Category Archives: AWS

New EC2 T4g Instances – Burstable Performance Powered by AWS Graviton2 – Try Them for Free

This post was originally published on this site

Two years ago Amazon Elastic Compute Cloud (EC2) T3 instances were first made available, offering a very cost effective way to run general purpose workloads. While current T3 instances offer sufficient compute performance for many use cases, many customers have told us that they have additional workloads that would benefit from increased peak performance and lower cost.

Today, we are launching T4g instances, a new generation of low cost burstable instance type powered by AWS Graviton2, a processor custom built by AWS using 64-bit Arm Neoverse cores. Using T4g instances you can enjoy a performance benefit of up to 40% at a 20% lower cost in comparison to T3 instances, providing the best price/performance for a broader spectrum of workloads.

T4g instances are designed for applications that don’t use CPU at full power most of the time, using the same credit model as T3 instances with unlimited mode enabled by default. Examples of production workloads that require high CPU performance only during times of heavy data processing are web/application servers, small/medium data stores, and many microservices. Compared to previous generations, the performance of T4g instances makes it possible to migrate additional workloads such as caching servers, search engine indexing, and e-commerce platforms.

T4g instances are available in 7 sizes providing up to 5 Gbps of network and up to 2.7 Gbps of Amazon Elastic Block Store (EBS) performance:

Name vCPUs Baseline Performance/vCPU CPU Credits Earned/Hour Memory
t4g.nano 2 5% 6 0.5 GiB
t4g.micro 2 10% 12 1 GiB
t4g.small 2 20% 24 2 GiB
t4g.medium 2 20% 24 4 GiB
t4g.large 2 30% 36 8 GiB
t4g.xlarge 4 40% 96 16 GiB
t4g.2xlarge 8 40% 192 32 GiB

Free Trial
To make it easier to develop, test, and run your applications on T4g instances, all AWS customers are automatically enrolled in a free trial on the t4g.micro size. Starting September 2020 until December 31st 2020, you can run a t4g.micro instance and automatically get 750 free hours per month deducted from your bill, including any CPU credits during the free 750 hours of usage. The 750 hours are calculated in aggregate across all regions. For details on terms and conditions of the free trial, please refer to the EC2 FAQs.

During the free trial, have a look at this getting started guide on using the Arm-based AWS Graviton processors. There, you can find suggestions on how to build and optimize your applications, using different programming languages and operating systems, and on managing container-based workloads. Some of the tips are specific for the Graviton processor, but most of the content works generally for anyone using Arm to run their code.

Using T4g Instances
You can start an EC2 instance in different ways, for example using the EC2 console, the AWS Command Line Interface (CLI), AWS SDKs, or AWS CloudFormation. For my first T4g instance, I use the AWS CLI:

$ aws ec2 run-instances 
  --instance-type t4g.micro 
  --image-id ami-09a67037138f86e67 
  --security-groups MySecurityGroup 
  --key-name my-key-pair

The Amazon Machine Image (AMI) I am using is based on Amazon Linux 2. Other platforms are available, such as Ubuntu 18.04 or newer, Red Hat Enterprise Linux 8.0 and newer, and SUSE Enterprise Server 15 and newer. You can find additional AMIs in the AWS Marketplace, for example Fedora, Debian, NetBSD, CentOS, and NGINX Plus. For containerized applications, Amazon ECS and Amazon Elastic Kubernetes Service optimized AMIs are available as well.

The security group I selected gives me SSH access to the instance. I connect to the instance and do a general update:

$ sudo yum update -y

Since the kernel has been updated, I reboot the instance.

I’d like to set up this instance as a development environment. I can use it to build new applications, or to recompile my existing apps to the 64-bit Arm architecture. To install most development tools, such as Git, GCC, and Make, I use this group of packages:

$ sudo yum groupinstall -y "Development Tools"

AWS is working with several open source communities to drive improvements to the performance of software stacks running on AWS Graviton2. For example, you can see our contributions to PHP for Arm64 in this post.

Using the latest versions helps you obtain maximum performance from your Graviton2-based instances. The amazon-linux-extras command enables new versions for some of my favorite programming environments:

$ sudo amazon-linux-extras enable golang1.11 corretto8 php7.4 python3.8 ruby2.6

The output of the amazon-linux-extras command tells me which packages to install with yum:

$ yum clean metadata
$ sudo yum install -y golang java-1.8.0-amazon-corretto 
  php-cli php-pdo php-fpm php-json php-mysqlnd 
  python38 ruby ruby-irb rubygem-rake rubygem-json rubygems

Let’s check the versions of the tools that I just installed:

$ go version
go version go1.13.14 linux/arm64
$ java -version
openjdk version "1.8.0_265"
OpenJDK Runtime Environment Corretto-8.265.01.1 (build 1.8.0_265-b01)
OpenJDK 64-Bit Server VM Corretto-8.265.01.1 (build 25.265-b01, mixed mode)
$ php -v
PHP 7.4.9 (cli) (built: Aug 21 2020 21:45:13) ( NTS )
Copyright (c) The PHP Group
Zend Engine v3.4.0, Copyright (c) Zend Technologies
$ python3.8 -V
Python 3.8.5
$ ruby -v
ruby 2.6.3p62 (2019-04-16 revision 67580) [aarch64-linux]

It looks like I am ready to go! Many more packages are available with yum, such as MariaDB and PostgreSQL. If you’re interested in databases, you might also want to try the preview of Amazon RDS powered by AWS Graviton2 processors.

Available Now
T4g instances are available today in US East (N. Virginia, Ohio), US West (Oregon), Asia Pacific (Tokyo, Mumbai), Europe (Frankfurt, Ireland).

You now have a broad choice of Graviton2-based instances to better optimize your workloads for cost and performance: low cost burstable general-purpose (T4g), general purpose (M6g), compute optimized (C6g) and memory optimized (R6g) instances. Local NVMe-based SSD storage options are also available.

You can use the free trial to develop new applications, or migrate your existing workloads to the AWS Graviton2 processor. Let me know how that goes!

Danilo

AWS Named as a Cloud Leader for the 10th Consecutive Year in Gartner’s Infrastructure & Platform Services Magic Quadrant

This post was originally published on this site

At AWS, we strive to provide you a technology platform that allows for agile development, rapid deployment, and unlimited scale, so that you can free up your resources to focus on innovation for your customers. It’s greatly rewarding to see our efforts recognized not just by our customers, but also by leading analysts.

This year, Gartner announced a new Magic Quadrant for Cloud Infrastructure and Platform Services (CIPS). This is an evolution of their Magic Quadrant for Cloud Infrastructure as a Service (IaaS) for which AWS has been named as a Leader for nine consecutive years.

Customers are using the cloud in broad ways, beyond foundational compute, networking and storage services. We believe for this reason, Gartner is expanding the scope to include additional platform as a service (PaaS) capabilities, and is extending coverage for areas such as managed database services, serverless computing, and developer tools.

Today, I am happy to share that AWS has been named as a Leader in the Magic Quadrant for Cloud Infrastructure and Platform Services, and placed highest in Ability to Execute and furthest in Completeness of Vision.

More information on the features and factors that our customers examine when choosing a cloud provider are available in the full report.

Danilo

Gartner, Magic Quadrant for Cloud Infrastructure and Platform Services, Raj Bala, Bob Gill, Dennis Smith, David Wright, Kevin Ji, 1 September 2020 – Gartner does not endorse any vendor, product or service depicted in its research publications, and does not advise technology users to select only those vendors with the highest ratings or other designation. Gartner research publications consist of the opinions of Gartner’s research organization and should not be construed as statements of fact. Gartner disclaims all warranties, expressed or implied, with respect to this research, including any warranties of merchantability or fitness for a particular purpose.

Announcing a second Local Zone in Los Angeles

This post was originally published on this site

In December 2019, Jeff Barr published this post announcing the launch of a new Local Zone in Los Angeles, California. A Local Zone extends existing AWS regions closer to end-users, providing single-digit millisecond latency to a subset of AWS services in the zone. Local Zones are attached to parent regions – in this case US West (Oregon) – and access to services and resources is performed through the parent region’s endpoints. This makes Local Zones transparent to your applications and end-users. Applications running in a Local Zone have access to all AWS services, not just the subset in the zone, via Amazon’s redundant and very high bandwidth private network backbone to the parent region.

At the end of the post, Jeff wrote (and I quote) – “In the fullness of time (as Andy Jassy often says), there could very well be more than one Local Zone in any given geographic area. In 2020, we will open a second one in Los Angeles (us-west-2-lax-1b), and are giving consideration to other locations.”

Well – that time has come! I’m pleased to announce that following customer requests, AWS today launched a second Local Zone in Los Angeles to further help customers in the area (and Southern California generally) to achieve even greater high-availability and fault tolerance for their applications, in conjunction with very low latency. These customers have workloads that require very low latency, such as artist workstations, local rendering, gaming, financial transaction processing, and more.

The new Los Angeles Local Zone contains a subset of services, such as Amazon Elastic Compute Cloud (EC2), Amazon Elastic Block Store (EBS), Amazon FSx for Windows File Server and Amazon FSx for Lustre, Elastic Load Balancing, Amazon Relational Database Service (RDS), and Amazon Virtual Private Cloud, and remember, applications running in the zone can access all AWS services and other resources through the zone’s association with the parent region.

Enabling Local Zones
Once you opt-in to use one or more Local Zones in your account, they appear as additional Availability Zones for you to use when deploying your applications and resources. The original zone, launched last December, was us-west-2-lax-1a. The additional zone, available to all customers now, is us-west-2-lax-1b. Local Zones can be enabled using the new Settings section of the EC2 Management Console, as shown below

As noted in Jeff’s original post, you can also opt-in (or out) of access to Local Zones with the AWS Command Line Interface (CLI) (aws ec2 modify-availability-zone-group command), AWS Tools for PowerShell (Edit-EC2AvailabilityZoneGroup cmdlet), or by calling the ModifyAvailabilityZoneGroup API from one of the AWS SDKs.

Using Local Zones for Highly-Available, Low-Latency Applications
Local Zones can be used to further improve high availability for applications, as well as ultra-low latency. One scenario is for enterprise migrations using a hybrid architecture. These enterprises have workloads that currently run in existing on-premises data centers in the Los Angeles metro area and it can be daunting to migrate these application portfolios, many of them interdependent, to the cloud. By utilizing AWS Direct Connect in conjunction with a Local Zone, these customers can establish a hybrid environment that provides ultra-low latency communication between applications running in the Los Angeles Local Zone and the on-premises installations without needing a potentially expensive revamp of their architecture. As time progresses, the on-premises applications can be migrated to the cloud in an incremental fashion, simplifying the overall migration process. The diagram below illustrates this type of enterprise hybrid architecture.

Anecdotally, we have heard from customers using this type of hybrid architecture, combining Local Zones with AWS Direct Connect, that they are achieving sub-1.5ms latency communication between the applications hosted in the Local Zone and those in the on-premises data center in the LA metro area.

Virtual desktops for rendering and animation workloads is another scenario for Local Zones. For these workloads, latency is critical and the addition of a second Local Zone for Los Angeles gives additional failover capability, without sacrificing latency, should the need arise.

As always, the teams are listening to your feedback – thank you! – and are working on adding Local Zones in other locations, along with the availability of additional services, including Amazon ECS Amazon Elastic Kubernetes Service, Amazon ElastiCache, Amazon Elasticsearch Service, and Amazon Managed Streaming for Apache Kafka. If you’re an AWS customer based in Los Angeles or Southern California, with a requirement for even greater high-availability and very low latency for your workloads, then we invite you to check out the new Local Zone. More details and pricing information can be found at Local Zone.

— Steve

Seamlessly Join a Linux Instance to AWS Directory Service for Microsoft Active Directory

This post was originally published on this site

Many customers I speak to use Active Directory to manage centralized user authentication and authorization for a variety of applications and services. For these customers, Active Directory is a critical piece of their IT Jigsaws.

At AWS, we offer the AWS Directory Service for Microsoft Active Directory that provides our customers with a highly available and resilient Active Directory service that is built on actual Microsoft Active Directory. AWS manages the infrastructure required to run Active Directory and handles all of the patching and software updates needed. It’s fully managed, so for example, if a domain controller fails, our monitoring will automatically detect and replace that failed controller.

Manually connecting a machine to Active Directory is a thankless task; you have to connect to the computer, make a series of manual changes, and then perform a reboot. While none of this is particularly challenging, it does take time, and if you have several machines that you want to onboard, then this task quickly becomes a time sink.

Today the team is unveiling a new feature which will enable a Linux EC2 instance, as it is launched, to connect to AWS Directory Service for Microsoft Active Directory seamlessly. This complements the existing feature that allows Windows EC2 instances to seamlessly domain join as they are launched. This capability will enable customers to move faster and improves the experience for Administrators.

Now you can have both your Windows and Linux EC2 instances seamlessly connect to AWS Directory Service for Microsoft Active Directory. The directory can be in your own account or shared with you from another account, the only caveat being that both the instance and the directory must be in the same region.

To show you how the process works, let’s take an existing AWS Directory Service for Microsoft Active Directory and work through the steps required to have a Linux EC2 instance seamlessly join that directory.

Create and Store AD Credentials
To seamlessly join a Linux machine to my AWS Managed Active Directory Domain, I will need an account that has permissions to join instances into the domain. While members of the AWS Delegated Administrators have sufficient privileges to join machines to the domain, I have created a service account that has the minimum privileges required. Our documentation explains how you go about creating this sort of service account.

The seamless domain join feature needs to know the credentials of my active directory service account. To achieve this, I need to create a secret using AWS Secrets Manager with specifically named secret keys, which the seamless domain feature will use to join instances to the directory.

In the AWS Secrets Manager console I click on the Store a new secret button, on the next screen, when asked to Select a secret type, I choose the option named Other type of secrets. I can now add two secret key/values. The first is called awsSeamlessDomainUsername, and in the value textbox, I enter the username for my Active Directory service account. The Second key is called awsSeamlessDomainPassword, and here I enter the password for my service account.

Since this is a demo, I chose to use the DefaultEncryptionKey for the secret, but you might decide to use your own key.

After clicking next, I am asked to give the secret a name. I add the following name, replacing d-xxxxxxxxx with my directory ID.

aws/directory-services/d-xxxxxxxxx/seamless-domain-join

The domain join will fail if you mistype this name or if you have any leading or ending spaces.

I take note down the Secret ARN as I will need it when I create my IAM Policy.

Create The Required IAM Policy and Role
Now I need to create an IAM policy that gives permission to read my seamless-domain-join secret.

I sign in to the IAM console and choose Policies. In the content pane, I select Create policy. I switch over to the JSON tab and copy the text from the following JSON policy document, replacing the Secrets Manager ARN with the one I noted down earlier.

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "secretsmanager:GetSecretValue",
                "secretsmanager:DescribeSecret"
            ],
            "Resource": [
                "arn:aws:secretsmanager:us-east-1:############:secret:aws/directory-service/d-xxxxxxxxxx/seamless-domain-join-example"
            ]
        }
    ]
}

On the Review page, I name the policy SeamlessDomainJoin-Secret-Readonly then choose Create policy to save my work.

Now I need to create an IAM Role that will use this policy (and a few others). In the IAM Console, I choose Roles, and then in the content pane, choose to Create role. Under Select type of trusted entity, I select AWS service and then select EC2 as a use case and click Next:Permissions.


I attach the following policies to my Role: AmazonSSMManagedInstanceCore, AmazonSSMDirectoryServiceAccess, and SeamlessDomainJoin-Secret-Readonly.

I click through to the Review screen where it asks for a Role name, I call the role EC2DomainJoin, but it could be called whatever you like. I then create the role by pressing the button at the bottom right of the screen.

Create an Amazon Machine Image
When I launch a Linux Instance later I will need to pick a Linux Amazon Machine Image (AMI) as a template. Currently, the default Linux AMIs do not contain the version of AWS Systems Manager agent (SSM agent) that this new seamless domain feature needs. Therefore I am going to have to create an AMI with an updated SSM agent. To do this, I first create a new Linux Instance in my account and then connect to it using my SSH client. I then follow the documentation to update the SSM agent to 2.3.1644.0 or newer. Once the instance has finished updating I am then able to create a new AMI based on this instance using the following documentation.

I now have a new AMI which I can use in the next step. In the future, the base AMIs will be updated to use the newer SSM agent, and then we can skip this section. If you are interested to know what version of the SSM agent an instance is using this documentation explains how you can check.

Seamless Join
To start, I need to create a Linux instance, and so I head over to the EC2 console and choose Launch Instance.

Next, I pick a Linux Amazon Machine Image (AMI). I select the AMI which I created earlier.

When configuring the instance, I am careful to choose the Amazon Virtual Private Cloud that contains my directory. Using the drop-down labeled Domain join directory I am able to select the directory that I want this instance to join.

In the IAM role, I select the EC2DomainJoin role that I created earlier.

When I launch this instance, it will seamlessly join my directory. Once the instance comes online, I can confirm everything is working correctly by using SSH to connect to the instance using the administrator credentials of my AWS Directory Service for Microsoft Active Directory.

This new feature is available from today, and we look forward to hearing your feedback about this new capability.

Happy Joining

— Martin

Log your VPC DNS queries with Route 53 Resolver Query Logs

This post was originally published on this site

The Amazon Route 53 team has just launched a new feature called Route 53 Resolver Query Logs, which will let you log all DNS queries made by resources within your Amazon Virtual Private Cloud. Whether it’s an Amazon Elastic Compute Cloud (EC2) instance, an AWS Lambda function, or a container, if it lives in your Virtual Private Cloud and makes a DNS query, then this feature will log it; you are then able to explore and better understand how your applications are operating.

Our customers explained to us that DNS query logs were important to them. Some wanted the logs so that they could be compliant with regulations, others wished to monitor DNS querying behavior, so they could spot security threats. Others simply wanted to troubleshoot application issues that were related to DNS. The team listened to our customers and have developed what I have found to be an elegant and easy to use solution.

From knowing very little about the Route 53 Resolver, I was able to configure query logging and have it working with barely a second glance at the documentation; which I assure you is a testament to the intuitiveness of the feature rather than me having any significant experience with Route 53 or DNS query logging.

You can choose to have the DNS query logs sent to one of three AWS services: Amazon CloudWatch Logs, Amazon Simple Storage Service (S3), and Amazon Kinesis Data Firehose. The target service you choose will depend mainly on what you want to do with the data. If you have compliance mandates (For example, Australia’s Information Security Registered Assessors Program), then maybe storing the logs in Amazon Simple Storage Service (S3) is a good option. If you have plans to monitor and analyze DNS queries in real-time or you integrate your logs with a 3rd party data analysis tool like Kibana or a SEIM tool like Splunk, than perhaps Amazon Kinesis Data Firehose is the option for you. For those of you who want an easy way to search, query, monitor metrics, or raise alarms, then Amazon CloudWatch Logs is a great choice, and this is what I will show in the following demo.

Over in the Route 53 Console, near the Resolver menu section, I see a new item called Query logging. Clicking on this takes me to a screen where I can configure the logging.

The dashboard shows the current configurations that are setup. I click Configure query logging to get started.

The console asks me to fill out some necessary information, such as a friendly name; I’ve named mine demoNewsBlog.

I am now prompted to select the destination where I would like my logs to be sent. I choose the CloudWatch Logs log group and select the option to Create log group. I give my new log group the name /aws/route/demothebeebsnet.

Next, I need to select what VPC I would like to log queries for. Any resource that sits inside the VPCs I choose here will have their DNS queries logged. You are also able to add tags to this configuration. I am in the habit of tagging anything that I use as part of a demo with the tag demo. This is so I can easily distinguish between demo resources and live resources in my account.

Finally, I press the Configure query logging button, and the configuration is saved. Within a few moments, the service has successfully enabled the query logging in my VPC.

After a few minutes, I log into the Amazon CloudWatch Logs console and can see that the logs have started to appear.

As you can see below, I was quickly able to start searching my logs and running queries using Amazon CloudWatch Logs Insights.

There is a lot you can do with the Amazon CloudWatch Logs service, for example, I could use CloudWatch Metric Filters to automatically generate metrics or even create dashboards. While putting this demo together, I also discovered a feature inside of Amazon CloudWatch Logs called Contributor Insights that enables you to analyze log data and create time series that display top talkers. Very quickly, I was able to produce this graph, which lists out the most common DNS queries over time.
Route 53 Resolver Query Logs is available in all AWS Commercial Regions that support Route 53 Resolver Endpoints, and you can get started using either the API or the AWS Console. You do not pay for the Route 53 Resolver Query Logs, but you will pay for handling the logs in the destination service that you choose. So, for example, if you decided to use Amazon Kinesis Data Firehose, then you will incur the regular charges for handling logs with the Amazon Kinesis Data Firehose service.

Happy Logging

— Martin

New EBS Volume Type (io2) – 100x Higher Durability and 10x More IOPS/GiB

This post was originally published on this site

We launched EBS Volumes with Provisioned IOPS way back in 2012. These volumes are a great fit for your most I/O-hungry and latency-sensitive applications because you can dial in the level of performance that you need, and then (with the launch of Elastic Volumes in 2017) change it later.

Over the years, we have increased the ratio of IOPS per gibibyte (GiB) of SSD-backed storage several times, most recently in August 2016. This ratio started out at 10 IOPS per GiB, and has grown steadily to 50 IOPS per GiB. In other words, the bigger the EBS volume, the more IOPS it can be provisioned to deliver, with a per-volume upper bound of 64,000 IOPS. This change in ratios has reduced storage costs by a factor of 5 for throughput-centric workloads.

Also, based on your requests and your insatiable desire for more performance, we have raised the maximum number of IOPS per EBS volume multiple times:

The August, 2014 change in the I/O request size made EBS 16x more cost-effective for throughput-centric workloads.

Bringing the various numbers together, you can think of Provisioned IOPS volumes as being defined by capacity, IOPS, and the ratio of IOPS per GiB. You should also think about durability, which is expressed in percentage terms. For example, io1 volumes are designed to deliver 99.9% durability, which is 20x more reliable than typical commodity disk drives.

Higher Durability & More IOPS
Today we are launching the io2 volume type, with two important benefits, at the same price as the existing io1 volumes:

Higher Durability – The io2 volumes are designed to deliver 99.999% durability, making them 2000x more reliable than a commodity disk drive, further reducing the possibility of a storage volume failure and helping to improve the availability of your application. By the way, in the past we expressed durability in terms of an Annual Failure Rate, or AFR. The new, percentage-based model is consistent with our other storage offerings, and also communicates expectations for success, rather than for failure.

More IOPS – We are increasing the IOPS per GiB ratio yet again, this time to 500 IOPS per GiB. You can get higher performance from your EBS volumes, and you can reduce or outright eliminate any over-provisioning that you might have done in the past to achieve the desired level of performance.

Taken together, these benefits make io2 volumes a perfect fit for your high-performance, business-critical databases and workloads. This includes SAP HANA, Microsoft SQL Server, and IBM DB2.

You can create new io2 volumes and you can easily change the type of an existing volume to io2:

Or:

$aws ec2 modify-volume --volume-id vol-0b3c663aeca5aabb7 --volume-type io2

io2 volumes support all features of io1 volumes with the exception of Multi-Attach, which is on the roadmap.

Available Now
You can make use of io2 volumes in the US East (Ohio), US East (N. Virginia), US West (N. California), Canada (Central), Europe (Frankfurt), Europe (Ireland), Europe (London), Europe (Stockholm), Asia Pacific (Hong Kong), Asia Pacific (Mumbai), Asia Pacific (Seoul), Asia Pacific (Singapore), Asia Pacific (Sydney), Asia Pacific (Tokyo), and Middle East (Bahrain) Regions today.

Jeff;

 

AWS announces WorldForge in AWS RoboMaker

This post was originally published on this site

What was announced?

Introducing AWS RoboMaker WorldForge 🤖, a new capability that makes robotics simulation easier! 🎉

AWS RoboMaker WorldForge makes it faster, simpler, and less expensive to create a multitude of 3D virtual worlds to simulate your robots in. Now robotics application developers and QA engineers can automatically create hundreds of user-defined, randomized 3D virtual environments that mimic real-world conditions without extra engineering investment or infrastructure management.

How it works and who it’s for…

Robotics companies are building increasingly sophisticated robots with autonomous capabilities and artificial intelligence. To develop these capabilities, developer teams need simulation to test and train their robots. Scaling simulation unlocks the ability to make testing more robust, reinforcement learning faster, and synthetic data generation (information that’s artificially manufactured rather than generated by real-world events) more affordable.

Until today, it was difficult for robotics teams to do simulation because building even just one 3D world was costly, time consuming, and required specialized skills in 3D modeling and knowledge of simulation engines. Given the investment needed to create a single simulation world, it was almost impossible to build enough worlds to effectively scale simulation. With AWS RoboMaker WorldForge, robotics developers can easily increase the scale and variance of simulation, improving the quality of production code and accelerating time to market.

Let’s see a demo…

Who wants to see a demo? I know I do. 😁

Let’s head over to the AWS Console and search for AWS RoboMaker. We click through to the AWS RoboMaker console and upon opening the side navigation menu, we notice a new section for AWS RoboMaker WorldForge capabilities.

Under Simulation WorldForge, we go to World templates and click the “Create template” button.

We have two options. Start a world from scratch or use one of the sample templates that RoboMaker WorldForge provides ready to use out of the box. Although we could start from scratch, for today’s blog post we’re going to use a sample template.

After we create this new template, we will be able to edit and customize it to create one or more random worlds.

AWS RoboMaker WorldForge provides four sample templates: a bedroom, a living room/dining room, a one-bedroom apartment, and a small house.

For this demo, we will choose the living room/dining room templates to start with.

Do you notice the green cylinder in the world? If you’re thinking that looked a little out of place, you’re right! That’s not a typical piece of furniture you would have in your home! It’s actually an indicator of the starting position when you put your robot into the world for simulation.

This will create a new world template in your account. This is an interactive console that allows us to edit and customize our worlds and rooms.

But first, let’s go ahead and generate some worlds with as few as 4 clicks.

I am doing only 1 world to start with.

Now RoboMaker WorldForge starts a job that is going to create this world.

We can click through our new world and view it in detail.

As you can see, this world is now a resource in your account. We can use this for simulation.

Take note how this world is similar but not identical to the world in the sample template. It still has two rooms and furniture, but the furniture is in a different location. The materials are different for the floor and walls, and the furniture setup varies. We could also place our robot in these worlds, to test our robot application performance in each world.

Ok. Let’s go into the World template and take a detailed look at it. We’re going to edit our sample template that we started with, and we’re going to add new rooms.

We can also name our templates.

Now let’s edit our template. We’re starting with our floor plan dimensions. I am going to use a 1:3 ratio and make my ceiling taller too.

Perhaps we want to edit one of the rooms. I can edit the desired aspect ratio and area.

We proceed to make this room have a 1:3 ratio as well.

But we can also add even more rooms. In fact, we have 7 types of rooms to choose from.

Now we want to customize where these are in the layout and that is what Connections is for. Connections helps you specify a preference for which rooms are adjacent to each other.

By default, AWS RoboMaker WorldForge will fully connect your floor plans. There are two types of connections; openings and doorways.

We can also customize the Interiors. Interiors are things such as walls and flooring materials, and you must select which specific rooms you wish to update.

We can also override the custom furniture that comes in each world template.

Ok. Let’s take a step back and recap what we’ve done so far.

We’ve decorated an interior, created a 6-room floor plan, customized the room connections, and saved this template. Now, QA and Robotics engineers can just as easily generate worlds for regression testing with these capabilities!

But now, we want to create more worlds. 🙌🏽

I click on the “Generate Worlds” button in the top right corner.

We can configure the number of worlds by specifying the number of floor plans and interior variations per floor plan.

You can request up to a total of 50 worlds! For today’s demo, we’re going to choose 10 floor plans and 3 variations, for a total of 30 worlds.

There we go! We just requested 30 worlds for our 6-room floor plan. This is the generation job page.

Our worlds will stream in here as they are completed. Once completed, we’ll be able to see each of these worlds and their variations. You can then test your robot’s navigation, moving it through each of these new complex worlds and then compare the results.

This is just one example of how you can create a multitude of 3D virtual worlds to simulate your robots in with AWS RoboMaker WorldForge.

🌎Lastly…

AWS RoboMaker WorldForge is now Generally Available in the following regions: US East (N. Virginia), US East (Ohio), US West (Oregon), EU (Ireland), EU (Frankfurt), Asia Pacific (Tokyo), and Asia Pacific (Singapore).

To start building with AWS RoboMaker WorldForge, visit the product landing page and the Developer Guide.

 

¡Gracias por tu tiempo!
~Alejandra 💁🏻‍♀️🤖 y Canela 🐾

New – AWS Fargate for Amazon EKS now supports Amazon EFS

This post was originally published on this site

AWS Fargate is a serverless compute engine for containers available with both Amazon Elastic Kubernetes Service (EKS) and Amazon Elastic Container Service (ECS). With Fargate, developers are able to focus on building applications, eliminating the need to manage the infrastructure related undifferentiated heavy lifting.

Developers specify resources for each Kubernetes pod, and are charged only for provisioned compute resource. When using Fargate, each EKS pod runs in its own kernel runtime environment and CPU, memory, storage, and network resources are never shared with other pods, providing workload isolation and increased security.

Containers are ephemeral in nature. They are dynamically scaled in and out, and their saved state or data is cleared on exit. We’ve had many requirements from our customers about data persistence and shared storage of containerized applications since launching EKS support for Fargate in 2019, and announced Amazon Elastic File System(EFS) support for Fargate on ECS in April 2020. Now many customers are operating stateful workloads on it, and others have requested support for EFS with Fargate when used with EKS. Today we are happy to announce this EFS support.

EFS provides a simple, scalable, and fully managed shared file system for use with AWS cloud services, and can also help Kubernetes applications be highly available because all data written to EFS is written to multiple AWS Availability Zones. EFS is built for on-demand petabyte growth without application interruption, and it automatically grows and shrinks as files are added and removed, eliminating the need to provision and manage capacity to accommodate growth. EFS Access Points is also ideal for security sensitive workloads as it can encrypt data in the file system and data in transit.

Kubernetes supports “Container Storage Interface (CSI)” which is a standard for exposing block and file storage systems to containerized workloads. The EFS CSI driver makes it simple to configure elastic file storage for Kubernetes clusters, and before this update customers could to use EFS via Amazon EC2 worker nodes connected to a cluster. Now customers can also configure their pods running on Fargate to access an EFS file system using standard Kubernetes APIs. With this update, customers can run stateful workloads that require highly available file systems as well as workloads that require access to shared storage. Using the EFS CSI driver, all data in transit is encrypted by default.

We released a generally available version of the Amazon EFS CSI driver for EKS in July 2020. The Amazon EFS CSI driver makes it easy to configure elastic file storage for both EKS and self-managed Kubernetes clusters running on AWS using standard Kubernetes interfaces. If a Kubernetes pod is terminated and relaunched, the CSI driver reconnects the EFS file system, even if the pod is relaunched in a different AWS Availability Zone.​ When using standard EC2 worker nodes, the EFS CSI driver needs to be deployed as a set of pods and DaemonSets. With this new update, for Fargate this step is not required and you do not need to install the EFS CSI driver, as it is installed in the Fargate stack and support for EFS is provided out of the box. Customers can use EFS with Fargate for EKS without spending the time and resources to install and update the CSI driver.

How to configure the Fargate/EKS and EFS integration?

You need to use three Kubernetes settings to mount EFS to Farfgate on EKS. Those are StorageClass, PersistentVolume (PV), and PersistentVolumeClaim (PVC). Configuring StorageClass and PVs are steps that an administrator (or similar) would do to make EFS file systems available for application developers. PVCs are used to allocate PVs from the pool of existing PVs as needed to deploy applications.

The StorageClass object provides a way for a Kubernetes administrator to register a specific storage type (e.g. EFS or EBS) and configuration (e.g. throughput, backup policy). Once a StorageClass is defined the PV object is used to create actual storage volumes inside that class. StorageClass and PV are the Kubernetes mechanisms that allow actual storage subsystems to be abstracted and decoupled from the way they are consumed by Kubernetes users. For example, while a Kubernetes administrator needs to know how exactly to configure a specific storage configuration from a particular storage service, Kubernetes users do not because they only see their volumes within abstract classes of storage.

The last step is the binding: Kubernetes users requests access to said volumes via the PVC object and related API. These volumes can be created dynamically when the user requests them via the PVC or they need to be statically pre-created by an administrator for later consumption by a Kubernetes user. The current implementation of the EFS CSI driver requires the volumes to be statically pre-created for the PVC binding to work.

If you are new to Kubernetes persistent volumes and want to know more about how they work, please refer to this page in the Kubernetes documentation that has all the details.

Let’s see this in action. First, you need to create your own EFS file system in the same AWS Region. If you are not familiar with EFS this EFS getting start guide is a good resource you can start with.

Once you create an EFS file system, you get your file system ID. You can configure the mount settings using a Kubernetes StorageClass and PersistentVolume. Here is an example of the YAML files:

CSIDriver Object

apiVersion: storage.k8s.io/v1beta1
kind: CSIDriver
metadata:
  name: efs.csi.aws.com
spec:
  attachRequired: false

For now you need to add the EFS CSIDriver object shown above to your cluster so Kubernetes can discover the driver that Fargate automatically installs. In the future, this manifest will be added by default to EKS clusters.

Storage Class

kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
  name: efs-sc
provisioner: efs.csi.aws.com

PersistentVolume(PV)

apiVersion: v1
kind: PersistentVolume
metadata:
  name: efs-pv
spec:
  capacity:
    storage: 5Gi
  volumeMode: Filesystem
  accessModes:
    - ReadWriteMany
  persistentVolumeReclaimPolicy: Retain
  storageClassName: efs-sc
  csi:
    driver: efs.csi.aws.com
    volumeHandle: <EFS filesystem ID>

The volumeHandle is returned by the EFS service when you create a file system, and you need to use it to configure the CSI driver to create the PV. You can obtain EFS filesystem ID from the AWS management console or the below command by AWS CLI.

aws efs describe-file-systems --query "FileSystems[*].FileSystemId" --output text

Now that you have created a PV by applying the manifest above, you configure Kurbenetes pods to access the EFS file system by including a PersistentVolumeClaim in the pod manifest. These are two manifest examples that do that:

PersistentVolumeClaim(PVC)

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: efs-claim
spec:
  accessModes:
    - ReadWriteMany
  storageClassName: efs-sc
  resources:
    requests:
      storage: 5Gi

Pod manifest

apiVersion: v1
kind: Pod
metadata:
  name: app1
spec:
  containers:
  - name: app1 image: busybox command: ["/bin/sh"] args: ["-c", "while true; do echo $(date -u) >> /data/out1.txt; sleep 5; done"]
    volumeMounts:
    - name: persistent-storage
      mountPath: /data
  volumes:
  - name: persistent-storage
    persistentVolumeClaim:
      claimName: efs-claim

Available Today

Today, this feature update is available for newly created EKS clusters with Kubernetes version 1.17, and we are planning to roll out support for this feature with additional Kubernetes versions on EKS in the coming weeks. This update is available in all AWS regions where Fargate with EKS is available. You can check our latest documentation for more detail.

– Kame;

AWS Online Tech Talks for August 2020

This post was originally published on this site

Join us for live, online presentations led by AWS solutions architects and engineers. AWS Online Tech Talks cover a range of topics and expertise levels, and feature technical deep dives, demonstrations, customer examples, and live Q&A with AWS experts.

Note – All sessions are free and in Pacific Time. Can’t join us live? Access webinar recordings and slides on our On-Demand Portal.

Tech talks this month are:

August 17, 2020 | 9:00 AM – 10:00 AM PT
Unlock the Power of Connected Vehicle Data with the AWS Connected Mobility Solution
Learn how to start building solutions for common connected mobility use cases in minutes.

August 17, 2020 | 11:00 AM – 12:00 PM PT
Build a Blockchain Track-and-Trace Application on AWS
Learn how to set up a blockchain network and build a track-and-trace application using Amazon Managed Blockchain.

August 18, 2020 | 9:00 AM – 10:00 AM PT
Cloud Financial Management: How Small Changes Can Accelerate Value Realization
Join 451 Research and AWS Cloud Financial Management (CFM) experts as they share findings from a recent study of 500 enterprise AWS customers conducted to understand how CFM helps organizations realize value beyond cost savings.

August 18, 2020 | 11:00 AM – 12:00 PM PT
Improve Data Science Team Productivity Using Amazon SageMaker Studio
Join us for a deep dive into Amazon SageMaker Studio, the first IDE for ML.

August 18, 2020 | 1:00 PM – 2:00 PM PT
What’s New With AWS Global Accelerator
Learn how to move user traffic onto the AWS network infrastructure, improving traffic by up to 60% in minutes with AWS Global Accelerator.

August 19, 2020 | 11:00 AM – 12:00 PM PT
AWS Lambda Data Storage: Choosing Between S3, EFS, and Local Storage
Learn when to choose Amazon S3, Amazon EFS, or Lambda Layers for your application with AWS Lambda Data Storage.

August 19, 2020 | 1:00 PM – 2:00 PM PT
Optimizing Cluster Utilization with EKS and Fargate
Learn how Amazon and EKS can work together and why you should consider deploying your Kubernetes applications on Fargate instead of traditional Kubernetes worker nodes.

August 20, 2020 | 9:00 AM – 10:00 AM PT
Streaming Data Pipelines for Real-Time Analytics – Are You Ready?
Learn best practices for building streaming data pipelines so you can focus on your data instead of managing infrastructure.

August 20, 2020 | 11:00 AM – 12:00 PM PT
Build Resilient, Easy to Manage Continuous Integration Workflows with AWS CodeBuild and AWS Step Functions
Learn how to use AWS CodeBuild and AWS Step Functions to easily set up branching workflows with manual approval steps for your continuous integration processes or data-processing applications.

August 20, 2020 | 1:00 PM – 2:00 PM PT
Improving Business Continuity with Amazon Aurora Global Database
Learn how to place your database where your customers are for low-latency reads and global disaster recovery.

August 21, 2020 | 9:00 AM – 10:00 AM PT
Intro to AWS Service Management Connector for ServiceNow
Learn how to simplify cloud resource management with the AWS Service Catalog Connector for ServiceNow.

August 21, 2020 | 11:00 AM – 12:00 PM PT
How to Use AWS Wavelength to Deliver Applications that Require Ultra-Low Latency to 5G Mobile Users
Learn how to use AWS Wavelength to develop and deploy applications that need low-latency access to end-user mobile devices.

August 21, 2020 | 1:00 PM – 2:00 PM PT
Deep Dive on Migrating and Modernizing Middleware Applications
Learn migration and modernization strategy, and patterns for middleware applications.

August 24, 2020 | 9:00 AM – 10:00 AM PT
AWS IoT and Amazon Kinesis Video Streams for Connected Home Applications
Learn how you can simplify the development and management of connected cameras and video streaming solutions for smart home use cases.

August 24, 2020 | 11:00 AM – 12:00 PM PT
Introduction to Quantum Computing on AWS
Learn the best practices to get started with quantum computing with Amazon Braket.

August 24, 2020 | 1:00 PM – 2:00 PM PT
Embedding Analytics in Your Applications
Learn about new QuickSight capabilities for embedding analytics and how customers are leveraging QuickSight’s serverless architecture to easily build and scale their embedded analytics applications.

August 25, 2020 | 9:00 AM – 10:00 AM PT
DNS Design Using Amazon Route 53
Explore DNS design, the capabilities of Amazon Route 53 DNS, and architectural considerations for hybrid networks.

August 25, 2020 | 11:00 AM – 12:00 PM PT
Amazon EMR Deep Dive and Best Practices
Learn design patterns and architectural best practices to get the most from your big data with Amazon EMR.

August 25, 2020 | 1:00 PM – 2:00 PM PT
Simply and Seamlessly Set Up Secure File Transfers to Amazon S3 Over SFTP and Other Protocols
Learn how to simply and seamlessly setup secure file transfers to Amazon S3 using AWS Transfer Family.

August 26, 2020 | 9:00 AM – 10:00 AM PT
Enhanced CI/CD with AWS CDK
Learn how to use the AWS Cloud Development Kit and AWS CodePipeline to create a construct library that makes it even easier to set up continuous delivery pipelines for your CDK applications.

August 26, 2020 | 11:00 AM – 12:00 PM PT
Forrester Research Analyzes the Total Economic Impact™ of Amazon Connect
Learn about the Forrester framework and to evaluate the financial impact of working with Amazon Connect, covering ROI benefits, costs, risk, and flexibility.

August 26, 2020 | 1:00 PM – 2:00 PM PT
Best Practices for Data Protection on Amazon S3
Learn how to protect your data by leveraging S3 features to ensure your data meets compliance requirements.

August 27, 2020 | 9:00 AM – 10:00 AM PT
How to Optimize for Cost When Using Amazon DocumentDB (with MongoDB Compatibility)
Learn how to optimize for cost when using Amazon DocumentDB (with MongoDB compatibility).

August 27, 2020 | 11:00 AM – 12:00 PM PT
Accelerating Embedded Linux Solutions for the AWS Cloud
Learn how to accelerate Embedded Linux solutions for the AWS Cloud.

August 27, 2020 | 1:00 PM – 2:00 PM PT
SQL Server Cost Optimization on AWS
Learn how to optimize the cost of SQL Server workloads on AWS across EC2 and RDS SQL Server to diversify and optimize your current licensing investments, think strategically about licensing in the cloud, and bring your own licenses to AWS.

August 28, 2020 | 9:00 AM – 10:00 AM PT
How to Centrally Audit and Remediate VPC Security Groups Using AWS Firewall Manager
Learn how to use AWS Firewall Manager to centrally audit VPC security groups across your accounts and resources using different pre-configured and custom audit polices.

August 28, 2020 | 11:00 AM – 12:00 PM PT
Increase Employee Productivity and Delight Customers with Cognitive Search Using Amazon Kendra
Discover how Amazon Kendra provides highly accurate and easy to use enterprise search powered by machine learning.

August 28, 2020 | 1:00 PM – 2:00 PM PT
How to Maximize the Value of Your Analytics Stack with an Amazon Redshift Data Warehouse
Learn how to maximize the value of your analytics stack with an Amazon Redshift data warehouse.

Amazon ECS Now Supports EC2 Inf1 Instances

This post was originally published on this site

As machine learning and deep learning models become more sophisticated, hardware acceleration is increasingly required to deliver fast predictions at high throughput. Today, we’re very happy to announce that AWS customers can now use the Amazon EC2 Inf1 instances on Amazon ECS, for high performance and the lowest prediction cost in the cloud. For a few weeks now, these instances have also been available on Amazon Elastic Kubernetes Service.

A primer on EC2 Inf1 instances
Inf1 instances were launched at AWS re:Invent 2019. They are powered by AWS Inferentia, a custom chip built from the ground up by AWS to accelerate machine learning inference workloads.

Inf1 instances are available in multiple sizes, with 1, 4, or 16 AWS Inferentia chips, with up to 100 Gbps network bandwidth and up to 19 Gbps EBS bandwidth. An AWS Inferentia chip contains four NeuronCores. Each one implements a high-performance systolic array matrix multiply engine, which massively speeds up typical deep learning operations such as convolution and transformers. NeuronCores are also equipped with a large on-chip cache, which helps cut down on external memory accesses, saving I/O time in the process. When several AWS Inferentia chips are available on an Inf1 instance, you can partition a model across them and store it entirely in cache memory. Alternatively, to serve multi-model predictions from a single Inf1 instance, you can partition the NeuronCores of an AWS Inferentia chip across several models.

Compiling Models for EC2 Inf1 Instances
To run machine learning models on Inf1 instances, you need to compile them to a hardware-optimized representation using the AWS Neuron SDK. All tools are readily available on the AWS Deep Learning AMI, and you can also install them on your own instances. You’ll find instructions in the Deep Learning AMI documentation, as well as tutorials for TensorFlow, PyTorch, and Apache MXNet in the AWS Neuron SDK repository.

In the demo below, I will show you how to deploy a Neuron-optimized model on an ECS cluster of Inf1 instances, and how to serve predictions with TensorFlow Serving. The model in question is BERT, a state of the art model for natural language processing tasks. This is a huge model with hundreds of millions of parameters, making it a great candidate for hardware acceleration.

Creating an Amazon ECS Cluster
Creating a cluster is the simplest thing: all it takes is a call to the CreateCluster API.

$ aws ecs create-cluster --cluster-name ecs-inf1-demo

Immediately, I see the new cluster in the console.

New cluster

Several prerequisites are required before we can add instances to this cluster:

  • An AWS Identity and Access Management (IAM) role for ECS instances: if you don’t have one already, you can find instructions in the documentation. Here, my role is named ecsInstanceRole.
  • An Amazon Machine Image (AMI) containing the ECS agent and supporting Inf1 instances. You could build your own, or use the ECS-optimized AMI for Inferentia. In the us-east-1 region, its id is ami-04450f16e0cd20356.
  • A Security Group, opening network ports for TensorFlow Serving (8500 for gRPC, 8501 for HTTP). The identifier for mine is sg-0994f5c7ebbb48270.
  • If you’d like to have ssh access, your Security Group should also open port 22, and you should pass the name of an SSH key pair. Mine is called admin.

We also need to create a small user data file in order to let instances join our cluster. This is achieved by storing the name of the cluster in an environment variable, itself written to the configuration file of the ECS agent.

#!/bin/bash
echo ECS_CLUSTER=ecs-inf1-demo >> /etc/ecs/ecs.config

We’re all set. Let’s add a couple of Inf1 instances with the RunInstances API. To minimize cost, we’ll request Spot Instances.

$ aws ec2 run-instances
--image-id ami-04450f16e0cd20356
--count 2
--instance-type inf1.xlarge
--instance-market-options '{"MarketType":"spot"}'
--tag-specifications 'ResourceType=instance,Tags=[{Key=Name,Value=ecs-inf1-demo}]'
--key-name admin
--security-group-ids sg-0994f5c7ebbb48270
--iam-instance-profile Name=ecsInstanceRole
--user-data file://user-data.txt

Both instances appear right away in the EC2 console.

Inf1 instances

A couple of minutes later, they’re ready to run tasks on the cluster.

Inf1 instances

Our infrastructure is ready. Now, let’s build a container storing our BERT model.

Building a Container for Inf1 Instances
The Dockerfile is pretty straightforward:

  • Starting from an Amazon Linux 2 image, we open ports 8500 and 8501 for TensorFlow Serving.
  • Then, we add the Neuron SDK repository to the list of repositories, and we install a version of TensorFlow Serving that supports AWS Inferentia.
  • Finally, we copy our BERT model inside the container, and we load it at startup.

Here is the complete file.

FROM amazonlinux:2
EXPOSE 8500 8501
RUN echo $'[neuron] n
name=Neuron YUM Repository n
baseurl=https://yum.repos.neuron.amazonaws.com n
enabled=1' > /etc/yum.repos.d/neuron.repo
RUN rpm --import https://yum.repos.neuron.amazonaws.com/GPG-PUB-KEY-AMAZON-AWS-NEURON.PUB
RUN yum install -y tensorflow-model-server-neuron
COPY bert /bert
CMD ["/bin/sh", "-c", "/usr/local/bin/tensorflow_model_server_neuron --port=8500 --rest_api_port=8501 --model_name=bert --model_base_path=/bert/"]

Then, I build and push the container to a repository hosted in Amazon Elastic Container Registry. Business as usual.

$ docker build -t neuron-tensorflow-inference .

$ aws ecr create-repository --repository-name ecs-inf1-demo

$ aws ecr get-login-password | docker login --username AWS --password-stdin 123456789012.dkr.ecr.us-east-1.amazonaws.com

$ docker tag neuron-tensorflow-inference 123456789012.dkr.ecr.us-east-1.amazonaws.com/ecs-inf1-demo:latest

$ docker push

Now, we need to create a task definition in order to run this container on our cluster.

Creating a Task Definition for Inf1 Instances
If you don’t have one already, you should first create an execution role, i.e. a role allowing the ECS agent to perform API calls on your behalf. You can find more information in the documentation. Mine is called ecsTaskExecutionRole.

The full task definition is visible below. As you can see, it holds two containers:

  • The BERT container that I built,
  • A sidecar container called neuron-rtd, that allows the BERT container to access NeuronCores present on the Inf1 instance. The AWS_NEURON_VISIBLE_DEVICES environment variable lets you control which ones may be used by the container. You could use it to pin a container on one or several specific NeuronCores.
{
  "family": "ecs-neuron",
  "executionRoleArn": "arn:aws:iam::123456789012:role/ecsTaskExecutionRole",
  "containerDefinitions": [
    {
      "entryPoint": [
        "sh",
        "-c"
      ],
      "portMappings": [
        {
          "hostPort": 8500,
          "protocol": "tcp",
          "containerPort": 8500
        },
        {
          "hostPort": 8501,
          "protocol": "tcp",
          "containerPort": 8501
        },
        {
          "hostPort": 0,
          "protocol": "tcp",
          "containerPort": 80
        }
      ],
      "command": [
        "tensorflow_model_server_neuron --port=8500 --rest_api_port=8501 --model_name=bert --model_base_path=/bert"
      ],
      "cpu": 0,
      "environment": [
        {
          "name": "NEURON_RTD_ADDRESS",
          "value": "unix:/sock/neuron-rtd.sock"
        }
      ],
      "mountPoints": [
        {
          "containerPath": "/sock",
          "sourceVolume": "sock"
        }
      ],
      "memoryReservation": 1000,
      "image": "123456789012.dkr.ecr.us-east-1.amazonaws.com/ecs-inf1-demo:latest",
      "essential": true,
      "name": "bert"
    },
    {
      "entryPoint": [
        "sh",
        "-c"
      ],
      "portMappings": [],
      "command": [
        "neuron-rtd -g unix:/sock/neuron-rtd.sock"
      ],
      "cpu": 0,
      "environment": [
        {
          "name": "AWS_NEURON_VISIBLE_DEVICES",
          "value": "ALL"
        }
      ],
      "mountPoints": [
        {
          "containerPath": "/sock",
          "sourceVolume": "sock"
        }
      ],
      "memoryReservation": 1000,
      "image": "790709498068.dkr.ecr.us-east-1.amazonaws.com/neuron-rtd:latest",
      "essential": true,
      "linuxParameters": { "capabilities": { "add": ["SYS_ADMIN", "IPC_LOCK"] } },
      "name": "neuron-rtd"
    }
  ],
  "volumes": [
    {
      "name": "sock",
      "host": {
        "sourcePath": "/tmp/sock"
      }
    }
  ]
}

Finally, I call the RegisterTaskDefinition API to let the ECS backend know about it.

$ aws ecs register-task-definition --cli-input-json file://inf1-task-definition.json

We’re now ready to run our container, and predict with it.

Running a Container on Inf1 Instances
As this is a prediction service, I want to make sure that it’s always available on the cluster. Instead of simply running a task, I create an ECS Service that will make sure the required number of container copies is running, relaunching them should any failure happen.

$ aws ecs create-service --cluster ecs-inf1-demo
--service-name bert-inf1
--task-definition ecs-neuron:1
--desired-count 1

A minute later, I see that both task containers are running on the cluster.

Running containers

Predicting with BERT on ECS and Inf1
The inner workings of BERT are beyond the scope of this post. This particular model expects a sequence of 128 tokens, encoding the words of two sentences we’d like to compare for semantic equivalence.

Here, I’m only interested in measuring prediction latency, so dummy data is fine. I build 100 prediction requests storing a sequence of 128 zeros. Using the IP address of the BERT container, I send them to the TensorFlow Serving endpoint via grpc, and I compute the average prediction time.

Here is the full code.

import numpy as np
import grpc
import tensorflow as tf
from tensorflow_serving.apis import predict_pb2
from tensorflow_serving.apis import prediction_service_pb2_grpc
import time

if __name__ == '__main__':
    channel = grpc.insecure_channel('18.234.61.31:8500')
    stub = prediction_service_pb2_grpc.PredictionServiceStub(channel)
    request = predict_pb2.PredictRequest()
    request.model_spec.name = 'bert'
    i = np.zeros([1, 128], dtype=np.int32)
    request.inputs['input_ids'].CopyFrom(tf.contrib.util.make_tensor_proto(i, shape=i.shape))
    request.inputs['input_mask'].CopyFrom(tf.contrib.util.make_tensor_proto(i, shape=i.shape))
    request.inputs['segment_ids'].CopyFrom(tf.contrib.util.make_tensor_proto(i, shape=i.shape))

    latencies = []
    for i in range(100):
        start = time.time()
        result = stub.Predict(request)
        latencies.append(time.time() - start)
        print("Inference successful: {}".format(i))
    print ("Ran {} inferences successfully. Latency average = {}".format(len(latencies), np.average(latencies)))

For convenience, I’m running this code on an EC2 instance based on the Deep Learning AMI. It comes pre-installed with a Conda environment for TensorFlow and TensorFlow Serving, saving me from installing any dependencies.

$ source activate tensorflow_p36
$ python predict.py

On average, prediction took 56.5ms. As far as BERT goes, this is pretty good!

Ran 100 inferences successfully. Latency average = 0.05647835493087769

Getting Started
You can now deploy Amazon Elastic Compute Cloud (EC2) Inf1 instances on Amazon ECS today in the US East (N. Virginia) and US West (Oregon) regions. As Inf1 deployment progresses, you’ll be able to use them with Amazon ECS in more regions.

Give this a try, and please send us feedback either through your usual AWS Support contacts, on the AWS Forum for Amazon ECS, or on the container roadmap on Github.

– Julien