Today I would like to talk about ways to make your website run faster and more efficiently. There are several benefits from doing this including better search engine rankings, a better customer experience, and an increase in sales or conversions.
You can always improve your architecture, tune your database, and optimize your code. In each case you locate a bottleneck, figure out a way around it, and then proceed to do so. You basically spend your time answering the question "What part of my application is running too slowly and how can I speed it up?" This is the classical approach to performance tuning; many of the techniques to measure and to improve performance date back to the time before the web (you do remember those days, don't you?).
In today's web world, "where" can be just as important as "what." Let's talk about that.
The distance between the user and the application is a major factor in the performance of a website or a web-based application. Latency increases with distance, and increased distance also means that each network packet will take more "hops" through intermediate routers and switches as it makes its way from browser to server and back.You probably can't move your users closer to your application, but you can often deploy your application closer to the user. AWS provides you with two different ways to do this:
Choose a Region - You can put your application into any of four different AWS Regions. You should choose the AWS region that's closest to the majority of your customers. As of this writing, you can choose to host your AWS-powered application on the east coast (Northern Virginia) or west coast (Northern California) of the United States, Europe (Ireland), or Asia Pacific (Singapore).
Use CloudFront Content Distribution - You can also speed up your application by using Amazon CloudFront to distribute web content (static HTML pages, CSS style sheets, JavaScript, images, file downloads, and video streams) stored in Amazon S3. Each user's requests will be served up from the nearest CloudFront edge location. There are currently fifteen such edge locations: eight in the United States, four in Europe, and three in Asia.
You might be thinking that you run a "small" or "unimportant" site and that you don't need or can't benefit from CloudFront. Given that a lot of the value of the web is found in the long tail content, I disagree. There's no harm (and plenty of potential benefit) from making your site quicker to load and more responsive. How do you think that the large, popular sites got that way? At least in part, they did so by being fast and responsive from the very beginning.
Or, you might think that CloudFront is somehow too expensive for you. Well, it's not. I've been serving up all of the images for this blog from CloudFront for a while. Here's my AWS account, reflecting an entire month of CloudFront usage:

That's right, less than $2 per month to make sure that readers all over the world are able to get to the images in the blog as quickly as possible. If nothing else, think of this an inexpensive insurance policy that will protect you from overload in case your site shows up on SlashDot, Digg, Reddit, Hacker News, or TechCrunch one fine morning.
If you are interested in speeding up access to your application's web content, start out by reading our guide to Migrating to Amazon CloudFront.This guide will walk you through the 5 basic steps needed to sign up for CloudFront and S3, download and install the proper tools, upload your content to an S3 bucket, create a CloudFront distribution, and link to the content.
If you are using a CMS (Content Management System), take a look at these:
If you are doing video streaming, take a look at Use your Amazon CloudFront account, How to Get Started With Amazon CloudFront Streaming, and S3 News from the CloudFront: Private Streaming Video.
Encoding.com has put together a guide to Apple HTTP Streaming with Amazon CloudFront. The guide includes complete step-by-step directions and gives you all the information you'll need to stream video to an iPad or an iPhone.
Carson McDonald has also put some work into an HTTP Segmenter and Streamer for Apple's HTTP Live Streaming Protocol. Learn more in his series of three blog posts.
Last but not least, don't forget about Using the JW Player with Amazon Web Services. JW Player is a very popular open source video player that's also easy to embed and use. Use the JW Player test page to experiment with the configuration and setup options.
There are a number of good testing tools that you can use to measure the speed of your site while you are in the process of migrating to CloudFront. Here are two:
As an example of just how location affects perceived website performance, take a look at this chart from the BrowserMob test:

There's a 4:1 ratio between the fastest and the slowest location. I compared the detailed output for two separate objects: the blog's home page and the volcano picture that I posted last week. The results show that CloudFront produces results that are reasonably consistent,regardless of where the test is run from:
| Location | Home Page | Volcano |
| Washington, DC | 323 ms | 140 ms |
| Dublin | 631 ms | 227 ms |
| San Francisco | 45 ms | 210 ms |
| Dallas | 200 ms | 252 ms |
You can also see the amount of time it takes to look up each item's DNS address, the time until the first byte of data arrives, and the amount of time spent reading the data:
Still need more info? Take a look at some of these case studies:
One more thing, and then I'll shut up! We recently recorded a webinar, Increasing the Performance of your Website With Amazon CloudFront.

-- Jeff;
We have a whole new way for you to request access to Amazon EC2 processing power!
Using our new Spot Instances, you can bid for one or more EC2 instances at the price you are willing to pay. Your Spot Instance request consists of a number of parameters including the maximum bid that you are willing to pay per hour, the EC2 Region where you need the instances, the number and type of instances you want to run, and the AMI that you want to launch if your bid is successful.
As requests come in and unused capacity becomes available, we'll evaluate the open bids for each Region and compute a new Spot Price for each instance type. After that we'll terminate any Spot Instances with bids below the Spot Price, and launch instances for requests with bids higher than or at the new Spot Price. The instances will be billed at the then-current Spot Price regardless of the actual bid, which can mean a substantial potential cost savings versus the bid amount.
You'll be able to track changes to the Spot Price over time using the EC2 API or the AWS Management Console. This means that you can now create intelligent, value-based scheduling tools to get the most value from EC2. I'm really looking forward to seeing what kinds of tools and systems emerge in this space.
From an architectural point of view, because EC2 will terminate instances whose bid price becomes lower than the Spot Price, you'll want to regularly checkpoint work in progress, perhaps using Amazon SimpleDB or an Elastic Block Store (EBS) volume. You could also architect your application so that it pulls work from an Amazon SQS Queue, counting on the SQS visibility timeout to return any unfinished work back to the queue if it is running on a Spot Instance that is terminated. Many types of work are suitable for this incremental, background processing model including web crawling, data analysis, and data transformation (e.g. media transcoding). It wouldn't make much sense to run a highly available application such as a web server or a database on a Spot Instance, though.
You can use the Spot instances to make all sorts of time-vs-money-vs-value trade-offs. If you have some time sensitive work that is of high value, you can place a bid that's somewhat higher than the historical Spot Price and know that there's a higher likelihood that it will be fulfilled. If you have some time insensitive work, you can bid very low and have your work done when EC2 isn't overly busy, perhaps during nighttime hours for that Region. The trick will be to use the price history to understand what pricing environment to expect during the time frame that you plan to make a request for instances.
Your requests can include a number of optional parameters for even more control:
Spot instances are supported by the EC2 API, the EC2 Command Line Tools, and the AWS Management Console. Here's a picture of the AWS Management Console in action:
Here's a good example of how the Spot Instances can be put to use.
The Protein Engineering group at Pfizer has been using AWS to model Antibody-Antigen interactions using a protein docking system. Their protocol utilizes a full stack of services including EC2, S3, SQS, SimpleDB and EC2 Spot instances (more info can be found in a recent article by BioTeam's Adam Kraut, a primary contributor to the implementation). BioTeam described this system as follows:
The most computationally intensive aspect of the protocol is an all-atom refinement of the docked complex resulting in more accurate models. This exploration of the solution space can require thousands of EC2 instances for several hours.
Here's what they do:
We have modified our pipeline to submit "must do" refinement jobs on standard EC2 instances and "nice to do" workloads to the Spot Instances. With large numbers of standard instances we want to optimize the time to complete the job. With the addition of Spot Instances to our infrastructure we can optimize for the price to complete jobs and cluster the results that we get back from spot. Not unlike volunteer computing efforts such as Rosetta@Home, we load the queue with tasks and then make decisions after we get back enough work units from the spot instances. If we're too low on the Spot bids we just explore less solution space. The more Spot Instances we acquire the more of the energy landscape we can explore.
Here is their architecture:
You can learn even more about Spot Instances by reading Werner Vogels', Expanding the Cloud - Amazon EC2 Spot Instances, Thorsten von Eiken's Bid for Your Instances and the Introduction to Spot Instances.
So, what do you think? Is this cool, or what?
-- Jeff;
We've added some important new community features to our Public Data Sets and we've also added some new and intriguing data to our collection. I'm writing this post to bring you up to date on this unique AWS feature and thought I would also show you how to instantiate and use an actual public data set.
If the concept is new to you, allow me to give you a brief
introduction.
We have set up a centralized repository for
large (tens or hundreds of gigabytes) public data sets, which
we host at no charge.
We currently have public data sets
in a number of categories including
Biology,
Chemistry,
Economics,
Encyclopedic,
Geographic, and
Mathematics.
The
data sets are stored in the form of EBS
(Elastic Block Storage)
snapshots. These snapshots are used to create an EBS volume
from scratch in a matter of seconds. Most data sets are available
in formats suitable for use with both Linux and Windows.
Once created, the volume is then mounted on an
EC2
instance for processing. Once the processing is complete,
the volume can be kept alive for further work, archived to
S3 or simple deleted.
To make sure that you can get a lot of value from our Public
Data Sets, we've added some new community features. Each
set now has its own page within the
AWS Resource Center.
The page contains all of the information needed to start making
use of the data, including submission information, creation date,
update date, data source, and more. There's a dedicated discussion
forum for each data set, and even (in classic Amazon style) room
to enter a review and a rating.
We've also added a number of rich and intriguing data sets to
our collection. Here's what's new:
We'll continue to add additional public data sets to our collection over the coming months. Please feel free to submit your own data sets for consideration, or to propose inclusion of data sets owned by others.
It is really easy to instantiate an instance of a public data set. I wanted to process the 2003-2006 US Economic Data. Here's what I need to do:
I hit the "Create" button, waited two seconds, and then hit "Refresh." The
volume status changed from "creating" to "available" so I knew that my
data was ready.
Once I am done I can simply unmount the volume, shut down the instance, and delete the volume. No fuss, no muss, and a total cost of 11 cents (10 cents for an hour of EC2 time and a penny or so for the actual EBS volume).
--Jeff;
Security is a top priority for Amazon Web Services. Providing a trustworthy infrastructure for you to develop and deploy applications is a responsibility we take very seriously. One important aspect of gaining your trust is being open and transparent about our security processes and continually working toward achieving industry-recognized certifications. Other important aspects include providing you with mechanisms for contacting us about potential security issues and enabling you to conduct security tests of the applications you deploy on AWS. I'm pleased to announce today two new policies: one that outlines our vulnerability reporting process and one that describes how to receive permission to conduct penetration tests of the applications running on your EC2 instances.
A new page in the AWS Security Center describes our vulnerability reporting process. The process is high-priority for us, it's human-driven, and is governed by a service level commitment. Like other technology providers, we believe in the concept of responsible disclosure: let's work together to protect everyone.
Another page in the Security Center describes our penetration testing procedure. Normally, conducting such tests violates our Acceptable Use Policy because these tests are often indistinguishable from real attacks. However, to ensure higher degrees of application security, external testing is an important phase of development and deployment. We put the procedure in place so that we won't respond to your testing as if your instances were under attack.
The e-mail address aws-security@amazon.com is your single point of contact for all things security-related. If you need to contact us about a particularly sensitive issue, you can encrypt your message with our PGP public key. And, of course, if you suspect abuse of EC2 or other AWS services, our abuse reporting process remains in place.
Finally, a small navigational change. We've moved the bulletins off the main page and onto a separate security bulletin list and changed the format so that all bulletins are displayed rather than just the most recent five.
As always, we welcome your comments and feedback. We're here to help you succeed!
> Steve <
AWS is not only a rich platform to build products and solutions but also a platform to build specialized platforms. The inherent flexibility of the AWS cloud enables businesses to use it as a platform in a variety of different ways. Some of these platforms are highlighted in my blog post titled The Cloud as a Platform for Platforms.
One such platform which is gaining a lot of steam in financial services industry is MarketSimplified. They provide a mobile trading platform on the top of AWS and specialize in making online brokerages fully mobile.
Customers of MarketSimplified not only get powerful features of the MarketSimplified Platform such as cross-device compatibility, support for multiple mobile OS, manageability, and on-demand analytics of transactions but also the scalability, elasticity and reliability of the AWS cloud. All this with no upfront capital expenditure or mobile application development overhead. Their SaaS Middleware platform combines the power of mobile and cloud computing.
So far, they are touting 11M+ Messages, and over $1B in Trade Value processed. They have powered mobile applications provided by TD Ameritrade, ChoiceTrade, IIFL, FXCM, OptionsXpress, PFGBEST, and tradeMonster.
If you would like to know more about them and their technology and how they leverage the AWS cloud, you can read our case study or meet them personally at SIFMA's 30th Annual Financial Services Technology Expo.
-- Jinesh
You can now launch Amazon EC2 instances from an AMI backed by Amazon EBS (Elastic Block Store). This new functionality enables you to launch an instance with an Amazon EBS volume that serves as the root device.
This new feature brings a number of important performance and operational benefits and also enables some really powerful new features:
Let's compare and contrast the original S3-based boot process and the new EBS-based process. Here's what happens when you boot from an AMI that references an image residing in S3:
Now, here's what happens when you boot from an AMI that references an image residing in EBS:
Up until this point the two processes are quite similar. However, the new model allows the instance to be stopped (shut down cleanly and the EBS volumes preserved) at any point and then rebooted later. Here's the process:
At this point the instance neither consumes nor possesses any compute hardware and is not accruing any compute hours. While the instance is stopped, the new ModifyInstanceAttribute function can be used to change instance attributes such as the instance type (small, medium, large, and so forth), the kernel, the user data, and so forth. The instance's Id remains valid while the instance is stopped, and can be used as the target of a start request. Here's what happens then:
When the instance is finally terminated, the EBS volumes will be deleted unless the deleteOnTermination flag associated with the volume was cleared prior to the termination request.
We made a number of other additions and improvements along the way including a disableApiTermination flag on each instance to protect your instances from accidental shutdowns, a new Description field for each AMI, and a simplified AMI creation process (one that works for both Linux and Windows) based on the new CreateImage function.
Detailed information about all of the new features can be found in the EC2 documentation. You should also take a look at the new Boot from EBS Feature Guide. This handy document includes tutorials on Running an instance backed by Amazon EBS, stopping and starting an instance, and bundling an instance backed by Amazon EBS. It also covers some advanced options and addresses some frequently asked questions about this powerful new feature.
I recently spent some time using this new feature and I had an enjoyable (and very productive) time doing so. I built a scalable, distributed ray tracing system around the venerable POV-Ray program. I was able to test and fine-tune the startup behavior of my EC2 instance without the need to create a new AMI for each go-round. Once I had it working as desired, I created the AMI and then enjoyed quicker boot times as I brought additional EC2 instances online and into my "farm" of ray-tracing instances.
I'll be publishing an article with full details on what I built in the near future, so stay tuned!
-- Jeff;
We've added two new features to Amazon SimpleDB to make it even easier for you to implement several different data storage and retrieval scenarios.
The first new feature allows you to do a consistent read. Up until now, SimpleDB implemented eventually consistent reads. You now have the option to choose the type of read which best meets the needs of each part of your application. Before I dive into the specifics, here's a quick guide to the two types of reads:
SimpleDB's Select and GetAttributes functions now accept an optional ConsistentRead flag. This flag has a default value of false, so existing applications will continue to use eventually consistent reads. If the flag is set to true, SimpleDB will return a consistent read.
The second new feature allows you to issue SimpleDB PutAttributes and DeleteAttributes operations on a conditional basis. In other words, you can tell SimpleDB to perform the indicated operation if and only if a given single-valued attribute has the value specified in the PutAttributes or Delete call. You can easily implement counters (the value itself is effectively the version number), delete accounts only if the current balance is zero, and insert an item only if it does not exist.
You can combine consistent reads and conditional operations to implement a form of optimistic concurrency control, or OCC. Let's say that your form-based web application allows users to update their accounts and that you built it with SimpleDB. If you store a version number with each SimpleDB item, you can keep the data consistent even if several users or applications attempt to edit the same record at the same time. You would retrieve the data for display using an eventually consistent read, and then display the form so that the user can edit it. You would also read the associated version number from SimpleDB and associate it with the form or the editing session. When the user modifies and then attempts to save the data, you would use a conditional PutAttributes call to make sure that the data hadn't been changed. If the update fails, you'd need to invoke some application-specific action to resolve the conflict before proceeding. OCC often obviates the need for long-term availability-impacting locks, transactions, timeouts, and other complex and sometimes messy programming constructs.
You can read more about consistency models in Werner's newest blog post (he may be in mid-flight as I write this, but I am confident that his post will show up soon).
-- Jeff;
We're kicking off the third annual AWS Start-Up Challenge now.
We're looking for the hottest and coolest start-ups and start-up ideas. Developers and entrepreneurs in the United States, United Kingdom, Germany, and Israel are encouraged to enter for a chance to win $50,000 in cash, $50,000 in AWS credits, mentoring sessions from AWS technical experts, and AWS Premium Support Gold for one year.
To enter, fill out and submit the online application by August 26, 2009. The judging panel will review all of the application and choose the seven best, based on originality and creativity, likelihood of long-term success, monetization strategy, quality of proposal, and effective use of AWS.
The finalists will be announced in October. At that time we will post a video of each finalist and invite the public to vote for their favorite. Then we'll fly all of the finalists to Silicon Valley where they'll present their ideas to the judges' panel during the day, and pitch them to a live audience of entrepreneurs and venture capitalists that night, where the winner will be chosen, annouced, and feted.
All runner-up finalists will receive $5,000 in AWS service credits; all entrants with qualified submissions will receive $25 credits.
The Challenge finalist with the most creative monetization model using the Amazon Flexible Payments Service (FPS) or Simple Pay from Amazon Payments will win $10,000 in combined cash and Amazon Payments credits. All finalists using these services will receive $2,500 in Amazon Payments credits. Read more here.
Questions? Check out the contest rules, review the prizes, and scan the FAQ. You may also want to watch the videos we made for the 2007 and 2008 finalists.
-- Jeff;
We developed our first proof of concept using EC2 and S3 back in 2006.
From the financial point of view, AWS made prototyping in early states and real world scenarios really affordable. From the technical point of view, AWS took care of the "hardware" part for us, thus allowing the developers to focus on development. We saved a lot of time, and efforts, and the "time to market" of our solution was considerably shorter.
From the experience of working for and communicating with established TV broadcasting companies, we knew that upscaling is and always has been a main issue. At the same time, we think that downscaling is also to be considered due to financial constraints. This is why we have developed a video load balancing solution which is able to decide how many streaming instances are needed. This also allows our customers to benefit from the AWS "pay as you go" payment model.
Can you give us some more technical details?
Using high-CPU EC2 instances, the encoding cloud receives jobs from SQS, the Amazon Simple Queue Service; it then transcodes them into various formats and stores the results on S3. The videos are delivered by our playout cloud using Wowza Flash Streaming technology, which is also powered by Amazon EC2. Asset management is done using our IVMS (Incredible Video Management System) written in Django, persisting its data on a MySQL on EBS (where regular snapshots on S3 save us backup trouble). To prevent spikes in server load and reduce latency, we deploy SvM-Video-Workflow static files and rendered JSON on S3 and use the content delivery service Cloudfront where possible.
What significant benefits have you experienced?
In comparison to the main market solutions for film distribution on the web, we were able to save about 80% of the usual running costs. Using AWS has enabled us to develop our projects in small core teams and still deliver on time, with the advantage of saving a great deal of headaches for the admins.
I remember you told me about the day when you were featured on SPIEGEL online. How was it?
On May 8, 2009, just thirty minutes before launch, dctp.tv got featured on SPIEGEL online. Our phone rang, ordering us to change the entire imprint. With shaky hands, we found ourselves deploying a new version in a matter of minutes and praying the cloudfront edge-servers updated on time. And yet, watching the server stats handling 50 hits per second and realizing that the load balancing module and the streaming cloud were working as they should was so enjoyable it made it all worthwhile.
All in all we had been working on our load balancing algorithms for many months without the opportunity of real world testing. The launch turned out to be our breakthrough, as everything worked as expected. And it has kept on doing just this ever since.
(comment from Simone: next time you need to load test your app, you should try Soasta.com)
Nikolai, do you have any suggestion for other AWS users?
Our experience is that the AWS team genuinely listens to the users' needs and requirements.
A lot of the AWS features that we use today came to tackle problems that had been reported by users when we first started. So good work, and keep in touch :)
I believe our readers will find this story inspiring. Thanks Nikolai.
- Simone (@simon)
I am very happy to announce my white paper on Cloud Architectures is now ready. This is one incarnation of the Emerging Cloud Service Architectures that Jeff wrote about a few weeks ago.
If you are new to the cloud, the first section of the paper will help you understand the benefits of building applications in-the-cloud. If you are using the cloud already, the second section of the paper will help you to use the cloud more effectively by utilizing some of the best practices.
In this paper, I discuss a new way to design architectures. Cloud Architectures are Services-Oriented Architectures that are designed to use On-demand infrastructure more effectively. Applications built on Cloud Architectures are such that the underlying computing infrastructure is used only when it is needed (for example to process a user request), draw the necessary resources on-demand (like compute servers or storage), perform a specific job, then relinquish the unneeded resources after the job is done. While in operation the application scales up or down elastically based on actual need for resources. Everything is automated and operates without any human intervention.
As an example of a Cloud Architecture, I discuss the GrepTheWeb application. This application runs a regular expression against millions of documents from the web and returns the filtered results which match the query. The architecture is interesting because it is runs completely on-demand in automated fashion. Triggered by a regex request, hundreds of Amazon EC2 instances are launched, a Hadoop Cluster is started on them, transient messages are stored on Amazon SQS queues, statuses in Amazon SimpleDB, and all Map/Reduce jobs are run in parallel. Each map task fetches the file from Amazon S3 and, and runs the regular expression - and aggregates all the results and then disposes all the infrastructure back into the cloud (when the Hadoop job is processed)
GrepTheWeb is one of many applications built by Amazon that uses all our services (Amazon EC2, Amazon SimpleDB, Amazon SQS, Amazon S3) together.
A wide variety of different types of applications that can be built using this design approach - from nightly batch processing systems to media processing pipelines.
An excerpt:
Cloud Architectures address key difficulties surrounding large-scale data processing. In traditional data processing it is difficult to get as many machines as an application needs. Second, it is difficult to get the machines when one needs them. Third, it is difficult to distribute and co-ordinate a large-scale job on different machines, run processes on them, and provision another machine to recover if one machine fails. Fourth, it is difficult to auto-scale up and down based on dynamic workloads. Fifth, it is difficult to get rid of all those machines when the job is done. Cloud Architectures solve such difficulties.
Applications built on Cloud Architectures run in-the-cloud where the physical location of the infrastructure is determined by the provider. They take advantage of simple APIs of Internet-accessible services that scale on-demand, that are industrial-strength, where the complex reliability and scalability logic of the underlying services remains implemented and hidden inside-the-cloud. The usage of resources in Cloud Architectures is as needed, sometimes ephemeral or seasonal, thereby providing the highest utilization and optimum bang for the buck.
In the first section I discuss the advantages and business benefits of Cloud Architectures and how each service was used. In the second section, I discuss best practices for the various Amazon Web Services.
You can download the PDF version or access it on AWS Resource Center
I talked about this briefly at the Hadoop Summit 2008 and QCon 2007. I got some good reviews after the talk and hence I decided to put all my thoughts in this paper along with some Best Practices for the use of Amazon Web Services (Amazon EC2, Amazon SQS, Amazon S3 and Amazon SimpleDB together). Many developers from our community have been asking for a real-world example of a complex, large-scale application. I will presenting this paper at the 2008 NSF Data-Intensive Scalable Computing Workshop at UW and 9th IEEE/NATEA Conference on Cloud Computing later this week.
I believe this new and emerging way of building applications, that run in-the-cloud, is going to change the way we do business.
-- Jinesh
Here are some good resources for current and potential users of our Elastic Load Balancing, Auto Scaling, and Amazon CloudWatch features:
Version 1.8a of the popular Boto library for AWS now supports all three of the new features. Written in Python, Boto provides access to
Amazon EC2,
Amazon S3,
Amazon SQS,
Amazon Mechanical Turk,
Amazon SimpleDB,
and
Amazon CloudFront. The
Elastician Blog has some more info.
The Elastician Blog also has a good article with a complete example of how to use CloudWatch from Boto. After creating the connection object, one call initiates the monitoring operation and two other calls provide access to the collected statistics.
The Paglo
monitoring system can now make use of the statistics collected by
CloudWatch. You will need to install the open source Paglo Crawler on your EC2 instances. More info on Paglo can be found here.
The IT Architects at The Server Labs have put together some great blog posts. The first one,
Setting up a load-balanced Oracle Weblogic cluster in Amazon EC2, contains all of the information needed to set up a two node cluster.
The second one,
Full Weblogic Load-Balancing in EC2 with Amazon ELB, shows how to use the Elastic Load Balancer to front a pair of Apache servers which, in turn, direct traffic to a three node Weblogic cluster to increase scalability and availability.
Speaking of availability and durability, you should definitely check out the DZone reference card on the topic. The card provides a detailed yet concise introduction to the two topics in just 6 pages. Topics covered include horizontal scalability, vertical scalability, high availability, measurement, analysis, load balancing, application caching, web caching, clustering, redundancy, fault detection, and fault tolerance.
Author and blogger Ramesh Rajamani wrote a detailed paper on the topic of
Dynamically Scaling Web Applications in Amazon EC2. Although the paper predates the release of the Elastic Load Balancer and Auto Scaling, the approach to scaling is still valid. Ramesh shows how to use
Nginx and
Nagios to build a scalable cluster.
The Serk Tools Blog has a post on
Amazon Elastic Load Balancer Setup. The post includes an architectural review of the Elastic Load Balancer service, detailed directions to create an Elastic Load Balancer instance, information about how to set up a CNAME record in your DNS server, and directions on how to set up health checks.
Arfon Smith wrote a blog post detailing his experience moving the Galaxy Zoo from HAProxy to Elastic Load Balancing. He notes that it took him just 15 minutes to make the switch and that he's now saving $150 per month.
Update: After I wrote this post, two more good resources were brought to my attention!
Shlomo Swidler of of MyDrifts.com wrote to tell me about his post. He covers the two-level elasticity of Elastic Load Balancing and describes some testing strategies. The first level of elasticity is provided by DNS when it maps the CNAME of an Elastic Load Balancer instance to the actual endpoint of the instance. Shlomo correctly points out that this allows inbound network traffic to scale. The second level is provided by the Elastic Load Balancer itself as it distributes traffic across multiple EC2 instances. The latter sections of the post provide a testing strategy for a system powered by one or more Elastic Load Balancer instances.
The Typica AWS library for Java has included CloudWatch support for a few months. You can read this post to learn more about enabling and fetching CloudWatch metrics through Typica.
I hope you find these resources to be helpful!
-- Jeff;
Cirrhus9 (mentioned yesterday) and Pfizer are co-sponsors of a roundtable discussion on the topic of cloud computing and biomedical research. Amazon CTO Werner Vogels will be in attendance at this unique event, where they'll discuss the emerging demands of biomedical research and how they can be met using cloud computing.
The roundtable will be help at 2 PM on January 14th at the Pfizer building in San Diego.
There are just 7 seats left so you'd better go ahead and register now.
-- Jeff;
I met Tom Lounibos, CEO of SOASTA, at the Palo Alto stop of the AWS Start-Up Tour. Tom gave the audience a good introduction to their CloudTest product, an on demand load testing solution which resides on and runs from Amazon EC2.
Tom wrote to me last week to tell me that they are now able to simulate over 500,000 users hitting a single web application. Testing at this level gives system architects the power to verify the scalability of sites, servers, applications, and networks in advance of a genuine surge in traffic.
Here are a few of their most recent success stories:
Based on this video, it looks like it is very easy to create a test, run it, and to process and analyze the results.
The first step is to record a new test consisting of one or more user scenarios. Next, the raw test is edited to generalize it and to specify test data, parameters, and variable substitutions. A drag and drop test creation tool is used to create real-world test scenarios on a multi-track timeline, The system under test can be monitored in various ways while the test is run. Once completed, the test results can be viewed and analyzed.
Pricing for CloudTest starts at $1000 per hour.
-- Jeff;
On February 1st, additional pricing tiers for high volume users of Amazon CloudFront go in to effect. We've been working to reduce our costs and to pass our savings along to you, our customers. If you are in the top bandwidth tier you can deliver content to customers in the United States and Europe for just $.050 per GB (one US Nickel).
The existing tiers apply at the 10 TB, 50 TB, and 150 TB transfer levels. We've added
new levels and corresponding price breaks at 250 TB, 500 TB, 750 TB, and 1 PB. You
can visit the CloudFront home page
to see all of the pricing tiers.
I would also like to call your attention to a number of useful CloudFront resources:
-- Jeff;
Many people have told me that they have used the ElasticFox extension for Firefox to get started with Amazon EC2. ElasticFox makes it easy to see the list of available AMIs (Amazon Machine Images), to launch any number of instances of those AMIs, and to monitor and manage the running instances:
We just released version 1.4 of this powerful tool. In addition to wiping out some bugs related to security groups and key management, ElasticFox now supports all of the features of the newest version of the EC2 API - Availability Zones, Elastic IPs, and user-selectable kernels. There are new tabs for kernels and ramdisks, Elastic IPs, and Availability Zones:
An IP address can be allocated and then attached to a running EC2 instance with a couple of clicks:

New instances can be launched in any availability zone, with full control of the kernel (AKI) and ramdisk (ARI):

Finally, you can now filter the AMI list using the box at the top right:

I added this feature myself because I had been spending too much time scrolling through the ever-expanding list of available AMIs during my conference and user group demos.
And that brings me to my last point: ElasticFox is an open source project hosted on SourceForge. It was easy to download the code to my desktop machine (I used TortoiseSVN), install FireBug, figure out how the code worked, and to make and test my changes.
We've got ideas for even more features, but there's no reason to wait for us. If you have some ideas of your own, grab the code, do your thing, and send us your code for review and checkin.
-- Jeff;
PS - We are planning to release a version of this extension which is compatible with version 3 of Firefox. This version is well under way, but we didn't want to hold up release of these great new features in anticipation of the production release of Firefox 3.
Update: If you are brave and somewhat fault-tolerant, you can download and try out the Firefox 3 version here. This version is reportedly faster, and also more responsive -- the UI doesn't freeze when the extension makes background calls to EC2. Please file bugs as you find them (you will need a SourceForge account in order to do so).
Amazon CloudFront was designed to make it really easy to distribute content to users at high speed with low latency. Here are some new tools which provide a nice end-user interface to CloudFront.
The newest Freeware release of the CloudBerry Explorer now includes CloudFront support. You can create and manage distributions, assign CNAMES, and even automate the entire process using the Windows PowerShell. CloudBerry Explorer also includes some powerful support for batched changes to S3 object Access Control Lists. There are a couple of helpful videos here.
StreamInCloud is a free FLV (Flash Video) encoder. You simply create an S3 bucket and give StreamInCloud permission to read and write it. It then monitors the bucket for new videos, encodes them into the FLV format, and places the encoded version in the bucket. Of course, if the bucket is part of a CloudFront distribution, the encoded content is then available worldwide at high speed with low latency.
StreamInCloud encodes the videos at 512kbps and leaves the size as-is. This service is free; an advanced version with additional features and options will be available later at an additional charge.
Cyberduck is a Mac OS X client for Amazon S3 and CloudFront, with added support for FTP, SFTP, WebDav, and other online storage facilities. The product has a very long feature list, is "scriptable via AppleScript, and, like CloudBerry Explorer, is Freeware.
Full source code is available as well.
As I noted earlier this week, Ylastic allows you to manage your CloudFront distributions from your iPhone. There's now support for the Google Android Phone as well. Watch the screencasts to learn more.
Affirma Consulting has developed the Manager For Amazon CloudFront in C#. The project is hosted on CodePlex and full source code is available. It supports direct streaming of data into S3 and uses multiple threads to manage simultaneous uploads, downloads, and live statistics.
On the surface, CloudBuddy looks like a free S3 bucket explorer tool with full support for CloudFront. However, there's quite a bit more beneath the surface. It is actually a platform with a highly refined architecture. All CloudBuddy operations are exposed as APIs.
The distribution includes a Microsoft Office plug-in to help you to manage your documents, workbooks, emails, presentations, and projects in the cloud. Source code is available.
Bucket Explorer also has a number of unique and very handy features including the ability to copy objects from one S3 account to another along with timed backups to S3. It is available for Windows, Mac, and Linux.
Enjoy, and let us know how you have put CloudFront to use.
-- Jeff;
????の????
注??管?????注????確認????の???購???に?絡??????使??????の??????には??来????HTMLま?は???(URL)??含?な????????て???ま????
現?不??に?????<??-?? /??>??と?っ??号??って???とHTML??と認?????????????信できな?場?????ま???
その場??????訳???ま???????????にHTMLの??ま?は???(URL)?含ま??て???????客?のE??????信????と?できま?????該??????????????????度??????てくだ?????と????????????表示???ま???
現?????署で対????って???ま????信の??に??????????表示?????場?は??号????て?信?てくだ????
????の???には度?な???迷????????ま??と???深く?詫び????ま???
Amazon.co.jp Amazon ?????????????
I thought that it would be worthwhile to outline the steps needed to purchase an EC2 Reserved Instance. Here's what you need to do:
This blog post assumes that you have the latest version of the EC2 Command Line tools installed and that you have set the proper environment variables (JAVA_HOME, EC2_HOME, EC2_PRIVATE_KEY, and EC2_CERT) All commands are to be typed in to a Windows Command (cmd.exe) window.
Choose a Region
Per the announcement, you can now purchase Reserved Instances in either the US or in Europe. If you already have an EC2 instance running in a particular region and you want to convert it to a reserved instance, then choose that region. Otherwise, choose the region that is best suited to your needs over the term (1 year or 3 year) of the Reserved Instance.
Based on your chosen region, set your EC2 endpoint appropriately:
US:
Europe:
Choose an Availability Zone
If you already have an On-Demand instance running and you want to convert it to a Reserved Instance, or if you have an EBS volume in a particular Availability Zone, then your choice is clear. You can use the ec2-describe-instances command to figure out the availability zone and instance type if necessary. In the screen shot below, I have highlighted the instance type in yellow and the availability zone in purple to make it clear where to find them:
Locate The Reserved Instance Offering
Now that you know the instance type and Availability Zone, you need to decide if you want to purchase a Reserved Instance for 1 year or for 3 years. You can consult the EC2 Pricing Chart and make a decision based on your needs. Considerations might include the expected lifetime of your site or application, plans for growth, degree of variability expectd in your usage patterns, and so forth.
The next step is to run ec2-describe-reserved-instances-offerings and select the appropriate offering. Each offering is identified by an alphanumeric id such as e5a2ff3b-f6eb-4b4e-83f8-b879d7060257 (highlighted in yellow below):
You can also get fancy and run a search pipeline. Here's how I found an m1.small instance in us-east-1a with a 1 year term:
Make the Purchase
The next step is to actually make the purchase using ec2-purchase-reserved-instances-offering. This command requires an offering id from the previous step and an instance count, allowing purchase of more than one reserved instance at a time. Needless to say, you should use this command with caution since you are spending hundreds or thousands of dollars! Here's what happened when I made the purchase:
Enjoy
Since I already had an instance running, all further instance hours that it consumes will be billed at the lower rate. As of this fall three of my five offspring will be in college ( Washington, Maryland, and Rochester), so the extra pennies per hour will definitely come in handy!
-- Jeff;
We've been working to make it possible for you to run Windows or SQL Server in additional locations and to build highly available applications.
You now have the ability to launch EC2 running Windows or SQL Server in the EU-West region, in two separate Availability Zones. You can also launch EC2 running Windows or SQL Server in a second Availability Zone in the US-East region. With the additional of the new European region and the additional US zone you now have the tools needed to build Windows-based applications that are resilient against failure of an availability zone.
The
AWS Management Console has been updated with full support for the EU-West region. After selecting the new region from the handy dropdown (shown at right), you can launch EC2 instances, create, attach and destroy EBS volumes, manage Elastic IP addresses, and more.
We've created new Windows AMIs with the French, German, Italian, and Spanish language packages installed. The Console even provides a new Language menu in the quick start list. Once launched, you simply set the locale in the Windows Control Panel. You can find step by step directions for launching AMIs in various languages here.
The popular ElasticFox tool now lets you tag running instances, EBS volumes, and EBS snapshots. The Image and Instance views have been assigned to distinct tabs and you can now specify a binary (non-text) file as instance data at launch time.
While I'm talking about all things European, I should mention two other items that may be of interest to you. First, Amazon CTO Werner Vogels will deliver a keynote at the Cebit conference in Germany later this week. Second, we have an opening in Luxembourg for an AWS Sales Representative.
-- Jeff;
As part of a trip to New York earlier this year, the folks at Strateer were kind enough to set up an informal meeting with some AWS users in the area.
Before lunch they pointed out a small conference room to me and told me that they had invited "a few local users." They wanted to make sure that I wouldn't be disappointed if just 2 or 3 folks showed up (I was fine). We returned from a very pleasant lunch to find the room jam-packed, with at least 15 people around the table!
For the next 90 minutes, they talked, and listened. I can't tell you how valuable and worthwhile it is to hear directly from our users ?? what they like about AWS, what they are doing with it, and what they don't like about it. Amazon's customer-oriented culture attaches a lot of value to the "voice of the customer"; accordingly, I am happy to do my part to collect it and to bring it to the attention of the entire AWS team.
One attendee at that meeting was Khanan Grauer, founder of AWS-powered Fingad (pictured at right). He told me that they were using our services to host their site and that it had worked out really well for them.
Fingad allows serious traders to share and review their trading strategies with their peers. As Khanan told me, "our mission is to provide investors points of view from other investors. This has resulted in very interesting knowledge, which is very different from what the news media & business analysts produce."
Users create accounts and can then publish information about themselves in their profile. This particular user wrote a very interesting post about the current state of the oil futures market. There are a number of social networking features including photo albums and reviews.
Users can also create, share, and track their own virtual investment portfolios.
The site was built using Ruby on Rails and the Lighttpd web server, and runs on Amazon EC2. All images are stored in Amazon S3. As Khanan says, "This has resulted in faster response times and improvement for our users." They created a Master AMI and are able to scale with ease when traffic spikes.
-- Jeff;
Even though Amazon SimpleDB is still a beta product, progressive developers are already learning about it and building highly scalable applications. In fact, we just released a pair of case studies.
ShareThis has been deployed to over 30,000 web sites. Faced with rapid growth, the team considered three storage options and chose SimpleDB for its responsiveness, reliability, zero software cost, minimal staff costs, and low barrier to development. They used EC2, SimpleDB, S3, and SQS to build a complete loosely coupled and fault tolerant system in the cloud, with an estimated savings of $200,000. Read all about it!
The Alexa Site Thumbnail Service uses SimpleDB to store intermediate status and log data, allowing them to store and deliver millions of thumbnails. They store over 12 million objects in SimpleDB and perform over 5 million queries every day. Read all about it!
-- Jeff;
We have just released a new code sample.
Written in Java, this new sample shows how Amazon SimpleDB can be used as a repository for metadata which describes objects stored in Amazon S3. The code was written to illustrate best practices for indexing S3 data and for getting the best indexing and query performance from SimpleDB.
Indexing is implemented at two levels. At the first level, multiple threads (implemented using the Java Executor) are used to ensure that a number of S3 reads and a number of SimpleDB writes are taking place simultaneously. At the second level, Amazon SQS is used to coordinate index tasks running on multiple systems, leading to an even higher degree of concurrency.
Bulk queries are implemented using a pair of thread pools. The first pool runs SimpleDB queries and the second retrieves SimpleDB attributes. With the proper balance between the two pools, a Small Amazon EC2 instance was able to make over 300 requests per second.
Check it out!
-- Jeff;
We've been working to drive down our costs and to pass the savings along to our customers. We've focused on bandwidth costs and are happy to announce that the cost of outbound bandwidth (for data transferred from within AWS to the outside world) has been reduced effective May 1, 2008. The old and new costs are as follows:
<style> table.T2008_04_30 { border-width: 1px 1px 1px 1px; border-spacing: 3px; border-style: solid solid solid solid; border-color: blue blue blue blue; border-collapse: collapse; background-color: white; } table.T2008_04_30 td { border-width: 2px 2px 2px 2px; padding: 3px 3px 3px 3px; border-style: solid solid solid solid; border-color: blue blue blue blue; background-color: white; } </style>| Monthly Transfer | Old Price / GB | New Price / GB |
| First 10 TB | $0.180 | $0.170 |
| Next 40 TB | $0.160 | $0.130 |
| Next 100 TB | $0.130 | $0.110 |
| >=150 TB | $0.130 | $0.100 |
Note that there's an entirely new pricing tier, for customers with outbound monthly transfer in excess of 150 Terabytes.
As noted in the forum post, a customer with 50 TB of monthly transfer will save 16% and a customer with 500 TB of monthly transfer will save 26%. Earlier this year we let the world know that the total bandwidth consumed by Amazon EC2 and S3 is greater than that consumed by all of our global web sites put together.
We've also updated the AWS Simple Calculator Utility to reflect the new prices.
-- Jeff;
In the past we've blogged about Digital Bucket, which is a great app for storing files in Amazon S3.
I'm mentioning them again because they've added all sorts of new features. If you are a Windows user, I think you'll find this to be intuitive and seamless! A few of new goodies are as follows:
-- Mike
There is lots of buzz about Hadoop and Amazon EC2??and of
course there should be, given all the great projects such as the one that the
New York Times one, where they converted
old articles into PDF files for $240.
There??s a second environment you should know about, although the buzz level is a bit lower. (That might change.) Condor is a scheduling application that is commonly used in HPC and grid applications. It can also be used to manage Hadoop grids, and manages ??jobs? in much the same manner as mainframes??that is, you submit a job to Condor, along with metadata that describes the job??s characteristics. Then Condor finds suitable resources to allocate for the job. Note that Condor and Hadoop are trying to solve things in independent ways--with the result that they overlap in some ways, while doing unrelated things in some cases.
This week I attended Condor Week at the University of Wisconsin in Madison. Condor Week is an annual event that gives Condor collaborators and users the chance to exchange ideas and experiences, to learn about latest research, to experience live demos, and to influence our short and long term research and development directions.
If you are interested in large-scale grid computing, this approach is worth a serious look. There are two active projects that implement Condor on Amazon EC2, and of course that??s why this blog entry is being posted.
Cycle Computing offers Amazon EC2 plus Condor as an integrated platform, in addition to supporting other underlying computing resources. Their software automates Condor grid management, including monitoring, configuration, version control, usage tracking, and more. At the conference Jason Stowe from Cycle Computing made a very strong case for using Amazon EC2 instead of a traditional grid environment. Jason??s presentation is available for download at http://www.cs.wisc.edu/condor/CondorWeek2008/condor_presentations/stowe_cycle.pdf.
Red Hat??s approach integrates EC2 directly into the Condor code base. The result is that an Amazon EC2 instance is the ??Condor Job?, and in that manner they are able to manage the entire life cycle of an EC2 Instance. In some cases the entire Condor pool is running on EC2, and in other cases EC2 augments an existing pool. All of this work was done by collaboration between the University of Wisconsin (Jaeyoung Yoon , Fang Cao, and Jaime Frey, along with Matt Farrellee from Red Hat. They plan to integrate Amazon S3 as a storage medium in the near future.<o:p></o:p>
One thing seems certain: on-demand virtualization brightens the lights in Grid Computing City, because organizations who could not afford a grid suddenly find themselves with both affordable infrastructure and powerful tools to manage their new-found tool.
-- Mike