We've added some important new community features to our Public Data Sets and we've also added some new and intriguing data to our collection. I'm writing this post to bring you up to date on this unique AWS feature and thought I would also show you how to instantiate and use an actual public data set.
If the concept is new to you, allow me to give you a brief
introduction.
We have set up a centralized repository for
large (tens or hundreds of gigabytes) public data sets, which
we host at no charge.
We currently have public data sets
in a number of categories including
Biology,
Chemistry,
Economics,
Encyclopedic,
Geographic, and
Mathematics.
The
data sets are stored in the form of EBS
(Elastic Block Storage)
snapshots. These snapshots are used to create an EBS volume
from scratch in a matter of seconds. Most data sets are available
in formats suitable for use with both Linux and Windows.
Once created, the volume is then mounted on an
EC2
instance for processing. Once the processing is complete,
the volume can be kept alive for further work, archived to
S3 or simple deleted.
To make sure that you can get a lot of value from our Public
Data Sets, we've added some new community features. Each
set now has its own page within the
AWS Resource Center.
The page contains all of the information needed to start making
use of the data, including submission information, creation date,
update date, data source, and more. There's a dedicated discussion
forum for each data set, and even (in classic Amazon style) room
to enter a review and a rating.
We've also added a number of rich and intriguing data sets to
our collection. Here's what's new:
We'll continue to add additional public data sets to our collection over the coming months. Please feel free to submit your own data sets for consideration, or to propose inclusion of data sets owned by others.
It is really easy to instantiate an instance of a public data set. I wanted to process the 2003-2006 US Economic Data. Here's what I need to do:
I hit the "Create" button, waited two seconds, and then hit "Refresh." The
volume status changed from "creating" to "available" so I knew that my
data was ready.
Once I am done I can simply unmount the volume, shut down the instance, and delete the volume. No fuss, no muss, and a total cost of 11 cents (10 cents for an hour of EC2 time and a penny or so for the actual EBS volume).
--Jeff;
We've added two new features to Amazon SimpleDB to make it even easier for you to implement several different data storage and retrieval scenarios.
The first new feature allows you to do a consistent read. Up until now, SimpleDB implemented eventually consistent reads. You now have the option to choose the type of read which best meets the needs of each part of your application. Before I dive into the specifics, here's a quick guide to the two types of reads:
SimpleDB's Select and GetAttributes functions now accept an optional ConsistentRead flag. This flag has a default value of false, so existing applications will continue to use eventually consistent reads. If the flag is set to true, SimpleDB will return a consistent read.
The second new feature allows you to issue SimpleDB PutAttributes and DeleteAttributes operations on a conditional basis. In other words, you can tell SimpleDB to perform the indicated operation if and only if a given single-valued attribute has the value specified in the PutAttributes or Delete call. You can easily implement counters (the value itself is effectively the version number), delete accounts only if the current balance is zero, and insert an item only if it does not exist.
You can combine consistent reads and conditional operations to implement a form of optimistic concurrency control, or OCC. Let's say that your form-based web application allows users to update their accounts and that you built it with SimpleDB. If you store a version number with each SimpleDB item, you can keep the data consistent even if several users or applications attempt to edit the same record at the same time. You would retrieve the data for display using an eventually consistent read, and then display the form so that the user can edit it. You would also read the associated version number from SimpleDB and associate it with the form or the editing session. When the user modifies and then attempts to save the data, you would use a conditional PutAttributes call to make sure that the data hadn't been changed. If the update fails, you'd need to invoke some application-specific action to resolve the conflict before proceeding. OCC often obviates the need for long-term availability-impacting locks, transactions, timeouts, and other complex and sometimes messy programming constructs.
You can read more about consistency models in Werner's newest blog post (he may be in mid-flight as I write this, but I am confident that his post will show up soon).
-- Jeff;
We're kicking off the third annual AWS Start-Up Challenge now.
We're looking for the hottest and coolest start-ups and start-up ideas. Developers and entrepreneurs in the United States, United Kingdom, Germany, and Israel are encouraged to enter for a chance to win $50,000 in cash, $50,000 in AWS credits, mentoring sessions from AWS technical experts, and AWS Premium Support Gold for one year.
To enter, fill out and submit the online application by August 26, 2009. The judging panel will review all of the application and choose the seven best, based on originality and creativity, likelihood of long-term success, monetization strategy, quality of proposal, and effective use of AWS.
The finalists will be announced in October. At that time we will post a video of each finalist and invite the public to vote for their favorite. Then we'll fly all of the finalists to Silicon Valley where they'll present their ideas to the judges' panel during the day, and pitch them to a live audience of entrepreneurs and venture capitalists that night, where the winner will be chosen, annouced, and feted.
All runner-up finalists will receive $5,000 in AWS service credits; all entrants with qualified submissions will receive $25 credits.
The Challenge finalist with the most creative monetization model using the Amazon Flexible Payments Service (FPS) or Simple Pay from Amazon Payments will win $10,000 in combined cash and Amazon Payments credits. All finalists using these services will receive $2,500 in Amazon Payments credits. Read more here.
Questions? Check out the contest rules, review the prizes, and scan the FAQ. You may also want to watch the videos we made for the 2007 and 2008 finalists.
-- Jeff;
We developed our first proof of concept using EC2 and S3 back in 2006.
From the financial point of view, AWS made prototyping in early states and real world scenarios really affordable. From the technical point of view, AWS took care of the "hardware" part for us, thus allowing the developers to focus on development. We saved a lot of time, and efforts, and the "time to market" of our solution was considerably shorter.
From the experience of working for and communicating with established TV broadcasting companies, we knew that upscaling is and always has been a main issue. At the same time, we think that downscaling is also to be considered due to financial constraints. This is why we have developed a video load balancing solution which is able to decide how many streaming instances are needed. This also allows our customers to benefit from the AWS "pay as you go" payment model.
Can you give us some more technical details?
Using high-CPU EC2 instances, the encoding cloud receives jobs from SQS, the Amazon Simple Queue Service; it then transcodes them into various formats and stores the results on S3. The videos are delivered by our playout cloud using Wowza Flash Streaming technology, which is also powered by Amazon EC2. Asset management is done using our IVMS (Incredible Video Management System) written in Django, persisting its data on a MySQL on EBS (where regular snapshots on S3 save us backup trouble). To prevent spikes in server load and reduce latency, we deploy SvM-Video-Workflow static files and rendered JSON on S3 and use the content delivery service Cloudfront where possible.
What significant benefits have you experienced?
In comparison to the main market solutions for film distribution on the web, we were able to save about 80% of the usual running costs. Using AWS has enabled us to develop our projects in small core teams and still deliver on time, with the advantage of saving a great deal of headaches for the admins.
I remember you told me about the day when you were featured on SPIEGEL online. How was it?
On May 8, 2009, just thirty minutes before launch, dctp.tv got featured on SPIEGEL online. Our phone rang, ordering us to change the entire imprint. With shaky hands, we found ourselves deploying a new version in a matter of minutes and praying the cloudfront edge-servers updated on time. And yet, watching the server stats handling 50 hits per second and realizing that the load balancing module and the streaming cloud were working as they should was so enjoyable it made it all worthwhile.
All in all we had been working on our load balancing algorithms for many months without the opportunity of real world testing. The launch turned out to be our breakthrough, as everything worked as expected. And it has kept on doing just this ever since.
(comment from Simone: next time you need to load test your app, you should try Soasta.com)
Nikolai, do you have any suggestion for other AWS users?
Our experience is that the AWS team genuinely listens to the users' needs and requirements.
A lot of the AWS features that we use today came to tackle problems that had been reported by users when we first started. So good work, and keep in touch :)
I believe our readers will find this story inspiring. Thanks Nikolai.
- Simone (@simon)
You can now launch Amazon EC2 instances from an AMI backed by Amazon EBS (Elastic Block Store). This new functionality enables you to launch an instance with an Amazon EBS volume that serves as the root device.
This new feature brings a number of important performance and operational benefits and also enables some really powerful new features:
Let's compare and contrast the original S3-based boot process and the new EBS-based process. Here's what happens when you boot from an AMI that references an image residing in S3:
Now, here's what happens when you boot from an AMI that references an image residing in EBS:
Up until this point the two processes are quite similar. However, the new model allows the instance to be stopped (shut down cleanly and the EBS volumes preserved) at any point and then rebooted later. Here's the process:
At this point the instance neither consumes nor possesses any compute hardware and is not accruing any compute hours. While the instance is stopped, the new ModifyInstanceAttribute function can be used to change instance attributes such as the instance type (small, medium, large, and so forth), the kernel, the user data, and so forth. The instance's Id remains valid while the instance is stopped, and can be used as the target of a start request. Here's what happens then:
When the instance is finally terminated, the EBS volumes will be deleted unless the deleteOnTermination flag associated with the volume was cleared prior to the termination request.
We made a number of other additions and improvements along the way including a disableApiTermination flag on each instance to protect your instances from accidental shutdowns, a new Description field for each AMI, and a simplified AMI creation process (one that works for both Linux and Windows) based on the new CreateImage function.
Detailed information about all of the new features can be found in the EC2 documentation. You should also take a look at the new Boot from EBS Feature Guide. This handy document includes tutorials on Running an instance backed by Amazon EBS, stopping and starting an instance, and bundling an instance backed by Amazon EBS. It also covers some advanced options and addresses some frequently asked questions about this powerful new feature.
I recently spent some time using this new feature and I had an enjoyable (and very productive) time doing so. I built a scalable, distributed ray tracing system around the venerable POV-Ray program. I was able to test and fine-tune the startup behavior of my EC2 instance without the need to create a new AMI for each go-round. Once I had it working as desired, I created the AMI and then enjoyed quicker boot times as I brought additional EC2 instances online and into my "farm" of ray-tracing instances.
I'll be publishing an article with full details on what I built in the near future, so stay tuned!
-- Jeff;
I am very happy to announce my white paper on Cloud Architectures is now ready. This is one incarnation of the Emerging Cloud Service Architectures that Jeff wrote about a few weeks ago.
If you are new to the cloud, the first section of the paper will help you understand the benefits of building applications in-the-cloud. If you are using the cloud already, the second section of the paper will help you to use the cloud more effectively by utilizing some of the best practices.
In this paper, I discuss a new way to design architectures. Cloud Architectures are Services-Oriented Architectures that are designed to use On-demand infrastructure more effectively. Applications built on Cloud Architectures are such that the underlying computing infrastructure is used only when it is needed (for example to process a user request), draw the necessary resources on-demand (like compute servers or storage), perform a specific job, then relinquish the unneeded resources after the job is done. While in operation the application scales up or down elastically based on actual need for resources. Everything is automated and operates without any human intervention.
As an example of a Cloud Architecture, I discuss the GrepTheWeb application. This application runs a regular expression against millions of documents from the web and returns the filtered results which match the query. The architecture is interesting because it is runs completely on-demand in automated fashion. Triggered by a regex request, hundreds of Amazon EC2 instances are launched, a Hadoop Cluster is started on them, transient messages are stored on Amazon SQS queues, statuses in Amazon SimpleDB, and all Map/Reduce jobs are run in parallel. Each map task fetches the file from Amazon S3 and, and runs the regular expression - and aggregates all the results and then disposes all the infrastructure back into the cloud (when the Hadoop job is processed)
GrepTheWeb is one of many applications built by Amazon that uses all our services (Amazon EC2, Amazon SimpleDB, Amazon SQS, Amazon S3) together.
A wide variety of different types of applications that can be built using this design approach - from nightly batch processing systems to media processing pipelines.
An excerpt:
Cloud Architectures address key difficulties surrounding large-scale data processing. In traditional data processing it is difficult to get as many machines as an application needs. Second, it is difficult to get the machines when one needs them. Third, it is difficult to distribute and co-ordinate a large-scale job on different machines, run processes on them, and provision another machine to recover if one machine fails. Fourth, it is difficult to auto-scale up and down based on dynamic workloads. Fifth, it is difficult to get rid of all those machines when the job is done. Cloud Architectures solve such difficulties.
Applications built on Cloud Architectures run in-the-cloud where the physical location of the infrastructure is determined by the provider. They take advantage of simple APIs of Internet-accessible services that scale on-demand, that are industrial-strength, where the complex reliability and scalability logic of the underlying services remains implemented and hidden inside-the-cloud. The usage of resources in Cloud Architectures is as needed, sometimes ephemeral or seasonal, thereby providing the highest utilization and optimum bang for the buck.
In the first section I discuss the advantages and business benefits of Cloud Architectures and how each service was used. In the second section, I discuss best practices for the various Amazon Web Services.
You can download the PDF version or access it on AWS Resource Center
I talked about this briefly at the Hadoop Summit 2008 and QCon 2007. I got some good reviews after the talk and hence I decided to put all my thoughts in this paper along with some Best Practices for the use of Amazon Web Services (Amazon EC2, Amazon SQS, Amazon S3 and Amazon SimpleDB together). Many developers from our community have been asking for a real-world example of a complex, large-scale application. I will presenting this paper at the 2008 NSF Data-Intensive Scalable Computing Workshop at UW and 9th IEEE/NATEA Conference on Cloud Computing later this week.
I believe this new and emerging way of building applications, that run in-the-cloud, is going to change the way we do business.
-- Jinesh
Here are some good resources for current and potential users of our Elastic Load Balancing, Auto Scaling, and Amazon CloudWatch features:
Version 1.8a of the popular Boto library for AWS now supports all three of the new features. Written in Python, Boto provides access to
Amazon EC2,
Amazon S3,
Amazon SQS,
Amazon Mechanical Turk,
Amazon SimpleDB,
and
Amazon CloudFront. The
Elastician Blog has some more info.
The Elastician Blog also has a good article with a complete example of how to use CloudWatch from Boto. After creating the connection object, one call initiates the monitoring operation and two other calls provide access to the collected statistics.
The Paglo
monitoring system can now make use of the statistics collected by
CloudWatch. You will need to install the open source Paglo Crawler on your EC2 instances. More info on Paglo can be found here.
The IT Architects at The Server Labs have put together some great blog posts. The first one,
Setting up a load-balanced Oracle Weblogic cluster in Amazon EC2, contains all of the information needed to set up a two node cluster.
The second one,
Full Weblogic Load-Balancing in EC2 with Amazon ELB, shows how to use the Elastic Load Balancer to front a pair of Apache servers which, in turn, direct traffic to a three node Weblogic cluster to increase scalability and availability.
Speaking of availability and durability, you should definitely check out the DZone reference card on the topic. The card provides a detailed yet concise introduction to the two topics in just 6 pages. Topics covered include horizontal scalability, vertical scalability, high availability, measurement, analysis, load balancing, application caching, web caching, clustering, redundancy, fault detection, and fault tolerance.
Author and blogger Ramesh Rajamani wrote a detailed paper on the topic of
Dynamically Scaling Web Applications in Amazon EC2. Although the paper predates the release of the Elastic Load Balancer and Auto Scaling, the approach to scaling is still valid. Ramesh shows how to use
Nginx and
Nagios to build a scalable cluster.
The Serk Tools Blog has a post on
Amazon Elastic Load Balancer Setup. The post includes an architectural review of the Elastic Load Balancer service, detailed directions to create an Elastic Load Balancer instance, information about how to set up a CNAME record in your DNS server, and directions on how to set up health checks.
Arfon Smith wrote a blog post detailing his experience moving the Galaxy Zoo from HAProxy to Elastic Load Balancing. He notes that it took him just 15 minutes to make the switch and that he's now saving $150 per month.
Update: After I wrote this post, two more good resources were brought to my attention!
Shlomo Swidler of of MyDrifts.com wrote to tell me about his post. He covers the two-level elasticity of Elastic Load Balancing and describes some testing strategies. The first level of elasticity is provided by DNS when it maps the CNAME of an Elastic Load Balancer instance to the actual endpoint of the instance. Shlomo correctly points out that this allows inbound network traffic to scale. The second level is provided by the Elastic Load Balancer itself as it distributes traffic across multiple EC2 instances. The latter sections of the post provide a testing strategy for a system powered by one or more Elastic Load Balancer instances.
The Typica AWS library for Java has included CloudWatch support for a few months. You can read this post to learn more about enabling and fetching CloudWatch metrics through Typica.
I hope you find these resources to be helpful!
-- Jeff;
Cirrhus9 (mentioned yesterday) and Pfizer are co-sponsors of a roundtable discussion on the topic of cloud computing and biomedical research. Amazon CTO Werner Vogels will be in attendance at this unique event, where they'll discuss the emerging demands of biomedical research and how they can be met using cloud computing.
The roundtable will be help at 2 PM on January 14th at the Pfizer building in San Diego.
There are just 7 seats left so you'd better go ahead and register now.
-- Jeff;
I met Tom Lounibos, CEO of SOASTA, at the Palo Alto stop of the AWS Start-Up Tour. Tom gave the audience a good introduction to their CloudTest product, an on demand load testing solution which resides on and runs from Amazon EC2.
Tom wrote to me last week to tell me that they are now able to simulate over 500,000 users hitting a single web application. Testing at this level gives system architects the power to verify the scalability of sites, servers, applications, and networks in advance of a genuine surge in traffic.
Here are a few of their most recent success stories:
Based on this video, it looks like it is very easy to create a test, run it, and to process and analyze the results.
The first step is to record a new test consisting of one or more user scenarios. Next, the raw test is edited to generalize it and to specify test data, parameters, and variable substitutions. A drag and drop test creation tool is used to create real-world test scenarios on a multi-track timeline, The system under test can be monitored in various ways while the test is run. Once completed, the test results can be viewed and analyzed.
Pricing for CloudTest starts at $1000 per hour.
-- Jeff;
Amazon Virtual Private Cloud (Amazon VPC) lets you create your own logically isolated set of Amazon EC2 instances and connect it to your existing network using an IPsec VPN connection. This new offering lets you take advantage of the low cost and flexibility of AWS while leveraging the investment you have already made in your IT infrastructure.
This cool new service is now in a limited beta and you can apply for admission here.
Here’s all you need to do to get started:
Once you have done this, all Internet-bound traffic generated by your Amazon EC2 instances within your VPC routes across the VPN connection, where it wends its way through your outbound firewall and any other network security devices under your control before exiting from your network.
IP addresses are specified using CIDR notation, where the value after the slash represents the number of bits in the routing prefix for the address. You’re currently limited to one VPC per AWS account, however, if you have a use case requiring more, let us know and we’ll see what we can do.
Because the VPC subnets are used to isolate logically distinct functionality, we’ve chosen not to immediately support Amazon EC2 security groups. You can launch your own AMIs and most public AMIs, including Microsoft Windows AMIs. You can’t launch Amazon DevPay AMIs just yet, though.
The Amazon EC2 instances are on your network. They can access or be accessed by other systems on the network as if they were local. As far as you are concerned, the EC2 instances are additional local network resources -- there is no NAT translation. EC2 instances within a VPC do not currently have Internet-facing IP addresses.
Requirements to interoperate with our VPN implementation include:
Optional capabilities that we recommend include:
Amazon VPC functionality is accessible via the EC2 API and command-line tools. The ec2-create-vpc command creates a VPC and the ec2-describe-vpcs command lists your collection of VPCs. There are commands to create subnets, customer gateways, VPN gateways, and VPN connections. Once all of the requisite objects have been created, the ec2-attach-vpn-gateway connects your VPC to your network and allows traffic to flow. While most organizations will likely leave the VPN connection (and VPC) up and running indefinitely, you can drop the connection, terminate the instances, and even delete the VPC if you would like.
You only pay for what you use. Pricing is on a pay-as-you-go basis. VPCs, subnets, customer gateways, and VPN gateways are free to create and to use. You simply pay an hourly charge for each VPN connection you create, and for the data transferred through those VPN connections. EC2 instances within your VPC are priced at the normal On-Demand rate. We’ll honor the hourly rate for any Reserved Instances that you have but during the beta we cannot guarantee that Reserved Instances will always be available for deployment within your VPC.
Imagine the many ways that you can now combine your existing on-premise static resources with dynamic resources from the Amazon VPC. You can expand your corporate network on a permanent or temporary basis. You can get resources for short-term experiments and then leave the instances running if the experiment succeeds. You can establish instances for use as part of a DR (Disaster Recovery) effort. You can even test new applications, systems, and middleware components without disturbing your existing versions.
As is the case with many of our betas, this one is launching in a single Availability Zone in the US-East region. You can use Amazon CloudWatch to monitor your instances, but you can’t use Elastic IP addresses, Auto Scaling or Elastic Load Balancing just yet.
Recall that all traffic from your instances routes through the VPN connection. For now, this includes traffic to other Amazon Web Services such as EC2 instances outside of your Amazon VPC, Amazon S3, Amazon SQS, and Amazon SimpleDB. You can create Elastic Block Store (EBS) volumes and attach them to your instances. EBS volumes created within your cloud can be moved to standard EC2 instances and vice-versa.
I do want to mention a few of the things on our road map as well. First, we're planning to let you directly reach the Internet from your VPC. In early discussions with potential users, we learned that most of them wanted to completely isolate their EC2 instances, routing all of the traffic back to their data center, so we gave this feature the highest priority. Later on, we'll let you decide if and how you want to expose your VPC to the Internet. Second, we're planning to let you specify the IP address of individual Amazon EC2 instances within a subnet. During this beta, Amazon EC2 instances are automatically assigned a random IP from the subnet's designated IP address range. Third, we're evaluating ways to allow you to filter traffic per subnet, kind of like how you might implement router ACLs. We're already working on these items and on other additions to the core functionality we're releasing today. If you have opinions on these items, or anything else you'd like to see in the service, e-mail us or post to the forum. This service is for you; we really need your feedback!
We think you can put Amazon VPC to immediate use and can’t wait to hear about new and imaginative use cases for it. Please feel free to leave a comment on this blog or to send us some email.
-- Jeff;
On February 1st, additional pricing tiers for high volume users of Amazon CloudFront go in to effect. We've been working to reduce our costs and to pass our savings along to you, our customers. If you are in the top bandwidth tier you can deliver content to customers in the United States and Europe for just $.050 per GB (one US Nickel).
The existing tiers apply at the 10 TB, 50 TB, and 150 TB transfer levels. We've added
new levels and corresponding price breaks at 250 TB, 500 TB, 750 TB, and 1 PB. You
can visit the CloudFront home page
to see all of the pricing tiers.
I would also like to call your attention to a number of useful CloudFront resources:
-- Jeff;
Many people have told me that they have used the ElasticFox extension for Firefox to get started with Amazon EC2. ElasticFox makes it easy to see the list of available AMIs (Amazon Machine Images), to launch any number of instances of those AMIs, and to monitor and manage the running instances:
We just released version 1.4 of this powerful tool. In addition to wiping out some bugs related to security groups and key management, ElasticFox now supports all of the features of the newest version of the EC2 API - Availability Zones, Elastic IPs, and user-selectable kernels. There are new tabs for kernels and ramdisks, Elastic IPs, and Availability Zones:
An IP address can be allocated and then attached to a running EC2 instance with a couple of clicks:

New instances can be launched in any availability zone, with full control of the kernel (AKI) and ramdisk (ARI):

Finally, you can now filter the AMI list using the box at the top right:

I added this feature myself because I had been spending too much time scrolling through the ever-expanding list of available AMIs during my conference and user group demos.
And that brings me to my last point: ElasticFox is an open source project hosted on SourceForge. It was easy to download the code to my desktop machine (I used TortoiseSVN), install FireBug, figure out how the code worked, and to make and test my changes.
We've got ideas for even more features, but there's no reason to wait for us. If you have some ideas of your own, grab the code, do your thing, and send us your code for review and checkin.
-- Jeff;
PS - We are planning to release a version of this extension which is compatible with version 3 of Firefox. This version is well under way, but we didn't want to hold up release of these great new features in anticipation of the production release of Firefox 3.
Update: If you are brave and somewhat fault-tolerant, you can download and try out the Firefox 3 version here. This version is reportedly faster, and also more responsive -- the UI doesn't freeze when the extension makes background calls to EC2. Please file bugs as you find them (you will need a SourceForge account in order to do so).
Amazon CloudFront was designed to make it really easy to distribute content to users at high speed with low latency. Here are some new tools which provide a nice end-user interface to CloudFront.
The newest Freeware release of the CloudBerry Explorer now includes CloudFront support. You can create and manage distributions, assign CNAMES, and even automate the entire process using the Windows PowerShell. CloudBerry Explorer also includes some powerful support for batched changes to S3 object Access Control Lists. There are a couple of helpful videos here.
StreamInCloud is a free FLV (Flash Video) encoder. You simply create an S3 bucket and give StreamInCloud permission to read and write it. It then monitors the bucket for new videos, encodes them into the FLV format, and places the encoded version in the bucket. Of course, if the bucket is part of a CloudFront distribution, the encoded content is then available worldwide at high speed with low latency.
StreamInCloud encodes the videos at 512kbps and leaves the size as-is. This service is free; an advanced version with additional features and options will be available later at an additional charge.
Cyberduck is a Mac OS X client for Amazon S3 and CloudFront, with added support for FTP, SFTP, WebDav, and other online storage facilities. The product has a very long feature list, is "scriptable via AppleScript, and, like CloudBerry Explorer, is Freeware.
Full source code is available as well.
As I noted earlier this week, Ylastic allows you to manage your CloudFront distributions from your iPhone. There's now support for the Google Android Phone as well. Watch the screencasts to learn more.
Affirma Consulting has developed the Manager For Amazon CloudFront in C#. The project is hosted on CodePlex and full source code is available. It supports direct streaming of data into S3 and uses multiple threads to manage simultaneous uploads, downloads, and live statistics.
On the surface, CloudBuddy looks like a free S3 bucket explorer tool with full support for CloudFront. However, there's quite a bit more beneath the surface. It is actually a platform with a highly refined architecture. All CloudBuddy operations are exposed as APIs.
The distribution includes a Microsoft Office plug-in to help you to manage your documents, workbooks, emails, presentations, and projects in the cloud. Source code is available.
Bucket Explorer also has a number of unique and very handy features including the ability to copy objects from one S3 account to another along with timed backups to S3. It is available for Windows, Mac, and Linux.
Enjoy, and let us know how you have put CloudFront to use.
-- Jeff;
????の????
注??管?????注????確認????の???購???に?絡??????使??????の??????には??来????HTMLま?は???(URL)??含?な????????て???ま????
現?不??に?????<??-?? /??>??と?っ??号??って???とHTML??と認?????????????信できな?場?????ま???
その場??????訳???ま???????????にHTMLの??ま?は???(URL)?含ま??て???????客?のE??????信????と?できま?????該??????????????????度??????てくだ?????と????????????表示???ま???
現?????署で対????って???ま????信の??に??????????表示?????場?は??号????て?信?てくだ????
????の???には度?な???迷????????ま??と???深く?詫び????ま???
Amazon.co.jp Amazon ?????????????
I thought that it would be worthwhile to outline the steps needed to purchase an EC2 Reserved Instance. Here's what you need to do:
This blog post assumes that you have the latest version of the EC2 Command Line tools installed and that you have set the proper environment variables (JAVA_HOME, EC2_HOME, EC2_PRIVATE_KEY, and EC2_CERT) All commands are to be typed in to a Windows Command (cmd.exe) window.
Choose a Region
Per the announcement, you can now purchase Reserved Instances in either the US or in Europe. If you already have an EC2 instance running in a particular region and you want to convert it to a reserved instance, then choose that region. Otherwise, choose the region that is best suited to your needs over the term (1 year or 3 year) of the Reserved Instance.
Based on your chosen region, set your EC2 endpoint appropriately:
US:
Europe:
Choose an Availability Zone
If you already have an On-Demand instance running and you want to convert it to a Reserved Instance, or if you have an EBS volume in a particular Availability Zone, then your choice is clear. You can use the ec2-describe-instances command to figure out the availability zone and instance type if necessary. In the screen shot below, I have highlighted the instance type in yellow and the availability zone in purple to make it clear where to find them:
Locate The Reserved Instance Offering
Now that you know the instance type and Availability Zone, you need to decide if you want to purchase a Reserved Instance for 1 year or for 3 years. You can consult the EC2 Pricing Chart and make a decision based on your needs. Considerations might include the expected lifetime of your site or application, plans for growth, degree of variability expectd in your usage patterns, and so forth.
The next step is to run ec2-describe-reserved-instances-offerings and select the appropriate offering. Each offering is identified by an alphanumeric id such as e5a2ff3b-f6eb-4b4e-83f8-b879d7060257 (highlighted in yellow below):
You can also get fancy and run a search pipeline. Here's how I found an m1.small instance in us-east-1a with a 1 year term:
Make the Purchase
The next step is to actually make the purchase using ec2-purchase-reserved-instances-offering. This command requires an offering id from the previous step and an instance count, allowing purchase of more than one reserved instance at a time. Needless to say, you should use this command with caution since you are spending hundreds or thousands of dollars! Here's what happened when I made the purchase:
Enjoy
Since I already had an instance running, all further instance hours that it consumes will be billed at the lower rate. As of this fall three of my five offspring will be in college ( Washington, Maryland, and Rochester), so the extra pennies per hour will definitely come in handy!
-- Jeff;
We've been working to make it possible for you to run Windows or SQL Server in additional locations and to build highly available applications.
You now have the ability to launch EC2 running Windows or SQL Server in the EU-West region, in two separate Availability Zones. You can also launch EC2 running Windows or SQL Server in a second Availability Zone in the US-East region. With the additional of the new European region and the additional US zone you now have the tools needed to build Windows-based applications that are resilient against failure of an availability zone.
The
AWS Management Console has been updated with full support for the EU-West region. After selecting the new region from the handy dropdown (shown at right), you can launch EC2 instances, create, attach and destroy EBS volumes, manage Elastic IP addresses, and more.
We've created new Windows AMIs with the French, German, Italian, and Spanish language packages installed. The Console even provides a new Language menu in the quick start list. Once launched, you simply set the locale in the Windows Control Panel. You can find step by step directions for launching AMIs in various languages here.
The popular ElasticFox tool now lets you tag running instances, EBS volumes, and EBS snapshots. The Image and Instance views have been assigned to distinct tabs and you can now specify a binary (non-text) file as instance data at launch time.
While I'm talking about all things European, I should mention two other items that may be of interest to you. First, Amazon CTO Werner Vogels will deliver a keynote at the Cebit conference in Germany later this week. Second, we have an opening in Luxembourg for an AWS Sales Representative.
-- Jeff;
As part of a trip to New York earlier this year, the folks at Strateer were kind enough to set up an informal meeting with some AWS users in the area.
Before lunch they pointed out a small conference room to me and told me that they had invited "a few local users." They wanted to make sure that I wouldn't be disappointed if just 2 or 3 folks showed up (I was fine). We returned from a very pleasant lunch to find the room jam-packed, with at least 15 people around the table!
For the next 90 minutes, they talked, and listened. I can't tell you how valuable and worthwhile it is to hear directly from our users ?? what they like about AWS, what they are doing with it, and what they don't like about it. Amazon's customer-oriented culture attaches a lot of value to the "voice of the customer"; accordingly, I am happy to do my part to collect it and to bring it to the attention of the entire AWS team.
One attendee at that meeting was Khanan Grauer, founder of AWS-powered Fingad (pictured at right). He told me that they were using our services to host their site and that it had worked out really well for them.
Fingad allows serious traders to share and review their trading strategies with their peers. As Khanan told me, "our mission is to provide investors points of view from other investors. This has resulted in very interesting knowledge, which is very different from what the news media & business analysts produce."
Users create accounts and can then publish information about themselves in their profile. This particular user wrote a very interesting post about the current state of the oil futures market. There are a number of social networking features including photo albums and reviews.
Users can also create, share, and track their own virtual investment portfolios.
The site was built using Ruby on Rails and the Lighttpd web server, and runs on Amazon EC2. All images are stored in Amazon S3. As Khanan says, "This has resulted in faster response times and improvement for our users." They created a Master AMI and are able to scale with ease when traffic spikes.
-- Jeff;
Even though Amazon SimpleDB is still a beta product, progressive developers are already learning about it and building highly scalable applications. In fact, we just released a pair of case studies.
ShareThis has been deployed to over 30,000 web sites. Faced with rapid growth, the team considered three storage options and chose SimpleDB for its responsiveness, reliability, zero software cost, minimal staff costs, and low barrier to development. They used EC2, SimpleDB, S3, and SQS to build a complete loosely coupled and fault tolerant system in the cloud, with an estimated savings of $200,000. Read all about it!
The Alexa Site Thumbnail Service uses SimpleDB to store intermediate status and log data, allowing them to store and deliver millions of thumbnails. They store over 12 million objects in SimpleDB and perform over 5 million queries every day. Read all about it!
-- Jeff;
We have just released a new code sample.
Written in Java, this new sample shows how Amazon SimpleDB can be used as a repository for metadata which describes objects stored in Amazon S3. The code was written to illustrate best practices for indexing S3 data and for getting the best indexing and query performance from SimpleDB.
Indexing is implemented at two levels. At the first level, multiple threads (implemented using the Java Executor) are used to ensure that a number of S3 reads and a number of SimpleDB writes are taking place simultaneously. At the second level, Amazon SQS is used to coordinate index tasks running on multiple systems, leading to an even higher degree of concurrency.
Bulk queries are implemented using a pair of thread pools. The first pool runs SimpleDB queries and the second retrieves SimpleDB attributes. With the proper balance between the two pools, a Small Amazon EC2 instance was able to make over 300 requests per second.
Check it out!
-- Jeff;
We've been working to drive down our costs and to pass the savings along to our customers. We've focused on bandwidth costs and are happy to announce that the cost of outbound bandwidth (for data transferred from within AWS to the outside world) has been reduced effective May 1, 2008. The old and new costs are as follows:
<style> table.T2008_04_30 { border-width: 1px 1px 1px 1px; border-spacing: 3px; border-style: solid solid solid solid; border-color: blue blue blue blue; border-collapse: collapse; background-color: white; } table.T2008_04_30 td { border-width: 2px 2px 2px 2px; padding: 3px 3px 3px 3px; border-style: solid solid solid solid; border-color: blue blue blue blue; background-color: white; } </style>| Monthly Transfer | Old Price / GB | New Price / GB |
| First 10 TB | $0.180 | $0.170 |
| Next 40 TB | $0.160 | $0.130 |
| Next 100 TB | $0.130 | $0.110 |
| >=150 TB | $0.130 | $0.100 |
Note that there's an entirely new pricing tier, for customers with outbound monthly transfer in excess of 150 Terabytes.
As noted in the forum post, a customer with 50 TB of monthly transfer will save 16% and a customer with 500 TB of monthly transfer will save 26%. Earlier this year we let the world know that the total bandwidth consumed by Amazon EC2 and S3 is greater than that consumed by all of our global web sites put together.
We've also updated the AWS Simple Calculator Utility to reflect the new prices.
-- Jeff;
In the past we've blogged about Digital Bucket, which is a great app for storing files in Amazon S3.
I'm mentioning them again because they've added all sorts of new features. If you are a Windows user, I think you'll find this to be intuitive and seamless! A few of new goodies are as follows:
-- Mike
There is lots of buzz about Hadoop and Amazon EC2??and of
course there should be, given all the great projects such as the one that the
New York Times one, where they converted
old articles into PDF files for $240.
There??s a second environment you should know about, although the buzz level is a bit lower. (That might change.) Condor is a scheduling application that is commonly used in HPC and grid applications. It can also be used to manage Hadoop grids, and manages ??jobs? in much the same manner as mainframes??that is, you submit a job to Condor, along with metadata that describes the job??s characteristics. Then Condor finds suitable resources to allocate for the job. Note that Condor and Hadoop are trying to solve things in independent ways--with the result that they overlap in some ways, while doing unrelated things in some cases.
This week I attended Condor Week at the University of Wisconsin in Madison. Condor Week is an annual event that gives Condor collaborators and users the chance to exchange ideas and experiences, to learn about latest research, to experience live demos, and to influence our short and long term research and development directions.
If you are interested in large-scale grid computing, this approach is worth a serious look. There are two active projects that implement Condor on Amazon EC2, and of course that??s why this blog entry is being posted.
Cycle Computing offers Amazon EC2 plus Condor as an integrated platform, in addition to supporting other underlying computing resources. Their software automates Condor grid management, including monitoring, configuration, version control, usage tracking, and more. At the conference Jason Stowe from Cycle Computing made a very strong case for using Amazon EC2 instead of a traditional grid environment. Jason??s presentation is available for download at http://www.cs.wisc.edu/condor/CondorWeek2008/condor_presentations/stowe_cycle.pdf.
Red Hat??s approach integrates EC2 directly into the Condor code base. The result is that an Amazon EC2 instance is the ??Condor Job?, and in that manner they are able to manage the entire life cycle of an EC2 Instance. In some cases the entire Condor pool is running on EC2, and in other cases EC2 augments an existing pool. All of this work was done by collaboration between the University of Wisconsin (Jaeyoung Yoon , Fang Cao, and Jaime Frey, along with Matt Farrellee from Red Hat. They plan to integrate Amazon S3 as a storage medium in the near future.<o:p></o:p>
One thing seems certain: on-demand virtualization brightens the lights in Grid Computing City, because organizations who could not afford a grid suddenly find themselves with both affordable infrastructure and powerful tools to manage their new-found tool.
-- Mike
Developers who have found our cloud computing model attractive have been asking us to be a little bit more open about what we are planning to do in the future. To date we've simply announced new additions to the Amazon Web Services lineup, with immediate beta availability at the time of announcement.
Earlier this year we started to post specifications for new features along with requests for feedback. We did this for the Amazon S3 Copy feature and for Amazon S3 Post Support . We received a lot of helpful feedback in both cases.
Now it is time for the next step...
I am excited to be able to tell you about an entire new feature, a feature so new that it doesn't even have a proper name, and that you can't use just yet. But you can read about it and you can start thinking about the best way to incorporate it into your system architecture.
If you have taken a close look at Amazon EC2, you know that the instances are ephemeral. The instances have anywhere from 160 GB to 1.7 TB of attached storage. The storage is there as long as the instance is running, but of course it disappears as soon as the instance is shut down. Applications with a need for persistent storage could store data in Amazon S3 or in Amazon SimpleDB, but they couldn't readily access either one as if it was an actual file system.
As you can read in our forum post, we've been working on addressing this.
In the same way that your running EC2 instances, your Elastic IP addresses, your S3 buckets and your SQS queues can be thought of as items contained within the scope of your AWS account, our forthcoming persistent storage feature will give you the ability to create reliable, persistent storage volumes for use with EC2. Once created, these volumes will be part of your account and will have a lifetime independent of any particular EC2 instance.
These volumes can be thought of as raw, unformatted disk drives which can be formatted and then used as desired (or even used as raw storage if you'd like). Volumes can range in size from 1 GB on up to 1 TB; you can create and attach several of them to each EC2 instance. They are designed for low latency, high throughput access from Amazon EC2. Needless to say, you can use these volumes to host a relational database.
You will also be able to perform "snapshot" backups of your volumes to Amazon S3. You can use these snapshots to create new volumes or to roll back your stored data to an earlier point in time.
The volumes are accessible via a new set of APIs, with functions like CreateVolume, DeleteVolume, AttachVolume, and CreateSnapshot. The same functionality is also available via the EC2 Command-Line tools.
I spent some time experimenting with this new feature on Saturday. In a matter of minutes I was able to create a pair of 512 GB volumes, attach them to an EC2 instance, create file systems on them with mkfs, and then mount them. When I was done I simply unmounted, detached, and then finally deleted them.
First I created the volumes from the command line of my Windows desktop:
U:\USER\Jeff\Amazon> ec2-create-volume -s 549755813888Then I attached them to my EC2 instance:
U:\USER\Jeff\Amazon> ec2-attach-volume vol-4695702f -i i-6b3bfd02 -d /dev/sdbThen I switched over to my instance, formatted and mounted them, and I was all set:
# yes | mkfs -t ext3 /dev/sdbPerhaps I am biased, but the ability to requisition this much storage on an as-needed basis seems pretty cool.
A few EC2 customers are already using these new volumes and we will be opening it up to a wider audience later this year. You should sign up now if you are interested in gaining access to this cool new feature. If you don't already have an Amazon Web Services account, get one today before you sign up for the waiting list.
We'll be releasing more information as soon as possible and I'll do my best to cover it here when we do.
Updated: Here is some additional coverage:
--- Jeff;
From the very beginning, we've always wanted to make sure that developers had all of the formal and informal support needed to build and to run their applications.
At first the task of monitoring the AWS Forums was a rotating part-time assignment. Members of the Amazon Web Services team would be tasked with checking the forums from time to time and providing answers to the best of our ability.
Later, as AWS became more popular and more complex, we began to hire people to dedicate to this task. If you spend any time on the AWS Forums you will see a variety of names flagged with the Amazon logo. These folks spend their working day scanning the forums for questions and trouble reports, researching and formulating answers, and also contributing to our Resource Center and to our Technical FAQs.
Increasingly, we see that organizations of all sizes are putting AWS to use in new, innovative, and mission-critical ways. These organizations have told us that they need a more direct and more discreet way to request assistance and to report problems.
Today we are rolling out a new AWS Premium Support support channel for users of Amazon EC2, Amazon S3, and Amazon SQS. The new channel has two plans, Gold and Silver.
Both plans include fast and predictable response times, an unlimited number of support cases, and personalized support from our team of developer support engineers. Because it can be tricky to figure out exactly where a problem resides, developers with AWS Premium Support also have access to a set of client-side diagnostic tools.
The Gold plan also includes round the clock (24 hours per day 7 days per week 365 days per year) coverage, telephone support, and 1 hour maximum response time for issues designated as urgent.
Developers with Silver or Gold support can file cases (problem reports) using the new AWS Support Center:
Phone support is handled using Amazon's proven Click-to-Call technology -- click a button and we call back!
The Manage Your Cases option provides visibility into all of the cases associated with an account, with optional filtering on status.
Existing EC2, S3, and SQS customers can add support to their account at any time; new customers can choose to sign up for support when they sign up for the service. The Silver support plan provides for two named support contacts and the Gold support plan accommodates three.
Pricing is based on service usage, with minimum amounts for each plan. The Silver support plan is priced at $100 per month or $0.10 per dollar of monthly service usage (whichever is greater). The Gold support plan is priced at the greater of $400 per month or a charge of $0.20 per dollar for the first $10,000 of monthly service usage, $0.15 per dollar for the next $70,000 and $0.10 per dollar of everything over $80,000.
-- Jeff;
XML Hacker M. David Peterson has put together a really interesting article.
As part of his work at 3rd and Urban, he has implemented redundant, fault-tolerant, read-write disk storage on Amazon EC2 using a number of open source tools and applications including LVM, DRBD, NFS, Heartbeat, and VTUN.
Mark notes that "the primary focus of this paper is to present both a detailed overview as well as a working code base that will enable you to begin designing, building, testing, and deploying your EC2-based applications using a generalized persistent storage foundation, doing so today in both lieu of and in preparation for release of Amazon Web Services offering in this same space."
The article provides complete implementation details and links to source code for the scripts that Mark developed.
You can read the article, and you can also follow progress via the discussion group.
-- Jeff;