Amazon S3 Overview
What is S3cmd
s3cmd) is a free command line tool and client for uploading, retrieving and managing data in Amazon S3 and other cloud storage service providers that use the S3 protocol, such as Google Cloud Storage or DreamHost DreamObjects. It is best suited for power users who are familiar with command line programs. It is also ideal for batch scripts and automated backup to S3, triggered from cron, etc.
S3cmd is written in Python. It's an open source project available under GNU Public License v2 (GPLv2) and is free for both commercial and private use. You will only have to pay Amazon for using their storage.
Lots of features and options have been added to S3cmd, since its very first release in 2008.... we recently counted more than 60 command line options, including multipart uploads, encryption, incremental backup, s3 sync, ACL and Metadata management, S3 bucket size, bucket policies, and more!
What is Amazon S3
Amazon S3 provides a managed internet-accessible storage service where anyone can store any amount of data and retrieve it later again.
S3 is a paid service operated by Amazon. Before storing anything into S3 you must sign up for an "AWS" account (where AWS = Amazon Web Services) to obtain a pair of identifiers: Access Key and Secret Key. You will need to give these keys to S3cmd. Think of them as if they were a username and password for your S3 account.
Amazon S3 pricing explained
At the time of this writing the costs of using S3 are (in USD):
$0.03 per GB per month of storage space used
$0.00 per GB - all data uploaded
$0.000 per GB - first 1GB / month data downloaded $0.090 per GB - up to 10 TB / month data downloaded $0.085 per GB - next 40 TB / month data downloaded $0.070 per GB - data downloaded / month over 50 TB
$0.005 per 1,000 PUT or COPY or LIST requests $0.004 per 10,000 GET and all other requests
If for instance on 1st of January you upload 2GB of photos in JPEG from your holiday in New Zealand, at the end of January you will be charged $0.06 for using 2GB of storage space for a month, $0.0 for uploading 2GB of data, and a few cents for requests. That comes to slightly over $0.06 for a complete backup of your precious holiday pictures.
In February you don't touch it. Your data are still on S3 servers so you pay $0.06 for those two gigabytes, but not a single cent will be charged for any transfer. That comes to $0.06 as an ongoing cost of your backup. Not too bad.
In March you allow anonymous read access to some of your pictures and your friends download, say, 1500MB of them. As the files are owned by you, you are responsible for the costs incurred. That means at the end of March you'll be charged $0.06 for storage plus $0.045 for the download traffic generated by your friends.
There is no minimum monthly contract or a setup fee. What you use is what you pay for. At the beginning my bill used to be like US$0.03 or even nil.
That's the pricing model of Amazon S3 in a nutshell. Check the Amazon S3 homepage for more details.
Needless to say that all these money are charged by Amazon itself, there is obviously no payment for using S3cmd :-)
Amazon S3 basics
Files stored in S3 are called "objects" and their names are officially called "keys". Since this is sometimes confusing for the users we often refer to the objects as "files" or "remote files". Each object belongs to exactly one "bucket".
To describe objects in S3 storage we invented a URI-like schema in the following form:
Buckets are sort of like directories or folders with some restrictions:
- each user can only have 100 buckets at the most,
- bucket names must be unique amongst all users of S3,
- buckets can not be nested into a deeper hierarchy and
- a name of a bucket can only consist of basic alphanumeric characters plus dot (.) and dash (-). No spaces, no accented or UTF-8 letters, etc.
It is a good idea to use DNS-compatible bucket names. That for instance means you should not use upper case characters. While DNS compliance is not strictly required some features described below are not available for DNS-incompatible named buckets. One more step further is using a fully qualified domain name (FQDN) for a bucket - that has even more benefits.
- For example "s3://--My-Bucket--" is not DNS compatible.
- On the other hand "s3://my-bucket" is DNS compatible but is not FQDN.
- Finally "s3://my-bucket.s3tools.org" is DNS compatible and FQDN provided you own the s3tools.org domain and can create the domain record for "my-bucket.s3tools.org".
Look for "Virtual Hosts" later in this text for more details regarding FQDN named buckets.
Objects (files stored in Amazon S3)
Unlike for buckets there are almost no restrictions on object names. These can be any UTF-8 strings of up to 1024 bytes long. Interestingly enough the object name can contain forward slash character (/) thus a
my/funny/picture.jpg is a valid object name. Note that there are not directories nor buckets called
funny - it is really a single object name called
my/funny/picture.jpg and S3 does not care at all that it looks like a directory structure.
The full URI of such an image could be, for example:
Public vs Private files
The files stored in S3 can be either Private or Public. The Private ones are readable only by the user who uploaded them while the Public ones can be read by anyone. Additionally the Public files can be accessed using HTTP protocol, not only using
s3cmd or a similar tool.
The ACL (Access Control List) of a file can be set at the time of upload using
--acl-private options with
s3cmd put or
s3cmd sync commands (see below).
Alternatively the ACL can be altered for existing remote files with
s3cmd setacl --acl-public (or
See the S3cmd HowTo for example usages of this S3-URI schema.