Program s3cmd can transfer files to and from Amazon S3 in two basic modes:
- Unconditional transfer — all matching files are uploaded to S3 (put operation) or downloaded back from S3 (get operation). This is similar to a standard unix cp command that also copies whatever it’s told to.
- Conditional transfer — only files that don’t exist at the destination in the same version are transferred by the s3cmd sync command. By default a md5 checksum and file size is compared. This is similar to a unix rsync command, with some exceptions outlined below.
Filenames handling rules and some other options are common for both these methods.
Filenames handling rules
Sync, get and put all support multiple arguments for source files and one argument for destination file or directory (optional in some case of get). The source can be a single file or a directory and there could be multiple sources used in one command. Let’s have these files in our working directory:
~/demo$ find . file0-1.msg file0-2.txt file0-3.log dir1/file1-1.txt dir1/file1-2.txt dir2/file2-1.log dir2/file2-2.txt
Obviously we can for instance upload one of the files to S3 and give it a different name:
~/demo$ s3cmd put file0-1.msg s3://s3tools-demo/test-upload.msg file0-1.msg -> s3://s3tools-demo/test-upload.msg [1 of 1]
We can also upload a directory with --recursive parameter:
~/demo$ s3cmd put --recursive dir1 s3://s3tools-demo/some/path/ dir1/file1-1.txt -> s3://s3tools-demo/some/path/dir1/file1-1.txt [1 of 2] dir1/file1-2.txt -> s3://s3tools-demo/some/path/dir1/file1-2.txt [2 of 2]
With directories there is one thing to watch out for – you can either upload the directory and its contents or just the contents. It all depends on how you specify the source.
To upload a directory and keep its name on the remote side specify the source without the trailing slash:
~/demo$ s3cmd put -r dir1 s3://s3tools-demo/some/path/ dir1/file1-1.txt -> s3://s3tools-demo/some/path/dir1/file1-1.txt [1 of 2] dir1/file1-2.txt -> s3://s3tools-demo/some/path/dir1/file1-2.txt [2 of 2]
On the other hand to upload just the contents, specify the directory it with a trailing slash:
~/demo$ s3cmd put -r dir1/ s3://s3tools-demo/some/path/ dir1/file1-1.txt -> s3://s3tools-demo/some/path/file1-1.txt [1 of 2] dir1/file1-2.txt -> s3://s3tools-demo/some/path/file1-2.txt [2 of 2]
Important — in both cases just the last part of the path name is taken into account. In the case of dir1 without trailing slash (which would be the same as, say, ~/demo/dir1 in our case) the last part of the path is dir1 and that’s what’s used on the remote side, appended after s3://s3…/path/ to make s3://s3…/path/dir1/….
On the other hand in the case of dir1/ (note the trailing slash), which would be the same as ~/demo/dir1/ (trailing slash again) is actually similar to saying dir1/* – ie expand to the list of the files in dir1. In that case the last part(s) of the path name are the filenames (file1-1.txt and file1-2.txt) without the dir1/ directory name. So the final S3 paths are s3://s3…/path/file1-1.txt and s3://s3…/path/file1-2.txt respectively, both without the dir1/ member in them. I hope it’s clear enough, if not ask in the mailing list or send me a better wording ;-)
The above examples were built around put command. A bit more powerful is sync – the path names handling is the same as was just explained. However the important difference is that sync first checks the list and details of the files already present at the destination, compares with the local files and only uploads the ones that either are not present remotely or have a different size or md5 checksum. If you ran all the above examples you’ll get a similar output to the following one from a sync:
~/demo$ s3cmd sync ./ s3://s3tools-demo/some/path/ dir2/file2-1.log -> s3://s3tools-demo/some/path/dir2/file2-1.log [1 of 2] dir2/file2-2.txt -> s3://s3tools-demo/some/path/dir2/file2-2.txt [2 of 2]
As you can see only the files that we haven’t uploaded yet, that is those from dir2, were now sync‘ed. Now modify for instance dir1/file1-2.txt and see what happens. In this run we’ll first check with code>—dry-run to see what would be uploaded. We’ll also add code>—delete-removed to get a list of files that exist remotely but are no longer present locally (or perhaps just have different names here):
~/demo$ s3cmd sync --dry-run --delete-removed ~/demo/ s3://s3tools-demo/some/path/ delete: s3://s3tools-demo/some/path/file1-1.txt delete: s3://s3tools-demo/some/path/file1-2.txt upload: ~/demo/dir1/file1-2.txt -> s3://s3tools-demo/some/path/dir1/file1-2.txt WARNING: Exitting now because of --dry-run
So there are two files to delete – they’re those that were uploaded without dir1/ prefix in one of the previous examples. And also one file to be uploaded — dir1/file1-2.txt, the file that we’ve just modified.
Sometimes you don’t want to compare checksums and sizes of the remote vs local files and only want to upload those that are new. For that use code>—skip-existing option:
~/demo$ s3cmd sync --dry-run --skip-existing --delete-removed ~/demo/ s3://s3tools-demo/some/path/ delete: s3://s3tools-demo/some/path/file1-1.txt delete: s3://s3tools-demo/some/path/file1-2.txt WARNING: Exitting now because of --dry-run
See? Nothing to upload in this case because dir1/file1-2.txt already exists in S3. With a different content, indeed, but --skip-existing only checks for the file presence, not the content.
Download from S3
Download from S3 with get and sync works pretty much along the same lines as explained above for upload. All the same rules apply and I’m not going to repeat myself. If in doubts run your command with —dry-run. If still in doubts ask on the mailing list for a help :-)
Filtering with —exclude / —include rules
Once the list of source files is compiled it is filtered through a set of exclude and include rules, in this order. That’s quite a powerful way to fine tune your uploads or downloads — you can for example instruct s3cmd to backup your home directory but don’t backup the JPG pictures (exclude pattern), except those whose name begins with a capital M and contain a digit. These you want to backup (include pattern).
S3cmd has one exclude list and one include list. Each can hold any number of filename match patterns, for instance in the exclude list the first pattern could be “match all JPG files” and the second one “match all files beginning with letter A” while in the include pattern may be just one pattern (or none or two hundreds) saying “match all GIF files”.
There is a number of options available to put the patterns in these lists.
- —exclude / —include — standard shell-style wildcards, enclose them into apostrophes to avoid their expansion by the shell. For example
--exclude 'x*.jpg'will match x12345.jpg but not abcdef.jpg. - —rexclude / —rinclude — regular expression version of the above. Much more powerful way to create match patterns. I realise most users have no clue about RegExps, which is sad. Anyway, if you’re one of them and can get by with shell style wildcards just use —exclude/—include and don’t worry about —rexclude/—rinclude. Or read some tutorial on RegExps, such a knowledge will come handy one day, I promise ;-)
- —exclude-from / —rexclude-from / —(r)include-from — Instead of having to supply all the patterns on the command line, write them into a file and pass that file’s name as a parameter to one of these options. For instance
--exclude '*.jpg' --exclude '*.gif'is the same as--exclude-from pictures.excludewherepictures.excludecontains these three lines:
# Hey, comments are allowed here ;-) *.jpg *.gif
All these parameters are equal in the sense that a file excluded by a --exclude-from rule can be put back into a game by, say, --rinclude rule.
One example to demonstrate the theory…
~/demo$ s3cmd sync --dry-run --exclude '*.txt' --include 'dir2/*' . s3://s3tools-demo/demo/ exclude: dir1/file1-1.txt exclude: dir1/file1-2.txt exclude: file0-2.txt upload: ./dir2/file2-1.log -> s3://s3tools-demo/demo/dir2/file2-1.log upload: ./dir2/file2-2.txt -> s3://s3tools-demo/demo/dir2/file2-2.txt upload: ./file0-1.msg -> s3://s3tools-demo/demo/file0-1.msg upload: ./file0-3.log -> s3://s3tools-demo/demo/file0-3.log WARNING: Exitting now because of --dry-run
The line in bold shows a file that has a ,txt extension, ie matches an exclude pattern, but because it also matches the ‘dir2/*’ include pattern it is still scheduled for upload.
This exclude / _include filtering is available for put, get and sync. In the future del, cp and mv will support it as well.





Joe wrote:
Has anyone figured a way to bypass the 2gb filesize limit yet?
And, can you have multiple specific excludes? How would they be formatted?
for instance
—exclude file1.tar.gz —exclude file2.tar.gz ?
(14 March 2009, 07:24 · #)
Tony Landis wrote:
Thanks, this is extremely helpful!
(18 June 2009, 21:13 · #)
SSL wrote:
Thanks, but does anyone know if there is a way to delete non-empty buckets recursively? As of now i think it’s a waste of time to go deliting files in a bucket one by one, so i have to use some Win application do simplify that task… Any way to do this with s3cmd?
( 4 July 2009, 08:47 · #)
Graham White wrote:
I want to know about multiple excludes, too. rsync allows those, so does s3cmd?
(20 July 2009, 13:19 · #)
Michal Ludvig wrote:
Hi Graham,
yes, you can use multiple excludes and multiple includes and you can combine —exclude, —rexclude, —exclude-from and —rexclude-from in one command. All the patterns will end up in a common list. Then you can fine tune this list with any number of —[r]include[-from] patterns.
Michal
(20 July 2009, 15:35 · #)
Ross Lai wrote:
I find a problem.
If I try s3cmd as:
./s3cmd sync —delete-removed —force s3://rossbkt/ test_case/
There will be some emtpy directories(these directories were existed in my local system but not in s3 bucket) in test_case/
Could I use s3cmd sync to remove these empty directories automatically?
(21 September 2009, 16:25 · #)
Christian Becker wrote:
Hi, great tool!
I ran into a small annoyance, however: —skip-existing does not seem to have any effect when syncing from S3 to a local directory.
(17 December 2009, 03:03 · #)
b14ck wrote:
Hi SSL, this response is for you:
To delete non-empty buckets recursively, you can do:
s3cmd rb -r s3://bucketname
Hope that helps! Cheers!
( 6 January 2010, 18:31 · #)
Saggi Malachi wrote:
Documentation says that by default s3cmd sync compares source and destination files using md5 checksum.
Is there any other way to compare source and destination files? I’d like it to compare by filename only to get a significant performance improvement. Is that possible?
(31 January 2010, 10:13 · #)
Saggi Malachi wrote:
Oh, sorry.
—skip-existing seems to do that.
(31 January 2010, 11:21 · #)
Rick T wrote:
Has anybody found a good solution for deleting a file that has been put to s3? I don’t want to simply use a batch command, because I don’t want to delete a file that failed to be put to s3.
( 5 February 2010, 04:57 · #)
Simone wrote:
Hi,
I’m trying to backup to S3 my whole home directory (let’s say /home/user/*)
how do I write the commandline in order to have some folders excluded from the backup?
i.e. I dont want to have
/home/user/Desktop/not_needed/* and
/home/user/no_no_no/*
I try this and other variations, but it does not work:
s3cmd sync —exclude /home/user/Desktop/not_needed/* —exclude /home/user/no_no_no/* —include “/home/user/*” /home/user s3://simone-backup-blackmagic
Can you point me in the right direction?
( 8 March 2010, 21:34 · #)
Tom Coady wrote:
Brilliant tool thanks so much. I know it’s not the tools fault but it seems a bit mad to have ACL rights so restricted by default – even with the sync command I need to manually set files to public access unless there’s some clever way to set this within sync like
s3cmd sync —acl-public ./ s3://s3tools-demo/
I just tried that and it looks like it worked but I had nothing to sync..
(16 March 2010, 07:17 · #)
Mashan wrote:
Everything is so simple when you have clear instructions.
Thank you
(20 March 2010, 18:58 · #)
Barry wrote:
Hi,
Thanks for great tool, but I seem to have a problem syncing from S3 to a local directory. Each time I run the command, ALL files are downloaded again, so it seems that no comparison with existing files is done, or else the comparison is failing.
I’m running something like:
s3cmd sync s3://mybucket/ ./mylocalcopy/
Am I mis-understanding how this works or is this a bug?
thanks, barry(25 March 2010, 07:34 · #)
Skyler Call wrote:
How can I include the directory /var/www/vhosts/ recursively but exclude the directory /var/www/vhosts/mydomain.tld/statistics/?
(31 March 2010, 05:26 · #)
Jerry wrote:
In Windows s3cmd does not convert \ to / in paths making recursive useless on this platform.
( 1 April 2010, 10:08 · #)
Brandon Simmons wrote:
As Christian Becker reported,
s3cmd sync —skip-existing s3://foo ./bardoes not actually skip existing files. Instead it looks like it overwrites them.
(30 April 2010, 11:23 · #)
rintje wrote:
First things first: this is a very fine utility indeed!
One small issue: whenever i try to abort a (long…) sync operation using CTRL + C, the application seems to hang. The message ‘See ya!’ is shown but then it does not return to the commandline.
Any thoughts?
(12 June 2010, 02:33 · #)
rintje wrote:
never mind, it was shell script problem.
(12 June 2010, 02:57 · #)
digitalquill wrote:
I use s3cmd sync to backup a web server, problem comes when I try and backup email, it errors out when a file that was showing when it indexed the folder is missing when it tries to backup i.e. either spam filter moves it or the user of the mailbox reads or deletes the file
is there any way to get round this?
( 8 July 2010, 02:44 · #)
mberman wrote:
I would love a way to do includes before excludes, which would enable some of the scenarios described above, as well as queries like, “Sync all the text files, except those in the following folders…”
Maybe a switch to flip the other, or another set of exclude switches (—exclude-after-include, etc.)?
(30 August 2010, 08:12 · #)
d wrote:
If I have a list of files I want to backup (and nothing else), should I first -exclude ‘#.#’ and then —include-from the list?
( 9 September 2010, 20:04 · #)
yann wrote:
s3cmd sync -v —delete-removed —rexclude
often fails with :
IOError: [Errno 2] No such file or directory: u’/root/XXXX/var/lib/XXXX/XXXXX/XX/XXXXX/67XXXXX57792XXXX.gz’
Is is because the file may have changed between the moment the list is build and the execution ?
The environment is :
S3cmd: 0.9.8.4
Python: 2.5.1 (r251:54863, Jul 10 2008, 17:25:56) [GCC 4.1.2 20070925 (Red Hat 4.1.2-33)]
(11 November 2010, 11:36 · #)
Michal Ludvig wrote:
Hi yann,
s3cmd is very very old. This bug has already been fixed, please upgrade to s3cmd 1.0.0(-rc1). See http://s3tools.org/s3cmd-100rc1-released for more info.
Michal
(11 November 2010, 12:25 · #)
HSD wrote:
great thanks, i wouldn’t do it with your help
(28 November 2010, 11:01 · #)
Joel wrote:
Having some issues with the sync option and lots of image files. I am syncing 2 directories every 5 mins and the jobs are backing up on themselves. Any way to speed the process up?
s3cmd sync —skip-existing —delete-remove /storage/ftp/ s3://bucket.images/
[root@image01 ftp]# find . -type f | wc -l
98137
[root@image01 ftp]# df -h /storage/
Filesystem Size Used Avail Use% Mounted on
/dev/sdf 500G 27G 473G 6% /storage
[root@image01 ftp]# rpm -q s3cmd
s3cmd-0.9.9.91-1.3
As time goes on I see this getting worse. s3cmd is taking up all the CPU going though the files local and remote.
Thanks
~joel
(16 December 2010, 19:33 · #)
Simon wrote:
I’m trying to exclude directories from a sync but can’t get either exclude or rexclude to work for directories. They work fine for files only.
s3cmd sync —encoding UTF-8 —rexclude “/webroot/admin/*” /webroot/* s3://s3….
also tried
s3cmd sync —encoding UTF-8 —exclude “/webroot/admin/*” /webroot/* s3://s3….
(13 January 2011, 06:55 · #)
Adam J wrote:
Has anyone found a way to automatically upload or delete modified files? I would like to use my programming IDE (Coda) to save or delete images on my webserver in the /var/www/site/images/ folder and have the files automatically added/deleted to/from the s3 server.
(19 January 2011, 12:12 · #)
Martin Leonard wrote:
Great little tool to use in cron jobs this. Have my images folder in vBulletin syncing to S3/CloudFront for faster delivery. Previously was only able to do this for “rarely changing” files, but now even user uploaded files can be served from the cloud without my input.
Awesome, will have a beer coming your way soon.
( 1 February 2011, 13:11 · #)
David wrote:
Hi just checking the s3cmd log and I find that it has uploaded 5000 files of 35000 files – perhaps I’d better do a bit of house-keeping and delete some of them!
Is there a safe way to kill s3cmd or shall I just ‘kill 11089’? – I’d rather not have file fragments filling up my buckets if I can avoid it.
David
(16 February 2011, 02:36 · #)
Justin wrote:
Hi Michael,
Using s3cmd sync, I synced a directory and by default all it’s subdirectories up to Amazon S3. Now, when I delete it, it quickly reappears again on refresh. How do I permanently delete these directories? Is the s3cmd sync still running on my Linux box? How do I tell it to stop?
Thanks.
Justin
(25 April 2011, 00:04 · #)
Nagaraj wrote:
Hi,
please let me know why I’m getting this error ?
s3cmd sync —include ‘Log/*/dt=20110421/’ —recursive s3://mybucket/ /mnt/share/s3test/
Usage: s3cmd [options] COMMAND [parameters]
s3cmd: error: no such option: —include
Also I was not able to see include and exclude command in s3cmd —help
Please help
Thanks !
(26 April 2011, 13:57 · #)
TaSK wrote:
Nice tool.
I’m in the process of uploading my 60.000 photos. I just wonder why it seems that all initial uploads fail? The script tries again after 2, 6 and 12 seconds before succeeding (error 11). With a throttle of 0.25 I get a meager 16 kB/s.
After 20 hours I am a mere 3% (!) into my repository.
Since I hope to use sync to maintain the S3 repository, I sincerely hope syncing is (much!) faster than uploading.
(26 April 2011, 19:29 · #)
Matt wrote:
This is a great tool! Thank you so much!
I am wondering, is there any logging event that I can look for to make sure that the script executed successfully and/or uploaded/downloaded files?
I ask because I am going to try to use this to do automated backups to S3 kicked off via cron.
Be nice to get some kind of indication that my data is being uploaded!
@TaSK: S3 is PAINFULLY slow. It took me days to backup my 50GB music collection.
( 8 May 2011, 17:39 · #)
wayne wrote:
Great tool, thanks.
However, the -r flag doesn’t work as expected.
using s3cmd sync -r <bucket> <local>
All files in a directory are downloaded if directory changed.
now using s3cmd sync <bucket>/dir/ <local>/dir/
Is what to expect?
(19 May 2011, 22:24 · #)
William Denniss wrote:
I am also finding that syncing from S3 is not comparing MD5 hashes, nor skipping existing files
Using s3cmd-1.0.1 on Mac OS X 10.7.4. I downloaded 11,147 files using sync, then interrupted and started it again. The bucket in question had versioning enabled. This is what I got:
$s3cmd sync —verbose s3://mybucket myfolder
INFO: Compiling list of local files…
INFO: Retrieving list of remote files for s3://mybucket/ …
INFO: Found 30993 remote files, 11147 local files
INFO: Applying —exclude/—include
INFO: Verifying attributes…
INFO: Summary: 30993 remote files to download, 11147 local files to delete
As you can see it’s wanting to delete my already downloaded files!
—skip-existing does skip the existing local files if they exist, but this isn’t a perfect solution as it does not replace corrupted files (caused for example if you interrupt `s3cmd sync`), or out of date files.
This is easy to test, simply sync a remote bucket locally. Interrupt the transfer, and try it again. Theoretically s3cmd sync should only re-download new and corrupt files, not all previous ones.
(17 June 2011, 18:07 · #)
William Denniss wrote:
SOLVED! I worked out why the md5 comparison wasn’t being done in my previous post. It’s all to do with the trailing slash.
If you have:
$s3cmd sync —verbose s3://mybucket myfolder
Change it to:
$s3cmd sync —verbose s3://mybucket/ myfolder/
(note the trailing slashes).
Then, the MD5 hashes are compared and everything works correctly! —skip-existing works as well.
To recap, both —skip-existing and md5 checks won’t happen if you use the first command, and both work if you use the second (I made a mistake in my previous post, as I was testing with 2 different directories).
(17 June 2011, 19:35 · #)
Brandon wrote:
Are there special permissions the user needs to have for sync? Put works just fine (local file up to S3), but sync give a 403 from S3 unless the user (via IAM)is an admin. Even giving the user s3:* doesn’t seem to work
s3cmd put —recursive /mnt/incoming/ s3://bucket/incoming/
/mnt/incoming/file3.txt -> s3://bucket/incoming/file3.txt [1 of 1]
but
s3cmd sync —recursive /mnt/incoming/ s3://bucket/incoming/
ERROR: S3 error: 403 (AccessDenied): Access Denied
(24 June 2011, 11:05 · #)
Brandon wrote:
Follow up, it seems related to the bucket the user has permissions to?
So put will work, but sync not if the policy (via IAM) is “Resource”: “arn:aws:s3:::bucket/incoming/*”
If you change it your entire S3 bucket range sync does work: “Resource”: “arn:aws:s3:::*”
which if true is an issue since that opens things up way to far. Can anyone else confirm
(24 June 2011, 11:24 · #)
Michal Ludvig wrote:
Hi Brandon,
That’s exactly true – the user needs r/w access to the bucket. Sync doesn’t read each and every file, instead it queries the bucket for the list of files and their attributes. For that it needs a read access to the bucket. And obviously for pushing files to the bucket needs write access.
As far as I know bucket-level access restricted to certain prefixes (e.g. only to /incoming/) is not possible in S3.
I suggest you create an “incoming bucket” for your 3rd party user and copy files from there to your private bucket with s3-to-s3 sync (well supported by s3cmd).
Michal
(24 June 2011, 11:44 · #)
Joseph wrote:
The s3cmd bucket sync takes a lot of system resources, which is not so good.
The way I use for buckets sync is mentioned here:
http://www.admon.org/sync-two-amazon-s3-buckets/
( 3 August 2011, 15:28 · #)
Paolo Ciccone wrote:
Thank you very much for making this available. It completely solved my backup to S3 problems. Very nice and generous of you.
(19 August 2011, 05:07 · #)
Ismael Olea wrote:
Question:
in order of doing the md5 checksum for the upstream file, how it works? I mean: it’s the md5 checksum of upstream file made locally or s3cmd gets only the md5 signature from an s3 propierty/attribute?
it’s obvious than the second way is pretty faster but don’t know if s3 maintains accesible md5 automagically generated neither s3cmd stores it in the bucket.
( 9 September 2011, 23:11 · #)
Paris wrote:
Question:
Can it sync two S3 buckets? And if yes, does it go through via COPY object or get/put?
The reason I am asking is because COPYing an object from S3-to-S3 bucket has no bandwidth costs if both buckets are in the same region.
Thanks
( 4 October 2011, 22:56 · #)
zitian wrote:
Hi,
I have trouble in configure.
Test access failed and the error message is:
‘ERROR: Test failed: 403 (AccessDenied): Access Denied’
I am sure I entered the right keys in the configure.
Can you help please?
Thanks,
(11 November 2011, 07:21 · #)
zitian wrote:
FYI
My problem fixed.
This is due to I only have access permission to sub-dir of the bucket.
‘s3cmd ls’ does not work
‘s3cmd ls s3://path’ works
Thanks
(12 November 2011, 06:25 · #)
Tom wrote:
Thank you William Dennis! http://s3tools.org/s3cmd-sync#c001212
That was exactly what i was after… Now its not killing the system on every sync. You MUST use trailing spaces.
For others use as an example, I use this actively as a cron task, to sync an S3 bucket to an EC2 in another region.
Careful with cron, as the task can, and will take longer than expected sometimes, get around this with ‘lockrun’ unix utility:
Cron set to run every 30 mins, with no overlap if it overextends.
/usr/bin/lockrun —lockfile=/tmp/s3_backup.lockrun — sh -c “(( /usr/bin/s3cmd sync —verbose —recursive s3://<production_bucket_name>/ /db/s3sync/<local file cache>/ ) &> /db/s3sync/logs/<production_bucket_name>.log)
Hope that helps someone following the same path.
(23 November 2011, 13:55 · #)
paulwintech wrote:
Hi,
Server side encryption is enhanced in s3cmd???Thanks
Paul
(10 December 2011, 22:49 · #)
William Smith wrote:
How do the MD5 signatures get calculated? Whenever I run this, a bunch of files are copied again and again, because their signatures don’t match.
Centos 5.5, kernel 2.6.18
I get, for instance:
WARNING: MD5 signatures do not match: computed=f449a48b7e78cea7453428976aaefe33, received=“a510874e347435e483493931ee020b60-13”
s3://s3nas.compusmiths.com/Quicken/QuickenData/BACKUP/QDATA-2011-11-27.PM03.26.QDF-backup -> /mnt/NAS/Finances/Quicken/BACKUP/QDATA-2011-11-27.PM03.26.QDF-backup
and neither of those match md5sum:
7c0dd8d3632f1e32c9ee08dcc1896515 /mnt/NAS/Finances/Quicken/BACKUP/QDATA-2011-11-27.PM03.26.QDF-backup
or the md5 sum in the bucket:
s3cmd ls —list-md5 s3://s3nas.compusmiths.com/Quicken/QuickenData/BACKUP/QDATA-2011-11-27.PM03.26.QDF-backup
2012-01-13 13:19 67245381 3c82b0b6d43c98de8c4f3780ae74c4b7-13 s3://s3nas.compusmiths.com/Quicken/QuickenData/BACKUP/QDATA-2011-11-27.PM03.26.QDF-backup
Any thoughts as to what I’m doing wrong?
Thanks!
(14 January 2012, 07:05 · #)
William Smith wrote:
It seems to be only the “big” files, 50 and 20 MB in size, and the returned MD5 sum is
156ed2c9b987040c7cd890a78d76a3f7-10
or
f07c594c302c72a8f62daa924dab5a14-5
instead of
701a5e18ea211ff3db17c0cfcf8dd158
So maybe it’s something to do with the -xx at the end of the reported MD5?
(14 January 2012, 07:25 · #)
Zachary Roadhouse wrote:
Thanks for the tool! I got a bit stuck when trying to use the —include filter. Apparently it is only applied if you also set an —exclude filter (found by examining the FileLists.py source). This was a bit counter intuitive. I would expect that if I set an include filter that all other files are excluded.
(25 January 2012, 08:37 · #)
eric festinger wrote:
hello
My “s3sync exclude” file contains:
ache
.cache
[tT]rash
metafiles
.thumbnails
.gvfs
lost+found
tmp
no_backup
Thumbs.db*
but files like these are still uploaded :
INFO: Sending file ‘/home/eric/.local/share/Trash/files/20120131-194158-IMG_5986 (small).jpg’, please wait…
(It seems it does the job perfectly for the other exluded files)
Any thoughts or clues?
Thanks in advance
( 2 February 2012, 08:27 · #)
eric festinger wrote:
oops, bold appeared in my comment… The first line of the my “s3sync exlude” file is:
(star)[cC]ache(star)
( 2 February 2012, 08:30 · #)
eric festinger wrote:
Got no answer :-( but managed to do it, adding a trailing “/*” to the folders I wanted to exclude.
(13 February 2012, 01:00 · #)
Russell wrote:
Any plans to add s3 to s3 bucket sync?
(15 February 2012, 07:59 · #)
Chandra wrote:
Hi,
I upload a tar.gz file using s3cmd-1.0.1beta version.
When i download it i am getting the following warning.
WARNING: MD5 signatures do not match: computed=9abc12977dd377b43e2e0a5ee17d17c8, received=“50927b67b64276dbc72c431b219ea78c-4
Does this effect the file??
Thanks
Chandra
(24 February 2012, 07:30 · #)
EricB wrote:
There’s a typo in the upload example. The text says “without” a trailing slash, yet the sample command includes the slash after “path”.
(quote)
To upload a directory and keep its name on the remote side specify the source without the trailing slash:
~/demo$ s3cmd put -r dir1 s3://s3tools-demo/some/path/
(unquote)
( 2 March 2012, 10:13 · #)
Eriks Goodwin-Pfister wrote:
I use this command to manually sync my Documents and Pictures directory on my CentOS 6 workstation:
s3cmd sync -r -v —exclude ‘‘ —include ‘/Documents/*’ —include ‘/Pictures/‘ /home s3://eriks_workstation/
By usingthe “exclude” first, it prevents all the crap cache files and other assorted trash from being copied. I also exclude my Downloads directories by default becasue I do not need backups of downloaded ISO images. :-)
The other advantage to this setup is that it copies all the Documents and Pictures directories for all the users on the machine who have directories in the /home/ structure.
I still need to figure out the best wa to format the cron job for this…. Thoughts?
( 7 April 2012, 05:07 · #)
Eriks Goodwin-Pfister wrote:
Sorry, but a lot of asterisks got clipped by the comment system. Let me try typing it this way:
s3cmd sync -r -v —exclude ‘Z’ —include ‘Z/Documents/Z’ —include ‘Z/Pictures/Z’ /home s3://eriks_workstation/
Replaces all the Z characters with an asterisk ( * ). and, of course, replace the m-dashes with a double hyphen.
( 7 April 2012, 05:17 · #)
randeep wrote:
How can I create a folder inside a bucket using s3cmd.
s3cmd mb s3://bucket_name/dir_name
failed.
ERROR: Parameter problem: Expecting S3 URI with just the bucket name set instead of ‘s3://bucket_name/dir_name’
(16 April 2012, 16:53 · #)
Nicolas wrote:
Hi, I think that sync can consume too many time and traffic, so I made this script that only put files added in a day
find -mtime -1 -exec /usr/local/bin/s3cmd put \{\} s3://bucket/ \;
( 5 June 2012, 04:32 · #)
Trip Denton wrote:
Great tool! I highly recommend it.
(10 July 2012, 08:50 · #)
tribalvibes wrote:
Great tool, with a plethora of options. Unfortunately, it is astoundingly slow for our simple use of syncing two s3 buckets….
s3cmd sync —delete-removed s3://buck1 s3://buck2
This is taking like one second per 100k-1mb file in a directory of 5000 files…. so more than an hour to copy around 2gb of data, running on a fast ec2 instance in the same zone as the s3 buckets. A local copy would take seconds. What is this doing that is so slow?
( 8 October 2012, 12:58 · #)
tiger wrote:
Is there any simple way to keep timestamp, stats, etc of the files uploaded to and downloaded from s3?
(11 October 2012, 04:46 · #)
Akhor wrote:
I’ve been testing the sync and it does not seems to sync symbolic links.
For example I have the structure
/blah/dir1
/blah/symlinktodir1 (obviously the symlink )
the symlinktodir1 does not get synced when running
s3cmd sync —recursive /blah/ S3:/blah/
(18 October 2012, 05:08 · #)
Attila wrote:
I confirm Akhor’s remark. I was also expecting sync to work with symbolic links seamlessly.
( 7 December 2012, 05:20 · #)
Mary wrote:
The —delete-removed option seems to ignore —exclude option.
Even thought I have excluded a directory, —delete-removed will delete any files that are in S3 but are not in the directory.
In the command below I have excluded the consumer and merchant directories, however,
s3cmd still deletes files in those directories.
Am I specifying the command correctly?
s3cmd sync \
—delete-removed \
—dry-run \
—recursive \
—reduced-redundancy \
—guess-mime-type \
—acl-public \
—add-header=‘Cache-Control’:‘public,max-age=7200’ \
—exclude ‘.svn*/*’ \
—exclude ‘/.*’ \
—exclude ‘/.*/’ \
—exclude ‘fonts/*’ \
—exclude ‘swf/*’ \
—exclude ‘images/consumer/*’ \
—exclude ‘images/merchant/*’ \
‘system/’ \
‘s3://the-system/’
(11 December 2012, 03:19 · #)
Paul wrote:
I have the same issue as Mary. The —delete-removed flag does not respect any regex that you pass. My specific situation is that I have an S3 bucket which I want to sync to a branch in a local svn repository. When the —delete-removed flag is included, since there are not .svn/ directories in my S3 bucket they are removed from my local repository as well. This makes it impossible for me to commit my updated local svn checkout back to the remote repo.
(20 December 2012, 11:31 · #)
Nic wrote:
I’ve had the “WARNING: MD5 signatures do not match: “ message here just now on a 100MB file. I ran a diff against the file I uploaded and they are identical. Obviously this might not apply to anyone, but this message is a red herring for me at least.
(30 December 2012, 02:19 · #)
Leandro wrote:
Hi there,
Thanks for the great tool. I have a question about synching which I’d like to run by you before testing and potentially breaking something or incurring heavy costs.
If I have files in S3 with the Glacier storage class, will sync successfully calculate the target MD5 and thus avoid re-uploading?
(17 January 2013, 11:36 · #)
WyriHaximus wrote:
@Leandro Just checked it and both with and without MD5 check it tries to reupload the files already on Glacier.
(21 January 2013, 07:28 · #)
Matt wrote:
This is first project we use s3cmd on AWS. I have a quick question if anyone could give an suggestion. Basically, I have a process to copy files from s3 bucket to our auto-scale instances at boot, and sometime the process failed because of timed-out. Not sure if the network is not ready or something else. Looks like it tried 5 fives with 3 seconds interval as well.
Here’s the log in my mail:
INFO: Compiling list of local files…
INFO: Retrieving list of remote files for s3://flw-china-website/ …
WARNING: Retrying failed request: / ([Errno 110] Connection timed out)
WARNING: Waiting 3 sec…
WARNING: Retrying failed request: / ([Errno 110] Connection timed out)
WARNING: Waiting 6 sec…
WARNING: Retrying failed request: / ([Errno 110] Connection timed out)
WARNING: Waiting 9 sec…
WARNING: Retrying failed request: / ([Errno 110] Connection timed out)
WARNING: Waiting 12 sec…
WARNING: Retrying failed request: / ([Errno 110] Connection timed out)
WARNING: Waiting 15 sec…
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! An unexpected error has occurred. Please report the following lines to: s3tools-bugs@lists.sourceforge.net
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Problem: S3RequestError: Request failed for: /
S3cmd: 1.0.1
Traceback (most recent call last): File “/usr/bin/s3cmd”, line 2006, in <module> main() File “/usr/bin/s3cmd”, line 1950, in main cmd_func(args) File “/usr/bin/s3cmd”, line 1213, in cmd_sync return cmd_sync_remote2local(args) File “/usr/bin/s3cmd”, line 929, in cmd_sync_remote2local remote_list = fetch_remote_list(args[:-1], recursive = True, require_attribs = True) File “/usr/bin/s3cmd”, line 276, in fetch_remote_list objectlist = _get_filelist_remote(uri) File “/usr/bin/s3cmd”, line 713, in _get_filelist_remote response = s3.bucket_list(remote_uri.bucket(), prefix = remote_uri.object(), recursive = recursive) File “/usr/lib/python2.6/site-packages/S3/S3.py”, line 186, in bucket_list response = self.bucket_list_noparse(bucket, prefix, recursive, uri_params) File “/usr/lib/python2.6/site-packages/S3/S3.py”, line 210, in bucket_list_nop arse response = self.send_request(request) File “/usr/lib/python2.6/site-packages/S3/S3.py”, line 487, in send_request return self.send_request(request, body, retries – 1)
(22 January 2013, 12:35 · #)
Tom wrote:
I can confirm issue at #68.
My daily backups go to S3 and are meant to be ovewritten regulary. They have date in their filename, but regular sync command works fine for this – removes old ones, when I remove them from filesystem.
Hovewer my monthly backups are made to be stored in glacier and are removed from local storage. They’re prefixed with “monthly” so using regular expression I should be able to exclude them from deletion after they’re gone from filesystem, but this doesnt happen.
I wish I could do
s3cmd sync —no-delete-removed backup s3://backup
s3cmd sync —delete-removed —exclude=‘monthly*’ backup s3://backup
(28 January 2013, 00:16 · #)
Leandro wrote:
@WyriHaximus, actually I think it works fine.
I have a large number of files already synched to S3 with Glacier as storage class (by virtue of Lifecycle Management). My source files don’t change, so they’re either present in S3 or they’re not.
When I use s3cmd similar to the following, I’m explicitly ignoring the MD5, but it does detect correctly whether the files are in S3, and only uploads the delta:
s3cmd —no-encrypt —recursive —no-check-md5 —no-delete-removed sync /my/source/dir/ s3://my-bucket
Furthermore, I ran an “ls” command requesting the MD5 hash of some files in Glacier storage, and it returns a hash. I didn’t validate this hash, but I think it makes sense for AWS to expose the proper value.
In conclusion, I think my use case works just fine.
(28 January 2013, 07:32 · #)
Jonathan Leung wrote:
Is there a way to move the file to another directory after it is pushed to S3?
(15 April 2013, 09:14 · #)
Cheryl Achunghall wrote:
I’m running the following:
s3cmd -r -v —delete-removed —exclude=“proc/*” —include=”/proc/“ sync / s3://mybucket/mydir/
However, s3cmd sync still attempts to traverse /proc.
INFO: Sending file ‘/proc/1/auxv’, please wait…
This results in the following:
Problem: IOErr: [Errno 29] Illegal seek
S3cmd: 1.1.0-beta1
How is this happening if I have the exclude in place? I’ve checked for symbolic links that point to /proc and there are none.
(17 April 2013, 00:03 · #)
Cheryl Achunghall wrote:
UPDATE
For the purpose of syncing top-level directories, I actually scripted around the exclude option not working.
# Scripting around failed exclusions issue in s3cmd
while read LINE
do s3cmd -r -v —delete-removed sync ${ROOTDIR}/${LINE}/ ${S3BUCKET}/${LINE}/
done <<< “$(ls ${ROOTDIR} | grep -vP ‘dev|lost\+found|media|mnt|proc|selinux|srv|sys|tmp’)”
This not only removed the need to exclude, but also seemed to index the individual top-level directories a lot quicker than when syncing the whole file system with exclusions.
Hope this helps someone!
(17 April 2013, 03:00 · #)
Anju wrote:
Am getting this error. Please let me know the proper syntax.
s3cmd sync —delete-removed —include /home —include ‘/tmp/’ s3://ec2-backup-us-east/arista3/
ERROR: Not enough paramters for command ‘sync’
(10 May 2013, 21:11 · #)
EnnPee wrote:
Hi, I have a redhat server running on AWS and wanted to create my website backups every 4 hour on to Amazon S3. The size of my web-server is around 30 GB. I used testing a small directory upload using the following command:
“s3cmd sync /var/www s3://bucket_name” and it synced . What my auditor wanted is to see list of backups taken every 4 hour as a separate backup image and be able to restore the specific point-in-time backup image.
“S3cmd sync S3://<bucket_name>” keeps syncing all new files created every 4 hours using cron job . But how to make it available as a separate backup image without duplicating full content which is a waste of storage ?
I know I could use AMI tools provided by AWS to create volume snapshots every 4 hour to achieve this but exploring if S3cmd can be used for this purpose
(26 May 2013, 20:49 · #)