I'm having trouble using *
in the AWS CLI to select a subset of files from a certain bucket.
Adding *
to the path like this does not seem to work
aws s3 cp s3://data/2016-08* .
To download multiple files from an aws bucket to your current directory, you can use recursive
, exclude
, and include
flags. The order of the parameters matters.
Example command:
aws s3 cp s3://data/ . --recursive --exclude "*" --include "2016-08*"
For more info on how to use these filters: http://docs.aws.amazon.com/cli/latest/reference/s3/#use-of-exclude-and-include-filters
The Order of the Parameters Matters
The exclude and include should be used in a specific order, We have to first exclude and then include. The viceversa of it will not be successful.
aws s3 cp s3://data/ . --recursive --include "2016-08*" --exclude "*"
This will fail because order of the parameters maters in this case. The include is excluded by the *
aws s3 cp s3://data/ . --recursive --exclude "*" --include "2016-08*"`
This one will work because the we excluded everything but later we had included the specific directory.
Okay I have to say the example is wrong and should be corrected as follows:
aws s3 cp . s3://data/ --recursive --exclude "*" --include "2006-08*" --exclude "*/*"
The .
needs to be right after the cp
. The final --exclude
is to make sure that nothing is picked up from any subdirectories that are picked up by the --recursive
(learned that one by mistake...)
This will work for anyone struggling with this by the time they got here.
If there is an error while using ‘ * ’
you can also use recursive, include, and exclude flags like
aws s3 cp s3://myfiles/ . --recursive --exclude "*" --include "file*"
Success story sharing
--exclude "*"
isn't a typo. If you don't add it, the include will match anything. As per the documentation: Note that, by default, all files are included. This means that providing only an --include filter will not change what files are transferred. --include will only re-include files that have been excluded from an --exclude filter. If you only want to upload files with a particular extension, you need to first exclude all files, then re-include the files with the particular extension.sync
for a similar effect, which is recursive by default:aws s3 sync s3://data/ . --exclude "*" --include "2016-08*"
--dryrun
flag to make sure the correct set of object is selected by the wildcard.--recursive
flag is required. Or it will not work.