ChatGPT解决这个技术问题 Extra ChatGPT

AWS S3: how do I see how much disk space is using

I have AWS account. I'm using S3 to store backups from different servers. The question is there any information in the AWS console about how much disk space is in use in my S3 cloud?

You have to get all objects, then sum up all the files sizes. You can't do it in a single operation.
It's strange that there is no real solution to this problem. Going through all items and calculating is not a solution if you have 10s of millions of files! In AWS's s3 UI you can easily see the usage under Management -> Metrics. Why isn't there a way to get this from the command line?

A
Asclepius

The command line tool gives a nice summary by running:

aws s3 ls s3://mybucket --recursive --human-readable --summarize

Yours and Christopher's are by far the best answers.
Much simpler solution than the accepted answer. Thanks!
this does not show the true size with versions. Is there a way to check the total size of the s3 bucket with all versions?
Print total size of each of your buckets: for b in $(aws s3 ls | awk '{ print $NF }'); do printf "$b "; aws s3 ls s3://$b --recursive --human-readable --summarize | tail -1; done
M
Mini John

Yippe - an update to AWS CLI allows you to recursively ls through buckets...

aws s3 ls s3://<bucketname> --recursive  | grep -v -E "(Bucket: |Prefix: |LastWriteTime|^$|--)" | awk 'BEGIN {total=0}{total+=$3}END{print total/1024/1024" MB"}'

print total/1024/1024/1024*.03 gives a nice estimate for $ usage if you are under 1TB. @cudds awesomeness - thanks a ton!!!
You don't need the grep part if you ls a single bucket.
AWS Cloudwatch now has a metric for bucket size and number of objects that is updated daily. About time! aws.amazon.com/blogs/aws/…
Example aws cloudwatch get-metric-statistics --namespace AWS/S3 --start-time 2015-07-15T10:00:00 --end-time 2015-07-31T01:00:00 --period 86400 --statistics Average --region eu-west-1 --metric-name BucketSizeBytes --dimensions Name=BucketName,Value=toukakoukan.com Name=StorageType,Value=StandardStorage Important: You must specify both StorageType and BucketName in the dimensions argument otherwise you will get no results.
@SamMartin what does StorageType need to be? Also this answer takes a very long time to compute for buckets bigger than 100 GB
e
endriju

To find out size of S3 bucket using AWS Console:

Click the S3 bucket name Select "Metrics" tab You should see "Bucket metrics" which by default includes "Total bucket size"


This works faster in case your bucket has TBs of data. The accepted answers take a lot of time to calculate all the objects in that scale.
Note also that this will capture hanging incomplete uploads, with the ls-based solutions don't.
fastest way to do it is this answer
''Metrics'' has its own tab for me. But yeah this is the fastest way for me.
m
markusk

s3cmd can show you this by running s3cmd du, optionally passing the bucket name as an argument.


FYI - I tried this and the aws cli version in cudds answer. They both work fine, but s3cmd was significantly slower in the cases I tried as of release 1.5.0-rc1.
@DougW: Thanks, useful info. AWS CLI 1.0.0 was released in September 2013, so it didn't exist at the time I wrote my answer.
s3cmd doesn't support AWS4 hashing so it won't work with any new regions, including the EU region "eu-central-1"
@Koen.: Thanks, I was not aware of this. Seems the s3cmd maintainer is looking into adding support for AWS4: github.com/s3tools/s3cmd/issues/402
@Koen.: s3cmd now supports AWS4 hashing as of 1.5.0, which was released 2015-01-12. See s3tools.org/news.
C
Christopher Hackett

The AWS CLI now supports the --query parameter which takes a JMESPath expressions.

This means you can sum the size values given by list-objects using sum(Contents[].Size) and count like length(Contents[]).

This can be be run using the official AWS CLI as below and was introduced in Feb 2014

 aws s3api list-objects --bucket BUCKETNAME --output json --query "[sum(Contents[].Size), length(Contents[])]"

I had to use double quotes around the query string in windows command line. Works like a champ though.
Beware: if the bucket is empty the command would fail with the following error: In function sum(), invalid type for value: None, expected one of: ['array-number'], received: "null" Otherwise the query works great!
AWS documentation indicates that if you need to get the size of a bucket use that command which works well in most cases. However, it's not suitable for automation because sometimes you might have a scenario where some of the certain buckets have thousands or millions of records. The command will have to iterate through the complete list before it can render the required bucket size information. It's not suitable for automation.
J
JScoobyCed

On linux box that have python (with pip installer), grep and awk, install AWS CLI (command line tools for EC2, S3 and many other services)

sudo pip install awscli

then create a .awssecret file in your home folder with content as below (adjust key, secret and region as needed):

[default]
aws_access_key_id=<YOUR_KEY_HERE>
aws_secret_access_key=<YOUR_SECRET_KEY_HERE>
region=<AWS_REGION>

Make this file read-write to your user only:

sudo chmod 600 .awssecret

and export it to your environment

 export AWS_CONFIG_FILE=/home/<your_name>/.awssecret

then run in the terminal (this is a single line command, separated by \ for easy reading here):

aws s3 ls s3://<bucket_name>/foo/bar | \
grep -v -E "(Bucket: |Prefix: |LastWriteTime|^$|--)" | \
awk 'BEGIN {total=0}{total+=$3}END{print total/1024/1024" MB"}'

the aws part lists the bucket (or optionally a 'sub-folder')

the grep part removes (using -v) the lines that match the Regular Expression (using -E). ^$ is for blank line, -- is for the separator lines in the output of aws s3 ls

the last awk simply add to total the 3rd colum of the resulting output (the size in KB) then display it at the end

NOTE this command works for the current bucket or 'folder', not recursively


r
rowelee

Cloud watch also allows you to create metrics for your S3 bucket. It shows you metrics by the size and object count. Services> Management Tools> Cloud watch. Pick the region where your S3 bucket is and the size and object count metrics would be among those available metrics.


C
Community

See https://serverfault.com/questions/84815/how-can-i-get-the-size-of-an-amazon-s3-bucket

Answered by Vic...

<?php
if (!class_exists('S3')) require_once 'S3.php';

// Instantiate the class
$s3 = new S3('accessKeyId', 'secretAccessKey');
S3::$useSSL = false;

// List your buckets:
echo "S3::listBuckets(): ";
echo '<pre>' . print_r($s3->listBuckets(), 1). '</pre>';

$totalSize = 0;
$objects = $s3->getBucket('name-of-your-bucket');
foreach ($objects as $name => $val) {
    // If you want to get the size of a particular directory, you can do
    // only that.
    // if (strpos($name, 'directory/sub-directory') !== false)
    $totalSize += $val['size'];
}

echo ($totalSize / 1024 / 1024 / 1024) . ' GB';
?>

Do you know if gigabyte in this case is 10243 or 10003? I'm having a hard time finding a definitive S3 statement.
@dfrankow The line echo ($totalSize / 1024 / 1024 / 1024) . ' GB'; is right there at the bottom of the source code.
@MJD I don't remember what my thought was here. It was asking either about s3cmd or S3 use of the word "gigabyte", not this PHP code.
r
ruletkin

In addition to Christopher's answer.

If you need to count total size of versioned bucket use:

aws s3api list-object-versions --bucket BUCKETNAME --output json --query "[sum(Versions[].Size)]"

It counts both Latest and Archived versions.


g
gene_wood

Getting large buckets size via API (either aws cli or s4cmd) is quite slow. Here's my HowTo explaining how to parse S3 Usage Report using bash one liner:

cat report.csv | awk -F, '{printf "%.2f GB %s %s \n", $7/(1024**3 )/24, $4, $2}' | sort -n

G
Geoff Appleford

The AWS console wont show you this but you can use Bucket Explorer or Cloudberry Explorer to get the total size of a bucket. Both have free versions available.

Note: these products still have to get the size of each individual object, so it could take a long time for buckets with lots of objects.


I can only see a trialware though. Has that offer been removed?
E
Evgeny Goldin

Based on @cudds's answer:

function s3size()
{
    for path in $*; do
        size=$(aws s3 ls "s3://$path" --recursive | grep -v -E "(Bucket: |Prefix: |LastWriteTime|^$|--)" | awk 'BEGIN {total=0}{total+=$3}END{printf "%.2fGb\n", (total/1024/1024/1024)}')
        echo "[s3://$path]=[$size]"
    done
}

...

$ s3size bucket-a bucket-b/dir
[s3://bucket-a]=[24.04Gb]
[s3://bucket-b/dir]=[26.69Gb]

Also, Cyberduck conveniently allows for calculation of size for a bucket or a folder.


u
user7191982

This is an old inquiry, but since I was looking for the answer I ran across it. Some of the answers made me remember I use S3 Browser to manage data. You can click on a bucket and hit properties and it shows you the total. Pretty simple. I highly recommend the browser: https://s3browser.com/default.aspx?v=6-1-1&fam=x64


D
Danny Schoemann

You asked: information in AWS console about how much disk space is using on my S3 cloud?

I so to the Billing Dashboard and check the S3 usage in the current bill.

They give you the information - MTD - in Gb to 6 decimal points, IOW, to the Kb level.

It's broken down by region, but adding them up (assuming you use more than one region) is easy enough.

BTW: You may need specific IAM permissions to get to the Billing information.


C
Community

Mini John's answer totally worked for me! Awesome... had to add

--region eu-west-1 

from Europe though


Y
Yiannis Tsimalis

Well, you can do it also through an S3 client if you prefer a human friendly UI.

I use CrossFTP, which is free and cross-platform, and there you can right-click on the folder directory -> select "Properties..." -> click on "Calculate" button next to Size and voila.


m
maksion

s3admin is an opensource app (UI) that lets you browse buckets, calculate total size, show largest/smallest files. It's tailored for having a quick overview of your Buckets and their usage.


J
JamesKn

So I am going to add Storage Lens from AWS on here with the default dashboard.

https://docs.aws.amazon.com/AmazonS3/latest/userguide/storage-lens-optimize-storage.html?icmpid=docs_s3_hp_storage_lens_dashboards

It is really super useful for identify hidden cost of storage like "incomplete multipart uploads"

It really should probably be now the first port of call for answering this question before you now reach for the code.


j
jwadsack

I use Cloud Turtle to get the size of individual buckets. If the bucket size exceeds >100 Gb, then it would take some time to display the size. Cloud turtle is freeware.


Be careful with this software. It installs extra chrome extensions and seems to be rather spammy.