Upload Data to an Amazon S3 Bucket

This tutorial describes common and recommended methods for uploading data to an existing Amazon S3 bucket.

Typical use cases include:

  • Uploading model input data

  • Storing simulation outputs

  • Backing up data from EC2 or FSx for Lustre

  • Sharing data across AWS services

This guide assumes that the S3 bucket already exists.

Prerequisites

Before uploading data to S3, ensure that:

  • You have access to an existing S3 bucket

  • Your IAM user or role has permission to write to the bucket

  • AWS CLI is installed and configured if using command-line methods

Upload Output Data from FSx for Lustre (Best for Large Outputs)

When using FSx for Lustre with an S3 Data Repository Association (DRA), data written to FSx can be automatically synchronized to S3.

Typical workflow:

/fsx/
└── output/
    └── Test_Global_1day_c36s10/

With DRA configured:

  • Files written to FSx appear in the associated S3 bucket

  • No explicit aws s3 cp command is required

This method is recommended for:

  • Large-scale model output

  • ParallelCluster workflows

  • Repeated data transfers

Note

FSx and S3 must be in the same AWS account and region for DRA to work.

Other Official Methods

Upload Data Using AWS CLI

The AWS CLI is the preferred method for uploading data in most research and HPC workflows. It is scriptable, restartable, and suitable for large datasets.

Upload a Single File

Use aws s3 cp to upload an individual file:

aws s3 cp local_file.nc s3://my-bucket/path/local_file.nc

Example:

aws s3 cp emis_2020.nc s3://imi-gchp-test/emissions/emis_2020.nc

Upload a Directory Recursively

To upload an entire directory and preserve its structure:

aws s3 cp local_directory/ s3://my-bucket/path/ --recursive

Example:

aws s3 cp ExtData/ s3://acmg-input-data/ExtData/ --recursive

This method is suitable for initial uploads of structured datasets.

Upload Data Using AWS Management Console

This method is suitable only for small files or one-off uploads.

Steps:

  1. Open the S3 service in the AWS Management Console

  2. Select the target bucket

  3. Click Upload

  4. Drag and drop files or select them manually

  5. Click Upload

Limitations:

  • Not suitable for large datasets

  • No automation

  • Browser-dependent

Common Permission Requirements

Uploading data to S3 typically requires the following IAM permissions:

s3:PutObject
s3:PutObjectAcl
s3:ListBucket

These permissions must apply to:

  • The bucket itself

  • All objects within the bucket

If uploading from EC2 or ParallelCluster:

  • The instance IAM role must have these permissions

  • User permissions on your local machine do not apply

Verification

To verify uploaded objects:

aws s3 ls s3://my-bucket/path/

To recursively list contents:

aws s3 ls s3://my-bucket/path/ --recursive

Note

  • Although S3 paths resemble Linux directories, S3 is not a filesystem: operations such as mv or rename rewrite object keys rather than modifying directory metadata.

  • Always include the trailing / when operating on a “folder”. The trailing / tells the AWS CLI to treat the path as a prefix rather than a single object. Without it, the command applies only to the object with that exact key, not to everything under the prefix.

  • When applying an operation to all objects under a prefix, also include --recursive; otherwise, the command will not descend into the pseudo-directory.

Checking S3 Bucket Size (for FSx Planning)

Before importing data from S3 into FSx, determine the total logical size of the S3 bucket or prefix. This value should be used to size FSx storage.

Use the AWS console:

  1. Go to your S3 bucket

  2. Click on Metrics tab

  3. Check Total bucket size

Notes and Best Practices

  • Prefer aws s3 sync for repeated uploads

  • Keep S3 buckets private by default

  • Upload data from the same AWS region whenever possible

  • Use FSx DRA for high-throughput workflows

  • Avoid browser uploads for large or critical datasets

Next Steps

After uploading data, you may want to:

  • Configure bucket access policies for collaborators

  • Associate the bucket with FSx for Lustre

  • Automate uploads in batch or workflow scripts