Uploading files#
The AWS SDK for Python provides a pair of methods to upload a file to an S3 bucket.
The upload_file
method accepts a file name, a bucket name, and an object
name. The method handles large files by splitting them into smaller chunks
and uploading each chunk in parallel.
import logging
import boto3
from botocore.exceptions import ClientError
import os
def upload_file(file_name, bucket, object_name=None):
"""Upload a file to an S3 bucket
:param file_name: File to upload
:param bucket: Bucket to upload to
:param object_name: S3 object name. If not specified then file_name is used
:return: True if file was uploaded, else False
"""
# If S3 object_name was not specified, use file_name
if object_name is None:
object_name = os.path.basename(file_name)
# Upload the file
s3_client = boto3.client('s3')
try:
response = s3_client.upload_file(file_name, bucket, object_name)
except ClientError as e:
logging.error(e)
return False
return True
The upload_fileobj
method accepts a readable file-like object. The file
object must be opened in binary mode, not text mode.
s3 = boto3.client('s3')
with open("FILE_NAME", "rb") as f:
s3.upload_fileobj(f, "amzn-s3-demo-bucket", "OBJECT_NAME")
The upload_file
and upload_fileobj
methods are provided by the S3
Client
, Bucket
, and Object
classes. The method functionality
provided by each class is identical. No benefits are gained by calling one
class’s method over another’s. Use whichever class is most convenient.
The ExtraArgs parameter#
Both upload_file
and upload_fileobj
accept an optional ExtraArgs
parameter that can be used for various purposes. The list of valid
ExtraArgs
settings is specified in the ALLOWED_UPLOAD_ARGS
attribute
of the S3Transfer
object
at boto3.s3.transfer.S3Transfer.ALLOWED_UPLOAD_ARGS
.
The following ExtraArgs
setting specifies metadata to attach to the S3
object.
s3.upload_file(
'FILE_NAME', 'amzn-s3-demo-bucket', 'OBJECT_NAME',
ExtraArgs={'Metadata': {'mykey': 'myvalue'}}
)
The following ExtraArgs
setting assigns the canned ACL (access control
list) value ‘public-read’ to the S3 object.
s3.upload_file(
'FILE_NAME', 'amzn-s3-demo-bucket', 'OBJECT_NAME',
ExtraArgs={'ACL': 'public-read'}
)
The ExtraArgs
parameter can also be used to set custom or multiple ACLs.
s3.upload_file(
'FILE_NAME', 'amzn-s3-demo-bucket', 'OBJECT_NAME',
ExtraArgs={
'GrantRead': 'uri="http://acs.amazonaws.com/groups/global/AllUsers"',
'GrantFullControl': 'id="01234567890abcdefg"',
}
)
The Callback parameter#
Both upload_file
and upload_fileobj
accept an optional Callback
parameter. The parameter references a class that the Python SDK invokes
intermittently during the transfer operation.
Invoking a Python class executes the class’s __call__
method. For each
invocation, the class is passed the number of bytes transferred up
to that point. This information can be used to implement a progress monitor.
The following Callback
setting instructs the Python SDK to create an
instance of the ProgressPercentage
class. During the upload, the
instance’s __call__
method will be invoked intermittently.
s3.upload_file(
'FILE_NAME', 'amzn-s3-demo-bucket', 'OBJECT_NAME',
Callback=ProgressPercentage('FILE_NAME')
)
An example implementation of the ProcessPercentage
class is shown below.
import os
import sys
import threading
class ProgressPercentage(object):
def __init__(self, filename):
self._filename = filename
self._size = float(os.path.getsize(filename))
self._seen_so_far = 0
self._lock = threading.Lock()
def __call__(self, bytes_amount):
# To simplify, assume this is hooked up to a single filename
with self._lock:
self._seen_so_far += bytes_amount
percentage = (self._seen_so_far / self._size) * 100
sys.stdout.write(
"\r%s %s / %s (%.2f%%)" % (
self._filename, self._seen_so_far, self._size,
percentage))
sys.stdout.flush()