We use cookies and similar tools to enhance your experience, provide our services, deliver relevant advertising, and make improvements. Approved third parties also use these tools to help us deliver advertising and provide certain site features.
Customize cookie preferences
We use cookies and similar tools (collectively, "cookies") for the following purposes.
Essential
Essential cookies are necessary to provide our site and services and cannot be deactivated. They are usually set in response to your actions on the site, such as setting your privacy preferences, signing in, or filling in forms.
Performance
Performance cookies provide anonymous statistics about how customers navigate our site so we can improve site experience and performance. Approved third parties may perform analytics on our behalf, but they cannot use the data for their own purposes.
Allowed
Functional
Functional cookies help us provide useful site features, remember your preferences, and display relevant content. Approved third parties may set these cookies to provide certain site features. If you do not allow these cookies, then some or all of these services may not function properly.
Allowed
Advertising
Advertising cookies may be set through our site by us or our advertising partners and help us deliver relevant marketing content. If you do not allow these cookies, you will experience less relevant advertising.
Allowed
Blocking some types of cookies may impact your experience of our sites. You may review and change your choices at any time by clicking Cookie preferences in the footer of this site. We and selected third-parties use cookies or similar technologies as specified in the AWS Cookie Notice.
# Copyright 2015 Amazon.com, Inc. or its affiliates. All Rights Reserved.## Licensed under the Apache License, Version 2.0 (the "License"). You# may not use this file except in compliance with the License. A copy of# the License is located at## https://aws.amazon.com/apache2.0/## or in the "license" file accompanying this file. This file is# distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF# ANY KIND, either express or implied. See the License for the specific# language governing permissions and limitations under the License."""Abstractions over S3's upload/download operations.This module provides high level abstractions for efficientuploads/downloads. It handles several things for the user:* Automatically switching to multipart transfers when a file is over a specific size threshold* Uploading/downloading a file in parallel* Progress callbacks to monitor transfers* Retries. While botocore handles retries for streaming uploads, it is not possible for it to handle retries for streaming downloads. This module handles retries for both cases so you don't need to implement any retry logic yourself.This module has a reasonable set of defaults. It also allows youto configure many aspects of the transfer process including:* Multipart threshold size* Max parallel downloads* Socket timeouts* Retry amountsThere is no support for s3->s3 multipart copies at thistime... _ref_s3transfer_usage:Usage=====The simplest way to use this module is:.. code-block:: python client = boto3.client('s3', 'us-west-2') transfer = S3Transfer(client) # Upload /tmp/myfile to s3://bucket/key transfer.upload_file('/tmp/myfile', 'bucket', 'key') # Download s3://bucket/key to /tmp/myfile transfer.download_file('bucket', 'key', '/tmp/myfile')The ``upload_file`` and ``download_file`` methods also accept``**kwargs``, which will be forwarded through to the correspondingclient operation. Here are a few examples using ``upload_file``:: # Making the object public transfer.upload_file('/tmp/myfile', 'bucket', 'key', extra_args={'ACL': 'public-read'}) # Setting metadata transfer.upload_file('/tmp/myfile', 'bucket', 'key', extra_args={'Metadata': {'a': 'b', 'c': 'd'}}) # Setting content type transfer.upload_file('/tmp/myfile.json', 'bucket', 'key', extra_args={'ContentType': "application/json"})The ``S3Transfer`` class also supports progress callbacks so you canprovide transfer progress to users. Both the ``upload_file`` and``download_file`` methods take an optional ``callback`` parameter.Here's an example of how to print a simple progress percentageto the user:.. code-block:: python class ProgressPercentage(object): def __init__(self, filename): self._filename = filename self._size = float(os.path.getsize(filename)) self._seen_so_far = 0 self._lock = threading.Lock() def __call__(self, bytes_amount): # To simplify we'll assume this is hooked up # to a single filename. with self._lock: self._seen_so_far += bytes_amount percentage = (self._seen_so_far / self._size) * 100 sys.stdout.write( "\r%s %s / %s (%.2f%%)" % ( self._filename, self._seen_so_far, self._size, percentage)) sys.stdout.flush() transfer = S3Transfer(boto3.client('s3', 'us-west-2')) # Upload /tmp/myfile to s3://bucket/key and print upload progress. transfer.upload_file('/tmp/myfile', 'bucket', 'key', callback=ProgressPercentage('/tmp/myfile'))You can also provide a TransferConfig object to the S3Transferobject that gives you more fine grained control over thetransfer. For example:.. code-block:: python client = boto3.client('s3', 'us-west-2') config = TransferConfig( multipart_threshold=8 * 1024 * 1024, max_concurrency=10, num_download_attempts=10, ) transfer = S3Transfer(client, config) transfer.upload_file('/tmp/foo', 'bucket', 'key')"""importloggingimportthreadingfromosimportPathLike,fspath,getpidfrombotocore.compatimportHAS_CRTfrombotocore.exceptionsimportClientErrorfroms3transfer.exceptionsimport(RetriesExceededErrorasS3TransferRetriesExceededError,)froms3transfer.futuresimportNonThreadedExecutorfroms3transfer.managerimportTransferConfigasS3TransferConfigfroms3transfer.managerimportTransferManagerfroms3transfer.subscribersimportBaseSubscriberfroms3transfer.utilsimportOSUtilsimportboto3.s3.constantsasconstantsfromboto3.exceptionsimportRetriesExceededError,S3UploadFailedErrorifHAS_CRT:importawscrt.s3fromboto3.crtimportcreate_crt_transfer_managerKB=1024MB=KB*KBlogger=logging.getLogger(__name__)defcreate_transfer_manager(client,config,osutil=None):"""Creates a transfer manager based on configuration :type client: boto3.client :param client: The S3 client to use :type config: boto3.s3.transfer.TransferConfig :param config: The transfer config to use :type osutil: s3transfer.utils.OSUtils :param osutil: The os utility to use :rtype: s3transfer.manager.TransferManager :returns: A transfer manager based on parameters provided """if_should_use_crt(config):crt_transfer_manager=create_crt_transfer_manager(client,config)ifcrt_transfer_managerisnotNone:logger.debug(f"Using CRT client. pid: {getpid()}, thread: {threading.get_ident()}")returncrt_transfer_manager# If we don't resolve something above, fallback to the default.logger.debug(f"Using default client. pid: {getpid()}, thread: {threading.get_ident()}")return_create_default_transfer_manager(client,config,osutil)def_should_use_crt(config):# This feature requires awscrt>=0.19.18ifHAS_CRTandhas_minimum_crt_version((0,19,18)):is_optimized_instance=awscrt.s3.is_optimized_for_system()else:is_optimized_instance=Falsepref_transfer_client=config.preferred_transfer_client.lower()if(is_optimized_instanceandpref_transfer_client==constants.AUTO_RESOLVE_TRANSFER_CLIENT):logger.debug("Attempting to use CRTTransferManager. Config settings may be ignored.")returnTruelogger.debug("Opting out of CRT Transfer Manager. Preferred client: "f"{pref_transfer_client}, CRT available: {HAS_CRT}, "f"Instance Optimized: {is_optimized_instance}.")returnFalsedefhas_minimum_crt_version(minimum_version):"""Not intended for use outside boto3."""ifnotHAS_CRT:returnFalsecrt_version_str=awscrt.__version__try:crt_version_ints=map(int,crt_version_str.split("."))crt_version_tuple=tuple(crt_version_ints)except(TypeError,ValueError):returnFalsereturncrt_version_tuple>=minimum_versiondef_create_default_transfer_manager(client,config,osutil):"""Create the default TransferManager implementation for s3transfer."""executor_cls=Noneifnotconfig.use_threads:executor_cls=NonThreadedExecutorreturnTransferManager(client,config,osutil,executor_cls)
[docs]classTransferConfig(S3TransferConfig):ALIAS={'max_concurrency':'max_request_concurrency','max_io_queue':'max_io_queue_size',}def__init__(self,multipart_threshold=8*MB,max_concurrency=10,multipart_chunksize=8*MB,num_download_attempts=5,max_io_queue=100,io_chunksize=256*KB,use_threads=True,max_bandwidth=None,preferred_transfer_client=constants.AUTO_RESOLVE_TRANSFER_CLIENT,):"""Configuration object for managed S3 transfers :param multipart_threshold: The transfer size threshold for which multipart uploads, downloads, and copies will automatically be triggered. :param max_concurrency: The maximum number of threads that will be making requests to perform a transfer. If ``use_threads`` is set to ``False``, the value provided is ignored as the transfer will only ever use the current thread. :param multipart_chunksize: The partition size of each part for a multipart transfer. :param num_download_attempts: The number of download attempts that will be retried upon errors with downloading an object in S3. Note that these retries account for errors that occur when streaming down the data from s3 (i.e. socket errors and read timeouts that occur after receiving an OK response from s3). Other retryable exceptions such as throttling errors and 5xx errors are already retried by botocore (this default is 5). This does not take into account the number of exceptions retried by botocore. :param max_io_queue: The maximum amount of read parts that can be queued in memory to be written for a download. The size of each of these read parts is at most the size of ``io_chunksize``. :param io_chunksize: The max size of each chunk in the io queue. Currently, this is size used when ``read`` is called on the downloaded stream as well. :param use_threads: If True, threads will be used when performing S3 transfers. If False, no threads will be used in performing transfers; all logic will be run in the current thread. :param max_bandwidth: The maximum bandwidth that will be consumed in uploading and downloading file content. The value is an integer in terms of bytes per second. :param preferred_transfer_client: String specifying preferred transfer client for transfer operations. Current supported settings are: * auto (default) - Use the CRTTransferManager when calls are made with supported environment and settings. * classic - Only use the origin S3TransferManager with requests. Disables possible CRT upgrade on requests. """super().__init__(multipart_threshold=multipart_threshold,max_request_concurrency=max_concurrency,multipart_chunksize=multipart_chunksize,num_download_attempts=num_download_attempts,max_io_queue_size=max_io_queue,io_chunksize=io_chunksize,max_bandwidth=max_bandwidth,)# Some of the argument names are not the same as the inherited# S3TransferConfig so we add aliases so you can still access the# old version of the names.foraliasinself.ALIAS:setattr(self,alias,getattr(self,self.ALIAS[alias]))self.use_threads=use_threadsself.preferred_transfer_client=preferred_transfer_clientdef__setattr__(self,name,value):# If the alias name is used, make sure we set the name that it points# to as that is what actually is used in governing the TransferManager.ifnameinself.ALIAS:super().__setattr__(self.ALIAS[name],value)# Always set the value of the actual name provided.super().__setattr__(name,value)
[docs]classS3Transfer:ALLOWED_DOWNLOAD_ARGS=TransferManager.ALLOWED_DOWNLOAD_ARGSALLOWED_UPLOAD_ARGS=TransferManager.ALLOWED_UPLOAD_ARGSdef__init__(self,client=None,config=None,osutil=None,manager=None):ifnotclientandnotmanager:raiseValueError('Either a boto3.Client or s3transfer.manager.TransferManager ''must be provided')ifmanagerandany([client,config,osutil]):raiseValueError('Manager cannot be provided with client, config, ''nor osutil. These parameters are mutually exclusive.')ifconfigisNone:config=TransferConfig()ifosutilisNone:osutil=OSUtils()ifmanager:self._manager=managerelse:self._manager=create_transfer_manager(client,config,osutil)
[docs]defupload_file(self,filename,bucket,key,callback=None,extra_args=None):"""Upload a file to an S3 object. Variants have also been injected into S3 client, Bucket and Object. You don't have to use S3Transfer.upload_file() directly. .. seealso:: :py:meth:`S3.Client.upload_file` :py:meth:`S3.Client.upload_fileobj` """ifisinstance(filename,PathLike):filename=fspath(filename)ifnotisinstance(filename,str):raiseValueError('Filename must be a string or a path-like object')subscribers=self._get_subscribers(callback)future=self._manager.upload(filename,bucket,key,extra_args,subscribers)try:future.result()# If a client error was raised, add the backwards compatibility layer# that raises a S3UploadFailedError. These specific errors were only# ever thrown for upload_parts but now can be thrown for any related# client error.exceptClientErrorase:raiseS3UploadFailedError("Failed to upload {} to {}: {}".format(filename,'/'.join([bucket,key]),e))
[docs]defdownload_file(self,bucket,key,filename,extra_args=None,callback=None):"""Download an S3 object to a file. Variants have also been injected into S3 client, Bucket and Object. You don't have to use S3Transfer.download_file() directly. .. seealso:: :py:meth:`S3.Client.download_file` :py:meth:`S3.Client.download_fileobj` """ifisinstance(filename,PathLike):filename=fspath(filename)ifnotisinstance(filename,str):raiseValueError('Filename must be a string or a path-like object')subscribers=self._get_subscribers(callback)future=self._manager.download(bucket,key,filename,extra_args,subscribers)try:future.result()# This is for backwards compatibility where when retries are# exceeded we need to throw the same error from boto3 instead of# s3transfer's built in RetriesExceededError as current users are# catching the boto3 one instead of the s3transfer exception to do# their own retries.exceptS3TransferRetriesExceededErrorase:raiseRetriesExceededError(e.last_exception)
classProgressCallbackInvoker(BaseSubscriber):"""A back-compat wrapper to invoke a provided callback via a subscriber :param callback: A callable that takes a single positional argument for how many bytes were transferred. """def__init__(self,callback):self._callback=callbackdefon_progress(self,bytes_transferred,**kwargs):self._callback(bytes_transferred)