Table of Contents
Firehose.
Client
¶A low-level client representing Amazon Kinesis Firehose
Amazon Kinesis Data Firehose is a fully managed service that delivers real-time streaming data to destinations such as Amazon Simple Storage Service (Amazon S3), Amazon Elasticsearch Service (Amazon ES), Amazon Redshift, and Splunk.
import boto3
client = boto3.client('firehose')
These are the available methods:
can_paginate()
close()
create_delivery_stream()
delete_delivery_stream()
describe_delivery_stream()
get_paginator()
get_waiter()
list_delivery_streams()
list_tags_for_delivery_stream()
put_record()
put_record_batch()
start_delivery_stream_encryption()
stop_delivery_stream_encryption()
tag_delivery_stream()
untag_delivery_stream()
update_destination()
can_paginate
(operation_name)¶Check if an operation can be paginated.
create_foo
, and you'd normally invoke the
operation as client.create_foo(**kwargs)
, if the
create_foo
operation can be paginated, you can use the
call client.get_paginator("create_foo")
.True
if the operation can be paginated,
False
otherwise.close
()¶Closes underlying endpoint connections.
create_delivery_stream
(**kwargs)¶Creates a Kinesis Data Firehose delivery stream.
By default, you can create up to 50 delivery streams per AWS Region.
This is an asynchronous operation that immediately returns. The initial status of the delivery stream is CREATING
. After the delivery stream is created, its status is ACTIVE
and it now accepts data. If the delivery stream creation fails, the status transitions to CREATING_FAILED
. Attempts to send data to a delivery stream that is not in the ACTIVE
state cause an exception. To check the state of a delivery stream, use DescribeDeliveryStream .
If the status of a delivery stream is CREATING_FAILED
, this status doesn't change, and you can't invoke CreateDeliveryStream
again on it. However, you can invoke the DeleteDeliveryStream operation to delete it.
A Kinesis Data Firehose delivery stream can be configured to receive records directly from providers using PutRecord or PutRecordBatch , or it can be configured to use an existing Kinesis stream as its source. To specify a Kinesis data stream as input, set the DeliveryStreamType
parameter to KinesisStreamAsSource
, and provide the Kinesis stream Amazon Resource Name (ARN) and role ARN in the KinesisStreamSourceConfiguration
parameter.
To create a delivery stream with server-side encryption (SSE) enabled, include DeliveryStreamEncryptionConfigurationInput in your request. This is optional. You can also invoke StartDeliveryStreamEncryption to turn on SSE for an existing delivery stream that doesn't have SSE enabled.
A delivery stream is configured with a single destination: Amazon S3, Amazon ES, Amazon Redshift, or Splunk. You must specify only one of the following destination configuration parameters: ExtendedS3DestinationConfiguration
, S3DestinationConfiguration
, ElasticsearchDestinationConfiguration
, RedshiftDestinationConfiguration
, or SplunkDestinationConfiguration
.
When you specify S3DestinationConfiguration
, you can also provide the following optional values: BufferingHints, EncryptionConfiguration
, and CompressionFormat
. By default, if no BufferingHints
value is provided, Kinesis Data Firehose buffers data up to 5 MB or for 5 minutes, whichever condition is satisfied first. BufferingHints
is a hint, so there are some cases where the service cannot adhere to these conditions strictly. For example, record boundaries might be such that the size is a little over or under the configured buffering size. By default, no encryption is performed. We strongly recommend that you enable encryption to ensure secure data storage in Amazon S3.
A few notes about Amazon Redshift as a destination:
COPY
syntax to load data into an Amazon Redshift table. This is specified in the RedshiftDestinationConfiguration.S3Configuration
parameter.SNAPPY
or ZIP
cannot be specified in RedshiftDestinationConfiguration.S3Configuration
because the Amazon Redshift COPY
operation that reads from the S3 bucket doesn't support these compression formats.INSERT
permissions.Kinesis Data Firehose assumes the IAM role that is configured as part of the destination. The role should allow the Kinesis Data Firehose principal to assume the role, and the role should have permissions that allow the service to deliver the data. For more information, see Grant Kinesis Data Firehose Access to an Amazon S3 Destination in the Amazon Kinesis Data Firehose Developer Guide .
See also: AWS API Documentation
Request Syntax
response = client.create_delivery_stream(
DeliveryStreamName='string',
DeliveryStreamType='DirectPut'|'KinesisStreamAsSource',
KinesisStreamSourceConfiguration={
'KinesisStreamARN': 'string',
'RoleARN': 'string'
},
DeliveryStreamEncryptionConfigurationInput={
'KeyARN': 'string',
'KeyType': 'AWS_OWNED_CMK'|'CUSTOMER_MANAGED_CMK'
},
S3DestinationConfiguration={
'RoleARN': 'string',
'BucketARN': 'string',
'Prefix': 'string',
'ErrorOutputPrefix': 'string',
'BufferingHints': {
'SizeInMBs': 123,
'IntervalInSeconds': 123
},
'CompressionFormat': 'UNCOMPRESSED'|'GZIP'|'ZIP'|'Snappy'|'HADOOP_SNAPPY',
'EncryptionConfiguration': {
'NoEncryptionConfig': 'NoEncryption',
'KMSEncryptionConfig': {
'AWSKMSKeyARN': 'string'
}
},
'CloudWatchLoggingOptions': {
'Enabled': True|False,
'LogGroupName': 'string',
'LogStreamName': 'string'
}
},
ExtendedS3DestinationConfiguration={
'RoleARN': 'string',
'BucketARN': 'string',
'Prefix': 'string',
'ErrorOutputPrefix': 'string',
'BufferingHints': {
'SizeInMBs': 123,
'IntervalInSeconds': 123
},
'CompressionFormat': 'UNCOMPRESSED'|'GZIP'|'ZIP'|'Snappy'|'HADOOP_SNAPPY',
'EncryptionConfiguration': {
'NoEncryptionConfig': 'NoEncryption',
'KMSEncryptionConfig': {
'AWSKMSKeyARN': 'string'
}
},
'CloudWatchLoggingOptions': {
'Enabled': True|False,
'LogGroupName': 'string',
'LogStreamName': 'string'
},
'ProcessingConfiguration': {
'Enabled': True|False,
'Processors': [
{
'Type': 'RecordDeAggregation'|'Lambda'|'MetadataExtraction'|'AppendDelimiterToRecord',
'Parameters': [
{
'ParameterName': 'LambdaArn'|'NumberOfRetries'|'MetadataExtractionQuery'|'JsonParsingEngine'|'RoleArn'|'BufferSizeInMBs'|'BufferIntervalInSeconds'|'SubRecordType'|'Delimiter',
'ParameterValue': 'string'
},
]
},
]
},
'S3BackupMode': 'Disabled'|'Enabled',
'S3BackupConfiguration': {
'RoleARN': 'string',
'BucketARN': 'string',
'Prefix': 'string',
'ErrorOutputPrefix': 'string',
'BufferingHints': {
'SizeInMBs': 123,
'IntervalInSeconds': 123
},
'CompressionFormat': 'UNCOMPRESSED'|'GZIP'|'ZIP'|'Snappy'|'HADOOP_SNAPPY',
'EncryptionConfiguration': {
'NoEncryptionConfig': 'NoEncryption',
'KMSEncryptionConfig': {
'AWSKMSKeyARN': 'string'
}
},
'CloudWatchLoggingOptions': {
'Enabled': True|False,
'LogGroupName': 'string',
'LogStreamName': 'string'
}
},
'DataFormatConversionConfiguration': {
'SchemaConfiguration': {
'RoleARN': 'string',
'CatalogId': 'string',
'DatabaseName': 'string',
'TableName': 'string',
'Region': 'string',
'VersionId': 'string'
},
'InputFormatConfiguration': {
'Deserializer': {
'OpenXJsonSerDe': {
'ConvertDotsInJsonKeysToUnderscores': True|False,
'CaseInsensitive': True|False,
'ColumnToJsonKeyMappings': {
'string': 'string'
}
},
'HiveJsonSerDe': {
'TimestampFormats': [
'string',
]
}
}
},
'OutputFormatConfiguration': {
'Serializer': {
'ParquetSerDe': {
'BlockSizeBytes': 123,
'PageSizeBytes': 123,
'Compression': 'UNCOMPRESSED'|'GZIP'|'SNAPPY',
'EnableDictionaryCompression': True|False,
'MaxPaddingBytes': 123,
'WriterVersion': 'V1'|'V2'
},
'OrcSerDe': {
'StripeSizeBytes': 123,
'BlockSizeBytes': 123,
'RowIndexStride': 123,
'EnablePadding': True|False,
'PaddingTolerance': 123.0,
'Compression': 'NONE'|'ZLIB'|'SNAPPY',
'BloomFilterColumns': [
'string',
],
'BloomFilterFalsePositiveProbability': 123.0,
'DictionaryKeyThreshold': 123.0,
'FormatVersion': 'V0_11'|'V0_12'
}
}
},
'Enabled': True|False
},
'DynamicPartitioningConfiguration': {
'RetryOptions': {
'DurationInSeconds': 123
},
'Enabled': True|False
}
},
RedshiftDestinationConfiguration={
'RoleARN': 'string',
'ClusterJDBCURL': 'string',
'CopyCommand': {
'DataTableName': 'string',
'DataTableColumns': 'string',
'CopyOptions': 'string'
},
'Username': 'string',
'Password': 'string',
'RetryOptions': {
'DurationInSeconds': 123
},
'S3Configuration': {
'RoleARN': 'string',
'BucketARN': 'string',
'Prefix': 'string',
'ErrorOutputPrefix': 'string',
'BufferingHints': {
'SizeInMBs': 123,
'IntervalInSeconds': 123
},
'CompressionFormat': 'UNCOMPRESSED'|'GZIP'|'ZIP'|'Snappy'|'HADOOP_SNAPPY',
'EncryptionConfiguration': {
'NoEncryptionConfig': 'NoEncryption',
'KMSEncryptionConfig': {
'AWSKMSKeyARN': 'string'
}
},
'CloudWatchLoggingOptions': {
'Enabled': True|False,
'LogGroupName': 'string',
'LogStreamName': 'string'
}
},
'ProcessingConfiguration': {
'Enabled': True|False,
'Processors': [
{
'Type': 'RecordDeAggregation'|'Lambda'|'MetadataExtraction'|'AppendDelimiterToRecord',
'Parameters': [
{
'ParameterName': 'LambdaArn'|'NumberOfRetries'|'MetadataExtractionQuery'|'JsonParsingEngine'|'RoleArn'|'BufferSizeInMBs'|'BufferIntervalInSeconds'|'SubRecordType'|'Delimiter',
'ParameterValue': 'string'
},
]
},
]
},
'S3BackupMode': 'Disabled'|'Enabled',
'S3BackupConfiguration': {
'RoleARN': 'string',
'BucketARN': 'string',
'Prefix': 'string',
'ErrorOutputPrefix': 'string',
'BufferingHints': {
'SizeInMBs': 123,
'IntervalInSeconds': 123
},
'CompressionFormat': 'UNCOMPRESSED'|'GZIP'|'ZIP'|'Snappy'|'HADOOP_SNAPPY',
'EncryptionConfiguration': {
'NoEncryptionConfig': 'NoEncryption',
'KMSEncryptionConfig': {
'AWSKMSKeyARN': 'string'
}
},
'CloudWatchLoggingOptions': {
'Enabled': True|False,
'LogGroupName': 'string',
'LogStreamName': 'string'
}
},
'CloudWatchLoggingOptions': {
'Enabled': True|False,
'LogGroupName': 'string',
'LogStreamName': 'string'
}
},
ElasticsearchDestinationConfiguration={
'RoleARN': 'string',
'DomainARN': 'string',
'ClusterEndpoint': 'string',
'IndexName': 'string',
'TypeName': 'string',
'IndexRotationPeriod': 'NoRotation'|'OneHour'|'OneDay'|'OneWeek'|'OneMonth',
'BufferingHints': {
'IntervalInSeconds': 123,
'SizeInMBs': 123
},
'RetryOptions': {
'DurationInSeconds': 123
},
'S3BackupMode': 'FailedDocumentsOnly'|'AllDocuments',
'S3Configuration': {
'RoleARN': 'string',
'BucketARN': 'string',
'Prefix': 'string',
'ErrorOutputPrefix': 'string',
'BufferingHints': {
'SizeInMBs': 123,
'IntervalInSeconds': 123
},
'CompressionFormat': 'UNCOMPRESSED'|'GZIP'|'ZIP'|'Snappy'|'HADOOP_SNAPPY',
'EncryptionConfiguration': {
'NoEncryptionConfig': 'NoEncryption',
'KMSEncryptionConfig': {
'AWSKMSKeyARN': 'string'
}
},
'CloudWatchLoggingOptions': {
'Enabled': True|False,
'LogGroupName': 'string',
'LogStreamName': 'string'
}
},
'ProcessingConfiguration': {
'Enabled': True|False,
'Processors': [
{
'Type': 'RecordDeAggregation'|'Lambda'|'MetadataExtraction'|'AppendDelimiterToRecord',
'Parameters': [
{
'ParameterName': 'LambdaArn'|'NumberOfRetries'|'MetadataExtractionQuery'|'JsonParsingEngine'|'RoleArn'|'BufferSizeInMBs'|'BufferIntervalInSeconds'|'SubRecordType'|'Delimiter',
'ParameterValue': 'string'
},
]
},
]
},
'CloudWatchLoggingOptions': {
'Enabled': True|False,
'LogGroupName': 'string',
'LogStreamName': 'string'
},
'VpcConfiguration': {
'SubnetIds': [
'string',
],
'RoleARN': 'string',
'SecurityGroupIds': [
'string',
]
}
},
AmazonopensearchserviceDestinationConfiguration={
'RoleARN': 'string',
'DomainARN': 'string',
'ClusterEndpoint': 'string',
'IndexName': 'string',
'TypeName': 'string',
'IndexRotationPeriod': 'NoRotation'|'OneHour'|'OneDay'|'OneWeek'|'OneMonth',
'BufferingHints': {
'IntervalInSeconds': 123,
'SizeInMBs': 123
},
'RetryOptions': {
'DurationInSeconds': 123
},
'S3BackupMode': 'FailedDocumentsOnly'|'AllDocuments',
'S3Configuration': {
'RoleARN': 'string',
'BucketARN': 'string',
'Prefix': 'string',
'ErrorOutputPrefix': 'string',
'BufferingHints': {
'SizeInMBs': 123,
'IntervalInSeconds': 123
},
'CompressionFormat': 'UNCOMPRESSED'|'GZIP'|'ZIP'|'Snappy'|'HADOOP_SNAPPY',
'EncryptionConfiguration': {
'NoEncryptionConfig': 'NoEncryption',
'KMSEncryptionConfig': {
'AWSKMSKeyARN': 'string'
}
},
'CloudWatchLoggingOptions': {
'Enabled': True|False,
'LogGroupName': 'string',
'LogStreamName': 'string'
}
},
'ProcessingConfiguration': {
'Enabled': True|False,
'Processors': [
{
'Type': 'RecordDeAggregation'|'Lambda'|'MetadataExtraction'|'AppendDelimiterToRecord',
'Parameters': [
{
'ParameterName': 'LambdaArn'|'NumberOfRetries'|'MetadataExtractionQuery'|'JsonParsingEngine'|'RoleArn'|'BufferSizeInMBs'|'BufferIntervalInSeconds'|'SubRecordType'|'Delimiter',
'ParameterValue': 'string'
},
]
},
]
},
'CloudWatchLoggingOptions': {
'Enabled': True|False,
'LogGroupName': 'string',
'LogStreamName': 'string'
},
'VpcConfiguration': {
'SubnetIds': [
'string',
],
'RoleARN': 'string',
'SecurityGroupIds': [
'string',
]
}
},
SplunkDestinationConfiguration={
'HECEndpoint': 'string',
'HECEndpointType': 'Raw'|'Event',
'HECToken': 'string',
'HECAcknowledgmentTimeoutInSeconds': 123,
'RetryOptions': {
'DurationInSeconds': 123
},
'S3BackupMode': 'FailedEventsOnly'|'AllEvents',
'S3Configuration': {
'RoleARN': 'string',
'BucketARN': 'string',
'Prefix': 'string',
'ErrorOutputPrefix': 'string',
'BufferingHints': {
'SizeInMBs': 123,
'IntervalInSeconds': 123
},
'CompressionFormat': 'UNCOMPRESSED'|'GZIP'|'ZIP'|'Snappy'|'HADOOP_SNAPPY',
'EncryptionConfiguration': {
'NoEncryptionConfig': 'NoEncryption',
'KMSEncryptionConfig': {
'AWSKMSKeyARN': 'string'
}
},
'CloudWatchLoggingOptions': {
'Enabled': True|False,
'LogGroupName': 'string',
'LogStreamName': 'string'
}
},
'ProcessingConfiguration': {
'Enabled': True|False,
'Processors': [
{
'Type': 'RecordDeAggregation'|'Lambda'|'MetadataExtraction'|'AppendDelimiterToRecord',
'Parameters': [
{
'ParameterName': 'LambdaArn'|'NumberOfRetries'|'MetadataExtractionQuery'|'JsonParsingEngine'|'RoleArn'|'BufferSizeInMBs'|'BufferIntervalInSeconds'|'SubRecordType'|'Delimiter',
'ParameterValue': 'string'
},
]
},
]
},
'CloudWatchLoggingOptions': {
'Enabled': True|False,
'LogGroupName': 'string',
'LogStreamName': 'string'
}
},
HttpEndpointDestinationConfiguration={
'EndpointConfiguration': {
'Url': 'string',
'Name': 'string',
'AccessKey': 'string'
},
'BufferingHints': {
'SizeInMBs': 123,
'IntervalInSeconds': 123
},
'CloudWatchLoggingOptions': {
'Enabled': True|False,
'LogGroupName': 'string',
'LogStreamName': 'string'
},
'RequestConfiguration': {
'ContentEncoding': 'NONE'|'GZIP',
'CommonAttributes': [
{
'AttributeName': 'string',
'AttributeValue': 'string'
},
]
},
'ProcessingConfiguration': {
'Enabled': True|False,
'Processors': [
{
'Type': 'RecordDeAggregation'|'Lambda'|'MetadataExtraction'|'AppendDelimiterToRecord',
'Parameters': [
{
'ParameterName': 'LambdaArn'|'NumberOfRetries'|'MetadataExtractionQuery'|'JsonParsingEngine'|'RoleArn'|'BufferSizeInMBs'|'BufferIntervalInSeconds'|'SubRecordType'|'Delimiter',
'ParameterValue': 'string'
},
]
},
]
},
'RoleARN': 'string',
'RetryOptions': {
'DurationInSeconds': 123
},
'S3BackupMode': 'FailedDataOnly'|'AllData',
'S3Configuration': {
'RoleARN': 'string',
'BucketARN': 'string',
'Prefix': 'string',
'ErrorOutputPrefix': 'string',
'BufferingHints': {
'SizeInMBs': 123,
'IntervalInSeconds': 123
},
'CompressionFormat': 'UNCOMPRESSED'|'GZIP'|'ZIP'|'Snappy'|'HADOOP_SNAPPY',
'EncryptionConfiguration': {
'NoEncryptionConfig': 'NoEncryption',
'KMSEncryptionConfig': {
'AWSKMSKeyARN': 'string'
}
},
'CloudWatchLoggingOptions': {
'Enabled': True|False,
'LogGroupName': 'string',
'LogStreamName': 'string'
}
}
},
Tags=[
{
'Key': 'string',
'Value': 'string'
},
]
)
[REQUIRED]
The name of the delivery stream. This name must be unique per AWS account in the same AWS Region. If the delivery streams are in different accounts or different Regions, you can have multiple delivery streams with the same name.
The delivery stream type. This parameter can be one of the following values:
DirectPut
: Provider applications access the delivery stream directly.KinesisStreamAsSource
: The delivery stream uses a Kinesis data stream as a source.When a Kinesis data stream is used as the source for the delivery stream, a KinesisStreamSourceConfiguration containing the Kinesis data stream Amazon Resource Name (ARN) and the role ARN for the source stream.
The ARN of the source Kinesis data stream. For more information, see Amazon Kinesis Data Streams ARN Format .
The ARN of the role that provides access to the source Kinesis data stream. For more information, see AWS Identity and Access Management (IAM) ARN Format .
Used to specify the type and Amazon Resource Name (ARN) of the KMS key needed for Server-Side Encryption (SSE).
If you set KeyType
to CUSTOMER_MANAGED_CMK
, you must specify the Amazon Resource Name (ARN) of the CMK. If you set KeyType
to AWS_OWNED_CMK
, Kinesis Data Firehose uses a service-account CMK.
Indicates the type of customer master key (CMK) to use for encryption. The default setting is AWS_OWNED_CMK
. For more information about CMKs, see Customer Master Keys (CMKs) . When you invoke CreateDeliveryStream or StartDeliveryStreamEncryption with KeyType
set to CUSTOMER_MANAGED_CMK, Kinesis Data Firehose invokes the Amazon KMS operation CreateGrant to create a grant that allows the Kinesis Data Firehose service to use the customer managed CMK to perform encryption and decryption. Kinesis Data Firehose manages that grant.
When you invoke StartDeliveryStreamEncryption to change the CMK for a delivery stream that is encrypted with a customer managed CMK, Kinesis Data Firehose schedules the grant it had on the old CMK for retirement.
You can use a CMK of type CUSTOMER_MANAGED_CMK to encrypt up to 500 delivery streams. If a CreateDeliveryStream or StartDeliveryStreamEncryption operation exceeds this limit, Kinesis Data Firehose throws a LimitExceededException
.
Warning
To encrypt your delivery stream, use symmetric CMKs. Kinesis Data Firehose doesn't support asymmetric CMKs. For information about symmetric and asymmetric CMKs, see About Symmetric and Asymmetric CMKs in the AWS Key Management Service developer guide.
[Deprecated] The destination in Amazon S3. You can specify only one destination.
The Amazon Resource Name (ARN) of the AWS credentials. For more information, see Amazon Resource Names (ARNs) and AWS Service Namespaces .
The ARN of the S3 bucket. For more information, see Amazon Resource Names (ARNs) and AWS Service Namespaces .
The "YYYY/MM/DD/HH" time format prefix is automatically used for delivered Amazon S3 files. You can also specify a custom prefix, as described in Custom Prefixes for Amazon S3 Objects .
A prefix that Kinesis Data Firehose evaluates and adds to failed records before writing them to S3. This prefix appears immediately following the bucket name. For information about how to specify this prefix, see Custom Prefixes for Amazon S3 Objects .
The buffering option. If no value is specified, BufferingHints
object default values are used.
Buffer incoming data to the specified size, in MiBs, before delivering it to the destination. The default value is 5. This parameter is optional but if you specify a value for it, you must also specify a value for IntervalInSeconds
, and vice versa.
We recommend setting this parameter to a value greater than the amount of data you typically ingest into the delivery stream in 10 seconds. For example, if you typically ingest data at 1 MiB/sec, the value should be 10 MiB or higher.
Buffer incoming data for the specified period of time, in seconds, before delivering it to the destination. The default value is 300. This parameter is optional but if you specify a value for it, you must also specify a value for SizeInMBs
, and vice versa.
The compression format. If no value is specified, the default is UNCOMPRESSED
.
The compression formats SNAPPY
or ZIP
cannot be specified for Amazon Redshift destinations because they are not supported by the Amazon Redshift COPY
operation that reads from the S3 bucket.
The encryption configuration. If no value is specified, the default is no encryption.
Specifically override existing encryption information to ensure that no encryption is used.
The encryption key.
The Amazon Resource Name (ARN) of the encryption key. Must belong to the same AWS Region as the destination Amazon S3 bucket. For more information, see Amazon Resource Names (ARNs) and AWS Service Namespaces .
The CloudWatch logging options for your delivery stream.
Enables or disables CloudWatch logging.
The CloudWatch group name for logging. This value is required if CloudWatch logging is enabled.
The CloudWatch log stream name for logging. This value is required if CloudWatch logging is enabled.
The destination in Amazon S3. You can specify only one destination.
The Amazon Resource Name (ARN) of the AWS credentials. For more information, see Amazon Resource Names (ARNs) and AWS Service Namespaces .
The ARN of the S3 bucket. For more information, see Amazon Resource Names (ARNs) and AWS Service Namespaces .
The "YYYY/MM/DD/HH" time format prefix is automatically used for delivered Amazon S3 files. You can also specify a custom prefix, as described in Custom Prefixes for Amazon S3 Objects .
A prefix that Kinesis Data Firehose evaluates and adds to failed records before writing them to S3. This prefix appears immediately following the bucket name. For information about how to specify this prefix, see Custom Prefixes for Amazon S3 Objects .
The buffering option.
Buffer incoming data to the specified size, in MiBs, before delivering it to the destination. The default value is 5. This parameter is optional but if you specify a value for it, you must also specify a value for IntervalInSeconds
, and vice versa.
We recommend setting this parameter to a value greater than the amount of data you typically ingest into the delivery stream in 10 seconds. For example, if you typically ingest data at 1 MiB/sec, the value should be 10 MiB or higher.
Buffer incoming data for the specified period of time, in seconds, before delivering it to the destination. The default value is 300. This parameter is optional but if you specify a value for it, you must also specify a value for SizeInMBs
, and vice versa.
The compression format. If no value is specified, the default is UNCOMPRESSED.
The encryption configuration. If no value is specified, the default is no encryption.
Specifically override existing encryption information to ensure that no encryption is used.
The encryption key.
The Amazon Resource Name (ARN) of the encryption key. Must belong to the same AWS Region as the destination Amazon S3 bucket. For more information, see Amazon Resource Names (ARNs) and AWS Service Namespaces .
The Amazon CloudWatch logging options for your delivery stream.
Enables or disables CloudWatch logging.
The CloudWatch group name for logging. This value is required if CloudWatch logging is enabled.
The CloudWatch log stream name for logging. This value is required if CloudWatch logging is enabled.
The data processing configuration.
Enables or disables data processing.
The data processors.
Describes a data processor.
The type of processor.
The processor parameters.
Describes the processor parameter.
The name of the parameter.
The parameter value.
The Amazon S3 backup mode. After you create a delivery stream, you can update it to enable Amazon S3 backup if it is disabled. If backup is enabled, you can't update the delivery stream to disable it.
The configuration for backup in Amazon S3.
The Amazon Resource Name (ARN) of the AWS credentials. For more information, see Amazon Resource Names (ARNs) and AWS Service Namespaces .
The ARN of the S3 bucket. For more information, see Amazon Resource Names (ARNs) and AWS Service Namespaces .
The "YYYY/MM/DD/HH" time format prefix is automatically used for delivered Amazon S3 files. You can also specify a custom prefix, as described in Custom Prefixes for Amazon S3 Objects .
A prefix that Kinesis Data Firehose evaluates and adds to failed records before writing them to S3. This prefix appears immediately following the bucket name. For information about how to specify this prefix, see Custom Prefixes for Amazon S3 Objects .
The buffering option. If no value is specified, BufferingHints
object default values are used.
Buffer incoming data to the specified size, in MiBs, before delivering it to the destination. The default value is 5. This parameter is optional but if you specify a value for it, you must also specify a value for IntervalInSeconds
, and vice versa.
We recommend setting this parameter to a value greater than the amount of data you typically ingest into the delivery stream in 10 seconds. For example, if you typically ingest data at 1 MiB/sec, the value should be 10 MiB or higher.
Buffer incoming data for the specified period of time, in seconds, before delivering it to the destination. The default value is 300. This parameter is optional but if you specify a value for it, you must also specify a value for SizeInMBs
, and vice versa.
The compression format. If no value is specified, the default is UNCOMPRESSED
.
The compression formats SNAPPY
or ZIP
cannot be specified for Amazon Redshift destinations because they are not supported by the Amazon Redshift COPY
operation that reads from the S3 bucket.
The encryption configuration. If no value is specified, the default is no encryption.
Specifically override existing encryption information to ensure that no encryption is used.
The encryption key.
The Amazon Resource Name (ARN) of the encryption key. Must belong to the same AWS Region as the destination Amazon S3 bucket. For more information, see Amazon Resource Names (ARNs) and AWS Service Namespaces .
The CloudWatch logging options for your delivery stream.
Enables or disables CloudWatch logging.
The CloudWatch group name for logging. This value is required if CloudWatch logging is enabled.
The CloudWatch log stream name for logging. This value is required if CloudWatch logging is enabled.
The serializer, deserializer, and schema for converting data from the JSON format to the Parquet or ORC format before writing it to Amazon S3.
Specifies the AWS Glue Data Catalog table that contains the column information. This parameter is required if Enabled
is set to true.
The role that Kinesis Data Firehose can use to access AWS Glue. This role must be in the same account you use for Kinesis Data Firehose. Cross-account roles aren't allowed.
Warning
If the SchemaConfiguration
request parameter is used as part of invoking the CreateDeliveryStream
API, then the RoleARN
property is required and its value must be specified.
The ID of the AWS Glue Data Catalog. If you don't supply this, the AWS account ID is used by default.
Specifies the name of the AWS Glue database that contains the schema for the output data.
Warning
If the SchemaConfiguration
request parameter is used as part of invoking the CreateDeliveryStream
API, then the DatabaseName
property is required and its value must be specified.
Specifies the AWS Glue table that contains the column information that constitutes your data schema.
Warning
If the SchemaConfiguration
request parameter is used as part of invoking the CreateDeliveryStream
API, then the TableName
property is required and its value must be specified.
If you don't specify an AWS Region, the default is the current Region.
Specifies the table version for the output data schema. If you don't specify this version ID, or if you set it to LATEST
, Kinesis Data Firehose uses the most recent version. This means that any updates to the table are automatically picked up.
Specifies the deserializer that you want Kinesis Data Firehose to use to convert the format of your data from JSON. This parameter is required if Enabled
is set to true.
Specifies which deserializer to use. You can choose either the Apache Hive JSON SerDe or the OpenX JSON SerDe. If both are non-null, the server rejects the request.
The OpenX SerDe. Used by Kinesis Data Firehose for deserializing data, which means converting it from the JSON format in preparation for serializing it to the Parquet or ORC format. This is one of two deserializers you can choose, depending on which one offers the functionality you need. The other option is the native Hive / HCatalog JsonSerDe.
When set to true
, specifies that the names of the keys include dots and that you want Kinesis Data Firehose to replace them with underscores. This is useful because Apache Hive does not allow dots in column names. For example, if the JSON contains a key whose name is "a.b", you can define the column name to be "a_b" when using this option.
The default is false
.
When set to true
, which is the default, Kinesis Data Firehose converts JSON keys to lowercase before deserializing them.
Maps column names to JSON keys that aren't identical to the column names. This is useful when the JSON contains keys that are Hive keywords. For example, timestamp
is a Hive keyword. If you have a JSON key named timestamp
, set this parameter to {"ts": "timestamp"}
to map this key to a column named ts
.
The native Hive / HCatalog JsonSerDe. Used by Kinesis Data Firehose for deserializing data, which means converting it from the JSON format in preparation for serializing it to the Parquet or ORC format. This is one of two deserializers you can choose, depending on which one offers the functionality you need. The other option is the OpenX SerDe.
Indicates how you want Kinesis Data Firehose to parse the date and timestamps that may be present in your input data JSON. To specify these format strings, follow the pattern syntax of JodaTime's DateTimeFormat format strings. For more information, see Class DateTimeFormat . You can also use the special value millis
to parse timestamps in epoch milliseconds. If you don't specify a format, Kinesis Data Firehose uses java.sql.Timestamp::valueOf
by default.
Specifies the serializer that you want Kinesis Data Firehose to use to convert the format of your data to the Parquet or ORC format. This parameter is required if Enabled
is set to true.
Specifies which serializer to use. You can choose either the ORC SerDe or the Parquet SerDe. If both are non-null, the server rejects the request.
A serializer to use for converting data to the Parquet format before storing it in Amazon S3. For more information, see Apache Parquet .
The Hadoop Distributed File System (HDFS) block size. This is useful if you intend to copy the data from Amazon S3 to HDFS before querying. The default is 256 MiB and the minimum is 64 MiB. Kinesis Data Firehose uses this value for padding calculations.
The Parquet page size. Column chunks are divided into pages. A page is conceptually an indivisible unit (in terms of compression and encoding). The minimum value is 64 KiB and the default is 1 MiB.
The compression code to use over data blocks. The possible values are UNCOMPRESSED
, SNAPPY
, and GZIP
, with the default being SNAPPY
. Use SNAPPY
for higher decompression speed. Use GZIP
if the compression ratio is more important than speed.
Indicates whether to enable dictionary compression.
The maximum amount of padding to apply. This is useful if you intend to copy the data from Amazon S3 to HDFS before querying. The default is 0.
Indicates the version of row format to output. The possible values are V1
and V2
. The default is V1
.
A serializer to use for converting data to the ORC format before storing it in Amazon S3. For more information, see Apache ORC .
The number of bytes in each stripe. The default is 64 MiB and the minimum is 8 MiB.
The Hadoop Distributed File System (HDFS) block size. This is useful if you intend to copy the data from Amazon S3 to HDFS before querying. The default is 256 MiB and the minimum is 64 MiB. Kinesis Data Firehose uses this value for padding calculations.
The number of rows between index entries. The default is 10,000 and the minimum is 1,000.
Set this to true
to indicate that you want stripes to be padded to the HDFS block boundaries. This is useful if you intend to copy the data from Amazon S3 to HDFS before querying. The default is false
.
A number between 0 and 1 that defines the tolerance for block padding as a decimal fraction of stripe size. The default value is 0.05, which means 5 percent of stripe size.
For the default values of 64 MiB ORC stripes and 256 MiB HDFS blocks, the default block padding tolerance of 5 percent reserves a maximum of 3.2 MiB for padding within the 256 MiB block. In such a case, if the available size within the block is more than 3.2 MiB, a new, smaller stripe is inserted to fit within that space. This ensures that no stripe crosses block boundaries and causes remote reads within a node-local task.
Kinesis Data Firehose ignores this parameter when OrcSerDe$EnablePadding is false
.
The compression code to use over data blocks. The default is SNAPPY
.
The column names for which you want Kinesis Data Firehose to create bloom filters. The default is null
.
The Bloom filter false positive probability (FPP). The lower the FPP, the bigger the Bloom filter. The default value is 0.05, the minimum is 0, and the maximum is 1.
Represents the fraction of the total number of non-null rows. To turn off dictionary encoding, set this fraction to a number that is less than the number of distinct keys in a dictionary. To always use dictionary encoding, set this threshold to 1.
The version of the file to write. The possible values are V0_11
and V0_12
. The default is V0_12
.
Defaults to true
. Set it to false
if you want to disable format conversion while preserving the configuration details.
The configuration of the dynamic partitioning mechanism that creates smaller data sets from the streaming data by partitioning it based on partition keys. Currently, dynamic partitioning is only supported for Amazon S3 destinations. For more information, see https://docs.aws.amazon.com/firehose/latest/dev/dynamic-partitioning.html
The retry behavior in case Kinesis Data Firehose is unable to deliver data to an Amazon S3 prefix.
The period of time during which Kinesis Data Firehose retries to deliver data to the specified Amazon S3 prefix.
Specifies that the dynamic partitioning is enabled for this Kinesis Data Firehose delivery stream.
The destination in Amazon Redshift. You can specify only one destination.
The Amazon Resource Name (ARN) of the AWS credentials. For more information, see Amazon Resource Names (ARNs) and AWS Service Namespaces .
The database connection string.
The COPY
command.
The name of the target table. The table must already exist in the database.
A comma-separated list of column names.
Optional parameters to use with the Amazon Redshift COPY
command. For more information, see the "Optional Parameters" section of Amazon Redshift COPY command . Some possible examples that would apply to Kinesis Data Firehose are as follows:
delimiter '\t' lzop;
- fields are delimited with "t" (TAB character) and compressed using lzop.
delimiter '|'
- fields are delimited with "|" (this is the default delimiter).
delimiter '|' escape
- the delimiter should be escaped.
fixedwidth 'venueid:3,venuename:25,venuecity:12,venuestate:2,venueseats:6'
- fields are fixed width in the source, with each width specified after every column in the table.
JSON 's3://mybucket/jsonpaths.txt'
- data is in JSON format, and the path specified is the format of the data.
For more examples, see Amazon Redshift COPY command examples .
The name of the user.
The user password.
The retry behavior in case Kinesis Data Firehose is unable to deliver documents to Amazon Redshift. Default value is 3600 (60 minutes).
The length of time during which Kinesis Data Firehose retries delivery after a failure, starting from the initial request and including the first attempt. The default value is 3600 seconds (60 minutes). Kinesis Data Firehose does not retry if the value of DurationInSeconds
is 0 (zero) or if the first delivery attempt takes longer than the current value.
The configuration for the intermediate Amazon S3 location from which Amazon Redshift obtains data. Restrictions are described in the topic for CreateDeliveryStream .
The compression formats SNAPPY
or ZIP
cannot be specified in RedshiftDestinationConfiguration.S3Configuration
because the Amazon Redshift COPY
operation that reads from the S3 bucket doesn't support these compression formats.
The Amazon Resource Name (ARN) of the AWS credentials. For more information, see Amazon Resource Names (ARNs) and AWS Service Namespaces .
The ARN of the S3 bucket. For more information, see Amazon Resource Names (ARNs) and AWS Service Namespaces .
The "YYYY/MM/DD/HH" time format prefix is automatically used for delivered Amazon S3 files. You can also specify a custom prefix, as described in Custom Prefixes for Amazon S3 Objects .
A prefix that Kinesis Data Firehose evaluates and adds to failed records before writing them to S3. This prefix appears immediately following the bucket name. For information about how to specify this prefix, see Custom Prefixes for Amazon S3 Objects .
The buffering option. If no value is specified, BufferingHints
object default values are used.
Buffer incoming data to the specified size, in MiBs, before delivering it to the destination. The default value is 5. This parameter is optional but if you specify a value for it, you must also specify a value for IntervalInSeconds
, and vice versa.
We recommend setting this parameter to a value greater than the amount of data you typically ingest into the delivery stream in 10 seconds. For example, if you typically ingest data at 1 MiB/sec, the value should be 10 MiB or higher.
Buffer incoming data for the specified period of time, in seconds, before delivering it to the destination. The default value is 300. This parameter is optional but if you specify a value for it, you must also specify a value for SizeInMBs
, and vice versa.
The compression format. If no value is specified, the default is UNCOMPRESSED
.
The compression formats SNAPPY
or ZIP
cannot be specified for Amazon Redshift destinations because they are not supported by the Amazon Redshift COPY
operation that reads from the S3 bucket.
The encryption configuration. If no value is specified, the default is no encryption.
Specifically override existing encryption information to ensure that no encryption is used.
The encryption key.
The Amazon Resource Name (ARN) of the encryption key. Must belong to the same AWS Region as the destination Amazon S3 bucket. For more information, see Amazon Resource Names (ARNs) and AWS Service Namespaces .
The CloudWatch logging options for your delivery stream.
Enables or disables CloudWatch logging.
The CloudWatch group name for logging. This value is required if CloudWatch logging is enabled.
The CloudWatch log stream name for logging. This value is required if CloudWatch logging is enabled.
The data processing configuration.
Enables or disables data processing.
The data processors.
Describes a data processor.
The type of processor.
The processor parameters.
Describes the processor parameter.
The name of the parameter.
The parameter value.
The Amazon S3 backup mode. After you create a delivery stream, you can update it to enable Amazon S3 backup if it is disabled. If backup is enabled, you can't update the delivery stream to disable it.
The configuration for backup in Amazon S3.
The Amazon Resource Name (ARN) of the AWS credentials. For more information, see Amazon Resource Names (ARNs) and AWS Service Namespaces .
The ARN of the S3 bucket. For more information, see Amazon Resource Names (ARNs) and AWS Service Namespaces .
The "YYYY/MM/DD/HH" time format prefix is automatically used for delivered Amazon S3 files. You can also specify a custom prefix, as described in Custom Prefixes for Amazon S3 Objects .
A prefix that Kinesis Data Firehose evaluates and adds to failed records before writing them to S3. This prefix appears immediately following the bucket name. For information about how to specify this prefix, see Custom Prefixes for Amazon S3 Objects .
The buffering option. If no value is specified, BufferingHints
object default values are used.
Buffer incoming data to the specified size, in MiBs, before delivering it to the destination. The default value is 5. This parameter is optional but if you specify a value for it, you must also specify a value for IntervalInSeconds
, and vice versa.
We recommend setting this parameter to a value greater than the amount of data you typically ingest into the delivery stream in 10 seconds. For example, if you typically ingest data at 1 MiB/sec, the value should be 10 MiB or higher.
Buffer incoming data for the specified period of time, in seconds, before delivering it to the destination. The default value is 300. This parameter is optional but if you specify a value for it, you must also specify a value for SizeInMBs
, and vice versa.
The compression format. If no value is specified, the default is UNCOMPRESSED
.
The compression formats SNAPPY
or ZIP
cannot be specified for Amazon Redshift destinations because they are not supported by the Amazon Redshift COPY
operation that reads from the S3 bucket.
The encryption configuration. If no value is specified, the default is no encryption.
Specifically override existing encryption information to ensure that no encryption is used.
The encryption key.
The Amazon Resource Name (ARN) of the encryption key. Must belong to the same AWS Region as the destination Amazon S3 bucket. For more information, see Amazon Resource Names (ARNs) and AWS Service Namespaces .
The CloudWatch logging options for your delivery stream.
Enables or disables CloudWatch logging.
The CloudWatch group name for logging. This value is required if CloudWatch logging is enabled.
The CloudWatch log stream name for logging. This value is required if CloudWatch logging is enabled.
The CloudWatch logging options for your delivery stream.
Enables or disables CloudWatch logging.
The CloudWatch group name for logging. This value is required if CloudWatch logging is enabled.
The CloudWatch log stream name for logging. This value is required if CloudWatch logging is enabled.
The destination in Amazon ES. You can specify only one destination.
The Amazon Resource Name (ARN) of the IAM role to be assumed by Kinesis Data Firehose for calling the Amazon ES Configuration API and for indexing documents. For more information, see Grant Kinesis Data Firehose Access to an Amazon S3 Destination and Amazon Resource Names (ARNs) and AWS Service Namespaces .
The ARN of the Amazon ES domain. The IAM role must have permissions for DescribeElasticsearchDomain
, DescribeElasticsearchDomains
, and DescribeElasticsearchDomainConfig
after assuming the role specified in RoleARN . For more information, see Amazon Resource Names (ARNs) and AWS Service Namespaces .
Specify either ClusterEndpoint
or DomainARN
.
The endpoint to use when communicating with the cluster. Specify either this ClusterEndpoint
or the DomainARN
field.
The Elasticsearch index name.
The Elasticsearch type name. For Elasticsearch 6.x, there can be only one type per index. If you try to specify a new type for an existing index that already has another type, Kinesis Data Firehose returns an error during run time.
For Elasticsearch 7.x, don't specify a TypeName
.
The Elasticsearch index rotation period. Index rotation appends a timestamp to the IndexName
to facilitate the expiration of old data. For more information, see Index Rotation for the Amazon ES Destination . The default value is OneDay
.
The buffering options. If no value is specified, the default values for ElasticsearchBufferingHints
are used.
Buffer incoming data for the specified period of time, in seconds, before delivering it to the destination. The default value is 300 (5 minutes).
Buffer incoming data to the specified size, in MBs, before delivering it to the destination. The default value is 5.
We recommend setting this parameter to a value greater than the amount of data you typically ingest into the delivery stream in 10 seconds. For example, if you typically ingest data at 1 MB/sec, the value should be 10 MB or higher.
The retry behavior in case Kinesis Data Firehose is unable to deliver documents to Amazon ES. The default value is 300 (5 minutes).
After an initial failure to deliver to Amazon ES, the total amount of time during which Kinesis Data Firehose retries delivery (including the first attempt). After this time has elapsed, the failed documents are written to Amazon S3. Default value is 300 seconds (5 minutes). A value of 0 (zero) results in no retries.
Defines how documents should be delivered to Amazon S3. When it is set to FailedDocumentsOnly
, Kinesis Data Firehose writes any documents that could not be indexed to the configured Amazon S3 destination, with elasticsearch-failed/
appended to the key prefix. When set to AllDocuments
, Kinesis Data Firehose delivers all incoming records to Amazon S3, and also writes failed documents with elasticsearch-failed/
appended to the prefix. For more information, see Amazon S3 Backup for the Amazon ES Destination . Default value is FailedDocumentsOnly
.
You can't change this backup mode after you create the delivery stream.
The configuration for the backup Amazon S3 location.
The Amazon Resource Name (ARN) of the AWS credentials. For more information, see Amazon Resource Names (ARNs) and AWS Service Namespaces .
The ARN of the S3 bucket. For more information, see Amazon Resource Names (ARNs) and AWS Service Namespaces .
The "YYYY/MM/DD/HH" time format prefix is automatically used for delivered Amazon S3 files. You can also specify a custom prefix, as described in Custom Prefixes for Amazon S3 Objects .
A prefix that Kinesis Data Firehose evaluates and adds to failed records before writing them to S3. This prefix appears immediately following the bucket name. For information about how to specify this prefix, see Custom Prefixes for Amazon S3 Objects .
The buffering option. If no value is specified, BufferingHints
object default values are used.
Buffer incoming data to the specified size, in MiBs, before delivering it to the destination. The default value is 5. This parameter is optional but if you specify a value for it, you must also specify a value for IntervalInSeconds
, and vice versa.
We recommend setting this parameter to a value greater than the amount of data you typically ingest into the delivery stream in 10 seconds. For example, if you typically ingest data at 1 MiB/sec, the value should be 10 MiB or higher.
Buffer incoming data for the specified period of time, in seconds, before delivering it to the destination. The default value is 300. This parameter is optional but if you specify a value for it, you must also specify a value for SizeInMBs
, and vice versa.
The compression format. If no value is specified, the default is UNCOMPRESSED
.
The compression formats SNAPPY
or ZIP
cannot be specified for Amazon Redshift destinations because they are not supported by the Amazon Redshift COPY
operation that reads from the S3 bucket.
The encryption configuration. If no value is specified, the default is no encryption.
Specifically override existing encryption information to ensure that no encryption is used.
The encryption key.
The Amazon Resource Name (ARN) of the encryption key. Must belong to the same AWS Region as the destination Amazon S3 bucket. For more information, see Amazon Resource Names (ARNs) and AWS Service Namespaces .
The CloudWatch logging options for your delivery stream.
Enables or disables CloudWatch logging.
The CloudWatch group name for logging. This value is required if CloudWatch logging is enabled.
The CloudWatch log stream name for logging. This value is required if CloudWatch logging is enabled.
The data processing configuration.
Enables or disables data processing.
The data processors.
Describes a data processor.
The type of processor.
The processor parameters.
Describes the processor parameter.
The name of the parameter.
The parameter value.
The Amazon CloudWatch logging options for your delivery stream.
Enables or disables CloudWatch logging.
The CloudWatch group name for logging. This value is required if CloudWatch logging is enabled.
The CloudWatch log stream name for logging. This value is required if CloudWatch logging is enabled.
The details of the VPC of the Amazon ES destination.
The IDs of the subnets that you want Kinesis Data Firehose to use to create ENIs in the VPC of the Amazon ES destination. Make sure that the routing tables and inbound and outbound rules allow traffic to flow from the subnets whose IDs are specified here to the subnets that have the destination Amazon ES endpoints. Kinesis Data Firehose creates at least one ENI in each of the subnets that are specified here. Do not delete or modify these ENIs.
The number of ENIs that Kinesis Data Firehose creates in the subnets specified here scales up and down automatically based on throughput. To enable Kinesis Data Firehose to scale up the number of ENIs to match throughput, ensure that you have sufficient quota. To help you calculate the quota you need, assume that Kinesis Data Firehose can create up to three ENIs for this delivery stream for each of the subnets specified here. For more information about ENI quota, see Network Interfaces in the Amazon VPC Quotas topic.
The ARN of the IAM role that you want the delivery stream to use to create endpoints in the destination VPC. You can use your existing Kinesis Data Firehose delivery role or you can specify a new role. In either case, make sure that the role trusts the Kinesis Data Firehose service principal and that it grants the following permissions:
ec2:DescribeVpcs
ec2:DescribeVpcAttribute
ec2:DescribeSubnets
ec2:DescribeSecurityGroups
ec2:DescribeNetworkInterfaces
ec2:CreateNetworkInterface
ec2:CreateNetworkInterfacePermission
ec2:DeleteNetworkInterface
If you revoke these permissions after you create the delivery stream, Kinesis Data Firehose can't scale out by creating more ENIs when necessary. You might therefore see a degradation in performance.
The IDs of the security groups that you want Kinesis Data Firehose to use when it creates ENIs in the VPC of the Amazon ES destination. You can use the same security group that the Amazon ES domain uses or different ones. If you specify different security groups here, ensure that they allow outbound HTTPS traffic to the Amazon ES domain's security group. Also ensure that the Amazon ES domain's security group allows HTTPS traffic from the security groups specified here. If you use the same security group for both your delivery stream and the Amazon ES domain, make sure the security group inbound rule allows HTTPS traffic. For more information about security group rules, see Security group rules in the Amazon VPC documentation.
Describes the configuration of a destination in Amazon S3.
The Amazon Resource Name (ARN) of the AWS credentials. For more information, see Amazon Resource Names (ARNs) and AWS Service Namespaces .
The ARN of the S3 bucket. For more information, see Amazon Resource Names (ARNs) and AWS Service Namespaces .
The "YYYY/MM/DD/HH" time format prefix is automatically used for delivered Amazon S3 files. You can also specify a custom prefix, as described in Custom Prefixes for Amazon S3 Objects .
A prefix that Kinesis Data Firehose evaluates and adds to failed records before writing them to S3. This prefix appears immediately following the bucket name. For information about how to specify this prefix, see Custom Prefixes for Amazon S3 Objects .
The buffering option. If no value is specified, BufferingHints
object default values are used.
Buffer incoming data to the specified size, in MiBs, before delivering it to the destination. The default value is 5. This parameter is optional but if you specify a value for it, you must also specify a value for IntervalInSeconds
, and vice versa.
We recommend setting this parameter to a value greater than the amount of data you typically ingest into the delivery stream in 10 seconds. For example, if you typically ingest data at 1 MiB/sec, the value should be 10 MiB or higher.
Buffer incoming data for the specified period of time, in seconds, before delivering it to the destination. The default value is 300. This parameter is optional but if you specify a value for it, you must also specify a value for SizeInMBs
, and vice versa.
The compression format. If no value is specified, the default is UNCOMPRESSED
.
The compression formats SNAPPY
or ZIP
cannot be specified for Amazon Redshift destinations because they are not supported by the Amazon Redshift COPY
operation that reads from the S3 bucket.
The encryption configuration. If no value is specified, the default is no encryption.
Specifically override existing encryption information to ensure that no encryption is used.
The encryption key.
The Amazon Resource Name (ARN) of the encryption key. Must belong to the same AWS Region as the destination Amazon S3 bucket. For more information, see Amazon Resource Names (ARNs) and AWS Service Namespaces .
The CloudWatch logging options for your delivery stream.
Enables or disables CloudWatch logging.
The CloudWatch group name for logging. This value is required if CloudWatch logging is enabled.
The CloudWatch log stream name for logging. This value is required if CloudWatch logging is enabled.
Describes a data processing configuration.
Enables or disables data processing.
The data processors.
Describes a data processor.
The type of processor.
The processor parameters.
Describes the processor parameter.
The name of the parameter.
The parameter value.
Describes the Amazon CloudWatch logging options for your delivery stream.
Enables or disables CloudWatch logging.
The CloudWatch group name for logging. This value is required if CloudWatch logging is enabled.
The CloudWatch log stream name for logging. This value is required if CloudWatch logging is enabled.
The details of the VPC of the Amazon ES destination.
The IDs of the subnets that you want Kinesis Data Firehose to use to create ENIs in the VPC of the Amazon ES destination. Make sure that the routing tables and inbound and outbound rules allow traffic to flow from the subnets whose IDs are specified here to the subnets that have the destination Amazon ES endpoints. Kinesis Data Firehose creates at least one ENI in each of the subnets that are specified here. Do not delete or modify these ENIs.
The number of ENIs that Kinesis Data Firehose creates in the subnets specified here scales up and down automatically based on throughput. To enable Kinesis Data Firehose to scale up the number of ENIs to match throughput, ensure that you have sufficient quota. To help you calculate the quota you need, assume that Kinesis Data Firehose can create up to three ENIs for this delivery stream for each of the subnets specified here. For more information about ENI quota, see Network Interfaces in the Amazon VPC Quotas topic.
The ARN of the IAM role that you want the delivery stream to use to create endpoints in the destination VPC. You can use your existing Kinesis Data Firehose delivery role or you can specify a new role. In either case, make sure that the role trusts the Kinesis Data Firehose service principal and that it grants the following permissions:
ec2:DescribeVpcs
ec2:DescribeVpcAttribute
ec2:DescribeSubnets
ec2:DescribeSecurityGroups
ec2:DescribeNetworkInterfaces
ec2:CreateNetworkInterface
ec2:CreateNetworkInterfacePermission
ec2:DeleteNetworkInterface
If you revoke these permissions after you create the delivery stream, Kinesis Data Firehose can't scale out by creating more ENIs when necessary. You might therefore see a degradation in performance.
The IDs of the security groups that you want Kinesis Data Firehose to use when it creates ENIs in the VPC of the Amazon ES destination. You can use the same security group that the Amazon ES domain uses or different ones. If you specify different security groups here, ensure that they allow outbound HTTPS traffic to the Amazon ES domain's security group. Also ensure that the Amazon ES domain's security group allows HTTPS traffic from the security groups specified here. If you use the same security group for both your delivery stream and the Amazon ES domain, make sure the security group inbound rule allows HTTPS traffic. For more information about security group rules, see Security group rules in the Amazon VPC documentation.
The destination in Splunk. You can specify only one destination.
The HTTP Event Collector (HEC) endpoint to which Kinesis Data Firehose sends your data.
This type can be either "Raw" or "Event."
This is a GUID that you obtain from your Splunk cluster when you create a new HEC endpoint.
The amount of time that Kinesis Data Firehose waits to receive an acknowledgment from Splunk after it sends it data. At the end of the timeout period, Kinesis Data Firehose either tries to send the data again or considers it an error, based on your retry settings.
The retry behavior in case Kinesis Data Firehose is unable to deliver data to Splunk, or if it doesn't receive an acknowledgment of receipt from Splunk.
The total amount of time that Kinesis Data Firehose spends on retries. This duration starts after the initial attempt to send data to Splunk fails. It doesn't include the periods during which Kinesis Data Firehose waits for acknowledgment from Splunk after each attempt.
Defines how documents should be delivered to Amazon S3. When set to FailedEventsOnly
, Kinesis Data Firehose writes any data that could not be indexed to the configured Amazon S3 destination. When set to AllEvents
, Kinesis Data Firehose delivers all incoming records to Amazon S3, and also writes failed documents to Amazon S3. The default value is FailedEventsOnly
.
You can update this backup mode from FailedEventsOnly
to AllEvents
. You can't update it from AllEvents
to FailedEventsOnly
.
The configuration for the backup Amazon S3 location.
The Amazon Resource Name (ARN) of the AWS credentials. For more information, see Amazon Resource Names (ARNs) and AWS Service Namespaces .
The ARN of the S3 bucket. For more information, see Amazon Resource Names (ARNs) and AWS Service Namespaces .
The "YYYY/MM/DD/HH" time format prefix is automatically used for delivered Amazon S3 files. You can also specify a custom prefix, as described in Custom Prefixes for Amazon S3 Objects .
A prefix that Kinesis Data Firehose evaluates and adds to failed records before writing them to S3. This prefix appears immediately following the bucket name. For information about how to specify this prefix, see Custom Prefixes for Amazon S3 Objects .
The buffering option. If no value is specified, BufferingHints
object default values are used.
Buffer incoming data to the specified size, in MiBs, before delivering it to the destination. The default value is 5. This parameter is optional but if you specify a value for it, you must also specify a value for IntervalInSeconds
, and vice versa.
We recommend setting this parameter to a value greater than the amount of data you typically ingest into the delivery stream in 10 seconds. For example, if you typically ingest data at 1 MiB/sec, the value should be 10 MiB or higher.
Buffer incoming data for the specified period of time, in seconds, before delivering it to the destination. The default value is 300. This parameter is optional but if you specify a value for it, you must also specify a value for SizeInMBs
, and vice versa.
The compression format. If no value is specified, the default is UNCOMPRESSED
.
The compression formats SNAPPY
or ZIP
cannot be specified for Amazon Redshift destinations because they are not supported by the Amazon Redshift COPY
operation that reads from the S3 bucket.
The encryption configuration. If no value is specified, the default is no encryption.
Specifically override existing encryption information to ensure that no encryption is used.
The encryption key.
The Amazon Resource Name (ARN) of the encryption key. Must belong to the same AWS Region as the destination Amazon S3 bucket. For more information, see Amazon Resource Names (ARNs) and AWS Service Namespaces .
The CloudWatch logging options for your delivery stream.
Enables or disables CloudWatch logging.
The CloudWatch group name for logging. This value is required if CloudWatch logging is enabled.
The CloudWatch log stream name for logging. This value is required if CloudWatch logging is enabled.
The data processing configuration.
Enables or disables data processing.
The data processors.
Describes a data processor.
The type of processor.
The processor parameters.
Describes the processor parameter.
The name of the parameter.
The parameter value.
The Amazon CloudWatch logging options for your delivery stream.
Enables or disables CloudWatch logging.
The CloudWatch group name for logging. This value is required if CloudWatch logging is enabled.
The CloudWatch log stream name for logging. This value is required if CloudWatch logging is enabled.
Enables configuring Kinesis Firehose to deliver data to any HTTP endpoint destination. You can specify only one destination.
The configuration of the HTTP endpoint selected as the destination.
The URL of the HTTP endpoint selected as the destination.
Warning
If you choose an HTTP endpoint as your destination, review and follow the instructions in the Appendix - HTTP Endpoint Delivery Request and Response Specifications .
The name of the HTTP endpoint selected as the destination.
The access key required for Kinesis Firehose to authenticate with the HTTP endpoint selected as the destination.
The buffering options that can be used before data is delivered to the specified destination. Kinesis Data Firehose treats these options as hints, and it might choose to use more optimal values. The SizeInMBs
and IntervalInSeconds
parameters are optional. However, if you specify a value for one of them, you must also provide a value for the other.
Buffer incoming data to the specified size, in MBs, before delivering it to the destination. The default value is 5.
We recommend setting this parameter to a value greater than the amount of data you typically ingest into the delivery stream in 10 seconds. For example, if you typically ingest data at 1 MB/sec, the value should be 10 MB or higher.
Buffer incoming data for the specified period of time, in seconds, before delivering it to the destination. The default value is 300 (5 minutes).
Describes the Amazon CloudWatch logging options for your delivery stream.
Enables or disables CloudWatch logging.
The CloudWatch group name for logging. This value is required if CloudWatch logging is enabled.
The CloudWatch log stream name for logging. This value is required if CloudWatch logging is enabled.
The configuration of the requeste sent to the HTTP endpoint specified as the destination.
Kinesis Data Firehose uses the content encoding to compress the body of a request before sending the request to the destination. For more information, see Content-Encoding in MDN Web Docs, the official Mozilla documentation.
Describes the metadata sent to the HTTP endpoint destination.
Describes the metadata that's delivered to the specified HTTP endpoint destination.
The name of the HTTP endpoint common attribute.
The value of the HTTP endpoint common attribute.
Describes a data processing configuration.
Enables or disables data processing.
The data processors.
Describes a data processor.
The type of processor.
The processor parameters.
Describes the processor parameter.
The name of the parameter.
The parameter value.
Kinesis Data Firehose uses this IAM role for all the permissions that the delivery stream needs.
Describes the retry behavior in case Kinesis Data Firehose is unable to deliver data to the specified HTTP endpoint destination, or if it doesn't receive a valid acknowledgment of receipt from the specified HTTP endpoint destination.
The total amount of time that Kinesis Data Firehose spends on retries. This duration starts after the initial attempt to send data to the custom destination via HTTPS endpoint fails. It doesn't include the periods during which Kinesis Data Firehose waits for acknowledgment from the specified destination after each attempt.
Describes the S3 bucket backup options for the data that Kinesis Data Firehose delivers to the HTTP endpoint destination. You can back up all documents (AllData
) or only the documents that Kinesis Data Firehose could not deliver to the specified HTTP endpoint destination (FailedDataOnly
).
Describes the configuration of a destination in Amazon S3.
The Amazon Resource Name (ARN) of the AWS credentials. For more information, see Amazon Resource Names (ARNs) and AWS Service Namespaces .
The ARN of the S3 bucket. For more information, see Amazon Resource Names (ARNs) and AWS Service Namespaces .
The "YYYY/MM/DD/HH" time format prefix is automatically used for delivered Amazon S3 files. You can also specify a custom prefix, as described in Custom Prefixes for Amazon S3 Objects .
A prefix that Kinesis Data Firehose evaluates and adds to failed records before writing them to S3. This prefix appears immediately following the bucket name. For information about how to specify this prefix, see Custom Prefixes for Amazon S3 Objects .
The buffering option. If no value is specified, BufferingHints
object default values are used.
Buffer incoming data to the specified size, in MiBs, before delivering it to the destination. The default value is 5. This parameter is optional but if you specify a value for it, you must also specify a value for IntervalInSeconds
, and vice versa.
We recommend setting this parameter to a value greater than the amount of data you typically ingest into the delivery stream in 10 seconds. For example, if you typically ingest data at 1 MiB/sec, the value should be 10 MiB or higher.
Buffer incoming data for the specified period of time, in seconds, before delivering it to the destination. The default value is 300. This parameter is optional but if you specify a value for it, you must also specify a value for SizeInMBs
, and vice versa.
The compression format. If no value is specified, the default is UNCOMPRESSED
.
The compression formats SNAPPY
or ZIP
cannot be specified for Amazon Redshift destinations because they are not supported by the Amazon Redshift COPY
operation that reads from the S3 bucket.
The encryption configuration. If no value is specified, the default is no encryption.
Specifically override existing encryption information to ensure that no encryption is used.
The encryption key.
The Amazon Resource Name (ARN) of the encryption key. Must belong to the same AWS Region as the destination Amazon S3 bucket. For more information, see Amazon Resource Names (ARNs) and AWS Service Namespaces .
The CloudWatch logging options for your delivery stream.
Enables or disables CloudWatch logging.
The CloudWatch group name for logging. This value is required if CloudWatch logging is enabled.
The CloudWatch log stream name for logging. This value is required if CloudWatch logging is enabled.
A set of tags to assign to the delivery stream. A tag is a key-value pair that you can define and assign to AWS resources. Tags are metadata. For example, you can add friendly names and descriptions or other types of information that can help you distinguish the delivery stream. For more information about tags, see Using Cost Allocation Tags in the AWS Billing and Cost Management User Guide.
You can specify up to 50 tags when creating a delivery stream.
Metadata that you can assign to a delivery stream, consisting of a key-value pair.
A unique identifier for the tag. Maximum length: 128 characters. Valid characters: Unicode letters, digits, white space, _ . / = + - % @
An optional string, which you can use to describe or define the tag. Maximum length: 256 characters. Valid characters: Unicode letters, digits, white space, _ . / = + - % @
dict
Response Syntax
{
'DeliveryStreamARN': 'string'
}
Response Structure
(dict) --
DeliveryStreamARN (string) --
The ARN of the delivery stream.
Exceptions
Firehose.Client.exceptions.InvalidArgumentException
Firehose.Client.exceptions.LimitExceededException
Firehose.Client.exceptions.ResourceInUseException
Firehose.Client.exceptions.InvalidKMSResourceException
delete_delivery_stream
(**kwargs)¶Deletes a delivery stream and its data.
To check the state of a delivery stream, use DescribeDeliveryStream . You can delete a delivery stream only if it is in one of the following states: ACTIVE
, DELETING
, CREATING_FAILED
, or DELETING_FAILED
. You can't delete a delivery stream that is in the CREATING
state. While the deletion request is in process, the delivery stream is in the DELETING
state.
While the delivery stream is in the DELETING
state, the service might continue to accept records, but it doesn't make any guarantees with respect to delivering the data. Therefore, as a best practice, first stop any applications that are sending records before you delete a delivery stream.
See also: AWS API Documentation
Request Syntax
response = client.delete_delivery_stream(
DeliveryStreamName='string',
AllowForceDelete=True|False
)
[REQUIRED]
The name of the delivery stream.
Set this to true if you want to delete the delivery stream even if Kinesis Data Firehose is unable to retire the grant for the CMK. Kinesis Data Firehose might be unable to retire the grant due to a customer error, such as when the CMK or the grant are in an invalid state. If you force deletion, you can then use the RevokeGrant operation to revoke the grant you gave to Kinesis Data Firehose. If a failure to retire the grant happens due to an AWS KMS issue, Kinesis Data Firehose keeps retrying the delete operation.
The default value is false.
dict
Response Syntax
{}
Response Structure
Exceptions
Firehose.Client.exceptions.ResourceInUseException
Firehose.Client.exceptions.ResourceNotFoundException
describe_delivery_stream
(**kwargs)¶Describes the specified delivery stream and its status. For example, after your delivery stream is created, call DescribeDeliveryStream
to see whether the delivery stream is ACTIVE
and therefore ready for data to be sent to it.
If the status of a delivery stream is CREATING_FAILED
, this status doesn't change, and you can't invoke CreateDeliveryStream again on it. However, you can invoke the DeleteDeliveryStream operation to delete it. If the status is DELETING_FAILED
, you can force deletion by invoking DeleteDeliveryStream again but with DeleteDeliveryStreamInput$AllowForceDelete set to true.
See also: AWS API Documentation
Request Syntax
response = client.describe_delivery_stream(
DeliveryStreamName='string',
Limit=123,
ExclusiveStartDestinationId='string'
)
[REQUIRED]
The name of the delivery stream.
dict
Response Syntax
{
'DeliveryStreamDescription': {
'DeliveryStreamName': 'string',
'DeliveryStreamARN': 'string',
'DeliveryStreamStatus': 'CREATING'|'CREATING_FAILED'|'DELETING'|'DELETING_FAILED'|'ACTIVE',
'FailureDescription': {
'Type': 'RETIRE_KMS_GRANT_FAILED'|'CREATE_KMS_GRANT_FAILED'|'KMS_ACCESS_DENIED'|'DISABLED_KMS_KEY'|'INVALID_KMS_KEY'|'KMS_KEY_NOT_FOUND'|'KMS_OPT_IN_REQUIRED'|'CREATE_ENI_FAILED'|'DELETE_ENI_FAILED'|'SUBNET_NOT_FOUND'|'SECURITY_GROUP_NOT_FOUND'|'ENI_ACCESS_DENIED'|'SUBNET_ACCESS_DENIED'|'SECURITY_GROUP_ACCESS_DENIED'|'UNKNOWN_ERROR',
'Details': 'string'
},
'DeliveryStreamEncryptionConfiguration': {
'KeyARN': 'string',
'KeyType': 'AWS_OWNED_CMK'|'CUSTOMER_MANAGED_CMK',
'Status': 'ENABLED'|'ENABLING'|'ENABLING_FAILED'|'DISABLED'|'DISABLING'|'DISABLING_FAILED',
'FailureDescription': {
'Type': 'RETIRE_KMS_GRANT_FAILED'|'CREATE_KMS_GRANT_FAILED'|'KMS_ACCESS_DENIED'|'DISABLED_KMS_KEY'|'INVALID_KMS_KEY'|'KMS_KEY_NOT_FOUND'|'KMS_OPT_IN_REQUIRED'|'CREATE_ENI_FAILED'|'DELETE_ENI_FAILED'|'SUBNET_NOT_FOUND'|'SECURITY_GROUP_NOT_FOUND'|'ENI_ACCESS_DENIED'|'SUBNET_ACCESS_DENIED'|'SECURITY_GROUP_ACCESS_DENIED'|'UNKNOWN_ERROR',
'Details': 'string'
}
},
'DeliveryStreamType': 'DirectPut'|'KinesisStreamAsSource',
'VersionId': 'string',
'CreateTimestamp': datetime(2015, 1, 1),
'LastUpdateTimestamp': datetime(2015, 1, 1),
'Source': {
'KinesisStreamSourceDescription': {
'KinesisStreamARN': 'string',
'RoleARN': 'string',
'DeliveryStartTimestamp': datetime(2015, 1, 1)
}
},
'Destinations': [
{
'DestinationId': 'string',
'S3DestinationDescription': {
'RoleARN': 'string',
'BucketARN': 'string',
'Prefix': 'string',
'ErrorOutputPrefix': 'string',
'BufferingHints': {
'SizeInMBs': 123,
'IntervalInSeconds': 123
},
'CompressionFormat': 'UNCOMPRESSED'|'GZIP'|'ZIP'|'Snappy'|'HADOOP_SNAPPY',
'EncryptionConfiguration': {
'NoEncryptionConfig': 'NoEncryption',
'KMSEncryptionConfig': {
'AWSKMSKeyARN': 'string'
}
},
'CloudWatchLoggingOptions': {
'Enabled': True|False,
'LogGroupName': 'string',
'LogStreamName': 'string'
}
},
'ExtendedS3DestinationDescription': {
'RoleARN': 'string',
'BucketARN': 'string',
'Prefix': 'string',
'ErrorOutputPrefix': 'string',
'BufferingHints': {
'SizeInMBs': 123,
'IntervalInSeconds': 123
},
'CompressionFormat': 'UNCOMPRESSED'|'GZIP'|'ZIP'|'Snappy'|'HADOOP_SNAPPY',
'EncryptionConfiguration': {
'NoEncryptionConfig': 'NoEncryption',
'KMSEncryptionConfig': {
'AWSKMSKeyARN': 'string'
}
},
'CloudWatchLoggingOptions': {
'Enabled': True|False,
'LogGroupName': 'string',
'LogStreamName': 'string'
},
'ProcessingConfiguration': {
'Enabled': True|False,
'Processors': [
{
'Type': 'RecordDeAggregation'|'Lambda'|'MetadataExtraction'|'AppendDelimiterToRecord',
'Parameters': [
{
'ParameterName': 'LambdaArn'|'NumberOfRetries'|'MetadataExtractionQuery'|'JsonParsingEngine'|'RoleArn'|'BufferSizeInMBs'|'BufferIntervalInSeconds'|'SubRecordType'|'Delimiter',
'ParameterValue': 'string'
},
]
},
]
},
'S3BackupMode': 'Disabled'|'Enabled',
'S3BackupDescription': {
'RoleARN': 'string',
'BucketARN': 'string',
'Prefix': 'string',
'ErrorOutputPrefix': 'string',
'BufferingHints': {
'SizeInMBs': 123,
'IntervalInSeconds': 123
},
'CompressionFormat': 'UNCOMPRESSED'|'GZIP'|'ZIP'|'Snappy'|'HADOOP_SNAPPY',
'EncryptionConfiguration': {
'NoEncryptionConfig': 'NoEncryption',
'KMSEncryptionConfig': {
'AWSKMSKeyARN': 'string'
}
},
'CloudWatchLoggingOptions': {
'Enabled': True|False,
'LogGroupName': 'string',
'LogStreamName': 'string'
}
},
'DataFormatConversionConfiguration': {
'SchemaConfiguration': {
'RoleARN': 'string',
'CatalogId': 'string',
'DatabaseName': 'string',
'TableName': 'string',
'Region': 'string',
'VersionId': 'string'
},
'InputFormatConfiguration': {
'Deserializer': {
'OpenXJsonSerDe': {
'ConvertDotsInJsonKeysToUnderscores': True|False,
'CaseInsensitive': True|False,
'ColumnToJsonKeyMappings': {
'string': 'string'
}
},
'HiveJsonSerDe': {
'TimestampFormats': [
'string',
]
}
}
},
'OutputFormatConfiguration': {
'Serializer': {
'ParquetSerDe': {
'BlockSizeBytes': 123,
'PageSizeBytes': 123,
'Compression': 'UNCOMPRESSED'|'GZIP'|'SNAPPY',
'EnableDictionaryCompression': True|False,
'MaxPaddingBytes': 123,
'WriterVersion': 'V1'|'V2'
},
'OrcSerDe': {
'StripeSizeBytes': 123,
'BlockSizeBytes': 123,
'RowIndexStride': 123,
'EnablePadding': True|False,
'PaddingTolerance': 123.0,
'Compression': 'NONE'|'ZLIB'|'SNAPPY',
'BloomFilterColumns': [
'string',
],
'BloomFilterFalsePositiveProbability': 123.0,
'DictionaryKeyThreshold': 123.0,
'FormatVersion': 'V0_11'|'V0_12'
}
}
},
'Enabled': True|False
},
'DynamicPartitioningConfiguration': {
'RetryOptions': {
'DurationInSeconds': 123
},
'Enabled': True|False
}
},
'RedshiftDestinationDescription': {
'RoleARN': 'string',
'ClusterJDBCURL': 'string',
'CopyCommand': {
'DataTableName': 'string',
'DataTableColumns': 'string',
'CopyOptions': 'string'
},
'Username': 'string',
'RetryOptions': {
'DurationInSeconds': 123
},
'S3DestinationDescription': {
'RoleARN': 'string',
'BucketARN': 'string',
'Prefix': 'string',
'ErrorOutputPrefix': 'string',
'BufferingHints': {
'SizeInMBs': 123,
'IntervalInSeconds': 123
},
'CompressionFormat': 'UNCOMPRESSED'|'GZIP'|'ZIP'|'Snappy'|'HADOOP_SNAPPY',
'EncryptionConfiguration': {
'NoEncryptionConfig': 'NoEncryption',
'KMSEncryptionConfig': {
'AWSKMSKeyARN': 'string'
}
},
'CloudWatchLoggingOptions': {
'Enabled': True|False,
'LogGroupName': 'string',
'LogStreamName': 'string'
}
},
'ProcessingConfiguration': {
'Enabled': True|False,
'Processors': [
{
'Type': 'RecordDeAggregation'|'Lambda'|'MetadataExtraction'|'AppendDelimiterToRecord',
'Parameters': [
{
'ParameterName': 'LambdaArn'|'NumberOfRetries'|'MetadataExtractionQuery'|'JsonParsingEngine'|'RoleArn'|'BufferSizeInMBs'|'BufferIntervalInSeconds'|'SubRecordType'|'Delimiter',
'ParameterValue': 'string'
},
]
},
]
},
'S3BackupMode': 'Disabled'|'Enabled',
'S3BackupDescription': {
'RoleARN': 'string',
'BucketARN': 'string',
'Prefix': 'string',
'ErrorOutputPrefix': 'string',
'BufferingHints': {
'SizeInMBs': 123,
'IntervalInSeconds': 123
},
'CompressionFormat': 'UNCOMPRESSED'|'GZIP'|'ZIP'|'Snappy'|'HADOOP_SNAPPY',
'EncryptionConfiguration': {
'NoEncryptionConfig': 'NoEncryption',
'KMSEncryptionConfig': {
'AWSKMSKeyARN': 'string'
}
},
'CloudWatchLoggingOptions': {
'Enabled': True|False,
'LogGroupName': 'string',
'LogStreamName': 'string'
}
},
'CloudWatchLoggingOptions': {
'Enabled': True|False,
'LogGroupName': 'string',
'LogStreamName': 'string'
}
},
'ElasticsearchDestinationDescription': {
'RoleARN': 'string',
'DomainARN': 'string',
'ClusterEndpoint': 'string',
'IndexName': 'string',
'TypeName': 'string',
'IndexRotationPeriod': 'NoRotation'|'OneHour'|'OneDay'|'OneWeek'|'OneMonth',
'BufferingHints': {
'IntervalInSeconds': 123,
'SizeInMBs': 123
},
'RetryOptions': {
'DurationInSeconds': 123
},
'S3BackupMode': 'FailedDocumentsOnly'|'AllDocuments',
'S3DestinationDescription': {
'RoleARN': 'string',
'BucketARN': 'string',
'Prefix': 'string',
'ErrorOutputPrefix': 'string',
'BufferingHints': {
'SizeInMBs': 123,
'IntervalInSeconds': 123
},
'CompressionFormat': 'UNCOMPRESSED'|'GZIP'|'ZIP'|'Snappy'|'HADOOP_SNAPPY',
'EncryptionConfiguration': {
'NoEncryptionConfig': 'NoEncryption',
'KMSEncryptionConfig': {
'AWSKMSKeyARN': 'string'
}
},
'CloudWatchLoggingOptions': {
'Enabled': True|False,
'LogGroupName': 'string',
'LogStreamName': 'string'
}
},
'ProcessingConfiguration': {
'Enabled': True|False,
'Processors': [
{
'Type': 'RecordDeAggregation'|'Lambda'|'MetadataExtraction'|'AppendDelimiterToRecord',
'Parameters': [
{
'ParameterName': 'LambdaArn'|'NumberOfRetries'|'MetadataExtractionQuery'|'JsonParsingEngine'|'RoleArn'|'BufferSizeInMBs'|'BufferIntervalInSeconds'|'SubRecordType'|'Delimiter',
'ParameterValue': 'string'
},
]
},
]
},
'CloudWatchLoggingOptions': {
'Enabled': True|False,
'LogGroupName': 'string',
'LogStreamName': 'string'
},
'VpcConfigurationDescription': {
'SubnetIds': [
'string',
],
'RoleARN': 'string',
'SecurityGroupIds': [
'string',
],
'VpcId': 'string'
}
},
'AmazonopensearchserviceDestinationDescription': {
'RoleARN': 'string',
'DomainARN': 'string',
'ClusterEndpoint': 'string',
'IndexName': 'string',
'TypeName': 'string',
'IndexRotationPeriod': 'NoRotation'|'OneHour'|'OneDay'|'OneWeek'|'OneMonth',
'BufferingHints': {
'IntervalInSeconds': 123,
'SizeInMBs': 123
},
'RetryOptions': {
'DurationInSeconds': 123
},
'S3BackupMode': 'FailedDocumentsOnly'|'AllDocuments',
'S3DestinationDescription': {
'RoleARN': 'string',
'BucketARN': 'string',
'Prefix': 'string',
'ErrorOutputPrefix': 'string',
'BufferingHints': {
'SizeInMBs': 123,
'IntervalInSeconds': 123
},
'CompressionFormat': 'UNCOMPRESSED'|'GZIP'|'ZIP'|'Snappy'|'HADOOP_SNAPPY',
'EncryptionConfiguration': {
'NoEncryptionConfig': 'NoEncryption',
'KMSEncryptionConfig': {
'AWSKMSKeyARN': 'string'
}
},
'CloudWatchLoggingOptions': {
'Enabled': True|False,
'LogGroupName': 'string',
'LogStreamName': 'string'
}
},
'ProcessingConfiguration': {
'Enabled': True|False,
'Processors': [
{
'Type': 'RecordDeAggregation'|'Lambda'|'MetadataExtraction'|'AppendDelimiterToRecord',
'Parameters': [
{
'ParameterName': 'LambdaArn'|'NumberOfRetries'|'MetadataExtractionQuery'|'JsonParsingEngine'|'RoleArn'|'BufferSizeInMBs'|'BufferIntervalInSeconds'|'SubRecordType'|'Delimiter',
'ParameterValue': 'string'
},
]
},
]
},
'CloudWatchLoggingOptions': {
'Enabled': True|False,
'LogGroupName': 'string',
'LogStreamName': 'string'
},
'VpcConfigurationDescription': {
'SubnetIds': [
'string',
],
'RoleARN': 'string',
'SecurityGroupIds': [
'string',
],
'VpcId': 'string'
}
},
'SplunkDestinationDescription': {
'HECEndpoint': 'string',
'HECEndpointType': 'Raw'|'Event',
'HECToken': 'string',
'HECAcknowledgmentTimeoutInSeconds': 123,
'RetryOptions': {
'DurationInSeconds': 123
},
'S3BackupMode': 'FailedEventsOnly'|'AllEvents',
'S3DestinationDescription': {
'RoleARN': 'string',
'BucketARN': 'string',
'Prefix': 'string',
'ErrorOutputPrefix': 'string',
'BufferingHints': {
'SizeInMBs': 123,
'IntervalInSeconds': 123
},
'CompressionFormat': 'UNCOMPRESSED'|'GZIP'|'ZIP'|'Snappy'|'HADOOP_SNAPPY',
'EncryptionConfiguration': {
'NoEncryptionConfig': 'NoEncryption',
'KMSEncryptionConfig': {
'AWSKMSKeyARN': 'string'
}
},
'CloudWatchLoggingOptions': {
'Enabled': True|False,
'LogGroupName': 'string',
'LogStreamName': 'string'
}
},
'ProcessingConfiguration': {
'Enabled': True|False,
'Processors': [
{
'Type': 'RecordDeAggregation'|'Lambda'|'MetadataExtraction'|'AppendDelimiterToRecord',
'Parameters': [
{
'ParameterName': 'LambdaArn'|'NumberOfRetries'|'MetadataExtractionQuery'|'JsonParsingEngine'|'RoleArn'|'BufferSizeInMBs'|'BufferIntervalInSeconds'|'SubRecordType'|'Delimiter',
'ParameterValue': 'string'
},
]
},
]
},
'CloudWatchLoggingOptions': {
'Enabled': True|False,
'LogGroupName': 'string',
'LogStreamName': 'string'
}
},
'HttpEndpointDestinationDescription': {
'EndpointConfiguration': {
'Url': 'string',
'Name': 'string'
},
'BufferingHints': {
'SizeInMBs': 123,
'IntervalInSeconds': 123
},
'CloudWatchLoggingOptions': {
'Enabled': True|False,
'LogGroupName': 'string',
'LogStreamName': 'string'
},
'RequestConfiguration': {
'ContentEncoding': 'NONE'|'GZIP',
'CommonAttributes': [
{
'AttributeName': 'string',
'AttributeValue': 'string'
},
]
},
'ProcessingConfiguration': {
'Enabled': True|False,
'Processors': [
{
'Type': 'RecordDeAggregation'|'Lambda'|'MetadataExtraction'|'AppendDelimiterToRecord',
'Parameters': [
{
'ParameterName': 'LambdaArn'|'NumberOfRetries'|'MetadataExtractionQuery'|'JsonParsingEngine'|'RoleArn'|'BufferSizeInMBs'|'BufferIntervalInSeconds'|'SubRecordType'|'Delimiter',
'ParameterValue': 'string'
},
]
},
]
},
'RoleARN': 'string',
'RetryOptions': {
'DurationInSeconds': 123
},
'S3BackupMode': 'FailedDataOnly'|'AllData',
'S3DestinationDescription': {
'RoleARN': 'string',
'BucketARN': 'string',
'Prefix': 'string',
'ErrorOutputPrefix': 'string',
'BufferingHints': {
'SizeInMBs': 123,
'IntervalInSeconds': 123
},
'CompressionFormat': 'UNCOMPRESSED'|'GZIP'|'ZIP'|'Snappy'|'HADOOP_SNAPPY',
'EncryptionConfiguration': {
'NoEncryptionConfig': 'NoEncryption',
'KMSEncryptionConfig': {
'AWSKMSKeyARN': 'string'
}
},
'CloudWatchLoggingOptions': {
'Enabled': True|False,
'LogGroupName': 'string',
'LogStreamName': 'string'
}
}
}
},
],
'HasMoreDestinations': True|False
}
}
Response Structure
(dict) --
DeliveryStreamDescription (dict) --
Information about the delivery stream.
DeliveryStreamName (string) --
The name of the delivery stream.
DeliveryStreamARN (string) --
The Amazon Resource Name (ARN) of the delivery stream. For more information, see Amazon Resource Names (ARNs) and AWS Service Namespaces .
DeliveryStreamStatus (string) --
The status of the delivery stream. If the status of a delivery stream is CREATING_FAILED
, this status doesn't change, and you can't invoke CreateDeliveryStream
again on it. However, you can invoke the DeleteDeliveryStream operation to delete it.
FailureDescription (dict) --
Provides details in case one of the following operations fails due to an error related to KMS: CreateDeliveryStream , DeleteDeliveryStream , StartDeliveryStreamEncryption , StopDeliveryStreamEncryption .
Type (string) --
The type of error that caused the failure.
Details (string) --
A message providing details about the error that caused the failure.
DeliveryStreamEncryptionConfiguration (dict) --
Indicates the server-side encryption (SSE) status for the delivery stream.
KeyARN (string) --
If KeyType
is CUSTOMER_MANAGED_CMK
, this field contains the ARN of the customer managed CMK. If KeyType
is AWS_OWNED_CMK
, DeliveryStreamEncryptionConfiguration
doesn't contain a value for KeyARN
.
KeyType (string) --
Indicates the type of customer master key (CMK) that is used for encryption. The default setting is AWS_OWNED_CMK
. For more information about CMKs, see Customer Master Keys (CMKs) .
Status (string) --
This is the server-side encryption (SSE) status for the delivery stream. For a full description of the different values of this status, see StartDeliveryStreamEncryption and StopDeliveryStreamEncryption . If this status is ENABLING_FAILED
or DISABLING_FAILED
, it is the status of the most recent attempt to enable or disable SSE, respectively.
FailureDescription (dict) --
Provides details in case one of the following operations fails due to an error related to KMS: CreateDeliveryStream , DeleteDeliveryStream , StartDeliveryStreamEncryption , StopDeliveryStreamEncryption .
Type (string) --
The type of error that caused the failure.
Details (string) --
A message providing details about the error that caused the failure.
DeliveryStreamType (string) --
The delivery stream type. This can be one of the following values:
DirectPut
: Provider applications access the delivery stream directly.KinesisStreamAsSource
: The delivery stream uses a Kinesis data stream as a source.VersionId (string) --
Each time the destination is updated for a delivery stream, the version ID is changed, and the current version ID is required when updating the destination. This is so that the service knows it is applying the changes to the correct version of the delivery stream.
CreateTimestamp (datetime) --
The date and time that the delivery stream was created.
LastUpdateTimestamp (datetime) --
The date and time that the delivery stream was last updated.
Source (dict) --
If the DeliveryStreamType
parameter is KinesisStreamAsSource
, a SourceDescription object describing the source Kinesis data stream.
KinesisStreamSourceDescription (dict) --
The KinesisStreamSourceDescription value for the source Kinesis data stream.
KinesisStreamARN (string) --
The Amazon Resource Name (ARN) of the source Kinesis data stream. For more information, see Amazon Kinesis Data Streams ARN Format .
RoleARN (string) --
The ARN of the role used by the source Kinesis data stream. For more information, see AWS Identity and Access Management (IAM) ARN Format .
DeliveryStartTimestamp (datetime) --
Kinesis Data Firehose starts retrieving records from the Kinesis data stream starting with this timestamp.
Destinations (list) --
The destinations.
(dict) --
Describes the destination for a delivery stream.
DestinationId (string) --
The ID of the destination.
S3DestinationDescription (dict) --
[Deprecated] The destination in Amazon S3.
RoleARN (string) --
The Amazon Resource Name (ARN) of the AWS credentials. For more information, see Amazon Resource Names (ARNs) and AWS Service Namespaces .
BucketARN (string) --
The ARN of the S3 bucket. For more information, see Amazon Resource Names (ARNs) and AWS Service Namespaces .
Prefix (string) --
The "YYYY/MM/DD/HH" time format prefix is automatically used for delivered Amazon S3 files. You can also specify a custom prefix, as described in Custom Prefixes for Amazon S3 Objects .
ErrorOutputPrefix (string) --
A prefix that Kinesis Data Firehose evaluates and adds to failed records before writing them to S3. This prefix appears immediately following the bucket name. For information about how to specify this prefix, see Custom Prefixes for Amazon S3 Objects .
BufferingHints (dict) --
The buffering option. If no value is specified, BufferingHints
object default values are used.
SizeInMBs (integer) --
Buffer incoming data to the specified size, in MiBs, before delivering it to the destination. The default value is 5. This parameter is optional but if you specify a value for it, you must also specify a value for IntervalInSeconds
, and vice versa.
We recommend setting this parameter to a value greater than the amount of data you typically ingest into the delivery stream in 10 seconds. For example, if you typically ingest data at 1 MiB/sec, the value should be 10 MiB or higher.
IntervalInSeconds (integer) --
Buffer incoming data for the specified period of time, in seconds, before delivering it to the destination. The default value is 300. This parameter is optional but if you specify a value for it, you must also specify a value for SizeInMBs
, and vice versa.
CompressionFormat (string) --
The compression format. If no value is specified, the default is UNCOMPRESSED
.
EncryptionConfiguration (dict) --
The encryption configuration. If no value is specified, the default is no encryption.
NoEncryptionConfig (string) --
Specifically override existing encryption information to ensure that no encryption is used.
KMSEncryptionConfig (dict) --
The encryption key.
AWSKMSKeyARN (string) --
The Amazon Resource Name (ARN) of the encryption key. Must belong to the same AWS Region as the destination Amazon S3 bucket. For more information, see Amazon Resource Names (ARNs) and AWS Service Namespaces .
CloudWatchLoggingOptions (dict) --
The Amazon CloudWatch logging options for your delivery stream.
Enabled (boolean) --
Enables or disables CloudWatch logging.
LogGroupName (string) --
The CloudWatch group name for logging. This value is required if CloudWatch logging is enabled.
LogStreamName (string) --
The CloudWatch log stream name for logging. This value is required if CloudWatch logging is enabled.
ExtendedS3DestinationDescription (dict) --
The destination in Amazon S3.
RoleARN (string) --
The Amazon Resource Name (ARN) of the AWS credentials. For more information, see Amazon Resource Names (ARNs) and AWS Service Namespaces .
BucketARN (string) --
The ARN of the S3 bucket. For more information, see Amazon Resource Names (ARNs) and AWS Service Namespaces .
Prefix (string) --
The "YYYY/MM/DD/HH" time format prefix is automatically used for delivered Amazon S3 files. You can also specify a custom prefix, as described in Custom Prefixes for Amazon S3 Objects .
ErrorOutputPrefix (string) --
A prefix that Kinesis Data Firehose evaluates and adds to failed records before writing them to S3. This prefix appears immediately following the bucket name. For information about how to specify this prefix, see Custom Prefixes for Amazon S3 Objects .
BufferingHints (dict) --
The buffering option.
SizeInMBs (integer) --
Buffer incoming data to the specified size, in MiBs, before delivering it to the destination. The default value is 5. This parameter is optional but if you specify a value for it, you must also specify a value for IntervalInSeconds
, and vice versa.
We recommend setting this parameter to a value greater than the amount of data you typically ingest into the delivery stream in 10 seconds. For example, if you typically ingest data at 1 MiB/sec, the value should be 10 MiB or higher.
IntervalInSeconds (integer) --
Buffer incoming data for the specified period of time, in seconds, before delivering it to the destination. The default value is 300. This parameter is optional but if you specify a value for it, you must also specify a value for SizeInMBs
, and vice versa.
CompressionFormat (string) --
The compression format. If no value is specified, the default is UNCOMPRESSED
.
EncryptionConfiguration (dict) --
The encryption configuration. If no value is specified, the default is no encryption.
NoEncryptionConfig (string) --
Specifically override existing encryption information to ensure that no encryption is used.
KMSEncryptionConfig (dict) --
The encryption key.
AWSKMSKeyARN (string) --
The Amazon Resource Name (ARN) of the encryption key. Must belong to the same AWS Region as the destination Amazon S3 bucket. For more information, see Amazon Resource Names (ARNs) and AWS Service Namespaces .
CloudWatchLoggingOptions (dict) --
The Amazon CloudWatch logging options for your delivery stream.
Enabled (boolean) --
Enables or disables CloudWatch logging.
LogGroupName (string) --
The CloudWatch group name for logging. This value is required if CloudWatch logging is enabled.
LogStreamName (string) --
The CloudWatch log stream name for logging. This value is required if CloudWatch logging is enabled.
ProcessingConfiguration (dict) --
The data processing configuration.
Enabled (boolean) --
Enables or disables data processing.
Processors (list) --
The data processors.
(dict) --
Describes a data processor.
Type (string) --
The type of processor.
Parameters (list) --
The processor parameters.
(dict) --
Describes the processor parameter.
ParameterName (string) --
The name of the parameter.
ParameterValue (string) --
The parameter value.
S3BackupMode (string) --
The Amazon S3 backup mode.
S3BackupDescription (dict) --
The configuration for backup in Amazon S3.
RoleARN (string) --
The Amazon Resource Name (ARN) of the AWS credentials. For more information, see Amazon Resource Names (ARNs) and AWS Service Namespaces .
BucketARN (string) --
The ARN of the S3 bucket. For more information, see Amazon Resource Names (ARNs) and AWS Service Namespaces .
Prefix (string) --
The "YYYY/MM/DD/HH" time format prefix is automatically used for delivered Amazon S3 files. You can also specify a custom prefix, as described in Custom Prefixes for Amazon S3 Objects .
ErrorOutputPrefix (string) --
A prefix that Kinesis Data Firehose evaluates and adds to failed records before writing them to S3. This prefix appears immediately following the bucket name. For information about how to specify this prefix, see Custom Prefixes for Amazon S3 Objects .
BufferingHints (dict) --
The buffering option. If no value is specified, BufferingHints
object default values are used.
SizeInMBs (integer) --
Buffer incoming data to the specified size, in MiBs, before delivering it to the destination. The default value is 5. This parameter is optional but if you specify a value for it, you must also specify a value for IntervalInSeconds
, and vice versa.
We recommend setting this parameter to a value greater than the amount of data you typically ingest into the delivery stream in 10 seconds. For example, if you typically ingest data at 1 MiB/sec, the value should be 10 MiB or higher.
IntervalInSeconds (integer) --
Buffer incoming data for the specified period of time, in seconds, before delivering it to the destination. The default value is 300. This parameter is optional but if you specify a value for it, you must also specify a value for SizeInMBs
, and vice versa.
CompressionFormat (string) --
The compression format. If no value is specified, the default is UNCOMPRESSED
.
EncryptionConfiguration (dict) --
The encryption configuration. If no value is specified, the default is no encryption.
NoEncryptionConfig (string) --
Specifically override existing encryption information to ensure that no encryption is used.
KMSEncryptionConfig (dict) --
The encryption key.
AWSKMSKeyARN (string) --
The Amazon Resource Name (ARN) of the encryption key. Must belong to the same AWS Region as the destination Amazon S3 bucket. For more information, see Amazon Resource Names (ARNs) and AWS Service Namespaces .
CloudWatchLoggingOptions (dict) --
The Amazon CloudWatch logging options for your delivery stream.
Enabled (boolean) --
Enables or disables CloudWatch logging.
LogGroupName (string) --
The CloudWatch group name for logging. This value is required if CloudWatch logging is enabled.
LogStreamName (string) --
The CloudWatch log stream name for logging. This value is required if CloudWatch logging is enabled.
DataFormatConversionConfiguration (dict) --
The serializer, deserializer, and schema for converting data from the JSON format to the Parquet or ORC format before writing it to Amazon S3.
SchemaConfiguration (dict) --
Specifies the AWS Glue Data Catalog table that contains the column information. This parameter is required if Enabled
is set to true.
RoleARN (string) --
The role that Kinesis Data Firehose can use to access AWS Glue. This role must be in the same account you use for Kinesis Data Firehose. Cross-account roles aren't allowed.
Warning
If the SchemaConfiguration
request parameter is used as part of invoking the CreateDeliveryStream
API, then the RoleARN
property is required and its value must be specified.
CatalogId (string) --
The ID of the AWS Glue Data Catalog. If you don't supply this, the AWS account ID is used by default.
DatabaseName (string) --
Specifies the name of the AWS Glue database that contains the schema for the output data.
Warning
If the SchemaConfiguration
request parameter is used as part of invoking the CreateDeliveryStream
API, then the DatabaseName
property is required and its value must be specified.
TableName (string) --
Specifies the AWS Glue table that contains the column information that constitutes your data schema.
Warning
If the SchemaConfiguration
request parameter is used as part of invoking the CreateDeliveryStream
API, then the TableName
property is required and its value must be specified.
Region (string) --
If you don't specify an AWS Region, the default is the current Region.
VersionId (string) --
Specifies the table version for the output data schema. If you don't specify this version ID, or if you set it to LATEST
, Kinesis Data Firehose uses the most recent version. This means that any updates to the table are automatically picked up.
InputFormatConfiguration (dict) --
Specifies the deserializer that you want Kinesis Data Firehose to use to convert the format of your data from JSON. This parameter is required if Enabled
is set to true.
Deserializer (dict) --
Specifies which deserializer to use. You can choose either the Apache Hive JSON SerDe or the OpenX JSON SerDe. If both are non-null, the server rejects the request.
OpenXJsonSerDe (dict) --
The OpenX SerDe. Used by Kinesis Data Firehose for deserializing data, which means converting it from the JSON format in preparation for serializing it to the Parquet or ORC format. This is one of two deserializers you can choose, depending on which one offers the functionality you need. The other option is the native Hive / HCatalog JsonSerDe.
ConvertDotsInJsonKeysToUnderscores (boolean) --
When set to true
, specifies that the names of the keys include dots and that you want Kinesis Data Firehose to replace them with underscores. This is useful because Apache Hive does not allow dots in column names. For example, if the JSON contains a key whose name is "a.b", you can define the column name to be "a_b" when using this option.
The default is false
.
CaseInsensitive (boolean) --
When set to true
, which is the default, Kinesis Data Firehose converts JSON keys to lowercase before deserializing them.
ColumnToJsonKeyMappings (dict) --
Maps column names to JSON keys that aren't identical to the column names. This is useful when the JSON contains keys that are Hive keywords. For example, timestamp
is a Hive keyword. If you have a JSON key named timestamp
, set this parameter to {"ts": "timestamp"}
to map this key to a column named ts
.
HiveJsonSerDe (dict) --
The native Hive / HCatalog JsonSerDe. Used by Kinesis Data Firehose for deserializing data, which means converting it from the JSON format in preparation for serializing it to the Parquet or ORC format. This is one of two deserializers you can choose, depending on which one offers the functionality you need. The other option is the OpenX SerDe.
TimestampFormats (list) --
Indicates how you want Kinesis Data Firehose to parse the date and timestamps that may be present in your input data JSON. To specify these format strings, follow the pattern syntax of JodaTime's DateTimeFormat format strings. For more information, see Class DateTimeFormat . You can also use the special value millis
to parse timestamps in epoch milliseconds. If you don't specify a format, Kinesis Data Firehose uses java.sql.Timestamp::valueOf
by default.
OutputFormatConfiguration (dict) --
Specifies the serializer that you want Kinesis Data Firehose to use to convert the format of your data to the Parquet or ORC format. This parameter is required if Enabled
is set to true.
Serializer (dict) --
Specifies which serializer to use. You can choose either the ORC SerDe or the Parquet SerDe. If both are non-null, the server rejects the request.
ParquetSerDe (dict) --
A serializer to use for converting data to the Parquet format before storing it in Amazon S3. For more information, see Apache Parquet .
BlockSizeBytes (integer) --
The Hadoop Distributed File System (HDFS) block size. This is useful if you intend to copy the data from Amazon S3 to HDFS before querying. The default is 256 MiB and the minimum is 64 MiB. Kinesis Data Firehose uses this value for padding calculations.
PageSizeBytes (integer) --
The Parquet page size. Column chunks are divided into pages. A page is conceptually an indivisible unit (in terms of compression and encoding). The minimum value is 64 KiB and the default is 1 MiB.
Compression (string) --
The compression code to use over data blocks. The possible values are UNCOMPRESSED
, SNAPPY
, and GZIP
, with the default being SNAPPY
. Use SNAPPY
for higher decompression speed. Use GZIP
if the compression ratio is more important than speed.
EnableDictionaryCompression (boolean) --
Indicates whether to enable dictionary compression.
MaxPaddingBytes (integer) --
The maximum amount of padding to apply. This is useful if you intend to copy the data from Amazon S3 to HDFS before querying. The default is 0.
WriterVersion (string) --
Indicates the version of row format to output. The possible values are V1
and V2
. The default is V1
.
OrcSerDe (dict) --
A serializer to use for converting data to the ORC format before storing it in Amazon S3. For more information, see Apache ORC .
StripeSizeBytes (integer) --
The number of bytes in each stripe. The default is 64 MiB and the minimum is 8 MiB.
BlockSizeBytes (integer) --
The Hadoop Distributed File System (HDFS) block size. This is useful if you intend to copy the data from Amazon S3 to HDFS before querying. The default is 256 MiB and the minimum is 64 MiB. Kinesis Data Firehose uses this value for padding calculations.
RowIndexStride (integer) --
The number of rows between index entries. The default is 10,000 and the minimum is 1,000.
EnablePadding (boolean) --
Set this to true
to indicate that you want stripes to be padded to the HDFS block boundaries. This is useful if you intend to copy the data from Amazon S3 to HDFS before querying. The default is false
.
PaddingTolerance (float) --
A number between 0 and 1 that defines the tolerance for block padding as a decimal fraction of stripe size. The default value is 0.05, which means 5 percent of stripe size.
For the default values of 64 MiB ORC stripes and 256 MiB HDFS blocks, the default block padding tolerance of 5 percent reserves a maximum of 3.2 MiB for padding within the 256 MiB block. In such a case, if the available size within the block is more than 3.2 MiB, a new, smaller stripe is inserted to fit within that space. This ensures that no stripe crosses block boundaries and causes remote reads within a node-local task.
Kinesis Data Firehose ignores this parameter when OrcSerDe$EnablePadding is false
.
Compression (string) --
The compression code to use over data blocks. The default is SNAPPY
.
BloomFilterColumns (list) --
The column names for which you want Kinesis Data Firehose to create bloom filters. The default is null
.
BloomFilterFalsePositiveProbability (float) --
The Bloom filter false positive probability (FPP). The lower the FPP, the bigger the Bloom filter. The default value is 0.05, the minimum is 0, and the maximum is 1.
DictionaryKeyThreshold (float) --
Represents the fraction of the total number of non-null rows. To turn off dictionary encoding, set this fraction to a number that is less than the number of distinct keys in a dictionary. To always use dictionary encoding, set this threshold to 1.
FormatVersion (string) --
The version of the file to write. The possible values are V0_11
and V0_12
. The default is V0_12
.
Enabled (boolean) --
Defaults to true
. Set it to false
if you want to disable format conversion while preserving the configuration details.
DynamicPartitioningConfiguration (dict) --
The configuration of the dynamic partitioning mechanism that creates smaller data sets from the streaming data by partitioning it based on partition keys. Currently, dynamic partitioning is only supported for Amazon S3 destinations. For more information, see https://docs.aws.amazon.com/firehose/latest/dev/dynamic-partitioning.html
RetryOptions (dict) --
The retry behavior in case Kinesis Data Firehose is unable to deliver data to an Amazon S3 prefix.
DurationInSeconds (integer) --
The period of time during which Kinesis Data Firehose retries to deliver data to the specified Amazon S3 prefix.
Enabled (boolean) --
Specifies that the dynamic partitioning is enabled for this Kinesis Data Firehose delivery stream.
RedshiftDestinationDescription (dict) --
The destination in Amazon Redshift.
RoleARN (string) --
The Amazon Resource Name (ARN) of the AWS credentials. For more information, see Amazon Resource Names (ARNs) and AWS Service Namespaces .
ClusterJDBCURL (string) --
The database connection string.
CopyCommand (dict) --
The COPY
command.
DataTableName (string) --
The name of the target table. The table must already exist in the database.
DataTableColumns (string) --
A comma-separated list of column names.
CopyOptions (string) --
Optional parameters to use with the Amazon Redshift COPY
command. For more information, see the "Optional Parameters" section of Amazon Redshift COPY command . Some possible examples that would apply to Kinesis Data Firehose are as follows:
delimiter '\t' lzop;
- fields are delimited with "t" (TAB character) and compressed using lzop.
delimiter '|'
- fields are delimited with "|" (this is the default delimiter).
delimiter '|' escape
- the delimiter should be escaped.
fixedwidth 'venueid:3,venuename:25,venuecity:12,venuestate:2,venueseats:6'
- fields are fixed width in the source, with each width specified after every column in the table.
JSON 's3://mybucket/jsonpaths.txt'
- data is in JSON format, and the path specified is the format of the data.
For more examples, see Amazon Redshift COPY command examples .
Username (string) --
The name of the user.
RetryOptions (dict) --
The retry behavior in case Kinesis Data Firehose is unable to deliver documents to Amazon Redshift. Default value is 3600 (60 minutes).
DurationInSeconds (integer) --
The length of time during which Kinesis Data Firehose retries delivery after a failure, starting from the initial request and including the first attempt. The default value is 3600 seconds (60 minutes). Kinesis Data Firehose does not retry if the value of DurationInSeconds
is 0 (zero) or if the first delivery attempt takes longer than the current value.
S3DestinationDescription (dict) --
The Amazon S3 destination.
RoleARN (string) --
The Amazon Resource Name (ARN) of the AWS credentials. For more information, see Amazon Resource Names (ARNs) and AWS Service Namespaces .
BucketARN (string) --
The ARN of the S3 bucket. For more information, see Amazon Resource Names (ARNs) and AWS Service Namespaces .
Prefix (string) --
The "YYYY/MM/DD/HH" time format prefix is automatically used for delivered Amazon S3 files. You can also specify a custom prefix, as described in Custom Prefixes for Amazon S3 Objects .
ErrorOutputPrefix (string) --
A prefix that Kinesis Data Firehose evaluates and adds to failed records before writing them to S3. This prefix appears immediately following the bucket name. For information about how to specify this prefix, see Custom Prefixes for Amazon S3 Objects .
BufferingHints (dict) --
The buffering option. If no value is specified, BufferingHints
object default values are used.
SizeInMBs (integer) --
Buffer incoming data to the specified size, in MiBs, before delivering it to the destination. The default value is 5. This parameter is optional but if you specify a value for it, you must also specify a value for IntervalInSeconds
, and vice versa.
We recommend setting this parameter to a value greater than the amount of data you typically ingest into the delivery stream in 10 seconds. For example, if you typically ingest data at 1 MiB/sec, the value should be 10 MiB or higher.
IntervalInSeconds (integer) --
Buffer incoming data for the specified period of time, in seconds, before delivering it to the destination. The default value is 300. This parameter is optional but if you specify a value for it, you must also specify a value for SizeInMBs
, and vice versa.
CompressionFormat (string) --
The compression format. If no value is specified, the default is UNCOMPRESSED
.
EncryptionConfiguration (dict) --
The encryption configuration. If no value is specified, the default is no encryption.
NoEncryptionConfig (string) --
Specifically override existing encryption information to ensure that no encryption is used.
KMSEncryptionConfig (dict) --
The encryption key.
AWSKMSKeyARN (string) --
The Amazon Resource Name (ARN) of the encryption key. Must belong to the same AWS Region as the destination Amazon S3 bucket. For more information, see Amazon Resource Names (ARNs) and AWS Service Namespaces .
CloudWatchLoggingOptions (dict) --
The Amazon CloudWatch logging options for your delivery stream.
Enabled (boolean) --
Enables or disables CloudWatch logging.
LogGroupName (string) --
The CloudWatch group name for logging. This value is required if CloudWatch logging is enabled.
LogStreamName (string) --
The CloudWatch log stream name for logging. This value is required if CloudWatch logging is enabled.
ProcessingConfiguration (dict) --
The data processing configuration.
Enabled (boolean) --
Enables or disables data processing.
Processors (list) --
The data processors.
(dict) --
Describes a data processor.
Type (string) --
The type of processor.
Parameters (list) --
The processor parameters.
(dict) --
Describes the processor parameter.
ParameterName (string) --
The name of the parameter.
ParameterValue (string) --
The parameter value.
S3BackupMode (string) --
The Amazon S3 backup mode.
S3BackupDescription (dict) --
The configuration for backup in Amazon S3.
RoleARN (string) --
The Amazon Resource Name (ARN) of the AWS credentials. For more information, see Amazon Resource Names (ARNs) and AWS Service Namespaces .
BucketARN (string) --
The ARN of the S3 bucket. For more information, see Amazon Resource Names (ARNs) and AWS Service Namespaces .
Prefix (string) --
The "YYYY/MM/DD/HH" time format prefix is automatically used for delivered Amazon S3 files. You can also specify a custom prefix, as described in Custom Prefixes for Amazon S3 Objects .
ErrorOutputPrefix (string) --
A prefix that Kinesis Data Firehose evaluates and adds to failed records before writing them to S3. This prefix appears immediately following the bucket name. For information about how to specify this prefix, see Custom Prefixes for Amazon S3 Objects .
BufferingHints (dict) --
The buffering option. If no value is specified, BufferingHints
object default values are used.
SizeInMBs (integer) --
Buffer incoming data to the specified size, in MiBs, before delivering it to the destination. The default value is 5. This parameter is optional but if you specify a value for it, you must also specify a value for IntervalInSeconds
, and vice versa.
We recommend setting this parameter to a value greater than the amount of data you typically ingest into the delivery stream in 10 seconds. For example, if you typically ingest data at 1 MiB/sec, the value should be 10 MiB or higher.
IntervalInSeconds (integer) --
Buffer incoming data for the specified period of time, in seconds, before delivering it to the destination. The default value is 300. This parameter is optional but if you specify a value for it, you must also specify a value for SizeInMBs
, and vice versa.
CompressionFormat (string) --
The compression format. If no value is specified, the default is UNCOMPRESSED
.
EncryptionConfiguration (dict) --
The encryption configuration. If no value is specified, the default is no encryption.
NoEncryptionConfig (string) --
Specifically override existing encryption information to ensure that no encryption is used.
KMSEncryptionConfig (dict) --
The encryption key.
AWSKMSKeyARN (string) --
The Amazon Resource Name (ARN) of the encryption key. Must belong to the same AWS Region as the destination Amazon S3 bucket. For more information, see Amazon Resource Names (ARNs) and AWS Service Namespaces .
CloudWatchLoggingOptions (dict) --
The Amazon CloudWatch logging options for your delivery stream.
Enabled (boolean) --
Enables or disables CloudWatch logging.
LogGroupName (string) --
The CloudWatch group name for logging. This value is required if CloudWatch logging is enabled.
LogStreamName (string) --
The CloudWatch log stream name for logging. This value is required if CloudWatch logging is enabled.
CloudWatchLoggingOptions (dict) --
The Amazon CloudWatch logging options for your delivery stream.
Enabled (boolean) --
Enables or disables CloudWatch logging.
LogGroupName (string) --
The CloudWatch group name for logging. This value is required if CloudWatch logging is enabled.
LogStreamName (string) --
The CloudWatch log stream name for logging. This value is required if CloudWatch logging is enabled.
ElasticsearchDestinationDescription (dict) --
The destination in Amazon ES.
RoleARN (string) --
The Amazon Resource Name (ARN) of the AWS credentials. For more information, see Amazon Resource Names (ARNs) and AWS Service Namespaces .
DomainARN (string) --
The ARN of the Amazon ES domain. For more information, see Amazon Resource Names (ARNs) and AWS Service Namespaces .
Kinesis Data Firehose uses either ClusterEndpoint
or DomainARN
to send data to Amazon ES.
ClusterEndpoint (string) --
The endpoint to use when communicating with the cluster. Kinesis Data Firehose uses either this ClusterEndpoint
or the DomainARN
field to send data to Amazon ES.
IndexName (string) --
The Elasticsearch index name.
TypeName (string) --
The Elasticsearch type name. This applies to Elasticsearch 6.x and lower versions. For Elasticsearch 7.x, there's no value for TypeName
.
IndexRotationPeriod (string) --
The Elasticsearch index rotation period
BufferingHints (dict) --
The buffering options.
IntervalInSeconds (integer) --
Buffer incoming data for the specified period of time, in seconds, before delivering it to the destination. The default value is 300 (5 minutes).
SizeInMBs (integer) --
Buffer incoming data to the specified size, in MBs, before delivering it to the destination. The default value is 5.
We recommend setting this parameter to a value greater than the amount of data you typically ingest into the delivery stream in 10 seconds. For example, if you typically ingest data at 1 MB/sec, the value should be 10 MB or higher.
RetryOptions (dict) --
The Amazon ES retry options.
DurationInSeconds (integer) --
After an initial failure to deliver to Amazon ES, the total amount of time during which Kinesis Data Firehose retries delivery (including the first attempt). After this time has elapsed, the failed documents are written to Amazon S3. Default value is 300 seconds (5 minutes). A value of 0 (zero) results in no retries.
S3BackupMode (string) --
The Amazon S3 backup mode.
S3DestinationDescription (dict) --
The Amazon S3 destination.
RoleARN (string) --
The Amazon Resource Name (ARN) of the AWS credentials. For more information, see Amazon Resource Names (ARNs) and AWS Service Namespaces .
BucketARN (string) --
The ARN of the S3 bucket. For more information, see Amazon Resource Names (ARNs) and AWS Service Namespaces .
Prefix (string) --
The "YYYY/MM/DD/HH" time format prefix is automatically used for delivered Amazon S3 files. You can also specify a custom prefix, as described in Custom Prefixes for Amazon S3 Objects .
ErrorOutputPrefix (string) --
A prefix that Kinesis Data Firehose evaluates and adds to failed records before writing them to S3. This prefix appears immediately following the bucket name. For information about how to specify this prefix, see Custom Prefixes for Amazon S3 Objects .
BufferingHints (dict) --
The buffering option. If no value is specified, BufferingHints
object default values are used.
SizeInMBs (integer) --
Buffer incoming data to the specified size, in MiBs, before delivering it to the destination. The default value is 5. This parameter is optional but if you specify a value for it, you must also specify a value for IntervalInSeconds
, and vice versa.
We recommend setting this parameter to a value greater than the amount of data you typically ingest into the delivery stream in 10 seconds. For example, if you typically ingest data at 1 MiB/sec, the value should be 10 MiB or higher.
IntervalInSeconds (integer) --
Buffer incoming data for the specified period of time, in seconds, before delivering it to the destination. The default value is 300. This parameter is optional but if you specify a value for it, you must also specify a value for SizeInMBs
, and vice versa.
CompressionFormat (string) --
The compression format. If no value is specified, the default is UNCOMPRESSED
.
EncryptionConfiguration (dict) --
The encryption configuration. If no value is specified, the default is no encryption.
NoEncryptionConfig (string) --
Specifically override existing encryption information to ensure that no encryption is used.
KMSEncryptionConfig (dict) --
The encryption key.
AWSKMSKeyARN (string) --
The Amazon Resource Name (ARN) of the encryption key. Must belong to the same AWS Region as the destination Amazon S3 bucket. For more information, see Amazon Resource Names (ARNs) and AWS Service Namespaces .
CloudWatchLoggingOptions (dict) --
The Amazon CloudWatch logging options for your delivery stream.
Enabled (boolean) --
Enables or disables CloudWatch logging.
LogGroupName (string) --
The CloudWatch group name for logging. This value is required if CloudWatch logging is enabled.
LogStreamName (string) --
The CloudWatch log stream name for logging. This value is required if CloudWatch logging is enabled.
ProcessingConfiguration (dict) --
The data processing configuration.
Enabled (boolean) --
Enables or disables data processing.
Processors (list) --
The data processors.
(dict) --
Describes a data processor.
Type (string) --
The type of processor.
Parameters (list) --
The processor parameters.
(dict) --
Describes the processor parameter.
ParameterName (string) --
The name of the parameter.
ParameterValue (string) --
The parameter value.
CloudWatchLoggingOptions (dict) --
The Amazon CloudWatch logging options.
Enabled (boolean) --
Enables or disables CloudWatch logging.
LogGroupName (string) --
The CloudWatch group name for logging. This value is required if CloudWatch logging is enabled.
LogStreamName (string) --
The CloudWatch log stream name for logging. This value is required if CloudWatch logging is enabled.
VpcConfigurationDescription (dict) --
The details of the VPC of the Amazon ES destination.
SubnetIds (list) --
The IDs of the subnets that Kinesis Data Firehose uses to create ENIs in the VPC of the Amazon ES destination. Make sure that the routing tables and inbound and outbound rules allow traffic to flow from the subnets whose IDs are specified here to the subnets that have the destination Amazon ES endpoints. Kinesis Data Firehose creates at least one ENI in each of the subnets that are specified here. Do not delete or modify these ENIs.
The number of ENIs that Kinesis Data Firehose creates in the subnets specified here scales up and down automatically based on throughput. To enable Kinesis Data Firehose to scale up the number of ENIs to match throughput, ensure that you have sufficient quota. To help you calculate the quota you need, assume that Kinesis Data Firehose can create up to three ENIs for this delivery stream for each of the subnets specified here. For more information about ENI quota, see Network Interfaces in the Amazon VPC Quotas topic.
RoleARN (string) --
The ARN of the IAM role that the delivery stream uses to create endpoints in the destination VPC. You can use your existing Kinesis Data Firehose delivery role or you can specify a new role. In either case, make sure that the role trusts the Kinesis Data Firehose service principal and that it grants the following permissions:
ec2:DescribeVpcs
ec2:DescribeVpcAttribute
ec2:DescribeSubnets
ec2:DescribeSecurityGroups
ec2:DescribeNetworkInterfaces
ec2:CreateNetworkInterface
ec2:CreateNetworkInterfacePermission
ec2:DeleteNetworkInterface
If you revoke these permissions after you create the delivery stream, Kinesis Data Firehose can't scale out by creating more ENIs when necessary. You might therefore see a degradation in performance.
SecurityGroupIds (list) --
The IDs of the security groups that Kinesis Data Firehose uses when it creates ENIs in the VPC of the Amazon ES destination. You can use the same security group that the Amazon ES domain uses or different ones. If you specify different security groups, ensure that they allow outbound HTTPS traffic to the Amazon ES domain's security group. Also ensure that the Amazon ES domain's security group allows HTTPS traffic from the security groups specified here. If you use the same security group for both your delivery stream and the Amazon ES domain, make sure the security group inbound rule allows HTTPS traffic. For more information about security group rules, see Security group rules in the Amazon VPC documentation.
VpcId (string) --
The ID of the Amazon ES destination's VPC.
AmazonopensearchserviceDestinationDescription (dict) --
RoleARN (string) --
DomainARN (string) --
ClusterEndpoint (string) --
IndexName (string) --
TypeName (string) --
IndexRotationPeriod (string) --
BufferingHints (dict) --
RetryOptions (dict) --
S3BackupMode (string) --
S3DestinationDescription (dict) --
Describes a destination in Amazon S3.
RoleARN (string) --
The Amazon Resource Name (ARN) of the AWS credentials. For more information, see Amazon Resource Names (ARNs) and AWS Service Namespaces .
BucketARN (string) --
The ARN of the S3 bucket. For more information, see Amazon Resource Names (ARNs) and AWS Service Namespaces .
Prefix (string) --
The "YYYY/MM/DD/HH" time format prefix is automatically used for delivered Amazon S3 files. You can also specify a custom prefix, as described in Custom Prefixes for Amazon S3 Objects .
ErrorOutputPrefix (string) --
A prefix that Kinesis Data Firehose evaluates and adds to failed records before writing them to S3. This prefix appears immediately following the bucket name. For information about how to specify this prefix, see Custom Prefixes for Amazon S3 Objects .
BufferingHints (dict) --
The buffering option. If no value is specified, BufferingHints
object default values are used.
SizeInMBs (integer) --
Buffer incoming data to the specified size, in MiBs, before delivering it to the destination. The default value is 5. This parameter is optional but if you specify a value for it, you must also specify a value for IntervalInSeconds
, and vice versa.
We recommend setting this parameter to a value greater than the amount of data you typically ingest into the delivery stream in 10 seconds. For example, if you typically ingest data at 1 MiB/sec, the value should be 10 MiB or higher.
IntervalInSeconds (integer) --
Buffer incoming data for the specified period of time, in seconds, before delivering it to the destination. The default value is 300. This parameter is optional but if you specify a value for it, you must also specify a value for SizeInMBs
, and vice versa.
CompressionFormat (string) --
The compression format. If no value is specified, the default is UNCOMPRESSED
.
EncryptionConfiguration (dict) --
The encryption configuration. If no value is specified, the default is no encryption.
NoEncryptionConfig (string) --
Specifically override existing encryption information to ensure that no encryption is used.
KMSEncryptionConfig (dict) --
The encryption key.
AWSKMSKeyARN (string) --
The Amazon Resource Name (ARN) of the encryption key. Must belong to the same AWS Region as the destination Amazon S3 bucket. For more information, see Amazon Resource Names (ARNs) and AWS Service Namespaces .
CloudWatchLoggingOptions (dict) --
The Amazon CloudWatch logging options for your delivery stream.
Enabled (boolean) --
Enables or disables CloudWatch logging.
LogGroupName (string) --
The CloudWatch group name for logging. This value is required if CloudWatch logging is enabled.
LogStreamName (string) --
The CloudWatch log stream name for logging. This value is required if CloudWatch logging is enabled.
ProcessingConfiguration (dict) --
Describes a data processing configuration.
Enabled (boolean) --
Enables or disables data processing.
Processors (list) --
The data processors.
(dict) --
Describes a data processor.
Type (string) --
The type of processor.
Parameters (list) --
The processor parameters.
(dict) --
Describes the processor parameter.
ParameterName (string) --
The name of the parameter.
ParameterValue (string) --
The parameter value.
CloudWatchLoggingOptions (dict) --
Describes the Amazon CloudWatch logging options for your delivery stream.
Enabled (boolean) --
Enables or disables CloudWatch logging.
LogGroupName (string) --
The CloudWatch group name for logging. This value is required if CloudWatch logging is enabled.
LogStreamName (string) --
The CloudWatch log stream name for logging. This value is required if CloudWatch logging is enabled.
VpcConfigurationDescription (dict) --
The details of the VPC of the Amazon ES destination.
SubnetIds (list) --
The IDs of the subnets that Kinesis Data Firehose uses to create ENIs in the VPC of the Amazon ES destination. Make sure that the routing tables and inbound and outbound rules allow traffic to flow from the subnets whose IDs are specified here to the subnets that have the destination Amazon ES endpoints. Kinesis Data Firehose creates at least one ENI in each of the subnets that are specified here. Do not delete or modify these ENIs.
The number of ENIs that Kinesis Data Firehose creates in the subnets specified here scales up and down automatically based on throughput. To enable Kinesis Data Firehose to scale up the number of ENIs to match throughput, ensure that you have sufficient quota. To help you calculate the quota you need, assume that Kinesis Data Firehose can create up to three ENIs for this delivery stream for each of the subnets specified here. For more information about ENI quota, see Network Interfaces in the Amazon VPC Quotas topic.
RoleARN (string) --
The ARN of the IAM role that the delivery stream uses to create endpoints in the destination VPC. You can use your existing Kinesis Data Firehose delivery role or you can specify a new role. In either case, make sure that the role trusts the Kinesis Data Firehose service principal and that it grants the following permissions:
ec2:DescribeVpcs
ec2:DescribeVpcAttribute
ec2:DescribeSubnets
ec2:DescribeSecurityGroups
ec2:DescribeNetworkInterfaces
ec2:CreateNetworkInterface
ec2:CreateNetworkInterfacePermission
ec2:DeleteNetworkInterface
If you revoke these permissions after you create the delivery stream, Kinesis Data Firehose can't scale out by creating more ENIs when necessary. You might therefore see a degradation in performance.
SecurityGroupIds (list) --
The IDs of the security groups that Kinesis Data Firehose uses when it creates ENIs in the VPC of the Amazon ES destination. You can use the same security group that the Amazon ES domain uses or different ones. If you specify different security groups, ensure that they allow outbound HTTPS traffic to the Amazon ES domain's security group. Also ensure that the Amazon ES domain's security group allows HTTPS traffic from the security groups specified here. If you use the same security group for both your delivery stream and the Amazon ES domain, make sure the security group inbound rule allows HTTPS traffic. For more information about security group rules, see Security group rules in the Amazon VPC documentation.
VpcId (string) --
The ID of the Amazon ES destination's VPC.
SplunkDestinationDescription (dict) --
The destination in Splunk.
HECEndpoint (string) --
The HTTP Event Collector (HEC) endpoint to which Kinesis Data Firehose sends your data.
HECEndpointType (string) --
This type can be either "Raw" or "Event."
HECToken (string) --
A GUID you obtain from your Splunk cluster when you create a new HEC endpoint.
HECAcknowledgmentTimeoutInSeconds (integer) --
The amount of time that Kinesis Data Firehose waits to receive an acknowledgment from Splunk after it sends it data. At the end of the timeout period, Kinesis Data Firehose either tries to send the data again or considers it an error, based on your retry settings.
RetryOptions (dict) --
The retry behavior in case Kinesis Data Firehose is unable to deliver data to Splunk or if it doesn't receive an acknowledgment of receipt from Splunk.
DurationInSeconds (integer) --
The total amount of time that Kinesis Data Firehose spends on retries. This duration starts after the initial attempt to send data to Splunk fails. It doesn't include the periods during which Kinesis Data Firehose waits for acknowledgment from Splunk after each attempt.
S3BackupMode (string) --
Defines how documents should be delivered to Amazon S3. When set to FailedDocumentsOnly
, Kinesis Data Firehose writes any data that could not be indexed to the configured Amazon S3 destination. When set to AllDocuments
, Kinesis Data Firehose delivers all incoming records to Amazon S3, and also writes failed documents to Amazon S3. Default value is FailedDocumentsOnly
.
S3DestinationDescription (dict) --
The Amazon S3 destination.>
RoleARN (string) --
The Amazon Resource Name (ARN) of the AWS credentials. For more information, see Amazon Resource Names (ARNs) and AWS Service Namespaces .
BucketARN (string) --
The ARN of the S3 bucket. For more information, see Amazon Resource Names (ARNs) and AWS Service Namespaces .
Prefix (string) --
The "YYYY/MM/DD/HH" time format prefix is automatically used for delivered Amazon S3 files. You can also specify a custom prefix, as described in Custom Prefixes for Amazon S3 Objects .
ErrorOutputPrefix (string) --
A prefix that Kinesis Data Firehose evaluates and adds to failed records before writing them to S3. This prefix appears immediately following the bucket name. For information about how to specify this prefix, see Custom Prefixes for Amazon S3 Objects .
BufferingHints (dict) --
The buffering option. If no value is specified, BufferingHints
object default values are used.
SizeInMBs (integer) --
Buffer incoming data to the specified size, in MiBs, before delivering it to the destination. The default value is 5. This parameter is optional but if you specify a value for it, you must also specify a value for IntervalInSeconds
, and vice versa.
We recommend setting this parameter to a value greater than the amount of data you typically ingest into the delivery stream in 10 seconds. For example, if you typically ingest data at 1 MiB/sec, the value should be 10 MiB or higher.
IntervalInSeconds (integer) --
Buffer incoming data for the specified period of time, in seconds, before delivering it to the destination. The default value is 300. This parameter is optional but if you specify a value for it, you must also specify a value for SizeInMBs
, and vice versa.
CompressionFormat (string) --
The compression format. If no value is specified, the default is UNCOMPRESSED
.
EncryptionConfiguration (dict) --
The encryption configuration. If no value is specified, the default is no encryption.
NoEncryptionConfig (string) --
Specifically override existing encryption information to ensure that no encryption is used.
KMSEncryptionConfig (dict) --
The encryption key.
AWSKMSKeyARN (string) --
The Amazon Resource Name (ARN) of the encryption key. Must belong to the same AWS Region as the destination Amazon S3 bucket. For more information, see Amazon Resource Names (ARNs) and AWS Service Namespaces .
CloudWatchLoggingOptions (dict) --
The Amazon CloudWatch logging options for your delivery stream.
Enabled (boolean) --
Enables or disables CloudWatch logging.
LogGroupName (string) --
The CloudWatch group name for logging. This value is required if CloudWatch logging is enabled.
LogStreamName (string) --
The CloudWatch log stream name for logging. This value is required if CloudWatch logging is enabled.
ProcessingConfiguration (dict) --
The data processing configuration.
Enabled (boolean) --
Enables or disables data processing.
Processors (list) --
The data processors.
(dict) --
Describes a data processor.
Type (string) --
The type of processor.
Parameters (list) --
The processor parameters.
(dict) --
Describes the processor parameter.
ParameterName (string) --
The name of the parameter.
ParameterValue (string) --
The parameter value.
CloudWatchLoggingOptions (dict) --
The Amazon CloudWatch logging options for your delivery stream.
Enabled (boolean) --
Enables or disables CloudWatch logging.
LogGroupName (string) --
The CloudWatch group name for logging. This value is required if CloudWatch logging is enabled.
LogStreamName (string) --
The CloudWatch log stream name for logging. This value is required if CloudWatch logging is enabled.
HttpEndpointDestinationDescription (dict) --
Describes the specified HTTP endpoint destination.
EndpointConfiguration (dict) --
The configuration of the specified HTTP endpoint destination.
Url (string) --
The URL of the HTTP endpoint selected as the destination.
Name (string) --
The name of the HTTP endpoint selected as the destination.
BufferingHints (dict) --
Describes buffering options that can be applied to the data before it is delivered to the HTTPS endpoint destination. Kinesis Data Firehose teats these options as hints, and it might choose to use more optimal values. The SizeInMBs
and IntervalInSeconds
parameters are optional. However, if specify a value for one of them, you must also provide a value for the other.
SizeInMBs (integer) --
Buffer incoming data to the specified size, in MBs, before delivering it to the destination. The default value is 5.
We recommend setting this parameter to a value greater than the amount of data you typically ingest into the delivery stream in 10 seconds. For example, if you typically ingest data at 1 MB/sec, the value should be 10 MB or higher.
IntervalInSeconds (integer) --
Buffer incoming data for the specified period of time, in seconds, before delivering it to the destination. The default value is 300 (5 minutes).
CloudWatchLoggingOptions (dict) --
Describes the Amazon CloudWatch logging options for your delivery stream.
Enabled (boolean) --
Enables or disables CloudWatch logging.
LogGroupName (string) --
The CloudWatch group name for logging. This value is required if CloudWatch logging is enabled.
LogStreamName (string) --
The CloudWatch log stream name for logging. This value is required if CloudWatch logging is enabled.
RequestConfiguration (dict) --
The configuration of request sent to the HTTP endpoint specified as the destination.
ContentEncoding (string) --
Kinesis Data Firehose uses the content encoding to compress the body of a request before sending the request to the destination. For more information, see Content-Encoding in MDN Web Docs, the official Mozilla documentation.
CommonAttributes (list) --
Describes the metadata sent to the HTTP endpoint destination.
(dict) --
Describes the metadata that's delivered to the specified HTTP endpoint destination.
AttributeName (string) --
The name of the HTTP endpoint common attribute.
AttributeValue (string) --
The value of the HTTP endpoint common attribute.
ProcessingConfiguration (dict) --
Describes a data processing configuration.
Enabled (boolean) --
Enables or disables data processing.
Processors (list) --
The data processors.
(dict) --
Describes a data processor.
Type (string) --
The type of processor.
Parameters (list) --
The processor parameters.
(dict) --
Describes the processor parameter.
ParameterName (string) --
The name of the parameter.
ParameterValue (string) --
The parameter value.
RoleARN (string) --
Kinesis Data Firehose uses this IAM role for all the permissions that the delivery stream needs.
RetryOptions (dict) --
Describes the retry behavior in case Kinesis Data Firehose is unable to deliver data to the specified HTTP endpoint destination, or if it doesn't receive a valid acknowledgment of receipt from the specified HTTP endpoint destination.
DurationInSeconds (integer) --
The total amount of time that Kinesis Data Firehose spends on retries. This duration starts after the initial attempt to send data to the custom destination via HTTPS endpoint fails. It doesn't include the periods during which Kinesis Data Firehose waits for acknowledgment from the specified destination after each attempt.
S3BackupMode (string) --
Describes the S3 bucket backup options for the data that Kinesis Firehose delivers to the HTTP endpoint destination. You can back up all documents (AllData
) or only the documents that Kinesis Data Firehose could not deliver to the specified HTTP endpoint destination (FailedDataOnly
).
S3DestinationDescription (dict) --
Describes a destination in Amazon S3.
RoleARN (string) --
The Amazon Resource Name (ARN) of the AWS credentials. For more information, see Amazon Resource Names (ARNs) and AWS Service Namespaces .
BucketARN (string) --
The ARN of the S3 bucket. For more information, see Amazon Resource Names (ARNs) and AWS Service Namespaces .
Prefix (string) --
The "YYYY/MM/DD/HH" time format prefix is automatically used for delivered Amazon S3 files. You can also specify a custom prefix, as described in Custom Prefixes for Amazon S3 Objects .
ErrorOutputPrefix (string) --
A prefix that Kinesis Data Firehose evaluates and adds to failed records before writing them to S3. This prefix appears immediately following the bucket name. For information about how to specify this prefix, see Custom Prefixes for Amazon S3 Objects .
BufferingHints (dict) --
The buffering option. If no value is specified, BufferingHints
object default values are used.
SizeInMBs (integer) --
Buffer incoming data to the specified size, in MiBs, before delivering it to the destination. The default value is 5. This parameter is optional but if you specify a value for it, you must also specify a value for IntervalInSeconds
, and vice versa.
We recommend setting this parameter to a value greater than the amount of data you typically ingest into the delivery stream in 10 seconds. For example, if you typically ingest data at 1 MiB/sec, the value should be 10 MiB or higher.
IntervalInSeconds (integer) --
Buffer incoming data for the specified period of time, in seconds, before delivering it to the destination. The default value is 300. This parameter is optional but if you specify a value for it, you must also specify a value for SizeInMBs
, and vice versa.
CompressionFormat (string) --
The compression format. If no value is specified, the default is UNCOMPRESSED
.
EncryptionConfiguration (dict) --
The encryption configuration. If no value is specified, the default is no encryption.
NoEncryptionConfig (string) --
Specifically override existing encryption information to ensure that no encryption is used.
KMSEncryptionConfig (dict) --
The encryption key.
AWSKMSKeyARN (string) --
The Amazon Resource Name (ARN) of the encryption key. Must belong to the same AWS Region as the destination Amazon S3 bucket. For more information, see Amazon Resource Names (ARNs) and AWS Service Namespaces .
CloudWatchLoggingOptions (dict) --
The Amazon CloudWatch logging options for your delivery stream.
Enabled (boolean) --
Enables or disables CloudWatch logging.
LogGroupName (string) --
The CloudWatch group name for logging. This value is required if CloudWatch logging is enabled.
LogStreamName (string) --
The CloudWatch log stream name for logging. This value is required if CloudWatch logging is enabled.
HasMoreDestinations (boolean) --
Indicates whether there are more destinations available to list.
Exceptions
Firehose.Client.exceptions.ResourceNotFoundException
get_paginator
(operation_name)¶Create a paginator for an operation.
create_foo
, and you'd normally invoke the
operation as client.create_foo(**kwargs)
, if the
create_foo
operation can be paginated, you can use the
call client.get_paginator("create_foo")
.client.can_paginate
method to
check if an operation is pageable.get_waiter
(waiter_name)¶Returns an object that can wait for some condition.
list_delivery_streams
(**kwargs)¶Lists your delivery streams in alphabetical order of their names.
The number of delivery streams might be too large to return using a single call to ListDeliveryStreams
. You can limit the number of delivery streams returned, using the Limit
parameter. To determine whether there are more delivery streams to list, check the value of HasMoreDeliveryStreams
in the output. If there are more delivery streams to list, you can request them by calling this operation again and setting the ExclusiveStartDeliveryStreamName
parameter to the name of the last delivery stream returned in the last call.
See also: AWS API Documentation
Request Syntax
response = client.list_delivery_streams(
Limit=123,
DeliveryStreamType='DirectPut'|'KinesisStreamAsSource',
ExclusiveStartDeliveryStreamName='string'
)
The delivery stream type. This can be one of the following values:
DirectPut
: Provider applications access the delivery stream directly.KinesisStreamAsSource
: The delivery stream uses a Kinesis data stream as a source.This parameter is optional. If this parameter is omitted, delivery streams of all types are returned.
ListDeliveryStreams
will start with the delivery stream whose name comes alphabetically immediately after the name you specify in ExclusiveStartDeliveryStreamName
.dict
Response Syntax
{
'DeliveryStreamNames': [
'string',
],
'HasMoreDeliveryStreams': True|False
}
Response Structure
(dict) --
DeliveryStreamNames (list) --
The names of the delivery streams.
HasMoreDeliveryStreams (boolean) --
Indicates whether there are more delivery streams available to list.
Lists the tags for the specified delivery stream. This operation has a limit of five transactions per second per account.
See also: AWS API Documentation
Request Syntax
response = client.list_tags_for_delivery_stream(
DeliveryStreamName='string',
ExclusiveStartTagKey='string',
Limit=123
)
[REQUIRED]
The name of the delivery stream whose tags you want to list.
ListTagsForDeliveryStream
gets all tags that occur after ExclusiveStartTagKey
.HasMoreTags
is set to true
in the response. To list additional tags, set ExclusiveStartTagKey
to the last key in the response.dict
Response Syntax
{
'Tags': [
{
'Key': 'string',
'Value': 'string'
},
],
'HasMoreTags': True|False
}
Response Structure
(dict) --
Tags (list) --
A list of tags associated with DeliveryStreamName
, starting with the first tag after ExclusiveStartTagKey
and up to the specified Limit
.
(dict) --
Metadata that you can assign to a delivery stream, consisting of a key-value pair.
Key (string) --
A unique identifier for the tag. Maximum length: 128 characters. Valid characters: Unicode letters, digits, white space, _ . / = + - % @
Value (string) --
An optional string, which you can use to describe or define the tag. Maximum length: 256 characters. Valid characters: Unicode letters, digits, white space, _ . / = + - % @
HasMoreTags (boolean) --
If this is true
in the response, more tags are available. To list the remaining tags, set ExclusiveStartTagKey
to the key of the last tag returned and call ListTagsForDeliveryStream
again.
Exceptions
Firehose.Client.exceptions.ResourceNotFoundException
Firehose.Client.exceptions.InvalidArgumentException
Firehose.Client.exceptions.LimitExceededException
put_record
(**kwargs)¶Writes a single data record into an Amazon Kinesis Data Firehose delivery stream. To write multiple data records into a delivery stream, use PutRecordBatch . Applications using these operations are referred to as producers.
By default, each delivery stream can take in up to 2,000 transactions per second, 5,000 records per second, or 5 MB per second. If you use PutRecord and PutRecordBatch , the limits are an aggregate across these two operations for each delivery stream. For more information about limits and how to request an increase, see Amazon Kinesis Data Firehose Limits .
You must specify the name of the delivery stream and the data record when using PutRecord . The data record consists of a data blob that can be up to 1,000 KiB in size, and any kind of data. For example, it can be a segment from a log file, geographic location data, website clickstream data, and so on.
Kinesis Data Firehose buffers records before delivering them to the destination. To disambiguate the data blobs at the destination, a common solution is to use delimiters in the data, such as a newline (\n
) or some other character unique within the data. This allows the consumer application to parse individual data items when reading the data from the destination.
The PutRecord
operation returns a RecordId
, which is a unique string assigned to each record. Producer applications can use this ID for purposes such as auditability and investigation.
If the PutRecord
operation throws a ServiceUnavailableException
, back off and retry. If the exception persists, it is possible that the throughput limits have been exceeded for the delivery stream.
Data records sent to Kinesis Data Firehose are stored for 24 hours from the time they are added to a delivery stream as it tries to send the records to the destination. If the destination is unreachable for more than 24 hours, the data is no longer available.
Warning
Don't concatenate two or more base64 strings to form the data fields of your records. Instead, concatenate the raw data, then perform base64 encoding.
See also: AWS API Documentation
Request Syntax
response = client.put_record(
DeliveryStreamName='string',
Record={
'Data': b'bytes'
}
)
[REQUIRED]
The name of the delivery stream.
[REQUIRED]
The record.
The data blob, which is base64-encoded when the blob is serialized. The maximum size of the data blob, before base64-encoding, is 1,000 KiB.
dict
Response Syntax
{
'RecordId': 'string',
'Encrypted': True|False
}
Response Structure
(dict) --
RecordId (string) --
The ID of the record.
Encrypted (boolean) --
Indicates whether server-side encryption (SSE) was enabled during this operation.
Exceptions
Firehose.Client.exceptions.ResourceNotFoundException
Firehose.Client.exceptions.InvalidArgumentException
Firehose.Client.exceptions.InvalidKMSResourceException
Firehose.Client.exceptions.ServiceUnavailableException
put_record_batch
(**kwargs)¶Writes multiple data records into a delivery stream in a single call, which can achieve higher throughput per producer than when writing single records. To write single data records into a delivery stream, use PutRecord . Applications using these operations are referred to as producers.
For information about service quota, see Amazon Kinesis Data Firehose Quota .
Each PutRecordBatch request supports up to 500 records. Each record in the request can be as large as 1,000 KB (before base64 encoding), up to a limit of 4 MB for the entire request. These limits cannot be changed.
You must specify the name of the delivery stream and the data record when using PutRecord . The data record consists of a data blob that can be up to 1,000 KB in size, and any kind of data. For example, it could be a segment from a log file, geographic location data, website clickstream data, and so on.
Kinesis Data Firehose buffers records before delivering them to the destination. To disambiguate the data blobs at the destination, a common solution is to use delimiters in the data, such as a newline (\n
) or some other character unique within the data. This allows the consumer application to parse individual data items when reading the data from the destination.
The PutRecordBatch response includes a count of failed records, FailedPutCount
, and an array of responses, RequestResponses
. Even if the PutRecordBatch call succeeds, the value of FailedPutCount
may be greater than 0, indicating that there are records for which the operation didn't succeed. Each entry in the RequestResponses
array provides additional information about the processed record. It directly correlates with a record in the request array using the same ordering, from the top to the bottom. The response array always includes the same number of records as the request array. RequestResponses
includes both successfully and unsuccessfully processed records. Kinesis Data Firehose tries to process all records in each PutRecordBatch request. A single record failure does not stop the processing of subsequent records.
A successfully processed record includes a RecordId
value, which is unique for the record. An unsuccessfully processed record includes ErrorCode
and ErrorMessage
values. ErrorCode
reflects the type of error, and is one of the following values: ServiceUnavailableException
or InternalFailure
. ErrorMessage
provides more detailed information about the error.
If there is an internal server error or a timeout, the write might have completed or it might have failed. If FailedPutCount
is greater than 0, retry the request, resending only those records that might have failed processing. This minimizes the possible duplicate records and also reduces the total bytes sent (and corresponding charges). We recommend that you handle any duplicates at the destination.
If PutRecordBatch throws ServiceUnavailableException
, back off and retry. If the exception persists, it is possible that the throughput limits have been exceeded for the delivery stream.
Data records sent to Kinesis Data Firehose are stored for 24 hours from the time they are added to a delivery stream as it attempts to send the records to the destination. If the destination is unreachable for more than 24 hours, the data is no longer available.
Warning
Don't concatenate two or more base64 strings to form the data fields of your records. Instead, concatenate the raw data, then perform base64 encoding.
See also: AWS API Documentation
Request Syntax
response = client.put_record_batch(
DeliveryStreamName='string',
Records=[
{
'Data': b'bytes'
},
]
)
[REQUIRED]
The name of the delivery stream.
[REQUIRED]
One or more records.
The unit of data in a delivery stream.
The data blob, which is base64-encoded when the blob is serialized. The maximum size of the data blob, before base64-encoding, is 1,000 KiB.
dict
Response Syntax
{
'FailedPutCount': 123,
'Encrypted': True|False,
'RequestResponses': [
{
'RecordId': 'string',
'ErrorCode': 'string',
'ErrorMessage': 'string'
},
]
}
Response Structure
(dict) --
FailedPutCount (integer) --
The number of records that might have failed processing. This number might be greater than 0 even if the PutRecordBatch call succeeds. Check FailedPutCount
to determine whether there are records that you need to resend.
Encrypted (boolean) --
Indicates whether server-side encryption (SSE) was enabled during this operation.
RequestResponses (list) --
The results array. For each record, the index of the response element is the same as the index used in the request array.
(dict) --
Contains the result for an individual record from a PutRecordBatch request. If the record is successfully added to your delivery stream, it receives a record ID. If the record fails to be added to your delivery stream, the result includes an error code and an error message.
RecordId (string) --
The ID of the record.
ErrorCode (string) --
The error code for an individual record result.
ErrorMessage (string) --
The error message for an individual record result.
Exceptions
Firehose.Client.exceptions.ResourceNotFoundException
Firehose.Client.exceptions.InvalidArgumentException
Firehose.Client.exceptions.InvalidKMSResourceException
Firehose.Client.exceptions.ServiceUnavailableException
start_delivery_stream_encryption
(**kwargs)¶Enables server-side encryption (SSE) for the delivery stream.
This operation is asynchronous. It returns immediately. When you invoke it, Kinesis Data Firehose first sets the encryption status of the stream to ENABLING
, and then to ENABLED
. The encryption status of a delivery stream is the Status
property in DeliveryStreamEncryptionConfiguration . If the operation fails, the encryption status changes to ENABLING_FAILED
. You can continue to read and write data to your delivery stream while the encryption status is ENABLING
, but the data is not encrypted. It can take up to 5 seconds after the encryption status changes to ENABLED
before all records written to the delivery stream are encrypted. To find out whether a record or a batch of records was encrypted, check the response elements PutRecordOutput$Encrypted and PutRecordBatchOutput$Encrypted , respectively.
To check the encryption status of a delivery stream, use DescribeDeliveryStream .
Even if encryption is currently enabled for a delivery stream, you can still invoke this operation on it to change the ARN of the CMK or both its type and ARN. If you invoke this method to change the CMK, and the old CMK is of type CUSTOMER_MANAGED_CMK
, Kinesis Data Firehose schedules the grant it had on the old CMK for retirement. If the new CMK is of type CUSTOMER_MANAGED_CMK
, Kinesis Data Firehose creates a grant that enables it to use the new CMK to encrypt and decrypt data and to manage the grant.
If a delivery stream already has encryption enabled and then you invoke this operation to change the ARN of the CMK or both its type and ARN and you get ENABLING_FAILED
, this only means that the attempt to change the CMK failed. In this case, encryption remains enabled with the old CMK.
If the encryption status of your delivery stream is ENABLING_FAILED
, you can invoke this operation again with a valid CMK. The CMK must be enabled and the key policy mustn't explicitly deny the permission for Kinesis Data Firehose to invoke KMS encrypt and decrypt operations.
You can enable SSE for a delivery stream only if it's a delivery stream that uses DirectPut
as its source.
The StartDeliveryStreamEncryption
and StopDeliveryStreamEncryption
operations have a combined limit of 25 calls per delivery stream per 24 hours. For example, you reach the limit if you call StartDeliveryStreamEncryption
13 times and StopDeliveryStreamEncryption
12 times for the same delivery stream in a 24-hour period.
See also: AWS API Documentation
Request Syntax
response = client.start_delivery_stream_encryption(
DeliveryStreamName='string',
DeliveryStreamEncryptionConfigurationInput={
'KeyARN': 'string',
'KeyType': 'AWS_OWNED_CMK'|'CUSTOMER_MANAGED_CMK'
}
)
[REQUIRED]
The name of the delivery stream for which you want to enable server-side encryption (SSE).
Used to specify the type and Amazon Resource Name (ARN) of the KMS key needed for Server-Side Encryption (SSE).
If you set KeyType
to CUSTOMER_MANAGED_CMK
, you must specify the Amazon Resource Name (ARN) of the CMK. If you set KeyType
to AWS_OWNED_CMK
, Kinesis Data Firehose uses a service-account CMK.
Indicates the type of customer master key (CMK) to use for encryption. The default setting is AWS_OWNED_CMK
. For more information about CMKs, see Customer Master Keys (CMKs) . When you invoke CreateDeliveryStream or StartDeliveryStreamEncryption with KeyType
set to CUSTOMER_MANAGED_CMK, Kinesis Data Firehose invokes the Amazon KMS operation CreateGrant to create a grant that allows the Kinesis Data Firehose service to use the customer managed CMK to perform encryption and decryption. Kinesis Data Firehose manages that grant.
When you invoke StartDeliveryStreamEncryption to change the CMK for a delivery stream that is encrypted with a customer managed CMK, Kinesis Data Firehose schedules the grant it had on the old CMK for retirement.
You can use a CMK of type CUSTOMER_MANAGED_CMK to encrypt up to 500 delivery streams. If a CreateDeliveryStream or StartDeliveryStreamEncryption operation exceeds this limit, Kinesis Data Firehose throws a LimitExceededException
.
Warning
To encrypt your delivery stream, use symmetric CMKs. Kinesis Data Firehose doesn't support asymmetric CMKs. For information about symmetric and asymmetric CMKs, see About Symmetric and Asymmetric CMKs in the AWS Key Management Service developer guide.
dict
Response Syntax
{}
Response Structure
Exceptions
Firehose.Client.exceptions.ResourceNotFoundException
Firehose.Client.exceptions.ResourceInUseException
Firehose.Client.exceptions.InvalidArgumentException
Firehose.Client.exceptions.LimitExceededException
Firehose.Client.exceptions.InvalidKMSResourceException
stop_delivery_stream_encryption
(**kwargs)¶Disables server-side encryption (SSE) for the delivery stream.
This operation is asynchronous. It returns immediately. When you invoke it, Kinesis Data Firehose first sets the encryption status of the stream to DISABLING
, and then to DISABLED
. You can continue to read and write data to your stream while its status is DISABLING
. It can take up to 5 seconds after the encryption status changes to DISABLED
before all records written to the delivery stream are no longer subject to encryption. To find out whether a record or a batch of records was encrypted, check the response elements PutRecordOutput$Encrypted and PutRecordBatchOutput$Encrypted , respectively.
To check the encryption state of a delivery stream, use DescribeDeliveryStream .
If SSE is enabled using a customer managed CMK and then you invoke StopDeliveryStreamEncryption
, Kinesis Data Firehose schedules the related KMS grant for retirement and then retires it after it ensures that it is finished delivering records to the destination.
The StartDeliveryStreamEncryption
and StopDeliveryStreamEncryption
operations have a combined limit of 25 calls per delivery stream per 24 hours. For example, you reach the limit if you call StartDeliveryStreamEncryption
13 times and StopDeliveryStreamEncryption
12 times for the same delivery stream in a 24-hour period.
See also: AWS API Documentation
Request Syntax
response = client.stop_delivery_stream_encryption(
DeliveryStreamName='string'
)
[REQUIRED]
The name of the delivery stream for which you want to disable server-side encryption (SSE).
{}
Response Structure
Exceptions
Firehose.Client.exceptions.ResourceNotFoundException
Firehose.Client.exceptions.ResourceInUseException
Firehose.Client.exceptions.InvalidArgumentException
Firehose.Client.exceptions.LimitExceededException
tag_delivery_stream
(**kwargs)¶Adds or updates tags for the specified delivery stream. A tag is a key-value pair that you can define and assign to AWS resources. If you specify a tag that already exists, the tag value is replaced with the value that you specify in the request. Tags are metadata. For example, you can add friendly names and descriptions or other types of information that can help you distinguish the delivery stream. For more information about tags, see Using Cost Allocation Tags in the AWS Billing and Cost Management User Guide .
Each delivery stream can have up to 50 tags.
This operation has a limit of five transactions per second per account.
See also: AWS API Documentation
Request Syntax
response = client.tag_delivery_stream(
DeliveryStreamName='string',
Tags=[
{
'Key': 'string',
'Value': 'string'
},
]
)
[REQUIRED]
The name of the delivery stream to which you want to add the tags.
[REQUIRED]
A set of key-value pairs to use to create the tags.
Metadata that you can assign to a delivery stream, consisting of a key-value pair.
A unique identifier for the tag. Maximum length: 128 characters. Valid characters: Unicode letters, digits, white space, _ . / = + - % @
An optional string, which you can use to describe or define the tag. Maximum length: 256 characters. Valid characters: Unicode letters, digits, white space, _ . / = + - % @
dict
Response Syntax
{}
Response Structure
Exceptions
Firehose.Client.exceptions.ResourceNotFoundException
Firehose.Client.exceptions.ResourceInUseException
Firehose.Client.exceptions.InvalidArgumentException
Firehose.Client.exceptions.LimitExceededException
untag_delivery_stream
(**kwargs)¶Removes tags from the specified delivery stream. Removed tags are deleted, and you can't recover them after this operation successfully completes.
If you specify a tag that doesn't exist, the operation ignores it.
This operation has a limit of five transactions per second per account.
See also: AWS API Documentation
Request Syntax
response = client.untag_delivery_stream(
DeliveryStreamName='string',
TagKeys=[
'string',
]
)
[REQUIRED]
The name of the delivery stream.
[REQUIRED]
A list of tag keys. Each corresponding tag is removed from the delivery stream.
dict
Response Syntax
{}
Response Structure
Exceptions
Firehose.Client.exceptions.ResourceNotFoundException
Firehose.Client.exceptions.ResourceInUseException
Firehose.Client.exceptions.InvalidArgumentException
Firehose.Client.exceptions.LimitExceededException
update_destination
(**kwargs)¶Updates the specified destination of the specified delivery stream.
Use this operation to change the destination type (for example, to replace the Amazon S3 destination with Amazon Redshift) or change the parameters associated with a destination (for example, to change the bucket name of the Amazon S3 destination). The update might not occur immediately. The target delivery stream remains active while the configurations are updated, so data writes to the delivery stream can continue during this process. The updated configurations are usually effective within a few minutes.
Switching between Amazon ES and other services is not supported. For an Amazon ES destination, you can only update to another Amazon ES destination.
If the destination type is the same, Kinesis Data Firehose merges the configuration parameters specified with the destination configuration that already exists on the delivery stream. If any of the parameters are not specified in the call, the existing values are retained. For example, in the Amazon S3 destination, if EncryptionConfiguration is not specified, then the existing EncryptionConfiguration
is maintained on the destination.
If the destination type is not the same, for example, changing the destination from Amazon S3 to Amazon Redshift, Kinesis Data Firehose does not merge any parameters. In this case, all parameters must be specified.
Kinesis Data Firehose uses CurrentDeliveryStreamVersionId
to avoid race conditions and conflicting merges. This is a required field, and the service updates the configuration only if the existing configuration has a version ID that matches. After the update is applied successfully, the version ID is updated, and can be retrieved using DescribeDeliveryStream . Use the new version ID to set CurrentDeliveryStreamVersionId
in the next call.
See also: AWS API Documentation
Request Syntax
response = client.update_destination(
DeliveryStreamName='string',
CurrentDeliveryStreamVersionId='string',
DestinationId='string',
S3DestinationUpdate={
'RoleARN': 'string',
'BucketARN': 'string',
'Prefix': 'string',
'ErrorOutputPrefix': 'string',
'BufferingHints': {
'SizeInMBs': 123,
'IntervalInSeconds': 123
},
'CompressionFormat': 'UNCOMPRESSED'|'GZIP'|'ZIP'|'Snappy'|'HADOOP_SNAPPY',
'EncryptionConfiguration': {
'NoEncryptionConfig': 'NoEncryption',
'KMSEncryptionConfig': {
'AWSKMSKeyARN': 'string'
}
},
'CloudWatchLoggingOptions': {
'Enabled': True|False,
'LogGroupName': 'string',
'LogStreamName': 'string'
}
},
ExtendedS3DestinationUpdate={
'RoleARN': 'string',
'BucketARN': 'string',
'Prefix': 'string',
'ErrorOutputPrefix': 'string',
'BufferingHints': {
'SizeInMBs': 123,
'IntervalInSeconds': 123
},
'CompressionFormat': 'UNCOMPRESSED'|'GZIP'|'ZIP'|'Snappy'|'HADOOP_SNAPPY',
'EncryptionConfiguration': {
'NoEncryptionConfig': 'NoEncryption',
'KMSEncryptionConfig': {
'AWSKMSKeyARN': 'string'
}
},
'CloudWatchLoggingOptions': {
'Enabled': True|False,
'LogGroupName': 'string',
'LogStreamName': 'string'
},
'ProcessingConfiguration': {
'Enabled': True|False,
'Processors': [
{
'Type': 'RecordDeAggregation'|'Lambda'|'MetadataExtraction'|'AppendDelimiterToRecord',
'Parameters': [
{
'ParameterName': 'LambdaArn'|'NumberOfRetries'|'MetadataExtractionQuery'|'JsonParsingEngine'|'RoleArn'|'BufferSizeInMBs'|'BufferIntervalInSeconds'|'SubRecordType'|'Delimiter',
'ParameterValue': 'string'
},
]
},
]
},
'S3BackupMode': 'Disabled'|'Enabled',
'S3BackupUpdate': {
'RoleARN': 'string',
'BucketARN': 'string',
'Prefix': 'string',
'ErrorOutputPrefix': 'string',
'BufferingHints': {
'SizeInMBs': 123,
'IntervalInSeconds': 123
},
'CompressionFormat': 'UNCOMPRESSED'|'GZIP'|'ZIP'|'Snappy'|'HADOOP_SNAPPY',
'EncryptionConfiguration': {
'NoEncryptionConfig': 'NoEncryption',
'KMSEncryptionConfig': {
'AWSKMSKeyARN': 'string'
}
},
'CloudWatchLoggingOptions': {
'Enabled': True|False,
'LogGroupName': 'string',
'LogStreamName': 'string'
}
},
'DataFormatConversionConfiguration': {
'SchemaConfiguration': {
'RoleARN': 'string',
'CatalogId': 'string',
'DatabaseName': 'string',
'TableName': 'string',
'Region': 'string',
'VersionId': 'string'
},
'InputFormatConfiguration': {
'Deserializer': {
'OpenXJsonSerDe': {
'ConvertDotsInJsonKeysToUnderscores': True|False,
'CaseInsensitive': True|False,
'ColumnToJsonKeyMappings': {
'string': 'string'
}
},
'HiveJsonSerDe': {
'TimestampFormats': [
'string',
]
}
}
},
'OutputFormatConfiguration': {
'Serializer': {
'ParquetSerDe': {
'BlockSizeBytes': 123,
'PageSizeBytes': 123,
'Compression': 'UNCOMPRESSED'|'GZIP'|'SNAPPY',
'EnableDictionaryCompression': True|False,
'MaxPaddingBytes': 123,
'WriterVersion': 'V1'|'V2'
},
'OrcSerDe': {
'StripeSizeBytes': 123,
'BlockSizeBytes': 123,
'RowIndexStride': 123,
'EnablePadding': True|False,
'PaddingTolerance': 123.0,
'Compression': 'NONE'|'ZLIB'|'SNAPPY',
'BloomFilterColumns': [
'string',
],
'BloomFilterFalsePositiveProbability': 123.0,
'DictionaryKeyThreshold': 123.0,
'FormatVersion': 'V0_11'|'V0_12'
}
}
},
'Enabled': True|False
},
'DynamicPartitioningConfiguration': {
'RetryOptions': {
'DurationInSeconds': 123
},
'Enabled': True|False
}
},
RedshiftDestinationUpdate={
'RoleARN': 'string',
'ClusterJDBCURL': 'string',
'CopyCommand': {
'DataTableName': 'string',
'DataTableColumns': 'string',
'CopyOptions': 'string'
},
'Username': 'string',
'Password': 'string',
'RetryOptions': {
'DurationInSeconds': 123
},
'S3Update': {
'RoleARN': 'string',
'BucketARN': 'string',
'Prefix': 'string',
'ErrorOutputPrefix': 'string',
'BufferingHints': {
'SizeInMBs': 123,
'IntervalInSeconds': 123
},
'CompressionFormat': 'UNCOMPRESSED'|'GZIP'|'ZIP'|'Snappy'|'HADOOP_SNAPPY',
'EncryptionConfiguration': {
'NoEncryptionConfig': 'NoEncryption',
'KMSEncryptionConfig': {
'AWSKMSKeyARN': 'string'
}
},
'CloudWatchLoggingOptions': {
'Enabled': True|False,
'LogGroupName': 'string',
'LogStreamName': 'string'
}
},
'ProcessingConfiguration': {
'Enabled': True|False,
'Processors': [
{
'Type': 'RecordDeAggregation'|'Lambda'|'MetadataExtraction'|'AppendDelimiterToRecord',
'Parameters': [
{
'ParameterName': 'LambdaArn'|'NumberOfRetries'|'MetadataExtractionQuery'|'JsonParsingEngine'|'RoleArn'|'BufferSizeInMBs'|'BufferIntervalInSeconds'|'SubRecordType'|'Delimiter',
'ParameterValue': 'string'
},
]
},
]
},
'S3BackupMode': 'Disabled'|'Enabled',
'S3BackupUpdate': {
'RoleARN': 'string',
'BucketARN': 'string',
'Prefix': 'string',
'ErrorOutputPrefix': 'string',
'BufferingHints': {
'SizeInMBs': 123,
'IntervalInSeconds': 123
},
'CompressionFormat': 'UNCOMPRESSED'|'GZIP'|'ZIP'|'Snappy'|'HADOOP_SNAPPY',
'EncryptionConfiguration': {
'NoEncryptionConfig': 'NoEncryption',
'KMSEncryptionConfig': {
'AWSKMSKeyARN': 'string'
}
},
'CloudWatchLoggingOptions': {
'Enabled': True|False,
'LogGroupName': 'string',
'LogStreamName': 'string'
}
},
'CloudWatchLoggingOptions': {
'Enabled': True|False,
'LogGroupName': 'string',
'LogStreamName': 'string'
}
},
ElasticsearchDestinationUpdate={
'RoleARN': 'string',
'DomainARN': 'string',
'ClusterEndpoint': 'string',
'IndexName': 'string',
'TypeName': 'string',
'IndexRotationPeriod': 'NoRotation'|'OneHour'|'OneDay'|'OneWeek'|'OneMonth',
'BufferingHints': {
'IntervalInSeconds': 123,
'SizeInMBs': 123
},
'RetryOptions': {
'DurationInSeconds': 123
},
'S3Update': {
'RoleARN': 'string',
'BucketARN': 'string',
'Prefix': 'string',
'ErrorOutputPrefix': 'string',
'BufferingHints': {
'SizeInMBs': 123,
'IntervalInSeconds': 123
},
'CompressionFormat': 'UNCOMPRESSED'|'GZIP'|'ZIP'|'Snappy'|'HADOOP_SNAPPY',
'EncryptionConfiguration': {
'NoEncryptionConfig': 'NoEncryption',
'KMSEncryptionConfig': {
'AWSKMSKeyARN': 'string'
}
},
'CloudWatchLoggingOptions': {
'Enabled': True|False,
'LogGroupName': 'string',
'LogStreamName': 'string'
}
},
'ProcessingConfiguration': {
'Enabled': True|False,
'Processors': [
{
'Type': 'RecordDeAggregation'|'Lambda'|'MetadataExtraction'|'AppendDelimiterToRecord',
'Parameters': [
{
'ParameterName': 'LambdaArn'|'NumberOfRetries'|'MetadataExtractionQuery'|'JsonParsingEngine'|'RoleArn'|'BufferSizeInMBs'|'BufferIntervalInSeconds'|'SubRecordType'|'Delimiter',
'ParameterValue': 'string'
},
]
},
]
},
'CloudWatchLoggingOptions': {
'Enabled': True|False,
'LogGroupName': 'string',
'LogStreamName': 'string'
}
},
AmazonopensearchserviceDestinationUpdate={
'RoleARN': 'string',
'DomainARN': 'string',
'ClusterEndpoint': 'string',
'IndexName': 'string',
'TypeName': 'string',
'IndexRotationPeriod': 'NoRotation'|'OneHour'|'OneDay'|'OneWeek'|'OneMonth',
'BufferingHints': {
'IntervalInSeconds': 123,
'SizeInMBs': 123
},
'RetryOptions': {
'DurationInSeconds': 123
},
'S3Update': {
'RoleARN': 'string',
'BucketARN': 'string',
'Prefix': 'string',
'ErrorOutputPrefix': 'string',
'BufferingHints': {
'SizeInMBs': 123,
'IntervalInSeconds': 123
},
'CompressionFormat': 'UNCOMPRESSED'|'GZIP'|'ZIP'|'Snappy'|'HADOOP_SNAPPY',
'EncryptionConfiguration': {
'NoEncryptionConfig': 'NoEncryption',
'KMSEncryptionConfig': {
'AWSKMSKeyARN': 'string'
}
},
'CloudWatchLoggingOptions': {
'Enabled': True|False,
'LogGroupName': 'string',
'LogStreamName': 'string'
}
},
'ProcessingConfiguration': {
'Enabled': True|False,
'Processors': [
{
'Type': 'RecordDeAggregation'|'Lambda'|'MetadataExtraction'|'AppendDelimiterToRecord',
'Parameters': [
{
'ParameterName': 'LambdaArn'|'NumberOfRetries'|'MetadataExtractionQuery'|'JsonParsingEngine'|'RoleArn'|'BufferSizeInMBs'|'BufferIntervalInSeconds'|'SubRecordType'|'Delimiter',
'ParameterValue': 'string'
},
]
},
]
},
'CloudWatchLoggingOptions': {
'Enabled': True|False,
'LogGroupName': 'string',
'LogStreamName': 'string'
}
},
SplunkDestinationUpdate={
'HECEndpoint': 'string',
'HECEndpointType': 'Raw'|'Event',
'HECToken': 'string',
'HECAcknowledgmentTimeoutInSeconds': 123,
'RetryOptions': {
'DurationInSeconds': 123
},
'S3BackupMode': 'FailedEventsOnly'|'AllEvents',
'S3Update': {
'RoleARN': 'string',
'BucketARN': 'string',
'Prefix': 'string',
'ErrorOutputPrefix': 'string',
'BufferingHints': {
'SizeInMBs': 123,
'IntervalInSeconds': 123
},
'CompressionFormat': 'UNCOMPRESSED'|'GZIP'|'ZIP'|'Snappy'|'HADOOP_SNAPPY',
'EncryptionConfiguration': {
'NoEncryptionConfig': 'NoEncryption',
'KMSEncryptionConfig': {
'AWSKMSKeyARN': 'string'
}
},
'CloudWatchLoggingOptions': {
'Enabled': True|False,
'LogGroupName': 'string',
'LogStreamName': 'string'
}
},
'ProcessingConfiguration': {
'Enabled': True|False,
'Processors': [
{
'Type': 'RecordDeAggregation'|'Lambda'|'MetadataExtraction'|'AppendDelimiterToRecord',
'Parameters': [
{
'ParameterName': 'LambdaArn'|'NumberOfRetries'|'MetadataExtractionQuery'|'JsonParsingEngine'|'RoleArn'|'BufferSizeInMBs'|'BufferIntervalInSeconds'|'SubRecordType'|'Delimiter',
'ParameterValue': 'string'
},
]
},
]
},
'CloudWatchLoggingOptions': {
'Enabled': True|False,
'LogGroupName': 'string',
'LogStreamName': 'string'
}
},
HttpEndpointDestinationUpdate={
'EndpointConfiguration': {
'Url': 'string',
'Name': 'string',
'AccessKey': 'string'
},
'BufferingHints': {
'SizeInMBs': 123,
'IntervalInSeconds': 123
},
'CloudWatchLoggingOptions': {
'Enabled': True|False,
'LogGroupName': 'string',
'LogStreamName': 'string'
},
'RequestConfiguration': {
'ContentEncoding': 'NONE'|'GZIP',
'CommonAttributes': [
{
'AttributeName': 'string',
'AttributeValue': 'string'
},
]
},
'ProcessingConfiguration': {
'Enabled': True|False,
'Processors': [
{
'Type': 'RecordDeAggregation'|'Lambda'|'MetadataExtraction'|'AppendDelimiterToRecord',
'Parameters': [
{
'ParameterName': 'LambdaArn'|'NumberOfRetries'|'MetadataExtractionQuery'|'JsonParsingEngine'|'RoleArn'|'BufferSizeInMBs'|'BufferIntervalInSeconds'|'SubRecordType'|'Delimiter',
'ParameterValue': 'string'
},
]
},
]
},
'RoleARN': 'string',
'RetryOptions': {
'DurationInSeconds': 123
},
'S3BackupMode': 'FailedDataOnly'|'AllData',
'S3Update': {
'RoleARN': 'string',
'BucketARN': 'string',
'Prefix': 'string',
'ErrorOutputPrefix': 'string',
'BufferingHints': {
'SizeInMBs': 123,
'IntervalInSeconds': 123
},
'CompressionFormat': 'UNCOMPRESSED'|'GZIP'|'ZIP'|'Snappy'|'HADOOP_SNAPPY',
'EncryptionConfiguration': {
'NoEncryptionConfig': 'NoEncryption',
'KMSEncryptionConfig': {
'AWSKMSKeyARN': 'string'
}
},
'CloudWatchLoggingOptions': {
'Enabled': True|False,
'LogGroupName': 'string',
'LogStreamName': 'string'
}
}
}
)
[REQUIRED]
The name of the delivery stream.
[REQUIRED]
Obtain this value from the VersionId
result of DeliveryStreamDescription . This value is required, and helps the service perform conditional operations. For example, if there is an interleaving update and this value is null, then the update destination fails. After the update is successful, the VersionId
value is updated. The service then performs a merge of the old configuration with the new configuration.
[REQUIRED]
The ID of the destination.
[Deprecated] Describes an update for a destination in Amazon S3.
The Amazon Resource Name (ARN) of the AWS credentials. For more information, see Amazon Resource Names (ARNs) and AWS Service Namespaces .
The ARN of the S3 bucket. For more information, see Amazon Resource Names (ARNs) and AWS Service Namespaces .
The "YYYY/MM/DD/HH" time format prefix is automatically used for delivered Amazon S3 files. You can also specify a custom prefix, as described in Custom Prefixes for Amazon S3 Objects .
A prefix that Kinesis Data Firehose evaluates and adds to failed records before writing them to S3. This prefix appears immediately following the bucket name. For information about how to specify this prefix, see Custom Prefixes for Amazon S3 Objects .
The buffering option. If no value is specified, BufferingHints
object default values are used.
Buffer incoming data to the specified size, in MiBs, before delivering it to the destination. The default value is 5. This parameter is optional but if you specify a value for it, you must also specify a value for IntervalInSeconds
, and vice versa.
We recommend setting this parameter to a value greater than the amount of data you typically ingest into the delivery stream in 10 seconds. For example, if you typically ingest data at 1 MiB/sec, the value should be 10 MiB or higher.
Buffer incoming data for the specified period of time, in seconds, before delivering it to the destination. The default value is 300. This parameter is optional but if you specify a value for it, you must also specify a value for SizeInMBs
, and vice versa.
The compression format. If no value is specified, the default is UNCOMPRESSED
.
The compression formats SNAPPY
or ZIP
cannot be specified for Amazon Redshift destinations because they are not supported by the Amazon Redshift COPY
operation that reads from the S3 bucket.
The encryption configuration. If no value is specified, the default is no encryption.
Specifically override existing encryption information to ensure that no encryption is used.
The encryption key.
The Amazon Resource Name (ARN) of the encryption key. Must belong to the same AWS Region as the destination Amazon S3 bucket. For more information, see Amazon Resource Names (ARNs) and AWS Service Namespaces .
The CloudWatch logging options for your delivery stream.
Enables or disables CloudWatch logging.
The CloudWatch group name for logging. This value is required if CloudWatch logging is enabled.
The CloudWatch log stream name for logging. This value is required if CloudWatch logging is enabled.
Describes an update for a destination in Amazon S3.
The Amazon Resource Name (ARN) of the AWS credentials. For more information, see Amazon Resource Names (ARNs) and AWS Service Namespaces .
The ARN of the S3 bucket. For more information, see Amazon Resource Names (ARNs) and AWS Service Namespaces .
The "YYYY/MM/DD/HH" time format prefix is automatically used for delivered Amazon S3 files. You can also specify a custom prefix, as described in Custom Prefixes for Amazon S3 Objects .
A prefix that Kinesis Data Firehose evaluates and adds to failed records before writing them to S3. This prefix appears immediately following the bucket name. For information about how to specify this prefix, see Custom Prefixes for Amazon S3 Objects .
The buffering option.
Buffer incoming data to the specified size, in MiBs, before delivering it to the destination. The default value is 5. This parameter is optional but if you specify a value for it, you must also specify a value for IntervalInSeconds
, and vice versa.
We recommend setting this parameter to a value greater than the amount of data you typically ingest into the delivery stream in 10 seconds. For example, if you typically ingest data at 1 MiB/sec, the value should be 10 MiB or higher.
Buffer incoming data for the specified period of time, in seconds, before delivering it to the destination. The default value is 300. This parameter is optional but if you specify a value for it, you must also specify a value for SizeInMBs
, and vice versa.
The compression format. If no value is specified, the default is UNCOMPRESSED
.
The encryption configuration. If no value is specified, the default is no encryption.
Specifically override existing encryption information to ensure that no encryption is used.
The encryption key.
The Amazon Resource Name (ARN) of the encryption key. Must belong to the same AWS Region as the destination Amazon S3 bucket. For more information, see Amazon Resource Names (ARNs) and AWS Service Namespaces .
The Amazon CloudWatch logging options for your delivery stream.
Enables or disables CloudWatch logging.
The CloudWatch group name for logging. This value is required if CloudWatch logging is enabled.
The CloudWatch log stream name for logging. This value is required if CloudWatch logging is enabled.
The data processing configuration.
Enables or disables data processing.
The data processors.
Describes a data processor.
The type of processor.
The processor parameters.
Describes the processor parameter.
The name of the parameter.
The parameter value.
You can update a delivery stream to enable Amazon S3 backup if it is disabled. If backup is enabled, you can't update the delivery stream to disable it.
The Amazon S3 destination for backup.
The Amazon Resource Name (ARN) of the AWS credentials. For more information, see Amazon Resource Names (ARNs) and AWS Service Namespaces .
The ARN of the S3 bucket. For more information, see Amazon Resource Names (ARNs) and AWS Service Namespaces .
The "YYYY/MM/DD/HH" time format prefix is automatically used for delivered Amazon S3 files. You can also specify a custom prefix, as described in Custom Prefixes for Amazon S3 Objects .
A prefix that Kinesis Data Firehose evaluates and adds to failed records before writing them to S3. This prefix appears immediately following the bucket name. For information about how to specify this prefix, see Custom Prefixes for Amazon S3 Objects .
The buffering option. If no value is specified, BufferingHints
object default values are used.
Buffer incoming data to the specified size, in MiBs, before delivering it to the destination. The default value is 5. This parameter is optional but if you specify a value for it, you must also specify a value for IntervalInSeconds
, and vice versa.
We recommend setting this parameter to a value greater than the amount of data you typically ingest into the delivery stream in 10 seconds. For example, if you typically ingest data at 1 MiB/sec, the value should be 10 MiB or higher.
Buffer incoming data for the specified period of time, in seconds, before delivering it to the destination. The default value is 300. This parameter is optional but if you specify a value for it, you must also specify a value for SizeInMBs
, and vice versa.
The compression format. If no value is specified, the default is UNCOMPRESSED
.
The compression formats SNAPPY
or ZIP
cannot be specified for Amazon Redshift destinations because they are not supported by the Amazon Redshift COPY
operation that reads from the S3 bucket.
The encryption configuration. If no value is specified, the default is no encryption.
Specifically override existing encryption information to ensure that no encryption is used.
The encryption key.
The Amazon Resource Name (ARN) of the encryption key. Must belong to the same AWS Region as the destination Amazon S3 bucket. For more information, see Amazon Resource Names (ARNs) and AWS Service Namespaces .
The CloudWatch logging options for your delivery stream.
Enables or disables CloudWatch logging.
The CloudWatch group name for logging. This value is required if CloudWatch logging is enabled.
The CloudWatch log stream name for logging. This value is required if CloudWatch logging is enabled.
The serializer, deserializer, and schema for converting data from the JSON format to the Parquet or ORC format before writing it to Amazon S3.
Specifies the AWS Glue Data Catalog table that contains the column information. This parameter is required if Enabled
is set to true.
The role that Kinesis Data Firehose can use to access AWS Glue. This role must be in the same account you use for Kinesis Data Firehose. Cross-account roles aren't allowed.
Warning
If the SchemaConfiguration
request parameter is used as part of invoking the CreateDeliveryStream
API, then the RoleARN
property is required and its value must be specified.
The ID of the AWS Glue Data Catalog. If you don't supply this, the AWS account ID is used by default.
Specifies the name of the AWS Glue database that contains the schema for the output data.
Warning
If the SchemaConfiguration
request parameter is used as part of invoking the CreateDeliveryStream
API, then the DatabaseName
property is required and its value must be specified.
Specifies the AWS Glue table that contains the column information that constitutes your data schema.
Warning
If the SchemaConfiguration
request parameter is used as part of invoking the CreateDeliveryStream
API, then the TableName
property is required and its value must be specified.
If you don't specify an AWS Region, the default is the current Region.
Specifies the table version for the output data schema. If you don't specify this version ID, or if you set it to LATEST
, Kinesis Data Firehose uses the most recent version. This means that any updates to the table are automatically picked up.
Specifies the deserializer that you want Kinesis Data Firehose to use to convert the format of your data from JSON. This parameter is required if Enabled
is set to true.
Specifies which deserializer to use. You can choose either the Apache Hive JSON SerDe or the OpenX JSON SerDe. If both are non-null, the server rejects the request.
The OpenX SerDe. Used by Kinesis Data Firehose for deserializing data, which means converting it from the JSON format in preparation for serializing it to the Parquet or ORC format. This is one of two deserializers you can choose, depending on which one offers the functionality you need. The other option is the native Hive / HCatalog JsonSerDe.
When set to true
, specifies that the names of the keys include dots and that you want Kinesis Data Firehose to replace them with underscores. This is useful because Apache Hive does not allow dots in column names. For example, if the JSON contains a key whose name is "a.b", you can define the column name to be "a_b" when using this option.
The default is false
.
When set to true
, which is the default, Kinesis Data Firehose converts JSON keys to lowercase before deserializing them.
Maps column names to JSON keys that aren't identical to the column names. This is useful when the JSON contains keys that are Hive keywords. For example, timestamp
is a Hive keyword. If you have a JSON key named timestamp
, set this parameter to {"ts": "timestamp"}
to map this key to a column named ts
.
The native Hive / HCatalog JsonSerDe. Used by Kinesis Data Firehose for deserializing data, which means converting it from the JSON format in preparation for serializing it to the Parquet or ORC format. This is one of two deserializers you can choose, depending on which one offers the functionality you need. The other option is the OpenX SerDe.
Indicates how you want Kinesis Data Firehose to parse the date and timestamps that may be present in your input data JSON. To specify these format strings, follow the pattern syntax of JodaTime's DateTimeFormat format strings. For more information, see Class DateTimeFormat . You can also use the special value millis
to parse timestamps in epoch milliseconds. If you don't specify a format, Kinesis Data Firehose uses java.sql.Timestamp::valueOf
by default.
Specifies the serializer that you want Kinesis Data Firehose to use to convert the format of your data to the Parquet or ORC format. This parameter is required if Enabled
is set to true.
Specifies which serializer to use. You can choose either the ORC SerDe or the Parquet SerDe. If both are non-null, the server rejects the request.
A serializer to use for converting data to the Parquet format before storing it in Amazon S3. For more information, see Apache Parquet .
The Hadoop Distributed File System (HDFS) block size. This is useful if you intend to copy the data from Amazon S3 to HDFS before querying. The default is 256 MiB and the minimum is 64 MiB. Kinesis Data Firehose uses this value for padding calculations.
The Parquet page size. Column chunks are divided into pages. A page is conceptually an indivisible unit (in terms of compression and encoding). The minimum value is 64 KiB and the default is 1 MiB.
The compression code to use over data blocks. The possible values are UNCOMPRESSED
, SNAPPY
, and GZIP
, with the default being SNAPPY
. Use SNAPPY
for higher decompression speed. Use GZIP
if the compression ratio is more important than speed.
Indicates whether to enable dictionary compression.
The maximum amount of padding to apply. This is useful if you intend to copy the data from Amazon S3 to HDFS before querying. The default is 0.
Indicates the version of row format to output. The possible values are V1
and V2
. The default is V1
.
A serializer to use for converting data to the ORC format before storing it in Amazon S3. For more information, see Apache ORC .
The number of bytes in each stripe. The default is 64 MiB and the minimum is 8 MiB.
The Hadoop Distributed File System (HDFS) block size. This is useful if you intend to copy the data from Amazon S3 to HDFS before querying. The default is 256 MiB and the minimum is 64 MiB. Kinesis Data Firehose uses this value for padding calculations.
The number of rows between index entries. The default is 10,000 and the minimum is 1,000.
Set this to true
to indicate that you want stripes to be padded to the HDFS block boundaries. This is useful if you intend to copy the data from Amazon S3 to HDFS before querying. The default is false
.
A number between 0 and 1 that defines the tolerance for block padding as a decimal fraction of stripe size. The default value is 0.05, which means 5 percent of stripe size.
For the default values of 64 MiB ORC stripes and 256 MiB HDFS blocks, the default block padding tolerance of 5 percent reserves a maximum of 3.2 MiB for padding within the 256 MiB block. In such a case, if the available size within the block is more than 3.2 MiB, a new, smaller stripe is inserted to fit within that space. This ensures that no stripe crosses block boundaries and causes remote reads within a node-local task.
Kinesis Data Firehose ignores this parameter when OrcSerDe$EnablePadding is false
.
The compression code to use over data blocks. The default is SNAPPY
.
The column names for which you want Kinesis Data Firehose to create bloom filters. The default is null
.
The Bloom filter false positive probability (FPP). The lower the FPP, the bigger the Bloom filter. The default value is 0.05, the minimum is 0, and the maximum is 1.
Represents the fraction of the total number of non-null rows. To turn off dictionary encoding, set this fraction to a number that is less than the number of distinct keys in a dictionary. To always use dictionary encoding, set this threshold to 1.
The version of the file to write. The possible values are V0_11
and V0_12
. The default is V0_12
.
Defaults to true
. Set it to false
if you want to disable format conversion while preserving the configuration details.
The configuration of the dynamic partitioning mechanism that creates smaller data sets from the streaming data by partitioning it based on partition keys. Currently, dynamic partitioning is only supported for Amazon S3 destinations. For more information, see https://docs.aws.amazon.com/firehose/latest/dev/dynamic-partitioning.html
The retry behavior in case Kinesis Data Firehose is unable to deliver data to an Amazon S3 prefix.
The period of time during which Kinesis Data Firehose retries to deliver data to the specified Amazon S3 prefix.
Specifies that the dynamic partitioning is enabled for this Kinesis Data Firehose delivery stream.
Describes an update for a destination in Amazon Redshift.
The Amazon Resource Name (ARN) of the AWS credentials. For more information, see Amazon Resource Names (ARNs) and AWS Service Namespaces .
The database connection string.
The COPY
command.
The name of the target table. The table must already exist in the database.
A comma-separated list of column names.
Optional parameters to use with the Amazon Redshift COPY
command. For more information, see the "Optional Parameters" section of Amazon Redshift COPY command . Some possible examples that would apply to Kinesis Data Firehose are as follows:
delimiter '\t' lzop;
- fields are delimited with "t" (TAB character) and compressed using lzop.
delimiter '|'
- fields are delimited with "|" (this is the default delimiter).
delimiter '|' escape
- the delimiter should be escaped.
fixedwidth 'venueid:3,venuename:25,venuecity:12,venuestate:2,venueseats:6'
- fields are fixed width in the source, with each width specified after every column in the table.
JSON 's3://mybucket/jsonpaths.txt'
- data is in JSON format, and the path specified is the format of the data.
For more examples, see Amazon Redshift COPY command examples .
The name of the user.
The user password.
The retry behavior in case Kinesis Data Firehose is unable to deliver documents to Amazon Redshift. Default value is 3600 (60 minutes).
The length of time during which Kinesis Data Firehose retries delivery after a failure, starting from the initial request and including the first attempt. The default value is 3600 seconds (60 minutes). Kinesis Data Firehose does not retry if the value of DurationInSeconds
is 0 (zero) or if the first delivery attempt takes longer than the current value.
The Amazon S3 destination.
The compression formats SNAPPY
or ZIP
cannot be specified in RedshiftDestinationUpdate.S3Update
because the Amazon Redshift COPY
operation that reads from the S3 bucket doesn't support these compression formats.
The Amazon Resource Name (ARN) of the AWS credentials. For more information, see Amazon Resource Names (ARNs) and AWS Service Namespaces .
The ARN of the S3 bucket. For more information, see Amazon Resource Names (ARNs) and AWS Service Namespaces .
The "YYYY/MM/DD/HH" time format prefix is automatically used for delivered Amazon S3 files. You can also specify a custom prefix, as described in Custom Prefixes for Amazon S3 Objects .
A prefix that Kinesis Data Firehose evaluates and adds to failed records before writing them to S3. This prefix appears immediately following the bucket name. For information about how to specify this prefix, see Custom Prefixes for Amazon S3 Objects .
The buffering option. If no value is specified, BufferingHints
object default values are used.
Buffer incoming data to the specified size, in MiBs, before delivering it to the destination. The default value is 5. This parameter is optional but if you specify a value for it, you must also specify a value for IntervalInSeconds
, and vice versa.
We recommend setting this parameter to a value greater than the amount of data you typically ingest into the delivery stream in 10 seconds. For example, if you typically ingest data at 1 MiB/sec, the value should be 10 MiB or higher.
Buffer incoming data for the specified period of time, in seconds, before delivering it to the destination. The default value is 300. This parameter is optional but if you specify a value for it, you must also specify a value for SizeInMBs
, and vice versa.
The compression format. If no value is specified, the default is UNCOMPRESSED
.
The compression formats SNAPPY
or ZIP
cannot be specified for Amazon Redshift destinations because they are not supported by the Amazon Redshift COPY
operation that reads from the S3 bucket.
The encryption configuration. If no value is specified, the default is no encryption.
Specifically override existing encryption information to ensure that no encryption is used.
The encryption key.
The Amazon Resource Name (ARN) of the encryption key. Must belong to the same AWS Region as the destination Amazon S3 bucket. For more information, see Amazon Resource Names (ARNs) and AWS Service Namespaces .
The CloudWatch logging options for your delivery stream.
Enables or disables CloudWatch logging.
The CloudWatch group name for logging. This value is required if CloudWatch logging is enabled.
The CloudWatch log stream name for logging. This value is required if CloudWatch logging is enabled.
The data processing configuration.
Enables or disables data processing.
The data processors.
Describes a data processor.
The type of processor.
The processor parameters.
Describes the processor parameter.
The name of the parameter.
The parameter value.
You can update a delivery stream to enable Amazon S3 backup if it is disabled. If backup is enabled, you can't update the delivery stream to disable it.
The Amazon S3 destination for backup.
The Amazon Resource Name (ARN) of the AWS credentials. For more information, see Amazon Resource Names (ARNs) and AWS Service Namespaces .
The ARN of the S3 bucket. For more information, see Amazon Resource Names (ARNs) and AWS Service Namespaces .
The "YYYY/MM/DD/HH" time format prefix is automatically used for delivered Amazon S3 files. You can also specify a custom prefix, as described in Custom Prefixes for Amazon S3 Objects .
A prefix that Kinesis Data Firehose evaluates and adds to failed records before writing them to S3. This prefix appears immediately following the bucket name. For information about how to specify this prefix, see Custom Prefixes for Amazon S3 Objects .
The buffering option. If no value is specified, BufferingHints
object default values are used.
Buffer incoming data to the specified size, in MiBs, before delivering it to the destination. The default value is 5. This parameter is optional but if you specify a value for it, you must also specify a value for IntervalInSeconds
, and vice versa.
We recommend setting this parameter to a value greater than the amount of data you typically ingest into the delivery stream in 10 seconds. For example, if you typically ingest data at 1 MiB/sec, the value should be 10 MiB or higher.
Buffer incoming data for the specified period of time, in seconds, before delivering it to the destination. The default value is 300. This parameter is optional but if you specify a value for it, you must also specify a value for SizeInMBs
, and vice versa.
The compression format. If no value is specified, the default is UNCOMPRESSED
.
The compression formats SNAPPY
or ZIP
cannot be specified for Amazon Redshift destinations because they are not supported by the Amazon Redshift COPY
operation that reads from the S3 bucket.
The encryption configuration. If no value is specified, the default is no encryption.
Specifically override existing encryption information to ensure that no encryption is used.
The encryption key.
The Amazon Resource Name (ARN) of the encryption key. Must belong to the same AWS Region as the destination Amazon S3 bucket. For more information, see Amazon Resource Names (ARNs) and AWS Service Namespaces .
The CloudWatch logging options for your delivery stream.
Enables or disables CloudWatch logging.
The CloudWatch group name for logging. This value is required if CloudWatch logging is enabled.
The CloudWatch log stream name for logging. This value is required if CloudWatch logging is enabled.
The Amazon CloudWatch logging options for your delivery stream.
Enables or disables CloudWatch logging.
The CloudWatch group name for logging. This value is required if CloudWatch logging is enabled.
The CloudWatch log stream name for logging. This value is required if CloudWatch logging is enabled.
Describes an update for a destination in Amazon ES.
The Amazon Resource Name (ARN) of the IAM role to be assumed by Kinesis Data Firehose for calling the Amazon ES Configuration API and for indexing documents. For more information, see Grant Kinesis Data Firehose Access to an Amazon S3 Destination and Amazon Resource Names (ARNs) and AWS Service Namespaces .
The ARN of the Amazon ES domain. The IAM role must have permissions for DescribeElasticsearchDomain
, DescribeElasticsearchDomains
, and DescribeElasticsearchDomainConfig
after assuming the IAM role specified in RoleARN
. For more information, see Amazon Resource Names (ARNs) and AWS Service Namespaces .
Specify either ClusterEndpoint
or DomainARN
.
The endpoint to use when communicating with the cluster. Specify either this ClusterEndpoint
or the DomainARN
field.
The Elasticsearch index name.
The Elasticsearch type name. For Elasticsearch 6.x, there can be only one type per index. If you try to specify a new type for an existing index that already has another type, Kinesis Data Firehose returns an error during runtime.
If you upgrade Elasticsearch from 6.x to 7.x and don’t update your delivery stream, Kinesis Data Firehose still delivers data to Elasticsearch with the old index name and type name. If you want to update your delivery stream with a new index name, provide an empty string for TypeName
.
The Elasticsearch index rotation period. Index rotation appends a timestamp to IndexName
to facilitate the expiration of old data. For more information, see Index Rotation for the Amazon ES Destination . Default value is OneDay
.
The buffering options. If no value is specified, ElasticsearchBufferingHints
object default values are used.
Buffer incoming data for the specified period of time, in seconds, before delivering it to the destination. The default value is 300 (5 minutes).
Buffer incoming data to the specified size, in MBs, before delivering it to the destination. The default value is 5.
We recommend setting this parameter to a value greater than the amount of data you typically ingest into the delivery stream in 10 seconds. For example, if you typically ingest data at 1 MB/sec, the value should be 10 MB or higher.
The retry behavior in case Kinesis Data Firehose is unable to deliver documents to Amazon ES. The default value is 300 (5 minutes).
After an initial failure to deliver to Amazon ES, the total amount of time during which Kinesis Data Firehose retries delivery (including the first attempt). After this time has elapsed, the failed documents are written to Amazon S3. Default value is 300 seconds (5 minutes). A value of 0 (zero) results in no retries.
The Amazon S3 destination.
The Amazon Resource Name (ARN) of the AWS credentials. For more information, see Amazon Resource Names (ARNs) and AWS Service Namespaces .
The ARN of the S3 bucket. For more information, see Amazon Resource Names (ARNs) and AWS Service Namespaces .
The "YYYY/MM/DD/HH" time format prefix is automatically used for delivered Amazon S3 files. You can also specify a custom prefix, as described in Custom Prefixes for Amazon S3 Objects .
A prefix that Kinesis Data Firehose evaluates and adds to failed records before writing them to S3. This prefix appears immediately following the bucket name. For information about how to specify this prefix, see Custom Prefixes for Amazon S3 Objects .
The buffering option. If no value is specified, BufferingHints
object default values are used.
Buffer incoming data to the specified size, in MiBs, before delivering it to the destination. The default value is 5. This parameter is optional but if you specify a value for it, you must also specify a value for IntervalInSeconds
, and vice versa.
We recommend setting this parameter to a value greater than the amount of data you typically ingest into the delivery stream in 10 seconds. For example, if you typically ingest data at 1 MiB/sec, the value should be 10 MiB or higher.
Buffer incoming data for the specified period of time, in seconds, before delivering it to the destination. The default value is 300. This parameter is optional but if you specify a value for it, you must also specify a value for SizeInMBs
, and vice versa.
The compression format. If no value is specified, the default is UNCOMPRESSED
.
The compression formats SNAPPY
or ZIP
cannot be specified for Amazon Redshift destinations because they are not supported by the Amazon Redshift COPY
operation that reads from the S3 bucket.
The encryption configuration. If no value is specified, the default is no encryption.
Specifically override existing encryption information to ensure that no encryption is used.
The encryption key.
The Amazon Resource Name (ARN) of the encryption key. Must belong to the same AWS Region as the destination Amazon S3 bucket. For more information, see Amazon Resource Names (ARNs) and AWS Service Namespaces .
The CloudWatch logging options for your delivery stream.
Enables or disables CloudWatch logging.
The CloudWatch group name for logging. This value is required if CloudWatch logging is enabled.
The CloudWatch log stream name for logging. This value is required if CloudWatch logging is enabled.
The data processing configuration.
Enables or disables data processing.
The data processors.
Describes a data processor.
The type of processor.
The processor parameters.
Describes the processor parameter.
The name of the parameter.
The parameter value.
The CloudWatch logging options for your delivery stream.
Enables or disables CloudWatch logging.
The CloudWatch group name for logging. This value is required if CloudWatch logging is enabled.
The CloudWatch log stream name for logging. This value is required if CloudWatch logging is enabled.
Describes an update for a destination in Amazon S3.
The Amazon Resource Name (ARN) of the AWS credentials. For more information, see Amazon Resource Names (ARNs) and AWS Service Namespaces .
The ARN of the S3 bucket. For more information, see Amazon Resource Names (ARNs) and AWS Service Namespaces .
The "YYYY/MM/DD/HH" time format prefix is automatically used for delivered Amazon S3 files. You can also specify a custom prefix, as described in Custom Prefixes for Amazon S3 Objects .
A prefix that Kinesis Data Firehose evaluates and adds to failed records before writing them to S3. This prefix appears immediately following the bucket name. For information about how to specify this prefix, see Custom Prefixes for Amazon S3 Objects .
The buffering option. If no value is specified, BufferingHints
object default values are used.
Buffer incoming data to the specified size, in MiBs, before delivering it to the destination. The default value is 5. This parameter is optional but if you specify a value for it, you must also specify a value for IntervalInSeconds
, and vice versa.
We recommend setting this parameter to a value greater than the amount of data you typically ingest into the delivery stream in 10 seconds. For example, if you typically ingest data at 1 MiB/sec, the value should be 10 MiB or higher.
Buffer incoming data for the specified period of time, in seconds, before delivering it to the destination. The default value is 300. This parameter is optional but if you specify a value for it, you must also specify a value for SizeInMBs
, and vice versa.
The compression format. If no value is specified, the default is UNCOMPRESSED
.
The compression formats SNAPPY
or ZIP
cannot be specified for Amazon Redshift destinations because they are not supported by the Amazon Redshift COPY
operation that reads from the S3 bucket.
The encryption configuration. If no value is specified, the default is no encryption.
Specifically override existing encryption information to ensure that no encryption is used.
The encryption key.
The Amazon Resource Name (ARN) of the encryption key. Must belong to the same AWS Region as the destination Amazon S3 bucket. For more information, see Amazon Resource Names (ARNs) and AWS Service Namespaces .
The CloudWatch logging options for your delivery stream.
Enables or disables CloudWatch logging.
The CloudWatch group name for logging. This value is required if CloudWatch logging is enabled.
The CloudWatch log stream name for logging. This value is required if CloudWatch logging is enabled.
Describes a data processing configuration.
Enables or disables data processing.
The data processors.
Describes a data processor.
The type of processor.
The processor parameters.
Describes the processor parameter.
The name of the parameter.
The parameter value.
Describes the Amazon CloudWatch logging options for your delivery stream.
Enables or disables CloudWatch logging.
The CloudWatch group name for logging. This value is required if CloudWatch logging is enabled.
The CloudWatch log stream name for logging. This value is required if CloudWatch logging is enabled.
Describes an update for a destination in Splunk.
The HTTP Event Collector (HEC) endpoint to which Kinesis Data Firehose sends your data.
This type can be either "Raw" or "Event."
A GUID that you obtain from your Splunk cluster when you create a new HEC endpoint.
The amount of time that Kinesis Data Firehose waits to receive an acknowledgment from Splunk after it sends data. At the end of the timeout period, Kinesis Data Firehose either tries to send the data again or considers it an error, based on your retry settings.
The retry behavior in case Kinesis Data Firehose is unable to deliver data to Splunk or if it doesn't receive an acknowledgment of receipt from Splunk.
The total amount of time that Kinesis Data Firehose spends on retries. This duration starts after the initial attempt to send data to Splunk fails. It doesn't include the periods during which Kinesis Data Firehose waits for acknowledgment from Splunk after each attempt.
Specifies how you want Kinesis Data Firehose to back up documents to Amazon S3. When set to FailedDocumentsOnly
, Kinesis Data Firehose writes any data that could not be indexed to the configured Amazon S3 destination. When set to AllEvents
, Kinesis Data Firehose delivers all incoming records to Amazon S3, and also writes failed documents to Amazon S3. The default value is FailedEventsOnly
.
You can update this backup mode from FailedEventsOnly
to AllEvents
. You can't update it from AllEvents
to FailedEventsOnly
.
Your update to the configuration of the backup Amazon S3 location.
The Amazon Resource Name (ARN) of the AWS credentials. For more information, see Amazon Resource Names (ARNs) and AWS Service Namespaces .
The ARN of the S3 bucket. For more information, see Amazon Resource Names (ARNs) and AWS Service Namespaces .
The "YYYY/MM/DD/HH" time format prefix is automatically used for delivered Amazon S3 files. You can also specify a custom prefix, as described in Custom Prefixes for Amazon S3 Objects .
A prefix that Kinesis Data Firehose evaluates and adds to failed records before writing them to S3. This prefix appears immediately following the bucket name. For information about how to specify this prefix, see Custom Prefixes for Amazon S3 Objects .
The buffering option. If no value is specified, BufferingHints
object default values are used.
Buffer incoming data to the specified size, in MiBs, before delivering it to the destination. The default value is 5. This parameter is optional but if you specify a value for it, you must also specify a value for IntervalInSeconds
, and vice versa.
We recommend setting this parameter to a value greater than the amount of data you typically ingest into the delivery stream in 10 seconds. For example, if you typically ingest data at 1 MiB/sec, the value should be 10 MiB or higher.
Buffer incoming data for the specified period of time, in seconds, before delivering it to the destination. The default value is 300. This parameter is optional but if you specify a value for it, you must also specify a value for SizeInMBs
, and vice versa.
The compression format. If no value is specified, the default is UNCOMPRESSED
.
The compression formats SNAPPY
or ZIP
cannot be specified for Amazon Redshift destinations because they are not supported by the Amazon Redshift COPY
operation that reads from the S3 bucket.
The encryption configuration. If no value is specified, the default is no encryption.
Specifically override existing encryption information to ensure that no encryption is used.
The encryption key.
The Amazon Resource Name (ARN) of the encryption key. Must belong to the same AWS Region as the destination Amazon S3 bucket. For more information, see Amazon Resource Names (ARNs) and AWS Service Namespaces .
The CloudWatch logging options for your delivery stream.
Enables or disables CloudWatch logging.
The CloudWatch group name for logging. This value is required if CloudWatch logging is enabled.
The CloudWatch log stream name for logging. This value is required if CloudWatch logging is enabled.
The data processing configuration.
Enables or disables data processing.
The data processors.
Describes a data processor.
The type of processor.
The processor parameters.
Describes the processor parameter.
The name of the parameter.
The parameter value.
The Amazon CloudWatch logging options for your delivery stream.
Enables or disables CloudWatch logging.
The CloudWatch group name for logging. This value is required if CloudWatch logging is enabled.
The CloudWatch log stream name for logging. This value is required if CloudWatch logging is enabled.
Describes an update to the specified HTTP endpoint destination.
Describes the configuration of the HTTP endpoint destination.
The URL of the HTTP endpoint selected as the destination.
Warning
If you choose an HTTP endpoint as your destination, review and follow the instructions in the Appendix - HTTP Endpoint Delivery Request and Response Specifications .
The name of the HTTP endpoint selected as the destination.
The access key required for Kinesis Firehose to authenticate with the HTTP endpoint selected as the destination.
Describes buffering options that can be applied to the data before it is delivered to the HTTPS endpoint destination. Kinesis Data Firehose teats these options as hints, and it might choose to use more optimal values. The SizeInMBs
and IntervalInSeconds
parameters are optional. However, if specify a value for one of them, you must also provide a value for the other.
Buffer incoming data to the specified size, in MBs, before delivering it to the destination. The default value is 5.
We recommend setting this parameter to a value greater than the amount of data you typically ingest into the delivery stream in 10 seconds. For example, if you typically ingest data at 1 MB/sec, the value should be 10 MB or higher.
Buffer incoming data for the specified period of time, in seconds, before delivering it to the destination. The default value is 300 (5 minutes).
Describes the Amazon CloudWatch logging options for your delivery stream.
Enables or disables CloudWatch logging.
The CloudWatch group name for logging. This value is required if CloudWatch logging is enabled.
The CloudWatch log stream name for logging. This value is required if CloudWatch logging is enabled.
The configuration of the request sent to the HTTP endpoint specified as the destination.
Kinesis Data Firehose uses the content encoding to compress the body of a request before sending the request to the destination. For more information, see Content-Encoding in MDN Web Docs, the official Mozilla documentation.
Describes the metadata sent to the HTTP endpoint destination.
Describes the metadata that's delivered to the specified HTTP endpoint destination.
The name of the HTTP endpoint common attribute.
The value of the HTTP endpoint common attribute.
Describes a data processing configuration.
Enables or disables data processing.
The data processors.
Describes a data processor.
The type of processor.
The processor parameters.
Describes the processor parameter.
The name of the parameter.
The parameter value.
Kinesis Data Firehose uses this IAM role for all the permissions that the delivery stream needs.
Describes the retry behavior in case Kinesis Data Firehose is unable to deliver data to the specified HTTP endpoint destination, or if it doesn't receive a valid acknowledgment of receipt from the specified HTTP endpoint destination.
The total amount of time that Kinesis Data Firehose spends on retries. This duration starts after the initial attempt to send data to the custom destination via HTTPS endpoint fails. It doesn't include the periods during which Kinesis Data Firehose waits for acknowledgment from the specified destination after each attempt.
Describes the S3 bucket backup options for the data that Kinesis Firehose delivers to the HTTP endpoint destination. You can back up all documents (AllData
) or only the documents that Kinesis Data Firehose could not deliver to the specified HTTP endpoint destination (FailedDataOnly
).
Describes an update for a destination in Amazon S3.
The Amazon Resource Name (ARN) of the AWS credentials. For more information, see Amazon Resource Names (ARNs) and AWS Service Namespaces .
The ARN of the S3 bucket. For more information, see Amazon Resource Names (ARNs) and AWS Service Namespaces .
The "YYYY/MM/DD/HH" time format prefix is automatically used for delivered Amazon S3 files. You can also specify a custom prefix, as described in Custom Prefixes for Amazon S3 Objects .
A prefix that Kinesis Data Firehose evaluates and adds to failed records before writing them to S3. This prefix appears immediately following the bucket name. For information about how to specify this prefix, see Custom Prefixes for Amazon S3 Objects .
The buffering option. If no value is specified, BufferingHints
object default values are used.
Buffer incoming data to the specified size, in MiBs, before delivering it to the destination. The default value is 5. This parameter is optional but if you specify a value for it, you must also specify a value for IntervalInSeconds
, and vice versa.
We recommend setting this parameter to a value greater than the amount of data you typically ingest into the delivery stream in 10 seconds. For example, if you typically ingest data at 1 MiB/sec, the value should be 10 MiB or higher.
Buffer incoming data for the specified period of time, in seconds, before delivering it to the destination. The default value is 300. This parameter is optional but if you specify a value for it, you must also specify a value for SizeInMBs
, and vice versa.
The compression format. If no value is specified, the default is UNCOMPRESSED
.
The compression formats SNAPPY
or ZIP
cannot be specified for Amazon Redshift destinations because they are not supported by the Amazon Redshift COPY
operation that reads from the S3 bucket.
The encryption configuration. If no value is specified, the default is no encryption.
Specifically override existing encryption information to ensure that no encryption is used.
The encryption key.
The Amazon Resource Name (ARN) of the encryption key. Must belong to the same AWS Region as the destination Amazon S3 bucket. For more information, see Amazon Resource Names (ARNs) and AWS Service Namespaces .
The CloudWatch logging options for your delivery stream.
Enables or disables CloudWatch logging.
The CloudWatch group name for logging. This value is required if CloudWatch logging is enabled.
The CloudWatch log stream name for logging. This value is required if CloudWatch logging is enabled.
dict
Response Syntax
{}
Response Structure
Exceptions
Firehose.Client.exceptions.InvalidArgumentException
Firehose.Client.exceptions.ResourceInUseException
Firehose.Client.exceptions.ResourceNotFoundException
Firehose.Client.exceptions.ConcurrentModificationException
The available paginators are: