update_column_statistics_for_partition

Glue.Client.update_column_statistics_for_partition(**kwargs)

Creates or updates partition statistics of columns.

The Identity and Access Management (IAM) permission required for this operation is UpdatePartition .

See also: AWS API Documentation

Request Syntax

response = client.update_column_statistics_for_partition(
    CatalogId='string',
    DatabaseName='string',
    TableName='string',
    PartitionValues=[
        'string',
    ],
    ColumnStatisticsList=[
        {
            'ColumnName': 'string',
            'ColumnType': 'string',
            'AnalyzedTime': datetime(2015, 1, 1),
            'StatisticsData': {
                'Type': 'BOOLEAN'|'DATE'|'DECIMAL'|'DOUBLE'|'LONG'|'STRING'|'BINARY',
                'BooleanColumnStatisticsData': {
                    'NumberOfTrues': 123,
                    'NumberOfFalses': 123,
                    'NumberOfNulls': 123
                },
                'DateColumnStatisticsData': {
                    'MinimumValue': datetime(2015, 1, 1),
                    'MaximumValue': datetime(2015, 1, 1),
                    'NumberOfNulls': 123,
                    'NumberOfDistinctValues': 123
                },
                'DecimalColumnStatisticsData': {
                    'MinimumValue': {
                        'UnscaledValue': b'bytes',
                        'Scale': 123
                    },
                    'MaximumValue': {
                        'UnscaledValue': b'bytes',
                        'Scale': 123
                    },
                    'NumberOfNulls': 123,
                    'NumberOfDistinctValues': 123
                },
                'DoubleColumnStatisticsData': {
                    'MinimumValue': 123.0,
                    'MaximumValue': 123.0,
                    'NumberOfNulls': 123,
                    'NumberOfDistinctValues': 123
                },
                'LongColumnStatisticsData': {
                    'MinimumValue': 123,
                    'MaximumValue': 123,
                    'NumberOfNulls': 123,
                    'NumberOfDistinctValues': 123
                },
                'StringColumnStatisticsData': {
                    'MaximumLength': 123,
                    'AverageLength': 123.0,
                    'NumberOfNulls': 123,
                    'NumberOfDistinctValues': 123
                },
                'BinaryColumnStatisticsData': {
                    'MaximumLength': 123,
                    'AverageLength': 123.0,
                    'NumberOfNulls': 123
                }
            }
        },
    ]
)
Parameters
  • CatalogId (string) -- The ID of the Data Catalog where the partitions in question reside. If none is supplied, the Amazon Web Services account ID is used by default.
  • DatabaseName (string) --

    [REQUIRED]

    The name of the catalog database where the partitions reside.

  • TableName (string) --

    [REQUIRED]

    The name of the partitions' table.

  • PartitionValues (list) --

    [REQUIRED]

    A list of partition values identifying the partition.

    • (string) --
  • ColumnStatisticsList (list) --

    [REQUIRED]

    A list of the column statistics.

    • (dict) --

      Represents the generated column-level statistics for a table or partition.

      • ColumnName (string) -- [REQUIRED]

        Name of column which statistics belong to.

      • ColumnType (string) -- [REQUIRED]

        The data type of the column.

      • AnalyzedTime (datetime) -- [REQUIRED]

        The timestamp of when column statistics were generated.

      • StatisticsData (dict) -- [REQUIRED]

        A ColumnStatisticData object that contains the statistics data values.

        • Type (string) -- [REQUIRED]

          The type of column statistics data.

        • BooleanColumnStatisticsData (dict) --

          Boolean column statistics data.

          • NumberOfTrues (integer) -- [REQUIRED]

            The number of true values in the column.

          • NumberOfFalses (integer) -- [REQUIRED]

            The number of false values in the column.

          • NumberOfNulls (integer) -- [REQUIRED]

            The number of null values in the column.

        • DateColumnStatisticsData (dict) --

          Date column statistics data.

          • MinimumValue (datetime) --

            The lowest value in the column.

          • MaximumValue (datetime) --

            The highest value in the column.

          • NumberOfNulls (integer) -- [REQUIRED]

            The number of null values in the column.

          • NumberOfDistinctValues (integer) -- [REQUIRED]

            The number of distinct values in a column.

        • DecimalColumnStatisticsData (dict) --

          Decimal column statistics data.

          • MinimumValue (dict) --

            The lowest value in the column.

            • UnscaledValue (bytes) -- [REQUIRED]

              The unscaled numeric value.

            • Scale (integer) -- [REQUIRED]

              The scale that determines where the decimal point falls in the unscaled value.

          • MaximumValue (dict) --

            The highest value in the column.

            • UnscaledValue (bytes) -- [REQUIRED]

              The unscaled numeric value.

            • Scale (integer) -- [REQUIRED]

              The scale that determines where the decimal point falls in the unscaled value.

          • NumberOfNulls (integer) -- [REQUIRED]

            The number of null values in the column.

          • NumberOfDistinctValues (integer) -- [REQUIRED]

            The number of distinct values in a column.

        • DoubleColumnStatisticsData (dict) --

          Double column statistics data.

          • MinimumValue (float) --

            The lowest value in the column.

          • MaximumValue (float) --

            The highest value in the column.

          • NumberOfNulls (integer) -- [REQUIRED]

            The number of null values in the column.

          • NumberOfDistinctValues (integer) -- [REQUIRED]

            The number of distinct values in a column.

        • LongColumnStatisticsData (dict) --

          Long column statistics data.

          • MinimumValue (integer) --

            The lowest value in the column.

          • MaximumValue (integer) --

            The highest value in the column.

          • NumberOfNulls (integer) -- [REQUIRED]

            The number of null values in the column.

          • NumberOfDistinctValues (integer) -- [REQUIRED]

            The number of distinct values in a column.

        • StringColumnStatisticsData (dict) --

          String column statistics data.

          • MaximumLength (integer) -- [REQUIRED]

            The size of the longest string in the column.

          • AverageLength (float) -- [REQUIRED]

            The average string length in the column.

          • NumberOfNulls (integer) -- [REQUIRED]

            The number of null values in the column.

          • NumberOfDistinctValues (integer) -- [REQUIRED]

            The number of distinct values in a column.

        • BinaryColumnStatisticsData (dict) --

          Binary column statistics data.

          • MaximumLength (integer) -- [REQUIRED]

            The size of the longest bit sequence in the column.

          • AverageLength (float) -- [REQUIRED]

            The average bit sequence length in the column.

          • NumberOfNulls (integer) -- [REQUIRED]

            The number of null values in the column.

Return type

dict

Returns

Response Syntax

{
    'Errors': [
        {
            'ColumnStatistics': {
                'ColumnName': 'string',
                'ColumnType': 'string',
                'AnalyzedTime': datetime(2015, 1, 1),
                'StatisticsData': {
                    'Type': 'BOOLEAN'|'DATE'|'DECIMAL'|'DOUBLE'|'LONG'|'STRING'|'BINARY',
                    'BooleanColumnStatisticsData': {
                        'NumberOfTrues': 123,
                        'NumberOfFalses': 123,
                        'NumberOfNulls': 123
                    },
                    'DateColumnStatisticsData': {
                        'MinimumValue': datetime(2015, 1, 1),
                        'MaximumValue': datetime(2015, 1, 1),
                        'NumberOfNulls': 123,
                        'NumberOfDistinctValues': 123
                    },
                    'DecimalColumnStatisticsData': {
                        'MinimumValue': {
                            'UnscaledValue': b'bytes',
                            'Scale': 123
                        },
                        'MaximumValue': {
                            'UnscaledValue': b'bytes',
                            'Scale': 123
                        },
                        'NumberOfNulls': 123,
                        'NumberOfDistinctValues': 123
                    },
                    'DoubleColumnStatisticsData': {
                        'MinimumValue': 123.0,
                        'MaximumValue': 123.0,
                        'NumberOfNulls': 123,
                        'NumberOfDistinctValues': 123
                    },
                    'LongColumnStatisticsData': {
                        'MinimumValue': 123,
                        'MaximumValue': 123,
                        'NumberOfNulls': 123,
                        'NumberOfDistinctValues': 123
                    },
                    'StringColumnStatisticsData': {
                        'MaximumLength': 123,
                        'AverageLength': 123.0,
                        'NumberOfNulls': 123,
                        'NumberOfDistinctValues': 123
                    },
                    'BinaryColumnStatisticsData': {
                        'MaximumLength': 123,
                        'AverageLength': 123.0,
                        'NumberOfNulls': 123
                    }
                }
            },
            'Error': {
                'ErrorCode': 'string',
                'ErrorMessage': 'string'
            }
        },
    ]
}

Response Structure

  • (dict) --

    • Errors (list) --

      Error occurred during updating column statistics data.

      • (dict) --

        Encapsulates a ColumnStatistics object that failed and the reason for failure.

        • ColumnStatistics (dict) --

          The ColumnStatistics of the column.

          • ColumnName (string) --

            Name of column which statistics belong to.

          • ColumnType (string) --

            The data type of the column.

          • AnalyzedTime (datetime) --

            The timestamp of when column statistics were generated.

          • StatisticsData (dict) --

            A ColumnStatisticData object that contains the statistics data values.

            • Type (string) --

              The type of column statistics data.

            • BooleanColumnStatisticsData (dict) --

              Boolean column statistics data.

              • NumberOfTrues (integer) --

                The number of true values in the column.

              • NumberOfFalses (integer) --

                The number of false values in the column.

              • NumberOfNulls (integer) --

                The number of null values in the column.

            • DateColumnStatisticsData (dict) --

              Date column statistics data.

              • MinimumValue (datetime) --

                The lowest value in the column.

              • MaximumValue (datetime) --

                The highest value in the column.

              • NumberOfNulls (integer) --

                The number of null values in the column.

              • NumberOfDistinctValues (integer) --

                The number of distinct values in a column.

            • DecimalColumnStatisticsData (dict) --

              Decimal column statistics data.

              • MinimumValue (dict) --

                The lowest value in the column.

                • UnscaledValue (bytes) --

                  The unscaled numeric value.

                • Scale (integer) --

                  The scale that determines where the decimal point falls in the unscaled value.

              • MaximumValue (dict) --

                The highest value in the column.

                • UnscaledValue (bytes) --

                  The unscaled numeric value.

                • Scale (integer) --

                  The scale that determines where the decimal point falls in the unscaled value.

              • NumberOfNulls (integer) --

                The number of null values in the column.

              • NumberOfDistinctValues (integer) --

                The number of distinct values in a column.

            • DoubleColumnStatisticsData (dict) --

              Double column statistics data.

              • MinimumValue (float) --

                The lowest value in the column.

              • MaximumValue (float) --

                The highest value in the column.

              • NumberOfNulls (integer) --

                The number of null values in the column.

              • NumberOfDistinctValues (integer) --

                The number of distinct values in a column.

            • LongColumnStatisticsData (dict) --

              Long column statistics data.

              • MinimumValue (integer) --

                The lowest value in the column.

              • MaximumValue (integer) --

                The highest value in the column.

              • NumberOfNulls (integer) --

                The number of null values in the column.

              • NumberOfDistinctValues (integer) --

                The number of distinct values in a column.

            • StringColumnStatisticsData (dict) --

              String column statistics data.

              • MaximumLength (integer) --

                The size of the longest string in the column.

              • AverageLength (float) --

                The average string length in the column.

              • NumberOfNulls (integer) --

                The number of null values in the column.

              • NumberOfDistinctValues (integer) --

                The number of distinct values in a column.

            • BinaryColumnStatisticsData (dict) --

              Binary column statistics data.

              • MaximumLength (integer) --

                The size of the longest bit sequence in the column.

              • AverageLength (float) --

                The average bit sequence length in the column.

              • NumberOfNulls (integer) --

                The number of null values in the column.

        • Error (dict) --

          An error message with the reason for the failure of an operation.

          • ErrorCode (string) --

            The code associated with this error.

          • ErrorMessage (string) --

            A message describing the error.

Exceptions

  • Glue.Client.exceptions.EntityNotFoundException
  • Glue.Client.exceptions.InvalidInputException
  • Glue.Client.exceptions.InternalServiceException
  • Glue.Client.exceptions.OperationTimeoutException
  • Glue.Client.exceptions.GlueEncryptionException