Glue / Client / get_column_statistics_for_partition

get_column_statistics_for_partition#

Glue.Client.get_column_statistics_for_partition(**kwargs)#

Retrieves partition statistics of columns.

The Identity and Access Management (IAM) permission required for this operation is GetPartition.

See also: AWS API Documentation

Request Syntax

response = client.get_column_statistics_for_partition(
    CatalogId='string',
    DatabaseName='string',
    TableName='string',
    PartitionValues=[
        'string',
    ],
    ColumnNames=[
        'string',
    ]
)
Parameters:
  • CatalogId (string) – The ID of the Data Catalog where the partitions in question reside. If none is supplied, the Amazon Web Services account ID is used by default.

  • DatabaseName (string) –

    [REQUIRED]

    The name of the catalog database where the partitions reside.

  • TableName (string) –

    [REQUIRED]

    The name of the partitions’ table.

  • PartitionValues (list) –

    [REQUIRED]

    A list of partition values identifying the partition.

    • (string) –

  • ColumnNames (list) –

    [REQUIRED]

    A list of the column names.

    • (string) –

Return type:

dict

Returns:

Response Syntax

{
    'ColumnStatisticsList': [
        {
            'ColumnName': 'string',
            'ColumnType': 'string',
            'AnalyzedTime': datetime(2015, 1, 1),
            'StatisticsData': {
                'Type': 'BOOLEAN'|'DATE'|'DECIMAL'|'DOUBLE'|'LONG'|'STRING'|'BINARY',
                'BooleanColumnStatisticsData': {
                    'NumberOfTrues': 123,
                    'NumberOfFalses': 123,
                    'NumberOfNulls': 123
                },
                'DateColumnStatisticsData': {
                    'MinimumValue': datetime(2015, 1, 1),
                    'MaximumValue': datetime(2015, 1, 1),
                    'NumberOfNulls': 123,
                    'NumberOfDistinctValues': 123
                },
                'DecimalColumnStatisticsData': {
                    'MinimumValue': {
                        'UnscaledValue': b'bytes',
                        'Scale': 123
                    },
                    'MaximumValue': {
                        'UnscaledValue': b'bytes',
                        'Scale': 123
                    },
                    'NumberOfNulls': 123,
                    'NumberOfDistinctValues': 123
                },
                'DoubleColumnStatisticsData': {
                    'MinimumValue': 123.0,
                    'MaximumValue': 123.0,
                    'NumberOfNulls': 123,
                    'NumberOfDistinctValues': 123
                },
                'LongColumnStatisticsData': {
                    'MinimumValue': 123,
                    'MaximumValue': 123,
                    'NumberOfNulls': 123,
                    'NumberOfDistinctValues': 123
                },
                'StringColumnStatisticsData': {
                    'MaximumLength': 123,
                    'AverageLength': 123.0,
                    'NumberOfNulls': 123,
                    'NumberOfDistinctValues': 123
                },
                'BinaryColumnStatisticsData': {
                    'MaximumLength': 123,
                    'AverageLength': 123.0,
                    'NumberOfNulls': 123
                }
            }
        },
    ],
    'Errors': [
        {
            'ColumnName': 'string',
            'Error': {
                'ErrorCode': 'string',
                'ErrorMessage': 'string'
            }
        },
    ]
}

Response Structure

  • (dict) –

    • ColumnStatisticsList (list) –

      List of ColumnStatistics that failed to be retrieved.

      • (dict) –

        Represents the generated column-level statistics for a table or partition.

        • ColumnName (string) –

          Name of column which statistics belong to.

        • ColumnType (string) –

          The data type of the column.

        • AnalyzedTime (datetime) –

          The timestamp of when column statistics were generated.

        • StatisticsData (dict) –

          A ColumnStatisticData object that contains the statistics data values.

          • Type (string) –

            The type of column statistics data.

          • BooleanColumnStatisticsData (dict) –

            Boolean column statistics data.

            • NumberOfTrues (integer) –

              The number of true values in the column.

            • NumberOfFalses (integer) –

              The number of false values in the column.

            • NumberOfNulls (integer) –

              The number of null values in the column.

          • DateColumnStatisticsData (dict) –

            Date column statistics data.

            • MinimumValue (datetime) –

              The lowest value in the column.

            • MaximumValue (datetime) –

              The highest value in the column.

            • NumberOfNulls (integer) –

              The number of null values in the column.

            • NumberOfDistinctValues (integer) –

              The number of distinct values in a column.

          • DecimalColumnStatisticsData (dict) –

            Decimal column statistics data. UnscaledValues within are Base64-encoded binary objects storing big-endian, two’s complement representations of the decimal’s unscaled value.

            • MinimumValue (dict) –

              The lowest value in the column.

              • UnscaledValue (bytes) –

                The unscaled numeric value.

              • Scale (integer) –

                The scale that determines where the decimal point falls in the unscaled value.

            • MaximumValue (dict) –

              The highest value in the column.

              • UnscaledValue (bytes) –

                The unscaled numeric value.

              • Scale (integer) –

                The scale that determines where the decimal point falls in the unscaled value.

            • NumberOfNulls (integer) –

              The number of null values in the column.

            • NumberOfDistinctValues (integer) –

              The number of distinct values in a column.

          • DoubleColumnStatisticsData (dict) –

            Double column statistics data.

            • MinimumValue (float) –

              The lowest value in the column.

            • MaximumValue (float) –

              The highest value in the column.

            • NumberOfNulls (integer) –

              The number of null values in the column.

            • NumberOfDistinctValues (integer) –

              The number of distinct values in a column.

          • LongColumnStatisticsData (dict) –

            Long column statistics data.

            • MinimumValue (integer) –

              The lowest value in the column.

            • MaximumValue (integer) –

              The highest value in the column.

            • NumberOfNulls (integer) –

              The number of null values in the column.

            • NumberOfDistinctValues (integer) –

              The number of distinct values in a column.

          • StringColumnStatisticsData (dict) –

            String column statistics data.

            • MaximumLength (integer) –

              The size of the longest string in the column.

            • AverageLength (float) –

              The average string length in the column.

            • NumberOfNulls (integer) –

              The number of null values in the column.

            • NumberOfDistinctValues (integer) –

              The number of distinct values in a column.

          • BinaryColumnStatisticsData (dict) –

            Binary column statistics data.

            • MaximumLength (integer) –

              The size of the longest bit sequence in the column.

            • AverageLength (float) –

              The average bit sequence length in the column.

            • NumberOfNulls (integer) –

              The number of null values in the column.

    • Errors (list) –

      Error occurred during retrieving column statistics data.

      • (dict) –

        Encapsulates a column name that failed and the reason for failure.

        • ColumnName (string) –

          The name of the column that failed.

        • Error (dict) –

          An error message with the reason for the failure of an operation.

          • ErrorCode (string) –

            The code associated with this error.

          • ErrorMessage (string) –

            A message describing the error.

Exceptions

  • Glue.Client.exceptions.EntityNotFoundException

  • Glue.Client.exceptions.InvalidInputException

  • Glue.Client.exceptions.InternalServiceException

  • Glue.Client.exceptions.OperationTimeoutException

  • Glue.Client.exceptions.GlueEncryptionException