Glue / Paginator / GetPartitions

GetPartitions#

class Glue.Paginator.GetPartitions#
paginator = client.get_paginator('get_partitions')
paginate(**kwargs)#

Creates an iterator that will paginate through responses from Glue.Client.get_partitions().

See also: AWS API Documentation

Request Syntax

response_iterator = paginator.paginate(
    CatalogId='string',
    DatabaseName='string',
    TableName='string',
    Expression='string',
    Segment={
        'SegmentNumber': 123,
        'TotalSegments': 123
    },
    ExcludeColumnSchema=True|False,
    TransactionId='string',
    QueryAsOfTime=datetime(2015, 1, 1),
    PaginationConfig={
        'MaxItems': 123,
        'PageSize': 123,
        'StartingToken': 'string'
    }
)
Parameters:
  • CatalogId (string) – The ID of the Data Catalog where the partitions in question reside. If none is provided, the Amazon Web Services account ID is used by default.

  • DatabaseName (string) –

    [REQUIRED]

    The name of the catalog database where the partitions reside.

  • TableName (string) –

    [REQUIRED]

    The name of the partitions’ table.

  • Expression (string) –

    An expression that filters the partitions to be returned.

    The expression uses SQL syntax similar to the SQL WHERE filter clause. The SQL statement parser JSQLParser parses the expression.

    Operators: The following are the operators that you can use in the Expression API call:

    =

    Checks whether the values of the two operands are equal; if yes, then the condition becomes true.

    Example: Assume ‘variable a’ holds 10 and ‘variable b’ holds 20.

    (a = b) is not true.

    < >

    Checks whether the values of two operands are equal; if the values are not equal, then the condition becomes true.

    Example: (a < > b) is true.

    >

    Checks whether the value of the left operand is greater than the value of the right operand; if yes, then the condition becomes true.

    Example: (a > b) is not true.

    <

    Checks whether the value of the left operand is less than the value of the right operand; if yes, then the condition becomes true.

    Example: (a < b) is true.

    >=

    Checks whether the value of the left operand is greater than or equal to the value of the right operand; if yes, then the condition becomes true.

    Example: (a >= b) is not true.

    <=

    Checks whether the value of the left operand is less than or equal to the value of the right operand; if yes, then the condition becomes true.

    Example: (a <= b) is true.

    AND, OR, IN, BETWEEN, LIKE, NOT, IS NULL

    Logical operators.

    Supported Partition Key Types: The following are the supported partition keys.

    • string

    • date

    • timestamp

    • int

    • bigint

    • long

    • tinyint

    • smallint

    • decimal

    If an type is encountered that is not valid, an exception is thrown.

    The following list shows the valid operators on each type. When you define a crawler, the partitionKey type is created as a STRING, to be compatible with the catalog partitions.

    Sample API Call:

  • Segment (dict) –

    The segment of the table’s partitions to scan in this request.

    • SegmentNumber (integer) – [REQUIRED]

      The zero-based index number of the segment. For example, if the total number of segments is 4, SegmentNumber values range from 0 through 3.

    • TotalSegments (integer) – [REQUIRED]

      The total number of segments.

  • ExcludeColumnSchema (boolean) – When true, specifies not returning the partition column schema. Useful when you are interested only in other partition attributes such as partition values or location. This approach avoids the problem of a large response by not returning duplicate data.

  • TransactionId (string) – The transaction ID at which to read the partition contents.

  • QueryAsOfTime (datetime) – The time as of when to read the partition contents. If not set, the most recent transaction commit time will be used. Cannot be specified along with TransactionId.

  • PaginationConfig (dict) –

    A dictionary that provides parameters to control pagination.

    • MaxItems (integer) –

      The total number of items to return. If the total number of items available is more than the value specified in max-items then a NextToken will be provided in the output that you can use to resume pagination.

    • PageSize (integer) –

      The size of each page.

    • StartingToken (string) –

      A token to specify where to start paginating. This is the NextToken from a previous response.

Return type:

dict

Returns:

Response Syntax

{
    'Partitions': [
        {
            'Values': [
                'string',
            ],
            'DatabaseName': 'string',
            'TableName': 'string',
            'CreationTime': datetime(2015, 1, 1),
            'LastAccessTime': datetime(2015, 1, 1),
            'StorageDescriptor': {
                'Columns': [
                    {
                        'Name': 'string',
                        'Type': 'string',
                        'Comment': 'string',
                        'Parameters': {
                            'string': 'string'
                        }
                    },
                ],
                'Location': 'string',
                'AdditionalLocations': [
                    'string',
                ],
                'InputFormat': 'string',
                'OutputFormat': 'string',
                'Compressed': True|False,
                'NumberOfBuckets': 123,
                'SerdeInfo': {
                    'Name': 'string',
                    'SerializationLibrary': 'string',
                    'Parameters': {
                        'string': 'string'
                    }
                },
                'BucketColumns': [
                    'string',
                ],
                'SortColumns': [
                    {
                        'Column': 'string',
                        'SortOrder': 123
                    },
                ],
                'Parameters': {
                    'string': 'string'
                },
                'SkewedInfo': {
                    'SkewedColumnNames': [
                        'string',
                    ],
                    'SkewedColumnValues': [
                        'string',
                    ],
                    'SkewedColumnValueLocationMaps': {
                        'string': 'string'
                    }
                },
                'StoredAsSubDirectories': True|False,
                'SchemaReference': {
                    'SchemaId': {
                        'SchemaArn': 'string',
                        'SchemaName': 'string',
                        'RegistryName': 'string'
                    },
                    'SchemaVersionId': 'string',
                    'SchemaVersionNumber': 123
                }
            },
            'Parameters': {
                'string': 'string'
            },
            'LastAnalyzedTime': datetime(2015, 1, 1),
            'CatalogId': 'string'
        },
    ],

}

Response Structure

  • (dict) –

    • Partitions (list) –

      A list of requested partitions.

      • (dict) –

        Represents a slice of table data.

        • Values (list) –

          The values of the partition.

          • (string) –

        • DatabaseName (string) –

          The name of the catalog database in which to create the partition.

        • TableName (string) –

          The name of the database table in which to create the partition.

        • CreationTime (datetime) –

          The time at which the partition was created.

        • LastAccessTime (datetime) –

          The last time at which the partition was accessed.

        • StorageDescriptor (dict) –

          Provides information about the physical location where the partition is stored.

          • Columns (list) –

            A list of the Columns in the table.

            • (dict) –

              A column in a Table.

              • Name (string) –

                The name of the Column.

              • Type (string) –

                The data type of the Column.

              • Comment (string) –

                A free-form text comment.

              • Parameters (dict) –

                These key-value pairs define properties associated with the column.

                • (string) –

                  • (string) –

          • Location (string) –

            The physical location of the table. By default, this takes the form of the warehouse location, followed by the database location in the warehouse, followed by the table name.

          • AdditionalLocations (list) –

            A list of locations that point to the path where a Delta table is located.

            • (string) –

          • InputFormat (string) –

            The input format: SequenceFileInputFormat (binary), or TextInputFormat, or a custom format.

          • OutputFormat (string) –

            The output format: SequenceFileOutputFormat (binary), or IgnoreKeyTextOutputFormat, or a custom format.

          • Compressed (boolean) –

            True if the data in the table is compressed, or False if not.

          • NumberOfBuckets (integer) –

            Must be specified if the table contains any dimension columns.

          • SerdeInfo (dict) –

            The serialization/deserialization (SerDe) information.

            • Name (string) –

              Name of the SerDe.

            • SerializationLibrary (string) –

              Usually the class that implements the SerDe. An example is org.apache.hadoop.hive.serde2.columnar.ColumnarSerDe.

            • Parameters (dict) –

              These key-value pairs define initialization parameters for the SerDe.

              • (string) –

                • (string) –

          • BucketColumns (list) –

            A list of reducer grouping columns, clustering columns, and bucketing columns in the table.

            • (string) –

          • SortColumns (list) –

            A list specifying the sort order of each bucket in the table.

            • (dict) –

              Specifies the sort order of a sorted column.

              • Column (string) –

                The name of the column.

              • SortOrder (integer) –

                Indicates that the column is sorted in ascending order ( == 1), or in descending order ( ==0).

          • Parameters (dict) –

            The user-supplied properties in key-value form.

            • (string) –

              • (string) –

          • SkewedInfo (dict) –

            The information about values that appear frequently in a column (skewed values).

            • SkewedColumnNames (list) –

              A list of names of columns that contain skewed values.

              • (string) –

            • SkewedColumnValues (list) –

              A list of values that appear so frequently as to be considered skewed.

              • (string) –

            • SkewedColumnValueLocationMaps (dict) –

              A mapping of skewed values to the columns that contain them.

              • (string) –

                • (string) –

          • StoredAsSubDirectories (boolean) –

            True if the table data is stored in subdirectories, or False if not.

          • SchemaReference (dict) –

            An object that references a schema stored in the Glue Schema Registry.

            When creating a table, you can pass an empty list of columns for the schema, and instead use a schema reference.

            • SchemaId (dict) –

              A structure that contains schema identity fields. Either this or the SchemaVersionId has to be provided.

              • SchemaArn (string) –

                The Amazon Resource Name (ARN) of the schema. One of SchemaArn or SchemaName has to be provided.

              • SchemaName (string) –

                The name of the schema. One of SchemaArn or SchemaName has to be provided.

              • RegistryName (string) –

                The name of the schema registry that contains the schema.

            • SchemaVersionId (string) –

              The unique ID assigned to a version of the schema. Either this or the SchemaId has to be provided.

            • SchemaVersionNumber (integer) –

              The version number of the schema.

        • Parameters (dict) –

          These key-value pairs define partition parameters.

          • (string) –

            • (string) –

        • LastAnalyzedTime (datetime) –

          The last time at which column statistics were computed for this partition.

        • CatalogId (string) –

          The ID of the Data Catalog in which the partition resides.