Glue / Paginator / GetTables

GetTables#

class Glue.Paginator.GetTables#
paginator = client.get_paginator('get_tables')
paginate(**kwargs)#

Creates an iterator that will paginate through responses from Glue.Client.get_tables().

See also: AWS API Documentation

Request Syntax

response_iterator = paginator.paginate(
    CatalogId='string',
    DatabaseName='string',
    Expression='string',
    TransactionId='string',
    QueryAsOfTime=datetime(2015, 1, 1),
    PaginationConfig={
        'MaxItems': 123,
        'PageSize': 123,
        'StartingToken': 'string'
    }
)
Parameters:
  • CatalogId (string) – The ID of the Data Catalog where the tables reside. If none is provided, the Amazon Web Services account ID is used by default.

  • DatabaseName (string) –

    [REQUIRED]

    The database in the catalog whose tables to list. For Hive compatibility, this name is entirely lowercase.

  • Expression (string) – A regular expression pattern. If present, only those tables whose names match the pattern are returned.

  • TransactionId (string) – The transaction ID at which to read the table contents.

  • QueryAsOfTime (datetime) – The time as of when to read the table contents. If not set, the most recent transaction commit time will be used. Cannot be specified along with TransactionId.

  • PaginationConfig (dict) –

    A dictionary that provides parameters to control pagination.

    • MaxItems (integer) –

      The total number of items to return. If the total number of items available is more than the value specified in max-items then a NextToken will be provided in the output that you can use to resume pagination.

    • PageSize (integer) –

      The size of each page.

    • StartingToken (string) –

      A token to specify where to start paginating. This is the NextToken from a previous response.

Return type:

dict

Returns:

Response Syntax

{
    'TableList': [
        {
            'Name': 'string',
            'DatabaseName': 'string',
            'Description': 'string',
            'Owner': 'string',
            'CreateTime': datetime(2015, 1, 1),
            'UpdateTime': datetime(2015, 1, 1),
            'LastAccessTime': datetime(2015, 1, 1),
            'LastAnalyzedTime': datetime(2015, 1, 1),
            'Retention': 123,
            'StorageDescriptor': {
                'Columns': [
                    {
                        'Name': 'string',
                        'Type': 'string',
                        'Comment': 'string',
                        'Parameters': {
                            'string': 'string'
                        }
                    },
                ],
                'Location': 'string',
                'AdditionalLocations': [
                    'string',
                ],
                'InputFormat': 'string',
                'OutputFormat': 'string',
                'Compressed': True|False,
                'NumberOfBuckets': 123,
                'SerdeInfo': {
                    'Name': 'string',
                    'SerializationLibrary': 'string',
                    'Parameters': {
                        'string': 'string'
                    }
                },
                'BucketColumns': [
                    'string',
                ],
                'SortColumns': [
                    {
                        'Column': 'string',
                        'SortOrder': 123
                    },
                ],
                'Parameters': {
                    'string': 'string'
                },
                'SkewedInfo': {
                    'SkewedColumnNames': [
                        'string',
                    ],
                    'SkewedColumnValues': [
                        'string',
                    ],
                    'SkewedColumnValueLocationMaps': {
                        'string': 'string'
                    }
                },
                'StoredAsSubDirectories': True|False,
                'SchemaReference': {
                    'SchemaId': {
                        'SchemaArn': 'string',
                        'SchemaName': 'string',
                        'RegistryName': 'string'
                    },
                    'SchemaVersionId': 'string',
                    'SchemaVersionNumber': 123
                }
            },
            'PartitionKeys': [
                {
                    'Name': 'string',
                    'Type': 'string',
                    'Comment': 'string',
                    'Parameters': {
                        'string': 'string'
                    }
                },
            ],
            'ViewOriginalText': 'string',
            'ViewExpandedText': 'string',
            'TableType': 'string',
            'Parameters': {
                'string': 'string'
            },
            'CreatedBy': 'string',
            'IsRegisteredWithLakeFormation': True|False,
            'TargetTable': {
                'CatalogId': 'string',
                'DatabaseName': 'string',
                'Name': 'string',
                'Region': 'string'
            },
            'CatalogId': 'string',
            'VersionId': 'string',
            'FederatedTable': {
                'Identifier': 'string',
                'DatabaseIdentifier': 'string',
                'ConnectionName': 'string'
            }
        },
    ],

}

Response Structure

  • (dict) –

    • TableList (list) –

      A list of the requested Table objects.

      • (dict) –

        Represents a collection of related data organized in columns and rows.

        • Name (string) –

          The table name. For Hive compatibility, this must be entirely lowercase.

        • DatabaseName (string) –

          The name of the database where the table metadata resides. For Hive compatibility, this must be all lowercase.

        • Description (string) –

          A description of the table.

        • Owner (string) –

          The owner of the table.

        • CreateTime (datetime) –

          The time when the table definition was created in the Data Catalog.

        • UpdateTime (datetime) –

          The last time that the table was updated.

        • LastAccessTime (datetime) –

          The last time that the table was accessed. This is usually taken from HDFS, and might not be reliable.

        • LastAnalyzedTime (datetime) –

          The last time that column statistics were computed for this table.

        • Retention (integer) –

          The retention time for this table.

        • StorageDescriptor (dict) –

          A storage descriptor containing information about the physical storage of this table.

          • Columns (list) –

            A list of the Columns in the table.

            • (dict) –

              A column in a Table.

              • Name (string) –

                The name of the Column.

              • Type (string) –

                The data type of the Column.

              • Comment (string) –

                A free-form text comment.

              • Parameters (dict) –

                These key-value pairs define properties associated with the column.

                • (string) –

                  • (string) –

          • Location (string) –

            The physical location of the table. By default, this takes the form of the warehouse location, followed by the database location in the warehouse, followed by the table name.

          • AdditionalLocations (list) –

            A list of locations that point to the path where a Delta table is located.

            • (string) –

          • InputFormat (string) –

            The input format: SequenceFileInputFormat (binary), or TextInputFormat, or a custom format.

          • OutputFormat (string) –

            The output format: SequenceFileOutputFormat (binary), or IgnoreKeyTextOutputFormat, or a custom format.

          • Compressed (boolean) –

            True if the data in the table is compressed, or False if not.

          • NumberOfBuckets (integer) –

            Must be specified if the table contains any dimension columns.

          • SerdeInfo (dict) –

            The serialization/deserialization (SerDe) information.

            • Name (string) –

              Name of the SerDe.

            • SerializationLibrary (string) –

              Usually the class that implements the SerDe. An example is org.apache.hadoop.hive.serde2.columnar.ColumnarSerDe.

            • Parameters (dict) –

              These key-value pairs define initialization parameters for the SerDe.

              • (string) –

                • (string) –

          • BucketColumns (list) –

            A list of reducer grouping columns, clustering columns, and bucketing columns in the table.

            • (string) –

          • SortColumns (list) –

            A list specifying the sort order of each bucket in the table.

            • (dict) –

              Specifies the sort order of a sorted column.

              • Column (string) –

                The name of the column.

              • SortOrder (integer) –

                Indicates that the column is sorted in ascending order ( == 1), or in descending order ( ==0).

          • Parameters (dict) –

            The user-supplied properties in key-value form.

            • (string) –

              • (string) –

          • SkewedInfo (dict) –

            The information about values that appear frequently in a column (skewed values).

            • SkewedColumnNames (list) –

              A list of names of columns that contain skewed values.

              • (string) –

            • SkewedColumnValues (list) –

              A list of values that appear so frequently as to be considered skewed.

              • (string) –

            • SkewedColumnValueLocationMaps (dict) –

              A mapping of skewed values to the columns that contain them.

              • (string) –

                • (string) –

          • StoredAsSubDirectories (boolean) –

            True if the table data is stored in subdirectories, or False if not.

          • SchemaReference (dict) –

            An object that references a schema stored in the Glue Schema Registry.

            When creating a table, you can pass an empty list of columns for the schema, and instead use a schema reference.

            • SchemaId (dict) –

              A structure that contains schema identity fields. Either this or the SchemaVersionId has to be provided.

              • SchemaArn (string) –

                The Amazon Resource Name (ARN) of the schema. One of SchemaArn or SchemaName has to be provided.

              • SchemaName (string) –

                The name of the schema. One of SchemaArn or SchemaName has to be provided.

              • RegistryName (string) –

                The name of the schema registry that contains the schema.

            • SchemaVersionId (string) –

              The unique ID assigned to a version of the schema. Either this or the SchemaId has to be provided.

            • SchemaVersionNumber (integer) –

              The version number of the schema.

        • PartitionKeys (list) –

          A list of columns by which the table is partitioned. Only primitive types are supported as partition keys.

          When you create a table used by Amazon Athena, and you do not specify any partitionKeys, you must at least set the value of partitionKeys to an empty list. For example:

          "PartitionKeys": []

          • (dict) –

            A column in a Table.

            • Name (string) –

              The name of the Column.

            • Type (string) –

              The data type of the Column.

            • Comment (string) –

              A free-form text comment.

            • Parameters (dict) –

              These key-value pairs define properties associated with the column.

              • (string) –

                • (string) –

        • ViewOriginalText (string) –

          Included for Apache Hive compatibility. Not used in the normal course of Glue operations. If the table is a VIRTUAL_VIEW, certain Athena configuration encoded in base64.

        • ViewExpandedText (string) –

          Included for Apache Hive compatibility. Not used in the normal course of Glue operations.

        • TableType (string) –

          The type of this table. Glue will create tables with the EXTERNAL_TABLE type. Other services, such as Athena, may create tables with additional table types.

          Glue related table types:

          EXTERNAL_TABLE

          Hive compatible attribute - indicates a non-Hive managed table.

          GOVERNED

          Used by Lake Formation. The Glue Data Catalog understands GOVERNED.

        • Parameters (dict) –

          These key-value pairs define properties associated with the table.

          • (string) –

            • (string) –

        • CreatedBy (string) –

          The person or entity who created the table.

        • IsRegisteredWithLakeFormation (boolean) –

          Indicates whether the table has been registered with Lake Formation.

        • TargetTable (dict) –

          A TableIdentifier structure that describes a target table for resource linking.

          • CatalogId (string) –

            The ID of the Data Catalog in which the table resides.

          • DatabaseName (string) –

            The name of the catalog database that contains the target table.

          • Name (string) –

            The name of the target table.

          • Region (string) –

            Region of the target table.

        • CatalogId (string) –

          The ID of the Data Catalog in which the table resides.

        • VersionId (string) –

          The ID of the table version.

        • FederatedTable (dict) –

          A FederatedTable structure that references an entity outside the Glue Data Catalog.

          • Identifier (string) –

            A unique identifier for the federated table.

          • DatabaseIdentifier (string) –

            A unique identifier for the federated database.

          • ConnectionName (string) –

            The name of the connection to the external metastore.