AgentsforBedrock / Client / get_knowledge_base_documents

get_knowledge_base_documents#

AgentsforBedrock.Client.get_knowledge_base_documents(**kwargs)#

Retrieves specific documents from a data source that is connected to a knowledge base. For more information, see Ingest documents into a knowledge base in real-time in the Amazon Bedrock User Guide.

See also: AWS API Documentation

Request Syntax

response = client.get_knowledge_base_documents(
    dataSourceId='string',
    documentIdentifiers=[
        {
            'custom': {
                'id': 'string'
            },
            'dataSourceType': 'CUSTOM'|'S3',
            's3': {
                'uri': 'string'
            }
        },
    ],
    knowledgeBaseId='string'
)
Parameters:
  • dataSourceId (string) –

    [REQUIRED]

    The unique identifier of the data source that contains the documents.

  • documentIdentifiers (list) –

    [REQUIRED]

    A list of objects, each of which contains information to identify a document for which to retrieve information.

    • (dict) –

      Contains information that identifies the document.

      • custom (dict) –

        Contains information that identifies the document in a custom data source.

        • id (string) – [REQUIRED]

          The identifier of the document to ingest into a custom data source.

      • dataSourceType (string) – [REQUIRED]

        The type of data source connected to the knowledge base that contains the document.

      • s3 (dict) –

        Contains information that identifies the document in an S3 data source.

        • uri (string) – [REQUIRED]

          The location’s URI. For example, s3://my-bucket/chunk-processor/.

  • knowledgeBaseId (string) –

    [REQUIRED]

    The unique identifier of the knowledge base that is connected to the data source.

Return type:

dict

Returns:

Response Syntax

{
    'documentDetails': [
        {
            'dataSourceId': 'string',
            'identifier': {
                'custom': {
                    'id': 'string'
                },
                'dataSourceType': 'CUSTOM'|'S3',
                's3': {
                    'uri': 'string'
                }
            },
            'knowledgeBaseId': 'string',
            'status': 'INDEXED'|'PARTIALLY_INDEXED'|'PENDING'|'FAILED'|'METADATA_PARTIALLY_INDEXED'|'METADATA_UPDATE_FAILED'|'IGNORED'|'NOT_FOUND'|'STARTING'|'IN_PROGRESS'|'DELETING'|'DELETE_IN_PROGRESS',
            'statusReason': 'string',
            'updatedAt': datetime(2015, 1, 1)
        },
    ]
}

Response Structure

  • (dict) –

    • documentDetails (list) –

      A list of objects, each of which contains information about the documents that were retrieved.

      • (dict) –

        Contains the details for a document that was ingested or deleted.

        • dataSourceId (string) –

          The identifier of the data source connected to the knowledge base that the document was ingested into or deleted from.

        • identifier (dict) –

          Contains information that identifies the document.

          • custom (dict) –

            Contains information that identifies the document in a custom data source.

            • id (string) –

              The identifier of the document to ingest into a custom data source.

          • dataSourceType (string) –

            The type of data source connected to the knowledge base that contains the document.

          • s3 (dict) –

            Contains information that identifies the document in an S3 data source.

            • uri (string) –

              The location’s URI. For example, s3://my-bucket/chunk-processor/.

        • knowledgeBaseId (string) –

          The identifier of the knowledge base that the document was ingested into or deleted from.

        • status (string) –

          The ingestion status of the document. The following statuses are possible:

          • STARTED – You submitted the ingestion job containing the document.

          • PENDING – The document is waiting to be ingested.

          • IN_PROGRESS – The document is being ingested.

          • INDEXED – The document was successfully indexed.

          • PARTIALLY_INDEXED – The document was partially indexed.

          • METADATA_PARTIALLY_INDEXED – You submitted metadata for an existing document and it was partially indexed.

          • METADATA_UPDATE_FAILED – You submitted a metadata update for an existing document but it failed.

          • FAILED – The document failed to be ingested.

          • NOT_FOUND – The document wasn’t found.

          • IGNORED – The document was ignored during ingestion.

          • DELETING – You submitted the delete job containing the document.

          • DELETE_IN_PROGRESS – The document is being deleted.

        • statusReason (string) –

          The reason for the status. Appears alongside the status IGNORED.

        • updatedAt (datetime) –

          The date and time at which the document was last updated.

Exceptions

  • AgentsforBedrock.Client.exceptions.ThrottlingException

  • AgentsforBedrock.Client.exceptions.AccessDeniedException

  • AgentsforBedrock.Client.exceptions.ValidationException

  • AgentsforBedrock.Client.exceptions.InternalServerException

  • AgentsforBedrock.Client.exceptions.ResourceNotFoundException

  • AgentsforBedrock.Client.exceptions.ServiceQuotaExceededException