NeptuneGraph / Client / create_graph_using_import_task

create_graph_using_import_task#

NeptuneGraph.Client.create_graph_using_import_task(**kwargs)#

Creates a new Neptune Analytics graph and imports data into it, either from Amazon Simple Storage Service (S3) or from a Neptune database or a Neptune database snapshot.

The data can be loaded from files in S3 that in either the Gremlin CSV format or the openCypher load format.

Request Syntax

response = client.create_graph_using_import_task(
    graphName='string',
    tags={
        'string': 'string'
    },
    publicConnectivity=True|False,
    kmsKeyIdentifier='string',
    vectorSearchConfiguration={
        'dimension': 123
    },
    replicaCount=123,
    deletionProtection=True|False,
    importOptions={
        'neptune': {
            's3ExportPath': 'string',
            's3ExportKmsKeyId': 'string',
            'preserveDefaultVertexLabels': True|False,
            'preserveEdgeIds': True|False
        }
    },
    maxProvisionedMemory=123,
    minProvisionedMemory=123,
    failOnError=True|False,
    source='string',
    format='CSV'|'OPEN_CYPHER'|'PARQUET'|'NTRIPLES',
    parquetType='COLUMNAR',
    blankNodeHandling='convertToIri',
    roleArn='string'
)

Parameters:

graphName (string) –
[REQUIRED]

A name for the new Neptune Analytics graph to be created.

The name must contain from 1 to 63 letters, numbers, or hyphens, and its first character must be a letter. It cannot end with a hyphen or contain two consecutive hyphens. Only lowercase letters are allowed.
tags (dict) –
Adds metadata tags to the new graph. These tags can also be used with cost allocation reporting, or used in a Condition statement in an IAM policy.
- (string) –
  - (string) –
publicConnectivity (boolean) – Specifies whether or not the graph can be reachable over the internet. All access to graphs is IAM authenticated. ( true to enable, or false to disable).
kmsKeyIdentifier (string) – Specifies a KMS key to use to encrypt data imported into the new graph.
vectorSearchConfiguration (dict) –
Specifies the number of dimensions for vector embeddings that will be loaded into the graph. The value is specified as ``dimension=``value. Max = 65,535
- dimension (integer) – [REQUIRED]
  
  The number of dimensions.
replicaCount (integer) –
The number of replicas in other AZs to provision on the new graph after import. Default = 0, Min = 0, Max = 2.

Warning
Additional charges equivalent to the m-NCUs selected for the graph apply for each replica.
deletionProtection (boolean) – Indicates whether or not to enable deletion protection on the graph. The graph can’t be deleted when deletion protection is enabled. ( true or false).
importOptions (dict) –
Contains options for controlling the import process. For example, if the failOnError key is set to false, the import skips problem data and attempts to continue (whereas if set to true, the default, or if omitted, the import operation halts immediately when an error is encountered.

Note
This is a Tagged Union structure. Only one of the following top level keys can be set: neptune.
- neptune (dict) –
  
  Options for importing data from a Neptune database.
  - s3ExportPath (string) – [REQUIRED]
    
    The path to an S3 bucket from which to import data.
  - s3ExportKmsKeyId (string) – [REQUIRED]
    
    The KMS key to use to encrypt data in the S3 bucket where the graph data is exported
  - preserveDefaultVertexLabels (boolean) –
    
    Neptune Analytics supports label-less vertices and no labels are assigned unless one is explicitly provided. Neptune assigns default labels when none is explicitly provided. When importing the data into Neptune Analytics, the default vertex labels can be omitted by setting preserveDefaultVertexLabels to false. Note that if the vertex only has default labels, and has no other properties or edges, then the vertex will effectively not get imported into Neptune Analytics when preserveDefaultVertexLabels is set to false.
  - preserveEdgeIds (boolean) –
    
    Neptune Analytics currently does not support user defined edge ids. The edge ids are not imported by default. They are imported if preserveEdgeIds is set to true, and ids are stored as properties on the relationships with the property name neptuneEdgeId.
maxProvisionedMemory (integer) –
The maximum provisioned memory-optimized Neptune Capacity Units (m-NCUs) to use for the graph. Default: 1024, or the approved upper limit for your account.

If both the minimum and maximum values are specified, the final provisioned-memory will be chosen per the actual size of your imported data. If neither value is specified, 128 m-NCUs are used.
minProvisionedMemory (integer) – The minimum provisioned memory-optimized Neptune Capacity Units (m-NCUs) to use for the graph. Default: 16
failOnError (boolean) – If set to true, the task halts when an import error is encountered. If set to false, the task skips the data that caused the error and continues if possible.
source (string) –
[REQUIRED]

A URL identifying to the location of the data to be imported. This can be an Amazon S3 path, or can point to a Neptune database endpoint or snapshot.
format (string) – Specifies the format of S3 data to be imported. Valid values are CSV, which identifies the Gremlin CSV format, OPEN_CYPHER, which identifies the openCypher load format, or ntriples, which identifies the RDF n-triples format.
parquetType (string) – The parquet type of the import task.
blankNodeHandling (string) – The method to handle blank nodes in the dataset. Currently, only convertToIri is supported, meaning blank nodes are converted to unique IRIs at load time. Must be provided when format is ntriples. For more information, see Handling RDF values.
roleArn (string) –
[REQUIRED]

The ARN of the IAM role that will allow access to the data that is to be imported.

Return type:

dict

Returns:

Response Syntax

{
    'graphId': 'string',
    'taskId': 'string',
    'source': 'string',
    'format': 'CSV'|'OPEN_CYPHER'|'PARQUET'|'NTRIPLES',
    'parquetType': 'COLUMNAR',
    'roleArn': 'string',
    'status': 'INITIALIZING'|'EXPORTING'|'ANALYZING_DATA'|'IMPORTING'|'REPROVISIONING'|'ROLLING_BACK'|'SUCCEEDED'|'FAILED'|'CANCELLING'|'CANCELLED'|'DELETED',
    'importOptions': {
        'neptune': {
            's3ExportPath': 'string',
            's3ExportKmsKeyId': 'string',
            'preserveDefaultVertexLabels': True|False,
            'preserveEdgeIds': True|False
        }
    }
}

Response Structure

(dict) –
- graphId (string) –
  
  The unique identifier of the Neptune Analytics graph.
- taskId (string) –
  
  The unique identifier of the import task.
- source (string) –
  
  A URL identifying to the location of the data to be imported. This can be an Amazon S3 path, or can point to a Neptune database endpoint or snapshot.
- format (string) –
  
  Specifies the format of S3 data to be imported. Valid values are CSV, which identifies the Gremlin CSV format, OPENCYPHER, which identifies the openCypher load format, or ntriples, which identifies the RDF n-triples format.
- parquetType (string) –
  
  The parquet type of the import task.
- roleArn (string) –
  
  The ARN of the IAM role that will allow access to the data that is to be imported.
- status (string) –
  
  The status of the import task.
- importOptions (dict) –
  
  Contains options for controlling the import process. For example, if the failOnError key is set to false, the import skips problem data and attempts to continue (whereas if set to true, the default, or if omitted, the import operation halts immediately when an error is encountered.
  Note
  This is a Tagged Union structure. Only one of the following top level keys will be set: neptune. If a client receives an unknown member it will set SDK_UNKNOWN_MEMBER as the top level key, which maps to the name or tag of the unknown member. The structure of SDK_UNKNOWN_MEMBER is as follows:
  'SDK_UNKNOWN_MEMBER': {'name': 'UnknownMemberName'}
  - neptune (dict) –
    
    Options for importing data from a Neptune database.
    - s3ExportPath (string) –
      
      The path to an S3 bucket from which to import data.
    - s3ExportKmsKeyId (string) –
      
      The KMS key to use to encrypt data in the S3 bucket where the graph data is exported
    - preserveDefaultVertexLabels (boolean) –
      
      Neptune Analytics supports label-less vertices and no labels are assigned unless one is explicitly provided. Neptune assigns default labels when none is explicitly provided. When importing the data into Neptune Analytics, the default vertex labels can be omitted by setting preserveDefaultVertexLabels to false. Note that if the vertex only has default labels, and has no other properties or edges, then the vertex will effectively not get imported into Neptune Analytics when preserveDefaultVertexLabels is set to false.
    - preserveEdgeIds (boolean) –
      
      Neptune Analytics currently does not support user defined edge ids. The edge ids are not imported by default. They are imported if preserveEdgeIds is set to true, and ids are stored as properties on the relationships with the property name neptuneEdgeId.

Exceptions

NeptuneGraph.Client.exceptions.ServiceQuotaExceededException
NeptuneGraph.Client.exceptions.ThrottlingException
NeptuneGraph.Client.exceptions.ValidationException
NeptuneGraph.Client.exceptions.ConflictException
NeptuneGraph.Client.exceptions.InternalServerException