Skip to content

Configuration

Pick a backend and fill in the fields — this is the same URL you’ll pass to GraphStoreFactory.for_graph_store(...).

neptune-db://my-graph.cluster-xxxxxxxx.us-east-1.neptune.amazonaws.com

The lexical-graph provides a GraphRAGConfig object that allows you to configure the LLMs and embedding models used by the indexing and retrieval processes, as well as the parallel and batch processing behaviours of the indexing pipelines. (The lexical-graph doesn’t use the LlamaIndex Settings object: attributes configured in Settings will have no impact in the graphrag-toolkit.)

The lexical-graph also allows you to set the logging level and apply logging filters from within your application.

GraphRAGConfig is a module-level singleton (not a class to instantiate). It is created once at import time (config.py) and shared across the process. Set attributes directly on the imported object:

from graphrag_toolkit.lexical_graph import GraphRAGConfig
GraphRAGConfig.aws_region = 'eu-west-1'
GraphRAGConfig.extraction_llm = 'anthropic.claude-3-5-sonnet-20241022-v2:0'

Setting aws_profile or aws_region automatically clears all cached boto3 clients.

The configuration includes the following parameters:

ParameterDescriptionDefault ValueEnvironment Variable
extraction_llmLLM used to perform graph extraction (see LLM configuration)us.anthropic.claude-3-7-sonnet-20250219-v1:0EXTRACTION_MODEL
response_llmLLM used to generate responses (see LLM configuration)us.anthropic.claude-3-7-sonnet-20250219-v1:0RESPONSE_MODEL
embed_modelEmbedding model used to generate embeddings for indexed data and queries (see Embedding model configuration)cohere.embed-english-v3EMBEDDINGS_MODEL
embed_dimensionsNumber of dimensions in each vector1024EMBEDDINGS_DIMENSIONS
extraction_num_workersThe number of parallel processes to use when running the extract stage2EXTRACTION_NUM_WORKERS
extraction_num_threads_per_workerThe number of threads used by each process in the extract stage4EXTRACTION_NUM_THREADS_PER_WORKER
extraction_batch_sizeThe number of input nodes to be processed in parallel across all workers in the extract stage4EXTRACTION_BATCH_SIZE
build_num_workersThe number of parallel processes to use when running the build stage2BUILD_NUM_WORKERS
build_batch_sizeThe number of input nodes to be processed in parallel across all workers in the build stage4BUILD_BATCH_SIZE
build_batch_write_sizeThe number of elements to be written in a bulk operation to the graph and vector stores (see Batch writes)25BUILD_BATCH_WRITE_SIZE
batch_writes_enabledDetermines whether, on a per-worker basis, to write all elements (nodes and edges, or vectors) emitted by a batch of input nodes as a bulk operation, or singly, to the graph and vector stores (see Batch writes)TrueBATCH_WRITES_ENABLED
include_domain_labelsDetermines whether entities will have a domain-specific label (e.g. Company) as well as the graph model’s __Entity__ labelFalseINCLUDE_DOMAIN_LABELS
include_local_entitiesWhether to include local-context entities in the graphFalseINCLUDE_LOCAL_ENTITIES
include_classification_in_entity_idWhether to include an entity’s classification in its graph node idTrueINCLUDE_CLASSIFICATION_IN_ENTITY_ID
enable_versioningWhether to enable versioned updates (see Versioned Updates)FalseENABLE_VERSIONING
enable_cacheDetermines whether the results of LLM calls to models on Amazon Bedrock are cached to the local filesystem (see Caching Amazon Bedrock LLM responses)FalseENABLE_CACHE
aws_profileAWS CLI named profile used to authenticate requests to Bedrock and other servicesNoneAWS_PROFILE
aws_regionAWS region used to scope Bedrock service callsDefault boto3 session regionAWS_REGION

The following parameters configure the rerankers used by query retrievers:

ParameterDescriptionDefaultEnvironment Variable
reranking_modelLocal reranker model (mixedbread-ai)mixedbread-ai/mxbai-rerank-xsmall-v1RERANKING_MODEL
bedrock_reranking_modelAmazon Bedrock reranker modelcohere.rerank-v3-5:0BEDROCK_RERANKING_MODEL

The following parameter applies only when using Amazon OpenSearch Serverless as a vector store:

ParameterDescriptionDefaultEnvironment Variable
opensearch_engineOpenSearch kNN enginenmslibOPENSEARCH_ENGINE

The following parameters configure local filesystem paths for container/EKS deployments:

ParameterDescriptionDefaultEnvironment Variable
local_output_dirLocal staging directory for batch files and temporary extraction outputsoutputLOCAL_OUTPUT_DIR
log_output_dirDirectory prefix for log files (when filename is relative)NoneLOG_OUTPUT_DIR

To set a configuration parameter in your application code:

from graphrag_toolkit.lexical_graph import GraphRAGConfig
GraphRAGConfig.response_llm = 'anthropic.claude-3-haiku-20240307-v1:0'
GraphRAGConfig.extraction_num_workers = 4

You can also set any of these via environment variables using the variable names in the tables above.

The extraction_llm and response_llm configuration parameters accept three different types of value:

  • You can pass an instance of a LlamaIndex LLM object. This allows you to configure the lexical-graph for LLM backends other than Amazon Bedrock.
  • You can pass the model id of an Amazon Bedrock model or inference profile. For example: anthropic.claude-3-7-sonnet-20250219-v1:0 (model id) or us.anthropic.claude-3-7-sonnet-20250219-v1:0 (inference profile).
  • You can pass a JSON string representation of a LlamaIndex BedrockConverse instance. For example:
{
"model": "anthropic.claude-3-7-sonnet-20250219-v1:0",
"temperature": 0.0,
"max_tokens": 4096
}

The embed_model configuration parameter accepts three different types of value:

  • You can pass an instance of a LlamaIndex BaseEmbedding object. This allows you to configure the lexical-graph for embedding backends other than Amazon Bedrock.
  • You can pass the model name of an Amazon Bedrock model. For example: amazon.titan-embed-text-v1.
  • You can pass a JSON string representation of a LlamaIndex BedrockEmbedding instance. For example:
{
"model_name": "amazon.titan-embed-text-v2:0"
}

When configuring an embedding model, you must also set the embed_dimensions configuration parameter to match the model’s output dimensions. For example:

GraphRAGConfig.embed_model = '{"model_name": "amazon.titan-embed-text-v2:0"}'
GraphRAGConfig.embed_dimensions = 512

Amazon Nova 2 multimodal embedding models (amazon.nova-2-multimodal-embeddings-v1:0) use a different API format than standard Bedrock embedding models. To use Nova 2 models, you must explicitly import and instantiate the Nova2MultimodalEmbedding class.

Usage:

from graphrag_toolkit.lexical_graph import GraphRAGConfig
from graphrag_toolkit.lexical_graph.utils.bedrock_utils import Nova2MultimodalEmbedding
GraphRAGConfig.embed_model = Nova2MultimodalEmbedding('amazon.nova-2-multimodal-embeddings-v1:0')
GraphRAGConfig.embed_dimensions = 3072

API Format Differences:

Standard Bedrock embeddings (Titan, Cohere) use:

{"inputText": "text to embed"}

Nova 2 multimodal embeddings require:

{
"taskType": "SINGLE_EMBEDDING",
"singleEmbeddingParams": {
"embeddingDimension": 3072,
"embeddingPurpose": "TEXT_RETRIEVAL",
"text": {
"truncationMode": "END",
"value": "text to embed"
}
}
}

Configuration Parameters:

ParameterDescriptionDefaultValid Values
embed_dimensionsVector dimensions30721024, 3072
embed_purposeEmbedding optimization purposeTEXT_RETRIEVALTEXT_RETRIEVAL, GENERIC_RETRIEVAL, DOCUMENT_RETRIEVAL, CLASSIFICATION, CLUSTERING
truncation_modeHow to handle text exceeding max lengthENDEND, NONE

Advanced Configuration:

To configure Nova 2 multimodal embeddings with custom parameters:

from graphrag_toolkit.lexical_graph import GraphRAGConfig
from graphrag_toolkit.lexical_graph.utils.bedrock_utils import Nova2MultimodalEmbedding
embedding = Nova2MultimodalEmbedding(
model_name='amazon.nova-2-multimodal-embeddings-v1:0',
embed_dimensions=3072,
embed_purpose='TEXT_RETRIEVAL',
truncation_mode='END'
)
GraphRAGConfig.embed_model = embedding
GraphRAGConfig.embed_dimensions = 3072

Features:

  • Handles Nova 2’s unique API format automatically
  • Includes retry logic for transient Bedrock errors
  • Custom pickle support for multiprocessing scenarios
  • Lazy client initialization using GraphRAGConfig.session
  • Empty text validation to prevent API errors

The lexical-graph uses microbatching to progress source data through the extract and build stages.

  • In the extract stage a batch of source nodes is processed in parallel by one or more workers, with each worker performing chunking, proposition extraction and topic/statement/fact/entity extraction over its allocated source nodes. For a given batch of source nodes, the extract stage emits a collection of chunks derived from those source nodes.
  • In the build stage, chunks from the extract stage are broken down into smaller indexable nodes representing sources, chunks, topics, statements and facts. These indexable nodes are then processed by the graph construction and vector indexing handlers.

The batch_writes_enabled configuration parameter determines whether all of the indexable nodes derived from a batch of incoming chunks are written to the graph and vector stores singly, or as a bulk operation. Bulk/batch operations tend to improve the throughput of the build stage, at the expense of some additonal latency with regard to this data becoming available to query.

If you’re using Amazon Bedrock, you can use the local filesystem to cache and reuse LLM responses. Set GraphRAGConfig.enable_cache to True. LLM responses will then be saved in clear text to a cache directory. Subsequent invocations of the same model with the exact same prompt will return the cached response.

Note that streaming responses from the query engine are not cached.

The cache directory can grow very large, particularly if you are caching extraction responses for a very large ingest. The lexical-graph will not manage the size of this directory or delete old entries. If you enable the cache, ensure you clear or prune the cache directory regularly.

The graphrag_toolkit provides two methods for configuring logging in your application. These methods allow you to set logging levels, apply filters to include or exclude specific modules or messages, and customize logging behavior:

  • set_logging_config
  • set_advanced_logging_config

The set_logging_config method allows you to configure logging with a basic set of options, such as logging level and module filters. Wildcards are supported for module names, and you can pass either a single string or a list of strings for included or excluded modules. You can optionally provide a filename to write log output to a file in addition to stdout. For example:

from graphrag_toolkit.lexical_graph import set_logging_config
set_logging_config(
logging_level='DEBUG', # or logging.DEBUG
debug_include_modules='graphrag_toolkit.lexical_graph.storage', # single string or list of strings
debug_exclude_modules=['opensearch', 'boto'], # single string or list of strings
filename='output.log' # optional: also write logs to a file
)

The set_advanced_logging_config method provides more advanced logging configuration options, including the ability to specify filters for included and excluded modules or messages based on logging levels. Wildcards are supported for module names and included messages, and you can pass either a single string or a list of strings for modules or messages. This method offers greater flexibility and control over the logging behavior.

ParameterTypeDescriptionDefault Value
logging_levelstr or intThe logging level to apply (e.g., 'DEBUG', 'INFO', logging.DEBUG, etc.).logging.INFO
included_modulesdict[int, str | list[str]]Modules to include in logging, grouped by logging level. Wildcards are supported.None
excluded_modulesdict[int, str | list[str]]Modules to exclude from logging, grouped by logging level. Wildcards are supported.None
included_messagesdict[int, str | list[str]]Specific messages to include in logging, grouped by logging level. Wildcards are supported.None
excluded_messagesdict[int, str | list[str]]Specific messages to exclude from logging, grouped by logging level.None
filenamestrIf provided, log output is also written to this file in addition to stdout.None

Here is an example of how to use set_advanced_logging_config:

import logging
from graphrag_toolkit.lexical_graph import set_advanced_logging_config
set_advanced_logging_config(
logging_level=logging.DEBUG,
included_modules={
logging.DEBUG: 'graphrag_toolkit', # single string or list of strings
logging.INFO: '*', # wildcard supported
},
excluded_modules={
logging.DEBUG: ['opensearch', 'boto', 'urllib'], # single string or list of strings
logging.INFO: ['opensearch', 'boto', 'urllib'], # wildcard supported
},
excluded_messages={
logging.WARNING: 'Removing unpickleable private attribute', # single string or list of strings
}
)

You can explicitly configure the AWS CLI profile and region to use when initializing Bedrock clients or other AWS service clients in GraphRAGConfig. This ensures compatibility across local development, EC2/ECS environments, or federated environments such as AWS SSO.

You may set the AWS profile and region in your application code:

from graphrag_toolkit.lexical_graph import GraphRAGConfig
GraphRAGConfig.aws_profile = 'padmin'
GraphRAGConfig.aws_region = 'us-east-1'

Alternatively, use environment variables:

Terminal window
export AWS_PROFILE=padmin
export AWS_REGION=us-east-1

If no profile or region is set explicitly, the system falls back to environment variables or the default AWS CLI configuration.

See Using AWS Profiles in GraphRAGConfig for more details on configuring and using AWS named profiles.

All boto3 clients created by GraphRAGConfig are wrapped in a ResilientClient (config.py:94). On ExpiredToken, RequestExpired, or InvalidClientTokenId errors the client is refreshed automatically and the call is retried.

When an AWS SSO profile is in use, the client wrapper also validates the SSO token age. If the token is more than one hour old, it runs aws sso login automatically before retrying. This is relevant for long-running indexing jobs and any environment where SSO sessions can expire mid-run.