External Properties
Overview
Section titled “Overview”Added a flexible external properties feature that allows adding any business-specific properties from source document metadata to chunk nodes in the graph database.
Changes Made
Section titled “Changes Made”1. Configuration (lexical-graph/src/graphrag_toolkit/lexical_graph/config.py)
Section titled “1. Configuration (lexical-graph/src/graphrag_toolkit/lexical_graph/config.py)”- Added
chunk_external_propertiesproperty toGraphRAGConfig - Accepts dictionary mapping chunk property names to source metadata keys
- Supports environment variable:
CHUNK_EXTERNAL_PROPERTIES(JSON format) - Default:
None(feature disabled)
2. Chunk Node Builder (lexical-graph/src/graphrag_toolkit/lexical_graph/indexing/build/chunk_node_builder.py)
Section titled “2. Chunk Node Builder (lexical-graph/src/graphrag_toolkit/lexical_graph/indexing/build/chunk_node_builder.py)”- Extracts multiple properties from validated source metadata when configured
- Iterates through property mapping and adds each available property
- Adds to chunk metadata:
metadata['chunk']['metadata'][property_name](nested structure matching source metadata) - Uses
_get_source_info_metadata()to ensure only valid (non-collection-based) metadata is used
3. Chunk Graph Builder (lexical-graph/src/graphrag_toolkit/lexical_graph/indexing/build/chunk_graph_builder.py)
Section titled “3. Chunk Graph Builder (lexical-graph/src/graphrag_toolkit/lexical_graph/indexing/build/chunk_graph_builder.py)”- Stores all external properties as properties on chunk nodes
- Reads from nested
metadata['chunk']['metadata']dictionary - Dynamically generates SET statements for each property
- Uses:
SET chunk.property_name = params.property_name
from graphrag_toolkit.lexical_graph import GraphRAGConfigfrom llama_index.core.schema import Document
# Configure multiple propertiesGraphRAGConfig.chunk_external_properties = { 'article_code': 'article_id', 'document_type': 'doc_type', 'department': 'dept_code'}
# Create document with metadatadoc = Document( text="Your content...", metadata={ 'article_id': 'ART-2024-001', 'doc_type': 'research', 'dept_code': 'ENG' })
# Build graph - chunks will have all configured propertiesQuery Examples
Section titled “Query Examples”// Find chunks by article codeMATCH (chunk:__Chunk__ {article_code: 'ART-2024-001'})RETURN chunk
// Find chunks by document typeMATCH (chunk:__Chunk__ {document_type: 'research'})RETURN chunk
// Complex multi-property queryMATCH (chunk:__Chunk__)WHERE chunk.document_type = 'research' AND chunk.department = 'ENG'RETURN chunkFiles Modified
Section titled “Files Modified”lexical-graph/src/graphrag_toolkit/lexical_graph/config.pylexical-graph/src/graphrag_toolkit/lexical_graph/indexing/build/chunk_node_builder.pylexical-graph/src/graphrag_toolkit/lexical_graph/indexing/build/chunk_graph_builder.py
Key Features
Section titled “Key Features”- Flexible: Support any number of properties
- Configurable: Dictionary-based mapping
- Graceful: Handles missing metadata keys
- Backward Compatible: No breaking changes
- Safe: Uses validated source metadata to avoid write failures