Query Engine
The ByoKGQueryEngine is the central orchestrating component that coordinates graph connectors, retrievers, and LLMs to process natural language questions and generate answers from knowledge graphs. It handles the high-level flow of query processing while delegating LLM-specific tasks to the KG Linker.
Overview
Section titled “Overview”The ByoKGQueryEngine orchestrates the interaction between multiple components to answer questions over knowledge graphs. It supports two main modes of operation:
- Cypher-based retrieval - Uses CypherKGLinker for direct query generation and execution
- Multi-strategy retrieval - Uses KGLinker with multiple retrieval strategies (agentic, path-based, query-based)
The engine can operate with either mode independently or combine both approaches for comprehensive question answering.
Architecture
Section titled “Architecture”The query engine integrates the following components:
- Graph Store - Provides access to graph data and schema information
- Entity Linker - Links natural language entities to graph nodes
- Triplet Retriever - Retrieves relevant triplets using agentic exploration
- Path Retriever - Finds and verbalizes paths between entities
- Graph Query Executor - Executes structured graph queries
- KG Linker - Handles LLM-based entity extraction and query understanding
- Cypher KG Linker - Specialized for Cypher query generation (optional)
Initialization
Section titled “Initialization”Basic initialization
Section titled “Basic initialization”from graphrag_toolkit.byokg_rag.byokg_query_engine import ByoKGQueryEnginefrom graphrag_toolkit.byokg_rag.llm import BedrockGenerator
# Initialize LLMllm_generator = BedrockGenerator( model_name='us.anthropic.claude-sonnet-4-6', region_name='us-east-1')
# Initialize query enginequery_engine = ByoKGQueryEngine( graph_store=your_graph_store, llm_generator=llm_generator)Full initialization with custom components
Section titled “Full initialization with custom components”query_engine = ByoKGQueryEngine( graph_store=graph_store, entity_linker=custom_entity_linker, triplet_retriever=custom_triplet_retriever, path_retriever=custom_path_retriever, graph_query_executor=custom_query_executor, llm_generator=custom_llm, kg_linker=custom_kg_linker, cypher_kg_linker=custom_cypher_linker, direct_query_linking=False)Default component initialization
Section titled “Default component initialization”When components are not provided, the engine initializes defaults where possible. Components that require an LLM (triplet_retriever, kg_linker) are only auto-created when llm_generator is explicitly provided:
Entity Linker: Uses FuzzyStringIndex with all graph nodes
from indexing import FuzzyStringIndexfrom graph_retrievers import EntityLinker
string_index = FuzzyStringIndex()string_index.add(graph_store.nodes())entity_retriever = string_index.as_entity_matcher()entity_linker = EntityLinker(entity_retriever)Triplet Retriever: Uses AgenticRetriever with graph traversal
from graph_retrievers import AgenticRetriever, GTraversal, TripletGVerbalizer
graph_traversal = GTraversal(graph_store)graph_verbalizer = TripletGVerbalizer()triplet_retriever = AgenticRetriever( llm_generator=llm_generator, graph_traversal=graph_traversal, graph_verbalizer=graph_verbalizer)Path Retriever: Uses PathRetriever with path verbalization
from graph_retrievers import PathRetriever, GTraversal, PathVerbalizer
graph_traversal = GTraversal(graph_store)path_verbalizer = PathVerbalizer()path_retriever = PathRetriever( graph_traversal=graph_traversal, path_verbalizer=path_verbalizer)Query processing
Section titled “Query processing”Main query method
Section titled “Main query method”def query(self, query: str, iterations: int = 2, cypher_iterations: int = 2) -> Tuple[List[str], List[str]]The query method processes questions through the retrieval pipeline and returns retrieved context.
Parameters:
query(str): The search queryiterations(int): Number of retrieval iterations for multi-strategy approach (default: 2)cypher_iterations(int): Number of Cypher generation retries (default: 2)
Returns:
- Tuple of (retrieved context, final answers) as lists of strings
Query processing flow
Section titled “Query processing flow”- Initialize context - Set up empty context lists and entity tracking
- Direct query linking (optional) - Use semantic similarity for initial entity linking
- Cypher-based retrieval (if CypherKGLinker provided) - Generate and execute Cypher queries
- Multi-strategy retrieval (if KGLinker provided) - Use iterative entity extraction and retrieval
- Context aggregation - Combine results from all strategies
Cypher-based retrieval
Section titled “Cypher-based retrieval”When a cypher_kg_linker is provided, the engine performs Cypher-based retrieval:
Process flow
Section titled “Process flow”- Generate Cypher response - Use CypherKGLinker to generate linking and query artifacts
- Execute linking queries - Process
opencypher-linkingartifacts for entity discovery - Execute main queries - Process
opencypherartifacts for answer retrieval - Handle failures - Provide feedback for failed queries to improve subsequent iterations
- Iterate - Repeat for specified number of
cypher_iterations
Error handling
Section titled “Error handling”The engine provides feedback for failed Cypher queries:
if len(answers) == 0: cypher_context_with_feedback.append( "No executable results for the above. Please improve cypher generation " "in the future by focusing more on the given schema and the relations " "between node types." )Multi-strategy retrieval
Section titled “Multi-strategy retrieval”When a kg_linker is provided, the engine performs multi-strategy retrieval:
Iterative process
Section titled “Iterative process”- Generate LLM response - Use KGLinker to extract entities, paths, and queries
- Link entities - Connect extracted entities to graph nodes
- Retrieve triplets - Use AgenticRetriever for contextual triplet extraction
- Process paths - Follow extracted metapaths between entities
- Execute queries - Run structured graph queries (Cypher, SPARQL)
- Check completion - Stop if task completion signal is detected
Task completion
Section titled “Task completion”The engine checks for completion signals in LLM responses:
task_completion = parse_response(response, r"<task-completion>(.*?)</task-completion>")if "FINISH" in " ".join(task_completion): breakIterative prompting
Section titled “Iterative prompting”The engine uses different prompts for different iterations:
- First iteration: Uses standard task prompts
- Subsequent iterations: Uses iterative prompts that build on previous context
Usage examples
Section titled “Usage examples”Basic usage
Section titled “Basic usage”# Initialize with graph store and LLMquery_engine = ByoKGQueryEngine( graph_store=graph_store, llm_generator=llm_generator)
# Process a questionquestion = "What are the side effects of aspirin?"context = query_engine.query(question)
print("Retrieved context:")for item in context: print(f"- {item}")Cypher-focused usage
Section titled “Cypher-focused usage”from graph_connectors import CypherKGLinker
# Initialize with Cypher supportcypher_linker = CypherKGLinker(llm_generator, graph_store)query_engine = ByoKGQueryEngine( graph_store=graph_store, llm_generator=llm_generator, cypher_kg_linker=cypher_linker)
# Process question with Cypher iterationsquestion = "Find all drugs that interact with aspirin"context = query_engine.query(question, cypher_iterations=3)Multi-strategy with custom components
Section titled “Multi-strategy with custom components”# Custom entity linker with semantic similarityfrom indexing import SemanticIndexsemantic_index = SemanticIndex(embedding_model)semantic_index.add(graph_store.nodes())entity_linker = EntityLinker(semantic_index.as_entity_matcher())
# Initialize with custom componentsquery_engine = ByoKGQueryEngine( graph_store=graph_store, llm_generator=llm_generator, entity_linker=entity_linker, direct_query_linking=True # Enable semantic entity linking)
# Process with multiple iterationscontext = query_engine.query(question, iterations=3)Response generation
Section titled “Response generation”# Generate final response from retrieved contextquestion = "What causes headaches?"context = query_engine.query(question)
# Generate answer using retrieved contextanswers, full_response = query_engine.generate_response( query=question, graph_context="\n".join(context))
print("Generated answers:")for answer in answers: print(f"- {answer}")Combining both approaches
Section titled “Combining both approaches”# Initialize with both KG Linker and Cypher KG Linkerquery_engine = ByoKGQueryEngine( graph_store=graph_store, llm_generator=llm_generator, kg_linker=kg_linker, cypher_kg_linker=cypher_linker)
# The engine will first try Cypher-based retrieval,# then fall back to multi-strategy retrievalcontext = query_engine.query(question)Configuration options
Section titled “Configuration options”Direct query linking
Section titled “Direct query linking”Enable semantic similarity-based entity linking:
query_engine = ByoKGQueryEngine( graph_store=graph_store, llm_generator=llm_generator, direct_query_linking=True)Custom LLM configuration
Section titled “Custom LLM configuration”from llm import BedrockGenerator
custom_llm = BedrockGenerator( model_name='us.anthropic.claude-sonnet-4-6', region_name='us-east-1')
query_engine = ByoKGQueryEngine( graph_store=graph_store, llm_generator=custom_llm)Iteration control
Section titled “Iteration control”# Fine-tune iteration counts for different strategiescontext = query_engine.query( question, iterations=2, # Multi-strategy iterations cypher_iterations=2 # Cypher retry iterations)