FAQ
Overview
Section titled “Overview”This document answers common questions about the byokg-rag library and provides guidance on troubleshooting, optimization, and best practices.
Common Questions
Section titled “Common Questions”Which graph store should I choose?
Section titled “Which graph store should I choose?”Choose your graph store based on deployment requirements and scale:
Amazon Neptune Analytics is best for:
- Production workloads requiring fast analytical queries
- Applications needing native vector search for entity linking
- Serverless deployments without infrastructure management
- Integration with AWS analytics services
Amazon Neptune Database is best for:
- Transactional workloads requiring ACID guarantees
- Applications needing high availability with automatic failover
- Workloads requiring read replicas for scaling
- Mixed transactional and analytical queries
Local Graph Store is best for:
- Development and prototyping
- Testing with small datasets (< 10,000 nodes)
- Learning and experimentation
- Environments without AWS access
TIP: Start with the local graph store for development, then migrate to Neptune Analytics for production deployments.
How do I optimize query performance?
Section titled “How do I optimize query performance?”Optimize performance through these strategies:
1. Adjust iteration counts
Reduce iterations and cypher_iterations parameters to minimize LLM calls:
context = query_engine.query( query="Your question", iterations=1, # Reduce from default of 2 cypher_iterations=1)2. Limit retriever parameters
Reduce the scope of graph exploration:
triplet_retriever = AgenticRetriever( llm_generator=llm_generator, graph_traversal=graph_traversal, graph_verbalizer=triplet_verbalizer, max_num_relations=3, # Reduce from default of 5 max_num_entities=2, # Reduce from default of 3 max_num_iterations=2, # Reduce from default of 3 max_num_triplets=30 # Reduce from default of 50)3. Use appropriate indexes
Choose the fastest index type for your use case:
- Fuzzy string index: Fastest, no external dependencies
- Dense index: Slower but better semantic matching
- Graph-store index: Integrated with Neptune Analytics
4. Enable direct query linking
Skip LLM-based entity extraction for simple queries:
query_engine = ByoKGQueryEngine( graph_store=graph_store, llm_generator=llm_generator, direct_query_linking=True # Use semantic similarity directly)5. Optimize LLM configuration
Use faster models or reduce token limits:
llm_generator = BedrockGenerator( model_name="anthropic.claude-haiku-4-5-20251001-v1:0", # Faster model max_tokens=2048 # Reduce from default of 4096)What LLM models are supported?
Section titled “What LLM models are supported?”The byokg-rag library supports Amazon Bedrock models through the BedrockGenerator class. For the latest model availability and lifecycle status, see the Amazon Bedrock model lifecycle documentation.
Active models (recommended):
Claude Sonnet 4.6 (Recommended)
- Model ID:
anthropic.claude-sonnet-4-6 - Best balance of performance and cost
- Strong reasoning capabilities for KGQA
Claude Opus 4.6
- Model ID:
anthropic.claude-opus-4-6-v1 - Highest capability for complex reasoning
- Highest cost and latency
Claude Haiku 4.5
- Model ID:
anthropic.claude-haiku-4-5-20251001-v1:0 - Fastest and lowest cost
- Suitable for simple queries
Legacy models (available only to users who have actively used them in the last 15 days; new users are blocked):
- Claude 3.7 Sonnet:
anthropic.claude-3-7-sonnet-20250219-v1:0(EOL: Apr 28, 2026) - Claude 3.5 Sonnet:
anthropic.claude-3-5-sonnet-20240620-v1:0(EOL: Jul 30, 2026) - Claude 3 Haiku:
anthropic.claude-3-haiku-20240307-v1:0(EOL: Sep 10, 2026)
To use a different model:
llm_generator = BedrockGenerator( model_name="anthropic.claude-haiku-4-5-20251001-v1:0", region_name="us-east-1")NOTE: Ensure the model is available in your AWS region. Check the Bedrock model availability documentation.
How do I handle authentication errors?
Section titled “How do I handle authentication errors?”Authentication errors typically indicate IAM permission issues. Follow these steps:
1. Verify AWS credentials
Ensure your environment has valid AWS credentials:
aws sts get-caller-identity2. Check IAM permissions
Verify your IAM role or user has the required permissions:
bedrock:InvokeModelfor LLM accessneptune-graph:ReadDataViaQueryfor Neptune Analyticsneptune-db:ReadDataViaQueryfor Neptune Databases3:GetObjectands3:PutObjectfor S3 operations
3. Verify resource access
Ensure your credentials can access the specific resources:
import boto3
# Test Neptune Analytics accessclient = boto3.client('neptune-graph', region_name='<region>')response = client.get_graph(graphIdentifier='<graph-id>')print(response)
# Test Bedrock accessclient = boto3.client('bedrock-runtime', region_name='<region>')# This will fail if you don't have access4. Check network connectivity
For Neptune Database, ensure your application runs in the correct VPC with appropriate security groups.
Can I use byokg-rag with my existing knowledge graph?
Section titled “Can I use byokg-rag with my existing knowledge graph?”Yes, byokg-rag works with existing knowledge graphs. Requirements:
Graph Structure
- Property graph model (nodes and edges with properties)
- Compatible with openCypher query language (for Neptune)
Data Loading
For Neptune Analytics:
graph_store = NeptuneAnalyticsGraphStore( graph_identifier="<existing-graph-id>", region="<region>")For Neptune Database:
graph_store = NeptuneDBGraphStore( endpoint_url="https://<cluster-endpoint>:8182", region="<region>")For local development:
graph_store = LocalKGStore()graph_store.read_from_csv( nodes_file="your_nodes.csv", edges_file="your_edges.csv")Schema Requirements
The graph store must provide schema information. Neptune Analytics and Neptune Database automatically expose schema. For custom graph stores, implement the get_schema() method.
How many iterations should I configure?
Section titled “How many iterations should I configure?”The optimal iteration count depends on query complexity:
Simple queries (1 iteration)
- Direct fact lookup: “What is the capital of France?”
- Single-hop relationships: “Who directed Inception?”
Moderate queries (2 iterations, default)
- Two-hop reasoning: “What movies did the director of Inception also direct?”
- Multiple entity queries: “Which actors appeared in both Inception and Interstellar?”
Complex queries (3-5 iterations)
- Multi-hop reasoning: “What awards did actors from Christopher Nolan films win?”
- Aggregation queries: “How many Nobel Prize winners worked at the same institution?”
Trade-offs:
- More iterations: Better coverage, higher latency, higher cost
- Fewer iterations: Faster responses, lower cost, may miss relevant information
Start with the default (2 iterations) and adjust based on your query complexity and performance requirements.
What’s the difference between KGLinker and CypherKGLinker?
Section titled “What’s the difference between KGLinker and CypherKGLinker?”KGLinker (Multi-Strategy Retrieval)
- Uses multiple retrieval strategies: agentic, path-based, query-based
- Extracts entities from natural language using LLM
- Combines results from different retrieval approaches
- Best for: Complex queries requiring diverse retrieval strategies
CypherKGLinker (Cypher-Focused Retrieval)
- Specializes in generating and executing openCypher queries
- Iteratively refines queries based on results
- Focuses on structured query generation
- Best for: Queries that map well to graph patterns
Usage:
Multi-strategy retrieval:
kg_linker = KGLinker( llm_generator=llm_generator, graph_store=graph_store)
query_engine = ByoKGQueryEngine( graph_store=graph_store, llm_generator=llm_generator, kg_linker=kg_linker)Cypher-focused retrieval:
cypher_linker = CypherKGLinker( llm_generator=llm_generator, graph_store=graph_store)
query_engine = ByoKGQueryEngine( graph_store=graph_store, llm_generator=llm_generator, cypher_kg_linker=cypher_linker)Combined approach:
query_engine = ByoKGQueryEngine( graph_store=graph_store, llm_generator=llm_generator, kg_linker=kg_linker, cypher_kg_linker=cypher_linker # Tries Cypher first, falls back to multi-strategy)Known Limitations
Section titled “Known Limitations”Retrieval Strategy Limitations
Section titled “Retrieval Strategy Limitations”Agentic Retrieval
- Requires multiple LLM calls, increasing latency and cost
- May explore irrelevant paths in very large graphs
- Performance depends on LLM reasoning capabilities
Scoring-Based Retrieval
- Requires semantic similarity computation for all candidate triplets
- May be slow for graphs with high-degree nodes (many edges per node)
- Effectiveness depends on embedding quality
Path-Based Retrieval
- Requires explicit metapath specification from LLM
- May miss relevant paths not matching specified patterns
- Performance degrades with very long paths (> 5 hops)
Query-Based Retrieval
- Requires LLM to generate syntactically correct queries
- May fail on complex graph schemas with many node/edge types
- Query generation quality varies by LLM model
Graph Store Limitations
Section titled “Graph Store Limitations”Neptune Analytics
- Vector search requires embeddings to be loaded as node properties
- Very complex queries may timeout (default: 60 seconds)
- Regional availability varies (check AWS documentation)
- Bulk loading requires S3 staging
Neptune Database
- VPC-only access (no public endpoints)
- Schema refresh requires recreating graph store instance
- Concurrent query limits depend on instance size
- Read replicas needed for high query concurrency
Local Graph Store
- In-memory only, limited by available RAM
- No persistence across restarts
- No support for complex query languages
- Single-process access only
Performance Considerations
Section titled “Performance Considerations”Large Graphs (> 1M nodes)
- Entity linking may be slow without proper indexing
- Consider using graph-store indexes for Neptune Analytics
- Limit exploration depth to avoid excessive traversal
High Query Volume
- LLM rate limits may throttle requests
- Consider caching frequently asked questions
- Use read replicas for Neptune Database
Long-Running Queries
- Queries with many iterations may timeout
- Reduce iteration counts or exploration parameters
- Consider breaking complex queries into simpler sub-queries
Troubleshooting
Section titled “Troubleshooting”Query returns empty results
Section titled “Query returns empty results”Possible causes:
- Entity linking failed to find relevant entities
- Graph schema doesn’t match query expectations
- Insufficient iterations for multi-hop reasoning
Solutions:
- Enable debug logging to see entity linking results
- Verify graph schema matches your domain
- Increase iteration count for complex queries
- Try direct query linking:
direct_query_linking=True
LLM timeout errors
Section titled “LLM timeout errors”Possible causes:
- Input exceeds token limits
- Network connectivity issues
- Bedrock service throttling
Solutions:
- Reduce
max_input_tokensparameter - Reduce graph context size by limiting retrievers
- Implement exponential backoff retry logic
- Check AWS service health dashboard
High latency
Section titled “High latency”Possible causes:
- Too many LLM calls (high iteration counts)
- Large graph traversals
- Slow entity linking
Solutions:
- Reduce iteration counts
- Limit retriever parameters (max_num_relations, max_num_entities)
- Use faster index types (fuzzy string vs. dense)
- Use faster LLM models (Claude Haiku)
Memory errors with local graph store
Section titled “Memory errors with local graph store”Possible causes:
- Graph too large for available RAM
- Too many triplets retained in context
Solutions:
- Use Neptune Analytics or Neptune Database instead
- Reduce
max_num_tripletsparameter - Filter graph data to relevant subset
- Increase available memory
For additional support, refer to the example notebooks or consult the AWS documentation for Neptune and Bedrock services.