Skip to content

Hybrid Deployment

Hybrid deployment enables flexible deployment: high-throughput LLM inference via SageMaker and Bedrock, and cost-effective local development using containerized graph/vector stores.

The lexical-graph library depends on three backend systems: a graph store, a vector store, and a foundation model provider. The graph store enables storage and querying of a lexical graph built from unstructured, text-based sources. The vector store contains one or more indexes with embeddings for selected graph elements, which help identify starting points for graph queries. The foundation model provider hosts the Large Language Models (LLMs) used for extraction and embedding.

The library provides built-in support for:

This hybrid configuration enables flexible deployment: high-throughput LLM inference via SageMaker and Bedrock, and cost-effective local development using containerized graph/vector stores.

The lexical-graph library implements two high-level processes: indexing and querying. The indexing process ingests and extracts information from unstuctured, text-based source documents and then builds a graph and accompanying vector indexes. The query process retrieves content from the graph and vector indexes, and then supplies this content as context to an LLM to answer a user question.

Indexing is split into two pipeline stages: Extract and Build.

The Extract stage runs locally using Docker:

  • Loads and chunks documents

  • Performs two LLM-based extraction steps:

    • Proposition extraction: Converts chunked text into well-formed statements
    • Topic/entity/fact extraction: Identifies relations and concepts
  • Stores the extracted results in an AWS S3 bucket, serving as the transport medium between stages

The Build stage remains unchanged.

Indexing