LISA Langfuse Integration Guide

Langfuse is an open source tool that supports advanced traces, evals, and metrics. This guide provides step-by-step instructions for integrating Langfuse with LISA to enable tracing and monitoring of LLM interactions.

Prerequisites

First ensure that Langfuse is properly deployed by either:

Creating a managed Langfuse account at Langfuse Cloud
Self-hosting Langfuse using the official self-hosting documentation

Initial Langfuse Setup

After deploying Langfuse, complete the following setup steps:

Navigate to the Langfuse web interface
Create a user account
Create an organization
Create a project within the organization
Generate API credentials (Public Key and Secret Key)

Retain the generated API credentials as they will be required for the LISA configuration in Step 1.

Configuration Steps

Step 1: Update LiteLLM Configuration

Configure the LiteLLM integration by updating the litellmConfig section in the config-base.yaml file:

yaml

litellmConfig:
  callbacks: ["langfuse"]
  LANGFUSE_HOST: {YOUR_LANGFUSE_ENDPOINT}
  LANGFUSE_PUBLIC_KEY: pk-{YOUR_PUBLIC_KEY}
  LANGFUSE_SECRET_KEY: sk-{YOUR_SECRET_KEY}

Replace the placeholder values with the appropriate Langfuse endpoint and API credentials obtained during the initial setup.

Step 2: Update Python Dependencies

Add the Langfuse Python package to the LISA Serve REST API dependencies by including the following line in the requirements.txt file located at lib/serve/rest-api/src/:

langfuse>=3.0.0

Step 3: Configure Environment Variables

Update Configuration Schema

Modify the LiteLLMConfig schema in the configSchema.ts file located at lib/schema/configSchema.ts to include the Langfuse environment variables:

typescript

const LiteLLMConfig = z.object({
    db_key: z.string().refine(
        ...
    ),
    general_settings: z.any().optional(),
    litellm_settings: z.any().optional(),
    router_settings: z.any().optional(),
    environment_variables: z.any().optional(),
    callbacks: z.array(z.string()).optional().describe('LiteLLM callbacks to enable (e.g., ["langfuse"])'),
    LANGFUSE_HOST: z.string().optional().describe('Langfuse host URL (e.g., https://us.cloud.langfuse.com)'),
    LANGFUSE_PUBLIC_KEY: z.string().optional().describe('Langfuse public key for authentication'),
    LANGFUSE_SECRET_KEY: z.string().optional().describe('Langfuse secret key for authentication'),
})

Update FastAPI Container Environment

Modify the fastApiContainer.ts file located at lib/api-base/fastApiContainer.ts to include the Langfuse environment variables in the baseEnvironment:

typescript

if (config.litellmConfig.LANGFUSE_HOST) {
    baseEnvironment.LANGFUSE_HOST = config.litellmConfig.LANGFUSE_HOST;
}
if (config.litellmConfig.LANGFUSE_PUBLIC_KEY) {
    baseEnvironment.LANGFUSE_PUBLIC_KEY = config.litellmConfig.LANGFUSE_PUBLIC_KEY;
}
if (config.litellmConfig.LANGFUSE_SECRET_KEY) {
    baseEnvironment.LANGFUSE_SECRET_KEY = config.litellmConfig.LANGFUSE_SECRET_KEY;
}

Step 4: Implement Langfuse

Decorate the LiteLLM Passthrough Function

The implementation in this section is designed for non-streamed responses. Streaming responses require additional implementation considerations for properly handling `StreamingResponse` objects.

Add the Langfuse observe decorator to the litellm_passthrough function located at lib/serve/rest-api/src/api/endpoints/v2/litellm_passthrough.py:

python

from langfuse import observe

@router.api_route("/{api_path:path}", methods=["GET", "POST", "OPTIONS", "PUT", "PATCH", "DELETE", "HEAD"])
@observe()
async def litellm_passthrough(request: Request, api_path: str) -> Response:
    ...

The decorator order is significant. The `@observe()` decorator must be positioned directly above the function definition.

Verification and Monitoring

Deployment and Testing

After completing all configuration changes, redeploy LISA. Once the deployment is successful, interactions with models via LISA will automatically send telemetry data to Langfuse.

Access the Langfuse tracing interface to view collected traces.

Trace Structure

Non-Streamed Response Traces

Non-streamed responses generate traces with the following structure:

Input:

json

{
    "args": [],
    "kwargs": {
        "api_path": "chat/completions",
        "request": {}
    }
}

Output:

json

{
    "status_code": 200,
    "background": null,
    "body": {
        "id": "chatcmpl-4c0f3c88-12d9-4e4d-a38e-9c83fabfa9df",
        "created": 1758912436,
        "model": "us.anthropic.claude-3-7-sonnet-20250219-v1:0",
        "object": "chat.completion",
        "choices": [
            {
                "finish_reason": "stop",
                "index": 0,
                "message": {
                    "content": "Hi there! I notice we're just exchanging greetings. Is there something I can help you with today? I'm happy to assist with questions, provide information on a topic you're interested in, or help with a specific task. Just let me know what you need!",
                    "role": "assistant"
                }
            }
        ],
        "usage": {
            "completion_tokens": 60,
            "prompt_tokens": 1324,
            "total_tokens": 1384,
            "prompt_tokens_details": {
                "cached_tokens": 0
            },
            "cache_creation_input_tokens": 0,
            "cache_read_input_tokens": 0
        }
    },
    "raw_headers": [
        [
            "content-length",
            "674"
        ],
        [
            "content-type",
            "application/json"
        ]
    ]
}

Streamed Response Traces

Streamed responses maintain identical input structure to non-streamed responses but the output differs.

The default trace output for streamed responses are:

xml

<starlette.responses.StreamingResponse object at 0x7fd112d31b50>

Advanced Streaming Implementation

For enhanced observability into streaming responses, implement a custom transformation function for the generate_response function located at lib/serve/rest-api/src/api/endpoints/v2/litellm_passthrough.py:

python

def custom_transformer(line):
    return f"{line}\n\n"

@observe(transform_to_string=custom_transformer)
def generate_response(iterator: Iterator[Union[str, bytes]]) -> Iterator[str]:
    ...

The transform_to_string parameter enables custom handling of generator chunks, allowing for proper string concatenation and formatting.

Additional Resources

Langfuse Docs MCP

For enhanced troubleshooting and integration support, Langfuse provides:

Reference Documentation

LiteLLM Integration:

Langfuse Documentation:

Getting Started

Architecture Overview

Deployment

Configuration Schema

LISA Langfuse Integration Guide

Prerequisites

Initial Langfuse Setup

Configuration Steps

Step 1: Update LiteLLM Configuration

Step 2: Update Python Dependencies

Step 3: Configure Environment Variables

Update Configuration Schema

Update FastAPI Container Environment

Step 4: Implement Langfuse

Decorate the LiteLLM Passthrough Function

Verification and Monitoring

Deployment and Testing

Trace Structure

Non-Streamed Response Traces

Streamed Response Traces

Advanced Streaming Implementation

Additional Resources

Langfuse Docs MCP

Reference Documentation

LISA Langfuse Integration Guide ​

Prerequisites ​

Initial Langfuse Setup ​

Configuration Steps ​

Step 1: Update LiteLLM Configuration ​

Step 2: Update Python Dependencies ​

Step 3: Configure Environment Variables ​

Update Configuration Schema ​

Update FastAPI Container Environment ​

Step 4: Implement Langfuse ​

Decorate the LiteLLM Passthrough Function ​

Verification and Monitoring ​

Deployment and Testing ​

Trace Structure ​

Non-Streamed Response Traces ​

Streamed Response Traces ​

Advanced Streaming Implementation ​

Additional Resources ​

Langfuse Docs MCP ​

Reference Documentation ​

LISA Langfuse Integration Guide

Prerequisites

Initial Langfuse Setup

Configuration Steps

Step 1: Update LiteLLM Configuration

Step 2: Update Python Dependencies

Step 3: Configure Environment Variables

Update Configuration Schema

Update FastAPI Container Environment

Step 4: Implement Langfuse

Decorate the LiteLLM Passthrough Function

Verification and Monitoring

Deployment and Testing

Trace Structure

Non-Streamed Response Traces

Streamed Response Traces

Advanced Streaming Implementation

Additional Resources

Langfuse Docs MCP

Reference Documentation