Dimensions

dimensions

Classes defining different components of cost

A "dimension" is one aspect of the pricing for a deployed Foundation Model or application. In general, multiple factors are likely to contribute to the total cost of FMs under test: For example, an API may charge separate rates for input vs output token counts; or a self-managed cloud deployment may carry per-hour charges for compute, as well as network bandwidth charges.

Here we provide implementations for some common cost dimensions, and define base classes you can use to bring customized cost dimensions for your own cost models.

EndpointTime `dataclass`

EndpointTime(price_per_hour, granularity_secs=1)

Bases: RunCostDimensionBase

Run cost dimension to model per-deployment-hour costs with a flat charge rate

Parameters:

Name	Type	Description	Default
`price_per_hour`	`float`	Charge applied per hour a test run takes	required
`granularity_secs`	`float`	Minimum number of seconds billed per increment (Default 1)	`1`

IRequestCostDimension

Bases: ISerializable

Interface for one dimension of a per-request cost model

Per-request cost components are calculated independently for each invocation in a test run, and can be used to model factors like cost-per-request, cost-per-input-tokens, cost-per-request-duration, etc. They're typically most relevant for serverless deployments like Amazon Bedrock, or estimating duration-based execution costs for AWS Lambda functions.

calculate `async`

calculate(response)

Calculate (this component of) the cost for an individual request/response

Source code in llmeter/callbacks/cost/dimensions.py

async def calculate(self, response: InvocationResponse) -> Optional[float]:
    """Calculate (this component of) the cost for an individual request/response"""
    ...

IRunCostDimension

Bases: ISerializable

Interface for one dimension of a per-Run cost model

Per-run cost components are notified before the start of a test run via before_run_start(), and then requested to calculate() at the end of the run. They're most relevant for provisioned-infrastructure based deployments like Amazon SageMaker, where factors like a (request-independent) cost-per-hour are important.

before_run_start `async`

before_run_start(run_config)

Function called to notify the cost component that a test run is about to start

This method is called before the test run starts, and can be used to perform any initialization or setup required for the cost component. In general, we assume a dimension instance may be re-used for multiple test runs, but only one run at a time: Meaning before_run_start() should not be called again before calculate() is called for the previous run.

Source code in llmeter/callbacks/cost/dimensions.py

async def before_run_start(self, run_config: _RunConfig) -> None:
    """Function called to notify the cost component that a test run is about to start

    This method is called before the test run starts, and can be used to perform any
    initialization or setup required for the cost component. In general, we assume a dimension
    instance may be re-used for multiple test runs, but only one run at a time: Meaning
    `before_run_start()` should not be called again before `calculate()` is called for the
    previous run.
    """
    ...

calculate `async`

calculate(result)

Calculate (this component of) the cost for a completed test run

Dimensions that depend on before_run_start being called to return an accurate result should throw an error if this was not done. Dimensions that only need calculate() should silently ignore if before_run_start was not called.

Source code in llmeter/callbacks/cost/dimensions.py

async def calculate(self, result: Result) -> Optional[float]:
    """Calculate (this component of) the cost for a completed test run

    Dimensions that depend on `before_run_start` being called to return an accurate result
    should throw an error if this was not done. Dimensions that only need `calculate()` should
    silently ignore if `before_run_start` was not called.
    """
    ...

InputTokens `dataclass`

InputTokens(price_per_million, granularity=1)

Bases: RequestCostDimensionBase

Request cost dimension to model per-input-token costs with a flat charge rate

Parameters:

Name	Type	Description	Default
`price_per_million`	`float`	Charge applied per million input (prompt) token to the Foundation Model	required
`granularity`	`int`	Minimum number of tokens billed per increment (Default 1)	`1`

OutputTokens `dataclass`

OutputTokens(price_per_million, granularity=1)

Bases: RequestCostDimensionBase

Request cost dimension to model per-output-token costs with a flat charge rate

Parameters:

Name	Type	Description	Default
`price_per_million`	`float`	Charge per million output (completion) token from the Foundation Model	required
`granularity`	`int`	Minimum number of tokens billed per increment (Default 1)	`1`

RequestCostDimensionBase

Bases: ABC, JSONableBase

Base class for implementing per-request cost model dimensions

This class provides a default implementation of ISerializable and sets up an abstract method for calculate(). It's fine if you don't want to derive from it directly - just be sure to fully implement IRequestCostDimension!

calculate `abstractmethod` `async`

calculate(response)

Calculate (this component of) the cost for an individual request/response

Source code in llmeter/callbacks/cost/dimensions.py

@abstractmethod
async def calculate(self, response: InvocationResponse) -> Optional[float]:
    """Calculate (this component of) the cost for an individual request/response"""
    raise NotImplementedError(
        "Children of RequestCostDimensionBase must implement `calculate()`! At: %s"
        % (self.__class__,)
    )

RunCostDimensionBase

Bases: ABC, JSONableBase

Base class for implementing per-run cost model dimensions

This class provides a default implementation of ISerializable, a default empty before_run_start implementation, and abstract methods for the other requirements of the IRunCostDimension protocol. It's fine if you don't want to derive from it directly - just make sure you fully implement IRunCostDimension!

before_run_start `async`

before_run_start(run_config)

Function called to notify the cost component that a test run is about to start

This method is called before the test run starts, and can be used to perform any initialization or setup required for the cost component. In general, we assume a dimension instance may be re-used for multiple test runs, but only one run at a time: Meaning before_run_start() should not be called again before calculate() is called for the previous run.

The default implementation is a pass.

Source code in llmeter/callbacks/cost/dimensions.py

async def before_run_start(self, run_config: _RunConfig) -> None:
    """Function called to notify the cost component that a test run is about to start

    This method is called before the test run starts, and can be used to perform any
    initialization or setup required for the cost component.  In general, we assume a dimension
    instance may be re-used for multiple test runs, but only one run at a time: Meaning
    `before_run_start()` should not be called again before `calculate()` is called for the
    previous run.

    The default implementation is a pass.
    """
    pass

calculate `abstractmethod` `async`

calculate(result)

Calculate (this component of) the cost for a completed test run

Dimensions that depend on before_run_start being called to return an accurate result should throw an error if this was not done. Dimensions that only need calculate() should silently ignore if before_run_start was not called.

Source code in llmeter/callbacks/cost/dimensions.py

@abstractmethod
async def calculate(self, result: Result) -> Optional[float]:
    """Calculate (this component of) the cost for a completed test run

    Dimensions that depend on `before_run_start` being called to return an accurate result
    should throw an error if this was not done. Dimensions that only need `calculate()` should
    silently ignore if `before_run_start` was not called.
    """
    raise NotImplementedError(
        "Children of RunCostDimensionBase must implement `calculate()`! At: %s"
        % (self.__class__,)
    )

Dimensions

dimensions

EndpointTime dataclass

IRequestCostDimension

calculate async

IRunCostDimension

before_run_start async

calculate async

InputTokens dataclass

OutputTokens dataclass

RequestCostDimensionBase

calculate abstractmethod async

RunCostDimensionBase

before_run_start async

calculate abstractmethod async

EndpointTime `dataclass`

calculate `async`

before_run_start `async`

calculate `async`

InputTokens `dataclass`

OutputTokens `dataclass`

calculate `abstractmethod` `async`

before_run_start `async`

calculate `abstractmethod` `async`