Skip to content

Dimensions

dimensions

Classes defining different components of cost

A "dimension" is one aspect of the pricing for a deployed Foundation Model or application. In general, multiple factors are likely to contribute to the total cost of FMs under test: For example, an API may charge separate rates for input vs output token counts; or a self-managed cloud deployment may carry per-hour charges for compute, as well as network bandwidth charges.

Here we provide implementations for some common cost dimensions, and define base classes you can use to bring customized cost dimensions for your own cost models.

EndpointTime dataclass

EndpointTime(price_per_hour, granularity_secs=1)

Bases: RunCostDimensionBase

Run cost dimension to model per-deployment-hour costs with a flat charge rate

Parameters:

Name Type Description Default
price_per_hour float

Charge applied per hour a test run takes

required
granularity_secs float

Minimum number of seconds billed per increment (Default 1)

1

IRequestCostDimension

Bases: ISerializable

Interface for one dimension of a per-request cost model

Per-request cost components are calculated independently for each invocation in a test run, and can be used to model factors like cost-per-request, cost-per-input-tokens, cost-per-request-duration, etc. They're typically most relevant for serverless deployments like Amazon Bedrock, or estimating duration-based execution costs for AWS Lambda functions.

calculate async

calculate(response)

Calculate (this component of) the cost for an individual request/response

Source code in llmeter/callbacks/cost/dimensions.py
36
37
38
async def calculate(self, response: InvocationResponse) -> Optional[float]:
    """Calculate (this component of) the cost for an individual request/response"""
    ...

IRunCostDimension

Bases: ISerializable

Interface for one dimension of a per-Run cost model

Per-run cost components are notified before the start of a test run via before_run_start(), and then requested to calculate() at the end of the run. They're most relevant for provisioned-infrastructure based deployments like Amazon SageMaker, where factors like a (request-independent) cost-per-hour are important.

before_run_start async

before_run_start(run_config)

Function called to notify the cost component that a test run is about to start

This method is called before the test run starts, and can be used to perform any initialization or setup required for the cost component. In general, we assume a dimension instance may be re-used for multiple test runs, but only one run at a time: Meaning before_run_start() should not be called again before calculate() is called for the previous run.

Source code in llmeter/callbacks/cost/dimensions.py
67
68
69
70
71
72
73
74
75
76
async def before_run_start(self, run_config: _RunConfig) -> None:
    """Function called to notify the cost component that a test run is about to start

    This method is called before the test run starts, and can be used to perform any
    initialization or setup required for the cost component. In general, we assume a dimension
    instance may be re-used for multiple test runs, but only one run at a time: Meaning
    `before_run_start()` should not be called again before `calculate()` is called for the
    previous run.
    """
    ...

calculate async

calculate(result)

Calculate (this component of) the cost for a completed test run

Dimensions that depend on before_run_start being called to return an accurate result should throw an error if this was not done. Dimensions that only need calculate() should silently ignore if before_run_start was not called.

Source code in llmeter/callbacks/cost/dimensions.py
78
79
80
81
82
83
84
85
async def calculate(self, result: Result) -> Optional[float]:
    """Calculate (this component of) the cost for a completed test run

    Dimensions that depend on `before_run_start` being called to return an accurate result
    should throw an error if this was not done. Dimensions that only need `calculate()` should
    silently ignore if `before_run_start` was not called.
    """
    ...

InputTokens dataclass

InputTokens(price_per_million, granularity=1)

Bases: RequestCostDimensionBase

Request cost dimension to model per-input-token costs with a flat charge rate

Parameters:

Name Type Description Default
price_per_million float

Charge applied per million input (prompt) token to the Foundation Model

required
granularity int

Minimum number of tokens billed per increment (Default 1)

1

OutputTokens dataclass

OutputTokens(price_per_million, granularity=1)

Bases: RequestCostDimensionBase

Request cost dimension to model per-output-token costs with a flat charge rate

Parameters:

Name Type Description Default
price_per_million float

Charge per million output (completion) token from the Foundation Model

required
granularity int

Minimum number of tokens billed per increment (Default 1)

1

RequestCostDimensionBase

Bases: ABC, JSONableBase

Base class for implementing per-request cost model dimensions

This class provides a default implementation of ISerializable and sets up an abstract method for calculate(). It's fine if you don't want to derive from it directly - just be sure to fully implement IRequestCostDimension!

calculate abstractmethod async

calculate(response)

Calculate (this component of) the cost for an individual request/response

Source code in llmeter/callbacks/cost/dimensions.py
49
50
51
52
53
54
55
@abstractmethod
async def calculate(self, response: InvocationResponse) -> Optional[float]:
    """Calculate (this component of) the cost for an individual request/response"""
    raise NotImplementedError(
        "Children of RequestCostDimensionBase must implement `calculate()`! At: %s"
        % (self.__class__,)
    )

RunCostDimensionBase

Bases: ABC, JSONableBase

Base class for implementing per-run cost model dimensions

This class provides a default implementation of ISerializable, a default empty before_run_start implementation, and abstract methods for the other requirements of the IRunCostDimension protocol. It's fine if you don't want to derive from it directly - just make sure you fully implement IRunCostDimension!

before_run_start async

before_run_start(run_config)

Function called to notify the cost component that a test run is about to start

This method is called before the test run starts, and can be used to perform any initialization or setup required for the cost component. In general, we assume a dimension instance may be re-used for multiple test runs, but only one run at a time: Meaning before_run_start() should not be called again before calculate() is called for the previous run.

The default implementation is a pass.

Source code in llmeter/callbacks/cost/dimensions.py
 97
 98
 99
100
101
102
103
104
105
106
107
108
async def before_run_start(self, run_config: _RunConfig) -> None:
    """Function called to notify the cost component that a test run is about to start

    This method is called before the test run starts, and can be used to perform any
    initialization or setup required for the cost component.  In general, we assume a dimension
    instance may be re-used for multiple test runs, but only one run at a time: Meaning
    `before_run_start()` should not be called again before `calculate()` is called for the
    previous run.

    The default implementation is a pass.
    """
    pass

calculate abstractmethod async

calculate(result)

Calculate (this component of) the cost for a completed test run

Dimensions that depend on before_run_start being called to return an accurate result should throw an error if this was not done. Dimensions that only need calculate() should silently ignore if before_run_start was not called.

Source code in llmeter/callbacks/cost/dimensions.py
110
111
112
113
114
115
116
117
118
119
120
121
@abstractmethod
async def calculate(self, result: Result) -> Optional[float]:
    """Calculate (this component of) the cost for a completed test run

    Dimensions that depend on `before_run_start` being called to return an accurate result
    should throw an error if this was not done. Dimensions that only need `calculate()` should
    silently ignore if `before_run_start` was not called.
    """
    raise NotImplementedError(
        "Children of RunCostDimensionBase must implement `calculate()`! At: %s"
        % (self.__class__,)
    )