Dimensions
dimensions
Classes defining different components of cost
A "dimension" is one aspect of the pricing for a deployed Foundation Model or application. In general, multiple factors are likely to contribute to the total cost of FMs under test: For example, an API may charge separate rates for input vs output token counts; or a self-managed cloud deployment may carry per-hour charges for compute, as well as network bandwidth charges.
Here we provide implementations for some common cost dimensions, and define base classes you can use to bring customized cost dimensions for your own cost models.
EndpointTime
dataclass
EndpointTime(price_per_hour, granularity_secs=1)
Bases: RunCostDimensionBase
Run cost dimension to model per-deployment-hour costs with a flat charge rate
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
price_per_hour
|
float
|
Charge applied per hour a test run takes |
required |
granularity_secs
|
float
|
Minimum number of seconds billed per increment (Default 1) |
1
|
IRequestCostDimension
Bases: ISerializable
Interface for one dimension of a per-request cost model
Per-request cost components are calculated independently for each invocation in a test run, and can be used to model factors like cost-per-request, cost-per-input-tokens, cost-per-request-duration, etc. They're typically most relevant for serverless deployments like Amazon Bedrock, or estimating duration-based execution costs for AWS Lambda functions.
calculate
async
calculate(response)
Calculate (this component of) the cost for an individual request/response
Source code in llmeter/callbacks/cost/dimensions.py
36 37 38 | |
IRunCostDimension
Bases: ISerializable
Interface for one dimension of a per-Run cost model
Per-run cost components are notified before the start of a test run via before_run_start(),
and then requested to calculate() at the end of the run. They're most relevant for
provisioned-infrastructure based deployments like Amazon SageMaker, where factors like a
(request-independent) cost-per-hour are important.
before_run_start
async
before_run_start(run_config)
Function called to notify the cost component that a test run is about to start
This method is called before the test run starts, and can be used to perform any
initialization or setup required for the cost component. In general, we assume a dimension
instance may be re-used for multiple test runs, but only one run at a time: Meaning
before_run_start() should not be called again before calculate() is called for the
previous run.
Source code in llmeter/callbacks/cost/dimensions.py
67 68 69 70 71 72 73 74 75 76 | |
calculate
async
calculate(result)
Calculate (this component of) the cost for a completed test run
Dimensions that depend on before_run_start being called to return an accurate result
should throw an error if this was not done. Dimensions that only need calculate() should
silently ignore if before_run_start was not called.
Source code in llmeter/callbacks/cost/dimensions.py
78 79 80 81 82 83 84 85 | |
InputTokens
dataclass
InputTokens(price_per_million, granularity=1)
Bases: RequestCostDimensionBase
Request cost dimension to model per-input-token costs with a flat charge rate
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
price_per_million
|
float
|
Charge applied per million input (prompt) token to the Foundation Model |
required |
granularity
|
int
|
Minimum number of tokens billed per increment (Default 1) |
1
|
OutputTokens
dataclass
OutputTokens(price_per_million, granularity=1)
Bases: RequestCostDimensionBase
Request cost dimension to model per-output-token costs with a flat charge rate
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
price_per_million
|
float
|
Charge per million output (completion) token from the Foundation Model |
required |
granularity
|
int
|
Minimum number of tokens billed per increment (Default 1) |
1
|
RequestCostDimensionBase
Bases: ABC, JSONableBase
Base class for implementing per-request cost model dimensions
This class provides a default implementation of ISerializable and sets up an abstract method
for calculate(). It's fine if you don't want to derive from it directly - just be sure to
fully implement IRequestCostDimension!
calculate
abstractmethod
async
calculate(response)
Calculate (this component of) the cost for an individual request/response
Source code in llmeter/callbacks/cost/dimensions.py
49 50 51 52 53 54 55 | |
RunCostDimensionBase
Bases: ABC, JSONableBase
Base class for implementing per-run cost model dimensions
This class provides a default implementation of ISerializable, a default empty
before_run_start implementation, and abstract methods for the other requirements of the
IRunCostDimension protocol. It's fine if you don't want to derive from it directly - just
make sure you fully implement IRunCostDimension!
before_run_start
async
before_run_start(run_config)
Function called to notify the cost component that a test run is about to start
This method is called before the test run starts, and can be used to perform any
initialization or setup required for the cost component. In general, we assume a dimension
instance may be re-used for multiple test runs, but only one run at a time: Meaning
before_run_start() should not be called again before calculate() is called for the
previous run.
The default implementation is a pass.
Source code in llmeter/callbacks/cost/dimensions.py
97 98 99 100 101 102 103 104 105 106 107 108 | |
calculate
abstractmethod
async
calculate(result)
Calculate (this component of) the cost for a completed test run
Dimensions that depend on before_run_start being called to return an accurate result
should throw an error if this was not done. Dimensions that only need calculate() should
silently ignore if before_run_start was not called.
Source code in llmeter/callbacks/cost/dimensions.py
110 111 112 113 114 115 116 117 118 119 120 121 | |