sagemaker
sagemaker
Utilities for automating cost modelling on Amazon SageMaker endpoints
SageMakerRTEndpointCompute
dataclass
SageMakerRTEndpointCompute(instance_type, instance_count=1, price_per_hour=None, region=None)
Bases: RunCostDimensionBase
Run cost dimension to estimate Amazon SageMaker real-time endpoint compute charges
NOTE: To auto-discover price_per_hour from instance type, you'll need IAM permissions to call
the pricing:GetProducts API. For more information, see:
https://docs.aws.amazon.com/service-authorization/latest/reference/list_awspricelist.html
Calculated rates do not currently include EBS storage volume costs. See
SageMakerRTEndpointStorage for estimating this.
See .fetch_sm_hosting_od_price() for more details on how price is looked up when not
explicitly provided. This lookup is provided on a best-effort basis, and may not accurately
reflect all possible scenarios.
Use .from_endpoint() to automatically discover compute cost dimensions from a deployed
SageMaker real-time inference endpoint.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
instance_type
|
str
|
Amazon SageMaker instance type e.g. 'ml.g5.4xlarge' |
required |
instance_count
|
float
|
Number of instances running (default: 1) |
1
|
price_per_hour
|
float | None
|
Price per hour per instance (default: attempt to fetch from pricing API) |
None
|
region
|
str | None
|
AWS region where the endpoint is running (default: current region) |
None
|
fetch_sm_hosting_od_price
staticmethod
fetch_sm_hosting_od_price(instance_type, region)
Look up USD hourly rates for on-demand SM hosting instances from the AWS Price List API
This function assumes:
- You're using standard "on-demand" pricing - no savings plans or private pricing
- No free tier or volume discounts are applicable to this usage
- Your pricing is provided in USD
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
instance_type
|
str
|
Amazon SageMaker instance type, e.g. 'ml.g5.4xlarge' |
required |
region
|
str
|
AWS region where the endpoint is running, e.g. 'us-east-1' |
required |
Returns:
| Name | Type | Description |
|---|---|---|
price_per_hour |
float
|
The standard, on-demand hourly price for the given instance type in USD |
Source code in llmeter/callbacks/cost/providers/sagemaker.py
249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 | |
from_endpoint
classmethod
from_endpoint(endpoint_name, region=None)
Configure SageMakerRTEndpointCompute dimension(s) from an existing SageMaker endpoint
NOTE: You'll need IAM permissions to sagemaker:DescribeEndpoint and
sagemaker:DescribeEndpointConfig to use this method.
This function returns a dictionary rather than a single SageMakerRTEndpointCompute,
because different "variants" deployed behind an endpoint may be backed by clusters of
different instance types, and therefore need separate dimensions.
Instance counts will be retrieved at the point in time this method is called, so watch out if you have auto-scaling enabled on your endpoint.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
endpoint_name
|
str
|
Name of the SageMaker endpoint |
required |
region
|
str | None
|
AWS region where the endpoint is running (default: current region) |
None
|
Returns:
| Name | Type | Description |
|---|---|---|
run_dims |
dict[str, SageMakerRTEndpointCompute]
|
A dictionary containing one or more dimensions, to pass to your
|
Source code in llmeter/callbacks/cost/providers/sagemaker.py
191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 | |
SageMakerRTEndpointStorage
dataclass
SageMakerRTEndpointStorage(gbs_provisioned, price_per_gb_hour=None, region=None)
Bases: RunCostDimensionBase
Run cost dimension to estimate EBS charges for Amazon SageMaker real-time endpoints
NOTE: To auto-discover price_per_gb_hour, you'll need IAM permissions to call the
pricing:GetProducts API. For more information, see:
https://docs.aws.amazon.com/service-authorization/latest/reference/list_awspricelist.html
See .fetch_sm_hosting_ebs_price() for more details on how price is looked up when not
explicitly provided. This lookup is provided on a best-effort basis, and may not accurately
reflect all possible scenarios.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
gbs_provisioned
|
float
|
Total size of provisioned EBS volume(s) for the endpoint in Gigabytes |
required |
price_per_gb_hour
|
float | None
|
Price per hour per GB (default: attempt to fetch from pricing API) |
None
|
region
|
str | None
|
AWS region where the endpoint is running (default: current region) |
None
|
fetch_sm_hosting_ebs_price
staticmethod
fetch_sm_hosting_ebs_price(region)
Look up hourly USD rate for SageMaker hosting EBS storage from the AWS Price List API
The API actually lists this rate as monthly, so we take an assumption of 30days * 24hrs to convert.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
region
|
str
|
AWS region where the endpoint is running, e.g. 'us-east-1' |
required |
Returns:
| Name | Type | Description |
|---|---|---|
price_per_gb_hour |
float
|
The standard, on-demand price per GB-hour in USD |
Source code in llmeter/callbacks/cost/providers/sagemaker.py
397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 | |
from_endpoint
classmethod
from_endpoint(endpoint_name, region=None, merge_variants=False)
Configure SageMakerRTEndpointStorage dimension(s) from an existing SageMaker endpoint
NOTE: You'll need IAM permissions to sagemaker:DescribeEndpoint and
sagemaker:DescribeEndpointConfig to use this method.
This function returns a dictionary rather than a single SageMakerRTEndpointStorage,
because different "variants" deployed behind an endpoint may be reported separately, if
more than one are present.
Instance counts will be retrieved at the point in time this method is called, so watch out if you have auto-scaling enabled on your endpoint.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
endpoint_name
|
str
|
Name of the SageMaker endpoint |
required |
region
|
str | None
|
AWS region where the endpoint is running (default: current region) |
None
|
merge_variants
|
bool
|
Set |
False
|
Returns:
| Name | Type | Description |
|---|---|---|
run_dims |
dict[str, SageMakerRTEndpointStorage]
|
A dictionary containing zero or more dimensions, to pass to your
|
Source code in llmeter/callbacks/cost/providers/sagemaker.py
313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 | |
cost_model_from_sagemaker_realtime_endpoint
cost_model_from_sagemaker_realtime_endpoint(endpoint_name, region=None)
Automatically infer an LLMeter CostModel from a deployed SageMaker real-time endpoint
This method builds a basic cost estimating model for SageMaker real-time inference endpoints including compute and EBS storage costs, but excluding data transfer costs. Standard on-demand pricing is used, without accounting for private pricing, tiers, savings plans, or etc.
NOTE: You'll need IAM permissions to pricing:GetProducts, sagemaker:DescribeEndpoint, and
sagemaker:DescribeEndpointConfig to use this method.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
endpoint_name
|
str
|
Name of the deployed SageMaker endpoint |
required |
region
|
str | None
|
AWS region where the endpoint is running (default: current region) |
None
|
Returns:
| Name | Type | Description |
|---|---|---|
cost_model |
CostModel
|
A |
Source code in llmeter/callbacks/cost/providers/sagemaker.py
114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 | |