Runner
runner
Runner
dataclass
Runner(endpoint=None, output_path=None, tokenizer=None, clients=1, n_requests=None, payload=None, run_name=None, run_description=None, timeout=60, callbacks=None, disable_per_client_progress_bar=True, disable_clients_progress_bar=True)
Bases: _RunConfig
Run (one or more) LLM test sets using a base configuration.
First create a Runner with base configuration for your test(s), then call .run() with
optional run-specific overrides. This pattern allows you to group related runs together for
organizing experiments (like ramping load tests) that might use more than one Run in total.
All attributes of this class may be unset (as you may choose to set them only at the Run level), but some are "Mandatory" to be set either at the Runner or individual-run level, as described below.
Attributes:
| Name | Type | Description |
|---|---|---|
endpoint |
Endpoint | dict | None
|
The LLM endpoint to be tested. Must be set at either the Runner or specific Run level. |
output_path |
PathLike | str | None
|
The (cloud or local) base folder under which run outputs and configurations should be stored. By default, outputs will not be saved to file. |
tokenizer |
Tokenizer | Any | None
|
Optional tokenizer used to estimate input and output
token counts for endpoints that don't report exact information. By default, LLMeter's
|
clients |
int
|
The number of concurrent clients to use for sending requests. Defaults to 1. |
n_requests |
int | None
|
The number of LLM invocations to generate per client. By
default, each request in |
payload |
dict | list[dict] | PathLike | str | None
|
The request data to send to the
endpoint under test. You can provide a single JSON payload (dict), a list of payloads
(list[dict]), or a path to one or more JSON/JSON-Lines files to be loaded by
|
run_name |
str | None
|
Name to use for a specific test Run. This is ignored if set at the
Runner level, and should instead be set in |
run_description |
str | None
|
A natural-language description for the test Run. Can be set
either at the Runner level (in which case the same description will be shared across
all Runs), or individually in |
timeout |
int | float
|
The maximum time (in seconds) to wait for each response from the endpoint. Defaults to 60 seconds. |
callbacks |
list[Callback] | None
|
Optional callbacks to enable during the test Run. See
|
disable_per_client_progress_bar |
bool
|
Set |
disable_clients_progress_bar |
bool
|
Set |
add_callback
add_callback(callback)
Add a callback to the runner's list of callbacks.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
callback
|
Callback
|
The callback to be added. |
required |
Source code in llmeter/runner.py
677 678 679 680 681 682 683 684 685 686 687 | |
run
async
run(*, endpoint=None, output_path=None, tokenizer=None, clients=None, n_requests=None, payload=None, run_name=None, run_description=None, timeout=None, callbacks=None, disable_per_client_progress_bar=None, disable_clients_progress_bar=None)
Run a test against an LLM endpoint
This method tests the performance of the endpoint by sending multiple concurrent requests with the given payload(s). It measures the total time taken to complete the test, generates invocations for the given payload(s), and optionally saves the results and metrics.
For arguments that are not specified, the Runner's attributes will be used by default.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
endpoint
|
Endpoint | dict | None
|
The LLM endpoint to be tested. Must be set at either the Runner or specific Run level. |
None
|
output_path
|
PathLike | str | None
|
The (cloud or local) base folder under which
run outputs and configurations should be stored. By default, a new |
None
|
tokenizer
|
Tokenizer | Any | None
|
Optional tokenizer used to estimate input and output token counts for endpoints that don't report exact information. |
None
|
clients
|
int
|
The number of concurrent clients to use for sending requests. |
None
|
n_requests
|
int | None
|
The number of LLM invocations to generate per client. |
None
|
payload
|
dict | list[dict] | PathLike | str | None
|
The request data to send to the
endpoint under test. You can provide a single JSON payload (dict), a list of
payloads (list[dict]), or a path to one or more JSON/JSON-Lines files to be loaded
by |
None
|
run_name
|
str | None
|
Name to use for a specific test Run. By default, runs are named
with the date and time they're requested in format: |
None
|
run_description
|
str | None
|
A natural-language description for the test Run. |
None
|
timeout
|
int | float
|
The maximum time (in seconds) to wait for each response from the endpoint. |
None
|
callbacks
|
list[Callback] | None
|
Optional callbacks to enable during the test Run. See
|
None
|
disable_per_client_progress_bar
|
bool
|
Set |
None
|
disable_clients_progress_bar
|
bool
|
Set |
None
|
Returns:
| Name | Type | Description |
|---|---|---|
Result |
Result
|
An object containing the test results, including the generated |
Result
|
response texts, total test time, total requests, number of clients, |
|
Result
|
number of requests per client, and other relevant metrics. |
Raises:
| Type | Description |
|---|---|
Exception
|
If there's an error during the test execution or if the endpoint cannot be reached. |
Note
- This method uses asyncio for concurrent processing.
- Progress is displayed using tqdm if not disabled.
- Responses are collected and processed asynchronously.
- If an output_path is provided, results are saved to files.
Source code in llmeter/runner.py
589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 | |
process_before_invoke_callbacks
async
process_before_invoke_callbacks(callbacks, payload)
Process the before_run callbacks for a Run.
This method is expected to be called exactly once after the _Run object is created. Attempting to re-use a _Run object may result in undefined behavior.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
callbacks
|
list[Callback]
|
The list of callbacks to process. |
required |
Source code in llmeter/runner.py
496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 | |