Skip to content

Prepare Agent for RL

Preparing an Agent for RL training is straightforward. The side-by-side below shows a deployed Bedrock AgentCore agent and the RL-adapted version — all four examples in this repo (math, AppWorld, migration, OfficeBench) follow this pattern.

A math agent that uses Bedrock AgentCore Runtime and the Strands Agents framework.

from bedrock_agentcore.runtime import BedrockAgentCoreApp
from strands import Agent
from strands.models import BedrockModel
from strands_tools import calculator
app = BedrockAgentCoreApp()
@app.entrypoint
def invoke_agent(payload):
model = BedrockModel(model_id="us.anthropic.claude-sonnet-4-20250514-v1:0")
agent = Agent(
model=model,
tools=[calculator],
system_prompt="Solve the math problem step by step.",
)
user_input = payload.get("prompt")
response = agent(user_input)
return response.message["content"][0]["text"]
if __name__ == "__main__":
app.run()
from agentcore_rl_toolkit import RewardFunction
class MyReward(RewardFunction):
def __call__(self, **kwargs) -> float:
# Compute a scalar from response_text, ground_truth,
# test_tracker, repo_dir, or whatever context you need.
return reward

Once the code is adapted, deploy with the bedrock-agentcore-starter-toolkit CLI. A minimal deploy:

Terminal window
agentcore configure \
--entrypoint rl_app.py \
--name my_rl_agent \
--requirements-file pyproject.toml \
--deployment-type container \
--non-interactive
agentcore deploy --agent my_rl_agent

A successful deploy prints the agent runtime ARN — that’s the one thing the training backend needs to reach this agent. Pair it with your S3 bucket name (where rewards will land) and you have the full set of hand-off values for the training backend. For example, they drop directly into SlimeRunner (from agentcore_rl_toolkit.backends.slime.SlimeRunner):

SlimeRunner(
agent_runtime_arn="arn:aws:bedrock-agentcore:us-west-2:...",
s3_bucket="your-bucket-name",
...,
).train(...)

For a worked end-to-end deploy (IAM role, ECR push, first-time AWS setup), see the Strands Math Agent example. For containerized or advanced deploys (custom Dockerfiles, direct code deployment, non-CLI workflows), see AWS’s Get started with AgentCore Runtime and its custom / without-CLI path.

Pick a training backend: slime (self-hosted)· rllm (managed + self-hosted)· verl (self-hosted).