Strands Math Agent
An RL-trainable GSM8K
math agent using Bedrock AgentCore RL Toolkit. The agent solves
grade-school word problems by invoking a calculator tool
for arithmetic steps and emitting a final answer after ####.
This is the “hello world” for the toolkit — minimal prompt and single tool. Start here.
Quickstart
Section titled “Quickstart”cd examples/strands_math_agentuv venv --python 3.13 && source .venv/bin/activateuv pip install -e .
# Run the RL-adapted app as a local HTTP serverpython rl_app.pyDeploy the same entrypoint to AgentCore Runtime via the
Prepare agent for RL → Deploy
flow, then feed the resulting runtime ARN into
SlimeRunner
to train.
What’s in the example
Section titled “What’s in the example”basic_app.py— deployment-ready reference usingBedrockModel.rl_app.py— RL-adapted version; readsbase_url/model_idfrom_rollout, returns{"rewards": ...}.reward.py—GSM8KReward: extract the####answer, exact-match against ground truth.evaluate.py— batch-evaluate a deployed ACR agent viaRolloutClient.
For local setup (Bedrock credentials, local vLLM, S3 bucket,
sending curl requests), Docker builds, ECR push, IAM setup, and
full deploy instructions, see the
full README on GitHub.