Skip to content

Exercise 3: ReAct from Scratch (no framework)

Corresponds to Stage 3 — Tool Use & Agent Intro Exercise 3.

Why write it from scratch

ReAct (Reasoning + Acting) is the foundational pattern of modern agents:

while not done:
    thought     = LLM reads current context and verbalizes the next step
    action      = LLM calls a tool
    observation = tool result, fed back to the LLM

LangGraph / CrewAI hide this loop from you. Writing it once yourself is what teaches you: - Why the messages array keeps growing - How tool_use_id pairs with tool_result - Why stop_reason is tool_use vs end_turn - Why max_iter is a mandatory safety net

All of that is covered in 70 lines of Python.

How to run

pip install -r requirements.txt
export ANTHROPIC_API_KEY=sk-ant-...
python starter.py

Expected:

❓ Question: Divide 'Taipei population' by 'NYC population', 4 decimal places.
------------------------------------------------------------
[step 0] thought: Let me look up Taipei's population...
           tool: lookup_fact({'query': '台北人口'}) → 2602000
[step 1] thought: Now NYC's...
           tool: lookup_fact({'query': '紐約人口'}) → 8336000
[step 2] thought: Compute the ratio...
           tool: calculator({'expression': '2602000 / 8336000'}) → 0.3121...
[step 3] thought: The answer is 0.3122.
------------------------------------------------------------
✅ Final answer: Taipei / NYC ≈ 0.3122
   Took 4 rounds.
✅ Exercise 3 passed — the ReAct loop chained lookup_fact and calculator on its own.

Validate the logic without spending API credits

python test.py

test.py uses unittest.mock.MagicMock to replace the Anthropic client and feed canned responses, validating your loop logic. Expected:

✅ test_calculator_basic
✅ test_calculator_rejects_eval_injection
✅ test_lookup_fact
✅ test_react_loop_single_tool_call
✅ test_react_loop_multi_step
✅ test_react_loop_respects_max_iter

🎉 All tests passed — your ReAct loop logic is correct.

Program structure walkthrough

Section Lines What it does
tool_calculator ~30-40 Safe calculator (whitelist filter, avoids eval injection)
tool_lookup_fact ~42-50 Fake fact lookup (teaching-only, avoids external API dep)
TOOLS_SPEC ~52-75 Tool schema that the LLM sees
TOOL_IMPL ~77-80 name → callable dispatch table
react_loop ~85-130 Main loop, with max_iter safety, messages accumulation, tool_result wiring

Common pitfalls

  1. Forgetting to append the assistant response to messages — next round the LLM can't see what it just said, leading to infinite loops
  2. Not passing tool_use_id with tool_result — the LLM can't pair results to calls
  3. while True without max_iter — if a tool returns garbage the LLM may call it forever; safety net is mandatory
  4. Unfiltered evaleval(user_input) in calculator = RCE; use a whitelist or ast.literal_eval

Want smarter answers?

Default model is claude-haiku-4-5 (cheapest). Switch to Sonnet:

MODEL=claude-sonnet-4-6 python starter.py

Or change MODEL = ... in starter.py.

Extensions

  • Add more tools — append one entry each to TOOLS_SPEC + TOOL_IMPL
  • Add streaming — swap client.messages.create(...) for with client.messages.stream(...) as s:, print as it goes
  • Add prompt cache — pass cache_control={"type":"ephemeral"} on system= or tools= to save 90% on repeat calls
  • Plug into LangGraph or Pydantic AI to see how frameworks hide these 70 lines