Fatskills
Practice. Master. Repeat.
Study Guide: AI Agent Foundations: Planning memory and tool use
Source: https://www.fatskills.com/ai-for-work/chapter/ai-agent-foundations-planning-memory-and-tool-use

AI Agent Foundations: Planning memory and tool use

By Fatskills Exam Guides Team — the exam nerds behind 28,500+ quizzes and 2.1M practice questions across 500+ global exams.

⏱️ ~6 min read

Planning, Memory, and Tool Use in AI Agents

What This Is Planning, memory, and tool use are core capabilities that turn AI from a reactive text generator into a proactive, context-aware agent. In real work, this means AI can break down complex tasks, remember past interactions, and use external tools (APIs, databases, calculators) to solve problems—like a junior teammate who learns from experience and asks for help when needed. Example: A customer support agent that remembers a user’s past issues, plans a multi-step troubleshooting workflow, and pulls real-time data from a CRM to resolve a ticket without human input.


Key Facts & Principles

  • Planning: The ability to decompose a goal into sub-tasks, sequence them, and adapt if steps fail. Example: An AI agent tasked with "prepare a quarterly report" might first gather data, then analyze trends, draft insights, and finally format the output—checking for errors at each step.
  • Memory (short-term vs. long-term):
  • Short-term memory (working memory): Retains context within a single session (e.g., chat history, intermediate calculations). Example: An agent remembers a user’s product preference from earlier in the conversation.
  • Long-term memory: Stores and retrieves information across sessions (e.g., user profiles, past decisions, or learned patterns). Example: An agent recalls a client’s preferred communication style from a prior interaction.
  • Tool use (function calling): The ability to interact with external tools (APIs, databases, calculators) to fetch data, perform actions, or verify information. Example: An agent uses a weather API to check conditions before suggesting outdoor meeting times.
  • ReAct (Reason + Act): A framework where agents alternate between reasoning (planning/thinking) and acting (using tools or generating output). Example: An agent first reasons ("I need to check inventory levels"), then acts (calls an inventory API), then reasons again ("Inventory is low; I should notify the supply team").
  • State management: Tracking the "state" of a task (e.g., progress, errors, dependencies) to avoid redundant work or loops. Example: An agent marks a step as "completed" after sending an email to avoid resending it.
  • Error recovery: Detecting failures (e.g., API timeouts, invalid inputs) and replanning. Example: If a database query fails, the agent retries with a fallback method or asks the user for clarification.
  • Context window limits: Most models have a finite "memory" (e.g., 32k tokens for GPT-4). Long-term memory requires external storage (e.g., vector databases). Example: An agent summarizes and stores key points from a long meeting to avoid hitting token limits.
  • Tool selection bias: Agents may over-rely on familiar tools or misjudge when to use them. Example: An agent defaults to a calculator for simple math instead of using its built-in arithmetic, slowing down the process.

Step-by-Step Application

  1. Define the goal and constraints
  2. Write a clear, specific objective (e.g., "Generate a sales forecast for Q3 using CRM data and market trends").
  3. List constraints (e.g., "Must use the company’s internal API," "Avoid sharing PII").

  4. Break the task into sub-tasks

  5. Use a flowchart or bullet points to outline steps (e.g., "1. Fetch CRM data-2. Clean data-3. Run forecast model-4. Generate report").
  6. Assign tools to each step (e.g., "Step 1: Use CRM API; Step 3: Use Python script").

  7. Set up memory and state tracking

  8. For short-term memory: Use the model’s context window (e.g., include prior steps in the prompt).
  9. For long-term memory: Store key data in a database (e.g., user preferences, past decisions) and retrieve it with embeddings or SQL.
  10. Track state: Use a status variable (e.g., {"step": "data_cleaning", "status": "in_progress"}).

  11. Implement tool use

  12. Define tool schemas (e.g., {"name": "get_crm_data", "description": "Fetches customer data", "parameters": {"date_range": "string"}}).
  13. Use function calling (e.g., OpenAI’s tools parameter) or a framework like LangChain to connect tools.
  14. Add error handling (e.g., retry logic, fallback tools).

  15. Test and iterate

  16. Run the agent on a small task (e.g., forecast for one region) and check:
    • Does it follow the plan?
    • Does it recover from errors?
    • Does it use tools efficiently?
  17. Refine prompts, tool descriptions, or memory storage based on failures.

  18. Monitor and log

  19. Log agent actions, tool calls, and errors for debugging.
  20. Use observability tools (e.g., LangSmith, Weights & Biases) to track performance over time.

Common Mistakes

  • Mistake: Assuming the agent will "just figure out" the plan without explicit decomposition. Correction: Break tasks into sub-tasks in the prompt (e.g., "First, do X. Then, if Y, do Z"). Agents perform better with structured guidance.

  • Mistake: Ignoring memory limits and letting the context window overflow. Correction: Summarize or offload long-term memory to external storage (e.g., vector DBs). Use techniques like "chain-of-thought" to compress intermediate steps.

  • Mistake: Overloading the agent with too many tools or vague tool descriptions. Correction: Limit tools to 3–5 per task and write specific descriptions (e.g., "Use this tool to fetch real-time stock prices, not historical data").

  • Mistake: Not handling tool failures (e.g., API timeouts, rate limits). Correction: Add retry logic, fallback tools, or user prompts (e.g., "The CRM API is down. Should I use cached data or wait?").

  • Mistake: Treating the agent as a black box without logging. Correction: Log all tool calls, errors, and state changes. Use this to debug and improve the system.


Practical Tips

  • Start small: Build a single-step tool use case (e.g., "fetch weather data") before tackling multi-step planning.
  • Use "thinking" prompts: Add phrases like "Before answering, think step-by-step about what tools you need" to encourage ReAct behavior.
  • Validate tool outputs: Always check if a tool’s response is valid (e.g., "Does this CRM data include the required fields?") before proceeding.
  • Combine memory types: Use short-term memory for session context (e.g., chat history) and long-term memory for persistent data (e.g., user profiles).

Quick Practice Scenario

Scenario: You’re building an AI agent to help sales teams prepare for client meetings. The agent should:
1. Pull the client’s past interactions from the CRM.
2. Check the client’s industry trends using a market data API.
3. Draft a meeting agenda based on the above. Question: What’s the first step to ensure the agent remembers the client’s name and past interactions across all three tasks?

Answer: Store the client’s name and CRM data in long-term memory (e.g., a database) and retrieve it at the start of each task. Explanation: Short-term memory (context window) may not persist across all steps, especially if the tasks are complex or involve multiple tool calls.


Last-Minute Cram Sheet

  1. Planning = Break goals into sub-tasks; adapt if steps fail.
  2. Short-term memory = Context window (e.g., chat history); resets after session.
  3. Long-term memory = External storage (e.g., vector DBs); persists across sessions.
  4. Tool use = Call APIs/databases; define schemas with specific descriptions.
  5. ReAct = Reason (plan)-Act (use tools)-Repeat.
  6. State management = Track progress (e.g., {"step": "X", "status": "done"}).
  7. Error recovery = Retry, fallback, or ask user if tools fail.
  8. Context window limits = Summarize or offload long conversations.
  9. Tool selection bias = Avoid over-relying on familiar tools; validate use cases.
  10. Log everything = Tool calls, errors, and state changes for debugging.