AI Backend 8 min read

Build a Conversational Agent with LangChain, Tools, and Chat History

Hoang Dang Tan Phat (Kane)

Hoang Dang Tan Phat (Kane)

Feb 16, 2026

LangChain agents combine LLMs with tools — functions the model can call to fetch data, perform calculations, or interact with external APIs. This post walks through building a conversational agent that maintains chat history, so each response accounts for everything said before. We also add automatic retries with exponential backoff using tenacity to handle transient network failures and API rate limits.

Project Setup

We use uv for dependency management. Initialize the project and add dependencies:

uv init learn-langchain
cd learn-langchain
uv add langchain langchain-anthropic langchain-classic langchain-core python-dotenv requests tenacity wikipedia

Create a .env file with your Anthropic API key:

ANTHROPIC_API_KEY=sk-ant-...

Defining Tools

LangChain tools are plain Python functions decorated with @tool. The docstring becomes the tool description the LLM sees when deciding which tool to call. We stack @retry from tenacity on the network-dependent tools so transient failures are retried automatically with exponential backoff.

Calculator

A safe math evaluator using Python’s ast module — no eval():

from langchain_core.tools import tool

@tool
def calculate(expression: str) -> str:
    """Calculate a mathematical expression."""
    try:
        import ast
        import operator

        operators = {
            ast.Add: operator.add,
            ast.Sub: operator.sub,
            ast.Mult: operator.mul,
            ast.Div: operator.truediv,
            ast.Pow: operator.pow,
            ast.Mod: operator.mod,
        }

        def eval_expr(node):
            if isinstance(node, ast.Constant):
                return node.value
            elif isinstance(node, ast.BinOp):
                return operators[type(node.op)](
                    eval_expr(node.left), eval_expr(node.right)
                )
            else:
                raise ValueError("Unsupported expression")

        result = eval_expr(ast.parse(expression, mode="eval").body)
        return str(result)

    except Exception as e:
        return f"Error: {e}"

This walks the AST tree and only allows numeric constants and binary operators. No arbitrary code execution.

@tool
@retry(stop=stop_after_attempt(3), wait=wait_exponential(min=1, max=5))
def wikipedia_search(query: str) -> str:
    """Search Wikipedia for a query."""
    try:
        import wikipedia
        return wikipedia.summary(query, sentences=2)
    except Exception as e:
        return f"Error: {e}"

The @retry decorator retries up to 3 times with exponential backoff (1–5 seconds) on any exception.

Current Time

@tool
def get_current_time(location: str = "UTC") -> str:
    """Get the current time."""
    from datetime import datetime
    return f"Current time: {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}"

Crypto Price Checker

Fetches live prices from the Binance API. The retry decorator only retries on requests.RequestException — network errors and HTTP failures — not on application-level errors like an invalid symbol:

@tool
@retry(stop=stop_after_attempt(3), wait=wait_exponential(min=1, max=5), retry=retry_if_exception_type(requests.RequestException))
def check_token_price(token: str) -> str:
    """Check the price of a cryptocurrency token."""
    try:
        token = token.upper()
        url = f"https://api.binance.com/api/v3/ticker/price?symbol={token}USDC"
        request = requests.get(url)
        data = request.json()
        return f"The current price of {token} is {data['price']} USDC"
    except Exception as e:
        return f"Error: {e}"

Note token.upper() — Binance rejects lowercase symbols, and the LLM often passes eth instead of ETH. The requests import is now at the top of the file instead of inside this function.

The ConversationAssistant Class

This is the core of the application. It wires together the LLM, tools, prompt template, and chat history.

from langchain_anthropic import ChatAnthropic
from langchain_core.messages import HumanMessage, AIMessage
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_classic.agents import AgentExecutor, create_tool_calling_agent
from tenacity import retry, stop_after_attempt, wait_exponential


class ConversationAssistant:
    def __init__(self):
        self.llm = ChatAnthropic(model_name="claude-haiku-4-5-20251001", timeout=10)
        self.tools = [calculate, wikipedia_search, get_current_time, check_token_price]
        self.chat_history: list[HumanMessage | AIMessage] = []

        prompt = ChatPromptTemplate.from_messages(
            [
                ("system", "You are a helpful assistant."),
                MessagesPlaceholder(variable_name="chat_history"),
                ("human", "{input}"),
                MessagesPlaceholder(variable_name="agent_scratchpad"),
            ],
        )

        agent = create_tool_calling_agent(self.llm, self.tools, prompt)
        self.agent_executor = AgentExecutor(
            agent=agent, tools=self.tools, verbose=False
        )

Key Design Decisions

Prompt structure matters. The prompt has four parts in order:

  1. System message — sets the assistant’s behavior
  2. chat_history — all previous turns injected here via MessagesPlaceholder
  3. Human input — the current user message
  4. agent_scratchpad — where the agent tracks its tool-calling reasoning

The order ensures the agent sees full conversation context before the current question, and has space to reason about tool calls after.

Chat history is a list of messages. We store HumanMessage and AIMessage objects. After each exchange, both get appended.

The agent_executor.invoke() call is wrapped in a retrying helper so transient API failures (rate limits, network blips) are retried up to 3 times with exponential backoff up to 10 seconds:

@retry(stop=stop_after_attempt(3), wait=wait_exponential(min=1, max=10))
def _invoke_agent(self, user_input: str) -> dict:
    return self.agent_executor.invoke(
        {"input": user_input, "chat_history": self.chat_history}
    )

def chat(self, user_input: str) -> str:
    try:
        result = self._invoke_agent(user_input)
        output = result["output"]

        if isinstance(output, list):
            output = "\n".join(
                block["text"] for block in output if block.get("text")
            )

        self.chat_history.append(HumanMessage(content=user_input))
        self.chat_history.append(AIMessage(content=output))

        return output
    except Exception as e:
        return f"Error: {e}"

The isinstance(output, list) check handles cases where the agent returns structured content blocks instead of a plain string — this happens with some Anthropic model responses.

Interactive Terminal Loop

The run() method reads input in a loop until the user types quit, exit, or presses Ctrl+C:

def run(self):
    print("\n╔══════════════════════════════════════════════════╗")
    print("║         Conversation Assistant                   ║")
    print("║  Type 'quit' or 'exit' to stop                  ║")
    print("╚══════════════════════════════════════════════════╝\n")

    while True:
        try:
            user_input = input("You > ").strip()
        except (EOFError, KeyboardInterrupt):
            print("\n\nGoodbye!")
            break

        if not user_input:
            continue

        if user_input.lower() in ("quit", "exit"):
            print("\nGoodbye!")
            break

        response = self.chat(user_input)
        print(f"\nAssistant > {response}\n")
        print("-" * 50)

Running It

uv run python main.py
╔══════════════════════════════════════════════════╗
║         Conversation Assistant                   ║
║  Type 'quit' or 'exit' to stop                  ║
╚══════════════════════════════════════════════════╝

You > What is 2 + 2 * 3?

Assistant > The result of 2 + 2 * 3 is 8.

--------------------------------------------------
You > What about the previous result times 5?

Assistant > The previous result was 8, so 8 * 5 = 40.

--------------------------------------------------
You > Check the price of ETH

Assistant > The current price of ETH is 1989.59 USDC.

--------------------------------------------------

The second question works because chat history carries the context — the agent knows “the previous result” refers to 8.

How the Agent Loop Works

When you call agent_executor.invoke(), this happens internally:

  1. The LLM receives the full prompt (system + history + input + scratchpad)
  2. If the LLM decides to call a tool, it returns a tool call message
  3. The executor runs the tool function and appends the result to the scratchpad
  4. The LLM sees the tool result and either calls another tool or returns a final answer
  5. The loop continues until the LLM produces a final response
flowchart TD
    A[User Input] --> B[Build Prompt]
    B --> C{LLM}
    C -->|Tool Call| D[Execute Tool]
    D --> E[Append Result to Scratchpad]
    E --> C
    C -->|Final Answer| F[Return Response]
    F --> G[Append to Chat History]

The agent_scratchpad placeholder is where this multi-step reasoning lives. You never populate it yourself — the AgentExecutor manages it.

Gotchas

Binance API requires uppercase symbols. Always normalize user input with .upper() in tools that hit case-sensitive APIs.

Output format varies. Depending on the model and LangChain version, result["output"] can be a string or a list of content blocks. Always handle both.

Chat history grows unbounded. For production use, consider trimming old messages or summarizing history to stay within token limits. LangChain provides ConversationSummaryMemory and ConversationBufferWindowMemory for this.

Tool docstrings are prompts. The LLM reads them to decide when and how to call each tool. Vague docstrings lead to wrong tool selection. Be specific about what the tool does and what arguments it expects.

Retry decorator order matters. When stacking @tool and @retry, place @tool first (outermost) so LangChain sees the decorated function, while @retry wraps the inner implementation. For the check_token_price tool, we only retry on requests.RequestException to avoid retrying on application-level errors like invalid symbols.

Full Source

import requests

from langchain_anthropic import ChatAnthropic
from langchain_core.messages import HumanMessage, AIMessage
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_core.tools import tool
from langchain_classic.agents import AgentExecutor, create_tool_calling_agent
from dotenv import load_dotenv
from tenacity import retry, stop_after_attempt, wait_exponential, retry_if_exception_type

load_dotenv()


@tool
def calculate(expression: str) -> str:
    """Calculate a mathematical expression."""
    try:
        import ast
        import operator

        operators = {
            ast.Add: operator.add,
            ast.Sub: operator.sub,
            ast.Mult: operator.mul,
            ast.Div: operator.truediv,
            ast.Pow: operator.pow,
            ast.Mod: operator.mod,
        }

        def eval_expr(node):
            if isinstance(node, ast.Constant):
                return node.value
            elif isinstance(node, ast.BinOp):
                return operators[type(node.op)](
                    eval_expr(node.left), eval_expr(node.right)
                )
            else:
                raise ValueError("Unsupported expression")

        result = eval_expr(ast.parse(expression, mode="eval").body)
        return str(result)

    except Exception as e:
        return f"Error: {e}"


@tool
@retry(stop=stop_after_attempt(3), wait=wait_exponential(min=1, max=5))
def wikipedia_search(query: str) -> str:
    """Search Wikipedia for a query."""
    try:
        import wikipedia
        return wikipedia.summary(query, sentences=2)
    except Exception as e:
        return f"Error: {e}"


@tool
def get_current_time(location: str = "UTC") -> str:
    """Get the current time."""
    from datetime import datetime
    return f"Current time: {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}"


@tool
@retry(stop=stop_after_attempt(3), wait=wait_exponential(min=1, max=5), retry=retry_if_exception_type(requests.RequestException))
def check_token_price(token: str) -> str:
    """Check the price of a cryptocurrency token."""
    try:
        token = token.upper()
        url = f"https://api.binance.com/api/v3/ticker/price?symbol={token}USDC"
        request = requests.get(url)
        data = request.json()
        return f"The current price of {token} is {data['price']} USDC"
    except Exception as e:
        return f"Error: {e}"


class ConversationAssistant:
    def __init__(self):
        self.llm = ChatAnthropic(model_name="claude-haiku-4-5-20251001", timeout=10)
        self.tools = [calculate, wikipedia_search, get_current_time, check_token_price]
        self.chat_history: list[HumanMessage | AIMessage] = []

        prompt = ChatPromptTemplate.from_messages(
            [
                ("system", "You are a helpful assistant."),
                MessagesPlaceholder(variable_name="chat_history"),
                ("human", "{input}"),
                MessagesPlaceholder(variable_name="agent_scratchpad"),
            ],
        )

        agent = create_tool_calling_agent(self.llm, self.tools, prompt)
        self.agent_executor = AgentExecutor(
            agent=agent, tools=self.tools, verbose=False
        )

    @retry(stop=stop_after_attempt(3), wait=wait_exponential(min=1, max=10))
    def _invoke_agent(self, user_input: str) -> dict:
        return self.agent_executor.invoke(
            {"input": user_input, "chat_history": self.chat_history}
        )

    def chat(self, user_input: str) -> str:
        try:
            result = self._invoke_agent(user_input)
            output = result["output"]

            if isinstance(output, list):
                output = "\n".join(
                    block["text"] for block in output if block.get("text")
                )

            self.chat_history.append(HumanMessage(content=user_input))
            self.chat_history.append(AIMessage(content=output))

            return output
        except Exception as e:
            return f"Error: {e}"

    def run(self):
        print("\n╔══════════════════════════════════════════════════╗")
        print("║         Conversation Assistant                   ║")
        print("║  Type 'quit' or 'exit' to stop                  ║")
        print("╚══════════════════════════════════════════════════╝\n")

        while True:
            try:
                user_input = input("You > ").strip()
            except (EOFError, KeyboardInterrupt):
                print("\n\nGoodbye!")
                break

            if not user_input:
                continue

            if user_input.lower() in ("quit", "exit"):
                print("\nGoodbye!")
                break

            response = self.chat(user_input)
            print(f"\nAssistant > {response}\n")
            print("-" * 50)


def main():
    assistant = ConversationAssistant()
    assistant.run()


if __name__ == "__main__":
    main()
langchain python claude ai-agent tools chat-history anthropic
Hoang Dang Tan Phat (Kane)

Hoang Dang Tan Phat (Kane)

Full-stack developer with 8+ years experience. Building scalable systems with Go, TypeScript, and React.