In the previous post, we built a LangGraph agent with intent-based routing and conditional workflows. That implementation expected conversation_history from the client — fine for demos, but production systems need server-side memory.
This post adds a history_loader node that fetches recent messages from PostgreSQL before classification, enabling personalized responses like “Welcome back, Sarah!” when a user returns.
Why Server-Side Memory?
Client-side history has drawbacks:
- Token bloat: Sending full history on every request wastes bandwidth
- Inconsistency: Different clients may have different history states
- Security: Clients can manipulate history to trick the bot
- Persistence: History lost when user clears browser
Server-side memory solves these by making the backend the source of truth. The client sends only the current message; the server loads context from the database.
Architecture Change
flowchart LR
A[API Request] --> B[History Loader]
B --> |SELECT messages| C[(PostgreSQL)]
C --> B
B --> D[Classifier]
D --> E[Rest of Graph...]
style B fill:#4f46e5,color:#fff
style C fill:#059669,color:#fff
The key change: history_loader runs before classification, populating conversation_history in the agent state.
Database Models
Add Conversation and Message tables with SQLAlchemy:
# models.py
from sqlalchemy import Column, String, Text, DateTime, Numeric, ForeignKey
from sqlalchemy.dialects.postgresql import JSONB
from sqlalchemy.orm import DeclarativeBase, relationship
class Base(DeclarativeBase):
pass
class Conversation(Base):
__tablename__ = "Conversation"
id = Column(String, primary_key=True)
user_id = Column(String, nullable=False)
status = Column(String, default="OPEN") # OPEN, CLOSED, ESCALATED
session = Column(String, default="BOT") # BOT, HUMAN
created_at = Column(DateTime)
updated_at = Column(DateTime)
messages = relationship("Message", back_populates="conversation")
class Message(Base):
__tablename__ = "Message"
id = Column(String, primary_key=True)
conversation_id = Column(String, ForeignKey("Conversation.id"), nullable=False)
role = Column(String, nullable=False) # USER, ASSISTANT
content = Column(Text, nullable=False)
metadata_ = Column("metadata", JSONB)
confidence = Column(Numeric)
created_at = Column(DateTime)
conversation = relationship("Conversation", back_populates="messages")
The Message table stores both user and assistant messages with timestamps for ordering.
History Loader Node
The history loader queries recent messages and formats them for the agent state:
# agent/nodes/history_loader.py
import logging
from sqlalchemy import select
from db import async_session
from models import Message
from agent.config import MAX_HISTORY_MESSAGES # Default: 10
from agent.state import AgentState
logger = logging.getLogger(__name__)
async def history_loader_node(state: AgentState) -> dict:
"""Load recent conversation history from the database."""
conversation_id = state.get("conversation_id", "")
if not conversation_id:
logger.debug("No conversation_id, skipping history load")
return {}
try:
async with async_session() as session:
query = (
select(Message)
.where(Message.conversation_id == conversation_id)
.order_by(Message.created_at.desc())
.limit(MAX_HISTORY_MESSAGES)
)
result = await session.execute(query)
rows = result.scalars().all()
if not rows:
return {}
# Reverse to chronological order (query was DESC for LIMIT)
rows.reverse()
history = [
{"role": row.role.lower(), "content": row.content}
for row in rows
]
logger.info("Loaded %d messages for conversation", len(history))
return {"conversation_history": history}
except Exception:
logger.exception("Failed to load history")
return {}
Key patterns:
- DESC + LIMIT + reverse: Fetch the N most recent messages efficiently, then reverse to chronological order
- Graceful fallback: On any error, return empty dict — don’t crash the workflow
- Role normalization: Convert
USER/ASSISTANTto lowercase for LangChain compatibility
Updated Graph
Add history_loader as the new entry point:
# agent/graph.py
from langgraph.graph import StateGraph, END
from agent.state import AgentState
from agent.nodes import (
history_loader_node,
classifier_node,
greeting_handler_node,
# ... other nodes
)
def build_graph():
graph = StateGraph(AgentState)
# Add history loader as first node
graph.add_node("load_history", history_loader_node)
graph.add_node("classifier", classifier_node)
graph.add_node("greeting_handler", greeting_handler_node)
# ... other nodes
# Entry point: load history first
graph.set_entry_point("load_history")
graph.add_edge("load_history", "classifier")
# Conditional routing after classification
graph.add_conditional_edges("classifier", route_by_intent, {...})
# ... rest of graph unchanged
return graph.compile()
The workflow now flows: load_history → classifier → handlers → responder.
Personalized Handlers
Handlers can now use the loaded history for personalization:
# agent/nodes/greeting_handler.py
from langchain_anthropic import ChatAnthropic
from agent.state import AgentState
GREETING_PROMPT = """You are Glow Assistant - customer support for Glow skincare shop.
Respond to the greeting warmly and briefly introduce what you can help with.
IMPORTANT: Use conversation history to personalize your response.
If the customer shared their name earlier, use it!
Recent conversation:
{history}
Customer message: {message}"""
def _format_history(state: AgentState) -> str:
history = state.get("conversation_history", [])[-5:]
if not history:
return "(no history)"
return "\n".join(
f"{m.get('role', 'user')}: {m.get('content', '')}"
for m in history
)
async def greeting_handler_node(state: AgentState) -> dict:
llm = ChatAnthropic(model="claude-haiku-4-20250514")
response = await llm.ainvoke(
GREETING_PROMPT.format(
history=_format_history(state),
message=state["message"],
)
)
return {"reply": response.content, "confidence": 1.0}
Now when a returning user says “Hi!”, the bot responds: “Welcome back, Sarah!” because it loaded the history where Sarah introduced herself.
Simplified API
With server-side memory, the API no longer needs the history field:
# main.py
class ChatRequest(BaseModel):
message: str
conversationId: str
userId: str
# No more history field - loaded from DB
@app.post("/chat", response_model=ChatResponse)
async def chat(req: ChatRequest):
result = await agent.ainvoke({
"message": req.message,
"conversation_id": req.conversationId,
"user_id": req.userId,
"conversation_history": [], # Populated by history_loader
"needs_human": False,
})
return ChatResponse(
reply=result.get("reply", "Sorry, I could not process your request."),
needs_human=result.get("needs_human", False),
confidence=result.get("confidence", 0.0),
)
The client just sends message, conversationId, and userId. The agent handles the rest.
Configuration
Add a config for maximum history length:
# agent/config.py
import os
MAX_HISTORY_MESSAGES = int(os.getenv("MAX_HISTORY_MESSAGES", "10"))
More history = better context but higher token usage. 10-20 messages is usually enough for customer support.
Database Indexing
Add an index for efficient history queries:
CREATE INDEX idx_message_conversation_created
ON "Message" (conversation_id, created_at DESC);
Conclusion
Server-side conversation memory is essential for production LangGraph agents:
- History loader node runs first, populating state from database
- All downstream nodes access
conversation_historyfor context - API is simplified — clients send only the current message
- Bot truly remembers users across sessions
This pattern scales to any LangGraph workflow. Add the history loader as entry point, and every node gets context for free.