★ INSERT COINNOW PLAYING: VENTURESHIGH SCORE: $100M ARR★ NEW STAGE UNLOCKED: ABOUT MEPRESS START★ DEMO DAY 04:00:00
★ INSERT COINNOW PLAYING: VENTURESHIGH SCORE: $100M ARR★ NEW STAGE UNLOCKED: ABOUT MEPRESS START★ DEMO DAY 04:00:00
◀ BACK
SERIES 101

AI Agent 101: LangChain Beginner-to-Builder Agent for Your App

A technical beginner-to-builder guide to LangChain, LangGraph, and LangSmith: what LangChain is, how agents use tools, how to build workflows, and how to deploy an AI agent into a real app.

1P · JUDY DUONG·JUNE 5, 2026·18 MIN READ
AI Agent 101: LangChain Beginner-to-Builder Agent for Your App

LangChain is a framework for building LLM-powered applications, especially applications where the model needs to interact with tools, retrieve data, follow workflows, return structured outputs, or behave like an agent.

The modern LangChain stack is no longer just prompt → model → answer. It is more like:

LangChain  = high-level framework for agents and LLM apps
LangGraph  = low-level orchestration/runtime layer for stateful agents
LangSmith  = observability, tracing, evaluation, monitoring

LangChain is best understood as a framework for connecting LLMs to tools, data, workflows, and production systems.

1. What LangChain actually is

At the simplest level, LangChain helps you connect:

User input
→ prompt
→ model
→ tools
→ data sources
→ memory/state
→ final output

But technically, LangChain is an application framework around LLMs.

It gives you reusable abstractions for:

LLM / chat model interfaces
Prompt templates
Message objects
Tool calling
Agents
Retrievers
RAG pipelines
Structured output
Middleware
Streaming
Human review
Tracing and observability

The reason LangChain exists is that real AI applications usually need more than one model call.

A normal chatbot might do this:

User: "Explain revenue recognition"
→ LLM
→ Answer

But a real AI agent may need this:

User: "Find the latest funding round, compare competitors, and draft a blog post"

→ classify task
→ search web
→ retrieve internal notes
→ call database
→ rank sources
→ generate draft
→ validate citations
→ ask human approval
→ create CMS draft
→ send Slack notification

That multi-step system is where LangChain becomes useful.

2. The modern LangChain stack

Think of LangChain as three layers.

┌──────────────────────────────────────────────┐
│                 Your App                     │
│  Slack bot, web app, CLI, CMS agent, etc.    │
└──────────────────────┬───────────────────────┘
                       │
┌──────────────────────▼───────────────────────┐
│                LangChain                     │
│  Agents, tools, prompts, structured output,  │
│  model abstraction, middleware               │
└──────────────────────┬───────────────────────┘
                       │
┌──────────────────────▼───────────────────────┐
│                LangGraph                     │
│  Stateful graph runtime, durable execution,  │
│  memory, persistence, human-in-the-loop       │
└──────────────────────┬───────────────────────┘
                       │
┌──────────────────────▼───────────────────────┐
│                LangSmith                     │
│  Tracing, debugging, evaluation, monitoring  │
└──────────────────────────────────────────────┘

LangChain is the easy entry point. It gives you prebuilt agent patterns and standard interfaces.

LangGraph is the lower-level runtime for complex stateful workflows. It is designed for long-running agents, durable execution, streaming, human-in-the-loop, persistence, and memory.

LangSmith is the observability and evaluation layer. It helps with tracing, debugging, monitoring, and evaluating LLM applications.

3. The core mental model

LangChain applications usually follow this loop:

Input
→ model thinks
→ model decides whether it needs a tool
→ tool executes
→ result goes back to model
→ model decides next step
→ final answer

Visualized:

                    ┌──────────────────────┐
                    │      User input      │
                    └──────────┬───────────┘
                               │
                               ▼
                    ┌──────────────────────┐
                    │      LLM / Agent     │
                    └──────────┬───────────┘
                               │
               ┌───────────────┴────────────────┐
               │                                │
               ▼                                ▼
     ┌──────────────────┐             ┌──────────────────┐
     │ Return answer    │             │ Call tool        │
     └──────────────────┘             └────────┬─────────┘
                                                │
                                                ▼
                                      ┌──────────────────┐
                                      │ Tool result      │
                                      └────────┬─────────┘
                                                │
                                                ▼
                                      Back to LLM / Agent

This is why LangChain is strongly associated with agents.

A normal LLM answers from its own context. A LangChain agent can decide to use external tools.

Example:

Question: "What is the current share price of NVIDIA and summarize today's movement?"

LLM alone:
- May hallucinate because it does not know live prices.

LangChain agent:
- Calls a finance/search tool.
- Gets current data.
- Summarizes based on the tool result.

4. Key concepts you need to understand

4.1 Model

The model is the LLM itself.

Examples:

OpenAI GPT models
Anthropic Claude models
Google Gemini models
Mistral models
local models through Ollama

LangChain tries to make model usage provider-agnostic, meaning you can write similar code while switching model providers.

Example:

from langchain.chat_models import init_chat_model

model = init_chat_model("openai:gpt-4.1")

Conceptually:

Your code
→ LangChain model interface
→ actual provider API
→ model response

4.2 Messages

Modern LLMs usually use chat messages, not one giant text string.

Typical message roles:

system    = high-level instruction
user      = user's message
assistant = model's previous response
tool      = result returned from a tool

Example:

messages = [
    {"role": "system", "content": "You are a helpful financial analyst."},
    {"role": "user", "content": "Explain EBITDA margin."}
]

response = model.invoke(messages)

The system message controls behavior. The user message contains the actual task.

4.3 Prompt template

A prompt template is a reusable prompt with variables.

from langchain_core.prompts import ChatPromptTemplate

prompt = ChatPromptTemplate.from_messages([
    ("system", "You are an expert {role}."),
    ("user", "Explain {topic} in a technical but clear way.")
])

formatted_prompt = prompt.invoke({
    "role": "AI engineer",
    "topic": "gradient descent"
})

4.4 Tool

A tool is a function the model can call.

Tools can do things like:

Search the web
Query a database
Read a PDF
Send a Slack message
Create a Sanity draft
Call an internal API
Run Python code
Retrieve customer records

Example:

from langchain.tools import tool

@tool
def calculate_revenue_growth(old_revenue: float, new_revenue: float) -> float:
    """Calculate revenue growth rate from old revenue to new revenue."""
    return (new_revenue - old_revenue) / old_revenue

The docstring matters because the LLM uses it to understand when to call the tool.

Tool calling flow:

User asks question
→ LLM sees available tools
→ LLM chooses tool
→ LangChain executes function
→ result goes back to LLM
→ LLM writes final response

4.5 Agent

An agent is an LLM system that can choose actions.

Example:

from langchain.agents import create_react_agent
from langchain.tools import tool

@tool
def search_company_database(company_name: str) -> str:
    """Search internal company database for a company profile."""
    return f"{company_name}: SaaS company, Series B, $20m ARR."

agent = create_agent(
    model="openai:gpt-4.1",
    tools=[search_company_database],
    system_prompt="You are a VC analyst. Use tools when needed."
)

result = agent.invoke({
    "messages": [
        {"role": "user", "content": "Find information about Acme AI and summarize it."}
    ]
})

Conceptually:

Agent = LLM + tools + instructions + control loop

4.6 Structured output

Structured output means the model returns data in a fixed schema, not random prose.

Instead of:

"Apple is a public technology company with strong profitability..."

You may want:

{
  "company": "Apple",
  "sector": "Technology",
  "public_company": true,
  "risk_score": 2
}

Example with Pydantic:

from pydantic import BaseModel, Field
from langchain.agents import create_agent

class CompanyAnalysis(BaseModel):
    company_name: str = Field(description="Name of the company")
    product_type: str = Field(description="Software, hardware, marketplace, drug, service, etc.")
    target_customers: list[str]
    investment_view: str
    risk_score: int = Field(description="Risk score from 1 to 5")

agent = create_agent(
    model="openai:gpt-4.1",
    tools=[],
    response_format=CompanyAnalysis,
    system_prompt="You are a technical VC analyst."
)

result = agent.invoke({
    "messages": [
        {"role": "user", "content": "Analyze LangChain as a company/product."}
    ]
})

structured = result["structured_response"]

This is extremely important for production because downstream systems often need JSON, not essays.

4.7 Middleware

Middleware lets you intercept or modify agent behavior.

Middleware can be used for:

Add dynamic system prompts
Limit tool access
Summarize long conversations
Apply safety rules
Insert user profile context
Control model selection
Add human approval
Modify state before/after model calls

4.8 Human-in-the-loop

Human-in-the-loop means the agent pauses before doing sensitive actions.

Example risky actions:

Send email
Delete file
Run SQL write query
Publish article
Transfer money
Change production database

Architecture:

Agent proposes action
→ policy checks tool call
→ risky?
    yes → pause and ask human
    no  → execute tool
→ continue workflow

For a Sanity/news agent, this is exactly the pattern you want:

Draft article = okay automatically
Publish article = requires approval
Delete article = requires approval
Overwrite live post = requires approval

5. How LangChain is different from just calling the OpenAI API

LangChain is genuinely useful — but it carries real costs that are worth knowing upfront so you can make an informed choice rather than hitting them by surprise.

What LangChain gives you

  • A consistent interface across LLM providers (swap Claude for GPT-4 by changing one line)
  • Pre-built chains, retrievers, and agent types that save significant boilerplate
  • LangSmith for tracing and debugging agent runs
  • A large ecosystem of integrations (100+ vector stores, tools, loaders)

For anything involving RAG, multi-step chains, or multi-tool agents, this is a genuine time save.

What it costs you

API churn. LangChain has historically moved fast and broken things. Imports have shifted multiple times (langchainlangchain_corelangchain_community), function signatures have changed, and chains that worked in 0.1 may need rewriting in 0.2+. If you copy a tutorial from 12 months ago, expect to debug import errors before you debug your actual problem.

Abstraction overhead. Every LangChain abstraction wraps something simpler. When things go wrong — and in agents, they will — you're debugging the abstraction, not just your logic. Beginners often find direct SDK calls easier to reason about and fix.

Hidden complexity. AgentExecutor and similar wrappers have many configuration knobs (early stopping, error handling, max iterations) that you often only discover when your agent loops infinitely or silently swallows errors.

A simple decision rule

Use caseRecommendations
Single LLM callUse the Open AI / Anthropic SDK directly
A few chained stepsUse the Open AI / Anthropic SDK directly
RAG pipelineLangChain adds real value
Multi-tool agentLangChain / LangGraph
Production with tracingLangSmith + LangGraph

6. Step-by-step: how to use LangChain

Step 0: Decide what you are building

Before writing code, classify the app.

Type A: Simple chatbot
Type B: RAG assistant over documents
Type C: Tool-calling agent
Type D: Workflow agent with approvals
Type E: Multi-agent / long-running system

LangChain is most useful for C, D, and E.

For example, an AI news/Sanity agent is Type D:

Search news
→ rank stories
→ write article
→ create Sanity draft
→ ask approval
→ maybe publish later

Step 1: Install packages

Typical setup:

pip install -U langchain langchain-openai langgraph langsmith python-dotenv

Depending on the provider, you may install more packages:

pip install -U langchain-anthropic
pip install -U langchain-google-genai
pip install -U langchain-community

Example .env:

OPENAI_API_KEY="your_key_here"
LANGSMITH_API_KEY="your_langsmith_key_here"
LANGSMITH_TRACING="true"
LANGSMITH_PROJECT="my-langchain-agent"

Step 2: Create your first model call

from dotenv import load_dotenv
from langchain.chat_models import init_chat_model

load_dotenv()

model = init_chat_model("openai:gpt-4.1")

response = model.invoke("Explain LangChain in one sentence.")
print(response.content)

Step 3: Use messages properly

messages = [
    {
        "role": "system",
        "content": "You are a senior AI engineer. Be precise and technical."
    },
    {
        "role": "user",
        "content": "Explain tool calling in LangChain."
    }
]

response = model.invoke(messages)
print(response.content)

Step 4: Add a prompt template

from langchain_core.prompts import ChatPromptTemplate

prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a {role}. Your tone is {tone}."),
    ("user", "Explain this topic: {topic}")
])

chain = prompt | model

response = chain.invoke({
    "role": "technical AI educator",
    "tone": "precise but understandable",
    "topic": "LangChain agents"
})

print(response.content)

The | operator creates a chain:

PromptTemplate → Model

Step 5: Add output parsing

from langchain_core.output_parsers import StrOutputParser

chain = prompt | model | StrOutputParser()

result = chain.invoke({
    "role": "AI engineer",
    "tone": "technical",
    "topic": "RAG"
})

print(result)

Now the flow is:

Prompt → Model → String parser → clean text

7. Building a tool-calling agent

Step 1: Define tools

from langchain.tools import tool

@tool
def multiply(a: float, b: float) -> float:
    """Multiply two numbers together."""
    return a * b

@tool
def calculate_cagr(start_value: float, end_value: float, years: float) -> float:
    """Calculate compound annual growth rate."""
    return (end_value / start_value) ** (1 / years) - 1

Important: tools should have clear names, type hints, and docstrings.

Bad tool:

def calc(x, y):
    return x * y

Good tool:

@tool
def calculate_revenue_growth(old_revenue: float, new_revenue: float) -> float:
    """Calculate revenue growth rate between two revenue figures."""
    return (new_revenue - old_revenue) / old_revenue

Step 2: Create the agent

from langchain.agents import create_agent

agent = create_agent(
    model="openai:gpt-4.1",
    tools=[multiply, calculate_cagr],
    system_prompt="You are a financial analyst. Use tools for calculations."
)

Step 3: Invoke the agent

result = agent.invoke({
    "messages": [
        {
            "role": "user",
            "content": "Revenue grew from 20 to 45 over 3 years. What is the CAGR?"
        }
    ]
})

print(result["messages"][-1].content)

Internal flow:

User asks CAGR
→ Agent sends question + tool schemas to model
→ Model chooses calculate_cagr
→ LangChain executes calculate_cagr(20, 45, 3)
→ Tool result returns to model
→ Model writes final explanation

8. Building a research/news agent

Goal:

User asks:
"Find recent AI startup news and draft a blog post."

Agent should:
1. Search for news
2. Rank stories
3. Select strongest one
4. Draft article
5. Create Sanity draft
6. Return draft link
7. Never publish automatically

Tool definitions

from langchain.tools import tool
from typing import List, Dict

@tool
def search_news(query: str, hours: int = 72) -> List[Dict]:
    """Search recent news articles matching a query within the last N hours."""
    return [
        {
            "title": "Example AI startup raises Series A",
            "source": "Tech publication",
            "url": "https://example.com",
            "published_at": "2026-06-05",
            "summary": "The company raised funding to build AI infrastructure."
        }
    ]

@tool
def create_sanity_draft(title: str, body: str, tags: List[str]) -> str:
    """Create a draft article in Sanity CMS. This does not publish the article."""
    return "https://sanity.io/draft/example-draft-id"

Agent

agent = create_agent(
    model="openai:gpt-4.1",
    tools=[search_news, create_sanity_draft],
    system_prompt="""
You are an AI news research agent.

Rules:
- Search for relevant news before drafting.
- Prefer AI, startups, venture capital, deep tech, spacetech, life sciences, and fintech.
- Create Sanity drafts only.
- Never publish live posts.
- Always return source links used.
- Be concise but analytical.
"""
)

Invocation

result = agent.invoke({
    "messages": [
        {
            "role": "user",
            "content": "Find the hottest AI infrastructure news in the last 72 hours and create one Sanity draft."
        }
    ]
})

print(result["messages"][-1].content)

Architecture:

User
 │
 ▼
LangChain Agent
 │
 ├── search_news()
 │       │
 │       ▼
 │   list of articles
 │
 ├── model ranks stories
 │
 ├── model drafts article
 │
 ├── create_sanity_draft()
 │       │
 │       ▼
 │   draft URL
 │
 ▼
Final response to user

9. Adding structured output to the agent

For a news agent, you may want structured output like:

{
  "selected_story": "...",
  "why_it_matters": "...",
  "sector": "AI infrastructure",
  "draft_url": "...",
  "confidence": 0.88
}

Define schema:

from pydantic import BaseModel, Field
from typing import List

class NewsAgentOutput(BaseModel):
    selected_story_title: str
    sector: str
    source_urls: List[str]
    why_it_matters: str
    draft_url: str
    confidence_score: float = Field(ge=0, le=1)

Create agent:

agent = create_agent(
    model="openai:gpt-4.1",
    tools=[search_news, create_sanity_draft],
    response_format=NewsAgentOutput,
    system_prompt="You are a technical AI news agent. Create drafts, never publish."
)

Invoke:

result = agent.invoke({
    "messages": [
        {"role": "user", "content": "Draft a post about the top AI startup news today."}
    ]
})

print(result["structured_response"])

10. Where LangGraph comes in

LangChain agents are great for fast building.

But sometimes you need more control.

For example, this is too important to leave to a loose agent loop:

Search news
→ rank story
→ draft article
→ fact-check
→ ask human approval
→ create CMS draft
→ wait for approval
→ publish

That should be a graph.

Instead of:

"Agent, figure it out."

You write:

Do step 1.
Then step 2.
If confidence < 0.7, go to human review.
If approved, continue.
If rejected, revise.

Visual graph:

        ┌──────────────┐
        │ User request │
        └──────┬───────┘
               ▼
        ┌──────────────┐
        │ Classify task │
        └──────┬───────┘
               ▼
        ┌──────────────┐
        │ Search news   │
        └──────┬───────┘
               ▼
        ┌──────────────┐
        │ Rank stories  │
        └──────┬───────┘
               ▼
        ┌──────────────┐
        │ Draft article │
        └──────┬───────┘
               ▼
        ┌──────────────┐
        │ Validate      │
        └──────┬───────┘
       ┌───────┴────────┐
       ▼                ▼
┌─────────────┐   ┌──────────────┐
│ Human review │   │ Create draft │
└──────┬──────┘   └──────┬───────┘
       ▼                 ▼
┌─────────────┐   ┌──────────────┐
│ Revise      │   │ Return link  │
└─────────────┘   └──────────────┘

11. Agent vs graph: when to use which

Use a LangChain agent when:

The task is flexible.
The model can decide which tool to use.
The cost of mistakes is low to medium.
You want to build quickly.

Use LangGraph when:

The workflow has fixed steps.
You need approvals.
You need retries.
You need state.
You need auditability.
The agent may run for a long time.
The action can affect real systems.

| Need | LangChain Agent | LangGraph |
|---|---:|---:|
| Quick prototype | Strong | Medium |
| Tool calling | Strong | Strong |
| Fixed workflow | Medium | Strong |
| Stateful execution | Medium | Strong |
| Human approval | Good | Strong |
| Long-running process | Medium | Strong |
| Production control | Good | Strong |
| Simple chatbot | Fine | Overkill |

12. RAG with LangChain

RAG means Retrieval-Augmented Generation.

Instead of asking the model to answer from memory, you retrieve relevant documents and feed them into the prompt.

Flow:

User question
→ embed question
→ search vector database
→ retrieve relevant chunks
→ insert chunks into prompt
→ model answers using retrieved context

Visual:

                 ┌────────────────────┐
                 │ User question       │
                 └─────────┬──────────┘
                           │
                           ▼
                 ┌────────────────────┐
                 │ Embedding model     │
                 └─────────┬──────────┘
                           │
                           ▼
                 ┌────────────────────┐
                 │ Vector database     │
                 └─────────┬──────────┘
                           │
                           ▼
                 ┌────────────────────┐
                 │ Retrieved chunks    │
                 └─────────┬──────────┘
                           │
                           ▼
                 ┌────────────────────┐
                 │ Prompt + context    │
                 └─────────┬──────────┘
                           │
                           ▼
                 ┌────────────────────┐
                 │ LLM answer          │
                 └────────────────────┘

Basic LangChain RAG architecture:

from langchain_community.document_loaders import TextLoader
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_openai import OpenAIEmbeddings
from langchain_community.vectorstores import Chroma
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser

loader = TextLoader("./data/company_notes.txt")
docs = loader.load()

splitter = RecursiveCharacterTextSplitter(
    chunk_size=800,
    chunk_overlap=100
)
chunks = splitter.split_documents(docs)

embeddings = OpenAIEmbeddings()
vectorstore = Chroma.from_documents(chunks, embeddings)

retriever = vectorstore.as_retriever(search_kwargs={"k": 5})

prompt = ChatPromptTemplate.from_template("""
You are a precise analyst.

Answer the question using only the context below.

Context:
{context}

Question:
{question}
""")

def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)

rag_chain = (
    {
        "context": retriever | format_docs,
        "question": lambda x: x
    }
    | prompt
    | model
    | StrOutputParser()
)

answer = rag_chain.invoke("What is the company's main product?")
print(answer)

This is not yet an agent. It is a retrieval chain.

13. Turning RAG into a tool

A powerful pattern is to wrap RAG as a tool.

@tool
def search_internal_knowledge_base(query: str) -> str:
    """Search internal knowledge base for relevant company, market, and product information."""
    return rag_chain.invoke(query)

Then give this tool to an agent:

agent = create_agent(
    model="openai:gpt-4.1",
    tools=[search_internal_knowledge_base, search_news, create_sanity_draft],
    system_prompt="""
You are a research agent.
Use internal knowledge when the user asks about previous posts, companies, or market notes.
Use news search for current events.
"""
)

Now the agent can decide:

Need internal context? → call RAG tool
Need current news? → call news search
Need CMS draft? → call Sanity tool

14. Memory and state

Do not confuse three things:

Chat history
Memory
State

Chat history

The previous messages in a conversation.

Memory

Information stored across turns or sessions.

Example:

User prefers VC-style explanations.
User is building a Sanity-based AI news blog.
User wants drafts but not auto-publishing.

State

Current workflow variables.

Example:

{
  "task": "draft news article",
  "sources_found": 12,
  "selected_source": "...",
  "draft_status": "pending_review",
  "approval": None
}

LangGraph is especially strong when you need state.

15. Observability with LangSmith

When building agents, you need to see what happened inside.

Without observability, debugging is painful:

Why did the agent call the wrong tool?
Which prompt caused the bad answer?
How much did this run cost?
Which source did it use?
Where did hallucination enter?
How long did each step take?

A trace can show:

Run
├── prompt formatting
├── model call
├── tool call
│   └── tool output
├── second model call
├── parser
└── final answer

For production, tracing is not just nice to have. It becomes necessary because LLM agents are probabilistic and hard to debug from logs alone.

16. Production architecture

A proper LangChain production app should not be:

Frontend directly calls agent with full permissions.

That is risky.

Better architecture:

┌────────────────────┐
│ Frontend / Slack   │
└─────────┬──────────┘
          │
          ▼
┌────────────────────┐
│ Backend API        │
│ Auth + validation  │
└─────────┬──────────┘
          │
          ▼
┌────────────────────┐
│ LangGraph workflow │
│ state + routing    │
└─────────┬──────────┘
          │
          ├──────────────► Read-only tools
          │                 - search
          │                 - retrieval
          │                 - database read
          │
          ├──────────────► Write tools
          │                 - create draft
          │                 - send message
          │
          ├──────────────► Restricted tools
          │                 - publish
          │                 - delete
          │                 - overwrite
          │
          ▼
┌────────────────────┐
│ Human approval     │
└─────────┬──────────┘
          ▼
┌────────────────────┐
│ External systems   │
│ Sanity, Slack, etc.│
└────────────────────┘

Tool permission design:

Low-risk:
- search web
- summarize public article
- retrieve internal note

Medium-risk:
- create draft
- send Slack message
- update non-public task

High-risk:
- publish article
- delete file
- send email externally
- run SQL write
- update production system

High-risk tools should require human approval.

17. Common design mistakes

Mistake 1: Giving the agent too many tools

Bad:

Agent has 40 tools.
Tool names are vague.
Descriptions overlap.
Model gets confused.

Better:

Give 5–8 tools.
Make names specific.
Use clear docstrings.
Split read tools and write tools.

Mistake 2: No structured output

Bad:

Agent returns a beautiful paragraph.
Your backend cannot reliably parse it.

Better:

Use Pydantic schema.
Return structured_response.
Validate fields.

Mistake 3: No human approval for write actions

Bad:

Agent can publish directly.

Better:

Agent can create draft.
Human approves publishing.

Mistake 4: Treating RAG as magic

Bad RAG:

Throw PDFs into vector DB.
Ask questions.
Trust answers.

a. Chunking strategy matters more than you'd expect

The default splitter most tutorials show — splitting by character count — produces chunks that cut across sentences mid-thought and hurt retrieval quality. Start here instead:

from langchain_text_splitters import RecursiveCharacterTextSplitter

splitter = RecursiveCharacterTextSplitter(
    chunk_size=500,       # tokens, not characters
    chunk_overlap=50,     # overlap helps preserve context at boundaries
    separators=["\n\n", "\n", ".", " "]  # tries paragraph breaks first
)
chunks = splitter.split_documents(docs)

For structured content (docs with clear sections), consider MarkdownHeaderTextSplitter or HTMLHeaderTextSplitter so chunks stay semantically coherent.

b. Reranking: a two-stage retrieval upgrade

A vector similarity search retrieves the most similar chunks, not necessarily the most relevant ones. Adding a reranker as a second pass significantly improves answer quality — especially for longer documents.

from langchain.retrievers import ContextualCompressionRetriever
from langchain_cohere import CohereRerank

# Base retriever (vector search)
base_retriever = vectorstore.as_retriever(search_kwargs={"k": 20})

# Reranker — re-scores top 20 and returns top 5
compressor = CohereRerank(top_n=5)
retriever = ContextualCompressionRetriever(
    base_compressor=compressor,
    base_retriever=base_retriever
)

If you'd rather not add Cohere as a dependency, FlashrankRerank is a lightweight local alternative.

c. Citing sources: surface the metadata

Retrieval without citation is a trust problem — your users can't verify what the agent found. Store source metadata at ingest time and surface it in the response:

from langchain_community.vectorstores import Chroma
from langchain_openai import ChatOpenAI
from langchain.chains import RetrievalQAWithSourcesChain

# Store documents with source metadata
vectorstore = Chroma.from_documents(
    documents=chunks,   # each chunk should have doc.metadata["source"]
    embedding=embeddings
)

# Use the sources-aware chain variant
chain = RetrievalQAWithSourcesChain.from_chain_type(
    llm=ChatOpenAI(model="gpt-4o"),
    retriever=vectorstore.as_retriever()
)

result = chain.invoke({"question": "What are the refund terms?"})
print(result["answer"])
print(result["sources"])  # returns the source filenames/URLs
Quick rule of thumb: retrieval quality degrades fast if your source documents are inconsistently formatted. Clean and normalise your data before indexing — it pays off more than any retrieval tuning.

Mistake 5: Not tracing runs

Bad:

Agent failed and nobody knows why.

Better:

Use traces.
Inspect prompts, tool calls, latency, token cost, errors.

18. Learning path

Stage 1: Basic LLM app

Build:

Prompt → Model → Output

Skills:

messages
system prompt
prompt template
output parser

Project:

Explain a technical concept in a clear, technical style.

Stage 2: Tool-calling agent

Build:

Agent + calculator/search tools

Skills:

@tool
create_agent
tool docstrings
agent.invoke()
tool result inspection

Project:

VC analyst agent that calculates CAGR, ARR multiple, and market size.

Stage 3: RAG

Build:

Documents → chunks → embeddings → vector DB → retriever → answer

Skills:

document loader
text splitter
embeddings
vectorstore
retriever
RAG prompt

Project:

Ask questions over your own blog posts or startup notes.

Stage 4: RAG as tool

Build:

Agent + internal knowledge search

Skills:

wrap retriever as tool
agent chooses when to retrieve
combine web/current info with internal memory

Project:

AI startup analyst that checks previous posts before drafting a new one.

Stage 5: LangGraph workflow

Build:

Explicit graph with state and routing

Skills:

StateGraph
nodes
edges
conditional routing
state updates
approval path

Project:

News research → rank → draft → validate → Sanity draft → approval.

Stage 6: Production

Build:

Backend + LangGraph + LangSmith + Slack/Sanity

Skills:

API endpoint
environment variables
tool permissions
human approval
logging
observability
evals
deployment

Project:

Slack command /news that creates Sanity drafts but never publishes automatically.

19. Practical project blueprint

A strong LangChain project looks like this:

User command:
"/news ai infrastructure 72h"

System:
1. Parse command
2. Search public sources
3. Retrieve old articles from internal knowledge base
4. Detect overlap
5. Rank news by relevance
6. Draft article
7. Validate source quality
8. Create Sanity draft
9. Send Slack confirmation
10. Wait for approval before publishing

Technical stack:

Frontend/control: Slack
Backend: FastAPI or Next.js API route
Agent framework: LangChain
Workflow runtime: LangGraph
Knowledge layer: vector DB or LlamaIndex retriever
CMS: Sanity
Observability: LangSmith
Hosting: Vercel / Render / Railway

Architecture:

┌─────────────────────┐
│ Slack command       │
│ /news ai 72h        │
└──────────┬──────────┘
           ▼
┌─────────────────────┐
│ Backend API         │
│ validates request   │
└──────────┬──────────┘
           ▼
┌─────────────────────┐
│ LangGraph workflow  │
│ stateful execution  │
└──────────┬──────────┘
           │
           ├──────────► search_recent_news()
           ├──────────► retrieve_past_posts()
           ├──────────► rank_sources()
           ├──────────► draft_article()
           ├──────────► validate_article()
           ├──────────► create_sanity_draft()
           └──────────► send_slack_message()
           ▼
┌─────────────────────┐
│ LangSmith tracing   │
└─────────────────────┘

20. Final mental model

LangChain by itself helps you build fast:

model + tools + agent

LangGraph helps you make it reliable:

state + graph + durable execution + human approval

LangSmith helps you understand and improve it:

traces + debugging + evaluation + monitoring

The core idea:

LLM alone = brain in a box

LangChain = brain connected to tools

LangGraph = brain inside a controlled workflow

LangSmith = camera watching what the brain did

For a beginner, LangChain can look scary because there are many abstractions. But the whole thing becomes much easier if you remember the build order:

1. Model
2. Prompt
3. Tool
4. Agent
5. Structured output
6. RAG
7. Workflow graph
8. Human approval
9. Observability
10. Production deployment

LangChain is not just “a Python library.” It is closer to an agent application framework: the layer that lets your AI system reason, choose tools, call APIs, write drafts, ask for approval, and integrate with real software.

#LANGCHAIN#LANGGRAPH#AI AGENTS#LLM APPS#RAG#AGENT ENGINEERING