AI Agent 101: LangChain Beginner-to-Builder Agent for Your App
A technical beginner-to-builder guide to LangChain, LangGraph, and LangSmith: what LangChain is, how agents use tools, how to build workflows, and how to deploy an AI agent into a real app.

LangChain is a framework for building LLM-powered applications, especially applications where the model needs to interact with tools, retrieve data, follow workflows, return structured outputs, or behave like an agent.
The modern LangChain stack is no longer just prompt → model → answer. It is more like:
LangChain = high-level framework for agents and LLM apps
LangGraph = low-level orchestration/runtime layer for stateful agents
LangSmith = observability, tracing, evaluation, monitoringLangChain is best understood as a framework for connecting LLMs to tools, data, workflows, and production systems.
1. What LangChain actually is
At the simplest level, LangChain helps you connect:
User input
→ prompt
→ model
→ tools
→ data sources
→ memory/state
→ final outputBut technically, LangChain is an application framework around LLMs.
It gives you reusable abstractions for:
LLM / chat model interfaces
Prompt templates
Message objects
Tool calling
Agents
Retrievers
RAG pipelines
Structured output
Middleware
Streaming
Human review
Tracing and observabilityThe reason LangChain exists is that real AI applications usually need more than one model call.
A normal chatbot might do this:
User: "Explain revenue recognition"
→ LLM
→ AnswerBut a real AI agent may need this:
User: "Find the latest funding round, compare competitors, and draft a blog post"
→ classify task
→ search web
→ retrieve internal notes
→ call database
→ rank sources
→ generate draft
→ validate citations
→ ask human approval
→ create CMS draft
→ send Slack notificationThat multi-step system is where LangChain becomes useful.
2. The modern LangChain stack
Think of LangChain as three layers.
┌──────────────────────────────────────────────┐
│ Your App │
│ Slack bot, web app, CLI, CMS agent, etc. │
└──────────────────────┬───────────────────────┘
│
┌──────────────────────▼───────────────────────┐
│ LangChain │
│ Agents, tools, prompts, structured output, │
│ model abstraction, middleware │
└──────────────────────┬───────────────────────┘
│
┌──────────────────────▼───────────────────────┐
│ LangGraph │
│ Stateful graph runtime, durable execution, │
│ memory, persistence, human-in-the-loop │
└──────────────────────┬───────────────────────┘
│
┌──────────────────────▼───────────────────────┐
│ LangSmith │
│ Tracing, debugging, evaluation, monitoring │
└──────────────────────────────────────────────┘LangChain is the easy entry point. It gives you prebuilt agent patterns and standard interfaces.
LangGraph is the lower-level runtime for complex stateful workflows. It is designed for long-running agents, durable execution, streaming, human-in-the-loop, persistence, and memory.
LangSmith is the observability and evaluation layer. It helps with tracing, debugging, monitoring, and evaluating LLM applications.
3. The core mental model
LangChain applications usually follow this loop:
Input
→ model thinks
→ model decides whether it needs a tool
→ tool executes
→ result goes back to model
→ model decides next step
→ final answerVisualized:
┌──────────────────────┐
│ User input │
└──────────┬───────────┘
│
▼
┌──────────────────────┐
│ LLM / Agent │
└──────────┬───────────┘
│
┌───────────────┴────────────────┐
│ │
▼ ▼
┌──────────────────┐ ┌──────────────────┐
│ Return answer │ │ Call tool │
└──────────────────┘ └────────┬─────────┘
│
▼
┌──────────────────┐
│ Tool result │
└────────┬─────────┘
│
▼
Back to LLM / AgentThis is why LangChain is strongly associated with agents.
A normal LLM answers from its own context. A LangChain agent can decide to use external tools.
Example:
Question: "What is the current share price of NVIDIA and summarize today's movement?"
LLM alone:
- May hallucinate because it does not know live prices.
LangChain agent:
- Calls a finance/search tool.
- Gets current data.
- Summarizes based on the tool result.4. Key concepts you need to understand
4.1 Model
The model is the LLM itself.
Examples:
OpenAI GPT models
Anthropic Claude models
Google Gemini models
Mistral models
local models through OllamaLangChain tries to make model usage provider-agnostic, meaning you can write similar code while switching model providers.
Example:
from langchain.chat_models import init_chat_model
model = init_chat_model("openai:gpt-4.1")Conceptually:
Your code
→ LangChain model interface
→ actual provider API
→ model response4.2 Messages
Modern LLMs usually use chat messages, not one giant text string.
Typical message roles:
system = high-level instruction
user = user's message
assistant = model's previous response
tool = result returned from a toolExample:
messages = [
{"role": "system", "content": "You are a helpful financial analyst."},
{"role": "user", "content": "Explain EBITDA margin."}
]
response = model.invoke(messages)The system message controls behavior. The user message contains the actual task.
4.3 Prompt template
A prompt template is a reusable prompt with variables.
from langchain_core.prompts import ChatPromptTemplate
prompt = ChatPromptTemplate.from_messages([
("system", "You are an expert {role}."),
("user", "Explain {topic} in a technical but clear way.")
])
formatted_prompt = prompt.invoke({
"role": "AI engineer",
"topic": "gradient descent"
})4.4 Tool
A tool is a function the model can call.
Tools can do things like:
Search the web
Query a database
Read a PDF
Send a Slack message
Create a Sanity draft
Call an internal API
Run Python code
Retrieve customer recordsExample:
from langchain.tools import tool
@tool
def calculate_revenue_growth(old_revenue: float, new_revenue: float) -> float:
"""Calculate revenue growth rate from old revenue to new revenue."""
return (new_revenue - old_revenue) / old_revenueThe docstring matters because the LLM uses it to understand when to call the tool.
Tool calling flow:
User asks question
→ LLM sees available tools
→ LLM chooses tool
→ LangChain executes function
→ result goes back to LLM
→ LLM writes final response4.5 Agent
An agent is an LLM system that can choose actions.
Example:
from langchain.agents import create_react_agent
from langchain.tools import tool
@tool
def search_company_database(company_name: str) -> str:
"""Search internal company database for a company profile."""
return f"{company_name}: SaaS company, Series B, $20m ARR."
agent = create_agent(
model="openai:gpt-4.1",
tools=[search_company_database],
system_prompt="You are a VC analyst. Use tools when needed."
)
result = agent.invoke({
"messages": [
{"role": "user", "content": "Find information about Acme AI and summarize it."}
]
})Conceptually:
Agent = LLM + tools + instructions + control loop4.6 Structured output
Structured output means the model returns data in a fixed schema, not random prose.
Instead of:
"Apple is a public technology company with strong profitability..."You may want:
{
"company": "Apple",
"sector": "Technology",
"public_company": true,
"risk_score": 2
}Example with Pydantic:
from pydantic import BaseModel, Field
from langchain.agents import create_agent
class CompanyAnalysis(BaseModel):
company_name: str = Field(description="Name of the company")
product_type: str = Field(description="Software, hardware, marketplace, drug, service, etc.")
target_customers: list[str]
investment_view: str
risk_score: int = Field(description="Risk score from 1 to 5")
agent = create_agent(
model="openai:gpt-4.1",
tools=[],
response_format=CompanyAnalysis,
system_prompt="You are a technical VC analyst."
)
result = agent.invoke({
"messages": [
{"role": "user", "content": "Analyze LangChain as a company/product."}
]
})
structured = result["structured_response"]This is extremely important for production because downstream systems often need JSON, not essays.
4.7 Middleware
Middleware lets you intercept or modify agent behavior.
Middleware can be used for:
Add dynamic system prompts
Limit tool access
Summarize long conversations
Apply safety rules
Insert user profile context
Control model selection
Add human approval
Modify state before/after model calls4.8 Human-in-the-loop
Human-in-the-loop means the agent pauses before doing sensitive actions.
Example risky actions:
Send email
Delete file
Run SQL write query
Publish article
Transfer money
Change production databaseArchitecture:
Agent proposes action
→ policy checks tool call
→ risky?
yes → pause and ask human
no → execute tool
→ continue workflowFor a Sanity/news agent, this is exactly the pattern you want:
Draft article = okay automatically
Publish article = requires approval
Delete article = requires approval
Overwrite live post = requires approval5. How LangChain is different from just calling the OpenAI API
LangChain is genuinely useful — but it carries real costs that are worth knowing upfront so you can make an informed choice rather than hitting them by surprise.
What LangChain gives you
- A consistent interface across LLM providers (swap Claude for GPT-4 by changing one line)
- Pre-built chains, retrievers, and agent types that save significant boilerplate
- LangSmith for tracing and debugging agent runs
- A large ecosystem of integrations (100+ vector stores, tools, loaders)
For anything involving RAG, multi-step chains, or multi-tool agents, this is a genuine time save.
What it costs you
API churn. LangChain has historically moved fast and broken things. Imports have shifted multiple times (langchain → langchain_core → langchain_community), function signatures have changed, and chains that worked in 0.1 may need rewriting in 0.2+. If you copy a tutorial from 12 months ago, expect to debug import errors before you debug your actual problem.
Abstraction overhead. Every LangChain abstraction wraps something simpler. When things go wrong — and in agents, they will — you're debugging the abstraction, not just your logic. Beginners often find direct SDK calls easier to reason about and fix.
Hidden complexity. AgentExecutor and similar wrappers have many configuration knobs (early stopping, error handling, max iterations) that you often only discover when your agent loops infinitely or silently swallows errors.
A simple decision rule
| Use case | Recommendations |
|---|---|
| Single LLM call | Use the Open AI / Anthropic SDK directly |
| A few chained steps | Use the Open AI / Anthropic SDK directly |
| RAG pipeline | LangChain adds real value |
| Multi-tool agent | LangChain / LangGraph |
| Production with tracing | LangSmith + LangGraph |
6. Step-by-step: how to use LangChain
Step 0: Decide what you are building
Before writing code, classify the app.
Type A: Simple chatbot
Type B: RAG assistant over documents
Type C: Tool-calling agent
Type D: Workflow agent with approvals
Type E: Multi-agent / long-running systemLangChain is most useful for C, D, and E.
For example, an AI news/Sanity agent is Type D:
Search news
→ rank stories
→ write article
→ create Sanity draft
→ ask approval
→ maybe publish laterStep 1: Install packages
Typical setup:
pip install -U langchain langchain-openai langgraph langsmith python-dotenvDepending on the provider, you may install more packages:
pip install -U langchain-anthropic
pip install -U langchain-google-genai
pip install -U langchain-communityExample .env:
OPENAI_API_KEY="your_key_here"
LANGSMITH_API_KEY="your_langsmith_key_here"
LANGSMITH_TRACING="true"
LANGSMITH_PROJECT="my-langchain-agent"Step 2: Create your first model call
from dotenv import load_dotenv
from langchain.chat_models import init_chat_model
load_dotenv()
model = init_chat_model("openai:gpt-4.1")
response = model.invoke("Explain LangChain in one sentence.")
print(response.content)Step 3: Use messages properly
messages = [
{
"role": "system",
"content": "You are a senior AI engineer. Be precise and technical."
},
{
"role": "user",
"content": "Explain tool calling in LangChain."
}
]
response = model.invoke(messages)
print(response.content)Step 4: Add a prompt template
from langchain_core.prompts import ChatPromptTemplate
prompt = ChatPromptTemplate.from_messages([
("system", "You are a {role}. Your tone is {tone}."),
("user", "Explain this topic: {topic}")
])
chain = prompt | model
response = chain.invoke({
"role": "technical AI educator",
"tone": "precise but understandable",
"topic": "LangChain agents"
})
print(response.content)The | operator creates a chain:
PromptTemplate → ModelStep 5: Add output parsing
from langchain_core.output_parsers import StrOutputParser
chain = prompt | model | StrOutputParser()
result = chain.invoke({
"role": "AI engineer",
"tone": "technical",
"topic": "RAG"
})
print(result)Now the flow is:
Prompt → Model → String parser → clean text7. Building a tool-calling agent
Step 1: Define tools
from langchain.tools import tool
@tool
def multiply(a: float, b: float) -> float:
"""Multiply two numbers together."""
return a * b
@tool
def calculate_cagr(start_value: float, end_value: float, years: float) -> float:
"""Calculate compound annual growth rate."""
return (end_value / start_value) ** (1 / years) - 1Important: tools should have clear names, type hints, and docstrings.
Bad tool:
def calc(x, y):
return x * yGood tool:
@tool
def calculate_revenue_growth(old_revenue: float, new_revenue: float) -> float:
"""Calculate revenue growth rate between two revenue figures."""
return (new_revenue - old_revenue) / old_revenueStep 2: Create the agent
from langchain.agents import create_agent
agent = create_agent(
model="openai:gpt-4.1",
tools=[multiply, calculate_cagr],
system_prompt="You are a financial analyst. Use tools for calculations."
)Step 3: Invoke the agent
result = agent.invoke({
"messages": [
{
"role": "user",
"content": "Revenue grew from 20 to 45 over 3 years. What is the CAGR?"
}
]
})
print(result["messages"][-1].content)Internal flow:
User asks CAGR
→ Agent sends question + tool schemas to model
→ Model chooses calculate_cagr
→ LangChain executes calculate_cagr(20, 45, 3)
→ Tool result returns to model
→ Model writes final explanation8. Building a research/news agent
Goal:
User asks:
"Find recent AI startup news and draft a blog post."
Agent should:
1. Search for news
2. Rank stories
3. Select strongest one
4. Draft article
5. Create Sanity draft
6. Return draft link
7. Never publish automaticallyTool definitions
from langchain.tools import tool
from typing import List, Dict
@tool
def search_news(query: str, hours: int = 72) -> List[Dict]:
"""Search recent news articles matching a query within the last N hours."""
return [
{
"title": "Example AI startup raises Series A",
"source": "Tech publication",
"url": "https://example.com",
"published_at": "2026-06-05",
"summary": "The company raised funding to build AI infrastructure."
}
]
@tool
def create_sanity_draft(title: str, body: str, tags: List[str]) -> str:
"""Create a draft article in Sanity CMS. This does not publish the article."""
return "https://sanity.io/draft/example-draft-id"Agent
agent = create_agent(
model="openai:gpt-4.1",
tools=[search_news, create_sanity_draft],
system_prompt="""
You are an AI news research agent.
Rules:
- Search for relevant news before drafting.
- Prefer AI, startups, venture capital, deep tech, spacetech, life sciences, and fintech.
- Create Sanity drafts only.
- Never publish live posts.
- Always return source links used.
- Be concise but analytical.
"""
)Invocation
result = agent.invoke({
"messages": [
{
"role": "user",
"content": "Find the hottest AI infrastructure news in the last 72 hours and create one Sanity draft."
}
]
})
print(result["messages"][-1].content)Architecture:
User
│
▼
LangChain Agent
│
├── search_news()
│ │
│ ▼
│ list of articles
│
├── model ranks stories
│
├── model drafts article
│
├── create_sanity_draft()
│ │
│ ▼
│ draft URL
│
▼
Final response to user9. Adding structured output to the agent
For a news agent, you may want structured output like:
{
"selected_story": "...",
"why_it_matters": "...",
"sector": "AI infrastructure",
"draft_url": "...",
"confidence": 0.88
}Define schema:
from pydantic import BaseModel, Field
from typing import List
class NewsAgentOutput(BaseModel):
selected_story_title: str
sector: str
source_urls: List[str]
why_it_matters: str
draft_url: str
confidence_score: float = Field(ge=0, le=1)Create agent:
agent = create_agent(
model="openai:gpt-4.1",
tools=[search_news, create_sanity_draft],
response_format=NewsAgentOutput,
system_prompt="You are a technical AI news agent. Create drafts, never publish."
)Invoke:
result = agent.invoke({
"messages": [
{"role": "user", "content": "Draft a post about the top AI startup news today."}
]
})
print(result["structured_response"])10. Where LangGraph comes in
LangChain agents are great for fast building.
But sometimes you need more control.
For example, this is too important to leave to a loose agent loop:
Search news
→ rank story
→ draft article
→ fact-check
→ ask human approval
→ create CMS draft
→ wait for approval
→ publishThat should be a graph.
Instead of:
"Agent, figure it out."You write:
Do step 1.
Then step 2.
If confidence < 0.7, go to human review.
If approved, continue.
If rejected, revise.Visual graph:
┌──────────────┐
│ User request │
└──────┬───────┘
▼
┌──────────────┐
│ Classify task │
└──────┬───────┘
▼
┌──────────────┐
│ Search news │
└──────┬───────┘
▼
┌──────────────┐
│ Rank stories │
└──────┬───────┘
▼
┌──────────────┐
│ Draft article │
└──────┬───────┘
▼
┌──────────────┐
│ Validate │
└──────┬───────┘
┌───────┴────────┐
▼ ▼
┌─────────────┐ ┌──────────────┐
│ Human review │ │ Create draft │
└──────┬──────┘ └──────┬───────┘
▼ ▼
┌─────────────┐ ┌──────────────┐
│ Revise │ │ Return link │
└─────────────┘ └──────────────┘11. Agent vs graph: when to use which
Use a LangChain agent when:
The task is flexible.
The model can decide which tool to use.
The cost of mistakes is low to medium.
You want to build quickly.Use LangGraph when:
The workflow has fixed steps.
You need approvals.
You need retries.
You need state.
You need auditability.
The agent may run for a long time.
The action can affect real systems.| Need | LangChain Agent | LangGraph |
|---|---:|---:|
| Quick prototype | Strong | Medium |
| Tool calling | Strong | Strong |
| Fixed workflow | Medium | Strong |
| Stateful execution | Medium | Strong |
| Human approval | Good | Strong |
| Long-running process | Medium | Strong |
| Production control | Good | Strong |
| Simple chatbot | Fine | Overkill |
12. RAG with LangChain
RAG means Retrieval-Augmented Generation.
Instead of asking the model to answer from memory, you retrieve relevant documents and feed them into the prompt.
Flow:
User question
→ embed question
→ search vector database
→ retrieve relevant chunks
→ insert chunks into prompt
→ model answers using retrieved contextVisual:
┌────────────────────┐
│ User question │
└─────────┬──────────┘
│
▼
┌────────────────────┐
│ Embedding model │
└─────────┬──────────┘
│
▼
┌────────────────────┐
│ Vector database │
└─────────┬──────────┘
│
▼
┌────────────────────┐
│ Retrieved chunks │
└─────────┬──────────┘
│
▼
┌────────────────────┐
│ Prompt + context │
└─────────┬──────────┘
│
▼
┌────────────────────┐
│ LLM answer │
└────────────────────┘Basic LangChain RAG architecture:
from langchain_community.document_loaders import TextLoader
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_openai import OpenAIEmbeddings
from langchain_community.vectorstores import Chroma
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
loader = TextLoader("./data/company_notes.txt")
docs = loader.load()
splitter = RecursiveCharacterTextSplitter(
chunk_size=800,
chunk_overlap=100
)
chunks = splitter.split_documents(docs)
embeddings = OpenAIEmbeddings()
vectorstore = Chroma.from_documents(chunks, embeddings)
retriever = vectorstore.as_retriever(search_kwargs={"k": 5})
prompt = ChatPromptTemplate.from_template("""
You are a precise analyst.
Answer the question using only the context below.
Context:
{context}
Question:
{question}
""")
def format_docs(docs):
return "\n\n".join(doc.page_content for doc in docs)
rag_chain = (
{
"context": retriever | format_docs,
"question": lambda x: x
}
| prompt
| model
| StrOutputParser()
)
answer = rag_chain.invoke("What is the company's main product?")
print(answer)This is not yet an agent. It is a retrieval chain.
13. Turning RAG into a tool
A powerful pattern is to wrap RAG as a tool.
@tool
def search_internal_knowledge_base(query: str) -> str:
"""Search internal knowledge base for relevant company, market, and product information."""
return rag_chain.invoke(query)Then give this tool to an agent:
agent = create_agent(
model="openai:gpt-4.1",
tools=[search_internal_knowledge_base, search_news, create_sanity_draft],
system_prompt="""
You are a research agent.
Use internal knowledge when the user asks about previous posts, companies, or market notes.
Use news search for current events.
"""
)Now the agent can decide:
Need internal context? → call RAG tool
Need current news? → call news search
Need CMS draft? → call Sanity tool14. Memory and state
Do not confuse three things:
Chat history
Memory
StateChat history
The previous messages in a conversation.
Memory
Information stored across turns or sessions.
Example:
User prefers VC-style explanations.
User is building a Sanity-based AI news blog.
User wants drafts but not auto-publishing.State
Current workflow variables.
Example:
{
"task": "draft news article",
"sources_found": 12,
"selected_source": "...",
"draft_status": "pending_review",
"approval": None
}LangGraph is especially strong when you need state.
15. Observability with LangSmith
When building agents, you need to see what happened inside.
Without observability, debugging is painful:
Why did the agent call the wrong tool?
Which prompt caused the bad answer?
How much did this run cost?
Which source did it use?
Where did hallucination enter?
How long did each step take?A trace can show:
Run
├── prompt formatting
├── model call
├── tool call
│ └── tool output
├── second model call
├── parser
└── final answerFor production, tracing is not just nice to have. It becomes necessary because LLM agents are probabilistic and hard to debug from logs alone.
16. Production architecture
A proper LangChain production app should not be:
Frontend directly calls agent with full permissions.That is risky.
Better architecture:
┌────────────────────┐
│ Frontend / Slack │
└─────────┬──────────┘
│
▼
┌────────────────────┐
│ Backend API │
│ Auth + validation │
└─────────┬──────────┘
│
▼
┌────────────────────┐
│ LangGraph workflow │
│ state + routing │
└─────────┬──────────┘
│
├──────────────► Read-only tools
│ - search
│ - retrieval
│ - database read
│
├──────────────► Write tools
│ - create draft
│ - send message
│
├──────────────► Restricted tools
│ - publish
│ - delete
│ - overwrite
│
▼
┌────────────────────┐
│ Human approval │
└─────────┬──────────┘
▼
┌────────────────────┐
│ External systems │
│ Sanity, Slack, etc.│
└────────────────────┘Tool permission design:
Low-risk:
- search web
- summarize public article
- retrieve internal note
Medium-risk:
- create draft
- send Slack message
- update non-public task
High-risk:
- publish article
- delete file
- send email externally
- run SQL write
- update production systemHigh-risk tools should require human approval.
17. Common design mistakes
Mistake 1: Giving the agent too many tools
Bad:
Agent has 40 tools.
Tool names are vague.
Descriptions overlap.
Model gets confused.Better:
Give 5–8 tools.
Make names specific.
Use clear docstrings.
Split read tools and write tools.Mistake 2: No structured output
Bad:
Agent returns a beautiful paragraph.
Your backend cannot reliably parse it.Better:
Use Pydantic schema.
Return structured_response.
Validate fields.Mistake 3: No human approval for write actions
Bad:
Agent can publish directly.Better:
Agent can create draft.
Human approves publishing.Mistake 4: Treating RAG as magic
Bad RAG:
Throw PDFs into vector DB.
Ask questions.
Trust answers.a. Chunking strategy matters more than you'd expect
The default splitter most tutorials show — splitting by character count — produces chunks that cut across sentences mid-thought and hurt retrieval quality. Start here instead:
from langchain_text_splitters import RecursiveCharacterTextSplitter
splitter = RecursiveCharacterTextSplitter(
chunk_size=500, # tokens, not characters
chunk_overlap=50, # overlap helps preserve context at boundaries
separators=["\n\n", "\n", ".", " "] # tries paragraph breaks first
)
chunks = splitter.split_documents(docs)For structured content (docs with clear sections), consider MarkdownHeaderTextSplitter or HTMLHeaderTextSplitter so chunks stay semantically coherent.
b. Reranking: a two-stage retrieval upgrade
A vector similarity search retrieves the most similar chunks, not necessarily the most relevant ones. Adding a reranker as a second pass significantly improves answer quality — especially for longer documents.
from langchain.retrievers import ContextualCompressionRetriever
from langchain_cohere import CohereRerank
# Base retriever (vector search)
base_retriever = vectorstore.as_retriever(search_kwargs={"k": 20})
# Reranker — re-scores top 20 and returns top 5
compressor = CohereRerank(top_n=5)
retriever = ContextualCompressionRetriever(
base_compressor=compressor,
base_retriever=base_retriever
)If you'd rather not add Cohere as a dependency, FlashrankRerank is a lightweight local alternative.
c. Citing sources: surface the metadata
Retrieval without citation is a trust problem — your users can't verify what the agent found. Store source metadata at ingest time and surface it in the response:
from langchain_community.vectorstores import Chroma
from langchain_openai import ChatOpenAI
from langchain.chains import RetrievalQAWithSourcesChain
# Store documents with source metadata
vectorstore = Chroma.from_documents(
documents=chunks, # each chunk should have doc.metadata["source"]
embedding=embeddings
)
# Use the sources-aware chain variant
chain = RetrievalQAWithSourcesChain.from_chain_type(
llm=ChatOpenAI(model="gpt-4o"),
retriever=vectorstore.as_retriever()
)
result = chain.invoke({"question": "What are the refund terms?"})
print(result["answer"])
print(result["sources"]) # returns the source filenames/URLsQuick rule of thumb: retrieval quality degrades fast if your source documents are inconsistently formatted. Clean and normalise your data before indexing — it pays off more than any retrieval tuning.
Mistake 5: Not tracing runs
Bad:
Agent failed and nobody knows why.Better:
Use traces.
Inspect prompts, tool calls, latency, token cost, errors.18. Learning path
Stage 1: Basic LLM app
Build:
Prompt → Model → OutputSkills:
messages
system prompt
prompt template
output parserProject:
Explain a technical concept in a clear, technical style.Stage 2: Tool-calling agent
Build:
Agent + calculator/search toolsSkills:
@tool
create_agent
tool docstrings
agent.invoke()
tool result inspectionProject:
VC analyst agent that calculates CAGR, ARR multiple, and market size.Stage 3: RAG
Build:
Documents → chunks → embeddings → vector DB → retriever → answerSkills:
document loader
text splitter
embeddings
vectorstore
retriever
RAG promptProject:
Ask questions over your own blog posts or startup notes.Stage 4: RAG as tool
Build:
Agent + internal knowledge searchSkills:
wrap retriever as tool
agent chooses when to retrieve
combine web/current info with internal memoryProject:
AI startup analyst that checks previous posts before drafting a new one.Stage 5: LangGraph workflow
Build:
Explicit graph with state and routingSkills:
StateGraph
nodes
edges
conditional routing
state updates
approval pathProject:
News research → rank → draft → validate → Sanity draft → approval.Stage 6: Production
Build:
Backend + LangGraph + LangSmith + Slack/SanitySkills:
API endpoint
environment variables
tool permissions
human approval
logging
observability
evals
deploymentProject:
Slack command /news that creates Sanity drafts but never publishes automatically.19. Practical project blueprint
A strong LangChain project looks like this:
User command:
"/news ai infrastructure 72h"
System:
1. Parse command
2. Search public sources
3. Retrieve old articles from internal knowledge base
4. Detect overlap
5. Rank news by relevance
6. Draft article
7. Validate source quality
8. Create Sanity draft
9. Send Slack confirmation
10. Wait for approval before publishingTechnical stack:
Frontend/control: Slack
Backend: FastAPI or Next.js API route
Agent framework: LangChain
Workflow runtime: LangGraph
Knowledge layer: vector DB or LlamaIndex retriever
CMS: Sanity
Observability: LangSmith
Hosting: Vercel / Render / RailwayArchitecture:
┌─────────────────────┐
│ Slack command │
│ /news ai 72h │
└──────────┬──────────┘
▼
┌─────────────────────┐
│ Backend API │
│ validates request │
└──────────┬──────────┘
▼
┌─────────────────────┐
│ LangGraph workflow │
│ stateful execution │
└──────────┬──────────┘
│
├──────────► search_recent_news()
├──────────► retrieve_past_posts()
├──────────► rank_sources()
├──────────► draft_article()
├──────────► validate_article()
├──────────► create_sanity_draft()
└──────────► send_slack_message()
▼
┌─────────────────────┐
│ LangSmith tracing │
└─────────────────────┘20. Final mental model
LangChain by itself helps you build fast:
model + tools + agentLangGraph helps you make it reliable:
state + graph + durable execution + human approvalLangSmith helps you understand and improve it:
traces + debugging + evaluation + monitoringThe core idea:
LLM alone = brain in a box
LangChain = brain connected to tools
LangGraph = brain inside a controlled workflow
LangSmith = camera watching what the brain didFor a beginner, LangChain can look scary because there are many abstractions. But the whole thing becomes much easier if you remember the build order:
1. Model
2. Prompt
3. Tool
4. Agent
5. Structured output
6. RAG
7. Workflow graph
8. Human approval
9. Observability
10. Production deploymentLangChain is not just “a Python library.” It is closer to an agent application framework: the layer that lets your AI system reason, choose tools, call APIs, write drafts, ask for approval, and integrate with real software.


