LangChain and LlamaIndex
๐ง LangChain: Simple Overview
LangChain is a powerful open-source framework that helps you build applications using Large Language Models (LLMs) like GPT-4 by connecting them with external data, tools, and memory. Think of it as the orchestrator that turns a language model into a full-fledged intelligent agent.
๐ง What Does LangChain Do?
| Capability | Description |
|---|---|
| ๐ Retrieval | Connects LLMs to vector databases (e.g., Pinecone, FAISS) for RAG pipelines |
| ๐ง Memory | Adds short-term or long-term memory to chatbots |
| ๐ ️ Tools | Lets LLMs use tools like search engines, calculators, or APIs |
| ๐ Chains | Combines multiple steps (e.g., retrieval → generation → summarization) into a workflow |
| ๐งฉ Agents | Enables LLMs to reason and decide which tools to use dynamically |
๐งช Simple LangChain Example (RAG)
from langchain.chains import RetrievalQA
from langchain.vectorstores import Chroma
from langchain.embeddings import OpenAIEmbeddings
from langchain.llms import OpenAI
# Load vector store
embedding = OpenAIEmbeddings()
vectorstore = Chroma(persist_directory="db", embedding_function=embedding)
# Create RAG chain
qa_chain = RetrievalQA.from_chain_type(
llm=OpenAI(),
retriever=vectorstore.as_retriever()
)
# Ask a question
query = "What is a vector database?"
result = qa_chain.run(query)
print(result)
๐งฑ Core Building Blocks
| Component | Purpose |
|---|---|
| LLMs | GPT-4, Claude, LLaMA, etc. |
| Embeddings | Convert text into vectors |
| Vector Stores | Store and retrieve documents semantically |
| Chains | Combine multiple steps (e.g., input → retrieval → generation) |
| Agents | Let LLMs choose tools dynamically |
| Memory | Maintain context across conversations |
๐ What Can You Build with LangChain?
- RAG-powered chatbots
- AI research assistants
- Document Q&A systems
- Code interpreters
- Workflow automation tools
Benefits of Using LangChain
There are several benefits of using LangChain to build applications powered by LLMs. These benefits include:
- Ease of use: LangChain makes it easier to use LLMs to build a variety of applications, even if the developer does not have any experience with artificial intelligence (AI) or machine learning
- Flexibility: LangChain is a flexible framework that can be used to build a wide variety of applications. Developers are not limited to any specific use case
- Scalability: LangChain is scalable to support applications of all sizes. Developers can use LangChain to build applications that serve millions of users
- Robustness: LangChain provides several features that make it easier to build robust and reliable applications. For example, LangChain supports caching and error handling
๐งฑ LangChain Core Components: A Modular Breakdown
LangChain is built around a set of modular components that you can mix and match to build powerful LLM-based applications. Here's a breakdown of the key components and what each one does:
๐น 1. LLMs (Large Language Models)
Purpose: Interface with models like GPT-4, Claude, LLaMA, etc.
Examples:
OpenAI(),ChatOpenAI()HuggingFaceHub(),Anthropic()
๐น 2. Prompt Templates
Purpose: Standardize and structure prompts for LLMs.
Types:
PromptTemplate→ For single-turn promptsChatPromptTemplate→ For multi-turn chat-style prompts
๐ Example:
from langchain.prompts import PromptTemplate
prompt = PromptTemplate.from_template("Translate '{text}' to French.")
๐น 3. Chains
Purpose: Combine multiple components into a pipeline.
Types:
LLMChain→ Prompt + LLMRetrievalQA→ Retriever + LLMSequentialChain,SimpleSequentialChain→ Multi-step workflows
๐ Example:
from langchain.chains import LLMChain
chain = LLMChain(llm=OpenAI(), prompt=prompt)
๐น 4. Memory
Purpose: Store and recall previous interactions (for chatbots).
Types:
ConversationBufferMemoryConversationSummaryMemoryVectorStoreRetrieverMemory
๐ Example:
from langchain.memory import ConversationBufferMemory
memory = ConversationBufferMemory()
๐น 5. Agents
Purpose: Let LLMs choose tools dynamically to solve tasks.
Includes:
- Tool use (e.g., calculator, search)
- Planning and decision-making
- ReAct-style prompting
๐ Example:
from langchain.agents import initialize_agent
agent = initialize_agent(tools, llm, agent="zero-shot-react-description")
๐น 6. Tools
Purpose: External functions the agent can call.
Examples:
- Web search
- Calculator
- Python REPL
- Custom APIs
๐น 7. Retrievers
Purpose: Fetch relevant documents from a vector store.
Examples:
Chroma,FAISS,Pinecone,Weaviate- Used in RAG pipelines
๐ Example:
retriever = vectorstore.as_retriever()
๐น 8. Document Loaders & Text Splitters
Purpose: Load and chunk documents for embedding and retrieval.
Examples:
TextLoader,PDFLoader,UnstructuredLoaderCharacterTextSplitter,RecursiveCharacterTextSplitter
๐น 9. Embeddings
Purpose: Convert text into dense vectors for semantic search.
Examples:
OpenAIEmbeddings,HuggingFaceEmbeddings,CohereEmbeddings
๐น 10. Vector Stores
Purpose: Store and retrieve embeddings.
Examples:
Chroma,FAISS,Pinecone,Qdrant,Weaviate
๐ LangChain Model I/O: Inputs, Outputs, and Interfaces
LangChain’s Model I/O module is all about how you interact with language models—how you structure inputs, manage outputs, and control the flow of information between components like prompts, LLMs, and chains.
Let’s break it down:
๐น 1. Prompt Templates
Purpose: Structure and format inputs to LLMs.
Types:
PromptTemplate: For single-turn prompts.ChatPromptTemplate: For multi-turn chat-style prompts.
๐ Example:
from langchain.prompts import PromptTemplate
prompt = PromptTemplate.from_template("Translate '{text}' to French.")
formatted = prompt.format(text="Hello")
# Output: "Translate 'Hello' to French."
๐น 2. LLMs and Chat Models
Purpose: Interface with language models.
Types:
LLM: For text completion models (e.g., GPT-3).ChatModel: For chat-based models (e.g., GPT-4, Claude).
๐ Example:
from langchain.llms import OpenAI
llm = OpenAI()
response = llm("What is LangChain?")
๐น 3. Output Parsers
Purpose: Convert raw LLM output into structured formats (JSON, lists, etc.).
Types:
StrOutputParser: Returns plain strings.CommaSeparatedListOutputParser: Parses comma-separated values.PydanticOutputParser: Parses into Pydantic models.
๐ Example:
from langchain.output_parsers import CommaSeparatedListOutputParser
parser = CommaSeparatedListOutputParser()
parsed = parser.parse("apples, bananas, oranges")
# Output: ['apples', 'bananas', 'oranges']
๐น 4. Output Fixing Parsers
Purpose: Automatically fix malformed outputs using LLMs.
๐ Example:
from langchain.output_parsers import OutputFixingParser
parser = OutputFixingParser.from_llm(parser=parser, llm=llm)
๐น 5. PromptValue and LLMResult
Purpose: Internal representations of inputs and outputs.
PromptValue: Encapsulates a formatted prompt.LLMResult: Encapsulates raw output from an LLM call.
These are mostly used under the hood but are important for advanced customization.
๐น 6. Runnable Interfaces
LangChain uses a unified interface called Runnable for all components (prompts, LLMs, chains).
๐ Example:
from langchain.prompts import PromptTemplate
from langchain.llms import OpenAI
prompt = PromptTemplate.from_template("Tell me a joke about {topic}")
llm = OpenAI()
chain = prompt | llm # Runnable composition
print(chain.invoke({"topic": "AI"}))
๐ง Summary Table
| Component | Role |
|---|---|
PromptTemplate |
Formats input text |
LLM / ChatModel |
Generates output |
OutputParser |
Parses or validates output |
Runnable |
Composable interface for chaining steps |
๐ง LangChain PromptTemplate: Structure Your Prompts Like a Pro
In LangChain, a PromptTemplate is a reusable, parameterized prompt that helps you format inputs for LLMs in a clean and consistent way. It’s one of the most fundamental building blocks in any LangChain application.
๐น Why Use PromptTemplate?
✅ Avoid hardcoding prompts
✅ Reuse and customize prompts dynamically
✅ Maintain clean separation between logic and prompt text
✅ Combine with chains, agents, and tools
๐ง Basic Usage
from langchain.prompts import PromptTemplate
# Define a template with placeholders
template = "Translate the following sentence to French: {sentence}"
# Create a PromptTemplate object
prompt = PromptTemplate.from_template(template)
# Format the prompt with actual input
formatted_prompt = prompt.format(sentence="I love machine learning.")
print(formatted_prompt)
๐ค Output:
Translate the following sentence to French: I love machine learning.
๐น Advanced Usage with Multiple Variables
template = """
You are a helpful assistant.
Summarize the following article in {num_words} words:
{article}
"""
prompt = PromptTemplate(
input_variables=["article", "num_words"],
template=template
)
formatted = prompt.format(
article="LangChain is a framework for building applications with LLMs...",
num_words="50"
)
๐น Integration with Chains
PromptTemplates are often used with LLMChain:
from langchain.chains import LLMChain
from langchain.llms import OpenAI
llm = OpenAI()
chain = LLMChain(llm=llm, prompt=prompt)
response = chain.run(article="LangChain is...", num_words="30")
print(response)
๐น ChatPromptTemplate (for Chat Models)
For chat-based models like GPT-4:
from langchain.prompts import ChatPromptTemplate
chat_prompt = ChatPromptTemplate.from_messages([
("system", "You are a helpful assistant."),
("user", "Translate '{text}' to French.")
])
formatted = chat_prompt.format_messages(text="Good morning!")
๐ง Summary
| Feature | Benefit |
|---|---|
PromptTemplate |
For single-turn text prompts |
ChatPromptTemplate |
For multi-turn chat-style prompts |
format() |
Injects variables into the template |
from_template() |
Quick creation from a string |
๐งพ 1. Document Loaders
Purpose: Load raw data from various sources (text, PDFs, web pages, etc.) into LangChain-compatible Document objects.
๐น Common Loaders:
| Loader | Source |
|---|---|
TextLoader |
Plain .txt files |
PyPDFLoader |
PDF documents |
UnstructuredLoader |
HTML, Word, PowerPoint, etc. |
WebBaseLoader |
Web pages via URL |
DirectoryLoader |
Bulk load files from a folder |
๐ Example:
from langchain.document_loaders import TextLoader
loader = TextLoader("data/notes.txt")
documents = loader.load()
✂️ 2. Text Splitters
Purpose: Break large documents into smaller, overlapping chunks for better embedding and retrieval.
๐น Common Splitters:
| Splitter | Description |
|---|---|
CharacterTextSplitter |
Splits by character count |
RecursiveCharacterTextSplitter |
Smart splitting by paragraph → sentence → word |
TokenTextSplitter |
Splits by token count (useful for LLM token limits) |
๐ Example:
from langchain.text_splitter import RecursiveCharacterTextSplitter
splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=50)
chunks = splitter.split_documents(documents)
๐ง 3. Text Embeddings
Purpose: Convert text chunks into dense vector representations for semantic similarity search.
๐น Common Embedding Models:
| Model | Provider |
|---|---|
OpenAIEmbeddings |
OpenAI (e.g., text-embedding-ada-002) |
HuggingFaceEmbeddings |
Local or hosted transformer models |
CohereEmbeddings |
Cohere API |
GooglePalmEmbeddings |
Google Vertex AI |
๐ Example:
from langchain.embeddings import OpenAIEmbeddings
embedding_model = OpenAIEmbeddings()
๐ฆ 4. Vector Stores
Purpose: Store and index embeddings for fast similarity search.
๐น Popular Vector Stores:
| Store | Type |
|---|---|
Chroma |
Lightweight, local, LangChain-native |
FAISS |
Open-source, fast, supports GPU |
Pinecone |
Fully managed, scalable cloud DB |
Weaviate, Qdrant, Milvus |
Open-source, production-grade vector DBs |
๐ Example:
from langchain.vectorstores import Chroma
vectorstore = Chroma.from_documents(chunks, embedding_model)
๐ 5. Retrievers
Purpose: Interface that abstracts how documents are retrieved from a vector store.
๐น Retriever Types:
| Retriever | Description |
|---|---|
vectorstore.as_retriever() |
Basic semantic search |
MultiQueryRetriever |
Uses multiple reformulated queries |
ContextualCompressionRetriever |
Compresses retrieved docs using an LLM |
SelfQueryRetriever |
Uses LLM to generate structured queries with filters |
๐ Example:
retriever = vectorstore.as_retriever(search_type="similarity", k=3)
๐ง Putting It All Together (Mini RAG Pipeline)
from langchain.chains import RetrievalQA
from langchain.llms import OpenAI
qa_chain = RetrievalQA.from_chain_type(
llm=OpenAI(),
retriever=retriever,
return_source_documents=True
)
query = "What are vector databases used for?"
result = qa_chain(query)
print("Answer:", result["result"])
๐ LangChain Chains: Orchestrating LLM Workflows
In LangChain, a Chain is a modular pipeline that connects multiple components—like prompts, LLMs, retrievers, and tools—into a sequential or branching workflow. Chains are the backbone of LangChain applications, enabling you to build everything from simple Q&A bots to complex multi-step reasoning agents.
๐งฑ Types of Chains in LangChain
๐น 1. LLMChain
The most basic chain: combines a prompt and an LLM.
from langchain.prompts import PromptTemplate
from langchain.llms import OpenAI
from langchain.chains import LLMChain
prompt = PromptTemplate.from_template("Translate '{text}' to French.")
llm = OpenAI()
chain = LLMChain(llm=llm, prompt=prompt)
response = chain.run(text="Hello, how are you?")
๐น 2. RetrievalQA
Combines a retriever (e.g. vector store) with an LLM to build a RAG pipeline.
from langchain.chains import RetrievalQA
qa_chain = RetrievalQA.from_chain_type(llm=llm, retriever=retriever)
response = qa_chain.run("What is a vector database?")
๐น 3. Stuff, Map-Reduce, and Refine Chains
Used for summarization or document synthesis.
| Chain Type | Description |
|---|---|
Stuff |
Concatenates all docs into one prompt |
MapReduce |
Summarizes chunks individually, then combines |
Refine |
Iteratively builds a summary by refining previous output |
๐น 4. SequentialChain
Executes multiple chains in a fixed order, passing outputs as inputs.
from langchain.chains import SequentialChain
chain1 = LLMChain(llm=llm, prompt=PromptTemplate.from_template("Summarize: {text}"))
chain2 = LLMChain(llm=llm, prompt=PromptTemplate.from_template("Translate to French: {summary}"))
seq_chain = SequentialChain(chains=[chain1, chain2], input_variables=["text"], output_variables=["summary"])
๐น 5. SimpleSequentialChain
A simplified version of SequentialChain for linear flows.
from langchain.chains import SimpleSequentialChain
chain = SimpleSequentialChain(llm=llm, prompt=prompt)
๐น 6. MultiPromptChain
Routes input to different prompts based on topic or intent.
๐น 7. RouterChain
Dynamically selects a sub-chain based on input characteristics.
๐ง When to Use Chains
| Use Case | Recommended Chain |
|---|---|
| Basic prompt + LLM | LLMChain |
| RAG pipeline | RetrievalQA |
| Summarization | MapReduceChain, RefineChain |
| Multi-step workflows | SequentialChain, RouterChain |
| Topic-based routing | MultiPromptChain |
LCEL VS Legacy Chains
| Feature | LCEL (LangChain Expression Language) | Chain Class (Legacy) | |
|---|---|---|---|
| ๐งฑ Style | Declarative, composable | Object-oriented | |
| ๐ Composition | Uses | (pipe) operator | Uses class inheritance |
| ⚡ Performance | Async, streaming, batch-ready | Less optimized | |
| ๐ง Flexibility | Easily mix LLMs, tools, retrievers | Harder to customize | |
| ๐งช Debugging | Transparent, introspectable | More opaque | |
| ๐ ️ Status | Modern, recommended | Legacy, still supported but not preferred |
๐น 1. LCEL (LangChain Expression Language)
LCEL is a new, functional-style API introduced in LangChain to make chains more:
- Composable
- Transparent
- Async-friendly
- Easier to debug and test
✅ Key Features:
- Uses the | (pipe) operator to chain components
- All components implement the Runnable interface
- Supports streaming, batch processing, and tracing
๐งช Example:
from langchain.prompts import PromptTemplate
from langchain.llms import OpenAI
from langchain.output_parsers import StrOutputParser
prompt = PromptTemplate.from_template("Translate '{text}' to French.")
llm = OpenAI()
parser = StrOutputParser()
chain = prompt | llm | parser
result = chain.invoke({"text": "Good morning!"})
๐น 2. Chain Class (Legacy)
The legacy Chain classes (like LLMChain, SequentialChain, RetrievalQA) are object-oriented wrappers that encapsulate logic in a more rigid structure.
✅ Key Features:
- Easy to use for simple pipelines
- Good for beginners
- Still supported, but less flexible
๐งช Example:
from langchain.prompts import PromptTemplate
from langchain.llms import OpenAI
from langchain.chains import LLMChain
prompt = PromptTemplate.from_template("Translate '{text}' to French.")
llm = OpenAI()
chain = LLMChain(prompt=prompt, llm=llm)
result = chain.run(text="Good morning!")
๐ง When to Use What?
| Use Case | Recommended |
|---|---|
| Prototyping or legacy code | Chain classes |
| Production-ready pipelines | LCEL |
| Async or streaming apps | LCEL |
| Complex workflows (tools, retrievers, memory) | LCEL |
| Multi-modal or multi-step chains | LCEL |
๐ง LangChain Agents: LLMs That Can Think and Act
LangChain Agents are one of the most powerful features in the framework. They allow a language model to dynamically decide what actions to take, such as calling tools, querying APIs, or performing calculations—based on the user’s input and the current context.
๐ What Is an Agent?
An agent is an LLM-powered decision-maker that:
- Interprets the user’s query.
- Chooses the appropriate tool(s) to use.
- Executes actions in sequence.
- Synthesizes the final answer.
๐ Think of it as an LLM with a brain and a toolbox.
๐งฐ Common Tools Agents Can Use
| Tool | Purpose |
|---|---|
LLM Math |
Solve math problems using Python |
SerpAPI |
Perform web searches |
Python REPL |
Run Python code |
VectorStoreRetriever |
Retrieve documents from a vector DB |
Custom APIs |
Call your own endpoints |
๐ง Agent Types in LangChain
| Agent Type | Description |
|---|---|
zero-shot-react-description |
Uses ReAct-style prompting to choose tools based on descriptions |
chat-zero-shot-react-description |
Same as above but optimized for chat models |
structured-chat-zero-shot-react |
Returns structured outputs |
openai-functions |
Uses OpenAI’s function calling API (if supported) |
๐งช Example: Agent with Math and Search Tools
from langchain.agents import initialize_agent, load_tools
from langchain.llms import OpenAI
llm = OpenAI(temperature=0)
tools = load_tools(["serpapi", "llm-math"], llm=llm)
agent = initialize_agent(
tools=tools,
llm=llm,
agent="zero-shot-react-description",
verbose=True
)
agent.run("What is the square root of the year Einstein published the theory of relativity?")
๐ How Agents Work (ReAct Loop)
- Thought: "I need to look up the year Einstein published his theory."
- Action:
Search["Einstein theory of relativity year"] - Observation: "1905"
- Thought: "Now I can calculate the square root."
- Action:
Calculator["sqrt(1905)"] - Observation: "43.65"
- Final Answer: "The square root is approximately 43.65."
๐ง When to Use Agents
| Scenario | Use Agents? |
|---|---|
| Static prompt + LLM | ❌ Use LLMChain |
| Dynamic tool use | ✅ Use Agent |
| Multi-step reasoning | ✅ Use Agent |
| API orchestration | ✅ Use Agent |
| Simple RAG | ❌ Use RetrievalQA |
One of the key features of LangChain is its support for chaining prompts. This means that developers can combine multiple prompts together to create more complex and nuanced requests. Another key feature of LangChain is its support for modular components. This means that developers can reuse components from different chains to create new chains. This can save developers a lot of time and effort, and it also makes it easier to share and collaborate on chains.
LangChain offers a suite of tools, components and interfaces that simplify the construction of LLM-centric applications. LangChain provides an LLM class designed for interfacing with various language models providers, such as OpenAI, Cohere and Hugging Face, that makes it easier to build LLM-agnostic applications by simply switching the language models, allowing developers to focus on the application logic without delving into the complexities of dealing with vendor-specific language models. The versatility and flexibility of LangChain enable seamless integration with various data sources, making it a comprehensive solution for creating advanced language model-powered applications.
The open-source framework of LangChain is available to build applications in Python or JavaScript/TypeScript. Its core design principle is composition and modularity. By combining modules and components, one can quickly build complex LLM-based applications. LangChain is an open-source framework that makes it easier to build powerful applications with LLMs relevant to the interests and needs of the user. It connects to external systems to access information required to solve complex problems. It provides abstractions for most of the functionalities needed for building an LLM application and also has integrations that can readily read and write data, reducing the development time of the application. LangChains’s framework allows for building applications that are agnostic to the underlying language model. With its ever-expanding support for various LLMs, LangChain offers a unique value proposition to build applications and iterate continuously.
๐งช LangSmith: Observability & Evaluation for LLM Apps
LangSmith is a developer platform built by the creators of LangChain to help you debug, test, evaluate, and monitor your LLM-powered applications. Think of it as the “LangChain DevTools”—a powerful companion for building reliable, production-grade AI systems.
๐ Why Use LangSmith?
| Feature | Benefit |
|---|---|
| ๐ Debugging | Visualize every step in your chain or agent |
| ๐ Evaluation | Run automated or human-in-the-loop evaluations |
| ๐ Tracing | Inspect inputs, outputs, intermediate steps, and tool calls |
| ๐งช Testing | Create test suites for prompts, chains, and agents |
| ๐ Monitoring | Track performance, latency, and failure rates in production |
๐งฑ Key Concepts
๐น 1. Traces
A trace is a full record of a chain or agent run, including:
- Inputs and outputs
- Intermediate steps (e.g., tool calls, LLM generations)
- Errors and retries
๐น 2. Datasets
Collections of test inputs and expected outputs used for:
- Regression testing
- Prompt tuning
- Model comparisons
๐น 3. Evaluators
Automated or manual scoring functions to assess:
- Accuracy
- Relevance
- Helpfulness
- Toxicity
๐งช Example: Logging a Chain Run to LangSmith
from langchain.llms import OpenAI
from langchain.prompts import PromptTemplate
from langchain.chains import LLMChain
from langsmith import traceable
# Enable LangSmith tracing
import os
os.environ["LANGCHAIN_API_KEY"] = "your-langsmith-api-key"
os.environ["LANGCHAIN_TRACING_V2"] = "true"
# Define a simple chain
prompt = PromptTemplate.from_template("Translate '{text}' to French.")
llm = OpenAI()
chain = LLMChain(llm=llm, prompt=prompt)
# Run with tracing
result = chain.run("Good morning!")
print(result)
You’ll see the full trace in your LangSmith dashboard.
๐ง When to Use LangSmith
| Scenario | Why LangSmith Helps |
|---|---|
| Building complex chains or agents | Visualize and debug each step |
| Evaluating prompt changes | Run A/B tests and regression checks |
| Deploying to production | Monitor performance and failures |
| Collaborating with teams | Share traces and test results |
๐ Bonus: LangSmith + LangChain Expression Language (LCEL)
LangSmith works seamlessly with LCEL pipelines. Just set the environment variable and all Runnable components will be traced automatically.
๐ LangServe: Serve LangChain Apps as APIs
LangServe is a lightweight framework built on top of FastAPI that allows you to deploy LangChain chains and agents as RESTful APIs—quickly and with minimal boilerplate.
It’s perfect for turning your LangChain workflows into production-ready services that can be called from web apps, mobile apps, or other backend systems.
๐ง Why Use LangServe?
| Feature | Benefit |
|---|---|
| ⚡ FastAPI-based | High-performance, async-ready API server |
| ๐ Reusable | Serve any LangChain Runnable (LLMChain, RAG, Agent, etc.) |
| ๐งช LangSmith-compatible | Automatically logs traces for observability |
| ๐ Secure | Add auth, rate limiting, and CORS easily |
| ๐ฆ Deployable | Works with Docker, serverless, or cloud platforms |
๐งฑ Core Concept: Serve a Runnable Chain
LangServe exposes any LangChain Runnable (like a chain or agent) as a REST API with endpoints like:
POST /invoke→ Run the chainPOST /batch→ Run multiple inputsGET /config→ View chain metadata
๐งช Example: Serve a Simple LLMChain
1. ๐ Project Structure
my_langserve_app/
├── app.py
├── chain.py
└── requirements.txt
2. ๐ง chain.py
from langchain.prompts import PromptTemplate
from langchain.llms import OpenAI
from langchain.chains import LLMChain
prompt = PromptTemplate.from_template("Translate '{text}' to French.")
llm = OpenAI()
chain = LLMChain(prompt=prompt, llm=llm)
# Expose as a Runnable
from langchain.schema.runnable import Runnable
app_chain: Runnable = chain
3. ๐ app.py
from langserve import add_routes
from fastapi import FastAPI
from chain import app_chain
app = FastAPI()
add_routes(app, app_chain, path="/translate")
4. ▶️ Run the Server
uvicorn app:app --reload --port 8000
5. ๐งช Test It
curl -X POST http://localhost:8000/translate/invoke \
-H "Content-Type: application/json" \
-d '{"input": {"text": "Good morning"}}'
๐ง Bonus Features
- ✅ Works with LCEL (| operator pipelines)
- ✅ Supports streaming responses
- ✅ Integrates with LangSmith for tracing
- ✅ Easily deployable with Docker or serverless platforms
๐ฆ LlamaIndex: The Data Framework for LLMs
LlamaIndex (formerly known as GPT Index) is a powerful open-source framework designed to help you connect large language models (LLMs) to your external data—like PDFs, databases, Notion docs, APIs, and more.
It’s purpose-built for building RAG (Retrieval-Augmented Generation) systems and semantic search applications with minimal friction.
๐ง Why Use LlamaIndex?
| Feature | Benefit |
|---|---|
| ๐ Data Connectors | Load data from files, APIs, SQL, Notion, etc. |
| ๐งพ Indexing | Organize and chunk data into searchable structures |
| ๐ Retrieval | Perform semantic search over your data |
| ๐ฌ Query Engines | Ask questions and get grounded answers |
| ๐งช Evaluation | Built-in tools for testing and refining pipelines |
๐งฑ Core Components of LlamaIndex
1. Data Connectors
Load data from:
- Files (PDF, Markdown, CSV, etc.)
- Web pages
- APIs
- SQL databases
- Notion, Google Docs, Airtable, etc.
from llama_index import SimpleDirectoryReader
documents = SimpleDirectoryReader("data").load_data()
2. Node Parsers & Text Splitters
Break documents into manageable chunks (nodes) with metadata.
from llama_index.node_parser import SimpleNodeParser
parser = SimpleNodeParser()
nodes = parser.get_nodes_from_documents(documents)
3. Indexing
Build an index to organize and retrieve nodes efficiently.
| Index Type | Use Case |
|---|---|
VectorStoreIndex |
Semantic search (most common) |
ListIndex |
Ordered document traversal |
TreeIndex |
Hierarchical summarization |
KeywordTableIndex |
Keyword-based retrieval |
from llama_index import VectorStoreIndex
index = VectorStoreIndex.from_documents(documents)
4. Retrievers
Query the index to retrieve relevant chunks.
retriever = index.as_retriever(similarity_top_k=3)
5. Query Engines
Combine retrievers with LLMs to generate answers.
query_engine = index.as_query_engine()
response = query_engine.query("What is LlamaIndex?")
print(response)
6. Storage & Persistence
Save and reload indexes for production use.
index.storage_context.persist("index_storage/")
๐ LlamaIndex vs. LangChain
| Feature | LlamaIndex | LangChain |
|---|---|---|
| Focus | Data ingestion, indexing, retrieval | Workflow orchestration, agents, tools |
| Strength | RAG pipelines, document QA | Agents, tool use, multi-step chains |
| Integration | Works with LangChain, OpenAI, Hugging Face | Can use LlamaIndex as a retriever |
✅ Best of both worlds: Use LlamaIndex for retrieval and LangChain for orchestration.
๐ Use Cases
- RAG-powered chatbots
- Enterprise document search
- Academic research assistants
- Legal/medical Q&A systems
- Personal knowledge bases
๐ LlamaIndex Data Connectors: Bringing External Data to LLMs
Data connectors in LlamaIndex are modules that allow you to ingest data from a wide variety of sources—files, APIs, databases, cloud platforms, and more—so that you can build powerful RAG (Retrieval-Augmented Generation) systems grounded in your own knowledge base.
๐งพ Categories of Data Connectors
๐น 1. File-Based Connectors
| Connector | Description |
|---|---|
| SimpleDirectoryReader | Loads all files from a local directory |
| PDFReader | Parses PDFs using PyMuPDF or pdfplumber |
| CSVReader | Loads structured data from CSV files |
| MarkdownReader | Parses .md files |
| DocxReader | Reads Microsoft Word .docx files |
| HTMLReader | Parses HTML content |
| ImageReader | Extracts text from images using OCR (e.g., Tesseract) |
๐ Example:
from llama_index import SimpleDirectoryReader
documents = SimpleDirectoryReader("data/").load_data()
๐น 2. Web & Cloud Connectors
| Connector | Description |
|---|---|
| WebPageReader | Scrapes and parses content from URLs |
| NotionPageReader | Loads content from Notion pages |
| GoogleDocsReader | Connects to Google Docs via API |
| SlackReader | Ingests messages from Slack channels |
| GitHubRepositoryReader | Loads code and docs from GitHub repos |
| ConfluenceReader | Connects to Atlassian Confluence pages |
๐ Example:
from llama_index.readers.web import WebPageReader
documents = WebPageReader().load_data(["https://example.com"])
๐น 3. Database & Structured Data Connectors
| Connector | Description |
|---|---|
| SQLDatabaseReader | Connects to SQL databases (PostgreSQL, MySQL, SQLite) |
| MongoDBReader | Loads documents from MongoDB collections |
| AirtableReader | Connects to Airtable bases |
| GoogleSheetsReader | Reads from Google Sheets |
๐ Example:
from llama_index.readers.database import SQLDatabaseReader
reader = SQLDatabaseReader(uri="sqlite:///mydb.sqlite")
documents = reader.load_data("SELECT * FROM customers")
๐น 4. API & Custom Connectors
| Connector | Description |
|---|---|
| OpenAPIReader | Connects to OpenAPI-compatible APIs |
| RSSReader | Loads content from RSS feeds |
| CustomReader | Build your own connector using the BaseReader class |
๐ Example:
from llama_index.readers.schema.base import BaseReader
class MyCustomReader(BaseReader):
def load_data(self, **kwargs):
# Fetch and return Document objects
return [Document(text="Custom data here")]
๐ง Best Practices
- Use metadata (e.g., source, timestamp) to enhance retrieval quality
- Combine multiple connectors for hybrid knowledge bases
- Persist documents using LlamaIndex’s StorageContext for reuse
๐งช Bonus: Combine with Node Parsers
After loading data, use a NodeParser to chunk and structure it:
from llama_index.node_parser import SimpleNodeParser
parser = SimpleNodeParser()
nodes = parser.get_nodes_from_documents(documents)
๐ฆ Core Components of LlamaIndex
LlamaIndex is designed to help you build powerful Retrieval-Augmented Generation (RAG) systems by connecting LLMs to your external data. Its architecture is modular and consists of several core components that work together to ingest, index, retrieve, and query data.
๐งฑ 1. Data Connectors (Loaders)
Purpose: Ingest data from various sources.
| Source Type | Examples |
|---|---|
| Files | PDFs, CSVs, Markdown, DOCX |
| Web | URLs, Notion, Confluence, GitHub |
| Databases | SQL, MongoDB, Airtable |
| APIs | OpenAPI, RSS, custom endpoints |
๐ Example:
from llama_index import SimpleDirectoryReader
documents = SimpleDirectoryReader("data/").load_data()
✂️ 2. Node Parsers (Text Splitters)
Purpose: Break documents into smaller, manageable chunks called “nodes” with metadata.
| Parser | Description |
|---|---|
| SimpleNodeParser | Basic chunking by character count |
| SentenceWindowParser | Sentence-aware chunking |
| HierarchicalNodeParser | Multi-level chunking for tree-based indexes |
๐ Example:
from llama_index.node_parser import SimpleNodeParser
nodes = SimpleNodeParser().get_nodes_from_documents(documents)
๐ง 3. Embedding Models
Purpose: Convert text chunks (nodes) into dense vector representations for semantic search.
| Provider | Examples |
|---|---|
| OpenAI | text-embedding-ada-002 |
| Hugging Face | SentenceTransformers |
| Cohere | embed-english-v3.0 |
๐ Example:
from llama_index.embeddings import OpenAIEmbedding
embed_model = OpenAIEmbedding()
๐ฆ 4. Indexes
Purpose: Organize and store nodes for efficient retrieval.
| Index Type | Use Case |
|---|---|
| VectorStoreIndex | Semantic search (most common) |
| ListIndex | Ordered traversal (e.g., summarization) |
| TreeIndex | Hierarchical summarization |
| KeywordTableIndex | Keyword-based search |
๐ Example:
from llama_index import VectorStoreIndex
index = VectorStoreIndex.from_documents(documents)
๐ 5. Retrievers
Purpose: Retrieve relevant nodes from an index based on a query.
| Retriever | Description |
|---|---|
| DefaultRetriever | Basic top-k similarity search |
| BM25Retriever | Keyword-based retrieval |
| HybridRetriever | Combines vector + keyword search |
| AutoMergingRetriever | Merges overlapping chunks for better context |
๐ Example:
retriever = index.as_retriever(similarity_top_k=3)
๐ฌ 6. Query Engines
Purpose: Combine retrievers with LLMs to generate answers.
| Engine | Description |
|---|---|
| SimpleQueryEngine | Basic RAG |
| RetrieverQueryEngine | Custom retriever + LLM |
| SubQuestionQueryEngine | Breaks complex queries into sub-questions |
| SQLQueryEngine | Queries SQL databases using natural language |
๐ Example:
query_engine = index.as_query_engine()
response = query_engine.query("What is LlamaIndex?")
๐พ 7. Storage Context
Purpose: Persist and reload indexes, documents, and embeddings.
๐ Example:
index.storage_context.persist("storage/")
๐งช 8. Evaluation & Observability
Purpose: Test and debug your RAG pipeline.
| Tool | Use |
|---|---|
| LangSmith | Trace and evaluate runs |
| Built-in Evaluators | Accuracy, relevance, faithfulness |
| Dataset Generator | Create test sets from your data |
๐ง Summary Table
| Component | Role |
|---|---|
| Data Connectors | Load data from files, APIs, DBs |
| Node Parsers | Chunk and structure documents |
| Embeddings | Convert text to vectors |
| Indexes | Organize and store nodes |
| Retrievers | Fetch relevant chunks |
| Query Engines | Generate answers using LLMs |
| Storage | Save and reload pipelines |
| Evaluation | Test and debug performance |
๐งฑ Types of Indexes in LlamaIndex
In LlamaIndex, an index is a data structure that organizes your documents (or nodes) to enable efficient retrieval and interaction with LLMs. Each index type is optimized for a different use case—whether it's semantic search, summarization, or keyword lookup.
๐น 1. VectorStoreIndex (Most Common)
Purpose: Semantic search using vector similarity.
- Stores embeddings of document chunks (nodes)
- Supports top-k retrieval based on cosine similarity or other distance metrics
- Works with vector stores like FAISS, Pinecone, Chroma, Weaviate
๐ Use Case: Retrieval-Augmented Generation (RAG), semantic Q&A
from llama_index import VectorStoreIndex
index = VectorStoreIndex.from_documents(documents)
๐น 2. ListIndex
Purpose: Ordered traversal of documents.
- Stores documents in a linear list
- Useful for summarization or sequential reading
- No semantic search—retrieves all documents in order
๐ Use Case: Document summarization, storytelling, walkthroughs
from llama_index import ListIndex
index = ListIndex.from_documents(documents)
๐น 3. TreeIndex
Purpose: Hierarchical summarization and reasoning.
- Builds a tree of summaries from document chunks
- Each parent node summarizes its children
- Enables recursive summarization and multi-level reasoning
๐ Use Case: Long document summarization, nested Q&A, outline generation
from llama_index import TreeIndex
index = TreeIndex.from_documents(documents)
๐น 4. KeywordTableIndex
Purpose: Keyword-based retrieval (non-semantic).
- Extracts keywords from documents and builds an inverted index
- Fast keyword lookup without embeddings
- Lightweight and interpretable
๐ Use Case: Simple keyword search, fallback when embeddings are unavailable
from llama_index import KeywordTableIndex
index = KeywordTableIndex.from_documents(documents)
๐ง Summary Table
| Index Type | Retrieval Style | Best For |
|---|---|---|
| VectorStoreIndex | Semantic similarity | RAG, semantic search, Q&A |
| ListIndex | Sequential | Summarization, walkthroughs |
| TreeIndex | Hierarchical | Recursive summarization, long docs |
| KeywordTableIndex | Keyword match | Lightweight search, no embeddings |
๐ Connecting LlamaIndex with Different LLMs
LlamaIndex is model-agnostic—it supports a wide range of LLMs from different providers, allowing you to plug in the model that best fits your use case, whether it's hosted (like OpenAI) or local (like LLaMA or Mistral).
๐ง How LLMs Are Used in LlamaIndex
LLMs in LlamaIndex are used for:
- Generating answers (via Query Engines)
- Summarizing documents
- Refining responses
- Re-ranking retrieved results
- Evaluating outputs
๐น 1. OpenAI (GPT-3.5, GPT-4)
from llama_index.llms import OpenAI
llm = OpenAI(model="gpt-4", temperature=0.3)
✅ Requires OPENAI_API_KEY
✅ Great for RAG, summarization, and reasoning tasks
๐น 2. Anthropic (Claude)
from llama_index.llms import Anthropic
llm = Anthropic(model="claude-2", temperature=0.5)
✅ Requires ANTHROPIC_API_KEY
✅ Known for long context windows and safe outputs
๐น 3. Hugging Face (Hosted or Local)
from llama_index.llms import HuggingFaceLLM
llm = HuggingFaceLLM(
model_name="tiiuae/falcon-7b-instruct",
tokenizer_name="tiiuae/falcon-7b-instruct",
context_window=2048,
max_new_tokens=256
)
✅ Works with Hugging Face Hub or local models
✅ Ideal for open-source deployments
๐น 4. Google Vertex AI (PaLM, Gemini)
from llama_index.llms import VertexAI
llm = VertexAI(model="text-bison", temperature=0.2)
✅ Requires Google Cloud setup
✅ Good for enterprise and multilingual use cases
๐น 5. Cohere
from llama_index.llms import Cohere
llm = Cohere(model="command-xlarge-nightly", temperature=0.4)
✅ Requires COHERE_API_KEY
✅ Strong performance on command-following tasks
๐น 6. Local Models (LLaMA, Mistral, etc.)
Use with backends like:
- ๐ง Ollama
- ๐ง LM Studio
- ๐ง Hugging Face Transformers
- ๐ง vLLM or Text Generation Inference (TGI)
๐ Example with Ollama:
from llama_index.llms import Ollama
llm = Ollama(model="llama2")
๐ Example with Transformers:
from llama_index.llms import HuggingFaceLLM
๐งช Using the LLM in a Query Engine
from llama_index import VectorStoreIndex
index = VectorStoreIndex.from_documents(documents)
query_engine = index.as_query_engine(llm=llm)
response = query_engine.query("What is LlamaIndex?")
print(response)
๐ง Summary Table
| Provider | Class | Notes |
|---|---|---|
| OpenAI | OpenAI |
GPT-3.5, GPT-4 |
| Anthropic | Anthropic |
Claude 1/2 |
| Hugging Face | HuggingFaceLLM |
Local or hosted models |
VertexAI |
PaLM, Gemini | |
| Cohere | Cohere |
Command models |
| Ollama | Ollama |
Local LLaMA, Mistral, etc. |
Building a simple RAG (Retrieval-Augmented Generation) pipeline using ๐ฆ LlamaIndex with:
- PDF/Text file ingestion
- Node parsing and vector indexing
- OpenAI for embeddings and LLM
- Semantic search and query answering
๐งช Full LlamaIndex RAG Pipeline (Step-by-Step)
✅ Prerequisites
Install the required packages:
pip install llama-index openai PyPDF2
Set your OpenAI API key:
export OPENAI_API_KEY=your-api-key
๐ Folder Structure
llamaindex_rag/
├── data/
│ └── example.pdf # or .txt
├── app.py
๐ง app.py
from llama_index import VectorStoreIndex, SimpleDirectoryReader
from llama_index.node_parser import SimpleNodeParser
from llama_index.llms import OpenAI
from llama_index.embeddings import OpenAIEmbedding
from llama_index.query_engine import RetrieverQueryEngine
# Step 1: Load documents from a folder
documents = SimpleDirectoryReader("data").load_data()
# Step 2: Parse documents into nodes (chunks)
parser = SimpleNodeParser()
nodes = parser.get_nodes_from_documents(documents)
# Step 3: Set up embedding model and LLM
embed_model = OpenAIEmbedding()
llm = OpenAI(model="gpt-3.5-turbo", temperature=0.3)
# Step 4: Create a vector index from nodes
index = VectorStoreIndex(nodes, embed_model=embed_model)
# Step 5: Create a retriever and query engine
retriever = index.as_retriever(similarity_top_k=3)
query_engine = RetrieverQueryEngine.from_args(retriever=retriever, llm=llm)
# Step 6: Ask a question
query = "What is this document about?"
response = query_engine.query(query)
# Step 7: Print the answer
print("\n๐ง Answer:")
print(response)
๐ Example Output
๐ง Answer:
This document discusses the fundamentals of vector databases and their role in semantic search...
๐ง What This Pipeline Does
| Step | Purpose |
|---|---|
| ๐ Load | Ingests files from the data/ folder |
| ✂️ Parse | Splits documents into chunks (nodes) |
| ๐ง Embed | Converts chunks into vectors using OpenAI |
| ๐ฆ Index | Stores vectors in memory for semantic search |
| ๐ Retrieve | Finds top-k relevant chunks |
| ๐ฌ Generate | Uses GPT to synthesize an answer from context |





Comments
Post a Comment