LangChain and LlamaIndex

 

๐Ÿง  LangChain: Simple Overview

LangChain is a powerful open-source framework that helps you build applications using Large Language Models (LLMs) like GPT-4 by connecting them with external data, tools, and memory. Think of it as the orchestrator that turns a language model into a full-fledged intelligent agent.


๐Ÿ”ง What Does LangChain Do?

Capability Description
๐Ÿ” Retrieval Connects LLMs to vector databases (e.g., Pinecone, FAISS) for RAG pipelines
๐Ÿง  Memory Adds short-term or long-term memory to chatbots
๐Ÿ› ️ Tools Lets LLMs use tools like search engines, calculators, or APIs
๐Ÿ“š Chains Combines multiple steps (e.g., retrieval → generation → summarization) into a workflow
๐Ÿงฉ Agents Enables LLMs to reason and decide which tools to use dynamically

๐Ÿงช Simple LangChain Example (RAG)

from langchain.chains import RetrievalQA
from langchain.vectorstores import Chroma
from langchain.embeddings import OpenAIEmbeddings
from langchain.llms import OpenAI

# Load vector store
embedding = OpenAIEmbeddings()
vectorstore = Chroma(persist_directory="db", embedding_function=embedding)

# Create RAG chain
qa_chain = RetrievalQA.from_chain_type(
    llm=OpenAI(),
    retriever=vectorstore.as_retriever()
)

# Ask a question
query = "What is a vector database?"
result = qa_chain.run(query)
print(result)

๐Ÿงฑ Core Building Blocks

Component Purpose
LLMs GPT-4, Claude, LLaMA, etc.
Embeddings Convert text into vectors
Vector Stores Store and retrieve documents semantically
Chains Combine multiple steps (e.g., input → retrieval → generation)
Agents Let LLMs choose tools dynamically
Memory Maintain context across conversations

๐Ÿš€ What Can You Build with LangChain?

  • RAG-powered chatbots
  • AI research assistants
  • Document Q&A systems
  • Code interpreters
  • Workflow automation tools


Benefits of Using LangChain
There are several benefits of using LangChain to build applications powered by LLMs. These benefits include:

  • Ease of use: LangChain makes it easier to use LLMs to build a variety of applications, even if the developer does not have any experience with artificial intelligence (AI) or machine learning
  • Flexibility: LangChain is a flexible framework that can be used to build a wide variety of applications. Developers are not limited to any specific use case
  • Scalability: LangChain is scalable to support applications of all sizes. Developers can use LangChain to build applications that serve millions of users
  • Robustness: LangChain provides several features that make it easier to build robust and reliable applications. For example, LangChain supports caching and error handling

๐Ÿงฑ LangChain Core Components: A Modular Breakdown

LangChain is built around a set of modular components that you can mix and match to build powerful LLM-based applications. Here's a breakdown of the key components and what each one does:


๐Ÿ”น 1. LLMs (Large Language Models)

Purpose: Interface with models like GPT-4, Claude, LLaMA, etc.
Examples:

  • OpenAI(), ChatOpenAI()
  • HuggingFaceHub(), Anthropic()

๐Ÿ”น 2. Prompt Templates

Purpose: Standardize and structure prompts for LLMs.
Types:

  • PromptTemplate → For single-turn prompts
  • ChatPromptTemplate → For multi-turn chat-style prompts

๐Ÿ“Œ Example:

from langchain.prompts import PromptTemplate
prompt = PromptTemplate.from_template("Translate '{text}' to French.")

๐Ÿ”น 3. Chains

Purpose: Combine multiple components into a pipeline.
Types:

  • LLMChain → Prompt + LLM
  • RetrievalQA → Retriever + LLM
  • SequentialChain, SimpleSequentialChain → Multi-step workflows

๐Ÿ“Œ Example:

from langchain.chains import LLMChain
chain = LLMChain(llm=OpenAI(), prompt=prompt)

๐Ÿ”น 4. Memory

Purpose: Store and recall previous interactions (for chatbots).
Types:

  • ConversationBufferMemory
  • ConversationSummaryMemory
  • VectorStoreRetrieverMemory

๐Ÿ“Œ Example:

from langchain.memory import ConversationBufferMemory
memory = ConversationBufferMemory()

๐Ÿ”น 5. Agents

Purpose: Let LLMs choose tools dynamically to solve tasks.
Includes:

  • Tool use (e.g., calculator, search)
  • Planning and decision-making
  • ReAct-style prompting

๐Ÿ“Œ Example:

from langchain.agents import initialize_agent
agent = initialize_agent(tools, llm, agent="zero-shot-react-description")

๐Ÿ”น 6. Tools

Purpose: External functions the agent can call.
Examples:

  • Web search
  • Calculator
  • Python REPL
  • Custom APIs

๐Ÿ”น 7. Retrievers

Purpose: Fetch relevant documents from a vector store.
Examples:

  • Chroma, FAISS, Pinecone, Weaviate
  • Used in RAG pipelines

๐Ÿ“Œ Example:

retriever = vectorstore.as_retriever()

๐Ÿ”น 8. Document Loaders & Text Splitters

Purpose: Load and chunk documents for embedding and retrieval.
Examples:

  • TextLoader, PDFLoader, UnstructuredLoader
  • CharacterTextSplitter, RecursiveCharacterTextSplitter

๐Ÿ”น 9. Embeddings

Purpose: Convert text into dense vectors for semantic search.
Examples:

  • OpenAIEmbeddings, HuggingFaceEmbeddings, CohereEmbeddings

๐Ÿ”น 10. Vector Stores

Purpose: Store and retrieve embeddings.
Examples:

  • Chroma, FAISS, Pinecone, Qdrant, Weaviate


๐Ÿ”„ LangChain Model I/O: Inputs, Outputs, and Interfaces

LangChain’s Model I/O module is all about how you interact with language models—how you structure inputs, manage outputs, and control the flow of information between components like prompts, LLMs, and chains.

Let’s break it down:


๐Ÿ”น 1. Prompt Templates

Purpose: Structure and format inputs to LLMs.

Types:

  • PromptTemplate: For single-turn prompts.
  • ChatPromptTemplate: For multi-turn chat-style prompts.

๐Ÿ“Œ Example:

from langchain.prompts import PromptTemplate

prompt = PromptTemplate.from_template("Translate '{text}' to French.")
formatted = prompt.format(text="Hello")
# Output: "Translate 'Hello' to French."

๐Ÿ”น 2. LLMs and Chat Models

Purpose: Interface with language models.

Types:

  • LLM: For text completion models (e.g., GPT-3).
  • ChatModel: For chat-based models (e.g., GPT-4, Claude).

๐Ÿ“Œ Example:

from langchain.llms import OpenAI
llm = OpenAI()
response = llm("What is LangChain?")

๐Ÿ”น 3. Output Parsers

Purpose: Convert raw LLM output into structured formats (JSON, lists, etc.).

Types:

  • StrOutputParser: Returns plain strings.
  • CommaSeparatedListOutputParser: Parses comma-separated values.
  • PydanticOutputParser: Parses into Pydantic models.

๐Ÿ“Œ Example:

from langchain.output_parsers import CommaSeparatedListOutputParser

parser = CommaSeparatedListOutputParser()
parsed = parser.parse("apples, bananas, oranges")
# Output: ['apples', 'bananas', 'oranges']

๐Ÿ”น 4. Output Fixing Parsers

Purpose: Automatically fix malformed outputs using LLMs.

๐Ÿ“Œ Example:

from langchain.output_parsers import OutputFixingParser

parser = OutputFixingParser.from_llm(parser=parser, llm=llm)

๐Ÿ”น 5. PromptValue and LLMResult

Purpose: Internal representations of inputs and outputs.

  • PromptValue: Encapsulates a formatted prompt.
  • LLMResult: Encapsulates raw output from an LLM call.

These are mostly used under the hood but are important for advanced customization.


๐Ÿ”น 6. Runnable Interfaces

LangChain uses a unified interface called Runnable for all components (prompts, LLMs, chains).

๐Ÿ“Œ Example:

from langchain.prompts import PromptTemplate
from langchain.llms import OpenAI

prompt = PromptTemplate.from_template("Tell me a joke about {topic}")
llm = OpenAI()

chain = prompt | llm  # Runnable composition
print(chain.invoke({"topic": "AI"}))

๐Ÿง  Summary Table

Component Role
PromptTemplate Formats input text
LLM / ChatModel Generates output
OutputParser Parses or validates output
Runnable Composable interface for chaining steps


๐Ÿง  LangChain PromptTemplate: Structure Your Prompts Like a Pro

In LangChain, a PromptTemplate is a reusable, parameterized prompt that helps you format inputs for LLMs in a clean and consistent way. It’s one of the most fundamental building blocks in any LangChain application.


๐Ÿ”น Why Use PromptTemplate?

✅ Avoid hardcoding prompts
✅ Reuse and customize prompts dynamically
✅ Maintain clean separation between logic and prompt text
✅ Combine with chains, agents, and tools


๐Ÿ”ง Basic Usage

from langchain.prompts import PromptTemplate

# Define a template with placeholders
template = "Translate the following sentence to French: {sentence}"

# Create a PromptTemplate object
prompt = PromptTemplate.from_template(template)

# Format the prompt with actual input
formatted_prompt = prompt.format(sentence="I love machine learning.")
print(formatted_prompt)

๐Ÿ“ค Output:

Translate the following sentence to French: I love machine learning.

๐Ÿ”น Advanced Usage with Multiple Variables

template = """
You are a helpful assistant.
Summarize the following article in {num_words} words:

{article}
"""

prompt = PromptTemplate(
    input_variables=["article", "num_words"],
    template=template
)

formatted = prompt.format(
    article="LangChain is a framework for building applications with LLMs...",
    num_words="50"
)

๐Ÿ”น Integration with Chains

PromptTemplates are often used with LLMChain:

from langchain.chains import LLMChain
from langchain.llms import OpenAI

llm = OpenAI()
chain = LLMChain(llm=llm, prompt=prompt)

response = chain.run(article="LangChain is...", num_words="30")
print(response)

๐Ÿ”น ChatPromptTemplate (for Chat Models)

For chat-based models like GPT-4:

from langchain.prompts import ChatPromptTemplate

chat_prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful assistant."),
    ("user", "Translate '{text}' to French.")
])

formatted = chat_prompt.format_messages(text="Good morning!")

๐Ÿง  Summary

Feature Benefit
PromptTemplate For single-turn text prompts
ChatPromptTemplate For multi-turn chat-style prompts
format() Injects variables into the template
from_template() Quick creation from a string



๐Ÿงพ 1. Document Loaders

Purpose: Load raw data from various sources (text, PDFs, web pages, etc.) into LangChain-compatible Document objects.

๐Ÿ”น Common Loaders:

Loader Source
TextLoader Plain .txt files
PyPDFLoader PDF documents
UnstructuredLoader HTML, Word, PowerPoint, etc.
WebBaseLoader Web pages via URL
DirectoryLoader Bulk load files from a folder

๐Ÿ“Œ Example:

from langchain.document_loaders import TextLoader
loader = TextLoader("data/notes.txt")
documents = loader.load()

✂️ 2. Text Splitters

Purpose: Break large documents into smaller, overlapping chunks for better embedding and retrieval.

๐Ÿ”น Common Splitters:

Splitter Description
CharacterTextSplitter Splits by character count
RecursiveCharacterTextSplitter Smart splitting by paragraph → sentence → word
TokenTextSplitter Splits by token count (useful for LLM token limits)

๐Ÿ“Œ Example:

from langchain.text_splitter import RecursiveCharacterTextSplitter
splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=50)
chunks = splitter.split_documents(documents)

๐Ÿง  3. Text Embeddings

Purpose: Convert text chunks into dense vector representations for semantic similarity search.

๐Ÿ”น Common Embedding Models:

Model Provider
OpenAIEmbeddings OpenAI (e.g., text-embedding-ada-002)
HuggingFaceEmbeddings Local or hosted transformer models
CohereEmbeddings Cohere API
GooglePalmEmbeddings Google Vertex AI

๐Ÿ“Œ Example:

from langchain.embeddings import OpenAIEmbeddings
embedding_model = OpenAIEmbeddings()

๐Ÿ“ฆ 4. Vector Stores

Purpose: Store and index embeddings for fast similarity search.

๐Ÿ”น Popular Vector Stores:

Store Type
Chroma Lightweight, local, LangChain-native
FAISS Open-source, fast, supports GPU
Pinecone Fully managed, scalable cloud DB
Weaviate, Qdrant, Milvus Open-source, production-grade vector DBs

๐Ÿ“Œ Example:

from langchain.vectorstores import Chroma
vectorstore = Chroma.from_documents(chunks, embedding_model)

๐Ÿ” 5. Retrievers

Purpose: Interface that abstracts how documents are retrieved from a vector store.

๐Ÿ”น Retriever Types:

Retriever Description
vectorstore.as_retriever() Basic semantic search
MultiQueryRetriever Uses multiple reformulated queries
ContextualCompressionRetriever Compresses retrieved docs using an LLM
SelfQueryRetriever Uses LLM to generate structured queries with filters

๐Ÿ“Œ Example:

retriever = vectorstore.as_retriever(search_type="similarity", k=3)

๐Ÿง  Putting It All Together (Mini RAG Pipeline)

from langchain.chains import RetrievalQA
from langchain.llms import OpenAI

qa_chain = RetrievalQA.from_chain_type(
    llm=OpenAI(),
    retriever=retriever,
    return_source_documents=True
)

query = "What are vector databases used for?"
result = qa_chain(query)

print("Answer:", result["result"])


๐Ÿ”— LangChain Chains: Orchestrating LLM Workflows

In LangChain, a Chain is a modular pipeline that connects multiple components—like prompts, LLMs, retrievers, and tools—into a sequential or branching workflow. Chains are the backbone of LangChain applications, enabling you to build everything from simple Q&A bots to complex multi-step reasoning agents.


๐Ÿงฑ Types of Chains in LangChain

๐Ÿ”น 1. LLMChain

The most basic chain: combines a prompt and an LLM.

from langchain.prompts import PromptTemplate
from langchain.llms import OpenAI
from langchain.chains import LLMChain

prompt = PromptTemplate.from_template("Translate '{text}' to French.")
llm = OpenAI()
chain = LLMChain(llm=llm, prompt=prompt)

response = chain.run(text="Hello, how are you?")

๐Ÿ”น 2. RetrievalQA

Combines a retriever (e.g. vector store) with an LLM to build a RAG pipeline.

from langchain.chains import RetrievalQA
qa_chain = RetrievalQA.from_chain_type(llm=llm, retriever=retriever)
response = qa_chain.run("What is a vector database?")

๐Ÿ”น 3. Stuff, Map-Reduce, and Refine Chains

Used for summarization or document synthesis.

Chain Type Description
Stuff Concatenates all docs into one prompt
MapReduce Summarizes chunks individually, then combines
Refine Iteratively builds a summary by refining previous output

๐Ÿ”น 4. SequentialChain

Executes multiple chains in a fixed order, passing outputs as inputs.

from langchain.chains import SequentialChain

chain1 = LLMChain(llm=llm, prompt=PromptTemplate.from_template("Summarize: {text}"))
chain2 = LLMChain(llm=llm, prompt=PromptTemplate.from_template("Translate to French: {summary}"))

seq_chain = SequentialChain(chains=[chain1, chain2], input_variables=["text"], output_variables=["summary"])

๐Ÿ”น 5. SimpleSequentialChain

A simplified version of SequentialChain for linear flows.

from langchain.chains import SimpleSequentialChain

chain = SimpleSequentialChain(llm=llm, prompt=prompt)

๐Ÿ”น 6. MultiPromptChain

Routes input to different prompts based on topic or intent.


๐Ÿ”น 7. RouterChain

Dynamically selects a sub-chain based on input characteristics.


๐Ÿง  When to Use Chains

Use Case Recommended Chain
Basic prompt + LLM LLMChain
RAG pipeline RetrievalQA
Summarization MapReduceChain, RefineChain
Multi-step workflows SequentialChain, RouterChain
Topic-based routing MultiPromptChain


LCEL VS Legacy Chains

Feature LCEL (LangChain Expression Language) Chain Class (Legacy)
๐Ÿงฑ Style Declarative, composable Object-oriented
๐Ÿ”— Composition Uses (pipe) operator Uses class inheritance
⚡ Performance Async, streaming, batch-ready Less optimized
๐Ÿง  Flexibility Easily mix LLMs, tools, retrievers Harder to customize
๐Ÿงช Debugging Transparent, introspectable More opaque
๐Ÿ› ️ Status Modern, recommended Legacy, still supported but not preferred

๐Ÿ”น 1. LCEL (LangChain Expression Language)

LCEL is a new, functional-style API introduced in LangChain to make chains more:

  • Composable
  • Transparent
  • Async-friendly
  • Easier to debug and test

✅ Key Features:

  • Uses the | (pipe) operator to chain components
  • All components implement the Runnable interface
  • Supports streaming, batch processing, and tracing

๐Ÿงช Example:

from langchain.prompts import PromptTemplate
from langchain.llms import OpenAI
from langchain.output_parsers import StrOutputParser

prompt = PromptTemplate.from_template("Translate '{text}' to French.")
llm = OpenAI()
parser = StrOutputParser()

chain = prompt | llm | parser
result = chain.invoke({"text": "Good morning!"})

๐Ÿ”น 2. Chain Class (Legacy)

The legacy Chain classes (like LLMChain, SequentialChain, RetrievalQA) are object-oriented wrappers that encapsulate logic in a more rigid structure.

✅ Key Features:

  • Easy to use for simple pipelines
  • Good for beginners
  • Still supported, but less flexible

๐Ÿงช Example:

from langchain.prompts import PromptTemplate
from langchain.llms import OpenAI
from langchain.chains import LLMChain

prompt = PromptTemplate.from_template("Translate '{text}' to French.")
llm = OpenAI()
chain = LLMChain(prompt=prompt, llm=llm)

result = chain.run(text="Good morning!")

๐Ÿง  When to Use What?

Use Case Recommended
Prototyping or legacy code Chain classes
Production-ready pipelines LCEL
Async or streaming apps LCEL
Complex workflows (tools, retrievers, memory) LCEL
Multi-modal or multi-step chains LCEL



๐Ÿง  LangChain Agents: LLMs That Can Think and Act

LangChain Agents are one of the most powerful features in the framework. They allow a language model to dynamically decide what actions to take, such as calling tools, querying APIs, or performing calculations—based on the user’s input and the current context.


๐Ÿ” What Is an Agent?

An agent is an LLM-powered decision-maker that:

  1. Interprets the user’s query.
  2. Chooses the appropriate tool(s) to use.
  3. Executes actions in sequence.
  4. Synthesizes the final answer.

๐Ÿ“Œ Think of it as an LLM with a brain and a toolbox.


๐Ÿงฐ Common Tools Agents Can Use

Tool Purpose
LLM Math Solve math problems using Python
SerpAPI Perform web searches
Python REPL Run Python code
VectorStoreRetriever Retrieve documents from a vector DB
Custom APIs Call your own endpoints

๐Ÿง  Agent Types in LangChain

Agent Type Description
zero-shot-react-description Uses ReAct-style prompting to choose tools based on descriptions
chat-zero-shot-react-description Same as above but optimized for chat models
structured-chat-zero-shot-react Returns structured outputs
openai-functions Uses OpenAI’s function calling API (if supported)

๐Ÿงช Example: Agent with Math and Search Tools

from langchain.agents import initialize_agent, load_tools
from langchain.llms import OpenAI

llm = OpenAI(temperature=0)
tools = load_tools(["serpapi", "llm-math"], llm=llm)

agent = initialize_agent(
    tools=tools,
    llm=llm,
    agent="zero-shot-react-description",
    verbose=True
)

agent.run("What is the square root of the year Einstein published the theory of relativity?")

๐Ÿ”„ How Agents Work (ReAct Loop)

  1. Thought: "I need to look up the year Einstein published his theory."
  2. Action: Search["Einstein theory of relativity year"]
  3. Observation: "1905"
  4. Thought: "Now I can calculate the square root."
  5. Action: Calculator["sqrt(1905)"]
  6. Observation: "43.65"
  7. Final Answer: "The square root is approximately 43.65."

๐Ÿง  When to Use Agents

Scenario Use Agents?
Static prompt + LLM ❌ Use LLMChain
Dynamic tool use ✅ Use Agent
Multi-step reasoning ✅ Use Agent
API orchestration ✅ Use Agent
Simple RAG ❌ Use RetrievalQA


One of the key features of LangChain is its support for chaining prompts. This means that developers can combine multiple prompts together to create more complex and nuanced requests. Another key feature of LangChain is its support for modular components. This means that developers can reuse components from different chains to create new chains. This can save developers a lot of time and effort, and it also makes it easier to share and collaborate on chains.


LangChain offers a suite of tools, components and interfaces that simplify the construction of LLM-centric applications. LangChain provides an LLM class designed for interfacing with various language models providers, such as OpenAI, Cohere and Hugging Face, that makes it easier to build LLM-agnostic applications by simply switching the language models, allowing developers to focus on the application logic without delving into the complexities of dealing with vendor-specific language models. The versatility and flexibility of LangChain enable seamless integration with various data sources, making it a comprehensive solution for creating advanced language model-powered applications.


The open-source framework of LangChain is available to build applications in Python or JavaScript/TypeScript. Its core design principle is composition and modularity. By combining modules and components, one can quickly build complex LLM-based applications. LangChain is an open-source framework that makes it easier to build powerful applications with LLMs relevant to the interests and needs of the user. It connects to external systems to access information required to solve complex problems. It provides abstractions for most of the functionalities needed for building an LLM application and also has integrations that can readily read and write data, reducing the development time of the application. LangChains’s framework allows for building applications that are agnostic to the underlying language model. With its ever-expanding support for various LLMs, LangChain offers a unique value proposition to build applications and iterate continuously.


๐Ÿงช LangSmith: Observability & Evaluation for LLM Apps

LangSmith is a developer platform built by the creators of LangChain to help you debug, test, evaluate, and monitor your LLM-powered applications. Think of it as the “LangChain DevTools”—a powerful companion for building reliable, production-grade AI systems.


๐Ÿ” Why Use LangSmith?

Feature Benefit
๐Ÿž Debugging Visualize every step in your chain or agent
๐Ÿ“Š Evaluation Run automated or human-in-the-loop evaluations
๐Ÿ” Tracing Inspect inputs, outputs, intermediate steps, and tool calls
๐Ÿงช Testing Create test suites for prompts, chains, and agents
๐Ÿ“ˆ Monitoring Track performance, latency, and failure rates in production

๐Ÿงฑ Key Concepts

๐Ÿ”น 1. Traces

A trace is a full record of a chain or agent run, including:

  • Inputs and outputs
  • Intermediate steps (e.g., tool calls, LLM generations)
  • Errors and retries

๐Ÿ”น 2. Datasets

Collections of test inputs and expected outputs used for:

  • Regression testing
  • Prompt tuning
  • Model comparisons

๐Ÿ”น 3. Evaluators

Automated or manual scoring functions to assess:

  • Accuracy
  • Relevance
  • Helpfulness
  • Toxicity

๐Ÿงช Example: Logging a Chain Run to LangSmith

from langchain.llms import OpenAI
from langchain.prompts import PromptTemplate
from langchain.chains import LLMChain
from langsmith import traceable

# Enable LangSmith tracing
import os
os.environ["LANGCHAIN_API_KEY"] = "your-langsmith-api-key"
os.environ["LANGCHAIN_TRACING_V2"] = "true"

# Define a simple chain
prompt = PromptTemplate.from_template("Translate '{text}' to French.")
llm = OpenAI()
chain = LLMChain(llm=llm, prompt=prompt)

# Run with tracing
result = chain.run("Good morning!")
print(result)

You’ll see the full trace in your LangSmith dashboard.


๐Ÿง  When to Use LangSmith

Scenario Why LangSmith Helps
Building complex chains or agents Visualize and debug each step
Evaluating prompt changes Run A/B tests and regression checks
Deploying to production Monitor performance and failures
Collaborating with teams Share traces and test results

๐Ÿš€ Bonus: LangSmith + LangChain Expression Language (LCEL)

LangSmith works seamlessly with LCEL pipelines. Just set the environment variable and all Runnable components will be traced automatically.


๐Ÿš€ LangServe: Serve LangChain Apps as APIs

LangServe is a lightweight framework built on top of FastAPI that allows you to deploy LangChain chains and agents as RESTful APIs—quickly and with minimal boilerplate.

It’s perfect for turning your LangChain workflows into production-ready services that can be called from web apps, mobile apps, or other backend systems.


๐Ÿ”ง Why Use LangServe?

Feature Benefit
⚡ FastAPI-based High-performance, async-ready API server
๐Ÿ” Reusable Serve any LangChain Runnable (LLMChain, RAG, Agent, etc.)
๐Ÿงช LangSmith-compatible Automatically logs traces for observability
๐Ÿ” Secure Add auth, rate limiting, and CORS easily
๐Ÿ“ฆ Deployable Works with Docker, serverless, or cloud platforms

๐Ÿงฑ Core Concept: Serve a Runnable Chain

LangServe exposes any LangChain Runnable (like a chain or agent) as a REST API with endpoints like:

  • POST /invoke → Run the chain
  • POST /batch → Run multiple inputs
  • GET /config → View chain metadata

๐Ÿงช Example: Serve a Simple LLMChain

1. ๐Ÿ“ Project Structure

my_langserve_app/
├── app.py
├── chain.py
└── requirements.txt

2. ๐Ÿง  chain.py

from langchain.prompts import PromptTemplate
from langchain.llms import OpenAI
from langchain.chains import LLMChain

prompt = PromptTemplate.from_template("Translate '{text}' to French.")
llm = OpenAI()
chain = LLMChain(prompt=prompt, llm=llm)

# Expose as a Runnable
from langchain.schema.runnable import Runnable
app_chain: Runnable = chain

3. ๐Ÿš€ app.py

from langserve import add_routes
from fastapi import FastAPI
from chain import app_chain

app = FastAPI()
add_routes(app, app_chain, path="/translate")

4. ▶️ Run the Server

uvicorn app:app --reload --port 8000

5. ๐Ÿงช Test It

curl -X POST http://localhost:8000/translate/invoke \
  -H "Content-Type: application/json" \
  -d '{"input": {"text": "Good morning"}}'

๐Ÿง  Bonus Features

  • ✅ Works with LCEL (| operator pipelines)
  • ✅ Supports streaming responses
  • ✅ Integrates with LangSmith for tracing
  • ✅ Easily deployable with Docker or serverless platforms




๐Ÿฆ™ LlamaIndex: The Data Framework for LLMs

LlamaIndex (formerly known as GPT Index) is a powerful open-source framework designed to help you connect large language models (LLMs) to your external data—like PDFs, databases, Notion docs, APIs, and more.

It’s purpose-built for building RAG (Retrieval-Augmented Generation) systems and semantic search applications with minimal friction.


๐Ÿง  Why Use LlamaIndex?

Feature Benefit
๐Ÿ”Œ Data Connectors Load data from files, APIs, SQL, Notion, etc.
๐Ÿงพ Indexing Organize and chunk data into searchable structures
๐Ÿ” Retrieval Perform semantic search over your data
๐Ÿ’ฌ Query Engines Ask questions and get grounded answers
๐Ÿงช Evaluation Built-in tools for testing and refining pipelines

๐Ÿงฑ Core Components of LlamaIndex

1. Data Connectors

Load data from:

  • Files (PDF, Markdown, CSV, etc.)
  • Web pages
  • APIs
  • SQL databases
  • Notion, Google Docs, Airtable, etc.
from llama_index import SimpleDirectoryReader
documents = SimpleDirectoryReader("data").load_data()

2. Node Parsers & Text Splitters

Break documents into manageable chunks (nodes) with metadata.

from llama_index.node_parser import SimpleNodeParser
parser = SimpleNodeParser()
nodes = parser.get_nodes_from_documents(documents)

3. Indexing

Build an index to organize and retrieve nodes efficiently.

Index Type Use Case
VectorStoreIndex Semantic search (most common)
ListIndex Ordered document traversal
TreeIndex Hierarchical summarization
KeywordTableIndex Keyword-based retrieval
from llama_index import VectorStoreIndex
index = VectorStoreIndex.from_documents(documents)

4. Retrievers

Query the index to retrieve relevant chunks.

retriever = index.as_retriever(similarity_top_k=3)

5. Query Engines

Combine retrievers with LLMs to generate answers.

query_engine = index.as_query_engine()
response = query_engine.query("What is LlamaIndex?")
print(response)

6. Storage & Persistence

Save and reload indexes for production use.

index.storage_context.persist("index_storage/")

๐Ÿ”„ LlamaIndex vs. LangChain

Feature LlamaIndex LangChain
Focus Data ingestion, indexing, retrieval Workflow orchestration, agents, tools
Strength RAG pipelines, document QA Agents, tool use, multi-step chains
Integration Works with LangChain, OpenAI, Hugging Face Can use LlamaIndex as a retriever

✅ Best of both worlds: Use LlamaIndex for retrieval and LangChain for orchestration.


๐Ÿš€ Use Cases

  • RAG-powered chatbots
  • Enterprise document search
  • Academic research assistants
  • Legal/medical Q&A systems
  • Personal knowledge bases


๐Ÿ”Œ LlamaIndex Data Connectors: Bringing External Data to LLMs

Data connectors in LlamaIndex are modules that allow you to ingest data from a wide variety of sources—files, APIs, databases, cloud platforms, and more—so that you can build powerful RAG (Retrieval-Augmented Generation) systems grounded in your own knowledge base.


๐Ÿงพ Categories of Data Connectors

๐Ÿ”น 1. File-Based Connectors

Connector Description
SimpleDirectoryReader Loads all files from a local directory
PDFReader Parses PDFs using PyMuPDF or pdfplumber
CSVReader Loads structured data from CSV files
MarkdownReader Parses .md files
DocxReader Reads Microsoft Word .docx files
HTMLReader Parses HTML content
ImageReader Extracts text from images using OCR (e.g., Tesseract)

๐Ÿ“Œ Example:

from llama_index import SimpleDirectoryReader
documents = SimpleDirectoryReader("data/").load_data()

๐Ÿ”น 2. Web & Cloud Connectors

Connector Description
WebPageReader Scrapes and parses content from URLs
NotionPageReader Loads content from Notion pages
GoogleDocsReader Connects to Google Docs via API
SlackReader Ingests messages from Slack channels
GitHubRepositoryReader Loads code and docs from GitHub repos
ConfluenceReader Connects to Atlassian Confluence pages

๐Ÿ“Œ Example:

from llama_index.readers.web import WebPageReader
documents = WebPageReader().load_data(["https://example.com"])

๐Ÿ”น 3. Database & Structured Data Connectors

Connector Description
SQLDatabaseReader Connects to SQL databases (PostgreSQL, MySQL, SQLite)
MongoDBReader Loads documents from MongoDB collections
AirtableReader Connects to Airtable bases
GoogleSheetsReader Reads from Google Sheets

๐Ÿ“Œ Example:

from llama_index.readers.database import SQLDatabaseReader
reader = SQLDatabaseReader(uri="sqlite:///mydb.sqlite")
documents = reader.load_data("SELECT * FROM customers")

๐Ÿ”น 4. API & Custom Connectors

Connector Description
OpenAPIReader Connects to OpenAPI-compatible APIs
RSSReader Loads content from RSS feeds
CustomReader Build your own connector using the BaseReader class

๐Ÿ“Œ Example:

from llama_index.readers.schema.base import BaseReader

class MyCustomReader(BaseReader):
    def load_data(self, **kwargs):
        # Fetch and return Document objects
        return [Document(text="Custom data here")]

๐Ÿง  Best Practices

  • Use metadata (e.g., source, timestamp) to enhance retrieval quality
  • Combine multiple connectors for hybrid knowledge bases
  • Persist documents using LlamaIndex’s StorageContext for reuse

๐Ÿงช Bonus: Combine with Node Parsers

After loading data, use a NodeParser to chunk and structure it:

from llama_index.node_parser import SimpleNodeParser
parser = SimpleNodeParser()
nodes = parser.get_nodes_from_documents(documents)


๐Ÿฆ™ Core Components of LlamaIndex

LlamaIndex is designed to help you build powerful Retrieval-Augmented Generation (RAG) systems by connecting LLMs to your external data. Its architecture is modular and consists of several core components that work together to ingest, index, retrieve, and query data.


๐Ÿงฑ 1. Data Connectors (Loaders)

Purpose: Ingest data from various sources.

Source Type Examples
Files PDFs, CSVs, Markdown, DOCX
Web URLs, Notion, Confluence, GitHub
Databases SQL, MongoDB, Airtable
APIs OpenAPI, RSS, custom endpoints

๐Ÿ“Œ Example:

from llama_index import SimpleDirectoryReader
documents = SimpleDirectoryReader("data/").load_data()

✂️ 2. Node Parsers (Text Splitters)

Purpose: Break documents into smaller, manageable chunks called “nodes” with metadata.

Parser Description
SimpleNodeParser Basic chunking by character count
SentenceWindowParser Sentence-aware chunking
HierarchicalNodeParser Multi-level chunking for tree-based indexes

๐Ÿ“Œ Example:

from llama_index.node_parser import SimpleNodeParser
nodes = SimpleNodeParser().get_nodes_from_documents(documents)

๐Ÿง  3. Embedding Models

Purpose: Convert text chunks (nodes) into dense vector representations for semantic search.

Provider Examples
OpenAI text-embedding-ada-002
Hugging Face SentenceTransformers
Cohere embed-english-v3.0

๐Ÿ“Œ Example:

from llama_index.embeddings import OpenAIEmbedding
embed_model = OpenAIEmbedding()

๐Ÿ“ฆ 4. Indexes

Purpose: Organize and store nodes for efficient retrieval.

Index Type Use Case
VectorStoreIndex Semantic search (most common)
ListIndex Ordered traversal (e.g., summarization)
TreeIndex Hierarchical summarization
KeywordTableIndex Keyword-based search

๐Ÿ“Œ Example:

from llama_index import VectorStoreIndex
index = VectorStoreIndex.from_documents(documents)

๐Ÿ” 5. Retrievers

Purpose: Retrieve relevant nodes from an index based on a query.

Retriever Description
DefaultRetriever Basic top-k similarity search
BM25Retriever Keyword-based retrieval
HybridRetriever Combines vector + keyword search
AutoMergingRetriever Merges overlapping chunks for better context

๐Ÿ“Œ Example:

retriever = index.as_retriever(similarity_top_k=3)

๐Ÿ’ฌ 6. Query Engines

Purpose: Combine retrievers with LLMs to generate answers.

Engine Description
SimpleQueryEngine Basic RAG
RetrieverQueryEngine Custom retriever + LLM
SubQuestionQueryEngine Breaks complex queries into sub-questions
SQLQueryEngine Queries SQL databases using natural language

๐Ÿ“Œ Example:

query_engine = index.as_query_engine()
response = query_engine.query("What is LlamaIndex?")

๐Ÿ’พ 7. Storage Context

Purpose: Persist and reload indexes, documents, and embeddings.

๐Ÿ“Œ Example:

index.storage_context.persist("storage/")

๐Ÿงช 8. Evaluation & Observability

Purpose: Test and debug your RAG pipeline.

Tool Use
LangSmith Trace and evaluate runs
Built-in Evaluators Accuracy, relevance, faithfulness
Dataset Generator Create test sets from your data

๐Ÿง  Summary Table

Component Role
Data Connectors Load data from files, APIs, DBs
Node Parsers Chunk and structure documents
Embeddings Convert text to vectors
Indexes Organize and store nodes
Retrievers Fetch relevant chunks
Query Engines Generate answers using LLMs
Storage Save and reload pipelines
Evaluation Test and debug performance


๐Ÿงฑ Types of Indexes in LlamaIndex

In LlamaIndex, an index is a data structure that organizes your documents (or nodes) to enable efficient retrieval and interaction with LLMs. Each index type is optimized for a different use case—whether it's semantic search, summarization, or keyword lookup.


๐Ÿ”น 1. VectorStoreIndex (Most Common)

Purpose: Semantic search using vector similarity.

  • Stores embeddings of document chunks (nodes)
  • Supports top-k retrieval based on cosine similarity or other distance metrics
  • Works with vector stores like FAISS, Pinecone, Chroma, Weaviate

๐Ÿ“Œ Use Case: Retrieval-Augmented Generation (RAG), semantic Q&A

from llama_index import VectorStoreIndex
index = VectorStoreIndex.from_documents(documents)

๐Ÿ”น 2. ListIndex

Purpose: Ordered traversal of documents.

  • Stores documents in a linear list
  • Useful for summarization or sequential reading
  • No semantic search—retrieves all documents in order

๐Ÿ“Œ Use Case: Document summarization, storytelling, walkthroughs

from llama_index import ListIndex
index = ListIndex.from_documents(documents)

๐Ÿ”น 3. TreeIndex

Purpose: Hierarchical summarization and reasoning.

  • Builds a tree of summaries from document chunks
  • Each parent node summarizes its children
  • Enables recursive summarization and multi-level reasoning

๐Ÿ“Œ Use Case: Long document summarization, nested Q&A, outline generation

from llama_index import TreeIndex
index = TreeIndex.from_documents(documents)

๐Ÿ”น 4. KeywordTableIndex

Purpose: Keyword-based retrieval (non-semantic).

  • Extracts keywords from documents and builds an inverted index
  • Fast keyword lookup without embeddings
  • Lightweight and interpretable

๐Ÿ“Œ Use Case: Simple keyword search, fallback when embeddings are unavailable

from llama_index import KeywordTableIndex
index = KeywordTableIndex.from_documents(documents)

๐Ÿง  Summary Table

Index Type Retrieval Style Best For
VectorStoreIndex Semantic similarity RAG, semantic search, Q&A
ListIndex Sequential Summarization, walkthroughs
TreeIndex Hierarchical Recursive summarization, long docs
KeywordTableIndex Keyword match Lightweight search, no embeddings


๐Ÿ”Œ Connecting LlamaIndex with Different LLMs

LlamaIndex is model-agnostic—it supports a wide range of LLMs from different providers, allowing you to plug in the model that best fits your use case, whether it's hosted (like OpenAI) or local (like LLaMA or Mistral).


๐Ÿง  How LLMs Are Used in LlamaIndex

LLMs in LlamaIndex are used for:

  • Generating answers (via Query Engines)
  • Summarizing documents
  • Refining responses
  • Re-ranking retrieved results
  • Evaluating outputs

๐Ÿ”น 1. OpenAI (GPT-3.5, GPT-4)

from llama_index.llms import OpenAI

llm = OpenAI(model="gpt-4", temperature=0.3)

✅ Requires OPENAI_API_KEY
✅ Great for RAG, summarization, and reasoning tasks


๐Ÿ”น 2. Anthropic (Claude)

from llama_index.llms import Anthropic

llm = Anthropic(model="claude-2", temperature=0.5)

✅ Requires ANTHROPIC_API_KEY
✅ Known for long context windows and safe outputs


๐Ÿ”น 3. Hugging Face (Hosted or Local)

from llama_index.llms import HuggingFaceLLM

llm = HuggingFaceLLM(
    model_name="tiiuae/falcon-7b-instruct",
    tokenizer_name="tiiuae/falcon-7b-instruct",
    context_window=2048,
    max_new_tokens=256
)

✅ Works with Hugging Face Hub or local models
✅ Ideal for open-source deployments


๐Ÿ”น 4. Google Vertex AI (PaLM, Gemini)

from llama_index.llms import VertexAI

llm = VertexAI(model="text-bison", temperature=0.2)

✅ Requires Google Cloud setup
✅ Good for enterprise and multilingual use cases


๐Ÿ”น 5. Cohere

from llama_index.llms import Cohere

llm = Cohere(model="command-xlarge-nightly", temperature=0.4)

✅ Requires COHERE_API_KEY
✅ Strong performance on command-following tasks


๐Ÿ”น 6. Local Models (LLaMA, Mistral, etc.)

Use with backends like:

  • ๐Ÿ”ง Ollama
  • ๐Ÿ”ง LM Studio
  • ๐Ÿ”ง Hugging Face Transformers
  • ๐Ÿ”ง vLLM or Text Generation Inference (TGI)

๐Ÿ“Œ Example with Ollama:

from llama_index.llms import Ollama

llm = Ollama(model="llama2")

๐Ÿ“Œ Example with Transformers:

from llama_index.llms import HuggingFaceLLM

๐Ÿงช Using the LLM in a Query Engine

from llama_index import VectorStoreIndex

index = VectorStoreIndex.from_documents(documents)
query_engine = index.as_query_engine(llm=llm)

response = query_engine.query("What is LlamaIndex?")
print(response)

๐Ÿง  Summary Table

Provider Class Notes
OpenAI OpenAI GPT-3.5, GPT-4
Anthropic Anthropic Claude 1/2
Hugging Face HuggingFaceLLM Local or hosted models
Google VertexAI PaLM, Gemini
Cohere Cohere Command models
Ollama Ollama Local LLaMA, Mistral, etc.


Building a simple RAG (Retrieval-Augmented Generation) pipeline using ๐Ÿฆ™ LlamaIndex with:

  • PDF/Text file ingestion
  • Node parsing and vector indexing
  • OpenAI for embeddings and LLM
  • Semantic search and query answering

๐Ÿงช Full LlamaIndex RAG Pipeline (Step-by-Step)

✅ Prerequisites

Install the required packages:

pip install llama-index openai PyPDF2

Set your OpenAI API key:

export OPENAI_API_KEY=your-api-key

๐Ÿ“ Folder Structure

llamaindex_rag/
├── data/
│   └── example.pdf  # or .txt
├── app.py

๐Ÿง  app.py

from llama_index import VectorStoreIndex, SimpleDirectoryReader
from llama_index.node_parser import SimpleNodeParser
from llama_index.llms import OpenAI
from llama_index.embeddings import OpenAIEmbedding
from llama_index.query_engine import RetrieverQueryEngine

# Step 1: Load documents from a folder
documents = SimpleDirectoryReader("data").load_data()

# Step 2: Parse documents into nodes (chunks)
parser = SimpleNodeParser()
nodes = parser.get_nodes_from_documents(documents)

# Step 3: Set up embedding model and LLM
embed_model = OpenAIEmbedding()
llm = OpenAI(model="gpt-3.5-turbo", temperature=0.3)

# Step 4: Create a vector index from nodes
index = VectorStoreIndex(nodes, embed_model=embed_model)

# Step 5: Create a retriever and query engine
retriever = index.as_retriever(similarity_top_k=3)
query_engine = RetrieverQueryEngine.from_args(retriever=retriever, llm=llm)

# Step 6: Ask a question
query = "What is this document about?"
response = query_engine.query(query)

# Step 7: Print the answer
print("\n๐Ÿง  Answer:")
print(response)

๐Ÿ“„ Example Output

๐Ÿง  Answer:
This document discusses the fundamentals of vector databases and their role in semantic search...

๐Ÿง  What This Pipeline Does

Step Purpose
๐Ÿ“‚ Load Ingests files from the data/ folder
✂️ Parse Splits documents into chunks (nodes)
๐Ÿง  Embed Converts chunks into vectors using OpenAI
๐Ÿ“ฆ Index Stores vectors in memory for semantic search
๐Ÿ” Retrieve Finds top-k relevant chunks
๐Ÿ’ฌ Generate Uses GPT to synthesize an answer from context









Comments

Popular posts from this blog

Resume Work and Project Details

Time Series and MMM basics

LINEAR REGRESSION