LangChain ke Saath PDF Chatbot

PDF ke Saath Baat Karne Ka Jadui Tarika 📄✨

Kya aapko kabhi laga hai ki aapke PDF documents se interact karne ka ek aasan tarika ho? LangChain ke madad se hum ek powerful PDF chatbot bana sakte hain jo aapke documents se direct baat karta hai! Chaliye step-by-step samajhte hain kaise.

🔧 Prerequisite

Sabse pehle humein kuch libraries install karni hongi:

pip install langchain pypdf python-dotenv openai chromadb

Step 1: PDF Loader Setup 📚

PDF ko load karne ke liye hum PyPDFLoader ka use karenge:

from langchain_community.document_loaders import PyPDFLoader

# PDF file path
file_path = "your_document.pdf"

# PDF loader initialize karna
loader = PyPDFLoader(file_path)

# Documents load karna
documents = loader.load()

Explanation:

Step 2: Text Splitting ✂️

PDF ke text ko chunks me split karna important hai:

from langchain.text_splitter import RecursiveCharacterTextSplitter

# Text splitter create karna
text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=1000,
    chunk_overlap=200
)

# Documents ko split karna
texts = text_splitter.split_documents(documents)

Explanation:

Step 3: Embeddings Generate Karna 🧠

Embeddings document ke meaning ko vector format me convert karte hain:

from langchain_openai import OpenAIEmbeddings
from langchain_community.vectorstores import Chroma

# OpenAI Embeddings
embeddings = OpenAIEmbeddings()

# Vector Store create karna
vectorstore = Chroma.from_documents(
    documents=texts,
    embedding=embeddings
)

Explanation:

Step 4: Retrieval Chain Setup 🔗

Retrieval chain document se relevant information nikalne me help karta hai:

from langchain.chains import RetrievalQA
from langchain_openai import ChatOpenAI

# ChatGPT model initialize
llm = ChatOpenAI(model_name="gpt-3.5-turbo")

# Retrieval QA Chain
qa_chain = RetrievalQA.from_chain_type(
    llm=llm,
    chain_type="stuff",
    retriever=vectorstore.as_retriever()
)

Explanation:

Step 5: Chatbot Interaction 💬

Ab aap apne PDF se interact kar sakte hain!

# Query karna
query = "Aapke document ka main point kya hai?"
result = qa_chain.invoke(query)

print(result['result'])

Yeh command aapke PDF ke content se relevant jawab nikal kar dikhata hai 🎯

Advanced Features 🚀

1. Conversation Memory

from langchain.memory import ConversationBufferMemory
from langchain.chains import ConversationalRetrievalChain

memory = ConversationBufferMemory()

Memory feature chatbot ko past questions ya context yaad rakhne me help karta hai — taaki conversation natural lage aur user experience better ho.

💡 Pro Tips

← Back to LangChain Tutorials
0