Implementing Retrieval-Augmented Generation with LangChain, Pgvector and OpenAI

by Adithya Hebbar, System Analyst

Introduction

In the previous blog, we explored how Retrieval-Augmented Generation (RAG) can augment the capabilities of GPT models. This post takes it a step further by demonstrating how to build a system that creates and stores embeddings from a document set using LangChain and Pgvector, allowing us to feed these embeddings to OpenAI's GPT for enhanced and contextually relevant responses.


1. The Role of Embeddings in RAG Systems

Embeddings represent text data in a dense numerical format, which helps models capture the semantic meaning of the text. By storing these embeddings in a database like Pgvector, we can efficiently retrieve the most relevant pieces of information and feed them into a model, like GPT, to enhance its ability to answer queries.


2. Installation and Setup

Before diving into the code, let’s go over the installation steps to set up the environment with LangChain, Pgvector, and OpenAI.

Step 1: Install the Required Libraries

You’ll need the following Python libraries to get started:

bash

pip install langchain pgvector psycopg2-binary
  • langchain: A framework to work with LLMs and build AI applications.
  • pgvector: A Postgres extension that supports vector embeddings storage and similarity search.
  • psycopg2-binary: A PostgreSQL adapter for Python to handle database interactions.

Step 2: Set up Pgvector in PostgreSQL

To store embeddings in Pgvector, your PostgreSQL instance needs the Pgvector extension installed. Here's how to install and configure it:

  1. Install Pgvector (if not already installed):

bash

psql -d your_database -c "CREATE EXTENSION IF NOT EXISTS vector;"

Step 3: Set Up OpenAI API Access

You’ll need an OpenAI API key to generate embeddings and interact with GPT models.

  1. Go to OpenAI API to sign up or log in.
  2. Get your API key and store it in an environment variable:

bash

OPENAI_API_KEY="your_openai_api_key"

With these installations in place, you’re ready to move on to loading, embedding, and querying your documents.


3. Loading the Document Data

Python

from langchain.document_loaders import TextLoader

# Load the document from a local file on the device
with open("path_to_your_file.txt", "r", encoding="utf-8") as temp_file:
    loader = TextLoader(temp_file.name, encoding="utf-8")
    documents = loader.load()

# Optional: Split large documents for better embedding granularity
from langchain.text_splitter import RecursiveCharacterTextSplitter

text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=50)
texts = text_splitter.split_documents(documents)

Splitting the documents into smaller chunks ensures that each chunk can be embedded individually, improving retrieval accuracy.


4. Creating and Storing Embeddings

Once we have our text data, we can generate embeddings using LangChain and store them in Pgvector.

Python

from langchain.embeddings.openai import OpenAIEmbeddings
from pgvector.vectorstore import PGVector

# Set up the embeddings model
embeddings_model = OpenAIEmbeddings(openai_api_key="your_openai_key")

# Set up the Pgvector database
collection_name = "my_collection"
connection_string = "postgres://user:password@localhost:5432/mydb"

# Save embeddings to the Pgvector database
pgvector = PGVector(
    collection_name=collection_name,
    connection_string=connection_string,
    embedding_function=embeddings_model
)

pgvector.from_documents(texts)

5. Querying the Database for Relevant Data

Now that the embeddings are stored, we can retrieve the most relevant documents based on a query.

Python

# Retrieve documents based on similarity to the query
query = "Tell me about the Eiffel Tower."

retriever = pgvector.as_retriever(search_type="similarity", search_kwargs={"k": 5})

# Get the top 5 most relevant documents
relevant_docs = retriever.get_relevant_documents(query)

def format_docs(docs):
    return "\n\n".join([d.page_content for d in docs])

context = format_docs(relevant_docs)

6. Generating a Response with OpenAI

With the relevant context retrieved, we now use OpenAI’s GPT model to generate a response.

Python

from langchain.prompts import ChatPromptTemplate
from langchain.chat_models import ChatOpenAI

# Prepare the template for the prompt
template = """You are an AI assistant. Here's the context for the question:\n{context}\nNow answer the question below:\n\nQuestion: {question}\nAI: """
prompt = ChatPromptTemplate.from_template(template)

# Set up the GPT-4 model
model = ChatOpenAI(api_key="your_openai_key")

# Prepare the full chain
chain = prompt | model

# Generate the response
response = chain.invoke({
    "context": context,
    "question": query
})

print(response)

Conclusion

In this guide, we’ve shown how to set up LangChain with Pgvector to create and store document embeddings, query them for relevant context, and use the context to generate a response using OpenAI’s GPT model. This approach provides a powerful system for generating accurate, contextually informed answers.


Resources


More articles

Protecting Your LLM Applications from Prompt Injection Attacks

Learn practical techniques to defend against prompt injection attacks in AI applications with simple code examples.

Read more

How to Read a Flame Graph in Chrome DevTools

A deep, practical guide to reading flame charts in Chrome DevTools, spotting expensive functions, and validating performance improvements.

Read more

Your competitors are already using AI.
The question is how fast you want to unlock the value.

Don't know where to start?

AI is everywhere but it's unclear which investments will actually move your metrics and which are expensive experiments.

Your data isn't ready

Most AI projects fail at the data layer. Pipelines, quality, access all need work before LLMs can deliver value.

Internal teams are stretched

Your engineers are shipping product. They don't have capacity to also become AI specialists with production-grade experience.

Legacy systems block everything

Aging, undocumented codebases make AI integration slow, risky, and expensive. They need to move first.

Don't worry. We've got you covered.

Start with the audit.