Skip to content
Menu
Rohit Naik Kundaikar
  • Home
  • Contact
Rohit Naik Kundaikar

Building a Context-Aware Q&A System with LangChain.js and Web Scraping

Posted on July 19, 2025July 19, 2025 by Rohit Naik Kundaikar

In the rapidly evolving world of Large Language Models (LLMs), providing relevant context is key to getting accurate and helpful responses. While LLMs are incredibly powerful, they don’t inherently know everything about specific, niche topics or the latest information on a website. This is where Retrieval Augmented Generation (RAG) comes into play, allowing us to feed specific, up-to-date information to our LLMs.

This blog post will walk you through building a basic RAG system using LangChain.js to scrape content from a website, embed it, store it, and then use it to answer questions with a Google Gemini model.

Why RAG? The Problem of Context

Imagine asking an LLM about the specific features of a new product launched last week, or details from your company’s internal documentation. Without explicit context, the LLM might hallucinate, provide general information, or simply state it doesn’t know. RAG solves this by:

  1. Retrieving relevant information from a knowledge base (in our case, a scraped website).
  2. Augmenting the LLM’s prompt with this retrieved context.
  3. Generating a more informed and accurate answer.

The Tools We’ll Use

Our system leverages several powerful components from the LangChain.js ecosystem:

  • LangChain.js: A framework for developing applications powered by language models.
  • ChatGoogleGenerativeAI: Our chosen LLM, specifically Google’s Gemini 2.0 Flash model.
  • CheerioWebBaseLoader: A document loader that uses Cheerio to efficiently scrape HTML content from web pages.
  • RecursiveCharacterTextSplitter: Breaks down large documents into smaller, manageable chunks.
  • GoogleGenerativeAIEmbeddings: Converts text chunks into numerical vector representations (embeddings).
  • MemoryVectorStore: An in-memory database to store and search our text embeddings.
  • createStuffDocumentsChain & createRetrievalChain: LangChain utilities to orchestrate the flow of retrieving documents and combining them with our LLM prompt.

Step-by-Step Implementation

Let’s break down the process, mirroring the provided JavaScript code:

1. Initialize the Language Model

First, we set up our LLM. We’re using gemini-2.0-flash for its speed and cost-effectiveness, with a temperature of 0 for more deterministic (less creative) answers, which is often preferred for factual Q&A.

import { ChatGoogleGenerativeAI } from "@langchain/google-genai";

const model = new ChatGoogleGenerativeAI({
    model: "gemini-2.0-flash",
    temperature: 0
});

2. Define the Prompt Template

A prompt template structures how the LLM will receive the retrieved context and the user’s question. This ensures the LLM understands its role: to answer based only on the provided content.

import { ChatPromptTemplate } from "@langchain/core/prompts";

const prompt = ChatPromptTemplate.fromTemplate(
    "Answer users question based on the provided content.\n\n" +
    "Content: {context}\n\n" +
    "Question: {input}"
);

3. Create the Document Combination Chain

This chain takes the retrieved documents and “stuffs” them into the prompt, preparing them for the LLM.

import { createStuffDocumentsChain } from "langchain/chains/combine_documents";

const chain = await createStuffDocumentsChain({
    llm: model,
    prompt: prompt
});

4. Load Data from a Website

Here’s where the web scraping happens! CheerioWebBaseLoader fetches the content from a specified URL.

import { CheerioWebBaseLoader } from "@langchain/community/document_loaders/web/cheerio";

const loader = new CheerioWebBaseLoader("https://js.langchain.com/docs/integrations/document_loaders/web_loaders/web_cheerio/");
const doc = await loader.load();

5. Split the Document into Chunks

Large documents need to be broken down. The RecursiveCharacterTextSplitter intelligently splits text, ensuring chunks are of a manageable size for embedding and that some overlap exists to maintain context across splits.

import { RecursiveCharacterTextSplitter } from "langchain/text_splitter";

const splitter = new RecursiveCharacterTextSplitter({
    chunkSize: 200, // Max characters per chunk
    chunkOverlap: 20 // Overlap to prevent loss of context
});
const splitDoc = await splitter.splitDocuments(doc);

6. Generate Embeddings

Embeddings are numerical representations of text that capture its semantic meaning. GoogleGenerativeAIEmbeddings transforms our text chunks into these vectors.

import { GoogleGenerativeAIEmbeddings } from "@langchain/google-genai";

const embedding = new GoogleGenerativeAIEmbeddings();

7. Create an In-Memory Vector Store

The MemoryVectorStore is where our embedded text chunks live. It allows us to quickly search for chunks that are semantically similar to a given query.

import { MemoryVectorStore } from 'langchain/vectorstores/memory';

const vectorStore = await MemoryVectorStore.fromDocuments(splitDoc, embedding);

8. Set Up the Retriever

The retriever’s job is to query the vectorStore and fetch the most relevant document chunks. We configure it to retrieve the top 2 (k: 2) most similar documents.

const retriever = vectorStore.asRetriever({
    k: 2,
});

9. Build the Retrieval Chain

This is the final orchestration step. The createRetrievalChain combines our document combination chain with the retriever, creating an end-to-end pipeline for answering questions based on retrieved context.

import { createRetrievalChain } from "langchain/chains/retrieval";

const retrieverChain = await createRetrievalChain({
    combineDocsChain: chain,
    retriever: retriever
});

10. Invoke the Chain with a Question

Finally, we can ask our system a question! The retrieverChain will handle finding the relevant information and passing it to the LLM to generate an answer.

const response = await retrieverChain.invoke({
    input: "What is Cheerio",
});

console.log(response);

When you run this code, the response object will contain the LLM’s answer, which is derived from the content scraped from the LangChain.js documentation about Cheerio.

Conclusion

This example demonstrates the power of LangChain.js in building sophisticated LLM applications. By combining web scraping with vector embeddings and retrieval, you can create highly context-aware Q&A systems capable of answering questions based on specific, up-to-date information from virtually any website. This opens up a world of possibilities for custom chatbots, knowledge assistants, and more!

LangChain, LLM

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Recent Posts

  • Mastering Clean Code: A Deep Dive into SOLID Principles
  • Building a Context-Aware Q&A System with LangChain.js and Web Scraping
  • TypeScript Best Practices for React
  • Vite Micro Frontend
  • React Best Practices for Performance and Maintainability

Recent Comments

    Archives

    • July 2025
    • August 2024
    • January 2024
    • September 2021
    • July 2021
    • June 2021
    • May 2021
    • April 2021
    • December 2020
    • November 2020
    • October 2020
    • September 2020

    Categories

    • Angular
    • API
    • Best Practice
    • Compiler
    • Context
    • DevOps
    • Docker
    • FAANG
    • Forms
    • Good Coding Habits
    • GraphQL
    • Java
    • Javascript
    • LangChain
    • LLM
    • Machine Learning
    • MobX
    • Python
    • ReactJS
    • Redux Toolkit
    • Spring Boot
    • Typescript
    • Uncategorized
    • Vite
    • Webpack
    ©2025 Rohit Naik Kundaikar | Powered by WordPress & Superb Themes