DefinePK

DefinePK hosts the largest index of Pakistani journals, research articles, news headlines, and videos. It also offers chapter-level book search.

A Large Language Model based Web Application for Contextual Document Conversation: A Large Language Model based Web Application for Contextual Document Conversation


Article Information

Title: A Large Language Model based Web Application for Contextual Document Conversation: A Large Language Model based Web Application for Contextual Document Conversation

Authors: Asad Khan, Abdul Haseeb Malik, Talha Ahsan, Qazi Ejaz Ali

Journal: International Journal of Innovations in Science & Technology

HEC Recognition History
Category From To
Y 2024-10-01 2025-12-31
Y 2023-07-01 2024-09-30
Y 2021-07-01 2022-06-30

Publisher: 50SEA JOURNALS (SMC-PRIVATE) LIMITED

Country: Pakistan

Year: 2024

Volume: 6

Issue: 4

Language: English

Keywords: Natural Language ProcessingApplication LearningAI BotsLarge Learning modelContextual Document Conversation

Categories

Abstract

The emergence of LLMs, such as ChatGPT, Gemini, and Claude has ushered in a new era of natural language processing, enabling rich textual interactions with computers. However, despite the capabilities of these new language models, they face significant challenges when queried on recent information or private data not included in the model’s dataset. Retrieval Augmented Generation (RAG) overcame the problems mentioned earlier by augmenting user queries with relevant context from a user-provided document(s), thus grounding the model’s response to inaccurate source material. In research, RAG enables users to engage interactively with their documents, instead of manually reading through their document(s). Users provide their document(s) to the system, which is then converted into vector indices, and used to inject contextual information into the user prompt during retrieval. The augmented prompt then enables the language model to contextually answer user queries. The research is composed of a web application, with an intuitive interface for interacting with the LIama 3.2 1B, an open-source LLM. Users can upload their document(s) and chat with the LLM in the context of their uploaded document(s).


Research Objective

To develop a web application that leverages Retrieval Augmented Generation (RAG) to enable users to converse with a Large Language Model (LLM) in the context of their uploaded documents, thereby improving document comprehension and information retrieval.


Methodology

The research proposes a web application with several components: Authentication (Auth), Chat, User Management, and Feedback. The system converts user documents into text chunks, generates embeddings, and stores them in a vector database. User queries are used to retrieve relevant text chunks, which are then augmented with the user's prompt and fed to an LLM (Llama 3.2 1B) for contextual responses. The development environment utilizes Typescript, Langchain.js, NodeJS, and Ollama, with Visual Studio Code as the IDE and Git for version control. The application is deployed on Google Cloud Platform (GCP) using Ubuntu, Nginx, and Express.js for the backend, and Netlify for the frontend. Testing involves unit, integration, and end-to-end tests for both frontend and backend.

Methodology Flowchart
                        graph TD;
    A[User Uploads Document] --> B[Document Processing: Chunking & Embedding];
    B --> C[Store in Vector DB];
    D[User Query] --> E[Retrieve Relevant Context from Vector DB];
    E --> F[Augment Prompt with Context & Query];
    F --> G[LLM Generates Response];
    G --> H[Display Response to User];
    I[User Feedback] --> J[Feedback Storage];
    K[Admin Actions] --> L[User Management];                    

Discussion

The paper discusses potential improvements to the RAG workflow, including advanced chunking methods, metadata integration, handling structured data, query rewriting and decomposition, query-to-document expansion, routing subsystems, re-ranking mechanisms, context consolidation, and multi-modality. Practical implications highlight improved user interaction with documents, efficient information retrieval, and enhanced user experience through summarization and generative tasks.


Key Findings

The developed web application successfully enables users to interact with LLMs in the context of their own documents, overcoming limitations of LLMs regarding recent information and private data. The RAG approach enhances accuracy by grounding responses in provided context and broadens LLM applicability by allowing interaction with private data.


Conclusion

The proposed LLM-based web application with RAG effectively addresses the limitations of traditional LLMs by providing contextual understanding of user-provided documents. Future work includes handling picture-rich PDFs, adding a download option for questions, and conducting tests with diverse user groups.


Fact Check

* The paper was published in the International Journal of Innovations in Science & Technology (IJIST), Vol. 06 Issue. 04 in December 2024. (Confirmed by citation and publication date)
* The application uses the Llama 3.2 1B LLM. (Confirmed in Methodology section)
* The development environment uses Typescript, Langchain.js (version 0.3.7), NodeJS (version 22.12.0), and Ollama (version 0.3.4). (Confirmed in Development Environment section)


Mind Map

Loading PDF...

Loading Statistics...