Simple_RAG

The Problem

When working with a large language model to find information, the model is limited to information it learned when training which can lead to outdated information or the model hallucinating answers. A RAG (Retrieval Augmented Generation) system solves this problem by taking the user's query and matching it up against relevant information stored in a database.

Use instructions

This project was designed to run on a mac with apple's M-series chips. other devices may not work well or at all

Install the packages: pip install chromadb mlx_embeddings mlx_lm
Download the most recent wiki dump and index file (use the multi-stream version)
Run the program: python rag-qa.py

This will take a few hours to run the first time because it needs to vectorize all of the article titles for the system to work

Once you are done asking questions, type "END" in all caps to end the conversation

How it works:

Vectorize article names if the vector database is empty

I used all-MiniLM-L6-v2 with 4-bit quantization as my embedding model because it runs well on my mac and because the article titles are only a few words, so TF-IDF would give very sparse vector representations that don't have enough information to be useful. The dense embeddings from this model gives me more information to work with.
I only vectorized the titles because the size of the fully decompressed article text is around 100 GB, and I don't have the storage or time to chunk and vectorize everything. I also couldn't just give the model all of the titles because there are around 7 million articles on Wikipedia and running them through every time would be wasteful.

Run in an infinite loop where every time a user asks a question:
1. The model takes the question and uses it to come up with titles to potentially useful wikipedia articles
2. These titles are vectorized, then being used to search for the most similar wikipedia document titles
3. The metadata associated with these titles is used to index and find the corresponding articles
4. These articles and the user's question are fed back into the model and it answers the question

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
.gitignore		.gitignore
.python-version		.python-version
README.md		README.md
rag-qa.py		rag-qa.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Simple_RAG

The Problem

Use instructions

How it works:

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Simple_RAG

The Problem

Use instructions

How it works:

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages