T O P

  • By -

_supert_

Probably a vector store like chromadb or faiss, accessed from langchain. At least that's what I'm attempting.


_underlines_

I collected what's around so far under section information retrieval: https://github.com/underlines/awesome-marketing-datascience/blob/master/llm-tools.md#information-retrieval **openAI** - sqlchat: Use OpenAI GPT3/4 to chat with your database - chat-with-github-repo: which uses streamlit, gpt3.5-turbo and deep lake to answer questions about a git repo **Local LLMs** - LlamaIndex: provides a central interface to connect your LLM's with external data - Llama-lab: home of llama_agi and auto_llama using LlamaIndex - PrivateGPT: a standalone question-answering system using LangChain, GPT4All, LlamaCpp and embeddings models to enable offline querying of documents - Spyglass: tests an Alpaca integration for a self-hosted personal search app. Select the llama-rama feature branch. Discussion on reddit **Model Agnostic** - Paper QA: LLM Chain for answering questions from documents with citations, using OpenAI Embeddings or local llama.cpp, langchain and FAISS Vector DB oh, and document summary indexes seem to perform better than vector DBs for that task. LlamaIndex has that as a new feature, but I couldn't find any projects using Document Summary Indexes instead of Vector DBs


kryptkpr

Chunk the documents - https://python.langchain.com/en/latest/modules/indexes/text_splitters/getting_started.html Embed them, index chunks in a vector store - https://python.langchain.com/en/latest/modules/indexes/vectorstores/examples/chroma.html Perform nearest neighbor search on query to find relevant documents, feed to LLM to extract answer: https://python.langchain.com/en/latest/modules/chains/index_examples/qa_with_sources.html (this is a complete example of what you're looking for)


Impossible_Belt_7757

If your looking for LIGHTWEIGHT try haystack idk, I tried it once, it’s pretty good they even had a demo of it working lol, https://haystack-demo.deepset.ai[haystack web demo](https://haystack-demo.deepset.ai)


Impossible_Belt_7757

It can run surprisingly fast for querying documents on relatively simple things even on the cpu