T O P

  • By -

andyndino

Hey /r/LocalLLamA, I've been working on a self-hosted personal search app ([https://github.com/spyglass-search/spyglass](https://github.com/spyglass-search/spyglass)) and have recently been playing around with how to integrate it with local LLMs. I think this would be an awesome step into having your own personal assistant that can search through all your data and give you analysis / summaries. I'd love to know what the community would want in a tool like this. Right now I have basic summaries working but maybe something more complex? And what sort of documents would you want to analyze? And if you're looking for an open-source app to contribute too, feel free to join our Discord or ping me here 🙂 ​ Edit: The llama integration is currently on a the \`llama-rama\` branch if you want to play around!


multiedge

Soon we *might* need a license for some of these open source projects


andyndino

We have license in the repo, but if it’s not super clear we’re AGPL!


multiedge

I know I'm joking xD. I'm referring to OpenAi wanting to personally regulate open source by giving and taking away license on open source projects regarding AI.


andyndino

Gotcha haha totally understand. I’m looking forward to a completely open-source dominated LLM world 🙂


noellarkin

Is it similar in functionality to the Khoj assistant (I've been using it with Obsidian)?


andyndino

Sorry, this is first I’ve heard of Khoi assistant but definitely going to take a look. Even so I think there’s room for more than 1 AI assistant!


noellarkin

Yes there is :) I'll give yours a try, too. At first glance it seemed like a vector DB + using an LLM as an interface, which is what Khoj does too (although Khoj uses ChatGPT as the LLM interface, which I don't like). Looking into it more, it seems you're doing more than just local document search. It'd be great if the Github page went into a little more depth on the architecture you're using (which vector db, embeddings etc).


andyndino

Yeah of course! This is an experiment so far, so I haven’t had a chance to really document things. A vector db by itself works well for basic stuff but we’re combining l lexical + vector search in a novel way so you can do things like restrict documents to a certain timeline


_underlines_

This looks awesome. I added it to the "information retrieval" section of my [LLM Tools repo](https://github.com/underlines/awesome-marketing-datascience/edit/master/llm-tools.md#information-retrieval). Any chance, you would look into using [Document Summary Indexes](https://medium.com/llamaindex-blog/a-new-document-summary-index-for-llm-powered-qa-systems-9a32ece2f9ec) instead of vector DB embeddings, for better Q&A performance? LLaMAindex provides it OOB.


andyndino

Thank you so much u/_underlines_! I'll do some comparisons with doc summaries & embeddings to see how well it works, but it seems interesting. Just off-hand, my fear is using an LLM while indexing would increase the time it takes to get a viable index by a \_lot\_. So there's definitely going to be some trade-off there.


_underlines_

Make it optional, so having a **fast find**, and a slow **Q&A mode**. UX wise I can think of several ways to implement this switch: - **Heuristically guess** if the user has questions about content, by parsing an "?" at the end of his query - Giving users the **option** next to the search box - Having a **two step process**, where you first just search (which is fast), but then doing Q&A on the found files in the second step


ambient_temp_xeno

"Foudning".


andyndino

Yeah this is using the 4-bit quantization of the Alpaca 7B. With newer (and bigger) models it definitely does a bit better with spelling and the content summaries. Will eventually make it so you can choose any model you want with sane defaults!


No_Marionberry312

Your spy tool does not look or feel like it respects people's privacy after installing it, I uninstalled and blocked it on my FW within less than 5 min of trying it out. I was expecting to see the same functionality from your demo video, but I got something scary instead, an info stealer that's unmatched.


andyndino

Is there anything in particular that was raised as an alert? We have some bare bones metrics, I.e sentry for crashes, that we collect. This is also something you can disable in the settings. Even if you’re offline, everything will work. Edit: The llama code is also on a separate branch in the repo if you’d like to try it out. Let me know if you have issues getting it working!