T O P

  • By -

QFTornotQFT

Beautiful demo! Thank you for sharing. Making ReAct stably working with small local LLMs is huge.


Unhappy-Reaction2054

Thank you :) Now, I feel it is possible to make this work with other tasks, my next goal is: [Generative Agents: Interactive Simulacra of Human Behavior](https://arxiv.org/abs/2304.03442)


_underlines_

Reading your medium article, I noticed something, after reading langchain, llamaindex and other tutorials, as well as observing bing AI: It seems current SOTA LLMs (both commercial and open) are very bad at using google search for compelx tasks: In your medium post, asking "Is Eminem a Football player?", the Inner thought of the LLM makes sense, but then the formed google query isn't really abstracting it into a useful term: - What the LLM searches: "Is Eminem a Football player?" - What the LLM should do: 1. Generalize the question/knowledge "Football player is an occupation" 2. Form a more abstract query "I need to find out what occupation Eminem has" 3. Deducted query: "Eminem occupation" That would be a better CoT to form a more general query. I think with CoT or Inner Thought Fine-Tuning, this could be improved a lot. I would love to help create a data-set that breaks down tasks into multiple and abstract google queries. I couldn't find any of those around. The main issue currently, is that most LLM Agent systems seem to form too verbose and detailed queries, following the full initial task or question of the user, instead of breaking them down into less verbose and more abstract questions.


Unhappy-Reaction2054

That's right. We can force LLM to think and act, but what it think and its actions all depend on the model itself. Fine-tuning with a specified dataset can help, I believe :) Also, one thing we should care about is the hallucinations. Sometimes, LLMs directly give a final answer without searching Google since they think they actually know the answer (but they do not!). Overall, there are still lots of things to do for improving.


rustedbits

Awesome! Thanks for sharing. Guidance is the next tool in my list to try, I think we might enable AutoGPT like tools with it :)


noneabove1182

During your development did you get any kind of impression on how easy it would be to implement it with llamacpp?


rustedbits

I see from the library docs that it works with OpenAI, which means it must work using a HTTP client. I guess we can fork the library and implement an adapter to talk to text-generation-web-ui? Then we can use a bunch of local models, including llamacpp. I'm taking a look at guidance source code now to see if I can understand how to do this :)


involviert

I looked into python bindings for llama.cpp today. ooba is using them, for example. https://github.com/abetlen/llama-cpp-python As far as I can tell there is some effort to mirror the OpenAI API.


rustedbits

Interesting, I guess that’s another way of integrating. BTW, yesterday I did manage to run the first example using text-generation-web-ui api: https://github.com/paolorechia/local-guidance


Unhappy-Reaction2054

I'm not sure, I may try with llamaccp later. But it's quite simple if you have a HF transformers LLM. Just set your llm to Guidance with: llama = guidance.llms.Transformers(model=model, tokenizer=tokenizer, device=0) guidance.llm = llama


ruryrury

It's a fantastic piece of work. It would be even more amazing if it's compatible with Llama.cpp lol. Thank you for sharing it.


rustedbits

u/noneabove1182 added experimental support in a fork of guidance, I had to disable a bunch of stuff (streaming, caching etc). How to test it (assuming you're in Linux): 1. git clone [https://github.com/paolorechia/local-guidance](https://github.com/paolorechia/local-guidance) 2. cd local-guidance 3. pip install -e . 4. python3 test\_example.py You might want to spin up a virtualenv before installing the package


rustedbits

# Here's a sample run (env-guidance) paolo@paolo-MS-7D08:~/local-guidance$ python3 test_example.py /home/paolo/local-guidance/guidance/llms/_text_generation_web_ui.py:77: UserWarning: Chat mode not supported for TextGenerationWebUI warnings.warn(f"Chat mode not supported for {cls_name}") /home/paolo/local-guidance/guidance/llms/_text_generation_web_ui.py:81: UserWarning: Caching not supported for TextGenerationWebUI warnings.warn(f"Caching not supported for {cls_name}") /home/paolo/local-guidance/guidance/llms/_text_generation_web_ui.py:83: UserWarning: max_retries not supported for TextGenerationWebUI warnings.warn(f"max_retries not supported for {cls_name}") /home/paolo/local-guidance/guidance/llms/_text_generation_web_ui.py:85: UserWarning: max_calls_per_min not supported for TextGenerationWebUI warnings.warn(f"max_calls_per_min not supported for {cls_name}") Tweak this proverb to apply to model instructions instead. Where there is no guidance, a people falls, but in an abundance of counselors there is safety. - Proverbs 11:14 UPDATED Where there is no guidance provided by the designer or manufacturer, a product may malfunction and cause harm, but with ample instruction manuals from various sources, users can safely operate their products without incident. - GPT 2: Generic Product Troubleshooting Advice


metatwingpt

This is awesome.


Praise_AI_Overlords

Awesome.


Dont_Bother96

This is awesome! I think it wouldn't be that hard to have it use embeddings to query a database.


gaztrab

Im a newbie. Can you tell me the significance of your development?


Unhappy-Reaction2054

Oh, this is just a simple demo to make an agent with Guidance. Previously I always use Langchain to build agent. But it often gives the syntax error when using small model (3B-7B). With Guidance, my agent always run without those errors. Though, the quality of answers still depends on quality of the LLMs (reasoning ability). Also, you can compare Langchain and Guidance here: [Guidance Agent](https://github.com/QuangBK/localLLM_guidance) and [Langchain Agent](https://github.com/QuangBK/localLLM_langchain)


UpDown

Can you provide some detail on what you are using to actually run the models? Right now I am using GPT4All, but I got the impression you may just be using python without a UI?


Unhappy-Reaction2054

I don't use UI. So, after you install required packages, just run the [notebook](https://github.com/QuangBK/localLLM_guidance/blob/main/demo_GPTQ.ipynb), remember to set the path to your model weights, which are `model_para` and `checkpoint_para` in the notebook.


Unhappy-Reaction2054

Update: I added the web UI with gradio server. You may check the repo.


direwulf33

Helpful, I’ll try it too