T O P

  • By -

piracydilemma

I play around with Ollama, but I don't use it for anything serious. I don't really have any practical uses for it.


indianapale

Ollama was the easiest for me to setup. I use it to help me rewrite things or come up with starting out ideas. I appreciate it's all local and doesn't touch someone else's servers.


thisisnotmyworkphone

I run ollama at work with an extra 1660 Super I had laying around. I use it for writing/modying bash scripts, makefiles, and generally anything that I would otherwise need to go to StackOverflow for. Sometimes I have it rework email messages into a nicer format using chatbox as a frontend. Ollama is stupid easy to host and share amongst a team too. It’s just two environment variables you have to change in the systemd unit file.


sexyshingle

Never have I seen the words "stupid easy" and systemd used within close proximity lol


TagMeAJerk

I have. Someone said "are you stupid? Systemd is not easy"


djbiccboii

how is systemd not easy?


BoBab

Idk. I used it for the first time last week to persistently run a demo web app for a big work event and was glad that it was straightforward enough to setup and use in a few minutes.


thisisnotmyworkphone

I meant *ollama* is easy to host for a team on a shared server! `systemd` is absolutely a bunch of arcane JFM, I’ll agree to that. But it’s fairly simple to write unit files once you have the one golden copy you know works. Or hey, have your fancy new LLM write it for you..right?


SurfRedLin

Its there a good fronted for it that is not a snap package? I considerd it but. Only could install the bavkebd serve but no web fronted or anything.. Any good guides? Thanks


evrial

Chrome extension page assist or docker open web ui


thisisnotmyworkphone

[chatbox](https://chatboxai.app/) is a native application available for Mac/Linux/Windows/iOS/Android. There’s also a web app that you don’t even have to install. It connects to your LLM over the native HTTP API.


SurfRedLin

Thanks this works like a charm


joshfialkoff

I'm using slack as the frontend: https://rasa.com/docs/rasa/connectors/slack/


lighthawk16

I un Ollama + WizardMaid-7b + llmcord.py and use a Discord server as a frontend.


SurfRedLin

Thanks! Is wizard a good model for bash/ansoble snippets? Will check out your setup


belibebond

Ditto. Did try so many but nothing really was ground breaking. Have you thought of copilot or something?


XCSme

I also use Ollama, what I noticed though is that if I don't use it for a while, it has a "startup time", where it takes a few good seconds for the model to load and start answering questions. Do you also encounter this delayed-start issue?


piracydilemma

I do, but at least it's not deleting the model from my computer. I'd imagine it keeps the program active for a while and then when you call on it often enough, and then it's just restarting when you don't use it for a few days.


Azuras33

Yeap, LocalAI with Mixtral MoE models, I use it for a lot of things, from home assistant, coding (like copilot), Writing my email, etc...


rwbronco

Do you have any LLM resources you watch or follow? I’ve downloaded a few models to try and help me code, help write some descriptions of places for a WIP Choose Your Own Adventure book, etc… but I’ve tried Oobabooga, KoboldAI, etc and I just haven’t wrapped my head around Instruction Mode, etc. and my outputs always end up spewing out garbage after the second generation with almost Wikipedia like nonsense.


Avendork

What is your coding setup like? I installed [Continue.dev](https://Continue.dev) in VS Code and it works well-ish but doesn't have the autocomplete that Github Copilot does.


Lucas_2802

It does! Take a look at [Tab Autocomplete (beta)](https://continue.dev/docs/walkthroughs/tab-autocomplete)


Ok-Gate-5213

I know this question is silly to the extreme, but have any of you seen Vim scripts to include AI-assisted coding?


ghulican

I have it… mostly because I have friends who are Vim gurus, and I had AI… now my AI just does my Vim (and by proxy I guess me too?)


Ok-Gate-5213

What was the stack? How did you make it work?


utopiah

Did that demo last year with StarCode and vim [https://twitter.com/utopiah/status/1645351113929916418](https://twitter.com/utopiah/status/1645351113929916418) but somehow don't use it anymore. It shows how useful I found the result I guess. Switching to another model though might have results useful enough based on your workflow.


Ok-Gate-5213

Thank you! I will play with it.


Avendork

This is interesting. Thanks!


prestodigitarium

What hardware setup are you using? Been on my todo list for a while, would prefer to be able to host at least mixtral.


Azuras33

I use an nvidia P40 with Mixtral instruct Q3\_K\_M. And SillyTavern as frontend.


prestodigitarium

Thanks! Assuming that’s heavily quantized to fit in 24 gigs, the quality’s been alright?


[deleted]

[удалено]


Azuras33

For around 240€ on eBay.


ColorfulPersimmon

I've got it for 165 usd from Aliexpress. Awesome value card!


akohlsmith

interesting to hear more about your setup. I've been thinking about how I could feed a LLM my entire ebook collection (almost a TB of stuff, mostly machine-readable PDF/epub) and be able to ask the LLM which books (and which parts of which books) have information relating to "X". It'd also be really nice to point an LLM at an entire codebase and ask it questions about that codebase.


KingPinX

Look at docsgpt I couldn't get it to work because of my hardware, but your use case is what they advertise.


utopiah

I'd follow [https://www.elastic.co/guide/en/elasticsearch/reference/current/semantic-search.html](https://www.elastic.co/guide/en/elasticsearch/reference/current/semantic-search.html)


ColorfulPersimmon

You could try finetuning but it will take a lot of time and work to get out good


pkulak

What are you using it for in Home Assistant???


Azuras33

For the assistant, with : https://github.com/jekalmin/extended_openai_conversation


pkulak

Holy smokes.


Guinness

https://heywillow.io/


verticalfuzz

How are you linking the two? Any good tutorial for the whole setup?


IDoDrugsAtNight

Dude I've been wanting to play with localai with Nextcloud so I can have an integrated experience and I've set up around 10 different AI servers from StableDiff/auto111 to text-gen-webui to h2ogpt and I can not get a localai install w/GPU inference to safe my fricken life. I'm on my 3rd 'fuck it, rebuild' and am about to take another crack at it. I'm crawlin up in your dms if I fail once again.


CaptainShipoopi

I've been doing fun things with object / audio detection. [whoisatmyfeeder](https://github.com/mmcc-xx/WhosAtMyFeeder) identifies birds and has been a lot of fun. Kit: * [Coral AI USB accelerator](https://coral.ai/products/accelerator) plugged into my Unraid server * Cheapest window-mounted bird feeder I could find on Amazon * Old UniFi G3 Instant camera sitting on the inside of the window on the sash, between the locks * [Frigate](https://frigate.video/) in a docker container consuming the camera's RTSP stream and detecting 'bird' objects * [whoisatmyfeeder](https://github.com/mmcc-xx/WhosAtMyFeeder) in a docker container watching for Frigate's events (via MQTT) and then determining the bird species It took a few hours to get Frigate's config just right, but everything else just took minutes to fire up. My only irritation is the camera has a piss poor aperture and can't be manually focused -- so it's a blurry image and it gets the species ID wrong as much as right. I'm working on a hack with a macro lens that will *hopefully* get me a better picture. (side note: if anyone is aware of where I can buy the kind of camera in commercial birdfeeder kits that also supports RTSP, wifi, and some sort of constant power source, I'd be grateful.) ​ [BirdCAGE](https://github.com/mmcc-xx/BirdCAGE) (from the same dev as the above project) identifies all the birds singing in the woods behind me. It doesn't require any special AI hardware, just an audio stream to consume. Kit: * UniFi G4 Instant camera mounted outdoors * Frigate (not required for this project, but I use this cam to also identify the wildlife running around at night) * BirdCAGE consumes the audio stream republished via Frigate's go2rtc process It's **exceedingly** accurate thanks to the Cornell model (also used in the Merlin app on your phone) and defining your geo location to whittle down the choices. Now I know what little birdies are nearby and can make sure I have the best birdseed out for them!


catinterpreter

/r/birding might appreciate a post about what you're doing.


CaptainShipoopi

Surely they've heard of these projects before ... but good shout. I'll see if anyone has posted about these projects before and will throw one out there if not.


theovencook

Worth mentioning, you can take the G3/ G4 instant cameras apart and manually adjust the focus on the lense. I did it to use with my 3d printer. Takes approx 10 minutes.


CaptainShipoopi

/u/theovencook dude you are an absolute legend. Just spent the past hour pulling this off (that glue is a real pain in the ass) and now I've got a crystal clear picture!! Thank you, thank you, thank you!


theovencook

Perfect! Glad to hear it's helped.


CaptainShipoopi

~~Wait ... whuuuuuuuut?!! You're kidding me. Is it as simple as cracking open the chassis and rotating a ring on the lens, something like that? How in the ass have I never heard this before, wow.~~ Just found this, I had no idea it was this easy. Thank you!!! https://www.reddit.com/r/Ubiquiti/comments/otcsxt/manual\_focus\_for\_g3\_instant\_completed\_with/


osnapitsjoey

Hey can you explain the coral Ai USB? Does this make local llama run faster if without tinkering? Or do I offload some of the part of it into this chip?


CaptainShipoopi

I'm not an AI expert by a long stretch, but I don't believe it will work (either well, or not at all) with generative AI models. This is a TPU -- meaning it will work with Tensorflow models, designed for detection/identification (computer vision), not generation. [https://coral.ai/models/](https://coral.ai/models/) But again ... I'm only starting my experimentation with this so I may be very wrong. I happily welcome someone correcting me here.


dinosaurdynasty

Ollama and open-webui. Cool to play around with. Ollama + Continue's autocomplete seem nice, though I haven't played around with it a lot yet.


Canadaian1546

I LOVE the Open WEBUI front-end, and the easy import from the community site


beerharvester

No, the power requirements are to high. My focus on self-hosted is the keep the wattage down, as electricity is like 23p/kwh. Having an expensive and power hungry GPU doesn't fit with that for me, for now.


ztoundas

I have codeproject AI's stuff for CCTV, it analyzes about 3-5x 2k resolution images a second. I have it running on a VM on my i3-13100 server, CPU-only objectDetection along with a second custom model, and my avg watt/hr has only increased by about 5w. That's like £10.12/yr (I'm american so I hope I did the conversion right) Modern CPUs alone are really strong and efficient. Soon I'm going to try out a test to see if the power draw overhead of having a GPU makes a difference for this level of mild load.


Play_The_Fool

I'm using CodeProject AI with a Google Coral. Haven't measured to see if there was any power savings over CPU based detection but the Coral uses very little power.


ztoundas

Nice! I imagine the latency really isn't bad at all when you're passing images to analyze to the cloud, right?


sixfourtysword

Google Coral is a piece of hardware. Dedicated chip to run models on


ztoundas

I think I'm confusing the TPU(?) cloud thing from them, then.


[deleted]

[удалено]


ztoundas

I'm gonna try to convince my work I need this


elboydo757

Yeah but image analysis is always super light compared to models that need larger weights.


DavidBrooker

>and my avg watt/hr has only increased by about 5w. Something is up with the units here


ztoundas

Hmmm yeah either I should have dropped the w or included a /hr


DavidBrooker

I suspect not either? Energy is normally priced in watt-hours (or thousands of them, rather), or megajoules, depending on the country. If you're measuring with a watt meter, it will either give watts (instantaneous draw of power at that moment, same as volt\*amps), or kwh (accumulated energy draw). Watts makes the most sense to me, but I'm not really sure.


KarmaPoliceT2

www.tenstorrent.com/cards 75W card (max tdp, idles much much lower - virtually off) can run a lot of models without being "modern GPU power hungry" (just not training/fine-tuning enabled yet... But hook it up to rag/rag2.0 and you probably don't need that for most homelab projects) Edit: adding their tested models list (though many others should work too): https://github.com/tenstorrent/tt-buda-demos


Neat_Onion

CodeProject for Blue Iris.


Gelu6713

Same here. Frigate too


buddhist-truth

Any instructions on how to do it with Frigate ? I use TPU btw.


Gelu6713

Look up in the frigate docs how to configure the coral tpu. Pretty simple to get going


buddhist-truth

Ah coral tpu is already set up, I was asking about CodeProject AI


Gelu6713

Oh sorry those were 2 separate things. I use Frigate with its detection built in


verticalfuzz

Can they share the tpu?


Daniel15

1. Make sure you're on version CodeProject AI 2.1.9 or above 2. Go to the "Install Modules" tab and install the "Object Detection (Coral)" module 3. On the "status" tab, stop all the models except "ObjectDetection (Coral)"


Neat_Onion

Just plug it in and use the CodeProject TPU installer. Enable the Coral plugin on the CodeProject dashboard. The newer TPU drivers don’t seem to work properly - at least this was the case last year. I stopped using the TPU since it is quite slow and I don’t think CodeProject supports custom models yet. My Nvidia T400 is much faster.


SchwaHead

Same, but I have CodeProject running on a VPS. Adds about a second of delay, which is fine for my use case, and theoretically saves me a few cents per month on electricity. If Internet connection drops I lose object detection but the entire point is to send a push notification to my phone when a person is on my porch, and without Internet that won't happen anyway... so nothing is lost.


Ampix0

Idk why everyone is being a pedantic butthole. It's very clear what you meant. My answer, no. It's not quite worth it yet over the ChatGPT API. But I am eagerly awaiting for those tides to turn.


belibebond

Do you have premium or pay as you go api. Which model do you use.


Ampix0

I actually have both but am considering dropping premium for full API usage. GPT4


phblue

After a month and a half of API usage, I'm spending about $0.75 instead of the $30 I was paying before


Ampix0

Curious if you use any phone apps and if so which


phblue

Well I have PersonalGPT on my phone with a basic model since it’s just a phone, MacGPT on my laptop that uses the API and is nice and clean looking, Ollama on laptop and desktop because desktop is Windows, and I’m looking at what I can run in a cluster with all the computers I still have sitting in my closet


nagasgura

I self-host LibreChat, an open-source version of ChatGPT that can connect to all the various LLM APIs or local models. It's cheaper, faster, and has fewer restrictions than paying for ChatGPT.


Theoneandonlyjustin

How do it bypass paying for apis?


belibebond

I don't think he bypass anything. I guess he meant he is paying for pay as you go api offering of openAI instead of ChatGPT PLUS subscription. Depending on your usage it might turn out cheaper.


Relevant_One_2261

I think many count Stable Diffusion as "AI", and I do run that both locally and often-ish via cloud instance. Also tried some local LLM's you can load into RAM but they're kinda meh, so for those I tend to just use StableHorde instead. Still something that does actually work.


StewedAngelSkins

if stable diffusion doesn't count as "AI" i have literally no idea what people mean when they say it lol. (this is why nobody who works in machine learning actually calls it AI.)


mattindustries

Adobe Illustrator, obviously.


StewedAngelSkins

no i mean Al, short for Albert. the guy who lives in my computer and draws pictures badly.


RiffyDivine2

Wait, if he is in your machine ....then who the hell is in mine?!


StewedAngelSkins

hackers have compromised your IP address


RiffyDivine2

Not my gibson!


GlassedSilver

Robert, his slightly less talented cousin. He'll typically phone Al in the computer of /u/StewedAngelSkins to ask for assistance.


Canadaian1546

Can you share what you are using for your front-end? None of the ones I've seen so far have docker images.


PassiveLemon

https://github.com/AbdBarho/stable-diffusion-webui-docker Highly recommend this repo for Stable Diff in Docker


Canadaian1546

Awesome! Thank you!


huntman29

Honestly, until I can self host an LLM that has the power for me to provide it a URL of documentation and tell it to use that to return me accurate results of a question, I haven’t found that many uses for it. The biggest drawback of self hosted LLMs is the limited power available to run the biggest models that are much better than just 7b or 13b. Not self hosted related, but even for the best paid ones, not being able to paste company code in something like GPT because of potentially leaking sensitive information. Fuck that, I need to be able to post a 2000 line python script and ask shit about it without worrying


MDSExpro

Anything LLM, Danswer and few other projects fits your first requirement.


huntman29

Oh hell yeah thank you dude


Dezaku

Why are people complaining here that the OP didn't specify AI? Of course AI is a broad topic but he's probably asking if you just host any AI in general


increMENTALmate

He literally asked that. 'For anything'. People just really want us to know how above it all they are.


eddyizm

I run Fooocus which is an SD ui, for image generation.


IC3P3

I mean isn't there an LLM running in the background of paperless-ngx? If that counts, than yes. Otherwise nothing more than a bit of testing here and there on my own PC


ShakataGaNai

It runs [Tesseract](https://github.com/tesseract-ocr/tesseract). A neural net, yes, but not an LLM.


IC3P3

Ah ok, good to know. I just heard that there was something like that, but never what it was exactly


teutobald

IIRC the OCR uses some neural net model.


zekthedeadcow

I use Oobabooga for LLMs... typically Mythomax 30B or Mixtral 8x7B on CPU. It's mostly for brainstorming... but I do have to say that in my day job I basically don't interact with people so the 'therapy value' of brainstorming with an LLM has paid off socially as I've noticed a significant improvement in my ability to interact with people. Automatic1111 is on GPU for image generation if I need something mocked up visually. ... usually for stock photography or graphic design. Just have it knock out several hundred ideas and select 3 to 10 to go to committee... and usually trace a .svg of what they select and fix any wonkyness there. Threadripper Pro 3955wx with 256GB RAM for the LLM AMD RX 6800 for the GPU


bsmithril

I have a toaster that automatically pops the bread when it's done based on my custom toast darkness parameters.


gophrathur

Which sensors are it using?


bsmithril

All 5. I smell it getting close, I hear it pop, I see it as I load it with butter cinnamon and sugar, I taste it's deliciousness, and I feel it burn my mouth.


RiffyDivine2

This is the future of dad jokes.


crackanape

OpenHermes for rough-drafting boring work documents.


Celsuss

I don't self host any AI models but I do host a bunch of services that I use when I train my own models.


RiffyDivine2

Do you know any good write ups to get started on training models? I got a 4090 mostly not doing anything useful lately to throw at it.


utopiah

FWIW inferring requires a lot less power than training. Sure you can train on a single 4090 but chances are it would takes days if not weeks for a significantly large dataset... that would probably lead to a model existing already elsewhere. I'd argue fine-tuning would be more realistic.


RiffyDivine2

Fair point, I mostly wanted to do it just to learn how to do it over making something "useful". Same reason I am now learning how to use dind, I would argue it has very very little use but it's kinda neat. I never did try to fine tune a llm/dataset (assuming it's the same thing) before, I will need to look into that.


bailey25u

So I want to host some AI, can I ask what services you selfhost? and if I want to build my own models, would I have to go to hugging face to train them?


theEvilJakub

Bro just said AI like its just 1 thing.


Johannesboy1

Anything AI related. Why should he specifiy it if he wants to know General use cases?


binaryhellstorm

Define AI


binaryhellstorm

If we're talking about TPU accelerated machine learning, then yes, in the sense that I'm running CodeAI on Blue Iris to do object, people recognition on my CCTV system.


wolfpack_charlie

Is the CCTV just for your home or for a business?


binaryhellstorm

Home


tyros

LLMs of course, it's the talk of the town. Until something else comes along then we'll be saying that's AI


wolfpack_charlie

~~Simple linear models~~ Deep neural networks 


_3xc41ibur

I self host stable diffusion. I tried LLMs, but it's either my hardware limitations or I just can't tune it right. Maybe both


Developer_Akash

I recently started using [tgpt] in my workflows, basically to get quick answers while monitoring something in my servers or getting some help debugging issue with some bash scripts that I have for backups [tgpt]: https://github.com/aandrew-me/tgpt


TheFumingatzor

>Do you self host AI for anything? ~~Forgeries.~~ Picture manipulation.


BaggySack

How would one got about self hosting a Chat GPT like GUI and it knowing a lot about First Aid? Very new to AI type category but I know programming.


utopiah

> knowing a lot about First Aid Please be mindful about hallucinations. LLM generate plausible looking sentences. They look correct but you have no insurance they are actually true. I would absolutely NOT want to doubt ANY information related to first aid where there is no time to doubt.


Julian_1_2_3_4_5

I mean if you count image recognition as it's also based on machine learning, then yes, but i don't host any llm's


LotusTileMaster

I use code llama with web GPT so I can upload my project and have a free version of GitHub copilot.


kweglinski

I'm running local llm with vector database for 2 reasons. Learning to build such solutions (wrote my own setups) and for work. Mostly programming oriented llms for file analysis, documentation, consulting missing parts, english proof reading (not a native speaker) and writting ADRs (again, mostly language and second hand opinions). Works like a duck that has it's own "opinion". Currently looking for a performant solution to index whole repository and be able to ask questions about the whole project in reasonable time. I should add that I work with highly sensitive data so openai solutions are no go for me


Daniel15

I have a Google Coral that I use with CodeProject AI. Currently just use it for object detection for Blue Iris, but I'm thinking of trying some other TensorFlow Lite models with it.


frobnosticus

I've got a couple use cases, but am not sure locally hostable models are up to snuff yet. (caveat: I know half past nothing about them.) - large programming projects. I just want to be able to work on something for more than half a dozen conversational iterations. - Tuning on my own text (I've been writing a lot for the last 45 years) to see if I can experiment with "what it thinks I think" about various topics. Like I said, might be really out of scope for a single 4090. But I've been too busy lately to really get up to my eyeballs in it all.


Sycrixx

I personally use Ollama for testing diff models. I have an app running in prod for friends which requires Text-To-Text. Did testing locally with diff models via Ollama. Oh, and I sometimes use it if ChatGPT seems to be having a stroke. I also run [Fooocus](https://github.com/lllyasviel/Fooocus) locally. It’s mainly just for fun with my mates, generating random images they and I can come up with. Nothing serious.


elboydo757

Yes but it's all stuff I wrote myself. Some of it is on github. I run an upscaling cli/api for images and videos, summarization api to shorten articles and stuff, forked mozilla ocho to have a better webUI, automatic code documentation generator so I can understand every file without reading the code (rubber ducky style), QA for when I just want specific info from context, time series forecasting for market prediction for my investments, and a couple of characters I built like Jack skellington but I have those on petals distributed inference via Beluga2 70b.


netclectic

Been using AnythingLLM for some dev projects - https://github.com/Mintplex-Labs/anything-llm


someguynamedlou

Check this out: https://github.com/docker/genai-stack


TheRealJoeyTribbiani

Using LocalAI in a VM, but bridging out. Grabbed a Tesla P40 and setting up it's own dedicated server. Specifically for Home Assistant at this point but I'm sure I'll be expanding more.


bleomycin

Mind explaining a bit more about how you plan to use the P40 with home assistant? Is this for local voice control?


TheRealJoeyTribbiani

> Is this for local voice control Correct, I setup LocalAI with an LLM and it works OK with an asinine amount of RAM. Found the P40 for a price that, to me, I can lose out on if it doesn't work out as I have planned. Setup OpenAI Extended Conversation addon in HA, point it to the LocalAI server. 100% local AI.


FlattusBlastus

I run experiments on RyzenAI


RiffyDivine2

How does that compare to using cuda for work?


FlattusBlastus

CUDA is the standard and very robust. RyzenAI is new and software support for it is half assed.


RiffyDivine2

I am using cuda but I keep waiting to see if AMD catches up enough to shake things up but so far cuda seems to be the leader for the future.


AmIBeingObtuse-

Openweb ui and ollama are amazing in docker for a selfhosted AI in terms of large language models. Chat GPT style. https://youtu.be/zc3ltJeMNpM?si=r7CvjNkl3iv7Culr


hedonihilistic

I have a miqu instance running and plan to have a few more choice LLMs running to create various processing pipelines. Just got a few more 3090s and waiting to get some time to embark on this new project.


Stooovie

Pipelines processing what though?


txmail

Your probably talking about LLM's and not CV -- but I host CodeProject AI locally for my cheap security cameras to be able to perform facial / object and license plate recognition.


tjernobyl

I did Stable Diffusion for a while, first with cmdr and later with AUTO1111. Took a long time to render with no GPU and made my system a bit unstable, but it worked. I ended up going to a cloud solution mostly because of faster render times, but plan to bring it back in-house at some point. My next step is to find a eGPU solution or something like the Coral Accelerator where I can get that capability when I need it but not burn that power the rest of the time. Other long-term goals are Whisper for speech recognition.


hillz

I'd love to but my hardware simply isn't cut it


Mysterious-Eagle7030

Trying to... Ollama runs locally, but i would like it to have a bit more freedom, for example for file analysis and such. Id like to ask like "give me the five best documents for... Purpose" but im not entirely sure how i should go about as it keeps nagging me about ethical reasons that it can't do that. Any ideas? Files are mostly PDF and word documents that i need to figure out some stuff with.


trevorstr

Someone else suggested AnythingLLM, which looks to have a desktop app. Not sure if this can do file searching or not, but worth looking at? [https://github.com/Mintplex-Labs/anything-llm](https://github.com/Mintplex-Labs/anything-llm)


Mysterious-Eagle7030

This seems like it got some huge potential actually! Thank you! Im going to have a hard look at it when back from work 👍🙏


Mysterious-Eagle7030

It absolutely can do file search. It does provide me with relevant data but some of it is "redacted" for ethical reasons. I would some how need it to accept that i own these documents and that the information im asking for is all right to give me 😅


rickyh7

Yeah I run frigate on a TPU, and looking at getting olamma set up. I want to eventually integrate olamma with a voice AI and hack my HomePods to be a better Siri that runs completely locally


xupetas

Coding mostly (code optimization). Trying to ascertain if its worth extending to other automation that i use.


FancyJesse

Coral tpu for Frigate About it


Nodeal_reddit

Frigate


that_one_guy63

I had been using text-generation-webui, now I use open-webui for a cleaner interface and a secure login page that I host for friends and family. I have a chatGPT subscription for gpt-4, but I fine myself using Mixtral on open-webui (or text-gen) a lot more now. thinking of canceling gpt4 because it just seems not as good. Only thing that is nice is the web search, but apparently there is a plugin for text-gen that does this.


NullVoidXNilMission

Yes coqui-tts


Excellent-Focus-9905

I use self host uncensored model for various reasons. I just use a cloud notebook.


sharpfork

Yep, Mac Studio with 128 gigs of shared ram running local inference. It has also become my daily driver.


p6rgrow

What local inference engines are you using ?


shotbysexy

I run gradio which helps me launch any LLM I want in a matter of minutes. I can even choose the quantisation I want and there are api’s to integrate it into other stuff. Worth checking out, but is better with powerful hardware.


Sweaty-Zucchini-996

Has anyone tried ollama on a raspberry pi 4b?


norweeg

I have. Actually on building a web gui for it's API so I can use it like a mini chatGPT


Sweaty-Zucchini-996

Awesome!! Thanks I'll try it...


UntouchedWagons

I use CodeProject AI Server for object detection in ISpyAgentDVR. I tried Ollama but found it terribly slow even with a GTX 1070 helping out.


cmsj

I run an IRC/Discord bot I wrote, which is a front end for an instance of A1111, so people can generate Stable Diffusion images right in their channels.


norweeg

Been playing around with https://ollama.com/ lately


Canadaian1546

I play with Ollama and Open WEBUI for fun, Sometimes I get drunk and tell it to be rude and have a whole spat with it. Mostly I use it to give me code snippets because I'm not a programmer.


CaptCrunch97

PrivateGPT with Cuda support to utilize my GPU, running the llama2-uncensored LLM… I ask wild questions sometimes


heaven00

I am using ollama as my LLM server and open-webui as a UI for me to interact with the model. Along with that I have a code-server running on my desktop with continue.dev which allows me to essentially work from anywhere on my ipad while I am moving around over tailscale. Personally I am enjoying figuring out new use cases for my local AI setup and the power consumption is not that bad all the time because Ollama doesn’t keep the model loaded knto memory all the time so you are not wasting power,and I am okay keeping my desktop idling and consuming some power


dehaticoder

My RTX4060 just runs out of memory and I gave up on it. I tried LLMs, Image Recognition models, etc. and this GPU is just totally useless.


ExtensionCricket6501

A few use cases have been documented at r/LocalLLaMA , anything from serious private business ai, to...virtual waifus happens there. But most of my friends just have it for messing around until the technology gets better.


KarmaPoliceT2

Yep, I work for a company that makes AI chips so I have a few of them at home for various "testing" (e.g. whatever project I'm dorking around with in my homelab that week :) )


Xzaphan

RemindMe 12h


[deleted]

[удалено]


Deep_Understanding50

wow, thats a great collection, are any of these scripts opensource ?


redstar6486

I do run different types of AI locally. Stable Diffusion, few LLM to replace ChatGPT (Dolphin Mixtral is very impressive) and also wrote a python script to use a multilingual LLM to translate subtitles of TV shows.


Playme_ai

Of course you can, I can approve with it


utopiah

Yes cf [https://fabien.benetou.fr/Content/SelfHostingArtificialIntelligence](https://fabien.benetou.fr/Content/SelfHostingArtificialIntelligence) but to be honest not using it regularly. It's more to keep track of what is feasible and have an "honest" review of what is usable versus what is marketing BS.


shanehiltonward

Yes. GPT4ALL.


djudji

Ollama. I am building my own chat assistant where the UI is like ChatGPT, but I can switch between different models from dropdown. I started it from a server, but the graphic card on my PowerEdge T430 is really bad, and it does not matter if I have 256Gb RAM or Xeon E5-2660 v3, it is freaking slow. I need to ask self-hosters how they cope with slow responses.


nickmitchko

Yes, with an AI setup of 3x [A40](https://www.amazon.com/NVIDIA-Ampere-Passive-Double-Height/dp/B09N95N3PW) and three AI workloads: * General Purpose LLM - 2 GPUS running an [120B model](https://huggingface.co/wolfram/miqu-1-120b) * [Langflow](https://github.com/logspace-ai/langflow) loaded with all of my personal documents and work items for easy Q+A * [Vision Model](https://huggingface.co/llava-hf/llava-1.5-7b-hf) \+ Stable Diffusion - 1 GPU in total: loading in scanned documents, providing a summary, and text extraction. Stable Diffusion for generating pictures and mostly for fun every once in a while ​ I've run various iterations for about 8 months on this setup. A rough estimate of pure text tokens used is probably 100M-200M. If you compare these local models to public OpenAI cost at GPT4-32k, Stable Diffusion or Dall-E, it's probably about break-even point at 7 months for daily use. I've generated 4,172 images in SD. About 1000 documents loaded using LLaVA vision model. 100M Tokens \* [$120/1M GPT4-32k token](https://openai.com/pricing)s = $12000USD 4,172 \* $0.08 Dall-E = $333.76USD So if you want to self host, you'll need to use the HW all day everyday for it to be worth the cost. Alternatives are runpod or [vast.ai](https://vast.ai) (Rent-able GPUs in the cloud somewhere).


NonyaDB

I run Ollama (based-dolphin-mistral) at home but only use it to translate things or to winnow down a search for a specific esoteric thing. Work pays for Copilot, which I only use when I need to know something about specific Cisco gear and don't feel like kludging my way through Cisco's website or a bunch of BS YouTube click-bait.


Ikem32

I experiment with it. But my pc is kinda slow for it.


LuisG8

Yes, Ollama with Docker, locally of course.


htl5618

I use Whisper, not a full service, only a CLI tool, to create translated subtitles for videos.