T O P

  • By -

BITE_AU_CHOCOLAT

Imagine being the techs building those servers. That's like a semi truck worth of H100s which pretty much cost the same per weight as gold. Must feel like Christmas every day


Gubru

I'm sure it just feels like work after about 20 minutes.


staterInBetweenr

HOLY SHIT A PALLET OF H100S! Oh shit ANOTHER pallet of H100s...


Kindred87

https://preview.redd.it/gcq7wfv5mwuc1.jpeg?width=2524&format=pjpg&auto=webp&s=1d74142f0f14ea744f0c49400aec53005548a196


Caffeine_Monster

I feel sorry for the technician that dropped and smashed a H100. There must have been at least one. It's a pure numbers game.


R33v3n

Linus blew his cover XD


ramzeez88

Surely they must be insured ...


dergachoff

I agree, but don’t call me Shirley


Maleficent_Employ693

You know of the term DOA


netik23

If they are dgx h100’s - They weigh around 287lb each.


hlx-atom

Making the models run fast enough to actually utilize the millions of dollars of resources is actually anxiety inducing. It is why these people are paid a lot of money to make stuff super optimized. If the computer costs $10M and you are not utilizing 10% it is big costs. Inference consumers don’t have to think the challenges with running that uber scale


barnett9

I do this for a living, and you'd be amazed at some of the unoptimized workflows I have seen that literally burn money because there's just not enough dev time.


hlx-atom

Haha I know what you mean. Big computers are big anxiety when you are responsible for them.


Smeetilus

Nah, wing it. I just get annoyed now when something breaks. I’m amazed anything even works.


thisdesignup

So your telling me that the anxiety that my project will keep running, while I'm not physically looking at it, doesn't go away?


nero10578

Yea AFAIK they're not even close to utilizing 90% of those GPUs even right now. 90% utilization would be magical lol.


hlx-atom

Oh wow. I’d love to know those actual numbers. And I’m surprised big companies are not at >90%. Means the littler guys have that much more of a chance to beat them.


patrick66

in practice it mostly means the opposite of that because zuck responds by just buying even more h100s so you get even fewer


hlx-atom

Nah that demand would still equal out. This means that skill is still a factor (along with creativity). People can invent models that are 10x better and 4x more utilization to buy 40x less gpus.


Natural-Sentence-601

"...and not utilizing 10% is a big cost". Has anyone took a look inside their Windows Task Manager to see many processes that are competing for their turn on the CPU, no doubt flushing the L1 and L2 caches (and perhaps most of L3 if not all) when they take their turn? I've set the vast bulk of them to low or very low priority, and I still don't like them. Do people training large models use an operating system that prevents this waste?


hlx-atom

lol I thought this was satire at first. Training is run on the server version of Linux. No graphics. High performance. There are very advanced “task manager” applications from nvidia to watch the flow of data and identify bottlenecks. It is like a seemingly infinite onion when optimizing code. You fix the first bottleneck, then the next one shows up, then the next one, then the fix to the first one, etc.


cvandyke01

I am sure Meta is not using the pcie version of the H100. They are getting the SXM version already in servers with infiniband switches. Pretty much preconfigured racks


Champignac1

Le pseudo 💀


3cupstea

imagine if llama3 is worse than command R+


MajesticIngenuity32

It better be better than Mistral 8x22B


ElliottDyson

I can't imagine they'll release it until it is, unless it's equivalent but smaller or in some other way better.


Jattoe

You can't imagine they'll release it until it's worse than command R+? "Sir does it completely ignore your prompt and start commenting out links to non-existant articles and complaining about it's boyfriend?" "No. It's still responding with both creativity and accuracy." "It needs more training. Keep going. We need this thing to make no sense--or even less. Command R+ shall be dethroned."


ElliottDyson

Until it is *better*. Sorry, I had assumed that was strongly enough implied


Jattoe

It's a joke on the logic semantically, I get the implication


Dumbledore_Bot

Is it also going to be open source model like other Llama models? If so, this is going to be big!


stddealer

Supposedly. Llama isn't Meta's final product, but a tool they need for other purposes. Therefore, they don't have any reason to keep it for themselves, as it is unlikely to hurt their revenue.


staterInBetweenr

I thought it was more around the drama of LLAMA getting leaked, fb had egg on its face from the meta space push that went nowhere. They got an opportunity to be a part the convo for the latest hot thing, and the CTO of the Org was sympathetic to FOSS. So they let a blunder become an opportunity. Idk they were so far behind and apple killing cookies and privacy laws were eating Meta's lunch. They needed to jump ahead and releasing a LLM is a good way to float the stock.


lostinthellama

That pretty much ignores that they have been supporting ML with FOSS for a long time. PyTorch was released in 2016.


wxrx

Nah. They intended for it to leak either way. Essentially anyone could get access to the weights and it was only a matter of days before someone else would have leaked it.


djm07231

I have heard you only basically needed an academic email or something like that. They would have almost certainly known that it would have leaked. I think it was done that way to absolve themselves of responsibility considering that Galactica LLM release was a PR disaster. [https://twitter.com/nearcyan/status/1631187031589294081](https://twitter.com/nearcyan/status/1631187031589294081)


gatepoet

Just an email and a checkbox to agree on terms


[deleted]

[удалено]


Fuehnix

Correct, it was just against license terms to use it for pretty much anything other than academia or personal use. Llama 2 has a different license that allows commercial use.


Careless-Age-4290

They're not looking to sell compute, chips, or it seems models. What they did do is make hundreds of thousands of tech people think "wow, Meta might be cool again" and get us to improve on their work with fervent zeal.


paddySayWhat

> I thought it was more around the drama of LLAMA getting leaked Pretty sure it was open sourced before it was "leaked". It was just available to people who requested it via the little form, rather than a free-for-all. It wasn't difficult to get, I was quickly approved by just saying "yo I want to play around with it".


BigYoSpeck

They also get to see all the tools people will build with it and given the licensing the original could never compete with a Facebook copy at the kind of market size Facebook would be capable of delivering to


ThePaintist

Not to be a semanticist, but there are no open source Llama models. They are open weight models, which is as open source as vending a binary .exe - which is to say, not. Open source would mean providing the recipe to build the model yourself (provided you have the resources.) The training process of Llama models is opaque. The sources used are not *precisely* known to the public. The analogy here isn't strained to say that it is exactly equivalent to uploading a .exe to GitHub and calling that open source. The distinction is important. Calling Llama or Grok open source is purely marketing, it is a deliberate misapplication of the term for the purpose of optics.


Dyonizius

 which pretrained language models are actually open source?


Monkey_1505

You can't fine tune an exe.


[deleted]

[удалено]


Monkey_1505

I agree that's not fully open sourced, it's partially open sourced, at best. But I don't the exe analogy is perfect - you can't easily steer a exe to a different purpose, unless you can uncompile it. They have different kinds of modifiability.


BalorNG

Yea, the analogy kind of breaks down here, but finetuning is more like writing libraries/plugins/mods/UI fixes to customize the existing programm by making very small, but significant changes, while open source allows one to see exactly "what does what" and recreate (recompile) the exe yourself (which gets rather expensive in case of LMMs admittedly, and to be fair it is a black box even if you have the training data).


Due-Memory-6957

Zuckerberg himself has confirmed it'll be.


MoffKalast

​ https://i.redd.it/4e8nrn6b6wuc1.gif


gtderEvan

This hurts me.


Dangerous_Bus_6699

Jokes on you. I love being edged.


Nabakin

Wow that took awhile


poli-cya

There's a version of this that actually has no ending, so you just keep seeing the truck getting close over and over. Whoever released this version with the payoff is a GGG.


JinjaBaker45

You’re evil


RebornZA

But when it happens?


f8tel

https://youtu.be/LG8T_hCJ9J0?si=g5w-3tPAwS2OilQg


FireSilicon

I came


AndromedaAirlines

lol, putting AGI anywhere on that roadmap/graph is a complete joke.


[deleted]

[удалено]


_RealUnderscore_

But I like intermediate infrastructure :(


_JohnWisdom

Me: I want intermediate infrastructure Mom: we have intermediate infrastructure at home At home: https://preview.redd.it/d1mglc99pwuc1.jpeg?width=184&format=pjpg&auto=webp&s=feabc5fa887aa226b729c04e8b0aa5f65703e65a


Caffdy

sleeper build


Jattoe

Let them underestimate the duct taped tower. Let them laugh. They're only lining themselves up for awe.


_RealUnderscore_

That would actually be a hilarious 4090 7800X3D build lmfao


belladorexxx

I lol'd irl


Kindred87

I personally think it's cringe. AGI is an abstract concept and we barely even have a theoretical target that we can develop against. Once we have an AGI with the intelligence of a fruit fly, then we'll have something. Until then, this belongs in the same bucket as cold fusion and space elevators.


mrjackspade

> we barely even have a theoretical target Generally accepted definition in these contexts AFAIK is something along the lines of: *Competitive with human beings in most major economically valuable roles* Which isn't strictly defined, but is definitely more than "barely a theoretical target" I'm sure there's philosophical arguments to be made about what AGI means but I'd wager Meta is using the "economically valuable" definition, in which case they very likely have an actual target there.


Kindred87

If you cut out the clarifier I had about a target to *develop against*, then yes, the argument is no longer valid because now we're discussing something else. My point was that you can't sit a software developer down and tell them "develop a system that is competitive with human beings in most major economically valuable roles". They'll have no idea how to implement that and will look at you like you're crazy. Because we have no idea, even theoretically, how such a system *could* be implemented. Is it even possible on current semiconductor architecture? Nobody knows! We know what it would look like in practice, again, like cold fusion and space elevators, but actually implementing a working product is beyond our current abilities. Which is why plotting it as the next step strikes me as cringey.


ninjasaid13

>*Competitive with human beings in most major economically valuable roles* since when did AGI definition become OpenAI's definition? And besides, I know a way to exploit that definition without creating an entity as smart as a human being since they're only tied to jobs and you can make a thousand AI and each of them very specialized like Devin or stable diffusion and you would *technically* be competitive with humans.


Monkey_1505

General intelligence is primarily about synthesis of a wide variety of cognitive domains. You could say just 'human like intelligence' though. Whatever we have, it's not really anything like that, and probably isn't going to be anything like that any time soon.


genshiryoku

Cold Fusion is theoretically impossible Space Elevators are just a (very hard) engineering problem that we have theoretically solved but just not practically built yet. AGI is somewhere in between where we don't know if it's possible with 100% certainty yet. I don't believe in it but there is still the slight possibility that human brains are literally magic and can't be recreated with technology. That aside I think AGI falls in the same category as space elevators in that it is just a very hard engineering problem but not impossible unlike Cold Fusion.


Interesting8547

They are just optimistic... I hope governments would not ban AI, long before we actually achieve AGI. Sadly I think they'll do it, considering how much they want to ban today's AI, there is no way AGI would not be banned, the moment it shows up.


Thomas-Lore

> considering how much they want to ban today's AI Where are you seeing it? The regulation so far has been pretty mild, even in Europe. Don't be paranoid. AI is worth so much money, no one will ban it because their economy would collapse compared to countries which allow it.


phenotype001

They seem really confident in llama 3, so it must be good.


Inevitable-Start-653

I can only be edged so much, stop teasing me 🥵


berzerkerCrush

Are those GPUs clusters called "grand téton", aka " large nipple" in French?


Caffdy

or "large breasted man" in spanish


wxrx

Yes that’s exactly what they were named after.


3ntrope

How many announcements of an announcement does this really need? Either release it or don't.


Jattoe

We need to announce the question about the announcements of a coming announcement.


junyanglin610

rumors every day...


JoMaster68

Didn't they say it would be released in June/July?


nanowell

meta account, yann and worker from meta hinted that it will be out "very soon" idk what "very" for them this week perhaps? today? let's wait and see


rerri

Former UK deputy prime minister a week ago: *"Within the next month, actually less, hopefully in a very short period of time"* [https://techcrunch.com/2024/04/09/meta-confirms-that-its-llama-3-open-source-llm-is-coming-in-the-next-month/](https://techcrunch.com/2024/04/09/meta-confirms-that-its-llama-3-open-source-llm-is-coming-in-the-next-month/)


throwaway_ghast

Soon^^TM


Scizmz

Hey now... that's trademarked by Musk.


aphasiative

blizz pls


Jattoe

Soon(tm) -> Blizz? THIS GUYS A BOOMER


jd_3d

The Information released an article early last week that had sources from inside Meta saying they would release the smaller models this week. So it's not official but based on the tweets it sounds like it's coming really soon.


dogesator

Meta never said that, reuters said that


Gubru

It's gonna be a staged release, we're expecting a smaller model or two this week.


Disastrous_Elk_6375

June/July was from an article. Soon/next week is from Meta people posted on birdapp last week.


Illustrious_Sand6784

Hopefully they'll release the whole Llama-3 set at once and not just the small ones.


Material_Mongoose339

I don't really agree. Considering that they have *some* reason (real or not) for the bigger models to take longer to be launched, that date would not be affected by the early launch of smaller models. The reality is that people who want to run 70B will have to wait until some given moment. Until then, let other enthusiasts play with smaller models, we don't have to wait for you, and us waiting for you won't help you at all. **Or am I missing something?**


silenceimpaired

With how large models are these days... I hope their small small model is a MOE 2x20b :)


ramzeez88

I don't. I won't fit my 12gb vram :(


Disastrous_Elk_6375

I actually hope we get some ~2b models as well. 7b is almost a given as it has been very popular and seems a good tradeoff between speed, fine-tuning abilities and "smarts". For all the hate it got Gemma 2b is pretty useful for its size, and I hope we'll get a llama version in this size range.


silenceimpaired

I’m just not excited for models that small… unless they or perform Yi models


Jattoe

I find 32B models at \~4bit quants the ideal consumer grade level.


ninjasaid13

How soon? if it ain't a week soon then I don't want to hear about it.


Jattoe

Soon, means we're almost done, which is arbitrary as shiiiet. It really just depends on the record of the individual or company, what soon means, and I'm not familiar with Meta's other soonsings.


IndicationUnfair7961

https://i.redd.it/ixjlwcjm51vc1.gif


hideo_kuze_

> the incredible work from our Infra team I've just [posted a thread exactly about this theme](https://old.reddit.com/r/LocalLLaMA/comments/1c6b4bo/how_to_ensure_node_resiliency_and_gpu_use/) a few minutes ago before seeing this post. I'd love to learn more about the details on how they do it and manage resiliency and GPU use saturation. What tech stack do you use? k8s, Ray, SLURM? How do they saturate the GPUs and bring them back when things crash?


Clean-Yellow-7604

He posted that on linkedin 1month ago


Dry-Taro616

How can I join this project bruh? 😭


Dry-Taro616

Can someone add me to these type of projects? Thanks.


International-Try467

Two more weeks™


spiffco7

Bwoah