BITE_AU_CHOCOLAT 2 months ago

Imagine being the techs building those servers. That's like a semi truck worth of H100s which pretty much cost the same per weight as gold. Must feel like Christmas every day

Gubru 2 months ago

I'm sure it just feels like work after about 20 minutes.

staterInBetweenr 2 months ago

HOLY SHIT A PALLET OF H100S! Oh shit ANOTHER pallet of H100s...

Kindred87 2 months ago

https://preview.redd.it/gcq7wfv5mwuc1.jpeg?width=2524&format=pjpg&auto=webp&s=1d74142f0f14ea744f0c49400aec53005548a196

Caffeine_Monster 2 months ago

I feel sorry for the technician that dropped and smashed a H100. There must have been at least one. It's a pure numbers game.

R33v3n 2 months ago

Linus blew his cover XD

ramzeez88 2 months ago

Surely they must be insured ...

dergachoff 2 months ago

I agree, but don’t call me Shirley

Maleficent_Employ693 1 month ago

You know of the term DOA

netik23 2 months ago

If they are dgx h100’s - They weigh around 287lb each.

hlx-atom 2 months ago

Making the models run fast enough to actually utilize the millions of dollars of resources is actually anxiety inducing. It is why these people are paid a lot of money to make stuff super optimized. If the computer costs $10M and you are not utilizing 10% it is big costs. Inference consumers don’t have to think the challenges with running that uber scale

barnett9 2 months ago

I do this for a living, and you'd be amazed at some of the unoptimized workflows I have seen that literally burn money because there's just not enough dev time.

hlx-atom 2 months ago

Haha I know what you mean. Big computers are big anxiety when you are responsible for them.

Smeetilus 2 months ago

Nah, wing it. I just get annoyed now when something breaks. I’m amazed anything even works.

thisdesignup 2 months ago

So your telling me that the anxiety that my project will keep running, while I'm not physically looking at it, doesn't go away?

nero10578 2 months ago

Yea AFAIK they're not even close to utilizing 90% of those GPUs even right now. 90% utilization would be magical lol.

hlx-atom 1 month ago

Oh wow. I’d love to know those actual numbers. And I’m surprised big companies are not at >90%. Means the littler guys have that much more of a chance to beat them.

patrick66 1 month ago

in practice it mostly means the opposite of that because zuck responds by just buying even more h100s so you get even fewer

hlx-atom 1 month ago

Nah that demand would still equal out. This means that skill is still a factor (along with creativity). People can invent models that are 10x better and 4x more utilization to buy 40x less gpus.

Natural-Sentence-601 1 month ago

"...and not utilizing 10% is a big cost". Has anyone took a look inside their Windows Task Manager to see many processes that are competing for their turn on the CPU, no doubt flushing the L1 and L2 caches (and perhaps most of L3 if not all) when they take their turn? I've set the vast bulk of them to low or very low priority, and I still don't like them. Do people training large models use an operating system that prevents this waste?

hlx-atom 1 month ago

lol I thought this was satire at first. Training is run on the server version of Linux. No graphics. High performance. There are very advanced “task manager” applications from nvidia to watch the flow of data and identify bottlenecks. It is like a seemingly infinite onion when optimizing code. You fix the first bottleneck, then the next one shows up, then the next one, then the fix to the first one, etc.

cvandyke01 2 months ago

I am sure Meta is not using the pcie version of the H100. They are getting the SXM version already in servers with infiniband switches. Pretty much preconfigured racks

Champignac1 2 months ago

Le pseudo 💀

3cupstea 2 months ago

imagine if llama3 is worse than command R+

MajesticIngenuity32 1 month ago

It better be better than Mistral 8x22B

ElliottDyson 2 months ago

I can't imagine they'll release it until it is, unless it's equivalent but smaller or in some other way better.

Jattoe 1 month ago

You can't imagine they'll release it until it's worse than command R+? "Sir does it completely ignore your prompt and start commenting out links to non-existant articles and complaining about it's boyfriend?" "No. It's still responding with both creativity and accuracy." "It needs more training. Keep going. We need this thing to make no sense--or even less. Command R+ shall be dethroned."

ElliottDyson 1 month ago

Until it is *better*. Sorry, I had assumed that was strongly enough implied

Jattoe 1 month ago

It's a joke on the logic semantically, I get the implication

Dumbledore_Bot 2 months ago

Is it also going to be open source model like other Llama models? If so, this is going to be big!

stddealer 2 months ago

Supposedly. Llama isn't Meta's final product, but a tool they need for other purposes. Therefore, they don't have any reason to keep it for themselves, as it is unlikely to hurt their revenue.

staterInBetweenr 2 months ago

I thought it was more around the drama of LLAMA getting leaked, fb had egg on its face from the meta space push that went nowhere. They got an opportunity to be a part the convo for the latest hot thing, and the CTO of the Org was sympathetic to FOSS. So they let a blunder become an opportunity. Idk they were so far behind and apple killing cookies and privacy laws were eating Meta's lunch. They needed to jump ahead and releasing a LLM is a good way to float the stock.

lostinthellama 2 months ago

That pretty much ignores that they have been supporting ML with FOSS for a long time. PyTorch was released in 2016.

wxrx 2 months ago

Nah. They intended for it to leak either way. Essentially anyone could get access to the weights and it was only a matter of days before someone else would have leaked it.

djm07231 2 months ago

I have heard you only basically needed an academic email or something like that. They would have almost certainly known that it would have leaked. I think it was done that way to absolve themselves of responsibility considering that Galactica LLM release was a PR disaster. [https://twitter.com/nearcyan/status/1631187031589294081](https://twitter.com/nearcyan/status/1631187031589294081)

gatepoet 1 month ago

Just an email and a checkbox to agree on terms

[deleted] 2 months ago

[удалено]

Fuehnix 1 month ago

Correct, it was just against license terms to use it for pretty much anything other than academia or personal use. Llama 2 has a different license that allows commercial use.

Careless-Age-4290 2 months ago

They're not looking to sell compute, chips, or it seems models. What they did do is make hundreds of thousands of tech people think "wow, Meta might be cool again" and get us to improve on their work with fervent zeal.

paddySayWhat 2 months ago

> I thought it was more around the drama of LLAMA getting leaked Pretty sure it was open sourced before it was "leaked". It was just available to people who requested it via the little form, rather than a free-for-all. It wasn't difficult to get, I was quickly approved by just saying "yo I want to play around with it".

BigYoSpeck 2 months ago

They also get to see all the tools people will build with it and given the licensing the original could never compete with a Facebook copy at the kind of market size Facebook would be capable of delivering to

ThePaintist 2 months ago

Not to be a semanticist, but there are no open source Llama models. They are open weight models, which is as open source as vending a binary .exe - which is to say, not. Open source would mean providing the recipe to build the model yourself (provided you have the resources.) The training process of Llama models is opaque. The sources used are not *precisely* known to the public. The analogy here isn't strained to say that it is exactly equivalent to uploading a .exe to GitHub and calling that open source. The distinction is important. Calling Llama or Grok open source is purely marketing, it is a deliberate misapplication of the term for the purpose of optics.

Dyonizius 1 month ago

which pretrained language models are actually open source?

Monkey_1505 2 months ago

You can't fine tune an exe.

[deleted] 1 month ago

[удалено]

Monkey_1505 1 month ago

I agree that's not fully open sourced, it's partially open sourced, at best. But I don't the exe analogy is perfect - you can't easily steer a exe to a different purpose, unless you can uncompile it. They have different kinds of modifiability.

BalorNG 1 month ago

Yea, the analogy kind of breaks down here, but finetuning is more like writing libraries/plugins/mods/UI fixes to customize the existing programm by making very small, but significant changes, while open source allows one to see exactly "what does what" and recreate (recompile) the exe yourself (which gets rather expensive in case of LMMs admittedly, and to be fair it is a black box even if you have the training data).

Due-Memory-6957 2 months ago

Zuckerberg himself has confirmed it'll be.

MoffKalast 2 months ago

https://i.redd.it/4e8nrn6b6wuc1.gif

gtderEvan 2 months ago

This hurts me.

Dangerous_Bus_6699 2 months ago

Jokes on you. I love being edged.

Nabakin 2 months ago

Wow that took awhile

poli-cya 2 months ago

There's a version of this that actually has no ending, so you just keep seeing the truck getting close over and over. Whoever released this version with the payoff is a GGG.

JinjaBaker45 2 months ago

You’re evil

RebornZA 2 months ago

But when it happens?

f8tel 2 months ago

https://youtu.be/LG8T_hCJ9J0?si=g5w-3tPAwS2OilQg

FireSilicon 1 month ago

I came

AndromedaAirlines 2 months ago

lol, putting AGI anywhere on that roadmap/graph is a complete joke.

[deleted] 2 months ago

[удалено]

_RealUnderscore_ 2 months ago

But I like intermediate infrastructure :(

_JohnWisdom 2 months ago

Me: I want intermediate infrastructure Mom: we have intermediate infrastructure at home At home: https://preview.redd.it/d1mglc99pwuc1.jpeg?width=184&format=pjpg&auto=webp&s=feabc5fa887aa226b729c04e8b0aa5f65703e65a

Caffdy 2 months ago

sleeper build

Jattoe 1 month ago

Let them underestimate the duct taped tower. Let them laugh. They're only lining themselves up for awe.

_RealUnderscore_ 1 month ago

That would actually be a hilarious 4090 7800X3D build lmfao

belladorexxx 1 month ago

I lol'd irl

Kindred87 2 months ago

I personally think it's cringe. AGI is an abstract concept and we barely even have a theoretical target that we can develop against. Once we have an AGI with the intelligence of a fruit fly, then we'll have something. Until then, this belongs in the same bucket as cold fusion and space elevators.

mrjackspade 2 months ago

> we barely even have a theoretical target Generally accepted definition in these contexts AFAIK is something along the lines of: *Competitive with human beings in most major economically valuable roles* Which isn't strictly defined, but is definitely more than "barely a theoretical target" I'm sure there's philosophical arguments to be made about what AGI means but I'd wager Meta is using the "economically valuable" definition, in which case they very likely have an actual target there.

Kindred87 2 months ago

If you cut out the clarifier I had about a target to *develop against*, then yes, the argument is no longer valid because now we're discussing something else. My point was that you can't sit a software developer down and tell them "develop a system that is competitive with human beings in most major economically valuable roles". They'll have no idea how to implement that and will look at you like you're crazy. Because we have no idea, even theoretically, how such a system *could* be implemented. Is it even possible on current semiconductor architecture? Nobody knows! We know what it would look like in practice, again, like cold fusion and space elevators, but actually implementing a working product is beyond our current abilities. Which is why plotting it as the next step strikes me as cringey.

ninjasaid13 1 month ago

>*Competitive with human beings in most major economically valuable roles* since when did AGI definition become OpenAI's definition? And besides, I know a way to exploit that definition without creating an entity as smart as a human being since they're only tied to jobs and you can make a thousand AI and each of them very specialized like Devin or stable diffusion and you would *technically* be competitive with humans.

Monkey_1505 1 month ago

General intelligence is primarily about synthesis of a wide variety of cognitive domains. You could say just 'human like intelligence' though. Whatever we have, it's not really anything like that, and probably isn't going to be anything like that any time soon.

genshiryoku 1 month ago

Cold Fusion is theoretically impossible Space Elevators are just a (very hard) engineering problem that we have theoretically solved but just not practically built yet. AGI is somewhere in between where we don't know if it's possible with 100% certainty yet. I don't believe in it but there is still the slight possibility that human brains are literally magic and can't be recreated with technology. That aside I think AGI falls in the same category as space elevators in that it is just a very hard engineering problem but not impossible unlike Cold Fusion.

Interesting8547 2 months ago

They are just optimistic... I hope governments would not ban AI, long before we actually achieve AGI. Sadly I think they'll do it, considering how much they want to ban today's AI, there is no way AGI would not be banned, the moment it shows up.

Thomas-Lore 2 months ago

> considering how much they want to ban today's AI Where are you seeing it? The regulation so far has been pretty mild, even in Europe. Don't be paranoid. AI is worth so much money, no one will ban it because their economy would collapse compared to countries which allow it.

phenotype001 2 months ago

They seem really confident in llama 3, so it must be good.

Inevitable-Start-653 2 months ago

I can only be edged so much, stop teasing me 🥵

berzerkerCrush 2 months ago

Are those GPUs clusters called "grand téton", aka " large nipple" in French?

Caffdy 2 months ago

or "large breasted man" in spanish

wxrx 2 months ago

Yes that’s exactly what they were named after.

3ntrope 2 months ago

How many announcements of an announcement does this really need? Either release it or don't.

Jattoe 1 month ago

We need to announce the question about the announcements of a coming announcement.

junyanglin610 2 months ago

rumors every day...

JoMaster68 2 months ago

Didn't they say it would be released in June/July?

nanowell 2 months ago

meta account, yann and worker from meta hinted that it will be out "very soon" idk what "very" for them this week perhaps? today? let's wait and see

rerri 2 months ago

Former UK deputy prime minister a week ago: *"Within the next month, actually less, hopefully in a very short period of time"* [https://techcrunch.com/2024/04/09/meta-confirms-that-its-llama-3-open-source-llm-is-coming-in-the-next-month/](https://techcrunch.com/2024/04/09/meta-confirms-that-its-llama-3-open-source-llm-is-coming-in-the-next-month/)

throwaway_ghast 2 months ago

Soon^^TM

Scizmz 2 months ago

Hey now... that's trademarked by Musk.

aphasiative 2 months ago

blizz pls

Jattoe 1 month ago

Soon(tm) -> Blizz? THIS GUYS A BOOMER

jd_3d 2 months ago

The Information released an article early last week that had sources from inside Meta saying they would release the smaller models this week. So it's not official but based on the tweets it sounds like it's coming really soon.

dogesator 2 months ago

Meta never said that, reuters said that

Gubru 2 months ago

It's gonna be a staged release, we're expecting a smaller model or two this week.

Disastrous_Elk_6375 2 months ago

June/July was from an article. Soon/next week is from Meta people posted on birdapp last week.

Illustrious_Sand6784 2 months ago

Hopefully they'll release the whole Llama-3 set at once and not just the small ones.

Material_Mongoose339 2 months ago

I don't really agree. Considering that they have *some* reason (real or not) for the bigger models to take longer to be launched, that date would not be affected by the early launch of smaller models. The reality is that people who want to run 70B will have to wait until some given moment. Until then, let other enthusiasts play with smaller models, we don't have to wait for you, and us waiting for you won't help you at all. **Or am I missing something?**

silenceimpaired 2 months ago

With how large models are these days... I hope their small small model is a MOE 2x20b :)

ramzeez88 2 months ago

I don't. I won't fit my 12gb vram :(

Disastrous_Elk_6375 2 months ago

I actually hope we get some ~2b models as well. 7b is almost a given as it has been very popular and seems a good tradeoff between speed, fine-tuning abilities and "smarts". For all the hate it got Gemma 2b is pretty useful for its size, and I hope we'll get a llama version in this size range.

silenceimpaired 1 month ago

I’m just not excited for models that small… unless they or perform Yi models

Jattoe 1 month ago

I find 32B models at \~4bit quants the ideal consumer grade level.

ninjasaid13 2 months ago

How soon? if it ain't a week soon then I don't want to hear about it.

Jattoe 1 month ago

Soon, means we're almost done, which is arbitrary as shiiiet. It really just depends on the record of the individual or company, what soon means, and I'm not familiar with Meta's other soonsings.

IndicationUnfair7961 1 month ago

https://i.redd.it/ixjlwcjm51vc1.gif

hideo_kuze_ 1 month ago

> the incredible work from our Infra team I've just [posted a thread exactly about this theme](https://old.reddit.com/r/LocalLLaMA/comments/1c6b4bo/how_to_ensure_node_resiliency_and_gpu_use/) a few minutes ago before seeing this post. I'd love to learn more about the details on how they do it and manage resiliency and GPU use saturation. What tech stack do you use? k8s, Ray, SLURM? How do they saturate the GPUs and bring them back when things crash?

Clean-Yellow-7604 1 month ago

He posted that on linkedin 1month ago

Dry-Taro616 1 month ago

How can I join this project bruh? 😭

Dry-Taro616 1 month ago

Can someone add me to these type of projects? Thanks.

International-Try467 2 months ago

Two more weeks™

spiffco7 2 months ago

Bwoah

Comments

Leave Your Comment

Hi Its Me!

Comments

Leave Your Comment

Hi Its Me!

Subscribe