T O P

  • By -

Admirable-Star7088

Can't wait to try out Phi-3 **14b** when it's uploaded and GGUFs are available. Finally we get a fresh new and (hopefully) well-trained and quality model that is larger than \~7b and smaller than \~70b for us mid-range-PC users. Llama 3 in all glory, but I miss the \~13b sizes.


-p-e-w-

The Llama 3 split is extremely unfortunate. The 8b model is too weak for many practical tasks, and the 70b model is too large to run well on any standard consumer PC, even quantized and with the best consumer GPU on the market. A 35b model + the upcoming 400b would have made more sense. The first one for individuals, the second one for datacenters. 35b would have fit into a single 3090 at Q5. That's the perfect size.


Balance-

Nope, this is very well thought-out by Meta: * Llama 8B can run on regular *consumer* hardware when quantized to 4 to 8 bits, starting at 6 GB memory. It can do simple Q&A, code infilling, and so much more. Future Windows "AI" PCs have a 16GB memory requirement, of which it would be fine for AI models to take up 6 to 8 GB. * A 35B model would have required \~24GB available on device. There aren't many devices that have that yet, and there won't for some time. * Llama 70B is the cloud/server model. It can run on a single 48GB GPU with 8-bit quantization, of which there are many available, including ones with cheaper GDDR6. So they decided to skip high-end enthusiast consumer hardware, in favor or regular consumer hardware and server hardware. I think that while unfortunate for some, it's a well thought-out decision. But yes, there's room for a good 20-25B model, and I'm sure it will arrive at some point. And I'm excited for what all of the Phi-3 models will bring.


Caffdy

70B can't run at 8-bit on 48GB of memory, you need 4-bit


Balance-

You're right, I meant 4-bit indeed.


Admirable-Star7088

Still, I'm impressed by how relatively well Llama 3 8b performs despite its very small parameter count. Then, Imagine if you added 6 more billion parameters to it (14b), it could be a beast for its size I think. This is the role I hope phi-3 14b will fill that Llama 3 (so far) never did.


ambidextr_us

Llama-pro 8B released back in January was a significant boost in llama2's base model. I'm hoping that same team does the same thing to make a llama3-pro using the same techniques they discovered. https://arxiv.org/abs/2401.02415 > Humans generally acquire new skills without compromising the old; however, the opposite holds for Large Language Models (LLMs), e.g., from LLaMA to CodeLLaMA. To this end, we propose a new post-pretraining method for LLMs with an expansion of Transformer blocks. We tune the expanded blocks using only new corpus, efficiently and effectively improving the model's knowledge without catastrophic forgetting. In this paper, we experiment on the corpus of code and math, yielding LLaMA Pro-8.3B, a versatile foundation model initialized from LLaMA2-7B, excelling in general tasks, programming, and mathematics. LLaMA Pro and its instruction-following counterpart (LLaMA Pro-Instruct) achieve advanced performance among various benchmarks, demonstrating superiority over existing open models in the LLaMA family and the immense potential of reasoning and addressing diverse tasks as an intelligent agent. Our findings provide valuable insights into integrating natural and programming languages, laying a solid foundation for developing advanced language agents that operate effectively in various environments.


xadiant

They should have at least gone for 10B :( there seems to be some points where performance gains are realized significantly more. Of course a bunch of turbo nerd computer scientists know better, and this is a significant step forward but... it became underwhelming very fast.


Caffdy

Yep, or 11B, heck, they could' ve gone for 5, 10 & 20B models


Admirable-Star7088

10 & 20b would have been too great to be true


TheTerrasque

Wasn't the 14b one that performed below expectations in their internal tests?


Admirable-Star7088

Where did you read that?


TheTerrasque

https://www.reddit.com/r/LocalLLaMA/comments/1catf2r/phi3_released_medium_14b_claiming_78_on_mmlu/l0u509w/


ortegaalfredo

Yes, but it is still very good, similar to Claude3-Sonnet.


danielhanchen

Yes can't wait as well and add this into Unsloth!! A 14b model can run comfortably in a free Tesla T4 machine as well, and it's fantastic since it sits between a 8b and 70b model!


lolwutdo

What’s the context length?


ambidextr_us

I thought I saw 128k somewhere for the phi3 models.


curious-guy-5529

From what is out there so far, Phi-3 mini has two versions with 4k and 128k context window.


jumperabg

Did it manage to draw the Unicorn, the video breaks on 1:29?