They aren't going to be selling them at the volume they have been up until now. That's for sure.
Jesus Christ. Talk about missing the timeline of AI.
So cute
Yup. LLM is the end game. That's it. AI stops there. Was a good run. OpenAI and Stargate was just aiming for youtube AI videos.
Subject to change. It doesn't even need to be AMD either by the way, but their tech will likely become far more viable as a result of this.
Best - inference - no - matter - the - vendor
DeepSeek-V3 is dependent as much as any previous model before it on inference speed.
Oh amazing, using AMD's marketing are we? How does their bandwidth compare to the bandwidth on Nvidia's solutions? It is there that you will discover my point instead of attempting to twist my words as if I said bandwidth is no longer a consideration.
AMD always claims higher bandwidth than Nvidia. The opposite of your DeepSeek claim you made that "no longer need to make sure you have access to the massive bandwidth Nvidia". I discover that you do not know what you're talking about.
Jesus christ. Nvidia currently have a 85%+ market share in the AI space, that doesn't equate to AMD selling something.
your maths are something else
As for Triton, it's ~80% less efficient than Cuda, why do you think that is? What does it ultimately convert to when running it on a Nvidia GPU?
Less efficient for what GHG? Which tasks are you trying to program? Or benchmark?
Unsloth AI last year raised 2.2x inference in LLM with 70% less RAM usage by using converting sequential code paths into parallel ones using Triton kernels.
Is your whole premise of Nvidia getting 85% of market share because of Cuda? Look at the big tech firms. None are using cuda. You're missing the whole reason why Nvidia is selling like it does. Total cost of ownership. Not Cuda. Even with AMD getting better theoretical number on papers, their racks and implementation at these big AI farm levels is total shit.
Deepseek are using their own low level solution, it does not have any reliance on Cuda at any level - hence the efficiency gains.
That's nothing new and has
nothing to do with inference speed or computational parallelism. They went to the metal with near assembly language on GPU. What does that have to do with GPU vendor? There's a mountain pile of alternatives to Cuda.
Peoples pick Cuda / Pytorch not for the best performance, they pick it for ease of implementation and easy to code. Do you have any idea how it is to implement DeepSeek? No you don't. It doesn't remove the needs of broad and easy coding languages. This is equivalent to saying that we should never have made an API because assembly. Sure, nothing beats that, but good luck coding.
But yes, anyone who sees the benefit of what's been introduced here and understands the threat this poses to Nvidia's business specifically is a "midwit" who likes to make hentai porn in their basement. Perfect summary.
Yup
Also a business dimwit that has never heard of the Jevons paradox. Its quite cute.