Support NeoGAF

Fess · Jan 29, 2025

Topher said:
My son works in a manufacturing plant and I can easily see his job being affected by the same scenario. I think these things can be useful to a point, but can also be incredibly shortsighted if the goal of all this is just to make more money for stockholders, everything else be damned.

Yeah they need to have long term plans or things will end badly.
And AI, while useful as you say, can make humanity at large plain stupid. If we stop seeing the need to study and learn things properly. There can already be a knowledge gap between older and younger people in certain areas on basic things knowing how something works from the ground up. AI will accelerate this.

PaintTinJr · Jan 30, 2025

ResurrectedContrarian said:
They are using NVIDIA chips -- they admit that openly in their paper.

But the chips are H800s, a bit slower and cheaper, instead of the cutting edge new chips that NVIDIA is hyping as the next obligatory hardware for the next wave of models. So it's just a matter of the market thinking "maybe chip scaling isn't the factor anymore." Even then, it's not really a logical reason to dump stock, given how dominant NVIDIA remains.

The H800s by the way are actually slower because we did that on purpose lol, the US required chips sold to China to be downscaled in their capabilities.

https://www.reuters.com/technology/nvidia-tweaks-flagship-h100-chip-export-china-h800-2023-03-21/

But DeepSeek shows this really doesn't matter... the highest powered chip isn't the prime factor for innovating or winning in AI research.

I think 'winning in AI research' is eventually going to be far more nuanced than comparative test scores after trying deepseek r1 in 14b and giving it questions I would normally throw at CoPilot

By comparison deepseek seemed to make less mistakes, with what it provided, but the overall quality of the interaction and wider thinking seemed lacking, along with it seeming pretty lazy to not actually do the tasks as required when called out to do them.

CoPilot doesn't always do things, but rarely fails to do the task after it gets chastised in a follow up question, even if doing it less than perfect, whereas deepseek just comes across like a bullshitter getting called on something and then giving it more of the: "what you want to do now is" (do the task yourself in effect).

So I think deepseek wasn't very helpful at 14b and 7b for my usual AI questions, despite doing a great job of convincing it was going to do great, right up until the end when it fell massively short.

Buggy Loop · Jan 30, 2025

Walter White I Give Up GIF by Breaking Bad

What a bunch of wankers lol

This is 100% distilled

USA to close the AI pipeline from running unhindered in those regions for sure.

GHG · Jan 30, 2025

Buggy Loop said:
What a bunch of wankers lol

This is 100% distilled

USA to close the AI pipeline from running unhindered in those regions for sure.

Download 32B and run it on your system. Ask it what Qwen is and who it's made by and let us know what it says.

Buggy Loop · Jan 30, 2025

GHG said:
Download 32B and run it on your system. Ask it what Qwen is and who it's made by and let us know what it says.

I know what Qwen is, its Alibaba's, it still fucks up GHG. You do not mess up this kind of thing. Claude doesn't say "I'm chatGPT made by OpenAI". These are trademarks to begin with, not something you just stumbled upon trying to make an answer.

A slip of tongue as humans say. I'm sure they're training him to not mention OpenAI again but a bit too late.

No I ain't moving stuffs around the SSD for that model lol

deepseek-says-its-a-version-of-chatgpt-v0-qo649wwldofe1.jpeg

Sometimes DeepSeek R1 slips out that its Claude or Open AI

deepseek-says-its-a-version-of-chatgpt-v0-ejy5d4u1fnfe1.png

deepseek-says-its-a-version-of-chatgpt-v0-659xk4vdpmfe1.png

i-asked-deepseek-if-it-had-a-mobile-app-and-it-thought-it-v0-js6a6o6mecfe1.jpg

But yea, continue GHG, doing an amazing job for DeepSeek lol

deepseek-says-its-a-version-of-chatgpt-v0-et1mxlsggpfe1.png

ResurrectedContrarian · Jan 30, 2025

Just to be clear -- the versions of R1 that are small (eg. 32B) are not the actual R1 model at all -- those are distillations on top of other models (Qwen, Llama, etc). These are meant to show the downstream benefits of applying R1 to another model as teacher, but they aren't the same thing as the actual model at all.

Buggy Loop · Jan 30, 2025

ResurrectedContrarian said:
Just to be clear -- the versions of R1 that are small (eg. 32B) are not the actual R1 model at all -- those are distillations on top of other models (Qwen, Llama, etc). These are meant to show the downstream benefits of applying R1 to another model as teacher, but they aren't the same thing as the actual model at all.

So when using the app or by web browser, its the full model via cloud right?

ResurrectedContrarian · Jan 30, 2025

Buggy Loop said:
So when using the app or by web browser, its the full model via cloud right?

Correct.

Some third-party providers also host the full model, since the full weights are openly provided. It just takes a ton of RAM across multiple GPUs to run (at least 400 GB of vram), unlike the small distilled versions, so you'll notice the pricing isn't anywhere near the low prices of inference on small models like Llama. For instance, Fireworks hosts the full model for API use: https://fireworks.ai/models/fireworks/deepseek-r1

GHG · Jan 30, 2025

Buggy Loop said:
I know what Qwen is, its Alibaba's, it still fucks up GHG. You do not mess up this kind of thing. Claude doesn't say "I'm chatGPT made by OpenAI". These are trademarks to begin with, not something you just stumbled upon trying to make an answer.

A slip of tongue as humans say. I'm sure they're training him to not mention OpenAI again but a bit too late.

No I ain't moving stuffs around the SSD for that model lol

Sometimes DeepSeek R1 slips out that its Claude or Open AI

But yea, continue GHG, doing an amazing job for DeepSeek lol

Why are you so defensive?

I didn't ask you if you knew what Qwen was. Good grief.

You don't need to move anything around either. One single line via powershell (assuming you know how to use ollama) and it's installed.

ResurrectedContrarian said:
Just to be clear -- the versions of R1 that are small (eg. 32B) are not the actual R1 model at all -- those are distillations on top of other models (Qwen, Llama, etc). These are meant to show the downstream benefits of applying R1 to another model as teacher, but they aren't the same thing as the actual model at all.

Yep, this is known. It's also important to note that people should also be selecting the distillation appropriate for their hardware:

The larger the model the closer it will be to emulating the full model.

sono · Jan 30, 2025

OpenAI and AI Czar David Sacks accuse DeepSeek of stealing their IP to train their new R1 model, citing hard evidence gained from Microsoft.

readonly · Jan 30, 2025

sono said:
OpenAI and AI Czar David Sacks accuse DeepSeek of stealing their IP to train their new R1 model, citing hard evidence gained from Microsoft.

Why would China steal intellectual property?

Fabieter · Jan 30, 2025

It's also partly beause of the carry trade still going on. Japan raised interest rates last week.

winjer · Jan 30, 2025

readonly said:
Why would China steal intellectual property?

That is a good question. Because China has never, ever stolen technology, art of IP from other countries....

Astray · Jan 30, 2025

sono said:
OpenAI and AI Czar David Sacks accuse DeepSeek of stealing their IP to train their new R1 model, citing hard evidence gained from Microsoft.

Their copium over Deepseek has been fantastic.

Wolzard · Jan 30, 2025

readonly said:
Why would China steal intellectual property?

Bieren · Jan 30, 2025

Think of the shareholders. That's the saddest part of this news.

readonly · Jan 30, 2025

Wolzard said:

I wasn't being serious lol. The only clear thing about China that we know is they steal all our shit.

Support NeoGAF

NVIDIA stock to lose $400 billion ( US tech 1 Trillion) after DeepSeek release

Fess

Member

PaintTinJr

Gold Member

Buggy Loop

Gold Member

GHG

Gold Member

Buggy Loop

Gold Member

ResurrectedContrarian

Suffers with mild autism

Buggy Loop

Gold Member

ResurrectedContrarian

Suffers with mild autism

GHG

Gold Member

sono

Gold Member

readonly

Member

Fabieter

Member

winjer

Member

Astray

Member

Wolzard

Member

Bieren

Member

readonly

Member

Similar threads