Skip Navigation

Technology @lemmy.world

jeffw @lemmy.world

3w ago

OpenAI hits back at DeepSeek with o3-mini reasoning model

arstechnica.com OpenAI hits back at DeepSeek with o3-mini reasoning model

OpenAI says faster, more accurate STEM-focused model will be free to all users.

20 comments

Can I download their model and run it on my own hardware? No? Then they're inferior to deepseek
- In fairness, unless you have about 800GB of VRAM/HBM you're not running true Deepseek yet. The smaller models are Llama or Qwen distilled from Deepseek R1.
  I'm really hoping Deepseek releases smaller models that I can fit on a 16GB GPU and try at home.
  
  Well, honestly: I have this kind of computational power at my university, and we are in dire need of a locally hosted LLM for a project, so at least for me as a researcher, its really really cool to have that.
  
  Qwen 2.5 is already amazing for a 14B, so I don’t see how deepseek can improve that much with a new base model, even if they continue train it.
  Perhaps we need to meet in the middle, and have quad channel APUs like Strix Halo become more common, and maybe release like 40-80GB MoE models. Perhaps bitnet ones?
  Or design them for asynchronous inference.
  I just don’t see how 20B-ish models can perform like one orders of magnitude bigger without a paradigm shift.
  
  Sure but i can run the decensored quants of those distils on my pc, I dont need to even open the article to know that openai isnt going to allow me to do that and so isnt really relevant.

- Dude, you made me laugh so much!

I'd like to see OpenAI compare themselves to other models aside from their own.

I wonder how much this puff piece cost OpenAI? Pretty cheap compared to the damage of being caught with the hand in the proverbial cookie jar.

Yeah ok we get it, they just release the latest checkpoint of their continuously trained model whenever convenient and make big headlines out of it.

Someone please write a virus that deletes all knowledge from LLMs.
- Deleting data from them might not be feasible, but there are other tactics.
  [...] trapping AI crawlers and sending them down an "infinite maze" of static files with no exit links, where they "get stuck" and "thrash around" for months, he tells users. Once trapped, the crawlers can be fed gibberish data, aka Markov babble, which is designed to poison AI models.

How this impacts me, someone who doesn't use AI:

20 comments