New Open Source AI model beats DeepSeek's performance using just 14% of the data its Chinese competitor needed

Hotznplotzn@lemmy.sdf.org · 6 days ago

New Open Source AI model beats DeepSeek's performance using just 14% of the data its Chinese competitor needed

HubertManne@moist.catsweat.com · 5 days ago

I really feel a smaller high quality data set would beat feeding it anything and everything.

Autonomous User@lemmy.world · 6 days ago

Is this libre software?

Hotznplotzn@lemmy.sdf.org · 6 days ago

Model weights, datasets, data generation code, evaluation code, and training code are all publicly available.

double_quack@lemm.ee · 6 days ago

Hey, I came late to the party. I am CS but I am far from AI. Can you point me to a resource where I can learn how to use all of “those things” that you mention?

naeap@sopuli.xyz · 6 days ago

Had the same problem and someone guided me to the hugging face documents/tutorials.
They are quite nice to get a local model up and running, play around with it, how to fine tune it and connect it with agents

Haven’t tried much, but the articles were exactly what I was looking for
Hope it helps you as well

double_quack@lemm.ee · 6 days ago

Thank you!

Hotznplotzn@lemmy.sdf.org · edit-2 6 days ago

As @naeap@sopuli.xyz said, it’s on their Hugging Face site (here the link again: https://huggingface.co/open-thoughts/OpenThinker-32B), just below the first table are all the links.

Autonomous User@lemmy.world · edit-2 6 days ago

But is it libre? Looks like it is all Apache 2.0, so yes.

dreadbeef@lemmy.dbzer0.com · 6 days ago

deleted by creator

New Open Source AI model beats DeepSeek's performance using just 14% of the data its Chinese competitor needed

New Open Source AI model beats DeepSeek's performance using just 14% of the data its Chinese competitor needed

New Open Source AI Model Rivals DeepSeek's Performance—With Far Less Training Data - Decrypt