cross-posted from: https://lemmy.sdf.org/post/29607342
Here is the data at Hugging Face.
A team of international researchers from leading academic institutions and tech companies upended the AI reasoning landscape on Wednesday with a new model that matched—and occasionally surpassed—one of China’s most sophisticated AI systems: DeepSeek.
OpenThinker-32B, developed by the Open Thoughts consortium, achieved a 90.6% accuracy score on the MATH500 benchmark, edging past DeepSeek’s 89.4%.
The model also outperformed DeepSeek on general problem-solving tasks, scoring 61.6 on the GPQA-Diamond benchmark compared to DeepSeek’s 57.6. On the LCBv2 benchmark, it hit a solid 68.9, showing strong performance across diverse testing scenarios.
…
I really feel a smaller high quality data set would beat feeding it anything and everything.
Is this libre software?
Model weights, datasets, data generation code, evaluation code, and training code are all publicly available.
Hey, I came late to the party. I am CS but I am far from AI. Can you point me to a resource where I can learn how to use all of “those things” that you mention?
Had the same problem and someone guided me to the hugging face documents/tutorials.
They are quite nice to get a local model up and running, play around with it, how to fine tune it and connect it with agentsHaven’t tried much, but the articles were exactly what I was looking for
Hope it helps you as wellThank you!
As @naeap@sopuli.xyz said, it’s on their Hugging Face site (here the link again: https://huggingface.co/open-thoughts/OpenThinker-32B), just below the first table are all the links.
But is it libre? Looks like it is all Apache 2.0, so yes.
deleted by creator