- cross-posted to:
- technology@lemmy.zip
- technology@beehaw.org
- technology@lemmy.ml
- cross-posted to:
- technology@lemmy.zip
- technology@beehaw.org
- technology@lemmy.ml
cross-posted from: https://lemm.ee/post/55428692
cross-posted from: https://lemm.ee/post/55428692
Some of the distillations are trained on top of Qwen 2.5.
And for some cases, FuseAI (a special merge of several thinking models), Qwen Coder, EVA-Gutenberg Qwen, or some other specialized models do a better job than Deepseek 32B in certain niches.