Qwen2.5: A Party of Foundation Models!

brucethemoose@lemmy.world · 18 hours ago

Isn’t that a massive security risk?

Like, what if the U.S was using Roscosmos satellite links in drones? I’d certainly be raising an eyebrow.

brucethemoose@lemmy.world · 21 hours ago

Maybe so… but I feel like this is one of the biggest mistakes in the post WWII order. Imagine what things would be like if the UN had more teeth (and no security council)

brucethemoose@lemmy.world · edit-2 5 days ago

Why would people want to be on Twitter for posts like this?

It’s like joining a real life club where the more of a jerk you are, the more speaking time you get.

brucethemoose@lemmy.world · edit-2 6 days ago

“You know this war we’re slow-walking? That we keep almost negotiating a ceasefire to? Yeah, let’s keep sabotaging that, and start another war, just in case.”

I also kinda sympathize with American allies looking on from the outside now, as the U.S. is going to no doubt unconditionally support this, because… yeah.

brucethemoose@lemmy.world · 7 days ago

deleted by creator

brucethemoose@lemmy.world · 8 days ago

I’m really shocked at how many people (apparently) shell out for onlyfans, subscriptions and such.

I swear, I’m the only sane one, not the only insane one.

brucethemoose@lemmy.world · 9 days ago

How to turn your country into North Korea, 101.

brucethemoose@lemmy.world · edit-2 9 days ago

Good! Try the IQM, XS, and XSS quantizations as well, especially if you try a 14B, as they “squeeze” the model into less space better than the Q3_K quantizations.

Yeah I’m liking the 32B as well. If you are looking for speed just for ultilitarian Q/A, you might want to keep a Deepseek Lite V2 Code GGUF on hand, as it’s uber fast partially offloaded.

brucethemoose@lemmy.world · 10 days ago

I like how much it obsesses over your post’s engagement metrics. I feel like that creates toxic incentive’s, but… shrug. What do I know?

brucethemoose@lemmy.world · edit-2 10 days ago

Same issue here.

I think it’s just a temporary issue, but I really will stop using Reddit if old.reddit.com goes down. There are niches that just aren’t on here… but that UI.

brucethemoose@lemmy.world · edit-2 11 days ago

A Qwen 2.5 14B IQ3_M should completely fit in your VRAM, with longish context, with acceptable quality.

An IQ4_XS will just barely overflow but should still be fast at short context.

And while I have not tried it yet, the 14B is allegedly smart.

Also, what I do on my PC is hook up my monitor to the iGPU so the GPU’s VRAM is completely empty, lol.

brucethemoose@lemmy.world · edit-2 11 days ago

Qwen2.5: A Party of Foundation Models!

brucethemoose@lemmy.world · 11 days ago

Oh, and you HAVE to try the new Qwen 2.5 14B.

The whole lineup is freaking sick, 34B it outscoring llama 3.1 70B in a lot of benchmarks, and in personal use it feels super smart.

brucethemoose@lemmy.world · edit-2 11 days ago

You can try a smaller IQ3 imatrix quantization to speed it up, but 22B is indeed tight for 8GB.

If someone comes out with an AQLM for it, it might completely fit in VRAM, but I’m not sure it would even work for a Pascal card TBH.

brucethemoose@lemmy.world · edit-2 13 days ago

Especially if you’re mega rich.

brucethemoose@lemmy.world · edit-2 13 days ago

“Well, one lesson I’ve learned is that just because I say something to a group and they laugh doesn’t mean it’s going to be all that hilarious as a post on X,” he said in a follow-up post early Monday. “Turns out that jokes are WAY less funny if people don’t know the context and the delivery is plain text."

I knew people like this in real life, who’d say something horrible and follow it up with “It’s just a joke,” but only if they ‘lose’ and are called out on it.

They’re slimey jerks, and it’s utterly miserable to even be around them. And I don’t understand why so many would worship/follow Elon and dwell on Twitter for it.

brucethemoose@lemmy.world · 14 days ago

It’s still everywhere in my news/internet diet.

It’s bleeding, for sure, but it’s big. Its gone bad. But I think its premature to say its collapse is a good thing, because it just won’t go away.

brucethemoose@lemmy.world · edit-2 14 days ago

It’s not dead though, it’s still linked to everywhere, from big news to niche communities because it still has that critical mass and inertia.

And I have to be cynical of the Fediverse, but realistically, what replaces it, at least here in the US? Discord? No, thanks, I’d at least rather have information be public.

I’m speaking as someone who has never used Twitter, but I can’t ignore it, as much as I’d like to.

brucethemoose@lemmy.world · edit-2 15 days ago

The behavior is configurable just like it is on linux, UAC can be set to require a password every time.

But I think its not set this way by default because many users don’t remember their passwords, lol. You think I’m kidding, you should meet my family…

Also, scripts can do plenty without elevation, on linux or Windows.

brucethemoose@lemmy.world · edit-2 19 days ago

The problem is that splitting models up over a network, even over LAN, is not super efficient. The entire weights need to be run through for every half word.

And the other problem is that petals just can’t keep up with the crazy dev pace of the LLM community. Honestly they should dump it and fork or contribute to llama.cpp or exllama, as TBH no one wants to split up LLAMA 2 (or even llama 3) 70B, and be a generation or two behind for a base instruct model instead of a finetune.

Even the horde has very few hosts relative to users, even though hosting a small model on a 6GB GPU would get you lots of karma.

The diffusion community is very different, as the output is one image and even the largest open models are much smaller. Lora usage is also standardized there, while it is not on LLM land.

brucethemoose@lemmy.world · 21 days ago

TBH this is a great space for modding and local LLM/LLM “hordes”

brucethemoose@lemmy.world · edit-2 26 days ago

How does Lemmy feel about "open source" machine learning, akin to the Fediverse vs Social Media?

brucethemoose@lemmy.world · 30 days ago

Cohere Drops Command-R 35B 08-2024 Update, Just About a Perfect Local LLM for 24GB GPUs.

brucethemoose@lemmy.world · 2 months ago

Pressure grows as "last chance" negotiations for Gaza deal resume

brucethemoose@lemmy.world · 2 months ago

Hostage-ceasefire deal talks stall over new Netanyahu demands, Israeli officials say

brucethemoose@lemmy.world · 2 months ago

Alleged AMD Strix Halo APU Appears in Benchmark

brucethemoose@lemmy.world · 3 months ago

Paramount Acquisition Deal Falls Through