Claude? No. Cucumbers? Yes!

SuspiciousCarrot78@aussie.zone · edit-2 3 hours ago

I actually have a theory here…I think there’s a bare basement level that a model needs to be…anything above which, deterministic tooling can do the rest. We’ve just been yeeting into a black box.

Why that matters is this - if you can make a 450M model do what a 7B model does…that has a huge set of implications (see above examples), not least of which is for use GPU poors.

I’m doing some smoke testing on this idea right now for what I’m calling an ‘expert system’, where the model is treated like a squawk box and the infrastructure around it provides the brains (not RAG, per se. More like sidecars or tool calling). I’m liking what I see so far but there’s lots of fucking work to go. There may yet be a cheat code for some of the NVIDIA tax, if we take the work outside of the magic parrot :)

SuspiciousCarrot78@aussie.zone · edit-2 3 hours ago

No? Just me then. How about this - 99% accurate COPD cough count…with a itty bitty convolutional model, on a $30 Adurino.

https://www.edgeimpulse.com/blog/ai-dont-like-the-sound-of-that-cough/

Why this might be cool. Different coughs correlate to different conditions (aka there is work going on in cough acoustics as a diagnostic signal / proxy for spirometry and breath sounds).

The above was trained on his coughs…it’s not far from there to “was that a healthy cough, wet cough, dry cough, wheeze? Is this a Blue Bloater or Pink Puffer?”

I’ve long suspected PoC (Point Of Care) systems could be adapted to use language models. Imagine - Qwen3.5-2B (with --mmproj) that lives on your phone…and you can point at mole or freckle and ask “hey…is this fucky or what” - and it actually KNOWS because it has access to DermNZ and can classify based on ABCDEs

SuspiciousCarrot78@aussie.zone · edit-2 6 hours ago

Claude? No. Cucumbers? Yes!

SuspiciousCarrot78@aussie.zone · edit-2 6 hours ago

Agreed. I’m all in on home lab / local LLM stuff. And entirely OUT on microslop.

(Which reminds me, I need to turn my Github into a billboard for Codeberg and then strip Github. Watching the traffic count on Github, the only clear signal I see it “bots crawl this shit daily; enjoy”, despite being politely told no)

SuspiciousCarrot78@aussie.zone · 12 hours ago

I’m with you on this I think.

I have no problem with anyone using an AI scribe (though I would prefer one that was on device rather than cloud based). I am aware of things like Lyrebird Health that integrate with EHR management software - frankly, anything that allows the practitioner to focus more solely on the patient is a good thing. After all, they are meant to be treating the patient in front of them, not the computer screen.

The prior point about legal liability is accurate IMHO. Medical health records are functionally a legal record, and should be treated as such. Responsibility for review, redress of inaccuracies etc cannot be waved away as “ChatGPT did it”. If the practitioner is willing to take the onus of that on, and treats the scribed document with the same fidelity, chain of provenance etc as other records, I’m probably ok with it.

Requiring patients to consent to cloud-based AI scribing as a condition of access is where it gets uncomfortable, and your point about local alternatives is exactly why. If deterministic, on-device transcription exists and does the job, the justification for mandating a cloud pipeline through a psychiatric service gets pretty dicey, pretty fast.

I think I can see a way to have Dragon Dictate record the audio, convert it to text and then have on device AI pull out relevant bits to populate a template. That doesn’t abrogate the need to actually LISTEN to the patient but it might fix that ‘capture’ part of the funnel.

SuspiciousCarrot78@aussie.zone · 3 days ago

I hear you; I’m not wildly enamored with reddit either…but that convo is a good springboard.

I see almost everyone chasing bigger GPUs, more parameters, more more more. I figure when 9 people say “go right”, there should be at least someone that can make the plausible case for “actually, here’s why go left works”.

Eg: I think there should be some discussion about watts per token vs tokens per second.

I’m still re-writing the FAQ for my project - when it’s done (and if there’s interest) I will post it here.

SuspiciousCarrot78@aussie.zone · 3 days ago

On the broader topic of “the llm is the mouth, not the brain”, I just stumbled across this.

https://www.atomelm.com/index.html#what

https://www.atomelm.com/index.html#prototypes

Might turn out to be something yet, dunno. Web demo is a bit meh.

SuspiciousCarrot78@aussie.zone · 3 days ago

Thanks for that. I’ve been meaning to (re) disable Google Play services. I have a few older phones too that never had it to begin with. I wonder how/if Aurora Store will be impacted. Presumably, if you don’t have Google Play Services functioning, you don’t get the poison pill. But…given that Big Evil likes to just … do shit (cf the recent 4GB forced ingestion of their LLM with Chrome) I dunno.

In any case, step 1 is probably nuking that.

SuspiciousCarrot78@aussie.zone · 3 days ago

Granite is much more straight laced. Qwen is more expressive. Honestly, it reminds me a lot of early days with GPT 4 class models (and the benchmarks show it about matches that, too).

SuspiciousCarrot78@aussie.zone · edit-2 3 days ago

Cool. So what happens if I run a version of Android that doesn’t inherit Google security theater cruft? That is to say…what if the user simply…does not…upgrade the Android version to be affected by this (eg: uses an old phone or blocks OS version update?).

My phone is going on 7yrs old. Perfectly happy with it. When it breaks, I will get a phone of the same era (2nd hand or new-old stock) or investigate other options.

So, it seems to me, the winning move is not to play the game (in any one of 100 diff ways).

Or am I missing something here? Is there something that will prevent older tech from working? Because if so, I am happy to YOLO my phone and switch to a dumbphone if I have to.

SuspiciousCarrot78@aussie.zone · edit-2 3 days ago

Good man/woman. Nerd Valhalla awaits you :)

SuspiciousCarrot78@aussie.zone · 3 days ago

Hey, me too :) As my school teachers use to tell me “Great minds think alike (but fools seldom differ :)”

For me, I’m thinking of having a LLM as one layer / one container in a homelab that does some specific stuff

queries against local docs / notes / manuals / PDFs / wiki material as the trusted knowledge layer
uses tools for search, file lookup, shell, git, Docker, Home Assistant, calendar, etc.
a local “Codex” / wiki layer that turns my own source material into an inspectable knowledge base
provenance and audit trails

I want to take a screenshot of something, drop it into Syncthing from my phone, then later ask “did I fuck the pins on this?” … and for it to look up the schematics, eyeball the pins and tell me. Or I say “hey, can you grab a copy of X for me, usual params” and have the LLM instruct Sonarr/Radarr/Sabnzdb to do that. (That is, make your OWN “Alexa” with an Arduino ESP32, stick it in a room and then call it when you need it).

So instead of asking a 70B model to “know” why your media server is down, the system checks service status, logs, last config changes, prior notes, Docker state, network state, etc., then the LLM explains the result in human language. You can probably do that with a 4B (I’m testing that assumption now).

Same for “find that motherboard note,” “summarize this email thread,” “turn this into a task,” “compare this Ebay listing to my saved hardware notes,” “what did I do last time this broke,” or “run the smoke test and tell me the first real failure.”

I think small models are the shit for this because if the model only has to classify intent, route the request, render structured evidence, and talk like a normal human…then it doesn’t need to be a giant oracle. The expensive (time wise) part becomes less “make the model smarter” and more “build a better control plane around it.”

Basically: local LLM as semantic HID; expert system/tool router underneath; user owns the data and the machine.

As always, ICBW…but fuck it, I’m gonna try.

PS: I have an idea of how to apply that to coding too…but that’s a project for much later. I’ve been cooking this shit for far too long. The next thing I wanna do is a fun project for myself (that is: ROM hack a parachute and grappling gun into Super Mario Sunshine, so I can basically play “What if Super Mario Sunshine but actually Just Cause 2” on my Wii with the kids.

SuspiciousCarrot78@aussie.zone · edit-2 3 days ago

I’m actually thinking of pivoting my router/orchestrater entirely. I think the way forward is to look at expert systems (yes, those ancient things from the long, long ago of…1980) but with modern tooling (that can be user updated), with a small LLM in the middle that the user can talk to. That is, de-emphasize the central role of the LLM entirely; rather, make it the user-facing NLP input/output and let the real programs, running on real silicon, do the work. I might have a different use case than most, but I bet not so different (that is to say, online LLM discussion seem to gravitate around user that use LLMs for coding; Anthropic and OAI internal reports say otherwise)

Ironically, I’m writing the blurb now while waiting for smoke test #90238472398 to finish.

SuspiciousCarrot78@aussie.zone · edit-2 3 days ago

"The cost of running LLMs is just too damn high"

SuspiciousCarrot78@aussie.zone · edit-2 7 hours ago

I’d ask why…but “because I fucking wanted to” is entirely cromulent (and 100% valid) response. Just wish it had some screenshots or videos of it in action that we could geek out over.

EDIT: I need reading glasses, clearly

https://www.youtube.com/watch?v=eGS9su_inBY

The next step for the dev (are you here?) - get IE running and post from your N64 onto this Lemmy thread. I double dog dare you :)

SuspiciousCarrot78@aussie.zone · edit-2 3 days ago

What I did was this -

Lenovo M93P tiny (i7-4785t, 8GB, no GPU: cost $50. I can do upto PS2 at 1.5x, AAA games upto 2014/5 and later indies)
Offline (once art scrapped by below etc)
Windows 8.1 install (era appropriate, correct drivers, offline, yadda yadda) + ClassicShell
Installed Xbox 360 dongle with drivers
Installed games I wanted / emulators (eg: Dolphin for Wii and GC, PCSX2 for PS2 etc)
Installed Playnite, set it to launch full screen
Define scripts / launch conditions (e.g., Getting AntiMicroX to launch when Luanti launches, so that it can be played with controllers instead of keyboard, then shutdown cleanly when return to PlayNite)
Replaced Explorer.exe as the default shell in Regedit

End result: turn on PC, boots into Windows (in about 2 seconds), launches Playnite (which is full controller / couch mode compatible). Additionally, I can fine tune things like EDID (fine grained control of display modes), ReShade (per game sharpening etc effects), to say nothing of the extra Win programs I can run.

With a bit of skill, you can make games look way better than they have any right to, even on low end hardware. I can dig up some screenshots of Just Cause 2 and FireWatch running in 540p for you if you’d like…you’d be hard pressed to tell it wasn’t much higher resolution (viewed on 75" tv from 8 feet away).

Reason I did it this way:

People will tell you Batocera is awesome (and it is) but…there are just some things that run better natively (e.g., Fallout 3 GOG Game of the Year Edition, Just Cause 2 etc). Windows lets you play windows shit natively and the emulation scene (Dolphin, PCSX2 etc) is mature. No need for Wine, Proton blah blah. It just … runs.

Playnite lets you “hide” games you don’t want the kiddies to run. Once you’re done with it, you can exit and return to desktop - you have normal PC (though if you do the shell replacement I mentioned, you will have to exit, CTRL-ALT-DEL to get task manager, then run explorer.exe. I only set Playnite as default shell because I wanted ZERO flashes or indication this was a normal windows PC on boot; if a small 2-3 second desktop flash doesn’t annoy you, then just set Playnite to launch at start, black screen desktop and go from there. It’s much easier for something that is multi-use). Also, because it’s just a front end, you should in theory just be able to make a shortcut to “Jellyfin.exe” and launch it as needed from Playnite (haven’t explored that myself tho).

Win 8.1 (with Classic Shell) launches fast, is lightweight, and doesn’t need hacking to get around log in permissions and shell replacement the way Win 10 and later might. You wouldn’t want to leave it hooked up to the net unsupervised, but on a HTPC being treated mostly as an offline appliance, the so-called security trade-offs are worth it to me (plus, I have firewall and other isolation in place).

PS: Controller-wise: Xbox 360 wireless + dongle for me. 1 $30 dongle can host up to 4 controllers and I already had to controllers :)

PPS: Can I be honest with you? After all this - the kids decided they just prefer the Wii. I had to laugh. Fine…we’ll use the Wii (even though I replicated everything on the M93p - INCLUDING upscale, making wii controllers etc work in Dolphin, bought a Dolphin bar etc. I even put the fucking wii music as the background in Playnite!). So much work … ignored LOL. Eh, I learned a lot doing it :)

PPPS: We have a Google chrome cast with TV dongle attached to the TV, so it can stream Jellyfin from the media server just fine. I really recommend those things (not the new one, the old hockey puck style one) or the off-label one you can get now (ONN I think?). Actually, come to think of it, I’m pretty sure Wii can stream JellyFin now in glorious 480p too lol

SuspiciousCarrot78@aussie.zone · 4 days ago

Token Speed visualiser

SuspiciousCarrot78@aussie.zone · edit-2 4 days ago

Yeah, transcoding entirely off - directly stream stored 720/1080p files (downloaded like that, although I did use handbrake on the pi once to transcode Space 1999 season 1. Took about 2 days I think).

Someone else was just talking about Wyse thin clients. I’m fairly sure that a $40 Wyse thin client out performs even the best Pi 4 (maybe 5 sometimes). If I can’t find a way to fix mine, I may have to buy a few for uh…science. IIRC, they idle at about the same as the Pi

SuspiciousCarrot78@aussie.zone · edit-2 4 days ago

Oh man I love those wyze thin clients. They can’t go for much more that $40 these days.

I hope people keep sleeping on em - I could use a Raspberry Pi replacement or two

SuspiciousCarrot78@aussie.zone · 4 days ago

It’s very ok, as long as you don’t expect multiple 4K streams at it.

I ran JellyFin on a Pi 4 for about 3 or 4 yrs before it started acting up. So long as you don’t transcode, it works wonderfully well. I had it serving upto 4-5 x 720p streams at same time. IIRC, it can just about do a single 4K, 60? Never tried - all my media is 1080p or less.

IIRC, mine is overclocked and undervolted using PiTools (and is in a Argon 40 case with a m.2). The Argon 40 case (I think) is causing it to short (something with the daughter-board? Dunno). Better options these days.

Paperless I don’t use but I don’t see why it shouldn’t be possible.

Don’t try Immich unless you like pain (or turn off the AI stuff)

SuspiciousCarrot78@aussie.zone · edit-2 4 days ago

I actually (just last night) abliterated a Qwen3.5-2B for this sort of purpose (well, more specifically, to fit neatly into a socket for a project). It’s fast and light, cooked for edge devices, and should have inherited all of base Qwen’s tricks (~200 languages, vision etc) polaris-heretic-Q4_K_M-GGUF

Try it and see if it works? I inadvertently made it really fucking love dotpoints (GPT-OSS 20B disease) so am trying to unfuck it right now.

Else - I can recommend something like Granite-4H or the old Qwen3-4B 2507 instruct

granite-4.1-3b-heretic.i1-Q4_K_M

Qwen3-4B 2507 instruct

SuspiciousCarrot78@aussie.zone · 4 days ago

It’s a more convenient method for some to pirate content, as it requires comparatively little set up. Think: Netflix but yaaar. You pay upkeep but they ensure content is there (as best as possible).

Other similar options include things like Flixify and FMovies.

It always surprised me folks into self hosting prefer pirate streaming. That’s still someone else computer - I’d rather D/L it myself when possible. I get it though - some of the services are very good and near Netflix level convenient.

SuspiciousCarrot78@aussie.zone · 5 days ago

~~Don’t~~Be Evil