• jsomae@lemmy.ml
    link
    fedilink
    arrow-up
    1
    ·
    edit-2
    5 hours ago

    Obviously it depends on your GPU. A crypto mine, you’ll leave it running 24/7. On a recent macbook, an LLM will run at several tokens per second, so yeah for long responses it could take more than a minute. But most people aren’t going to be running such an LLM for hours on end. Even if they do – big deal, it’s a single GPU, that’s negligible compared to running your dishwasher, using your oven, or heating your house.