The logical end of the ‘Solution to bad speech is better speech’ has arrived in the age of state-sponsored social media propaganda bots versus AI-driven bots arguing back

  • 👁️👄👁️@lemm.ee
    link
    fedilink
    English
    arrow-up
    65
    arrow-down
    3
    ·
    1 year ago

    Just a reminder, LLMs are not designed to provide truth, but rather naturally sounding word generation.

    • tehmics@lemmy.world
      link
      fedilink
      English
      arrow-up
      8
      arrow-down
      5
      ·
      1 year ago

      We can certainly argue over what they’re designed to do, and I definitely agree that’s the goal of them. The reality though is that on some level it is impossible to separate assertions from the words that describe them. Language itself is designed to communicate ideas, you can’t really create language without also communicating ideas, otherwise every sentence from an LLM would just look like

      “Has Anyone Really Been Far Even as Decided to Use Even Go Want to do Look More Like”

      They will readily cite information that was fed to them. Sometimes it is on point, sometimes not. That starts to be a bit of an ethical discussion on whether it is okay for them to paraphrase information they were fed, and without citing it as a source of the info.

      In a perfect world we should be able to expand a whole learning tree to trace back how the model pieced together each word and point of data it is citing, kind of like an advanced Wikipedia article. Then you could take the typical synopsis that the model provides and dig into it to judge for yourself if it’s accurate or not. From a research standpoint I view info you collect from a language model as a step down from a secondary source and we should be able to easily see how it gets to that info.

      • turmacar@lemmy.world
        link
        fedilink
        English
        arrow-up
        2
        arrow-down
        3
        ·
        1 year ago

        LLMs are at least a quaternary(?) source. They’re scraping secondary/tertiary sources. As such they’re little better than asking passersby on the street. You might get a general idea of what the zeitgeist is, but how true any particular statement actually is will vary wildly.

        Math itself is designed to describe relationships between things. That doesn’t mean you can’t mock up a ‘reasonable seeming’ equation that is absolute nonsense after further examination, but that a layman will take as ‘true enough’.

        LLMs don’t cite things. They provide an approximation of what a human might write. They don’t know what they’re writing or how it relates to the ‘real world’ any more than any other centerpiece of a Chinese Room.