cross-posted from: https://lemmy.world/post/11178564

Scientists Train AI to Be Evil, Find They Can’t Reverse It::How hard would it be to train an AI model to be secretly evil? As it turns out, according to Anthropic researchers, not very.

  • swlabr@awful.systems
    link
    fedilink
    English
    arrow-up
    8
    ·
    9 months ago

    to be read in the low bit cadence of SF2 Guile “ai doom!”

    It’s not a huge surprise that these AI models that indiscriminately inhale a bunch of ill-gotten inputs are prone to poisoning. Fingers crossed that it makes the number go down!