• BiNonBi@lemmy.blahaj.zone
    link
    fedilink
    arrow-up
    0
    ·
    1 year ago

    NPR reported that a “top concern” is that ChatGPT could use The Times’ content to become a “competitor” by “creating text that answers questions based on the original reporting and writing of the paper’s staff.”

    That’s something that can currently be done by a human and is generally considered fair use. All a language model really does is drive the cost of doing that from tens or hundreds of dollars down to pennies.

    To defend its AI training models, OpenAI would likely have to claim “fair use” of all the web content the company sucked up to train tools like ChatGPT. In the potential New York Times case, that would mean proving that copying the Times’ content to craft ChatGPT responses would not compete with the Times.

    A fair use defense does not have to include noncompetition. That’s just one factor in a fair use defense and the other factors may be enyon their own.

    I think it’ll come down to how “the purpose and character of the use, including whether such use is of a commercial nature or is for nonprofit educational purposes” and “the amount and substantiality of the portion used in relation to the copyrighted work as a whole;” are interpreted by the courts. Do we judge if a language model by the model itself or by the output itself? Can a model itself be uninfringing and it still be able to potentially produce infringing content?

    • fuzzywolf23@beehaw.org
      link
      fedilink
      arrow-up
      1
      ·
      1 year ago

      The model is intended for commercial use, uses the entire work and creates derivative works based on it which are in direct competition.