OpenAI releases tool to detect AI-generated text, including from ChatGPT • TechCrunch


After telegraphing the move in media appearances, OpenAI has launched a tool that attempts to distinguish between human-written and AI-generated text — like the text produced by the company’s own ChatGPT and GPT-3 models. The classifier isn’t particularly accurate — its success rate is around 26%, OpenAI notes — but OpenAI argues that it, when used in tandem with other methods, could be useful in helping prevent AI text generators from being abused.

“The classifier aims to help mitigate false claims that AI-generated text was written by a human. However, it still has a number of limitations — so it should be used as a complement to other methods of determining the source of text instead of being the primary decision-making tool,” an OpenAI spokesperson told TechCrunch via email. “We’re making this initial classifier available to get feedback on whether tools like this are useful, and hope to share improved methods in the future.”

As the fervor around generative AI — particularly text-generating AI — grows, critics have called on the creators of these tools to take steps to mitigate their potentially harmful effects. Some of the U.S.’ largest school districts have banned ChatGPT on their networks and devices, fearing the impacts on student learning and the accuracy of the content that the tool produces. And sites including Stack Overflow have banned users from sharing content generated by ChatGPT, saying that the AI makes it too easy for users to flood discussion threads with dubious answers.

OpenAI’s classifier — aptly called OpenAI AI Text Classifier — is intriguing architecturally. It, like ChatGPT, is an AI language model trained on many, many examples of publicly available text from the web. But unlike ChatGPT, it’s fine-tuned to predict how likely it is that a piece of text was generated by AI — not just from ChatGPT, but any text-generating AI model.

More specifically, OpenAI trained the OpenAI AI Text Classifier on text from 34 text-generating systems from five different organizations, including OpenAI itself. This text was paired with similar (but not exactly similar) human-written text from Wikipedia, websites extracted from links shared on Reddit and a set of “human demonstrations” collected for a previous OpenAI text-generating system. (OpenAI admits in a support document, however, that it might’ve inadvertently misclassified some AI-written text as human-written “given the proliferation of AI-generated content on the internet.”)

The OpenAI Text Classifier won’t work on just any text, importantly. It needs a minimum of 1,000 characters, or about 150 to 250 words. It doesn’t detect plagiarism — an especially unfortunate limitation considering that text-generating AI has been shown to regurgitate the text on which it was trained. And OpenAI says that it’s more likely to get things wrong on text written by children or in a language other than English, owing to its English-forward data set.

The detector hedges its answer a bit when evaluating whether a given piece of text is AI-generated. Depending on its confidence level, it’ll label text as “very unlikely” AI-generated (less than a 10% chance), “unlikely” AI-generated (between a 10% and 45% chance), “unclear if it is” AI-generated (a 45% to 90% chance), “possibly” AI-generated (a 90% to 98% chance)  or “likely” AI-generated (an over 98% chance).

Out of curiosity, I fed some text through the classifier to see how it might manage. While it confidently, correctly predicted that several paragraphs from a TechCrunch article about Meta’s Horizon Worlds and a snippet from an OpenAI support page weren’t AI generated, the classifier had a tougher time with article-length text from ChatGPT, ultimately failing to classify it altogether. It did, however, successfully spot ChatGPT output from a Gizmodo piece about — what else? — ChatGPT.

According to OpenAI, the classifier incorrectly labels human-written text as AI-written 9% of the time. That mistake didn’t happen in my testing, but I chalk that up to the small sample size.

OpenAI text classifier

Image Credits: OpenAI

On a practical level, I found the classifier not particularly useful for evaluating shorter pieces of writing. 1,000 characters is a difficult threshold to reach in the realm of messages, for example emails (at least the ones I get on a regular basis). And the limitations give pause — OpenAI emphasizes that the classifier can be evaded by modifying some words or clauses in generated text.

That’s not to suggest the classifier is useless — far from it. But it certainly won’t stop committed fraudsters (or students, for that matter) in its current state.

The question is, will other tools? Something of a cottage industry has sprung up to meet the demand for AI-generated text detectors. ChatZero, developed by a Princeton University student, uses criteria including “perplexity” (the complexity of text) and “burstiness” (the variations of sentences) to detect whether text might be AI-written. Plagiarism detector Turnitin is developing its own AI-generated text detector. Beyond those, a Google search yields a least a half-dozen other apps that claim to be able to separate the AI-generated wheat from the human-generated chaff, to torture the metaphor.

It’ll likely become a cat-and-mouse game. As text-generating AI improves, so will the detectors — a never-ending back-and-forth similar to that between cybercriminals and security researchers. And as OpenAI writes, while the classifiers might help in certain circumstances, they’ll never be a reliable sole piece of evidence in deciding whether text was AI-generated.

That’s all to say that there’s no silver bullet to solve the problems AI-generated text poses. Quite likely, there never will be.


#OpenAI #releases #tool #detect #AIgenerated #text #including #ChatGPT #TechCrunch

mrB

Related Posts

Elon Musk says Twitter’s For You page will only recommend verified accounts

Musk claims the move is “the only realistic way to address advanced AI bot swarms taking over.” Verified users are also going to become the only accounts…

Tesla brings back European referral program as end of Q1 nears

Tesla is bringing back its referral program to Europe, a strategy that taps into the brand loyalty of customers as it seeks to preserve market share and…

FBI Spent Tens of Thousands of Dollars on Bulk Data Collection

The FBI’s cybersecurity division has been purchasing bulk internet data from a little-known tech firm based in Florida, Motherboard reports. Team Cymru, which calls itself a “global…

Plastics Are Devastating the Guts of Seabirds

This might be why her team got contrasting results in their analysis: The more individual microplastics in the gut, the greater the microbial diversity, but the higher mass of…

‘Yellowjackets’ connects Seasons 1 and 2 with ‘Beaches’

You’ve got to laugh a little, cry a little, until the ritualistic cannibalism rolls by a little. That’s the story of, that’s the glory of Yellowjackets. In…

MultiVersus’ open beta is shutting down, but the game is set to return in 2024

Warner Bros. Games is shutting down MultiVersus’ open beta ahead of a full release targeted for early 2024. The open beta for the WB-themed free-to-play Super Smash…

Leave a Reply

Your email address will not be published. Required fields are marked *