The short answer would probably be: surprisingly good and still bad. Yes, we can declare texts (sometimes better, sometimes worse) as human or machine generated. But is that the right approach when even Google says: write FOR people, regardless of what or by whom.
Is our school at a point where AI-generated texts are no longer distinguishable from those written by humans? AI text generation enters the classroom and with it AI detectors.
AI text recognizers and their limitations
AI detectors work on similar principles to AI text generators like ChatGPT. They use metrics such as perplexity and burstiness to calculate the probability that a text was created by an AI.
As with any technology, AI detectors have their limitations. These systems, although promising, are not infallible. This means that the skills and understanding of the human actors play a crucial role. People must be able to monitor the AI, critically evaluate its recommendations and correct them if necessary.
The effectiveness of AI text detectors can vary greatly depending on the specific implementation and the text being analyzed. An analysis examined five AI text content detectors developed by OpenAI, Writer, Copyleaks, GPTZero and CrossPlag. These tools were used to evaluate generated content and determine the effectiveness of each detector in correctly identifying and categorizing the text as either AI-generated or human-written. The results showed a variance in the performance of these tools across GPT 3.5, GPT 4, and human-generated content. While the tools were generally more successful at identifying content generated by GPT 3.5, they struggled with content generated by GPT 4 and showed inconsistencies when analyzing control responses written by humans1.
Some studies have found that AI text detectors are not reliable in practical scenarios. A particular type of attack known as paraphrasing attacks, in which a light paraphraser is applied to a large language model (LLM), can break a whole range of detectors2. For example, paraphrasing text generated by three large language models was found to successfully bypass multiple detectors, reducing the detection accuracy of DetectGPT from 70.3% to 4.6%3.
Another problem is that while the tools were good at identifying human-written text (with an average of 96% accuracy), they performed worse when it came to recognizing AI-generated text, especially when certain techniques were applied to fool the detectors4.
“The promises are high, but the real results of the technology are often sobering.” — Roger Basler de Roca
Several findings suggest that the performance of the detectors can vary significantly depending on the sophistication of the AI model used to generate the content. This has significant implications for plagiarism detection and highlights the need to continually improve detection tools to keep pace with evolving AI text generation capabilities.
Overall, these findings demonstrate that although AI detection tools can be a useful aid in identifying AI-generated content, they should not be used as the sole determinant in academic integrity cases. Instead, a holistic approach should be taken that includes manual review and consideration of contextual factors. This would ensure a fairer evaluation process and mitigate the ethical concerns surrounding the use of AI detection tools1.
The AI watermark
OpenAI has long been working on a ‘watermarking system’ that, in theory, can clearly determine whether a text was created by an AI. But there are limits here too. The main motivation behind this development is to prevent misuse, such as the use of AI-generated content by third parties and misrepresentation of it as their own work1.
This watermarking initiative aims to provide a clear distinction between human-generated and AI-generated content, which is particularly important in academic and professional contexts.
OpenAI’s watermarking technology works as a kind of “wrapper” over existing text generation systems, using a server-level cryptographic function to pseudorandomly select the next token. In theory, the generated text would still appear random to the viewer, but someone holding the “key” to the cryptographic function could uncover a watermark2.
A cryptographic function is used to insert a recognizable signature into the words produced by OpenAI’s text generation AI3.
However, implementing the watermarking system is not without challenges. Experts are divided on the effectiveness of the system. Some argue that it could be trivial for adversaries to circumvent the system, for example by using synonyms or rewriting the text.
Furthermore, embedding a watermark in each token could affect the fluency of the text if the watermark is too obvious, or raise doubts about the authenticity of the watermark if it is too subtle4.
The better alternative: training and education of teachers and students
We have to ask ourselves whether using AI detectors alone is really the best option. It might make more sense to invest our resources in the education and training of teachers and students. This greater emphasis on education and a transparent approach to AI technologies could be a far more sensible and effective security strategy.
In the context of using AI detectors and enabling learners to use these technologies effectively, it is important to develop a comprehensive plan based on the principles of empowerment, enthusiasm and careful supervision. Here are some measures that educational institutions at different levels can take to promote responsible use of AI while creating an environment of transparency and inclusion:
Elementary school:
Develop educational resources:
Developing age-appropriate learning resources to teach students the basics of AI and its applications. Organizing workshops and interactive sessions where children have the opportunity to engage with and explore AI technologies.
Involvement of Parents and Guardians:
Providing parents with information resources to raise awareness of the benefits and risks of AI technology. Regular parent meetings to listen to parents’ concerns and suggestions and jointly develop strategies for the safe use of AI in the learning process.
Promoting critical thinking:
Instructing students to ask questions and critically examine information, especially that generated by AI systems.
Middle school:
Further education programs:
Offering courses and workshops that focus on the ethical use of AI and understanding how it works. Encouraging students to participate in projects that involve the practical use of AI technologies.
Transparent communication:
Clear communication about the use of AI technologies in the learning environment and associated policies. Creation of forums for open discussions between teachers, students and parents about the use of AI and its impact on the learning process.
Promoting online security:
Teaching digital ethics and online safety to promote safe and responsible behavior in the digital space.
University:
Professional training:
Providing specialized courses and certifications in AI and related technologies. Partnering with industry to provide realistic insights into the application of AI and the challenges of implementing AI detectors.
Research and Development:
Promoting research projects focused on developing effective and ethical AI detectors. Creating platforms for interdisciplinary exchange and collaboration on AI ethics and governance.
Community involvement:
Organizing public lectures, workshops and discussions to raise awareness of the opportunities and risks of AI technology.
By implementing these measures, educational institutions can create an environment that promotes the ability, enthusiasm and careful support of students and teachers in the use of AI technologies. It is critical that everyone — from students to teachers and families — be actively involved in the process to create an inclusive and transparent learning environment that promotes the responsible use of AI technologies.
If that sounded exciting to you, then let’s talk — feel free to let us know if you have any questions.
How reliable are AI detectors?
The reliability of AI detectors can vary and depends heavily on the quality of the algorithm, the training data set and the specific characteristics of the AI text to be recognized. Advances in AI research and development are continually improving the ability of detectors to identify AI-generated text. However, very advanced AI models such as GPT-3 or GPT-4 can generate human-like texts, making recognition difficult.
How do AI detectors work?
AI detectors typically work by analyzing text features that are atypical for human authors or by comparing them to known patterns of AI-generated text. They can also be based on statistical discrepancies found in the texts. Sometimes they use machine learning to better understand how AI and human text differ and adjust their recognition methods accordingly.
Can AI be proven?
Yes, detectability of AI is possible through certain forensic techniques and specialized software. These tools can identify patterns and anomalies that indicate the use of AI. However, proof can become more difficult the more advanced the AI technology becomes.