Challenges of AIGC Detection in Academic Writing

Introduction

In late April, Ren Mingyu, an undergraduate at Northeast Electric Power University, submitted her thesis to PaperPass for an initial check. PaperPass is popular among students for its free detection of paper similarity and AIGC (Artificial Intelligence Generated Content) rates.

Ren’s paper showed a 5% similarity rate, meeting her school’s graduation requirements, but a 59.39% AIGC rate, exceeding the 30% threshold. The AIGC report highlighted large sections in red, marked as “high risk,” causing her significant anxiety.

The Rise of AIGC Detection

Previously, similarity checks were the only requirement for thesis submission. However, with the proliferation of AI tools, platforms like GeziDa, CNKI, and Weipu began offering AIGC detection features in 2023. By 2024, universities in China started implementing a dual-check system for graduation papers, requiring both similarity and AIGC rates to meet standards.

A similarity check compares a paper against existing literature in databases to assess originality, while AIGC detection identifies typical characteristics of AI-generated text. With both checks in place, many students find themselves in a cycle of rewriting and testing, sometimes even paying to lower their AIGC rates.

Conflicting Results Across Platforms

Faced with a nearly 60% AIGC rate, Ren tried PaperPass’s paid service to reduce the rate, costing her 85 yuan. The result was amusingly frustrating; the original text “××占比50%” was altered to “××占比半壁江山.” As an engineering student, she had many national engineering standards, formulas, and technical terms in her paper, and she felt that the changes compromised academic accuracy.

When she checked her paper on the Weipu platform, the AIGC rate was only about 25%.

Not only do different platforms yield conflicting results, but the same platform can also show varying outcomes. At the end of April, Yang Feng, an undergraduate at Central China Normal University, found that her paper’s AIGC rate was only about 10% during the first check. After an upgrade, the same paper’s AIGC rate skyrocketed to 44%, while her school required it to be below 25%.

Many users on social media reported similar issues, with AIGC rates fluctuating dramatically between checks.

The Impact on Academic Writing

Yang Feng acknowledged that some students do use AI to write papers, and AIGC detection mechanisms help maintain academic integrity. However, the inconsistent results left her puzzled. She admitted to using AI for organizing thoughts and data but insisted that most of the content was her own writing. She speculated that her paper’s structured language and use of phrases like “firstly, secondly, thirdly” might have led to misclassification as AI-generated.

To meet school requirements, some faculty and students began exploring different strategies. Chen Hongmin, a lecturer at Guizhou Minzu University, found several logical inconsistencies and immature writing in her students’ papers and rewrote three paragraphs. After checking in a system, the student reported that these sections had an AIGC suspicion rate of 68%. Chen, who has taught for over a decade, believes that current AIGC detection standards lack transparency and that the tools operate as a clear algorithmic black box.

She observed that some platforms were more likely to classify tightly structured, complex sentences filled with technical terms as AIGC risks, making it harder for better writers to avoid penalties. Consequently, she had to adjust her teaching approach, advising students not to write too “sophisticatedly” or tightly and to use more colloquial expressions.

The Interplay of Similarity and AIGC Rates

Lu Dezheng, a second-year master’s student at Guizhou Minzu University, faced the challenge of balancing similarity and AIGC rates, with his school requiring a similarity rate below 20% and an AIGC rate below 15%. He found that lowering one rate often led to an increase in the other, forcing him to repeatedly modify and check his work.

While writing a paper on “quality education,” he underwent eight rounds of dual checks, totaling 16 tests. Initially, a free platform showed a 7% similarity rate and a 48% AIGC rate. After two nights of manual revisions, the AIGC rate decreased, but new sections were flagged, raising the similarity rate to 13%. Switching platforms resulted in an AIGC rate of 64% and a similarity rate of 16%.

As manual revisions proved unstable, he began using a paid AIGC reduction service. After processing, one report showed an AIGC rate of 27% and a similarity rate of 0%, while another platform reported an AIGC rate of 36% and a similarity rate of 9%.

After seeking help from friends, Lu found a manual AIGC reduction service on social media, costing 70 yuan and taking only 10 to 20 minutes. He discovered that results varied widely across platforms, with similarity rates ranging from 0% to 17%, while AIGC rates hovered around the acceptable threshold.

Ultimately, he concluded that due to the instability of AIGC rates, he prioritized ensuring the similarity rate was compliant.

The Cost of Compliance

Yang Yining, a third-year master’s student at Liaoning University, felt that this year’s graduation thesis process was exhausting. His school had strict dual-check requirements, leading many students to purchase services for both similarity and AIGC checks. A typical master’s thesis can be tens of thousands of words long, and these services often charge by character, costing hundreds of yuan for a single dual check.

To be safe, Yang’s thesis underwent over a dozen checks from draft to final version, including purchasing AIGC reduction services, totaling over 800 yuan. This expense was significant for him, prompting him to cut back on unnecessary spending during those two months. He humorously noted that the money saved on meals went towards paying for plagiarism detection software.

Yang found that free platforms often reported higher AIGC rates than paid ones.

The Emergence of AIGC Services

The emergence of services like “AIGC reduction,” “AI polishing,” and others has been noted, with prices varying from 3 yuan per thousand words to hundreds for entire papers, depending on length.

Hong Tao, a doctoral student at Zhejiang University’s Guanghua Law School, pointed out that more universities and journals are incorporating AIGC rates into their decision-making processes, but the reliability of these technologies is questionable. Users often find themselves repeatedly purchasing detection services, leading to substantial profits for the platforms.

AIGC Detection: An Algorithmic Black Box?

Journalists contacted several AIGC detection platforms to understand their logic. The customer service for CNKI’s personal plagiarism detection explained that the AIGC detection system relies on specific algorithms and is a dynamic process, with reports being time-sensitive.

GeziDa’s automated response listed reasons why original content might be flagged as “suspected AI,” including overly simplistic logic, overly formulaic structure, excessive empty phrases, and improper punctuation. PaperPass has published articles explaining its detection principles, stating that its AIGC detection algorithm learns from a vast amount of human and AI text to identify AI characteristics based on subtle differences in wording, sentence structure, and logical coherence.

Many academic journal editors are beginning to pay attention to AIGC detection issues. Earlier this year, Shen Xibin, head of the new media department at the Chinese Medical Association, tested several AIGC detection tools for their ability to identify medical review abstracts. They found that the consistency of detection across three Chinese tools was only 40%-80%, indicating that different tools can yield conflicting results. Shen explained that differing training models and corpora lead to varying “anomaly” signals, resulting in inconsistent or even contradictory determinations.

Additionally, these tools tend to have high detection rates for AI-written text but struggle with misclassifying or failing to identify AI-polished text. Shen believes that current AIGC detection lacks a recognized “gold standard,” and single detection results should not be used as definitive evidence of academic misconduct.

Having worked as a journal editor for over 20 years, Shen noted that while similarity detection has a foundational database for comparison, AIGC detection operates as a “black box.” He believes that as AI generation capabilities improve, AI text increasingly resembles human writing, making it difficult to determine the likelihood of AI authorship based solely on text composition.

Embracing AI in Academia

Shen believes that the purpose of AIGC detection is to curb AI ghostwriting and uphold academic integrity, serving as a tool for “assisted screening and risk warning.” Its core value lies in efficiently identifying high-risk manuscripts, thus saving editorial review efforts. Given the rapid evolution of AI technology, it is crucial to avoid a one-size-fits-all approach and establish appropriate standards and review mechanisms.

He shared that the Chinese Medical Association has a review logic for AIGC detection, where some editors set thresholds based on different paper types. When detection results reach a certain value, editors and editorial boards conduct manual assessments, and papers that cannot be clearly judged are sent to external reviewers, preserving the authors’ right to appeal.

The academic community is beginning to regulate the boundaries of AIGC usage. Recent documents, such as the “Guidelines for Identifying AI-Generated Content in Research,” clearly state that researchers must not use AI to directly generate core research findings. When AI is used for assistance in organizing ideas, language polishing, or data management, human authors must rigorously verify the authenticity of generated content and disclose the use of AI tools and their processes in their papers.

The International Committee on Publication Ethics (COPE) and several mainstream academic publishing institutions have proposed that authors must publicly disclose their use of AI and take responsibility for the authenticity, accuracy, and originality of their paper’s content.

Nanjing University’s guidelines mention that AI tools can be used for assistance in collecting and organizing materials, optimizing language, audio-visual content, and charts, and handling complex data, provided that written consent is obtained from the course instructor or supervisor.

Yang Feng was pleased to learn from previous graduates that if a thesis’s AIGC rate exceeds school requirements but sufficient evidence is presented, students can appeal to the relevant departments for a review, allowing for a more reassuring experience for current graduates.