This AI Can Tell If Text Was Written by a Human — And Which LLM Wrote It
As AI-generated text becomes increasingly indistinguishable from human writing, the need for reliable detection tools has never been greater. A new paper from researchers at the Indian Institute of Technology Guwahati introduces COT_Finetuned, a framework that not only detects AI-generated text but also identifies which large language model (LLM) wrote it — all while explaining its reasoning.
The AI Detection Arms Race
The paper, presented at AAAI 2025’s DEFACTIFY 4.0 workshop, tackles two critical tasks:
- Task A: Binary classification (human vs. AI-generated text)
- Task B: Multi-class classification (identifying the specific LLM behind AI-generated text)
What sets COT_Finetuned apart is its use of Chain-of-Thought (CoT) reasoning, which forces the model to explain its classifications step-by-step. This isn’t just about accuracy — it’s about transparency in an era where AI-generated content floods everything from academic papers to news articles.
How It Works
The system fine-tunes pre-trained models (like BERT) on a dataset of 10,500 text samples, labeled as human-written or generated by models including GPT-4.0, DeBERTa, FalconMamba, and Phi-3.5. Key innovations:
- Dual-task architecture: Simultaneously classifies text origin (human/AI) and pinpoints the LLM if AI-generated.
- Explainable AI: Generates natural language explanations for its decisions (e.g., “This text lacks emotional depth and shows repetitive phrasing patterns characteristic of GPT-4”).
- Combined loss function: Optimizes for both classification accuracy and reasoning quality.
Performance That Speaks Volumes
The results are compelling:
| Method | Task A (Human vs AI) F1 | Task B (LLM ID) F1 | |-----------------|-------------------------|--------------------| | RoBERTa | 0.672 | 0.143 | | BERT | 0.742 | 0.249 | | BERT + COT | 0.898 | 0.307 |
While Task B remains challenging (identifying LLMs is harder than just detecting AI), the CoT approach significantly boosts performance in both tasks. The explanations also help humans understand what gives away AI text — whether it’s unnatural fluency, lack of personal detail, or model-specific quirks.
Why This Matters for Business
- Content moderation: Platforms can better filter AI-generated spam or misinformation while preserving human expression.
- Academic integrity: Schools and publishers gain tools to detect AI-assisted work without blanket bans on LLM use.
- Model accountability: When AI-generated content causes harm, tracing it to specific models could inform better governance.
The Road Ahead
The researchers acknowledge limitations — sophisticated paraphrasing can still evade detection, and newer LLMs will require constant model updates. But by baking explainability into detection, COT_Finetuned offers a path toward more auditable AI systems.
As one author noted: “We’re not just building a better detector; we’re creating tools to understand how AI writes — and how that differs from human thought.” In an age where the line between human and machine blurs daily, that understanding might be priceless.