24 Apr 2025 2 min read

This AI Can Tell If Text Was Written by a Human — And Which LLM Wrote It

As AI-generated text becomes increasingly indistinguishable from human writing, the need for reliable detection tools has never been greater. A new paper from researchers at the Indian Institute of Technology Guwahati introduces COT_Finetuned, a framework that not only detects AI-generated text but also identifies which large language model (LLM) wrote it — all while explaining its reasoning.

The AI Detection Arms Race

The paper, presented at AAAI 2025’s DEFACTIFY 4.0 workshop, tackles two critical tasks:

Task A: Binary classification (human vs. AI-generated text)
Task B: Multi-class classification (identifying the specific LLM behind AI-generated text)

What sets COT_Finetuned apart is its use of Chain-of-Thought (CoT) reasoning, which forces the model to explain its classifications step-by-step. This isn’t just about accuracy — it’s about transparency in an era where AI-generated content floods everything from academic papers to news articles.

How It Works

The system fine-tunes pre-trained models (like BERT) on a dataset of 10,500 text samples, labeled as human-written or generated by models including GPT-4.0, DeBERTa, FalconMamba, and Phi-3.5. Key innovations:

Dual-task architecture: Simultaneously classifies text origin (human/AI) and pinpoints the LLM if AI-generated.
Explainable AI: Generates natural language explanations for its decisions (e.g., “This text lacks emotional depth and shows repetitive phrasing patterns characteristic of GPT-4”).
Combined loss function: Optimizes for both classification accuracy and reasoning quality.

Performance That Speaks Volumes

The results are compelling:

| Method | Task A (Human vs AI) F1 | Task B (LLM ID) F1 | |-----------------|-------------------------|--------------------| | RoBERTa | 0.672 | 0.143 | | BERT | 0.742 | 0.249 | | BERT + COT | 0.898 | 0.307 |

While Task B remains challenging (identifying LLMs is harder than just detecting AI), the CoT approach significantly boosts performance in both tasks. The explanations also help humans understand what gives away AI text — whether it’s unnatural fluency, lack of personal detail, or model-specific quirks.

Why This Matters for Business

Content moderation: Platforms can better filter AI-generated spam or misinformation while preserving human expression.
Academic integrity: Schools and publishers gain tools to detect AI-assisted work without blanket bans on LLM use.
Model accountability: When AI-generated content causes harm, tracing it to specific models could inform better governance.

The Road Ahead

The researchers acknowledge limitations — sophisticated paraphrasing can still evade detection, and newer LLMs will require constant model updates. But by baking explainability into detection, COT_Finetuned offers a path toward more auditable AI systems.

As one author noted: “We’re not just building a better detector; we’re creating tools to understand how AI writes — and how that differs from human thought.” In an age where the line between human and machine blurs daily, that understanding might be priceless.