2 min read

Does Feasibility Matter? How Synthetic Training Data Impacts AI Performance

Does Feasibility Matter? How Synthetic Training Data Impacts AI Performance

With the rise of photorealistic diffusion models, synthetic data is increasingly used to train AI systems. But these models often generate unrealistic images—dogs floating in mid-air, cars with impossible textures—raising questions about how such "infeasible" data affects performance. A new study from researchers at the Technical University of Munich and Helmholtz Munich tackles this question head-on, introducing VariReal, a pipeline for generating minimal-change synthetic data with controlled feasible and infeasible attributes.

The Feasibility Question

The team defines feasibility as whether an attribute in a synthetic image could realistically exist in the real world. For example, a Yorkshire Terrier at a lakeshore is feasible; the same dog on an oil rig is not. Intuitively, one might assume infeasible data harms model generalization—but does it really?

To find out, the researchers fine-tuned CLIP-based classifiers on three fine-grained datasets (Oxford Pets, FGVC Aircraft, and Stanford Cars), testing three attribute categories:

  1. Background (e.g., realistic vs. implausible settings)
  2. Color (e.g., natural vs. neon fur)
  3. Texture (e.g., realistic vs. elephant-skin coats)

Key Findings

  1. Feasibility Barely Matters
  • Surprisingly, the difference in top-1 accuracy between models trained on feasible vs. infeasible data was negligible—often less than 0.3%.
  • Mixing feasible and infeasible data also had minimal impact, suggesting strict feasibility enforcement may be unnecessary.
  1. Background Edits Boost Performance
  • Modifying backgrounds—whether feasible or not—consistently improved classification accuracy.
  • This challenges prior work like ALIA, which restricted augmentations to feasible backgrounds only.
  1. Foreground Edits Are Tricky
  • Changing colors or textures (e.g., making a dog green) often hurt performance, even when the edits were feasible.
  • The team hypothesizes that foreground alterations disrupt class-relevant features more than background changes.

The VariReal Pipeline

The study’s VariReal method edits real images with minimal changes, isolating single attributes at a time. Key innovations include:

  • LLM-generated prompts: GPT-4 produces feasible/infeasible attribute descriptions (e.g., "a dog with elephant-skin texture").
  • Prior-guided diffusion: Combines Stable Diffusion inpainting with ControlNet for precise edits.
  • Automatic filtering: Uses LLaVA-Next to discard off-target generations.

Why This Matters for Business

For companies leveraging synthetic data, these findings suggest:

  • Cost savings: Strict feasibility filtering may not be worth the computational overhead.
  • Better augmentation strategies: Background diversity helps; foreground tweaks may not.
  • Scalability: Minimal-change generation (like VariReal) can efficiently expand datasets without distorting core features.

The takeaway? Feasibility isn’t the bottleneck we thought it was—but attribute selection is. As synthetic data becomes ubiquitous, understanding these nuances will be key to training robust AI systems.

Code available: GitHub