26 Jun 2025 3 min read

Define-ML: A New Framework for Ideating Machine Learning-Enabled Systems

Machine learning (ML) is transforming software products across industries, but ideating ML-enabled systems comes with unique challenges. Traditional methods like Lean Inception weren’t designed to handle ML’s probabilistic nature, data dependencies, or the technical feasibility questions that arise when integrating AI into products. A new framework, Define-ML, aims to bridge this gap by extending Lean Inception with ML-specific activities—helping teams align business goals with technical realities from the start.

The Problem with Traditional Ideation for ML

ML-enabled systems aren’t just traditional software with an AI layer slapped on. They require careful consideration of data quality, model feasibility, and stakeholder expectations. Without structured guidance, teams risk proposing solutions that are either technically unfeasible or misaligned with business needs. As noted in the paper, "Managing customer expectations and aligning requirements with data are among the main pain points of engineering ML-enabled systems."

Existing ideation methods like Lean Inception, Design Thinking, or Lean Startup don’t explicitly address these ML-specific concerns. They help prioritize features but don’t ask critical questions like:

What data is available, and is it usable?
Can ML realistically solve this problem?
How do we align probabilistic ML behavior with business objectives?

This is where Define-ML comes in.

What Define-ML Adds to the Table

Developed by researchers at Pontifical Catholic University of Rio de Janeiro (PUC-Rio), Define-ML introduces three key activities to Lean Inception:

Data Source Mapping – Teams identify and assess available data sources, distinguishing between public/private and governed/ungoverned data. Quality ratings (high, medium, low) help flag potential issues early.
Feature-to-Data Source Mapping – This connects proposed features to the data required to build them, ensuring realistic scoping. If a feature relies on unavailable or poor-quality data, teams can pivot before development starts.
ML Mapping – Inspired by the Mix & Match ML Toolkit, this activity helps teams match ML techniques to business needs. It involves classifying data types (e.g., text, images) and mapping them to feasible ML capabilities (e.g., categorization, recommendation).

These additions ensure that ideation isn’t just about what the product should do, but also how ML can feasibly support those goals.

Validation: Does It Work in Practice?

The researchers validated Define-ML in two ways:

Static validation – A simulated workshop with industry practitioners (energy sector) working on a toy problem (loan approval system).
Dynamic validation – A real-world case study with a multinational energy drink company developing a retail demand forecasting tool.

Key takeaways from participants:

✅ Data Source Mapping was praised for "clarifying data complexity" and "revealing gaps in data availability." One participant noted, "We easily created a blueprint for aligning data into a single source."

✅ Feature-to-Data Source Mapping helped teams "understand which features were realistically supported by data." Some suggested more time should be allocated to this step.

✅ ML Mapping was seen as valuable but slightly more technical. Some participants recommended a pre-workshop primer on ML basics to make it more accessible.

Overall, 100% of participants in the dynamic validation expressed intent to adopt Define-ML, citing its ability to "align functionalities, data, and expectations."

Why This Matters for Businesses

Define-ML doesn’t just help teams brainstorm—it forces them to confront feasibility early. Too many ML projects fail because:

Stakeholders overestimate what AI can do.
Teams underestimate data requirements.
Business goals and technical execution are misaligned.

By integrating data and ML considerations into ideation, Define-ML reduces wasted effort and increases the chances of building viable ML products. As one participant put it: "A really pleasant experience seeing ideas become concepts, then products."

The Road Ahead

The framework is openly available, with templates on Miro for teams to try. Future work may expand it to cover Generative AI and intelligent agents—areas where ideation challenges are even more pronounced.

For now, Define-ML offers a structured way to navigate the messy early stages of ML product development. If your team is wrestling with AI feasibility, data alignment, or stakeholder expectations, it might be worth a look.

Interested in trying Define-ML? Check out the open-access Miro template and the full paper for deeper insights.