Sunday, 15 Jun 2025
Subscribe
logo logo
  • Global
  • Technology
  • Business
  • AI
  • Cloud
  • Edge Computing
  • Security
  • Investment
  • More
    • Sustainability
    • Colocation
    • Quantum Computing
    • Regulation & Policy
    • Infrastructure
    • Power & Cooling
    • Design
    • Innovations
  • 🔥
  • data
  • Secures
  • Funding
  • Investment
  • revolutionizing
  • Center
  • Series
  • cloud
  • Power
  • Future
  • Centers
  • million
Font ResizerAa
Silicon FlashSilicon Flash
Search
  • Global
  • Technology
  • Business
  • AI
  • Cloud
  • Edge Computing
  • Security
  • Investment
  • More
    • Sustainability
    • Colocation
    • Quantum Computing
    • Regulation & Policy
    • Infrastructure
    • Power & Cooling
    • Design
    • Innovations
Have an existing account? Sign In
Follow US
© 2022 Foxiz News Network. Ruby Design Company. All Rights Reserved.
Silicon Flash > Blog > AI > Is your AI product actually working? How to develop the right metric system
AI

Is your AI product actually working? How to develop the right metric system

Published April 27, 2025 By Juwan Chacko
Share
7 Min Read
Is your AI product actually working? How to develop the right metric system
SHARE

Join our daily and weekly newsletters to stay updated with the latest news and exclusive content on cutting-edge AI technology. Subscribe now for more information.

In my initial role as a machine learning (ML) product manager, a simple question sparked intense discussions among different departments and leaders: How can we determine if this product is truly effective? The product I oversaw served both internal and external customers. It allowed internal teams to identify the main issues faced by customers so they could prioritize the right solutions to address these issues. With a complex network of relationships between internal and external customers, selecting the right metrics to measure the product’s impact was crucial for guiding it towards success.

Failing to monitor the effectiveness of your product is like trying to land a plane without guidance from air traffic control. Without understanding what is going well or poorly, you cannot make informed decisions for your customers. Moreover, if you do not define the metrics proactively, your team will come up with their own alternative metrics. The danger of having multiple interpretations of an ‘accuracy’ or ‘quality’ metric is that each person will develop their own version, potentially leading to a lack of alignment in working towards the same goal.

For instance, when I discussed my annual goal and the underlying metric with our engineering team, their immediate response was: “But this is a business metric, we already track precision and recall.”

First and foremost, determine what you want to learn about your AI product. When it comes to defining metrics for your product, especially one that operates with multiple customers like an ML product, the complexity grows. How do you measure the effectiveness of a model? Measuring the impact on internal teams to prioritize releases based on our models may not be fast enough, and measuring customer adoption of solutions recommended by our model could lead to conclusions based on a broad adoption metric (what if the customer did not adopt the solution because they simply wanted to speak with a support agent?).

See also  Le Chat Enterprise: Revolutionizing Enterprise AI with the Medium 3 Model

Moving into the age of large language models (LLMs), where outputs include text answers, images, and music, the dimensions requiring metrics multiply rapidly — formats, customers, types, and more.

When developing metrics for all my products, my initial step is to distill the impact on customers into key questions. Identifying the right questions makes it easier to identify the appropriate metrics. Here are some examples:

1. Did the customer receive an output? → metric for coverage
2. How long did it take for the product to provide an output? → metric for latency
3. Did the user like the output? → metrics for customer feedback, customer adoption, and retention

After identifying key questions, the next step is to determine a set of sub-questions for ‘input’ and ‘output’ signals. Output metrics are lagging indicators that measure events that have already occurred. Input metrics, on the other hand, are leading indicators that can identify trends or predict outcomes. Not all questions need to have leading or lagging indicators.

The final step is to establish the method for collecting metrics. Most metrics are collected at scale through new instrumentation via data engineering. However, for ML-based products, especially for questions like the third example above, you have the option of manual or automated evaluations to assess model outputs. While automated evaluations are preferred, starting with manual evaluations for assessing the quality of outputs can lay the groundwork for a robust and tested automated evaluation process.

Example use cases: AI search, listing descriptions

The framework outlined above can be applied to any ML-based product to identify the primary metrics for the product. Let’s consider search as an example.

See also  First operating system for quantum networks

Question Metrics Nature of Metric
Did the customer receive an output? → Coverage % search sessions with search results displayed to customer Output
How long did it take for the product to provide an output? → Latency Time taken to display search results for the user Output
Did the user like the output? → Customer feedback, customer adoption, and retention
Did the user indicate that the output is correct/incorrect? (Output) Was the output good/fair? (Input)
% of search sessions with ‘thumbs up’ feedback on search results from the customer or % of search sessions with clicks from the customer % of search results marked as ‘good/fair’ for each search term, according to quality rubric Output Input

Consider a product that generates descriptions for listings (e.g., menu items on Doordash or product listings on Amazon).

Question Metrics Nature of Metric
Did the customer receive an output? → Coverage % listings with generated descriptions Output
How long did it take for the product to provide an output? → Latency Time taken to generate descriptions for the user Output
Did the user like the output? → Customer feedback, customer adoption, and retention
Did the user indicate that the output is correct/incorrect? (Output) Was the output good/fair? (Input)
% of listings with generated descriptions requiring edits from technical content team/seller/customer % of listing descriptions marked as ‘good/fair’, according to quality rubric Output Input

This approach can be extended to various ML-based products. I trust this framework will assist you in defining the appropriate metrics for your ML model.

Sharanya Rao is a group product manager at Intuit.

See also  Android's Enhanced Defense System: Shielding Users from Phone Call Scammers
TAGGED: develop, metric, product, System, Working
Share This Article
Twitter Email Copy Link Print
Previous Article Cvent Acquires Prismm Cvent Acquires Prismm
Next Article 4chan is back online, says it’s been ‘starved of money’ 4chan is back online, says it’s been ‘starved of money’
Leave a comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Your Trusted Source for Accurate and Timely Updates!

Our commitment to accuracy, impartiality, and delivering breaking news as it happens has earned us the trust of a vast audience. Stay ahead with real-time updates on the latest events, trends.
FacebookLike
TwitterFollow
LinkedInFollow

Popular Posts

Unleashing the Power of Connectivity: Constructing the Future

Europe is making significant strides towards leading the development of 6G wireless technology, with plans…

May 20, 2025

Why they don’t fall apart

Ferroelectric semiconductors have emerged as a groundbreaking new class of chips with the ability to…

April 18, 2025

Dell data center modernization gear targets AI, HPC workloads

Introducing the Latest Dell PowerEdge Servers The latest update from Dell brings a range of…

April 19, 2025

The Role of AI in Driving Sustainability in Data Centers

Summary: AI's impact on the future of work and innovation is becoming clearer, but questions…

May 13, 2025

CrediLinq Secures $8.5M in Series A Funding to Expand Financial Services

CrediLinq Raises $8.5M in Series A Funding for AI-powered Embedded Finance Platform Summary: CrediLinq, a…

May 16, 2025

You Might Also Like

Surging Demand for AI Chips Leads to Record Year of Supply Shortages
AI

Surging Demand for AI Chips Leads to Record Year of Supply Shortages

Juwan Chacko
The Great Debate: Can Reasoning Models Truly Think? Insights from Apple’s Research Spark Controversy and Discussion
AI

The Great Debate: Can Reasoning Models Truly Think? Insights from Apple’s Research Spark Controversy and Discussion

Juwan Chacko
Reddit vs. Anthropic: The Battle Over User Data and AI Training
AI

Reddit vs. Anthropic: The Battle Over User Data and AI Training

Juwan Chacko
Unlocking the Potential: The Crucial Role of Humans in Chatbot Testing
AI

Unlocking the Potential: The Crucial Role of Humans in Chatbot Testing

Juwan Chacko
logo logo
Facebook Twitter Youtube Rss

About US

Silicon Flash: Stay informed with the latest Tech News, Innovations, Gadgets, AI, Data Center, and Industry trends from around the world—all in one place.

Top Categories
  • Technology
  • Business
  • Innovations
  • Investments
Usefull Links
  • Home
  • Contact
  • Privacy Policy
  • Terms & Conditions

© 2024 – siliconflash.com – All rights reserved

Welcome Back!

Sign in to your account

Lost your password?