Sunday, 15 Jun 2025
Subscribe
logo logo
  • Global
  • Technology
  • Business
  • AI
  • Cloud
  • Edge Computing
  • Security
  • Investment
  • More
    • Sustainability
    • Colocation
    • Quantum Computing
    • Regulation & Policy
    • Infrastructure
    • Power & Cooling
    • Design
    • Innovations
  • 🔥
  • data
  • Secures
  • Funding
  • Investment
  • revolutionizing
  • Center
  • Series
  • cloud
  • Power
  • Future
  • Centers
  • million
Font ResizerAa
Silicon FlashSilicon Flash
Search
  • Global
  • Technology
  • Business
  • AI
  • Cloud
  • Edge Computing
  • Security
  • Investment
  • More
    • Sustainability
    • Colocation
    • Quantum Computing
    • Regulation & Policy
    • Infrastructure
    • Power & Cooling
    • Design
    • Innovations
Have an existing account? Sign In
Follow US
© 2022 Foxiz News Network. Ruby Design Company. All Rights Reserved.
Silicon Flash > Blog > AI > Is your AI product actually working? How to develop the right metric system
AI

Is your AI product actually working? How to develop the right metric system

Published April 27, 2025 By Juwan Chacko
Share
7 Min Read
Is your AI product actually working? How to develop the right metric system
SHARE

Join our daily and weekly newsletters to stay updated with the latest news and exclusive content on cutting-edge AI technology. Subscribe now for more information.

In my initial role as a machine learning (ML) product manager, a simple question sparked intense discussions among different departments and leaders: How can we determine if this product is truly effective? The product I oversaw served both internal and external customers. It allowed internal teams to identify the main issues faced by customers so they could prioritize the right solutions to address these issues. With a complex network of relationships between internal and external customers, selecting the right metrics to measure the product’s impact was crucial for guiding it towards success.

Failing to monitor the effectiveness of your product is like trying to land a plane without guidance from air traffic control. Without understanding what is going well or poorly, you cannot make informed decisions for your customers. Moreover, if you do not define the metrics proactively, your team will come up with their own alternative metrics. The danger of having multiple interpretations of an ‘accuracy’ or ‘quality’ metric is that each person will develop their own version, potentially leading to a lack of alignment in working towards the same goal.

For instance, when I discussed my annual goal and the underlying metric with our engineering team, their immediate response was: “But this is a business metric, we already track precision and recall.”

First and foremost, determine what you want to learn about your AI product. When it comes to defining metrics for your product, especially one that operates with multiple customers like an ML product, the complexity grows. How do you measure the effectiveness of a model? Measuring the impact on internal teams to prioritize releases based on our models may not be fast enough, and measuring customer adoption of solutions recommended by our model could lead to conclusions based on a broad adoption metric (what if the customer did not adopt the solution because they simply wanted to speak with a support agent?).

See also  Navigating the AI Revolution: Embracing Open Standards for Sustainable Innovation

Moving into the age of large language models (LLMs), where outputs include text answers, images, and music, the dimensions requiring metrics multiply rapidly — formats, customers, types, and more.

When developing metrics for all my products, my initial step is to distill the impact on customers into key questions. Identifying the right questions makes it easier to identify the appropriate metrics. Here are some examples:

1. Did the customer receive an output? → metric for coverage
2. How long did it take for the product to provide an output? → metric for latency
3. Did the user like the output? → metrics for customer feedback, customer adoption, and retention

After identifying key questions, the next step is to determine a set of sub-questions for ‘input’ and ‘output’ signals. Output metrics are lagging indicators that measure events that have already occurred. Input metrics, on the other hand, are leading indicators that can identify trends or predict outcomes. Not all questions need to have leading or lagging indicators.

The final step is to establish the method for collecting metrics. Most metrics are collected at scale through new instrumentation via data engineering. However, for ML-based products, especially for questions like the third example above, you have the option of manual or automated evaluations to assess model outputs. While automated evaluations are preferred, starting with manual evaluations for assessing the quality of outputs can lay the groundwork for a robust and tested automated evaluation process.

Example use cases: AI search, listing descriptions

The framework outlined above can be applied to any ML-based product to identify the primary metrics for the product. Let’s consider search as an example.

See also  Revolutionary Battery-Free RFID Sensor System for Optimized Data Center Management

Question Metrics Nature of Metric
Did the customer receive an output? → Coverage % search sessions with search results displayed to customer Output
How long did it take for the product to provide an output? → Latency Time taken to display search results for the user Output
Did the user like the output? → Customer feedback, customer adoption, and retention
Did the user indicate that the output is correct/incorrect? (Output) Was the output good/fair? (Input)
% of search sessions with ‘thumbs up’ feedback on search results from the customer or % of search sessions with clicks from the customer % of search results marked as ‘good/fair’ for each search term, according to quality rubric Output Input

Consider a product that generates descriptions for listings (e.g., menu items on Doordash or product listings on Amazon).

Question Metrics Nature of Metric
Did the customer receive an output? → Coverage % listings with generated descriptions Output
How long did it take for the product to provide an output? → Latency Time taken to generate descriptions for the user Output
Did the user like the output? → Customer feedback, customer adoption, and retention
Did the user indicate that the output is correct/incorrect? (Output) Was the output good/fair? (Input)
% of listings with generated descriptions requiring edits from technical content team/seller/customer % of listing descriptions marked as ‘good/fair’, according to quality rubric Output Input

This approach can be extended to various ML-based products. I trust this framework will assist you in defining the appropriate metrics for your ML model.

Sharanya Rao is a group product manager at Intuit.

See also  Ex-OpenAI CEO and power users sound alarm over AI sycophancy and flattery of users
TAGGED: develop, metric, product, System, Working
Share This Article
Twitter Email Copy Link Print
Previous Article Cvent Acquires Prismm Cvent Acquires Prismm
Next Article 4chan is back online, says it’s been ‘starved of money’ 4chan is back online, says it’s been ‘starved of money’
Leave a comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Your Trusted Source for Accurate and Timely Updates!

Our commitment to accuracy, impartiality, and delivering breaking news as it happens has earned us the trust of a vast audience. Stay ahead with real-time updates on the latest events, trends.
FacebookLike
TwitterFollow
LinkedInFollow

Popular Posts

Nvidia’s DGX Cloud Lepton: Revolutionizing AI Ridesharing

Summary: Nvidia's DGX Cloud Lepton platform is currently in early access and has already gained…

May 19, 2025

Revolutionizing RHEL Management: Red Hat’s Integration of AI and Offline Tools

In a strategic move to incorporate artificial intelligence into Linux system administration, Red Hat has…

May 21, 2025

Revolutionizing Quantum Computing: Advancing Fault-Tolerant Systems with Cutting-Edge Materials

Summary: 1. Researchers at Oxford University have developed a new technique to find materials for…

May 31, 2025

Elon Musk’s The Boring Company: Revolutionizing Amtrak Infrastructure

Federal Railroad Regulators in Talks with Elon Musk’s Boring Company for Multi-Billion Amtrak Project According…

May 13, 2025

Trump administration decides to fund CVE cybersecurity tracker after all

Government to Continue Funding CVE Program The Common Vulnerabilities and Exposures (CVE) program will receive…

April 16, 2025

You Might Also Like

Expanding Horizons: The Decision to Choose South Korea for OpenAI’s Global Expansion
AI

Expanding Horizons: The Decision to Choose South Korea for OpenAI’s Global Expansion

Juwan Chacko
Revolutionizing LLM Deployment: Exploring Google’s Diffusion Approach
AI

Revolutionizing LLM Deployment: Exploring Google’s Diffusion Approach

Juwan Chacko
Surging Demand for AI Chips Leads to Record Year of Supply Shortages
AI

Surging Demand for AI Chips Leads to Record Year of Supply Shortages

Juwan Chacko
The Great Debate: Can Reasoning Models Truly Think? Insights from Apple’s Research Spark Controversy and Discussion
AI

The Great Debate: Can Reasoning Models Truly Think? Insights from Apple’s Research Spark Controversy and Discussion

Juwan Chacko
logo logo
Facebook Twitter Youtube Rss

About US

Silicon Flash: Stay informed with the latest Tech News, Innovations, Gadgets, AI, Data Center, and Industry trends from around the world—all in one place.

Top Categories
  • Technology
  • Business
  • Innovations
  • Investments
Usefull Links
  • Home
  • Contact
  • Privacy Policy
  • Terms & Conditions

© 2024 – siliconflash.com – All rights reserved

Welcome Back!

Sign in to your account

Lost your password?