Sign up for our daily and weekly newsletters to stay updated with the latest news and exclusive content on cutting-edge AI advancements. Discover More
Salesforce is addressing a prevalent challenge in artificial intelligence for business applications: the gap between the raw intelligence of an AI system and its consistent performance in unpredictable enterprise environments, known as “jagged intelligence.”
In a significant research announcement, Salesforce AI Research unveiled new benchmarks, models, and frameworks aimed at enhancing the intelligence, trustworthiness, and versatility of future AI agents for enterprise use. These advancements target improving both the capabilities and reliability of AI systems, particularly when operating as autonomous agents in complex business scenarios.
According to Silvio Savarese, Salesforce’s Chief Scientist and Head of AI Research, conventional AI systems may excel in standardized tests, intricate planning, and creative tasks like poetry generation, but they often struggle to deliver consistent task execution in dynamic enterprise environments.
The initiative reflects Salesforce’s commitment to “Enterprise General Intelligence” (EGI) – AI tailored for business complexities rather than the theoretical pursuit of Artificial General Intelligence (AGI).
Savarese explained, “EGI refers to purpose-built AI agents optimized for business challenges, focusing not only on capability but also on consistency. Businesses are leveraging these foundational concepts to solve real-world problems at scale, rather than waiting for a distant vision of superintelligent machines.”
Addressing AI’s Consistency Challenge in Enterprise Settings
The research emphasizes quantifying and rectifying AI’s inconsistency in performance. Salesforce introduced the “SIMPLE dataset,” a benchmark comprising 225 straightforward reasoning questions to evaluate the jaggedness of AI systems’ capabilities.
Shelby Heinecke, Senior Manager of Research at Salesforce, highlighted the importance of measuring AI’s jaggedness using the SIMPLE benchmark. In enterprise applications, inconsistency in AI performance can have severe consequences, potentially disrupting operations, damaging customer trust, or causing financial losses.
Introducing CRMArena: A Virtual Testing Environment
An essential innovation is CRMArena, a benchmarking framework designed to simulate realistic customer relationship management scenarios. This framework enables comprehensive testing of AI agents in professional contexts, bridging the gap between academic benchmarks and actual business demands.
Savarese emphasized the significance of CRMArena in evaluating agent performance across different personas within a CRM environment. The framework aims to enhance agent capabilities and reliability through stress testing and learning from failure cases.
Enhanced Embedding Models for Enterprise Context
Salesforce introduced SFR-Embedding, a model focused on deeper contextual understanding that outperforms the Massive Text Embedding Benchmark across various datasets. This model, along with the developer-friendly SFR-Embedding-Code, offers improved code search capabilities and development efficiency.
Advantages of Smaller Action-Focused AI Models
The xLAM V2 (Large Action Model) family of models, specifically designed to predict actions rather than generate text, offers a more efficient approach with models starting from 1 billion parameters. These models excel in predicting and executing task sequences, making them valuable for autonomous agents interacting with enterprise systems.
Ensuring Enterprise AI Safety
Salesforce introduced SFR-Guard, a set of models trained on both public and internal CRM data to strengthen the Trust Layer, ensuring AI agent behavior aligns with business needs and standards. Additionally, ContextualJudgeBench evaluates LLM-based judge models for accuracy and appropriateness in context.
Moreover, Salesforce unveiled TACO, a family of multimodal action models for addressing complex problems through chains of thought-and-action, enhancing AI’s ability to interpret and respond to diverse queries involving multiple media types.
Customer Co-Innovation in AI Development
Itai Asseo, Senior Director of Incubation and Brand Strategy at AI Research, emphasized the role of customer feedback in shaping enterprise-ready AI solutions. Customer collaboration has led to significant improvements in AI performance, demonstrating the effectiveness of co-innovation in addressing business challenges.
Future of Salesforce AI
Salesforce’s research initiatives align with the increasing demand for AI systems combining advanced capabilities with consistent performance in enterprise settings. The focus on addressing the consistency gap underscores Salesforce’s commitment to real-world business requirements over academic benchmarks.
The announced technologies will be progressively implemented, with SFR-Embedding being the first to debut in Data Cloud, while other innovations will power upcoming versions of Agentforce. Salesforce aims to lead the enterprise AI revolution by prioritizing consistency and reliability in AI applications.