Agentic AI Benchmarks: Proprietary- Open Source AI Agents & Performance
Agentic AI includes agents that execute complex tasks with minimal human supervision. We evaluated the most popular AI agents, open-source AI agent frameworks, customer service AI agents, and the performance of popular LLMs as AI agents.
Explore Agentic AI Benchmarks: Proprietary- Open Source AI Agents & Performance
Optimizing Agentic Coding: How to use Claude Code
AI coding tools have become indispensable for many development tasks. In our tests, popular AI coding tools like Cursor have been responsible for generating over 70% of the code required for tasks.
40+ Agentic AI Use Cases with Real-life Examples
Autonomous generative AI agents execute complex tasks with little or no human supervision. Agentic AI differs from chatbots and co-pilots. Unlike traditional AI, particularly generative AI, which often requires human intervention in complex workflows, agentic AI aims to autonomously navigate and optimize processes thanks to its decision-making capabilities and goal-directed behavior.
Authorization for AI Agents: Permit.io, Descope & more
I have been exploring agent identity and the authentication/authorization platforms that could support it, while also examining how standards like OAuth 2.0 and frameworks such as Keycloak might apply. Below, I listed the best AI agent–specific platforms and features, categorized by their primary focus.
AI Identities: The Role of Agentic Systems in Governance
Agentic AI systems are rapidly emerging in enterprise environments. To govern them safely, each agent needs to be recognized as a first-class identity with its own credentials, permissions, and audit trail.
Agentic AI Architecture for Industrial Systems
Agentic AI allows natural language interaction with industrial systems, enabling users to query data and receive actionable insights. We will outline a reference architecture designed for industrial environments, describe how task-specific agents and tools can be orchestrated. We will also explore current state of natural language interfaces (NLIs) in industrial systems.
Agentic Mesh: The Future of Scalable AI Collaboration
While much has been written about agent architectures, real-world production-grade implementations remain limited. Building on my earlier post about A2A fundamentals, this piece highlights the agentic AI mesh, a concept introduced in a recent McKinsey.
How we Moved from LLM Scorers to Agentic Evals?
Evaluating LLM applications primarily focuses on testing an application end-to-end to ensure it performs consistently and reliably. We previously covered traditional text-based LLM evaluation methods like BLEU or ROUGE. Those classical reference-based NLP metrics are useful for tasks such as translation or summarization, where the goal is simply to match a reference output.
AI Agents vs Agentic AI Systems
Adapted from There’s been a lot of buzz around the terms “AI agents” and “Agentic AI systems” lately. While they’re often used interchangeably, they actually refer to slightly different concepts.
LCMs: From LLM Tokenization to Concept-level Representation
Large concept models (LCMs), as introduced by Meta in their work on “Large Concept Models,” represent a fundamental shift away from token-based prediction toward concept-level representation.
The 7 Layers of Agentic AI Stack
The rise of agentic AI has introduced a technology stack that extends well beyond simple calls to foundation-model APIs. Unlike traditional software stacks, where value often concentrates at the application tier, the agentic AI stack distributes value more unevenly. Some layers offer strong opportunities for differentiation and moat building, while others are rapidly becoming commoditized.