Şevval Alper
Research interests
Şevval focuses on AI coding tools, AI agents, and quantum technologies.She is part of the AIMultiple benchmark team, conducting assessments and providing insights to help readers understand various emerging technologies and their applications.
Professional experience
She contributed to organizing and guiding participants in three “CERN International Masterclasses - hands-on particle physics” events in Türkiye, working alongside faculty to facilitate learning.Education
Şevval holds a Bachelor's degree in Physics from Middle East Technical University.Latest Articles from Şevval
AI Agent Platforms Benchmark: Claude Managed Agents vs Google Vertex Agent Engine
We benchmarked 4 AI agent platforms across 3 dimensions: task completion (10 coding tasks × 3 runs), harness-specific capabilities (steering, reconnection, long-conversation recall, large-file handling), and cost.
MCP Benchmark: Top MCP Servers for Web Access
We benchmarked 8 MCP servers across web search and extraction, as well as browser automation tasks, by running 4 different tasks 5 times on all suitable MCPs. We also performed a load test involving 250 concurrent AI agents.
E-Commerce AI Video Maker Benchmark: Veo 3 vs Sora 2
Product visualization plays a crucial role in e-commerce success, yet creating high-quality product videos remains a significant challenge. Recent advancements in AI video generation technology offer promising solutions.
AI Code Review Tools Benchmark
With the increased use of AI coding tools, codebases have become more prone to vulnerabilities, which increased the need for effective code reviews.
AGI Benchmark: Can AI Generate Economic Value
AI will have its greatest impact when AI systems start to create economic value autonomously. We benchmarked whether frontier models can generate economic value. We prompted them to build a new digital application (e.g., website or mobile app) that can be monetized with a SaaS or advertising-based model.
8 AI Code Models Benchmarked: LMC-Eval
More than 37% of tasks performed on AI models are about computer programming and maths.
OCR Benchmark: Text Extraction / Capture Accuracy
OCR accuracy is critical for many document processing tasks, and SOTA multi-modal LLMs are now offering an alternative to OCR.
Text-to-Video Generator Benchmark
A text-to-video generator is an AI system that turns written prompts into short videos by generating visuals, motion, and sometimes audio directly from natural language.
Code Execution with MCP: A New Approach to AI Agent Efficiency
Anthropic introduced a method in which AI agents interact with Model Context Protocol (MCP) servers by writing executable code rather than making direct calls to tools. The agent treats tools as files on a computer, finds what it needs, and uses them directly with code, so intermediate data doesn’t have to pass through the model’s memory.
LLM Parameters: GPT-5 High, Medium, Low and Minimal
New LLMs, such as OpenAI’s GPT-5 family, come in different versions (e.g., GPT-5, GPT-5-mini, and GPT-5-nano) and with various parameter settings, including high, medium, low, and minimal. Below, we explore the differences between these model versions by gathering their benchmark performance and the costs to run the benchmarks. Price vs.
AIMultiple Newsletter
1 free email per week with the latest B2B tech news & expert insights to accelerate your enterprise.