Şevval Alper
Şevval ist KI-Forscherin bei AIMultiple. Sie verfügt über Forschungserfahrung im Bereich der Pseudozufallszahlengenerierung mithilfe chaotischer Systeme.
Forschungsschwerpunkte
Şevval konzentriert sich auf KI-Codierungswerkzeuge, KI-Agenten und Quantentechnologien.
Sie ist Teil des AIMultiple-Benchmark-Teams, führt Bewertungen durch und liefert Erkenntnisse, um den Lesern das Verständnis verschiedener neuer Technologien und ihrer Anwendungen zu erleichtern.
Berufserfahrung
Sie wirkte an der Organisation und Betreuung von Teilnehmern an drei „CERN International Masterclasses - hands-on particle physics“-Veranstaltungen in der Türkei mit und arbeitete dabei eng mit den Dozenten zusammen, um das Lernen zu erleichtern.
Ausbildung
Şevval besitzt einen Bachelor-Abschluss in Physik von der Technischen Universität des Nahen Ostens.
Neueste Artikel von Şevval
AGI-Benchmark: Kann KI wirtschaftlichen Wert generieren
AI will have its greatest impact when AI systems start to create economic value autonomously. We benchmarked whether frontier models can generate economic value. We prompted them to build a new digital application (e.g., website or mobile app) that can be monetized with a SaaS or advertising-based model.
8 KI-Code-Modelle im Benchmark: LMC-Eval
More than 37% of tasks performed on AI models are about computer programming and maths.
OCR Benchmark: Text Extraktion / Erfassungsgenauigkeit
OCR accuracy is critical for many document processing tasks, and SOTA multi-modal LLMs are now offering an alternative to OCR.
Text-to-Video-Generator-Benchmark
A text-to-video generator is an AI system that turns written prompts into short videos by generating visuals, motion, and sometimes audio directly from natural language.
Code-Ausführung mit MCP: Ein neuer Ansatz für die Effizienz von KI-Agenten
Anthropic introduced a method in which AI agents interact with Model Context Protocol (MCP) servers by writing executable code rather than making direct calls to tools. The agent treats tools as files on a computer, finds what it needs, and uses them directly with code, so intermediate data doesn’t have to pass through the model’s memory.
Top 10 Google Colab Alternativen
Google Colaboratory is a popular platform for data scientists and machine learning scientists, but its limitations and pricing may not meet your needs. Several alternatives offer unique features and capabilities that cater to different data science needs and scenarios.
LLM Parameter: GPT-5 High, Medium, Low und Minimal
New LLMs, such as OpenAI’s GPT-5 family, come in different versions (e.g., GPT-5, GPT-5-mini, and GPT-5-nano) and with various parameter settings, including high, medium, low, and minimal. Below, we explore the differences between these model versions by gathering their benchmark performance and the costs to run the benchmarks. Price vs.
KI-Agenten: Operator vs Browser Use vs Project Mariner
AI agents are increasingly marketed as end-to-end digital workers, but real-world performance can vary widely depending on the task, tools, and execution environment. To understand what these systems can genuinely deliver today, we conducted hands-on benchmarking across practical business scenarios.
Sprach-zu-Text-Benchmark: Deepgram vs. Whisper
We benchmarked the leading speech-to-text (STT) providers, focusing specifically on healthcare applications. Our benchmark used real-world examples to assess transcription accuracy in medical contexts, where precision is crucial. Speech-to-text benchmark results Based on both word error rate (WER) and character error rate (CER) results, GPT-4o-transcribe demonstrates the highest transcription accuracy among all evaluated speech-to-text systems.
Vibe Coding: Großartig für MVP, aber noch nicht produktionsreif
Vibe coding is a new term that has entered our lives with AI coding tools like Cursor. It means coding by only prompting. We made several benchmarks to test the vibe coding tools, and with our experience, we decided to prepare this detailed guide.
AIMultiple Newsletter
1 kostenlose E-Mail pro Woche mit den neuesten B2B-Technachrichten und Experten Einblicken.