Berk Kalelioğlu
Interessi di ricerca
Berk si concentra sull'apprendimento automatico, sugli strumenti di intelligenza artificiale agentiva e sui modelli linguistici di grandi e piccole dimensioni (LLM e SLM). Fa parte del team di benchmark di AIMultiple, dove conduce valutazioni e fornisce approfondimenti per aiutare i lettori a comprendere le tecnologie emergenti e le loro applicazioni nel mondo reale.Esperienza professionale
Ha iniziato la sua carriera come responsabile di progetto tecnico presso l'ODTU IVME-R, dove ha guidato un progetto per la costruzione di generatori fisici di numeri quantici e pseudocasuali. Dopo l'esperienza all'IVME-R, ha co-fondato una società di sviluppo di videogiochi e ha pubblicato un gioco su Steam. Successivamente, ha orientato la sua carriera verso l'intelligenza artificiale ed è entrato a far parte di AIMultiple come ricercatore.Preparazione
Berk ha conseguito una laurea in matematica presso l'Università di Ankara.Ultimi articoli di Berk
Migliori LLM per finestre di contesto estese
We ran a proprietary 32-message conversation test on 22 leading AI models to see how much of their advertised context windows actually work. The conversation includes synthesis tasks that require recalling information from earlier messages, not just parroting the last thing said.
AI Bellek: En İyi Belleğe Sahip En Popüler AI Modelleri
Smarter models often have worse memory. We tested 26 large language models in a 32-message business conversation to determine which actually retain information. AI memory benchmark results We tested 26 popular large language models through a simulated 32-message business conversation with 43 questions.
Allucinazioni dell'IA: Confronta i migliori LLM come GPT-5.2
AI models can generate answers that seem plausible but are incorrect or misleading, known as AI hallucinations. 77% of businesses concerned about AI hallucinations.
Strumenti CLI Agentic: Codex vs Claude Code
Agentic CLI tools are AI coding tools that can create and delete files, run commands, plan, and execute the coding of the entire project.
Benchmark Modelli Tabulari: Prestazioni su 19 Dataset
We benchmarked 7 widely used tabular learning models across 19 real-world datasets, covering ~260,000 samples and over 250 total features, with dataset sizes ranging from 435 to nearly 49,000 rows. Our goal was to understand top-performing model families for datasets of different sizes and structure (e.g. numeric vs.
Agentic LLM Benchmark: Confronto dei Modelli Leader
We benchmarked the top LLMs across 10 software development tasks by using an agentic CLI tool. We executed ~3,500 automated validation steps per model across both API and UI layers. Agentic LLM benchmark results Success rate comparison Each alias ran 3 times across 10 tasks (30 samples per alias, 230 cells per iteration).
Confronto VPS: Hetzner vs Digital Ocean
We benchmarked 6 Virtual Private Server (VPS) providers by running ~1,200 automated tests per server across CPU, memory, disk I/O, and network speed using sysbench, fio, and speedtest-cli. We also documented the full signup-to-SSH experience for each provider.
Ambienti RL: L'infrastruttura dietro l'AI agentic
Reinforcement learning environments are controlled environments where AI agents take actions, observe outcomes, and receive feedback. They are becoming more useful as models move from one-shot answers to multi-step work in coding, browser tasks, customer support, and business software. RL environment companies Some companies sell custom environments for coding, finance, enterprise workflows, or computer-use tasks.
Casi d'uso e sicurezza di OpenClaw (Moltbot/Clawdbot)
OpenClaw (formerly Moltbot and Clawdbot) is an open-source, self-hosted AI assistant designed to execute local computing tasks and interface with users through standard messaging platforms. Unlike traditional chatbots that function as advisors generating text, OpenClaw operates as an autonomous agent that can execute shell commands, manage files, and automate browser operations on the host machine.
Moltbook: Social Media Guidata da Agenti
The rapid growth of OpenClaw has triggered an unusual social experiment: Moltbook, a Reddit-like social platform where agents interact with each other. Launched on the 28th of January, 2026, and started to get attention in a short time span. It reached 1.5m+ agents in its first week.
Newsletter AI Multiple
Una email gratuita a settimana con le ultime notizie tecnologiche B2B e approfondimenti di esperti per dare impulso alla tua azienda.