Berk Kalelioğlu
Interesses de pesquisa
Berk se concentra em aprendizado de máquina, ferramentas de IA com agentes e modelos de linguagem grandes e pequenos (LLMs e SLMs). Ele faz parte da equipe de benchmarks do AIMultiple, realizando avaliações e fornecendo insights para ajudar os leitores a entender as tecnologias emergentes e suas aplicações no mundo real.Experiência profissional
Ele iniciou sua carreira como Líder de Projetos de Tecnologia no ODTU IVME-R, onde liderou um projeto para construir geradores físicos de números quânticos e pseudoaleatórios. Após sua passagem pelo IVME-R, cofundou uma empresa de desenvolvimento de jogos e lançou um jogo no Steam. Posteriormente, direcionou sua carreira para a Inteligência Artificial e ingressou na AIMultiple como pesquisador.Educação
Berk é bacharel em Matemática pela Universidade de Ankara.Últimos artigos de Berk
Melhores LLMs para Janelas de Contexto Estendidas
We ran a proprietary 32-message conversation test on 22 leading AI models to see how much of their advertised context windows actually work. The conversation includes synthesis tasks that require recalling information from earlier messages, not just parroting the last thing said.
Memória de IA: Modelos de IA Mais Populares com a Melhor Memória
Smarter models often have worse memory. We tested 26 large language models in a 32-message business conversation to determine which actually retain information. AI memory benchmark results We tested 26 popular large language models through a simulated 32-message business conversation with 43 questions.
Alucinação de IA: Compare os principais LLMs como GPT-5.2
AI models can generate answers that seem plausible but are incorrect or misleading, known as AI hallucinations. 77% of businesses concerned about AI hallucinations.
Ferramentas CLI Agentic: Codex vs Claude Code
Agentic CLI tools are AI coding tools that can create and delete files, run commands, plan, and execute the coding of the entire project.
Benchmark de Modelos Tabulares: Desempenho em 19 Conjuntos de Dados
We benchmarked 7 widely used tabular learning models across 19 real-world datasets, covering ~260,000 samples and over 250 total features, with dataset sizes ranging from 435 to nearly 49,000 rows. Our goal was to understand top-performing model families for datasets of different sizes and structure (e.g. numeric vs.
Referencial de LLM Agente: Principais Modelos Comparados
We benchmarked the top LLMs across 10 software development tasks by using an agentic CLI tool. We executed ~3,500 automated validation steps per model across both API and UI layers. Agentic LLM benchmark results Success rate comparison Each alias ran 3 times across 10 tasks (30 samples per alias, 230 cells per iteration).
Teste de Referência VPS: Hetzner vs Digital Ocean
We benchmarked 6 Virtual Private Server (VPS) providers by running ~1,200 automated tests per server across CPU, memory, disk I/O, and network speed using sysbench, fio, and speedtest-cli. We also documented the full signup-to-SSH experience for each provider.
Ambientes RL: A Infraestrutura por trás da IA Agêntica
Reinforcement learning environments are controlled environments where AI agents take actions, observe outcomes, and receive feedback. They are becoming more useful as models move from one-shot answers to multi-step work in coding, browser tasks, customer support, and business software. RL environment companies Some companies sell custom environments for coding, finance, enterprise workflows, or computer-use tasks.
Casos de Uso e Segurança do OpenClaw (Moltbot/Clawdbot)
OpenClaw (formerly Moltbot and Clawdbot) is an open-source, self-hosted AI assistant designed to execute local computing tasks and interface with users through standard messaging platforms. Unlike traditional chatbots that function as advisors generating text, OpenClaw operates as an autonomous agent that can execute shell commands, manage files, and automate browser operations on the host machine.
Moltbook: Mídia Social Impulsionada por Agentes
The rapid growth of OpenClaw has triggered an unusual social experiment: Moltbook, a Reddit-like social platform where agents interact with each other. Launched on the 28th of January, 2026, and started to get attention in a short time span. It reached 1.5m+ agents in its first week.
Boletim informativo AIMultiple
Receba um e-mail gratuito por semana com as últimas notícias de tecnologia B2B e insights de especialistas para impulsionar o seu negócio.