Audience Simulation: Can LLMs Predict Human Behavior?

updated on Apr 28, 2026

In marketing, evaluating how accurately LLMs predict human behavior is crucial for assessing their effectiveness in anticipating audience needs and recognizing the risks of misalignment, ineffective communication, or unintended influence.

Audience simulation with LLMs enables the modeling of virtual audiences, helping organizations anticipate reactions to content or products without relying on costly surveys or focus groups.

We tested how well AI models can predict which of two LinkedIn posts by the same author will get higher engagement (likes, comments, shares), essentially simulating human audience behavior.

Audience simulation benchmark results

Loading Chart

Reasons behind performance differences in LLMs

Large language models show different levels of accuracy in predicting which of two LinkedIn posts by the same author will receive higher engagement. These differences emerge from how well each model processes the inputs described in the benchmark and how effectively it identifies the factors that influence audience reactions.

Understanding engagement signals

The benchmark requires models to evaluate subtle cues that predict engagement. Higher performing models tend to detect these cues more accurately. These cues include whether the post:

Presents a personal insight or a lesson
Asks a direct question
Is relatable to a broad audience
Appears promotional
Structure affects reader attention

Models such as DeepSeek Chat V3 and Claude Opus 4 perform well because they identify these cues with greater consistency.

Using contextual information

The evaluation includes several pieces of contextual data for each post, and models differ in how well they use them. Each model receives:

Post text
Media type such as text, image, video, or link
Follower bucket of the author

Accurate prediction requires the model to combine these inputs. Higher-performing models recognize patterns, such as lower engagement for link posts and higher engagement for reflective narratives. Weaker models often treat inputs in isolation or overlook their interactions.

Interpreting human behavior

Predicting engagement requires reasoning about audience preferences. Only a few models exhibit a strong capability in this area. Many models stay near the 50% baseline because audience behavior is variable and depends on psychological factors that are difficult to infer from text alone.

Models that perform around 52% show partial understanding of these cues. They can identify general patterns but struggle in borderline cases. Models with very low scores, such as o1, appear to misjudge standard engagement drivers and often favor the less engaging option.

Influence of training data

It is noted that model outputs reflect the data on which they are trained. If training data does not represent a wide range of communication styles or demographic groups, the model may misinterpret certain types of content. These training differences contribute directly to the spread of results in the benchmark.

Models trained on broader or more conversational datasets tend to better approximate user reactions. Models trained on narrower datasets often rely on surface-level features that do not correlate well with actual engagement.

Generalization across authors

The dataset includes posts from 50 authors with various follower counts, media preferences, and writing styles. Models must generalize across these differences. Stronger models form consistent expectations about what drives engagement regardless of the author.

Lower-performing models apply inconsistent criteria across different authors and posts.

See our methodology to understand how we calculate these measurements.

What is audience simulation?

Audience simulation is the practice of using synthetic, model-driven populations, sometimes referred to as virtual audiences, to predict how real people may react to content, products, or policy ideas before they are released. Instead of running live tests with expensive surveys or focus groups, organizations can create personas that represent their target audience and observe their simulated responses.

The technique builds on methods from agent-based modeling, large language models, and persona simulation. Each simulated agent or persona is designed with attributes such as demographics, preferences, or behavioral tendencies. Together, these personas interact, producing synthetic data that approximates the behavior of a group of real customers or citizens in the same situation.

How do audience simulation tools work?

The mechanics of audience simulation depend on the tools used, but most approaches share standard components:

Persona design: Researchers define personas based on specific demographics, psychographics, or market segments. These personas can range from simple rule-based agents to detailed AI personas enriched with biographies and conversational abilities.
Synthetic data generation: Large language models help simulate dialogue, survey responses, or posting behavior. For example, Artificial Societies operates 100–300 AI personas that read, react to, and reshare LinkedIn posts to simulate network dynamics.
Interaction modeling: Personas do not act in isolation. They interact, influence one another, and form patterns such as echo chambers, cascades of reposts, or shifts in public opinion. This allows simulations to capture not just individual reactions but also group-level phenomena.
Scenario testing: By varying inputs such as message framing, media type, or survey questions, organizations can observe how simulated audiences respond to these variations. These scenarios help generate hypotheses and test ideas in a safe practice stage before engaging with real people.
Data analysis: The outputs are analyzed using techniques like word clouds, sentiment analysis, and accuracy scoring. The results can show likely winners between two post variants, common themes in feedback, or a persona’s perspective on why one idea resonates more than another.

Real-life example: Stanford’s Generative Agent Simulations

A Stanford University research team developed an agent architecture that converts qualitative interview data into LLM-powered representations of real individuals.

Rather than building personas from demographic labels alone, each agent is grounded in a two-hour interview with the person it represents. Tested against the General Social Survey, the agents matched their source individuals’ responses nearly as well as those individuals matched their own answers when re-surveyed two weeks apart.

The architecture also showed reduced prediction bias across racial and ideological groups compared to demographic-only persona approaches, suggesting it can model diverse populations more faithfully than simpler methods.¹

Audience simulation use cases

Marketing and advertising

Brands can test campaign slogans, visuals, or product positioning with a virtual audience before spending on large-scale distribution. Instead of relying solely on traditional survey responses, they can generate synthetic data from AI personas and compare performance across groups.

For example, marketers can determine whether a product resonates more with Gen Z than with older professionals and adjust their creative strategy accordingly. This ability to validate campaigns at the testing stage leads to cost savings and more precise targeting.

Real-life example: Focus Agent

Researchers at KU Leuven built a multi-agent system that replicates the structure of a traditional focus group entirely in software, including both the participants and the moderator role.

The system was validated by running five real focus group sessions with 23 human participants on the same discussion topics, then comparing the outputs to those generated by AI participants alone. The AI-generated opinions aligned closely with those of human respondents.

Beyond replacing participants, the LLM moderator also offered practical advantages over human moderation, such as more consistent topic steering and time management.²

Media and publishing

Media companies can simulate how different content formats (e.g., short posts, long-form articles, video explainers) will perform among their audiences.

Persona simulation also allows testing how headlines affect click-throughs or how tone influences shares. By anticipating reactions, editors can prioritize stories that are more likely to spread, rather than waiting for post-publication metrics.

Public policy and research

Governments and think tanks can use audience simulation to test policy research ideas. Synthetic populations modeled after specific demographics can illustrate how different communities might respond to a new tax, health regulation, or climate initiative. Researchers have applied generative simulations to explore issues like polarization and misinformation.

This approach facilitates hypothesis generation and provides a safer environment for anticipating unintended consequences before engaging with real people.

Product development

Companies can simulate how personas representing specific demographics talk about a new feature or device. For example, a tech company could compare whether small business owners, students, or enterprise managers find more value in a new software update.

Insights from the simulation can inform design decisions and mitigate the risk of releasing features that fail to resonate with the intended audience.

Training and education

Universities and businesses can use simulations to create practice environments where learners interact with AI personas. A trainee negotiator might practice with simulated counterparts, or a medical student could test communication strategies with synthetic patients.

These training scenarios offer a realistic range of responses, allowing learners to refine their skills before encountering real individuals.

Market research agencies

Traditional survey questions and focus groups can be costly and slow. Market research agencies can complement them with audience simulation to generate synthetic data that provides fast directional insights.

While simulations do not replace engagement with real customers, they can reduce dependence on expensive panels and accelerate early-stage testing.

To get up to date on enterprise AI and software, follow us:

Cem Dilmegani

Principal Analyst

Follow On

Audience simulation tools

If you are looking for a dedicated tool for audience simulation instead of using LLMs, here are some options:

Artificial Societies

Artificial Societies enables users to describe a target audience in plain language or generate one based on social media interactions. It then constructs a “society” of personas and runs AI-driven simulations.

Each simulation includes automatic A/B testing, which generates variations of a message in the user’s style and tests them against the audience. Results are presented with scores, comments, and summaries, allowing for quick interpretation. Use cases span PR, product development, branding, marketing, journalism, and social media.

Figure 1: Artificial Societies simulation dashboard.

Real-life example: Teneo

Teneo, a PR company, was preparing to launch a new technology strategy and needed to test whether its messaging would resonate with key stakeholders before announcing it publicly. However, the company faced several constraints:

The strategy was confidential, limiting traditional research methods.
The timeline was short, making large-scale surveys difficult.
Important audiences, such as policymakers, industry leaders, and specialized stakeholders, were almost impossible to reach through conventional market research panels.

To address these challenges, Teneo partnered with Artificial Societies. The process included:

Creating AI personas: Over 5,000 AI personas were generated. These personas were based on real demographic and psychographic profiles, informed by social listening and qualitative research.
Building specialized “societies”: Separate AI societies represented different stakeholder groups, including:
- Consumers
- Industry peers
- Policymakers, lobbyists, and political influencers.
Testing messaging narratives: Researchers tested six competing technology narratives using surveys and experiments within each AI society.
Analyzing reactions: Responses were analyzed at both aggregate and individual persona levels, allowing the team to compare reactions across audience segments.

The simulation produced large-scale insights much faster than traditional research methods. Key outcomes included:

189,756 unique responses generated from the AI simulations.
Insights based on 30 in-depth research questions across six narratives.
Identification of the most effective narrative and tailored messaging for each audience segment.
Delivery of results through an interactive analytics platform and a written report.³

Ask rally

Ask Rally is a virtual audience simulator that allows users to test questions, content, and ideas with AI personas designed to resemble real audiences.

Users create or edit personas, or clone them from existing data such as interviews or surveys. After defining an audience, they can ask questions and receive responses generated by personas, ranging from 5 to 100. The platform aggregates answers, provides key insights, and allows agents to vote on options.

Key features include:

Multi-agent responses with aggregated summaries and insights.
Mem0-powered persona memory enables personas to retain context and behavioral patterns over time, helping simulate more consistent and realistic audience reactions.
Four-tier audience sophistication allows users to model audiences with different levels of expertise or familiarity with a topic.
Video reaction simulation allows teams to test how audiences might respond to video content such as ads, campaign materials, or presentations.
API access enables teams to integrate the simulator into research workflows, internal tools, or automated testing pipelines.
Testing environments for websites, campaigns, and media.
Additional capabilities such as digital twins, simulator environments, and calibration against real-world data.
Free plan for experimentation and early testing.

Generative Audiences by Dentsu

Generative Audiences is an AI marketing intelligence tool that creates simulated consumer audiences from real data. It helps brands improve audience targeting, media planning, and campaign performance by allowing marketers to interact with these AI personas and analyze their responses.⁴

Deterministic and AI-driven data: Combines people-based deterministic data with AI-driven behavioral signals to model audience behavior accurately.
Interactive consumer insights: Marketers can interact with simulated personas to explore motivations and behaviors, for example, to test how audiences might respond to new messaging, product ideas, or current events.
Multi-source data integration: Synthesizes multiple data sources (static and real-time) and integrates with existing client data.
Media planning and activation: Insights from the AI audiences can be used to build targeted media strategies and activate campaigns.
Privacy-conscious audience modeling: Because it uses statistical simulations rather than relying heavily on personal identifiers, the solution can scale audience targeting while remaining more privacy-compliant.

Electric Twin

Electric Twin is a synthetic audience platform that creates digital populations from real-world data.⁵

Synthetic audience modeling: Creates digital populations that represent real demographic groups and simulate human behavior.
Real-time audience feedback: Users can ask questions and receive immediate responses from simulated personas, rather than running surveys.
Scenario and messaging testing: Teams can evaluate product concepts, campaigns, pricing strategies, and policy proposals before launching them.
Surveys and simulated focus groups: Supports rapid polls, interviews, and focus-group-style discussions with AI personas.
Custom and prebuilt audiences: Organizations can build audiences using their own survey data or use ready-made demographic populations across multiple countries.
Prediction engine: Benchmarks results against real-world survey data to estimate likely consumer responses.
Privacy-preserving research environment: Synthetic populations allow testing ideas without exposing sensitive or personal data.

Simile AI

Built by researchers from Stanford, Simile aims to simulate large groups, or even entire societies, to predict how people might react to products, policies, or corporate decisions.⁶

Digital twin personas: Creates AI agents that represent real individuals based on behavioral data and interviews.
Large-scale human behavior simulations: Models interactions among thousands of agents to predict consumer decisions or social outcomes.
Scenario forecasting: Businesses can anticipate events such as changes in consumer demand or analyst questions during earnings calls.
Generative agent architecture: AI agents plan actions, form opinions, and interact with each other to produce realistic behavioral dynamics.

Benchmark methodology

Our research question for this benchmark was “Can AI models predict which LinkedIn post will get more engagement before it’s published?” For this cause, we evaluated how well AI models can predict which of two LinkedIn posts by the same author will generate higher total engagement (likes + comments + shares) within 7 days of posting.

We used 50 authors’ posts for our dataset. Each row contains a pair of posts from the same author with these features:

Post content: Raw text of both posts
Media type: text/image/video/link for each post
Author context: Follower bucket (e.g., “1k-5k”, “5k-20k”)
Ground truth: Actual engagement numbers and winner label (A or B)

Example data:

Post A (Winner – 156 engagement): “After three failed startups, here’s what I wish someone told me about product-market fit: Stop building features your five beta users requested. Start obsessing over the problem 95% of your target market actually faces. Made this mistake for 2 years. Don’t repeat it. What’s the biggest product lesson you learned the hard way?”

Media: text
Followers: 5k-20k

Post B (84 engagement): “Excited to share our new AI-powered analytics dashboard! Check out the demo and let us know what you think.”

Media: link
Followers: 5k-20k

Analysis: Post A won because it provides specific, actionable advice from personal failure, asks an engaging question, and offers relatable content. Post B is a generic promotion with less engagement potential.

Evaluation

In evaluation, each model receives this information for both posts:

Post text
Media type
Author follower count bucket

With this information, the models are expected to predict whether post A or B is the best performer. They can show us their reasoning, but we did not evaluate their reasoning in this benchmark.

Since the models have a 50% chance of being accurate about the best performer ( there are only two choices), we are considering looking for “lift over chance (Accuracy minus 50% which is the random guessing baseline)” baseline in the future.

Still, in this dataset, we have not observed random guessing; all models explained their reasoning, whether their answers were right or wrong.

What are the potential challenges of audience simulation?

Despite its promise, audience simulation must be approached with caution.

Validation against real customers

Predictions from virtual audiences must be compared against actual outcomes. Without benchmarks, results may create false confidence. Validation is crucial to ensure that synthetic personas accurately reflect the behavior of real people.

Bias in language models

AI personas are shaped by the data that trained the underlying language models. If that data underrepresents certain groups, the resulting personas may distort how specific demographics are portrayed. This can affect how survey responses or public opinion are simulated.

Interpretability

Although persona conversations or word clouds can show common themes, it is not always clear why specific outputs emerge. The complexity of LLM responses can make it difficult to explain or validate audience behavior.

Ethical guidelines

Using synthetic data for customer research or policy research requires transparency. Organizations must ensure that they do not present simulations as a replacement for real customers and should respect ethical boundaries in defining personas.

Generalizability

Simulations are highly dependent on the scope of persona design. A model trained on U.S.-based tech founders cannot automatically predict responses from Gen Z in Asia. Overgeneralization is a risk when extending findings to populations that were not represented in the simulation.

Computational cost

Running detailed simulations with thousands of personas can require significant resources. Although AI tools are improving efficiency, large-scale experiments still demand time, technical knowledge, and infrastructure.

Reference Links

https://arxiv.org/pdf/2411.10109

https://arxiv.org/pdf/2409.01907

Artificial Societies

Dentsu Launches Generative Audiences: AI-Powered Growth Intelligence That Thinks Like Consumers | News | dentsu

Electric Twin - Synthetic Audiences for Instant Consumer Insights

Electric Twin

Home | Simile

Sıla Ermut

Industry Analyst

Follow On

Sıla Ermut is an industry analyst at AIMultiple focused on email marketing and sales videos. She previously worked as a recruiter in project management and consulting firms. Sıla holds a Master of Science degree in Social Psychology and a Bachelor of Arts degree in International Relations.

View Full Profile