From people to personas: A new era of synthetic research data?

The research industry is undergoing a profound technological transformation, driven by the demand for greater speed, scalability, and cost-efficiency. Artificial intelligence (AI) is reshaping every stage of the research process, from project design to recruitment, data collection, and analysis.
One of the most talked-about trends in the research industry at the moment is the rise of synthetic respondents. These are AI-generated profiles created by Large Language Models (LLMs) such as ChatGPT or Microsoft Copilot, designed to emulate the characteristics and behaviours of real-life consumers. Acting as a proxy for human participants, synthetic respondents can be deployed across a variety of research methods, from large-scale surveys to in-depth interviews and virtual focus groups.
Synthetic respondents promise a range of benefits. They can dramatically reduce fieldwork costs by minimising the need to recruit and incentivise real participants. They can also provide feedback to questions instantaneously and at scale, maximising the speed at which insights can be delivered. And they can help to replicate traditionally hard-to-reach populations, broadening potential reach and uncovering a more diverse range of perspectives. With such advantages, it is unsurprising that some predict synthetic data could account for more than half of all data collection within the next three years (Qualtrics 2025 Global Market Research Trends Report).
Leading research firms are pioneering this trend. For example, Kantar has conducted controlled side-by-side testing of synthetic samples generated by LLMs against human panellists to gain a more comprehensive picture of the quality of data produced by artificial respondents. Similarly, Qualtrics has developed a proprietary AI model leveraging its extensive consumer database to generate insights and responses that claims to closely mirror shifting consumer behaviours.
But are synthetic respondents all they’re cracked up to be? While they offer efficiency, they also come with some limitations. AI-generated responses often lack the authenticity and nuance that characterise human behaviour. Humans are complex, sometimes irrational, and deeply influenced by cultural and emotional context - factors that algorithms struggle to replicate. Moreover, because LLMs are trained on patterns and historical data, they risk perpetuating stereotypes and biases, which can lead to insights that risk feeling generic or disconnected from reality.
Ultimately, synthetic respondents can play a valuable role in supporting the research process, but they are no substitute for human insight. When used transparently and in the right context, synthetic data can enhance decision-making, offering speed and scalability. However, the human researcher remains indispensable. It is their ability to interpret subtle cues, challenge assumptions, and uncover the story behind the data that transforms findings into meaningful, actionable insights. AI can accelerate and augment, but it is human expertise that ensures research delivers impact that is grounded in real human truth.