Imagine a world where you would never need to worry about not understanding what your customers need or want. A world where, at the press of a button, you could get answers to your most pressing questions without having to deal with the messiness and complexity of humanity. Wouldn’t that be something?
This is the world that silicon sampling and the concept of synthetic users try to hint at. And for us, it is to be approached with caution. This possible world requires multiple perspectives and deep consideration to properly process and doesn’t come without its dangers.
Silicon sampling refers to the use of generative AI tools like ChatGPT to simulate user responses. The AI is fed some basic prompts and information about a target audience, then generates synthetic text, survey responses, or other data meant to replicate real users. Synthetic users are the fictional people essentially dreamt up by the AI to serve as stand-ins for real-world customers, clients, or business targets.
So the basic (utopian?) claim is that by using large language model-based tools, such as ChatGPT, to simulate actual users, saves you the pain, the time and the money of actual user research. But what are you losing by turning to silicon sampling? Is there something you actually stand to gain without significant risk?
Why should you be careful?
“ GenAI not only perpetuates inequalities but amplifies them. Do we really want to contribute to this phenomenon? ” – A Solitan
One shouldn’t go blindly into the night. Holding onto ethical principles, values and guidelines while choosing the tools and methods you work with is imperative for responsible research — and to understand all the nuances, you need a wide variety of expertise to determine what is possible.
Biases. Let’s be honest: at this point in time, the biases in algorithms and AI solutions should come as no surprise. They have been discussed through and through, and generative AI is no exception. It includes heavy biases that might not be immediately obvious to the user. It also hallucinates, meaning it produces pure nonsense that might look very convincing and legitimate. Using GenAI tools for customer research requires a lot from the user: to not take things at face value, to be critical, to check your facts. These are all things that we have to do when dealing with real people as well. Depending on our level of trust in technology, that trust can lead us to relax our standards when it comes to dealing with “data, not people”, as if that data would somehow be less fallible, less messy just because of the medium used.
There is also the risk of getting poisoned data generated from fake responses on crowdsourcing platforms, a new glorious age for troll factories wanting to influence the results (Hämäläinen, P., Tavast, M. and Kunnari, A. 2023).
Diversity. Using LLMs to substitute humans leaves a lot to be desired when it comes to diversity. By its very nature, generative AI is reductive and averaging. It fails to represent the spectrum of human diversity that inhabits our world — and maybe even contributes to how we on the business side of things view people as. Are they actual persons to be treated with dignity and respect, or something we just extract information from? Research ethics continues to be something that we can’t afford to ignore.
Emotion and empathy. If we go down the path of reducing people to ones and zeroes, what does it mean for our capability to learn from what they feel, from what they experience? From their joy or sadness? Large language models don’t feel. They don’t experience the world. Thus they cannot replicate or accurately represent humanity, nor understand the complex contexts that we spend our lives in. And by not exposing ourselves to the very messiness of people, we are growing more distant from them (us).
Some say this is already happening to our ability to have discussions in the physical world: when it’s so much easier and more convenient to have conversations online, why would we put in all the cognitive and emotional work and cultural learning of talking with someone face to face? This in turn erodes our capability for empathy. If this is true just by changing the medium of conversation, what would forgoing actual people altogether do to our ability to empathise? We shudder at the thought.
A lot of our understanding of each other takes place in the silent in-betweens, in the gestures, facial expressions, in the things left unsaid. This is something synthetic users cannot replicate.
Anthropomorphism involves ascribing human characteristics, emotions, or intentions to non-human entities. Even if a large language model generates: “I think, Sebastian, therefore I am.”, it’s crucial that we resist the temptation to perceive large language models as human in any guise. Instead, we should recognize them as mathematical and statistical manifestations of accessible written content — a representation of a representation of humanity and its rich tapestry of written culture. Think about how you converse with generative AI; are you writing simple prompts, or using a more conversational tone?
Putting googly eyes on a robot might make you less likely to mistreat them, but so far it doesn’t look like being extra nice to ChatGPT contributes to your survival odds in case of an uprising… Unless you believe in Roko’s basilisk.
Well, what can I do then?
Being careful of these solutions doesn’t mean that you should completely abandon all hope of making use of them. As long as you have a critical mindset, you can absolutely make use of generative AI when doing customer research.
Ideation and hypothesis formulation. You as an individual are, by default, biased. So are your colleagues. So are scientists and researchers. So are we all. Some might be better at confronting and dismantling those biases than others, but that doesn’t mean those biases don’t exist. And yet, we all formulate hypotheses about the world and about our customers. Asking a large language model to create some additional hypothesis might actually be an advantage and help us fight our own blind spots, as long as we test out those hypotheses in the environments and with the people who actually matter. By prompting for diversity and inclusion, and asking eg. ChatGPT to consider an issue from those perspectives, you may gain insights your biases are blinding you to.
Testing interview structures and themes. Services like ChatGPT are quite good at looking at larger subject matters and helping you understand what kind of themes you could delve deeper into. They can also help you by suggesting interview structures and survey questions based on those themes, which you can amend and build upon.
Analysis assistant. Analysing interview data, especially large volumes of qualitative data like free text responses can be extremely difficult or even impossible to do by hand, this is why many polls have been quite intricately designed to leverage only quantitative responses in order to make it possible to analyse the responses in an automated manner. GenerativeAI provides effective tools for sentiment analysis, entity extraction, topic modelling, summarisation, and more; although traditional and specialised natural language processing methods, while demanding more of the user, might still be more reliable options. Common and respected analysis tools, like Atlas.ti and MAXQDA, already offer GenAI-based assistance for qualitative data.
You can use GenAI to process materials and help you with analysing the results more comprehensively — keeping in mind that you shouldn’t delegate the whole analysis process, but use AI as a tool to help you. Not going through the data yourself and comparing your own findings to those provided by automated intelligent analysis can result in missing key insights and eventually even de-skilling yourself. Doing the work is how you learn, not only about the data but about your own ways of working.
Visualisation. Sometimes your thought process needs a little jolt. Not all of us are skilled at visualising the vast masses of information processing in our heads, and that is something that prompt-based AI image generators can help us with: creating images of scenes and situations, even abstract ones, is often a helpful addition to text-based analysis. Depending on the prompts, sometimes the results might be somewhat unexpected, but that isn’t always a bad thing. Just keep in mind that the biases talked about previously still repeat in image generators, just as they do in generating text: the images generated often amplify racial and gender stereotypes. Copyright issues of these services are also a morally grey area.
A brave new world?
Let’s look at this from an optimistic best-practices perspective. The responsible adoption of silicon sampling and synthetic uses may well help us in our work of understanding people and behaviour. With proper governance, this technology can widen our perspectives, while still respecting human values. Companies have an obligation to augment, not replace, genuine human insights.
We maintain cautious optimism that one day, with advances in AI, synthetic users might capture more of the nuance that makes us human. They may assist where gathering direct human input is infeasible, such as budgetary or geographical reasons. By ensuring diverse voices guide the technology’s development, we create hope for an inclusive future. With care, there is a possibility that silicon sampling can help some companies be more customer-centric — even if their customer research budget wouldn’t rival that of the big brands or tech companies.
But we also know how easy it is to get overly excited about this. It is difficult not to humanise these models when they speak to us like they’re our online friends. It’s also natural to want to believe there are comprehensible solutions to things we find difficult to understand. For ages, we have found supernatural explanations for things beyond our comprehension: spirits, magic, and gods.
The honoured science fiction author Sir Arthur C. Clarke stipulated three laws:
- When a distinguished but elderly scientist states that something is possible, he is almost certainly right. When he states that something is impossible, he is very probably wrong.
- The only way of discovering the limits of the possible is to venture a little way past them into the impossible.
- Any sufficiently advanced technology is indistinguishable from magic.
Large language models are not magic, nor are they an easy solution to our every problem. Like magic, they can make us believe we can achieve something that would otherwise be inconceivable or out of reach — but if we peek behind the curtain, we find only rules and sequences masquerading as a man, not the mighty wizard of Oz.
But with the right approach and due diligence, perhaps we can harness the capabilities of this particular technology, pushing the boundaries of what we currently deem improbable.
Do use generative AI to give you ideas, inspiration and interesting viewpoints that you might have otherwise missed. Formulate hypotheses, try out and see how it can help you in coding and analysis. Experiment on how you can augment and enhance your work, while remembering to stay not only critical, but also curious.
Don’t replace your actual human insight with silicon sampling. We are gloriously messy and complex, and it’s our job to figure us out. That’s what makes us human.
At Solita, we’re a community that consists of a wide variety of experts from machine learning to social sciences, from design to data science. Even with this collective know-how and insight, we can’t be sure what will happen in the realm of artificial intelligence and what the consequences might be. But we’re trying to make sense of it, and we just might be able to help you as well.
Hämäläinen, P., Tavast, M. and Kunnari, A. (2023): Evaluating Large Language Models in Generating Synthetic HCI Research Data: a Case Study. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems. doi: 10.1145/3544548.3580688
Turkle, Sherry (2015): Reclaiming Conversation: The Power of Talk in a Digital Age