Data quality is everything when it comes to reliable insights. But are the old ways of checking for it still effective against modern challenges like bots and survey farms? We’re always curious to find ways to produce more data confidence, so as part of this commitment, we decided to take advantage of our own panel and platform to look for insights and guidance on how we can help move the industry forward, together. In this article, we'll dive into the world of in-survey quality checks, exploring the limits of traditional "red herrings." We’ll share initial findings from our own large-scale research into a new generation of more varied and engaging in-survey quality checks. This research looks at what makes them effective, how respondents actually feel about them, and how culture impacts performance. Stick around for actionable tips to help you balance data integrity with a better respondent experience.
A shared challenge: advancing data quality together
Let's be clear: ensuring survey data quality isn't just an aytm concern; it's a challenge that impacts the entire market research industry. From brands making critical decisions to agencies crafting strategies, and ultimately to the respondents sharing their valuable time and opinions, everyone benefits when insights are built on a foundation of trust and reliability. We believe that tackling this challenge requires transparency, innovation, and collaboration. At aytm, we're committed to not only refining our own methods but also sharing what we learn. The research detailed here is part of that commitment, offered to help move the industry forward. We're all in this together, and finding better solutions benefits the entire insights community.
Starting with trust: the quest for clean data
Gathering survey data you can truly trust is the bedrock of great market research. For a long time, we've leaned on "red herrings" or "attention checks." These are the little questions that can be slipped in to see if respondents are paying attention. But let's face it, the world has changed. We're not just looking out for wandering eyes anymore; we're navigating sophisticated bots, survey farms, and the real risk of frustrating our valuable respondents with checks that feel more like hurdles.
This evolving landscape is what drives our curiosity at aytm. We're always exploring ways to do things better. So, we started wondering: Can we design quality checks that are sharper at spotting unreliable data (including AI-generated responses) and create a more positive, engaging experience for the thoughtful humans taking our surveys? This question has sparked an exciting new area of exploration for us. It involves the development and initial research into a new generation of more engaging and varied in-survey quality checks. We're energized by what we're learning from this ongoing work and eager to share some of our early findings with you in this article. But first, let's dive into why this evolution in quality checks is so crucial right now.
Why old-school attention checks aren't always enough
The classic approach has its place, but it often misses the mark in today's complex environment. Simple attention checks might catch someone randomly clicking, but they can struggle against automated scripts or organized fraud. Plus, overly simplistic or trick questions can sometimes penalize legitimate, attentive respondents who might just interpret a poorly phrased question differently. This can lead to frustration and potentially skewing your data by removing good participants. We knew there had to be a more nuanced way. That’s where our research into these newer, more nuanced quality checks aims to explore.
Asking better questions: our research into new quality check concepts experiment
On our quest for a more nuanced and effective approach to data quality, we didn't want to just make guesses. So in typical aytm fashion, we dove into a large-scale research project. This project involved designing and testing 15 unique developmental approaches to these new quality checks. These went beyond the standard "select orange" prompt to include things like heatmap exercises on images, visual "odd one out" challenges, common knowledge tests, video comprehension checks, and more imaginative scenarios.
To see how these played out globally, we tested them with thousands of respondents across four diverse countries: the US, Brazil, Germany, and Japan. We used a blended sample, including participants from our own trusted PaidViewpoint panel. Each person saw 3 to 4 of these different experimental quality questions woven into a typical "Health and Wellness" survey. Crucially, we also asked the reliable respondents for their thoughts on the experimental checks they encountered.
Finding the balance: what makes a quality check work?
Okay, so how do we begin to understand if one of these new developmental quality checks is effective? We looked at it from two primary angles:
- Catching the bad: Did the experimental check successfully flag responses we identified as unreliable using standard methods (like checking for speeding, straight-lining, or other suspicious patterns)?
- Keeping the good: Did it avoid unfairly flagging attentive, reliable respondents?
We calculated a "Trade Off Ratio" to measure this balance. Simply put, we want quality checks that flag a high percentage of unreliable responses while flagging a very low percentage of reliable ones. A ratio of 3.0 or more is generally considered good in these types of explorations, with 4.0 or higher being ideal. Our classic attention check served as the benchmark.

Then we assessed respondents’ perceptions of the new quality checks they saw. We asked them how likable, easy, quick to answer, and fun the question was. We also asked whether such a question would impact their future survey-taking in any way. For a holistic view of respondent perceptions, we evaluated each metric along with a “Preference Average” for each experimental question type.

What we learned from our initial research
Our exploration into these new developmental quality check concepts yielded some fascinating initial insights. Here’s a breakdown of what stood out:
Effectiveness is varied
Some of the newer, more complex quality check concepts (like identifying overlapping text categories or ranking items by cost) did catch a lot of bad data in the US, but they also tripped up too many good respondents, resulting in lower Trade Off Ratios. Conversely, other developmental concepts involving recognizing human emotions, writing image captions, estimating typical costs, checking answer consistency across the survey, and understanding short videos proved highly effective in our US tests at singling out unreliable data without penalizing the reliable crowd in the US.
Culture is key
Our findings underscore that there’s no universal "best" approach to these advanced quality checks. A question about a fictional scenario, for example, didn't test well in the US but performed strongly in Brazil, Germany, and Japan. This really underscores the need to think locally when designing global studies and considering different types of quality checks.
Effective doesn't always equal enjoyable
Interestingly, some of the most effective new quality check concepts (like the "Human Condition" or "Typical Cost Test" in the US) weren't rated as the most liked or fun by respondents. The favorites often involved more visual interaction or felt more like a quick puzzle, such as the Heatmap Captcha, Cost Pair Test, and Visual Odd One Out concepts we explored.
Putting quality insights into practice: 4 tips for your next survey
So, how can you use these learnings in your own research design, regardless of the specific tools you use? Here are a few thoughts:
- Layer your defenses: Don't rely on a single type of check. A robust quality strategy uses multiple checkpoints. Combine back-end behavioral monitoring (like checking survey duration and answer patterns) with a couple of well-chosen, varied in-survey quality questions. Think beyond the basic "select orange." Consider checks that tap into common sense, consistency, or visual interpretation, tailored to be engaging but effective.
- Less can be more: Bombarding respondents with checks leads to fatigue and potential frustration. We generally recommend using a maximum of three different types of quality questions within a single survey. Consider setting your removal criteria based on failing two or more checks, rather than just one, to account for potential misunderstandings or minor errors.
- Think globally, act locally: The effectiveness and perception of different quality check approaches vary significantly by culture. Tailor your quality check selection and design based on the cultural context of your respondents. Our research provides clues that certain types of checks perform very differently across regions (e.g., fictional scenarios, common knowledge), suggesting a one-size-fits-all question may not be the best approach globally.
- Use visuals wisely: Engaging images or interactive elements can sometimes make quality checks less tedious and more interesting for respondents (like the principles behind heatmap or visual odd-one-out exercises we tested). However, ensure any visuals you incorporate are clear, load quickly, are universally understood, relevant, and don't inadvertently introduce bias, confusion, or negative emotional responses.
Moving forward together: the future of data quality
Ensuring data quality is an ongoing journey, not a destination. As threats evolve, so must our methods. At aytm, we're committed to continuously exploring innovative solutions, like our initiative to develop more advanced and respondent-friendly quality checks. This represents part of this commitment, and we plan to continue developing and refining these concepts. Our goal is to help everyone capture insights they can depend on, while always striving to create a better, more respectful experience for the people sharing their perspectives with you.
Finding that sweet spot between rigorous quality control and genuine respondent engagement is crucial, and we're excited to keep exploring it right alongside you. To that end, we’ve compiled a research report that goes into more detail about our initial findings from this ongoing work. We’d love to share it with you!