“There are three kinds of lies: lies, damn lies, and statistics.”
While the source of the axiom is widely debated, the sentiment is just as widely accepted. While survey research relies heavily on statistics, our team, both robot and human, can all agree that a graph doesn’t always tell the whole story.
Polls and surveys rely on collecting accurate data from real, live human beings who understand our questions and respond with their real-life experience or opinions. But as we know, to err is human, and apparently, to err is also bot. Bot farms and professional survey takers have become increasingly prevalent in the internet age, allowing those behind the screen a cheat code to instant rewards. Bot farms have infiltrated the panel industry, and some estimates put the volume of fraudulent responses at 25% or more. While rewards and incentives are great motivators, participants often develop assumptions about survey objectives and attempt to answer what they assume companies want to hear, creating a dishonest response. Of course, the most obvious struggle in data collection is good, old-fashioned misunderstandings. Respondents are bound to click the wrong box, misinterpret the question, or lose focus throughout the survey. So how can we anticipate these user errors and increase our chances of collecting representative data?
We work with Lucid Marketplace, where experts behind the scenes monitor suspicious activity within their panel referrers. But we don’t stop there. We’ve implemented CleanID, a software solution from Opinion Route that works to eliminate irrelevant IP addresses, IP addresses from chronic survey-takers, and inconsistent data. Suspicious responses are automatically flagged, and completed surveys are rated on a 1-100 score for fraud. Accepted surveys generally include anyone with a 1-25 score. While the program is not without flaws, the scoring system allows team members to override CleanID and lets us better determine what factors might contribute to a valid respondent being flagged. Finally, within our own system, post-processing of the data will automatically look for and delete “speeders” (those who complete surveys unrealistically fast), and “straight-liners” (those who hold down the 1 key or select the top answer on every question). Beyond this, we build bot traps into every survey. These consist of quality control questions specifically designed to be impossible for a bot to answer correctly. These can be scattered throughout the questionnaire.
From the therapist’s lips to the market researcher’s ears, communication really is key. Regular survey-takers have learned that a lack of familiarity with a product can often lead to automatic termination. So, they tend to say they use a product or know it well, even if they only have minimal awareness of the brand. This “yes-bias” can often be hard or impossible to monitor automatically. This can be countered by informing participants when they have completed all screening questions. A brief congratulations informing the respondent that they have met the criteria and there are no right or wrong answers may deter any tendencies toward false affirmations. When surveying about rare brands or products, we may verify the response by looping back and asking, “are you sure?” Another helpful tool is adding a red herring in the form of a non-existent brand or option to serve as further quality control.
Invalid responses are not always devious or intentional. Browser settings, short attention spans, and brand confusion can all contribute to less trustworthy data. For surveys using video, we add video checks into our screening process, asking follow up questions to identify video and audio cues. This allows the respondent to make sure their audio is working, and their browser settings will allow video play. When battling for attention, randomization and random flips are a researcher’s best friend. While options like “don’t know” or “other” remain in a fixed position, the changing order allows the participant to view lists and scales with fresh eyes. Brand confusion can also play a major role in data collection. If a brand name is similar to another product, it may be helpful to specify what you are not referring to. This can be hard to anticipate, but when caught, can make a world of difference.
Data checks and quality control are vital tools in the researcher’s toolbox. Implemented together, these methods can all be utilized to improve your data quality. Automation is there to help, but human eyes and gut instincts often play an important final role in data collection. With a little humanity, our robot friends can go much farther in assisting us with high-quality, accurate survey research.