Gallup's Approach to Opt-In Sampling

WASHINGTON, D.C. -- Opt-in panel sampling has become a popular and relatively inexpensive method for collecting survey data, but it's important to understand its benefits and challenges.

Opt-in panels (also known as nonprobability panels) typically have access to hundreds of thousands (if not millions) of participants who can take online surveys. Because of their size and online access, research can usually be fielded quickly and at a lower cost than for more traditional methods, such as telephone, mail or face-to-face data collection. These panels can also be used to collect large sample sizes or to reach low-incidence populations.

However, opt-in panels do not use random selection to recruit panelists; therefore, the probability of selection into the panel -- a key statistic in survey sampling and weighting -- is unknown. Opt-in panels use a variety of methods to recruit respondents, such as consumer lists (for example, a list of airline loyalty rewards members), professional membership lists or online advertisements.

Most opt-in panels also allow potential respondents to join the panel without an invitation, meaning participants can seek out panels and sign up to join them. Some panels also use river or intercept sampling, whereby people who are online are directed to a survey but are not recruited into a panel.¹

Opt-in recruiting and sampling are different from probability-based methods, which include address-based sampling, random-digit-dialing, random-route procedures and probability-based panels. In probability-based sampling, a sample frame is constructed for the population, meaning researchers create or access a list of everyone in the population, such as all household addresses. Information about the percentage of people in the population covered by the frame is known, and every unit in the frame has a known probability of selection.

Once the sample frame is constructed, potential respondents are randomly selected for the survey or are invited to join a probability-based panel. Therefore, unlike nonprobability samples, all members of the target population have a chance of being randomly selected and given the opportunity to participate. Despite some concerns about declining response rates from probability-based samples, all current research has consistently found that probability-based samples produce more accurate estimates than opt-in/nonprobability samples.²³

Benefits of Opt-In Sampling

Opt-in online panels have existed in the United States for at least three decades,⁴ but they have continued to grow in popularity because of several potential benefits:

Cost-effective: Opt-in data collection is generally less expensive than probability-based methods such as face-to-face, telephone or probability-based online panels.
Timely: Opt-in panels can be fielded relatively quickly, making them attractive for projects with tight deadlines or those that need a quick pulse survey.
Large sample sizes: They have the capacity to produce large sample sizes.
Access to specific populations: Opt-in panels can be useful for reaching specific low-prevalence populations that might be difficult -- if not impossible -- to survey effectively using other methods. The use of an opt-in panel sample may mean that groups historically excluded from research can be measured and heard.
Online surveying: Some surveys are best suited to online, self-administered data collection, and in many countries, opt-in panels are the only viable option for conducting an online survey. Many countries do not have probability-based web panels, and recruiting respondents using probability-based methods and then directing them to a web survey is costly and time consuming.

Concerns About Opt-In Sampling

Recently, however, there have been renewed and growing concerns about the quality of opt-in sampling. Some studies using opt-in sampling have come under scrutiny for implausible or inaccurate results.⁵ Other organizations, such as Pew,⁶ have raised concerns about data quality and fraudulent responses when opt-in panels are used, though their findings did not employ methods to attempt to remove poor-quality responses.

Despite the many advantages of opt-in panels, these concerns cannot be dismissed. Potential error stems from many aspects of opt-in design:

Sampling error: Opt-in web panels do not have known selection probabilities and violate the statistical assumptions necessary to calculate traditional standard errors and confidence intervals. For projects that require modeling or complex analysis, alternative methods (such as Bayesian models) may be required. The American Association for Public Opinion Research (AAPOR) has outlined guidelines and calculation methods for reporting a margin of error or credibility intervals when an opt-in sample is used⁷ but cautions potential users about its limited value.

Coverage error and nonresponse error: Certain segments of the population are completely excluded from opt-in panels, and these exclusions are challenging to define or quantify and can vary by panel and how a given panel is recruited. Additionally, individuals without access to the internet are excluded. Topics or populations that would be expected to be highly correlated with internet access (for example, studies of low-income elderly populations) should not be conducted using online opt-in samples. While opt-in panels boast large numbers of members, few panel members are active participants, and these panels may have “professional respondents” who are highly motivated by incentives they receive for taking many surveys over a relatively short period of time.

Opt-in panels typically claim “representation” by producing samples that closely represent demographic targets (using quota sampling and/or weighting adjustments). The underlying assumption is that surveyed respondents represent nonsurveyed respondents who have similar observed characteristics like age, gender or race. However, individuals who are motivated to complete numerous surveys each month for nominal rewards may have different attitudes and beliefs from those who are not opt-in respondents and who share the same demographic characteristics.

It is important to mention that statistical adjustments, such as weighting, can help reduce the potential bias from coverage and nonresponse error.
Measurement error: Opt-in respondents may be motivated to complete a survey as quickly as possible to collect rewards. Because of this, these panels are more susceptible to respondents “speeding” through the questionnaire or “straight-lining” answers (giving the same answers to all questions without reading the question or answer choices). Additionally, respondents may be motivated to lie in their answers to screening or qualifying questions, so they have the opportunity to complete as many surveys as possible. “Bots” are also a growing concern, especially with more sophisticated and mainstream AI tools. Bots are becoming more prevalent and harder to detect with basic quality measures. Fraudulent responses and poor-quality responses are more commonly observed with opt-in panels and are far less common with probability-based samples, which will be discussed in a future blog.

It is especially important to clean opt-in data to remove poor-quality respondents and to review the data for face-validity. Many recent examples of poor-quality opt-in studies were of poor quality partly because researchers treated these studies the same as probability-based panels, and did not clean out clearly fraudulent or poor-quality responses.

Gallup's “Fit for Purpose” Approach

At Gallup, we carefully evaluate every study to determine the best method for collecting data. While there are challenges with opt-in sampling, sometimes it is the best “fit for purpose,” given the research objectives. 优蜜传媒has been using online opt-in samples for research, including for some public release studies, for more than a decade, and we regularly conduct research experiments to develop methodologies, analytic approaches and implementation strategies that can improve the quality of opt-in data. We bring this expertise to every study that we field.

A Summary of Gallup’s Opt-in Panel Research Findings

Through our extensive research with opt-in samples and different panel providers around the world, we have learned that opt-in online data collection is uniquely different from probability-based data collection and must be treated as such. Our research has experimentally tested different methodological strategies for improving data quality and has found that some are more effective than others at mitigating the errors associated with opt-in samples. Over the coming months, 优蜜传媒will be releasing a series of methodology blogs that will dive deeper into our opt-in research and share what we have learned and what it may mean for your research. At a high level, our research has found that:

Nonprobability respondents do not behave like probability-based respondents. Nonprobability respondents are more likely to give positive responses to questions with Likert scales, say “Yes” to or indicate agreement with binary questions, and are much more likely to be flagged for quality issues, such as speeding, straight-lining and inconsistent or bogus responses. Question construction and data-quality procedures must take these differences into account.
Studies need to begin with a high-quality survey instrument. With any research study, writing a quality, unbiased survey that is easy for respondents to complete is crucial to reducing measurement error. It is also important to understand that not all questions or topics are a good fit for opt-in research, because of how the panels are recruited and how respondents approach the survey task. Question design principles are just as important, if not more so, with opt-in respondents. Question construction can help reduce known biases, help detect poor-quality or fraudulent responding, and minimize respondent errors.
Opt-in provider quality procedures may not be enough. While most opt-in sample providers have some quality procedures to improve their samples, a higher bar is needed for rigorous social and economic research. Additional procedures are needed to encourage quality responding, set appropriate quotas, closely manage fieldwork, execute vetting data to take steps to drop potentially bogus responses, and develop innovative approaches to data weighting.
Data quality can vary significantly by opt-in panel provider and by country. Each opt-in provider has their own unique methodology for recruiting and retaining panel respondents. Panel infrastructures also differ from country to country, and in some countries, the capabilities are limited or exclude large segments of the population. Many providers disclose little about their methods or keep their procedures closely held as proprietary. This can make it challenging to determine which panel providers will be able to meet the needs of a given study. 优蜜传媒has extensively tested providers in the U.S. and around the world, and we have identified preferred providers and countries where quality opt-in data collection may or may not be feasible. Results from some of this testing will be shared in a future blog.
Fieldwork must be carefully designed and closely monitored. Opt-in studies require unique sample management and data collection procedures. Demographic quotas are commonly used to ensure that the final composition of the sample closely approximates the demographic characteristics of the population. However, how and when those quotas are filled during data collection is important and can affect data quality. Decisions should also be made about the type of sample that can be directed to the survey -- for example, requiring respondents to have a certain panel tenure or quality score or discouraging the use of river sampling.

Given these issues, it is important to monitor the progression of quotas during a survey's field period to be sure the implementation of quotas does not introduce bias. It is common for studies to have all quotas open early in the field period. As quotas close, more people will begin to screen out. This leaves the hardest-to-reach groups open late in the field period and can lead to greater likelihood of fraudulent responses. 优蜜传媒takes alternative approaches to quota management.
Data cleaning procedures should be implemented. The complexity of the procedures will vary by project, but at a minimum, the dataset should be monitored during and after data collection to remove speeders and respondents who fail attention and quality checks. More extensive cleaning should be considered on a case-by-case basis and may include techniques such as logic or consistency checks (for example, cleaning out respondents who give an implausible combination of responses). The data should also be reviewed for reasonableness and face validity. Poor-quality responses are not an indication that the entire study should be thrown out, but they are an indication that the poor-quality responders should be removed.
Consider combining probability and nonprobability data collection. There may be times when probability-based data collection cannot reach certain groups of interest, or larger sample sizes are needed than what is possible. Supplementing with opt-in panel sample can be used to expand the study, and research has found that combining samples can help overcome some of the weakness of opt-in samples.⁸

The probability-based sample can serve as the gold-standard sample and is used to calibrate the nonprobability sample. The nonprobability sample can be used to increase sample sizes or oversample hard-to-reach populations that may be difficult to achieve with probability-based sample alone. This is an approach 优蜜传媒has used on many studies, including Gallup’s Center on Black Voices tracking survey, which seeks to understand the attitudes and experiences of groups that often are not included in research in a way that allows their views to be reported.

Combining can be a good option but does introduce potential complexity into the weighting. 优蜜传媒has also observed measurement differences between opt-in and nonprobability samples, which may need to be taken into account when combining sample sources.
The specific weighting approach is not as important as the variables used. Much of the research on opt-in panels to date has focused on statistical adjustments to improve the representativeness of the resulting samples. Many organizations have focused on developing proprietary adjustment methods. Our research has found that the specific calibration method is not as important as the inputs used in the weighting procedures and whether they are correlated with likelihood to respond to a panel survey and the survey variables of interest. It is also important to understand that weighting cannot adjust away quality issues such as fraudulent responses.
As with any other methodology or sample source, the use of opt-in samples should be clearly disclosed in reporting. Some organizations may describe opt-in samples in their methods statements as an “online, representative sample.” 优蜜传媒does not believe this meets the level of transparency or descriptiveness necessary for readers to properly interpret the survey methods. If data consumers are evaluating potential providers or are reading the results of a study and find a lack of transparency about the sampling methodology, fieldwork and data processing (such as weighting or removal of fraudulent cases), it should raise red flags.

Bottom Line

For more than a decade, 优蜜传媒has worked extensively with opt-in panel providers and conducted innovative methodological research to ensure we have the facts needed to properly incorporate this methodology into our work. Our research, as well as that conducted by others in the industry, has shown that opt-in panels have unique challenges that can increase the potential for error. However, through this research, we have also found that there are times when opt-in sample is the best solution, given the research objectives, and we have developed methods for minimizing bias when possible.

Finally, while this article reflects Gallup’s current recommendations related to the use of opt-in samples, our guidance may evolve as new methodological innovations are uncovered or new challenges become barriers to conducting quality opt-in research. 优蜜传媒will continue to study all issues related to the use of opt-in panels and is committed to sharing our findings.

[1] For more information about nonprobability panels, refer to the AAPOR Online Task Force on Online DataQuality: https://aapor.org/wp-content/uploads/2023/02/Task-Force-Report-FINAL.pdf

[2] For example: Yeager, D., Krosnick, J., Chang, L., Javitz., Levendusky, M., Simpser, A., & Wang, R. (2011). “Comparing the Accuracy of RDD Telephone Surveys and Internet Surveys Conducted with Probability and Non-Probability Samples.” Public Opinion Quarterly, 75, 709-747.

[3] MacInnis, B., Krosnick, J.A., Ho, A.S., & Cho, M-J. “The Accuracy of Measurements with Probability and Nonprobability Survey Samples: Replication and Extension.” Public Opinion Quarterly, 82 (2018): 707-744.

[4] Callegaro, M., Baker, R., Bethlehem, J., Göritz, A.S., Krosnick, J.A., & Lavrakas, P.J. (2014). Online panel research: History, concepts, applications and a look at the future.

[5] https://www.sfchronicle.com/opinion/openforum/article/israel-american-support-poll-19484618.php

[6] https://www.pewresearch.org/short-reads/2024/03/05/online-opt-in-polls-can-produce-misleading-results-especially-for-young-people-and-hispanic-adults/

[7] See https://aapor.org/wp-content/uploads/2023/02/Task-Force-Report-FINAL.pdf

and
https://aapor.org/wp-content/uploads/2022/12/Margin-of-Sampling-Error-508.pdf

[8] Wi艣niowski, A., Sakshaug, J.W., Perez Ruiz, D.A., & Blom, A.G. Integrating Probability and Nonprobability Samples for Survey Inference, Journal of Survey Statistics and Methodology, Volume 8, Issue 1, February 2020, 120-147. https://doi.org/10.1093/jssam/smz051

Author(s)

Jenny Marlar, Ph.D., is Director of Survey Research at Gallup.

优蜜传媒