Accuracy of self-assessment of gestational duration among people seeking abortion

BACKGROUND: Mifepristone, used together with misoprostol, is approved by the United States Food and Drug Administration for medication abortion through 10 weeks’ gestation. Although in-person ultrasound is frequently used to establish medication abortion eligibility, previous research demonstrates that people seeking abortion early in pregnancy can accurately self-assess gestational duration using the date of their last menstrual period. OBJECTIVE: In this study, we establish the screening performance of a broader set of questions for self-assessment of gestational duration among a sample of people seeking abortion at a wide range of gestations. STUDY DESIGN: We surveyed patients seeking abortion at 7 facilities before ultrasound and compared self-assessments of gestational duration using 11 pregnancy dating questions with measurements on ultrasound. For individual pregnancy dating questions and combined questions, we established screening performance focusing on metrics of diagnostic accuracy, deﬁned as the area under the receiver operating characteristic curve, sensitivity (or the proportion of ineligible participants who correctly screened as ineligible for medication abortion), and proportion of false negatives (ie, the proportion of all participants who erroneously screened as eligible for medication abortion). We tested for differences in sensitivity across individual and combined questions using McNemar’s test, and for differences in accuracy using the area under the receiver operating curve and Sidak adjusted P values. RESULTS: One-quarter (25%) of 1089 participants had a gestational duration of > 70 days on ultrasound. Using the date of last menstrual period alone demonstrated 83.5% sensitivity (95% conﬁdence interval, 78.4 e 87.9) in identifying participants with gestational durations of > 70 days on ultrasound, with an area under the receiver operating characteristic curve of 0.82 (95% conﬁdence interval, 0.79 e 0.85) and a proportion of false negatives of 4.0%. A composite measure of responses to questions on number of weeks pregnant, date of last menstrual period, and date they got pregnant demonstrated 89.1% sensitivity (95% conﬁdence interval, 84.7 e 92.6) and an area under the receiver operating curve of 0.86 (95% conﬁdence interval, 0.83 e 0.88), with 2.7% of false negatives. A simpler question set focused on being > 10 weeks or > 2 months pregnant or having missed 2 or more periods had comparable sensitivity (90.7%; 95% conﬁdence interval, 86.6 e 93.9) and proportion of false negatives (2.3%), but with a slightly lower area under the receiver operating curve (0.82; 95% conﬁdence interval, 0.79 e 0.84). CONCLUSION: In a sample representative of people seeking abortion nationally, broadening the screening questions for assessing gestational duration beyond the date of the last menstrual period resulted in improved accuracy and sensitivity of self-assessment at the 70-day threshold for medication abortion. Ultrasound assessment for medication abortion may not be necessary, especially when requiring ultrasound could in-crease COVID-19 risk or healthcare costs, restrict access, or limit patient choice.

BACKGROUND: Mifepristone, used together with misoprostol, is approved by the United States Food and Drug Administration for medication abortion through 10 weeks' gestation. Although in-person ultrasound is frequently used to establish medication abortion eligibility, previous research demonstrates that people seeking abortion early in pregnancy can accurately self-assess gestational duration using the date of their last menstrual period. OBJECTIVE: In this study, we establish the screening performance of a broader set of questions for self-assessment of gestational duration among a sample of people seeking abortion at a wide range of gestations. STUDY DESIGN: We surveyed patients seeking abortion at 7 facilities before ultrasound and compared self-assessments of gestational duration using 11 pregnancy dating questions with measurements on ultrasound. For individual pregnancy dating questions and combined questions, we established screening performance focusing on metrics of diagnostic accuracy, defined as the area under the receiver operating characteristic curve, sensitivity (or the proportion of ineligible participants who correctly screened as ineligible for medication abortion), and proportion of false negatives (ie, the proportion of all participants who erroneously screened as eligible for medication abortion). We tested for differences in sensitivity across individual and combined questions using McNemar's test, and for differences in accuracy using the area under the receiver operating curve and Sidak adjusted P values. RESULTS: One-quarter (25%) of 1089 participants had a gestational duration of >70 days on ultrasound. Using the date of last menstrual period alone demonstrated 83.5% sensitivity (95% confidence interval, 78.4e87.9) in identifying participants with gestational durations of >70 days on ultrasound, with an area under the receiver operating characteristic curve of 0.82 (95% confidence interval, 0.79e0.85) and a proportion of false negatives of 4.0%. A composite measure of responses to questions on number of weeks pregnant, date of last menstrual period, and date they got pregnant demonstrated 89.1% sensitivity (95% confidence interval, 84.7e92.6) and an area under the receiver operating curve of 0.86 (95% confidence interval, 0.83e0.88), with 2.7% of false negatives. A simpler question set focused on being >10 weeks or >2 months pregnant or having missed 2 or more periods had comparable sensitivity (90.7%; 95% confidence interval, 86.6e93.9) and proportion of false negatives (2.3%), but with a slightly lower area under the receiver operating curve (0.82; 95% confidence interval, 0.79e0.84). CONCLUSION: In a sample representative of people seeking abortion nationally, broadening the screening questions for assessing gestational duration beyond the date of the last menstrual period resulted in improved accuracy and sensitivity of self-assessment at the 70-day threshold for medication abortion. Ultrasound assessment for medication abortion may not be necessary, especially when requiring ultrasound could increase COVID-19 risk or healthcare costs, restrict access, or limit patient choice.

Introduction
Medication abortion (MA) with the 2drug regimen of mifepristone and misoprostol is currently approved by the US Food and Drug Administration (FDA) for use through 70 days' (or 10 weeks') gestation. Although in-person ultrasounds or physical exams are frequently used to establish gestational duration (GD) and eligibility for MA, there is growing adoption of models of care that forgo ultrasounds or physical exams and/or conduct eligibility screening virtually. 1e3 However, widespread uptake of these models is limited by state-level restrictions explicitly prohibiting telemedicine provision of abortion or requiring an in-person ultrasound or counseling visit. 4,5 Further, some providers and regulators remain concerned that most people seeking abortion need an ultrasound or physical exam to accurately determine the GD of their pregnancy and, by extension, their eligibility for MA.
Previous research suggests that people seeking abortion early in pregnancy are generally accurate in self-assessing GD using the date of their last menstrual period (LMP), particularly when the goal is minimizing the number of people who screen as eligible when they are not. 6 In the largest US-based study to date, 3.5% of participants who selfassessed their GD using the date of their LMP as 63 days (9 weeks) were determined to have a GD of >63 days on ultrasound. 7 Fewer participants (1.2%) had a GD of >70 days. 8 However, a 2014 review article found that the proportion of participants erroneously self-assessing their GD as making them eligible for MA ranged from 2.5% to 11.8% in different populations globally. 6 Further, larger studies were typically restricted to people seeking abortion in early pregnancy, and there is evidence in smaller samples that those later in gestation are more likely to underestimate GD using the date of their LMP. 9,10 Finally, reliance on the date of LMP alone may be unnecessarily restrictive. Studies of people seeking abortion suggest that 20% to 29% are not certain about the date of their LMP, 7,11,12 and up to 1 in 5 people report irregular periods or frequently spotting between periods, 11,13 both of which are associated with later presentation for abortion. 11,12,14 In this study, we examined the diagnostic accuracy of a broad set of pregnancy dating questions-alone and in combination-in identifying people eligible for MA on the basis of GD. We examined questions presented as they would be on a label for an overthe-counter (OTC) MA product and in an electronic questionnaire that could be completed online, in a pharmacy or clinic waiting room, or during a telehealth consultation. This study offers new tools to support effective methods for self-assessment of GD and ultimately expanded access to abortion care via telemedicine in the present and in a future OTC context.

Materials and Methods
We recruited participants from 7 freestanding abortion facilities in Alabama, California, Florida, Illinois, North Dakota, Texas, and Washington, District of Columbia between October 2019 and March 2020. Sites providing care beyond the first trimester were selected to achieve geographic and policy diversity.
A trained research assistant approached all patients in facility waiting rooms. Those interested in the study completed an eligibility screening tool; eligibility was restricted to those aged !15, able to speak and read English or Spanish, seeking an abortion (medication or surgical), and who had not yet had an ultrasound at that facility. There were no inclusion criteria regarding type of abortion or estimated gestation. To align with state or facility policies, parental consent was obtained for 15-to 17-year-olds in North Dakota and Texas, and only people aged >17 were eligible in Alabama.
After research assistants obtained verbal consent, they gave participants a tablet to complete the survey and a study index card. They instructed participants to give the index card to their ultrasound technician. Participants received a $25 Amazon gift card as remuneration. All study activities were approved by the University of California, San Francisco Institutional Review Board.

Measures
GD in weeks and days on ultrasound was recorded by ultrasound technicians. Transvaginal or transabdominal ultrasound was performed in the standard fashion at each facility, generally using crown-rump length up to 14 weeks' gestation and biparietal diameter, head circumference, and/or femur length after 14 weeks' gestation. When GD could not be determined, technicians recorded the reason. Dichotomous variables were created for GD of >56, >63, >70, and >77 days on ultrasound. Despite measurement error, which increases as pregnancy progresses, ultrasound is considered the standard of care for assessing pregnancy duration and serves as the reference standard in this study. 15 Survey questions were developed by the research team and refined on the basis of input from members of a community advisory board composed of medical and nonmedical experts and feedback from 11 cognitive interviews conducted with people seeking abortion. The survey included 2 modules assessing GD. The first included statements as they might appear on a hypothetical MA Drug Facts Label (DFL), with response options limited to "Yes," "No," or "Not sure." The second was structured as a self-administered questionnaire allowing for varied question formats and skip patterns ( Table 1). The order in which participants saw the 2 modules was randomized.
LMP-based GD was calculated by subtracting the date of the survey from the reported date of LMP. GD based on date of fertilization was calculated by subtracting the date of the survey from the reported date they got pregnant, and then adding 14 days. When these calculations resulted in nonsensical values (eg, negative or 14 days), we applied a set of recoding rules (Supplement).
For most pregnancy dating questions, we dichotomized responses as >70 days (or !10 weeks if the question was phrased in week format) vs earlier. For certain items in the Questionnaire

AJOG at a Glance
Why was this study conducted? In-person ultrasound is typically used to establish gestational duration (GD) and eligibility for medication abortion (MA), but there is interest in expanding the use of telemedicine screening and removing ultrasound requirements. In this study, we examined whether patient's self-assessment of GD using the date of their last menstrual period (LMP) and other questions results in accurate assessment of MA eligibility.
Module, we also created dichotomous measures of >56 days or !8 weeks, >63 days or !9 weeks, and >77 days or !11 weeks (vs earlier). We recoded nonresponse (skipped the question or selected "Not sure") as GD above the threshold or ineligible for MA.
Both modules included items related to regularity of periods, hormonal contraceptive (HC) method use, and previous pregnancy dating (Table 1). Participants who selected "Yes" to the "periods about once a month" statement in the DFL Module or "every 4 weeks or about once a month" in the Questionnaire Module were classified as having regular periods; all other responses, including "Not sure," were considered irregular periods. In the DFL Module, participants who selected "Yes" to statements about injectable, pill, or implant use were classified as HC users; all others, including "Not sure," were considered nonusers. In the Questionnaire Module, participants who selected birth control pills, vaginal rings, patches, implants, or injectables were classified as HC users.
Demographics were assessed at the end of the survey and included age, race and ethnicity, place of birth (in or outside the United States), highest level of education, number of previous births, and household income and size. Household poverty level was calculated using 2019 thresholds. 16

Analysis
We presented descriptive statistics and compared the sample to people seeking abortion nationally using the Guttmacher Institute's 2014 Abortion Patient Survey 17 (for socio-demographics) and 2017 Abortion Provider Census 18 (for pregnancy duration). We examined concordance between GD estimated by date of LMP and by calculating and summarizing the difference between these 2 values and generating 2-way scatterplots.
We described nonresponse to each pregnancy dating question and examined the screening performance of questions, alone and then in combination, according to 6 indicators: sensitivity, specificity, negative predictive value (NPV), positive predictive value (PPV), area under the receiver operating characteristic curve (AUC), and proportion of false negatives in the total sample. Indicators were derived using the 2Â2 table and formulas presented in the Supplement; a positive result was defined as greater than the specified threshold (most often 70 days). FDA guidance on self-selection considers accuracy, sensitivity, and NPV to be most important when determining whether someone can self-select to take a medication, for example in an OTC environment. 19 Thus, although most previous research on GD self-assessment has focused on the proportion of false negatives, 6 in this study we also prioritized performance regarding sensitivity and AUC, which captures overall accuracy or balance between sensitivity and specificity. AUC also has the advantage of not varying with outcome prevalence. For AUC, we considered values of 0.7 to 0.8 as good, 0.8 to 0.9 as very good, and 0.9 to 1.0 as excellent in terms of diagnostic accuracy. 20 We compared AUC values of individual dating questions using Sidak adjusted P values to account for paired comparisons; Stata software's roccomp command was used for calculations. 21 We then combined responses using individual questions with AUC >0.75 or with a factor loading >0.60 in exploratory factor analysis (Supplement B provides details). We combined questions in 2 ways. First, in scaffolding, we layered responses according to question accuracy (from higher to lower), using responses to higher-accuracy questions unless they were missing, in which case we used responses to the next most accurate item. In the second, composite measure of responses, we recoded participants as ineligible if their response to any question classified them as such.
We tested the sensitivity and AUC of combinations of questions with the goal of arriving at 1 combination per module with the fewest items (ie, lowest participant burden) and highest sensitivity and AUC, and then contrasted performance between the best performing DFL and questionnaire combinations. Differences in sensitivity were assessed using McNemar's test for paired data. As mentioned previously, differences in AUC were assessed using receiver operating characteristic curves and Sidak adjusted P values.
Once we identified the optimal combination by module, we tested whether sensitivity and AUC improved with the addition of ineligibility criteria of HC use and/or irregular periods. We also tested whether sensitivity and AUC improved by shifting self-reported GD to 56 and 63 days, keeping ultrasound at >70 days. In descriptive analysis, we calculated the number of false negative cases that were appropriately reclassified by moving from reliance on date of LMP alone to the 3-item composite of questionnaire items.
Finally, in sensitivity analyses, we examined whether performance was similar: (1) when shifting the threshold to 77 days; (2) among participants who reported no previous ultrasound; and (3) on the basis of the order in which participants saw the modules.
We sought to estimate whether the rate at which people falsely identify themselves as eligible (proportion of false negatives) for MA is reduced by at least 1.5% with the use of a combined question; a sample size of 1000 provides 86% power to detect this difference assuming, consistent with previous research, 7 that the proportion of false negatives using date of LMP alone is 3.5%. We  Consented (n=1177) The asterisk represents participants could be ineligible for multiple reasons. ajog.org

Results
Of 1697 individuals approached, 1312 (77%) were interested in learning about the study. Most (n¼1209) were eligible, and 1177 provided consent and initiated the survey. The final analytical sample included 1089 participants for whom GD measured by ultrasound was available; the most common reason for no reported GD was no visualization of intrauterine pregnancy (n¼40) (Figure 1). Most (n¼1039, 95.4%) participants reached the end of the survey. Median age was 26 years (interquartile range, 22e30). Approximately one-third of participants were Black (35.4%), another third White (31.0%), and 1 in 5 (18.0%) Hispanic or Latina/Latinx. Four in 10 participants (42.2%) were nulliparous ( Table 2). The participants' demographic profiles and GD measured by ultrasound mirrored those of people seeking abortion nationally (Supplemental Table 1). Mean GD on ultrasound was 61.7 days, and median GD was 53 days, ranging from 28 to 224 days. One-quarter of participants (25%) had a GD >70 days.

Missingness and accuracy of Drug Facts Label Module items
Of the 5 DFL questions, having "missed >2 periods" produced the lowest nonresponse (6% vs 12%e23% for other questions). Nonresponse was highest for the statement of being ">2 months pregnant" (23%) ( Table 3).

Missingness and accuracy of Questionnaire Module items
Of the 6 questions, "number of weeks pregnant" had the lowest nonresponse (3%). Questions focused on dates of LMP and date they got pregnant had the highest nonresponse at 14%. The question with the highest AUC was "number of weeks pregnant" (0.87; 95% CI, 0.84e0.89); this was higher than the AUC of the date of LMP question (0.82; 95% CI, 0.79e0.85) at P¼.007. The questions on date they got pregnant and number of weeks since LMP performed similarly to the question on date of LMP on AUC (Table 3).
Supplemental Figure 2, A and B depict general concordance between GD estimated by date of LMP/date they got pregnant and GD measured by ultrasound. The mean difference between GD estimated by date of LMP and GD measured by ultrasound was À0.10 (95% CI, À1.03 to 0.83); the mean difference between GD estimated by date they got pregnant and GD measured by ultrasound was À4.0 (95% CI, À4.90 to À3.00).

Adding hormonal contraceptive use or irregular periods
For both DFL and Questionnaire Module composite measures, the addition of HC use and irregular periods significantly increased sensitivity and reduced AUC (Combinations with hormonal contraceptive use and irregular period as additional layers in Table 4).

Lowering the threshold of selfreported pregnancy duration
For the composite questionnaire measures, shifting self-reported GD to 56 or 63 days significantly improved sensitivity and reduced AUC. Using the 56-day selfreported cutoff resulted in the lowest false-negative proportion (0.65%) but also the lowest specificity (49.3%) and PPV (38.8%) (Lower dating threshold with 70-d gestational age on ultrasound (questionnaire only) in Table 4).

False negatives
Using the date of LMP alone and a 70-day threshold, there were 42 false negatives. One-half (23 of 42) of these were appropriately reclassified as ineligible when using the composite questionnaire measure of number of weeks pregnant, date they got pregnant, and date of LMP ( Figure 2).    The average ultrasound-measured GD of participants with false negatives (n¼25) to the DFL Module composite items of ">10 weeks," "missed >2 periods," or ">2 months pregnant" was 83.3 days and ranged from 72 to 117 days. Two-thirds (64%, 16/25) of participants had a GD <84 days (not shown). The average ultrasound-measured GD of participants with false negatives (n¼29) to the Questionnaire Module composite items of number of weeks pregnant, date of LMP, or date they got pregnant was 82.6 days and ranged from 71 to 126 days. Most (72%, 21/29) participants had a GD <84 days (not shown).

Sensitivity analysis
In subgroup analyses restricted only to participants who reported no previous ultrasound (n¼867, 81%), the screening performance of individual items and the composite measures was largely compa-rable (Supplemental Table 3). There were no statistically significant differences in AUC by module random-ization order (Questionnaire vs DFL), with p-values ranging from 0.196 to 0.986 (Supplemental Table 4).

Principal findings and results in context
In this sample of people seeking abortion at facilities across the United States, multiple questions were effective in enabling patients to self-assess the duration of their pregnancy. Asking for a "Yes"/"No" response to a single state-ment-"You are >10 weeks preg-nant"-correctly identified 84% of participants with GD >70 days and would result in 4% screening as eligible for MA when they were >10 weeks' gestation on ultrasound. Adding 3 additional statements for evaluating whether the participants were >2 months pregnant, missed >2 periods, or had irregular periods before pregnancy, correctly identified 93% of participants with GD of >70 days on ultrasound, and would result in 1.7% incorrectly screening as eligible. This represents a reduction in the proportion of partici-pants incorrectly screened as eligible compared with previous studies, where TABLE 4 Screening performance of combinations of pregnancy dating items, overall and with inclusion of additional questions for hormonal contraceptive use and irregular periods and shifting of gestational duration thresholds (  ajog.org this proportion was 3.3% or higher when using date of LMP alone. 7,23e26 Notably, these screening questions did not rely on date of LMP, the standard approach to pregnancy dating, 15 the focus of previous research, 6 and a measure that past studies have demonstrated as harder to recall for some, including young people and those with irregular periods. 13,27 Our study indicates that a set of 3 non-LMP-based questions can be used to self-assess pregnancy duration, particularly when the goal is minimizing the frequency of erroneous selfassessment of GD as <70 days (ie, when sensitivity is of paramount importance). These 3 questions could appear on a DFL for a future OTC MA product and enable very good self-assessment of GD.
Still, our study findings do not suggest that LMP or date-based questions should be abandoned. If access to facility-based abortion care is restricted, whether because of COVID-19 28 or restrictive policies, 29e32 screening questions may need to prioritize balancing sensitivity and specificity, or overall accuracy, rather than sensitivity alone. Our study revealed that a composite measure of responses to 3 questions on date of LMP, date they think they got pregnant, and number of weeks pregnant had an AUC value of 0.86, reflecting very good sensitivity (89%) and specificity (82%), with a 70day threshold. Notably, performance was comparable with a 77-day threshold, with AUC of 0.87, 86% sensitivity, and 89% specificity. Thus, adding just 2 additional questions beyond date of LMP expands the number of people who can respond, and with very good accuracy. Therefore, these questions could be wellsuited for an online tool for self-screening of eligibility for an OTC MA product, and could be effective for most people seeking abortion.

Clinical implications
Like most screening questions, there was no 1 question or combination of questions that resulted in perfect eligibility classification. Thus, the decision to forgo ultrasound partly depends on the levels of risk tolerance of people in need of abortion care, providers, and policymakers regarding the use of MA at >70 or >77 days' gestation, taking into account that such use is rare. In this study, >70% of the 29 patients who erroneously screened as eligible but were actually >70 days on ultrasound had a GD of less than 84 days (12 weeks) gestation, where existing evidence indicates that MA efficacy in ending a pregnancy remains high and that people receive appropriate follow-up care when needed, including for ectopic pregnancy. 33 However, there will be rare cases where people screen as eligible and take the medications later in gestation and may require additional counseling or treatment. Levels of risk tolerance can depend on other factors such as availability and, importantly, accessibility of facility-based abortion care to all patients. 32 In the context of COVID-19, another factor may be the desire to minimize risk of exposure to the virus. Of paramount importance is that informed consent and counseling practices evolve to center people seeking abortion in the decision-making on proceeding without ultrasound.

Strengths and limitations
This study has several notable strengths. First, because we included nearly all people seeking abortion at recruitment facilities, the sample was racially and socioeconomically diverse and generalizable to people seeking facility-based abortion with ultrasound in the United States. Further, our sample included much more people seeking abortion around the 70-day threshold for MA than those in previous research. Given that one-quarter of the sample had a GD of >10 weeks on ultrasound, this study offers evidence of the real-world performance of screening questions among people in need of abortion. Importantly, this study also establishes the screening performance of questions beyond date of LMP, offering clinicians and researchers an expanded set of tools for assessing duration of pregnancy.
This study has several limitations. First, we developed many of the dating questions and examined their performance around a 70-day threshold given current FDA mifepristone labeling. Further evaluation would be necessary to ensure that some items, particularly in the DFL Module, perform similarly at higher GD thresholds. Notably, in sensitivity analyses, a composite measure of date of LMP, date they got pregnant, and number of weeks pregnant performed similarly with a 77-day threshold, which is important because some providers have already started using this threshold. 1 Second, although the comparison of our study population data with Guttmacher's national survey data suggest that our study findings are generalizable to people seeking facilitybased abortion nationally, additional research would be useful in ensuring that questions be appropriate for people with low literacy levels, which is prioritized by the FDA but was not assessed in this study. Similarly, <2% of participants chose to complete the study in Spanish, and thus we were unable to establish screening performance among Spanish speakers; a future study among Spanish speakers would ensure equitable access to expanded screening models.

Conclusions
This study provides robust evidence that people seeking abortion can self-assess their GD, particularly around the 70 or 77-day threshold. Importantly, beyond relying on date of LMP primarily, our study offers alternative pregnancy dating questions that produce more accurate self-assessments and lower nonresponse. Our findings suggest that policies requiring in-person ultrasound or dispensing of medications such as the Risk Evaluation and Mitigation Strategy for mifepristone are not universally necessary to establish gestation-based eligibility for MA. n

Supplemental Materials
Accuracy of self-assessment of gestational duration among people seeking abortion. Area Under the Receiver Operating Characteristic (ROC) Curve (AUC)¼(Sensitivity þ Specificity)/2 1 The US Food and Drug Administration guidance on self-selection considers accuracy, sensitivity, and NPV to be most important when determining whether someone can self-select to take a medication, for example in an over-the-counter environment. 2 Sensitivity reflects correct self-selection of those who cannot take the medication (true positive rate), whereas NPV captures correct selfselection of those who select to take the medication.
AUC is a summary measure of overall accuracy reflecting both sensitivity and specificity; AUC values of 1.0 represent a test that perfectly discriminates people eligible vs ineligible to take a medication, whereas an AUC of 0.5 reflects a test with no discriminatory ability. 3,4 Additional details on study methods Rules for recoding gestational duration variables calculated on the basis of dates To calculate GD using the date of LMP, we subtracted the date of the survey from LMP date. This resulted in some nonsensical values, defined as 14 days.
Values of 12 to 14 days (n¼7) were left as is and gestational duration calculated accordingly; values between À19 and 12 were recoded to missing (n¼10). Large, negative numbers (n¼8) were reviewed in detail and determined to be the result of misreporting of year in the calendar; the programming of the survey defaulted to current calendar year and our survey period spanned 2019 to 2020. For these 8 cases, we replaced year with 2019, generating a plausible value of pregnancy duration 280 days.
A similar recoding approach was used for GD calculated using the date they got pregnant, defined as the number of days between the date they got pregnant and the survey date plus 14. Nonsensical values were defined as 28 days. Values of 20 to 28 (n¼8) were left as is; values between À8 and 20 (n¼8) were recoded to missing; large and negative values that likely resulted from mis-entry of year (n¼14) were recoded using calendar year 2019.

Details on exploratory factor analysis
To reduce the number of items considered for combination and to confirm that the items were measuring only 1 construct, we ran 2 separate exploratory factor analyses using the iterated principal-factor method. The first model contained 5 candidate items from the Drug Facts Label module, whereas the second model contained 6 items from the Questionnaire module. Factor 1 eigenvalues for the Label and Questionnaire items were 2.75 and 3.15, respectively, each followed by a precipitous drop in eigenvalues (0.27 and 0.13, respectively), suggesting that they measure only 1 construct. We removed items with low factor loading (<0.60), preserving a 1-factor solution. The retained items that were then considered further in combination had factor loadings ranging from 0.63 to 0.75 for the Drug Facts Label (n¼4) and 0.70 to 0.85 for the Questionnaire items (n¼4). SUPPLEMENTAL TABLE 1 Comparison of the sociodemographic and pregnancy characteristics of the study sample with Guttmacher surveys of abortion patients and providers