# Seroprevalence of SARS-CoV-2 antibody among the urban Iranian population: results of the second large population-based cross-sectional study | BMC Public Health

### Study design and participants

This population-based cross-sectional study was conducted in 16 cities in 15 Iranian provinces, including Ardabil, Babol, Gorgan, Sari, Tabriz and Urmia in northern provinces, Hamedan, Kermanshah, Mashhad, Qom, Tehran and Sanandaj in Iran. the central provinces, and Ahvaz, Kerman, Shiraz and Zahedan in the southern provinces (Fig. 1). The detailed sampling method was described in the first study phase [3]. Briefly, we randomly sampled the general population registered in Iran’s electronic health record (SIB) system based on their national ID numbers and invited them by phone to refer to a health center for data gathering. The SIB Network belongs to a prospective population-based cohort study in which demographic information and administrative health data of > 88% of Iranians (approximately 72 million people) are recorded [6]. We included people aged ≥ 10 years and excluded those who were unreachable or unwilling to participate in the study. Unlike our previous serological survey, we did not recruit high-risk individuals in the present study, i.e. we did not include high-risk occupational groups (such as health, etc). We considered provincial capitals as clusters due to the heterogeneous pattern of COVID-19 dispersion in Iranian provinces, as well as factors such as population density, the high correlation of humidity in each province with the prevalence of COVID-19 and the intra-city. and intra-provincial movements, which could affect the prevalence of COVID-19 [7, 8].

### Sample size calculation

Sample size was calculated based on estimated COVID-19 prevalence of 14.2% [9]a relative estimation error of 10%, taking into account a precision of 5%, a non-response rate of 10% and a design effect (Deff) of 1.75 to adjust the nature of sampling by the following form:

Def = 1 + d(n-1); where the intraclass correlation coefficient (d) was 0.05 and the cluster (not= 16) was the total number of cities. The total sample size for this study per the information mentioned was 9010 individuals. The sample size formulation was:

$$mathrm{n}=frac{left({mathrm{z}}_{1-upalpha left/ 2right.}^2right)ast mathrm{p}ast left(1-mathrm{p}right)}{{mathrm{d}}^2}$$

### Procedures

After being referred to a collaborating center, participants were interviewed by trained research staff to complete questionnaires containing demographic details, medical history, COVID-19-related symptoms, and COVID-19-related exposures. After collecting the required information, a 6 ml sample of venous blood was collected from each participant by a trained laboratory technician in an EDTA-coated microtainer labeled with a unique participant identification number. Centrifuged plasma samples were then transported to a central laboratory on dry ice (minus 20 degrees Centigrade). Serum samples were evaluated for the presence of IgG and IgM antibodies to SARS-CoV-2 nucleocapsid protein, using SARS-CoV-2 ELISA kits approved by the Iranian Food and Drug Administration (Pishtaz Teb, Tehran, Iran) according to the manufacturer’s protocol [10]. The kits were designed based on an indirect method in which SARS-CoV-2 specific nucleocapsid was coated in the 96-well plates. The recombinant SARS-CoV-2 nucleocapsid protein expressed in Baculovirus-insect cells consists of 1–419 amino acids and predicts a molecular mass of 47.08 kDa. Information on sample collection and ELISA kits has been presented in detail previously [3].

### Validation of tests

Considering that the ELISA kits used in the present study were similar to those used in our previous serological survey, their diagnostic performance and test validation were the same as previously described. [3]. Similarly, we used two scenarios to adjust seroprevalence rates in this study. The test performance of Scenario 1 (our own test validation data, including 66.9% sensitivity and 98.2% specificity) was used as the primary test characteristic, and Scenario 2 (combining manufacturer’s data along with our test validation data, including sensitivity of 71.8% and specificity of 98.2%) were used to be compared to the test-adjusted estimates from Scenario 1.

### Covariates

Demographic information included gender, age, and city of residence. Prior medical history included the following self-reported comorbidities: heart disease, hypertension, chronic lung disease, asthma, diabetes, obesity, and kidney disease. Symptoms related to COVID-19 included cough, fever, chills, sore throat, headache, dyspnea, diarrhea, anosmia, conjunctivitis, weakness, myalgia, arthralgia, altered level of consciousness and chest pain experienced within the past 12 weeks [3, 11]. Participants were then categorized as asymptomatic, paucisymptomatic (one to three symptoms), or symptomatic (four or more symptoms). We also asked participants about their recent contact (within the past 12 weeks) with a confirmed COVID-19 patient.

### statistical analyzes

Statistical analyzes have been previously explained in detail [3]. Briefly, the overall crude seroprevalence of SARS-CoV-2-specific antibodies was estimated as the proportion of positive tests to the total sample size. Age-, gender-, and city-weighted rates were calculated in bootstrap samples using the 2016 Iran Population and Household Census as the standard population. Given the nature of participant selection, the bootstrap-weighted seroprevalence rate for each combination of cities (Ahvaz, Ardabil, Babol, Gorgan, Hamedan, Kerman, Kermanshah, Mashhad, Qom, Sanandaj, Sari, Shiraz, Tabriz, Tehran, Urmia and Zahedan), age (10-19, 20-29, 30-39, 40-49, 50-59, ≥60) and gender (male, female). Finally, to minimize the resulting bias due to antibody tests with imperfect sensitivity and specificity, we calculated the weighted seroprevalence (bootstrap weight) adjusted test performance for scenarios 1 and 2 based on Cassaniti et al. [12, 13] proposed the following formula, where AP denotes the adjusted prevalence, UP denotes the unadjusted prevalence (apparent prevalence), Sp denotes the specificity of the test and Se denotes the sensitivity of the test:

$$mathrm{AP}=frac{mathrm{UP}+mathrm{Sp}-1}{mathrm{Se}+mathrm{Sp}-1}$$

Note that 95% confidence intervals (CIs) for unweighted seroprevalence were estimated using exact binomial models, and a bootstrap method was used to construct 95% CIs for weighted and adjusted estimates. [14, 15]. Categorical variables were reported in frequency and percentage. We calculated the total number of infections by multiplying the prevalence of infection by the total population of each province. We also assessed the distribution of SARS-COV-2 seropositivity by sex, age, comorbidity, contact with COVID-19 patients, and symptoms, using the chi-square test. All statistical analyzes were performed using Microsoft Excel and STATA version 14 (StataCorp, College Station, TX).