The following is the established format for referencing this article:
Estien, C. O., E. J. Carlen, and C. J. Schell. 2024. Examining the influence of sociodemographics, residential segregation, and historical redlining on eBird and iNaturalist data disparities in three U.S. cities. Ecology and Society 29(3):16.ABSTRACT
Ecologists often leverage contributory science, also referred to as citizen science, to answer large-scale spatial and temporal biodiversity questions. Contributory science platforms, such as eBird and iNaturalist, provide researchers with incredibly fine-scale data to track biodiversity. However, data generated by these platforms are spatially biased. Research has shown that factors like income, race, and historical redlining can influence spatial patterns of reported eBird and iNaturalist data. However, the role of contemporary residential segregation remains unclear. Additionally, we do not understand how these variables potentially relate to certain Census tracts having more or less biodiversity data than you would expect based on size or population density. To further understand the social factors that may contribute to spatial biases in eBird and iNaturalist data, we focused on three cities within the USA (Oakland, California; St. Louis, Missouri; and Baltimore, Maryland). We specifically investigated how income, race, segregation, and redlining via Home Owners’ Loan Corporation grades (grades A = best, B, C, and D = hazardous and “redlined”) are associated with the difference between reported and expected observations based on area and human population density. We find that census tracts with higher income and more White people generally have more observations than expected. We only find segregation to influence differences in reported and expected observations in Baltimore, with more segregated Census tracts having more observations than expected. Lastly, we find that grades C and D consistently have fewer data than expected compared with grades A and B for both platforms in each city. Our results show that although each city has distinct societal and ecological features, societal inequity permeates each city to shape the uptake of data for two of the largest sources of biodiversity data.
INTRODUCTION
Examining the impact of global urbanization on flora and fauna is becoming increasingly urgent, with urban expansion projected to increase by 0.82–1.53 million km², threatening over 30,000 species globally (Nilon 2011, Simkin et al. 2022, Lambert and Schell 2023). Determining the appropriate scale, resolution, and depth of biological data collection is therefore essential for sufficiently deciphering ecological responses to rapid landscape transformation (Chandler et al. 2017, Callaghan et al. 2021, Blumstein et al. 2023), while simultaneously providing pivotal solutions for effective and equitable conservation strategies (Chapman et al. 2024). Specifically, fine-scale data that span large geographies with temporal depth will be crucial for asking large-scale questions concerning biodiversity in the era of climate change and rapid biodiversity loss (Theobald et al. 2015, Kelling et al. 2019, Perkins et al. 2023).
Contributory science—also referred to as citizen or community science—platforms yield immense data that have pertinent potential for exploring biodiversity conservation hotspots (McKinley et al. 2017). Specifically, data sources, such as eBird and iNaturalist, that collect observations of flora and fauna globally are viable tools for understanding biodiversity within cities (iNaturalist 2023, eBird 2023). eBird is a platform that produces semistructured data (i.e., count data with associated metadata on participant effort), whereas iNaturalist generally produces unstructured data (i.e., presence only and no information on participant effort) (Welvaert and Caley 2016). Data collected on platforms such as eBird and iNaturalist allow scientists to explore large-scale ecological questions by collecting data at vast spatial and temporal scales (Winton et al. 2018, Kirchhoff et al. 2021, Putman et al. 2021). Leveraging these data is useful due to the challenges of answering ecological questions on a continental or global scale. For example, contributory data have been used to understand the factors influencing the death of migratory birds (Yang et al. 2021) and the distribution of non-native species (Maistrello et al. 2016, Werenkraut et al. 2020, Calzada Preston and Pruett-Jones 2021). However, participant-led platforms may yield biases due to individual differences in space use and preferences.
The probability of an individual reporting data to eBird and iNaturalist can vary across space, often as a result of social and ecological factors (Gadsden et al. 2023, Perkins et al. 2023, Carlen et al. 2024). For data to be reported, individuals must have access to areas and physically be present, leading to spatial variation in reported data via road density, human density, and land cover type (Zhang 2020). Recent work has shown that eBird and iNaturalist data can also be influenced by a suite of social factors (Carlen et al. 2024). For instance, in eBird data, higher income and more White neighborhoods have more observations (Perkins 2020, Grade et al. 2022). Similarly, recent evidence suggests that historical redlining deployed by the Federal Housing Association (FHA) and local lenders is strongly associated with the depth and distribution of eBird and iNaturalist observations. Historical redlining was a discriminatory lending practice used across the USA and institutionalized by the Home Owner’s Loan Corporation (HOLC) in the 1930s when HOLC appraisers ranked and mapped neighborhood quality to assess investment risk (Hillier 2003, Fishback et al. 2022). Home Owner’s Loan Corporation maps ranked neighborhoods on a four-letter scale: Grade A (i.e., most desirable and “greenlined” areas), which were mostly high-income and White populations, B (still desirable), C (definitely declining), and D (i.e., hazardous and “redlined” areas), which were mostly Black and/or other marginalized populations (Hillier 2003)—creating maps that serve as proxy for numerous racialized policies, including redlining, that led to and upheld disinvestment in these neighborhoods (Fishback et al. 2022, Pickett et al. 2023). As a result of these racialized policies, these redlined neighborhoods have a higher concentration of poverty as well as diminished environmental quality, such as an overall higher concentration of environmental hazards (Appel and Nickerson 2016, Locke et al 2021, Nardone et al. 2021, Estien et al. 2024b) and diminished biodiversity (Wood et al. 2023, Estien et al. 2024a). Further downstream, consequences of redlining are seen in bird biodiversity data across the USA, with redlined neighborhoods having lower sampling densities than greenlined neighborhoods (Ellis-Soto et al. 2023). Thus, social factors such as race, income, and redlining influence the interpretation of reported biodiversity data, providing an incomplete assessment of biodiversity and obscuring our ability to successfully tackle the crises at hand (Carlen et al. 2024).
Further investigating the potential biases in these data is crucial for identifying what variables ecologists must control for when modeling species distribution or ecology with these data. Contemporary residential segregation, which is a process and mechanism that drives the arrangement of different ethno-racial groups due to differences in labor markets and housing policies (Morello-Frosch 2002, Grove et al. 2018), may also influence where biodiversity data are reported. Segregation itself has been shown to drive disparities in environmental quality and human health outcomes, such that humans in extremely segregated cities face worse environmental hazard outcomes than those in less segregated cities, regardless of ethnicity (Morello-Frosch and Jesdale 2006, Jesdale et al. 2013, Casey et al. 2017). Thus, for contributory science data, although there may be spatial biases in data by race, disparities may be further exacerbated due to segregation.
We aim to fill several gaps with respect to eBird and iNaturalist data—two of the most used contributory science applications and largest sources of biodiversity data currently found in the Global Biodiversity Information Facility—with this study. First, prior works provide evidence suggesting that observations can vary by income, race, and HOLC grade (Perkins 2020, Grade et al. 2022, Ellis-Soto et al. 2023, Estien et al. 2024a), with wealthier, whiter, and higher HOLC-grade neighborhoods having more observations. However, we do not know if those same census tracts have more observations than you would expect based on area or human population density. Thus, we ask the novel question of how the total number of reported observations “differs” from the number of expected observations. Second, only one study to our knowledge has examined how race is explicitly associated with biodiversity data, focusing only on eBird data (Grade et al. 2022). Therefore, we hold no understanding of the relationship between race and iNaturalist data, or how race is associated with eBird data beyond the cities investigated in Grade et al. (2022). Third, no work has sought to investigate the relationship between segregation and eBird and iNaturalist data, despite the potential link. Lastly, outside of a few studies (Perkins 2020, Grade et al. 2022, Ellis-Soto et al. 2023, Estien et al. 2024a), a majority of literature pertaining to iNaturalist or eBird data focus on a single city. Examining multiple cities at once allows for a deeper look at the nuances shaping city-level results as well as yielding potential generalizations.
To investigate how income, race, segregation, and historical redlining were associated with differences in reported and expected observations in eBird and iNaturalist, we focused on three North American cities for our analyses: Oakland, California, St. Louis, Missouri, and Baltimore, Maryland (Fig. 1; Append. 1: figs. A1-A3). We chose these three cities because they vary in social (e.g., politics, culture, and history) and ecological (e.g., canopy cover, green space) characteristics and are located in three distinct locations in the USA (West coast, mid-West, and East coast). We looked at both eBird and iNaturalist as we expected to see differences in the biases investigated due to differences in sampling techniques (semi-structured vs. unstructured). We expected to find Census tracts that were previously greenlined, had higher income, percentages of White people, and segregation indices to have more reported observations than expected. However, due to the semi-structured nature of eBird, we expected to find weaker effects of the variables of interest on eBird data compared with iNaturalist.
METHODS
Study areas and data sets
We downloaded all observations from iNaturalist and eBird for Oakland, California, St. Louis, Missouri, and Baltimore, Maryland, from the first recorded observation in each database through 27 July 2022. These observations reflect the total number of reports to each platform, which may reflect multiple observations for the same species. Additionally, we used data from the 2020 United States Census Bureau (U.S. Census Bureau 2022) to examine race and income across our study area and HOLC maps from the Mapping Inequity Project to examine historical redlining (Nelson et al. 2020). We completed all analyses in R version 4.3.1.
Using the “tidycensus” package (Walker and Herman 2021), we extracted self-reported ethno-racial and income data at the Census tract level. Next, we calculated the percentage of White people within a Census tract. Additionally, we calculated segregation using ethno-racial identities to create a segregation score for each Census tract via a dissimilarity index, which focuses on multiple racial and ethnic groups (Iceland 2004). The dissimilarity index represents “the proportion of the racial group that would need to relocate to another Census tract to achieve an even distribution throughout a metropolitan area” (Morello-Frosch and Jesdale 2006). Lastly, to assign a Census tract a HOLC grade, we calculated the centroid of each Census tract using the “st_centroid” function in the “sf” package (Pebesma and Bivand 2023) and assigned the tract a HOLC grade (i.e., A, B, C, or D) based on where the centroid fell. Because not all Census tracts had a HOLC grade due to city development, Census tracts without a HOLC grade were removed from the data set for a separate HOLC grade analysis (see below).
Statistical analysis
We sought to understand if the number of reported observations per Census tract differed from the number of expected observations. We did this by calculating the expected number of observations for each city based on total area of the city as well as human population density across each city. To do this, we took the total observations within a city and divided it by the total area within the city. We then repeated this step for the total human population within a city. This approach yielded an expected amount per meter squared and per person, respectively. Furthermore, we used the size and population density of each Census tract to get the expected number of observations per Census tract for area and population density. Next, we used a Wilcox Signed-Rank test to determine if there were significant differences in the reported and expected observations in each city.
To understand if each social variable was associated with the difference between the reported and expected observations, we subtracted the expected number of observations from the reported observations while controlling for area and human population density to yield a mismatch value. We then ran generalized linear models on this mismatch value to examine (1) if the social factor had a significant effect on the mismatch value and (2) which social factor was most associated with the mismatch of observations (via model selection). We extracted the beta estimates (β) and p values from each model. A positive estimate value would indicate that there were more observations than expected, and a negative estimate value would indicate there were fewer observations than expected. We built five generalized linear models (GLMs) with a Gaussian distribution: (1) an income model, (2) a percentage of White people model, (3) a segregation model, (4) a global model with race, income, and segregation, and (5) a null model where fixed effects were omitted. In each model, our response variable was the mismatch of observations, and the fixed effect was the social variable of interest (i.e., percentage of White people, income, or segregation). We did not include Census tract area or human population density as an offset variable as we had controlled for area and human population density when calculating the mismatch in observations. We used AIC model selection to identify the best-performing results. When the ΔAIC between two or more models was <2, we used the “performance” package (Lüdecke et al. 2021) to generate a performance score and select the top model.
As most cities have expanded past their original HOLC maps, we re-ran models on the Census tracts located within HOLC grades and re-ran the above models with an additional HOLC model to see if HOLC grade outperformed other social variables. To understand if there were differences in reported and expected observations per grade, we constructed a GLM where observation type (i.e., reported observations, expected observations per area, and expected observations per person) was the fixed effect, and our response variable was the number of observations. We then extracted the estimated marginal means for the reported, expected per area, and expected per person observations and performed a Tukey−Kramer’s post hoc between each type of observation to investigate if there were significant differences. We report the β estimate and p value.
RESULTS
Oakland
For eBird data, we found significant differences between the number of reported observations (4219.164 ± 21232.60) compared with what was expected (Wilcoxon’s p < 0.001), after controlling for both area (17911.171 ± 61910.02) and human population density (17911.171 ± 7321.59) (Fig. 2). For Oakland’s eBird data, we found that income was our best-performing model when we controlled for area (Table 1), although this did not differ significantly from our null model (p = 0.080). Similarly, segregation was our best-performing model when we controlled for population density (Table 1), although this did not differ significantly from our null model (p = 0.129). When we re-ran models with Census tracts that fall within previously HOLC-graded neighborhoods, our HOLC model was the best-performing when controlling for area (p < 0.05), whereas income was our best-performing model when controlling population density (p < 0.001) (Table 1).
For income, we did not find significant differences between reported and expected observations when we controlled for area (β = -0.1800, p = 0.082) or population density (β = 0.0408, p = 0.282) (Fig. 2A; Table 2). For race, based on the percentage of White people, we did not find significant differences between the observed and expected, both for area (β = -195.63, p = 0.408) and population density (β = 125.30, p = 0.145) (Fig. 2B; Table 2). For segregation, we did not find significant differences between the observed and expected, both for area (β = 5033.3, p = 0.862) and population density (β = 15908, p = 0.131) (Fig. 2C; Table 2). Lastly, for HOLC grades, we did not find significant differences between observed and expected for grades A (area: β = -11954, p = 0.1410; population density: β = -8700, p = 0.335) or B (area: β = -4105, p = 0.815; population density: β = -11820, p = 0.199) (Fig. 2D; Table 2). We found that grade C had significantly fewer observations than expected when we controlled for area (β = -4844, p < 0.001) and population density (β -18169, p < 0.001) (Fig. 2D; Table 2). Similarly, we found that grade D had significantly fewer observations than expected when we controlled for area (β = -7775, p < 0.001) and population density (β = -17580, p < 0.001) (Fig. 2D; Table 2).
For iNaturalist data, we found a significant difference between the number of reported observations (523.0959 ± 1544.027) compared with what was expected after controlling for human population density (527.5822 ± 1823.590) (Wilcoxon’s p < 0.001), but not when we controlled for area (527.5822 ± 215.661) (Wilcoxon’s p = 0.156) (Fig. 2). We found that our null model performed best when we controlled for area, whereas when we controlled for population density, race was our best-performing model (p < 0.01) (Table 1). When we re-ran models with Census tracts that fall within previously HOLC-graded neighborhoods, income was our best-performing model when we controlled for area and population density (p < 0.001) (Table 1).
For income, we did not find significant differences between reported and expected observations when we controlled for area (β = -0.0011, p = 0.768), but when we controlled for population density, we found that there were significantly more observations than expected (β = 0.0054, p < 0.05) (Fig. 2E; Table 2). For race, based on the percentage of White people, we did not find significant differences between reported and expected observations when we controlled for area (β = 6.102, p = 0.484), but when we controlled for population density, we found that there were significantly more observations than expected (β = 15.555, p < 0.01) (Fig. 2F; Table 2). For segregation, we did not find significant differences between the observed and expected, both for area (β = -396.4, p = 0.711) and population density (β = -76.03, p = 0.917) (Fig. 2G; Table 2). Lastly, for HOLC grades, we did not find significant differences between observed and expected for Grades A (area: β = 608.1, p = 0.493; population density: β = 704.0, p = 0.392), or B (area: β = 556, p = 0.477; population density: β = 329, p = 0.769) (Fig. 2H; Table 2). For grade C, we did not find significant differences between reported and expected observations when we controlled for area (β = 27.4, p = 0.726), but when we controlled for population density, we found that there were significantly fewer observations than expected (β = -365.1, p < 0.001) (Fig. 2H; Table 2). Additionally, for grade D, we did not find significant differences between reported and expected observations when we controlled for area (β = 11.6, p = 0.993), but when we controlled for population density, we found that there were significantly fewer observations than expected (β = -277.3, p < 0.05) (Fig. 2H; Table 2).
St. Louis
For eBird data, we found significant differences between the number of reported observations (3091.317 ± 15903.711) compared with what was expected (Wilcoxon’s p < 0.001), after controlling for both area (3091.163 ± 2529.654) and human population density (3091.163 ± 1291.985) (Fig. 3). We found that race was our best-performing model when controlling for area (p < 0.05) and population density (Table 1), though our population density model did not significantly differ from our null (p = 0.091). When we re-ran models with Census tracts that fall within previously HOLC-graded neighborhoods, race was still our best-performing model when controlling for area and population density (Table 1), although neither model differed significantly from our null model (area: p = 0.089; population density: p = 0.135).
For income, we did not find significant differences between the observed and expected when we controlled area (β = 0.1525, p =0.072) or population density (β = 0.1157, p = 0.193) (Fig. 3A; Table 2). For race, based on the percentage of White people, we found that there were significantly more observations than expected when we controlled for area (β = 98.51, p < 0.05), but not population density (β = 82.90, p = 0.094) (Fig. 3B; Table 2). For segregation, we did not find significant differences between the observed and expected, both for area (β = -4362, p = 0.489) and population density (β = -2725.6, p = 0.679) (Fig. 3C; Table 2). Lastly, for HOLC grades, we did not find significant differences between observed and expected for Grades A (area: β = 5594, p = 0.7040, population density: β = 4151, p = 0.823) or B (area: β = 1177, p = 0.842, population density: β = 362, p = 0.984) (Fig. 3D; Table 2). We found that grade C had significantly fewer observations than expected when we controlled for area (β = -2219, p < 0.001) and population density (β = -2826, p < 0.001) (Fig. 3D; Table 2). We similarly found that grade D had significantly fewer observations than expected when we controlled for area (β = -3137, p < 0.001) and population density (β = -2791, p < 0.001) (Fig. 3D; Table 2).
For iNaturalist data, we found significant differences between the number of reported observations (333.8077 ± 1235.9405) compared with what was expected after controlling for both area (352.4423 ± 1235.9405) and human population density (352.4423 ± 147.3071) (Wilcoxon’s p < 0.001) (Fig. 3). We found that race was our best-performing model when controlling for area (p < 0.05) and population density (Table 1), although our population density model did not significantly differ from our null (p = 0.088). When we re-ran models with Census tracts that fall within previously HOLC-graded neighborhoods, race was still our best-performing model when we controlled for area (p < 0.01) and population density (p < 0.05) (Table 1).
For income, we found that there were significantly more observations than expected when we controlled for area (β = 124.4, p < 0.05), but not population density (β = 0.0144, p = 0.128) (Fig. 3E; Table 2). For race, based on the percentage of White people, we found that there were significantly more observations than expected when we controlled for area (β = 8.233, p < 0.05), but not for population density (β = 6.453, p = 0.091) (Fig. 3F; Table 2). For segregation, we did not find significant differences between the observed and expected, both for area (β = -577.2, p = 0.225) and population density (β = -390.7, p = 0.443) (Fig. 3G; Table 2). Lastly, for HOLC grades, we did not find significant differences between observed and expected for Grades A (area: β = 336, p = 0.394; population density: β = 172, p = 0.775) or B (area: β = 30.6, p = 0.927; population density: β = -62.3, p = 0.732) (Fig. 3H; Table 2). We found that grade C had significantly fewer observations than expected when we controlled for area (β = -185.6, p < 0.001) and population density (β = -254.8, p < 0.001) (Fig. 3H; Table 2. Additionally, we found that grade D had significantly fewer observations than expected when we controlled for area (β = -171.2, p < 0.05), but not when we controlled for population density (β = -131.7, p = 0.088) (Fig. 3H; Table 2).
Baltimore
In Baltimore eBird data, we found significant differences between the number of reported observations (2855.465 ± 12858.78) compared with what was expected (Wilcoxon’s p < 0.001) after controlling for both area (11177.289 ± 12156.62) and human population density (11487.449 ± 5159.03) (Fig. 4). We found that income was our best-performing model when controlling for area (p < 0.01) and population density (p < 0.001) (Table 1). When we re-ran models with Census tracts that fall within previously HOLC-graded neighborhoods, our HOLC model was the best-performing model when we controlled for area and population density (p < 0.001) (Table 1).
For income, we found that there were significantly more observations than expected when we controlled for area (β = 0.0944, p < 0.01) and population density (β = 0.1056, p < 0.001) (Fig. 4A; Table 2). For race, based on the percentage of White people, we did not find significant differences between reported and expected observations when we controlled for area (β = 58.33, p = 0.164), but when we controlled for population density, we found that there were significantly more observations than expected (β = 111.27, p < 0.01) (Fig. 4B; Table 2). For segregation, we did not find significant differences between reported and expected observations when we controlled for area (β = -7276, p = 0.103), but when we controlled for population density, we found that there were significantly more observations than expected (β = 13454, p < 0.01) (Fig. 4C; Tables 1, 2). Lastly, for HOLC grades, we found that every grade had significantly fewer observations than expected. Grade A had significantly fewer observations than expected when we controlled for area (β = -14284, p < 0.001) and population density (β = -16801, p < 0.001) (Fig. 4D; Tables 1, 2). Grade B had significantly fewer observations than expected when we controlled for area (β = -9742, p < 0.001) and population density (β = -11007, p < 0.001) (Fig. 4D; Tables 1, 2). Grade C had significantly fewer observations than expected when we controlled for area (β = -8063, p < 0.001) and population density (β = -10066, p < 0.001) (Fig. 4D; Tables 1, 2). Similarly, grade D had significantly fewer observations than expected when we controlled for area (β = -4007, p < 0.001) and population density (β = -8028, p < 0.001) (Fig. 4D; Tables 1, 2).
For iNaturalist data, we found significant differences between the number of reported observations (188.4495 ± 460.46516) compared with what was expected (Wilcoxon’s p < 0.001), after controlling for area (206.5561 ± 224.65422) and human population density (212.2879 ± 95.33878) (Fig. 4). We found that income was our best-performing model when controlling for area (p < 0.001) and that race was our best-performing model when controlling for population density (p < 0.001) (Table 1). When we re-ran models with Census tracts that fall within previously HOLC-graded neighborhoods, race was the best-performing model when we controlled for area and population density (p < 0.001) (Table 1).
For income, we found that there were significantly more observations than expected when we controlled for area (β = 0.0038, p < 0.001) and population density (β = 0.0040, p < 0.001) (Fig. 4E; Tables 1, 2). For race, based on the percentage of White people, we found that there were significantly more observations than expected when we controlled for area (β = 4.292, p < 0.001) and population density (β = 5.270, p < 0.001) (Fig. 4F; Tables 1, 2). For segregation, we found that there were significantly more observations than expected when we controlled for area (β = 321.14, p < 0.05) and population density (β = 435.30, p < 0.01) (Fig. 4G; Tables 1, 2). Lastly, for HOLC grades, we did not find significant differences between observed and expected for Grades A (area: β = -64.5, p = 0.634, population density: β = -111.0, p = 0.273), B (area: β = -84.3, p = 0.171; population density: β = -107.7, p = 0.059), and D (area: β = 90.0, p = 0.346; population density: β = 15.7, p = 0.968) (Fig. 4H; Tables 1). Grade C had significantly fewer observations than expected when we controlled for (β = -79.3, p < 0.05) and population density (β = -116.3, p < 0.001) (Fig. 4H; Tables 1, 2).
DISCUSSION
Our results provide additional empirical support suggesting that income, the percentage of White people, and historical redlining are associated with disparities in eBird and iNaturalist data (Perkins 2020; Grade et al. 2022; Ellis-Soto et al. 2023). Importantly, our study used an integrated approach, assessing the impacts of segregation, which more accurately reflects ethno-racial division in the USA, with other sociodemographic variables. Moreover, our multi-city approach integrated multiple social variables to examine both the differences between reported and expected data, as well as examine the variable most associated with observed differences, a first among similar studies. This integrated, multi-city, and multi-factorial approach allowed us to disentangle the relationships among social factors and contributory biodiversity data. First, we found that income was uniformly associated across all three cities with differences in reported and expected in iNaturalist data, although this varied depending on whether we controlled for area or population density. Conversely, we only found significant differences between reported and expected eBird data in Baltimore when considering income. Second, we found variation at the city-level in the relationship between the percentage of White people and the difference between reported and expected eBird and iNaturalist data. Third, we found the effect of segregation to be city dependent, with only Baltimore showing a significant relationship between segregation and the difference between reported and expected eBird and iNaturalist data. Lastly, we found an association between HOLC grades and the difference between reported and expected eBird and iNaturalist data, with grades C and D consistently having fewer reported observations than we would expect. Our results demonstrate that city-level differences in histories and contemporary social demography are important for understanding disparities in data and the conclusions drawn when using these data to understand patterns of reported biodiversity.
Our results support previous conclusions in the literature that highlight the connections between income and disparities in eBird and iNaturalist data (Perkins 2020, Grade et al. 2022). Although eBird is structured for birders (Wood et al. 2011, Rosenblatt et al. 2022), who typically have relatively high household income (Eubanks et al. 2004), we only detected a significant relationship between reported and expected observations based on income in Baltimore. In contrast, whereas iNaturalist is structured from general users (Aristeidou et al. 2021a, b), we found that across all three cities, higher-income Census tracts had significantly more reported observations than expected. Participation in iNaturalist is often dependent on having the time, access, and resources to engage with and identify local biodiversity, similar to eBird. Moreover, higher-income Census tracts in the USA typically have more access to environmental amenities (e.g., more street trees and greenspaces that are typically larger in size) (Schwarz et al. 2015, Rigolon 2016, Fan et al. 2019), which can increase the spatial overlap between people and wildlife (Belcher et al. 2019, Magle et al. 2021). This overlap can provide more opportunities for individuals to sample using both the eBird and iNaturalist apps.
Differences between the number of reported and expected observations were strongly associated with the percentage of White people in a Census tract, similar to results presented by Blake et al. (2020) and Mahmoudi et al. (2022). However, this difference between expected and reported observations based on the percentage of White people varied between eBird and iNaturalist data. Gaps in data only significantly differed across both data sets in St. Louis and Baltimore, whereas in Oakland, we only found significant differences in iNaturalist data. This may be a result of differences in the history of segregation and racialized policies in each city. For example, in St. Louis, racial covenants, which prevented the rental or sale of houses in majority-White neighborhoods to Black people and other People of Color, were legally written into many St. Louis house deeds until 1948 when they were outlawed (Gordon 2023). Despite being outlawed, racialized housing practices have kept the city racially divided (Salter 2022), leading to the Delmar Divide—a road that runs East–West and divides the city into the predominantly Black north city and the predominantly White south city. This line is evidently reflected in the spatial distribution of both eBird and iNaturalist data, with the southern half of the city having more observations than the North. Similarly, in Baltimore, a combination of racial violence and racialized policies has created highly segregated neighborhoods (Grove et al. 2018, Pickett et al. 2023), a legacy of which is still evident today and known as “the Black Butterfly” because of the proportion of Black people living in the northeast and northwestern parts of the city causing the racial distribution to mimic butterfly wings (Brown 2021). We found that areas with fewer White people, i.e., the “Black Butterfly”, were reflected in eBird and iNaturalist data. Throughout all three cities, racialized histories and policies have led to differences across races, particularly between Black and White people, in comfortability and recreational outdoor spaces (Byrne 2012, Finney 2014). Moreover, race-based biases in governmental processes have led to further disparities in access to vegetation, green spaces, and tree canopy cover in these cities (Grove et al. 2018, Nesbitt et al. 2019, Estien et al., 2024b)—vital environmental characteristics that can encourage sampling while promoting biodiversity.
Despite Oakland, St. Louis, and Baltimore visually showing ethno-racial clustering and being widely known for being segregated, we only found segregation to have a significant effect on the difference between reported and expected observations for eBird or iNaturalist data in Baltimore. In Oakland and St. Louis, sharp segregation may be dampening the relationship seen with the percentage of White people, as areas with “high” segregation may contain large concentrations of any ethnic group. Thus, in a city like St. Louis, which is incredibly segregated, the bulk of data in the southern portion of the city (i.e., predominately White people) and the lack of data from the North part of the city (i.e., predominantly Black people) are both considered “high” segregation. This arrangement of people and data likely dilutes the potential relationships between segregation as a metric and data disparities despite clear disparities existing in all three cities where racial segregation is present. However, we did find a relationship between segregation and data disparities in Baltimore. Census tracts that are more segregated in Baltimore have significantly more observations than expected on both platforms, which is likely due to high income and White participants. Thus, to fully understand the role of segregation in contributory data, future research should examine differences between highly segregated and predominately White Census tracts compared with Census tracts that are highly segregated and predominately non-White.
In alignment with recent work highlighting the role of historical redlining via HOLC maps on eBird and iNaturalist data (Ellis-Soto et al. 2023, Estien et al. 2024a), we found that HOLC grades differed in their data disparities. Yet, when we restricted our analysis to Census tracts that were previously assigned HOLC grades, HOLC grade only outperformed other social variables in Oakland and Baltimore for eBird data. This result suggests that other contemporary factors, such as the percentage of White people and income, are more important in understanding data disparities in these three cities. Nevertheless, we found that in all three cities, grades C and D consistently had significantly fewer reported observations than expected. Surprisingly, for eBird data in Baltimore, we also found that grades A and B had significantly fewer reported observations than expected. This result may be attributed to the fact that although grades A and B typically have higher-income individuals (Appel and Nickerson 2016), in Baltimore, previous HOLC grades do not necessarily follow current patterns of income. For instance, neighborhoods formerly graded A and B include both high- and low-income Census tracts. As highlighted in our results, eBird data in Baltimore is associated with income, with higher-income areas having more observations than expected. Thus, although redlined neighborhoods are considered “coldspots” for bird biodiversity data (Ellis-Soto et al. 2023), in some cities such as Baltimore, greenlined neighborhoods may have similar data disparities as a result of more contemporary social processes that drive the spatial distribution of residents and the associated income, such as segregation (Pickett et al. 2023).
Our analysis suggests which social variables may be important for understanding variation in eBird and iNaturalist data and which variables influence data disparities for these platforms. For instance, in St. Louis, the percentage of White people continually performed as the best-performing model across both data sets when we controlled for area and human population density, suggesting that variation in data for these platforms can be attributed to the relative amount of White people within a Census tract. However, we only detected a significant effect of the percentage of White people on the difference between reported and expected observations for St. Louis’ eBird data when controlling for area, suggesting that race is linked to landscape variables that could influence users’ opportunity to report bird biodiversity data (e.g., access to greenspace; Dai 2011, Kephart 2022). Furthermore, we detected a significant negative effect of poor HOLC grades on data disparities, with grades C and D having significantly fewer reported observations than expected when controlling for both area and human population density. Overall, this would suggest that for St. Louis eBird data, creating outreach initiatives for eBird data that reach non-White ethno-racial groups or neighborhoods that received poor HOLC grades may prove more effective than working across income groups or segregation metrics in St. Louis. In contrast, for Oakland’s eBird data, depending on whether we controlled for area or population density, different factors were the best-performing model. Yet, only HOLC grades C and D were significantly associated with Oakland’s eBird data regardless of which landscape variable we controlled for. These results suggest that for Oakland’s eBird data, tailoring outreach initiatives for Census tracts that were previously graded C or D may help reduce data disparities over other variables, such as income or race.
While acknowledging that many factors collide to influence the spatial distribution of eBird and iNaturalist data (Carlen et al. 2024), our results highlight that contemporary and historical factors can have a strong influence on the mismatch between reported and expected observations. This can lead to specific areas within cities, such as low-income and areas with communities of color, to have more and/or larger gaps in biodiversity data (Chapman et al. 2024). However, these gaps in data can be overcome by local initiatives. For instance, Oakland has relatively fewer disparities between the amount of reported and expected observations than St. Louis and Baltimore. This may be due to local efforts in science education as well as intentional efforts to increase contributory science data collection across the city by local organizations, such as Rotary Nature Center, Oakland Shoreline Leadership Academy, and the California Academy of Sciences. Additionally, institutional efforts, such as the City Nature Challenge created by the California Academy of Sciences and the National History Museum of Los Angeles County that engages community members in bioblitzes, can produce spatially rich biodiversity data while motivating community members to use contributory platforms outside of bioblitz events. Thus, to ameliorate these biases, the solution is not simply to sample in these areas, but rather intentionally engage with communities via recruitment, education, or workshops that increase their understanding of these platforms (Perkins et al. 2023). Furthermore, community empowerment and reduction in societal inequities (e.g., disparities in income) are critical in providing individuals with the time, resources, and agency to engage with these data platforms and participate in local biodiversity sampling efforts.
CONCLUSION
Contributory data provide researchers with broad spatial and temporal coverage to ask incredibly powerful and relevant ecological questions. However, the participant-led nature of applications like eBird and iNaturalist can create spatial biases in biodiversity data. In this article, we demonstrated that the differences between reported and expected observations in eBird and iNaturalist are associated with income, race, segregation, and HOLC grades, with factors such as race and redlining consistently shaping data disparities across cities and platforms. We also show that the influence of social factors on data disparities can be city specific, as seen with segregation, as well as platform specific, as seen with income. Thus, despite differences in sampling protocols in these platforms, both are subjected to biases as a result of societal inequities. Our results show that although each city has distinct societal and ecological features, societal inequity permeates each city to shape the uptake of two of the largest sources of biodiversity data. Understanding the role of societal features (e.g., socioeconomics, segregation) in biodiversity data, and how this varies by city and platform, is crucial for reducing the uneven biases present in these data. This work, along with other research highlighting data gaps in contributory science data, emphasizes that the solution to these biases is locally built programs that aim at empowering neighborhoods to collect their own biodiversity data.
RESPONSES TO THIS ARTICLE
Responses to this article are invited. If accepted for publication, your response will be hyperlinked to the article. To submit a response, follow this link. To read responses already accepted, follow this link.
AUTHOR CONTRIBUTIONS
COE and EJC conceived the manuscript. COE and EJC collected and analyzed the data. COE and EJC made the figures. COE and EJS wrote the manuscript. COE, EJC, and CJS edited the manuscript.
ACKNOWLEDGMENTS
We thank the Schell Lab and collaborators who engaged in conversations that inspired this paper. We thank the Tyson Research Center at Washington University in St. Louis, specifically E. Biro and S. Adalsteinsson and their students for early conversations about this manuscript. We would also like to thank L. Verde Arregoitia and J. Chang for their help with mapping and R code. We would like to thank the anonymous reviewers for suggestions that greatly improved this manuscript. COE was supported by the University of California, Berkeley’s Chancellor Fellowship and the National Science Foundation Graduate Research Fellowship under Grant No. DGE-2146752. EJC was supported by the National Science Foundation under Grant No. DBI-2109587 and the Living Earth Collaborative at Washington University in St. Louis. Any opinions, findings, conclusions, or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.
DATA AVAILABILITY
The data and code used in our manuscript are available for free at https://zenodo.org/records/7183392. Data were derived from the following public domains: (1) Home-Owner Loan Corporation maps (Nelson et al. 2020), (2) race and income data (U.S. Census Bureau 2022), (3) iNaturalist observations (iNaturalist 2023), and (4) eBird observations (eBird 2023).
LITERATURE CITED
Appel, I., and J. Nickerson. 2016. Pockets of poverty: the long-term effects of redlining. Social Science Research Network (SSRN) preprint website. https://doi.org/10.2139/ssrn.2852856
Aristeidou, M., C. Herodotou, H. L. Ballard, L. Higgins, R. F. Johnson, A. E. Miller, A. N. Young, and L. D. Robinson. 2021a. How do young community and citizen science volunteers support scientific research on biodiversity? The case of iNaturalist. Diversity 13: 318. https://doi.org/10.3390/d13070318
Aristeidou, M., C. Herodotou, H. L. Ballard, A. N. Young, A. E. Miller, L. Higgins, and R. F. Johnson. 2021b. Exploring the participation of young citizen scientists in scientific research: the case of iNaturalist. PLoS One 16: e0245682. https://doi.org/10.1371/journal.pone.0245682
Belcher, R. N., K. R. Sadanandan, E. R. Goh, J. Y. Chan, S. Menz, and T. Schroepfer. 2019. Vegetation on and around large-scale buildings positively influences native tropical bird abundance and bird species richness. Urban Ecosystems 22:213-225. https://doi.org/10.1007/s11252-018-0808-0
Blake, C., A. Rhanor, and C. Pajic. 2020. The demographics of citizen science participation and its implications for data quality and environmental justice. Citizen Science: Theory and Practice 5(1): 21. https://doi.org/10.5334/cstp.320
Blumstein, D. T., M. Alberti, J. Beninde, R. V. Blakey, J. R. Burger, D. S. Cooper, C. A. Niesner, C. J. Schell, M. Soga, and K. Uchida. 2023. Global urban biodiversity and the importance of scale. Frontiers in Conservation Science 4: 1149088. https://doi.org/10.3389/fcosc.2023.1149088
Brown, L. T. 2021. The black butterfly: the harmful politics of race and space in America. JHU Press, Baltimore, Maryland, USA.
Byrne, J. 2012. When green is white: the cultural politics of race, nature and social exclusion in a Los Angeles urban national park. Geoforum 43:595-611. https://doi.org/10.1016/j.geoforum.2011.10.002
Callaghan, C. T., A. G. Poore, T. Mesaglio, A. T. Moles, S. Nakagawa, C. Roberts, J. J. Rowley, A. Vergés, J. H. Wilshire, and W. K. Cornwell. 2021. Three frontiers for the future of biodiversity research using citizen science data. BioScience 71:55-63. https://doi.org/10.1093/biosci/biaa131
Calzada Preston, C. E., and S. Pruett-Jones. 2021. The number and distribution of introduced and naturalized parrots. Diversity 13: 412. https://doi.org/10.3390/d13090412
Carlen, E. J., C. O. Estien, T. Caspi, D. Perkins, B. R. Goldstein, S. E. Kreling, Y. Hentati, T. D. Williams, L. A. Stanton, S. Des Roches, R. Johnson, A. N. Young, C. Cooper, and C. J. Schell. 2024. A framework for contextualizing social-ecological biases in contributory science data. People and Nature 6(2):377-390. https://doi.org/10.1002/pan3.10592
Casey, J. A., R. Morello-Frosch, D. J. Mennitt, K. Fristrup, E. L. Ogburn, and P. James. 2017. Race/ethnicity, socioeconomic status, residential segregation, and spatial variation in noise exposure in the contiguous United States. Environmental Health Perspectives 125: 077017. https://doi.org/10.1289/EHP898
Chandler, M., L. See, K. Copas, A. M. Bonde, B. C. López, F. Danielsen, J. K. Legind, S. Masinde, A. J. Miller-Rushing, and G. Newman. 2017. Contribution of citizen science towards international biodiversity monitoring. Biological Conservation 213:280-294. https://doi.org/10.1016/j.biocon.2016.09.004
Chapman, M., B. R. Goldstein, C.J. Schell, J. S. Brashares, N. H. Carter, D. Ellis-Soto, H. O. Faxon, J. E. Goldstein, B. S. Halpern, and J. Longdon. 2024. Biodiversity monitoring for a just planetary future. Science 383:34-36. https://doi.org/10.1126/science.adh8874
Dai, D. 2011. Racial/ethnic and socioeconomic disparities in urban green space accessibility: where to intervene? Landscape and Urban Planning 102(4):234-244. https://doi.org/10.1016/j.landurbplan.2011.05.002
eBird. 2023. Homepage. https://ebird.org/home
Ellis-Soto, D., M. Chapman, and D. H. Locke. 2023. Historical redlining is associated with increasing geographical disparities in bird biodiversity sampling in the United States. Nature Human Behaviour 7(11):1869-1877. https://doi.org/10.1038/s41562-023-01688-5
Estien, C. O., M. Fidino, C. E. Wilkinson, R. Morello-Frosch, and C. J. Schell. 2024a. Historical redlining is associated with disparities in wildlife biodiversity in four California cities. Proceedings of the National Academy of Sciences 121(25): e2321441121. https://doi.org/10.1073/pnas.2321441121
Estien, C. O., C. E. Wilkinson, R. Morello-Frosch, and C. J. Schell. 2024b. Historical redlining is associated with disparities in environmental quality across California. Environmental Science and Technology Letters 11:54-59. https://doi.org/10.1021/acs.estlett.3c00870
Eubanks, T. L. Jr., J. R. Stoll, and R. B. Ditton. 2004. Understanding the diversity of eight birder sub-populations: socio-demographic characteristics, motivations, expenditures and net benefits. Journal of Ecotourism 3(3):151-172. https://doi.org/10.1080/14664200508668430
Fan, C., M. Johnston, L. Darling, L. Scott, and F. H. Liao. 2019. Land use and socio-economic determinants of urban forest structure and diversity. Landscape and Urban Planning 181:10-21. https://doi.org/10.1016/j.landurbplan.2018.09.012
Finney, C. 2014. Black faces, white spaces: reimagining the relationship of African Americans to the great outdoors. University of North Carolina Press Books, Chapel Hill, North Carolina, USA. https://doi.org/10.5149/northcarolina/9781469614489.001.0001
Fishback, P., J. Rose, K. A. Snowden, and T. Storrs. 2022. New evidence on redlining by federal housing programs in the 1930s. Journal of Urban Economics 141: 103462. https://doi.org/10.1016/j.jue.2022.103462
Gadsden, G. I., N. Golden, and N. C. Harris. 2023. Place-based bias in environmental scholarship derived from social-ecological landscapes of fear. BioScience 73(1):23-35. https://doi.org/10.1093/biosci/biac095
Gordon, C. 2023. Dividing the city: race-restrictive covenants and the architecture of segregation in St. Louis. Journal of Urban History 49:160-182. https://doi.org/10.1177/0096144221999641
Grade, A. M., N. W. Chan, P. Gajbhiye, D. J. Perkins, and P. S. Warren. 2022. Evaluating the use of semi-structured crowdsourced data to quantify inequitable access to urban biodiversity: a case study with eBird. PLoS One 17: e0277223. https://doi.org/10.1371/journal.pone.0277223
Grove, M., L. Ogden, S. Pickett, C. Boone, G. Buckley, D. H. Locke, C. Lord, and B. Hall. 2018. The legacy effect: understanding how segregation and environmental injustice unfold over time in Baltimore. Annals of the American Association of Geographers 108:524-537. https://doi.org/10.1080/24694452.2017.1365585
Hillier, A. E. 2003. Redlining and the Home Owners’ Loan Corporation. Journal of Urban History 29:394-420. https://doi.org/10.1177/0096144203029004002
Iceland, J. 2004. Beyond black and white: metropolitan residential segregation in multi-ethnic America. Social Science Research 33:248-271. https://doi.org/10.1016/S0049-089X(03)00056-5
iNaturalist. 2023. Homepage. https://www.inaturalist.org/home
Jesdale, B. M., R. Morello-Frosch, and L. Cushing. 2013. The racial/ethnic distribution of heat risk-related land cover in relation to residential segregation. Environmental Health Perspectives 121:811-817. https://doi.org/10.1289/ehp.1205919
Kelling, S., A. Johnston, A. Bonn, D. Fink, V. Ruiz-Gutierrez, R. Bonney, M. Fernandez, W. M. Hochachka, R. Julliard, and R. Kraemer. 2019. Using semistructured surveys to improve citizen science data for monitoring biodiversity. BioScience 69:170-179. https://doi.org/10.1093/biosci/biz010
Kephart, L. 2022. How racial residential segregation structures access and exposure to greenness and green space: a review. Environmental Justice 15(4):204-213. https://doi.org/10.1089/env.2021.0039
Kirchhoff, C., C. T. Callaghan, D. A. Keith, D. Indiarto, G. Taseski, M. K. Ooi, T. D. Le Breton, T. Mesaglio, R. T. Kingsford, and W. K. Cornwell. 2021. Rapidly mapping fire effects on biodiversity at a large-scale using citizen science. Science of the Total Environment 755: 142348. https://doi.org/10.1016/j.scitotenv.2020.142348
Lambert, M., and C. Schell. 2023. Urban biodiversity and equity: justice-centered conservation in cities. Oxford University Press, Oxford, UK. https://doi.org/10.1093/oso/9780198877271.001.0001
Locke, D. H., B. Hall, J. M. Grove, S. T. A. Pickett, L. A. Ogden, C. Aoki, C. G. Boone, and J. P. M. O’Neil-Dunne. 2021. Residential housing segregation and urban tree canopy in 37 U.S. cities. NPJ Urban Sustainability 1: 15. https://doi.org/10.1038/s42949-021-00022-0
Lüdecke, D., M. Ben-Shachar, I. Patil, P. Waggoner, and D. Makowski. 2021. Performance: an R package for assessment, comparison and testing of statistical models. Journal of Open Source Software 6: 3139. https://doi.org/10.21105/joss.03139
Magle, S. B., M. Fidino, H. A. Sander, A. T. Rohnke, K. L. Larson, T. Gallo, C. A. M. Kay, E. W. Lehrer, M. H. Murray, S. A. Adalsteinsson, A. A. Ahlers, W. J. B. Anthonysamy, A. R. Gramza, A. M. Green, M. J. Jordan, J. S. Lewis, R. A. Long, B. MacDougall, M. E. Pendergast, K. Remine, K. Conrad Simon, C. C. St. Clair, C. J. Shier, T. Stankowich, C. J. Stevenson, A. J. Zellmer, and C. J. Schell . 2021. Wealth and urbanization shape medium and large terrestrial mammal communities. Global Change Biology 27:5446-5459. https://doi.org/10.1111/gcb.15800
Mahmoudi, D., C. L. Hawn, E. H. Henry, E. H., D. J. Perkins, C. B. Cooper, and S. M. Wilson. 2022. Mapping for whom? Communities of color and the citizen science gap. ACME: An International Journal for Critical Geographies 21(4):372-388.
Maistrello, L., P. Dioli, M. Bariselli, G. L. Mazzoli, and I. Giacalone-Forini. 2016. Citizen science and early detection of invasive species: phenology of first occurrences of Halyomorpha halys in southern Europe. Biological Invasions 18:3109-3116. https://doi.org/10.1007/s10530-016-1217-z
McKinley, D. C., A. J. Miller-Rushing, H. L. Ballard, R. Bonney, H. Brown, S. C. Cook-Patton, D. M. Evans, R. A. French, J. K. Parrish, and T. B. Phillips. 2017. Citizen science can improve conservation science, natural resource management, and environmental protection. Biological Conservation 208:15-28. https://doi.org/10.1016/j.biocon.2016.05.015
Morello-Frosch, R. A. 2002. Discrimination and the political economy of environmental inequality. Environment and Planning C: Politics and Space 20(4):477-496. https://doi.org/10.1068/c03r
Morello-Frosch, R., and B. M. Jesdale. 2006. Separate and unequal: residential segregation and estimated cancer risks associated with ambient air toxics in U.S. metropolitan areas. Environmental Health Perspectives 114:386-393. https://doi.org/10.1289/ehp.8500
Nardone, A., K. E. Rudolph, R. Morello-Frosch, and J. A. Casey. 2021. Redlines and greenspace: the relationship between historical redlining and 2010 greenspace across the United States. Environmental Health Perspectives 129(1): 017006. https://doi.org/10.1289/EHP7495
Nelson, R. K., L. Winling, R. Marciano, N. Connolly, and E. L. Ayers. 2020. Mapping inequality: redlining in new deal America. American Panorama: An Atlas of United States History. University of Richmond: Digital Scholarship Lab 17, 19.
Nesbitt, L., M. J. Meitner, C. Girling, S. R. J. Sheppard, and Y. Lu. 2019. Who has access to urban vegetation? A spatial analysis of distributional green equity in 10 U.S. cities. Landscape and Urban Planning 181:51-79. https://doi.org/10.1016/j.landurbplan.2018.08.007
Nilon, C. H. 2011. Urban biodiversity and the importance of management and conservation. Landscape and Ecological Engineering 7:45-52. https://doi.org/10.1007/s11355-010-0146-8
Pebesma, E., and R. Bivand. 2023. Spatial data science: with applications in R. Chapman and Hall/CRC, New York, New York, USA. https://doi.org/10.1201/9780429459016
Perkins, D. J. 2020. Blind spots in citizen science data: implications of volunteer biases in eBird data. Dissertation, North Carolina State University, Raleigh, North Carolina, USA.
Perkins, D. J., L. M. Nichols, and R. R. Dunn. 2023. Participatory science for equitable urban biodiversity research and practice. Chapter 7 in M. Lambert and C. Schell, editors. Urban biodiversity and equity: justice-centered conservation in cities. Oxford University Press, Oxford, UK. https://doi.org/10.1093/oso/9780198877271.003.0007
Pickett, S. T., J. M. Grove, C. G. Boone, and G. L. Buckley. 2023. Resilience of racialized segregation is an ecological factor: Baltimore case study. Buildings and Cities 4:783-800. https://doi.org/10.5334/bc.317
Putman, B. J., R. Williams, E. Li, and G.B. Pauly. 2021. The power of community science to quantify ecological interactions in cities. Scientific Reports 11: 3069. https://doi.org/10.1038/s41598-021-82491-y
Rigolon, A. 2016. A complex landscape of inequity in access to urban parks: a literature review. Landsc. Urban Planning 153:160-169. https://doi.org/10.1016/j.landurbplan.2016.05.017
Rosenblatt, C. J., A. A. Dayer, J. N. Duberstein, T. B. Phillips, H. W. Harshaw, D. C. Fulton, D.C., N. W. Cole, A. H. Raedeke, J. D. Rutter, and C. L. Wood. 2022. Highly specialized recreationists contribute the most to the citizen science project eBird. Ornithological Applications 124: duac008. https://doi.org/10.1093/ornithapp/duac008
Salter, J. 2022. Realtors apologizing for past discrimination, urging change. AP News. https://apnews.com/article/business-discrimination-race-and-ethnicity-racial-injustice-0fc1b75c4659e5014a55706033a52963
Schwarz, K., M. Fragkias, C. G. Boone, W. Zhou, M. McHale, J. M. Grove, J. O’Neil-Dunne, J. P. McFadden, G. L. Buckley, D. Childers, L. Ogden, S. Pincetl, D. Pataki, A. Whitmer, and A., M. L. Cadenasso. 2015. Trees grow on money: urban tree canopy cover and environmental justice. PLoS One 10: e0122051. https://doi.org/10.1371/journal.pone.0122051
Simkin, R. D., K. C. Seto, R. I. McDonald, and W. Jetz. 2022. Biodiversity impacts and conservation implications of urban land expansion projected to 2050. Proceedings of the National Academy of Sciences 119: e2117297119. https://doi.org/10.1073/pnas.2117297119
Theobald, E. J., A. K. Ettinger, H. K. Burgess, L. B. DeBey, N. R. Schmidt, H. E. Froehlich, C. Wagner, J. HilleRisLambers, J. Tewksbury, M. A. Harsch. 2015. Global change and local solutions: tapping the unrealized potential of citizen science for biodiversity research. Biological Conservation 181:236-244. https://doi.org/10.1016/j.biocon.2014.10.021
U.S. Census Bureau. 2022. Annual estimates of the resident population for the United States, regions, states, District of Columbia, and Puerto Rico: April 1, 2020 to July 1, 2022. https://www.census.gov/data/tables/time-series/demo/popest/2020s-state-total.html
Walker, K., and M. Herman. 2021. tidycensus: load U.S. census boundary and attribute data as “tidyverse” and “Sf”-ready data frames. R package version 1. https://doi.org/10.32614/CRAN.package.tidycensus
Welvaert, M., and P. Caley. 2016. Citizen surveillance for environmental monitoring: combining the efforts of citizen science and crowdsourcing in a quantitative data framework. SpringerPlus 5: 1890. https://doi.org/10.1186/s40064-016-3583-5
Werenkraut, V., F. Baudino, and H. E. Roy. 2020. Citizen science reveals the distribution of the invasive harlequin ladybird (Harmonia axyridis Pallas) in Argentina. Biological Invasions 22:2915-2921. https://doi.org/10.1007/s10530-020-02312-7
Winton, R. S., N. Ocampo-Peñuela, N. Cagle. 2018. Geo-referencing bird-window collisions for targeted mitigation. PeerJ 6:e4215. https://doi.org/10.7717/peerj.4215
Wood, C., B. Sullivan, M. Iliff, D. Fink, and S. Kelling. 2011. eBird: engaging birders in science and conservation. PLoS Biology 9: e1001220. https://doi.org/10.1371/journal.pbio.1001220
Wood, E. M., S. Esaian, C. Benitez, P. J. Ethington, T. Longcore, and L. Y. Pomara. 2023. Historical racial redlining and contemporary patterns of income inequality negatively affect birds, their habitat, and people in Los Angeles, California, Ornithological Applications 126(10): 5. https://doi.org/10.1093/ornithapp/duad044
Yang, D., A. Yang, J. Yang, R. Xu, and H. Qiu. 2021. Unprecedented migratory bird die-off: a citizen-based analysis on the spatiotemporal patterns of mass mortality events in the western United States. GeoHealth 5: e2021GH000395. https://doi.org/10.1029/2021GH000395
Zhang, G. 2020. Spatial and temporal patterns in volunteer data contribution activities: a case study of eBird. ISPRS International Journal of Geo-Information 9(10): 597. https://doi.org/10.3390/ijgi9100597
Table 1
Table 1. Best-performing model for each city’s eBird and iNaturalist data. Each model was run adjusted for area and human population density. Cells with asterisks (*) indicate that the model was significantly different from the null model. Note that in Oakland’s iNaturalist data model with all Census tracks adjusted for area, the null model was the most well-supported model.
All census tracts | Census tracts within HOLC grades | ||||||||
Area | Human population density | Area | Human population density | ||||||
Oakland | eBird | income | segregation | HOLC* | income* | ||||
iNaturalist | null | percentage of White people* | income* | income* | |||||
St. Louis | eBird | percentage of White people* | percentage of White people | percentage of White people | percentage of White people | ||||
iNaturalist | percentage of White people* | percentage of White people | percentage of White people* | percentage of White people* | |||||
Baltimore | eBird | income* | income* | HOLC* | HOLC* | ||||
iNaturalist | income* | percentage of White people* | percentage of White people* | percentage of White people* | |||||
Table 2
Table 2. Differences in reported vs. expected observations based on income, percentage of White people, segregation, and HOLC grade (A–D) controlled by using area and human population density. The directionality of significance is denoted with + (more than expected) and – (less than expected).
Income | Percentage of White people | Segregation | Grade A | Grade B | Grade C | Grade D | |||
Oakland eBird | Area | 0.082 | 0.408 | 0.862 | 0.141 | 0.815 | p < 0.001 (-) | p < 0.001 (-) | |
Human population density | 0.282 | 0.145 | 0.131 | 0.335 | 0.199 | p < 0.001 (-) | p < 0.001 (-) |
||
Oakland iNaturalist | Area | 0.768 | 0.484 | 0.711 | 0.493 | 0.477 | 0.726 | 0.993 | |
Human population density | p < 0.05 (+) | p < 0.01 (+) | 0.917 | 0.3923 | 0.769 | p < 0.001 (-) | p < 0.05 (-) |
||
St. Louis eBird | Area | 0.082 | p < 0.05 (+) | 0.489 | 0.704 | 0.842 | p < 0.001 (-) | p < 0.001 (-) | |
Human population density | 0.282 | 0.094 | 0.679 | 0.823 | 0.984 | p < 0.001 (-) | p < 0.001 (-) |
||
St Louis iNaturalist | Area | p < 0.05 (+) | p < 0.05 (+) | 0.225 | 0.394 | 0.927 | p < 0.001 (-) | p < 0.05 (-) | |
Human population density | 0.128 | 0.091 | 0.443 | 0.775 | 0.732 | p < 0.001 (-) | 0.088 |
||
Baltimore eBird | Area | p < 0.01 (+) | 0.164 | 0.103 | p < 0.001 (-) | p < 0.001 (-) | p < 0.001 (-) | p < 0.001 (-) | |
Human population density | p < 0.001 (+) | p < 0.01 (+) | p < 0.01 (+) | p < 0.001 (-) | p < 0.001 (-) | p < 0.001 (-) | p < 0.001 (-) |
||
Baltimore iNaturalist | Area | p < 0.001 (+) | p < 0.001 (+) | p < 0.05 (+) | 0.634 | 0.171 | p < 0.05 (-) | 0.346 | |
Human population density | p < 0.001 (+) | p < 0.001 (+) | p < 0.01 (+) | 0.273 | 0.059 | p < 0.001 (-) | 0.968 | ||