Hypothesis 2

Change in US-specific racial and ethnic categories in biomedical abstracts

Next, we wanted to examine the growth of OMB Directive 15 and US Census terms. Initially, we posed this hypothesis:

H2: The use of US-specific racial and ethnic terms
have increased in biomedical abstracts from 1990-2020.

To test this assertion, we compiled a dictionary used to catch to the main OMB/US Census terms, including Asian, White, Black, Native American, Pacific Islander, Hispanic/Latinx as well as more general terms like race and ethnicity. Given that these terms can also refer to populations outside of the US, we recoded compound and hyphenated US-centric terms to ensure they were specific to the American context. Then, we used R’s tidytext package to unnest and count all of the terms in our corpus (both in raw counts and as proportions).

Figure 2A shows the OMB US Census terms are growing over time, but this growth is quite subdued compared to our previous results. For example, no category exceeds more than 3,100 uses in a given year. Of these category sets, race and ethnicity grow the most followed by Black, White, Asian and Hispanic/Latinx. Notably, uses of “social diversity” ends up between Hispanic/Latinx and Native American, but are much lower than the other terms just mentioned.

Figure 2B shows proportional variation in OMB/Census terms over time. While almost all of the categories rise gradually until the late-2000’s, the terms white, black, race and ethnicity all taper off and, in some cases, decline thereafter. In contrast, we see that the “diversity (social)” category continues to rise, though does not eclipse most of the racial/ethnic categories. This suggests that racial/ethnic terms are less commonly used that a decade ago, and interest in diversity is on the rise. Therefore, it seems that Berrey (2015) may be right about diversity becoming more prominent than racial/ethnic terms. However, this method cannot tell us with certainty that diversity is replacing racial/ethnic terms. While we eventually plan to explore this more through computational methods like Word2Vec and BERT, qualitative researchers may want to supplement this research by questioning biomedical researchers about how they conceptualize diversity relative to more traditional racial/ethnic terms, especially given that most STS scholarship has assume that these types of terms are still on the rise. We now know that this is not necessarily the case.


Here is a list of the terms that were used to classify the OMB/US Census terms in the final two sets of analyses.