Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here .

Loading metrics

Open Access

Peer-reviewed

Research Article

Recent quantitative research on determinants of health in high income countries: A scoping review

Roles Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Project administration, Software, Visualization, Writing – original draft, Writing – review & editing

* E-mail: [email protected]

Affiliation Centre for Health Economics Research and Modelling Infectious Diseases, Vaccine and Infectious Disease Institute, University of Antwerp, Antwerp, Belgium

ORCID logo

Roles Conceptualization, Data curation, Funding acquisition, Project administration, Resources, Supervision, Validation, Visualization, Writing – review & editing

  • Vladimira Varbanova, 
  • Philippe Beutels

PLOS

  • Published: September 17, 2020
  • https://doi.org/10.1371/journal.pone.0239031
  • Peer Review
  • Reader Comments

Fig 1

Identifying determinants of health and understanding their role in health production constitutes an important research theme. We aimed to document the state of recent multi-country research on this theme in the literature.

We followed the PRISMA-ScR guidelines to systematically identify, triage and review literature (January 2013—July 2019). We searched for studies that performed cross-national statistical analyses aiming to evaluate the impact of one or more aggregate level determinants on one or more general population health outcomes in high-income countries. To assess in which combinations and to what extent individual (or thematically linked) determinants had been studied together, we performed multidimensional scaling and cluster analysis.

Sixty studies were selected, out of an original yield of 3686. Life-expectancy and overall mortality were the most widely used population health indicators, while determinants came from the areas of healthcare, culture, politics, socio-economics, environment, labor, fertility, demographics, life-style, and psychology. The family of regression models was the predominant statistical approach. Results from our multidimensional scaling showed that a relatively tight core of determinants have received much attention, as main covariates of interest or controls, whereas the majority of other determinants were studied in very limited contexts. We consider findings from these studies regarding the importance of any given health determinant inconclusive at present. Across a multitude of model specifications, different country samples, and varying time periods, effects fluctuated between statistically significant and not significant, and between beneficial and detrimental to health.

Conclusions

We conclude that efforts to understand the underlying mechanisms of population health are far from settled, and the present state of research on the topic leaves much to be desired. It is essential that future research considers multiple factors simultaneously and takes advantage of more sophisticated methodology with regards to quantifying health as well as analyzing determinants’ influence.

Citation: Varbanova V, Beutels P (2020) Recent quantitative research on determinants of health in high income countries: A scoping review. PLoS ONE 15(9): e0239031. https://doi.org/10.1371/journal.pone.0239031

Editor: Amir Radfar, University of Central Florida, UNITED STATES

Received: November 14, 2019; Accepted: August 28, 2020; Published: September 17, 2020

Copyright: © 2020 Varbanova, Beutels. This is an open access article distributed under the terms of the Creative Commons Attribution License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: All relevant data are within the manuscript and its Supporting Information files.

Funding: This study (and VV) is funded by the Research Foundation Flanders ( https://www.fwo.be/en/ ), FWO project number G0D5917N, award obtained by PB. The funder had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Competing interests: The authors have declared that no competing interests exist.

Introduction

Identifying the key drivers of population health is a core subject in public health and health economics research. Between-country comparative research on the topic is challenging. In order to be relevant for policy, it requires disentangling different interrelated drivers of “good health”, each having different degrees of importance in different contexts.

“Good health”–physical and psychological, subjective and objective–can be defined and measured using a variety of approaches, depending on which aspect of health is the focus. A major distinction can be made between health measurements at the individual level or some aggregate level, such as a neighborhood, a region or a country. In view of this, a great diversity of specific research topics exists on the drivers of what constitutes individual or aggregate “good health”, including those focusing on health inequalities, the gender gap in longevity, and regional mortality and longevity differences.

The current scoping review focuses on determinants of population health. Stated as such, this topic is quite broad. Indeed, we are interested in the very general question of what methods have been used to make the most of increasingly available region or country-specific databases to understand the drivers of population health through inter-country comparisons. Existing reviews indicate that researchers thus far tend to adopt a narrower focus. Usually, attention is given to only one health outcome at a time, with further geographical and/or population [ 1 , 2 ] restrictions. In some cases, the impact of one or more interventions is at the core of the review [ 3 – 7 ], while in others it is the relationship between health and just one particular predictor, e.g., income inequality, access to healthcare, government mechanisms [ 8 – 13 ]. Some relatively recent reviews on the subject of social determinants of health [ 4 – 6 , 14 – 17 ] have considered a number of indicators potentially influencing health as opposed to a single one. One review defines “social determinants” as “the social, economic, and political conditions that influence the health of individuals and populations” [ 17 ] while another refers even more broadly to “the factors apart from medical care” [ 15 ].

In the present work, we aimed to be more inclusive, setting no limitations on the nature of possible health correlates, as well as making use of a multitude of commonly accepted measures of general population health. The goal of this scoping review was to document the state of the art in the recent published literature on determinants of population health, with a particular focus on the types of determinants selected and the methodology used. In doing so, we also report the main characteristics of the results these studies found. The materials collected in this review are intended to inform our (and potentially other researchers’) future analyses on this topic. Since the production of health is subject to the law of diminishing marginal returns, we focused our review on those studies that included countries where a high standard of wealth has been achieved for some time, i.e., high-income countries belonging to the Organisation for Economic Co-operation and Development (OECD) or Europe. Adding similar reviews for other country income groups is of limited interest to the research we plan to do in this area.

In view of its focus on data and methods, rather than results, a formal protocol was not registered prior to undertaking this review, but the procedure followed the guidelines of the PRISMA statement for scoping reviews [ 18 ].

We focused on multi-country studies investigating the potential associations between any aggregate level (region/city/country) determinant and general measures of population health (e.g., life expectancy, mortality rate).

Within the query itself, we listed well-established population health indicators as well as the six world regions, as defined by the World Health Organization (WHO). We searched only in the publications’ titles in order to keep the number of hits manageable, and the ratio of broadly relevant abstracts over all abstracts in the order of magnitude of 10% (based on a series of time-focused trial runs). The search strategy was developed iteratively between the two authors and is presented in S1 Appendix . The search was performed by VV in PubMed and Web of Science on the 16 th of July, 2019, without any language restrictions, and with a start date set to the 1 st of January, 2013, as we were interested in the latest developments in this area of research.

Eligibility criteria

Records obtained via the search methods described above were screened independently by the two authors. Consistency between inclusion/exclusion decisions was approximately 90% and the 43 instances where uncertainty existed were judged through discussion. Articles were included subject to meeting the following requirements: (a) the paper was a full published report of an original empirical study investigating the impact of at least one aggregate level (city/region/country) factor on at least one health indicator (or self-reported health) of the general population (the only admissible “sub-populations” were those based on gender and/or age); (b) the study employed statistical techniques (calculating correlations, at the very least) and was not purely descriptive or theoretical in nature; (c) the analysis involved at least two countries or at least two regions or cities (or another aggregate level) in at least two different countries; (d) the health outcome was not differentiated according to some socio-economic factor and thus studied in terms of inequality (with the exception of gender and age differentiations); (e) mortality, in case it was one of the health indicators under investigation, was strictly “total” or “all-cause” (no cause-specific or determinant-attributable mortality).

Data extraction

The following pieces of information were extracted in an Excel table from the full text of each eligible study (primarily by VV, consulting with PB in case of doubt): health outcome(s), determinants, statistical methodology, level of analysis, results, type of data, data sources, time period, countries. The evidence is synthesized according to these extracted data (often directly reflected in the section headings), using a narrative form accompanied by a “summary-of-findings” table and a graph.

Search and selection

The initial yield contained 4583 records, reduced to 3686 after removal of duplicates ( Fig 1 ). Based on title and abstract screening, 3271 records were excluded because they focused on specific medical condition(s) or specific populations (based on morbidity or some other factor), dealt with intervention effectiveness, with theoretical or non-health related issues, or with animals or plants. Of the remaining 415 papers, roughly half were disqualified upon full-text consideration, mostly due to using an outcome not of interest to us (e.g., health inequality), measuring and analyzing determinants and outcomes exclusively at the individual level, performing analyses one country at a time, employing indices that are a mixture of both health indicators and health determinants, or not utilizing potential health determinants at all. After this second stage of the screening process, 202 papers were deemed eligible for inclusion. This group was further dichotomized according to level of economic development of the countries or regions under study, using membership of the OECD or Europe as a reference “cut-off” point. Sixty papers were judged to include high-income countries, and the remaining 142 included either low- or middle-income countries or a mix of both these levels of development. The rest of this report outlines findings in relation to high-income countries only, reflecting our own primary research interests. Nonetheless, we chose to report our search yield for the other income groups for two reasons. First, to gauge the relative interest in applied published research for these different income levels; and second, to enable other researchers with a focus on determinants of health in other countries to use the extraction we made here.

thumbnail

  • PPT PowerPoint slide
  • PNG larger image
  • TIFF original image

https://doi.org/10.1371/journal.pone.0239031.g001

Health outcomes

The most frequent population health indicator, life expectancy (LE), was present in 24 of the 60 studies. Apart from “life expectancy at birth” (representing the average life-span a newborn is expected to have if current mortality rates remain constant), also called “period LE” by some [ 19 , 20 ], we encountered as well LE at 40 years of age [ 21 ], at 60 [ 22 ], and at 65 [ 21 , 23 , 24 ]. In two papers, the age-specificity of life expectancy (be it at birth or another age) was not stated [ 25 , 26 ].

Some studies considered male and female LE separately [ 21 , 24 , 25 , 27 – 33 ]. This consideration was also often observed with the second most commonly used health index [ 28 – 30 , 34 – 38 ]–termed “total”, or “overall”, or “all-cause”, mortality rate (MR)–included in 22 of the 60 studies. In addition to gender, this index was also sometimes broken down according to age group [ 30 , 39 , 40 ], as well as gender-age group [ 38 ].

While the majority of studies under review here focused on a single health indicator, 23 out of the 60 studies made use of multiple outcomes, although these outcomes were always considered one at a time, and sometimes not all of them fell within the scope of our review. An easily discernable group of indices that typically went together [ 25 , 37 , 41 ] was that of neonatal (deaths occurring within 28 days postpartum), perinatal (fetal or early neonatal / first-7-days deaths), and post-neonatal (deaths between the 29 th day and completion of one year of life) mortality. More often than not, these indices were also accompanied by “stand-alone” indicators, such as infant mortality (deaths within the first year of life; our third most common index found in 16 of the 60 studies), maternal mortality (deaths during pregnancy or within 42 days of termination of pregnancy), and child mortality rates. Child mortality has conventionally been defined as mortality within the first 5 years of life, thus often also called “under-5 mortality”. Nonetheless, Pritchard & Wallace used the term “child mortality” to denote deaths of children younger than 14 years [ 42 ].

As previously stated, inclusion criteria did allow for self-reported health status to be used as a general measure of population health. Within our final selection of studies, seven utilized some form of subjective health as an outcome variable [ 25 , 43 – 48 ]. Additionally, the Health Human Development Index [ 49 ], healthy life expectancy [ 50 ], old-age survival [ 51 ], potential years of life lost [ 52 ], and disability-adjusted life expectancy [ 25 ] were also used.

We note that while in most cases the indicators mentioned above (and/or the covariates considered, see below) were taken in their absolute or logarithmic form, as a—typically annual—number, sometimes they were used in the form of differences, change rates, averages over a given time period, or even z-scores of rankings [ 19 , 22 , 40 , 42 , 44 , 53 – 57 ].

Regions, countries, and populations

Despite our decision to confine this review to high-income countries, some variation in the countries and regions studied was still present. Selection seemed to be most often conditioned on the European Union, or the European continent more generally, and the Organisation of Economic Co-operation and Development (OECD), though, typically, not all member nations–based on the instances where these were also explicitly listed—were included in a given study. Some of the stated reasons for omitting certain nations included data unavailability [ 30 , 45 , 54 ] or inconsistency [ 20 , 58 ], Gross Domestic Product (GDP) too low [ 40 ], differences in economic development and political stability with the rest of the sampled countries [ 59 ], and national population too small [ 24 , 40 ]. On the other hand, the rationales for selecting a group of countries included having similar above-average infant mortality [ 60 ], similar healthcare systems [ 23 ], and being randomly drawn from a social spending category [ 61 ]. Some researchers were interested explicitly in a specific geographical region, such as Eastern Europe [ 50 ], Central and Eastern Europe [ 48 , 60 ], the Visegrad (V4) group [ 62 ], or the Asia/Pacific area [ 32 ]. In certain instances, national regions or cities, rather than countries, constituted the units of investigation instead [ 31 , 51 , 56 , 62 – 66 ]. In two particular cases, a mix of countries and cities was used [ 35 , 57 ]. In another two [ 28 , 29 ], due to the long time periods under study, some of the included countries no longer exist. Finally, besides “European” and “OECD”, the terms “developed”, “Western”, and “industrialized” were also used to describe the group of selected nations [ 30 , 42 , 52 , 53 , 67 ].

As stated above, it was the health status of the general population that we were interested in, and during screening we made a concerted effort to exclude research using data based on a more narrowly defined group of individuals. All studies included in this review adhere to this general rule, albeit with two caveats. First, as cities (even neighborhoods) were the unit of analysis in three of the studies that made the selection [ 56 , 64 , 65 ], the populations under investigation there can be more accurately described as general urban , instead of just general. Second, oftentimes health indicators were stratified based on gender and/or age, therefore we also admitted one study that, due to its specific research question, focused on men and women of early retirement age [ 35 ] and another that considered adult males only [ 68 ].

Data types and sources

A great diversity of sources was utilized for data collection purposes. The accessible reference databases of the OECD ( https://www.oecd.org/ ), WHO ( https://www.who.int/ ), World Bank ( https://www.worldbank.org/ ), United Nations ( https://www.un.org/en/ ), and Eurostat ( https://ec.europa.eu/eurostat ) were among the top choices. The other international databases included Human Mortality [ 30 , 39 , 50 ], Transparency International [ 40 , 48 , 50 ], Quality of Government [ 28 , 69 ], World Income Inequality [ 30 ], International Labor Organization [ 41 ], International Monetary Fund [ 70 ]. A number of national databases were referred to as well, for example the US Bureau of Statistics [ 42 , 53 ], Korean Statistical Information Services [ 67 ], Statistics Canada [ 67 ], Australian Bureau of Statistics [ 67 ], and Health New Zealand Tobacco control and Health New Zealand Food and Nutrition [ 19 ]. Well-known surveys, such as the World Values Survey [ 25 , 55 ], the European Social Survey [ 25 , 39 , 44 ], the Eurobarometer [ 46 , 56 ], the European Value Survey [ 25 ], and the European Statistics of Income and Living Condition Survey [ 43 , 47 , 70 ] were used as data sources, too. Finally, in some cases [ 25 , 28 , 29 , 35 , 36 , 41 , 69 ], built-for-purpose datasets from previous studies were re-used.

In most of the studies, the level of the data (and analysis) was national. The exceptions were six papers that dealt with Nomenclature of Territorial Units of Statistics (NUTS2) regions [ 31 , 62 , 63 , 66 ], otherwise defined areas [ 51 ] or cities [ 56 ], and seven others that were multilevel designs and utilized both country- and region-level data [ 57 ], individual- and city- or country-level [ 35 ], individual- and country-level [ 44 , 45 , 48 ], individual- and neighborhood-level [ 64 ], and city-region- (NUTS3) and country-level data [ 65 ]. Parallel to that, the data type was predominantly longitudinal, with only a few studies using purely cross-sectional data [ 25 , 33 , 43 , 45 – 48 , 50 , 62 , 67 , 68 , 71 , 72 ], albeit in four of those [ 43 , 48 , 68 , 72 ] two separate points in time were taken (thus resulting in a kind of “double cross-section”), while in another the averages across survey waves were used [ 56 ].

In studies using longitudinal data, the length of the covered time periods varied greatly. Although this was almost always less than 40 years, in one study it covered the entire 20 th century [ 29 ]. Longitudinal data, typically in the form of annual records, was sometimes transformed before usage. For example, some researchers considered data points at 5- [ 34 , 36 , 49 ] or 10-year [ 27 , 29 , 35 ] intervals instead of the traditional 1, or took averages over 3-year periods [ 42 , 53 , 73 ]. In one study concerned with the effect of the Great Recession all data were in a “recession minus expansion change in trends”-form [ 57 ]. Furthermore, there were a few instances where two different time periods were compared to each other [ 42 , 53 ] or when data was divided into 2 to 4 (possibly overlapping) periods which were then analyzed separately [ 24 , 26 , 28 , 29 , 31 , 65 ]. Lastly, owing to data availability issues, discrepancies between the time points or periods of data on the different variables were occasionally observed [ 22 , 35 , 42 , 53 – 55 , 63 ].

Health determinants

Together with other essential details, Table 1 lists the health correlates considered in the selected studies. Several general categories for these correlates can be discerned, including health care, political stability, socio-economics, demographics, psychology, environment, fertility, life-style, culture, labor. All of these, directly or implicitly, have been recognized as holding importance for population health by existing theoretical models of (social) determinants of health [ 74 – 77 ].

thumbnail

https://doi.org/10.1371/journal.pone.0239031.t001

It is worth noting that in a few studies there was just a single aggregate-level covariate investigated in relation to a health outcome of interest to us. In one instance, this was life satisfaction [ 44 ], in another–welfare system typology [ 45 ], but also gender inequality [ 33 ], austerity level [ 70 , 78 ], and deprivation [ 51 ]. Most often though, attention went exclusively to GDP [ 27 , 29 , 46 , 57 , 65 , 71 ]. It was often the case that research had a more particular focus. Among others, minimum wages [ 79 ], hospital payment schemes [ 23 ], cigarette prices [ 63 ], social expenditure [ 20 ], residents’ dissatisfaction [ 56 ], income inequality [ 30 , 69 ], and work leave [ 41 , 58 ] took center stage. Whenever variables outside of these specific areas were also included, they were usually identified as confounders or controls, moderators or mediators.

We visualized the combinations in which the different determinants have been studied in Fig 2 , which was obtained via multidimensional scaling and a subsequent cluster analysis (details outlined in S2 Appendix ). It depicts the spatial positioning of each determinant relative to all others, based on the number of times the effects of each pair of determinants have been studied simultaneously. When interpreting Fig 2 , one should keep in mind that determinants marked with an asterisk represent, in fact, collectives of variables.

thumbnail

Groups of determinants are marked by asterisks (see S1 Table in S1 Appendix ). Diminishing color intensity reflects a decrease in the total number of “connections” for a given determinant. Noteworthy pairwise “connections” are emphasized via lines (solid-dashed-dotted indicates decreasing frequency). Grey contour lines encircle groups of variables that were identified via cluster analysis. Abbreviations: age = population age distribution, associations = membership in associations, AT-index = atherogenic-thrombogenic index, BR = birth rate, CAPB = Cyclically Adjusted Primary Balance, civilian-labor = civilian labor force, C-section = Cesarean delivery rate, credit-info = depth of credit information, dissatisf = residents’ dissatisfaction, distrib.orient = distributional orientation, EDU = education, eHealth = eHealth index at GP-level, exch.rate = exchange rate, fat = fat consumption, GDP = gross domestic product, GFCF = Gross Fixed Capital Formation/Creation, GH-gas = greenhouse gas, GII = gender inequality index, gov = governance index, gov.revenue = government revenues, HC-coverage = healthcare coverage, HE = health(care) expenditure, HHconsump = household consumption, hosp.beds = hospital beds, hosp.payment = hospital payment scheme, hosp.stay = length of hospital stay, IDI = ICT development index, inc.ineq = income inequality, industry-labor = industrial labor force, infant-sex = infant sex ratio, labor-product = labor production, LBW = low birth weight, leave = work leave, life-satisf = life satisfaction, M-age = maternal age, marginal-tax = marginal tax rate, MDs = physicians, mult.preg = multiple pregnancy, NHS = Nation Health System, NO = nitrous oxide emissions, PM10 = particulate matter (PM10) emissions, pop = population size, pop.density = population density, pre-term = pre-term birth rate, prison = prison population, researchE = research&development expenditure, school.ref = compulsory schooling reform, smoke-free = smoke-free places, SO = sulfur oxide emissions, soc.E = social expenditure, soc.workers = social workers, sugar = sugar consumption, terror = terrorism, union = union density, UR = unemployment rate, urban = urbanization, veg-fr = vegetable-and-fruit consumption, welfare = welfare regime, Wwater = wastewater treatment.

https://doi.org/10.1371/journal.pone.0239031.g002

Distances between determinants in Fig 2 are indicative of determinants’ “connectedness” with each other. While the statistical procedure called for higher dimensionality of the model, for demonstration purposes we show here a two-dimensional solution. This simplification unfortunately comes with a caveat. To use the factor smoking as an example, it would appear it stands at a much greater distance from GDP than it does from alcohol. In reality however, smoking was considered together with alcohol consumption [ 21 , 25 , 26 , 52 , 68 ] in just as many studies as it was with GDP [ 21 , 25 , 26 , 52 , 59 ], five. To aid with respect to this apparent shortcoming, we have emphasized the strongest pairwise links. Solid lines connect GDP with health expenditure (HE), unemployment rate (UR), and education (EDU), indicating that the effect of GDP on health, taking into account the effects of the other three determinants as well, was evaluated in between 12 to 16 studies of the 60 included in this review. Tracing the dashed lines, we can also tell that GDP appeared jointly with income inequality, and HE together with either EDU or UR, in anywhere between 8 to 10 of our selected studies. Finally, some weaker but still worth-mentioning “connections” between variables are displayed as well via the dotted lines.

The fact that all notable pairwise “connections” are concentrated within a relatively small region of the plot may be interpreted as low overall “connectedness” among the health indicators studied. GDP is the most widely investigated determinant in relation to general population health. Its total number of “connections” is disproportionately high (159) compared to its runner-up–HE (with 113 “connections”), and then subsequently EDU (with 90) and UR (with 86). In fact, all of these determinants could be thought of as outliers, given that none of the remaining factors have a total count of pairings above 52. This decrease in individual determinants’ overall “connectedness” can be tracked on the graph via the change of color intensity as we move outwards from the symbolic center of GDP and its closest “co-determinants”, to finally reach the other extreme of the ten indicators (welfare regime, household consumption, compulsory school reform, life satisfaction, government revenues, literacy, research expenditure, multiple pregnancy, Cyclically Adjusted Primary Balance, and residents’ dissatisfaction; in white) the effects on health of which were only studied in isolation.

Lastly, we point to the few small but stable clusters of covariates encircled by the grey bubbles on Fig 2 . These groups of determinants were identified as “close” by both statistical procedures used for the production of the graph (see details in S2 Appendix ).

Statistical methodology

There was great variation in the level of statistical detail reported. Some authors provided too vague a description of their analytical approach, necessitating some inference in this section.

The issue of missing data is a challenging reality in this field of research, but few of the studies under review (12/60) explain how they dealt with it. Among the ones that do, three general approaches to handling missingness can be identified, listed in increasing level of sophistication: case-wise deletion, i.e., removal of countries from the sample [ 20 , 45 , 48 , 58 , 59 ], (linear) interpolation [ 28 , 30 , 34 , 58 , 59 , 63 ], and multiple imputation [ 26 , 41 , 52 ].

Correlations, Pearson, Spearman, or unspecified, were the only technique applied with respect to the health outcomes of interest in eight analyses [ 33 , 42 – 44 , 46 , 53 , 57 , 61 ]. Among the more advanced statistical methods, the family of regression models proved to be, by and large, predominant. Before examining this closer, we note the techniques that were, in a way, “unique” within this selection of studies: meta-analyses were performed (random and fixed effects, respectively) on the reduced form and 2-sample two stage least squares (2SLS) estimations done within countries [ 39 ]; difference-in-difference (DiD) analysis was applied in one case [ 23 ]; dynamic time-series methods, among which co-integration, impulse-response function (IRF), and panel vector autoregressive (VAR) modeling, were utilized in one study [ 80 ]; longitudinal generalized estimating equation (GEE) models were developed on two occasions [ 70 , 78 ]; hierarchical Bayesian spatial models [ 51 ] and special autoregressive regression [ 62 ] were also implemented.

Purely cross-sectional data analyses were performed in eight studies [ 25 , 45 , 47 , 50 , 55 , 56 , 67 , 71 ]. These consisted of linear regression (assumed ordinary least squares (OLS)), generalized least squares (GLS) regression, and multilevel analyses. However, six other studies that used longitudinal data in fact had a cross-sectional design, through which they applied regression at multiple time-points separately [ 27 , 29 , 36 , 48 , 68 , 72 ].

Apart from these “multi-point cross-sectional studies”, some other simplistic approaches to longitudinal data analysis were found, involving calculating and regressing 3-year averages of both the response and the predictor variables [ 54 ], taking the average of a few data-points (i.e., survey waves) [ 56 ] or using difference scores over 10-year [ 19 , 29 ] or unspecified time intervals [ 40 , 55 ].

Moving further in the direction of more sensible longitudinal data usage, we turn to the methods widely known among (health) economists as “panel data analysis” or “panel regression”. Most often seen were models with fixed effects for country/region and sometimes also time-point (occasionally including a country-specific trend as well), with robust standard errors for the parameter estimates to take into account correlations among clustered observations [ 20 , 21 , 24 , 28 , 30 , 32 , 34 , 37 , 38 , 41 , 52 , 59 , 60 , 63 , 66 , 69 , 73 , 79 , 81 , 82 ]. The Hausman test [ 83 ] was sometimes mentioned as the tool used to decide between fixed and random effects [ 26 , 49 , 63 , 66 , 73 , 82 ]. A few studies considered the latter more appropriate for their particular analyses, with some further specifying that (feasible) GLS estimation was employed [ 26 , 34 , 49 , 58 , 60 , 73 ]. Apart from these two types of models, the first differences method was encountered once as well [ 31 ]. Across all, the error terms were sometimes assumed to come from a first-order autoregressive process (AR(1)), i.e., they were allowed to be serially correlated [ 20 , 30 , 38 , 58 – 60 , 73 ], and lags of (typically) predictor variables were included in the model specification, too [ 20 , 21 , 37 , 38 , 48 , 69 , 81 ]. Lastly, a somewhat different approach to longitudinal data analysis was undertaken in four studies [ 22 , 35 , 48 , 65 ] in which multilevel–linear or Poisson–models were developed.

Regardless of the exact techniques used, most studies included in this review presented multiple model applications within their main analysis. None attempted to formally compare models in order to identify the “best”, even if goodness-of-fit statistics were occasionally reported. As indicated above, many studies investigated women’s and men’s health separately [ 19 , 21 , 22 , 27 – 29 , 31 , 33 , 35 , 36 , 38 , 39 , 45 , 50 , 51 , 64 , 65 , 69 , 82 ], and covariates were often tested one at a time, including other covariates only incrementally [ 20 , 25 , 28 , 36 , 40 , 50 , 55 , 67 , 73 ]. Furthermore, there were a few instances where analyses within countries were performed as well [ 32 , 39 , 51 ] or where the full time period of interest was divided into a few sub-periods [ 24 , 26 , 28 , 31 ]. There were also cases where different statistical techniques were applied in parallel [ 29 , 55 , 60 , 66 , 69 , 73 , 82 ], sometimes as a form of sensitivity analysis [ 24 , 26 , 30 , 58 , 73 ]. However, the most common approach to sensitivity analysis was to re-run models with somewhat different samples [ 39 , 50 , 59 , 67 , 69 , 80 , 82 ]. Other strategies included different categorization of variables or adding (more/other) controls [ 21 , 23 , 25 , 28 , 37 , 50 , 63 , 69 ], using an alternative main covariate measure [ 59 , 82 ], including lags for predictors or outcomes [ 28 , 30 , 58 , 63 , 65 , 79 ], using weights [ 24 , 67 ] or alternative data sources [ 37 , 69 ], or using non-imputed data [ 41 ].

As the methods and not the findings are the main focus of the current review, and because generic checklists cannot discern the underlying quality in this application field (see also below), we opted to pool all reported findings together, regardless of individual study characteristics or particular outcome(s) used, and speak generally of positive and negative effects on health. For this summary we have adopted the 0.05-significance level and only considered results from multivariate analyses. Strictly birth-related factors are omitted since these potentially only relate to the group of infant mortality indicators and not to any of the other general population health measures.

Starting with the determinants most often studied, higher GDP levels [ 21 , 26 , 27 , 29 , 30 , 32 , 43 , 48 , 52 , 58 , 60 , 66 , 67 , 73 , 79 , 81 , 82 ], higher health [ 21 , 37 , 47 , 49 , 52 , 58 , 59 , 68 , 72 , 82 ] and social [ 20 , 21 , 26 , 38 , 79 ] expenditures, higher education [ 26 , 39 , 52 , 62 , 72 , 73 ], lower unemployment [ 60 , 61 , 66 ], and lower income inequality [ 30 , 42 , 53 , 55 , 73 ] were found to be significantly associated with better population health on a number of occasions. In addition to that, there was also some evidence that democracy [ 36 ] and freedom [ 50 ], higher work compensation [ 43 , 79 ], distributional orientation [ 54 ], cigarette prices [ 63 ], gross national income [ 22 , 72 ], labor productivity [ 26 ], exchange rates [ 32 ], marginal tax rates [ 79 ], vaccination rates [ 52 ], total fertility [ 59 , 66 ], fruit and vegetable [ 68 ], fat [ 52 ] and sugar consumption [ 52 ], as well as bigger depth of credit information [ 22 ] and percentage of civilian labor force [ 79 ], longer work leaves [ 41 , 58 ], more physicians [ 37 , 52 , 72 ], nurses [ 72 ], and hospital beds [ 79 , 82 ], and also membership in associations, perceived corruption and societal trust [ 48 ] were beneficial to health. Higher nitrous oxide (NO) levels [ 52 ], longer average hospital stay [ 48 ], deprivation [ 51 ], dissatisfaction with healthcare and the social environment [ 56 ], corruption [ 40 , 50 ], smoking [ 19 , 26 , 52 , 68 ], alcohol consumption [ 26 , 52 , 68 ] and illegal drug use [ 68 ], poverty [ 64 ], higher percentage of industrial workers [ 26 ], Gross Fixed Capital creation [ 66 ] and older population [ 38 , 66 , 79 ], gender inequality [ 22 ], and fertility [ 26 , 66 ] were detrimental.

It is important to point out that the above-mentioned effects could not be considered stable either across or within studies. Very often, statistical significance of a given covariate fluctuated between the different model specifications tried out within the same study [ 20 , 49 , 59 , 66 , 68 , 69 , 73 , 80 , 82 ], testifying to the importance of control variables and multivariate research (i.e., analyzing multiple independent variables simultaneously) in general. Furthermore, conflicting results were observed even with regards to the “core” determinants given special attention, so to speak, throughout this text. Thus, some studies reported negative effects of health expenditure [ 32 , 82 ], social expenditure [ 58 ], GDP [ 49 , 66 ], and education [ 82 ], and positive effects of income inequality [ 82 ] and unemployment [ 24 , 31 , 32 , 52 , 66 , 68 ]. Interestingly, one study [ 34 ] differentiated between temporary and long-term effects of GDP and unemployment, alluding to possibly much greater complexity of the association with health. It is also worth noting that some gender differences were found, with determinants being more influential for males than for females, or only having statistically significant effects for male health [ 19 , 21 , 28 , 34 , 36 , 37 , 39 , 64 , 65 , 69 ].

The purpose of this scoping review was to examine recent quantitative work on the topic of multi-country analyses of determinants of population health in high-income countries.

Measuring population health via relatively simple mortality-based indicators still seems to be the state of the art. What is more, these indicators are routinely considered one at a time, instead of, for example, employing existing statistical procedures to devise a more general, composite, index of population health, or using some of the established indices, such as disability-adjusted life expectancy (DALE) or quality-adjusted life expectancy (QALE). Although strong arguments for their wider use were already voiced decades ago [ 84 ], such summary measures surface only rarely in this research field.

On a related note, the greater data availability and accessibility that we enjoy today does not automatically equate to data quality. Nonetheless, this is routinely assumed in aggregate level studies. We almost never encountered a discussion on the topic. The non-mundane issue of data missingness, too, goes largely underappreciated. With all recent methodological advancements in this area [ 85 – 88 ], there is no excuse for ignorance; and still, too few of the reviewed studies tackled the matter in any adequate fashion.

Much optimism can be gained considering the abundance of different determinants that have attracted researchers’ attention in relation to population health. We took on a visual approach with regards to these determinants and presented a graph that links spatial distances between determinants with frequencies of being studies together. To facilitate interpretation, we grouped some variables, which resulted in some loss of finer detail. Nevertheless, the graph is helpful in exemplifying how many effects continue to be studied in a very limited context, if any. Since in reality no factor acts in isolation, this oversimplification practice threatens to render the whole exercise meaningless from the outset. The importance of multivariate analysis cannot be stressed enough. While there is no “best method” to be recommended and appropriate techniques vary according to the specifics of the research question and the characteristics of the data at hand [ 89 – 93 ], in the future, in addition to abandoning simplistic univariate approaches, we hope to see a shift from the currently dominating fixed effects to the more flexible random/mixed effects models [ 94 ], as well as wider application of more sophisticated methods, such as principle component regression, partial least squares, covariance structure models (e.g., structural equations), canonical correlations, time-series, and generalized estimating equations.

Finally, there are some limitations of the current scoping review. We searched the two main databases for published research in medical and non-medical sciences (PubMed and Web of Science) since 2013, thus potentially excluding publications and reports that are not indexed in these databases, as well as older indexed publications. These choices were guided by our interest in the most recent (i.e., the current state-of-the-art) and arguably the highest-quality research (i.e., peer-reviewed articles, primarily in indexed non-predatory journals). Furthermore, despite holding a critical stance with regards to some aspects of how determinants-of-health research is currently conducted, we opted out of formally assessing the quality of the individual studies included. The reason for that is two-fold. On the one hand, we are unaware of the existence of a formal and standard tool for quality assessment of ecological designs. And on the other, we consider trying to score the quality of these diverse studies (in terms of regional setting, specific topic, outcome indices, and methodology) undesirable and misleading, particularly since we would sometimes have been rating the quality of only a (small) part of the original studies—the part that was relevant to our review’s goal.

Our aim was to investigate the current state of research on the very broad and general topic of population health, specifically, the way it has been examined in a multi-country context. We learned that data treatment and analytical approach were, in the majority of these recent studies, ill-equipped or insufficiently transparent to provide clarity regarding the underlying mechanisms of population health in high-income countries. Whether due to methodological shortcomings or the inherent complexity of the topic, research so far fails to provide any definitive answers. It is our sincere belief that with the application of more advanced analytical techniques this continuous quest could come to fruition sooner.

Supporting information

S1 checklist. preferred reporting items for systematic reviews and meta-analyses extension for scoping reviews (prisma-scr) checklist..

https://doi.org/10.1371/journal.pone.0239031.s001

S1 Appendix.

https://doi.org/10.1371/journal.pone.0239031.s002

S2 Appendix.

https://doi.org/10.1371/journal.pone.0239031.s003

  • View Article
  • Google Scholar
  • PubMed/NCBI
  • 75. Dahlgren G, Whitehead M. Policies and Strategies to Promote Equity in Health. Stockholm, Sweden: Institute for Future Studies; 1991.
  • 76. Brunner E, Marmot M. Social Organization, Stress, and Health. In: Marmot M, Wilkinson RG, editors. Social Determinants of Health. Oxford, England: Oxford University Press; 1999.
  • 77. Najman JM. A General Model of the Social Origins of Health and Well-being. In: Eckersley R, Dixon J, Douglas B, editors. The Social Origins of Health and Well-being. Cambridge, England: Cambridge University Press; 2001.
  • 85. Carpenter JR, Kenward MG. Multiple Imputation and its Application. New York: John Wiley & Sons; 2013.
  • 86. Molenberghs G, Fitzmaurice G, Kenward MG, Verbeke G, Tsiatis AA. Handbook of Missing Data Methodology. Boca Raton: Chapman & Hall/CRC; 2014.
  • 87. van Buuren S. Flexible Imputation of Missing Data. 2nd ed. Boca Raton: Chapman & Hall/CRC; 2018.
  • 88. Enders CK. Applied Missing Data Analysis. New York: Guilford; 2010.
  • 89. Shayle R. Searle GC, Charles E. McCulloch. Variance Components: John Wiley & Sons, Inc.; 1992.
  • 90. Agresti A. Foundations of Linear and Generalized Linear Models. Hoboken, New Jersey: John Wiley & Sons Inc.; 2015.
  • 91. Leyland A. H. (Editor) HGE. Multilevel Modelling of Health Statistics: John Wiley & Sons Inc; 2001.
  • 92. Garrett Fitzmaurice MD, Geert Verbeke, Geert Molenberghs. Longitudinal Data Analysis. New York: Chapman and Hall/CRC; 2008.
  • 93. Wolfgang Karl Härdle LS. Applied Multivariate Statistical Analysis. Berlin, Heidelberg: Springer; 2015.

Advertisement

Issue Cover

  • Previous Issue
  • Previous Article
  • Next Article

Clarifying the Research Purpose

Methodology, measurement, data analysis and interpretation, tools for evaluating the quality of medical education research, research support, competing interests, quantitative research methods in medical education.

Submitted for publication January 8, 2018. Accepted for publication November 29, 2018.

  • Split-Screen
  • Article contents
  • Figures & tables
  • Supplementary Data
  • Peer Review
  • Open the PDF for in another window
  • Cite Icon Cite
  • Get Permissions
  • Search Site

John T. Ratelle , Adam P. Sawatsky , Thomas J. Beckman; Quantitative Research Methods in Medical Education. Anesthesiology 2019; 131:23–35 doi: https://doi.org/10.1097/ALN.0000000000002727

Download citation file:

  • Ris (Zotero)
  • Reference Manager

There has been a dramatic growth of scholarly articles in medical education in recent years. Evaluating medical education research requires specific orientation to issues related to format and content. Our goal is to review the quantitative aspects of research in medical education so that clinicians may understand these articles with respect to framing the study, recognizing methodologic issues, and utilizing instruments for evaluating the quality of medical education research. This review can be used both as a tool when appraising medical education research articles and as a primer for clinicians interested in pursuing scholarship in medical education.

Image: J. P. Rathmell and Terri Navarette.

Image: J. P. Rathmell and Terri Navarette.

There has been an explosion of research in the field of medical education. A search of PubMed demonstrates that more than 40,000 articles have been indexed under the medical subject heading “Medical Education” since 2010, which is more than the total number of articles indexed under this heading in the 1980s and 1990s combined. Keeping up to date requires that practicing clinicians have the skills to interpret and appraise the quality of research articles, especially when serving as editors, reviewers, and consumers of the literature.

While medical education shares many characteristics with other biomedical fields, substantial particularities exist. We recognize that practicing clinicians may not be familiar with the nuances of education research and how to assess its quality. Therefore, our purpose is to provide a review of quantitative research methodologies in medical education. Specifically, we describe a structure that can be used when conducting or evaluating medical education research articles.

Clarifying the research purpose is an essential first step when reading or conducting scholarship in medical education. 1   Medical education research can serve a variety of purposes, from advancing the science of learning to improving the outcomes of medical trainees and the patients they care for. However, a well-designed study has limited value if it addresses vague, redundant, or unimportant medical education research questions.

What is the research topic and why is it important? What is unknown about the research topic? Why is further research necessary?

What is the conceptual framework being used to approach the study?

What is the statement of study intent?

What are the research methodology and study design? Are they appropriate for the study objective(s)?

Which threats to internal validity are most relevant for the study?

What is the outcome and how was it measured?

Can the results be trusted? What is the validity and reliability of the measurements?

How were research subjects selected? Is the research sample representative of the target population?

Was the data analysis appropriate for the study design and type of data?

What is the effect size? Do the results have educational significance?

Fortunately, there are steps to ensure that the purpose of a research study is clear and logical. Table 1   2–5   outlines these steps, which will be described in detail in the following sections. We describe these elements not as a simple “checklist,” but as an advanced organizer that can be used to understand a medical education research study. These steps can also be used by clinician educators who are new to the field of education research and who wish to conduct scholarship in medical education.

Steps in Clarifying the Purpose of a Research Study in Medical Education

Steps in Clarifying the Purpose of a Research Study in Medical Education

Literature Review and Problem Statement

A literature review is the first step in clarifying the purpose of a medical education research article. 2 , 5 , 6   When conducting scholarship in medical education, a literature review helps researchers develop an understanding of their topic of interest. This understanding includes both existing knowledge about the topic as well as key gaps in the literature, which aids the researcher in refining their study question. Additionally, a literature review helps researchers identify conceptual frameworks that have been used to approach the research topic. 2  

When reading scholarship in medical education, a successful literature review provides background information so that even someone unfamiliar with the research topic can understand the rationale for the study. Located in the introduction of the manuscript, the literature review guides the reader through what is already known in a manner that highlights the importance of the research topic. The literature review should also identify key gaps in the literature so the reader can understand the need for further research. This gap description includes an explicit problem statement that summarizes the important issues and provides a reason for the study. 2 , 4   The following is one example of a problem statement:

“Identifying gaps in the competency of anesthesia residents in time for intervention is critical to patient safety and an effective learning system… [However], few available instruments relate to complex behavioral performance or provide descriptors…that could inform subsequent feedback, individualized teaching, remediation, and curriculum revision.” 7  

This problem statement articulates the research topic (identifying resident performance gaps), why it is important (to intervene for the sake of learning and patient safety), and current gaps in the literature (few tools are available to assess resident performance). The researchers have now underscored why further research is needed and have helped readers anticipate the overarching goals of their study (to develop an instrument to measure anesthesiology resident performance). 4  

The Conceptual Framework

Following the literature review and articulation of the problem statement, the next step in clarifying the research purpose is to select a conceptual framework that can be applied to the research topic. Conceptual frameworks are “ways of thinking about a problem or a study, or ways of representing how complex things work.” 3   Just as clinical trials are informed by basic science research in the laboratory, conceptual frameworks often serve as the “basic science” that informs scholarship in medical education. At a fundamental level, conceptual frameworks provide a structured approach to solving the problem identified in the problem statement.

Conceptual frameworks may take the form of theories, principles, or models that help to explain the research problem by identifying its essential variables or elements. Alternatively, conceptual frameworks may represent evidence-based best practices that researchers can apply to an issue identified in the problem statement. 3   Importantly, there is no single best conceptual framework for a particular research topic, although the choice of a conceptual framework is often informed by the literature review and knowing which conceptual frameworks have been used in similar research. 8   For further information on selecting a conceptual framework for research in medical education, we direct readers to the work of Bordage 3   and Irby et al. 9  

To illustrate how different conceptual frameworks can be applied to a research problem, suppose you encounter a study to reduce the frequency of communication errors among anesthesiology residents during day-to-night handoff. Table 2 10 , 11   identifies two different conceptual frameworks researchers might use to approach the task. The first framework, cognitive load theory, has been proposed as a conceptual framework to identify potential variables that may lead to handoff errors. 12   Specifically, cognitive load theory identifies the three factors that affect short-term memory and thus may lead to communication errors:

Conceptual Frameworks to Address the Issue of Handoff Errors in the Intensive Care Unit

Conceptual Frameworks to Address the Issue of Handoff Errors in the Intensive Care Unit

Intrinsic load: Inherent complexity or difficulty of the information the resident is trying to learn ( e.g. , complex patients).

Extraneous load: Distractions or demands on short-term memory that are not related to the information the resident is trying to learn ( e.g. , background noise, interruptions).

Germane load: Effort or mental strategies used by the resident to organize and understand the information he/she is trying to learn ( e.g. , teach back, note taking).

Using cognitive load theory as a conceptual framework, researchers may design an intervention to reduce extraneous load and help the resident remember the overnight to-do’s. An example might be dedicated, pager-free handoff times where distractions are minimized.

The second framework identified in table 2 , the I-PASS (Illness severity, Patient summary, Action list, Situational awareness and contingency planning, and Synthesis by receiver) handoff mnemonic, 11   is an evidence-based best practice that, when incorporated as part of a handoff bundle, has been shown to reduce handoff errors on pediatric wards. 13   Researchers choosing this conceptual framework may adapt some or all of the I-PASS elements for resident handoffs in the intensive care unit.

Note that both of the conceptual frameworks outlined above provide researchers with a structured approach to addressing the issue of handoff errors; one is not necessarily better than the other. Indeed, it is possible for researchers to use both frameworks when designing their study. Ultimately, we provide this example to demonstrate the necessity of selecting conceptual frameworks to clarify the research purpose. 3 , 8   Readers should look for conceptual frameworks in the introduction section and should be wary of their omission, as commonly seen in less well-developed medical education research articles. 14  

Statement of Study Intent

After reviewing the literature, articulating the problem statement, and selecting a conceptual framework to address the research topic, the final step in clarifying the research purpose is the statement of study intent. The statement of study intent is arguably the most important element of framing the study because it makes the research purpose explicit. 2   Consider the following example:

This study aimed to test the hypothesis that the introduction of the BASIC Examination was associated with an accelerated knowledge acquisition during residency training, as measured by increments in annual ITE scores. 15  

This statement of study intent succinctly identifies several key study elements including the population (anesthesiology residents), the intervention/independent variable (introduction of the BASIC Examination), the outcome/dependent variable (knowledge acquisition, as measure by in In-training Examination [ITE] scores), and the hypothesized relationship between the independent and dependent variable (the authors hypothesize a positive correlation between the BASIC examination and the speed of knowledge acquisition). 6 , 14  

The statement of study intent will sometimes manifest as a research objective, rather than hypothesis or question. In such instances there may not be explicit independent and dependent variables, but the study population and research aim should be clearly identified. The following is an example:

“In this report, we present the results of 3 [years] of course data with respect to the practice improvements proposed by participating anesthesiologists and their success in implementing those plans. Specifically, our primary aim is to assess the frequency and type of improvements that were completed and any factors that influence completion.” 16  

The statement of study intent is the logical culmination of the literature review, problem statement, and conceptual framework, and is a transition point between the Introduction and Methods sections of a medical education research report. Nonetheless, a systematic review of experimental research in medical education demonstrated that statements of study intent are absent in the majority of articles. 14   When reading a medical education research article where the statement of study intent is absent, it may be necessary to infer the research aim by gathering information from the Introduction and Methods sections. In these cases, it can be useful to identify the following key elements 6 , 14 , 17   :

Population of interest/type of learner ( e.g. , pain medicine fellow or anesthesiology residents)

Independent/predictor variable ( e.g. , educational intervention or characteristic of the learners)

Dependent/outcome variable ( e.g. , intubation skills or knowledge of anesthetic agents)

Relationship between the variables ( e.g. , “improve” or “mitigate”)

Occasionally, it may be difficult to differentiate the independent study variable from the dependent study variable. 17   For example, consider a study aiming to measure the relationship between burnout and personal debt among anesthesiology residents. Do the researchers believe burnout might lead to high personal debt, or that high personal debt may lead to burnout? This “chicken or egg” conundrum reinforces the importance of the conceptual framework which, if present, should serve as an explanation or rationale for the predicted relationship between study variables.

Research methodology is the “…design or plan that shapes the methods to be used in a study.” 1   Essentially, methodology is the general strategy for answering a research question, whereas methods are the specific steps and techniques that are used to collect data and implement the strategy. Our objective here is to provide an overview of quantitative methodologies ( i.e. , approaches) in medical education research.

The choice of research methodology is made by balancing the approach that best answers the research question against the feasibility of completing the study. There is no perfect methodology because each has its own potential caveats, flaws and/or sources of bias. Before delving into an overview of the methodologies, it is important to highlight common sources of bias in education research. We use the term internal validity to describe the degree to which the findings of a research study represent “the truth,” as opposed to some alternative hypothesis or variables. 18   Table 3   18–20   provides a list of common threats to internal validity in medical education research, along with tactics to mitigate these threats.

Threats to Internal Validity and Strategies to Mitigate Their Effects

Threats to Internal Validity and Strategies to Mitigate Their Effects

Experimental Research

The fundamental tenet of experimental research is the manipulation of an independent or experimental variable to measure its effect on a dependent or outcome variable.

True Experiment

True experimental study designs minimize threats to internal validity by randomizing study subjects to experimental and control groups. Through ensuring that differences between groups are—beyond the intervention/variable of interest—purely due to chance, researchers reduce the internal validity threats related to subject characteristics, time-related maturation, and regression to the mean. 18 , 19  

Quasi-experiment

There are many instances in medical education where randomization may not be feasible or ethical. For instance, researchers wanting to test the effect of a new curriculum among medical students may not be able to randomize learners due to competing curricular obligations and schedules. In these cases, researchers may be forced to assign subjects to experimental and control groups based upon some other criterion beyond randomization, such as different classrooms or different sections of the same course. This process, called quasi-randomization, does not inherently lead to internal validity threats, as long as research investigators are mindful of measuring and controlling for extraneous variables between study groups. 19  

Single-group Methodologies

All experimental study designs compare two or more groups: experimental and control. A common experimental study design in medical education research is the single-group pretest–posttest design, which compares a group of learners before and after the implementation of an intervention. 21   In essence, a single-group pre–post design compares an experimental group ( i.e. , postintervention) to a “no-intervention” control group ( i.e. , preintervention). 19   This study design is problematic for several reasons. Consider the following hypothetical example: A research article reports the effects of a year-long intubation curriculum for first-year anesthesiology residents. All residents participate in monthly, half-day workshops over the course of an academic year. The article reports a positive effect on residents’ skills as demonstrated by a significant improvement in intubation success rates at the end of the year when compared to the beginning.

This study does little to advance the science of learning among anesthesiology residents. While this hypothetical report demonstrates an improvement in residents’ intubation success before versus after the intervention, it does not tell why the workshop worked, how it compares to other educational interventions, or how it fits in to the broader picture of anesthesia training.

Single-group pre–post study designs open themselves to a myriad of threats to internal validity. 20   In our hypothetical example, the improvement in residents’ intubation skills may have been due to other educational experience(s) ( i.e. , implementation threat) and/or improvement in manual dexterity that occurred naturally with time ( i.e. , maturation threat), rather than the airway curriculum. Consequently, single-group pre–post studies should be interpreted with caution. 18  

Repeated testing, before and after the intervention, is one strategy that can be used to reduce the some of the inherent limitations of the single-group study design. Repeated pretesting can mitigate the effect of regression toward the mean, a statistical phenomenon whereby low pretest scores tend to move closer to the mean on subsequent testing (regardless of intervention). 20   Likewise, repeated posttesting at multiple time intervals can provide potentially useful information about the short- and long-term effects of an intervention ( e.g. , the “durability” of the gain in knowledge, skill, or attitude).

Observational Research

Unlike experimental studies, observational research does not involve manipulation of any variables. These studies often involve measuring associations, developing psychometric instruments, or conducting surveys.

Association Research

Association research seeks to identify relationships between two or more variables within a group or groups (correlational research), or similarities/differences between two or more existing groups (causal–comparative research). For example, correlational research might seek to measure the relationship between burnout and educational debt among anesthesiology residents, while causal–comparative research may seek to measure differences in educational debt and/or burnout between anesthesiology and surgery residents. Notably, association research may identify relationships between variables, but does not necessarily support a causal relationship between them.

Psychometric and Survey Research

Psychometric instruments measure a psychologic or cognitive construct such as knowledge, satisfaction, beliefs, and symptoms. Surveys are one type of psychometric instrument, but many other types exist, such as evaluations of direct observation, written examinations, or screening tools. 22   Psychometric instruments are ubiquitous in medical education research and can be used to describe a trait within a study population ( e.g. , rates of depression among medical students) or to measure associations between study variables ( e.g. , association between depression and board scores among medical students).

Psychometric and survey research studies are prone to the internal validity threats listed in table 3 , particularly those relating to mortality, location, and instrumentation. 18   Additionally, readers must ensure that the instrument scores can be trusted to truly represent the construct being measured. For example, suppose you encounter a research article demonstrating a positive association between attending physician teaching effectiveness as measured by a survey of medical students, and the frequency with which the attending physician provides coffee and doughnuts on rounds. Can we be confident that this survey administered to medical students is truly measuring teaching effectiveness? Or is it simply measuring the attending physician’s “likability”? Issues related to measurement and the trustworthiness of data are described in detail in the following section on measurement and the related issues of validity and reliability.

Measurement refers to “the assigning of numbers to individuals in a systematic way as a means of representing properties of the individuals.” 23   Research data can only be trusted insofar as we trust the measurement used to obtain the data. Measurement is of particular importance in medical education research because many of the constructs being measured ( e.g. , knowledge, skill, attitudes) are abstract and subject to measurement error. 24   This section highlights two specific issues related to the trustworthiness of data: the validity and reliability of measurements.

Validity regarding the scores of a measurement instrument “refers to the degree to which evidence and theory support the interpretations of the [instrument’s results] for the proposed use of the [instrument].” 25   In essence, do we believe the results obtained from a measurement really represent what we were trying to measure? Note that validity evidence for the scores of a measurement instrument is separate from the internal validity of a research study. Several frameworks for validity evidence exist. Table 4 2 , 22 , 26   represents the most commonly used framework, developed by Messick, 27   which identifies sources of validity evidence—to support the target construct—from five main categories: content, response process, internal structure, relations to other variables, and consequences.

Sources of Validity Evidence for Measurement Instruments

Sources of Validity Evidence for Measurement Instruments

Reliability

Reliability refers to the consistency of scores for a measurement instrument. 22 , 25 , 28   For an instrument to be reliable, we would anticipate that two individuals rating the same object of measurement in a specific context would provide the same scores. 25   Further, if the scores for an instrument are reliable between raters of the same object of measurement, then we can extrapolate that any difference in scores between two objects represents a true difference across the sample, and is not due to random variation in measurement. 29   Reliability can be demonstrated through a variety of methods such as internal consistency ( e.g. , Cronbach’s alpha), temporal stability ( e.g. , test–retest reliability), interrater agreement ( e.g. , intraclass correlation coefficient), and generalizability theory (generalizability coefficient). 22 , 29  

Example of a Validity and Reliability Argument

This section provides an illustration of validity and reliability in medical education. We use the signaling questions outlined in table 4 to make a validity and reliability argument for the Harvard Assessment of Anesthesia Resident Performance (HARP) instrument. 7   The HARP was developed by Blum et al. to measure the performance of anesthesia trainees that is required to provide safe anesthetic care to patients. According to the authors, the HARP is designed to be used “…as part of a multiscenario, simulation-based assessment” of resident performance. 7  

Content Validity: Does the Instrument’s Content Represent the Construct Being Measured?

To demonstrate content validity, instrument developers should describe the construct being measured and how the instrument was developed, and justify their approach. 25   The HARP is intended to measure resident performance in the critical domains required to provide safe anesthetic care. As such, investigators note that the HARP items were created through a two-step process. First, the instrument’s developers interviewed anesthesiologists with experience in resident education to identify the key traits needed for successful completion of anesthesia residency training. Second, the authors used a modified Delphi process to synthesize the responses into five key behaviors: (1) formulate a clear anesthetic plan, (2) modify the plan under changing conditions, (3) communicate effectively, (4) identify performance improvement opportunities, and (5) recognize one’s limits. 7 , 30  

Response Process Validity: Are Raters Interpreting the Instrument Items as Intended?

In the case of the HARP, the developers included a scoring rubric with behavioral anchors to ensure that faculty raters could clearly identify how resident performance in each domain should be scored. 7  

Internal Structure Validity: Do Instrument Items Measuring Similar Constructs Yield Homogenous Results? Do Instrument Items Measuring Different Constructs Yield Heterogeneous Results?

Item-correlation for the HARP demonstrated a high degree of correlation between some items ( e.g. , formulating a plan and modifying the plan under changing conditions) and a lower degree of correlation between other items ( e.g. , formulating a plan and identifying performance improvement opportunities). 30   This finding is expected since the items within the HARP are designed to assess separate performance domains, and we would expect residents’ functioning to vary across domains.

Relationship to Other Variables’ Validity: Do Instrument Scores Correlate with Other Measures of Similar or Different Constructs as Expected?

As it applies to the HARP, one would expect that the performance of anesthesia residents will improve over the course of training. Indeed, HARP scores were found to be generally higher among third-year residents compared to first-year residents. 30  

Consequence Validity: Are Instrument Results Being Used as Intended? Are There Unintended or Negative Uses of the Instrument Results?

While investigators did not intentionally seek out consequence validity evidence for the HARP, unanticipated consequences of HARP scores were identified by the authors as follows:

“Data indicated that CA-3s had a lower percentage of worrisome scores (rating 2 or lower) than CA-1s… However, it is concerning that any CA-3s had any worrisome scores…low performance of some CA-3 residents, albeit in the simulated environment, suggests opportunities for training improvement.” 30  

That is, using the HARP to measure the performance of CA-3 anesthesia residents had the unintended consequence of identifying the need for improvement in resident training.

Reliability: Are the Instrument’s Scores Reproducible and Consistent between Raters?

The HARP was applied by two raters for every resident in the study across seven different simulation scenarios. The investigators conducted a generalizability study of HARP scores to estimate the variance in assessment scores that was due to the resident, the rater, and the scenario. They found little variance was due to the rater ( i.e. , scores were consistent between raters), indicating a high level of reliability. 7  

Sampling refers to the selection of research subjects ( i.e. , the sample) from a larger group of eligible individuals ( i.e. , the population). 31   Effective sampling leads to the inclusion of research subjects who represent the larger population of interest. Alternatively, ineffective sampling may lead to the selection of research subjects who are significantly different from the target population. Imagine that researchers want to explore the relationship between burnout and educational debt among pain medicine specialists. The researchers distribute a survey to 1,000 pain medicine specialists (the population), but only 300 individuals complete the survey (the sample). This result is problematic because the characteristics of those individuals who completed the survey and the entire population of pain medicine specialists may be fundamentally different. It is possible that the 300 study subjects may be experiencing more burnout and/or debt, and thus, were more motivated to complete the survey. Alternatively, the 700 nonresponders might have been too busy to respond and even more burned out than the 300 responders, which would suggest that the study findings were even more amplified than actually observed.

When evaluating a medical education research article, it is important to identify the sampling technique the researchers employed, how it might have influenced the results, and whether the results apply to the target population. 24  

Sampling Techniques

Sampling techniques generally fall into two categories: probability- or nonprobability-based. Probability-based sampling ensures that each individual within the target population has an equal opportunity of being selected as a research subject. Most commonly, this is done through random sampling, which should lead to a sample of research subjects that is similar to the target population. If significant differences between sample and population exist, those differences should be due to random chance, rather than systematic bias. The difference between data from a random sample and that from the population is referred to as sampling error. 24  

Nonprobability-based sampling involves selecting research participants such that inclusion of some individuals may be more likely than the inclusion of others. 31   Convenience sampling is one such example and involves selection of research subjects based upon ease or opportuneness. Convenience sampling is common in medical education research, but, as outlined in the example at the beginning of this section, it can lead to sampling bias. 24   When evaluating an article that uses nonprobability-based sampling, it is important to look for participation/response rate. In general, a participation rate of less than 75% should be viewed with skepticism. 21   Additionally, it is important to determine whether characteristics of participants and nonparticipants were reported and if significant differences between the two groups exist.

Interpreting medical education research requires a basic understanding of common ways in which quantitative data are analyzed and displayed. In this section, we highlight two broad topics that are of particular importance when evaluating research articles.

The Nature of the Measurement Variable

Measurement variables in quantitative research generally fall into three categories: nominal, ordinal, or interval. 24   Nominal variables (sometimes called categorical variables) involve data that can be placed into discrete categories without a specific order or structure. Examples include sex (male or female) and professional degree (M.D., D.O., M.B.B.S., etc .) where there is no clear hierarchical order to the categories. Ordinal variables can be ranked according to some criterion, but the spacing between categories may not be equal. Examples of ordinal variables may include measurements of satisfaction (satisfied vs . unsatisfied), agreement (disagree vs . agree), and educational experience (medical student, resident, fellow). As it applies to educational experience, it is noteworthy that even though education can be quantified in years, the spacing between years ( i.e. , educational “growth”) remains unequal. For instance, the difference in performance between second- and third-year medical students is dramatically different than third- and fourth-year medical students. Interval variables can also be ranked according to some criteria, but, unlike ordinal variables, the spacing between variable categories is equal. Examples of interval variables include test scores and salary. However, the conceptual boundaries between these measurement variables are not always clear, as in the case where ordinal scales can be assumed to have the properties of an interval scale, so long as the data’s distribution is not substantially skewed. 32  

Understanding the nature of the measurement variable is important when evaluating how the data are analyzed and reported. Medical education research commonly uses measurement instruments with items that are rated on Likert-type scales, whereby the respondent is asked to assess their level of agreement with a given statement. The response is often translated into a corresponding number ( e.g. , 1 = strongly disagree, 3 = neutral, 5 = strongly agree). It is remarkable that scores from Likert-type scales are sometimes not normally distributed ( i.e. , are skewed toward one end of the scale), indicating that the spacing between scores is unequal and the variable is ordinal in nature. In these cases, it is recommended to report results as frequencies or medians, rather than means and SDs. 33  

Consider an article evaluating medical students’ satisfaction with a new curriculum. Researchers measure satisfaction using a Likert-type scale (1 = very unsatisfied, 2 = unsatisfied, 3 = neutral, 4 = satisfied, 5 = very satisfied). A total of 20 medical students evaluate the curriculum, 10 of whom rate their satisfaction as “satisfied,” and 10 of whom rate it as “very satisfied.” In this case, it does not make much sense to report an average score of 4.5; it makes more sense to report results in terms of frequency ( e.g. , half of the students were “very satisfied” with the curriculum, and half were not).

Effect Size and CIs

In medical education, as in other research disciplines, it is common to report statistically significant results ( i.e. , small P values) in order to increase the likelihood of publication. 34 , 35   However, a significant P value in itself does necessarily represent the educational impact of the study results. A statement like “Intervention x was associated with a significant improvement in learners’ intubation skill compared to education intervention y ( P < 0.05)” tells us that there was a less than 5% chance that the difference in improvement between interventions x and y was due to chance. Yet that does not mean that the study intervention necessarily caused the nonchance results, or indicate whether the between-group difference is educationally significant. Therefore, readers should consider looking beyond the P value to effect size and/or CI when interpreting the study results. 36 , 37  

Effect size is “the magnitude of the difference between two groups,” which helps to quantify the educational significance of the research results. 37   Common measures of effect size include Cohen’s d (standardized difference between two means), risk ratio (compares binary outcomes between two groups), and Pearson’s r correlation (linear relationship between two continuous variables). 37   CIs represent “a range of values around a sample mean or proportion” and are a measure of precision. 31   While effect size and CI give more useful information than simple statistical significance, they are commonly omitted from medical education research articles. 35   In such instances, readers should be wary of overinterpreting a P value in isolation. For further information effect size and CI, we direct readers the work of Sullivan and Feinn 37   and Hulley et al. 31  

In this final section, we identify instruments that can be used to evaluate the quality of quantitative medical education research articles. To this point, we have focused on framing the study and research methodologies and identifying potential pitfalls to consider when appraising a specific article. This is important because how a study is framed and the choice of methodology require some subjective interpretation. Fortunately, there are several instruments available for evaluating medical education research methods and providing a structured approach to the evaluation process.

The Medical Education Research Study Quality Instrument (MERSQI) 21   and the Newcastle Ottawa Scale-Education (NOS-E) 38   are two commonly used instruments, both of which have an extensive body of validity evidence to support the interpretation of their scores. Table 5 21 , 39   provides more detail regarding the MERSQI, which includes evaluation of study design, sampling, data type, validity, data analysis, and outcomes. We have found that applying the MERSQI to manuscripts, articles, and protocols has intrinsic educational value, because this practice of application familiarizes MERSQI users with fundamental principles of medical education research. One aspect of the MERSQI that deserves special mention is the section on evaluating outcomes based on Kirkpatrick’s widely recognized hierarchy of reaction, learning, behavior, and results ( table 5 ; fig .). 40   Validity evidence for the scores of the MERSQI include its operational definitions to improve response process, excellent reliability, and internal consistency, as well as high correlation with other measures of study quality, likelihood of publication, citation rate, and an association between MERSQI score and the likelihood of study funding. 21 , 41   Additionally, consequence validity for the MERSQI scores has been demonstrated by its utility for identifying and disseminating high-quality research in medical education. 42  

Fig. Kirkpatrick’s hierarchy of outcomes as applied to education research. Reaction = Level 1, Learning = Level 2, Behavior = Level 3, Results = Level 4. Outcomes become more meaningful, yet more difficult to achieve, when progressing from Level 1 through Level 4. Adapted with permission from Beckman and Cook, 2007.2

Kirkpatrick’s hierarchy of outcomes as applied to education research. Reaction = Level 1, Learning = Level 2, Behavior = Level 3, Results = Level 4. Outcomes become more meaningful, yet more difficult to achieve, when progressing from Level 1 through Level 4. Adapted with permission from Beckman and Cook, 2007. 2  

The Medical Education Research Study Quality Instrument for Evaluating the Quality of Medical Education Research

The Medical Education Research Study Quality Instrument for Evaluating the Quality of Medical Education Research

The NOS-E is a newer tool to evaluate the quality of medication education research. It was developed as a modification of the Newcastle-Ottawa Scale 43   for appraising the quality of nonrandomized studies. The NOS-E includes items focusing on the representativeness of the experimental group, selection and compatibility of the control group, missing data/study retention, and blinding of outcome assessors. 38 , 39   Additional validity evidence for NOS-E scores includes operational definitions to improve response process, excellent reliability and internal consistency, and its correlation with other measures of study quality. 39   Notably, the complete NOS-E, along with its scoring rubric, can found in the article by Cook and Reed. 39  

A recent comparison of the MERSQI and NOS-E found acceptable interrater reliability and good correlation between the two instruments 39   However, noted differences exist between the MERSQI and NOS-E. Specifically, the MERSQI may be applied to a broad range of study designs, including experimental and cross-sectional research. Additionally, the MERSQI addresses issues related to measurement validity and data analysis, and places emphasis on educational outcomes. On the other hand, the NOS-E focuses specifically on experimental study designs, and on issues related to sampling techniques and outcome assessment. 39   Ultimately, the MERSQI and NOS-E are complementary tools that may be used together when evaluating the quality of medical education research.

Conclusions

This article provides an overview of quantitative research in medical education, underscores the main components of education research, and provides a general framework for evaluating research quality. We highlighted the importance of framing a study with respect to purpose, conceptual framework, and statement of study intent. We reviewed the most common research methodologies, along with threats to the validity of a study and its measurement instruments. Finally, we identified two complementary instruments, the MERSQI and NOS-E, for evaluating the quality of a medical education research study.

Bordage G: Conceptual frameworks to illuminate and magnify. Medical education. 2009; 43(4):312–9.

Cook DA, Beckman TJ: Current concepts in validity and reliability for psychometric instruments: Theory and application. The American journal of medicine. 2006; 119(2):166. e7–166. e116.

Franenkel JR, Wallen NE, Hyun HH: How to Design and Evaluate Research in Education. 9th edition. New York, McGraw-Hill Education, 2015.

Hulley SB, Cummings SR, Browner WS, Grady DG, Newman TB: Designing clinical research. 4th edition. Philadelphia, Lippincott Williams & Wilkins, 2011.

Irby BJ, Brown G, Lara-Alecio R, Jackson S: The Handbook of Educational Theories. Charlotte, NC, Information Age Publishing, Inc., 2015

Standards for Educational and Psychological Testing (American Educational Research Association & American Psychological Association, 2014)

Swanwick T: Understanding medical education: Evidence, theory and practice, 2nd edition. Wiley-Blackwell, 2013.

Sullivan GM, Artino Jr AR: Analyzing and interpreting data from Likert-type scales. Journal of graduate medical education. 2013; 5(4):541–2.

Sullivan GM, Feinn R: Using effect size—or why the P value is not enough. Journal of graduate medical education. 2012; 4(3):279–82.

Tavakol M, Sandars J: Quantitative and qualitative methods in medical education research: AMEE Guide No 90: Part II. Medical teacher. 2014; 36(10):838–48.

Support was provided solely from institutional and/or departmental sources.

The authors declare no competing interests.

Citing articles via

Most viewed, email alerts, related articles, social media, affiliations.

  • ASA Practice Parameters
  • Online First
  • Author Resource Center
  • About the Journal
  • Editorial Board
  • Rights & Permissions
  • Online ISSN 1528-1175
  • Print ISSN 0003-3022
  • Anesthesiology
  • ASA Monitor

Silverchair Information Systems

  • Terms & Conditions Privacy Policy
  • Manage Cookie Preferences
  • © Copyright 2024 American Society of Anesthesiologists

This Feature Is Available To Subscribers Only

Sign In or Create an Account

  • Research article
  • Open access
  • Published: 06 January 2021

Effects of the COVID-19 pandemic on medical students: a multicenter quantitative study

  • Aaron J. Harries   ORCID: orcid.org/0000-0001-7107-0995 1 ,
  • Carmen Lee 1 ,
  • Lee Jones 2 ,
  • Robert M. Rodriguez 1 ,
  • John A. Davis 2 ,
  • Megan Boysen-Osborn 3 ,
  • Kathleen J. Kashima 4 ,
  • N. Kevin Krane 5 ,
  • Guenevere Rae 6 ,
  • Nicholas Kman 7 ,
  • Jodi M. Langsfeld 8 &
  • Marianne Juarez 1  

BMC Medical Education volume  21 , Article number:  14 ( 2021 ) Cite this article

149k Accesses

164 Citations

38 Altmetric

Metrics details

The COVID-19 pandemic disrupted the United States (US) medical education system with the necessary, yet unprecedented Association of American Medical Colleges (AAMC) national recommendation to pause all student clinical rotations with in-person patient care. This study is a quantitative analysis investigating the educational and psychological effects of the pandemic on US medical students and their reactions to the AAMC recommendation in order to inform medical education policy.

The authors sent a cross-sectional survey via email to medical students in their clinical training years at six medical schools during the initial peak phase of the COVID-19 pandemic. Survey questions aimed to evaluate students’ perceptions of COVID-19’s impact on medical education; ethical obligations during a pandemic; infection risk; anxiety and burnout; willingness and needed preparations to return to clinical rotations.

Seven hundred forty-one (29.5%) students responded. Nearly all students (93.7%) were not involved in clinical rotations with in-person patient contact at the time the study was conducted. Reactions to being removed were mixed, with 75.8% feeling this was appropriate, 34.7% guilty, 33.5% disappointed, and 27.0% relieved.

Most students (74.7%) agreed the pandemic had significantly disrupted their medical education, and believed they should continue with normal clinical rotations during this pandemic (61.3%). When asked if they would accept the risk of infection with COVID-19 if they returned to the clinical setting, 83.4% agreed.

Students reported the pandemic had moderate effects on their stress and anxiety levels with 84.1% of respondents feeling at least somewhat anxious. Adequate personal protective equipment (PPE) (53.5%) was the most important factor to feel safe returning to clinical rotations, followed by adequate testing for infection (19.3%) and antibody testing (16.2%).

Conclusions

The COVID-19 pandemic disrupted the education of US medical students in their clinical training years. The majority of students wanted to return to clinical rotations and were willing to accept the risk of COVID-19 infection. Students were most concerned with having enough PPE if allowed to return to clinical activities.

Peer Review reports

The COVID-19 pandemic has tested the limits of healthcare systems and challenged conventional practices in medical education. The rapid evolution of the pandemic dictated that critical decisions regarding the training of medical students in the United States (US) be made expeditiously, without significant input or guidance from the students themselves. On March 17, 2020, for the first time in modern US history, the Association of American Medical Colleges (AAMC), the largest national governing body of US medical schools, released guidance recommending that medical students immediately pause all clinical rotations to allow time to obtain additional information about the risks of COVID-19 and prepare for safe participation in the future. This decisive action would also conserve scarce resources such as personal protective equipment (PPE) and testing kits; minimize exposure of healthcare workers (HCWs) and the general population; and protect students’ education and wellbeing [ 1 ].

A similar precedent was set outside of the US during the SARS-CoV1 epidemic in 2003, where an initial cluster of infection in medical students in Hong Kong resulted in students being removed from hospital systems where SARS surfaced, including Hong Kong, Singapore and Toronto [ 2 , 3 ]. Later, studies demonstrated that the exclusion of Canadian students from those clinical environments resulted in frustration at lost learning opportunities and students’ inability to help [ 3 ]. International evidence also suggests that medical students perceive an ethical obligation to participate in pandemic response, and are willing to participate in scenarios similar to the current COVID-19 crisis, even when they believe the risk of infection to themselves to be high [ 4 , 5 , 6 ].

The sudden removal of some US medical students from educational settings has occurred previously in the wake of local disasters, with significant academic and personal impacts. In 2005, it was estimated that one-third of medical students experienced some degree of depression or post-traumatic stress disorder (PTSD) after Hurricane Katrina resulted in the closure of Tulane University School of Medicine [ 7 ].

Prior to the current COVID-19 pandemic, we found no studies investigating the effects of pandemics on the US medical education system or its students. The limited pool of evidence on medical student perceptions comes from two earlier global coronavirus surges, SARS and MERS, and studies of student anxiety related to pandemics are also limited to non-US populations [ 3 , 8 , 9 ]. Given the unprecedented nature of the current COVID-19 pandemic, there is concern that students may be missing out on meaningful educational experiences and months of clinical training with unknown effects on their current well-being or professional trajectory [ 10 ].

Our study, conducted during the initial peak phase of the COVID-19 pandemic, reports students’ perceptions of COVID-19’s impact on: medical student education; ethical obligations during a pandemic; perceptions of infection risk; anxiety and burnout; willingness to return to clinical rotations; and needed preparations to return safely. This data may help inform policies regarding the roles of medical students in clinical training during the current pandemic and prepare for the possibility of future pandemics.

We conducted a cross-sectional survey during the initial peak phase of the COVID-19 pandemic in the United States, from 4/20/20 to 5/25/20, via email sent to all clinically rotating medical students at six US medical schools: University of California San Francisco School of Medicine (San Francisco, CA), University of California Irvine School of Medicine (Irvine, CA), Tulane University School of Medicine (New Orleans, LA), University of Illinois College of Medicine (Chicago, Peoria, Rockford, and Urbana, IL), Ohio State University College of Medicine (Columbus, OH), and Zucker School of Medicine at Hofstra/Northwell (Hempstead, NY). Traditional undergraduate medical education in the US comprises 4 years of medical school with 2 years of primarily pre-clinical classroom learning followed by 2 years of clinical training involving direct patient care. Study participants were defined as medical students involved in their clinical training years at whom the AAMC guidance statement was directed. Depending on the curricular schedule of each medical school, this included intended graduation class years of 2020 (graduating 4th year student), 2021 (rising 4th year student), and 2022 (rising 3rd year student), exclusive of planned time off. Participating schools were specifically chosen to represent a broad spectrum of students from different regions of the country (West, South, Midwest, East) with variable COVID-19 prevalence. We excluded medical students not yet involved in clinical rotations. This study was deemed exempt by the respective Institutional Review Boards.

We developed a survey instrument modeled after a survey used in a previously published peer reviewed study evaluating the effects of the COVID-19 pandemic on Emergency Physicians, which incorporated items from validated stress scales [ 11 ]. The survey was modified for use in medical students to assess perceptions of the following domains: perceived impact on medical student education; ethical beliefs surrounding obligations to participate clinically during the pandemic; perceptions of personal infection risk; anxiety and burnout related to the pandemic; willingness to return to clinical rotations; and preparation needed for students to feel safe in the clinical environment. Once created, the survey underwent an iterative process of input and review from our team of authors with experience in survey methodology and psychometric measures to allow for optimization of content and validity. We tested a pilot of our preliminary instrument on five medical students to ensure question clarity, and confirm completion of the survey in approximately 10 min. The final survey consisted of 29 Likert, yes/no, multiple choice, and free response questions. Both medical school deans and student class representatives distributed the survey via email, with three follow-up emails to increase response rates. Data was collected anonymously.

For example, to assess the impact on students’ anxiety, participants were asked, “How much has the COVID-19 pandemic affected your stress or anxiety levels?” using a unipolar 7-point scale (1 = not at all, 4 = somewhat, 7 = extremely). To assess willingness to return to clinical rotations, participants were asked to rate on a bipolar scale (1 = strongly disagree, 2 = disagree, 3 = somewhat disagree, 4 = neither disagree nor agree, 5 = somewhat agree, 6 = agree, and 7 = strongly agree) their agreement with the statement: “to the extent possible, medical students should continue with normal clinical rotations during this pandemic.” (Survey Instrument, Supplemental Table  1 ).

Survey data was managed using Qualtrics hosted by the University of California, San Francisco. For data analysis we used STATA v15.1 (Stata Corp, College Station, TX). We summarized respondent characteristics and key responses as raw counts, frequency percent, medians and interquartile ranges (IQR). For responses to bipolar questions, we combined positive responses (somewhat agree, agree, or strongly agree) into an agreement percentage. To compare differences in medians we used a signed rank test with p value < 0.05 to show statistical difference. In a secondary analysis we stratified data to compare questions within key domains amongst the following sub-groups: female versus male, graduation year, local community COVID-19 prevalence (high, medium, low), and students on clinical rotations with in-person patient care. This secondary analysis used a chi square test with p value < 0.05 to show statistical difference between sub-group agreement percentages.

Of 2511 students contacted, we received 741 responses (29.5% response rate). Of these, 63.9% of respondents were female and 35.1% were male, with 1.0% reporting a different gender identity; 27.7% of responses came from the class of 2020, 53.5% from the class of 2021, and 18.7% from the class of 2022. (Demographics, Table 1 ).

Most student respondents (74.9%) had a clinical rotation that was cut short or canceled due to COVID-19 and 93.7% reported not being involved in clinical rotations with in-person patient contact at the time of the study. Regarding students’ perceptions of cancelled rotations (allowing for multiple reactions), 75.8% felt this was appropriate, 34.7% felt guilty for not being able to help patients and colleagues, 33.5% felt disappointed, and 27.0% felt relieved.

Most students (74.7%) agreed that their medical education had been significantly disrupted by the pandemic. Students also felt they were able to find meaningful learning experiences during the pandemic (72.1%). Free response examples included: taking a novel COVID-19 pandemic elective course, telehealth patient care, clinical rotations transitioned to virtual online courses, research or education electives, clinical and non-clinical COVID-19-related volunteering, and self-guided independent study electives. Students felt their medical schools were doing everything they could to help students adjust (72.7%). Overall, respondents felt the pandemic had interfered with their ability to develop skills needed to prepare for residency (61.4%), though fewer (45.7%) felt it had interfered with their ability to apply to residency. (Educational Impact, Fig.  1 ).

figure 1

Perceived educational impacts of the COVID-19 pandemic on medical students

A majority of medical students agreed they should be allowed to continue with normal clinical rotations during this pandemic (61.3%). Most students agreed (83.4%) that they accepted the risk of being infected with COVID-19, if they returned. When asked if students should be allowed to volunteer in clinical settings even if there is not a healthcare worker (HCW) shortage, 63.5% agreed; however, in the case of a HCW shortage only 19.5% believed students should be required to volunteer clinically. (Willingness to Participate Clinically, Fig.  2 ).

figure 2

Willingness to participate clinically during the COVID-19 pandemic

When asked if they perceived a moral, ethical, or professional obligation for medical students to help, 37.8% agreed that medical students have such an obligation during the current pandemic. This is in contrast to their perceptions of physicians: 87.1% of students agreed with a physician obligation to help during the COVID-19 pandemic. For both groups, students were asked if this obligation persisted without adequate PPE: only 10.9% of students believed medical students had this obligation, while 34.0% agreed physicians had this obligation. (Ethical Obligation, Fig.  3 ).

figure 3

Ethical obligation to volunteer during the COVID-19 pandemic

Given the assumption that there will not be a COVID-19 vaccine until 2021, students felt the single most important factor in a safe return to clinical rotations was having access to adequate PPE (53.3%), followed by adequate testing for infection (19.3%) and antibody testing for possible immunity (16.2%). Few students (5%) stated that nothing would make them feel comfortable until a vaccine is available. On a 1–7 scale (1 = not at all, 4 = somewhat, 7 = extremely), students felt somewhat prepared to use PPE during this pandemic in the clinical setting, median = 4 (IQR 4,6), and somewhat confident identifying symptoms most concerning for COVID-19, median = 4 (IQR 4,5). Students preferred to learn about PPE via video demonstration (76.7%), online modules (47.7%), and in-person or Zoom style conferences (44.7%).

Students believed they were likely to contract COVID-19 in general (75.6%), independent of a return to the clinical environment. Most respondents believed that missing some school or work would be a likely outcome (90.5%), and only a minority of students believed that hospitalization (22.1%) or death (4.3%) was slightly, moderately, or extremely likely.

On a 1–7 scale (1 = not at all, 4 = somewhat, and 7 = extremely), the median (IQR) reported effect of the COVID-19 pandemic on students’ stress or anxiety level was 5 (4, 6) with 84.1% of respondents feeling at least somewhat anxious due to the pandemic. Students’ perceived emotional exhaustion and burnout before the pandemic was a median = 2 (IQR 2,4) and since the pandemic started a median = 4 (IQR 2,5) with a median difference Δ = 2, p value < 0.001.

Secondary analysis of key questions revealed statistical differences between sub-groups. Women were significantly more likely than men to agree that the pandemic had affected their anxiety. Several significant differences existed for the class of 2020 when compared to the classes of 2021 and 2022: they were less likely to report disruptions to their education, to prefer to return to rotations, and to report an effect on anxiety. There were no significant differences with students who were still involved with in-person patient care compared with those who were not. In comparing areas with high COVID-19 prevalence at the time of the survey (New York and Louisiana) with medium (Illinois and Ohio) and low prevalence (California), students were less likely to report that the pandemic had disrupted their education. Students in low prevalence areas were most likely to agree that medical students should return to rotations. There were no differences between prevalence groups in accepting the risk of infection to return, or subjective anxiety effects. (Stratification, Table  2 ).

The COVID-19 pandemic has fundamentally transformed education at all levels - from preschool to postgraduate. Although changes to K-12 and college education have been well documented [ 12 , 13 ], there have been very few studies to date investigating the effects of COVID-19 on undergraduate medical education [ 14 ]. To maintain the delicate balance between student safety and wellbeing, and the time-sensitive need to train future physicians, student input must guide decisions regarding their roles in the clinical arena. Student concerns related to the pandemic, paired with their desire to return to rotations despite the risks, suggest that medical students may take on emotional burdens as members of the patient care team even when not present in the clinical environment. This study offers insight into how best to support medical students as they return to clinical rotations, how to prepare them for successful careers ahead, and how to plan for their potential roles in future pandemics.

Previous international studies of medical student attitudes towards hypothetical influenza-like pandemics demonstrated a willingness (80%) [ 4 ] and a perceived ethical obligation to volunteer (77 and 70%), despite 40% of Canadian students in one study perceiving a high likelihood of becoming infected [ 5 , 6 ]. Amidst the current COVID-19 pandemic, our participants reported less agreement with a medical student ethical obligation to volunteer in the clinical setting at 37.8%, but believed in a higher likelihood of becoming infected at 75.6%. Their willingness to be allowed to volunteer freely (63.5%) may suggest that the stresses of an ongoing pandemic alter students’ perceptions of the ethical requirement more than their willingness to help. Students overwhelmingly agreed that physicians had an ethical obligation to provide care during the COVID-19 pandemic (87.1%), possibly reflecting how they view the ethical transition from student to physician, or differences between paid professionals and paying for an education.

At the time our study was conducted, there were widespread concerns for possible HCW shortages. It was unclear whether medical students would be called to volunteer when residents became ill, or even graduate early to start residency training immediately (as occurred at half of schools surveyed). This timing allowed us to capture a truly unique perspective amongst medical students, a majority of whom reported increased anxiety and burnout due to the pandemic. At the same time, students felt that their medical schools were doing everything possible to support them, perhaps driven by virtual town halls and daily communication updates.

Trends in secondary analysis show important differences in the impacts of the pandemic. Women were more likely to report increased anxiety as compared to men, which may reflect broader gender differences in medical student anxiety [ 15 ] but requires more study to rule out different pandemic stresses by gender. Graduating medical students (class of 2020) overall described less impact on medical education and anxiety, a decreased desire to return to rotations, but equal acceptance of the risk of infection in clinical settings, possibly reflecting a focus on their upcoming intern year rather than the remaining months of undergraduate medical education. Since this class’s responses decreased overall agreement on these questions, educational impacts and anxiety effects may have been even greater had they been assessed further from graduation. Interestingly, students from areas with high local COVID-19 prevalence (New York and Louisiana) reported a less significant effect of the pandemic on their education, a paradoxical result that may indicate that medical student tolerance for the disruptions was greater in high-prevalence areas, as these students were removed at the same, if not higher, rates as their peers. Our results suggest that in future waves of the current pandemic or other disasters, students may be more patient with educational impacts when they have more immediate awareness of strains on the healthcare system.

A limitation of our study was the survey response rate, which was anticipated given the challenges students were facing. Some may not have been living near campus; others may have stopped reading emails due to early graduation or limited access to email; and some would likely be dealing with additional personal challenges related to the pandemic. We attempted to increase response rates by having the study sent directly from medical school deans and leadership, as well as respective class representatives, and by sending reminders for completion. The survey was not incentivized, and a higher response rate in the class of 2021 across all schools may indicate that students who felt their education was most affected were most likely to respond. We addressed this potential source of bias in the secondary analysis, which showed no differences between 2021 and 2022 respondents. Another limitation was the inherent issue with survey data collection of missing responses for some questions that occurred in a small number of surveys. This resulted in slight variability in the total responses received for certain questions, which were not statistically significant. To be transparent about this limitation, we presented our data by stating each total response and denominator in the Tables.

This initial study lays the groundwork for future investigations and next steps. With 72.1% of students agreeing that they were able to find meaningful learning in spite of the pandemic, future research should investigate novel learning modalities that were successful during this time. Educators should consider additional training on PPE use, given only moderate levels of student comfort in this area, which may be best received via video. It is also important to study the long-term effects of missing several months of essential clinical training and identifying competencies that may not have been achieved, since students perceived a significant disruption to their ability to prepare skills for residency. Next steps could be to study curriculum interventions, such as capstone boot camps and targeted didactic skills training, to help students feel more comfortable as they transition into residency. Educators must also acknowledge that some students may not feel comfortable returning to the clinical environment until a vaccine becomes available (5%) and ensure they are equally supported. Lastly, it is vital to further investigate the mental health effects of the pandemic on medical students, identifying subgroups with additional stressors, needs related to anxiety or possible PTSD, and ways to minimize these negative effects.

In this cross-sectional survey, conducted during the initial peak phase of the COVID-19 pandemic, we capture a snapshot of the effects of the pandemic on US medical students and gain insight into their reactions to the unprecedented AAMC national recommendation for removal from clinical rotations. Student respondents from across the US similarly recognized a significant disruption to their medical education, shared a desire to continue with in-person rotations, and were willing to accept the risk of infection with COVID-19. Our novel results provide a solid foundation to help shape medical student roles in the clinical environment during this pandemic and future outbreaks.

Availability of data and materials

The datasets used and/or analyzed during the current study are available from the corresponding author on reasonable request.

Association of American Medical Colleges. Interim Guidance on Medical Students’ Participation in Direct Patient Contact Activities: Principles and Guidelines. https://www.aamc.org/news-insights/press-releases/important-guidance-medical-students-clinical-rotations-during-coronavirus-covid-19-outbreak . Published March 17, 2020. Accessed April 1, 2020.

Clark J. Fear of SARS thwarts medical education in Toronto. BMJ. 2003;326(7393):784. https://doi.org/10.1136/bmj.326.7393.784/c .

Article   Google Scholar  

Loh LC, Ali AM, Ang TH, Chelliah A. Impact of a spreading epidemic on medical students. Malays J Med Sci. 2006;13(2):30–6.

Google Scholar  

Mortelmans LJ, Bouman SJ, Gaakeer MI, Dieltiens G, Anseeuw K, Sabbe MB. Dutch senior medical students and disaster medicine: a national survey. Int J Emerg Med. 2015;8(1):77. https://doi.org/10.1186/s12245-015-0077-0 .

Huapaya JA, Maquera-Afaray J, García PJ, Cárcamo C, Cieza JA. Conocimientos, prácticas y actitudes hacia el voluntariado ante una influenza pandémica: estudio transversal con estudiantes de medicina en Perú [Knowledge, practices and attitudes toward volunteer work in an influenza pandemic: cross-sectional study with Peruvian medical students]. Medwave. 2015;15(4):e6136Published 2015 May 8. https://doi.org/10.5867/medwave.2015.04.6136 .

Herman B, Rosychuk RJ, Bailey T, Lake R, Yonge O, Marrie TJ. Medical students and pandemic influenza. Emerg Infect Dis. 2007;13(11):1781–3. https://doi.org/10.3201/eid1311.070279 .

Kahn MJ, Markert RJ, Johnson JE, Owens D, Krane NK. Psychiatric issues and answers following hurricane Katrina. Acad Psychiatry. 2007;31(3):200–4. https://doi.org/10.1176/appi.ap.31.3.200 .

Al-Rabiaah A, Temsah MH, Al-Eyadhy AA, et al. Middle East respiratory syndrome-Corona virus (MERS-CoV) associated stress among medical students at a university teaching hospital in Saudi Arabia. J Infect Public Health. 2020;13(5):687–91. https://doi.org/10.1016/j.jiph.2020.01.005 .

Wong JG, Cheung EP, Cheung V, et al. Psychological responses to the SARS outbreak in healthcare students in Hong Kong. Med Teach. 2004;26(7):657–9. https://doi.org/10.1080/01421590400006572 .

Stokes DC. Senior medical students in the COVID-19 response: an opportunity to be proactive. Acad Emerg Med. 2020;27(4):343–5. https://doi.org/10.1111/acem.13972 .

Rodriguez RM, Medak AJ, Baumann BM, et al. Academic emergency medicine physicians’ anxiety levels, stressors, and potential mitigation measures during the acceleration phase of the COVID-19 pandemic. Acad Emerg Med. 2020;27(8):700–7. https://doi.org/10.1111/acem.14065 .

Sahu P. Closure of universities due to coronavirus disease 2019 (COVID-19): impact on education and mental health of students and academic staff. Cureus. 2020;12(4):e7541Published 2020 Apr 4. https://doi.org/10.7759/cureus.7541 .

Reimers FM, Schleicher A. A framework to guide an education response to the COVID-19 pandemic of 2020: OECD. https://www.hm.ee/sites/default/files/framework_guide_v1_002_harward.pdf .

Choi B, Jegatheeswaran L, Minocha A, Alhilani M, Nakhoul M, Mutengesa E. The impact of the COVID-19 pandemic on final year medical students in the United Kingdom: a national survey. BMC Med Educ. 2020;20:206–16. https://doi.org/10.1186/s12909-020-02117-1 .

Dyrbye LN, Thomas MR, Shanafelt TD. Systematic review of depression, anxiety, and other indicators of psychological distress among U.S. and Canadian medical students. Acad Med. 2006;81(4):354–73. https://doi.org/10.1097/00001888-200604000-00009 .

Download references

Acknowledgments

The authors wish to thank Newton Addo, UCSF Statistician.

Author information

Authors and affiliations.

Department of Emergency Medicine, University of California San Francisco School of Medicine, San Francisco General Hospital, 1001 Potrero Avenue, Building 5, Room #6A4, San Francisco, California, 94110, USA

Aaron J. Harries, Carmen Lee, Robert M. Rodriguez & Marianne Juarez

University of California San Francisco School of Medicine, San Francisco, California, USA

Lee Jones & John A. Davis

Clinical Emergency Medicine, University of California Irvine School of Medicine, Irvine, CA, USA

Megan Boysen-Osborn

University of Illinois College of Medicine, Chicago, IL, USA

Kathleen J. Kashima

Deming Department of Medicine, Tulane University School of Medicine, New Orleans, Louisiana, USA

N. Kevin Krane

Basic Science Education, Tulane University School of Medicine, New Orleans, Louisiana, USA

Guenevere Rae

Emergency Medicine, Ohio State College of Medicine, Columbus, OH, USA

Nicholas Kman

Department of Science Education, Donald and Barbara Zucker School of Medicine at Hofstra/Northwell, Hempstead, New York, USA

Jodi M. Langsfeld

You can also search for this author in PubMed   Google Scholar

Contributions

All authors made substantial contributions to the study and met the specific conditions listed in the BMC Medical Education editorial policy for authorship. All authors have read and approved the manuscript. AH as principal investigator contributed to study design, survey instrument creation, IRB submission for his respective medical school, acquisition of data and recruitment of other participating medical schools, data analysis, writing and editing the manuscript. CL contributed to background literature review, study design, survey instrument creation, acquisition of data, data analysis, writing and editing the manuscript. LJ contributed to study design, survey instrument creation, acquisition of data from his respective medical school and recruitment of other participating medical schools, data analysis, and editing the manuscript. RR contributed to study design, survey instrument creation, data analysis, writing and editing the manuscript. JD contributed to study design, survey instrument creation, recruitment of other participating medical schools, data analysis, and editing the manuscript. MBO contributed as individual site principal investigator obtaining IRB exemption acceptance and acquisition of data from her respective medical school along with editing the manuscript. KK contributed as individual site principal investigator obtaining IRB exemption acceptance and acquisition of data from her respective medical school along with editing the manuscript. NKK contributed as individual site co-principal investigator obtaining IRB exemption acceptance and acquisition of data from his respective medical school along with editing the manuscript. GR contributed as individual site co-principal investigator obtaining IRB exemption acceptance and acquisition of data from her respective medical school along with editing the manuscript. NK contributed as individual site principal investigator obtaining IRB exemption acceptance and acquisition of data from his respective medical school along with editing the manuscript. JL contributed as individual site principal investigator obtaining IRB exemption acceptance and acquisition of data from her respective medical school along with editing the manuscript. MJ contributed to study design, survey instrument creation, data analysis, writing and editing the manuscript.

Corresponding authors

Correspondence to Aaron J. Harries or Marianne Juarez .

Ethics declarations

Ethics approval and consent to participate.

This study was reviewed and deemed exempt by each participating medical school’s Institutional Review Board (IRB): University of California San Francisco School of Medicine, IRB# 20–30712, Reference# 280106, Tulane University School of Medicine, Reference # 2020–331, University of Illinois College of Medicine), IRB Protocol # 2012–0783, Ohio State University College of Medicine, Study ID# 2020E0463, Zucker School of Medicine at Hofstra/Northwell, Reference # 20200527-SOM-LAN-1, University of California Irvine School of Medicine, submitted self-exemption IRB form. In accordance with the IRB exemption approval, each survey participant received an email consent describing the study and their optional participation.

Consent for publication

This manuscript does not contain any individualized person’s data, therefore consent for publication was not necessary according to the IRB exemption approval.

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Additional file 1: table s1..

Survey Instrument

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ . The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article.

Harries, A.J., Lee, C., Jones, L. et al. Effects of the COVID-19 pandemic on medical students: a multicenter quantitative study. BMC Med Educ 21 , 14 (2021). https://doi.org/10.1186/s12909-020-02462-1

Download citation

Received : 29 July 2020

Accepted : 16 December 2020

Published : 06 January 2021

DOI : https://doi.org/10.1186/s12909-020-02462-1

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Undergraduate medical education
  • COVID-19 pandemic
  • Medical student anxiety

BMC Medical Education

ISSN: 1472-6920

published quantitative research paper about medicine

This paper is in the following e-collection/theme issue:

Published on 20.8.2020 in Vol 22 , No 8 (2020) : August

Digital Inequality During a Pandemic: Quantitative Study of Differences in COVID-19–Related Internet Uses and Outcomes Among the General Population

Authors of this article:

Author Orcid Image

Original Paper

  • Alexander JAM van Deursen, Prof Dr  

University of Twente, Enschede, Netherlands

Corresponding Author:

Alexander JAM van Deursen, Prof Dr

University of Twente

Drienerlolaan 5

Enschede, 7500AE

Netherlands

Phone: 31 622942142

Fax:31 534893200

Email: [email protected]

Background: The World Health Organization considers coronavirus disease (COVID-19) to be a public emergency threatening global health. During the crisis, the public’s need for web-based information and communication is a subject of focus. Digital inequality research has shown that internet access is not evenly distributed among the general population.

Objective: The aim of this study was to provide a timely understanding of how different people use the internet to meet their information and communication needs and the outcomes they gain from their internet use in relation to the COVID-19 pandemic. We also sought to reveal the extent to which gender, age, personality, health, literacy, education, economic and social resources, internet attitude, material access, internet access, and internet skills remain important factors in obtaining internet outcomes after people engage in the corresponding uses.

Methods: We used a web-based survey to draw upon a sample collected in the Netherlands. We obtained a dataset with 1733 respondents older than 18 years.

Results: Men are more likely to engage in COVID-19–related communication uses. Age is positively related to COVID-19–related information uses and negatively related to information and communication outcomes. Agreeableness is negatively related to both outcomes and to information uses. Neuroticism is positively related to both uses and to communication outcomes. Conscientiousness is not related to any of the uses or outcomes. Introversion is negatively related to communication outcomes. Finally, openness relates positively to all information uses and to both outcomes. Physical health has negative relationships with both outcomes. Health perception contributes positively to information uses and both outcomes. Traditional literacy has a positive relationship with information uses and both outcomes. Education has a positive relationship with information and communication uses. Economic and social resources played no roles. Internet attitude is positively related to information uses and outcomes but negatively related to communication uses and outcomes. Material access and internet access contributed to all uses and outcomes. Finally, several of the indicators and outcomes became insignificant after accounting for engagement in internet uses.

Conclusions: Digital inequality is a major concern among national and international scholars and policy makers. This contribution aimed to provide a broader understanding in the case of a major health pandemic by using the ongoing COVID-19 crisis as a context for empirical work. Several groups of people were identified as vulnerable, such as older people, less educated people, and people with physical health problems, low literacy levels, or low levels of internet skills. Generally, people who are already relatively advantaged are more likely to use the information and communication opportunities provided by the internet to their benefit in a health pandemic, while less advantaged individuals are less likely to benefit. Therefore, the COVID-19 crisis is also enforcing existing inequalities.

Introduction

The World Health Organization considers coronavirus disease (COVID-19) to be a public emergency threatening global health [ 1 ]. Governments worldwide have taken stringent action, including requiring social distancing, closing public services, schools and universities, and canceling cultural events [ 2 , 3 ]. People are being advised or ordered to stay at home and socially isolate themselves to avoid being infected [ 4 ]. The ongoing pandemic represents an outbreak of an unparalleled scale, and it has induced widespread fear and uncertainty.

In this paper, we focus on the role of the internet during the crisis. The internet has become a crucial source for the general public, as it provides access to general information, the latest national and international developments, and guidelines on behavioral norms during the crisis. In this respect, the internet plays an important role in the great challenges facing governments regarding the transfer of knowledge and guidelines to the population at large. When individuals understand the need and rationale behind government-enforced measures, they are more motivated to comply and even adopt measures voluntarily [ 5 , 6 ]. In addition to informational purposes, the internet enables individuals to share news and experiences with people they cannot meet face-to-face, remain in contact with friends and family, seek support, and ask questions of official agencies, including health agencies. Further, the internet enables people to take initiatives such as raising money or preparing packaged meals for people in need, such as health workers or people who have lost their jobs. In sum, the internet plays a vital role for people of all social strata and backgrounds during a time of worldwide crisis. All people should thus be able to use the internet as a source of information and communication.

However, digital inequality research has shown that internet access is not evenly distributed among the general population [ 7 , 8 ]. The basic idea of digital inequality stems from a comparative perspective of social and information inequality, as there are benefits associated with internet access and negative consequences of lack of access [ 9 ]. Calamities are often a story of inequality [ 10 ]; therefore, in this paper, we aimed to gain a deeper and broader understanding of the differences in how people use the internet to cope during the COVID-19 crisis. Van Dijk’s resources and appropriation theory [ 8 ] explains differences or inequalities of internet access by considering personal and positional categories of individuals and the individuals’ resources. Internet access itself is considered to be a process of appropriation involving attitudinal access, material access, skills access, and in the final stage, usage access. The latter entails differences in the type of activities that people perform on the internet. The consequences of the process are the outcomes of internet use. These outcomes in turn reinforce personal and positional inequalities and an unequal distribution of resources [ 8 ] ( Figure 1 ). The first goal of this paper is to provide a timely understanding of how different people use the internet and the outcomes they gain from it in relation to the COVID-19 pandemic.

Internet use and outcome differences between groups of people are likely to have profound consequences on how people manage a crisis. For example, older people are most in danger of being infected with the virus and most likely to die from the infection [ 11 ], and they also use the internet less and have the fewest internet outcomes [ 12 ]. The latter may further endanger their peculiar situation, as limited internet use and outcomes may result in a lack of critical information or necessary support.

published quantitative research paper about medicine

COVID-19–Related Internet Uses and Outcomes

To study differences in internet uses and outcomes during the COVID-19 pandemic, it is necessary to understand the types of uses and outcomes that are at play. Typically, uses and outcomes are studied by following conceptual classifications that distinguish different domains, such as economic, social, cultural, or personal domains [ 13 ]. Here, we take the COVID-19 pandemic as the domain of interest. Within this domain, we consider two main and conceptually different types of uses and outcomes: information and communication [ 14 , 15 ]. Information internet uses involve searching for information on all aspects of COVID-19. Potential information outcomes include becoming better informed about the disease, understanding why certain measures are necessary, and limiting the risk of becoming infected by developing greater awareness of one’s own behavior. Communication internet uses include talking to friends about the crisis, asking questions on social media or online fora, giving advice, or offering support to others. Communication outcomes include finding people on the internet who can offer support or share concern, being less lonely, and protecting others from potential COVID-19 risks. Studying both types of uses and outcomes is important, as prior research has shown that communication uses can compensate for information uses to attain beneficial internet outcomes [ 16 ].

Determinants of COVID-19–Related Internet Uses and Outcomes

Digital inequality research suggests that the vast amount of web-based information and communication possibilities around the COVID-19 pandemic are likely to be difficult to grasp and conceptualize for sections of the general population [ 7 ]. Some frequently observed personal categorical inequalities are gender, age, personality, and health [ 7 ]. Earlier research revealed that men and women differ in their internet activities; women are more likely to use email and social media, whereas men are more likely to use the internet to obtain information [ 17 , 18 ]. Age in general has a negative impact on all types of internet uses and outcomes [ 7 ]. In the COVID-19 crisis, older people are especially vulnerable; therefore, it is very important for them to know how to behave and be safe. We hypothesize that (H1) men are more likely to be involved in information-related uses and outcomes while women are more likely to be involved in communication-related internet uses and outcomes regarding COVID-19-related internet uses and outcomes. We also hypothesize that (H2) age contributes negatively to COVID-19–related internet uses and outcomes.

An individual’s personality may hinder or stimulate their engagement in certain COVID-19–related activities. Cognitive appraisal theory suggests that individuals complete two types of cognitive appraisal processes in a crisis [ 19 ]. The process starts with an evaluation of the crisis as a potential source of danger or life disruption. If the crisis is not determined to be dangerous, it is not considered a stressor and does not require intervention. If the crisis is determined to be relevant, it is considered a stressor and must be evaluated further by balancing the demands of the crisis and the person’s resources [ 20 ]. At this point, personality enters the equation [ 20 ]. There is a general consensus regarding the Big Five model when personality traits are studied. This model proposes five personality traits of agreeableness, neuroticism, conscientiousness, introversion, and openness [ 21 ]. However, there is no agreement as to whether these traits contribute to or detract from resisting disturbance [ 20 ]. There is also no consensus on how the Big Five personality traits relate to internet use [ 7 , 22 ]. For example, conscientiousness relates to people who abide by rules. On one hand, one might argue that this would result in a greater need for information on how to behave. On the other hand, the internet is unstructured, and rules and policies are absent to a large extent. When linking personality traits to internet use for psychological adjustments to the COVID-19 crisis, it is not evident whether these traits will support or hinder COVID-19–related internet uses and outcomes. We hypothesize that (H3a) agreeableness, (H3b) neuroticism, (H3c) contentiousness, (H3d) introversion, and (H3e) openness are related to COVID-19–related internet uses and outcomes.

An individual’s health may play an important role in how they approach COVID-19. To gain an elaborate understanding of how health relates to COVID-19–related internet uses and outcomes, we followed earlier research that distinguishes between different health aspects [ 23 ]: A person’s physical functioning or the degree to which their health currently interferes with activities such as sports, carrying groceries, climbing stairs, and walking, their mental health or psychological distress and well-being, and their health perception concerning their own health rating in general. During a crisis, we expect that people with health issues are more likely to turn to the internet for comfort and reassurance. We hypothesize that (H4a) physical functioning, (H4b) mental health, and (H4c) health perception contribute negatively to COVID-19–related internet uses and outcomes.

The final type of personal inequality considered in this study is traditional literacy, which is known to have a substantial impact on how the internet is used [ 24 , 25 ]. We consider literacy to be the ability to read, write, and understand text, which is also framed under the umbrella terms functional literacy or fundamental literacy [ 24 ]. Functional or traditional literacy can be considered as the basic dimension of all literacy concepts [ 26 ]. Considering the crucial role the internet is playing in the COVID-19 crisis, a low level of literacy is a potentially large inhibitor of understanding information and being involved in web-based communication. We hypothesize that (H5) traditional literacy contributes positively to COVID-19–related internet uses and outcomes.

Education is the most observed positional categorial inequality in digital divide research, and it is likely to play a role in the current context. People with higher levels of education are better equipped to comprehend web-based information and benefit from internet use [ 7 ]. We hypothesize that (H6) education contributes positively to COVID-19–related internet uses and outcomes.

When studying differences in internet uses and outcomes, the resources people can access are often derived from Pierre Bourdieu’s capital theory [ 27 ], which stresses the importance of including not only economic but also social and cultural resources to determine one’s status and position in society. In the COVID-19 pandemic, economic and social resources are likely to be important, as earlier research has shown that people with greater economic resources—mostly operationalized as income in digital inequality research—are known to use the internet more efficaciously and productively [ 7 , 28 ]. People with more social resources are more likely to have access to family, friends, or other contacts on the internet [ 29 ]. We hypothesize that (H7a) economic and (H7b) social resources contribute positively to COVID-19–related internet uses and outcomes.

The Internet Appropriation Process

The core of the resources and appropriation theory is access to technology, which is considered as a process of appropriation involving attitudinal, material, skills, and usage access. Attitudinal access concerns a person’s attitude towards the internet; according to theories of technology adoption, this type of access is crucial for using the internet [ 30 ]. Material access can be defined in terms of the different devices that people use to access the internet and all other web-based resources, including desktop computers, laptop computers, tablets, smartphones, game consoles, and interactive televisions [ 31 ]. Skills access concerns the skills necessary to use the internet, ranging from operational and information skills to social and content creation skills [ 32 ]. Prior research has revealed that all three types of internet access directly affect internet uses and outcomes [ 16 ]. We hypothesize that (H8a) attitudinal internet access, (H8b) material internet access, and (H8c) skills internet access contribute positively to COVID-19–related internet uses and outcomes.

The Effects of COVID-19–Related Internet Uses on Their Corresponding Outcomes

A recent multifaceted consideration of digital inequality revealed a strong effect of internet uses on outcomes [ 12 ]. Further, people’s internet activities appeared to be more important than their personal characteristics with regard to inequalities in outcomes of internet use. This suggests that the variables discussed in the prior sections will become less important for obtaining information outcomes when people are involved in COVID-19–related internet information uses. This is also true for COVID-19–related communication uses and outcomes. The second goal of this paper is to reveal the extent to which the indicators discussed remain important for obtaining internet outcomes after people are involved in the corresponding uses.

Recruitment

This study used a web-based survey and drew upon a sample collected in the Netherlands. To obtain a representative sample of the population, we used PanelClix, a professional organization for market research, to provide a panel of approximately 110,000 people. Members of the panel received a small incentive for every survey they completed. In the Netherlands, 98% of the population uses the internet; therefore, the internet user population is very closely representative of the general population in terms of its sociodemographic makeup. The panel included novice and advanced internet users. In total, we aimed to obtain a dataset with approximately 1700 respondents over the age of 18. Eventually, this resulted in the collection of 1733 responses over a 1-week period in April 2020. During the data collection period, three amendments to the sampling frame were made to ensure the representativity of the Dutch population. Accordingly, the analyses revealed that the gender, age, and formal education of our respondents largely matched official census data. As a result, only very small post hoc corrections were needed.

The web-based survey used software that checked for missing responses and prompted users to respond. The survey was pilot-tested with 10 internet users over two rounds. Amendments were made based on the feedback provided. No major comments were provided in the second round. The average time required to complete the survey was 20 minutes.

We initially developed 11 survey items pertaining to COVID-19–related internet use. Respondents were asked to indicate the extent to which they used the internet for various activities in the past month using a 5-point scale (“not” to “multiple times a day”) as an ordinal-level measure. Principal component analysis with varimax rotation was used to determine two underlying usage clusters, one related to information and one to communication. Factor loadings were employed at 0.4 and above for each item [ 33 ]. In total, 8 items (3 for information and 5 for communication) were retained in a two-factor structure with eigenvalues over 1.0, together accounting for 76% of the total variance.

For COVID-19-related information and communication internet outcomes, we developed 14 items mapped onto the use items. A 5-point agreement scale as an ordinal level measure was used. Principal component analysis with varimax rotation resulted in a structure that matched the conceptual definition of information outcomes (4 items) and communication outcomes (4 items). The two factors showed eigenvalues over 1.0 and explained 65% of the variance.

Gender was included as a dichotomous variable, and age was directly asked (mean 50.2, SD 17.0).

Personality was measured with the Quick Big Five personality questionnaire [ 34 ], which consists of 30 adjectives reflecting a valid and reliable measure of the Big Five traits. Participants were asked to rate the extent to which a particular adjective applied to them on a 7-point scale, ranging from completely untrue to completely true. The Cronbach α values for the five traits were .89 for agreeableness, .88 for neuroticism, .88 for conscientiousness, .87 for introversion, and .81 for openness.

Physical health, mental health, and health perception were measured with the Dutch version of the Medical Outcomes Study (MOS) Short-Form General Health Survey (SF-20) [ 35 ]. This instrument enables respondents to assess their general health and generates composite summary scores representing different types of health. We normalized the scales, with higher scores representing better functioning. Physical health was measured with 5 items (2-point scale; α=.89; mean 1.75, SD 0.34), mental health with 5 items (5-point scale; α=.85; mean 3.65, SD 0.77), and health perception with 5 items (5-point scale; α=.86; mean 3.39, SD 0.85).

To measure traditional literacy, we used the validated 11-item Diagnostic Illiteracy Scale [ 36 ]. Sample items included “I have difficulties with reading and understanding information from my municipality” and “I find it difficult to read and understand my telephone bill.” A 5-point agreement scale was used. Scores on the scale exhibited high internal consistency. Items were recoded so that higher scores corresponded with higher levels of literacy (α=.94; mean 4.33, SD 0.71).

To assess education, data regarding degrees earned were collected and used to create three groups: low (primary), middle (secondary), and high (tertiary) educational achievement.

Economic resources were objectively measured by seeking the annual family income in the last 12 months. Twelve categories were recoded into three categories of low for <€30,000 (US $35,503.50), middle for €30,000 to €70,000 (US $35,503.50 to $82841.50), and high for >€70,000 (>US $82841.50). For social resources, we used the MOS Social Support Survey [ 37 ]. Respondents completed 18 items covering emotional support (eg, “Someone you can count on to listen when you need to talk”), informational support (eg, “Someone to give you good advice about a crisis”), and tangible support (eg, “Someone to help you if you were confined to bed”). All items were rated on a 5-point Likert scale with anchors of none of the time (1) and most of the time (5). We computed an aggregate measure of support availability (α=.96; mean 3.83, SD 0.85).

Attitudinal internet access was measured by three items adapted from the Digital Motivation Scale [ 38 ]. A 5-point agreement scale was used, and all items were balanced for the direction of response (α=.74; mean 4.10, SD 0.70). An example statement is “Technologies such as the internet and mobile phones make life easier.” To measure material internet access, we considered 7 devices used to connect to the internet (mean 3.43, SD 1.53). Included were desktop computer, laptop computer, tablet, smartphone, smart TV, game console, and smart device (eg, activity tracker). Finally, skills internet access was adapted from the conceptual idea behind the Internet Skills Scale [ 32 ]. We proposed 30 items reflecting operational, information navigation, social, and creative internet skills. A 20-item single skills construct resulted from the principal component analysis. All items were scored on a 5-point scale that ranged from “not at all true of me” to “very true of me” and exhibited high internal consistency (α=.96; mean 3.67, SD 0.97). Example items are “I know how to open downloaded files,” “I find it hard to decide what the best keywords are to use for online searches,” and “I know which information I should and shouldn’t share online.”

Statistical Analysis

To test the hypotheses and account for the sequentiality between COVID-19–related internet uses and outcomes, hierarchical regression analyses were used. In the first model, we tested our hypotheses by analyzing the significant determinants for the two types of COVID-19–related internet uses and the two corresponding outcomes. In the second model, we sought to determine the changes in the significance of the determinants after the internet uses were added to the models.

Table 1 provides an overview of the sample of people surveyed in the study.

Table 2 shows the mean scores of the survey questions related to internet uses and internet outcomes.

The first goal of this paper was addressed in the first model, as presented in Table 3 , where several significant determinants for COVID-19 uses and outcomes are revealed.

CharacteristicValue

Male874 (50.4)

Female859 (49.6)

18-30280 (16.2)

31-40271 (15.6)

41-50293 (16.9)

51-60338 (19.5)

61-70324 (18.7)

>70227 (13.1)

Low519 (29.9)

Middle602 (34.7)

High612 (35.3)

a Low: primary; middle: secondary; high: tertiary.

Category and questionsαMean (SD)
–related informational internet uses.803.13 1.53

Search the internet for information about COVID-19
3.76 1.91

Consult websites of public agencies (eg, RIVM , municipality, hospital, or government)
3.21 1.83

Search the internet for measures to prevent the further spread of COVID-19
2.44 1.71
.921.56 1.13

Provide advice on COVID-19 to others via social media
1.56 1.31

Starting an action against COVID-19 via the internet (eg collecting money, offering help)
1.41 1.17

Ask questions about COVID-19 on forums or social media
1.54 1.30

Comment on the internet on COVID-19 discussions (eg, on social media)
1.58 1.34

Offering help online to people who need it now
1.70 1.41
.803.17 0.95

The internet makes me better informed about COVID-19
3.58 1.13

The internet makes me understand the measures against COVID-19 better
3.25 1.15

The internet helps me to reduce the risk of getting COVID-19
3.15 1.16

Information about COVID-19 on the internet has made me more aware of my own behavior
2.70 1.26
.801.91 0.89

Through the internet I found someone who can help me in this time of COVID-19
1.67 1.04

Through the internet I have found people with whom I can share my concerns about COVID-19
1.83 1.10

Via the internet I contributed to the COVID-19 crisis (eg, collecting money, helping people)
1.83 1.13

The internet makes me less lonely now
2.29 1.25

a COVID-19: coronavirus disease.

b RIVM: Rijksinstituut voor Volksgezondheid en Milieu.

CharacteristicInformationCommunication


UseOutcomeUseOutcome


β valueβ valueβ valueβ value

Gender (male or female).01.61.00.98–.08<.001.01.83

Age.08.01–.03.35–.08.003–.11<.001

Agreeableness-.07.03–.01.75–.13<.001-.08.003

Neuroticism.15<.001.15<.001.05.20.08.02

Conscientiousness.01.60–.02.52.01.54–.04.14

Introversion–.04.11.02.56–.09<.001-.06.02

Openness.08.004.03.30.14<.001.15<.001

Physical health–.04.15–.03.31–.15<.001–.10<.001

Mental health–.06.15.03.41–.06.11–.01.82

Health perception.07.05.04.31.16<.001.10<.001

Traditional literacy.09<.001.10.18.31<.001.33<.001

Education.08.002.02.36.07.003.02.34

Economic resources.03.23.04.13–.01.57–.03.23

Social resources.02.40.00.91–.02.36–.01.69

Attitudinal access.14<.001.29<.001–.06.01–.04.08

Material access.10<.001.06.02.08<.001.07.008

Skills access.08.006.08.008.09<.001.09<.001

Table 3 shows that men are more likely to be involved in COVID-19–related communication uses. Age is positively related to COVID-19–related information uses and negatively related to COVID-19 communication uses and outcomes. Concerning personality traits, agreeableness is negatively related to COVID-19–related information and communication uses and to communication outcomes. Neuroticism is positively related to both uses and to communication outcomes.

Conscientiousness is not related to any of the uses or outcomes. Introversion is negatively related to COVID-19–related communication uses and outcomes, suggesting that this is performed more by extraverted people. Finally, openness relates positively to information uses and to both outcomes.

The results further show that concerning the three health indicators, physical health is negatively related to communication uses and outcomes. Mental health did not contribute to any uses or outcomes. Health perception contributes positively to information uses and to both outcomes.

Traditional literacy has a positive relationship with information-type uses and with both outcomes, and education has a positive relationship with COVID-19–related information and communication uses. Economic and social resources were not related to any COVID-19 uses or outcomes.

Attitudinal internet access is positively related to information uses and outcomes but is negatively related to communication uses and outcomes. Material internet access contributes positively to all uses and outcomes, and skills access has a positive relationship with all uses and outcomes. Table 4 provides an overview of the hypotheses.

NumberHypothesisInformation usesInformation outcomesCommunication usesCommunication outcomesValidation
H1Gender (male or female)ns nsnsR
H2Age+ nsPS
H3aAgreeablenessnsPS
H3bNeuroticism++ns+PS
H3cConscientiousnessnsnsnsnsR
H3dIntroversionnsnsPS
H3eOpenness+ns++PS
H4aPhysical healthnsnsPS
H4bMental healthnsnsnsnsR
H4cHealth perception+ns++R
H5Traditional literacy+ns++PS
H6Education+ns+nsPS
H7aEconomic resourcesnsnsnsnsR
H7bSocial resourcesnsnsnsnsR
H8aAttitudinal access++PS
H8bMaterial access++++S
H8cSkills access++++S

a ns: no significant contribution.

b –: significant negative contribution.

c R: reject.

d +: significant positive contribution.

e PS: partial support.

f S: support.

Finally, to address the second goal of the study, we tested what would happen to the contribution of the outcome determinants when the corresponding uses were added to the analyses (Model 2: see Tables 5 and 6 ). Adding the uses significantly increased the explained variance; also, several of the relationships between personal and positional categories and between resources and outcomes became insignificant. The relationships that remained significant for information outcomes were age, health perception, and traditional literacy. Furthermore, attitudinal internet access remained significant. For communication outcomes, the relationships that remained significant were age, openness, and traditional literacy.

CharacteristicInformation outcomesCommunication outcomes


β valueβ value

Gender (male or female)–.01.71.04.05

Age–.07.003–.08<.001

Agreeableness.03.25–.02.38

Neuroticism.06.04.06.06

Conscientiousness–.02.26–.04.05

Introversion.04.08–.02.40

Openness-.02.49.08<.001

Physical health.07.05–.03.26

Mental health.01.79.02.59

Health perception–.05.02.03.34

Traditional literacy.05.02.19<.001

Education–.02.33–.01.67

Economic resources.02.30–.02.28

Social resources–.01.67.00.99

Attitudinal access.21<.001–.02.49

Material access.01.74.03.21

Skills access.03.17.05.06
Information uses.55<.001N/A N/A
Communication usesN/AN/A.45<.001

a N/A: not applicable.

Model and measuresInformation outcomesCommunication outcomes


UseOutcomeUseOutcome

r .09.13.23.21

F22.1515.0530.1326.54

r N/A .41N/A.37

r changeN/A.28N/A.16

FN/A63.71N/A54.72

Principal Results

This paper aims to provide a comprehensive examination of digital inequality in the case of an unprecedented health pandemic. The first goal of the study was to reveal how inequality manifests itself in COVID-19–related internet information and communication uses and outcomes. The findings revealed several relationships between the background variables and the two types of internet uses and outcomes.

Older people were found to be less equipped to use the internet for information and communication during a time of crisis. However, they were more likely to engage in information-type COVID-19–related internet uses, possibly because they are at greatest risk from the disease [ 11 ]. This did not result in more beneficial information outcomes. Internet skills play an important role in translating internet uses into beneficial internet outcomes [ 39 ], and prior research has shown that older people have lower internet skill levels in general [ 32 ]. The finding that older people are less likely to perform communication activities or obtain communication-related outcomes is in line with prior studies [ 15 ]; however, these outcomes are important, as older people are more at risk of having severe complications when diagnosed with COVID-19. Regarding gender, contrary to general internet use, men were found to be more likely to engage in communication-type COVID-19–related internet uses during the crisis than women. A possible explanation is that men and women may respond to crisis news in different ways [ 40 ].

The positive effect of neuroticism suggests that a tendency to experience negative emotions such as anger, anxiety, or depression fuels the need to turn to the internet for COVID-19–related information and communication. People who score higher on the neuroticism scale may be more in need of guidelines on how to mitigate risks or may need more support from others to be comforted. Also, the openness trait supports both information and communication internet use and outcomes. A possible explanation is that a major crisis triggers adventure, unconventional ideas, imagination, awareness of feelings, curiosity, or a variety of experiences, all of which are aspects linked to high openness [ 21 ]. The negative contribution of agreeableness raises questions. A possible explanation is that agreeable people are less frequently sought out for communication activities. However, the internet may also be a very inviting environment for less agreeable people. Conscientiousness did not appear to be a significant determinant. People who are more stubborn and focused or more flexible and spontaneous both appear to be involved in information- and communication-type COVID-19–related internet uses and outcomes. Extroversion emerged as a trait that supports using the internet for communication uses and outcomes; this can be expected, as extroversion is marked by pronounced engagement with the external world [ 21 ].

Although we expected that psychological distress would play a role in the current context, as there would be a relatively high need for information and support from others, mental health did not surface as a significant contributor. Furthermore, we did find that physical health problems appear to encourage web-based COVID-19–related communication uses and outcomes. The most likely explanation is that people with underlying health problems are more at risk (and thus more bound to their homes) and thus have higher needs for communication with friends and family. A possible reason for the positive effect of health perception is that people who believe their personal health to be good may feel better equipped to support others during the COVID-19 pandemic.

As expected, traditional literacy played an important role. A lack of general ability to read, write, and understand text further disadvantages individuals in the case of the COVID-19 pandemic, as they have less access to information and communication sources. COVID-19 is a new, unknown, and complicated disease with characteristics that are often described in difficult medical language that is not easy to read. Similar findings were found for educational attainment. Research has long shown that education is one of the most prominent positional variables in digital divide research [ 7 ]. However, our results suggest that when less educated individuals are involved in information and communication internet uses, they are as likely to achieve the corresponding outcomes as people with higher levels of education. This is an important finding for designing interventions for those of lower levels of education.

An effect of economic resources did not emerge in relation to COVID-19–related internet uses and outcomes. The participants’ income did not make a difference in obtaining information and communication COVID-19–related internet outcomes. Earlier research often showed that income is especially important to consumptive and work-related internet uses [ 17 ], topics that are not considered here. Unexpectedly, social resources were not found to be influential. Apparently, a person who has an offline support network will not necessarily turn more to web-based information and communication support during a crisis.

Concerning internet access, we can first conclude that a person’s internet attitude is important for engaging in information uses and gaining information outcomes. Unexpectedly, there was a negative contribution of internet attitude to communication uses and outcomes, suggesting that individuals who have a negative evaluation of the internet in general are more likely to engage in communication uses in the event of a major crisis. Both material and skills internet access played important roles in achieving all uses and outcomes. Using a higher diversity of devices was related to higher COVID-19–related internet use and to more outcomes. The opportunities devices offer are known to be related to inequalities in internet uses and outcomes. As each device offers its own specific characteristics and advantages, a higher diversity of devices supports a larger range of use activities and outcomes [ 31 ]. Furthermore, internet skills play a fundamental role in COVID-19–related uses and in obtaining beneficial outcomes [ 12 ].

In this paper, several indicators surfaced for people’s web-based COVID-19–related uses and outcomes. The variety of important indicators raises the question of whether general policies to address digital inequalities in a time of crisis will be effective. The complex relationships between the different indicators on one hand and internet uses and outcomes on the other hand demand more focused policies, such as those related to health indicators and the need for information to enhance health outcomes. This study reveals that the greater an individual’s existing advantages, the more they benefit from the internet at a time of crisis; the converse is true as well. Marginalized people are likely to have fewer types of access available to take actions, behave as requested, or be comforted by help, creating a vicious cycle where already marginalized groups are further marginalized in a time of crisis.

To end on a positive note, the situation may become slightly less complex when we address the second goal of this paper. When people engage in information and communication internet uses in a crisis situation, their personal characteristics become less important to achieving the corresponding outcomes. This suggests that to achieve information and communication outcomes, policy or research should especially focus on encouraging people to engage in the corresponding internet uses, as we can assume to some extent that engagement with information and communication-related COVID-19 uses is the best way to achieve beneficial outcomes at a time when they are most needed.

Limitations

The current study was conducted in the Netherlands, a country whose citizens have very high household internet penetration and high levels of educational attainment. Although differences in educational background and income are present and were taken into consideration, the observed inequalities may be even stronger in countries with a less homogeneous population. Given that the greatest burden of deaths has been in countries with very diverse populations, race and associated factors are likely to play a major role.

The aim of this study was to provide a broader picture of inequality in relation to how the internet is used in the case of a major global health crisis. A broad range of determinants was considered, and the relative importance of these indicators was revealed. However, a deeper understanding and further investigation to reveal the exact underlying mechanisms that cause these indicators to play a role would provide additional explanations. This suggests that further qualitative research is needed not only to obtain in-depth understanding of the mechanisms but also to understand the consequences of the observed inequalities to complement the findings of the current quantitative approach.

Conclusions

Digital inequality is a major concern among national and international scholars and policy makers. In this paper, we aimed to provide a broader understanding in the case of a major health pandemic by using the ongoing COVID-19 crisis as a context for empirical work. Several groups of people were identified as vulnerable, such as older people and people with lower levels of education, physical health problems, higher levels of neuroticism, low literacy levels, and low levels of trust. The general conclusion is that people who are already relatively advantaged are more likely to use the information and communication opportunities provided by the internet to their benefit in a health pandemic, while more disadvantaged individuals are less likely to benefit. Therefore, the COVID-19 crisis is also an enforcer of existing inequalities.

Conflicts of Interest

None declared.

  • Mahase E. China coronavirus: WHO declares international emergency as death toll exceeds 200. BMJ 2020 Jan 31;368:m408. [ CrossRef ] [ Medline ]
  • Anderson RM, Heesterbeek H, Klinkenberg D, Hollingsworth TD. How will country-based mitigation measures influence the course of the COVID-19 epidemic? Lancet 2020 Mar;395(10228):931-934 [ FREE Full text ] [ CrossRef ]
  • Chinazzi M, Davis JT, Ajelli M, Gioannini C, Litvinova M, Merler S, et al. The effect of travel restrictions on the spread of the 2019 novel coronavirus (COVID-19) outbreak. Science 2020 Apr 24;368(6489):395-400 [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Horton R. Offline: 2019-nCoV—“A desperate plea”. Lancet 2020 Feb;395(10222):400. [ CrossRef ]
  • Cowper A. Covid-19: are we getting the communications right? BMJ 2020 Mar 06;368:m919. [ CrossRef ] [ Medline ]
  • Deci E, Ryan R. Intrinsic motivation. In: Weiner I, Craighead WE, editors. The Corsini Encyclopedia Of Psychology. Hoboken, NJ: John Wiley & Sons; 2010:2.
  • Scheerder A, van Deursen AJAM, van Dijk JAGM. Determinants of Internet skills, uses and outcomes. A systematic review of the second- and third-level digital divide. Telemat Inform 2017 Dec;34(8):1607-1624. [ CrossRef ]
  • Van Dijk JAGM. The Deepening Divide: Inequality in the Information Society. London, UK: Sage Publications; 2005.
  • van Deursen AJ, van Dijk JA. The digital divide shifts to differences in usage. New Media Soc 2013 Jun 07;16(3):507-526. [ CrossRef ]
  • Madianou M. Digital Inequality and Second-Order Disasters: Social Media in the Typhoon Haiyan Recovery. SM+S 2015 Sep 30;1(2):205630511560338. [ CrossRef ]
  • Wang L, He W, Yu X, Hu D, Bao M, Liu H, et al. Coronavirus disease 2019 in elderly patients: Characteristics and prognostic factors based on 4-week follow-up. J Infect 2020 Jun;80(6):639-645 [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Van Deursen AJ, Helsper EJ. Collateral benefits of Internet use: Explaining the diverse outcomes of engaging with the Internet. New Media Soc 2018 Jul 30;20(7):2333-2351 [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Helsper EJ. A Corresponding Fields Model for the Links Between Social and Digital Exclusion. Commun Theor 2012 Oct 15;22(4):403-426. [ CrossRef ]
  • Kraut R, Mukhopadhyay T, Szczypula J, Kiesler S, Scherlis B. Information and Communication: Alternative Uses of the Internet in Households. Inf Sys Res 1999 Dec;10(4):287-303. [ CrossRef ]
  • Blank G, Groselj D. Dimensions of Internet use: amount, variety, and types. Information, Communication & Society 2014 Feb 28;17(4):417-435. [ CrossRef ]
  • van Deursen AJAM, Courtois C, van Dijk JAGM. Internet Skills, Sources of Support, and Benefiting From Internet Use. International Journal of Human-Computer Interaction 2014 Mar 07;30(4):278-290. [ CrossRef ]
  • van Deursen AJ, van Dijk JA, ten Klooster PM. Increasing inequalities in what we do online: A longitudinal cross sectional analysis of Internet activities among the Dutch population (2010 to 2013) over gender, age, education, and income. Telemat Inform 2015 May;32(2):259-272. [ CrossRef ]
  • Zillien N, Hargittai E. Digital Distinction: Status‐Specific Types of Internet Usage. Soc Sci Q 2009;90(2):274-291. [ CrossRef ]
  • Lazarus R, Folkman S. Cognitive theories of stress and the issue of circularity. In: Appley MH, Trumbull R, editors. Dynamics of Stress. Boston, MA: Spinger; 1986:63-80.
  • Lazarus R. Stress and Emotion: A New Synthesis. New York, NY: Springer; 2006:A.
  • John O, Srivastava S. The Big 5 trait taxonomy: History, measurement, and theoretical perspectives. In: Pervin LA, John OP, editors. Handbook of Personality: Theory and Research (2nd ed.). New York, NY: Guilford Press; 1999:102-138.
  • Landers RN, Lounsbury JW. An investigation of Big Five and narrow personality traits in relation to Internet usage. Comput Hum Behav 2006 Mar;22(2):283-293. [ CrossRef ]
  • Stewart AL, Hays RD, Ware JE. The MOS short-form general health survey. Reliability and validity in a patient population. Med Care 1988 Jul;26(7):724-735. [ CrossRef ] [ Medline ]
  • Coiro J. Exploring Literacy on the Internet: Reading Comprehension on the Internet: Expanding Our Understanding of Reading Comprehension to Encompass New Literacies. Read Teach 2003;56(5):458-464.
  • van Deursen A, van Dijk J. Modeling Traditional Literacy, Internet Skills and Internet Usage: An Empirical Study. Interact Comput 2014 Jul 16;28(1):13-26. [ CrossRef ]
  • Frisch A, Camerini L, Diviani N, Schulz PJ. Defining and measuring health literacy: how can we profit from other literacy domains? Health Promot Int 2012 Mar;27(1):117-126. [ CrossRef ] [ Medline ]
  • Bourdieu P. The forms of capital. In: Richardson J, editor. Handbook of Theory and Research for the Sociology of Education. New York, NY: Greenwood; 1986:241-258.
  • DiMaggio P, Hargittai E, Celeste C, Shafer S. From unequal access to differentiated use: A literature review and agenda for research on digital inequality. In: Neckerman K, editor. Social Inequality. New York: Russell Sage Foundation; 2004:A-400.
  • Woolcock M, Narayan D. Social Capital: Implications for Development Theory, Research, and Policy. World Bank Res Obs 2000 Aug 01;15(2):225-249. [ CrossRef ]
  • Davis FD. Perceived Usefulness, Perceived Ease of Use, and User Acceptance of Information Technology. MIS Q 1989 Sep;13(3):319. [ CrossRef ]
  • van Deursen AJ, van Dijk JA. The first-level digital divide shifts from inequalities in physical access to inequalities in material access. New Media Soc 2019 Feb 07;21(2):354-375 [ FREE Full text ] [ CrossRef ] [ Medline ]
  • van Deursen AJ, Helsper EJ, Eynon R. Development and validation of the Internet Skills Scale (ISS). Inf Commun Soc 2015 Aug 25;19(6):804-823. [ CrossRef ]
  • Field A. Discovering Statistics Using IBM SPSS Statistics. London, UK: Sage Publications; 2013.
  • Vermulst AA, Gerris JRM. Quick Big Five Personality Questionnaire. Guideline. Leeuwarden NL: LDC Publications; 2006.
  • Kempen GIJM, Brilman EI, Heyink J, Ormel J. MOS Short-Form General Health Survey. Groningen, Netherlands: Rijksuniversiteit Groningen; 1995.
  • De Greef M, van Deursen AJAM, Tubbing M. Development of the DIS-scale (Diagnostic Illiteracy Scale) in order to reveal illiteracy among adults. J Study Adult Educ Learn 2013;1:37-48.
  • Sherbourne CD, Stewart AL. The MOS social support survey. Soc Sci Med 1991 Jan;32(6):705-714. [ CrossRef ]
  • Helsper E, Smirnova S, Robinson D. DiSTO Youth. London School of Economics and Political Science. 2017.   URL: http://www.lse.ac.uk/media-and-communications/research/research-projects/disto/disto-youth [accessed 2020-08-14]
  • Van Deursen AJAM, Helsper E, Eynon R, van Dijk JAGM. The compoundness and sequentiality of digital inequality. Int J Commun 2017;11:452-473.
  • Lachlan KA, Spence PR, Nelson LD. Gender Differences in Negative Psychological Responses to Crisis News: The Case of the I-35W Collapse. Communication Research Reports 2010 Feb;27(1):38-48. [ CrossRef ]

Abbreviations

coronavirus disease
Medical Outcomes Study

Edited by G Eysenbach; submitted 11.05.20; peer-reviewed by S LaValley, T Hale; comments to author 13.07.20; revised version received 16.07.20; accepted 03.08.20; published 20.08.20

©Alexander JAM van Deursen. Originally published in the Journal of Medical Internet Research (http://www.jmir.org), 20.08.2020.

This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in the Journal of Medical Internet Research, is properly cited. The complete bibliographic information, a link to the original publication on http://www.jmir.org/, as well as this copyright and license information must be included.

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings
  • My Bibliography
  • Collections
  • Citation manager

Save citation to file

Email citation, add to collections.

  • Create a new collection
  • Add to an existing collection

Add to My Bibliography

Your saved search, create a file for external citation management software, your rss feed.

  • Search in PubMed
  • Search in NLM Catalog
  • Add to Search

Quantitative Research Methods in Medical Education

Affiliation.

  • 1 From the Division of Hospital Internal Medicine (J.T.R.) Division of General Internal Medicine (A.P.S., T.J.B.), Mayo Clinic College of Medicine and Science, Department of Medicine, Mayo Clinic, Rochester, Minnesota.
  • PMID: 31045900
  • DOI: 10.1097/ALN.0000000000002727

There has been a dramatic growth of scholarly articles in medical education in recent years. Evaluating medical education research requires specific orientation to issues related to format and content. Our goal is to review the quantitative aspects of research in medical education so that clinicians may understand these articles with respect to framing the study, recognizing methodologic issues, and utilizing instruments for evaluating the quality of medical education research. This review can be used both as a tool when appraising medical education research articles and as a primer for clinicians interested in pursuing scholarship in medical education.

PubMed Disclaimer

Similar articles

  • Developing scholarly projects in education: a primer for medical teachers. Beckman TJ, Cook DA. Beckman TJ, et al. Med Teach. 2007 Mar;29(2-3):210-8. doi: 10.1080/01421590701291469. Med Teach. 2007. PMID: 17701635
  • Qualitative Research Methods in Medical Education. Sawatsky AP, Ratelle JT, Beckman TJ. Sawatsky AP, et al. Anesthesiology. 2019 Jul;131(1):14-22. doi: 10.1097/ALN.0000000000002728. Anesthesiology. 2019. PMID: 31045898 Review.
  • A Toolkit for Medical Education Scholarship. Sullivan GM. Sullivan GM. J Grad Med Educ. 2018 Feb;10(1):1-5. doi: 10.4300/JGME-D-17-00974.1. J Grad Med Educ. 2018. PMID: 29467965 Free PMC article. No abstract available.
  • Is medical education research 'hard' or 'soft' research? Gruppen LD. Gruppen LD. Adv Health Sci Educ Theory Pract. 2008 Mar;13(1):1-2. doi: 10.1007/s10459-007-9092-0. Epub 2007 Dec 4. Adv Health Sci Educ Theory Pract. 2008. PMID: 18060572 No abstract available.
  • A writer's guide to education scholarship: Quantitative methodologies for medical education research (part 1). Thoma B, Camorlinga P, Chan TM, Hall AK, Murnaghan A, Sherbino J. Thoma B, et al. CJEM. 2018 Jan;20(1):125-131. doi: 10.1017/cem.2017.17. Epub 2017 Apr 26. CJEM. 2018. PMID: 28443532 Review.
  • General practitioner residents' experiences and perceptions of outpatient training in primary care settings in China: a qualitative study. Wu L, Tong Y, Yu Y, Yu X, Zhou Y, Xu M, Guo Y, Song Z, Xu Z. Wu L, et al. BMJ Open. 2023 Sep 15;13(9):e076821. doi: 10.1136/bmjopen-2023-076821. BMJ Open. 2023. PMID: 37714679 Free PMC article.
  • Dental Teacher Feedback and Student Learning: A Qualitative Study. Fine P, Leung A, Tonni I, Louca C. Fine P, et al. Dent J (Basel). 2023 Jun 30;11(7):164. doi: 10.3390/dj11070164. Dent J (Basel). 2023. PMID: 37504230 Free PMC article.

Publication types

  • Search in MeSH

LinkOut - more resources

Full text sources.

  • Ovid Technologies, Inc.
  • Silverchair Information Systems

full text provider logo

  • Citation Manager

NCBI Literature Resources

MeSH PMC Bookshelf Disclaimer

The PubMed wordmark and PubMed logo are registered trademarks of the U.S. Department of Health and Human Services (HHS). Unauthorized use of these marks is strictly prohibited.

published quantitative research paper about medicine

  • Subscribe to journal Subscribe
  • Get new issue alerts Get alerts

Secondary Logo

Journal logo.

Colleague's E-mail is Invalid

Your message has been successfully sent to your colleague.

Save my selection

Research Methodologies in Health Professions Education Publications: Breadth and Rigor

Han, Heeyoung PhD 1 ; Youm, Julie PhD 2 ; Tucker, Constance PhD 3 ; Teal, Cayla R. PhD, MA 4 ; Rougas, Steven MD, MS 5 ; Park, Yoon Soo PhD 6 ; J. Mooney, Christopher PhD, MPH 7 ; L. Hanson, Janice PhD, EdS 8 ; Berry, Andrea MPA 9

1 H. Han is associate professor and director of postdoctoral programs, Department of Medical Education, Southern Illinois University School of Medicine, Springfield, Illinois; ORCID: https://orcid.org/0000-0002-7286-2473 .

2 J. Youm is associate dean of education compliance and quality, University of California, Irvine School of Medicine, Irvine, California.

3 C. Tucker is associate professor, Vice Provost of Educational Improvement and Innovation, Academic Affairs, Oregon Health & Science University, Portland, Oregon; ORCID: https://orcid.org/0000-0002-6507-8832 .

4 C.R. Teal is associate professor and associate dean of assessment and evaluation, Office of Medical Education, University of Kansas School of Medicine, Kansas City, Kansas; ORCID: https://orcid.org/0000-0002-2138-4926 .

5 S. Rougas is associate professor of emergency medicine and medical science and director of the doctoring program, The Warren Alpert Medical School of Brown University, Providence, Rhode Island; ORCID: https://orcid.org/0000-0003-2225-9657 .

6 Y.S. Park is associate professor, Harvard Medical School, and director of health professions education research, Massachusetts General Hospital, Boston, Massachusetts; ORCID: http://orcid.org/0000-0001-8583-4335 .

7 C.J. Mooney is assistant professor of medicine and director of assessment, University of Rochester School of Medicine and Dentistry, Rochester, New York; ORCID: https://orcid.org/0000-0003-2881-2169 .

8 J.L. Hanson is professor of medicine, Department of Medicine and Office of Education, Washington University, St. Louis, Missouri; ORCID: https://orcid.org/0000-0001-7051-8225 .

9 A. Berry is executive director of faculty life, University of Central Florida College of Medicine, Orlando, Florida.

Supplemental digital content for this article is available at https://links.lww.com/ACADMED/B318 .

Funding/Support: None reported.

Other disclosures: None reported.

Ethical approval: Reported as not applicable.

Previous presentations: The abstract of this paper will be presented at the Association for Medical Education in Europe (AMEE) annual meeting, August 2022, Lyon, France.

Correspondence should be addressed to Heeyoung Han, 913 N Rutledge St., Springfield, IL 62684; telephone: (217) 545-8536; email: [email protected] .

published quantitative research paper about medicine

Purpose 

Research methodologies represent assumptions about knowledge and ways of knowing. Diverse research methodologies and methodological standards for rigor are essential in shaping the collective set of knowledge in health professions education (HPE). Given this relationship between methodologies and knowledge, it is important to understand the breadth of research methodologies and their rigor in HPE research publications. However, there are limited studies examining these questions. This study synthesized current trends in methodologies and rigor in HPE papers to inform how evidence is gathered and collectively shapes knowledge in HPE.

Method 

This descriptive quantitative study used stepwise stratified cluster random sampling to analyze 90 papers from 15 HPE journals published in 2018 and 2019. Using a research design codebook, the authors conducted group coding processes for fidelity, response process validity, and rater agreement; an index quantifying methodological rigor was developed and applied for each paper.

Results 

Over half of research methodologies were quantitative (51%), followed by qualitative (28%), and mixed methods (20%). No quantitative and mixed methods papers reported an epistemological approach. All qualitative papers that reported an epistemological approach (48%) used social constructivism. Most papers included participants from North America (49%) and Europe (20%). The majority of papers did not specify participant sampling strategies (56%) or a rationale for sample size (80%). Among those reported, most studies (81%) collected data within 1 year.

The average rigor score of the papers was 56% (SD = 17). Rigor scores varied by journal categories and research methodologies. Rigor scores differed between general HPE journals and discipline-specific journals. Qualitative papers had significantly higher rigor scores than quantitative and mixed methods papers.

Conclusions 

This review of methodological breadth and rigor in HPE papers raises awareness in addressing methodological gaps and calls for future research on how the authors shape the nature of knowledge in HPE.

Research is a scientific social process that systematically synthesizes evidence to create knowledge. Foundational to research practices are assumptions underlying the nature of reality (ontology) that lead to assumptions about the nature of knowing (epistemology), which in turn, influence decisions about the nature of inquiry (methodologies). 1–3 Given the sequential alignment among ontology, epistemology, and methodologies of research, it is imperative that researchers be explicit regarding their assumptions. Yet researcher assumptions are often not explicitly in health professions education (HPE) research. 4 Indeed, many studies take ontological and epistemological assumptions for granted and excessively focus on methodology. 3 Further, empirical research in HPE is often limited to a specific subset of methodologies.

The habitual use of research methodologies leads to questions regarding whether our research methodologies shape the nature of knowledge in HPE rather than the reverse, and if so, how research methodologies have shaped evidence and body of knowledge in this field. Thomas and colleagues 5 argued that research methodologies and researchers’ epistemologies shape the nature of knowledge, as methodologies determine what kind of knowledge is possible, legitimate, and trustworthy. Similarly, Balmer and colleagues reported that the nature of knowledge described within longitudinal qualitative papers depends on researchers’ assumptions around the nature of time as fluid or static. 6 In addition, in an HPE context, the prevalence of a limited subset of research methodologies and our inclination to focus on specific methodologies potentially narrow the scope of our knowledge. Biesta and van Braak 7 criticized the limited medical education research methodologies, noting that currently used methodologies fail to recognize dynamic and complex educational practices. They encourage researchers to expand their methodologies to capture what happens in learning interactions. While there are expectations that the research question should guide the choice of a study methodology, 8 , 9 the use of specific study methodologies can also shape research questions. In this sense, perhaps there is a more complex relationship between methodology and knowledge. Therefore, it becomes relevant to consider how research methodologies have shaped knowledge in HPE.

Reviewing the history of HPE provides evidence on how researchers’ methodologies have shaped knowledge and practice in HPE. Informed by the milieu of medical education in the mid-1900s, Ham 10 described the trends and evolution of medical education research and curricula 60 years ago in the United States. The medical education research approach at that time heavily relied on a quantitative methodology that was grounded in empiricism and the scientific method of investigation. In the 1980s, qualitative research paradigms originated from other disciplines such as anthropology and sociology were introduced into medical education. 2 In their commentary in the mid-1990s, Colliver and Verhulst 11 argued against a qualitative research methodology, claiming it provided weak subjective evidence. They concluded by advocating positivist research methodologies and stated, “Research should be driven by research questions, not research methods, and any attempt to legislate the use of a particular method or combination of methods is a threat to the creativity and viability of scientific research.” 11 (p211) Within this rigid atmosphere that exclusively promoted quantitative methodologies, HPE research favored positivism-oriented methodologies, even when using qualitative methods, and this limited focus continues to affect opinions about research quality and rigor in HPE research. 12 , 13

The field’s preference for quantitative methodologies has been changing over the last 2 decades as qualitative research became better appreciated as another legitimate research methodology. 2 , 14–16 Increasing uptake and support for methodological diversities has changed the landscape of HPE literature with greater acceptance of subjective and interpretive constructivism paradigms. Since HPE research topics are broad, 17 the field needs such openness to diverse ontological and epistemological assumptions and methodologies. 18 Educational research is a dynamic process where researchers merge existing methodologies (between and within different types of methods) and reconceptualize ways of asking and answering questions. 19 Researchers must remain nimble, as they may need to apply different paradigms depending on the nature of knowledge about the phenomenon under study. 20–22

As HPE research methodologies evolved from positivism to include social constructivism and critical inquiry, and as methodological pluralism emerged, the education research community increasingly questioned the notion of research methodology rigor. 18 Medical educators recognized that diverse methodologies including varying epistemological assumptions and their relevant standards of rigor were needed to advance the field. 12–15 , 18 , 20 , 21 In the foreword to the 2020 Research in Medical Education supplement of Academic Medicine, Park and colleagues 23 noted the importance of scientific rigor in the diverse research approaches in HPE research that comprises methodological processes including epistemological assumptions, research designs, and research methods (data collection, analysis, and interpretation). Although they noted the variety of standards of rigor based on the elements of research methodologies, they argued that standards of rigor should be implemented and reported in research papers.

Diverse research methodologies and the implementation of methodological standards for rigor are essential in shaping the collective sets of knowledge we create in the field of HPE. Given the importance of a comprehensive repertoire of epistemological and methodological approaches in creating collective sets of knowledge in HPE, it is critical to understand the breadth of research methodologies and their rigor in HPE research publications. To date, however, there are limited studies examining these questions. By answering the following 2 research questions, we aim to describe current trends in research methodologies and level of rigor in the field of HPE: (1) What is the breadth of research designs reported by current HPE publications? (2) What is the level of methodological rigor described by current HPE publications?

We adopted a descriptive quantitative study using a positivist epistemology to understand the observable breadth of research designs and corresponding methodological rigor in a sample of HPE empirical research papers.

Sampling/data collection

We conducted stepwise stratified cluster random sampling of articles, where we first randomly selected journals, issues, and then research articles. As a starting point, we used the Association of American Medical Colleges (AAMC) Group on Educational Affairs (GEA) Medical Education Scholarship Research and Evaluation (MESRE) Annotated Bibliography of Journals of Educational Scholarship in 2019. We (1) selected journals focusing on HPE (n = 44); (2) categorized the selected journals into 3 categories: general medical education (e.g., Academic Medicine ), medical education in specific domains (e.g., Journal of Cancer Education ), and other HPE (e.g., Journal of Nursing Education ); (3) randomly selected 5 journals from each category (n = 15, 34%) and chose publications from a 2-year period between January 2018 and December 2019; and then (4) randomly selected an issue in each year and 3 research articles in each selected issue. We did not include publications in 2020 and 2021 as there were unusual patterns in submissions and publications due to the COVID-19 pandemic. This resulted in 90 articles for analysis. We decided on a sample size of 30 articles in each group (general medical education, domain-specific medical education, and other HPE journals) to enable the comparison of group differences, assuming a sufficient number in each research approach based on the IMB SPSS Statistics (version 27) calculation of a sample size with an estimated power of 0.80. The inclusion criteria included empirical studies and original research from education-focused journals. We excluded journals publishing general medical research without a specific focus on education due to feasibility concerns as we would have had to review all papers for sampling to determine whether each article was eligible for our study. We also excluded papers that were innovation studies, case reports, conceptual and literature reviews, meta-analyses, and perspectives. The data collection was conducted in March 2021.

Coding structure

We used audit methodology to analyze the sample articles. 24 An audit procedure has been introduced in social science research to investigate the quality of studies. 25 , 26 We applied this methodology to determine whether the articles that we sampled met specific standards of methodological rigor. For the audit process, we developed a coding structure based on the literature around research design and rigor. 12 , 13 , 16 , 20 , 21 , 23 , 27–39 The coding structure included subcategories: methodological philosophy, research design, and research methods, which we elaborated into epistemology, 40 population, sampling, data sources, data collection, and data analysis. These subcategories included specific coding questions based on 3 different research approaches: qualitative, quantitative, and mixed methods. We also included general questions such as population, sampling and recruitment methods, rationale for sample size, and data collection duration.

Questions addressing standards of rigor were added for each research methodology. For qualitative papers, these questions related to the reporting of specific qualitative study approaches, sampling strategy, source of data, data analysis process, reflexivity, and trustworthiness. 14 , 20 , 21 , 27 , 33–36 For quantitative research methodologies, the literature suggested that we tailor our rigor questions to the specific research design. 14 , 20 , 21 , 28–32 For example, for papers using a survey measurement for either relational or descriptive studies, we added questions regarding common method bias that could introduce validity threat in the measurement, 28 which could be addressed by statistically controlling or using different measures rather than relying on a single survey measurement. For studies using a causal inference design, we included a question regarding the use of an active control group rather than an inactive control group. 32 , 41 For validation or measurement development studies, we included a question inquiring about a pilot study before the main study. Last, for mixed methods studies, our rigor questions asked about a rationale for this specific design, discussing the data mixing/integration approach, and sharing insights from mixing the methods. 37 , 38

With a set of preliminary coding questions, we formed 3 coding groups based on study team members’ expertise: qualitative, quantitative, and mixed methods. Each group reviewed the coding questions to improve the fidelity of the coding tool and response process validity. The group coding development process included reviewing the questions, keeping notes about the questions and responses, providing corrections and clarifications on the questions, individual coding, reviewing group agreement and disagreement, reconciling discrepancies, and recommending changes in the questions. The coding questions development process was iterative as we implemented changes in the questionnaire, coding additional articles using the updated questions, and discussed further changes until we did not have any additional changes. We completed 3 cycles of this codebook improvement process from March through July 2021.

Table 1 includes the overall question structure to measure rigor reported in the papers and the breadth of research methodologies in each category. Actual questions are attached in Supplemental Digital Appendix 1 at https://links.lww.com/ACADMED/B318 . We had 10 questions exclusively for the descriptions of the breadth of research design (annotated as footnote “a” in Table 1 ). There were 33 questions to measure rigor for mixed method papers, which included 8 general questions, 3 mixed method specific questions, 10 qualitative research questions, and 12 quantitative research questions. Qualitative papers had 18 questions including 8 general rigor questions. Quantitative papers had 20 rigor questions. We implemented the coding questions using SurveyMonkey.

T1

Coding process

After developing the aforementioned coding structure, we completed individual coding of all 90 articles by working group: quantitative, qualitative, and mixed methods. Each group had 3 members to enable 3 coding pairs for each round of group coding. When the pair disagreed on codes, the members discussed and negotiated consensus codes by providing rationale and evidence. By doing this, the groups were able to develop consensus on all codes. Each group went through 4–6 group coding processes until 9 pairs achieved > 80% agreement, 42 which resulted in completing 38 articles through the group coding processes. Once we met > 80% rater agreement, we conducted individual coding to complete the remaining 52 articles from July to September 2021.

Data analysis

The coding questions were designed to provide answers to the stated research questions focused on the breadth of methodologies and the level of methodological rigor in HPE publication. For the breadth of methodologies, we used descriptive statistics, relying on frequencies and response counts. For example, one question was, “What qualitative research design did the authors use?” It included 6 different qualitative research design options: narrative, grounded theory, phenomenology, case study, ethnography, conversation analysis, other, and not explicitly specified. We calculated the frequencies of each option to understand the breadth of research designs.

For methodological rigor, we created an index (rigor score below) reflecting rigor from the coding responses. To measure the level of rigor, we placed a score of 1 for each question unless it was marked “not explicitly specified.” Questions (annotated as footnote “a” in Table 1 ) that did not fall into the answer option of “No” or “Not explicitly specified” did not get counted for the level of rigor but were used for descriptions of the breadth of research design. In addition, as each paper had a different research design that required different methodological rigor standards, there were “Not Applicable (N/A)” responses, which we removed in calculating a rigor score. Therefore, we calculated the level of rigor of each paper using the formula below:

††Sum total of items measuring rigor = the number of rigor questions in each method, excluding the number of N/A questions

§Gained score = the paper-specific sum of possible rigor score, excluding “No” or “Not Explicitly Stated” responses.

To minimize bias in rigor scoring, we followed the coding structure based on existing literature and group consensus. The rigor score reinforced the study’s positivist epistemology and philosophical framework, which assumes one objective reality. To view aspects of methodology as one objective reality, the qualitative research coding group developed a concrete, agreed-upon definition of each item on the checklist—our translation of the one objective reality. The possible rigor scores ranged from 15 points to 19 points for qualitative papers, from 8 to 19 for quantitative papers, and from 18 to 28 for mixed methods papers. We used the percentage scores given each paper’s different score ranges and conducted ANOVA to see group differences. We did not pursue IRB approval.

Researchers

The research team is composed of 9 current and past elected members of the MESRE section of the GEA of the AAMC. Based on our expertise in research across paradigms, our MESRE group work included promoting and improving medical education research through facilitating regional conference abstract review processes, grant proposal reviews, and national workshops on scholarship. Many team members are editorial members of journals with substantial manuscript review experience. Ethical approval was reported as not applicable.

Research question 1: What is the breadth of research design reported by current HPE publications?

Data analysis demonstrated that most research methodologies reported in the papers were quantitative (n = 46, 51%), followed by qualitative (n = 25, 28%), and mixed methods (n = 18, 20%) ( Figure 1 ). Only one paper, a Delphi study, was categorized as “other” as it did not fit well into the defined rigor standards. None of the quantitative and mixed methods studies reported an epistemological approach. About half of the qualitative papers (n = 13, 52%) did not report one either. Those that did report an epistemological approach (n = 12, 48%) used social constructivism including postmodernism, phenomenology, and interpretivist epistemology. No papers explicitly used other epistemological approaches, such as critical theory.

F1

Most papers included study participant populations from North America (n = 44, 49%) and Europe (n = 18, 20%). There were studies conducted in other locations, which were smaller numbers ( Figure 2 ). Study participants included medical students (n = 23, 26%) and students in other HPE programs, such as nursing, dental, or veterinary medicine programs (n = 35, 39%), faculty including community physicians involved in teaching (n = 18, 20%), residents/fellows (n = 8, 9%), patients (n = 7, 8%), and staff (n = 3, 3%). Other types of data sources (n = 20, 22%) included electronic medical records, archives, websites, and community members, including policymakers and patients’ families.

F2

More than half of the papers did not specify their participant sampling strategies (n = 50, 56%) or a rationale for the sample size (n = 72, 80%). Most studies (n = 58, 64%) occurred at a single institution or site, while 23 papers (26%) recruited participants from multiple sites. Fifty-four (60%) studies collected data within 1 year, while 7 (8%) studies collected data within 1–3 years and 6 (7%) had a data collection timeline of more than 3 years. Almost all papers (n = 82, 91%) analyzed in this study discussed the limitations of their research methodologies.

Qualitative research papers.

In the sample selected for this study, over half of the qualitative research papers specified research design approaches (n = 14, 56%). These approaches included grounded theory (n = 6, 24%) or phenomenology (n = 4, 16%) more frequently than ethnography (n = 1, 2%), other (action research, n = 1), narrative (n = 0), case study (n = 0), or conversation analysis (n = 0).

A majority of the qualitative papers explicitly used purposeful (n = 18, 72 %) and/or convenience sampling (n = 9, 40%) methods to recruit participants. The dominant sources of data were interviews (n = 14, 56%) and focus groups (n = 7, 28%). While not as prominent, some papers used documents or websites (n = 3, 12%), participant and nonparticipant observations (n = 2, 8%), or surveys (n = 1, 4%).

Most of the qualitative papers analyzed data using an inductive approach (n = 21, 84%) and/or thematic analysis (n = 14, 56%). In addition, most papers explicitly described using techniques to improve trustworthiness including triangulation (n = 14, 56%), additional reviewers to confirm findings (n = 6, 24%), member checking (n = 5, 20%), prolonged observation (n = 3, 12%), or audits (n = 2, 8%). Three papers (12%) did not specify methods to improve trustworthiness.

Quantitative research papers.

Half of the quantitative research papers used relational (n = 23, 50%) study design. Also, there were causal inference (n = 17, 37%), descriptive/observational (n = 9, 20%), and validation (n = 4, 9%) methods. Among papers with relational and/or descriptive cross-sectional designs, most papers did not address common method bias issues (n = 25, 89%). Papers with a causal inference design used an active control group (n = 10, 59%) and a pretest (n = 9, 53%). Only one paper (25%) among the validation studies conducted a pilot study.

Data collection occurred at one time point cross-sectionally (n = 24, 52%), longitudinally with the same group (n = 17, 37%), or longitudinally with different cohorts (n = 7, 15%). Most data were collected (n = 38, 83%) prospectively rather than retrospectively (n = 9, 20%). Data sources included self-reports/perceptions (n = 32, 70%) or knowledge assessment (n = 15, 33%). There were few papers with observed behaviors/performance (n = 3, 7%).

Some papers did not report a rationale for statistical tests (n = 13, 28%) or the quality of statistical analysis (n = 19, 41%), such as effect size or confidence interval. Only some papers using relational design incorporated analysis of control/confounding variables (n = 14, 30%) and moderating/mediating variables (n = 6, 29%) when expected. The papers used parametric (n = 35, 76%) and nonparametric (n = 19, 41%) data analysis techniques. Only a few papers provided discussions of missing data (n = 10, 22%), measurement validity (n = 9, 20%), and reliability (n = 14, 30%) evidence when expected.

Mixed methods papers.

Of the 18 mixed methods papers, a majority did not report a rationale for their use of a mixed methods approach (n = 16, 89%). Most of the mixed method papers used quantitative methodologies (n = 10, 56%) as their dominant inquiry approach, while only one paper (6%) had a dominant qualitative inquiry approach. There were 7 papers (39%) that used balanced quantitative and qualitative methods. Mixed methods papers mostly used triangulation design (n = 14, 78%) that complemented each data set on the same topic. 38 However, most triangulation design papers used a survey that included quantitative measures and open-ended questions rather than conducting independent qualitative and quantitative data collection methods. There were only 2 papers that used explanation design, 1 paper that used embedded design, and 1 that used exploration design. 38 Nearly all mixed methods papers did not specify a qualitative research design (n = 17, 94%). For the quantitative data component, most of the papers used descriptive/observational design (n = 14, 78%). Other quantitative components included causal inference (n = 4, 22%), relational designs (n = 1, 6%), and consensus research using a modified Delphi technique (n = 1, 6%). Only 5 studies (28%) explicitly discussed how the quantitative and qualitative approaches were linked and merged.

Research Question 2: What is the level of methodological rigor described by current HPE publications?

Rigor scores were analyzed for descriptive statistics and group differences by journal categories and research methodologies. The Kolmogorov–Smirnov and Shapiro–Wilk normality test showed the data’s normal distribution as the significance P value was greater than .05.

The average rigor score of the papers analyzed in the sample was 56% (SD = 17), ranged from 19% to 94% ( Table 2 ). The rigor scores varied by journal categories and research methodologies. ANOVA showed that the group differences were statistically significant by journal categories (F [2, 87] = 5.82, P = .004, η 2 = .12) and research methodologies (F [2, 86] = 32.68, P = .000, η 2 = .43). Tukey HSD revealed group differences between general medical education journals (M = 62.92, SD = 18.94) and discipline-specific medical education journals (M = 48.78, SD = 14.04) ( Table 3 ). The group difference by research methodologies was also statistically significant (F [2, 86] = 32.68, P = .000, η 2 = .43). The post hoc analysis showed that qualitative papers had statistically significant higher rigor scores than quantitative and mixed methods papers. There was no statistically significant difference between quantitative and mixed methods papers.

T2

Although HPE has made progress in embracing diverse research paradigms, 3 , 4 , 43 this study found that there is still limited variation in the field. HPE research continues to have a modest diversity of epistemological approaches, population, sampling, and nature of data (e.g., short-term cross-sectional, perceptions based via survey or interviews), and limitations of the rigor in research methodologies reported in HPE papers—which may contribute to a limited set of knowledge in the field. Quantitative methodologies are more prevalent in HPE papers, including serving as the dominant approach to inquiry in mixed methods designs. Within qualitative methodologies, the most commonly reported are grounded theory and phenomenology. This finding is not surprising, as these 2 methods have extensive literature regarding the processes of their use. 33

The absence of reported epistemological frameworks among all the manuscripts is noteworthy; perhaps there is an inherent researcher assumption made about quantitative approaches that assume a positivist framework, but that is not a conclusion we can draw. Approximately half of the sampled qualitative papers reported a social constructivism framework without specific details of epistemological assumptions. It is striking that there were no subjectivist epistemologies and philosophical frameworks highlighting critical inquiry, including feminism, or postmodernism among any papers in the sample. Critical theory and related ideological views have a dialectic methodological nature. 1 These epistemological approaches focus on transforming misapprehension—historically shaped—into consciousness through developing and resolving contradictions. 1 Postmodernism is opposed to any epistemological stances based on grand narratives and pursues continuous reevaluation of practices and theories through deconstruction processes. 44 Diverse epistemological approaches, including critical theory and postmodernism, can provide different and unique understandings of HPE practices. However, the findings showed limited epistemological variation in our sample, even with calls to make room for other ways of knowing in HPE literature. 45 , 46

Reflecting on the absences

According to the findings, the HPE papers reviewed in this study are outlined as dominantly shaped by students’ perceptions from Western countries using a survey collected at one point in time within one year at a single institution. What does it mean that we rarely saw studies done in Africa or Asia, the use of ethnography, or longitudinal data? Paton and colleagues 47 recently published an insightful paper that investigated the absences in HPE research.

Despite our field’s compelling need to cite evidence, serious voids in the literature remain: areas in HPE where the literature fails to support educational practices. These voids are no mere “gap”; they are absences. They are important in their effects on how we construct the field of health professions education research: what we include or exclude, what we count or not, what we believe to be true or false, what we do or do not read, who speaks and who is silenced. 47 (p6)

If HPE research should be varied with respect to epistemological and methodological approaches, these study findings suggest that much work remains to be done. Further, reflection is needed on the meaning of the absence of stated epistemologies and key methodological details such as participant population, sampling strategies, or rationale for design and analytic choices. Research methodology has symbolic power as it determines the legitimacy of who is included in studies and how, and implicitly give value to ways of knowing. The absences in HPE papers loudly voice the missed opportunities to broaden exploration and innovation in HPE. We call for future studies investigating and addressing the absences to advance how we construct the collective knowledge in HPE.

Lack of reported rigor

Despite the prevalence of quantitative methodologies, there is significant room for improvement regarding methodological rigor in quantitative papers. Discipline-specific medical education journals dominantly published quantitative and mixed methods papers with lower rigor scores compared with other journals. Surprisingly, almost half of the papers with an experimental design were published without reporting the use of an active control group and/or pretest. Practices for rigor, such as controlling confounding variables, analyzing moderating or mediating variables, or providing measurement validity and reliability when expected, were rarely and variably reported in the papers. Given the historical dominance of the quantitative research paradigm and its emphasis on rigor standards, this is counterintuitive yet consistent with prior studies in HPE and other fields. 28 , 32 , 41 It may be that for qualitative research papers using a relatively newer group of methodologies, reviewers more often required specific terminology related to methodological rigor, compared with other papers based on a quantitative methodology, in which rigor was assumed. 47 The application of standards for methodological rigor should be emphasized and reinforced, even for those with arguably better understood quantitative methodological approaches.

Relationships between epistemology, research design, and methodologies

Scientific research interplays between research questions, research approaches, and research answers. 23 As a research approach is also based on the dynamic relationship between epistemology, research design, and methodologies, we should not focus on one domain but rather the interaction between them in building a collective set of knowledge in HPE. Prior literature 48 describes how common approaches to mixed methods studies focus too heavily on the methods and data, rather than the relationship between the research question and the logic of inquiry. This is consistent with our findings of the lack of a clear rationale presented in the mixed methods papers reviewed in our study. This reinforces the call for mixed methods researchers to provide a clear description and justification for design choices emphasizing the integration of data and findings from both components. 49 Additionally, it was concerning that the majority of mixed methods papers were described as using triangulation design, often based on data from a quantitative survey with several open-ended qualitative questions. This potentially oversimplifies the nuance that triangulation requires, such as distinctions between within-methods and between-methods triangulation. 50 It begs the question of whether mixed methods is a selected research approach of convenience, rather than a deliberate paradigm unto itself.

Guidelines for methodological rigor

While the guidelines for rigor do exist and should be followed, reporting of rigor standards in publications is varied in HPE papers. This may be caused by different levels of attention to the reporting standards and understanding of rigor in HPE literature. While there has been an explicit effort to communicate reporting standards for qualitative 27 , 33 and quantitative research, 14 , 20 , 21 there are relatively few studies that establish specific guidelines for the rigor of mixed methods studies in HPE. 51 Of those noted in the literature, 13 many rely on quantitative factors (such as statistical power) or oversimplify mixing strategies 52 without clear reference to the philosophical reasoning for the choices of which methods are used. 48 Since no clear guidelines for rigor in mixed methods exist among HPE researchers, it is not surprising that mixed methods papers had the lowest rigor rating among the 3 research approaches. As Creswell and colleagues note, 38 mixed methods research is a research paradigm unto itself, not simply a mixing or joining of 2 different paradigms. The components of rigor, therefore, must also be unique to mixed methods papers, not simply a mixing or joining of quantitative and qualitative rigor. The logic and need for combining is one consideration, but not the only consideration. There should be further considerations about the way in which mixed methods methodological rigor can be determined beyond looking at the qualitative and quantitative components as simply additive.

One quagmire in this exploration is how to appropriately situate Delphi studies in our analysis. Depending on the study, 53 Delphi studies may be seen as mixed methods or qualitative, 54 with little consensus on the definition and characteristics that discern this research approach. As such, there are few studies 55 examining criteria for rigor. 56 Given the wide array of methodological variants on the Delphi technique, it is difficult to situate this research approach in our current work. Future studies are needed to examine the breadth of research design and level of rigor specific to the Delphi technique.

Limitations

The current study has several limitations. First, the current study focused on the rigor of research methodologies explicitly stated in the papers, not the rigor of the entire studies reported in the research papers. For example, it was not within the scope of the study to evaluate whether the selected methodology (e.g., experimental design) was adequate to answer the study’s research questions. Instead, we focused on how the chosen methodology was appropriately implemented and reported as guided in the literature (e.g., use of active control group, pretest, measurement validity and reliability, and control variables). Second, it is likely that the rigor score questions were not exhaustive to include all rigor standards in each design. We coded based on whether the authors explicitly reported that they used a particular method or not, but often disagreed on whether the remainder of the article described using the method in an appropriate and rigorous way (e.g., whether the authors stated a specific sampling design but later contradicted that design with additional language in the Method section). A future study is needed to investigate research rigor in a more holistic way, especially by adopting a qualitative or mixed methods research approach. In particular, an approach is needed that integrates the role of interpretation in the assessment of methodological rigor. There is also an opportunity to study trends on the breadth and rigor of research methodologies in HPE literature by adopting a longitudinal research design.

Conclusions

Research methodologies in the published papers in HPE journals demonstrated limited variation in epistemological approaches and research designs. The methodological rigor reported in the published papers was limited, which calls for improvement, especially in reporting quantitative and mixed methods research papers. These problems we reported in this paper call for HPE journals’, editors’, and reviewers’ awareness and reflections as well as individual researchers’ as published research papers are the artifacts of co-constructing knowledge production processes. If specific epistemologies and methodologies get rejected at higher rates among HPE journals, and if reviewers are not equipped with guidelines for the rigor of each methodology, researchers will follow the limited institutional expectations, which would create a vicious cycle. This limited methodological breadth and rigor in HPE papers observed in the current study are significant and call for future research and reflections on how our habitual inquiry languages—research methodologies—shape the nature of knowledge in HPE.

  • Cited Here |
  • Google Scholar

Supplemental Digital Content

  • ACADMED_2022_07_28_HAN_AcadMed-D-22-01235_SDC1.pdf; [PDF] (796 KB)
  • + Favorites
  • View in Gallery

Readers Of this Article Also Read

Summary of instructions for authors, standards for reporting qualitative research: a synthesis of recommendations, common qualitative methodologies and research designs in health professions..., situating remediation: accommodating success and failure in medical education..., academies in health professions education: a scoping review.

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • J Korean Med Sci
  • v.38(37); 2023 Sep 18
  • PMC10506897

Logo of jkms

Conducting and Writing Quantitative and Qualitative Research

Edward barroga.

1 Department of Medical Education, Showa University School of Medicine, Tokyo, Japan.

Glafera Janet Matanguihan

2 Department of Biological Sciences, Messiah University, Mechanicsburg, PA, USA.

Atsuko Furuta

Makiko arima, shizuma tsuchiya, chikako kawahara, yusuke takamiya.

Comprehensive knowledge of quantitative and qualitative research systematizes scholarly research and enhances the quality of research output. Scientific researchers must be familiar with them and skilled to conduct their investigation within the frames of their chosen research type. When conducting quantitative research, scientific researchers should describe an existing theory, generate a hypothesis from the theory, test their hypothesis in novel research, and re-evaluate the theory. Thereafter, they should take a deductive approach in writing the testing of the established theory based on experiments. When conducting qualitative research, scientific researchers raise a question, answer the question by performing a novel study, and propose a new theory to clarify and interpret the obtained results. After which, they should take an inductive approach to writing the formulation of concepts based on collected data. When scientific researchers combine the whole spectrum of inductive and deductive research approaches using both quantitative and qualitative research methodologies, they apply mixed-method research. Familiarity and proficiency with these research aspects facilitate the construction of novel hypotheses, development of theories, or refinement of concepts.

Graphical Abstract

An external file that holds a picture, illustration, etc.
Object name is jkms-38-e291-abf001.jpg

INTRODUCTION

Novel research studies are conceptualized by scientific researchers first by asking excellent research questions and developing hypotheses, then answering these questions by testing their hypotheses in ethical research. 1 , 2 , 3 Before they conduct novel research studies, scientific researchers must possess considerable knowledge of both quantitative and qualitative research. 2

In quantitative research, researchers describe existing theories, generate and test a hypothesis in novel research, and re-evaluate existing theories deductively based on their experimental results. 1 , 4 , 5 In qualitative research, scientific researchers raise and answer research questions by performing a novel study, then propose new theories by clarifying their results inductively. 1 , 6

RATIONALE OF THIS ARTICLE

When researchers have a limited knowledge of both research types and how to conduct them, this can result in substandard investigation. Researchers must be familiar with both types of research and skilled to conduct their investigations within the frames of their chosen type of research. Thus, meticulous care is needed when planning quantitative and qualitative research studies to avoid unethical research and poor outcomes.

Understanding the methodological and writing assumptions 7 , 8 underpinning quantitative and qualitative research, especially by non-Anglophone researchers, is essential for their successful conduct. Scientific researchers, especially in the academe, face pressure to publish in international journals 9 where English is the language of scientific communication. 10 , 11 In particular, non-Anglophone researchers face challenges related to linguistic, stylistic, and discourse differences. 11 , 12 Knowing the assumptions of the different types of research will help clarify research questions and methodologies, easing the challenge and help.

SEARCH FOR RELEVANT ARTICLES

To identify articles relevant to this topic, we adhered to the search strategy recommended by Gasparyan et al. 7 We searched through PubMed, Scopus, Directory of Open Access Journals, and Google Scholar databases using the following keywords: quantitative research, qualitative research, mixed-method research, deductive reasoning, inductive reasoning, study design, descriptive research, correlational research, experimental research, causal-comparative research, quasi-experimental research, historical research, ethnographic research, meta-analysis, narrative research, grounded theory, phenomenology, case study, and field research.

AIMS OF THIS ARTICLE

This article aims to provide a comparative appraisal of qualitative and quantitative research for scientific researchers. At present, there is still a need to define the scope of qualitative research, especially its essential elements. 13 Consensus on the critical appraisal tools to assess the methodological quality of qualitative research remains lacking. 14 Framing and testing research questions can be challenging in qualitative research. 2 In the healthcare system, it is essential that research questions address increasingly complex situations. Therefore, research has to be driven by the kinds of questions asked and the corresponding methodologies to answer these questions. 15 The mixed-method approach also needs to be clarified as this would appear to arise from different philosophical underpinnings. 16

This article also aims to discuss how particular types of research should be conducted and how they should be written in adherence to international standards. In the US, Europe, and other countries, responsible research and innovation was conceptualized and promoted with six key action points: engagement, gender equality, science education, open access, ethics and governance. 17 , 18 International ethics standards in research 19 as well as academic integrity during doctoral trainings are now integral to the research process. 20

POTENTIAL BENEFITS FROM THIS ARTICLE

This article would be beneficial for researchers in further enhancing their understanding of the theoretical, methodological, and writing aspects of qualitative and quantitative research, and their combination.

Moreover, this article reviews the basic features of both research types and overviews the rationale for their conduct. It imparts information on the most common forms of quantitative and qualitative research, and how they are carried out. These aspects would be helpful for selecting the optimal methodology to use for research based on the researcher’s objectives and topic.

This article also provides information on the strengths and weaknesses of quantitative and qualitative research. Such information would help researchers appreciate the roles and applications of both research types and how to gain from each or their combination. As different research questions require different types of research and analyses, this article is anticipated to assist researchers better recognize the questions answered by quantitative and qualitative research.

Finally, this article would help researchers to have a balanced perspective of qualitative and quantitative research without considering one as superior to the other.

TYPES OF RESEARCH

Research can be classified into two general types, quantitative and qualitative. 21 Both types of research entail writing a research question and developing a hypothesis. 22 Quantitative research involves a deductive approach to prove or disprove the hypothesis that was developed, whereas qualitative research involves an inductive approach to create a hypothesis. 23 , 24 , 25 , 26

In quantitative research, the hypothesis is stated before testing. In qualitative research, the hypothesis is developed through inductive reasoning based on the data collected. 27 , 28 For types of data and their analysis, qualitative research usually includes data in the form of words instead of numbers more commonly used in quantitative research. 29

Quantitative research usually includes descriptive, correlational, causal-comparative / quasi-experimental, and experimental research. 21 On the other hand, qualitative research usually encompasses historical, ethnographic, meta-analysis, narrative, grounded theory, phenomenology, case study, and field research. 23 , 25 , 28 , 30 A summary of the features, writing approach, and examples of published articles for each type of qualitative and quantitative research is shown in Table 1 . 31 , 32 , 33 , 34 , 35 , 36 , 37 , 38 , 39 , 40 , 41 , 42 , 43

ResearchTypeMethodology featureResearch writing pointersExample of published article
QuantitativeDescriptive researchDescribes status of identified variable to provide systematic information about phenomenonExplain how a situation, sample, or variable was examined or observed as it occurred without investigator interferenceÖstlund AS, Kristofferzon ML, Häggström E, Wadensten B. Primary care nurses’ performance in motivational interviewing: a quantitative descriptive study. 2015;16(1):89.
Correlational researchDetermines and interprets extent of relationship between two or more variables using statistical dataDescribe the establishment of reliability and validity, converging evidence, relationships, and predictions based on statistical dataDíaz-García O, Herranz Aguayo I, Fernández de Castro P, Ramos JL. Lifestyles of Spanish elders from supervened SARS-CoV-2 variant onwards: A correlational research on life satisfaction and social-relational praxes. 2022;13:948745.
Causal-comparative/Quasi-experimental researchEstablishes cause-effect relationships among variablesWrite about comparisons of the identified control groups exposed to the treatment variable with unexposed groups : Sharma MK, Adhikari R. Effect of school water, sanitation, and hygiene on health status among basic level students in Nepal. Environ Health Insights 2022;16:11786302221095030.
Uses non-randomly assigned groups where it is not logically feasible to conduct a randomized controlled trialProvide clear descriptions of the causes determined after making data analyses and conclusions, and known and unknown variables that could potentially affect the outcome
[The study applies a causal-comparative research design]
: Tuna F, Tunçer B, Can HB, Süt N, Tuna H. Immediate effect of Kinesio taping® on deep cervical flexor endurance: a non-controlled, quasi-experimental pre-post quantitative study. 2022;40(6):528-35.
Experimental researchEstablishes cause-effect relationship among group of variables making up a study using scientific methodDescribe how an independent variable was manipulated to determine its effects on dependent variablesHyun C, Kim K, Lee S, Lee HH, Lee J. Quantitative evaluation of the consciousness level of patients in a vegetative state using virtual reality and an eye-tracking system: a single-case experimental design study. 2022;32(10):2628-45.
Explain the random assignments of subjects to experimental treatments
QualitativeHistorical researchDescribes past events, problems, issues, and factsWrite the research based on historical reportsSilva Lima R, Silva MA, de Andrade LS, Mello MA, Goncalves MF. Construction of professional identity in nursing students: qualitative research from the historical-cultural perspective. 2020;28:e3284.
Ethnographic researchDevelops in-depth analytical descriptions of current systems, processes, and phenomena or understandings of shared beliefs and practices of groups or cultureCompose a detailed report of the interpreted dataGammeltoft TM, Huyền Diệu BT, Kim Dung VT, Đức Anh V, Minh Hiếu L, Thị Ái N. Existential vulnerability: an ethnographic study of everyday lives with diabetes in Vietnam. 2022;29(3):271-88.
Meta-analysisAccumulates experimental and correlational results across independent studies using statistical methodSpecify the topic, follow reporting guidelines, describe the inclusion criteria, identify key variables, explain the systematic search of databases, and detail the data extractionOeljeklaus L, Schmid HL, Kornfeld Z, Hornberg C, Norra C, Zerbe S, et al. Therapeutic landscapes and psychiatric care facilities: a qualitative meta-analysis. 2022;19(3):1490.
Narrative researchStudies an individual and gathers data by collecting stories for constructing a narrative about the individual’s experiences and their meaningsWrite an in-depth narration of events or situations focused on the participantsAnderson H, Stocker R, Russell S, Robinson L, Hanratty B, Robinson L, et al. Identity construction in the very old: a qualitative narrative study. 2022;17(12):e0279098.
Grounded theoryEngages in inductive ground-up or bottom-up process of generating theory from dataWrite the research as a theory and a theoretical model.Amini R, Shahboulaghi FM, Tabrizi KN, Forouzan AS. Social participation among Iranian community-dwelling older adults: a grounded theory study. 2022;11(6):2311-9.
Describe data analysis procedure about theoretical coding for developing hypotheses based on what the participants say
PhenomenologyAttempts to understand subjects’ perspectivesWrite the research report by contextualizing and reporting the subjects’ experiencesGreen G, Sharon C, Gendler Y. The communication challenges and strength of nurses’ intensive corona care during the two first pandemic waves: a qualitative descriptive phenomenology study. 2022;10(5):837.
Case studyAnalyzes collected data by detailed identification of themes and development of narratives written as in-depth study of lessons from caseWrite the report as an in-depth study of possible lessons learned from the caseHorton A, Nugus P, Fortin MC, Landsberg D, Cantarovich M, Sandal S. Health system barriers and facilitators to living donor kidney transplantation: a qualitative case study in British Columbia. 2022;10(2):E348-56.
Field researchDirectly investigates and extensively observes social phenomenon in natural environment without implantation of controls or experimental conditionsDescribe the phenomenon under the natural environment over timeBuus N, Moensted M. Collectively learning to talk about personal concerns in a peer-led youth program: a field study of a community of practice. 2022;30(6):e4425-32.

QUANTITATIVE RESEARCH

Deductive approach.

The deductive approach is used to prove or disprove the hypothesis in quantitative research. 21 , 25 Using this approach, researchers 1) make observations about an unclear or new phenomenon, 2) investigate the current theory surrounding the phenomenon, and 3) hypothesize an explanation for the observations. Afterwards, researchers will 4) predict outcomes based on the hypotheses, 5) formulate a plan to test the prediction, and 6) collect and process the data (or revise the hypothesis if the original hypothesis was false). Finally, researchers will then 7) verify the results, 8) make the final conclusions, and 9) present and disseminate their findings ( Fig. 1A ).

An external file that holds a picture, illustration, etc.
Object name is jkms-38-e291-g001.jpg

Types of quantitative research

The common types of quantitative research include (a) descriptive, (b) correlational, c) experimental research, and (d) causal-comparative/quasi-experimental. 21

Descriptive research is conducted and written by describing the status of an identified variable to provide systematic information about a phenomenon. A hypothesis is developed and tested after data collection, analysis, and synthesis. This type of research attempts to factually present comparisons and interpretations of findings based on analyses of the characteristics, progression, or relationships of a certain phenomenon by manipulating the employed variables or controlling the involved conditions. 44 Here, the researcher examines, observes, and describes a situation, sample, or variable as it occurs without investigator interference. 31 , 45 To be meaningful, the systematic collection of information requires careful selection of study units by precise measurement of individual variables 21 often expressed as ranges, means, frequencies, and/or percentages. 31 , 45 Descriptive statistical analysis using ANOVA, Student’s t -test, or the Pearson coefficient method has been used to analyze descriptive research data. 46

Correlational research is performed by determining and interpreting the extent of a relationship between two or more variables using statistical data. This involves recognizing data trends and patterns without necessarily proving their causes. The researcher studies only the data, relationships, and distributions of variables in a natural setting, but does not manipulate them. 21 , 45 Afterwards, the researcher establishes reliability and validity, provides converging evidence, describes relationship, and makes predictions. 47

Experimental research is usually referred to as true experimentation. The researcher establishes the cause-effect relationship among a group of variables making up a study using the scientific method or process. This type of research attempts to identify the causal relationships between variables through experiments by arbitrarily controlling the conditions or manipulating the variables used. 44 The scientific manuscript would include an explanation of how the independent variable was manipulated to determine its effects on the dependent variables. The write-up would also describe the random assignments of subjects to experimental treatments. 21

Causal-comparative/quasi-experimental research closely resembles true experimentation but is conducted by establishing the cause-effect relationships among variables. It may also be conducted to establish the cause or consequences of differences that already exist between, or among groups of individuals. 48 This type of research compares outcomes between the intervention groups in which participants are not randomized to their respective interventions because of ethics- or feasibility-related reasons. 49 As in true experiments, the researcher identifies and measures the effects of the independent variable on the dependent variable. However, unlike true experiments, the researchers do not manipulate the independent variable.

In quasi-experimental research, naturally formed or pre-existing groups that are not randomly assigned are used, particularly when an ethical, randomized controlled trial is not feasible or logical. 50 The researcher identifies control groups as those which have been exposed to the treatment variable, and then compares these with the unexposed groups. The causes are determined and described after data analysis, after which conclusions are made. The known and unknown variables that could still affect the outcome are also included. 7

QUALITATIVE RESEARCH

Inductive approach.

Qualitative research involves an inductive approach to develop a hypothesis. 21 , 25 Using this approach, researchers answer research questions and develop new theories, but they do not test hypotheses or previous theories. The researcher seldom examines the effectiveness of an intervention, but rather explores the perceptions, actions, and feelings of participants using interviews, content analysis, observations, or focus groups. 25 , 45 , 51

Distinctive features of qualitative research

Qualitative research seeks to elucidate about the lives of people, including their lived experiences, behaviors, attitudes, beliefs, personality characteristics, emotions, and feelings. 27 , 30 It also explores societal, organizational, and cultural issues. 30 This type of research provides a good story mimicking an adventure which results in a “thick” description that puts readers in the research setting. 52

The qualitative research questions are open-ended, evolving, and non-directional. 26 The research design is usually flexible and iterative, commonly employing purposive sampling. The sample size depends on theoretical saturation, and data is collected using in-depth interviews, focus groups, and observations. 27

In various instances, excellent qualitative research may offer insights that quantitative research cannot. Moreover, qualitative research approaches can describe the ‘lived experience’ perspectives of patients, practitioners, and the public. 53 Interestingly, recent developments have looked into the use of technology in shaping qualitative research protocol development, data collection, and analysis phases. 54

Qualitative research employs various techniques, including conversational and discourse analysis, biographies, interviews, case-studies, oral history, surveys, documentary and archival research, audiovisual analysis, and participant observations. 26

Conducting qualitative research

To conduct qualitative research, investigators 1) identify a general research question, 2) choose the main methods, sites, and subjects, and 3) determine methods of data documentation access to subjects. Researchers also 4) decide on the various aspects for collecting data (e.g., questions, behaviors to observe, issues to look for in documents, how much (number of questions, interviews, or observations), 5) clarify researchers’ roles, and 6) evaluate the study’s ethical implications in terms of confidentiality and sensitivity. Afterwards, researchers 7) collect data until saturation, 8) interpret data by identifying concepts and theories, and 9) revise the research question if necessary and form hypotheses. In the final stages of the research, investigators 10) collect and verify data to address revisions, 11) complete the conceptual and theoretical framework to finalize their findings, and 12) present and disseminate findings ( Fig. 1B ).

Types of qualitative research

The different types of qualitative research include (a) historical research, (b) ethnographic research, (c) meta-analysis, (d) narrative research, (e) grounded theory, (f) phenomenology, (g) case study, and (h) field research. 23 , 25 , 28 , 30

Historical research is conducted by describing past events, problems, issues, and facts. The researcher gathers data from written or oral descriptions of past events and attempts to recreate the past without interpreting the events and their influence on the present. 6 Data is collected using documents, interviews, and surveys. 55 The researcher analyzes these data by describing the development of events and writes the research based on historical reports. 2

Ethnographic research is performed by observing everyday life details as they naturally unfold. 2 It can also be conducted by developing in-depth analytical descriptions of current systems, processes, and phenomena or by understanding the shared beliefs and practices of a particular group or culture. 21 The researcher collects extensive narrative non-numerical data based on many variables over an extended period, in a natural setting within a specific context. To do this, the researcher uses interviews, observations, and active participation. These data are analyzed by describing and interpreting them and developing themes. A detailed report of the interpreted data is then provided. 2 The researcher immerses himself/herself into the study population and describes the actions, behaviors, and events from the perspective of someone involved in the population. 23 As examples of its application, ethnographic research has helped to understand a cultural model of family and community nursing during the coronavirus disease 2019 outbreak. 56 It has also been used to observe the organization of people’s environment in relation to cardiovascular disease management in order to clarify people’s real expectations during follow-up consultations, possibly contributing to the development of innovative solutions in care practices. 57

Meta-analysis is carried out by accumulating experimental and correlational results across independent studies using a statistical method. 21 The report is written by specifying the topic and meta-analysis type. In the write-up, reporting guidelines are followed, which include description of inclusion criteria and key variables, explanation of the systematic search of databases, and details of data extraction. Meta-analysis offers in-depth data gathering and analysis to achieve deeper inner reflection and phenomenon examination. 58

Narrative research is performed by collecting stories for constructing a narrative about an individual’s experiences and the meanings attributed to them by the individual. 9 It aims to hear the voice of individuals through their account or experiences. 17 The researcher usually conducts interviews and analyzes data by storytelling, content review, and theme development. The report is written as an in-depth narration of events or situations focused on the participants. 2 , 59 Narrative research weaves together sequential events from one or two individuals to create a “thick” description of a cohesive story or narrative. 23 It facilitates understanding of individuals’ lives based on their own actions and interpretations. 60

Grounded theory is conducted by engaging in an inductive ground-up or bottom-up strategy of generating a theory from data. 24 The researcher incorporates deductive reasoning when using constant comparisons. Patterns are detected in observations and then a working hypothesis is created which directs the progression of inquiry. The researcher collects data using interviews and questionnaires. These data are analyzed by coding the data, categorizing themes, and describing implications. The research is written as a theory and theoretical models. 2 In the write-up, the researcher describes the data analysis procedure (i.e., theoretical coding used) for developing hypotheses based on what the participants say. 61 As an example, a qualitative approach has been used to understand the process of skill development of a nurse preceptor in clinical teaching. 62 A researcher can also develop a theory using the grounded theory approach to explain the phenomena of interest by observing a population. 23

Phenomenology is carried out by attempting to understand the subjects’ perspectives. This approach is pertinent in social work research where empathy and perspective are keys to success. 21 Phenomenology studies an individual’s lived experience in the world. 63 The researcher collects data by interviews, observations, and surveys. 16 These data are analyzed by describing experiences, examining meanings, and developing themes. The researcher writes the report by contextualizing and reporting the subjects’ experience. This research approach describes and explains an event or phenomenon from the perspective of those who have experienced it. 23 Phenomenology understands the participants’ experiences as conditioned by their worldviews. 52 It is suitable for a deeper understanding of non-measurable aspects related to the meanings and senses attributed by individuals’ lived experiences. 60

Case study is conducted by collecting data through interviews, observations, document content examination, and physical inspections. The researcher analyzes the data through a detailed identification of themes and the development of narratives. The report is written as an in-depth study of possible lessons learned from the case. 2

Field research is performed using a group of methodologies for undertaking qualitative inquiries. The researcher goes directly to the social phenomenon being studied and observes it extensively. In the write-up, the researcher describes the phenomenon under the natural environment over time with no implantation of controls or experimental conditions. 45

DIFFERENCES BETWEEN QUANTITATIVE AND QUALITATIVE RESEARCH

Scientific researchers must be aware of the differences between quantitative and qualitative research in terms of their working mechanisms to better understand their specific applications. This knowledge will be of significant benefit to researchers, especially during the planning process, to ensure that the appropriate type of research is undertaken to fulfill the research aims.

In terms of quantitative research data evaluation, four well-established criteria are used: internal validity, external validity, reliability, and objectivity. 23 The respective correlating concepts in qualitative research data evaluation are credibility, transferability, dependability, and confirmability. 30 Regarding write-up, quantitative research papers are usually shorter than their qualitative counterparts, which allows the latter to pursue a deeper understanding and thus producing the so-called “thick” description. 29

Interestingly, a major characteristic of qualitative research is that the research process is reversible and the research methods can be modified. This is in contrast to quantitative research in which hypothesis setting and testing take place unidirectionally. This means that in qualitative research, the research topic and question may change during literature analysis, and that the theoretical and analytical methods could be altered during data collection. 44

Quantitative research focuses on natural, quantitative, and objective phenomena, whereas qualitative research focuses on social, qualitative, and subjective phenomena. 26 Quantitative research answers the questions “what?” and “when?,” whereas qualitative research answers the questions “why?,” “how?,” and “how come?.” 64

Perhaps the most important distinction between quantitative and qualitative research lies in the nature of the data being investigated and analyzed. Quantitative research focuses on statistical, numerical, and quantitative aspects of phenomena, and employ the same data collection and analysis, whereas qualitative research focuses on the humanistic, descriptive, and qualitative aspects of phenomena. 26 , 28

Structured versus unstructured processes

The aims and types of inquiries determine the difference between quantitative and qualitative research. In quantitative research, statistical data and a structured process are usually employed by the researcher. Quantitative research usually suggests quantities (i.e., numbers). 65 On the other hand, researchers typically use opinions, reasons, verbal statements, and an unstructured process in qualitative research. 63 Qualitative research is more related to quality or kind. 65

In quantitative research, the researcher employs a structured process for collecting quantifiable data. Often, a close-ended questionnaire is used wherein the response categories for each question are designed in which values can be assigned and analyzed quantitatively using a common scale. 66 Quantitative research data is processed consecutively from data management, then data analysis, and finally to data interpretation. Data should be free from errors and missing values. In data management, variables are defined and coded. In data analysis, statistics (e.g., descriptive, inferential) as well as central tendency (i.e., mean, median, mode), spread (standard deviation), and parameter estimation (confidence intervals) measures are used. 67

In qualitative research, the researcher uses an unstructured process for collecting data. These non-statistical data may be in the form of statements, stories, or long explanations. Various responses according to respondents may not be easily quantified using a common scale. 66

Composing a qualitative research paper resembles writing a quantitative research paper. Both papers consist of a title, an abstract, an introduction, objectives, methods, findings, and discussion. However, a qualitative research paper is less regimented than a quantitative research paper. 27

Quantitative research as a deductive hypothesis-testing design

Quantitative research can be considered as a hypothesis-testing design as it involves quantification, statistics, and explanations. It flows from theory to data (i.e., deductive), focuses on objective data, and applies theories to address problems. 45 , 68 It collects numerical or statistical data; answers questions such as how many, how often, how much; uses questionnaires, structured interview schedules, or surveys 55 as data collection tools; analyzes quantitative data in terms of percentages, frequencies, statistical comparisons, graphs, and tables showing statistical values; and reports the final findings in the form of statistical information. 66 It uses variable-based models from individual cases and findings are stated in quantified sentences derived by deductive reasoning. 24

In quantitative research, a phenomenon is investigated in terms of the relationship between an independent variable and a dependent variable which are numerically measurable. The research objective is to statistically test whether the hypothesized relationship is true. 68 Here, the researcher studies what others have performed, examines current theories of the phenomenon being investigated, and then tests hypotheses that emerge from those theories. 4

Quantitative hypothesis-testing research has certain limitations. These limitations include (a) problems with selection of meaningful independent and dependent variables, (b) the inability to reflect subjective experiences as variables since variables are usually defined numerically, and (c) the need to state a hypothesis before the investigation starts. 61

Qualitative research as an inductive hypothesis-generating design

Qualitative research can be considered as a hypothesis-generating design since it involves understanding and descriptions in terms of context. It flows from data to theory (i.e., inductive), focuses on observation, and examines what happens in specific situations with the aim of developing new theories based on the situation. 45 , 68 This type of research (a) collects qualitative data (e.g., ideas, statements, reasons, characteristics, qualities), (b) answers questions such as what, why, and how, (c) uses interviews, observations, or focused-group discussions as data collection tools, (d) analyzes data by discovering patterns of changes, causal relationships, or themes in the data; and (e) reports the final findings as descriptive information. 61 Qualitative research favors case-based models from individual characteristics, and findings are stated using context-dependent existential sentences that are justifiable by inductive reasoning. 24

In qualitative research, texts and interviews are analyzed and interpreted to discover meaningful patterns characteristic of a particular phenomenon. 61 Here, the researcher starts with a set of observations and then moves from particular experiences to a more general set of propositions about those experiences. 4

Qualitative hypothesis-generating research involves collecting interview data from study participants regarding a phenomenon of interest, and then using what they say to develop hypotheses. It involves the process of questioning more than obtaining measurements; it generates hypotheses using theoretical coding. 61 When using large interview teams, the key to promoting high-level qualitative research and cohesion in large team methods and successful research outcomes is the balance between autonomy and collaboration. 69

Qualitative data may also include observed behavior, participant observation, media accounts, and cultural artifacts. 61 Focus group interviews are usually conducted, audiotaped or videotaped, and transcribed. Afterwards, the transcript is analyzed by several researchers.

Qualitative research also involves scientific narratives and the analysis and interpretation of textual or numerical data (or both), mostly from conversations and discussions. Such approach uncovers meaningful patterns that describe a particular phenomenon. 2 Thus, qualitative research requires skills in grasping and contextualizing data, as well as communicating data analysis and results in a scientific manner. The reflective process of the inquiry underscores the strengths of a qualitative research approach. 2

Combination of quantitative and qualitative research

When both quantitative and qualitative research methods are used in the same research, mixed-method research is applied. 25 This combination provides a complete view of the research problem and achieves triangulation to corroborate findings, complementarity to clarify results, expansion to extend the study’s breadth, and explanation to elucidate unexpected results. 29

Moreover, quantitative and qualitative findings are integrated to address the weakness of both research methods 29 , 66 and to have a more comprehensive understanding of the phenomenon spectrum. 66

For data analysis in mixed-method research, real non-quantitized qualitative data and quantitative data must both be analyzed. 70 The data obtained from quantitative analysis can be further expanded and deepened by qualitative analysis. 23

In terms of assessment criteria, Hammersley 71 opined that qualitative and quantitative findings should be judged using the same standards of validity and value-relevance. Both approaches can be mutually supportive. 52

Quantitative and qualitative research must be carefully studied and conducted by scientific researchers to avoid unethical research and inadequate outcomes. Quantitative research involves a deductive process wherein a research question is answered with a hypothesis that describes the relationship between independent and dependent variables, and the testing of the hypothesis. This investigation can be aptly termed as hypothesis-testing research involving the analysis of hypothesis-driven experimental studies resulting in a test of significance. Qualitative research involves an inductive process wherein a research question is explored to generate a hypothesis, which then leads to the development of a theory. This investigation can be aptly termed as hypothesis-generating research. When the whole spectrum of inductive and deductive research approaches is combined using both quantitative and qualitative research methodologies, mixed-method research is applied, and this can facilitate the construction of novel hypotheses, development of theories, or refinement of concepts.

Disclosure: The authors have no potential conflicts of interest to disclose.

Author Contributions:

  • Conceptualization: Barroga E, Matanguihan GJ.
  • Data curation: Barroga E, Matanguihan GJ, Furuta A, Arima M, Tsuchiya S, Kawahara C, Takamiya Y, Izumi M.
  • Formal analysis: Barroga E, Matanguihan GJ, Furuta A, Arima M, Tsuchiya S, Kawahara C.
  • Investigation: Barroga E, Matanguihan GJ, Takamiya Y, Izumi M.
  • Methodology: Barroga E, Matanguihan GJ, Furuta A, Arima M, Tsuchiya S, Kawahara C, Takamiya Y, Izumi M.
  • Project administration: Barroga E, Matanguihan GJ.
  • Resources: Barroga E, Matanguihan GJ, Furuta A, Arima M, Tsuchiya S, Kawahara C, Takamiya Y, Izumi M.
  • Supervision: Barroga E.
  • Validation: Barroga E, Matanguihan GJ, Furuta A, Arima M, Tsuchiya S, Kawahara C, Takamiya Y, Izumi M.
  • Visualization: Barroga E, Matanguihan GJ.
  • Writing - original draft: Barroga E, Matanguihan GJ.
  • Writing - review & editing: Barroga E, Matanguihan GJ, Furuta A, Arima M, Tsuchiya S, Kawahara C, Takamiya Y, Izumi M.

IMAGES

  1. (PDF) Medical Research Papers and Their Popularization. A Macro- and Micro-Linguistic

    published quantitative research paper about medicine

  2. Medical research papers statistics

    published quantitative research paper about medicine

  3. Quantitative Methods Examples

    published quantitative research paper about medicine

  4. Quantitative Research About Medical Administration

    published quantitative research paper about medicine

  5. Quantitative Medicine: Complete Guide to Getting Well, Staying Well, Avoiding Disease, Slowing

    published quantitative research paper about medicine

  6. 😀 Quantitative nursing research article critique example. Nursing research article critique

    published quantitative research paper about medicine

VIDEO

  1. Seven Days Online workshop on How to write a Quantitative research paper

  2. Diy Paper medicine craft 💊 💊 ❤#diy #artandcraft #youtubeshort ❤❤❤

  3. An Introduction to the CDER Quantitative Medicine Center of Excellence

  4. Patient and public involvement and qualitative research methods

  5. qualitative and quantitative research critique

  6. Quantitative Systems Pharmacology (QSP): Past, Present and Future

COMMENTS

  1. Quantitative medicine: Tracing the transition from holistic to

    The rise of quantitative medicine. Quantitative medicine is a paradigm shift in the practice of medicine that emphasizes the use of quantitative data and mathematical models to understand and treat disease. 20 This approach is based on the idea that the human body can be studied as a complex system, with many interconnected parts that can be modeled and simulated using mathematical and ...

  2. Recent quantitative research on determinants of health in high income

    Background Identifying determinants of health and understanding their role in health production constitutes an important research theme. We aimed to document the state of recent multi-country research on this theme in the literature. Methods We followed the PRISMA-ScR guidelines to systematically identify, triage and review literature (January 2013—July 2019). We searched for studies that ...

  3. Public and patient involvement in quantitative health research: A

    1. BACKGROUND. Public and patient involvement (PPI) in health research has been defined as research being carried out "with" or "by" members of the public rather than "to," "about" or "for" them. 1 PPI covers a diverse range of approaches from "one off" information gathering to sustained partnerships. Tritter's conceptual framework for PPI distinguished between indirect ...

  4. Living with a chronic disease: A quantitative study of the views of

    Chronic diseases have an impact on and change patients' lives, and the way they experience their bodies alters. Patients may struggle with identity and self-esteem, a shrinking lifeworld and a challenging reality. 1 The chronic diseases become part of the patients' lives, whether they affect their physical health and functions, autonomy, freedom and identity, or threaten their life. 2 The ...

  5. A Quantitative Observational Study of Physician Influence on Hospital

    The average cost of hospital inpatient visits was $9172 for all visits, $9492 for visits to teaching hospitals, and $8679 for visits to nonteaching hospitals (see Appendix Table A1 for visit characteristics). There were 7993 physicians who worked only at teaching hospitals, 4249 physicians who worked only at nonteaching hospitals, and 2995 ...

  6. Quantitative Research Methods in Medical Education

    There has been an explosion of research in the field of medical education. A search of PubMed demonstrates that more than 40,000 articles have been indexed under the medical subject heading "Medical Education" since 2010, which is more than the total number of articles indexed under this heading in the 1980s and 1990s combined.

  7. Effects of the COVID-19 pandemic on medical students: a multicenter

    The COVID-19 pandemic disrupted the United States (US) medical education system with the necessary, yet unprecedented Association of American Medical Colleges (AAMC) national recommendation to pause all student clinical rotations with in-person patient care. This study is a quantitative analysis investigating the educational and psychological effects of the pandemic on US medical students and ...

  8. Effectiveness of mRNA Covid-19 Vaccine among U.S. Health Care Personnel

    Phase 3 clinical trials showed the safety and efficacy of the mRNA vaccines, 7,8 and early data from observational studies 9-11 have supported the clinical trial results. Real-world data on ...

  9. Quantitative research methods in medical education

    Quantitative research methods in medical education. Geoff Norman, Geoff Norman. Clinical Epidemiology and Biostatistics, McMaster University, Canada. Search for more papers by this author. Kevin W Eva, Kevin W Eva. Centre for Health Education Scholarship, University of British Columbia, Canada ... First published: 22 October 2013. https://doi ...

  10. Conducting Quantitative Medical Education Research: From ...

    Abstract. Rigorous medical education research is critical to effectively develop and evaluate the training we provide our learners. Yet many clinical medical educators lack the training and skills needed to conduct high-quality medical education research. We offer guidance on conducting sound quantitative medical education research.

  11. Quantitative research: Designs relevant to nursing and healthcare

    This paper gives an overview of the main quantitative research designs relevant to nursing and healthcare. It outlines some strengths and weaknesses of the designs, provides examples to illustrate the different designs and examines some of the relevant statistical concepts.

  12. Quantitative Research Methods in Medical Education

    Summary The past three decades of research have seen substantial advances in medical education, ... Quantitative Research Methods in Medical Education. Geoff Norman. Clinical Epidemiology and Biostatistics, McMaster University, Hamilton, Ontario, Canada ... Search for more papers by this author. First published: 05 October 2018. https://doi.org ...

  13. Quantitative medicine: Tracing the transition from holistic to

    Quantitative medicine is a paradigm shift in the practice of medicine that emphasizes the use of quantitative data and mathematical ... (DT). In 2012, the NASA and the U.S. Air Force jointly published a paper about the DT, which stated the DT was the key technology for future vehicles. ... the number of research studies on DT in aerospace has ...

  14. Quantitative Research Methods in Medical Education

    Quantitative Research Methods in Medical Education. April 2019. Anesthesiology Publish Ahead of Print (&NA;):&NA; DOI: 10.1097/aln.0000000000002727. Authors: John T. Ratelle. Mayo Foundation for ...

  15. Powerful numbers: Exemplary quantitative studies of science that had

    Abstract. Much scientometric research aims to be relevant to policy, but such research only rarely has a notable policy impact. In this paper, we examine four exemplary cases of policy impact from quantitative studies of science. The cases are analyzed in light of lessons learned about the use of evidence in policy making in health services, which provides very thorough explorations of the ...

  16. Appraising Quantitative Research in Health Education: Guidelines for

    Greenhalgh T, Taylor R. How to read a paper: Papers that go beyond numbers (qualitative research) British Medical Journal. 1997; 315:740-743. [PMC free article] [Google Scholar] Greenhalgh T. How to read a paper: Assessing the methodological quality of published papers. British Medical Journal. 315:305-308.

  17. Journal of Medical Internet Research

    Background: The World Health Organization considers coronavirus disease (COVID-19) to be a public emergency threatening global health. During the crisis, the public's need for web-based information and communication is a subject of focus. Digital inequality research has shown that internet access is not evenly distributed among the general population.

  18. Quantitative Research Methods in Medical Education

    Associate Professor, Department of Medicine Education Researcher, Center for Faculty Educators. School of Medicine, University of California, San Francisco, CA, USA. Search for more papers by this author

  19. Quantitative Research Excellence: Study Design and Reliable and Valid

    All subjects Allied Health Cardiology & Cardiovascular Medicine Dentistry Emergency Medicine & Critical Care Endocrinology & Metabolism Environmental Science General Medicine Geriatrics Infectious Diseases Medico-legal ... Article first published online: June 9, 2021. Issue ... Quantitative Research for the Qualitative Researcher. 2014. SAGE ...

  20. Quantitative Research Methods in Medical Education

    Evaluating medical education research requires specific orientation to issues related to format and content. Our goal is to review the quantitative aspects of research in medical education so that clinicians may understand these articles with respect to framing the study, recognizing methodologic issues, and utilizing instruments for evaluating ...

  21. Academic Medicine

    ologies and knowledge, it is important to understand the breadth of research methodologies and their rigor in HPE research publications. However, there are limited studies examining these questions. This study synthesized current trends in methodologies and rigor in HPE papers to inform how evidence is gathered and collectively shapes knowledge in HPE. Method This descriptive quantitative ...

  22. Quantitative Research in Human Biology and Medicine

    Description. Quantitative Research in Human Biology and Medicine reflects the author's past activities and experiences in the field of medical statistics. The book presents statistical material from a variety of medical fields. The text contains chapters that deal with different aspects of vital statistics. It provides statistical surveys of ...

  23. Conducting and Writing Quantitative and Qualitative Research

    INTRODUCTION. Novel research studies are conceptualized by scientific researchers first by asking excellent research questions and developing hypotheses, then answering these questions by testing their hypotheses in ethical research.1,2,3 Before they conduct novel research studies, scientific researchers must possess considerable knowledge of both quantitative and qualitative research.2