Shapiro Library

Higher Education Administration (HEA) Guide

A review of quantitative and qualitative analysis.

Need a refresher on Quantitative and Qualitative Analysis? Click below to get a review of both research methodologies.

  • Quantitative and Qualitative Analysis This link opens in a new window

Program Evaluation and Planning

Close up on hand writing out numbered plans on paper

Image by Kelly Sikkema, retrieved via Unsplash

From data analysis to program management methods and more, evaluating and planning for the success of each program is a crucial aspect of Higher Education Administration. Below you will find some useful articles and reports to help bring context to this important element of higher education leadership. 

Useful Articles

Below you will find a sample of reports, case studies and articles that outline the process of program evaluation, planning and analysis. Click through and read on for more information. 

  • The Feasibility of Program-Level Accountability in Higher Education: Guidance for Policymakers. Research Report Policymakers have expressed increased interest in program-level higher education accountability measures as a supplement to, or in place of, institution-level metrics. But it is unclear what these measures should look like. In this report, the authors assess the ways program-level data could be developed to facilitate federal accountability.
  • Improving Institutional Evaluation Methods: Comparing Three Evaluations Using PSM, Exact and Coarsened Exact Matching Policymakers and institutional leaders in higher education too often make decisions based on descriptive data analyses or even anecdote when better analysis options could produce more nuanced and more valuable results. Employing the setting of higher education program evaluation at a midwestern regional public university, for this study we compared analysis approaches using basic descriptive analyses, regression, standard propensity score matching (PSM), and a mixture of PSM with continuous variables, coarsened exact matching, and exact matching on categorical variables. We used three examples of program evaluations: a freshman seminar, an upper division general education program intended to improve cultural awareness and respect for diverse groups, and multiple living learning communities. We describe how these evaluations were conducted, compare the different results for each type of method employed, and discuss the strengths and weaknesses of each in the context of program evaluation.
  • Data-Informed Policy Innovations in Tennessee: Effective Use of State Data Systems This link opens in a new Analysis of student-level data to inform policy and promote student success is a core function of executive higher education agencies. Postsecondary data systems have expanded their collection of data elements for use by policymakers, institutional staff and the general public. State coordinating and governing boards use these data systems for strategic planning, to allocate funding, establish performance metrics, evaluate academic programs, and inform students and their families. This report discusses efforts at the Tennessee Higher Education Commission to support policy innovation with data and information resources.

Other Resources

case study higher education evaluation

  • << Previous: Contemporary Issues In Higher Education
  • Next: Legal and Regulatory Research >>

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • My Account Login
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • Open access
  • Published: 02 March 2021

Research impact evaluation and academic discourse

  • Marta Natalia Wróblewska   ORCID: orcid.org/0000-0001-8575-5215 1 , 2  

Humanities and Social Sciences Communications volume  8 , Article number:  58 ( 2021 ) Cite this article

5034 Accesses

12 Citations

40 Altmetric

Metrics details

  • Language and linguistics
  • Science, technology and society

The introduction of ‘impact’ as an element of assessment constitutes a major change in the construction of research evaluation systems. While various protocols of impact evaluation exist, the most articulated one was implemented as part of the British Research Excellence Framework (REF). This paper investigates the nature and consequences of the rise of ‘research impact’ as an element of academic evaluation from the perspective of discourse. Drawing from linguistic pragmatics and Foucauldian discourse analysis, the study discusses shifts related to the so-called Impact Agenda on four stages, in chronological order: (1) the ‘problematization’ of the notion of ‘impact’, (2) the establishment of an ‘impact infrastructure’, (3) the consolidation of a new genre of writing–impact case study, and (4) academics’ positioning practices towards the notion of ‘impact’, theorized here as the triggering of new practices of ‘subjectivation’ of the academic self. The description of the basic functioning of the ‘discourse of impact’ is based on the analysis of two corpora: case studies submitted by a selected group of academics (linguists) to REF2014 (no = 78) and interviews ( n  = 25) with their authors. Linguistic pragmatics is particularly useful in analyzing linguistic aspects of the data, while Foucault’s theory helps draw together findings from two datasets in a broader analysis based on a governmentality framework. This approach allows for more general conclusions on the practices of governing (academic) subjects within evaluation contexts.

Similar content being viewed by others

case study higher education evaluation

Writing impact case studies: a comparative study of high-scoring and low-scoring case studies from REF2014

Bella Reichard, Mark S Reed, … Andrea Whittle

case study higher education evaluation

The transformative power of values-enacted scholarship

Nicky Agate, Rebecca Kennison, … Penelope Weber

case study higher education evaluation

IDADA: towards a multimethod methodological framework for PhD by publication underpinned by critical realism

A. Kayode Adesemowo

Introduction

The introduction of ‘research impact’ as an element of evaluation constitutes a major change in the construction of research evaluation systems. ‘Impact’, understood broadly as the influence of academic research beyond the academic sphere, including areas such as business, education, public health, policy, public debate, culture etc., has been progressively implemented in various systems of science evaluation—a trend observable worldwide (Donovan, 2011 ; Grant et al., 2009 ; European Science Foundation, 2012 ). Salient examples of attempts to systematically evaluate research impact include the Australian Research Quality Framework–RQF (Donovan, 2008 ) and the Dutch Standard Evaluation Protocol (VSNU–Association of Universities in the Netherlands, 2016 , see ‘societal relevance’).

The most articulated system of impact evaluation to date was implemented in the British cyclical ex post assessment of academic units, Research Excellence Framework (REF), as part of a broader governmental policy—the Impact Agenda. REF is the most-studied and probably the most influential impact evaluation system to date. It has been used as a model for analogous evaluations in other countries. These include the Norwegian Humeval exercise for the humanities (Research Council of Norway, 2017 , pp. 36–37, Wróblewska, 2019 ) and ensuing evaluations of other fields (Research Council of Norway, 2018 , pp. 32–34; Wróblewska, 2019 , pp. 12–16). REF has also directly inspired impact evaluation protocols in Hong-Kong (Hong Kong University Grants Committee, 2018 ) and Poland (Wróblewska, 2017 ). This study is based on data collected in the context of the British REF2014 but it advances a description of the ‘discourse of impact’ that can be generalized and applied to other national and international contexts.

Although impact evaluation is a new practice, a body of literature has been produced on the topic. This includes policy documents on the first edition of REF in 2014 (HEFCE, 2015 ; Stern, 2016 ) and related reports, be it commissioned (King’s College London and Digital Science, 2015 ; Manville et al., 2014 , 2015 ) or conducted independently (National co-ordinating center for public engagement, 2014 ). There also exists a scholarly literature which reflects on the theoretical underpinnings of impact evaluations (Gunn and Mintrom, 2016 , 2018 ; Watermeyer, 2012 , 2016 ) and the observable consequences of the exercise for academic practice (Chubb and Watermeyer, 2017 ; Chubb et al., 2016 ; Watermeyer, 2014 ). While these reports and studies mainly draw on the methods of philosophy, sociology and management, many of them also allude to changes related to language .

Several publications on impact drew attention to the process of meaning-making around the notion of ‘impact’ in the early stages of its existence. Manville et al. flagged up the necessity for the policy-maker to facilitate the development of common vocabulary to enable a broader ‘cultural shift’ (2015, pp. 16, 26. 37–38, 69). Power wrote of an emerging ‘performance discourse of impact’ (2015, p. 44) while Derrick ( 2018 ) looked at the collective process of defining and delimiting “the ambiguous object” of impact at the stage of panel proceedings. The present paper picks up these observations bringing them together in a unique discursive perspective.

Drawing from linguistic pragmatics and Foucauldian discourse analysis, the paper presents shifts related to the introduction of ‘impact’ as element of evaluation in four stages. These are, in chronological order: (1) the ‘problematisation’ of the notion of ‘impact’ in policy and its appropriation on a local level, (2) the creation of an impact infrastructure to orchestrate practices around impact, (3) the consolidation of a new genre of writing—impact case study, (4) academics’ uptake of the notion of impact and its progressive inclusion in their professional positioning.

Each of these stages is described using theoretical concepts grounded in empirical data. The first stage has to do with the process of ‘problematization’ of a previously non-regulated area, i.e., the process of casting research impact as a ‘problem’ to be addressed and regulated by a set of policy measures. The second stage took place when in rapid response to government policy, new procedures and practices were created within universities, giving rise to an impact ‘infrastructure’ (or ‘apparatus’ in the Foucauldian sense). The third stage is the emergence of a crucial element of the infrastructure—a new genre of academic writing—impact case study. I argue that engaging with the new genre and learning to write impact case studies was key in incorporating ‘impact’ into scholars’ narratives of ‘academic identity’. Hence, the paper presents new practices of ‘subjectivation’ as the fourth stage of incorporation of ‘impact’ into academic discourse. The four stages of the introduction of ‘impact’ into academic discourse are mutually interlinked—each step paves the way for the next.

Of the described four stages, only stage three focuses a classical linguistic task: the description of a new genre of text. The remaining three take a broader view informed by sociology and philosophy, focusing on discursive practices i.e., language used in social context. Other descriptions of the emergence of impact are possible—note for instance Power’s four-fold structure (Power, 2015 ), at points analogous to this study.

Theoretical framework and data

This study builds on a constructivist approach to social phenomena in assuming that language plays a crucial role in establishing and maintaining social practice. In this approach ‘discourse’ is understood as the production of social meaning—or the negotiation of social, political or cultural order—through the means of text and talk (Fairclough, 1989 , 1992 ; Fairclough et al., 1997 ; Gee, 2015 ).

Linguistic pragmatics and Foucauldian approaches to discourse are used to account for the changes related to the rise of ‘impact’ as element of evaluation and discourse on the macro and micro scale. In looking at the micro scale of every-day linguistic practices the analysis makes use of linguistic pragmatics, in particular concepts of positioning (Davies and Harré, 1990 ), stage (Goffman, 1969 ; Robinson, 2013 ), metaphor (Cameron, et al., 2009 ; Musolff, 2004 , 2012 ), as well as genre analysis (Swales, 1990 , 2011 ). Analyzing the macro scale, i.e., the establishment of the concept of ‘impact’ in policy and the creation of an impact infrastructure, it draws on selected concepts of Fouculadian governmentality theory (crucially ‘problematisation’, ‘apparatus’, ‘subjectivation’) (Foucault, 1980 , 1988 , 1990 ; Rose, 1999 , pp. ix–xiii).

While the toolbox of linguistic pragmatics is particularly useful in analyzing linguistic aspects of the datasets, Foucault’s governmental framework helps bring together findings from the two datasets in a broader analysis, allowing more general conclusions on the practices of governing (academic) subjects within evaluation frameworks. Both pragmatic and Foucauldian traditions of discourse analysis have been productively applied in the study of higher education contexts (e.g., Fairclough, 1993 , Gilbert and Mulkey, 1984 , Hyland, 2009 , Myers, 1985 , 1989 ; for an overview see Wróblewska and Angermuller, 2017 ).

The analysis builds on an admittedly heterogenous set of concepts, hailing from different traditions and disciplines. This approach allows for a suitably nuanced description of a broad phenomenon—the discourse of impact—studied here on the basis of two different datasets. To facilitate following the argument, individual theoretical and methodological concepts are defined where they are applied in the analysis.

The studied corpus consists of two datasets: a written and oral one. The written corpus includes 78 impact case studies (CSs) submitted to REF2014 in the discipline of linguistics Footnote 1 . Linguistics was selected as a discipline straddling the social sciences and humanities (SSH). SSH are arguably most challenged by the practice of impact evaluation as they have traditionally resisted subjection to economization and social accountability (Benneworth et al., 2016 ; Bulaitis, 2017 ).

The CSs were downloaded in pdf form from REF’s website: https://www.ref.ac.uk/2014/ . The documents have an identical structure, featuring basic information: name of institution, unit of assessment, title of CS and core content divided into five sections: (1) summary of impact, (2) underpinning research, (3) references to the research, (4) details of impact (5) sources to corroborate impact. Each CS is about 4 pages long (~2400 words). The written dataset (with a word-count of 173,474) was analyzed qualitatively using MAX QDA software with a focus on the generic aspect of the documents.

The oral dataset is composed of semi-structured interviews with authors of the studied CSs ( n  = 20) and other actors involved in the evaluation, including two policy-makers and three academic administrators Footnote 2 . In total, the 25 interviews, each around 60 min long, add up to around 25 h of recordings. The interviews were analyzed in two ways. Firstly, they were coded for themes and topics related to the evaluation process—this was useful for the description of impact infrastructure presented in step 2 of analysis. Secondly, they were considered as a linguistic performance and coded for discursive devices (irony, distancing, metaphor etc.)—this was the basis for findings related to the presentation of one’s ‘academic self’ which are the object of fourth step of analysis. The written corpus allows for an analysis of the functioning of the notion of ‘impact’ in the official, administrative discourse of academia, looking at the emergence of an impact infrastructure and the genre created for the description of impact. The oral dataset in turn sheds light on how academics relate to the notion of impact in informal settings, by focusing on metaphors and pragmatic markers of stage.

The discourse of impact

Problematization of impact.

The introduction of ‘impact’, a new element of evaluation accounting for 20% of the final result, was seen as a surprise and as a significant change in respect to the previous model of evaluation—the Research Assessment Exercise (Warner, 2015 ). The outline of an approach to impact evaluation in REF was developed on the government’s recommendation after a review of international practice in impact assessment (Grant et al., 2009 ). The adopted approach was inspired by the previously-created (but never implemented) Australian RQF framework (Donovan, 2008 ). A pilot evaluation exercise run in 2010 confirmed the viability of the case-study approach to impact evaluation. In July 2011 the Higher Education Council for England (HEFCE) published guidelines regulating the new assessment (HEFCE, 2011 ). The deadline for submissions was set for November 2013.

In the period between July 2011 and November 2013 HEFCE engaged in broad communication and training activities across universities, with the aim of explaining the concept of ‘impact’ and the rules which would govern its evaluation (Power, 2015 , pp. 43–48). Knowledge on the new element of evaluation was articulated and passed down to particular departments, academic administrative staff and individual researchers in a trickle-down process, as explained by a HEFCE policymaker in an account of the run-up to REF2014:

There was no master blue print! There were some ideas, which indeed largely came to pass. But in order to understand where we [HEFCE] might be doing things that were unhelpful and might have adverse outcomes, we had to listen. I was in way over one hundred meetings and talked to thousands of people! (…) [The Impact Agenda] is something that we are doing to universities. Actually, what we wanted to say is: ‘we are doing it with you, you’ve Footnote 3 got to own it’.
Int20, policymaker, example 1 Footnote 4

Due to the importance attributed to the exercise by managers of academic units and the relatively short time for preparing submissions, institutions were responsive to the policy developments. In fact, they actively contributed to the establishment and refinement of concepts related to impact. Institutional learning occurred to a large degree contemporarily to the consolidation of the policy and the refinement of the concepts and definitions related to impact. The initially open, undefined nature of ‘impact’ (“there was no master blue-print”) is described also in accounts of academics who participated in the many rounds of meetings and consultations. See example 2 below:

At that time, they [HEFCE] had not yet come up with this definition [of impact], not yet pinned it down, but they were trying to give an idea of what it was, to get feedback, to get a grip on it. (…) And we realised (…) they didn’t have any more of an idea of this than we did! It was almost like a fishing expedition. (…) I got a sense very early on of, you know, groping.
Int1, academic, example 2

The “pinning down” of an initially fuzzy concept and defining the rules which would come to govern its evaluation was just one aim of the process. The other one was to engage academics and affirm their active role in the policy-making. From an idea which came from outside of the British academic community (from the the government, the research councils) and originally from outside the UK (the Australian RQF exercise), a concept which was imposed on academics (“it is something that we are doing to universities”) the Impact Agenda was to become an accepted, embedded element of the academic life (“you’ve got to own it”). In this sense, the laboriousness of the process, both for the policy-makers and the academics involved, was a necessary price to be paid for the feeling of “ownership” among the academic community. Attitudes of academics, initially quite negative (Chubb et al., 2016 , Watermeyer, 2016 ), changed progressively, as the concept of impact became familiarized and adapted to the pre-existing realities of academic life, as recounted by many of the interviewees, e.g.,:

I think the resentment died down relatively quickly. There was still some resistance. And that was partly academics recognising that they had to [take part in the exercise], they couldn’t ignore it. Partly, the government and the research council has been willing to tweak, amend and qualify the initial very hard-edged guidelines and adapt them for the humanities. So, it was two-way process, a dialogue.
Int16, academic, example 3

The announcement of the final REF regulations (HEFCE, 2011 ) was the climax of the long process of making ‘impact’ into a thinkable and manageable entity. The last iteration of the regulations constituted a co-creation of various actors (initial Australian policymakers of the RQF, HEFCE employees, academics, impact professionals, universities, professional organizations) who had contributed to it at different stages (in many rounds of consultations, workshops, talks and sessions across the country). ‘Impact’ as a notion was ‘talked into being’ in a polyphonic process (Angermuller, 2014a , 2014b ) of debate, critique, consultation (“listening”, “getting feedback”) and adaptation (“tweaking”, “changing”, “amending hard-edged guidelines”) also in view of the pre-existing conditions of academia such as the friction between the ‘soft’ and ‘hard’ sciences (as mentioned in example 3). In effect, impact was constituted as an object of thought, and an area of academic activity begun to emerge around it.

The period of defining ‘impact’ as a new, important notion in academic discourse in the UK, roughly between July 2011 and November 2013, can be conceptualized in terms of the Foucauldian notion of ‘problematization’. This concept describes how spaces, areas of activity, persons, behaviors or practices become targeted by government, separated from others, and cast as ‘problems’ to be addressed with a set of techniques and regulations. ‘Problematisation’ is the moment when a notion “enters into the play of true and false, (…) is constituted as an object of thought (whether in the form of moral reflection, scientific knowledge, political analysis, etc.)” (Foucault, 1988 , p. 257), when it “enters into the field of meaning” (Foucault, 1984 , pp. 84–86). The problematization of an area triggers not only the establishment of new notions and objects but also of new practices and institutions. In consequence, the areas in question become subjugated to a new (political, administrative, financial) domination. This eventually shapes the way in which social subjects conceive of their world and of themselves. But a ‘problematisation’, however influential, cannot persist on its own. It requires an overarching structure in the form of an ‘apparatus’ which will consolidate and perpetuate it.

Impact infrastructure

Soon after the publication of the evaluation guidelines for REF2014, and still during the phase of ‘problematisation’ of impact, universities started collecting data on ‘impactful’ research conducted in their departments and recruiting authors of potential CSs which could be submitted for evaluation. The winding and iterative nature of the process of problematization of ‘impact’ made it difficult for research managers and researchers to keep track of the emerging knowledge around impact (official HEFCE documentation, results of the pilot evaluation, FAQs, workshops and sessions organized around the country, writings published in paper and online). At the stage of collecting drafts of CSs it was still unclear what would ‘count’ as impact and what evidence would be required. Hence, there emerged a need for specific procedures and specialized staff who would prepare the REF submissions.

At most institutions, specific posts were created for employees preparing impact submissions for REF2014. These were both secondment positions such as ‘impact lead’, ‘impact champion’ and full-time ones such as impact officer, impact manager. These professionals soon started organizing between themselves at meetings and workshops. Administrative units focused on impact (such as centers for impact and engagement, offices for impact and innovation) were created at many institutions. A body of knowledge on impact evaluation was soon consolidated, along with a specific vocabulary (‘a REF-able piece of research’, ‘pathways to impact’, ‘REF-readiness’ etc.) and sets of resources. Impact evaluation gave raise to the creation of a new type of specialized university employee, who in turn contributed to turning the ‘generation of impact’, as well as the collection and presentation of related data into a veritable field of professional expertize.

In order to ensure timely delivery of CSs to REF2014, institutions established fixed procedures related to the new practice of impact evaluation (periodic monitoring of impact, reporting on impact-related activities), frames (schedules, document templates), forms of knowledge transfer (workshops on impact generation or on writing in the CS genre), data systems and repositories for logging and storing impact-related data, and finally awards and grants for those with achievements (or potential) related to impact. Consultancy companies started offering commercial services focused on research impact, catering to universities and university departments but also to governments and research councils outside the UK looking at solutions for impact evaluation. There is even an online portal with a specific focus on showcasing researchers’ impact (Impact Story).

In consequence, impact became institutionalized as yet another “box to be ticked” on the list of academic achievements, another component of “academic excellence”. Alongside burdens connected to reporting on impact and following regulations in the area, there came also rewards. The rise of impact as a new (or newly-problematised) area of academic life opened up uncharted areas to be explored and opportunities for those who wished to prove themselves. These included jobs for those who had acquired (or could claim) expertize in the area of impact (Donovan, 2017 , p. 3) and research avenues for those studying higher education and evaluation (after all, entirely new evaluation practices rarely emerge, as stressed by Power, 2015 , p. 43). While much writing on the Impact Agenda highlights negative attitudes towards the exercise (Chubb et al., 2016 ; Sayer, 2015 ), equally worth noting are the opportunities that the establishment of a new element of the exercise opened. It is the energy of all those who engage with the concept (even in a critical way) that contributes to making it visible, real and robust.

The establishment of a specialized vocabulary, of formalized requirements and procedures, the creation of dedicated impact-related positions and departments, etc. contribute to the establishment of what can be described as an ‘impact infrastructure’ (comp. Power, 2015 , p. 50) or in terms of Foucauldian governmentality theory as an ‘apparatus’ Footnote 5 . In Foucault’s terminology, ‘apparatus’ refers to a formation which encompasses the entirety of organizing practices (rituals, mechanisms, technologies) but also assumptions, expectations and values. It is the system of relations established between discursive and non-discursive elements as diverse as “institutions, architectural forms, regulatory decisions, laws, administrative measures, scientific statements, philosophical, moral and philanthropic propositions” (Foucault, 1980 , p. 194). An apparatus servers a specific strategic function—responding to an urgent need which arises in a concrete time in history—for instance, regulating the behavior of a population.

There is a crucial discursive element to all the elements of the ‘impact apparatus’. While the creation of organizational units and jobs, the establishment of procedures and regulations, participation in meetings and workshops are no doubt ‘hard facts’ of academic life, they are nevertheless brought about and made real in discursive acts of naming, defining, delimiting and evaluating. The aim of the apparatus was to support the newly-established problematization of impact. It did so by operating on many levels: first of all, and most visibly, newly-established procedures enabled a timely and organized submission to the upcoming REF. Secondly, the apparatus guided the behavior of social actors. It did so not only through directive methods (enforcing impact-related requirements) but also through nurturing attitudes and dispositions which are necessary for the notion of impact to take root in academia (for instance via impact training delivered to early-career scholars).

Interviewed actors involved in implementing the policy in institutions recognized their role in orchestrating collective learning. An interviewed impact officer stated:

My feeling is that ultimately my post should not exist. In ten or fifteen years’ time, impact officers should have embedded the message [about impact] firmly enough that they [researchers] don’t need us anymore.
Int7, impact officer, example 4

A similar vision was evoked by a HEFCE policymaker who was asked if the notion of impact had become embedded in academic institutions:

I hope [after the next edition of REF] we will be able to say that it has become embedded. I think the question then will be “have we done enough in terms of case studies? Do we need something very much lighter-touch?” “Do we need anything at all?”—that’s a question. (…) If [impact] is embedded you don’t need to talk about it.
Int20, policy-maker, example 5

Rather than being an aim in itself, the Impact Agenda is a means of altering academic culture so that institutions and individual researchers become more mindful of the societal impacts of their research. The instillment of a “new impact culture” (see Manville et al., 2014 , pp. 24–29) would ensure that academic subjects consider the question of ‘impact’ even outside of the framework of REF. The “culture shift” is to occur not just within institutions but ultimately within the subjects—it is in them that the notion of ‘impact’ has to become embedded. Hence, the final purpose of the apparatus would be to obscure the origins of the notion of ‘impact’ and the related practices, neutralizing the notion itself, and giving a guise of necessity to an evaluative reality which in fact is new and contingent.

The genre of impact case study as element of infrastructure

In this section two questions are addressed: (1) what are the features of the genre (or what is it like?) and (2) what are the functions of the genre (or what does it do? what vision of research does it instil?). In addressing the first question, I look at narrative patterns, as well as lexical and grammatical features of the genre. This part of the study draws on classical genre analysis (Bhatia, 1993 ; Swales, 1998 ) Footnote 6 . The second question builds on the recognition, present in discourse studies since the 1970s’, that genres are not merely classes of texts with similar properties, but also veritable ‘dispositives of communication’. A genre is a means of articulation of legitimate speech; it does not just represent facts or reflect ideologies, it also acts on and alters the context in which it operates (Maingueneau, 2010 , pp. 6–7). This awareness has engendered broader sociological approaches to genre which include their pragmatic functioning in institutional realities (Swales, 1998 ).

The genre of CS differs from other academic genres in that it did not emerge organically, but was established with a set of guidelines and a document template at a precise moment in time. The genre is partly reproductive, as it recycles existing patterns of academic texts, such as journal article, grant application, annual review, as well as case study templates applied elsewhere. The studied corpus is strikingly uniform, testifying to an established command of the genre amongst submitting authors. Identical expressions are used to describe impact across the corpus. Only very rarely is non-standard vocabulary used (e.g., “horizontal” and “vertical” impact rather then “reach” and “significance” of impact). This coherence can be contrasted with a much more diversified corpus of impact CSs submitted in Norway to an analogous exercise (Wróblewska, 2019 ). The rapid consolidation of the genre in British academia can be attributed to the perceived importance of impact evaluation exercise, which lead to the establishment of an impact infrastructure, with dedicated employees tasked with instilling the ‘culture of impact’.

In its nature, the CS is a performative, persuasive genre—its purpose is to convince the ‘ideal readers’ (the evaluators) of the quality of the underpinning research and the ‘breadth and significance’ of the described impact. The main characteristics of the genre stem directly from its persuasive aim. These are discussed below in terms of narrative patterns, and grammatical and lexical features.

Narrative patterns

On the level of narrative, there is an observable reliance on a generic pattern of story-telling frequent in fiction genres, such as myths or legends, namely the Situation-Problem–Response–Evaluation (SPRE) structure (also known as the Problem-Solution pattern, see Hoey, 1994 , 2001 pp. 123–124). This is a well-known narrative which follows the SPRE pattern: a mountain ruled by a dragon (situation) which threats the neighboring town (problem) is sieged by a group of heroes (response), to lead to a happy ending or a new adventure (evaluation). Compare this to an example of the SPRE pattern in a sample impact narrative from the studied corpus:

Mosetén is an endangered language spoken by approximately 800 indigenous people (…) (SITUATION). Many Mosetén children only learn the majority language, Spanish (PROBLEM). Research at [University] has resulted in the development of language materials for the Mosetenes. (…) (RESPONSE). It has therefore had a direct influence in avoiding linguistic and cultural loss. (EVALUATION).
CS40828 Footnote 7

The SPRE pattern is complemented by patterns of Further Impact and Further Corroboration. The first one allows elaborating the narrative, e.g., by showing additional (positive) outcomes, so that the impact is not presented as an isolated event, but rather as the beginning of a series of collaborations, e.g.,:

The research was published in [outlet] (…). This led to an invitation from the United Nations Environment Programme for [researcher](FURTHER IMPACT).

Patterns of ‘further impact’ are often built around linking words, such as: “X led to” ( n  = 78) Footnote 8 , “as a result” ( n in the corpus =31), “leading to” ( n  = 24), “resulting in” ( n  = 13), “followed” (“X followed Y”– n  = 14). Figure 1 below shows a ‘word tree’ for a frequent linking structure “led to”. The size of the terms in the diagram represents frequencies of terms in the corpus. Reading the word tree from left to right enables following typical sentence structures built around the ‘led to’ phrase: research led to an impact (fundamental change/development/establishment/production of…); impact “led to” further impact.

figure 1

Word tree with string ‘led to'. This word tree with string ‘led to’ was prepared with MaxQDA software. It visualises a frequent sentence structure where research led to impact (fundamental change/ development/ establishment/ production of…) or otherwise how impact “led to” further impact.

The ‘Further Corroboration’ pattern provides additional information which strengthens the previously provided corroborative material:

(T)he book has been used on the (…) course rated outstanding by Ofsted, at the University [Name](FURTHER CORROBORATION).

Grammatical and lexical features

Both on a grammatical and lexical level, there is a visible focus on numbers and size. In making the point on the breadth and significance of impact, CS authors frequently showcase (high) numbers related to the research audience (numbers of copies sold, audience sizes, downloads but also, increasingly, tweets, likes, Facebook friends and followers). Adjectives used in the CSs appear frequently in the superlative or with modifiers which intensify them: “Professor [name] undertook a major Footnote 9 ESRC funded project”; “[the database] now hosts one of the world’s largest and richest collections (…) of corpora”; “work which meets the highest standards of international lexicographical practice”; “this experience (…) is extremely empowering for local communities”, “Reach: Worldwide and huge ”.

Use of ‘positive words’ constitutes part of the same phenomenon. These appear often in the main narrative on research and impact, and even more frequently in quoted testimonials. Research is described in the CSs as being new, unique and important with the use of words such as “innovative” ( n  = 29), “influential” ( n  = 16), “outstanding” ( n  = 12), “novel” ( n  = 10), “excellent” ( n  = 8), “ground-breaking” ( n  = 7), “tremendous” ( n  = 4), “path-breaking” ( n  = 2), etc. The same qualities are also rendered descriptively, with the use of words that can be qualified as boosters e.g., “[the research] has enabled a complete rethink of the relationship between [areas]”; “ vitally important [research]”.

Novelty of research is also frequently highlighted with the adjective “first” appearing in the corpus 70 times Footnote 10 . While in itself “first” is not positive or negative, it carries a big charge in the academic world where primacy of discovery is key. Authors often boast about having for the first time produced a type of research—“this was the first handbook of discourse studies written”…, studied a particular area—“This is the first text-oriented discourse analytic study”…, compiled a type of data—“[We] provid[ed] for the first time reliable nationwide data”; “[the] project created the first on-line database of…”, or proven a thesis: “this research was the first to show that”…

Another striking lexical characteristic of the CSs is the presence of fixed expressions in the narrative on research impact. I refer to these as ‘impact speak’. There are several collocations with ‘impact’, the most frequent being “impact on” ( n  = 103) followed by the ‘type’ of impact achieved (impact on knowledge), area/topic (impact on curricula) or audience (Impact on Professional Interpreters). This collocation often includes qualifiers of impact such as “significant”, “wide”, “primary”,“secondary”, “broader”, “key”, and boosters: great, positive, wide, notable, substantial, worldwide, major, fundamental, immense etc. Impact featured in the corpus also as a transitive verb ( n  = 22) in the forms “impacted” and “impacting”—e.g., “[research] has (…) impacted on public values and discourse”. This is interesting, as use of ‘impact’ as a verb is still often considered colloquial. Verb collocations with ‘impact’ are connected to achieving influence (“lead to..”, “maximize…”, “deliver impact”) and proving the existence and quality of impact (“to claim”, “to corroborate” impact, “to vouch for” impact, “to confirm” impact, to “give evidence” for impact). Another salient collocation is “pathways to impact” ( n  = 14), an expression describing channels of interacting with the public, in the corpus occasionally shortened to just “pathways” e.g., “The pathways have been primarily via consultancy”. This phrase has most likely made its way to the genre of CS from the Research Councils UK ‘Pathways to Impact’ format introduced as part of grant applications in 2009 (discontinued in early 2020).

On a syntactic level, CSs are rich in parallel constructions of enumeration, for instance: “ (t)ranslators, lawyers, schools, colleges and the wider public of Welsh speakers are among (…) users [of research]”; “the research has benefited a broad, international user base including endangered language speakers and community members, language activists, poets and others ”; [the users of the research come] “from various countries including India, Turkey, China, South Korea, Venezuela, Uzbekistan, and Japan ”. Listing, alongside providing figures, is one of the standard ways of signaling the breadth and significance of impact. Both lists and superlatives support the persuasive function of the genre. In terms of verbal forms, passive verbs are clearly favored and personal pronouns (“I, we”) are avoided: “research was conducted”, “advice was provided”, “contracts were undertaken”.

Vision of research promoted by the genre of CS

Impact CS is a new, influential genre which affects its academic context by celebrating and inviting a particular vision of successful research and impact. It sets a standard for capturing and describing a newly-problematized academic object. This standard will be a point of reference for future authors of CSs. Hence, it is worth taking a look at the vision on research it instills.

The SPRE pattern used in the studied CSs favors a vision of research that is linear: work proceeds from research question to results without interference. The Situation and Problem elements are underplayed in favor of elaborate descriptions of the researchers’ ‘Reactions’ (research and outreach/impact activities) and flattering ‘Evaluations’ (descriptions of effects of the research and data supporting these claims). Most narratives are devoid of challenges (the ‘Problem’ element is underplayed, possible drawbacks and failures in the research process are mentioned sporadically). Furthermore, narratives are clearly goal-oriented: impact is shown as included in the research design from the beginning (e.g., impact is frequently mentioned already in section 2 ‘Underpinning research’, rather than the latter one ‘Details of the impact’). Elements of chance, luck, serendipity in the research process are erased—this is reinforced by the presence of patterns of ‘further proof’ and ‘further corroboration’. As such, the bulk of studied CSs channel a vision of what is referred to in Science Studies as ‘normal’ (deterministic, linear) science (Kuhn, 1970 , pp. 10–42). From a purely literary perspective this makes for rather dull narratives: “fairy-tales of researcher-heroes… but with no dragons to be slain” (Selby, 2016 ).

The few CSs which do discuss obstacles in the research process or in securing impact stand out as strikingly diverse from the rest of the corpus. Paradoxically, while apparently ‘weakening’ the argumentation, they render it more engaging and convincing. This effect has been observed also in in an analogous corpus of Norwegian CSs which tend to problematize the pathway from research to impact to a much higher degree (Wróblewska, 2019 , pp. 34–35).

The lexical and grammatical features of the CSs—the proliferation of ‘positive words’, including superlatives, and the adjective “first”— contribute to an idealization of the research process. The documents channel a vision of academia where there is no place for simply ‘good’ research—all CSs seem based on ‘excellent’ and ‘ground-breaking’ projects. The quality of research underpinning impact is recognized in CSs in a straightforward, simplistic way (quotation numbers, peer reviewed papers, publications in top journals, submission to REF), which contributes to normalizing the view of research quality as easily measurable. Similarly, testimonials related to impact are not all equal. Sources of corroboration cited in CSs were carefully selected to appear prestigious and trustworthy. Testimonials and statements from high-ranking officials (but also ‘celebrities’ such as famous intellectuals or political leaders) were particularly sought-after. The end effect reinforces a solidified vision of a hierarchy of worth and trustworthiness in academia.

The prevalence of impersonal verbal forms suggests an de-personalized vision of the research process (“work was conducted”, “papers were published”, “evidence was given…”), where individual factors such as personal aspirations, constraints or ambitions are effaced. The importance given to numbers contributes to a strengthening of a ‘quantifiable’ idea of impact. This is in line with a trend observed in academic writing in general – the inflation of ‘positive words’ (boosters and superlatives) (Vinkers et al., 2015 ). This tendency is amplified in the genre of CS, particularly in its British iteration. In a Norwegian corpus claims to excellence of research and breadth and significance of impact were significantly more modest (Wróblewska, 2019 , pp. 28–30).

The genre of impact CS is a core binding component of the impact infrastructure: all the remaining elements of this formation are mutually connected by a common aim – the generation of CSs. While the CS genre, together with the encompassing impact infrastructure, is vested with a seductive/coercive force, the subjects whose work it represents and who produce it take different positions in its face.

Academics’ positioning towards the Impact Agenda

Academics position themselves towards the concept of impact in many explicit and implicit ways. ‘Positioning’ is understood here as performance-based claims to identity and subjectivity (Davies and Harré, 1990 , Harré and Van Langenhove, 1998 ). Rejecting the idea of stable “inherent” identities, positioning theorists stress how different roles are invoked and enacted in a continuous game of positioning (oneself) and being positioned (by others). Positioning in academic contexts may take the form of indexing identities such as “professor”, “linguist”, “research manager”, “SSH scholar”, “intellectual”, “maverick” etc. (Angermuller, 2013 ; Baert, 2012 , Hamann, 2016 , Hah, 2019 , 2020 ). Also many daily interactions which do not include explicit identity claims involve subject positioning, as they carry value judgments, thereby also evoking counter-statements and colliding social contexts (Tirado and Galvaz, 2008 , pp. 32–45).

My analysis draws attention to the process of incorporating impact into academic subjectivities. I look firstly at the mechanics of academics’ positioning towards impact: the game of opposite discursive acts of distancing and endorsement. Academics reject the notion of ‘impact’ by ironizing, stage management and use of metaphors. Conversely, they may actively incorporate impact into their presentation of academic ‘self’. This discursive engagement with the notion of impact can be described as ‘subjectivation’, i.e., the process whereby subjects re(establish) themselves in relation to the grid of power/knowledge in which they function (in this case the emergent ‘impact infrastructure’).

The relatively high response rate of this study (~50%) and the visible eagerness of respondents to discuss the question of impact suggest an emotional response of academics to the topic of impact evaluation. Yet, respondents visibly struggled with the notion of ‘impact’, often distancing themselves from it through discursive devices, the most salient being ironizing, use of metaphors and stage management.

Ironizing the notion of impact

In many cases, before proceeding to explain their attitude to impact, interviewed academics elaborated on the notion of impact, explaining how the notion applied to their discipline or field and what it meant for them personally. This often meant rejecting the official definition of impact or redefining the concept. In excerpt 6, the interviewee picks up the notion:

Impact… I don’t even like the word! (…) It sounds [like] a very aggressive word, you know, impact, impact ! I don’t want to imp act ! What you want, and what has happened with [my research] really is… more of a dialogue.
Int21, academic, example 6

Another respondent brought up the notion of impact when discussing ethical challenges arising from public dissemination of research.

When you manage to go through that and navigate successfully, and keep producing research, to be honest, that’s impact for me.
Int9, academic, example 7

An analogous distinction was made by a third respondent who discussed the effect of his work on an area of professional activity. While, as he explained, this application of his research has been a source of personal satisfaction, he refused to describe his work in terms of ‘impact’. He stressed that the type of influence he aims for does not lend itself to producing a CS (is not ‘REF-able’):

That’s not impact in the way this government wants it! Cause I have no evidence. I just changed someone’s view. Is that impact? Yes, for me it is. But it is not impact as understood by the bloody REF.
Int3, academic, example 8

These are but three examples of many in the studied corpus where speakers take up the notion of impact to redefine or nuance it, often juxtaposing it with adjacent notions of public engagement, dissemination, outreach, social responsibility, activism etc. A previous section highlighted how the definition of impact was collectively constructed by a community in a process of problematization. The above-cited examples illustrate the reverse of this phenomenon—namely, how individual social actors actively relate to an existing notion in a process of denying, re-defining, and delimiting.

These opposite tendencies of narrowing down and again widening a definition are in line with the theory of the double role of descriptions in discourse. Definitions are both constructions and constructive —while they are effects of discourse, they can also become ‘building blocks’ for ideas, identities and attitudes (Potter, 1996 , p. 99). By participating in impact-related workshops academics ‘reify’ the existing, official definition by enacting it within the impact infrastructure. Fragments cited above exemplify the opposite strategy of undermining the adequacy of the description or ‘ironizing’ the notion (Ibid, p.107). The tension between reifying and ironizing points to the winding, conflictual nature of the process of accepting and endorsing the new ‘culture of impact’. A recognition of the multiple meanings given to the notion of ‘impact’ by policy-makers, academic managers and scholars may caution us in relation to studies on attitudes towards impact which take the notion at face value.

Respondents nuanced the notion of impact also through the use of metaphors. In discourse analysis metaphors are seen in not just as stylistic devices but as vehicles for attitudes and values (Mussolf, 2004 , 2012 ). Many of the respondents make remarks on the ‘realness’ or ‘seriousness’ of the exercise, emphasizing its conventional, artificial nature. Interviewees admitted that claims made in the CSs tend to be exaggerated. At the same time, they stressed that this was in line with the convention of the genre, the nature of which was clear for authors and panelists alike. The practice of impact evaluation was frequently represented metaphorically as a game. See excerpt 9 below:

To be perfectly honest, I view the REF and all of this sort of regulatory mechanisms as something of a game that everybody has to play. The motivation [to submit to REF] was really: if they are going to make us jump through that hoop, we are clever enough to jump through any hoops that any politician can set.
Int14, academic, example 9

Regarding the relation of the narratives in the CSs to truth see example 10:

[A CS] is creative stuff. Given that this is anonymous, I can say that it’s just creative fiction. I wouldn’t say we [authors of CSs] lie, because we don’t, but we kind of… spin. We try to show a reality which, by some stretch of imagination is there. (It’s) a truth. I’m not lying. Can it be shown in different ways? Yes, it can, and then it would be possibly less. But I choose, for obvious reasons, to say that my external funding is X million, which is a truth.
Int3, academic, example 10

The metaphors of “playing a game”, “jumping through hoops” suggest a competition which one does not enter voluntarily (“everybody has to play it”) while those of “creative fiction”, “spinning”, presenting “ a truth” point to an element of power struggle over defining the rules of the game. Doing well in the exercise can mean outsmarting those who establish the framework (politicians) by “performing” particularly well. This can be achieved by eagerly fulfilling the requirements of the genre of CS, and at the same time maintaining a disengaged position from the “regulatory mechanism” of the impact infrastructure.

Stage management

Academics’ positioning towards impact plays out also through management of ‘stage’ of discursive performance, often taking the form of frontstage and backstage markers (in the sense of Goffman’s dramaturgy–1969, pp. 92–122). For instance, references to the confidential nature of the interview (see example 10 above) or the expression “to be perfectly honest” (example 9), are backstage markers. Most of the study’s participants have authored narratives about their work in the strict, formalized genre of CS, thereby performing on the Goffmanian ‘front stage’ for an audience composed of senior management, REF panelists and, ultimately, perhaps “politicians”, “the government”. However, when speaking on the ‘back stage’ context of an anonymous interview, many researchers actively reject the accuracy of the submitted CSs as representations of their work. Many express a nuanced, often critical, view on impact.

Respondents frequently differentiate between the way they perceive ‘impact’ on different ‘levels’, or from the viewpoint of their different ‘roles’ (scholar, research manager, citizen…). One academic can hold different (even contradictory) views on the assessment of impact. Someone who strongly criticizes the Impact Agenda as an administrative practice might be supportive of ‘impact’ on a personal level or vice versa. See the answer of a linguist asked whether ‘impact’ enters into play when he assesses the work of other academics:

When I look at other people’s work work as a linguist, I don’t worry about that stuff. (…) As an administrator, I think that linguistics, like many sciences, has neglected the public. (…) At some point, when we would be talking about promotion (…) I would want to take a look at the impact of their work. (…) And that would come into my thinking in different times.
Int13, academic, example 11

Interestingly, in the studied corpus there isn’t a simple correlation between conducting research which easily ‘lends itself to impact’ and a positive overall attitude to impact evaluation.

Subjectivation

The most interesting data excerpts in this study are perhaps the ones where respondents wittingly or unwittingly expose their hesitations, uncertainties and struggles in positioning themselves towards the concept of impact. In theoretical terms, these can be interpreted as symptoms of an ongoing process of ‘subjectivation’.

‘Subjectivation’ is another concept rooted in Foucauldian governmentality theory. According to Foucault, individuals come to the ‘truth’ about their subjectivity by actively relating to a pre-existent set of codes, patterns, rules and rituals suggested by their culture or social group (Castellani, 1999 , pp. 257–258; Foucault, 1988 , p. 11). The term ‘subjectivation’ refers to the process in which an individual establishes oneself in relation to the grid of power/knowledge in which they function. This includes actions subjects take on their performance, competences, attitudes, self-esteem, desires etc. in order to improve, regulate or reform themselves (Dean, 1999 , p. 20; Lemke, 2002 ; Rose, 1999 , p. xii).

Academics often distance themselves from the assessment exercise, as shown in previous sections. And yet, the data hints that having taken part in the evaluation and engaged with the impact infrastructure was not without influence on the way they present their research, also in nonofficial, non-evaluative contexts, such as the research interview. This effect is visible in vocabulary choices—interviewees routinely spoke about ‘pathways to impact’, ‘impact generation’, ‘REF-ability’ etc. ‘Impact speak’ has made its way into every-day, casual academic conversations. Beyond changes to vocabulary, there is a more deep-running process—the discursive work of reframing one’s research in view of the evaluation exercise and in its terms. Many respondents seemed to adjust the presentation of their research, its focus and aims, when the topic of REF surfaced in the exchange. Interestingly, such shifts occurred even in the case of respondents who did not submit to the exercise, for instance because they were already retired, or because they refused to take part in it. For those who have submitted CSs to REF, the effect of having re-framed the narrative of their research in this new genre often had a tremendous effect.

Below presented is the example of a scholar who did not initially volunteer to submit a CS, and was reluctant to take part when she was encouraged by a supervisor. During the interview the respondent distanced herself from the exercise and the concept of impact through the discursive devices of ironizing, metaphors, stage management, and humor. The respondent was consistently critical towards impact in course of the interview. Therefore the researcher expected a firm negative answer to the final question: “did the exercise affect your perception of your work?”. See excerpt 13 below for her the respondent’s somewhat surprising answer.

Do you know what? It did, it did, it did. Almost a kind of a massive influence it had. Maybe this is the answer that you didn’t see coming ((laughing)). (…) It did [have an influence] but maybe from a different route as for people who were signed up for [the REF submission] from the outset. (…) When I saw this [CS narrative] being shaped up and people [who gave testimonies] I kind of thought: goodness me! And there were other moving things.
Int21, academic, example 13

Through the preparation of the CS and particularly through familiarizing herself with the underpinning testimonials, the respondent gained greater awareness of an area of practice which was influenced by her research. The interviewee’s attitude changed not only in the course of the evaluation exercise, but also—as if mirroring this process—during the interview. In both cases, elements which were up to that moment implicit (the response of end-users of the work, the researcher’s own emotional response to the exercise and to the written-up narrative of her impact) were made explicit. It is the process of recounting one’s story in a different framework, according to other norms and values (and in a different genre) that triggers the process of subjectivation. This example of a change of attitude in an initially reluctant subject demonstrates the difficulty in opposing the overwhelming force of the impact infrastructure, particularly in view of the (sometimes unexpected) rewards that it offers.

Many respondents found taking part in the REF submission—including the discursive work on the narrative of their research—an exhausting experience. In some cases however, the process of reshaping one’s academic identity triggered by the Agenda was a welcome development. Several interviewees claimed that the exercise valorized their extra-academic involvement which previously went unnoticed at their department. These scholars embraced the genre of CS as an opportunity to present their impact-related activities as an inherent part of their academic work. One academic stated:

At last, I can take my academic identity and my activist identity and roll them up into one.
Int11, academic, example 14

Existing studies have focused on situating academics’ attitudes towards the Impact Agenda on a positive-negative scale (e.g., Chubb et al., 2016 ), and studied divergences depending on career stage or disciplinary affiliation etc. (Chikoore, 2016 ; Chikoore and Probets, 2016 ; Weinstein et al., 2019 ). My data shows that there are many dimensions to each academic’s view of impact. Scholars have complex (sometimes even contradictory) views on ‘impact’ and the discursive work in incorporating impact into a coherent academic ‘self’ is ongoing. While an often overwhelming ‘impact infrastructure’ looms over professional discursive positioning practices, academic subjects are by no means passive recipients of governmental new-managerial policies. On the contrary, they are agents actively involved in accepting, rejecting and negotiating them on a local level—both in front-stage and back-stage contexts.

Looking at the front stage, most CSs seem compliant in their eagerness to demonstrate impact in all its breadth and significance. The documents showcase large numbers and data once considered trivial in the academic context (Facebook likes, Twitter followers, endorsement of celebrities…) and faithfully follow the policy documents in adopting ‘impact speak’. Interviews with academics paint a different picture: the respondents may be playing according to the rules of the evaluation “game”, but they are playing consciously , often in an emotionally detached, distanced manner. Other scholars adjust to the regulations, but not in the name of compliance, but in view of an alignment between the goals of the Agenda and their personal ones. Finally, some academics perceive the evaluation of impact as an opportunity to re-position themselves professionally or re-claim areas of activity which were long considered non-essential for an academic career, like public engagement, outreach and activism.

Concluding remarks

The initial, dynamic phases of the introduction of impact to British academia represent, in terms of Foucauldian theory, the phase of ‘emergence’. This notion draws attention to the moment when discursive concepts (‘impact’, ‘impact case study’…) surface and consolidate. It is in these terms that the previously non-regulated area of academic activity will be thereon described, assessed, evaluated. New notions, definitions, procedures related to impact and the genre of CS will continue to circulate, emerging in other evaluation exercises, at other institutions, in other countries.

The stage of emergence is characterized by a struggle of forces, an often violent conflict between opposing ideas—“it is their eruption, the leap from the wings to centre stage” (Foucault, 1984 , p. 84). The shape that an emergent idea will eventually take is the effect of clashes of these forces and it does not fully depend on any of them. Importantly, emergence is merely “the entry of forces” (p. 84), and “not the final term of historical development” (p. 83). For Foucault, a concept, in its inception, is essentially an empty word, which addresses the needs of a field that is being problematized and satisfies the powers which target it. A problematization (of an object, practice, area of activity) is a response to particular desires or problems—these constitute an instigation, but do not determine the shape of the problematization. As Foucault urges “to one single set of difficulties, several responses can be made” (2003, p. 24).

With the emergence of the Impact Agenda, an area of activity which has always existed (the collaboration of academics with the non-academic world) was targeted, delimited and described with new notions in a process of problematization. The notion of ‘impact’ together with the genre created for capturing it became the core of an administrative machinery—the impact infrastructure. This was a new reality that academics had to quickly come to terms with, positioning themselves towards it in a process of subjectification.

The run-up to REF2014 was a crucial and defining phase, but it was only the first stage of a longer process—the emergence of the concept of ‘impact’, the establishment of basic rules which would govern its generation, documentation, evaluation. Let’s recall Foucault’s argument that “rules are empty in themselves, violent and unfinalized; they are impersonal and can be bent to any purpose. The successes of history belong to those who are capable of seizing these rules”… (pp. 85–86). The rules embodied in the REF guidelines, the new genre of CS, the principals of ‘impact speak’ were in the first instance still “empty and unfinalized”. It was up to those subject to the rules to fill them with meaning.

The data analyzed in this study shows that despite dealing with a new powerful problematization and functioning in the framework of a complex infrastructure, academics continue to be active and highly reflective subjects, who discursively negotiate key concepts of the impact infrastructure and their own position within it. It will be fascinating to study the emergence of analogous evaluation systems in other countries and institutions. ‘Impact infrastructure’ and ‘genre’ are two excellent starting points for an analysis of ensuing changes to academic realities and subjectivities.

Data availability

The interview data analyzed in this paper is not publicly available, due to the confidential nature of the interview data. It can be made available by the corresponding author in anonymised form on reasonable request. The cited case studies were sourced from the REF database ( https://www.ref.ac.uk/2014/ ) and may be consulted online. The coded dataset is considered part of the analysis (and hence protected by copyright), but may be made available on reasonable request.

Most of the studied documents—71 CSs—have been submitted to the Unit of Assessment (UoA) 28—Linguistics and Modern Languages, the remaining seven have been submitted to five different UoAs but fall under the field of linguistics.

Some interviewees were involved in REF in more than just one role. ‘Authors’ of CSs authored the documents to a different degree, some (no = 5) were also engaged in the evaluation process in managerial roles.

Words underlined in interview excerpts were stressed by the speaker.

When citing interview data I give numbers attributed to individual interviews in the corpus, type of interviewee, and number of cited example.

‘Apparatus’ is one of the existing translations of the French ‘dispositif’, another one is ‘historical construct’ (Sembou, 2015 , p. 38) or ‘grid of intelligibility’ (Dreyfus and Rabinow, 1983 , p. 121). The French original is also sometimes used in English texts. In this paper, I use ‘apparatus’ and ‘infrastructure’, as the notion of ‘infrastructure’ has already become current in referring to resources dedicated to impact generation at universities, both in scholarly literature (Power, 2015 ) and in managerial ‘impact speak’.

A full version of the analysis may be found in Wróblewska, 2018 .

CS numbers are those found in the REF impact case study base: https://impact.ref.ac.uk/casestudies/ . I only provide CS numbers for cited fragments of one sentence or longer; exact sources for cited phrases may be given on request or easily identified in the CS database.

The figures given for appearances of certain elements of the genre in the studied corpus are drawn from the computer-assisted qualitative analysis conducted with MaxQDA software. They serve as an illustration of the relative frequency of particular elements for the reader, but since they are not the result of a rigorous corpus analytical study of a larger body of CSs, the researcher does can not claim statistical relevance.

Words underlined in CS excerpts are emphasized by the author of the analysis.

Number of occurrences of string ‘the first’ in the context of quality of research, excluding phrases like “the first workshop took place…” etc.

Angermuller J (2013) How to become an academic philosopher: academic discourse as a multileveled positioning practice. Sociol Hist 2:263–289

Google Scholar  

Angermuller J (2014a) Poststructuralist discourse analysis. subjectivity in enunciative pragmatics. Palgrave Macmillan, Houndmills/Basingstoke

Angermuller J (2014b) Subject positions in polyphonic discourse. In:Angermuller J, Maingueneau D, Wodak R (eds) The Discourse Studies Reader. Main currents in theory and analysis. John Benjamins Publishing Company, Amsterdam/Philadelphia, p 176–186

Baert P (2012) Positioning theory and intellectual interventions. J Theory Soc Behav 42(3):304–324

Article   Google Scholar  

Benneworth P, Gulbrandsen M, Hazelkorn E (2016) The impact and future of arts and humanities research. Palgrave Macmillan, London

Book   Google Scholar  

Bhatia VK (1993) Analysing genre: language use in professional settings. Longman, London

Bulaitis Z (2017) Measuring impact in the humanities: Learning from accountability and economics in a contemporary history of cultural value. Pal Commun 3(7). https://doi.org/10.1057/s41599-017-0002-7

Cameron L, Maslen R, Todd Z, Maule J, Stratton P, Stanley N (2009) The discourse dynamics approach to metaphor and metaphor-led discourse analysis. Metaphor Symbol 24(2):63–89. https://doi.org/10.1080/10926480902830821

Castellani B (1999) Michel Foucault and symbolic interactionism: the making of a new theory of interaction. Stud Symbolic Interact 22:247–272

Chikoore L (2016) Perceptions, motivations and behaviours towards ‘research impact’: a cross-disciplinary perspective. Loughborough University. Loughborough University Institutional Repository. https://dspace.lboro.ac.uk/2134/22942 . Accessed 30 Dec 2020

Chikoore L, Probets S (2016) How are UK academics engaging the public with their research? a cross-disciplinary perspective. High Educ Q 70(2):145–169. https://doi.org/10.1111/hequ.12088

Chubb J, Watermeyer R, Wakeling P (2016) Fear and loathing in the Academy? The role of emotion in response to an impact agenda in the UK and Australia. High Educ Res Dev 36(3):555–568. https://doi.org/10.1080/07294360.2017.1288709

Chubb J, Watermeyer R (2017) Artifice or integrity in the marketization of research impact? Investigating the moral economy of (pathways to) impact statements within research funding proposals in the UK and Australia. Stud High Educ 42(12):2360–2372

Davies B, Harré R (1990) Positioning: the discursive production of selves. J Theory Soc Behav 20(1):43–63

Dean MM (1999) Governmentality: power and rule in modern society. SAGE Publications, Thousand Oaks, California

Derrick G (2018) The evaluators’ eye: Impact assessment and academic peer review. Palgrave Macmillan, London

Donovan C (2008) The Australian Research Quality Framework: A live experiment in capturing the social, economic, environmental, and cultural returns of publicly funded research. New Dir for Eval 118:47–60. https://doi.org/10.1002/ev.260

Donovan C (2011) State of the art in assessing research impact: introduction to a special issue. Res. Eval. 20(3):175–179. https://doi.org/10.3152/095820211X13118583635918

Donovan C (2017) For ethical ‘impactology’. J Responsible Innov 6(1):78–83. https://doi.org/10.1080/23299460.2017.1300756

Dreyfus HL, Rabinow P (1983) Michel Foucault: beyond structuralism and hermeneutics. University of Chicago Press, Chicago

European Science Foundation (2012) The Challenges of Impact Assessment. Working Group 2: Impact Assessment. ESF Archives. http://archives.esf.org/index.php?eID=tx_nawsecuredl&u=0&g=0&t=1609373495&hash=08da8bb115e95209bcea2af78de6e84c0052f3c8&file=/fileadmin/be_user/CEO_Unit/MO_FORA/MOFORUM_Eval_PFR__II_/Publications/WG2_new.pdf . Accessed 30 Dec 2020

Fairclough N (1989) Language and power. Longman, London/New York

Fairclough N (1992) Discourse and social change. Polity Press, Cambridge, UK/Cambridge

Fairclough N (1993) Critical discourse analysis and the marketization of public discourse: The Universities. Discourse Soc 4(2):133–168

Fairclough N, Mulderrig J, Wodak R (1997) Critical discourse analysis. In: Van Dijk TA (ed) Discourse studies: a multidisciplinary introduction. SAGE Publications Ltd, New York, pp. 258–284

Foucault M (1980) The confession of the flesh. In: Gordon C (ed) Power/knowledge: selected interviews and other writings 1972–1977. Vintage Books, New York

Foucault M (1984) Nietzsche, genealogy, history. In: Rabinow P (ed) The Foucault Reader. Pantheon Books, New York

Foucault M (1988) Politics, philosophy, culture: Interviews and other writings, 1977–1984. Routledge, New York

Foucault M (1990) The use of pleasure. The history of sexuality, vol. 2. Vintage Books, New York

Gee J (2015) Social linguistics and literacies ideology in discourses. Taylor and Francis, Florence

Gilbert GN, Mulkay M (1984) Opening Pandora’s Box: a sociological analysis of scientists’ discourse. Cambridge University Press, Cambridge

Goffman E (1969) The presentation of self in everyday life. Allen Lane The Pinguin Press, London

Grant J, Brutscher, PB, Kirk S, Butler L, Wooding S (2009) Capturing Research Impacts. A review of international practice. Rand Corporation. RAND Europe. https://www.rand.org/content/dam/rand/pubs/documented_briefings/2010/RAND_DB578.pdf . Accessed 30 Dec 2020

Gunn A, Mintrom M (2016) Higher education policy change in Europe: academic research funding and the impact agenda. Eur Educ 48(4):241–257. https://doi.org/10.1080/10564934.2016.1237703

Gunn A, Mintrom M (2018) Measuring research impact in Australia. Aust Universit Rev 60(1):9–15

Hah S (2019) Disciplinary positioning struggles: perspectives from early career academics. J Appl Linguist Prof Pract 12(2). https://doi.org/10.1558/jalpp.32820

Hah S (2020) Valuation discourses and disciplinary positioning struggles of academic researchers–a case study of ‘maverick’ academics. Pal Commun 6(1):1–11. https://doi.org/10.1057/s41599-020-0427-2

Hamann J (2016) “Let us salute one of our kind.” How academic obituaries consecrate research biographies. Poetics 56:1–14. https://doi.org/10.1016/j.poetic.2016.02.005

Harré R, Van Langenhove L (1998) Positioning theory: moral contexts of international action. Wiley-Blackwell, Chichester

HEFCE (2015) Research Excellence Framework 2014: Manager’s report. HEFCE. https://www.ref.ac.uk/2014/media/ref/content/pub/REF_managers_report.pdf . Accessed 30 Dec 2020

HEFCE (2011) Assessment framework and guidance on submissions. HEFCE: https://www.ref.ac.uk/2014/media/ref/content/pub/assessmentframeworkandguidanceonsubmissions/GOS%20including%20addendum.pdf . Accessed 30 Dec 2020

Hoey M (1994) Signalling in discourse: A functional analysis of a common discourse pattern in written and spoken English. In: Coulthard M (ed) Advances in written text analysis. Routledge, London

Hoey M (2001) Textual interaction: an introduction to written discourse analysis. Routledge, London

Hong Kong University Grants Committee (2018) Research Assessment Exercise 2020. Draft General Panel Guidelines. UGC. https://www.ugc.edu.hk/doc/eng/ugc/rae/2020/draft_gpg_feb18.pdf Accessed 30 Dec 2020

Hyland K (2009) Academic discourse English in a global context. Continuum, London

King’s College London and Digital Science (2015) The nature, scale and beneficiaries of research impact: an initial analysis of Research Excellence Framework (REF) 2014 impact case studies. Dera: http://dera.ioe.ac.uk/22540/1/Analysis_of_REF_impact.pdf . Accessed 30 Dec 2020

Kuhn TS (1970) The structure of scientific revolutions. University of Chicago Press, Chicago

Lemke T (2002) Foucault, governmentality, and critique. Rethink Marx 14(3):49–64

Maingueneau D (2010) Le discours politique et son « environnement ». Mots. Les langages du politique 94. https://doi.org/10.4000/mots.19868

Manville C, Jones MM, Frearson M, Castle-Clarke S, Henham ML, Gunashekar S, Grant J (2014) Preparing impact submissions for REF 2014: An evaluation. Findings and observations. RAND Corporation: https://www.rand.org/pubs/research_reports/RR726.html . Accessed 30 Dec 2020

Manville C, Guthrie S, Henham ML, Garrod B, Sousa S, Kirtley A, Castle-Clarke S, Ling T (2015) Assessing impact submissions for REF 2014: an evaluation. Rand Corporation. https://www.rand.org/content/dam/rand/pubs/research_reports/RR1000/RR1032/RAND_RR1032.pdf . Accessed 30 Dec 2020

Myers G (1985) Texts as knowledge claims: the social construction of two biology articles. Soc Stud Sci 15(4):593–630

Myers G (1989) The pragmatics of politeness in scientific articles. Appl Linguist 10(1):1–35

Article   ADS   Google Scholar  

Musolff A (2004) Metaphor and political discourse. Analogical reasoning in debates about Europe. Palgrave Macmillan, Basingstoke

Musolff A (2012) The study of metaphor as part of critical discourse analysis. Crit. Discourse Stud. 9(3):301–310. https://doi.org/10.1080/17405904.2012.688300

National Co-ordinating Centre For Public Engagement (2014) After the REF-Taking Stock: summary of feedback. NCCFPE. https://www.publicengagement.ac.uk/sites/default/files/publication/nccpe_after_the_ref_write_up_final.pdf . Accessed 30 Dec 2020

Potter J (1996) Representing reality: discourse, rhetoric and social construction. Sage, London

Power M (2015) How accounting begins: object formation and the accretion of infrastructure. Account Org Soc 47:43–55. https://doi.org/10.1016/j.aos.2015.10.005

Research Council of Norway (2017) Evaluation of the Humanities in Norway. Report from the Principal Evaluation Committee. The Research Council of Norway. Evaluation Division for Science. RCN. https://www.forskningsradet.no/siteassets/publikasjoner/1254027749230.pdf . Accessed 30 Dec 2020

Research Council of Norway (2018) Evaluation of the Social Sciences in Norway. Report from the Principal Evaluation Committee. The Research Council of Norway.Division for Science and the Research System RCN. https://www.forskningsradet.no/siteassets/publikasjoner/1254035773885.pdf Accessed 30 Dec 2020

Robinson D (2013) Introducing performative pragmatics. Routledge, London/New York

Rose N (1999) Governing the soul: the shaping of the private self. Free Association Books, Sidmouth

Sayer D (2015) Rank hypocrisies: the insult of the REF. Sage, Thousand Oaks

Selby J (2016) Critical IR and the Impact Agenda, Paper presented at Pais Impact Conference. Warwick University, Coventry, pp. 22–23 November 2016

Sembou E (2015) Hegel’s Phenomenology and Foucault’s Genealogy. Routledge, New York

Stern N (2016) Building on Success and Learning from Experience. an Independent Review of the Research Excellence Framework. Department for Business, Energy and Industrial Strategy. Assets Publishing Service. https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/541338/ind-16-9-ref-stern-review.pdf . Accessed 30 Dec 2020

Swales JM (1998) Other floors, other voices: a textography of a small university building. Routledge, London/New York

Swales JM (1990) Genre analysis: English in academic and research settings. Cambridge University Press, Cambridge

Swales JM (2011) Aspects of Article Introductions. University of Michigan Press, Ann Arbor

Tirado F, Gálvez A (2008) Positioning theory and discourse analysis: some tools for social interaction analysis. Historical Social Res 8(2):224–251

Vinkers CH, Tijdink JK, Otte WM (2015) Use of positive and negative words in scientific PubMed abstracts between 1974 and 2014: retrospective analysis. BMJ 351:h6467. https://doi.org/10.1136/bmj.h6467

Article   PubMed   PubMed Central   Google Scholar  

VSNU–Association of Universities in the Netherlands (2016) Standard Evaluation Protocol (SEP). Protocol for Research Assessments in the Netherlands. VSNU. https://vsnu.nl/files/documenten/Domeinen/Onderzoek/SEP2015-2021.pdf . Accessed 30 Dec 2020

Watermeyer R (2012) From engagement to impact? Articulating the public value of academic research. Tertiary Educ Manag 18(2):115–130. https://doi.org/10.1080/13583883.2011.641578

Watermeyer R (2014) Issues in the articulation of ‘impact’: the responses of UK academics to ‘impact’ as a new measure of research assessment. Stud High Educ 39(2):359–377. https://doi.org/10.1080/03075079.2012.709490

Watermeyer R (2016) Impact in the REF: issues and obstacles. Stud High Educ 41(2):199–214. https://doi.org/10.1080/03075079.2014.915303

Warner M (2015) Learning my lesson. London Rev Books 37(6):8–14

Weinstein N, Wilsdon J, Chubb J, Haddock G (2019) The Real-time REF review: a pilot study to examine the feasibility of a longitudinal evaluation of perceptions and attitudes towards REF 2021. SocArXiv: https://osf.io/preprints/socarxiv/78aqu/ . Accessed 30 Dec 2020

Wróblewska MN, Angermuller J (2017) Dyskurs akademicki jako praktyka społeczna. Zwrot dyskursywny i społeczne badania szkolnictwa wyższego. Kultura–Społeczeństwo–Edukacja 12(2):105–128. https://doi.org/10.14746/kse.2017.12.510.14746/kse.2017.12.5

Wróblewska MN (2017) Ewaluacja wpływu społecznego nauki. Przykład REF 2014 a kontekst polski. NaukaiSzkolnicwo Wyższe 49(1):79–104. https://doi.org/10.14746/nisw.2017.1.5

Wróblewska MN (2018) The making of the Impact Agenda. A study in discourse and governmnetality. Unpublished doctoral dissertation. Warwick University

Wróblewska MN (2019) Impact evaluation in Norway and in the UK: A comparative study, based on REF 2014 and Humeval 2015-2017. ENRESSH working paper series 1. University of Twente Research Information. https://ris.utwente.nl/ws/portalfiles/portal/102033214/ENRESSH_01_2019.pdf . Accessed 30 Dec 2020

Download references

Acknowledgements

I wish to thank Prof. Johannes Angermuller, the supervisor of the doctoral dissertation in which many of the ideas discussed in this paper were first presented. Prof. Angermuller’s guidance and support were essential for the development of my understanding of the importance of discourse in evaluative contexts. I also thank the reviewers of the aforementioned thesis, Prof. Jo Angouri and Prof. Srikant Sarangi for their feedback which helped me develop and clarify the concepts which I use in my analysis, as well as its presentation. Any errors or omissions are of course my own. The research presented in this paper received funding from the European Research Council (DISCONEX project 313,172). The underpinning research was also facilitated by the author’s membership in EU Cost Action “European Network for Research Evaluation in the Social Sciences and the Humanities”(ENRESSH CA15137-E). Particularly advice and encouragement recieved from the late prof. Paul Benneworth was invaluable.

Author information

Authors and affiliations.

University of Warwick, Coventry, UK

Marta Natalia Wróblewska

National Centre for Research and Development–NCBR, Warsaw, Poland

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Marta Natalia Wróblewska .

Ethics declarations

Competing interests.

The author declares no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Cite this article.

Wróblewska, M.N. Research impact evaluation and academic discourse. Humanit Soc Sci Commun 8 , 58 (2021). https://doi.org/10.1057/s41599-021-00727-8

Download citation

Received : 12 May 2020

Accepted : 11 January 2021

Published : 02 March 2021

DOI : https://doi.org/10.1057/s41599-021-00727-8

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

case study higher education evaluation

Ensuring bachelor’s thesis assessment quality: a case study at one Dutch research university

Higher Education Evaluation and Development

ISSN : 2514-5789

Article publication date: 30 May 2023

In the Netherlands, thesis assessment quality is a growing concern for the national accreditation organization due to increasing student numbers and supervisor workload. However, the accreditation framework lacks guidance on how to meet quality standards. This study aims to address these issues by sharing our experience, identifying problems and proposing guidelines for quality assurance for a thesis assessment system.

Design/methodology/approach

This study has two parts. The first part is a narrative literature review conducted to derive guidelines for thesis assessment based on observations made at four Dutch universities. The second part is a case study conducted in one bachelor’s psychology-related program, where the assessment practitioners and the vice program director analyzed the assessment documents based on the guidelines developed from the literature review.

The findings of this study include a list of guidelines based on the four standards. The case study results showed that the program meets most of the guidelines, as it has a comprehensive set of thesis learning outcomes, peer coaching for novice supervisors, clear and complete assessment information and procedures for both examiners and students, and a concise assessment form.

Originality/value

This study is original in that it demonstrates how to holistically ensure the quality of thesis assessments by considering the context of the program and paying more attention to validity (e.g. program curriculum and assessment design), transparency (e.g. integrating assessment into the supervision process) and the assessment expertise of teaching staff.

  • Quality assurance
  • Accreditation
  • Thesis assessment

Hsiao, Y.-P.(A). , van de Watering, G. , Heitbrink, M. , Vlas, H. and Chiu, M.-S. (2023), "Ensuring bachelor’s thesis assessment quality: a case study at one Dutch research university", Higher Education Evaluation and Development , Vol. ahead-of-print No. ahead-of-print. https://doi.org/10.1108/HEED-08-2022-0033

Emerald Publishing Limited

Copyright © 2023, Ya-Ping (Amy) Hsiao, Gerard van de Watering, Marthe Heitbrink, Helma Vlas and Mei-Shiu Chiu

Published in Higher Education Evaluation and Development . Published by Emerald Publishing Limited. This article is published under the Creative Commons Attribution (CC BY 4.0) licence. Anyone may reproduce, distribute, translate and create derivative works of this article (for both commercial and non-commercial purposes), subject to full attribution to the original publication and authors. The full terms of this licence may be seen at http:// creativecommons.org/licences/by/4.0/legalcode

Introduction

According to data from the universities of the Netherlands, the number of bachelor’s students at Dutch research universities has been steadily increasing from 2015 to 2021 [1] , leading to increased workload for teaching staff due to the need for greater supervision of students [2] . This increased supervision is particularly evident in the supervision of students’ final projects. In the Netherlands, students can begin working on their final projects in the final year of their program’s curriculum once they pass the first-year diploma (the so-called Propaedeutic phase based on a positive binding study advice, BSA), earn a required number of European Credit Transfer and Accumulation System (ECTS) credits and meet other requirements. A bachelor’s degree is awarded when a student has “demonstrated by the results of tests, the final projects, and the performance of graduates in actual practice or in postgraduate programmes” (The Accreditation Organisation of the Netherlands and Flanders [Nederlands-Vlaamse Accreditatieorganisatie], hereinafter abbreviated as the NVAO, 2018 , p. 34).

The Bachelor’s thesis is the culmination of the Bachelor’s programme. A Bachelor’s thesis is carried out in the form of a research project within a department. It is an opportunity to put the knowledge learned during the programme into practice. The Bachelor’s thesis is used to assess the student’s initiative and their ability to plan, report and present a project. The difficulty level of the thesis is described by the attainment targets of the programme and the modules followed up until that moment. Students work independently on a Bachelor’s thesis or Individual Assignment (IOO) under the guidance of a supervisor.

This definition highlights the pedagogical value of the thesis (i.e. the opportunity to carry out an independent project) and the purpose of thesis assessment (i.e. to determine the extent to which the intended learning outcomes have been achieved). While this definition acknowledges the importance of a bachelor’s thesis, relatively little research has been done on examining the quality of undergraduate thesis assessment ( Hand and Clewes, 2000 ; Shay, 2005 ; Webster et al ., 2000 ; Todd et al ., 2004 ), let alone in the Dutch context where thesis supervisors and examiners of bachelor’s students are experiencing an increasing workload.

In recent years, the Dutch government has placed increasing emphasis on assessment quality in higher education ( Inspectorate of Education [Inspectie van het Onderwijs], 2016 ). The NVAO has established the Assessment Framework for the Higher Education Accreditation System of the Netherlands (hereinafter abbreviated as the Framework, NVAO, 2018 ). The standards for the accreditation of initial and existing study programs emphasize whether a program has established an adequate student assessment system that appropriately assesses the intended learning outcomes ( NVAO, 2018 ). According to the quality standards of the Framework, thesis assessment should be valid, reliable, transparent and independent. Assessment literature in the higher education context has defined these criteria as follows (e.g. Biggs and Tang, 2007 ; Bloxham and Boyd, 2007 ). Validity refers to the extent to which an assessment accurately measures what it is intended to measure. Reliability refers to the consistency of the assessment results, or how well they accurately reflect a student’s actual achievement level. Transparency is the clarity and specificity with which assessment information is communicated to both students and examiners. Independency is a necessary condition for ensuring the validity and reliability of an assessment, as it requires that examiners remain objective in the assessment process.

Despite the inclusion of these standards in the Framework ( NVAO, 2008 ), official guidance on establishing a quality system of assessing graduation projects that test achievement of the exit level of a study program at Dutch research universities is limited. As assessment practitioners (the first four authors of this article), we have found that it is often unclear for a program’s curriculum and/or management team to establish appropriate thesis assessment procedures at the undergraduate level that meet the NVAO’s quality standards. We hope that our experience can provide valuable insights and guidance for programs seeking to ensure quality assurance for thesis assessment.

Aims and research questions

The purpose of this study is to share our experience and the challenges we faced during internal and external quality assurance processes of thesis assessment. Based on these challenges, we conducted a narrative literature review to develop a set of guidelines for ensuring thesis assessment quality that aligns with the four standards outlined in the Framework ( NVAO, 2018 ): (1) intended learning outcomes, (2) teaching and learning environment, (3) assessment and (4) achieved learning outcomes. To illustrate the application of these guidelines, we present a case study of bachelor’s thesis assessment practices at one Dutch research university.

What are the guidelines for ensuring the quality of thesis assessment procedures that meet the standards specified in the Framework?

How can these guidelines be applied to evaluate the quality of thesis assessment in a study program?

It is important to note that this study is limited to the context of four Dutch research universities, where we encountered common issues during internal quality assurance processes of thesis assessment. Our goal is to share our experience and offer insights that could be useful to other institutions seeking to ensure the quality of thesis assessment. We do not intend to assume that these problems are present at all Dutch research universities.

Problems and guidelines in meeting the four standards

According to the didactic principle of constructive alignment ( Biggs and Tang, 2007 ), which is commonly used in Dutch higher education, the three education processes, teaching, learning and assessment, should be aligned with the intended learning outcomes. We begin with Standards 1 and 2, which set out the conditions under which thesis assessment takes place, and then we place more emphasis on Standards 3 and 4, which focus on the quality criteria for thesis assessment.

Standard 1: intended learning outcomes

To ensure that a study program meets Standard 1 of the Dutch Qualification Framework ( NLQF, 2008 ), the intended learning outcomes for graduates in specific subject areas and qualifications are typically developed using the Dublin Descriptors ( Bologna Working Group, 2005 ), which provide generic statements of competencies and attributes. However, it is often assumed that a thesis should assess all of these program learning outcomes (PLOs) since it is intended to evaluate the achieved learning outcomes at the exit level. Unfortunately, these PLOs can be global and unclear, which can confuse and hinder students from trying to understand the expectations for thesis assessment. Our observation is that programs often utilize PLOs as thesis learning outcomes (TLOs), although a thesis is not equivalent to the entire program curriculum.

According to Biggs and Tang (2007) , it is important for teachers to first clearly define the learning outcomes before designing instructional activities to guide students toward achieving them. In addition, the outcomes at the program and course levels (i.e. a thesis is also a course) should also be constructively aligned, and the course-level outcomes should be specific to the context of the course. Therefore, to design effective thesis activities (such as supervision) and develop assessment criteria, it would be more pedagogically valuable to formulate thesis-specific learning outcomes and explain how they contribute to the PLOs and Dublin Descriptors, rather than directly using the PLOs for thesis assessment.

In addition, a thesis course often involves most of the teaching staff in the program. Therefore, it is important to establish clear and specific expectations for what students should achieve at the end of a bachelor’s thesis course ( Willison and O'Regan, 2006 ; Todd et al ., 2004 ), such as the scope and type of research (e.g. scaffolded or self-initiated), integrating disciplinary knowledge and research skills from earlier program curriculum, demonstrating critical thinking through well-supported arguments and developing independent learning skills for future work ( Willison and O'Regan, 2006 ).

Standard 2: teaching-learning environment

According to Standard 2 of the Dutch Qualification Framework ( NLQF, 2008 ), the quality of the teaching and learning environment should be designed to help students achieve the intended learning outcomes of the program curriculum. However, our experience has revealed problems in this area. In informal discussions with thesis supervisors, we have found that students often report a lack of preparedness for a bachelor’s thesis, as they have not been adequately taught or practiced certain academic and research skills such as communication, information seeking and methodologies. Conversely, many teachers in the program believe they have covered these skills in their courses. Furthermore, during thesis calibration sessions, we have observed that novice examiners lack expertise due to insufficient experience in research education, a lack of training as thesis examiners, and unclear instructions on thesis assessment procedures.

To meet Standard 2, we recommend the following two guidelines. First, as suggested by research on curriculum alignment ( Wijngaards-de Meij and Merx, 2018 ) and research skills development ( Willison, 2012 ; Reguant et al ., 2018 ), the program-level curriculum design should arrange domain-specific subjects in a logical order and gradually develop students’ research, communication and independent learning skills so that they are well prepared to work on the thesis. At the same time, universities should focus on converting teaching staff’s research experience into research education expertise ( Maxwell and Smyth, 2011 ) for the long term.

Second, the program should ensure the quality of the teaching staff because examiners’ practices are crucial for the quality of thesis assessment ( Golding et al ., 2014 ; Kiley and Mullins, 2004 ; Mullins and Kiley, 2002 ). According to the literature, thesis examiners should receive sufficient instructions and training on how to grade a thesis ( Hand and Clewes, 2000 ; Kiley and Mullins, 2004 ). In addition, the university should provide teaching staff with written instructions to regulate and communicate thesis assessment procedures for supervisors, examiners and students, as well as assessment training on using the assessment forms and holding calibration sessions to achieve consistency in interpreting criteria and grade points. The literature on how supporting teaching staff in assessment practices contributes to consistency is discussed further in the section on Reliability.

Standards 3 and 4: student assessment and achieved learning outcomes

Ensuring validity starts with clearly defining what the assessment is intended to measure. According to the definition of validity and principle of constructive alignment ( Biggs and Tang, 2007 ), thesis assessment should be aligned with learning outcomes.

We have identified two problems in this regard. The first problem is the use of a generic assessment form with a set of uniform criteria across different programs within the same department or school. We believe this practice does not follow the principle of constructive alignment ( Biggs and Tang, 2007 ). In particular, the same assessment form cannot be used directly for different degrees (i.e. Bachelor, Master and PhD) based on the Dublin Descriptors. It would be difficult for a generic assessment form to assess the different levels of cognitive demand and skills required at each degree level. For example, the concept of “originality” is defined very differently at each degree level and this should be reflected in the assessment criteria.

The second problem is the quality of the assessment form itself. We have observed the following issues: (1) some criteria are not always directly relevant to the TLOs, (2) the assessment form only lists the names of criteria without defining them or providing specific indicators for each criterion, (3) it is unclear whether different criteria are given equal weight and (4) it is unclear how the final grade is determined (e.g. whether each criterion must be “sufficient” or “passing”).

To address these problems, we recommend the following guidelines. The assessment criteria listed in the form should align with the TLOs and should describe the characteristics of student work that provide relevant, representative and important evidence of their attainment of the learning outcomes ( Brookhart, 2013 , 2018 ; Walvoord and Anderson, 2011 ). In addition to aligning the criteria with the outcomes, the quality of the criteria also affects what is actually being assessed. The criteria should avoid vagueness that leads to multiple interpretations of quality indicators ( Biggs and Tang, 2007 ; Bloxham et al ., 2011 ; Hand and Clewes, 2000 ; Webster et al ., 2000 ). To ensure that the assessment measures what it is intended to measure, the criteria should meet the following five criteria ( Brookhart, 2013 , 2018 ; Walvoord and Anderson, 2011 ): they should be definable, observable, distinct from one another, complete and able to support descriptions along a continuum of quality.

Another important aspect of validity is the weighting of multiple assessment criteria. The weighting should reflect the relative importance of the criteria based on the disciplinary focus of the study program. For example, the criterion of “method and data analysis” might carry more weight in psychology than it would in philosophy.

Reliability and independency

Reliability is a necessary condition for validity and refers to the consistency of assessment results. Reliability is important because it allows us to confidently interpret and determine students’ true performance on a thesis.

Independency between examiners is necessary to ensure the reliability (or objectivity) of the assessment process, as it helps prevent influence on each other’s judgment. Independent grading is often specified in the Education and Examination Regulations of an institution.

Intra-rater reliability refers to the consistency of a single examiner’s grading process over time. Inconsistencies may occur due to internal influences rather than true differences in student performance. We have observed inconsistencies in completed assessment forms, including discrepancies between comments and scores given by the same examiner across different student theses.

Analytical: Examiners assign a rating to each criterion and then determine a thesis grade based on the grading guidelines.

Analytical and then holistic: Examiners assign a rating to each criterion and then determine a thesis grade based on the grading guidelines. If the thesis grade does not match the holistic judgment, examiners adjust the ratings of the criteria.

Holistic and then analytical: Examiners hold an initial grade (in their mind) based on holistic judgment. Next, examiners assign a rating to each criterion and determine a thesis grade based on the grading guidelines. If the thesis grade is different from the initial grade, examiners adjust the ratings of the criteria to make sure that these two grades are the same.

To ensure intra-rater reliability, it is essential to clearly define each criterion to prevent multiple interpretations by examiners. Additionally, examiners should be provided with bias-reduction training ( Wylie and Szpara, 2004 ) to make them aware of potential biases, such as supervisor bias ( Bettany-Saltikov et al ., 2009 ; McQuade et al ., 2020 ; Nyamapfene, 2012 ), and to take actions to prevent them. During the grading process, examiners should also consistently revisit the established criteria and level descriptors to maintain consistency.

To improve inter-rater reliability, the literature suggests establishing standard assessment procedures and improving examiners’ assessment practices ( Hand and Clewes, 2000 ; Kiley and Mullins, 2004 ; Pathirage et al ., 2007 ). Standard assessment procedures should clearly outline the process for considering the relative importance of multiple criteria and the relative importance of various indicators within a criterion ( Hand and Clewes, 2000 ; Bloxham et al ., 2016a ; Pathirage et al ., 2007 ; Webster et al ., 2000 ). To improve examiners’ assessment practices, common approaches include providing examiners with the following three processes ( Sadler, 2013 ):

Prior to grading, to ensure consistent grading, examiners should have a shared understanding of the expectations for each criterion and score level. This can be achieved through the use of anchor or exemplar theses, which are previously graded theses that illustrate the characteristics of each score level ( Osborn Popp et al ., 2009 ). Examiners can refer to these anchor theses as they grade to ensure that they are accurately distinguishing between the different score levels. It should also be clear to examiners how to complete the grading form and whether they are allowed to discuss with other examiners during the grading process ( Pathirage et al ., 2007 ; Dierick et al ., 2002 ).

During the grading process, moderation refers to the process of two examiners arriving at a collective thesis grade ( Bloxham et al ., 2016b ). It is important to have clear instructions on how to control evaluative judgments and stay within reasonable limits during the moderation process. Examiners should also be informed of score resolution methods in case of large discrepancies between their scores, as averaging the scores may not be sufficient in such cases ( Johnson et al ., 2005 ; Sadler, 2013 ). If a third examiner is involved in the moderation process, it should be clear who is qualified for this task and how their results are used to determine the final thesis grade ( Johnson et al ., 2005 ).

As a “post-judgment” process, calibration is the act of ensuring that examiners grade student work against the agreed quality criteria and “how a particular level of quality should be represented” ( Sadler, 2013 , p. 6). It can be helpful to think of calibration as similar to checking the accuracy of a weighing scale by comparing it to a standard and making adjustments to bring it into alignment. In a similar vein, the thesis assessment form (including criteria and score-level descriptors) and examiners’ assessment practices should be calibrated, particularly when there are significant changes in thesis assessment procedures. As noted by Sadler (2013) , high-quality evaluative judgments also require the development of “calibrated” academics who serve not only as custodians of quality criteria and level standards but also as consultants for novice and short-term examiners. Calibration can be implemented alongside the normal grading period as part of an internal quality assurance system ( Andriessen and Manders, 2013 ; Bergwerff and Klaren, 2016 ).

Transparency

Transparency in assessment has received increasing attention in higher education in recent years ( Bamber, 2015 ; Bell et al ., 2013 ; O'Donovan et al ., 2004 ; Price, 2005 ). It refers to making the perceptions and expectations of assessors, including requirements, standards and assessment criteria, known and understood by all participants, particularly students ( O'Donovan et al ., 2004 ).

To ensure transparency in thesis assessment, it’s not enough to only provide students with assessment forms and instructions on assessment procedures. Our observations indicate that without discussing the deeper meaning of criteria and standards, there is a risk of different interpretations by examiners and students.

To address this issue, it is important to foster shared understanding and promote assessment for learning and feedback on progress. This can be achieved by helping students develop their understanding of the quality criteria and standards through observation, discussion and imitation of good-quality theses ( Malcolm, 2020 ). Using anchor theses ( Orsmond et al ., 2002 ; Sadler, 1987 ) and involving students in peer review and grading of each other’s theses using the criteria ( O'Donovan et al ., 2004 ; Rust et al ., 2003 ) can be effective ways to do this.

To ensure transparency, supervisors should use the assessment form not only for thesis examination but also during supervising activities, and should clearly explain the criteria and score levels to their students using anchor theses for illustration ( O'Donovan et al ., 2004 ; Rust et al ., 2003 ).

Overview of guidelines

Formulate program-specific TLOs.

Thesis assessment should be appropriate for the program curriculum and assessment plan.

The program should ensure examiners’ assessment expertise by providing training or instructions.

Standards 3 and 4 – student assessment and achieved learning outcomes

TLOs, thesis supervision and thesis assessment should be constructively aligned.

The assessment criteria should be clearly defined and meet quality requirements. The weighting of multiple criteria should reflect the relative importance of TLOs.

Intra-rater reliability: Examiners should revisit the established criteria to ensure consistency and strive to prevent any possible assessor bias.

○ The program should make assessment procedures consistent across examiners.

○ The program should improve examiners’ assessment practices through the use of anchor or exemplary theses, moderation prior to and during assessment practices, and calibration after thesis assessment.

The program should inform students of what is expected of them and how their thesis will be assessed.

The program should instruct supervisors to explicitly use the criteria during supervising activities.

To illustrate the application of these guidelines, we present a case study of a psychology-related bachelor’s program at a Dutch research university. We chose to focus on this program because all of the authors have experience in quality assurance at various psychology programs. The documents for this case study were provided by one of the co-authors, who played a significant role in the quality assurance of assessment at the program. These documents include the program’s learning outcomes, a thesis handbook, a thesis assessment form, grading instructions for examiners and a self-assessment report (which includes reflections on the four standards of the Framework and is required to be submitted to the NVAO before a site visit).

Four of the authors and the vice program director (as a self-reflection exercise) examined these documents and answered open-ended questions derived from the guidelines in Box 1 . The findings were then structured based on the guidelines in Box 1 .

Motivation for participating in this study

Improving the quality of the assessment criteria to prevent multiple interpretations by examiners.

Clearly defining the roles, tasks and responsibilities of supervisors (as the first examiner) and the second examiner.

The vice program director indicated that the assessment form is still in development and that it is a dynamic improvement process, based on examiners’ accumulated experience and feedback from supervisors, examiners, students and assessment specialists.

Brief course descriptions of the Bachelor’s thesis

In this thesis course, students perform a study that covers the entire empirical research cycle, from developing a specific research question to using theory to answer the question and testing the theory through data collection. They integrate knowledge from various disciplines and practice conducting research on a technology-related problem. Students may collaborate in groups for literature search or data collection, but they must formulate a specific question to be answered in their individually written bachelor’s thesis.

Standard 1 – intended learning outcomes

PLO1 – Competent in scientific disciplines

PLO2 – Competent in doing research

PLO3 – Competent in designing

PLO4 – Use of a scientific approach

PLO5 – Basic intellectual skills

PLO6 – Competent in cooperating and communicating

PLO7 – Take into account the temporal, technological and social context.

TLO1 – formulate a research question fitted to the problem and relevant scholarly literature (PLO1,2)

TLO2 – conduct a literature search (PLO1,2,3,4,6)

TLO3 – apply and modify relevant scientific theory in order to solve a technology-related problem (PLO1,2,4,5,7)

TLO4 – make an adequate research design for empirical research (PLO2,3,4)

TLO5 – apply relevant scientific methods for empirical research (PLO1,2,3,4,5)

TLO6 – relate interpretation of data to theory and to design and/or policy recommendations (PLO1,2,3,4,5,7)

TLO7 – individually write a scientific report (PLO5,6)

TLO8 – reflect and think systematically (PLO5,6,7)

We conclude that TLOs contribute to the development of all seven competences outlined in the PLOs, as well as the five components of the Dublin Descriptors.

Standard 2 – teaching-learning environment

The bachelor’s thesis builds upon the knowledge and skills developed in previous courses. According to the curriculum and program assessment plan, student skills progress from year 1 to 3 and are assessed through various types of assessment, such as presentations, reports and reflective writing. However, there is no specific learning trajectory for academic and research skills available.

To ensure student readiness for working independently on their thesis, students must have passed the propaedeutic phase and obtained a required number of ECTS upon enrolment in the bachelor’s thesis course. They must also have passed the two methods courses.

Written instructions, including a detailed explanation of assessment procedures, criteria and rubrics, are provided in a thesis handbook for supervisors, examiners and students.

The program requires novice examiners to go through an “examiner internship” with senior examiners (mentors). They are guided and monitored by their mentors when assessing graduation theses in their first year of practice. They can directly approach mentors when encountering problems during supervision and assessment.

C1 – Abstract (TLO7,8)

C2 – Introduction/Theory (TLO1,2,3,8)

C3 – Method and results (TLO2,4,5,6)

C4 – Discussion (TLO1,2,3,6,8)

C5 – Writing style (TLO7)

C6 – Process/Work attitude (TLO7,8)

Each criterion on the assessment form includes a short definition and a number of indicators, which are graded using a five-point rating scale (Poor–Insufficient–Sufficient–Good–Very good). It is required that qualitative comments be added to all of the criteria.

It is not clear how each criterion is weighted.

It is not clear how the ratings of multiple indicators and criteria are aggregated to determine the total grade.

Although a rating scale is provided, score-level descriptors are not available. It is not clear whether the indicators describe the “Very good” or “Sufficient” score level.

These issues correspond to areas that the program is currently working to improve, as mentioned at the beginning of this section.

Reliability

New examiners receive a one-day training, in which they practice assessing theses based on the rubric, and discuss their practice results with senior examiners. They also receive guidance on how to use the criteria during the supervision process.

The first and second examiners assess the thesis independently by using the same rubric and register their initial grading results separately to the administration system.

It is obligatory for both examiners to hold a moderation meeting in order to arrive at collective grading results. In this meeting, they go through each criterion and discuss the differences. Then they register the collective results in the administration system, which generates the thesis grade.

When the discrepancies between two examiners cannot be moderated during the meeting, both examiners register these in the administration system. Next, a subcommittee from the Examination Board is informed, which carries out additional grading. The members of the subcommittee are senior examiners who are often mentors assigned to the novice examiners during the examiner internship.

There are no institution-wide guidelines on the moderation and calibration process. These quality assurance processes are organized by study programs. How they are implemented depends on the available resources, assessment expertise and time per study program.

Although no calibration procedure is established, the subcommittee regularly regrades a sample of the borderline theses around the fail/pass grade, the theses with a resit, and theses for which the two examiners differ substantially in their initial grading. In addition, this subcommittee holds a regular plenary meeting to discuss their assessment practices and report their findings regularly to the Examination Board.

After the assessment, both examiners and students are asked to fill out a survey to evaluate the use of rubric and the assessment procedures. The results are used for improving the quality of rubric.

These procedures are in line with most of our guidelines. Still, we suggest that the subcommittee systematically analyses their findings of regrading practices and acts on the improvements in order to complete the quality assurance cycle. In addition, as lessons learned from one university, we highly recommend the Examination Board or the program to carry out a regular review of the completed assessment forms to detect whether there is any assessor bias in order to safeguard intra-rater reliability.

The program has established clear guidelines on how to ensure transparency. At the beginning of the final project, an information session is organized to explain the supervision and assessment procedures and rules to students. It is made clear what the role tasks and responsibilities of supervisor, examiner and student are, in what way the thesis is assessed, and what is assessed (i.e. the criteria in the rubric). The criteria and indicators per criterion are explained in detail in this information session.

The program also makes it clear that the criteria should be used from the beginning and during the supervision activities, as well as in the assessment process. Supervisors are instructed to formulate feedback based on the criteria.

To sum up, this case study shows that their thesis assessment practices apply most of the guidelines suggested in this study.

Conclusion and discussion

This study presents problems encountered from a practitioner’s perspective and derives guidelines from the literature to address these issues. These guidelines cover the entire education process, taking the context of the program into account. They not only explain how to meet the quality criteria of validity, reliability, transparency and independence but also include the conditions that increase the likelihood of meeting these criteria, such as the importance of examiners’ assessment expertise and how the institution should facilitate their development in this area. The case study demonstrates how these guidelines are applied to examine thesis assessment practices at a bachelor’s psychology-related program at a Dutch academic university.

Our experience highlights the importance of applying the didactic principle of constructive alignment at the exit level, as it is not always clear to teaching staff what this means in the context of thesis assessment (despite its widespread use at the course level for instructional design) and how it can be used to ensure the four standards of the Framework. This has led to a focus on reliability, as noted by Webster et al . (2000) , such as revising thesis assessment forms and ensuring consistency among examiners. Our study aims to draw the attention of program teams to validity by considering the program’s curriculum and assessment design and the didactic purpose of using a thesis as a graduation project.

While other studies have focused on specific thesis assessment quality criteria such as reliability (e.g. Pathirage et al ., 2007 ), transparency (e.g. Malcolm, 2020 ) and independence ( Todd et al ., 2004 ; e.g. Nyamapfene, 2012 ), our case study shows how to ensure all of these criteria and carry out a complete quality assurance process. This does not mean that a program needs to address all of them at the same time. Instead, we want to emphasize the importance of research education in a bachelor’s program and recommend that the program align its thesis assessment design with its curriculum design for research education (i.e. as a learning trajectory) and its overall assessment design. Improving thesis assessment alone is not sufficient for students to achieve the intended learning outcomes of the program.

A final, and perhaps the most important, aspect to consider is how to effectively use limited resources to improve teaching staff’s assessment expertise so that they can continuously contribute to the improvement of thesis assessment practices. The guidelines presented in this study can be further developed or adapted as training materials for teaching staff.

Limitations

We would like to acknowledge two limitations of this study. First, unlike more traditional research methods such as surveys and interviews, the problems we reported here were compiled from various sources at four Dutch research universities. Without a more rigorous synthesis of these sources, it is possible that there may be some subjectivity and selection bias present. Second, the guidelines we derived from a narrative review of these problem topics may not include all relevant references.

It is important to note that our use of only one psychology-related bachelor’s program for the case study does not allow us to generalize our findings to all bachelor’s psychology programs at other Dutch academic universities. Rather, our aim is to share our experience and research-informed guidelines, and to examine thesis assessment quality from a practitioner perspective. In line with the goals of Koris and Pello’s (2022) article, our aim is to gradually find solutions that are appropriate for our context through several subsequent iterations in the future.

https://www.universiteitenvannederland.nl/en_GB/f_c_ingeschreven_studenten.html

https://www.universiteitenvannederland.nl/en_GB/reduce-work-pressure#eerste

Andriessen , D. and Manders , P. ( 2013 ), Beoordelen Is Mensenwerk [Evaluation Is Human Work] , Vereniging Hogescholen , Den Haag .

Bamber , M. ( 2015 ), “ The impact on stakeholder confidence of increased transparency in the examination assessment process ”, Assessment and Evaluation in Higher Education , Vol.  40 , pp.  471 - 487 .

Bell , A. , Mladenovic , R. and Price , M. ( 2013 ), “ Students' perceptions of the usefulness of marking guides, grade descriptors and annotated exemplars ”, Assessment and Evaluation in Higher Education , Vol.  38 , pp.  769 - 788 .

Bergwerff , M. and Klaren , M. ( 2016 ), Kwaliteitsborging Toetsing: En Handreiking Voor Examencommissies [Quality Assurance of Assessment: A Guide for Examination Boards] , Leiden University , Leiden .

Bettany-Saltikov , J. , Kilinc , S. and Stow , K. ( 2009 ), “ Bones, boys, bombs and booze: an exploratory study of the reliability of marking dissertations across disciplines ”, Assessment and Evaluation in Higher Education , Vol.  34 , pp.  621 - 639 .

Biggs , J. and Tang , C. ( 2007 ), Teaching for Quality Learning at University , Open University Press , New York, NY .

Bloxham , S. and Boyd , P. ( 2007 ), Developing Effective Assessment in Higher Education: A Practical Guide , McGraw-Hill Education , New York, NY .

Bloxham , S. , Boyd , P. and Orr , S. ( 2011 ), “ Mark my words: the role of assessment criteria in UK higher education grading practices ”, Studies in Higher Education , Vol.  36 , pp.  655 - 670 .

Bloxham , S. , den-Outer , B. , Hudson , J. and Price , M. ( 2016a ), “ Let's stop the pretence of consistent marking: exploring the multiple limitations of assessment criteria ”, Assessment and Evaluation in Higher Education , Vol.  41 , pp.  466 - 481 .

Bloxham , S. , Hughes , C. and Adie , L. ( 2016b ), “ What's the point of moderation? A discussion of the purposes achieved through contemporary moderation practices ”, Assessment and Evaluation in Higher Education , Vol.  41 , pp.  638 - 653 .

Bologna Working Group ( 2005 ), A Framework for Qualifications of the European Higher Education Area. Bologna Working Group Report on Qualifications Frameworks , Danish Ministry of Science, Technology and Innovation , Copenhagen .

Brookhart , S.M. ( 2013 ), How to Create and Use Rubrics for Formative Assessment and Grading , Association for Supervision and Curriculum Development (ASCD) , Alexandria, VA .

Brookhart , S.M. ( 2018 ), “ Appropriate criteria: key to effective rubrics ”, Frontiers in Education , [Online] , Vol.  3 , available at: https://www.frontiersin.org/article/10.3389/feduc.2018.00022 ( accessed 10 April 2018 ).

Dierick , S. , van de Watering , G.A. and Muijtjens , A. ( 2002 ), “ De actuele kwaliteit van assessment: ontwikkelingen in de edumetrie ”, in Dochy , F. , Heylen , L. and van de Mosselaer , H. (Eds), Assessment in onderwijs: Nieuwe toetsvormen en examinering in studentgericht onderwijs en competentiegericht onderwijs , Boom Lemma Uitgevers , Amsterdam .

Golding , C. , Sharmini , S. and Lazarovitch , A. ( 2014 ), “ What examiners do: what thesis students should know ”, Assessment and Evaluation in Higher Education , Vol.  39 , pp.  563 - 576 .

Hand , L. and Clewes , D. ( 2000 ), “ Marking the difference: an investigation of the criteria used for assessing undergraduate dissertations in a business school ”, Assessment and Evaluation in Higher Education , Vol.  25 , pp.  5 - 21 .

Hsiao , Y.P. and Verhagen , M. ( 2018 ), “ Examining the quality and use of the grading form to assess undergraduate theses ”, 9th Biennial Conference of EARLI SIG 1: Assessment and Evaluation , Helsinki, Finland .

Inspectorate of Education [Inspectie van het Onderwijs] ( 2016 ), De kwaliteit van de toetsing in het hoger onderwijs [The assessment quality in higher education] , Ministry of Education, Culture and Science [Ministerie van Onderwijs, Cultuur en Wetenschap] , Utrecht .

Johnson , R.L. , Penny , J. , Gordon , B. , Shumate , S.R. and Fisher , S.P. ( 2005 ), “ Resolving score differences in the rating of writing samples: does discussion improve the accuracy of scores? ”, Language Assessment Quarterly , Vol.  2 , pp.  117 - 146 .

Kiley , M. and Mullins , G. ( 2004 ), “ Examining the examiners: how inexperienced examiners approach the assessment of research theses ”, International Journal of Educational Research , Vol.  41 , pp.  121 - 135 .

Koris , R. and Pello , R. ( 2022 ), “ We cannot agree to disagree: ensuring consistency, transparency and fairness across bachelor thesis writing, supervision and evaluation ”, Assessment & Evaluation in Higher Education , pp.  1 - 12 .

Malcolm , M. ( 2020 ), “ The challenge of achieving transparency in undergraduate honours-level dissertation supervision ”, Teaching in Higher Education , Vol.  28 No.  1 , pp. 1 - 17 .

Maxwell , T.W. and Smyth , R. ( 2011 ), “ Higher degree research supervision: from practice toward theory ”, Higher Education Research and Development , Vol.  30 , pp.  219 - 231 .

McQuade , R. , Kometa , S. , Brown , J. , Bevitt , D. and Hall , J. ( 2020 ), “ Research project assessments and supervisor marking: maintaining academic rigour through robust reconciliation processes ”, Assessment and Evaluation in Higher Education , Vol.  45 , pp.  1181 - 1191 .

Mullins , G. and Kiley , M. ( 2002 ), “ 'It's a PhD, not a Nobel Prize': how experienced examiners assess research theses ”, Studies in Higher Education , Vol.  27 , pp.  369 - 386 .

NLQF ( 2008 ), The Higher Education Qualifications Framework in the Netherlands, a Presentation for Compatibility with the Framework for Qualifications of the European Higher Education Area , Dutch Qualifications Framework (NLQF) , ’s-Hertogenbosch, The Netherlands .

NVAO ( 2008 ), The Higher Education Qualifications Framework in the Netherlands: A Presentation for Compatibility with the Framework for Qualifications of the European Higher Education Area , Accreditation Organisation of the Netherlands and Flanders (NVAO) , Den Haag .

NVAO ( 2018 ), Assessment Framework for the Higher Education Accreditation System of the Netherlands , Accreditation Organisation of the Netherlands and Flanders (NVAO) , Den Haag .

Nyamapfene , A. ( 2012 ), “ Involving supervisors in assessing undergraduate student projects: is double marking robust? ”, Engineering Education , Vol.  7 , pp.  40 - 47 .

O'Donovan , B. , Price , M. and Rust , C. ( 2004 ), “ Know what I mean? Enhancing student understanding of assessment standards and criteria ”, Teaching in Higher Education , Vol.  9 , pp.  325 - 335 .

Orsmond , P. , Merry , S. and Reiling , K. ( 2002 ), “ The use of exemplars and formative feedback when using student derived marking criteria in peer and self-assessment ”, Assessment and Evaluation in Higher Education , Vol.  27 , pp.  309 - 323 .

Osborn Popp , S.E. , Ryan , J.M. and Thompson , M.S. ( 2009 ), “ The critical role of anchor paper selection in writing assessment ”, Applied Measurement in Education , Vol.  22 , pp.  255 - 271 .

Pathirage , C. , Haigh , R. , Amaratunga , D. and Baldry , D. ( 2007 ), “ Enhancing the quality and consistency of undergraduate dissertation assessment: a case study ”, Quality Assurance in Education , Vol.  15 , pp.  271 - 286 .

Price , M. ( 2005 ), “ Assessment standards: the role of communities of practice and the scholarship of assessment ”, Assessment and Evaluation in Higher Education , Vol.  30 , pp.  215 - 230 .

Reguant , M. , Martínez-Olmo , F. and Contreras-Higuera , W. ( 2018 ), “ Supervisors' perceptions of research competencies in the final-year project ”, Educational Research , Vol.  60 , pp.  113 - 129 .

Rust , C. , Price , M. and O'Donovan , B. ( 2003 ), “ Improving students' learning by developing their understanding of assessment criteria and processes ”, Assessment and Evaluation in Higher Education , Vol.  28 , pp.  147 - 164 .

Sadler , D.R. ( 1987 ), “ Specifying and promulgating achievement standards ”, Oxford Review of Education , Vol.  13 , pp.  191 - 209 .

Sadler , D.R. ( 2013 ), “ Assuring academic achievement standards: from moderation to calibration ”, Assessment in Education: Principles, Policy and Practice , Vol.  20 , pp.  5 - 19 .

Shay , S. ( 2005 ), “ The assessment of complex tasks: a double reading ”, Studies in Higher Education , Vol.  30 , pp.  663 - 679 .

Todd , M. , Bannister , P. and Clegg , S. ( 2004 ), “ Independent inquiry and the undergraduate dissertation: perceptions and experiences of final-year social science students ”, Assessment and Evaluation in Higher Education , Vol.  29 , pp.  335 - 355 .

University of Twente ( 2019 ), The Bachelor's Thesis [Online] . University of Twente , Enschede, The Netherlands , available at: https://www.utwente.nl/en/bee/programme/thesis/ (accessed 8 September 2019) .

Walvoord , B.E. and Anderson , V.J. ( 2011 ), Effective Grading: A Tool for Learning and Assessment in College , Jossey-Bass , San Francisco, CA .

Webster , F. , Pepper , D. and Jenkins , A. ( 2000 ), “ Assessing the undergraduate dissertation ”, Assessment and Evaluation in Higher Education , Vol.  25 , pp.  71 - 80 .

Wijngaards-de Meij , L. and Merx , S. ( 2018 ), “ Improving curriculum alignment and achieving learning goals by making the curriculum visible ”, International Journal for Academic Development , Vol.  23 , pp.  219 - 231 .

Willison , J.W. ( 2012 ), “ When academics integrate research skill development in the curriculum ”, Higher Education Research and Development , Vol.  31 , pp.  905 - 919 .

Willison , J.W. and O'Regan , K. ( 2006 ), Research Skill Development Framework [Online] , The University of Adelaide , Adelaide , available at: https://www.adelaide.edu.au/rsd/ ( accessed 19th November 2019 ).

Wylie , E.C. and Szpara , M.Y. ( 2004 ), National Board for Professional Teaching Standards Bias-Reduction Training: Impact on Assessors’ Awareness , ETS Research Report Series , Princeton, NJ , Vol.  2004 , p. i - 55 .

Acknowledgements

The authors are grateful to the reviewers for their thorough review and valuable feedback, which allowed the authors to improve the quality of the manuscript. The authors appreciate the time and effort they put into the review process.

Funding: This work was supported by National Chengchi University (DZ15-B4). The funder only provides financial support and does not substantially influence the entire research process, from study design to submission. The authors are fully responsible for the content of the paper.

Corresponding author

About the authors.

Ya-Ping (Amy) Hsiao is an assessment specialist and teacher trainer at Tilburg University. Her current research focuses on the reflection, portfolio and performance assessment of the graduation projects.

Gerard van de Watering is a policy advisor at Eindhoven University of Technology. His research and development interest focus on assessment and evaluation, student-centered learning environments, independent learning and study skills. He is also the founder of a network of assessment specialists in academic higher education in the Netherlands.

Marthe Heitbrink is a testing and assessment coordinator at the Psychology department of the University of Amsterdam.

Helma Vlas is an educational consultant, teacher trainer/assessor and assessment specialist at the University of Twente. She is stationed at the Centre of Expertise in Learning and Teaching. She is coordinator of the Senior Examination Qualification trajectory at the University of Twente.

Mei-Shiu Chiu is a full professor of Education at National Chengchi University in Taiwan. Her research interests focus on interactions between emotion/affect, cognition and culture for diverse knowledge domains (e.g. mathematics, science and energy) in relation to teaching, assessment and large-scale databases.

Related articles

We’re listening — tell us what you think, something didn’t work….

Report bugs here

All feedback is valuable

Please share your general feedback

Join us on our journey

Platform update page.

Visit emeraldpublishing.com/platformupdate to discover the latest news and updates

Questions & More Information

Answers to the most commonly asked questions here

E-assessment challenges during e-learning in higher education: A case study

  • Published: 06 January 2024

Cite this article

  • Yazid Meftah Ali Wahas   ORCID: orcid.org/0000-0002-6646-5279 1 &
  • Akbar Joseph A. Syed 2  

190 Accesses

Explore all metrics

Technology has become a fundamental means to encourage reliable and more effective assessments. Rapid technological developments have led to the widespread use of digital platforms and devices in all aspects of life. Educational institutions worldwide had to take advantage of this technological leap during pandemics such as COVID-19, which changed the shape of higher education and prompted the entire globe to adopt online learning as a new form of teaching. E-learning has ushered in a revolution in the educational process. E-assessment has become a significant tool of e-learning in many parts of the world, and has served as an alternative to evaluate students’ performance. E-assessment has many advantages such as being reliable, flexible, and accessible through many devices. However, it is unfamiliar to both teachers and students and vulnerable to piracy, cheating, and impersonation. Thus, this study aims to investigate the challenges of e-assessment faced by teachers and students during e-learning at Aligarh Muslim University (AMU), India. The theory of planned behavior (TPB) was applied to test participants’ attitudes toward implementing e-assessment during online learning. The study used a quantitative method, and an online questionnaire was delivered to 120 participants. The survey addressed three domains: (1) technological and technical challenges, (2) teachers’ challenges, and (3) students’ challenges. The findings of the study showed that both teachers and students were unfamiliar with this type of assessment as they used it for the first time.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price includes VAT (Russian Federation)

Instant access to the full article PDF.

Rent this article via DeepDyve

Institutional subscriptions

case study higher education evaluation

Similar content being viewed by others

case study higher education evaluation

Adoption of online mathematics learning in Ugandan government universities during the COVID-19 pandemic: pre-service teachers’ behavioural intention and challenges

Geofrey Kansiime & Marjorie Sarah Kabuye Batiibwe

case study higher education evaluation

Online learning in higher education: exploring advantages and disadvantages for engagement

Amber D. Dumford & Angie L. Miller

case study higher education evaluation

A literature review: efficacy of online learning courses for higher education institution using meta-analysis

Mayleen Dorcas B. Castro & Gilbert M. Tumibay

Data availability

The data used in this study are available from the corresponding author on reasonable request.

Abedi, E. A., Prestridge, S., & Hodge, S. (2023). Teachers’ beliefs about technology integration in Ghana: A qualitative study of teachers’, headteachers’ and education officials’ perceptions. Education and Information Technologies . https://doi.org/10.1007/s10639023-12049-0

Article   Google Scholar  

Adedoyin, O. B., & Soykan, E. (2020). COVID-19 pandemic and online learning: The challenges and opportunities. Interactive Learning Environments , 1–13. https://doi.org/10.1080/10494820.2020.1813180

Adegbija, M. V. (2012). New technologies and the conduct of E-Examinations. A case study of the national . Open University of Nigeria.

Google Scholar  

Ajzen, I. (1991). The theory of planned behavior. Orgnizational Behavior and Human Decision Processes, 50 (2), 179–211.

Ajzen, I. (2005). Attitudes, personality, and behavior . McGraw-Hill International.

Akbari, M., Danesh, M., Moumenihelali, H., & Rezvani, A. (2023). How does identity theory contribute to the continuance use of E-learning: The mediating role of Inertia and moderating role of computer self-efficacy. Education and Information Technologies, 28 (6), 6321–6345.

Al-Anzi, F. M., & Al-Shamrani, A. I. (2022). Challenges of e-assessment during e-learning during Covid-19 according to teachers’ perceptions in Hail Province. Arabic Journal for Specific Education, 23 , 371–394.

Al-Jadee’a, M. Q. (2017). The attitudes of academic staff towards e-assessment and challenges of its implementation at Tabuk University. International Specialized Educational Journal, 6 , 77–87.

Al-Maqbali, A. H., & Raja Hussain, R. (2022). The impact of online assessment challenges on assessment principles during COVID-19 in Oman. Journal of University Teaching & Learning Practice, 19 (2), 73–91. https://doi.org/10.53761/1.19.2.6

Al-Rouqi, A. A. (2017). The degree of natural sciences teachers’ practice of e-assessment methods in the intermediate stage in Riyadh. (Master thesis, Imam Mohammed Bin Saud Islamic University, Riyadh).

Al-Zaid, H. A. (2019). The impact of e-assessment (Kahoot App) as a sample on increasing motivation of Princess Noura Bint Abdurhaman. Faculty of Education Journal for Humanities and Educational Sciences Babil University, 34 , 509–527.

Al-Zain, H. (2017). Program training effectiveness in improving skill of designing and producing e-assessment tools among faculty members and their satisfaction. Journal of Islamic University for Educational and Psychological Studies, 25 , 21–45.

Alruwais, N., Wills, G., & Wald, M. (2018). Advantages and challenges of using e-assessment. International Journal of Information and Education Technology, 8 (1), 34–37.

Alsadoon, H. (2017). Students’ perceptions of E-assessment at Saudi Electronic University. Turkish Online Journal of Educational Technology-TOJET, 16 (1), 147–153.

Aroonsrimarakot, S., Laiphrakpam, M., Chathiphot, P., Saengsai, P., & Prasri, S. (2022). Online learning challenges in Thailand and strategies to overcome the challenges from the students’ perspectives. Education and Information Technologies, 28 , 8153–8170. https://doi.org/10.1007/s10639-022-11530-6

Ataullah, M. I. M. (2016). Attitudes of Teachers and Students towards e-assessment and its challenges at AlMansoura University. Educational and Psychological Studies Journal, 90 , 201–247.

Bao, W. (2020). COVID-19 and online teaching in higher education: A case study of Peking University. Human Behavior and Emerging Technologies, 2 (2), 113–115.

Bdair, I. A. (2021). Nursing students’ and faculty members’ perspectives about online learning during the COVID-19 pandemic: A qualitative study. Teaching and Learning in Nursing, 16 (3), 220–226.

Beaunoyer, E., Dupéré, S., & Guitton, M. J. (2020). COVID-19 and digital inequalities: Reciprocal impacts and mitigation strategies. Computers in Human Behavior, 111 , 106424.

Bousalem, K. (2022). The effectiveness of electronic assessment in distance learning. Ichkalat in Language and Literature Journal.  https://www.asjp.cerist.dz/en/downarticle/238/11/1/181635 . Accessed 20 Apr 2023

Cook, J., & Jenkins, V. (2010). Getting started with E-assessment . University of Bath.

Creswell, J. W. (2003). Research design: Qualitative, quantitative, and mixed methods approaches (3rd ed.). SAGE Publications, Inc.

Crisp, G., Guàrdia, L., & Hillier, M. (2016). Using e-assessment to enhance student learning and evidence learning outcomes. International Journal of Educational Technology in Higher Education, 13 , 18. https://doi.org/10.1186/s41239-016-0020-3

De Villiers, R., Scott-Kennel, J., & Larke, R. (2016). Principles of effective e-assessment: A proposed framework. Journal of International Business Education, 11 , 65–92.

Dhal, P. K. (2020). Emerging Issues and Challenges in Higher Education of India. In Higher Education of India . https://doi.org/10.6084/m9.figshare.12547589

Doğan, N., Kibrislioğlu, N., Kelecioğlu, H., & Hambleton, R. K. (2020). An overview of e-assessment. Hacettepe Üniversitesi Eğitim Fakültesi Dergisi, 35 (Special Issue), 1–5.

Donovan, J., Mader, C., & Shinsky, J. (2007). Online vs. traditional course evaluation formats: Student perceptions. Journal of Interactive Online Learning , 6(3), 158–180. Retrieved from http://www.ncolr.org/jiol/issues/pdf/6.3.2.pdf .  Accessed 1 Oct 2023

Eljinini, M., & Alsamarai, S. (2012). The impact of e-assessments system on the success of the implementation process. Modern Education and Computer Science, 4 (11), 76–84.

Farooq, F., Rathore, F. A., & Mansoor, S. N. (2020). Challenges of online medical education in Pakistan during COVID-19 pandemic. Journal of the College of Physicians and Surgeons Pakistan, 30 (1), 67–69. https://doi.org/10.29271/jcpsp.2020.supp1.s67

Flanigan, A. E., & Babchuk, W. A. (2022). Digital distraction in the classroom: Exploring instructor perceptions and reactions. Teaching in Higher Education, 27 (3), 352–370.

García-Morales, V. J., Garrido-Moreno, A., & Martín-Rojas, R. (2021). The transformation of higher education after the COVID disruption: Emerging challenges in an online learning scenario. Frontiers in Psychology, 12 , 196. https://doi.org/10.3389/fpsyg.2021.616059

Gilbert, L., Whitelock, D., & Gale, V. (2011). Synthesis report on assessment and feedback with technology enhancement. Electronics and Computer Science EPrints Southampton , available at: http://srafte.ecs.soton.ac.uk .  Accessed 30 Sep 2023

Godhe, A. L. (2023). Teachers’ experience of the breakdown of infrastructures during the pandemic. Education and Information Technologies . https://doi.org/10.1007/s10639023-12027-6

Govindarajan, V., & Srivastava, A. (2020). What the shift to virtual learning could mean for the future of higher ed. Harvard Business Review, 31 (1), 3–8. https://shorturl.at/eEVW7

Hillier, M. (2014). The very idea of e-Exams: student (pre) conceptions. In Proceedings of ASCILITE 2014-Annual Conference of the Australian Society for Computers in Tertiary Education (pp. 77–88).

Holden, O. L., Norris, M. E., & Kuhlmeier, V. A. (2021). Academic integrity in online assessment: A research review. In Frontiers in Education (Vol. 6, p. 639814). Frontiers Media SA.

Huda, S. S. M., Kabir, M., & Siddiq, T. (2020). E-assessment in higher education: Students perspective. International Journal of Education and Development Using Information and Communication Technology, 16 (2), 250–258.

Janke, S., Rudert, S. C., Petersen, Ä., Fritz, T. M., & Daumiller, M. (2021). Cheating in the wake of COVID-19: How dangerous is ad-hoc online testing for academic integrity? Computers and Education Open, 2 , 100055.

Jordan, S., & Mitchell, T. (2009). e-Assessment for learning? The potential of short-answer free-text questions with tailored feedback. British Journal of Educational Technology, 40 (2), 371–385.

Kuikka, M., Kitola, M., & Laakso, M. J. (2014). Challenges when introducing electronic exam. Research in Learning Technology, 22 . https://doi.org/10.3402/rlt.v22.22817

Liao, H. L., Liu, S. H., Pi, S. M., & Chou, Y. J. (2011). Factors affecting lifelong learners’ intention to continue using e-learning website: An empirical study. New Horizons in web-based Learning-ICWL 2010 Workshops: ICWL 2010 Workshops: STEG, CICW, WGLBWS, and IWKDEWL, Shanghai, China, December 7–11, 2010 revised selected papers 9 (pp. 112–119). Springer.

Lin, K. M. (2011). e-Learning continuance intention: Moderating effects of user e-learning experience. Computers & Education, 56 (2), 515–526.

Maatuk, A. M., Elberkawi, E. K., Aljawarneh, S., Rashaideh, H., & Alharbi, H. (2022). The COVID-19 pandemic and E-learning: Challenges and opportunities from the perspective of students and instructors. Journal of Computing in Higher Education, 34 (1), 21–38.

Marriott, P. (2009). Students’ evaluation of the use of online summative assessment on an undergraduate financial accounting module. British Journal of Educational Technology, 40 (2), 237–254.

Mohmmed, A. O., Khidhir, B. A., Nazeer, A., & Vijayan, V. J. (2020). Emergency remote teaching during coronavirus pandemic: The current trend and future directive at Middle East College Oman. Innovative Infrastructure Solutions, 5 (3), 1–11. https://doi.org/10.1007/s41062-020-00326-7

Mouloudj, K., Bouarar, A. C., & Stojczew, K. (2021). Analyzing the students’ intention to use online learning system in the context of COVID-19 pandemic: A theory of planned behavior approach (Vol. 3, p. 9). University of South Florida M3 Center Publishing.

Patil, H., & Undale, S. (2023). Willingness of university students to continue using e-Learning platforms after compelled adoption of technology: Test of an extended UTAUT model. Education and Information Technologies, 28 , 14943–14965. https://doi.org/10.1007/s10639-023-11778-6

Peytcheva-Forsyth, R., Aleksieva, L., & Yovkova, B. (2018). The Impact of prior experience of e-learning and e-assessment on students’ and teachers’ approaches to the use of a student authentication and authorship checking system. EDULEARN18 Proceedings , 2311–2321. https://doi.org/10.21125/edulearn.2018.0626

 Pham, A. T. (2022). University students’ attitudes towards the application of quizizz in learning english as a foreign language. International Journal of Emerging Technologies in Learning, 17 (19). https://doi.org/10.3991/ijet.v17i19.32235

Rahim, A. F. A. (2020). Guidelines for online assessment in emergency remote teaching during the COVID-19 pandemic. Education in Medicine Journal, 12 (2), 59–68. https://doi.org/10.21315/eimj2020.12.2.6

Reju, S. A., & Adesina, A. (2009). Fundamentals of on-line examinations. In Paper presented at a training workshop for academic staff on on-line examination System in National Open University of Nigeria, at the Model Study Centre Computer Laboratory .

Ridgway, J., McCusker, S., & Pead, D. (2004). Literature review of e-assessment. Bristol. Retrieved from http://dro.dur.ac.uk/1929/ . Accessed 2 Oct 2023

Rueda-Gómez, K.L., Rodríguez-Muñiz, L.J. & Muñiz-Rodríguez, L. (2023) Factors that mediate the success of the use of online platforms to support learning: the view of university teachers. Education and Information technologies, 1–24. https://doi.org/10.1007/s10639-023-11916-0

Slimi, Z. (2020). Online learning and teaching during COVID-19: A case study from Oman. International Journal of Information Technology and Language Studies, 4 (2), 44–56.

Sorensen, E. (2013). Implementation and student perceptions of e-assessment in a Chemical Engineering module. European Journal of Engineering Education, 38 (2), 172–185. https://doi.org/10.1080/03043797.2012.760533

Stodberg, U. (2012). A research review of e-assessment. Assessment and Evaluation in Higher Education, 37 (5), 591–604. https://doi.org/10.1080/02602938.2011.557496

Syed, A.J.A, & Wahas, Y.M.A (2020). Challenges and solutions in teaching english through poetry to EFL Students at Hajjah University: A Case Study of William Wordsworth’s Lucy and John Donne’s Death poems. rEFLections , 27 (2), 189–198. https://so05.tci-thaijo.org/index.php/reflections/article/view/248043 . Accessed 30 July 2023

Tuah, N. A. A., & Naing, L. (2021). Is online assessment in higher education institutions during COVID-19 pandemic reliable? Siriraj Medical Journal, 73 (1), 61–68.

Tuparova, D., Goranova, E., Voinohovska, V., Asenova, P., Tuparov, G., & Gyudzhenov, I. (2015). Teachers’ attitudes towards the use of e-assessment–results from a survey in Bulgaria. Procedia Social and Behavioral Sciences, 191 , 2236–2240.

Wahas, Y. M. A. (2023). Challenges of e-learning faced by ESL learners during the Covid-19 Pandemic: A case study. International Journal of Language and Translation Research, 3 (1), 41–58. https://doi.org/10.22034/IJLTR.2023.169033

Way, A. (2012). The use of e-assessments in the Nigerian higher education system. The Turkish Online Journal of Distance Education, 13 (1), 140–152.

Xu, H., & Mahenthiran, S. (2016). Factors that influence online learning assessment an satisfaction: using moodle as a learning management system. International Business Research , 9 (2). https://doi.org/10.5539/ibr.v9n2p1

Zaiton, H. (2005). E-learning: Concept, issues, implementation and assessment . Dar Alsoltiah.

Zhang, W., Wang, Y., Yang, L., & Wang, C. (2020). Suspending classes without stopping learning: China’s education emergency management policy in the COVID-19 outbreak. Journal of Risk and Financial Management, 13 (3), 55.

Download references

Author information

Authors and affiliations.

Hajjah University, Hajjah City, Yemen

Yazid Meftah Ali Wahas

Aligarh Muslim University, Aligarh City, UP, India

Akbar Joseph A. Syed

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Yazid Meftah Ali Wahas .

Ethics declarations

Competing interests.

The authors declare that they have no competing interests.

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Wahas, Y.M.A., Syed, A.J.A. E-assessment challenges during e-learning in higher education: A case study. Educ Inf Technol (2024). https://doi.org/10.1007/s10639-023-12421-0

Download citation

Received : 22 May 2023

Accepted : 12 December 2023

Published : 06 January 2024

DOI : https://doi.org/10.1007/s10639-023-12421-0

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • E-assessment
  • Covid-19 pandemic
  • Aligarh Muslim University
  • Find a journal
  • Publish with us
  • Track your research
  • Open access
  • Published: 03 April 2024

Application of flipped classroom teaching method based on ADDIE concept in clinical teaching for neurology residents

  • Juan Zhang 1 ,
  • Hong Chen 2 ,
  • Xie Wang 2 ,
  • Xiaofeng Huang 1 &
  • Daojun Xie 1  

BMC Medical Education volume  24 , Article number:  366 ( 2024 ) Cite this article

124 Accesses

Metrics details

As an important medical personnel training system in China, standardized residency training plays an important role in enriching residents’ clinical experience, improving their ability to communicate with patients and their clinical expertise. The difficulty of teaching neurology lies in the fact that there are many types of diseases, complicated conditions, and strong specialisation, which puts higher requirements on residents’ independent learning ability, the cultivation of critical thinking, and the learning effect. Based on the concept of ADDIE (Analysis-Design-Development-Implementation-Evaluation), this study combines the theory and clinical practice of flipped classroom teaching method to evaluate the teaching effect, so as to provide a basis and reference for the implementation of flipped classroom in the future of neurology residency training teaching.

The participants of the study were 90 neurology residents in standardised training in our hospital in the classes of 2019 and 2020. A total of 90 residents were divided into a control group and an observation group of 45 cases each using the random number table method. The control group used traditional teaching methods, including problem based learning (PBL), case-based learning (CBL), and lecture-based learning (LBL). The observation group adopted the flipped classroom teaching method based on the ADDIE teaching concept. A unified assessment of the learning outcomes of the residents was conducted before they left the department in the fourth week, including the assessment of theoretical and skill knowledge, the assessment of independent learning ability, the assessment of critical thinking ability, and the assessment of clinical practice ability. Finally, the overall quality of teaching was assessed.

The theoretical and clinical skills assessment scores achieved by the observation group were significantly higher than those of the control group, and the results were statistically significant ( P  < 0.001). The scores of independent learning ability and critical thinking ability of the observation group were better than those of the control group, showing statistically significant differences ( P  < 0.001). The observation group was better than the control group in all indicators in terms of Mini-Cex score ( P  < 0.05). In addition, the observation group had better teaching quality compared to the control group ( P  < 0.001).

Based on the concept of ADDIE combined with flipped classroom teaching method can effectively improve the teaching effect of standardized training of neurology residents, and had a positive effect on the improvement of residents’ autonomous learning ability, critical thinking ability, theoretical knowledge and clinical comprehensive ability.

Peer Review reports

Introduction

As an important medical education system, the standardized residency training system is of great significance in China’s clinical medical training system [ 1 – 2 ]. In order to continuously improve the clinical medical talent training system and build a talent training system with clinical medical characteristics, China began to implement the resident standardized training system in 2014. Under the standardized clinical teaching plan, residents can achieve the requirements and objectives of multidisciplinary training required by the primary professional title through rotational learning and clinical teaching evaluation among various departments [ 3 ]. The implementation of the system not only greatly improves the professional ability of clinical medical staff, but also effectively saves medical resources and costs. However, neurology diseases are relatively abstruse and complex, with many critical diseases and strong professionalism, which requires physicians to have better autonomous learning ability, richer knowledge reserve and clinical emergency problem-solving ability.

The ADDIE model consists of five components: analysis, design, development, implementation, and evaluation [ 4 ]. The ADDIE teaching theory, as a new type of teaching theory, focuses on the needs and goals of the students. It allows the teacher to be the decision maker for learning [ 5 ], to set and develop the necessary learning steps and to implement them effectively by analysing the main learning objectives of the students and taking into account the students’ own realities. Learning effectiveness is checked through appropriate clinical teaching practice sessions to assess whether the learning requirements have been met, and it helps students to enhance their understanding of the learning content. It not only improves the educator’s ability to teach, but most importantly, the effectiveness of the students’ learning is also improved. Gagne instructional design method is mainly composed of nine learning events, such as training attention, informing learner of objectives, stimulating recall of prior learning, presenting stimulus, and providing learning guidance [ 6 ]. Compared with Gagne teaching design method, ADDIE model teaching method has the advantages of simple steps and easy implementation, and is often used in medical education design. Lucia et al. [ 7 ] used ADDIE model to develop the basic life support course in the process of adult cardiac arrest related surgery. Under the guidance of this theory, it not only realized the technical innovation in cardiopulmonary resuscitation education and systematization, but also had important positive significance for medical education. Maya et al. [ 8 ] developed and implemented the covid-19 elective course for pediatric residents by using the idea of ADDIE teaching. As an effective teaching method, this course provides necessary disaster response and flexible education for pediatric residents. Therefore, the teaching concept plays an important role in medical education.

Flipped classroom [ 9 ] was first popularised in the United States, where people advocated homework to replace the classroom learning format, and has gradually been applied to the medical education business in recent years [ 10 ]. It is different from traditional teaching. As an emerging mode of teaching, it advocates a student-centred approach, whereby the teacher prepares teaching videos or materials through an online platform and sends the materials to the students in a uniform manner before the students arrange their own study plan and time [ 11 – 12 ]. Therefore, this model is not limited by time and place, and students can learn according to their own situation and their own speed. When encountering difficult points, students can also watch the video repeatedly, interact and discuss with other students, or organise the questions and feedback them to the teacher for one-by-one answers.

Therefore, the flipped classroom teaching method based on AddIE teaching concept can formulate and implement the corresponding learning and training plan in combination with the clinical teaching needs of standardized training of neurology residents and the actual situation at this stage, encourage students to independently arrange learning time, and give the initiative of learning to students, so as to overcome the disadvantages of tight classroom time, heavy tasks, and students’ inability to study and think deeply in traditional medical teaching, which has a positive effect on the cultivation of students’ autonomous learning ability, the formation of critical thinking ability, and the improvement of professional knowledge and clinical comprehensive ability. Mini-CEX (Mini clinical exercise assessment) is considered to be an effective method for evaluating the clinical ability and teaching function of residents [ 13 ]. In this study, the theoretical and technical knowledge, autonomous learning ability and critical thinking ability were evaluated and scored, and the clinical comprehensive ability of residents was evaluated by mini CEX method, so as to provide a comprehensive and objective evaluation for clinical teaching results. This study is an exploration of medical clinical education mode, in order to provide reference for clinical teaching mode of standardized training of residents.

Materials and methods

Study design.

A prospective controlled experimental design of research was used in this study.

Participants

The participants of the study were 90 residents of the classes of 2019 and 2020 participating in the standardized residency training in the Department of Neurology of our hospital. Random number table method was used to divide 90 residents into control group and observation group with 45 residents in each group. There were 21 males and 24 females in the control group, aged 23–28 (25.40 ± 2.78) years. The observation group consisted of 23 males and 22 females, aged 22–27 (24.37 ± 2.59) years. All subjects signed an informed consent form. By comparing the general data of the residents in both groups, the results suggested no statistical significance ( p  > 0.05).

Training methods

Both groups of residents underwent a one-month standardized residency training in the Department of Neurology. During the training period, the instructors trained the residents according to the standardized residency training syllabus, which mainly included theoretical learning and skills operation. The two groups of teachers were.

randomly assigned and the quality of teaching was monitored by the department head.

Control group

The group adopted traditional teaching methods, including problem-based learning (PBL), case-based learning (CBL) and lecture based learning (LBL). PBL refers to a problem-oriented teaching method in which students seek solutions around problems [ 14 ]. CBL refers to the case-based teaching method, that is, to design cases according to teaching objectives, take teachers as the leading role, and let students think, analyze and discuss [ 15 ]. LBL refers to the traditional teaching method [ 16 ]. In the first week of enrollment, teachers will conduct unified enrollment assessment, enrollment education and popularization of basic knowledge of Neurology. The second week is mainly based on the traditional LBL teaching method, mainly for common diseases in the Department of Neurology, including ward round, bedside physical examination, auxiliary examination analysis, and putting forward the diagnosis basis and treatment plan. In the third week, CBL teaching method is mainly used to consolidate the knowledge learned through case study. In the fourth week, PBL teaching method is mainly used to promote problem learning and knowledge understanding by asking and answering questions. The learning outcomes were evaluated before leaving the department four weeks later. The detailed process was shown in Fig.  1 .

figure 1

Flow chart of resident training process for two groups

Observation group

This group adopted the flipped classroom teaching method based on the ADDIE teaching concept. The training content of the first week was the same as that of the control group. From the second to the fourth week, the flipped classroom teaching method based on the ADDIE teaching concept was adopted, with a total of 38 class hours. By analysing the content of the syllabus and the actual situation of the subjects, we designed and developed a characteristic and targeted teaching programme and implemented it, and conducted a unified assessment of the learning outcomes before the residents left the department in the fourth week. The concrete programme is shown in Table  1 .

Step 1: composition of the teaching team

The members of the teaching team included a department head, 10 neurology lead teachers, and two non-neurology ADDIE specialists. The department chair is responsible for overseeing the overall quality of teaching, and the instructors are responsible for the teaching and learning of all students and the assessment of their outcomes. The ADDIE experts integrate the ADDIE concepts into the clinical learning curriculum plan of the standardised residency training according to the specific arrangement and actual situation of the curriculum.

Step 2: setting of teaching objectives

The teaching objectives of standardised training for neurology residents mainly include the following aspects: (1) To understand and master common neurological diseases and their diagnosis and treatment processes, such as migraine, tension headache, benign paroxysmal positional vertigo, peripheral facial palsy, Parkinson’s disease, posterior circulation ischemia, cerebral infarction, cerebral hemorrhage, subarachnoid hemorrhage, epilepsy, etc.; (2) To understand and master systematic physical examination of the neurological system methods; (3) Proficiency in performing skillful operations related to neurological diseases, including lumbar puncture, etc.; (4) Familiarity with the management process of common neurological emergencies, including acute-phase cerebral infarction, acute-phase cerebral haemorrhage, and epileptic status persistent, etc.; and (5) Improvement of the resident’s ability of communicating with the team, collaborating with the team, communicating with the patients and the ability of dealing with the emergency problems on a temporary basis.

Step 3: concrete teaching plan

With the unanimous agreement and unremitting efforts of the teaching team, the curriculum and methodology for the standardised training of residents in the flipped classroom based on the ADDIE teaching concept was finalised. The teaching plan will be carried out in 5 steps, as shown in Table  1 .

Step 4: implementation of flipped classroom teaching method based on ADDIE teaching philosophy

Project analysis.

The final teaching task of this training mainly includes two aspects: (1) To complete all the teaching objectives set above; (2) To improve the residents’ comprehensive clinical ability in the process. Before the start of the training through the questionnaire form of the resident’s knowledge base of neurological specialities for the initial assessment, which helps to understand the current learning situation of the students, in order to facilitate the tailored teaching. At the same time, the main teaching tasks and teaching objectives were combined to analyse the specific form and content of the project, so as to develop a more practical and targeted programme.

Project design

The specific content of the project mainly includes: (1) Admission assessment: after admission to the department, all residents will conduct a unified admission mission and popularise the basic knowledge of neurology; (2) Flipped classroom teaching method: before the class, the leading teacher will analyse and sort out the common neurology diseases and their diagnosis and treatment processes according to the disease types based on the requirements of the syllabus, make a good teaching plan, and study a disease type at a time. Teachers will send teaching resources including PPT, video, cases, literature, etc. to the social platform. At the same time, they put forward the content and requirements to be mastered, and put forward 3–5 questions for students to think about in accordance with the focus of the teaching. Students can arrange their own study time, group themselves and have group discussions to try to solve the problems, and they can also ask questions to the teaching staff through the social platform at any time. Students can choose to go to the library or check the relevant literature on the Internet to expand their knowledge. In this session, knowledge transfer is completed; (3) Bedside practice teaching: the teacher communicates with the patient in advance, so that the students can conduct bedside questioning of medical history, physical examination, auxiliary examination and analysis. The diagnosis and diagnostic basis are proposed, and the teacher observes and assists the whole process.

Project development

After the teacher has finished the theoretical learning and practical teaching, he/she will ask targeted questions, pointing out what the students have done well and what needs to be improved in the process of questioning and treating the patients. At the same time, specific learning tasks are assigned for different students. Students are encouraged to report to the teacher about the patient’s condition and treatment plan, and propose their own treatment ideas. They are also allowed to ask the teacher any questions or problems that they cannot solve during the consultation. This teaching method is of great significance for students to master the theoretical knowledge of diseases and cultivate their clinical thinking.

Project implementation

Through the teaching team’s development of a specific and detailed teaching programme, methods such as entrance examination, flipped classroom teaching method, bedside practical teaching, and special case discussion were adopted. When encountering problems, students take the initiative to consult the literature and information or solve the problems independently through group discussion. If the problem cannot be solved, the students will seek help from the teachers, in order to practice students’ independent learning, teamwork and clinical diagnosis and treatment thinking ability.

Programme assessment

Students are assessed on their theoretical and professional skills knowledge at the end of the programme training. Students’ independent learning ability, critical thinking ability, clinical practice ability are assessed using relevant assessment methods, and finally the overall teaching quality is assessed, after which the teacher comments and summarises the results of the assessment.

Observation indicators

Theory and skill knowledge assessment.

This assessment includes two parts: theory and skill operation. The theoretical assessment mainly consists of the basic knowledge of neurology and the diagnosis and treatment process and medication of common neurology diseases. Skill operation involves lumbar puncture, thoracentesis, abdominal puncture, cardiopulmonary resuscitation, and other necessary items. The theory and skill operation parts were each worth 50 points, totalling 100 points. Unified assessment and grading will be conducted by the teachers.

Self-directed learning ability assessment scale

After the fourth week of training, the self-learning ability assessment form [ 17 ] was used to assess residents’ self-learning ability. The main contents include self motivation belief and objective behavior. Self motivation belief also includes self motivation (5 items) and learning belief (3 items). Objective behavior mainly includes four aspects: making learning goals and plans (4 items), self-monitoring and adjustment (7 items), obtaining and processing information (4 items) and communication and cooperation ability (7 items). The Likert scale [ 18 ] is used for a 5-level response system, which includes 5 levels of “completely non compliant”, “basically non compliant”, “average”, “basically compliant”, and “completely compliant”. The corresponding scores are 1 point, 2 point, 3 point, 4 point, and 5 point, with a total score of 150 points. The level of the score is positively correlated with the strength of autonomous learning ability. The Cronbach’s alpha coefficient was 0.929, the split half reliability was 0.892, and the content validity index was 0.970, indicating that the scale has good internal consistency, reliability and validity.

Critical thinking skills assessment scale

The Critical Thinking Skills Assessment Scale [ 19 ], which consists of seven dimensions, namely, truth-seeking, open-mindedness, analytical ability, and systematisation, with 10 items for each dimension, was used for the assessment at the end of the fourth week of training. A 6-point scale was used, ranging from “Strongly Disagree” to “Strongly Agree”, with scores ranging from 1 to 6, and the opposite for negative responses. The total score of the scale is 70–420, where ≤ 210 indicates negative performance, 211–279 indicates neutral performance, 280–349 indicates positive performance, and ≥ 350 indicates strong critical thinking skills. The Cronbach’s alpha coefficient was 0.90, the content validity index was 0.89, and the reliability was 0.90, indicating that the internal consistency, reliability and validity were good.

Clinical practice competence assessment

Clinical practice competence was assessed at the end of the fourth week of training using the mini-CEX scale [ 20 ], which included the following seven aspects: medical interview, physical examination, humanistic care, clinical diagnosis, communication skills, organisational effectiveness, and overall performance. Each aspect is rated from 1 to 9: 1 to 3 as “unqualified”; 4 to 6 as “qualified”; and 7 to 9 as “excellent”. The Cronbach’s alpha coefficient of the scale was 0.780, and the split-half reliability coefficient was 0.842, indicating that the internal consistency and reliability of the scale were relatively high.

Teaching quality assessment

Teaching quality assessment was conducted at the end of the fourth week of assessment, using the teaching quality assessment scale [ 21 ]. The specific content includes five aspects: teaching attitude, teaching method, teaching content, teaching characteristics, and teaching effect. The Likert 5-point scale was used, and the rating was positively correlated with the quality of teaching. The Cronbach’s alpha coefficient was 0.85 and the reliability was 0.83, which showed good reliability and validity.

Data analysis

SPSS 23.0 statistical software was used to analyse the data. Measurement information was expressed as mean ± standard deviation ( \( \bar x \pm \,S \) ), and t-test was used for comparison between groups. Comparison of the unordered data between the two groups was performed using the χ2 test, or Fisher’s exact method. p -value < 0.05 was considered a statistically significant difference.

The scores and statistical analysis results of theory, skill assessment, self-learning ability assessment, critical thinking ability assessment of the two groups of students were shown in Table  2 . The results of mini CEX assessment and statistical analysis were shown in Table  3 . The results of teaching quality assessment and statistical analysis were shown in Table  4 .

The standardised training of residents is an important medical personnel training system in China. It is a key link in the training of high-quality residents, which requires clinicians to have not only solid clinical expertise, but also noble medical character to better serve patients in outpatient and inpatient medical work. In recent years, due to the continuous development of China’s economic level, people’s demand for health is also increasing. Neurological system diseases are diverse, and certain diseases such as acute cerebrovascular disease, epilepsy, central nervous system infections, acute disseminated encephalomyelitis, Guillain-Barré, etc., have an acute onset and a rapid change in condition, which requires neurology residents to accurately identify and manage certain neurological emergencies and serious illnesses at an early stage. It puts forward higher requirements on the basic quality of neurology residents and brings more challenges to the clinical teaching of standardised neurology residency training. Therefore, the traditional teaching methods can no longer meet the current teaching requirements put forward under the new situation and new policies. Only by continuously improving and innovating the clinical teaching methods and improving the quality of teaching can the professional quality construction and training quality of residents be improved [ 22 ].

This study found that through four weeks’ teaching assessment, the theoretical and clinical skills assessment scores of the observation group were significantly higher than those of the control group, and the results were statistically significant ( P  < 0.001). Meanwhile, the scores of autonomous learning ability and critical thinking ability of the observation group were also better than those of the control group, with statistically significant differences ( P  < 0.001). In terms of Mini-Cex assessment, the observation group had better scores than the control group both in medical interview and physical examination ( P  < 0.01) and in humanistic care, clinical diagnosis, communication skills, organisational effectiveness, and overall performance ( P  < 0.05). In addition, the observation group also had higher scores compared to the control group regarding the quality of teaching in this study ( P  < 0.001). Previous studies have shown that the ADDIE concept can be applied to the design of clinical ethics education programmes and can be an effective tool for healthcare education, providing an established structure for the development of educational programmes [ 23 ]. Saeidnia [ 24 ] et al. used the ADDIE model to develop and design an educational application for COVID-19 self-prevention, self-care educational application to help people learn self-care skills at home during isolation, which can be used as an effective tool against COVID-19 to some extent. For the sake of reducing postoperative complications of breast cancer, Aydin [ 25 ] and others designed and developed a mobile application to support self-care of patients after breast cancer surgery with the support of the ADDIE model concept, which can provide professional medical guidance and advice for postoperative patients and is widely used in both education and clinical settings. Therefore, the ADDIE model concept has not only achieved better outcomes in the design of medical education, but also played a positive role in all aspects of disease prevention guidance and postoperative care.

As a flexible, targeted and effective new teaching method, flipped classroom method has been studied by many scholars in the field of basic medicine and clinical education. Pual [ 26 ] et al. found that the flipped classroom method was more effective for teaching clinical skills by comparing the two methods of course implementation, flipped teaching and online teaching. Du [ 27 ] and others found that a fully online flipped classroom approach increased classroom participation and adequate student-faculty interaction in distance education, and improved overall medical student exam pass rates during the COVID-19 pandemic, with better teaching and learning outcomes. Sierra [ 28 ] and others found that the flipped classroom method achieved better teaching and learning outcomes in a cardiology residency training programme, with higher acceptance among participants and teachers, and improved physicians’ assessment scores compared to traditional and virtual model teaching methods. Meanwhile, the Mini-CEX method was used in this study to assess the overall clinical competence of residents. This method, as a formative assessment, can not only provide a more accurate and comprehensive assessment of physicians’ comprehensive clinical competence, but also effectively promote physicians’ learning and growth [ 29 – 30 ]. Objective structured clinical examination(OSCE), as a method of evaluating students’ clinical comprehensive ability, understanding and application by simulating clinical scenarios, is widely used in the pre internship training of Undergraduates’ professional clinical practice skills [ 31 ]. Compared with OSCE, Mini-CEX is not limited by site and time, and it is time-consuming, simple and comprehensive. It can more systematically and comprehensively evaluate students’ clinical comprehensive ability [ 32 – 33 ]. Therefore, Mini-CEX is selected as the main clinical evaluation method in this study. Khalafi [ 34 ] et al. found that the use of Mini-CEX as a formative assessment method had a significant impact on the improvement of clinical skills of nursing anaesthesia students. Shafqat [ 35 ] et al. assessed the validity and feasibility of Mini-CEX by adopting it as a direct observation to assess its effectiveness and feasibility in an undergraduate medical curriculum. The study found that the altered method was effective in measuring student competence, improving clinical and diagnostic skills of medical students, and enhancing teacher-student interaction.

This study found that using ADDIE concept combined with flipped classroom teaching method, residents’ autonomous learning ability, critical thinking ability, theoretical knowledge and clinical comprehensive ability were improved. Analyze the potential causes: ADDIE, as a comprehensive medical teaching design concept, mainly includes five dimensions: analysis, design, development, implementation and evaluation. First, it systematically analyzes the specific clinical teaching needs and combines them with the current actual situation of students. On this basis, it flexibly sets the teaching plan, especially with the flipped classroom method, and pays attention to student-centered, This is quite different from the teacher centered concept in traditional teaching methods. This method encourages students to use their spare time to study independently through the text and video materials distributed by the teacher platform to meet the personalized needs of each student. At the same time, students actively explore the problems raised and encountered by teachers, which not only stimulate students’ interest in learning, but also greatly improve students’ autonomous learning and independent thinking ability. Furthermore, students’ collaborative discussion of problems and teachers’ in-depth explanation promoted the formation of students’ critical thinking, improved students’ learning effect and classroom efficiency, and improved students’ clinical comprehensive ability.

Limitations and recommendations

Although this study achieved some clinical teaching value, we still have many shortcomings. First, the limited number of residency trainers resulted in an insufficient sample size for this study, which may have an impact on the results. Second, due to the limitations of the residency training syllabus and policy, the training in this study was conducted for only one month, in fact, the training of speciality knowledge and talent development often need more sufficient time. Third, the study only used the Mini-CEX to assess the residents’ comprehensive clinical competence, and the scale selection in this area is relatively homogeneous, which may have an impact on the real assessment results. Therefore, in the future, we will expand the sample size, giving more reasonable and sufficient time for teaching training and knowledge digestion and assimilation, by using multiple scales to conduct in-depth assessment in various aspects, with a view to obtaining more reliable and persuasive results, which will provide reference for the teaching of specialised clinical medicine.

Based on the ADDIE concept combined with flipped classroom teaching method, this study conducted research in the residency training and found that compared with the traditional teaching method, the new teaching concept combined with flipped classroom teaching method can effectively improve the autonomous learning ability, critical thinking ability, theoretical knowledge and clinical comprehensive ability of neurology residents, and had better teaching quality. In clinical medical education, we should actively conform to modern teaching ideas. On the basis of traditional teaching, we should actively integrate new ideas and methods, give full play to the advantages of different teaching methods, so as to continuously improve the teaching efficiency and quality.

Data availability

The datasets used and/or analysed in this study are available from the corresponding author upon reasonable request.

Hongxing L, Yan S, Lianshuang Z, et al. The practice of professional degree postgraduate education of clinical medicine under the background of the reform of the medical education Cooperatio. Contin Med Educ. 2018;32(12):16–8.

Google Scholar  

Shilin F, Chang G, Guanlin L, et al. The investigation of the training model in four-in-one professional Master’s degree of medicine. China Contin Med Educ. 2018;10(35):34–7.

Man Z, Dayu S, Lulu Z, et al. Study on the evaluation indeses system of clinical instructors by clinical professional postgraduates under the dualtrack system mode. Med Educ Res Prac. 2018;26(6):957–61.

Boling E, Easterling WV, Hardré PL, Howard CD, Roman TA. ADDIE: perspectives in transition. Educational Technology. 2011;51(5):34–8.

Hsu T, Lee-Hsieh J, Turton MA, Cheng S. Using the ADDIE model to develop online continuing education courses on caring for nurses in Taiwan. J Contin Educ Nurs. 2014;45(3):124–131. https://doi.org/10.3928/00220124-20140219-04

Woo WH. Using Gagne’s instructional model in phlebotomy education. Adv Med Educ Pract. 2016;7:511–6. https://doi.org/10.2147/AMEP.S1103 . Published 2016 Aug 31.

Article   Google Scholar  

Tobase L, Peres HHC, Almeida DM, Tomazini EAS, Ramos MB, Polastri TF. Instructional design in the development of an online course on basic life support. Rev Esc Enferm USP. 2018;51:e03288. Published 2018 Mar 26. https://doi.org/10.1590/S1980-220X2016043303288

MS, Lo CB, Scherzer DJ, et al. The COVID-19 elective for pediatric residents: learning about systems-based practice during a pandemic. Cureus. 2021;13(2):e13085. https://doi.org/10.7759/cureus.13085 . Published 2021 Feb 2.

Pierce R, Fox J. Vodcasts and active-learning exercises in a flipped classroom model of a renal pharmacotherapy module. Am J Pharm Educ. 2012;76(10):196.

Bergmann J, Sams A. Remixing chemistry class. Learn Lead Technol. 2008;36(4):24–7.

Mehta NB, Hull AL, Young JB, Stoller JK. Just imagine: new paradigms for medical education. Acad Med. 2013;88(10):1418–23.

Ramnanan CJ, Pound LD. Advances in medical education and practice: student perceptions of the flipped classroom. Adv Med Educ Pract. 2017;8:63–73.

Norcini JJ, Blank LL, Duffy FD, Fortna GS. The mini-CEX: a method for assessing clinical skills. Ann Intern Med. 2003;138:476–81. https://doi.org/10.7326/0003-4819-138-6-200303180-00012

Zhang J, Xie DJ, Bao YC et al. The application of goal setting theory combined with PBL teaching mode in clinical teaching of neurology [J]. J Clin Chin Med. 2017;29(06):946–8. https://doi.org/10.16448/j.cjtcm.2017.0316 (Chinese).

Zhang J, Xie DJ, Huang XF, et al. The application of PAD combined with CBL teaching method in neurology teaching [J]. Chin Med Records. 2023;24(06):98–101..(Chinese).

Liu CX, Ouyang WW, Wang XW, Chen D, Jiang ZL. Comparing hybrid problem-based and lecture learning (PBL + LBL) with LBL pedagogy on clinical curriculum learning for medical students in China: a meta-analysis of randomized controlled trials. Med (Baltim). 2020;99(16):e19687. https://doi.org/10.1097/MD.0000000000019687

Wang Xiaodan T, Gangqin W, Suzhen, et al. Construction of the self-study ability assessment scale for medical students [J]. Chin J Health Psychol. 2014;22(7):1034–7. (Chinese).

Fang Bao. Analysis of the influencing factors on the effectiveness of the likert rating scale survey results [J]. J Shiyan Vocat Tech Coll. 2009;22(2):25–8..(Chinese).

Meici P, Guocheng W, Jile C, et al. Research on the reliability and validity of the critical thinking ability measurement scale [J]. Chin J Nurs. 2004;39(9):7–10.

YUSUF L, AHMED A, YASMIN R. Educational impact of mini-clinical evaluation exercise: a game changer[J]. Pak J Med Sci. 2018;34(2):405–11.

Zhou Tong X, Ling W, Dongmei et al. The application of the teaching model based on CDIO concept in the practice teaching of cardiovascular nursing. Chin Gen Med. 2022;(09):1569–72.

Li Q, Shuguang L. Analysis and reflection on the standardized training of 24 hour responsible physicians [J]. China Health Manage. 2016;33(5):374–6. (Chinese).

Kim S, Choi S, Seo M, Kim DR, Lee K. Designing a clinical ethics education program for nurses based on the ADDIE model. Res Theory Nurs Pract. 2020;34(3):205–222. https://doi.org/10.1891/RTNP-D-19-00135

Saeidnia HR, Kozak M, Ausloos M, et al. Development of a mobile app for self-care against COVID-19 using the analysis, design, development, implementation, and evaluation (ADDIE) model: methodological study. JMIR Form Res. 2022;6(9):e39718. Published 2022 Sep 13. https://doi.org/10.1891/RTNP-D-19-0013510.2196/39718

Aydin A, Gürsoy A, Karal H. Mobile care app development process: using the ADDIE model to manage symptoms after breast cancer surgery (step 1). Discov Oncol. 2023;14(1):63. https://doi.org/10.1007/s12672-023-00676-5 . Published 2023 May 9.

Paul A, Leung D, Salas RME, et al. Comparative effectiveness study of flipped classroom versus online-only instruction of clinical reasoning for medical students. Med Educ Online. 2023;28(1):2142358. https://doi.org/10.1080/10872981.2022.2142358

Du J, Chen X, Wang T, Zhao J, Li K. The effectiveness of the fully online flipped classroom for nursing undergraduates during the COVID-19: historical control study. Nurs Open. 2023;10(8):5766–5776. https://doi.org/10.1002/nop2.1757

Sierra-Fernández CR, Alejandra HD, Trevethan-Cravioto SA, Azar-Manzur FJ, Mauricio LM, Garnica-Geronimo LR. Flipped learning as an educational model in a cardiology residency program. BMC Med Educ. 2023;23(1):510. Published 2023 Jul 17. https://doi.org/10.1186/s12909-023-04439-2

Jamenis SC, Pharande S, Potnis S, Kapoor P. Use of mini clinical evaluation exercise as a tool to assess the orthodontic postgraduate students. J Indian Orthod Soc. 2020;54(1):39–43.

Devaprasad PS. Introduction of mini clinical evaluation exercise as an assessment tool for M.B.B.S. Interns in the Department of Orthopaedics. Indian J Orthop. 2023;57(5):714–717. Published 2023 Apr 9. https://doi.org/10.1007/s43465-023-00866-x

Hatala R, Marr S, Cuncic C, et al. Modifcation of an OSCE format to enhance patient continuity in a high-stakes assessment of clinical performance. BMC Med Educ. 2011;11:23.

Niu L, Mei Y, Xu X et al. A novel strategy combining Mini-CEX and OSCE to assess standardized training of professional postgraduates in department of prosthodontics. BMC Med Educ. 2022;22(1):888. Published 2022 Dec 22. https://doi.org/10.1186/s12909-022-03956-w

örwald AC, Lahner FM, Mooser B, et al. Influences on the implementation of Mini-CEX and DOPS for postgraduate medical trainees’ learning: a grounded theory study. Med Teach. 2019;41(4):448–56. https://doi.org/10.1080/0142159X.2018.1497784

Khalafi A, Sharbatdar Y, Khajeali N, Haghighizadeh MH, Vaziri M. Improvement of the clinical skills of nurse anesthesia students using mini-clinical evaluation exercises in Iran: a randomized controlled study. J Educ Eval Health Prof. 2023;20(12). https://doi.org/10.3352/jeehp.2023.20.12

Shafqat S, Tejani I, Ali M, Tariq H, Sabzwari S. Feasibility and effectiveness of mini-clinical evaluation exercise (Mini-CEX) in an undergraduate medical program: a study from Pakistan. Cureus. 2022;14(9):e29563. Published 2022 Sep 25. https://doi.org/10.7759/cureus.29563

Download references

Acknowledgements

The authors would like to thank all the faculty members of the Department of Neurology of the First Affiliated Hospital of Anhui University of Traditional Chinese Medicine for their support of the clinical teaching programme for standardized residency training.

This study was funded by the National Natural Foundation of China under the National Science Foundation of China (Grant No. 82274493) and Scientific Research Project of Higher Education Institutions in Anhui Province (Grant No. 2023AH050791).

Author information

Authors and affiliations.

Department of Neurology, The First Affiliated Hospital of Anhui University of Traditional Chinese Medicine, 117 Meishan Road, Hefei, Anhui, China

Juan Zhang, Xiaofeng Huang & Daojun Xie

The First Clinical Medical College of Anhui University of Chinese Medicine, Hefei, China

Hong Chen & Xie Wang

You can also search for this author in PubMed   Google Scholar

Contributions

JZ wrote the manuscript. JZ and HC collected the data. HC, XW, XH obtained and analysed the data. DX revised the manuscript for intellectual content. JZ confirmed the authenticity of all original data. All authors had read and approved the final manuscript.

Corresponding author

Correspondence to Juan Zhang .

Ethics declarations

Ethical approval and consent to participate.

All procedures performed in the study involving human participants were in accordance with institutional and/or national research council ethical standards and in accordance with the 1964 Declaration of Helsinki and its subsequent amendments or similar ethical standards. All participants signed an informed consent form. All experimental protocols were approved by the Ethics Committee of the First Affiliated Hospital of Anhui University of Traditional Chinese Medicine.

Consent for publication

Not applicable.

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ . The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article.

Zhang, J., Chen, H., Wang, X. et al. Application of flipped classroom teaching method based on ADDIE concept in clinical teaching for neurology residents. BMC Med Educ 24 , 366 (2024). https://doi.org/10.1186/s12909-024-05343-z

Download citation

Received : 26 September 2023

Accepted : 23 March 2024

Published : 03 April 2024

DOI : https://doi.org/10.1186/s12909-024-05343-z

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • ADDIE teaching model
  • Flipped classroom
  • Standardized training for residents

BMC Medical Education

ISSN: 1472-6920

case study higher education evaluation

IMAGES

  1. 49 Free Case Study Templates ( + Case Study Format Examples + )

    case study higher education evaluation

  2. Assessment & Evaluation in Higher Education: Vol 46, No 4

    case study higher education evaluation

  3. (PDF) Comparative Analysis of Research Skills and ICT: A Case Study in

    case study higher education evaluation

  4. 49 Free Case Study Templates ( + Case Study Format Examples + )

    case study higher education evaluation

  5. (PDF) The Assessment of Higher Education Quality from the Perspective

    case study higher education evaluation

  6. (PDF) Online peer assessment: An exploratory case study in a higher

    case study higher education evaluation

VIDEO

  1. Les Roche University Case Study

  2. Digital Leadership-A New Perspective of Higher Education Evaluation

  3. Evaulate (HCC Libraries Academic Research Video 2)

  4. 4.1-Evaluation:-Continuous and comprehensive Evaluation (CCE) B.Ed 3rd semester #jammuuniversitybed

  5. SENCER Student Assessment of Their Learning Gains

  6. Influence of Faculty Behavior on Student Engagement and Learning

COMMENTS

  1. Program Evaluation and Planning

    Below you will find a sample of reports, case studies and articles that outline the process of program evaluation, planning and analysis. Click through and read on for more information. ... Employing the setting of higher education program evaluation at a midwestern regional public university, for this study we compared analysis approaches ...

  2. Time to Revisit Existing Student's Performance Evaluation Approach in

    1.1. Study Objectives. This article presents a case study using an experiment-based approach to observe whether ChatGPT can provide full-baked solutions to the assortment of assessment tools used for developing students' skills in critical thinking, problem solving, communication, knowledge application, research, teamwork, ethics, and so on—critical for their professional and career ...

  3. PDF Program evaluation in higher education: A case study

    PROGRAM EVALUATION IN HIGHER EDUCATION: A CASE STUDY A Dissertation Presented to The Faculty of the School of Education The College of William and Mary in Virginia In Partial Fulfillment Of the Requirements for the Degree Doctor of Philosophy by Elizabeth Delavan Steele April 1999. Reproduced with permission of the copyright owner.

  4. (PDF) Curriculum assessment practices that incorporate learning

    A combination of terms referring to curriculum assessment and evaluation in higher education ... Frequency of use of assessed sources for curriculum assessment in 13 reviewed case studies. ...

  5. Research impact evaluation and academic discourse

    A pilot evaluation exercise run in 2010 confirmed the viability of the case-study approach to impact evaluation. In July 2011 the Higher Education Council for England (HEFCE) published guidelines ...

  6. Academic Evaluation in Higher Education

    'Academic Evaluation in Higher Education' published in 'The International Encyclopedia of Higher Education Systems and Institutions' ... editors judge articles in case of minor and major errors that need to be met with errata or retractions ... Studies in Higher Education 39: 294-306. Google Scholar Mallard, Grégoire, Michèle Lamont, and ...

  7. Higher Education Evaluation and Development

    Aims and scope Higher Education Evaluation and Development (HEED) is a scholarly refereed journal that aims to encourage research in higher education evaluation and development, raising standard of evaluation research and sharing the discoveries worldwide.HEED is receptive to critical, phenomenological as well as positivistic studies.. The journal would like to publish more studies that use ...

  8. PDF Towards Quality Monitoring and Evaluation Methodology: Higher Education

    Towards Quality Monitoring and Evaluation Methodology: Higher Education Case-Study 125 So M&E information system for HEE has to support such data collection methods as reviews of official records, questionnaires and surveys of students and graduates and web mining. After goals and results of M&E are determined and quality evidence and

  9. Exploring graduate and undergraduate course evaluations administered on

    Assessment & Evaluation in Higher Education Volume 38, 2013 - Issue 1. Submit an article Journal homepage. 569 Views 9 ... Exploring graduate and undergraduate course evaluations administered on paper and online: a case study. Jamis J. Perrett Statistics, Texas A&M University, College Station, TX, ...

  10. The Value of Assessing Higher Education Student Learning Outcomes

    Two articles focus on the challenges associated with creating nation- or systemwide assessment systems. Martin and colleague present a case study that reflects on development of the field in Australia. It discusses insights from a review of institutional websites and a survey of leaders regarding learning outcomes identified by institutions.

  11. Ensuring bachelor's thesis assessment quality: a case study at one

    The second part is a case study conducted in one bachelor's psychology-related program, where the assessment practitioners and the vice program director analyzed the assessment documents based on the guidelines developed from the literature review.,The findings of this study include a list of guidelines based on the four standards.

  12. Peer evaluation: A case study

    Peer evaluation is the process whereby students critique the performances of other students. A peer evaluation format emphasizes skills, encourages involvement, focuses on learning, establishes a reference, promotes excellence, provides increased feedback, fosters attendance, and teaches responsibility. The process of peer evaluation is explained, the criteria are specified, the training for ...

  13. PDF Student perspectives: evaluating a higher education administration program

    The Ph.D. program in Higher Education Administration is primarily for those applicants whose interests lie in gaining a tenure-track position in a Higher Education department at a college or university. Furthermore, the Ph.D. program prepares students to conduct research both inside and outside of college/university settings.

  14. Case study: suggesting choice: inclusive assessment processes

    1. Introduction. This case-study paper is based on a university research project, 'Suggesting Choice: Inclusive Assessment Processes', which aimed to explore staff and student opinions on the introduction of choice in assessment methods, drawing upon the principles of Inclusive Pedagogy and Universal Design, in a School of Social Sciences ...

  15. The case method evaluated in terms of higher education research: A

    This study evaluates the case method in terms of higher education research. • The case method is much more effective than lectures in fostering understanding. • This is a validation of the case method in terms of known measures of course quality. • This study suggests new ways to evaluate and improve management education courses.

  16. Evaluation Strategies For True Learning In Higher Education

    Evaluation is an essential part of the learning process in higher education. It allows instructors to determine whether students learn the material and achieve the intended learning outcomes. However, traditional evaluation methods, such as multiple-choice tests and quizzes, often measure only memorization and recall of information rather than ...

  17. PDF Learning Online: A Case Study Exploring Student Perceptions and ...

    Higher Education Development, Evaluation, and Research Associates. This study explored the perceptions and experiences of a group of students enrolled in an online course in Economic Evaluation. A mixed methods approach was adopted for the data collection, and thematic analysis was used to synthesize the data collected and highlight key findings.

  18. Implementing summative assessment with a formative flavour: a case

    The aim of this paper is to show how formative assessment elements can be integrated in very large cohorts of undergraduate students, under the common cost and time constraints of the Australian higher education sector. The following case study is based on a subject with an annual enrolment of more than 2100 students.

  19. Sustainability Assessment of Higher Education Institutions: A ...

    This study examined higher education sustainability evaluation literature for sustainability assessment methods and case studies. This article adds case studies on sustainability evaluation at HEIs to earlier sustainability assessment research. Overall, 88 peer-reviewed articles are examined.

  20. E-assessment challenges during e-learning in higher education: A case study

    Technology has become a fundamental means to encourage reliable and more effective assessments. Rapid technological developments have led to the widespread use of digital platforms and devices in all aspects of life. Educational institutions worldwide had to take advantage of this technological leap during pandemics such as COVID-19, which changed the shape of higher education and prompted the ...

  21. Mathematics

    The assessment of knowledge and skills acquired by the student at each academic stage is crucial for every educational process. This paper proposes and tests an approach based on a structured assessment test for mathematical competencies in higher education and methods for statistical evaluation of the test. A case study is presented for the assessment of knowledge and skills for solving ...

  22. Application of flipped classroom teaching method based on ADDIE concept

    Based on the concept of ADDIE (Analysis-Design-Development-Implementation-Evaluation), this study combines the theory and clinical practice of flipped classroom teaching method to evaluate the teaching effect, so as to provide a basis and reference for the implementation of flipped classroom in the future of neurology residency training teaching.

  23. The self‐assessed portfolio: a case study: Assessment & Evaluation in

    Rima Bahous. This paper reports on a successful attempt to use the portfolio as a sole assessment tool for an upper level language arts course at an English‐medium university in Lebanon. Over four consecutive years in the spring semester, the teacher/researcher devised a special syllabus based on the teaching/learning of text discourses and ...