Gamification and active learning in higher education: is it possible to match digital society, academia and students' interests?

This study aims to examine whether it is possible to match digital society, academia and students interests in higher education by testing to what extent the introduction of gamification into active learning setups affects the skills development demanded by the workplace of the digital society of the twenty-first century, the academic achievement standards claimed by the academia, and the satisfaction with the learning process required by the students. Our results provide statistically significant empirical evidence, concluding that the generation of a co-creative and empowered gameful experience that supports students' overall value creation yields to satisfactory active learning setups without any loss of academic achievement, and allowing to develop a series of skills especially relevant for twenty-first century professionals.


Introduction
The preponderance of digital technology characterizes the society of the twenty-first century. The successive technological innovations we are witnessing today make up a digital society in a continuous process of change and with a labor market that demands flexible and creative people with the capacity to reinvent themselves and be direct protagonists of their lifelong learning (Longmore et al. 2018). The new professionals must get used to working in multidisciplinary teams and environments, where over-specialization in a specific subject is not so much valued as their initiative to learn from an open and holistic perspective (Muduli 2018;Zhu et al. 2019). In this context, the university of the twenty-first century is configured as the field of practice where to simulate this work scenario through active learning strategies that, promoting quality technical training, also allow the development of the skills demanded by the actual workplace .
Active learning encourages the students' autonomy and participation in their learning process, giving them a leading role and placing the teacher not as a mere Page 2 of 27 Murillo-Zamorano et al. Int J Educ Technol High Educ (2021) 18:15 transmitter of knowledge but as a facilitator or guide of that learning (Bonwell and Eison, 1991). Active learning promotes their creativity, helping them to develop the skills that increasingly determine their future employability and personal development (Daellenbach, 2018;Hayter and Parker 2019;Pang et al. 2019). The Bologna Process and the European Higher Education Area (EHEA; Zahavi and Friedman 2019), the Partnership for 21st Century Learning (P21; van Laar et al. 2017) or the Assessment and Teaching of 21st Century Skills (ATC21S; Care et al. 2018) are examples of international conceptual learning frameworks that highlight the usefulness of active learning for the development of skills associated with content-knowledge learning required for students to succeed in the fast-changing digital society of the twenty-first century.
Despite these benefits, active learning, especially in the field of higher education, is still not sufficiently implemented and in fact there are many detractors among the academia (Kalms 2019;Robertson 2018;Tharayil et al. 2018). To a great extent, university teachers continue to emphasize passive scenarios using the master class as the primary mechanism for students' learning (Guerrero-Roldán and Noguera 2018). They view with distrust and incredulity that the students could acquire this knowledge autonomously through their experiential participation in the learning process. Many of them are also wary of active learning because it could mean a loss of time and thus an obstacle to their students' academic achievement.
On the other hand, from the students' point of view, moving from a passive role to be the protagonists of their own learning requires a more significant workload and degree of commitment. We must also bear in mind that new generations of students are intimately linked to aspects of immediate gratification with consumed experiences (Sackin 2018). The one added to the other, makes it necessary to generate a high degree of satisfaction with the active learning experience to reach our university students today. Otherwise, and no matter how many of its advantages in terms of academic achievement and skills development, it will be doomed to failure.
With all this background and taking into account the interests, doubts and demands of the three stakeholders related to the higher education system: digital society, academia and students, the objective of this article is to analyze whether it is possible to create active learning experiences in higher education that allow the development of the skills demanded by the workplace of the digital society of the twenty-first century, without affecting the quality training and learning standards required by the academia, neither the satisfaction generated in the students by involving them in the active creation of their own knowledge. Is it possible to match digital society, academia and students' interests? That is the question. To the best of our knowledge, this is the first time that such a question has been formally addressed by the literature.
In doing so, the first contribution of this research is to show, through the presentation of a real case, that the implementation of satisfactory active learning setups in higher education is feasible without any loss of students' academic achievement, and allowing to develop a series of skills especially relevant for future twenty-first century professionals: ability to work in groups, ability to listen to others' opinions, self-learning ability, ability to apply knowledge in practice, analytical ability, and ability to synthesize information. Our purpose is that our learning experience setup can be generalizable to other university contexts that might be interested in developing active and satisfactory learning environments.
But where is the holy grail? Our results point out that the key piece to square the circle and be able to accommodate the interests of the three higher education system stakeholders can be found in the use of gamification and more specifically, in the generation of gameful active learning experiences. The most widely accepted definition of gamification is "the use of game design elements in non-game contexts", proposed by Deterding et al. (2011, p.10), to attract attention, modifying behavior, or solving problems (Kapp 2012;Seaborn and Fels 2015;Werbach and Hunter 2012;Yildrim 2017). It has received and continues to receive significant attention in the media and the specialized research literature (Kasurinen and Knutas 2018).
Its final success lies in the generation of intrinsic motivation that permanently modifies the behavior of individuals (Alsawaier 2018;Hamari et al. 2018). This is not an easy task and depends largely on the design of the gamification experience (Cechetti et al. 2019;Diefenbach and Müssig 2018). Its applications are growing and numerous in different fields, including education (Bozkurt and Durak 2018). In this area, recent meta-analyses conclude that gamification has a positive and significant effect on students' learning outcomes (Bai et al. 2020;Sailer and Homner 2020;Yildirim and Şen 2019). Other benefits of this technique are increased motivation and engagement of students and the development of their autonomous learning skills and critical thinking skills (Zainuddin et al. 2020).
Focusing on higher education, Subhash and Cudney (2018) point out after their systematic review that gamification "has an overwhelming support for a number of benefits to both teachers and students" (p. 204). Among them, the improvement of performance, learning outcomes, and average scores, as well as the reduction of failure rates. Empirical studies such as those by Tsay et al. (2018), and Diaz-Ramirez (2020) conclude that gamification assessments improve students' final grades. Other studies suggest that it improves both grades and student satisfaction (Fuster-Guilló et al. 2019) and motivation (Jurgelaitis et al. 2019). Guardia et al. (2019) conclude that students positively value gamification and that it can have greater potential than traditional methods to develop skills such as teamwork, practical training, leadership and oral communication skills, the ability to learn and act in new situations, and the ability to generate new ideas and solutions. It should also be noted that Zainuddin (2018) analyzes the effect of gamification in a flipped class environment in secondary school and finds that it improves students' scores, competence beliefs, and motivation.
However, referring to the specific field of active learning environments in higher education, the literature is still quite scarce which would corroborate the difficulty of generating successful experiences in this context (Huang and Hew 2018;Huang et al. 2019). This is the first study investigating how gamification affects students' skills, academic achievement and satisfaction in a higher education active learning setup.
In terms of game design elements, most studies follow a classical gamification approach incorporating the so-called PBL triad: points, badges, and leaderboards (Buckley et al. 2018;Seaborn and Fels 2015;Werbach and Hunter 2012). This approach generates engagement and extrinsic motivation but not necessarily satisfaction, which in the medium-long term may lead to user abandonment (Bogost 2015;Mekler et al. 2017;Sanchez et al. 2020). A more recent view defines gamification as "a process of enhancing a service with affordances for gameful experiences in order to support users' overall value creation" (Huotari and Hamari 2017, p. 25). According to this approach, gamification must be able to create experiences that, like games, are intrinsically motivating and satisfactory, achieving a permanent change in individuals' behavior (Koivisto and Hamari 2019). This definition emphasizes not so much the game design elements used but the emergence of gameful experiences by providing the user with the mechanisms necessary to participate in the co-creation thereof.
Following this approach, in this research we develop The Econplus Champions League, a satisfactory gameful active learning setup combining flipped learning (Murillo-Zamorano et al. 2019;Bergman and Sams 2012;Lage et al. 2000), cooperative learning (Aronson 1978(Aronson , 2002Hänze 2009, 2016) and the use of rubrics (Azizan et al. 2018;Gallavan and Kottler 2009;Panadero and Jonsson 2013;Zhang et al. 2019). The Econplus Champions League is designed as a competition contested by students' teams, with the ultimate goal of creating a gameful experience surrounding the totality of our active learning setup. Students were empowered by allowing them to participate in the design of the competition itself. Its final purpose was to enable the students to co-create their own didactic material and in doing so be the protagonist of their own learning. To the best of our knowledge, this is the first time that such a gamification approach is employed in a higher education context.
We use a quasi-experimental design of natural groups with a gamified active learning instructional condition (experimental group) and a non-gamified active learning instructional condition (control group). In the experimental group, the role of the gamification provider (teacher) was to support users' (students) processes by offering them resources to co-produce an academic output, enjoy the gameful experience and also make them participate in the design of the gamification setup.
The academic output consisted of the co-creation by groups of students of multiplechoice question tests (MCQT). The construction of good MCQTs requires precision, technical adequacy and plausibility (Haladyna et al. 2002). These characteristics give the activity a high added value both in terms of skills and the assimilation of technical contents (Yu et al. 2015). The creation of these tests allows for the self-learning of students, fostering their ability to apply knowledge into practice, and enabling them to generate new knowledge about previously learned (Kurtz et al. 2019;Rosenshine et al. 1996). Team co-creation of MCQTs promotes the students' ability to listen to other's opinions, the ability to work in groups, as well as the development of their analysis and synthesis capacities. As far as we know, this is also the first time that this proactive gamified approach to the co-creation of MCQs is examined in the literature.
Conscious, as previously commented, of the resistance and doubts that this type of gamified enriched active learning experiences can arouse among some of the higher education system stakeholders, a further contribution of this research is to offer relevant and statistically significant empirical evidence on their particular interests (Fig. 1). In doing so, in this research we address three research questions. A first one concerning the skills development demanded by the workplace of the digital society of the twenty-first century (RQ1), a second one related to the academic achievement standards claimed by the academia (RQ2), and a third one about the students' Page 5 of 27 Murillo-Zamorano et al. Int J Educ Technol High Educ (2021) 18:15 satisfaction with their learning process (RQ3). More precisely, these three research questions are stated as: • RQ1: Does the use of gamification in an active learning setup affect the students' skills? • RQ2: Does the use of gamification in an active learning setup affect the students' academic achievement? • RQ3: Does the use of gamification in an active learning setup affect the students' satisfaction?
The rest of the paper is organized as follows. First, "Methodology" section develops the methodology of this research by defining the participants, study design, an instructional activity, the sample and data collection procedure, and the measurement scales. "Results" section presents the results and answers the three research questions stated in the introduction. Finally, "Discussion and conclusion" section discusses and concludes the paper, identifies the study's limitations, and points out future research directions.

Methodology
The higher education module on which we centered our teaching and learning experience was a Macroeconomics module consisted of 60 teaching hours distributed over 15 weeks and 30 2-h sessions. During this time, a syllabus made up of eight topics and an introductory topic zero was addressed. This topic zero was devoted to familiarizing the students with the active learning methodology to be implemented and with the necessary information and communication technologies (ICTs) that they would need to appropriately follow the course.

Participants
The teaching and learning experience was conducted at the Faculty of Business and Economics in the university of Extremadura (Spain). The participants were 132 students enrolled in the Macroeconomics module taught in the second semester of the first year of studies in two existing groups of the Degree in Business Administration & Management (Group 1 and Group 2). Group 1 (control group-active learning instructional condition) made up of 65 students with an average age of 20.17 years and a standard deviation of 2.58. Group 2 (experimental group-gamified active learning instructional condition) consisted of 67 students with an average age of 19.97 years and a standard deviation of 1.60. Following the generational cohorts definition of Seppanen and Gualtieri (2012), 1 all the students participating in this experience were millennials belonging to the Y generation of students and sharing the same set of generational characteristics The size of the groups of students in our research follows the criteria of natural academic groups according to the registration process established by the university. 2 Of the total number of registered undergraduates in both groups, 59.09% (78) were female and 40.91% (54) were male. The majority (125 or 94.69%) were 18 to 23 years old, 4.54% (6) were between 24 and 25 years old, and just 2 (1.51%) were older than 29. The experience was developed in two non-consecutive academic years leaving a full academic year between the two. The reason why a full academic year was allowed to pass between the application of the experience to Group 1 and Group 2 lies in the fact of avoiding, in this way, the bias-motivated by the transmission of information from one student to another if the experience had been carried out in the same year or even in two immediately consecutive years. In addition, this temporal separation is also intended to avoid the teacher's bias towards one or the other application of the experience.
The instructor in charge of developing the experience was the same in the two participating groups in order to minimize the effect of the "unobserved teacher characteristic in the students' academic performance" (Rivkin et al. 2005;Rockoff 2004). This is an experienced instructor, with more than 20 years lecturing in higher education, and awarded a teaching recognition excellence.

Design
We use a quasi-experimental design of natural groups with a gamified active learning instructional condition (experimental group) and a non-gamified active learning instructional condition (control group). A comparison in terms of skills, academic achievement and student's satisfaction was made on the influence of gamification on the active learning instructional conditions mentioned above. By following this approach, we are using a selective control in which "the goal is to assign subjects to the experimental groups in a manner which ensures uniform distribution of extraneous variables among groups.
1 Seppannen and Gualtieri (2012) use the birth years of 1980 to 1999 to define the Millennial cohort in their piece of research published by the U.S. Chamber of Commerce. 2 At the time the experience was developed, one of the students enrolled in Group 2 was 46 years old and another of the students, in this case enrolled in Group 1, was 37 years old. In order to safeguard the homogeneity and consistency of the analysis groups used in this study, both students were extracted from the analysis, since by year of birth, they do not belong to or share, therefore, the generational characteristics of the rest of their classmates belonging to generation Y or millennials.
When the distribution of the extraneous variables is the same from group to group, the effects of extraneous variables cancel out across groups" (Street 1995, p. 171).

Instructional activity
The instructional activity was developed using an active learning setup combining flipped learning, cooperative learning and the use of rubrics. This enriched active learning setup's final product was that students elaborate their own didactic material through the co-creation of a MCQT for any of the Macroeconomics module contents topics. The instructional activity was developed, both in the control group (Group 1) and in the experimental group (Group 2), in four occasions (T1, T2, T3, T4) distributed throughout the course. In what follows, and according to Fig. 2, we explain the active learning setup that served as the basis for the development of the experience in both groups as well as the specific gamification practice that was developed with the experimental group in order to compare the effects of its use in the final results of the experience.

Flipped learning: the 4D_FLIPPED classroom
In this research, we adopted the 4D_FLIPPED classroom active learning setup proposed and tested by Murillo-Zamorano et al. (2019). This approach is specifically designed for its use in higher education and has a positive and direct effect on students' knowledge, skills and satisfaction, which is particularly relevant to the development of our experience. The 4D_FLIPPED classroom consists of four dimensions: out-of-class activities; feedback; in-class activities; and the use of technology.
In terms of out-of-class activities, one week before the start of each one of the module's topics, the students had access to a series of YouTube videos about the main contents of the topic that were uploaded by the teacher to the TES Blendspace online platform and the Virtual Learning Environment (VLE) system of the course. Students were allowed a week to watch and summarize the videos, and to answer an online questionnaire about their main contents and the aspects that they had found the easiest and hardest to understand. The instructor collected and analyzed all this feedback. Subsequently, during class time and following a two-ways feedback approach, he commented on this information, answered some of the students' questions, and explained to them how he would adjust the lectures and in-class activities to develop in-depth the concepts that the students had identified as being the most complex. The 4D_FLIPPED classroom enhances the entire learning process with the use of technology. In our case, out-of-class activities included the use of the Google cloud services, the course VLE system and multimedia sharing facilities as TES Blendspace and You-Tube. For in-class activities, the essential technology elements were the combined use of mobile devices, social networks, and cloud-computing applications. Figure 3 gathers some of the platforms, tools and technological apps used to implement this active learning setup.

Cooperative learning: the jigsaw classroom with rubric
After the students had carried out the flipped learning out-of-class activities for a particular topic and the teacher explained its contents, the students proceeded to develop the in-class activity consisting of the co-creation in teams of a MCQT on such contents topic. For its elaboration were used cooperative learning techniques and the scoring rubric presented in Table 1, which was specifically designed by the teacher for its adequate development. The use of rubrics facilitates the process of elaboration of MCQTs, favoring that the student perceives with clarity the most relevant dimensions, the evaluation standards associated with each dimension, and the importance given to each one of them (Chan and Ho 2019; Chowdhury 2019; Cockett and Jackson 2018; Gallavan and Kottler 2009;Jonsson and Swingby 2007;Panadero and Jonsson 2013).
In terms of cooperative learning techniques, we used the Jigsaw classroom approach (Aronson 1978(Aronson , 2002O'Leary et al. 2019;Sanaie et al. 2019) conforming jigsaw teams of 4 students each. Following this technique, each topic was divided by the teacher into four large blocks and the jigsaw teams appointed an expert in each of them. Each student elaborated two multiple-choice questions (MCQs) about his/her block following the rubric guidelines. After that, he/she met at work tables with other experts in the same block to share their questions and choose the best three ones in a consensual way. Then, they uploaded them to a collaborative document located on Google Drive. Subsequently, the initial jigsaw teams joined again, and each member contributes his/her three questions. Each team chooses, from the twelve available questions, the five that are considered the best. Then, they uploaded the final version of their test to the VLE system of the course.  Technical adequacy 30% All items in the MCQt contain "productive" and not simply "reproductive" questions: The asked question requires a process of reflection and deduction on the part of the reader before being answered The questions and answers do not conform to the mere literality of class notes Justification of the correct answer requires employing several cause-effect relationships in a chained fashion 30 points 4 of the items in the MCQt contain "productive" and not simply "reproductive" questions: The asked question requires a process of reflection and deduction on the part of the reader before being answered The questions and answers do not conform to the mere literality of class notes Justification of the correct answer requires employing several cause-effect relationships in a chained fashion 24 points Between 2 and 3 of the items in the MCQt contain "productive" and not simply "reproductive" questions: The asked question requires a process of reflection and deduction on the part of the reader before being answered The questions and answers do not conform to the mere literality of class notes Justification of the correct answer requires employing several cause-effect relationships in a chained fashion 18 points Less than 2 of the items in the MCQt contain "productive" and not simply "reproductive" questions: The asked question requires a process of reflection and deduction on the part of the reader before being answered The questions and answers do not conform to the mere literality of class notes Justification of the correct answer requires employing several cause-effect relationships in a chained fashion 0 points Accuracy 20% All items in the MCQt are characterized by: Include all the relevant information for the resolution of the question, but without resulting in an excessively extensive formulation that makes it difficult to understand Avoid answers such as "all of the above" or "none of the above" Avoid answers with words of ambiguous meaning such as "normally", "typically", "may be", etc. or very specific determinants such as "always", "never", "all", "none", etc 20 points 4 of the items in the MCQt are characterized by: Include all the relevant information for the resolution of the question, but without resulting in an excessively extensive formulation that makes it difficult to understand Avoid answers such as "all of the above" or "none of the above" Avoid answers with words of ambiguous meaning such as "normally", "typically", "may be", etc. or very specific determinants such as "always", "never", "all", "none", etc 16 points Between 2 and 3 of the items in the MCQt are characterized by: Include all the relevant information for the resolution of the question, but without resulting in an excessively extensive formulation that makes it difficult to understand Avoid answers such as "all of the above" or "none of the above" Avoid answers with words of ambiguous meaning such as "normally", "typically", "may be", etc. or very specific determinants such as "always", "never", "all", "none", etc 12 points Less than 2 of the items in the MCQt are characterized by: Include all the relevant information for the resolution of the question, but without resulting in an excessively extensive formulation that makes it difficult to understand Avoid answers such as "all of the above" or "none of the above" Avoid answers with words of ambiguous meaning such as "normally", "typically", "may be", etc. or very specific determinants such as "always", "never", "all", "none", etc 0 points Structure 15% All items in the MCQt are characterized by: Include four answers, of which one and only one is correct Sort the answers in a logical sequence Identify the correct answer and provide an explanation to justify its choice 15 points 4 of the items in the MCQt are characterized by: Include four answers, of which one and only one is correct Sort the answers in a logical sequence Identify the correct answer and provide an explanation to justify its choice 12 points Between 2 and 3 of the items in the MCQt are characterized by: Include four answers, of which one and only one is correct Sort the answers in a logical sequence Identify the correct answer and provide an explanation to justify its choice 9 points Less than 2 of the items in the MCQt are characterized by: Include four answers, of which one and only one is correct Sort the answers in a logical sequence Identify the correct answer and provide an explanation to justify its choice 0 points

Gamification: the econplus champions league
Gamification was introduced into our experimental active learning group (Group 2) of students to balance the increases in students' workloads derived from the active learning setup. Specifically, by creating a gameful active learning experience aimed to incentivize their participation and engagement, involving them in the design of the experience itself (Huotari and Hamari 2017;Kovisto and Hamari 2019): The Econplus Champions League.
Relying on the emotional connection of our students with football clubs competitions, the Econplus Champions League was designed as a competition contested by a set of "top-division macroeconomics clubs" (jigsaw teams in the preceding section), with the ultimate goal of creating a gameful experience surrounding the totality of our active learning setup. Students also participated in defining the competition rules, chose their teams' names, and designed their teams' attire with the online Owayo ® 3D kit designer (Fig. 4).
The Econplus Champions League was structured around three phases: (i) warmingup, (ii) group stage, and (iii) grand finale. The warming-up (phase 1) followed the cooperative and flipped learning procedures, times and steps of the active learning setup described in previous sections. The jigsaw teams' final tests were named as Econplus Quizzes and each team uploaded its Econplus Quiz to the social game-based learning platform: Kahoot! 3 After that, in the group stage (phase 2), teams compete against the other ones through that platform. The nine teams that performed better in the group stage went on to the grand finale (phase 3), where they competed again to determine the Econplus Champions League team of the year.
The warming-up phase and the group stage one took place in 4 2-h regular league sessions along the course, one for each of the four-module contents topics Econplus Quizzes created by the students. The grand finale was a 2-h session in the antepenultimate week of the semester, before the final module examination and concerning the whole set of topic contents studied during the course.
The entire competition process was organized around a set of challenges -the Champions Battles-characterized by increasing knowledge acquisition and structured into four progressive apprentice levels: (i) intermediate, (ii) advanced, (iii) higher, and (iv) grade. The warming-up phase and the group stage one took place within the three first apprentice levels while participants in the grand finale accessed the grade level.
In the intermediate level, the Champions Battles were played in pairs, which were randomly selected by all the jigsaw teams conformed in the warming-up phase. To do this, first one team and then the other projected their Econplus Quiz on their laptops to the At the end of each of the four regular sessions of the stage group, the Kahoot points of the intermediate and advanced levels were added and the total score was written down in the Econplus Champions League leaderboard. The points obtained in the higher level did not compute with them but gave teams the opportunity to get extra points for their final module marks and were also written down in the leaderboard but separated on another column. The Econplus Champions League leaderboard was made available to students on the bulletin board of the classroom as well as on the VLE system of the course, so that the students had a record of their position in the ranking, day by day (Fig. 5).
When the group stage concluded, the nine teams with the highest scores on the Econplus Champions League leaderboard moved up to the grade level of competition and went on to the grand finale. In it, teams faced two final quizzes elaborated by the teacher.
Teams could obtain badges all the Econplus Champions League along. These badges affected the entire gamified active learning set developed in this experience. They could obtain three types of badges which can be of gold, silver or bronze. Two of them could be earned in each of the four sessions of the Econplus Champions League: flipped classroom badges, depending on the number of members of the team that completed all online questionnaires of the out-of-class activity, and cooperative learning badges,  depending on the number of members of the team that had actively participated in the co-creation of the Econplus Quizzes. Gamification badges were obtained by teams with the highest number of Kahoot points in each of the two quizzes in the grand finale. The badges designed can be checked in Fig. 6. Each badge of gold, silver or bronze meant 3, 2 and 1 rewarded points that were accumulating session to session by teams. In the grand finale, all these rewarded points were summed up. This was followed by a short awards ceremony where the third and second place finishers were announced, and the winner was proclaimed the Econplus Champions League team of the year.

Sample, data collection and questionnaire administration
We have prepared this subsection, following established protocols (Churchill 1979;Diamantopoulos 1994;Dillman 2011;Rudd et al. 2008). The questionnaire was elaborated about the experience of the students with their active learning setup. The questionnaire contained responses in a 7-point Likert scale format, which was the result of a careful review of the literature, and a pre-test performed under the above-established protocols. Specifically, we carried out in-depth interviews with four well informed and senior scholars on the topics under study. These scholars provided useful information to confirm the latent variables' conceptual domain under investigation and to clarify the wording of some items.
The students completed the questionnaire through the VLE system once the lectures were over and before the final exam, and were also told that their responses would remain strictly confidential and used in an aggregated manner (Dillman 2011). This procedure was carried out for both the control group and the experimental group. The administration of the questionnaire concluded with a total of 132 valid responses (Table 2): 65 students for the control group (Group 1) and 67 students for the experimental group (Group 2). To control for common method variance, we also performed ex-ante and expost procedures (Podsakoff et al. 2012). On the one hand, by means of the study design (ex-ante) carried out in terms of providing a psychological separation between variables in the questionnaire and by pointing out the importance of providing a truthful answer when filling it by respondents. On the other hand, from a statistical point of view (expost) by using Harman's single factor test to determine that one factor did not have the majority of the variance. In this way, we examined that common method variance was not a problem in this study. Additionally, we compared the control group (Group 1) and the experimental group (Group 2) according to the criteria employed in Table 2: the gender, the average grade of the rest of the subjects, and the highest enrolled course. The chi-squared tests revealed, at a significance level of p ≤ 0.01, that there were no significant differences between the control group and the experimental group in terms of the gender (X 2 = 0.063, df = 1, p = 0.802), the average grade of the rest of the subjects (X 2 = 3.946, df = 4, p = 0.267) and of the highest enrolled course (X 2 = 0.711, df = 2, p = 0.021).

Measurement scales
In Appendix, we present the questionnaire above to capture the students' perceptions of their active learning setup. Following Murillo-Zamorano et al. (2019), we conceived for this study two blocks of questions. Block 1 collected information on the students' skills while Block 2 focused on the analysis of the students' level of satisfaction. Specifically, Block 1 refers to whether the teaching methodology employed in the module facilitated them to enhance the ability to work in groups (SKI1), the ability to listen to others' opinions (SKI2), self-learning ability (SKI3), the ability to apply knowledge in practice (SKI4), the ability to analyze (SKI5), and the ability to synthesize (SKI6). The questionnaire also  -Zamorano et al. Int J Educ Technol High Educ (2021) 18:15 asked the students in Block 2 about their level of satisfaction with the lecturer (SAT1) and with the module (SAT2). We also employed an objective measure such as the students' academic achievement in terms of final marks. This variable refers to the final marks obtained by the student in a final exam of the subject in a range of 0 to 10; being the lower limit 0 and the upper limit 10. This practice helps to have data from non-perceptual sources, i.e., we combined survey students' responses (e.g., the students' skills and the students' satisfaction) with objective data by means of the students' final marks. This procedure is helpful to increase the confidence in the results of our study and to control for common method variance (Podsakoff et al. 2012).

Results
This section is organized into two subsections: first, we examine the measurement models, where appropriate, of the scales defined in the previous subsection: the students' skills (perceptual measure), the students' academic achievement (objective measure), and the students' satisfaction (perceptual measure), in order to determine whether they are reliable and valid scales following established procedures (Hair et al. 2017a, b). Second, we carry out the statistical tests to provide an answer to the three research questions raised in the introduction of the paper.

Measurement models
This subsection examines the measurement models for the scales considered in our study. The reliability and (convergent and discriminant) validity analysis revealed that our first-order measurement models are correct. We used partial least squares structural equation modeling (PLS-SEM) within the SmartPLS software package (Ringle et al. 2015). The basis for employing PLS-SEM, instead of those methods that use covariance structures, is that (Hair et al. 2017a): (i) a normal distribution is not required when using PLS; and (ii) there is no need to work with a high number of observations in PLS. This approach has several differences compared to covariance-based structural equation modeling (CB-SEM) (Hair et al. 2017a). For example, for PLS-SEM, which is the case of our study, there is no need for multivariate normality as "PLS-SEM is a nonparametric statistical method. Different from maximum likelihood (ML)-based CB-SEM, it does not require the data to be normally distributed" (Hair et al. 2017a, pp. 61-62).
In the case of the sample size requirements for the multi-group analysis in PLS-SEM, we a priori have to indicate that "compared with its covariance-based counterpart, PLS-SEM has higher levels of statistical power in situations with complex model structures or smaller sample sizes" (Hair et al. 2017a, p. 24), and furthermore "PLS can be applied in many instances of small samples when other methods fail" (Henseler et al. 2014, p. 199). Once said this, we have employed G*Power 3 statistical software to carry out a power analysis, revealing that for both Group 1 and Group 2 in the multi-group analysis the power value is above the recommended threshold of 0.80 (Cohen 1988).
In Table 3, we can observe that for the first-order measurement models of the students' skills and the students' satisfaction, all the loadings were above 0.6 in all cases, i.e., for the control group and experimental group (n = 132), for the control group (n = 65),   -Zamorano et al. Int J Educ Technol High Educ (2021) 18:15 and for the experimental group (n = 67). We employed 5000 subsamples with the same cases as in the original sample to compute the significance levels of the t-values associated with the loadings (Hair et al. 2017b). All the t-tests for determining statistical significance results were satisfactory (Anderson and Gerbing 1988). The average variance extracted (AVE) ranged from 0.601 to 0.867, and the composite reliability index (CRI) ranged from 0.899 to 0.950; these circumstances show an adequate level of reliability for the scales (Bagozzi and Yi 1989). The discriminant validity is also supported as the square root of the AVE is greater than the correlation of the students' skills and the students' satisfaction, i.e., for the control group and experimental group (n = 132), for the control group (n = 65), and for the experimental group (n = 67). Additionally, the satisfactory results of the measurement models of the students' skills and the students' satisfaction were also confirmed by the multi-group analysis. We examined whether the loadings differ significantly in the control group (Group 1) from the experimental group (Group 2). We used a multiple methods approach under PLS-SEM: PLS-MGA, parametric test, and Welch-Satterthwaite t-test . Table 4 shows the results of the multi-group analysis and allows us to assert that there are no significant differences between the loadings that conform the measurement models of the students' skills and the students' satisfaction across the different groups: Control group (Group 1) and experimental group (Group 2). These results reflect that the scales employed are reliable and valid across the groups considered in our study. Table 5 presents the correlation matrix, mean, standard deviation, and square root of the AVE for each of the latent variables.
Regarding the objective measure used in our study, the students' academic achievement in terms of final marks, there is no need to carry out a reliability and validity assessment procedure as it is not a latent variable. In Table 5, apart from the correlation matrix, the basic descriptive statistics of such variable show: (i) for the control group and the experimental group (n = 132), an average of 4.867 and a standard  -Zamorano et al. Int J Educ Technol High Educ (2021)

Statistical tests to provide an answer to the three research questions
In light of the above results on the students' skills, the students' academic achievement, and the students' satisfaction, we are now in a position to give an answer to the three research questions. First, we checked if the data followed a normal distribution in order to choose the most suitable type of test to answer the three research questions. To this end, we performed the Shapiro-Wilk test, together with the Kolmogorov-Smirnov test with the correction of Lilliefors, on all the items from the control group (Group 1, n = 65) and the experimental group (Group 2, n = 67) (Farrel and Stewart 2006;Razali and Wah 2011). The results revealed that neither group followed a normal distribution for the perceptual measures: the students' skills and the students' satisfaction. However, for the objective measure, the students' academic achievement in terms of final marks, we found that it followed a normal distribution.
Regarding the first research question, RQ1: Does the use of gamification in an active learning setup affect the students' skills? As the data did not follow a normal distribution for the students' skills, we employed the Mann-Whitney U test, which does not depend Table 5 Descriptive statistics and correlations Correlation coefficients were computed by means of the average scores of the indicators included in each of the latent variables. The square root of AVE is reported in italics along the diagonal ns non-significant, NA not applicable ***p = 0.00; **p < 0.01; *p < 0.05 Control group (Group 1) and experimental group (Group 2) (n = 132)  -Zamorano et al. Int J Educ Technol High Educ (2021) 18:15 on the data distribution (Nachar 2008). In Table 6, we show a disaggregated analysis for the students' skills measure (1-7 Likert type scale; lower limit 1 and the upper limit 7), to compare the experimental group with the control group. A disaggregated analysis of the items that make up the scale allows us to develop a broader view of how the different skills behave under different circumstances in an active learning setup.
Examining the results, we found that there are significant differences between the level of skill improvement of the experimental group and the control group. The asymptotic (bilateral) significance of the Mann-Whitney U test is p < 0.01 for the ability to work in groups (SKI1), the ability to listen to others' opinions (SKI2), the ability to apply knowledge into practice (SKI4), the ability to analyze (SKI5), and the ability to synthesize (SKI6); and such test is p < 0.05 for self-learning ability (SKI3). The mean of the skills for the experimental group ranged from 5.64 to 5.91, while the mean of the skills for the control group ranged from 4.55 to 5.23. Consequently, the means of the skills in the experimental group are higher and significantly different from the means of the skills in the control group. Therefore, we can suggest that with a gamified active learning setup is possible to create better active learning experiences in higher education that allow developing higher levels of the skills demanded by the workplace in the digital society of the twenty-first century for this specific context of research (comparison of the two groups).
Concerning the second research question, RQ2: Does the use of gamification in an active learning setup affect the students' academic achievement? In this case, the data from the students' academic achievement in terms of final marks (0-10 numeric type scale; lower limit 0 and the upper limit 10) followed a normal distribution. We also checked that the homogeneity of variances is met by using Levene's test (Gastwirth et al. 2009). As the p-value for such test was above 0.05, equal variances can be considered, so we performed a one-way ANOVA. The level of significance for the F-test for the one-way ANOVA had a significance level of p = 0.15 which surpasses the threshold of 0.05, i.e., there was not a significant difference in mean students' final marks between the experimental group and the control group (Table 7). The mean of the final marks is 5.06 for the experimental group and 4.66 for the control group. As a result, we can affirm that the introduction of gamification  -Zamorano et al. Int J Educ Technol High Educ (2021) 18:15 in the active learning experiences carried out in the present study increases the final marks but not in a statistically significant way if we compare the experimental and the control groups employed in our study; or in other words, gamification in this specific context of research (comparison of the two groups) does not harm the final marks and does not affect the quality and learning standards demanded by the academia. With reference to the third research question, RQ3: Does the use of gamification in an active learning setup affect the students' satisfaction? Again, we used the Mann-Whitney U test because the data from the students' satisfaction did not follow a normal distribution (Nachar 2008). The disaggregated analysis of the students' satisfaction measure (1-7 Likert type scale; lower limit 1 and the upper limit 7) by the items that conform the scale allows us to establish from a broader perspective how the students' satisfaction with the lecturer and with the module behave under different circumstances in an active learning setup ( Table 6).
The disaggregated analysis of the students' satisfaction measure to assess whether there were significant differences between the experimental group and the control group revealed that this was not the case. The asymptotic (bilateral) significance of the Mann-Whitney U test is p = 0.23 for the level of satisfaction with the lecturer, and is p = 0.70 for the level of satisfaction with the module. The mean of the satisfaction with the lecturer is 5.88 for the experimental group and 6.03 for the control group, while the mean of the satisfaction with the module is 5.51 for the experimental group and 5.45 for the control group. Consequently, we cannot state that the means of the satisfaction (with the lecturer and with the module) in the experimental group are significantly different from the means of the satisfaction (with the lecturer and with the module) in the control group. The results above for this specific context of research (comparison of the two groups) lead us to maintain that a gamified active learning setup does not create better nor worse active learning experiences in higher education in terms of generating higher levels of satisfaction in the new generation of students involving them in the creation of their own knowledge.

Discussion and conclusion
In the new and challenging context of the digital society of the twenty-first century, the labor market increasingly demands flexible, creative professionals with a broad background in skills and competencies (van Laar et al. 2018). The university cannot remain oblivious to these changes, as it must adapt to them and become the simulation field where future professionals practice and develop these skills. Aware of the undoubted benefits of active learning for the skills development demanded by the actual workplace, and of the vertigo that this change in educational paradigm can produce among students and the academia in higher education, this article contributes to the literature by providing relevant and statistically significant empirical evidence on the particular interests of any of these three higher education system stakeholders.
Through the presentation of a real experience defined by an enriched active learning setup that combines flipped learning, cooperative learning and the use of rubrics, we address the three research questions stated in the introduction of our study. Our purpose is that this learning setup can be generalizable to other university contexts that might be interested in developing active and satisfactory learning environments with potential for the acquisition of the academic standards and the skills development necessary to face the workplace successfully. This is the first time that such a global objective has been formally addressed by the literature.
In doing so, we first use a set of measurement scales to capture the students' perceptions about their active setup in terms of skills and their level of satisfaction with the lecturer and the module. These perceptual measures were employed in combination with objective measures such as the students' academic achievement in terms of final marks, which helps to increase the confidence in the results of our study and to control for common method variance (Podsakoff et al. 2012). This is the first time that a real experience that addresses the use of an enriched active learning setup, together with flipped learning, cooperative learning, and the use of rubrics, is tested in higher education with a multiple methods approach under PLS-SEM to carry out a multi-group analysis, allowing us to identify key theoretical and managerial implications.
The results point out that the gamification favors the development of skills demanded by the current workplace in the context of active learning described. There are significant differences between the results of the group that attended the gamified active learning setup and those who attended the non-gamified one. These skills are the ability to work in groups, the ability to listen to others' opinions, self-learning ability, ability to apply knowledge in practice, analytical ability, and ability to synthesize information. In this way, gamification represents an educational tool capable of satisfying the interests of the digital society.
However, the differences between the scores and satisfaction of both groups of students were not significant. This does not imply that the gamified active learning setup described does not match the interests of the academy and the students. The results point out that the introduction of gamification in the active learning setup does not harm the academic results, which represents one of the fears of a sector of the university teachers against the adoption of this technique. As for the satisfaction, it can be affirmed that the gamification does not create better nor worse active learning experiences in higher education in terms of generating higher levels of it for this specific context of research.
Also, it should be noted that both groups (gamified and non-gamified) performed an active learning setup based on the 4D_FLIPPED classroom active learning setup proposed and tested by Murillo-Zamorano et al. (2019), which has a positive and direct effect on students' knowledge, skills, and satisfaction. However, the difference in effects has not been analyzed with a group that followed a traditional non-active learning approach. This may be why gamification has not had a significant effect on students' grades and satisfaction as in the studies by Fuster-Guilló et al. (2019), Jurgelaitis et al. (2019) and Tsay et al. (2018).
With reference to the three research questions stated in the introduction of our study, we conclude that it is possible to match digital society, academia and students' interests Page 22 of 27 Murillo-Zamorano et al. Int J Educ Technol High Educ (2021) 18:15 by generating satisfactory active learning setups without any loss of students' academic achievement. Specifically, our results point out that the crucial piece to accommodate their interests in higher education can be found in the introduction of gamification into active learning setups. Despite the growing and numerous uses and applications of gamification in different fields, including education (Kasurinen and Knutas 2018;Rapp et al. 2018), the literature about gamification and active learning setups in higher education is practically non-existent. To the best of our knowledge, this is the first research in higher education that formally states and tests how gamification affects students' skills, academic achievement and satisfaction in an enriched active learning setup. Additionally, the literature approaching the general concept of gamification has not yet presented a definitive classification of which game design elements to use within an effective gamified experience (Landers et al. 2018;Sailer et al. 2017). The most recurrent within higher education are points, badges, and league tables (Alomari et al. 2019;Subhash and Cudney 2018). Likewise, there are multiple studies on the effects of gamification that only make use of the Kahoot platform (e.g., Bicen and Kocakoyun 2018;Göksün and Gürsoy 2019;Pertegal-Felices et al. 2020). In them, the teacher creates the quizzes and the students just respond. This research goes beyond these works, creating a gameful experience that is characterized by: (i) surrounding the totality of our active learning setup, (ii) empowering students allowing them to participate in the design of the gamification setup itself, and (iii) supporting students' overall value creation through the production of academic contents with which actively construct their own meaning. This approach moves away from "pontification", that is, using only points, badges, and league tables, which can lose effectiveness in the long run after the novelty effect Tsay et al. 2020). This gameful experience seems to be able to generate not only extrinsic motivation but also intrinsic one what turns into a satisfactory students' experience with active learning, promoting the development of skills and competencies, without harming their final marks nor affecting the quality and learning standards demanded by the academia.
With reference to limitations, we should consider the following. First, the study employs perceptual measurement scales together with objective measures, which is not a widespread common practice in the literature of higher education but enhances the robustness of our results. Furthermore, common method variance was not a problem in this research. Second, the theoretical framework was tested using a sample of 132 students, and we understand that future research directions should test our results in different settings, especially within internationalized environments with heterogeneous and culturally diverse student groups. Third, the use of active learning setups will probably represent an increase in the workloads of students and lecturers. Fourth, we did not have pre-test data for our study before implementing our intervention, so we acknowledge this circumstance as a limitation of the study.
Finally, with regard to future lines of research, there is no doubt that it would be very useful to be able to define a specific gamification measurement scale for higher education. Such a scale would contribute to standardize the application of gamification in the university context, facilitating its application and encouraging the emergence of more relevant research in this field. In line with the generation of more empirical evidence on the usefulness of active learning approaches in university classrooms, it would also be desirable to incorporate into the agenda for future research the exploration of the existing connections between active learning, gameful experiences and flow theory. The analysis of the flow mechanisms that most propitiate the full involvement and enjoyment of the student with enriched active learning setups as the one presented in this study, would contribute to enhance its beneficial effects, to favor its application in the university context, and so to prepare better the new generation of students for the challenges of the digital society of the twenty-first century. The use of artificial intelligence in the context above would also enrich our knowledge of approaching the teaching-learning process from a complementary perspective that deserves attention and it has the potential to change education, as it is understood today.

Block of questions Item Definition
In your opinion, the teaching methodology used in the module has helped you to… (1: Completely disagree -7: Completely agree) Block 1: Skills (SKI) SKI1 Improve the ability to work in groups SKI2 Improve the ability to listen to others' opinions SKI3 Improve self-learning ability SKI4 Improve the ability to apply knowledge in practice SKI5 Improve the ability to analyze (ability to distinguish and separate the parts of a whole)

SKI6
Improve the ability to synthesize (ability to form a whole from its elements)