Skip to main content
  • Research article
  • Open access
  • Published:

Empowering ChatGPT with guidance mechanism in blended learning: effect of self-regulated learning, higher-order thinking skills, and knowledge construction

Abstract

In the evolving landscape of higher education, challenges such as the COVID-19 pandemic have underscored the necessity for innovative teaching methodologies. These challenges have catalyzed the integration of technology into education, particularly in blended learning environments, to bolster self-regulated learning (SRL) and higher-order thinking skills (HOTS). However, increased autonomy in blended learning can lead to learning disruptions if issues are not promptly addressed. In this context, OpenAI's ChatGPT, known for its extensive knowledge base and immediate feedback capability, emerges as a significant educational resource. Nonetheless, there are concerns that students might become excessively dependent on such tools, potentially hindering their development of HOTS. To address these concerns, this study introduces the Guidance-based ChatGPT-assisted Learning Aid (GCLA). This approach modifies the use of ChatGPT in educational settings by encouraging students to attempt problem-solving independently before seeking ChatGPT assistance. When engaged, the GCLA provides guidance through hints rather than direct answers, fostering an environment conducive to the development of SRL and HOTS. A randomized controlled trial (RCT) was employed to examine the impact of the GCLA compared to traditional ChatGPT use in a foundational chemistry course within a blended learning setting. This study involved 61 undergraduate students from a university in Taiwan. The findings reveal that the GCLA enhances SRL, HOTS, and knowledge construction compared to traditional ChatGPT use. These results directly align with the research objective to improve learning outcomes through providing guidance rather than answers by ChatGPT. In conclusion, the introduction of the GCLA has not only facilitated more effective learning experiences in blended learning environments but also ensured that students engage more actively in their educational journey. The implications of this study highlight the potential of ChatGPT-based tools in enhancing the quality of higher education, particularly in fostering essential skills such as self-regulation and HOTS. Furthermore, this research offers insights regarding the more effective use of ChatGPT in education.

Introduction

The onset of COVID-19 accelerated the adoption of blended learning in higher education, merging traditional and online teaching methodologies. This approach has been crucial in meeting the complex demands of university courses, ensuring educational continuity, facilitating access to resources, enabling virtual collaboration, and combining online and in-person problem-solving strategies (Mali & Lim, 2021; Menon & Azam, 2021). Moreover, blended learning emphasizes the importance of self-regulated learning (SRL), as proposed in model by Zimmerman (1990), essential for students in complex academic environments. Higher-order thinking skills (HOTS) are also vital, fostering critical analysis, idea synthesis, and innovative problem-solving, skills imperative in academia and professional practice (Hwang et al., 2019). However, the autonomous nature of blended learning can pose challenges in maintaining consistent teacher support, which may impact the development of students' SRL and HOTS (Rasheed et al., 2020).

The introduction of ChatGPT by OpenAI in late 2022 has transformed the blended learning landscape. Incorporating ChatGPT offers both opportunities and challenges (Labadze et al., 2023). It provides immediate feedback and a plethora of information, enhancing learning efficiency and allowing for personalized educational paths (Stojanov, 2023; Wu et al., 2023a). Nevertheless, it could also foster a dependency that might impede the development of HOTS. The ease of accessing information and problem-solving assistance through tools like ChatGPT could discourage students from engaging in thorough thinking or independent problem-solving (Chan & Hu, 2023; Ding et al., 2023). White and Gunstone (2014) underscored the importance of prediction in knowledge acquisition, suggesting that learners should initially hypothesize and develop their own solutions and viewpoints, then validate their assumptions through observation and analysis.

Although ChatGPT's capabilities have revolutionized pedagogical methods and learning dynamics at various educational levels, including higher education, the absence of structured guidelines may limit the development of HOTS, potentially affecting the efficacy of blended learning. To address this, this study introduces the guidance-based ChatGPT-assisted learning aid (GCLA), designed specifically for higher education. The GCLA requires students to articulate their initial thoughts and perspectives before consulting ChatGPT. Then, they iteratively refine their responses using ChatGPT's feedback until a well-substantiated answer is developed. This approach promotes deeper engagement with the subject matter, thus enhancing knowledge construction and educational outcomes in higher education blended learning environments.

This study aims to tackle the critical issue of students' excessive reliance on ChatGPT in higher education blended learning settings, and its impact on their SRL, HOTS, and knowledge construction. It also evaluates the effectiveness of the GCLA, as compared to traditional ChatGPT usage, in enhancing these aspects among learners. The study seeks to answer the following research questions:

  1. 1.

    How does the GCLA, compared to traditional ChatGPT use, affect the SRL of higher education students in blended learning environments?

  2. 2.

    How does the GCLA, compared to traditional ChatGPT use, influence the development of HOTS in these students?

  3. 3.

    How does the GCLA, compared to traditional ChatGPT use, impact knowledge construction in higher education students in these environments?

Literature review

Self-regulated learning in blended learning

Self-regulated learning (SRL) is a framework that empowers learners to autonomously steer their educational paths. Originally delineated by Zimmerman (1990), the SRL model is segmented into three pivotal phases: forethought, performance, and self-reflection. The first phase is the forethought phase, which is the earliest and most important phase in the learning process. Learners first analyze the learning task, set goals and develop methods to achieve them. During this process, learners often need to adjust their learning motivation to ensure they have enough motivation to complete the learning task. The second phase is the performance phase, which is the phase where learners actually engage in learning activities. Learners need to actively participate in learning activities, monitor and adjust their own learning behavior to ensure that they achieve the expected learning goals. The final phase is the self-reflection phase, which is the last phase in the learning process. Learners need to review and evaluate their own effectiveness and learning outcomes to gain more learning experience and knowledge (Pintrich, 2000; Zimmerman, 2008).

In blended learning environments, which integrate face-to-face and online instruction, learners are granted the autonomy to tailor their learning objectives and pace (Rasheed et al., 2020; Snodin, 2013). These environments are well-aligned with SRL, particularly facilitating the performance phase by providing opportunities for learners to practice and apply their strategies. Students adeptly navigate their learning journeys, employing the most suitable strategies and resources for their needs (Wu et al., 2023a). Within such settings, as noted by Rasheed et al. (2020), SRL necessitates significant self-discipline and initiative, especially online where learners choose their engagement levels—a dynamic essential for the self-reflection phase, as discussed by Hood et al. (2015) and grounded in Zimmerman (1990)’s work.

Motivation, deeply intertwined with the forethought phase, drives individuals to engage fully with the SRL process and reach their goals. This link between motivation and goal-setting is supported by a robust body of research highlighting motivation's central role in self-regulation (Wu et al., 2023a; Zhu et al., 2020). Engagement, critical in the performance phase, stems from active involvement in learning tasks and is closely tied to the strategic application of learning techniques and interaction with educational content and peers. Research has shown that such engagement is vital in reinforcing SRL and educational dedication (Hershcovits et al., 2020; Wu et al., 2023a). Finally, self-efficacy, particularly relevant in the self-reflection phase, reflects one's confidence in successfully completing learning tasks and is strongly correlated with self-regulatory skills and academic outcomes. This sense of self-efficacy fosters the ability to embrace challenges and navigate the self-regulatory process effectively, a stance that is widely recognized in academic literature (Salah Dogham et al., 2022; Wu et al., 2023a).

In summary, motivation, engagement, and self-efficacy are vital pillars of SRL, corresponding to its phases and collectively nurturing learner autonomy and proactive behavior. These factors not only amplify the learners’ intrinsic drive but also solidify their learning proficiency, ensuring adept management of learning experiences within the rich contexts of blended learning environments.

Higher-order thinking skills

In recent years, higher-order thinking skills (HOTS) have increasingly become a focal point in higher education on a global scale (Lu et al., 2021a, 2021b). The discourse in this sector has evolved to consider HOTS indispensable for navigating the complexities of modern society (Lu et al., 2021a, 2021b). These skills, which go beyond basic memory and comprehension, include advanced cognitive processes such as critical thinking, problem-solving, and creativity (Hwang et al., 2018).

In the realm of higher education, critical thinking is essential for students to objectively analyze and evaluate information, leading to informed decisions (Brookhart, 2010; Lu et al., 2021a, 2021b). This level of scrutiny encourages learners to not just passively accept information but to actively engage in questioning and appraising its validity and utility (Krathwohl, 2002).

Problem-solving, a key component of HOTS, is particularly relevant in higher education as it involves identifying complex issues, gathering and scrutinizing data, proposing potential solutions, and selecting the most effective ones (Hwang & Lai, 2017; Lu et al., 2021a, 2021b). In an evolving global landscape, the ability to address new and unprecedented challenges is paramount, equipping students for real-life and workplace scenarios where standard solutions may not apply.

Creativity, integral to HOTS, is championed in higher education as a means of thinking outside the norm and generating innovative, impactful ideas (Hwang & Lai, 2017; Lu et al., 2021a, 2021b). Encouraging creativity allows students to surpass conventional thinking, leading to flexible strategies in problem-solving and pioneering advances (Sternberg, 2003).

When these elements are integrated into the higher education experience, a rich and varied learning environment is created (Cheng et al., 2020). This not only nurtures the development of HOTS through educator and peer interaction but also promotes autonomous and personalized learning pathways (Chen et al., 2023; Jansen & Möller, 2022). Such integration is vital for preparing learners for the twenty-first century, emphasizing the importance of applying HOTS in everyday life and professional pursuits. Consequently, developing appropriate pedagogical approaches and tools to enhance higher-order thinking within the context of blended learning in higher education remains an area of significant interest and activity.

ChatGPT in education

In recent years, the rapid evolution of large language models has profoundly influenced the domain of natural language processing (NLP). Foremost among these innovations, the Transformer model has surpassed earlier frameworks such as LSTM and RNN in prominence and efficacy (Vaswani et al., 2017). This pivotal shift catalyzed the development of notable pre-trained models like BERT (Ettinger, 2020) and GPT (Radford et al., 2018). In particular, GPT-3, with its unprecedented scale, set benchmarks by leveraging billions of parameters to capture intricate language nuances (Dale, 2021; Zhang & Li, 2021). Building on this foundation, GPT-3.5 introduced directive learning and Reinforcement Learning from Human Feedback (RLHF) to further optimize model performance (Abramski et al., 2023; Wu et al., 2023a).

Built upon GPT-3.5, ChatGPT has been increasingly adopted within the educational sector. Investigations by Kasneci et al. (2023) elucidate how ChatGPT engenders engagement and fosters a more interactive learning paradigm. Moreover, Jeon and Lee (2023) scrutinized the synergy between ChatGPT and teachers, emphasizing their mutual complementarity in the educational arena and probing how ChatGPT might support teachers in classroom facilitation. Furthermore, the utility of ChatGPT extends to linguistic instruction, as illustrated by Kohnke et al. (2023). Its efficacy as a supplementary educational tool has been further underscored through self-assessment studies (Stojanov, 2023). Innovations like CILA, tailored for blended learning environments, harness the capabilities of ChatGPT to furnish students with precise, on-demand answers—a clear departure from the vast yet unspecific information of traditional search engines like Google (Wu et al., 2023a).

However, the unparalleled potential of ChatGPT in education is not without challenges (Adeshola & Adepoju, 2023). Prominent among these is the risk of students becoming overly reliant on its responses, which could potentially stifle their critical thinking and independent problem-solving acumen (Cooper, 2023). Such dependence could dilute the depth and richness of their educational experiences and compromise their acquisition of vital problem-solving techniques (Montenegro-Rueda et al., 2023). Consequently, it becomes imperative for educators to exercise prudence in integrating ChatGPT into curricula. Implementing comprehensive and pragmatic guidelines for its usage is essential to ensure that digital learning continues to evolve responsibly. With judicious management, Researchers can harness the full potential of ChatGPT while curtailing its potential pitfalls, fostering a robust, effective, and accountable educational landscape.

The design of guidance-based ChatGPT-assisted learning aid (GCLA)

This study introduces GCLA, an innovative educational tool designed for blended learning environments, highlighting its potential applications and benefits. Birthed from a groundbreaking collaboration, it fuses ChatGPT's vast response capabilities with the unparalleled connectivity of Apple’s Shortcuts. GCLA harnesses ChatGPT's comprehensive knowledge to proficiently address a broad array of student inquiries. This is enhanced by its smooth integration with Apple's Shortcuts, providing fluid interaction across all Apple devices and ensuring prompt, personalized responses. A key aspect of GCLA is its 'learning log file,' which records past inquiries to support reflective learning. Rather than giving direct answers, GCLA prompts students to formulate their own, offering insightful hints to aid problem-solving. This cultivates a deep engagement with the material, fostering HOTS. Figure 1 depicts the intricate workings of GCLA, showcasing its various components and their interconnections.

Fig. 1
figure 1

The procedure of GCLA

Figure 2 visually depicts the GCLA workflow, illustrating the interaction of learners with the tool. This study's objective is to underscore the unique attributes and advantages of GCLA as a revolutionary tool in blended learning environments. It provides comprehensive insights into its functionalities and the potential to revolutionize educational paradigms.

Fig. 2
figure 2

The workflow of GCLA

The implementation of GCLA

The GCLA represents an innovative step in educational assistance, fostering a more interactive relationship between learners and technology. Unlike traditional tools that merely provide answers, GCLA prompts users to first articulate their own solutions through Apple's Shortcuts. This method ensures active engagement with the subject matter, fostering deeper comprehension and enhancing HOTS. The GCLA service, designed to run on Apple devices with iOS 12 or above, streamlines the learning process with automated workflows and personalized interactions.

Central to GCLA is Apple's Shortcuts, a user-friendly platform that enables custom automation of tasks. The Shortcuts interface, illustrated in Fig. 3, allows learners to easily create workflows for a variety of functions, such as messaging, weather updates, or voice commands through Siri—Apple's virtual assistant—all without needing extensive programming knowledge.

Fig. 3
figure 3

The development interface of Shortcuts

GCLA also incorporates the ChatGPT engine, specifically the "gpt-3.5-turbo-16 k" model, to manage queries. This study has set precise parameters for ChatGPT, detailed in Table 1. These settings include a 4000 max_tokens limit for detailed responses, a temperature of 0.6 for varied replies, and a presence_penalty of 0.2 to reduce repetition and promote novelty.

Table 1 The parameter of ChatGPT in GCLA

Learners wishing to use GCLA's ChatGPT feature must obtain an authentication key from OpenAI's official website. After registration, they can submit questions and their own proposed answers through the GCLA interface. This step is crucial as it compels the learner to engage with the content before receiving feedback. The input is processed by ChatGPT, which, once authenticated, provides customized guidance according to the preset parameters.

Additionally, GCLA includes a learning log file to record all questions and answers, supporting self-regulation and reflection. This log acts as a useful tool for revisiting material, aiding in concept recall and memory retention, and establishing a basis for post-class reflection. In essence, GCLA is an integrated learning solution that combines the adaptability of Apple's Shortcuts with ChatGPT's intelligent query processing, delivering a tool that not only demystifies complex learning but also considerably enriches the educational journey.

The example of using GCLA

To demonstrate the practical application of GCLA (Guided Chemical Learning Assistant) and its response to reviewers' inquiries, this example showcases a student utilizing GCLA in their study of chemical reactions. Confronted with an unsolvable problem, the student employed GCLA for guidance and to deepen their understanding. The problem was: “What is the balanced equation for the combustion of propane (C3H8) in oxygen (O2) to produce carbon dioxide (CO2) and water (H2O)?” (refer to Fig. 4).

Fig. 4
figure 4

Inputting the problem into GCLA

Upon accessing the GCLA application on their iPad, the student entered the problem and attempted an initial solution, drawing on prior knowledge. Their response was: “C3H8 + O2—> CO2 + H2O” (see Fig. 5).

Fig. 5
figure 5

Submitting an initial answer in GCLA

Following this, GCLA processed the input and interfaced with the ChatGPT engine, subsequently presenting a hint: “Your answer is not correct. However, I will give you a hint on how…” (illustrated in Fig. 6).

Fig. 6
figure 6

The guidance from GCLA

Upon reviewing this hint, the student recognized the necessity of atom counting. They identified three carbon atoms, eight hydrogen atoms, and two oxygen atoms on the left, contrasted with one carbon atom, two hydrogen atoms, and three oxygen atoms on the right. Observing the disparity, the student revised their answer to: “C3H8 + 5O2—> 3CO2 + 4H2O”.

Re-evaluating the oxygen atoms, the student noted an equal count of ten on both sides, affirming the accuracy of their revised answer. This was corroborated by the app, which displayed: “Your idea is correct! You have…” (refer to Fig. 7).

Fig. 7
figure 7

Refining the answer with GCLA guidance until correct

Through the feedback and hints from the GCLA app, the student not only resolved the problem but also engaged in reflective learning. By reviewing the learning log, which documented the problem, responses, and received guidance, the student assessed their learning process, pinpointed strengths and weaknesses, and set future objectives, such as tackling more complex problems and expanding their knowledge of various chemical reactions.

The comparison of ChatGPT on iOS

In May 2023, OpenAI launched a ChatGPT mobile application for both Android and iOS platforms, aiming to provide a seamless user experience for applying ChatGPT in addressing everyday challenges. However, due to the limited availability of GCLA (generative conversational language assistant) within the iOS ecosystem, this study will focus exclusively on the integration of ChatGPT with iOS. The user interface of this integration is illustrated in Fig. 8.

Fig. 8
figure 8

The interface of ChatGPT on iOS

A notable feature presented at the top of the application allows users to toggle between the GPT-3.5 and GPT-4 models—the latter being accessible to those with an active subscription. Drawing a parallel to GCLA, this application endows learners with the flexibility to relay their inquiries either through textual or vocal input. It's imperative to note that, due to the predominant use of GPT-3.5 among the masses and its utilization in GCLA's developmental phase, this study predominantly leverages the GPT-3.5 iteration of ChatGPT on iOS. Upon query submission, the selected GPT model proffers an immediate response. This interactive mechanism mirrors that of GCLA, albeit with a distinction: GCLA mandates learners to offer their preliminary solutions post-question submission.

Methodology

Research design

A randomized controlled trial (RCT) is a scientific study design that involves allocating participants into different groups using randomization (Stanley, 2007). This method is widely regarded as the 'gold standard' for evaluating the efficacy of new interventions or treatments. In an RCT, participants are randomly assigned to either a treatment group, which receives the intervention, or a control group, which receives a standard treatment or a placebo. This random allocation helps to minimize biases and ensures that the groups are comparable at the start of the study. The outcomes of these groups are then compared to determine the effectiveness of the intervention (Stolberg et al., 2004).

In this study, we employed an RCT to assess the impact of the GCLA on student performance in a blended learning environment. The study was conducted in a first-year university chemistry course in southern Taiwan, involving 61 students, comprising 31 males and 30 females. These participants were randomly divided into two groups: the Treatment Group (TG), consisting of 16 males and 15 females, and the Control Group (CG), with 15 males and 15 females. The TG interacted with the GCLA, a tool developed for this study to enhance blended learning. Unlike the traditional ChatGPT, the GCLA does not provide direct answers but rather encourages students to propose their solutions first. It then offers guidance to promote critical thinking and the iterative improvement of their answers. In contrast, the CG used the standard ChatGPT application on iOS. To ensure consistency in terms of hardware, all students were provided with iPads for the duration of the experiment. Both groups were taught by the same instructor, used identical teaching materials, and were in the same learning environment. The only variable was the use of either GCLA or traditional ChatGPT. This experiment was seamlessly integrated into the existing chemistry curriculum without requiring additional courses or incentives. All participants were fully informed about the study's objectives, methods, potential risks, and benefits through detailed consent forms. They were also assured that they could withdraw from the study at any time without any penalty or negative consequences.

Population

The population of this study consisted of first-year undergraduate students enrolled in a chemistry course at a university in southern Taiwan. The course was a compulsory general education course for college of science students, regardless of their major. The course covered basic concepts and principles of chemistry, such as atomic structure, periodic table, chemical bonding, chemical reactions, and stoichiometry.

Sample size and sampling technique

The sample size of this study was 61 students, which was determined by the availability of the participants and the feasibility of the experiment. The sampling technique used in this study was convenience sampling, which is a non-probability sampling method that selects participants based on their accessibility and willingness to participate (Creswell & Creswell, 2017). Convenience sampling was chosen because it was the most practical and economical way to recruit participants for this study, given the time and resource constraints.

Measurement

In this study, pre-tests, post-tests, and delayed tests were utilized to measure participants’ advancement in chemistry knowledge. Researchers utilized a multiple-choice questionnaire consisting of 20 items, each valued at 5 points. This questionnaire was developed collaboratively by two expert chemistry educators, both possessing over a decade of teaching experience. ‘Knowledge Construction’—as defined in this study—is based on the premise that individuals progressively enhance their understanding of a subject (van Kesteren & Meeter, 2020), evolving from simple information acquisition to in-depth comprehension through active cognitive engagement, leading to new insights that integrate with existing cognitive structures (Gan et al., 2020). The delayed test was particularly revealing regarding the long-term retention of knowledge. By administering pre-, post-, and delayed tests, Researchers can more accurately assess how learners acquire and retain chemistry knowledge, thus mapping the progression of their knowledge construction more clearly.

The higher-order thinking skills scale developed by Hwang et al. (2018) was utilized for a comprehensive assessment of higher-order thinking skills. This scale includes eleven items that correspond to three core dimensions: critical thinking, problem-solving, and creativity. Critical thinking involves reflective thinking and informed judgment-making, as defined by Hwang et al. (2018). Problem-solving focuses on the thorough gathering and analysis of information to overcome challenges effectively. The third dimension, creativity, highlights the ability to generate and develop new ideas. The importance of this instrument, built upon the framework by Hwang et al. (2018), is significant. It equips both researchers and educators with a powerful tool to explore and enhance individual competencies in these crucial domains.

The method proposed by Wu et al., (2023a) was applied to measure SRL, focusing on motivation, engagement, and self-efficacy—each aligning with the forethought, performance, and self-reflection phases of SRL, respectively. Wu et al., (2023a) underline the importance of sustaining motivation in the forethought phase (Pintrich, 2000; Zimmerman, 2000, 2008), actively engaging in the performance phase (Bernardo et al., 2022; Doo & Bonk, 2020), and enhancing self-efficacy during self-reflection (Rabin et al., 2020; Stephen et al., 2020). The Situational Motivation Scale (SIMS) by Guay et al. (2000) was employed, which differentiates motivation into intrinsic motivation, identified regulation, external regulation, and amotivation. Intrinsic motivation is driven by personal interest, identified regulation by a recognition of relevance, external regulation by rewards or pressures, and amotivation reflects a lack of motivation. For engagement, the Math and Science Engagement Scales proposed by Wang et al. (2016) was adapted, dividing engagement into cognitive, behavioral, and emotional aspects. Cognitive engagement involves self-regulation and strategy use, behavioral engagement involves participation and positive conduct, and emotional engagement involves positive feelings towards the educational environment. The New General Self-Efficacy Scale proposed by Chen et al. (2001) was adapted, defining self-efficacy as confidence in mobilizing resources to meet situational demands, particularly in an academic context. The SRL scale, comprising three primary dimensions and nine sub-dimensions, uses a five-point Likert scale for responses. Its reliability and validity were confirmed in prior studies (Chen et al., 2001; Guay et al., 2000; Wang et al., 2016).

Reliability of the measurement

The reliability of the instruments used in this study were established through various methods. For the chemistry knowledge questionnaire, the content validity was ensured by the expert judgment of two experienced chemistry educators, who reviewed the items and provided feedback on their clarity, relevance, and difficulty. The reliability was assessed by calculating the Cronbach’s alpha coefficient, which yielded a value of 0.79, indicating a significant level of internal consistency.

For the higher-order thinking skills scale, the content validity was based on the theoretical framework proposed by Hwang et al. (2018), which identified three core dimensions of higher-order thinking skills: critical thinking, problem-solving, and creativity. Following the translation into Chinese, a reliability analysis detailed was performed in Table 2. The reliability was measured by computing the Cronbach’s alpha coefficients for each dimension, which ranged from 0.72 to 0.81, demonstrating a significant level of internal consistency.

Table 2 Reliability analysis of higher-order thinking scale

For the self-regulated learning scale, the content validity was derived from the model of SRL proposed by Wu et al., (2023a), which aligned the three phases of SRL with three key factors: motivation, engagement, and self-efficacy. Following the translation into Chinese, a reliability analysis detailed was performed in Table 3. The reliability was evaluated by calculating the Cronbach’s alpha coefficients for each factor, which ranged from 0.75 to 0.88, indicating a significant level of internal consistency.

Table 3 The reliability of the self-regulated learning scale

Details of pre-test of and post-test

To assess the impact of GCLA on students' knowledge construction, self-regulated learning (SRL), and higher-order thinking skills (HOTS), this study employed a series of tests: pre-tests, post-tests, and delayed tests. These tests served as evaluative instruments. The pre-tests, conducted before the intervention, and the post-tests, administered immediately after the intervention, each had a duration of one hour. The delayed tests, taking place two weeks following the intervention, were 20 min long. All tests were conducted online using Google Forms.

The tests evaluating knowledge construction consisted of 20 multiple-choice questions, each worth 5 points, focusing on fundamental concepts and principles of chemistry, resulting in a total possible score of 100 points for each test. The format of the questions in the pre-tests, post-tests, and delayed tests remained consistent, with variations only in the questions, options, and numerical values in the questions.

The SRL assessment comprised 46 items across three domains: motivation (20 items), engagement (18 items), and self-efficacy (8 items). These items were adapted from Wu et al., (2023a) framework, integrating scales developed by Guay et al. (2000), Wang et al. (2016), and Chen et al. (2001). Responses were recorded on a five-point Likert scale, ranging from 1 (strongly disagree) to 5 (strongly agree).

The HOTS assessment included 11 items, aligned with three dimensions: critical thinking, problem-solving, and creativity, adapted from the scale developed by Hwang et al. (2018). Responses were also recorded using a five-point Likert scale.

Instruction methodology

The instruction methodology was based on the self-regulated learning (SRL) model proposed by Zimmerman (1990), which consists of three phases: forethought, performance, and self-reflection. The course involved learners in three-hour weekly classes for ten weeks, following the SRL framework as shown in Fig. 5.

  • Week 1: Forethought phase

The instructor covered core chemistry concepts and conducted a pre-test to evaluate learners' preliminary understanding. Learners established goals like mastering the periodic table and chemical bonding, coordinating these with their other duties.

  • Weeks 2 to 9: Performance phase

Learners dedicated these weeks to achieving their academic objectives, utilizing online tools and resources. Concurrently, weekly three-hour in-person classes promoted teamwork in problem-solving and permitted real-time learning strategy modifications.

  • Week 10: Self-reflection phase

Learners appraised the knowledge gained throughout the course. A post-test assessed their comprehension, alongside a survey examining their SRL, HOTS, and chemistry knowledge. A delayed test in Week 15 was designed to evaluate the persistence of their acquired knowledge. The combined results of the pre-test, post-test, and delayed test will provide a comprehensive measure of the learners' knowledge construction. Figure 9 details the entire experimental protocol.

Fig. 9
figure 9

Experimental procedure

The TG used GCLA to navigate the performance phase's challenges, with learning logs aiding Week 10's reflective activities. Conversely, the CG addressed problems using ChatGPT on iOS during the same phase and relied on memory for reflective tasks in the final week. Table 4 delineates the variations in SRL dynamics between the TG and CG throughout the study's duration.

Table 4 Differences between groups in SRL at different phase

Variables

The variables in this study included the independent variable, the dependent variables, and the covariates. The independent variable was the type of learning tool used by the students: GCLA or ChatGPT. The dependent variables were the students’ scores on the SRL, HOTS, and chemistry knowledge tests. The covariates were the students’ scores on the pre-tests, which were used to control for the initial differences between the groups.

Methods of analysis and statistical tools

The methods of analysis and statistical tools used in this study were as follows:

  • Descriptive statistics: To describe the sample characteristics and the scores of the dependent variables for each group.

  • Analysis of covariance (ANCOVA): To compare the mean scores of the dependent variables between the groups, adjusting for the effects of the covariates.

  • Effect size: To measure the magnitude of the difference between the groups, using partial eta-squared (η2) as the index.

  • Statistical software: To perform the data analysis, using JAMOVI version 2.4 (project, 2023).

Results

The impact of GCLA on self-regulated learning (SRL)

To investigate the potential impact of introducing GCLA on students' SRL in a blended learning setting, Analysis of Covariance (ANCOVA) was employed in this study. The SRL scores from the pre-tests as covariates and considered the post-test SRL scores as the dependent variables. Prior to the main analysis, The assumption of homogeneity of variances using Levene's test was confirmed. The results are outlined in Table 5. As seen in Table 5, all sub-dimensions of SRL have p-values exceeding 0.05. These findings underline the validity of the variance equality assumption and confirm the appropriateness of using ANCOVA for the analysis.

Table 5 The Levene's test results for SRL

Tables 6 and 7 respectively present the descriptive statistics and ANCOVA analysis results for SRL. As evidenced in Table 7, statistically significant differences were observed in intrinsic motivation, amotivation, cognitive engagement, behavioral engagement, and self-efficacy. From the descriptive statistics in Table 6, it can be discerned that the CG notably outperformed the TG in terms of intrinsic motivation (MEG = 18.9 > MCG = 16.1) and amotivation (MEG = 8.53 < MCG = 12.0, this being an adverse indicator). Conversely, the TG surpassed the CG in cognitive engagement (MEG = 34.5 > MCG = 27.4), behavioral engagement (MEG = 20.1 > MCG = 17.4), and self-efficacy (MEG = 31.2 > MCG = 25.8). Thus, in addressing Research Question 1, while the introduction of GCLA may not bolster students' motivational aspect in SRL (Forethought phase), it does indeed foster positive effects on their engagement level (Performance phase) and self-efficacy (Self-reflection phase).

Table 6 Descriptive results for SRL
Table 7 ANCOVA results for SRL

The impact of GCLA on higher-order thinking skills (HOTS)

To investigate the potential impact of introducing GCLA on students' HOTS in a blended learning setting, ANCOVA was employed in this study. The HOTS scores from the pre-tests as covariates and considered the post-test HOTS scores as the dependent variables. Prior to the main analysis, The assumption of homogeneity of variances using Levene's test was confirmed. The results are outlined in Table 8. As seen in Table 8, all dimensions of HOTS have p-values exceeding 0.05. These findings underline the validity of the variance equality assumption and confirm the appropriateness of using ANCOVA for the analysis.

Table 8 The Levene's test results for HOTS

Tables 9 and 10 provide the descriptive statistics and ANCOVA analysis results for HOTS, respectively. As seen in Table 10, there were statistically significant differences in the dimension of critical thinking, problem-solving, and creativity. From the descriptive statistics in Table 9, it can be discerned that the TG notably outperformed the CG in terms of critical thinking (MEG = 17.3 > MCG = 13.8), problem-solving (MEG = 15.9 > MCG = 13.1), and creativity (MEG = 11.0 > MCG = 10.1). Thus, in addressing the research question 2, the introduction of GCLA can be deduced to enhance the HOTS of learners, manifesting in significant improvements in critical thinking, problem-solving, and creativity.

Table 9 Descriptive results for HOTS
Table 10 ANCOVA results for HOTS

The impact of GCLA on knowledge construction

To gauge learners' knowledge construction within a blended learning environment, chemistry comprehension was assessed at three distinct intervals: a pre-test (prior to experimental activity), a post-test (following the experimental activity), and a delayed test (two weeks post-experimental activity). Two ANCOVAs were employed to discern variations in knowledge over these time frames, thereby evaluating students' knowledge construction. The initial ANCOVA treated post-test scores as the dependent variable, with pre-test scores serving as the covariate. This was to discern the disparity in chemistry understanding between the TG and CG subsequent to the experimental intervention. The subsequent ANCOVA utilized delayed test scores as the dependent variable and post-test scores as the covariate, aiming to gauge the retention of knowledge after a two-week span. Prior to the ANCOVA implementation, the Levene's test was executed to confirm the homogeneity of variances, with the results delineated in Table 11. As evident from Table 11, both post-test and delayed test yielded p-values surpassing 0.05, reinforcing the validity of the equal variance assumption and solidifying the justification for ANCOVA in the assessment.

Table 11 The Levene's test results for knowledge construction

To ensure that there were no initial differences in learners' prior knowledge, the study initially conducted an ANOVA on the pre-test scores for chemistry knowledge. The results indicated no significant relationship between the TG and the CG in the pre-test scores (F = 0.231, p = 0.632 > 0.05). This suggests that there were no significant differences in prior knowledge between the two groups at the onset of the study. Such a finding is crucial as it establishes a baseline equivalence between the TG and CG, thereby allowing for a more accurate assessment of the GCLA intervention's impact on chemistry comprehension and knowledge retention.

Tables 12 and 13 present the descriptive statistics for the pre-test, post-test, and delayed post-test, along with the outcomes of two distinct ANCOVAs. As elucidated in Table 13, noteworthy statistical discrepancies exist between the TG and the CG concerning both post-test and delayed test scores. A closer analysis of Table 12 distinctly showcases that the scores attained by the TG in both the post-test (MEG = 79.7 > MCG = 74.7) and delayed test (MEG = 75.9 > MCG = 69.3) substantially surpass those of the CG. Addressing the research question 3, the GCLA intervention markedly enhances learners' comprehension of chemistry and the subsequent retention of this knowledge. Consequently, it's plausible to deduce that GCLA plays a pivotal role in augmenting learners' knowledge construction.

Table 12 Descriptive results for knowledge construction
Table 13 ANCOVA results for knowledge construction

Discussion

The impact of GCLA on self-regulated learning (SRL)

SRL, fundamental to blended learning, plays a crucial role in higher education, where hybrid models are increasingly prevalent. Zimmerman (1990) articulated SRL's phases as Forethought, Performance, and Self-Reflection. Wu et al. (2023a) explored these stages within the context of higher education, evaluating their impact on motivation, engagement, and self-efficacy. The investigation, aligning with these scholarly works, specifically examines the differential effects of GCLA versus traditional ChatGPT on SRL among higher education students, as Table 6 and 7 illustrate. The data shows that those utilizing GCLA exhibited greater cognitive and behavioral engagement, alongside heightened self-efficacy, compared to peers using ChatGPT.

In the realm of higher education, where independent critical thinking is paramount, GCLA's design, which necessitates that students engage thoughtfully with content prior to receiving answers, proved advantageous. This strategy encourages the deployment of advanced cognitive strategies and the swift integration of complex concepts, key in fostering cognitive engagement at this level. This methodology is supported by the "Probing understanding" concept advocated by White and Gunstone (2014) and the inquiry-based framework proposed by Pedaste et al. (2015). Such an approach is particularly effective in higher education, as confirmed by Al Mamun and Lawrie (2023) and Mamun et al. (2020), and aligns with GCLA’s problem-solving orientation.

Enhanced behavioral engagement with GCLA is also critical in higher education settings. The platform’s design requires more intensive student interaction, reflecting the “learning-by-doing” pedagogy that is vital in higher learning environments. Students in these settings must not only understand theoretical concepts but also apply them in practical scenarios (Lee et al., 2023a, 2023b; Lee et al., 2023a, 2023b; Wu et al., 2023b). This is consistent with the findings of Dellatola et al. (2020) and Kuo et al. (2020), who reported that active engagement practices, such as collaborative problem-solving and peer feedback, improved students’ learning outcomes and satisfaction in blended learning courses. Therefore, GCLA can be seen as a valuable tool to facilitate such practices and enhance students’ behavioral engagement.

Furthermore, GCLA's record-keeping feature improves self-efficacy among higher education learners by enabling them to monitor their queries and reflect on their learning journey. This reflective practice is vital for students to develop confidence in their academic abilities, which is especially important in higher education where learners are expected to take greater ownership of their learning (Hsia & Hwang, 2020; Menon & Azam, 2021).

However, in terms of intrinsic motivation and amotivation, traditional ChatGPT holds a slight edge, particularly in higher education contexts. Its ability to provide immediate feedback seems to foster motivation and stave off feelings of helplessness, underscoring the impact of timely support on student motivation—a critical factor for student success in higher education, as emphasized by Wu et al. (2023a).

The impact of GCLA on higher-order thinking skills (HOTS)

The increasing acknowledgment of HOTS as indispensable for success in the twenty-first century has been a prominent theme in the field of higher education research (Conklin, 2011). In stark contrast to the passive nature of traditional lecture-based instruction prevalent in higher education, blended learning paradigms afford students a high level of autonomy (Snodin, 2013). This shift necessitates a more deliberate focus on cultivating advanced cognitive skills that enable students to critically analyze educational content, appraise their learning journeys, and craft tailored learning plans. Evidence from Tables 9 and 10 reinforces that GCLA developed in this study notably advances HOTS within higher education's blended learning frameworks, specifically targeting critical thinking, problem-solving, and creativity.

In terms of critical thinking within the higher education context, GCLA systematically provides hints rather than outright solutions, prompting students to iteratively refine their reasoning to reach the correct conclusion. This approach aligns with the dialogic and inquiry-based models that are becoming increasingly prevalent in higher education pedagogy (Al Mamun & Lawrie, 2023; Kuo et al., 2020). It encourages a deeper engagement with material, thus leading students in the TG to develop more robust critical thinking skills than their counterparts in the CG. These findings dovetail with multiple studies that emphasize the importance of discussion-based learning in fostering critical thinking within higher education settings (Al-Husban, 2020; Giacumo & Savenye, 2020; O'Riordan et al., 2021).

When addressing problem-solving abilities, GCLA's structured hinting mechanism compels students to actively engage with problems, reflecting on and revising their approaches. This is particularly valuable in the higher education landscape, where problem-solving is a key learning outcome. The technique employed by GCLA mimics the iterative process fundamental to Problem-Based Learning (PBL), a method well-established and valued in higher education for its efficacy in cultivating problem-solving prowess (Aslan, 2021; Phungsuk et al., 2017; Valentine et al., 2017).

Lastly, regarding creativity, a skill increasingly sought after in higher education graduates, GCLA's methodology nurtures this by offering hints that encourage a multiplicity of perspectives. This contrasts with the more deterministic approach of traditional ChatGPT, fostering a learning environment in higher education where students are prompted to think divergently and conceive innovative solutions. The potential for GCLA to stimulate more creative outcomes is in line with research advocating for the role of brainstorming and creative thinking in higher education (Göçmen & Coşkun, 2022; Gong et al., 2022).

The impact of GCLA on knowledge construction

Knowledge construction is fundamental in higher education, where evaluating students' learning outcomes is paramount. This study employed a multi-stage assessment approach comprising pre-tests, post-tests, and delayed tests to track students' knowledge acquisition in higher education settings. As delineated in Tables 12 and 13, higher education learners using the GCLA displayed superior performance in both post-test and delayed test evaluations when compared to peers using conventional ChatGPT tools.

The GCLA framework insists that learners in higher education contexts propose their own responses before receiving any assistance, offering hints instead of outright answers. This process promotes active engagement and allows students to develop, assess, and hone their problem-solving techniques, leading to more effective learning and review. This is a practical application of metacognitive strategies which are particularly relevant in the context of higher education (Schraw & Moshman, 1995). Moreover, by expressing their initial reasoning and iteratively refining their understanding, students experience a form of inquiry-based learning that is pivotal for higher education (Pedaste et al., 2015; White & Gunstone, 2014). The enhancement in knowledge construction and retention within the higher education cohort using GCLA is further substantiated by existing research on the benefits of metacognitive and inquiry-based learning strategies (Carvalho & Santos, 2022; Tawfik et al., 2020; Zhou & Lam, 2019).

Implications

This study has several implications for both theory and practice in the field of education.

Theoretical implications

This study contributes to the existing literature on the use of ChatGPT in education, especially in blended learning environments. It introduces the GCLA, a novel approach that modifies the use of ChatGPT by providing guidance rather than direct answers, and evaluates its impact on students’ SRL, HOTS, and knowledge construction. The findings reveal that the GCLA enhances these aspects compared to traditional ChatGPT use, demonstrating the potential of ChatGPT-based tools to foster essential skills such as self-regulation and higher-order thinking. Furthermore, this study offers insights into the more effective use of ChatGPT in education, highlighting the importance of encouraging students to attempt problem-solving independently before seeking ChatGPT assistance, and providing feedback through hints rather than solutions. These insights can inform the design and development of future ChatGPT-based educational tools and interventions.

Practical implications

This study also has practical implications for educators and learners in higher education, particularly in blended learning settings. It suggests that the GCLA can be a valuable tool to supplement blended learning, as it can provide timely and personalized guidance to students, enhance their engagement and self-efficacy, and improve their learning outcomes. The GCLA can also support educators in facilitating blended learning, as it can reduce their workload in providing feedback and assistance, and allow them to monitor students’ progress and performance through the learning log file. Moreover, the GCLA can be easily integrated into existing blended learning platforms and curricula, as it is compatible with Apple devices and can be customized according to different learning objectives and contexts.

Conclusion

This study aimed to address the issue of students’ excessive reliance on ChatGPT in higher education blended learning settings, and its impact on their SRL, HOTS, and knowledge construction. It also evaluated the effectiveness of the GCLA, a guidance-based ChatGPT-assisted learning aid, compared to traditional ChatGPT use, in enhancing these aspects among learners. The study involved 61 undergraduate students from a university in Taiwan, who were randomly assigned to either the treatment group (TG) or the control group (CG). The TG used the GCLA, while the CG used the traditional ChatGPT application on iOS, to assist their learning in a foundational chemistry course within a blended learning setting. The study employed a Randomized Controlled Trial (RCT) and used pre-tests, post-tests, delayed tests, and surveys to measure the impact of the GCLA on students’ SRL, HOTS, and knowledge construction.

The results showed that the GCLA had a significant positive effect on students’ cognitive and behavioral engagement, self-efficacy, critical thinking, problem-solving, creativity, and knowledge construction, compared to traditional ChatGPT use. However, the GCLA did not have a significant effect on students’ intrinsic motivation and amotivation, which were higher in the CG than in the EG. These results suggest that the GCLA can effectively enhance students’ learning experiences and outcomes in blended learning environments, by providing guidance rather than answers, and fostering an environment conducive to the development of SRL and HOTS. The results also indicate that the GCLA can help students overcome the challenges of blended learning, such as maintaining consistent teacher support, managing learning autonomy, and engaging in complex academic tasks.

Limitations

This study has several limitations that should be acknowledged and addressed in future research. First, the sample size of this study was relatively small, and the participants were from a single university in Taiwan. Therefore, the generalizability of the findings may be limited, and further studies with larger and more diverse samples are needed to validate the results. Second, the study only focused on one subject area, namely chemistry, and one blended learning setting. Thus, the applicability of the GCLA to other subject areas and learning settings may vary, and more studies are needed to explore the effects of the GCLA in different domains and contexts. Third, the study only measured the short-term and medium-term effects of the GCLA on students’ learning, using post-tests and delayed tests. The long-term effects of the GCLA on students’ learning, such as retention, transfer, and application of knowledge and skills, were not assessed, and future studies should include more longitudinal measures to evaluate the lasting impact of the GCLA.

Future directions

Based on the findings and limitations of this study, several directions for future research can be suggested. First, future studies can extend the scope of this study by investigating the effects of the GCLA on other aspects of students’ learning, such as motivation, interest, satisfaction, and attitude. These aspects are also important for students’ learning success and well-being, and can provide a more comprehensive picture of the impact of the GCLA. Second, future studies can explore the underlying mechanisms and factors that mediate or moderate the effects of the GCLA on students’ learning. For example, how does the GCLA influence students’ cognitive processes, metacognitive strategies, and affective states? How do students’ prior knowledge, learning styles, and preferences affect their use and perception of the GCLA? These questions can help to explain the reasons and conditions for the effectiveness of the GCLA, and provide more insights for its improvement and optimization. Third, future studies can compare the GCLA with other ChatGPT-based tools or interventions, such as those that provide different types or levels of feedback, scaffolding, or personalization. These comparisons can help to identify the strengths and weaknesses of the GCLA, and provide more evidence for its relative advantages and disadvantages.

Availability of data and materials

The datasets used or analyzed during the current study are available from the corresponding author on reasonable request.

References

Download references

Acknowledgements

We are extremely grateful to the research assistants and students who participated in this study.

Funding

This project was funded by the National Science and Technology Council (NSTC), of the Republic of China under Contract numbers NSTC 110–2511-H-006–008-MY3, NSTC 112–2410-H-006–053-MY3.

Author information

Authors and Affiliations

Authors

Contributions

H-YL is the leader of this research, he is in charge of the research design, conducting teaching and learning experiment, data analysis. P-HC is responsible for assisting in the conduct of experiments and surveying related literature. W-SW is responsible for assisting in the conduct of experiments. Y-MH is responsible for designing research experiments, providing fundamental education theories and comments to this research, and he is also responsible for revising the manuscript. All authors spent more than 2 months to discuss and analyze the data. The author(s) read and approved the final manuscript. T-TW is responsible for assisting in the conduct of experiments and surveying related literature and proofreading the manuscript.

Corresponding author

Correspondence to Yueh-Min Huang.

Ethics declarations

Competing interests

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lee, HY., Chen, PH., Wang, WS. et al. Empowering ChatGPT with guidance mechanism in blended learning: effect of self-regulated learning, higher-order thinking skills, and knowledge construction. Int J Educ Technol High Educ 21, 16 (2024). https://doi.org/10.1186/s41239-024-00447-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s41239-024-00447-4

Keywords