“Connecting concepts helps put main ideas together”: cognitive load and usability in learning biology with an AI-enriched textbook

Rapid developments in educational technology in higher education are intended to make learning more engaging and effective. At the same time, cognitive load theory stresses limitations of human cognitive architecture and urges educational developers to design learning tools that optimise learners’ mental capacities. In a 2-month study we investigated university students’ learning with an AI-enriched digital biology textbook that integrates a 5000-concept knowledge base and algorithms offering the possibility to ask questions and receive answers. The study aimed to shed more light on differences between three sub-types (intrinsic, germane and extraneous) of cognitive load and their relationship with learning gain, self-regulated learning and usability perception while students interacted with the AI-enriched book during an introductory biology course. We found that students displayed a beneficial learning pattern with germane cognitive load significantly higher than both intrinsic and extraneous loads showing that they were engaged in meaningful learning throughout the study. A significant correlation between germane load and accessing linked suggested questions available in the AI-book indicates that the book may support deep learning. Additionally, results showed that perceived non-optimal design, which deflects cognitive resources away from meaningful processing accompanied lower learning gains. Nevertheless, students reported substantially more favourable than unfavourable opinions of the AI-book. The findings provide new approaches for investigating cognitive load types in relation to learning with emerging digital tools in higher education. The findings also highlight the importance of optimally aligning educational technologies and human cognitive architecture.


Introduction
Educational technology aims to make learning more effective, accessible, and engaging for learners. Digital learning environments offer support for learning through glossaries, hyperlinks, multimedia resources and different types of feedback (e.g., Aleven et al., 2003). Ideally, digital learning environments should serve to support learners' cognitive processes by reducing the consumption of cognitive resources while promoting retention and meaningful learning (Bates et al., 2020). However, since learning is impossible without engaging cognitive resources, educational interventions should focus on optimising the overall cognitive effort attributed to making learning more efficient (Eitel et al., 2020;Sweller & Chandler, 1991). In this regard, emerging research is exploring how the integration of artificial intelligence (AI) in educational technology may provide opportunities for improving learning through efficient use of cognitive resources (Zawacki-Richter et al., 2019). Effective learning with digital tools requires optimally channelling cognitive effort (Klepsch et al., 2017) as well as providing students with selfregulated learning skills (Eitel et al., 2020;Ibili & Billinghurst, 2019). In exploring how to do so, more fine-grained approaches that explore how cognitive load, usability perception and self-regulated learning skills impact learning, are needed.
Our previous work investigated university students' learning and interaction with an AI-enriched digital biology book versus a traditional E-book, during a short educational intervention (cf. Koć-Januchta et al., 2020). One outcome of the study was identifying the need for analysing the specific relationships between cognitive load and usability when engaging with emerging textbook technologies. In response, the current study explores students' differential cognitive effort over a longer period while learning biology with the AI-enriched book. Integration of the AI-book as a learning resource in a real university course setting allowed for an ecologically valid context. Our goal was to explore changes in three sub-types of cognitive load, namely intrinsic, germane, and extraneous load and their relationship with usability perception and self-regulated learning while students learned with the AI-book. In this pursuit, we aim to contribute to knowledge on systematically differentiating between cognitive load types. Specifically, in relation to students' learning with digital environments in higher education.

Cognitive load theory and learning
Since the 1980's, cognitive load theory (CLT; Sweller & Chandler, 1991) has been established as one of the most applied theories for considering relationships between instructional design and mental problem-solving resources. The theory is concerned with defining the overall mental effort (cognitive load) attributed to working memory resources delegated to accomplishing a task (Kalyuga & Liu, 2015). Cognitive load (CL) comprises three sub-types (van Merriënboer & Sweller, 2005), namely intrinsic cognitive load (ICL), extraneous cognitive load (ECL), and germane cognitive load (GCL).
Intrinsic cognitive load (ICL) is a result of cognitive activities needed for understanding the information inherent in a task and depends largely on the complexity of this information. A high level of ICL is caused by a large number of "task elements" which must be processed simultaneously during learning and also determined by the level of the learner's knowledge (Sweller, 2010). Extraneous cognitive load (ECL) is experienced by learners when they are forced to invest their cognitive resources in activities that are not immediately relevant to the learning task at hand. The main source of ECL is non-optimal or flawed instructional design, such as an unnecessary complex layout of a digital learning interface (Klepsch et al., 2017). The final sub-type, germane cognitive load (GCL), results from constructing schemas (Sweller et al., 1998) or mental models (Paas et al., 2004) during meaningful learning processes. An example of an activity that can increase GCL is integrating new information with knowledge the learner already has. Therefore, high levels of GCL can be interpreted as a sign of meaningful learning (Klepsch et al., 2017).
From a CLT perspective, learning is defined as constructing and automating schemas in long-term memory (Paas et al., 2004), which involves all three sub-types of cognitive load to some extent or other. To optimise learning, the sum of the load types should not exceed the learner's limited working memory capacity. Hence, assuming that ICL is inherent in the nature of the task, optimising learning should focus on minimising ECL and increasing GCL (see Klepsch et al., 2017). Although the cognitive load construct provides insight into the usefulness and effectiveness of learning with new educational technologies (e.g., Kalyuga & Liu, 2015), it remains challenging to measure (e.g., Klepsch et al., 2017). The literature contains multiple cognitive load measures that have emerged over time, which range from (the most widely-used) self-rating techniques to recent physiological measures (e.g. pupillary responses). From an instructional design standpoint, it is crucial to deduce ways to measure CL differentially by exploring the relative impact of all three load sub-types during learning (Ibili & Billinghurst, 2019;Klepsch et al., 2017;Mutlu-Bayraktar et al., 2019).

Interactive educational technology and cognitive load
Technology enhanced learning is becoming more and more apparent in higher education bringing with it both hopes and challenges. Amongst the hopes, digital resources that integrate interventions such as AI may help support the learning of complex scientific knowledge such as biology (e.g., Corbett et al., 2010). Interactive technology can also help students to learn more efficiently by offering multimedia resources, interactive glossaries, prompts, answers to questions, help in constructing models and even personalised suggestions for further learning (Aleven et al., 2003;Koć-Januchta et al., 2020;Linn et al., 2014). Nevertheless, new technological opportunities for learning are also associated with multiple challenges. From a motivational perspective, students may experience decreased motivation when learning on their own from a digital learning environment (e.g., DeVore et al., 2017). Consequently, learning with digital tools often requires advanced skills in independent learning, self-regulation and learning strategies (Glover et al., 2016;Means et al., 2009). Additionally, learners may experience cognitive overload when learning with digital technology (Aleven et al., 2003).
As elucidated previously, cognitive overload is often caused by high levels of extraneous cognitive load (Klepsch et al., 2017). While ICL and GCL concern processing of learning elements and promoting meaningful learning, respectively, extraneous load arises mainly from the way information is conveyed. Poorly designed digital learning tools may increase ECL to such an extent that it impairs learning (Moreno & Mayer, 2007). Therefore, where possible, ECL should be reduced by optimising the design of the learning environment. Design elements such as the range and complexity of implemented digital features must be carefully considered, since a complicated interface can render cognitive overload (Scheiter & Gerjets, 2007;Sweller et al., 1998). Lowering extraneous cognitive load frees the availability of cognitive resources that can be directed to germane load and thus stimulate deeper learning (Klepsch et al., 2017). As part of our previous work (Koć-Januchta et al., 2020), we presented students' opinions suggesting that ECL may increase over time when using an AI-enriched book. As part of that study, where we compared an AI-enriched book and a traditional E-book, students from a research university were interviewed about their experiences of using both types of digital book. They pointed out several advantages of the AI-enriched book over the traditional E-book (e.g., obtaining pop-up definitions to terms in realtime) but also reported growing dissatisfaction with the AI-enriched book as usage time progressed. We observed that a longer usage of the book revealed potential design-related disadvantages (e.g., AI-functionalities were sometimes confusing or contained too much information, or made one unsure of their learning). At the same time, the more difficulty students perceived learning with the AI-enriched book, the less positively they assessed its usability (Koć-Januchta et al., 2020).

Usability perception and cognitive load in educational technology
Usability is an essential measure when exploring user experiences of digital educational technologies (Diefenbach et al., 2014). The concept includes subjective and objective components, which consist of perceived usability or satisfaction (how comfortable it is to use a digital tool) and efficiency (the time and effort cost in using the digital tool), respectively (Lewis, 2018). One of the most popular measures of perceived usability is the System Usability Scale (SUS) questionnaire developed by Brooke (1996). The SUS is suitable for measuring satisfaction (e.g., meeting expectations) and ease of using the learning tool.
Many studies show that perceiving a learning system as useful is associated with a reduced cognitive load (e.g., Pantano et al., 2017), whereas feeling confused when using a system leads to increased cognitive load (Kılıç, 2007). Moreover, Costley and Lange (2017) found that an increase in users' intention to use a tool is influenced by effective instructional design. Additionally, optimal instructional design correlates with increased germane load indicating deep learning (Costley & Lange, 2017). Notably, Ibili and Billinghurst (2019) have stated that perceived usefulness (perceiving a learning tool as improving learning) and perceived ease of use (perceiving a tool as easy to learn with) were strongly correlated with all three types of cognitive load (ICL, GCL and ECL). Specifically, usefulness was negatively correlated with ICL (for females) and with ECL (for males). At the same time, both usefulness and ease of use were strongly positively correlated with germane cognitive load (Ibili & Billinghurst, 2019). Lastly, cognitive load is strongly connected with self-regulated learning. For example, high cognitive load might originate from students' insufficient self-control skills and low willingness to learn (de Bruin et al., 2020;Eitel et al., 2020).

Self-regulated learning and cognitive strategies
Acquiring self-regulation skills is important for learning and a research topic of high interest when it comes to individual learning with digital tools (Steffens, 2006). Zimmerman (2011) relates self-regulated learning to the degree to which learners participate actively in their own learning at the metacognitive, motivational, and behavioral level. In addition, Paris and Winograd (2003) describe self-regulated learning as a process in which learners approach problems, apply strategies, monitor their performance, and assess the results of their efforts. Self-regulated learners are more likely to improve their academic achievements by selecting and controlling cognitive processes involved in learning (Pintrich & De Groot, 1990). To learn deeply, one should be able to elaborate and organise information and monitor one's learning process (Pintrich & De Groot, 1990;Soenens et al., 2012). In this regard, technology-enhanced learning environments offer an opportunity to support selfregulated learning by helping students to plan, monitor and evaluate the cognitive, motivational, and affective components of their own learning (Steffens, 2006).

Learning biology: conceptual knowledge and cognitive skills
Biology is a natural science concerned with studying structures and processes associated with living organisms (e.g., Sadava et al., 2017). Learning biology involves building a conceptual understanding of the structure of the (bio)molecules of life that include proteins, enzymes, carbohydrates, lipids, and nucleic acids. This learning also includes developing core knowledge about the "unit of life" (the cell) and the plethora of cellular processes such as DNA replication, mitosis, meiosis, and gene expression. In turn, such knowledge must be integrated with understanding physiological functions such as photosynthesis, muscle contraction, neural and endocrine control. Furthermore, all these aspects of biology contribute to understanding populations, ecosystems, and evolution. Moreover, constructing biological knowledge involves making links to other scientific disciplines and reasoning at various levels of spatial and temporal scale. Cognitive skills associated with successful biology learning include: retaining biological knowledge, integrating knowledge with other concepts (while transitioning different levels of biological organisation), transferring learnt knowledge to novel tasks, as well as reasoning both "locally" and "globally" about a biological concept (Anderson & Schönborn, 2008).

Aims of the study
The objectives of this study are to investigate: 1. Any differences in how students experience the three types of cognitive load (ICL, GCL and ECL) while learning with an AI-enriched biology book at the beginning and the end of the study. 2. Relationships between the three types of cognitive load, usability, self-regulation, cognitive strategy use, and learning gain while interacting with the AI-enriched book.

Study setting: a reference point and main study
The collaboration described in this paper began in 2018, in planning toward a research intervention involving an AI-enriched digital biology textbook for 2019 at Harvard University. In 2018, we obtained initial information about cognitive load experienced by students learning biology. We asked 32 (53.1% female; age M = 27.53; SD = 6.63) students attending an introductory biology course, and who used traditional hardcopy and E-books to answer two questions on their experienced cognitive load (Paas, 1992). Doing so generated a reference point for a 2019 main study that investigated cognitive load experienced when students participated in the same course while engaging with an AI-enriched digital textbook.

Data collection and participants: main study
The study was conducted from September to October 2019 while students attended the course "Introduction to Molecular and Cellular Biology" at Harvard University. During this time, students responded to several questionnaires including a written pre-test and demographic questions, followed by two online surveys, and ending with a written post-test. Demographic questions included gender, age, Grade Point Average (GPA), native language and previously completed science courses. The pre-test (7 multiple choice questions) assessed students' biological knowledge and was answered during the first lecture. Subsequently, students were provided access to an AI-enriched book integrated in an iPad platform that covered the first ten chapters of the original hardcopy biology textbook (Sadava et al., 2017), and which corresponded to the first seven lectures of the course.
During the study, students had unlimited individual access to the AI-enriched book both on campus and at home. Students could use the book as they wished, while preparing for class, tests, and examinations. During the course, the lecturer provided the students with a pre-class and a post-class study guide for each lecture as supplementary material. The pre-class study guide specified the reading that students were expected to do before class, and the post-class study guide provided the intended learning goals and outcomes of that class. Each study guide also included a set of questions aligned with the learning outcomes for students to assess their knowledge. Students were also provided with explanation materials on how best to use the features of the AI-enriched book in the context of the learning goals (see Appendix). While using the AI-enriched book students responded to two surveys consisting of cognitive load, cognitive strategy use, self-regulation, and usability scales (the latter included open-ended questions). Altogether, 42 participants participated in the study, of which 69% were female (age range 17 to 44 years, M = 26.28 and SD = 4.87). Although the study consisted of several measurement points, not all students participated in all data collections. Figure 1 depicts the study timeline, showing the number of participants at each data collection point and associated questionnaires.

Specific instruments and measures
As part of the five measuring points obtained in the study (Fig. 1), Table 1 provides further detailed information about data gathering dates and respective instruments and measurements.

The AI-enriched digital biology textbook
The learning resource in focus is an example of an intelligent textbook that incorporates AI elements. The AI-enriched book is based on a widely used international hardcopy biology textbook (Sadava et al., 2017), and integrated into an iPad. The AIbook has typical electronic book features, such as the possibility to highlight text, enlarge figures, and make notes. However, in applying natural language processing techniques and formal knowledge representation, the AI-enriched book also offers a 5000-concept knowledge base and algorithms that provide the possibility to ask questions and receive answers. Figure 2 demonstrates three different ways of generating/ asking questions. Firstly, students can input a question by tapping the "magnifying glass" icon (see 1, Fig. 2). Secondly, students can tap on an underlined term ("dotted word") to view its short definition and access further information on the topic (e.g., suggested questions referring to the term) by tapping on the button "MORE" placed near the definition (see 2, Fig. 2). Thirdly, highlighting the text produces a note card with suggested, most relevant questions about the highlighted text (see 3, Fig. 2). The AI-enriched book includes multiple elements of artificial intelligence that incorporate knowledge-acquisition and knowledge presentation processes. Specific AI elements applied in the AI-enriched book comprise a formal knowledge representation of book content, natural language processing to interpret a student's inputted or selected suggested questions, and natural language mechanisms for generating answers (Chaudhri et al., 2013). Page 8 of 22 Koć-Januchta et al. Int J Educ Technol High Educ (2022) 19:11 Analytical procedure Ethical approval for the study was obtained and data analysed only from participants that provided informed consent. Quantitative analyses included descriptive statistics and statistical comparisons. Cronbach's alpha was used to calculate the internal consistency and reliability of the applied measures. We applied t-tests and a General Linear Model (GLM) Repeated Measures procedure to calculate any significant differences between variables. In reducing the risk of a Type I error when using t-tests, we applied a conservative Bonferroni correction. The GLM Repeated Measures procedure provided analysis of variance results when administering the same measurements to the same participants on several occasions (Field, 2013), and to calculate changes between three types of experienced cognitive load (ICL, GCL, ECL) over time (first and second measurements, respectively). We also applied Pillai's trace as a statistic robust to violations of analysis of variance assumptions (Finch, 2005).

Reliabilities of applied instruments and measures
The reliability (internal consistency) of the administered scales (Table 1), calculated with Cronbach's alpha are reported in Table 2. As shown in Table 2, almost all scales applied in the study had a good or very good reliability, apart from the GCL scale, where reliability was rather low (0.54). However, a similar reliability score was obtained in the original development of the scale (Klepsch et al., 2017).

Reference point study: comparison of cognitive load between different book versions
Since the main goal of the study was to investigate cognitive load, we commenced our analyses with establishing a reference point for cognitive load measurement. In this regard, we compared cognitive load experienced by the students during 2018 (when using the hardcopy and/or E-book) with 2019 (using the AI-enriched book) in terms of difficulty and mental effort when learning (Paas, 1992). Figure 3 displays cognitive load levels (in terms of difficulty and mental effort) when students used the hardcopy/ E-book version versus the AI-enriched version of the biology textbook. Figure 3 shows that both difficulty and mental effort when learning were significantly lower for the AI-enriched book. Table 3 below provides results from a t-test comparison of the two student groups.
An independent samples t-test showed significant higher perceived cognitive load when learning with a hardcopy or/and E-book in comparison to the AI-book. According to Sweller and Chandler (1991) the optimal level of cognitive load depends on the differential type of cognitive load a learner experiences at different points of learning. To specifically investigate such differential cognitive load when learning with the AI-book, the main study focused on exploring the three types of cognitive load (ICL, GCL, ECL) over time.

Main study: identifying and comparing three types of cognitive load
The main study compared three types of cognitive load experienced by students at the beginning and close of the study. To compare levels of three cognitive load types at both measuring points, we analysed the data within a General Linear Model using a repeated measures design framework (Field, 2013). A significant main effect on cognitive load types was revealed, F (2, 17) = 52.96; p < 0.001; η 2 = 0.86 (Pillai's Trace) showing that the perceived level of cognitive load differed significantly due to its type (intrinsic, germane or extraneous) (Fig. 4). Detailed pairwise comparisons with Bonferroni correction revealed that: • At the beginning of the study, all three types of cognitive load levels differed significantly. GCL1 was significantly higher than both ICL1 and ECL1. At the same time, ICL1 and ECL1 differed significantly (differences between ICL1, GCL1, ECL1 are depicted in grey in Fig. 4). • At the end of the study there were significant differences between germane load (GCL2) and the two other cognitive load types (differences between ICL2, ECL2 are depicted in black in Fig. 4). In contrast with the beginning of the study, there was no significant difference between ICL2 and ECL2 (p > 0.05) at the end of the study.
The overall mean level of cognitive load did not differ significantly at the beginning and at the end of the study, and the main effect of the cognitive load level was not statistically significant, F (1, 18) = 1.53; p = 0.233; η 2 = 0.078 (M 1 = 3.68; SD 1 = 0.79; M 2 = 3.82; SD 2 = 0.86).

Cognitive load: correlations with other variables
Pearson's correlation analyses were conducted to discover any relationships between study variables. Figure 5 depicts significant correlations between the three cognitive Fig. 4 Significant differences between cognitive load sub-type levels. "1" and "2" indicate measuring points (start and close of the study). *Difference is sig. at the 0.05 level (2-tailed); ***difference is sig. at the 0.001 level (2-tailed) load types and usability at the beginning ("1") and close of the study ("2"), with AIbook features, cognitive strategy use, self-regulation and learning gain.
Correlations at the beginning of the study (measurement "1") reveal that germane load (GCL1) correlates positively with both Cognitive Strategy Use (r = 0.50*) and Usability1 (r = 0.51**). High germane load indicates that a learner directs effort to learning deeply, suggesting that at the study commencement, the more advanced students' skills to learn strategically (Cognitive strategy use) the higher the mental effort directed to learning deeply with the AI-book. Moreover, a more positive perception of AI-book usability was linked to higher mental effort. Extraneous load (ECL1) correlated negatively with Usability1 (r = − 0.69**), which indicates that higher perceived extraneous cognitive load ("undesirable" load arising from "struggling" with a learning environment) accompanied a lower usability perception at the start of the study.
At the close of the study (measurement "2") intrinsic load (ICL2) correlated negatively with Usability2 (r = − 0.50**), which infers that lower perceived mental effort when learning with the AI-enriched book accompanies a more positive usability perception (Usability2). Furthermore, germane load (GCL2) correlates positively with Usability2 (r = 0.42*), which suggests that a more positive usability perception (Usa-bility2) the higher the mental effort directed to learning deeply with the AI-enriched book. Extraneous load (ECL2) correlates negatively with Usability2 (r = − 0.76**), showing that a higher perceived extraneous load is linked to a lower usability perception. Lastly, results revealed a significant positive correlation between germane load at the end of the study (GCL2) and the number of linked suggested questions accessed for example by tapping the "MORE" button (see Fig. 2 (2)) (r = 0.45*). This result indicates that the more often students accessed suggested questions to explore further knowledge, the more mental effort they invested in learning deeply from the AI-enriched book (GCL2).

Fig. 5
Statistically significant correlations between variables investigated in the study. "1" and "2" indicate measuring points (start and close of the study). Positive correlations in green and negative correlations in red.

Learning gain and usability: correlations with other variables
A t-test comparison between post-and pre-test revealed a significant learning gain, t(30) = 6.45, p < 0.001; M pre-test = 3.00; SD pre-test = 1.15; M post-test = 4.74; SD post-test = 1.06. Learning gain correlated negatively with extraneous load (ECL2; r = − 0.39 * ), which implies that high levels of ECL might have impeded meaningful learning with the AIbook system. A t-test comparison between Usability1 and Usability2 revealed a near significant decrease in usability perception at the study close, t(18) = 2.09, p = 0.051; M Usabil-ity1 = 5.82; SD Usability1 = 1.15; M Usability2 = 5.11; SD Usability2 = 1.28. Additionally, usability at the beginning of the study (Usability1) correlated positively with Cognitive Strategy Use (r = 0.47*), which shows that the more advanced skills in learning strategically (Cognitive strategy use) that students have, the more positive their usability perception of the AI-book.

Students' perceived usability of the AI-book
Thirty-three students rated AI-book usability through Likert scale statements and by being asked to express two positive and two negative aspects of using the book. Generated comments were grouped in relation to the ICL, GCL or ECL definitions in Klepsch et al. (2017) (Table 4). Specifically, we categorized comments regarding mental effort originating from reading, decoding, and memorizing the content as ICL-related. Comments mentioning understanding and combining mental information into knowledge were categorized as GCL-related. Responses referring to design of the environment were categorized as ECL-related. Overall, 12 positive statements referred to ICL, 9 to GCL and 30 to ECL. Four negative comments referred to ICL, 7 to GCL and 33 to ECL. There were also 20 positive and 5 negative unclassified statements that referred to more than one sub-type of cognitive load, which we did not group or analyse in relation to cognitive load.
When perceiving the usability of the AI-book, students revealed 71 positive and 49 negative comments, respectively, with a weighting toward more positive usability perception. Students were very satisfied with the pop-up definitions, explanations, and associated concept maps, which were deemed to provide the meaning of a term without interrupting reading. These positively viewed aspects of the AI-book might help in decoding and memorising information and optimising experienced intrinsic cognitive load. Students also appreciated the search function that allows them to put ideas together, compare terms, and connect concepts to build understanding, and offers answers to complex questions. Such features of the AI-book may promote germane load, since "putting ideas together" underpins the construction of mental models of biological knowledge. Most of the positive comments were classified as design-related, referring to ECL. For instance, students pointed out that the AI-book offered multiple informative and interactive visual features affording possibilities for zooming, rapid navigation and glossary accessibility, as well as ease of use and portability.
When referring to negative aspects of the book in relation to ICL, students mentioned that there were too few pop-up definitions, too much to remember to use the book effectively, and that chapter names did not always match contained content. Among usability Page 14 of 22 Koć-Januchta et al. Int J Educ Technol High Educ (2022) 19:11 Table 4 Examples of verbatim positive and negative usability comments related to intrinsic, germane and extraneous cognitive load (CL) types

ECL-related
Positive "I really liked that whenever I forgot the definition of something, I could just click on the word and be reminded of the definition. That saved me a hell of a lot of time while reading […]" "The connection of concepts really helps put main ideas together" "the immense amount of graphs/figures provided" "I like that the vocabulary is underlined and if you forget a word, you can click on it and be reminded" "I also really like the feature where you can ask the AIbook to compare topics. This was really helpful […] when I got confused on the similar terminology" "Media hyperlinks are easier than having to go to a separate device to watch videos" "Being able to click on various terms while reading to get definitions, explanations, etc. was really helpful" "Connections of concepts to other chapters to build (or review) understanding to maintain learning foundation" "Easy to use" Negative "Not enough inline [dotted] definitions" "The AI is not good enough at answering basic questions related to the content" "The search is [sic] works in a poor and frustrating manner: forcing me to use complete sentences […] (my habit is for words over sentences as this is the convention of all internet search nowadays […])" "Memorizability is still worse with this compared to a paper book due to lack of sense of location when reading" "I think the search engine, while amazing, could be further tweaked to really break down the topics, specifically with the comparison feature" "It seemed to take a long time for the answers to load after i [sic] submitted a query" "Too many things to remember while using the AI-book" "Connections between concepts not always made apparent. More depth would be useful" "Only available on iPads and not online/Android" Page 15 of 22 Koć-Januchta et al. Int J Educ Technol High Educ (2022) 19:11 concerns in relation to GCL, students noted that the AI-based features did not answer basic (or all) posed questions, with some participants expressing a need for more indepth answers. Usability in connection with ECL mostly included mechanisms in how to ask the book questions. For example, students were unsure why one needed to input complete question phrases rather than single terms alone to generate questions. Students also mentioned delayed loading of answers, problems with the highlighting functionality, lack of page numbers, book availability being constrained to iPads alone, and no possibility to access the book offline.

Discussion
Research suggests that students tend to perceive and learn from print and digital resources in similar ways (Sage et al., 2019;Koć-Januchta et al., 2020). However, this does not mean that the cognitive load associated with these media is similar. For example, studies show that interactively advanced learning tools may cause cognitive overload (Aleven et al., 2003;Scheiter & Gerjets, 2007;van Merriënboer & Sweller, 2005), by imposing design-related, extraneous load (Klepsch et al., 2017). In the reported reference point study, we established that university student participants experienced significantly lower cognitive load, when learning from an AI-enriched book, in comparison to a traditional hardcopy and E-book. Although different, the result did not explain which cognitive load types were differentially lower, when learning from the AI-enriched book.
Since cognitive load concerns both the constraints as well as capabilities of human information-processing architecture, cognitive load levels can positively and negatively influence learning. While ECL emerges from design constraints of the learning matter and should be minimised, GCL and ICL are integral parts of the learning process (Moreno & Mayer, 2007;Sweller et al., 1998), where GCL is indicative of meaningful learning and should be promoted (Klepsch et al., 2017).
In the main study, and in contrast with shorter interventions, we measured three types of cognitive load twice, once at the beginning and once at the close of the study after students used the AI-book for 1.5 months. Giving students a longer unrestricted interaction with the AI-book revealed relationships between cognitive load, usability perception, self-regulation, cognitive strategy use, learning gain and book features. Interaction with the book over a longer period during a biology course in an authentic setting also contributed to ecological validity. Overall, students achieved significant learning gains (M = 1.74) during the study.
Our previous research suggested that as usage time increased, learners perceived more ECL-related disadvantages of using the AI-book (Koć-Januchta et al., 2020). Indeed, the literature documents a strong relationship between cognitive load and tool usability (e.g., Kılıç, 2007;Pantano et al., 2017), where difficulty in using the AI-enriched book correlated negatively with usability perception (Koć-Januchta et al., 2020). At the same time, a higher ECL level may result from underdeveloped self-regulated learning skills (e.g., de Bruin et al., 2020;Eitel et al., 2020). In reference to these findings, we aimed to investigate differences between the three types of cognitive load and usability, self-regulation, cognitive strategy use, and learning gain when students engaged with the AI-book over Page 16 of 22 Koć-Januchta et al. Int J Educ Technol High Educ (2022) 19:11 an extended period in an ecologically valid setting. We discuss our findings in relation to the two major aims of the study.

Students' experiences of the three cognitive load types while learning with the AI-book
Results showed that GCL was significantly higher than ICL and ECL throughout the study, indicating that participants tried to learn deeply (Klepsch et al., 2017) for the entire course. At the beginning of the study intrinsic load was higher than extraneous load, while intrinsic load was as high as extraneous load (no significant difference) at the close. While the overall level of the three types of cognitive load did not change significantly, the possible increase between ECL2 and ECL1 could have occurred at the cost of both ICL and GCL. The second measurement revealed the cognitive load pattern after students learned with the AI-book for a longer time.
We assume that with more time, students had more possibilities to experience the affordances of the interface design and may have directed their cognitive resources toward such extraneous processing instead of knowledge integration (Mutlu-Bayraktar et al., 2019). In this regard, students identified more ECL-related disadvantages (33) than advantages (30) of the AI-book, and directed most usability comments toward ECL (63) in comparison with ICL (16) or GCL (16).
Since user experience quality impacts engagement with digital educational tools and can induce effective learning (e.g., Bikowski, & Casal, 2018), it is crucial to identify elements that might require modification or improvement. In our study students were unsatisfied with some AI-book features, such as ECL-related usability around the asking question functions, with some experiencing it as inconvenient and frustrating. Moreover, they flagged long processing times for answer generation, problems with highlighting and insufficient visual icon size. While improving "straight-forward" technological and design-related shortcomings can be viewed as a subsequent step for AI-book developers, incorporating changes in response to students' observed critique should also be considered. For example, students were often dissatisfied with the procedure required to ask questions, namely, being required to "use complete sentences" instead of inputting single words. While many conventional search tools do indeed operate with single words, accessing the higher level of questioning offered by the book environment, requires specifying the sought relationship as a more descriptive entry. In this regard, the AI-book question answering interface is designed as a "concept calculator". In the same sense that a numerical calculator performs numerical operations on numerals, the AI-based answering architecture performs describe, compare, and relate "calculations" on concepts. In turn, by posing questions in terms of such "concept calculations", students learn to deploy meaningful questions for accessing, unpacking, and integrating biological knowledge.
Students also expressed many positive aspects of the AI-book such as ease of use, rapid accessibility to features, engaging visuals, and portability. Students often suggested a desire to use the AI-enriched book for a longer period and across different hardware platforms. In line with our previous study (Koć-Januchta et al., 2020), the AI-book seems to afford possibilities for deep, meaningful learning, as evidenced by the significantly higher presence of GCL in comparison with the other load types. At the same time, the AI-enriched book technology unveils students' expectations of interactive digital tools that appear heavily shaped by experiences with conventional searching tools.
Relationships between the three types of cognitive load, usability, self-regulation, cognitive strategy use, and learning gain while interacting with the AI-book Significant correlations between cognitive load types and other studied variables were revealed both at the beginning and close of students' interaction with the tool (de Bruin et al., 2020;Eitel et al., 2020;Ibili & Billinghurst, 2019). In elaborating upon our previous work (Koć-Januchta et al., 2020), this study sheds more detail on relationships between cognitive load and usability when students use the AI-enriched technology to learn.
At the beginning of the study, ICL did not correlate significantly with other variables but correlated negatively with usability at the close. Since ICL originates from the number of interrelated elements for content to be learned (e.g., Klepsch et al., 2017), we assume that demands to learn more complex topics, especially when one has a low level of pre-knowledge, may relate to less favourable perceptions of book usability. The finding that GCL correlated positively with usability throughout the study (e.g., Ibili & Billinghurst, 2019) and with cognitive strategy use (e.g., de Bruin et al., 2020;Eitel et al., 2020), infers that an increase in germane cognitive load is positively related with favourable perceptions of AI-book usability. Additionally, higher self-regulated learning skills, such as the deployment of organizational, rehearsal and elaboration learning strategies, are positively correlated with deep, meaningful learning. Students appear to appreciate the support offered by the AI-book in constructing mental models needed to acquire knowledge in biology.
Several AI-book features yielded a positive response from students, with the access to pop-up definitions being particularly salient. Students appreciated the possibility to rapidly check the meanings of terms while continuing to read without interruption, which appeared to be a very useful support for understanding biological terminology (Zukswert et al., 2019). In addition, better skills in learning strategically (higher scores on cognitive strategy use) are linked to meaningful, deeper learning (de Bruin et al., 2020;Eitel et al., 2020). ECL correlated only negatively with usability, and most notably, with learning gain. The exhibited link between ECL and usability perception of the tool confirms the nature of ECL (Ibili & Billinghurst, 2019). Given that ECL is a consequence of expending mental resources on design elements, grappling with the features or technological affordances of the tool may impair learning and decrease satisfaction. Accessing linked suggested questions, by for example, tapping on the "MORE" button ( Fig. 2 (2)) correlated significantly with GCL. Herein, "digging deeper" with the AI-book to actively seek and connect biological phenomena is likely linked to meaningful learning. In extension of our previous work (Koć-Januchta et al., 2020), the significant correlation between GCL and the questions feature exposed here begs for further investigation into what it is about the nature of the question feature that promotes meaningful learning.
Page 18 of 22 Koć-Januchta et al. Int J Educ Technol High Educ (2022) 19:11 Limitations of the study Restriction on the transfer of the findings to other contexts include the following. Firstly, as was also the case in Klepsch et al. 's (2017) original study, the reliability of the GCL scale in the cognitive load questionnaire was unfavourable. We nevertheless adopted the questionnaire since it still manifested good overall item characteristics. Secondly, we acknowledge that the revealed learning gain is unlikely to be attributed to sole use of the AI-enriched book, and we thus only provide evidence of relationships rather than causality. Also, correlation between deep learning with the AI-enriched book and self-regulation might not be attributed to the AI-enriched book exclusively. Students with higher cognitive strategy use would likely tend to learn deeper regardless of textbook resource. Lastly, we did not compare students' use of E-book and AI versions of the resource since the authentic study setting required all students to have equivalent learning conditions.

Conclusions and implications
The initial reference point study showed that the AI-book was associated with lower cognitive load in comparison to hardcopy and E-book counterparts. In the main study that focused on the AI-book, both measurement points (beginning and end), revealed that GCL was always significantly higher than ICL and ECL. This pattern indicates that the AI-enriched book is associated with less mental effort and more meaningful learning. The same result could also be a sign of high learning competences and commitment from the participants, and may further support the positive correlation between GCL and self-regulation skills (Steffens, 2006). Furthermore, the significant correlation between GCL and students' interaction with the linked suggested question feature infers that the book supports deep learning. While our results show that the AI-book may be less resource-consuming than traditional books in supporting meaningful learning, users' expected inputting mechanisms seem to be shaped by experiences with existing internet search tools. Any conflict with an anticipated way of searching may cause dissatisfaction and decreased engagement with the AI-book. Hence, it is important to reduce ECL by improving the design of the functional features of the book environment. Most importantly, this study showed GCL to be positively connected with cognitive strategy use, usability, and with accessing linked suggested questions. This demonstrates that deep learning is related to seeking additional information and curiosity, and that such AI-enriched books are enthusiastically welcomed (e.g., "I hope this is the future of textbooks").
In moving toward future investigations, the results imply that higher education students need support in acquiring sufficient digital literacy (perhaps from a young age) and self-regulated learning skills, to optimally benefit from emerging AI-based tools. The implication that AI-textbook technology can support deeper learning requires expanded investigation including further learning gain measures, physiological measures of cognitive load as well as controlling for pre-knowledge.