There are broadly three streams of research within educational analytics. Learning Analytics (LA) focuses on learners. Its primary concern is optimizing teaching and learning processes. Educational Data Mining (EDM) on the other hand seeks to develop methods for exploring educational data in order to better understand learners, and to extract insights about them and the educational systems. Academic Analytics (AA) draws on the insights gained from educational data for supporting strategic decision-making, resolving academic issues such as retention and improving marketing strategies.
These three streams intersect at various points and share much of the underlying data, and though they could all be grouped under the same umbrella as Educational Data Science, they differ in the stakeholders which they target. EDM tends to target both teachers and learners, while LA primarily addresses the needs of learners. However, institutional administrators, managers and educational policymakers are the key stakeholders of AA applications. The three streams also affect different levels of the educational systems. LA is linked to course-level granularity and to department-level concerns within institutions, while EDM spans departmental through to faculty and institutional-level concerns (Nguyen et al., 2020). Meanwhile, AA affects universities at the institutional level, that has implications for policy making, thus it spans regional, national and possibly international levels.
Challenges for building LADs
While there are some differences between LA, AA and EDM, they all share some common challenges. Numerous studies have reported implementation details of LA products; however, a recent study by Leitner et al. (2020) pointed out that they rarely provide comprehensive descriptions of challenges faced in productionizing these systems. This study shortlisted seven general challenges for deploying LA initiatives:
-
1.
Purpose and Gain: managing expectations of different stakeholders.
-
2.
Representation and Actions: facilitation of actionable insights by LA products.
-
3.
Data: communication to students regarding what is being done with their data, and formulating suitable policies to manage data processes.
-
4.
IT Infrastructure: balancing the pros and cons of opting to use internal or external service providers for implementing and running the LA products.
-
5.
Development and Operation: planning and implementation of the process of developing and operating an LA initiative.
-
6.
Privacy: ensuring both security of learners’ data and compliance with increasingly stringent legal requirements worldwide.
-
7.
Ethics: ensuring that LA products do not bring harm and provide learners with the ability to opt-out.
The above challenges are generic and broadly applicable to all LA projects. We draw on recent literature to expand on two particular challenges above (2 and 7), and we tailor them to the difficulties which specifically relate to LAD projects. In addition, with supporting literature we posit an additional challenge, namely Agility, to the original seven identified by Leitner et al. (2020).
Representation and actions
Dashboard visualization is more of a science than art. The dashboard designer must possess a degree of understanding of how the human visual cortex perceives various visual cues in order to optimally match different data types to suitable visual representations. Some data are quantitative and others are ordinal or categorical in their attributes. The values of each data type are best represented by different cues which could comprise contrasting colors, differing spatial positions or variations in symbols denoting length, size, shape and orientation amongst others. The designer also needs to possess both domain expertise in learning theories and paradigms, as well as technical capabilities in developing dashboards (Klerkx et al., 2017).
Choosing the correct visualization technique can present difficulties largely due to the increasing amounts of available data and the candidate variables/indicators that can be incorporated (Leitner et al., 2019). Ensuring that dashboards are informative without overwhelming the user is a challenging balancing act. From an aesthetic perspective, Tufte (2001) cautions against use of ‘non-data-ink' and ‘chartjunk’ in graphs, that is, he maintains that excessive use of colors, patterns or gridlines can confuse and clog the recipient's comprehension. Bera (2016) specifically mentions the overuse and misuse of color in business dashboards and the role this has on the users’ decision-making abilities. Bera’s research finds that contrasting colors vie for user’s attention, and unless necessary, they distract and affect the decision-making processes. By using eye tracking technology, the study demonstrated that the cognitive overload associated with the misuse of color in dashboards leads to longer fixation periods on irrelevant aspects of dashboards and prolongs the ability of users to comprehend the information.
Use of predictive modelling is becoming more prominent within LA (Bergner, 2017), and these techniques are emerging more frequently within dashboards. A recent study (Baneres et al., 2021) into developing LA technologies acting as early warning systems for identifying at-risk learners highlighted the need to move beyond ‘old-fashioned’ dashboards that simply rely on descriptive analytics and to instead, orient efforts towards incorporating predictive analytics amongst other advanced features. However, building highly accurate and reliable predictive models is not trivial. Firstly, it requires considerable technical expertise which is not always easy to acquire. Secondly, predicting outcomes based on human behavior reflects a non-deterministic problem. Further, for scalability reasons, we ideally require generic predictive models which can predict student outcomes across widely disparate courses. However, since courses have different attributes, styles of delivery and assessment types, it is a considerable challenge to create single generic predictive models that can work optimally across diverse courses. On the other hand, developing tailored predictive models for each different course creates technical resource overheads. Tailored models are also likely to perform badly in many instances due to scarcity of data leading to overfitting, since individual courses may have small class numbers or have limited historical data. A recent systematic literature review on the current state of prediction of student performance within LA, Namoun and Alshanqiti (2020) found that the state of predictive modeling of student outcomes is not fully exploited and warrants further work. The study found that not only the accuracy of the existing models has room for improvement, but more robust testing for its validity, portability (or generic models) and overall generalizability needs to be conducted. In a recent study, Umer et al. (2021) concluded that many datasets used to build predictive models in this domain were small, often having less than 10% of the overall data points for certain class labels, leading to unreliable predictive accuracies especially when course-tailored predictive models are being created. The study also calls for enhancing the scope of engagement data to cover learner interaction data from forum messages, identifying pedagogically meaningful features and developing dashboard visualizations that have some underlying pedagogical intent.
Developing accurate classifiers is further complicated by the negative effects of concept drift (Lu et al., 2018). Concept drift describes the degradation in accuracies of predictive models over time since data used to build models may become disconnected with current real-life data. This can occur when learners’ study patterns gradually or abruptly change (as in the case of pandemic responses), and current digital footprints no longer correlate with previous patterns in the historic record. For example, the gradual shift towards the use of virtual learning environments (VLE) over the last 10–15 years represents a concept drift. Learners’ study patterns prior to this period in the historic record bear little resemblance to the patterns of learners of today, and thus, data from historic period will likely degrade predictive accuracies of current students. Concept drift can also happen suddenly, as indeed the sudden migration to full online learning during the recent pandemic crisis brought into play additional technologies and different digital patterns and footprints that students leave behind. This disconnect between the independent and dependent variables from historic data needed to train the predictive models, with the independent variables being used as input to predict the outcomes of current students, is constantly evolving. This phenomenon represents a technical and a capability challenge for universities, as concept drift needs to be detected and accounted for, while the mechanisms for achieving this effectively are still being researched (Lu et al., 2018).
The above challenges are considerable. However, even if they can all be addressed, it is now no longer sufficient to deploy predictive models and solely display their outputs without providing the learners with explainability of how a model arrived at a given prediction. It is also becoming more apparent that learners will engage with a LAD only if they understand how displayed values are generated (Rets et al., 2021). Liu and Koedinger (2017) argue for the importance of interpretability which leads onto actionability. Models need to possess explanatory characteristics so that learners understand why a model produced given predictions, what the underlying driving factors are, and importantly, what insights can be derived from these explanations in order to trigger actionable behavioral adjustments. Not only should interpretability of models and explainability of their individual predictions be provided to the learners, but also counterfactuals, which explicitly demonstrate alternative outcomes for the learner if a behavioral change were to take place in specific areas. Recent studies (Rets et al., 2021; Valle et al., 2021) in LADs have highlighted the necessity of integrating insights which are prescriptive and take on forms of recommendations to guide students in their learning. Producing such rich and sophisticated outputs is a challenge, because extracting simplified representations of predictive black-box models and their reasoning is complex. There are limited available tools with sufficient maturity that support this functionality, which again requires a high level of expertise to implement and leverage.
Ethics
The challenges surrounding ethical use of data within LA products are generally well understood and accepted. They center around questions of what personal data should be collected and processed by these systems, what insights should be extracted and with whom they should be shared. Additional concerns exist around possible consequences on learners when conveying personalized information; therefore, institutions need to be aware of intrusive advising or inappropriate labelling that may lead to learner resentment or demotivation (Campbell et al., 2007). As such, avoidance of harm to learners, alongside compliance with legal requirements are paramount.
Given the importance of practical ethical underpinnings when using LA systems, it is acknowledged that robust and clear policies need to be formulated on what empirical data is permitted to be used for analytical purposes and to what end (Kitto & Knight, 2019). The study supports that awareness of these policies must be communicated to the learners together with the purported educational benefits that such systems claim to bring, together with the potential risks. A key concern however is the uncertainty regarding the benefits distribution, which may not be the same for everyone (Rubel & Jones, 2016); hence, institutions are encouraged to create a sense of transparency about LA systems by including statements on their data practices and limitations.
Beyond the well accepted dilemmas of LA systems listed above, predictive models used in LADs bring with them some other acute challenges. Predictive models naturally embody within them the process of generalization. As the machine learning algorithms learn and induce predictive models, they move from individual and specific examples to more general descriptors of the data. With this natural induction process, errors are invariably introduced. The ethical concern and challenge come into play when we consider both incorrect and correct classifications and the effects that they might have on learners. If a student is mis-classified as being “at-risk” this might have the effect of discouraging them and eventuate in the “fulfillment of the prophecy” despite the fact they were originally on-track to successful completions. Or, in using oversimplified classification labels, we can diminish the predictive value and in turn reduce the trustworthiness of the analytical approach. This challenge will always remain since learners are not deterministic and predictive models in non-deterministic domains are inherently imperfect. Likewise, Bowker and Star (2000) note that even with correct predictions, for some this may be an incentive if they are already motivated and capable of positively adjusting their course in order to alter their predicted outcome, while for others, the prediction may only serve to further deflate.
Agility
Agility is the ability to rapidly adapt to changing requirements, be flexible and able to seize new opportunities. Universities are more resistant to change than industrial entities (Menon & Suresh, 2020); they are typically considered to be fractured and decentralized (Bunton, 2017), while possessing complex and non-standard business processes (Mukerjee, 2014a). However, financial constraints coupled with pressure from competition as a consequence of the unfolding digital revolution, have put universities on high alert to engage with new technologies (Mukerjee, 2014a). It is recognized that organizational agility is a crucial capability for universities at these times (Mukerjee, 2014b). Both the use of data insights and analytics as well as the development of these projects, places immediate demands of agility on behalf of the organization operationalizing them. Agility is therefore a key challenge for universities attempting to productionize LADs.
The requirement for agility comes at different levels in respect to LADs. Translating LADs into products that genuinely improve learning outcomes requires constant monitoring and analysis of their usage patterns, user feedback and ultimately the gathering of evidence into their efficacy. The consequences of this are an increase in resource costs for maintenance and continuous refinement of the LADs. Continuing support from the institutions and willingness to provide ongoing long-term refinements need to be secured ahead of time. Sun et al. (2019) point out that improvements of these types of systems needs to go beyond pilot and deployment stages, and that underlaying assumptions used to develop these systems need to be re-assessed as adjustments are made to enhance the design or functionality. For best results, the design of dashboards should be iterative with continuous feedback from learners in order to ensure that an operationalized product is actually useful. This is time and resource intensive and requires agility.
From a data-oriented point of view, agility and the ability to integrate new data streams into LADs are paramount. Universities are rapidly incorporating modern technologies for course delivery and improving the learning experience. The technologies sometimes augment what is already in place, while other times, they completely replace legacy processes and systems with new ones. This process has been accelerating recently and will continue to do so. The consequence is that new and more diverse digital footprints will continue to be generated by learners especially with the increased demand in online education in the aftermath of COVID-19. Therefore, adaptability and rapid responses in integrating new data sources must be set forth to identify new features that can improve the predictive power of deployed models.
Finally, profound insights are compelling. They demand action if negligence is to be avoided. Deep insights can be game-changers and often call for swift action even when this is inconvenient. For example, if predictive models powering the LADs identify certain qualifications within an institution’s portfolio as being key predictive drivers towards poor completion rates, then this would need to trigger action and possibly advice on changes that may neither be convenient for an institution, nor even align with their overarching strategic goals. With deployment of LADs, therefore, comes the responsibility of asking the tough questions in adapting to the suggested changes that can have a better institutional impact.