Predicting students at risk of academic failure using ensemble model during pandemic in a distance learning system

Table 4 Grid-search parameters for each candidate algorithm

Algorithms	Hyperparameter space to improve algorithm performance
DT	Measure of impurity = ['gini', 'entropy'] & Split strategy = ['best', 'random'] & Max depth = [None, 3] & Max features = ['auto', 'sqrt', 'log2'] & Class weight = ['balanced', None]
RF	Measure of impurity = ['gini', 'entropy'] & Bootstrap = [False, True] & Max depth = [None, 3] & Warm start = [False, True] & Class weight = ['balanced', 'balanced_subsample'] & Number of trees = [100, 200, 300, 400]
ET	Measure of impurity = ['gini', 'entropy'] & Max depth = [None, 3] & Warm start = [False, True] & Class weight = ['balanced', 'balanced_subsample'] & Number of trees = [100, 200, 300, 400]
LR	Optimization algorithm = ['newton-cg', 'lbfgs', 'liblinear', 'sag', 'saga'] & Inverse of regularization strength = [0.1, 1, 10, 100] & Class weight = ['balanced', None] & Dual = [False, True] & Fit intercept = [False, True] & Tol = [0.001, 0.01] & Warm start = [False, True]
GB	Measure of impurity = ['friedman_mse', 'mse', 'mae'] & Max depth = [None, 3] & Number of trees = [100, 200, 300]
ANN	Alpha = [0.1, 0.01, 0.001] & Activation = [‘relu’, ‘logistic’] & Early stopping = [True, False] & Number of hidden layers = [200, 300, 400]
QDA	*No parameter settings available in Scikit-learn implementation

Decision Tree (DT), Random Forest (RF), Extra Trees (ET), Logistic Regression (LR), Gradient Boosting (GB), Artificial Neural Network (ANN), and Quadratic Discriminant Analysis (QDA)