“CHAPTER 21 2014 Learning Analytics” in “25 Years of Ed Tech”

CHAPTER 21

2014

Learning Analytics

Data, data, data. It’s the new oil and the new driver of capitalism, war, and politics, so inevitably its role in education would come to the fore. Interest in analytics is driven by the increased amount of time that students spend in online learning environments, particularly LMS and MOOC, but also the increased data available across a university, including library usage, attendance, demographic data, and so on. Sclater, Peasgood, and Mullan (2016) defined it as “the measurement, collection, analysis and reporting of data about the progress of learners and the contexts in which learning takes place” (p. 4).

Learning analytics grew as a field from around 2011, when George Siemens hosted the first Learning Analytics conference. By 2014, it had emerged as a field of its own, combining elements of statistics, computer science, and education. Although not a direct consequence, there is a definite synergy and similarity between MOOC and learning analytics, not least through the presence of George Siemens as an early and prominent voice in both areas. MOOC generated a lot of interest, partly because they created large datasets, and, partly because, removed from the constraints of formal education, they were vehicles for conducting A/B testing and quantitative analysis. Both approaches brought new people into educational technology, particularly from the computer science field. They brought with them new methods and concepts to apply to educational analysis. If the knowledge exchange is reciprocal, then this evolving nature of ed tech could be one of its strengths.

The positive side of learning analytics is that for distance education, in particular, it provides the equivalent of responding to discreet signals in the face-to-face environment: the puzzled expression, the yawn, or the whispering between students seeking clarity. Every good face-to-face educator will respond to these signals and adjust their behaviour. In an online environment, these cues are absent, and analytics provides some proxy for these. As Siemens and Long (2011) have put it, “Learning analytics can penetrate the fog of uncertainty around how to allocate resources, develop competitive advantages, and most important, improve the quality and value of the learning experience” (p. 40). Bodily, Nyland, and Wiley (2017) proposed the use of analytics to address particular problems, using the RISE (Resource Inspection, Selection, and Enhancement) framework. In this, a 2 × 2 grid of outcome versus engagement was proposed, with a student’s grade on assessment on the y-axis, and engagement on the x-axis. By using analytics, educators were able to assess what the authors suggested was the particularly valuable area for intervention — that of high engagement and low attainment.

A circle with arrows between four text fields indicating the learning analytics cycle. Learners points to Data, which points to Metrics/Analytics, which points to Intervention, which points back to Learners.

FIGURE 2. The learning analytics cycle, after Clow (2011).

The basic model for analytics allows the identification of issues and then some form of effective intervention is implemented. Clow (2012) proposed a learning analytics cycle, shown in Figure 2. In this model, learners generated data that was then processed into metrics or analytics, such as dashboards. Clow stated that this metrics stage was “the heart of most learning analytics projects and has been the focus of great innovation in tools, methods and methodologies — e.g. dashboards, predictive modelling, social network analysis, recommenders, and so on” (p. 135). For analytics to be effective, however, intervention was required that would have some effect on the behaviour of learners.

Sclater and Mullan (2017) reported on a range of such interventions, usually targeting at-risk students across different institutions, which improved grades in the range of 2 to 12% and increased retention rates. However, analytics can also be used for more long-term analysis. For example, Purdue’s Course Signals approach used a traffic light system to predict student performance, based on demographic characteristics, academic history, and interaction with the LMS. Students were sent a personalized email from the faculty member that indicated their “traffic signal colour.” Their results showed improved retention and generally high levels of student satisfaction (Arnold & Pistilli, 2012). However, the validity of some of these results was called into question. Caulfield (2013b) highlighted that with more Course Signals courses in existence, students who persisted would inevitably take more such courses: “Students are taking more . . . Signals courses because they persist, rather than persisting because they are taking more Signals courses” (para. 7). This highlights two issues with analytics: firstly, that their claims are often difficult for non-experts to verify, and secondly that a problem existed around correlation and causation. It is no surprise that students who perform better tend to spend more time in the library or contribute more in the LMS. These are attributes of studying, so “good” students tend to study a lot. But making other students spend more time in the LMS, for example, may not lead to an improvement in performance.

Rienties (2018) used analysis of different large data sets at the Open University to highlight “6 myths” or commonly held beliefs about student behaviour:

Open University (OU) students love to work together.
Student satisfaction is positively related to success.
Student performance improves over time (e.g., from first to third level courses).
The grades students achieve are mostly related to what they do.
Student engagement in the VLE is mostly determined by the student.
Most OU students follow the schedule when studying.

The data reveals that all these preconceptions are, to some extent, false. What this analysis reveals could be deemed as concerns, but it also highlights positive behaviour. For example, that student behaviour is largely determined by what is set out in the course (number 5) can be interpreted as an effective outcome of good learning design, particularly for distance education students. Similarly, while students don’t slavishly follow the course schedule (number 6), with many studying ahead or just behind the calendar, this can be framed as part of the flexible (and accessible) design. What this type of analysis highlights is the value in questioning our assumptions about student behaviour. Just as an author may have their ideal reader in mind, a course designer may have an ideal student, but this analysis reveals that some of those assumptions may not be valid.

The downsides to learning analytics are that they can reduce students to data and that ownership of this data becomes a commodity in itself. The use of data surveillance has only just begun and with scandals around Facebook and Cambridge Analytica (see for example, Cadwalldr & Graham-Harrison, 2018), the issues involved in how data is used are only just becoming apparent. Higher education has a duty to increase understanding about these data-harvesting activities, and it should not be seen to be partaking in data surveillance and accustoming students to this way of working unquestioningly. Analytics puts the institution in the position of the central panopticon, potentially observing all student interaction (Land & Bayne, 2005). The ed tech field needs to avoid the mistakes of data capitalism and should embed learner agency and ethics in the use of data, and it should deploy that data sparingly. Nelson and Harfield (2017) claimed that it is essential for students to be involved in the discussions about analytics, stating that the primary aim of a university education is “to ethically develop and realize both individual and socio-cultural potentialities . . . that can only happen when students are involved in making sense of their own data” (para. 02).

Another implication is that a data-driven approach is essentially a quantitative field, but education is largely a qualitative one, dealing with real students, in something that is of great emotional significance. It can sometimes be easy to forget that the nodes on a data plot are students. We are only at the beginning of the use of analytics in education, and as the quantity of data and the sophistication of the analysis increases, the danger is that instead of analytics supporting education, analytics becomes education.

In order to realize a moral implementation of learning analytics and address some of these issues, Slade and Prinsloo (2013) propose six principles:

Learning analytics as moral practice — their first principle is to appreciate that learning analytics is a moral undertaking and should not only focus on what is effective, but also “function primarily as a moral practice resulting in understanding rather than measuring” (p. 12).
Students as agents — in line with Nelson and Harfield (2017), they propose that institutions should “engage students as collaborators and not as mere recipients of interventions and services” (p. 12).
Student identity and performance are temporal dynamic constructs — students’ identities will change over the course of their studies; indeed, education is often portrayed as an identity changing experience. Analytics and data need to take this into account.
Student success is a complex and multidimensional phenomenon — student success and behaviour is a result of more than can be measured through data.
Transparency — institutions should be transparent regarding what data is gathered and how it will be used.
Higher education cannot afford to not use data — however, they stress that it is part of an institution’s responsibility to make moral and effective use of data.

As Slade and Prinsloo (2013) have pointed out, there are serious ethical issues raised by analytics, which current legislation and systems may be ill-equipped to deal with. Let’s imagine a scenario where a researcher has created a very accurate predictive analytics model that can foretell whether a student will drop out or complete a course with something approaching 90% accuracy. For this scenario, let us put aside debate about whether this is possible, although Agnihotri and Ott (2014) reported a 75% accurate predictive model for students who do not return. The researcher’s intentions are entirely noble — the researcher wants to allow the university to target extra support for these students to increase their chances of success. This, however, immediately raises an ethical problem: Should the students be told? Would this make it a self-fulfilling prophecy? Clow (2013) summarized it thus:

What is the ethical thing to do when your predictive algorithm says there’s very little chance that a would-be student will pass your course? Is it right to take their time, effort and money (or that of whoever is subsidising their place), when it will almost certainly come to very little? But on the other hand, is it right to block them from study? (para. 4)

Moving beyond this immediate concern, let’s assume this algorithm gets adopted in a learning analytics system taken up by universities worldwide. Any such algorithm will likely incorporate elements of class and race, or at least proxies for these; for example, Sclater (2014) reported on a university’s use of analytics to specifically offer tailored support for black and minority ethnic students. For many universities in our scenario, rather than being a means of offering extra support, it allows them to more accurately filter out students who are expensive to support and more likely to fail. When universities are judged on their completion and continuation rates (for example, continuation is one of the metrics in the teaching excellence framework in the UK), then such action becomes more likely.

Our intrepid researcher, who started out wanting to increase the support for disadvantaged students, is now the cause of a global system that is reinforcing privilege and creating an elitist education system, which systematically excludes certain groups. Algorithms are not apolitical. While there are, of course, many assumptions and oversimplifications in this scenario that could be challenged, its function is to highlight how even well-intentioned applications of analytics can quickly raise very complex ethical questions.

One of the benefits of considering analytics might simply be better communication with students. Navigating the peculiar, often idiosyncratic, world of higher education with its rules and regulations can be daunting and confusing. By considering useful dashboards, for instance, the complexity of this is surfaced. Bennett (2018), for example, reported how the items most valued by students were clear graphics showing attendance and predicted degree grade. The latter was not based on behavioural analytics but rather a calculation based on their scores in modules so far. With different weightings, substitutions, and averaging, it is often difficult for a student to know what degree of classification they are on track for, and what improvements they need to make in terms of scores in order to adjust this. This highlights how institutions can do a lot to simplify and communicate their processes to students.

Analytics and data are in the early stages of their adoption, and as Slade and Prinsloo (2013) proposed, institutions cannot afford not to use them. It is difficult to argue that you make education more effective by knowing less about your students, but the usage of analytics comes with a host of issues that are complex to navigate. Probably more than any other ed tech application, learning analytics necessitates a moral philosopher or social scientist in the room alongside the developers.

CHAPTER 22 2015 Digital Badges

CHAPTER 21

2014

Learning Analytics

2014 – Learning Analytics

Between the Chapters: Learning Analytics