Resources & FAQs
Learning engineering is the use of computer science to pursue rapid experimentation and continuous improvement with the goal of improving student outcomes.
The learning engineering approach is critical because the current process to test and establish the efficacy of new ideas is too long and too expensive. Learning science research remains slow, small-scale, and data-poor, compared to other fields. The result is that teachers and administrators often have neither proven tools nor the research at hand they need to make informed pedagogical decisions. Learning engineering aims to solve this problem using the tools of computer science.
For individual platforms, the learning engineering approach is important because it allows for platforms to engage in rapid experimentation and engage in continuous improvement. In other words, learning engineering allows for platforms to quickly understand if an approach works and for whom and at what time. This is central to scaling an effective product.
Opportunities for Learning Engineering
Want to know in what areas of education learning engineering has the greatest potential for impact? Read this joint report between the Penn Center for Analytics and the Learning Agency.
IES Practice Guide
Learning engineering is already offering insight into how educators can improve instruction. This guide gives teachers concrete recommendations for how to optimize student learning and memory.
Report on Managing Instructional Complexity
Through school-researcher partnership, this report unearths factors and strategies that educators can use to finetune their instructional decision making to improve student learning.
Far too often, education research proves to be a frustrating process. Experiments often take years. Costs are high, sometimes many million per study. Quality is also uneven, and many studies have small n sizes and lack rigorous control. Similarly, the field lacks high-quality datasets that can spark better research and richer understanding of student learning.
Part of the issue is that learning is a complicated domain that takes place in highly varied contexts. Another issue is that the subjects of the studies are typically young people and so there are heightened concerns around privacy.
But the consequences of weak research processes are clear, and in education, experts often don't know much about what works, why it works, for whom it works, and in what contexts.
Take the example of interleaved practice, or mixing up problem sets while learning. Research into middle school math has established that students learn better when their practice is interleaved, meaning students practice a mix of new concepts and concepts from earlier lessons. But it’s an open research question how far this principle extends. Does interleaved practice work equally well for reading comprehension or social studies? Does it work for younger math students too? Does the type of student (high-achieving versus behind) matter?
This lack of knowledge has important consequences, and far too much money, time, and energy is wasted on unproven educational theories and strategies.
Learning engineering, at its core, is really about three processes: (1) systematically collecting data as users interact with a platform, tool, or procedure while protecting student privacy (2) analyzing the collected data to make more and more educated guesses about what’s leading to better learning, and (3) iterating based on these data to improve the platform, tool, or procedure for better learning outcomes. Some but not all platforms will partner with researchers to better learn what’s working best for students. These findings can then be shared with the community at large to help improve learner outcomes everywhere.
Instrumentation is building out a digital learning platform so many external researchers can engage in research. To be more exact, the platform is offering its data as an “instrument” to do research. In this sense, instrumentation is central to learning engineering; it is the process by which a platform turns their data into a research tool.
One primary way to instrument is by building a way for external researchers to run A/B experiments. Several platforms have created systems that allow outside researchers to run their research trials on digital platforms. In other words, the platforms have “opened up” their platforms to outside researchers. These platforms facilitate large-scale A/B trials and offer open-source trial tools, as well as tools that teachers themselves can use to conduct their own experiments.
When it comes to building A/B instrumentation within a platform, the process usually begins with identifying key data flows and ways in which there could be splits within the system. Platforms will also have to address issues of consent, privacy, and sample size. For instance, the average classroom does not provide a large enough sample size, and so platforms will need to think about ways to coordinate across classrooms. A number of platforms have also found success building “templates” to make it easier for researchers to run studies at scale.
One example of this approach is the ETRIALS testbed created by the ASSISTments team. As co-founder Neil Heffernan has argued, ETRIALS “allows researchers to examine basic learning interventions by embedding RCTs within students’ classwork and homework assignments. The shared infrastructure combines student-level randomization of content with detailed log files of student- and class-level features to help researchers estimate treatment effects and understand the contexts within which interventions work.”
To date, the ETRIALS tool has been used by almost two dozen researchers to conduct more than 100 studies, and these studies have yielded useful insights into student learning. For example, Neil Heffernan has shown that crowdsourcing “hints” from teachers has a statistically significant positive effect on student outcomes. The platform is currently expanding to increase the number of researchers by a factor ten over the next three years.
Other examples of platforms that have “opened up” in this way include Canvas, Zearn, and Carnegie Learning.
Carnegie Learning created the powerful Upgrade tool to help ed tech platforms conduct A/B tests. This project is designed to be a “fully open source platform and aims to provide a common resource for learning scientists and educational software companies.” Using Carnegie Learning’s Upgrade, the Playpower Labs team found that adding “gamification” actually reduces learner engagement by 15 percent.
A secondary way that learning platforms can contribute to the field of learning engineering is to produce large shareable datasets. Sharing large datasets that have been anonymized (removed of all personally identifiable markers, to protect student privacy) is a big catalyst for progress in the field as a whole.
In the field of machine learning for image recognition, there is a ubiquitously used open-source dataset of more than 100,000 labeled images called “ImageNet”. The creation and open-source offering of this dataset has allowed researchers to build better and better machine learning image recognition algorithms thus catapulting the field of image recognition to a new higher standard. We need similar datasets in the field of education.
An example of this approach is the development of a dataset aimed at improving assisted feedback on writing. Called the “Feedback Prize,” this effort will build on the Automated Student Assessment Prize (ASAP) that occurred in 2012 and support educators in their efforts to give feedback to students on their writing.
To date, the project has developed a dataset of nearly 400,000 essays from more than half-dozen different platforms. The data are currently being annotated for discourse features (e.g., evidence, claims, etc) and will be released as part of a data science competition. More on the project here.
Another example of an organization that has created a shared dataset is CommonLit, which uses algorithms to determine the readability of texts. CommonLit has shared its corpus of 3,000 level-assessed reading passages for grades 6-12. This will allow researchers to create open-source readability formulas and applications.
Yet another platform that has created useful large-scale datasets is Infinite Campus. Their student information system (SIS) and learning management system (LMS) dataset includes demographic, enrollment, program, behavior, health, schedule, attendance, curriculum, and assessment information. Using this data with the proper permissions, the company facilitates partnerships between research organizations, education agencies, and research funders to ask questions at scale about what works for student learning.
For the Learning Engineering Tools Competition 2021, a dataset alone would not make a highly competitive proposal. Teams with a compelling dataset are encouraged to partner with a researcher or developer that will design a tool or an algorithm based on the data.
For a list of researchers, please email email@example.com. We have a large and growing network of researchers who can assist platforms with:
- how best to instrument a platform in ways that would serve the field,
- determining what data a platform is able to collect and how best to collect it,
- using the data and related research to answer questions of interest.
We are also happy to make connections to researchers through individual requests or broader networking listservs and events.