Chapter 1 Introduction

1.1 Background

The mlr3 (Lang et al. 2019) package and ecosystem provide a generic, object-oriented, and extensible framework for classification, regression, survival analysis, and other machine learning tasks for the R language (R Core Team 2019). This unified interface provides functionality to extend and combine existing machine learning algorithms (learners), intelligently select and tune the most appropriate technique for a specific machine learning task, and perform large-scale comparisons that enable meta-learning. Examples of this advanced functionality include hyperparameter tuning and feature selection. Parallelization of many operations is natively supported.

1.2 Target Users

We assume that users of mlr3 have the equivalent knowledge of an introductory machine learning course and some experience in R. A background in computer science or statistics will provide a strong basis for understanding the advanced functionality described in the later chapters of this book. “An Introduction to Statistical Learning” provides a comprehensive introduction for those getting started in machine learning. mlr3 is suitable for complex projects that utilize the high degree of control as well as the highly abstracted “syntactic sugar” to mock-up specific tasks.

mlr3 provides a domain-specific language for machine learning in R. We target both practitioners who want to quickly apply machine learning algorithms and researchers who want to implement, benchmark, and compare their new methods in a structured environment.

1.3 Development

mlr (Bischl et al. 2016) was first released to CRAN in 2013, with the core design and architecture dating back much further. Over time, the addition of many features has led to a considerably more complex design that made it harder to build, maintain, and extend than we had hoped for. With hindsight, we saw that some design and architecture choices in mlr made it difficult to support new features, in particular with respect to pipelines. Furthermore, the R ecosystem as well as helpful packages such as data.table have undergone major changes in the meantime.

Bischl, Bernd, Michel Lang, Lars Kotthoff, Julia Schiffner, Jakob Richter, Erich Studerus, Giuseppe Casalicchio, and Zachary M. Jones. 2016. “mlr: Machine Learning in R.” Journal of Machine Learning Research 17 (170): 1–5. http://jmlr.org/papers/v17/15-066.html. It would have been nearly impossible to integrate all of these changes into the original design of mlr. Instead, we decided to start working on a reimplementation in 2018, which resulted in the first release of mlr3 on CRAN in July 2019. The new design and the integration of further and newly-developed R packages (especially R6, future, and data.table) makes mlr3 much easier to use, maintain, and in many regards more efficient compared to its predecessor mlr.