forked from whirlsyu/mlr3
-
Notifications
You must be signed in to change notification settings - Fork 0
/
README.Rmd
110 lines (84 loc) · 5.69 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
---
output: github_document
---
# mlr3 <img src="man/figures/logo_navbar.png" align="right" />
Efficient, object-oriented programming on the building blocks of machine learning.
Successor of [mlr](https://github.com/mlr-org/mlr).
[![Build Status](https://travis-ci.org/mlr-org/mlr3.svg?branch=master)](https://travis-ci.org/mlr-org/mlr3)
[![CRAN](https://www.r-pkg.org/badges/version/mlr3)](https://cran.r-project.org/package=mlr3)
[![codecov](https://codecov.io/gh/mlr-org/mlr3/branch/master/graph/badge.svg)](https://codecov.io/gh/mlr-org/mlr3)
[![StackOverflow](https://img.shields.io/badge/stackoverflow-mlr3-orange.svg)](https://stackoverflow.com/questions/tagged/mlr3)
[![Dependencies](https://tinyverse.netlify.com/badge/mlr3)](https://cran.r-project.org/package=mlr3)
## Resources
* We _started_ writing a [book manual](https://mlr3book.mlr-org.com/), but it is still in early stages.
* [Reference Manual](https://mlr3.mlr-org.com/reference/)
* [Extension packages](https://github.com/mlr-org/mlr3/wiki/Extension-Packages)
* [useR!2019 talks](https://github.com/mlr-org/mlr-outreach/tree/master/2019_useR)
* [Blog](https://mlr-org.com/) about _mlr_ and _mlr3_
## Installation
```{r, eval = FALSE}
remotes::install_github("mlr-org/mlr3")
```
## Example
### Constructing Learners and Tasks
```{r}
library(mlr3)
set.seed(1)
# create learning task
task_iris = TaskClassif$new(id = "iris", backend = iris, target = "Species")
task_iris
# load learner and set hyperparamter
learner = lrn("classif.rpart", cp = 0.01)
```
### Basic train + predict
```{r}
# train/test split
train_set = sample(task_iris$nrow, 0.8 * task_iris$nrow)
test_set = setdiff(seq_len(task_iris$nrow), train_set)
# train the model
learner$train(task_iris, row_ids = train_set)
# predict data
prediction = learner$predict(task_iris, row_ids = test_set)
# calculate performance
prediction$confusion
measure = msr("classif.acc")
prediction$score(measure)
```
### Resample
```{r}
# automatic resampling
resampling = rsmp("cv", folds = 3L)
rr = resample(task_iris, learner, resampling)
rr$performance(measure)
rr$aggregate(measure)
```
## Why a rewrite?
[mlr](https://github.com/mlr-org/mlr) was first released to [CRAN](https://cran.r-project.org/package=mlr) in 2013.
Its core design and architecture date back even further.
The addition of many features has led to a [feature creep](https://en.wikipedia.org/wiki/Feature_creep) which makes [mlr](https://github.com/mlr-org/mlr) hard to maintain and hard to extend.
We also think that while mlr was nicely extensible in some parts (learners, measures, etc.), other parts were less easy to extend from the outside.
Also, many helpful R libraries did not exist at the time [mlr](https://github.com/mlr-org/mlr) was created, and their inclusion would result in non-trivial API changes.
## Design principles
* Only the basic building blocks for machine learning are implemented in this package.
* Focus on computation here. No visualization or other stuff. That can go in extra packages.
* Overcome the limitations of R's [S3 classes](https://adv-r.hadley.nz/s3.html) with the help of [R6](https://cran.r-project.org/package=R6).
* Embrace [R6](https://cran.r-project.org/package=R6), clean OO-design, object state-changes and reference semantics. This might be less "traditional R", but seems to fit `mlr` nicely.
* Embrace [`data.table`](https://cran.r-project.org/package=data.table) for fast and convenient data frame computations.
* Combine `data.table` and `R6`, for this we will make heavy use of list columns in data.tables.
* Be light on dependencies. `mlr3` requires the following packages:
- [`backports`](https://cran.r-project.org/package=backports): Ensures backward compatibility with older R releases. Developed by members of the `mlr` team. No recursive dependencies.
- [`checkmate`](https://cran.r-project.org/package=checkmate): Fast argument checks. Developed by members of the `mlr` team. No extra recursive dependencies.
- [`mlr3misc`](https://github.com/mlr-org/mlr3misc) Miscellaneous functions used in multiple mlr3 [extension packages](https://github.com/mlr-org/mlr3/wiki/Extension-Packages). Developed by the `mlr` team. No extra recursive dependencies.
- [`paradox`](https://github.com/mlr-org/paradox): Descriptions for parameters and parameter sets. Developed by the `mlr` team. No extra recursive dependencies.
- [`R6`](https://cran.r-project.org/package=R6): Reference class objects. No recursive dependencies.
- [`data.table`](https://cran.r-project.org/package=data.table): Extension of R's `data.frame`. No recursive dependencies.
- [`digest`](https://cran.r-project.org/package=digest): Hash digests. No recursive dependencies.
- [`lgr`](https://github.com/s-fleck/lgr): Logging facility. No extra recursive dependencies.
- [`Metrics`](https://cran.r-project.org/package=Metrics): Package which implements performance measures. No recursive dependencies.
- [`mlbench`](https://cran.r-project.org/package=mlbench): A collection of machine learning data sets. No dependencies.
* [Reflections](https://en.wikipedia.org/wiki/Reflection_%28computer_programming%29): Objects are queryable for properties and capabilities, allowing you to programm on them.
* Additional functionality that comes with extra dependencies:
- For parallelization, `mlr3` utilizes the [`future`](https://cran.r-project.org/package=future) and [`future.apply`](https://cran.r-project.org/package=future.apply) packages.
- To capture output, warnings and exceptions, [`evaluate`](https://cran.r-project.org/package=evaluate) and [`callr`](https://cran.r-project.org/package=callr) can be used.
# Talks, Workshops, etc.
[mlr-outreach](https://github.com/mlr-org/mlr-outreach) holds all outreach activities related to _mlr_ and _mlr3_.