In MLJ loss functions, scoring rules, confusion matrices, sensitivities, etc, are
collectively referred to as measures. These measures are provided by the package
StatisticalMeasures.jl but are
immediately available to the MLJ user. Here's a simple example of direct application of
the log_loss
measures to compute a training loss:
using MLJ
X, y = @load_iris
DecisionTreeClassifier = @load DecisionTreeClassifier pkg=DecisionTree
tree = DecisionTreeClassifier(max_depth=2)
mach = machine(tree, X, y) |> fit!
yhat = predict(mach, X)
log_loss(yhat, y)
For more examples of direct measure usage, see the StatisticalMeasures.jl tutorial.
A list of all measures, ready to use after running using MLJ
or using StatisticalMeasures
, is
here. Alternatively,
call [measures()
](@ref StatisticalMeasures.measures) (experimental) to generate a
dictionary keyed on available measure constructors, with measure metadata as values.
Any measure-like object with appropriate calling behavior can be used with MLJ. To quickly build custom measures, we recommend using the package StatisticalMeasuresBase.jl, which provides this tutorial. Note, in particular, that an "atomic" measure can be transformed into a multi-target measure using this package.
In MLJ, measures are specified:
-
when evaluating model performance using
evaluate!
/evaluate
- see Evaluating Model Performance -
when wrapping models using
TunedModel
- see Tuning Models -
when wrapping iterative models using
IteratedModel
- see Controlling Iterative Models -
when generating learning curves using
learning_curve
- see Learning Curves
and elsewhere.
In previous versions of MLJ, measures from LossFunctions.jl were also available. Now measures from that package must be explicitly imported and wrapped, as described here.
A related performance evaluation tool provided by StatisticalMeasures.jl, and hence by MLJ, is the roc_curve
method:
StatisticalMeasures.roc_curve
Prior to MLJBase.jl 1.0 (respectivey, MLJ.jl version 0.19.6) measures were defined in MLJBase.jl (a dependency of MLJ.jl) but now they are provided by MLJ.jl dependency StatisticalMeasures. Effects on users are detailed below:
-
If
using MLJBase
without MLJ, then, in Julia 1.9 or higher,StatisticalMeasures
must be explicitly imported to use measures that were previously part of MLJBase. Ifusing MLJ
, then all previous measures are still available, with the exception of those corresponding to LossFunctions.jl (see below). -
All measures return a single aggregated measurement. In other words, measures previously reporting a measurement per-observation (previously subtyping
Unaggregated
) no longer do so. To get per-observation measurements, use the new methodStatisticalMeasures.measurements(measure, ŷ, y[, weights, class_weights])
. -
The default measure for regression models (used in
evaluate/evaluate!
whenmeasures
is unspecified) is changed fromrms
tol2=LPLoss(2)
(mean sum of squares). -
MeanAbsoluteError
has been removed and insteadmae
is an alias forLPLoss(p=1)
. -
Measures that previously skipped
NaN
values will now (at least by default) propagate those values. Missing value behavior is unchanged, except some measures that previously did not supportmissing
now do. -
Aliases for measure types have been removed. For example
RMSE
(alias forRootMeanSquaredError
) is gone. Aliases for instances, such asrms
andcross_entropy
persist. The exception isprecision
, for whichppv
can be used in its place. (This is to avoid conflict withBase.precision
, which was previously pirated.) -
info(measure)
has been decommissioned; query docstrings or access the new measure traits individually instead. These traits are now provided by StatisticalMeasures.jl and not are not exported. For example, to access the orientation of the measurerms
, doimport StatisticalMeasures as SM; SM.orientation(rms)
. -
Behavior of the
measures()
method, to list all measures and associated traits, has changed. It now returns a dictionary instead of a vector of named tuples;measures(predicate)
is decommissioned, butmeasures(needle)
is preserved. (This method, owned by StatisticalMeasures.jl, has some other search options, but is experimental.) -
Measures that were wraps of losses from LossFunctions.jl are no longer exposed by MLJBase or MLJ. To use such a loss, you must explicitly
import LossFunctions
and wrap the loss appropriately. See Using losses from LossFunctions.jl for examples. -
Some user-defined measures working in previous versions of MLJBase.jl may not work without modification, as they must conform to the new StatisticalMeasuresBase.jl API. See this tutorial on how define new measures.
-
Measures with a "feature argument"
X
, as insome_measure(ŷ, y, X)
, are no longer supported. See What is a measure? for allowed signatures in measures.
The migration of measures is not expected to require any changes to the source code in packges providing implementations of the MLJ model interface (MLJModelInterface.jl) such as MLJDecisionTreeInterface.jl and MLJFlux.jl, and this is confirmed by extensive integration tests. However, some current tests will fail, if they use MLJBase measures. The following should generally suffice to adapt such tests:
-
Add StatisticalMeasures as test dependency, and add
using StatisticalMeasures
to yourruntests.jl
(and/or included submodules). -
If measures are qualified, as in
MLJBase.rms
, then the qualification must be removed or changed toStatisticalMeasures.rms
, etc. -
Be aware that the default measure used in methods such as
evaluate!
, whenmeasure
is not specified, is changed fromrms
tol2
for regression models. -
Be aware of that all measures now report a measurement for every observation, and never an aggregate. See second point above.
-
The abstract measure types
Aggregated
,Unaggregated
,Measure
have been decommissioned. (A measure is now defined purely by its calling behavior.) -
What were previously exported as measure types are now only constructors.
-
target_scitype(measure)
is decommissioned. Related isStatisticalMeasures.observation_scitype(measure)
which declares an upper bound on the allowed scitype of a single observation. -
prediction_type(measure)
is decommissioned. Instead useStatisticalMeasures.kind_of_proxy(measure)
. -
The trait
reports_each_observation
is decommissioned. Related isStatisticalMeasures.can_report_unaggregated
; iffalse
the newmeasurements
method simply returnsn
copies of the aggregated measurement, wheren
is the number of observations provided, instead of individual observation-dependent measurements. -
aggregation(measure)
has been decommissioned. Instead useStatisticalMeasures.external_mode_of_aggregation(measure)
. -
instances(measure)
has been decommissioned; query docstrings for measure aliases, or follow this example:aliases = measures()[RootMeanSquaredError].aliases
. -
is_feature_dependent(measure)
has been decommissioned. Measures consuming feature data are not longer supported; see above. -
distribution_type(measure)
has been decommissioned. -
docstring(measure)
has been decommissioned. -
Behavior of
aggregate
has changed. -
The following traits, previously exported by MLJBase and MLJ, cannot be applied to measures:
supports_weights
,supports_class_weights
,orientation
,human_name
. Instead use the traits with these names provided by StatisticalMeausures.jl (they will need to be qualified, as inimport StatisticalMeasures; StatisticalMeasures.orientation(measure)
).