-
-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parameter transformations inside ParamSet #215
Comments
Hmm. I wanted to find the discussion we already had with @berndbischl about this topic. Maybe someone can link it if he finds it. We also came to the conclusion that we want to allow to add a parameter like |
Maybe you mean the discussion in mlr-org/mlr3pipelines#24 and Bernd's comment in particular?
If we have to attach the parameter transformation with a certain I would say the |
Sometimes it would be useful to specify how parameters are changed inside a
Learner
/PipeOp
, e.g. as in mlr-org/mlr3pipelines#24. A typical example is themtry
parameter of a random forest, which should range from 1 totask$ncol
. It would be nice if one could introduce anmtry.pexp
parameter ranging from 0 to 1, so that the actualmtry
is set toround(task$ncol ^ mtry.pexp)
.The
$trafo
function, as it currently stands, is not a good fit for this, because it (1) operates before theLearner
even sees theTask
, so wouldn't know abouttask$ncol
, and (2) would not be able to introduce a new parametermtry.pexp
, it would only be able to re-scale the presentmtry
, which is an integer between 1 andInf
, not a real number between 0 and 1.I think the following UI would be quite nice:
This would change the
lrn$param_set
to "look and feel" like theps
constructed / modified before, but internally theLearner
(or e.g. aPipeOp
) would get the parameter values as performed by the$trafo
function.A way to implement this would be the following:
private$.learnerside = NULL
slot that points to theParamSet
that theLearner
/PipeOp
should see.$has_interface
active binding:self$learnerside(last = TRUE)
function that gives theParamSet
that theLearner
/PipeOp
should see. Becauseprivate$.learnerside
could point to aParamSet
that itself has aprivate$.learnerside
set, it should be recursive iflast
isTRUE
, and only give the "next"learnerside
iflast
isFALSE
.private$copy_param_set()
helper function. It copies all relevant items from its argument to theParamSet
itself, to turn theself
into an effective copy of that argument:$add_interface()
function:$remove_interface()
function:Learner
/PipeOp
get its value out of this? There probably should be a$get_values()
function that gets the values for the operation, which should also have the filter functionality thatids
currently has.trafo
active binding to also accept functions of the formfunction(x, env)
This implementation has the advantage that multiple interfaces can be "stacked" on top of each other: A user who gets a
Learner
does not need to know or care if something put an interface in front of itsParamSet
. When the user sets a parameter usingparam_set$values$param = x
, the value gets checked against the constraints of the interface parameter set. When he callslrn$train()
, thetrain()
function callsget_values(tags = "train", learnerside = TRUE, env = list(task = task))
, which recurses through the different interfaces that were added, and sets$values
in each one of them after transforming. This automatically checks that the trafo function returns a feasible value for the originalParamSet
.This change would also be completely transparent to everything
ParamSet
is doing so far.Things that I am not sure about:
env
parameter depend on what kind of object theParamSet
belongs to: SomePipeOps
(e.g.PipeOpModelAvg
) have parameters in a different context, where notask
is present (and instead maybe aprediction
). One would probably want to agree on an interface (alwaystask
in aLearner
/ preprocessingPipeOp
, alwaysprediction
in a "post-processing"PipeOp
, other contexts..?)"train"
/"predict"
tags from the outside, e.g. maybe a tuning algorithm wants to train a model with one set of"train"
parameters and then evaluate these with different"predict"
parameters to get multiple performance datapoints with only a singletrain()
call for efficiency. In that case it would be nice if thetrafo
could also respect the"train"
/"predict"
tags and work when only a subset of parameter values is present. In that case, theget_values
would need to be adapted to only giveself$values[intersect(names(self$values), set$ids(...tags = tags))]
toself$trafo
.ParamSetCollection
. Maybe aGraphLearner
would want to have an interface as well? I wouldn't know what the UI for that would look like, however. In that case it would probably be easiest to intervene with the individualPipeOp
s'ParamSet
.The text was updated successfully, but these errors were encountered: