AutoMl¶
AutoMl implements a hyperparameter search algorithm to find the best model (BERT, Albert, RoBERTa etc.) and parameters (learning rate, batch size etc.) automatically without the need of any manual configuration. This drastically reduces the amount of work that must be done by users to find a good hyperparameter setup in order to train high quality models.
You can find a complete code example how to use the AutoML class in the tutorials.
- class autonlu.AutoMl(study_name, study_db=None, path='', models_to_try=None, lr_range=None, model_arguments={}, train_arguments={}, seed=42, *, hyperparameters=None)¶
This class automatically searches for the best hyperparameters to train a model.
To automatically train a model you first have to define the dataset through the
load_dataset()
function. Afterwards, you can create a model using automatic HPO by simply calling thecreate()
function. Search for hyperparameters is called a “study” and a single test (i.e. a specific hyperparameter setup) is called a trial.- Parameters
study_name (
str
) – Each hyperparameter search runs within a so-called study. If stored in a db (optional) this name will help to identify a study and continue it if it is canceled. It also allows to store multiple different studies into a single db by simply specifying different study names.study_db (
Optional
[str
]) – If a study db is given, all trials are stored into this db. This allows you to (1) continue a study that is cancelled and (2) use multiple nodes in parallel. If no db is given, everything is stored in-memory and dropped after the study is finished. Multi-Node execution or continuation after cancellation is not supported in this case.path (
str
) – If a path is given, the study_db and also the trained model will be stored there. Otherwise the relative execution path will be used.model_arguments (
Dict
) – A dictionary of keyword arguments to be passed into theModel
constructor for each new model constructed during the hyperparameter search. This can for example be used to set thestandard_label
for class label tasks.train_arguments (
Dict
) – A dictionary of keyword arguments to be passed into theModel.train()
function for each training during the hyperparameter search. This can for example be used to setdo_early_stopping
etc. for class label tasks.hyperparameters (
Optional
[List
[HyperparameterChoice
]]) – A list of instances ofCategorical
,FloatRange
, andIntRange
objects, specifying which hyperparameters should be optimized. Hyperparameters will automatically be routed to either theModel
constructor or theModel.train()
function (depending on the name).seed (
int
) – Random seed that is set in the sample to behave in a deterministic waymodels_to_try (
Optional
[List
[str
]]) – A list of base models that should be tried during hyperparameter optimization. IfNone
,"bert-base-uncased"
,"roberta-base"
,"albert-base-v2"
will be used. This is part of an old interface, please use thehyperparamters
argument instead for more flexibility.lr_range (
Optional
[Tuple
[float
,float
]]) – A tuple specifying a range of learning rates to try during hyperparameter optimization. IfNone
,(1e-7, 1e-3)
will be used. This is part of an old interface, please use thehyperparamters
argument instead for more flexibility.
Example
The examples shows how to configure an AutoMl run, load a dataset to optimize over and run the hyperparameter search.
>>> automl = AutoMl(study_name="the_study", study_db="the_study_db_dir", >>> hyperparameters=[ >>> Categorical("model_folder", choices=["roberta-base", "bert-base-uncased", "albert-base-v2"]), >>> FloatRange("learning_rate", low=1e-6, high=1e-3, log=True), >>> Categorical("decay_func_name", ["linear", "exp", "exp_sqr"]), >>> IntRange("nb_opti_steps", low=len(X)//32, high=len(X)//32*2, log=True), >>> FloatRange("total_lr_decay", low=1e-5, high=1, log=True) >>> ], >>> model_arguments={"standard_label": "NONE"}, >>> train_arguments={"do_early_stopping": False} >>> ) >>> automl.load_dataset(X, Y) >>> model = automl.create(timeout=60*10)
- load_dataset(X, Y, valX=None, valY=None, valsplit=0.1)¶
Load a dataset to prepare the hyperparameter search.
- Parameters
X (
List
[Union
[str
,Tuple
[str
,str
]]]) – Input samples. Either a list of strings for text classification or a list of pairs of strings for text pair classification.Y (
List
[str
]) – Training target. List containing the correct labels as strings.valX (
Optional
[List
[Union
[str
,Tuple
[str
,str
]]]]) – Input samples used for validation of the model during training. E.g. for stopping training early if there is no progress anymore or to report the current score via thescore_callback
. Same format asX
. IfNone
, a part ofX
will be split off.valY (
Optional
[List
[str
]]) – Training target used for validation of the model during training. E.g. for stopping training early if there is no progress anymore or to report the current score via thescore_callback
. Same format asY
. IfNone
, a part ofY
will be split off.valsplit (
float
) – IfvalX
orvalY
is not given, specifies how much of the training data should be split off for validation. Default is 10%.
- create(n_trials=None, timeout=None, retrain=False, verbose=False, epochs=2000)¶
- Starts the hyperparameter optimization and returns the best model that could be found.
The best model is also stored in
path
that was provided to the constructor (the relative execution path will be used by default) and can be loaded like any other model.create()
can be called multiple times if the search for better hyperparameters should be continued.
- Parameters
n_trials (
Optional
[int
]) – The number of trials that are executed to find the best hyperparameters. Note: timeout is recommended over n_trials to find good solutions. If this argument is set to None, the timeout parameter will limit the study. If timeout is also set to None, the study continues to create trials until it receives a termination signal. If n_trials and timeout is set, the one that is triggered first will limit the study.timeout (
Optional
[int
]) – Stop study after the given number of second(s) to find the best hyperparameters. Note: timeout is recommended over n_trials to find good solutions. If this argument is set to None, the n_trials parameter will limit the study. If n_trials is also set to None, the study continues to create trials until it receives a termination signal. If n_trials and timeout is set, the one that is triggered first will limit the study.retrain (
bool
) – Specifies, whether the model should be trained from scratch again using the best found hyperparamters or if the model of the HPO-search should directly be used.verbose (
bool
) – Show additional log messages containing all selected hyperparameters.epochs (
int
) – Defines an upper-bound of epochs to train for each trial.
Return: A Model trained using the best hyperparameters found through automatic HPO.
- class autonlu.automl.Categorical(name, choices)¶
Class for specifying that a hyperparameter should be selected from a list of options.
- Parameters
name (
str
) – Name of the hyperparameter to sample a value forchoices (
Sequence
[Union
[None
,bool
,int
,float
,str
]]) – A list of hyperparameter candidates
Examples:
>>> Categorical("model_folder", ["bert-base-uncased", "roberta-base", "albert-base-v2"])
- class autonlu.automl.IntRange(name, *, low, high, step=1, log=False)¶
Class for specifying that a hyperparameter should be selected from a range of integer numbers.
- Parameters
name (
str
) – Name of the hyperparameter to sample a value forlow (
int
) – Lower endpoint of the range of suggested values.low
is included in the range.high (
int
) – Upper endpoint of the range of suggested values.high
is included in the range.step (
int
) – A step of discretization. Note thathigh
is modified if the range is not divisible by step. The sequence from which a value will be selected will be low, low+step, low+2step, …, low+kstep ≤ high.step!=1
andlog==True
can not be used at the same time.log (
bool
) – If log is true, the value is sampled from the range in the log domain. Otherwise, the value is sampled from the range in the linear domain.step
andlog
can not be used at the same time.
Examples:
>>> IntRange("nb_opti_steps", low=1000, high=10000, step=100)
- class autonlu.automl.FloatRange(name, *, low, high, step=None, log=False)¶
Class for specifying that a hyperparameter should be selected from a range of floating point numbers.
- Parameters
name (
str
) – Name of the hyperparameter to sample a value forlow (
float
) – Lower endpoint of the range of suggested values.low
is included in the range.high (
float
) – Upper endpoint of the range of suggested values.high
is excluded from the range, except ifstep
is also used, in which caselow
as well ashigh
is included.step (
Optional
[float
]) – A step of discretization. The sequence from which a value will be selected will be low, low+step, low+2step, …, low+kstep ≤ high.step
andlog
can not be used at the same time.log (
bool
) – If log is true, the value is sampled from the range in the log domain. Otherwise, the value is sampled from the range in the linear domain.step
andlog
can not be used at the same time.
Examples:
>>> FloatRange("learning_rate", low=1e-6, high=1e-2, log=True)