AutoMl

AutoMl implements a hyperparameter search algorithm to find the best model (BERT, Albert, RoBERTa etc.) and parameters (learning rate, batch size etc.) automatically without the need of any manual configuration. This drastically reduces the amount of work that must be done by users to find a good hyperparameter setup in order to train high quality models.

You can find a complete code example how to use the AutoML class in the tutorials.

class autonlu.AutoMl(study_name, study_db=None, path='', models_to_try=None, lr_range=None, model_arguments={}, train_arguments={}, seed=42, *, hyperparameters=None)

This class automatically searches for the best hyperparameters to train a model.

To automatically train a model you first have to define the dataset through the load_dataset() function. Afterwards, you can create a model using automatic HPO by simply calling the create() function. Search for hyperparameters is called a “study” and a single test (i.e. a specific hyperparameter setup) is called a trial.

Parameters
  • study_name (str) – Each hyperparameter search runs within a so-called study. If stored in a db (optional) this name will help to identify a study and continue it if it is canceled. It also allows to store multiple different studies into a single db by simply specifying different study names.

  • study_db (Optional[str]) – If a study db is given, all trials are stored into this db. This allows you to (1) continue a study that is cancelled and (2) use multiple nodes in parallel. If no db is given, everything is stored in-memory and dropped after the study is finished. Multi-Node execution or continuation after cancellation is not supported in this case.

  • path (str) – If a path is given, the study_db and also the trained model will be stored there. Otherwise the relative execution path will be used.

  • model_arguments (Dict) – A dictionary of keyword arguments to be passed into the Model constructor for each new model constructed during the hyperparameter search. This can for example be used to set the standard_label for class label tasks.

  • train_arguments (Dict) – A dictionary of keyword arguments to be passed into the Model.train() function for each training during the hyperparameter search. This can for example be used to set do_early_stopping etc. for class label tasks.

  • hyperparameters (Optional[List[HyperparameterChoice]]) – A list of instances of Categorical, FloatRange, and IntRange objects, specifying which hyperparameters should be optimized. Hyperparameters will automatically be routed to either the Model constructor or the Model.train() function (depending on the name).

  • seed (int) – Random seed that is set in the sample to behave in a deterministic way

  • models_to_try (Optional[List[str]]) – A list of base models that should be tried during hyperparameter optimization. If None, "bert-base-uncased", "roberta-base", "albert-base-v2" will be used. This is part of an old interface, please use the hyperparamters argument instead for more flexibility.

  • lr_range (Optional[Tuple[float, float]]) – A tuple specifying a range of learning rates to try during hyperparameter optimization. If None, (1e-7, 1e-3) will be used. This is part of an old interface, please use the hyperparamters argument instead for more flexibility.

Example

The examples shows how to configure an AutoMl run, load a dataset to optimize over and run the hyperparameter search.

>>> automl = AutoMl(study_name="the_study", study_db="the_study_db_dir",
>>>                hyperparameters=[
>>>                    Categorical("model_folder", choices=["roberta-base", "bert-base-uncased", "albert-base-v2"]),
>>>                    FloatRange("learning_rate", low=1e-6, high=1e-3, log=True),
>>>                    Categorical("decay_func_name", ["linear", "exp", "exp_sqr"]),
>>>                    IntRange("nb_opti_steps", low=len(X)//32, high=len(X)//32*2, log=True),
>>>                    FloatRange("total_lr_decay", low=1e-5, high=1, log=True)
>>>                ],
>>>                model_arguments={"standard_label": "NONE"},
>>>                train_arguments={"do_early_stopping": False}
>>>                )
>>> automl.load_dataset(X, Y)
>>> model = automl.create(timeout=60*10)
load_dataset(X, Y, valX=None, valY=None, valsplit=0.1)

Load a dataset to prepare the hyperparameter search.

Parameters
  • X (List[Union[str, Tuple[str, str]]]) – Input samples. Either a list of strings for text classification or a list of pairs of strings for text pair classification.

  • Y (List[str]) – Training target. List containing the correct labels as strings.

  • valX (Optional[List[Union[str, Tuple[str, str]]]]) – Input samples used for validation of the model during training. E.g. for stopping training early if there is no progress anymore or to report the current score via the score_callback. Same format as X. If None, a part of X will be split off.

  • valY (Optional[List[str]]) – Training target used for validation of the model during training. E.g. for stopping training early if there is no progress anymore or to report the current score via the score_callback. Same format as Y. If None, a part of Y will be split off.

  • valsplit (float) – If valX or valY is not given, specifies how much of the training data should be split off for validation. Default is 10%.

create(n_trials=None, timeout=None, retrain=False, verbose=False, epochs=2000)
Starts the hyperparameter optimization and returns the best model that could be found.

The best model is also stored in path that was provided to the constructor (the relative execution path will be used by default) and can be loaded like any other model. create() can be called multiple times if the search for better hyperparameters should be continued.

Parameters
  • n_trials (Optional[int]) – The number of trials that are executed to find the best hyperparameters. Note: timeout is recommended over n_trials to find good solutions. If this argument is set to None, the timeout parameter will limit the study. If timeout is also set to None, the study continues to create trials until it receives a termination signal. If n_trials and timeout is set, the one that is triggered first will limit the study.

  • timeout (Optional[int]) – Stop study after the given number of second(s) to find the best hyperparameters. Note: timeout is recommended over n_trials to find good solutions. If this argument is set to None, the n_trials parameter will limit the study. If n_trials is also set to None, the study continues to create trials until it receives a termination signal. If n_trials and timeout is set, the one that is triggered first will limit the study.

  • retrain (bool) – Specifies, whether the model should be trained from scratch again using the best found hyperparamters or if the model of the HPO-search should directly be used.

  • verbose (bool) – Show additional log messages containing all selected hyperparameters.

  • epochs (int) – Defines an upper-bound of epochs to train for each trial.

Return: A Model trained using the best hyperparameters found through automatic HPO.

class autonlu.automl.Categorical(name, choices)

Class for specifying that a hyperparameter should be selected from a list of options.

Parameters
  • name (str) – Name of the hyperparameter to sample a value for

  • choices (Sequence[Union[None, bool, int, float, str]]) – A list of hyperparameter candidates

Examples:

>>> Categorical("model_folder", ["bert-base-uncased", "roberta-base", "albert-base-v2"])
class autonlu.automl.IntRange(name, *, low, high, step=1, log=False)

Class for specifying that a hyperparameter should be selected from a range of integer numbers.

Parameters
  • name (str) – Name of the hyperparameter to sample a value for

  • low (int) – Lower endpoint of the range of suggested values. low is included in the range.

  • high (int) – Upper endpoint of the range of suggested values. high is included in the range.

  • step (int) – A step of discretization. Note that high is modified if the range is not divisible by step. The sequence from which a value will be selected will be low, low+step, low+2step, …, low+kstep ≤ high. step!=1 and log==True can not be used at the same time.

  • log (bool) – If log is true, the value is sampled from the range in the log domain. Otherwise, the value is sampled from the range in the linear domain. step and log can not be used at the same time.

Examples:

>>> IntRange("nb_opti_steps", low=1000, high=10000, step=100)
class autonlu.automl.FloatRange(name, *, low, high, step=None, log=False)

Class for specifying that a hyperparameter should be selected from a range of floating point numbers.

Parameters
  • name (str) – Name of the hyperparameter to sample a value for

  • low (float) – Lower endpoint of the range of suggested values. low is included in the range.

  • high (float) – Upper endpoint of the range of suggested values. high is excluded from the range, except if step is also used, in which case low as well as high is included.

  • step (Optional[float]) – A step of discretization. The sequence from which a value will be selected will be low, low+step, low+2step, …, low+kstep ≤ high. step and log can not be used at the same time.

  • log (bool) – If log is true, the value is sampled from the range in the log domain. Otherwise, the value is sampled from the range in the linear domain. step and log can not be used at the same time.

Examples:

>>> FloatRange("learning_rate", low=1e-6, high=1e-2, log=True)