Changelog¶
To update your installation of AutoNLU please make sure that you have
set the environment variable PIP_PULL to your Gemfury token:
export PIP_PULL=your gemfury token # Example (not a valid token): # export PIP_PULL=jGxWW-4qKlz8ARZOhJgG9BIVuxsU9231
Then execute the following command:
pip install autonlu --upgrade --extra-index-url=https://${PIP_PULL}:@pypi.fury.io/deepopinion
1.6.0 (2022-07-07)¶
Python 3.9 is now officially supported
Topic modelling has been added with the class
TopicModel. Have a look at the corresponding tutorial to see how to use this feature.Label tasks have been changed to not use early stopping by default and hyperparameter settings have been optimized for this
Some smaller bugfixes
1.5.0 (2022-03-01)¶
Data cleaning: We added functionality to select which samples of a dataset should be checked by a human for detecting label errors as well as samples with general bad data quality (i.e. nonsensical sentences etc.) with the
DataCleanerclass. Have a look at the corresponding tutorial to see how to use it in practice.Automatic hyperparameter tuning with the class
AutoMlhas become more flexible. Have a look at the corresponding tutorial to see the new interface in action.
1.4.0 (2021-12-09)¶
Active learning, via
autonlu.Model.select_to_label(), is now also correctly supported for OMI models
1.3.0 (2021-11-30)¶
AutoNLU now supports two completely new tasks with “word labeling” (can be used for e.g. named entity recognition, information extraction, …) and question answering. Have a look at the two new tutorials to see how it is used.
The F1 binary metric was added to
autonlu.Model.evaluate()if applicable.
1.2.0 (2021-11-16)¶
If only one epoch will be done (e.g. during prediction), no separate progress bar for the epochs will be shown.
The default
mindatasetsizefor label tasks has been changed from0to4000for better performance on small datasets. This will increase training time for small datasets. Training time of bigger datasets is unaffected.Fixes a bug where in-memory studies using
autonlu.AutoMlstarted the hyperparameter optimization from scratch when additional experiments were performedRemoves many warnings from internal packages
The all_labels argument from
Model.distill()was removed since it was not actually used
1.1.0 (2021-09-28)¶
A
TypeErrorexception is now thrown if labels or classes provided for training are not stringsEntropies are now returned for
autonlu.Model.predict()ifreturn_extras = TrueThe argument
return_probabilitiesofautonlu.Model.predict()has been renamed toreturn_extrasin preparation for also returning additional informationFixed a memory leak when using
autonlu.AutoMlThe argument
return_logitshas been removed fromautonlu.SimpleModel.predict(). The logits are now always in the returned dictionary
1.0.1 (2021-09-13)¶
PyTorch 1.9.0 is now supported.
1.0.0 (2021-08-27)¶
Adds
autonlu.AutoMlto automatically search for hyperparameters and to automatically select the best model for a given task.Adds a new form of training for OMI models and with that some new optional parameters, which can be set when calling
autonlu.Model.train().
0.16.0 (2021-06-18)¶
Adds model distillation via
autonlu.Model.distill(). Also see the tutorial “Reduce inference time by distilling your model”Fixes a bug where the reported learning rate in the TensorBoard logs was not always correct when early stopping was turned off
0.15.2 (2021-05-28)¶
Internal bugfixes
0.15.1 (2021-05-07)¶
Tensorboard logs for language model finetuning are now written to the same directory (by default
tensorboard_logs) as all the other logsSome smaller bugfixes
0.15.0 (2021-05-03)¶
Added support for pruning models (reducing the size of models with a minimal impact on model performance) with the functions
autonlu.SimpleModel.prune(),autonlu.SimpleModel.auto_prune(),autonlu.Model.prune(), andautonlu.Model.auto_prune(). There is also a new tutorial, demonstrating this feature: “Increase the speed and reduce the memory consumption by pruning layers of models.”Added the option to use dynamic quantization for
autonlu.SimpleModel.predict()andautonlu.Model.predict()with the parameterdynamic_quantizationto speed up prediction times when using the CPU.
0.14.0 (2021-04-23)¶
Added support for macOS
Fixes bug where temporarily saved models by the BestModelKeeper were not always being deleted when program crashes/is interrupted during training
Fixes CUDA out of memory errors which can occurr when training bigger (e.g. RoBERTa large) models
0.13.1 (2021-04-19)¶
Saving a model in
autonlu.SimpleModelandautonlu.Modelis more robust now. Non-existing parent directories will be automatically genereated and aautonlu.core.ModelSaveExceptionis thrown if saving fails.Fixed a bug where the
autonlu.DocumentModel.evaluate()would not work for certain Tasks
0.13.0 (2021-04-16)¶
Added a new argument
use_samplehashfor the constructor ofautonlu.SimpleModelandautonlu.Model, which can be used to deactivate the sample hash to reduce memory consumption and processing time when a lot of training data is being usedFixed a problem where some Studio interactions fail silently when the given product key is not correct (e.g. for
autonlu.list_models()). Now, aLoginExceptionis thrown.Fixed a bug that could crash training when writing metrics in an unexpected format to Tensorboard
0.12.0 (2021-04-15)¶
autonlu.DocumentModel.evaluate()now supports special metrics for all three tasks (class, label, and classlabel)
0.11.1 (2021-04-01)¶
Fixes a bug in
autonlu.split_dataset()where the xVal and yVal contained both y values of the Y dataset.
0.11.0 (2021-04-01)¶
Adds
autonlu.data_dependence.DataDependenceto visualize the effect of the training set size on the model accuracyAdds translation capabilities to AutoNLU in the form of the
autonlu.TranslatorclassFixes a bug where the
all_labelsargument of the constructors ofautonlu.Modelandautonlu.SimpleModelwas not used
0.10.0 (2021-03-22)¶
A bug in the metric calculation of
autonlu.DocumentModelwas fixedA bug in
autonlu.DocumentModelwas fixed that lead to a crash in certain situations
0.9.0 (2021-03-18)¶
Additional arguments to influence the training (especially early stopping) have been added to
autonlu.SimpleModel.train()andautonlu.Model.train().Better default settings for class and label tasks have been introduced which should provide a good balance between model accuracy and training time.
0.8.0 (2021-03-17)¶
Makes finetuning of language models with
autonlu.SimpleModel.finetune()andautonlu.Model.finetune()more flexible by adding additional losses
0.7.1 (2021-03-12)¶
Added Windows support
0.7.0 (2021-03-12)¶
Fixes a bug with tokenization in finetuning
The training loss is now also logged
0.6.1 (2021-02-25)¶
The standard logging mode was changed from DEBUG to WARN
The default learning rate for label tasks was increased to 2e-4. This should improve training time as well as final accuracy in most cases.
The default mindatasetsize for label tasks was decreased to 0. This should improve training time.
0.6.0 (2021-02-22)¶
A new function
autonlu.get_model_by_id()was added that allows the download of models using the model ID. This method even works for models that are not returned withautonlu.list_models()(e.g. custom trained models inside projects, etc.)The function
autonlu.list_models()now returns a list of tuples containing(model name, display name, model id). The agrumentget_idswas removed.autonlu.SimpleModelandautonlu.Modelhave an argument baseurl now (i.e. a different address of a Studio instance). This is mainly interesting for on-premise Studio instances.Added functionality to upload models to Studio with
autonlu.SimpleModel.upload(),autonlu.Model.upload(), andautonlu.studio.upload_model()
0.5.0 (2021-02-15)¶
The functions
autonlu.Model.evaluate()andautonlu.SimpleModel.evaluate()were added, which offer a way to easily determine some evaluation metrics of models (accuracy, F1 score, precision, and recall)A bug in the function
autonlu.list_models()has been fixed which led to a crash of the function call if specific models were present on Studio
0.4.0 (2021-02-11)¶
autonlu.DocumentModelnow also supports class and label tasksThe function
autonlu.get_product_key()has been added to be able to retrieve the current valid product key from studio using a users login informationThe function
autonlu.login()has been added as a convenient way to use AutoNLU by simply providing the Studio login information (user name and passwort)autonlu.Modelandautonlu.DocumentModelnow correctly support per segment class lists (i.e. for class and classlabel tasks it is possible to specify different all_classes for individual samples which is useful for e.g. active learning)Removed the need to set the
DO_PUBLIC_KEYmanually. This is now only needed for on-premise solutions with separate user management.