Changelog¶
To update your installation of AutoNLU please make sure that you have
set the environment variable PIP_PULL
to your Gemfury token:
export PIP_PULL=your gemfury token # Example (not a valid token): # export PIP_PULL=jGxWW-4qKlz8ARZOhJgG9BIVuxsU9231
Then execute the following command:
pip install autonlu --upgrade --extra-index-url=https://${PIP_PULL}:@pypi.fury.io/deepopinion
1.6.0 (2022-07-07)¶
Python 3.9 is now officially supported
Topic modelling has been added with the class
TopicModel
. Have a look at the corresponding tutorial to see how to use this feature.Label tasks have been changed to not use early stopping by default and hyperparameter settings have been optimized for this
Some smaller bugfixes
1.5.0 (2022-03-01)¶
Data cleaning: We added functionality to select which samples of a dataset should be checked by a human for detecting label errors as well as samples with general bad data quality (i.e. nonsensical sentences etc.) with the
DataCleaner
class. Have a look at the corresponding tutorial to see how to use it in practice.Automatic hyperparameter tuning with the class
AutoMl
has become more flexible. Have a look at the corresponding tutorial to see the new interface in action.
1.4.0 (2021-12-09)¶
Active learning, via
autonlu.Model.select_to_label()
, is now also correctly supported for OMI models
1.3.0 (2021-11-30)¶
AutoNLU now supports two completely new tasks with “word labeling” (can be used for e.g. named entity recognition, information extraction, …) and question answering. Have a look at the two new tutorials to see how it is used.
The F1 binary metric was added to
autonlu.Model.evaluate()
if applicable.
1.2.0 (2021-11-16)¶
If only one epoch will be done (e.g. during prediction), no separate progress bar for the epochs will be shown.
The default
mindatasetsize
for label tasks has been changed from0
to4000
for better performance on small datasets. This will increase training time for small datasets. Training time of bigger datasets is unaffected.Fixes a bug where in-memory studies using
autonlu.AutoMl
started the hyperparameter optimization from scratch when additional experiments were performedRemoves many warnings from internal packages
The all_labels argument from
Model.distill()
was removed since it was not actually used
1.1.0 (2021-09-28)¶
A
TypeError
exception is now thrown if labels or classes provided for training are not stringsEntropies are now returned for
autonlu.Model.predict()
ifreturn_extras = True
The argument
return_probabilities
ofautonlu.Model.predict()
has been renamed toreturn_extras
in preparation for also returning additional informationFixed a memory leak when using
autonlu.AutoMl
The argument
return_logits
has been removed fromautonlu.SimpleModel.predict()
. The logits are now always in the returned dictionary
1.0.1 (2021-09-13)¶
PyTorch 1.9.0 is now supported.
1.0.0 (2021-08-27)¶
Adds
autonlu.AutoMl
to automatically search for hyperparameters and to automatically select the best model for a given task.Adds a new form of training for OMI models and with that some new optional parameters, which can be set when calling
autonlu.Model.train()
.
0.16.0 (2021-06-18)¶
Adds model distillation via
autonlu.Model.distill()
. Also see the tutorial “Reduce inference time by distilling your model”Fixes a bug where the reported learning rate in the TensorBoard logs was not always correct when early stopping was turned off
0.15.2 (2021-05-28)¶
Internal bugfixes
0.15.1 (2021-05-07)¶
Tensorboard logs for language model finetuning are now written to the same directory (by default
tensorboard_logs
) as all the other logsSome smaller bugfixes
0.15.0 (2021-05-03)¶
Added support for pruning models (reducing the size of models with a minimal impact on model performance) with the functions
autonlu.SimpleModel.prune()
,autonlu.SimpleModel.auto_prune()
,autonlu.Model.prune()
, andautonlu.Model.auto_prune()
. There is also a new tutorial, demonstrating this feature: “Increase the speed and reduce the memory consumption by pruning layers of models.”Added the option to use dynamic quantization for
autonlu.SimpleModel.predict()
andautonlu.Model.predict()
with the parameterdynamic_quantization
to speed up prediction times when using the CPU.
0.14.0 (2021-04-23)¶
Added support for macOS
Fixes bug where temporarily saved models by the BestModelKeeper were not always being deleted when program crashes/is interrupted during training
Fixes CUDA out of memory errors which can occurr when training bigger (e.g. RoBERTa large) models
0.13.1 (2021-04-19)¶
Saving a model in
autonlu.SimpleModel
andautonlu.Model
is more robust now. Non-existing parent directories will be automatically genereated and aautonlu.core.ModelSaveException
is thrown if saving fails.Fixed a bug where the
autonlu.DocumentModel.evaluate()
would not work for certain Tasks
0.13.0 (2021-04-16)¶
Added a new argument
use_samplehash
for the constructor ofautonlu.SimpleModel
andautonlu.Model
, which can be used to deactivate the sample hash to reduce memory consumption and processing time when a lot of training data is being usedFixed a problem where some Studio interactions fail silently when the given product key is not correct (e.g. for
autonlu.list_models()
). Now, aLoginException
is thrown.Fixed a bug that could crash training when writing metrics in an unexpected format to Tensorboard
0.12.0 (2021-04-15)¶
autonlu.DocumentModel.evaluate()
now supports special metrics for all three tasks (class, label, and classlabel)
0.11.1 (2021-04-01)¶
Fixes a bug in
autonlu.split_dataset()
where the xVal and yVal contained both y values of the Y dataset.
0.11.0 (2021-04-01)¶
Adds
autonlu.data_dependence.DataDependence
to visualize the effect of the training set size on the model accuracyAdds translation capabilities to AutoNLU in the form of the
autonlu.Translator
classFixes a bug where the
all_labels
argument of the constructors ofautonlu.Model
andautonlu.SimpleModel
was not used
0.10.0 (2021-03-22)¶
A bug in the metric calculation of
autonlu.DocumentModel
was fixedA bug in
autonlu.DocumentModel
was fixed that lead to a crash in certain situations
0.9.0 (2021-03-18)¶
Additional arguments to influence the training (especially early stopping) have been added to
autonlu.SimpleModel.train()
andautonlu.Model.train()
.Better default settings for class and label tasks have been introduced which should provide a good balance between model accuracy and training time.
0.8.0 (2021-03-17)¶
Makes finetuning of language models with
autonlu.SimpleModel.finetune()
andautonlu.Model.finetune()
more flexible by adding additional losses
0.7.1 (2021-03-12)¶
Added Windows support
0.7.0 (2021-03-12)¶
Fixes a bug with tokenization in finetuning
The training loss is now also logged
0.6.1 (2021-02-25)¶
The standard logging mode was changed from DEBUG to WARN
The default learning rate for label tasks was increased to 2e-4. This should improve training time as well as final accuracy in most cases.
The default mindatasetsize for label tasks was decreased to 0. This should improve training time.
0.6.0 (2021-02-22)¶
A new function
autonlu.get_model_by_id()
was added that allows the download of models using the model ID. This method even works for models that are not returned withautonlu.list_models()
(e.g. custom trained models inside projects, etc.)The function
autonlu.list_models()
now returns a list of tuples containing(model name, display name, model id)
. The agrumentget_ids
was removed.autonlu.SimpleModel
andautonlu.Model
have an argument baseurl now (i.e. a different address of a Studio instance). This is mainly interesting for on-premise Studio instances.Added functionality to upload models to Studio with
autonlu.SimpleModel.upload()
,autonlu.Model.upload()
, andautonlu.studio.upload_model()
0.5.0 (2021-02-15)¶
The functions
autonlu.Model.evaluate()
andautonlu.SimpleModel.evaluate()
were added, which offer a way to easily determine some evaluation metrics of models (accuracy, F1 score, precision, and recall)A bug in the function
autonlu.list_models()
has been fixed which led to a crash of the function call if specific models were present on Studio
0.4.0 (2021-02-11)¶
autonlu.DocumentModel
now also supports class and label tasksThe function
autonlu.get_product_key()
has been added to be able to retrieve the current valid product key from studio using a users login informationThe function
autonlu.login()
has been added as a convenient way to use AutoNLU by simply providing the Studio login information (user name and passwort)autonlu.Model
andautonlu.DocumentModel
now correctly support per segment class lists (i.e. for class and classlabel tasks it is possible to specify different all_classes for individual samples which is useful for e.g. active learning)Removed the need to set the
DO_PUBLIC_KEY
manually. This is now only needed for on-premise solutions with separate user management.