============ Changelog ============ To update your installation of AutoNLU please make sure that you have set the environment variable ``PIP_PULL`` to your Gemfury token: .. code-block:: sh export PIP_PULL=your gemfury token # Example (not a valid token): # export PIP_PULL=jGxWW-4qKlz8ARZOhJgG9BIVuxsU9231 Then execute the following command: .. code-block:: sh pip install autonlu --upgrade --extra-index-url=https://${PIP_PULL}:@pypi.fury.io/deepopinion 1.6.0 (2022-07-07) ------------------- * Python 3.9 is now officially supported * Topic modelling has been added with the class :class:`~autonlu.TopicModel`. Have a look at the corresponding tutorial to see how to use this feature. * **Label tasks have been changed to not use early stopping by default and hyperparameter settings have been optimized for this** * Some smaller bugfixes 1.5.0 (2022-03-01) ------------------- * Data cleaning: We added functionality to select which samples of a dataset should be checked by a human for detecting label errors as well as samples with general bad data quality (i.e. nonsensical sentences etc.) with the :class:`~autonlu.DataCleaner` class. Have a look at the corresponding tutorial to see how to use it in practice. * Automatic hyperparameter tuning with the class :class:`~autonlu.AutoMl` has become more flexible. Have a look at the corresponding tutorial to see the new interface in action. 1.4.0 (2021-12-09) ------------------- * Active learning, via :func:`autonlu.Model.select_to_label`, is now also correctly supported for OMI models 1.3.0 (2021-11-30) ------------------- * AutoNLU now supports two completely new tasks with "word labeling" (can be used for e.g. named entity recognition, information extraction, ...) and question answering. Have a look at the two new tutorials to see how it is used. * The F1 binary metric was added to :func:`autonlu.Model.evaluate` if applicable. 1.2.0 (2021-11-16) ------------------- * If only one epoch will be done (e.g. during prediction), no separate progress bar for the epochs will be shown. * The default ``mindatasetsize`` for label tasks has been changed from ``0`` to ``4000`` for better performance on small datasets. This will increase training time for small datasets. Training time of bigger datasets is unaffected. * Fixes a bug where in-memory studies using :class:`autonlu.AutoMl` started the hyperparameter optimization from scratch when additional experiments were performed * Removes many warnings from internal packages * The `all_labels` argument from :func:`Model.distill` was removed since it was not actually used 1.1.0 (2021-09-28) ------------------- * A ``TypeError`` exception is now thrown if labels or classes provided for training are not strings * Entropies are now returned for :func:`autonlu.Model.predict` if ``return_extras = True`` * The argument ``return_probabilities`` of :func:`autonlu.Model.predict` has been renamed to ``return_extras`` in preparation for also returning additional information * Fixed a memory leak when using :class:`autonlu.AutoMl` * The argument ``return_logits`` has been removed from :func:`autonlu.SimpleModel.predict`. The logits are now always in the returned dictionary 1.0.1 (2021-09-13) ------------------- * PyTorch 1.9.0 is now supported. 1.0.0 (2021-08-27) ------------------- * Adds :class:`autonlu.AutoMl` to automatically search for hyperparameters and to automatically select the best model for a given task. * Adds a new form of training for OMI models and with that some new optional parameters, which can be set when calling :func:`autonlu.Model.train`. 0.16.0 (2021-06-18) ------------------- * Adds model distillation via :func:`autonlu.Model.distill`. Also see the tutorial "Reduce inference time by distilling your model" * Fixes a bug where the reported learning rate in the TensorBoard logs was not always correct when early stopping was turned off 0.15.2 (2021-05-28) ------------------- * Internal bugfixes 0.15.1 (2021-05-07) ------------------- * Tensorboard logs for language model finetuning are now written to the same directory (by default ``tensorboard_logs``) as all the other logs * Some smaller bugfixes 0.15.0 (2021-05-03) ------------------- * Added support for pruning models (reducing the size of models with a minimal impact on model performance) with the functions :func:`autonlu.SimpleModel.prune`, :func:`autonlu.SimpleModel.auto_prune`, :func:`autonlu.Model.prune`, and :func:`autonlu.Model.auto_prune`. There is also a new tutorial, demonstrating this feature: *"Increase the speed and reduce the memory consumption by pruning layers of models."* * Added the option to use dynamic quantization for :func:`autonlu.SimpleModel.predict` and :func:`autonlu.Model.predict` with the parameter ``dynamic_quantization`` to speed up prediction times when using the CPU. 0.14.0 (2021-04-23) ------------------- * Added support for macOS * Fixes bug where temporarily saved models by the BestModelKeeper were not always being deleted when program crashes/is interrupted during training * Fixes CUDA out of memory errors which can occurr when training bigger (e.g. RoBERTa large) models 0.13.1 (2021-04-19) ------------------- * Saving a model in :class:`autonlu.SimpleModel` and :class:`autonlu.Model` is more robust now. Non-existing parent directories will be automatically genereated and a :class:`autonlu.core.ModelSaveException` is thrown if saving fails. * Fixed a bug where the :func:`autonlu.DocumentModel.evaluate` would not work for certain Tasks 0.13.0 (2021-04-16) ------------------- * Added a new argument ``use_samplehash`` for the constructor of :class:`autonlu.SimpleModel` and :class:`autonlu.Model`, which can be used to deactivate the sample hash to reduce memory consumption and processing time when a lot of training data is being used * Fixed a problem where some Studio interactions fail silently when the given product key is not correct (e.g. for :func:`autonlu.list_models`). Now, a ``LoginException`` is thrown. * Fixed a bug that could crash training when writing metrics in an unexpected format to Tensorboard 0.12.0 (2021-04-15) ------------------- * :func:`autonlu.DocumentModel.evaluate` now supports special metrics for all three tasks (class, label, and classlabel) 0.11.1 (2021-04-01) ------------------- * Fixes a bug in :func:`autonlu.split_dataset` where the xVal and yVal contained both y values of the Y dataset. 0.11.0 (2021-04-01) ------------------- * Adds :class:`autonlu.data_dependence.DataDependence` to visualize the effect of the training set size on the model accuracy * Adds translation capabilities to AutoNLU in the form of the :class:`autonlu.Translator` class * Fixes a bug where the ``all_labels`` argument of the constructors of :class:`autonlu.Model` and :class:`autonlu.SimpleModel` was not used 0.10.0 (2021-03-22) ------------------- * A bug in the metric calculation of :class:`autonlu.DocumentModel` was fixed * A bug in :class:`autonlu.DocumentModel` was fixed that lead to a crash in certain situations 0.9.0 (2021-03-18) ------------------ * Additional arguments to influence the training (especially early stopping) have been added to :func:`autonlu.SimpleModel.train` and :func:`autonlu.Model.train`. * Better default settings for class and label tasks have been introduced which should provide a good balance between model accuracy and training time. 0.8.0 (2021-03-17) ------------------ * Makes finetuning of language models with :func:`autonlu.SimpleModel.finetune` and :func:`autonlu.Model.finetune` more flexible by adding additional losses 0.7.1 (2021-03-12) ------------------ * Added Windows support 0.7.0 (2021-03-12) ------------------ * Fixes a bug with tokenization in finetuning * The training loss is now also logged 0.6.1 (2021-02-25) ------------------ * The standard logging mode was changed from DEBUG to WARN * The default learning rate for label tasks was increased to 2e-4. This should improve training time as well as final accuracy in most cases. * The default mindatasetsize for label tasks was decreased to 0. This should improve training time. 0.6.0 (2021-02-22) ------------------ * A new function :func:`autonlu.get_model_by_id` was added that allows the download of models using the model ID. This method even works for models that are not returned with :func:`autonlu.list_models` (e.g. custom trained models inside projects, etc.) * The function :func:`autonlu.list_models` now returns a list of tuples containing ``(model name, display name, model id)``. The agrument ``get_ids`` was removed. * :class:`autonlu.SimpleModel` and :class:`autonlu.Model` have an argument baseurl now (i.e. a different address of a Studio instance). This is mainly interesting for on-premise Studio instances. * Added functionality to upload models to Studio with :func:`autonlu.SimpleModel.upload`, :func:`autonlu.Model.upload`, and :func:`autonlu.studio.upload_model` 0.5.0 (2021-02-15) ------------------ * The functions :func:`autonlu.Model.evaluate` and :func:`autonlu.SimpleModel.evaluate` were added, which offer a way to easily determine some evaluation metrics of models (accuracy, F1 score, precision, and recall) * A bug in the function :func:`autonlu.list_models` has been fixed which led to a crash of the function call if specific models were present on Studio 0.4.0 (2021-02-11) ------------------ * :class:`autonlu.DocumentModel` now also supports class and label tasks * The function :func:`autonlu.get_product_key` has been added to be able to retrieve the current valid product key from studio using a users login information * The function :func:`autonlu.login` has been added as a convenient way to use AutoNLU by simply providing the Studio login information (user name and passwort) * :class:`autonlu.Model` and :class:`autonlu.DocumentModel` now correctly support per segment class lists (i.e. for class and classlabel tasks it is possible to specify different all_classes for individual samples which is useful for e.g. active learning) * Removed the need to set the ``DO_PUBLIC_KEY`` manually. This is now only needed for on-premise solutions with separate user management.