=====================================
Overview of the Software Architecture
=====================================

The information in this document is not needed to use AutoNLU. It
describes layers and the states that AutoNLU builds upon and is
intended for developers working on AutoNLU. It might nonetheless
be interesting to users of AutoNLU as well to understand certain
design decisions.

------------------------------
Bottom Up Hierarchy of AutoNLU
------------------------------

- Huggingface ``transformers`` is used for the low level language
  models
- ``autonlu.core`` contains our wrapper around language models. The
  (``Classifier`` class) implements functionality like splitting data
  into batches, dynamically adjusting the batch size to react to out
  of memory errors on the GPU, loading and saving models, calculating
  losses for training, epoch handling, etc.. It also offers, together
  with the (``CallbackHandler`` class), a system to change the
  behaviour of the whole system using callback classes (called
  **Modules** in AutoNLU)
- ``SimpleModel`` is the first layer that is intended to be used by
  customers. It is built on top of ``Classifier`` and a set of
  standard modules that are needed to have a fully functioning system
  which is able to solve practical problems. It also adds extra
  functionality like loading a model directly from Studio, performing
  language model fine tuning, active learning, etc. The SimipleModel
  also maintains state and ensures that different api calls work
  together such as train twice, finetune and traine etc. It also
  ensures that e.g. a CNN model, that does not support finetuning,
  throws a descriptive exception as this state transition does not
  exist. The states of a simple model are shown in the image below:
.. image:: ../../media/states.png
  :width: 400px
  :align: center
  :alt: States of SimpleModel

- ``Model`` is **the main class to be used by customers**. It inherits
  from ``SimpleModel`` and offers easier to use interfaces for the
  different tasks.
- ``DocumentModel`` is a class that offers a document centric
  interface. E.g. texts to be analyzed can be given as a hierarchical
  dictionary and results are returned as the same dictionary with
  additional annotation-key-value pairs added. Mainly intended as an
  interface for DO internal software (Studio, ctl-flow)

-----------------
The Module System
-----------------

Modules
-------

A module is a class which inherits from ``Callback`` (you can use
``autonlu/core/modules/00_moduletemplate.py`` as a starting point).
Each module implements a number of callback member functions that are
called at specific times during training or inference by
``Classifier``. e.g. ``on_epoch_begin`` will be called if a new epoch
is started, ``on_model_save`` is being called before a model is saved
etc.

Quite a big number of modules for all sorts of tasks are already
implemented and can be found in ``autonlu/core/modules``

The Individual Callbacks
~~~~~~~~~~~~~~~~~~~~~~~~

The individual callbacks have a way to get information about the whole
system and also to update the state of the system.

Information can be obtained by specifying arguments with the correct
name. ``00_moduletemplate.py`` mentions which arguments are available.

For example: In a module, we would like to lowecase all sentences from
the current batch before it is being sent to the machine learning
model. We select the callback ``on_batch_begin``, which will be
executed before the batch will be sent through the system. The
corresponding member function from ``00_moduletemplate.py`` looks like
this::

  def on_batch_begin(self, **kwargs):
      # Updated in internal state: x, y
      # Written back: x, y
      pass

If we want to have access to the sentences of the batch (``x``) in our
function, we can just mention it in the signature::

  def on_batch_begin(self, x, **kwargs):
      # Updated in internal state: x, y
      # Written back: x, y
      lower_x = [ex.lower() for ex in x]

All arguments that are mentioned in earlier callbacks can also be used
in later callbacks. So we could also mention e.g. ``modeltype`` which
was mentioned in ``before_model_load``.

So how do we update which sentences are actually being used for the
next batch by the system? From each callback member function we can
return a dictionary that contains updates to specific values. What
values will be updated is also mentioned as a comment in
``00_moduletemplate.py`` (``# Written back:`` ). In general, arbitrary
dictionaries can be returned, but only the mentioned values will be
updated immediatly for the Classifier. Other modules with the same
callback will get the updated values immediatly though. So our
callback becomes::

  def on_batch_begin(self, x, **kwargs):
      # Updated in internal state: x, y
      # Written back: x, y
      return {"x": [ex.lower() for ex in x]}

A callback function can also return arbitrary key/value pairs that are
not yet used for anything. This can be used as a loosly coupled way of
communication between modules. For example, we could give other
modules the information that our module lowered all the sentences by
registering a new value::

  def on_batch_begin(self, x, **kwargs):
      # Updated in internal state: x, y
      # Written back: x, y
      return {"x": [ex.lower() for ex in x], "was_lowered": True}

Other modules can then request this value and react to it accordingly.
When requesting non-standard information (which might not be there if
a specific module is not currently used) it is generally a good idea
to give a default value of ``None`` and check for it. For example,
another module might want to know in ``on_batch_end`` (after the batch
has been sent through the model) whether the sentences were lowered,
so we could write::

  def on_batch_end(self, was_lowered=None, **kwargs):
      # Updated in internal state: x, y, result, nbiterations, nbseensamples
      # Written back: x, y, result, nbiterations, nbseensamples
      if was_lowered is not None:
          print("We just processed lowered sentences")

Value Callbacks
~~~~~~~~~~~~~~~

We have just been talking about callbacks which are used at specific
points during training/inference/model loading etc. but there is also
a second form of callbacks we have named **value callbacks**. A value
callback is being called if a specific value has been updated by a
module (even if it was updated to the same value it had before). E.g.
our module might want to react to a change in the batchsize. This can
be achieved by registering functions in the ``self.val_callbacks``
dictionary.

::

   self.val_callbacks["batchsize"] = self.batchsize_change_callback

--------------------
The Callback Handler
--------------------

The ``CallbackHandler`` has two functions:
  1. It holds a global state of the system (in the ``state`` dict)
  2. It orchestrates the calling of the callback functions in modules

.. image:: ../../media/architecture.png
    :width: 400px
    :align: center
    :alt: Callback Handler architecture overview

1) In ``Classifier``, a certain callback is requested to be executed
   by the callback handler. The internal ``state`` dict is updated
   with the information given by ``Classifier``
2) The callback handler calls the corresponding callback of the first
   module
3) The module reports back information which should be updated
   in/added to the ``state`` dict and the callback handler integrated
   this information
4) The correct callback of each of the modules is called in the
   appropriate order. Modules are called either in the order they were
   given when the callback handler was created or in reverse,
   depending on the specific callback. Generally, callbacks which are
   executed before sending information through the machine learning
   model are called in order and callbacks that are called after are
   called in reverse order.
5) The returned information of the last module is integrated into the
   ``state`` dict
6) The new state is being returned to the ``Classifier`` which updates
   internal variables.

In practice, the calles in ``Classifier`` look like this. ``self.ch``
is the callback handler. Values that should be updated in the internal
``state`` dict are passed as named arguments to the callback. The
updated values are returned in the same order as a tuple.

In practice, the calles in ``Classifier`` look like this::

  # ############# On Model Load #############
  self.model, self.device, self.fp16 = self.ch.on_model_load(
      model=self.model, device=self.device, fp16=self.fp16)
  # ##############################################

``self.ch`` is the callback handler. Values that should be updated in
the internal ``state`` dict are passed as named arguments to the
callback. The updated values are returned in the same order as a
tuple.

Although only three values are being passed into the callback, the
callbacks of the modules still can request all information that is in
the ``state`` dict

------------------------------
Advantage of this architecture
------------------------------

Most of the functionality of AutoNLU is implemented in such modules,
even the actual training step (calculating the gradient and updating
the weights using an optimizer) are implemented in modules and are not
part of AutoNLU or the ``Classifier`` itself. This makes it very easy
to replace components. Other examples of modules:

- Sorting sentences by token length for faster inference
- Evaluating of the model in regular intervals
- Keeping the best model and reloading it after training is finished
- Tensorboard logging
- Encryption/Decryption of models on save/load
- ...

This architecture makes the core of AutoNLU extremely flexible and
allows new ideas and research results to be implemented without having
to change and increase the complexity of the core.

This keeps the actual system that is used for production very clean,
even if hundrets of different pre-processing stepts, data augmentation
schemes, training improvements, ..., are implemented and can be used
(also in combination) at a moments notice.