===================================== Overview of the Software Architecture ===================================== The information in this document is not needed to use AutoNLU. It describes layers and the states that AutoNLU builds upon and is intended for developers working on AutoNLU. It might nonetheless be interesting to users of AutoNLU as well to understand certain design decisions. ------------------------------ Bottom Up Hierarchy of AutoNLU ------------------------------ - Huggingface ``transformers`` is used for the low level language models - ``autonlu.core`` contains our wrapper around language models. The (``Classifier`` class) implements functionality like splitting data into batches, dynamically adjusting the batch size to react to out of memory errors on the GPU, loading and saving models, calculating losses for training, epoch handling, etc.. It also offers, together with the (``CallbackHandler`` class), a system to change the behaviour of the whole system using callback classes (called **Modules** in AutoNLU) - ``SimpleModel`` is the first layer that is intended to be used by customers. It is built on top of ``Classifier`` and a set of standard modules that are needed to have a fully functioning system which is able to solve practical problems. It also adds extra functionality like loading a model directly from Studio, performing language model fine tuning, active learning, etc. The SimipleModel also maintains state and ensures that different api calls work together such as train twice, finetune and traine etc. It also ensures that e.g. a CNN model, that does not support finetuning, throws a descriptive exception as this state transition does not exist. The states of a simple model are shown in the image below: .. image:: ../../media/states.png :width: 400px :align: center :alt: States of SimpleModel - ``Model`` is **the main class to be used by customers**. It inherits from ``SimpleModel`` and offers easier to use interfaces for the different tasks. - ``DocumentModel`` is a class that offers a document centric interface. E.g. texts to be analyzed can be given as a hierarchical dictionary and results are returned as the same dictionary with additional annotation-key-value pairs added. Mainly intended as an interface for DO internal software (Studio, ctl-flow) ----------------- The Module System ----------------- Modules ------- A module is a class which inherits from ``Callback`` (you can use ``autonlu/core/modules/00_moduletemplate.py`` as a starting point). Each module implements a number of callback member functions that are called at specific times during training or inference by ``Classifier``. e.g. ``on_epoch_begin`` will be called if a new epoch is started, ``on_model_save`` is being called before a model is saved etc. Quite a big number of modules for all sorts of tasks are already implemented and can be found in ``autonlu/core/modules`` The Individual Callbacks ~~~~~~~~~~~~~~~~~~~~~~~~ The individual callbacks have a way to get information about the whole system and also to update the state of the system. Information can be obtained by specifying arguments with the correct name. ``00_moduletemplate.py`` mentions which arguments are available. For example: In a module, we would like to lowecase all sentences from the current batch before it is being sent to the machine learning model. We select the callback ``on_batch_begin``, which will be executed before the batch will be sent through the system. The corresponding member function from ``00_moduletemplate.py`` looks like this:: def on_batch_begin(self, **kwargs): # Updated in internal state: x, y # Written back: x, y pass If we want to have access to the sentences of the batch (``x``) in our function, we can just mention it in the signature:: def on_batch_begin(self, x, **kwargs): # Updated in internal state: x, y # Written back: x, y lower_x = [ex.lower() for ex in x] All arguments that are mentioned in earlier callbacks can also be used in later callbacks. So we could also mention e.g. ``modeltype`` which was mentioned in ``before_model_load``. So how do we update which sentences are actually being used for the next batch by the system? From each callback member function we can return a dictionary that contains updates to specific values. What values will be updated is also mentioned as a comment in ``00_moduletemplate.py`` (``# Written back:`` ). In general, arbitrary dictionaries can be returned, but only the mentioned values will be updated immediatly for the Classifier. Other modules with the same callback will get the updated values immediatly though. So our callback becomes:: def on_batch_begin(self, x, **kwargs): # Updated in internal state: x, y # Written back: x, y return {"x": [ex.lower() for ex in x]} A callback function can also return arbitrary key/value pairs that are not yet used for anything. This can be used as a loosly coupled way of communication between modules. For example, we could give other modules the information that our module lowered all the sentences by registering a new value:: def on_batch_begin(self, x, **kwargs): # Updated in internal state: x, y # Written back: x, y return {"x": [ex.lower() for ex in x], "was_lowered": True} Other modules can then request this value and react to it accordingly. When requesting non-standard information (which might not be there if a specific module is not currently used) it is generally a good idea to give a default value of ``None`` and check for it. For example, another module might want to know in ``on_batch_end`` (after the batch has been sent through the model) whether the sentences were lowered, so we could write:: def on_batch_end(self, was_lowered=None, **kwargs): # Updated in internal state: x, y, result, nbiterations, nbseensamples # Written back: x, y, result, nbiterations, nbseensamples if was_lowered is not None: print("We just processed lowered sentences") Value Callbacks ~~~~~~~~~~~~~~~ We have just been talking about callbacks which are used at specific points during training/inference/model loading etc. but there is also a second form of callbacks we have named **value callbacks**. A value callback is being called if a specific value has been updated by a module (even if it was updated to the same value it had before). E.g. our module might want to react to a change in the batchsize. This can be achieved by registering functions in the ``self.val_callbacks`` dictionary. :: self.val_callbacks["batchsize"] = self.batchsize_change_callback -------------------- The Callback Handler -------------------- The ``CallbackHandler`` has two functions: 1. It holds a global state of the system (in the ``state`` dict) 2. It orchestrates the calling of the callback functions in modules .. image:: ../../media/architecture.png :width: 400px :align: center :alt: Callback Handler architecture overview 1) In ``Classifier``, a certain callback is requested to be executed by the callback handler. The internal ``state`` dict is updated with the information given by ``Classifier`` 2) The callback handler calls the corresponding callback of the first module 3) The module reports back information which should be updated in/added to the ``state`` dict and the callback handler integrated this information 4) The correct callback of each of the modules is called in the appropriate order. Modules are called either in the order they were given when the callback handler was created or in reverse, depending on the specific callback. Generally, callbacks which are executed before sending information through the machine learning model are called in order and callbacks that are called after are called in reverse order. 5) The returned information of the last module is integrated into the ``state`` dict 6) The new state is being returned to the ``Classifier`` which updates internal variables. In practice, the calles in ``Classifier`` look like this. ``self.ch`` is the callback handler. Values that should be updated in the internal ``state`` dict are passed as named arguments to the callback. The updated values are returned in the same order as a tuple. In practice, the calles in ``Classifier`` look like this:: # ############# On Model Load ############# self.model, self.device, self.fp16 = self.ch.on_model_load( model=self.model, device=self.device, fp16=self.fp16) # ############################################## ``self.ch`` is the callback handler. Values that should be updated in the internal ``state`` dict are passed as named arguments to the callback. The updated values are returned in the same order as a tuple. Although only three values are being passed into the callback, the callbacks of the modules still can request all information that is in the ``state`` dict ------------------------------ Advantage of this architecture ------------------------------ Most of the functionality of AutoNLU is implemented in such modules, even the actual training step (calculating the gradient and updating the weights using an optimizer) are implemented in modules and are not part of AutoNLU or the ``Classifier`` itself. This makes it very easy to replace components. Other examples of modules: - Sorting sentences by token length for faster inference - Evaluating of the model in regular intervals - Keeping the best model and reloading it after training is finished - Tensorboard logging - Encryption/Decryption of models on save/load - ... This architecture makes the core of AutoNLU extremely flexible and allows new ideas and research results to be implemented without having to change and increase the complexity of the core. This keeps the actual system that is used for production very clean, even if hundrets of different pre-processing stepts, data augmentation schemes, training improvements, ..., are implemented and can be used (also in combination) at a moments notice.