* Add support for multi-dimensional model input and making the x, y i…

…nterface symmetrical on the model / training interface: (#707) - x/y_transform extract x, y as numpy arrays or floats out of the record - x/y_translate convert the numpy arrays of floats into tf-readable dictionaries used in tf data. * Simplify model interface by implementing output_types() directly in the base class using output_shapes() dictionary. * Adding developer guide for own model development * Updated donkey command documentation * Improve asserts and type hints in keras.py * Added missing __init__.py in parts module. * Add cool ascii text for donkey init and update yml and setup files including mypy * Remove model training test from Travis and change the test to relative convergence. This avoids random fall overs in CI. * Added test of tf.data as used in the training pipeline through re-implementation of data transformation from tub records to tf expected dictionaries, for all currently supported models.
autorope · Dec 21, 2020 · ff35d5f · ff35d5f
1 parent d70ee60
commit ff35d5f
Show file tree

Hide file tree

Showing 15 changed files with 530 additions and 87 deletions.
diff --git a/docs/dev_guide/model.md b/docs/dev_guide/model.md
@@ -0,0 +1,305 @@
+# How to build your own model
+
+* [Overview](model.md#overview)
+* [Constructor](model.md#constructor)
+* [Training Interface](model.md#training interface)
+* [Parts Interface](model.md#parts interface)
+* [Example](model.md#example)
+
+## Overview
+
+You might want to write your own model:
+
+* If you find the models that ship with donkey not sufficient, and you want to
+  experiment with your own model infrastructure
+* If you want to add more input data to the model because your car has more
+  sensors
+
+## Constructor
+
+Models are located in `donkeycar/parts/keras.py`. Your own model needs to
+inherit from `KerasPilot` and initialize your model:
+
+```python
+class KerasSensors(KerasPilot):
+    def __init__(self, input_shape=(120, 160, 3), num_sensors=2):
+        super().__init__()
+        self.num_sensors = num_sensors
+        self.model = self.create_model(input_shape)
+```
+Here, you implement the [keras model](https://www.tensorflow.org/guide/keras/sequential_model)
+in the member function `create_model()`. The model needs to have labelled input
+and output tensors. These are required for the training to work.
+
+
+## Training interface
+
+What is required for your model to work, are the following functions:
+
+```python
+def compile(self):
+    self.model.compile(optimizer=self.optimizer, metrics=['accuracy'],
+                       loss={'angle_out': 'categorical_crossentropy',
+                             'throttle_out': 'categorical_crossentropy'},
+                       loss_weights={'angle_out': 0.5, 'throttle_out': 0.5})
+```
+
+The `compile` function tells keras how to define the loss function for training.
+We are using the `KerasCategorical` model as an example. The loss function here
+makes explicit usage of the output tensors of the
+model (`angle_out, throttle_out`).
+
+```python
+def x_transform(self, record: TubRecord):
+    img_arr = record.image(cached=True)
+    return img_arr
+```
+
+In this function you define how to extract the input data from your
+recorded data. This data is usually called `X` in the ML frame work . We have
+shown the implementation in the base class which works for all models that have
+only the image as input. 
+
+The function returns a single data item if the model has only one input. You 
+need to return a tuple if your model uses more input data.
+**Note:** _If your model has more inputs, the tuple needs to have the image in 
+the first place._ 
+
+```python
+def y_transform(self, record: TubRecord):
+    angle: float = record.underlying['user/angle']
+    throttle: float = record.underlying['user/throttle']
+    return angle, throttle
+```
+In this function you specify how to extract the `y` values (i.e. target
+values) from your recorded data.
+
+
+```python
+def x_translate(self, x: XY) -> Dict[str, Union[float, np.ndarray]]:
+    return {'img_in': x}
+```
+Here we require a translation of how the `X` value that you extracted above will
+be fed into `tf.data`. Note, `tf.data` expects a dictionary if the model has
+more than one input variable, so we have chosen to use dictionaries also in the
+one-argument case for consistency. Above we have shown the implementation in the
+base class which works for all models that have only the image as input. You
+don't have to overwrite neither `x_transform` nor
+`x_translate` if your model only uses the image as input data.
+**Note:** _the keys of the dictionary must match the name of the **input**  
+layers in the model._
+
+```python
+def y_translate(self, y: XY) -> Dict[str, Union[float, np.ndarray]]:
+    if isinstance(y, tuple):
+        angle, throttle = y
+        return {'angle_out': angle, 'throttle_out': throttle}
+    else:
+        raise TypeError('Expected tuple')
+```
+Similar to the above, this provides the translation of the `y` data into the
+dictionary required for `tf.data`. This example shows the implementation of
+`KerasLinear`.
+**Note:** _the keys of the dictionary must match the name of the **output**
+layers in the model._
+
+```python
+def output_shapes(self):
+    # need to cut off None from [None, 120, 160, 3] tensor shape
+    img_shape = self.get_input_shape()[1:]
+    shapes = ({'img_in': tf.TensorShape(img_shape)},
+              {'angle_out': tf.TensorShape([15]),
+               'throttle_out': tf.TensorShape([20])})
+    return shapes
+```
+This function returns a tuple of _two_ dictionaries that tells tensorflow which
+shapes are used in the model. We have shown the example of the 
+`KerasCategorical` model here.
+**Note 1:** _The keys of the two dictionaries must match the name of the
+**input** and **output** layers in the model._
+**Note 2:** _Where the model returns scalar numbers, the corresponding 
+type has to be `tf.TensorShape([])`._
+
+
+## Parts interface
+
+In the car application the model is called through the `run()` function. That
+function is already provided in the base class where the normalisation of the
+input image is happening centrally. Instead, the derived classes have to
+implement
+`inference()` which works on the normalised data. If you have additional data
+that needs to be normalised, too, you might want to override `run()` as well.
+```python
+def inference(self, img_arr, other_arr):
+    img_arr = img_arr.reshape((1,) + img_arr.shape)
+    outputs = self.model.predict(img_arr)
+    steering = outputs[0]
+    throttle = outputs[1]
+    return steering[0][0], throttle[0][0]
+```
+Here we are showing the implementation of the linear model. Please note that 
+the input tensor shape always contains the batch dimension in the first 
+place, hence the shape of the input image is adjusted from 
+`(120, 160, 3) -> (1, 120, 160, 3)`.
+**Note:** _If you are passing another array in the`other_arr` variable, you will
+have to do a similar re-shaping.
+
+
+## Example
+Let's build a new donkey model which is based on the standard linear model 
+but has following changes w.r.t. input data and network design:
+
+1. The model takes an additional vector of input data that represents a set 
+   of values from distance sensors which are attached to the front of the car.
+
+2. The model adds a couple of more feed-forward layers to combine the CNN 
+   layers of the vision system with the distance sensor data.
+
+### Building the model using keras   
+So here is the example model:
+```python
+class KerasSensors(KerasPilot):
+    def __init__(self, input_shape=(120, 160, 3), num_sensors=2):
+        super().__init__()
+        self.num_sensors = num_sensors
+        self.model = self.create_model(input_shape)
+
+    def create_model(self, input_shape):
+        drop = 0.2
+        img_in = Input(shape=input_shape, name='img_in')
+        x = core_cnn_layers(img_in, drop)
+        x = Dense(100, activation='relu', name='dense_1')(x)
+        x = Dropout(drop)(x)
+        x = Dense(50, activation='relu', name='dense_2')(x)
+        x = Dropout(drop)(x)
+        # up to here, this is the standard linear model, now we add the
+        # sensor data to it
+        sensor_in = Input(shape=(self.num_sensors, ), name='sensor_in')
+        y = sensor_in
+        z = concatenate([x, y])
+        # here we add two more dens layers
+        z = Dense(50, activation='relu', name='dense_3')(z)
+        z = Dropout(drop)(z)
+        z = Dense(50, activation='relu', name='dense_4')(z)
+        z = Dropout(drop)(z)
+        # two outputs for angle and throttle
+        outputs = [
+            Dense(1, activation='linear', name='n_outputs' + str(i))(z)
+            for i in range(2)]
+
+        # the model needs to specify the additional input here
+        model = Model(inputs=[img_in, sensor_in], outputs=outputs)
+        return model
+
+    def compile(self):
+        self.model.compile(optimizer=self.optimizer, loss='mse')
+
+    def inference(self, img_arr, other_arr):
+        img_arr = img_arr.reshape((1,) + img_arr.shape)
+        sens_arr = other_arr.reshape((1,) + other_arr.shape)
+        outputs = self.model.predict([img_arr, sens_arr])
+        steering = outputs[0]
+        throttle = outputs[1]
+        return steering[0][0], throttle[0][0]
+
+    def x_transform(self, record: TubRecord) -> XY:
+        img_arr = super().x_transform(record)
+        # for simplicity we assume the sensor data here is normalised
+        sensor_arr = np.array(record.underlying['sensor'])
+        # we need to return the image data first
+        return img_arr, sensor_arr
+
+    def x_translate(self, x: XY) -> Dict[str, Union[float, np.ndarray]]:
+        assert isinstance(x, tuple), 'Requires tuple as input'
+        # the keys are the names of the input layers of the model
+        return {'img_in': x[0], 'sensor_in': x[1]}
+
+    def y_transform(self, record: TubRecord):
+        angle: float = record.underlying['user/angle']
+        throttle: float = record.underlying['user/throttle']
+        return angle, throttle
+
+    def y_translate(self, y: XY) -> Dict[str, Union[float, np.ndarray]]:
+        if isinstance(y, tuple):
+            angle, throttle = y
+            # the keys are the names of the output layers of the model
+            return {'n_outputs0': angle, 'n_outputs1': throttle}
+        else:
+            raise TypeError('Expected tuple')
+
+    def output_shapes(self):
+        # need to cut off None from [None, 120, 160, 3] tensor shape
+        img_shape = self.get_input_shape()[1:]
+        # the keys need to match the models input/output layers
+        shapes = ({'img_in': tf.TensorShape(img_shape),
+                   'sensor_in': tf.TensorShape([self.num_sensors])},
+                  {'n_outputs0': tf.TensorShape([]),
+                   'n_outputs1': tf.TensorShape([])})
+        return shapes
+```
+We could have inherited from `KerasLinear` which would provide the 
+implementation of `y_transform(), y_translate(), compile()`. The model 
+requires the sensor data to be an array in the TubRecord with key `"sensor"`.
+
+### Creating a tub
+
+Because we don't have a tub with sensor data, let's create one with fake 
+sensor entries:
+```python
+import os
+import tarfile
+import numpy as np
+from donkeycar.parts.tub_v2 import Tub
+from donkeycar.pipeline.types import TubRecord
+from donkeycar.config import load_config
+
+
+if __name__ == '__main__':
+    # put your path to your car app
+    my_car = os.path.expanduser('~/mycar')
+    cfg = load_config(os.path.join(my_car, 'config.py'))
+    # put your path to donkey project
+    tar = tarfile.open(os.path.expanduser(
+        '~/Python/donkeycar/donkeycar/tests/tub/tub.tar.gz'))
+    tub_parent = os.path.join(my_car, 'data2/')
+    tar.extractall(tub_parent)
+    tub_path = os.path.join(tub_parent, 'tub')
+    tub1 = Tub(tub_path)
+    tub2 = Tub(os.path.join(my_car, 'data2/tub_sensor'),
+               inputs=['cam/image_array', 'user/angle', 'user/throttle',
+                       'sensor'],
+               types=['image_array', 'float', 'float', 'list'])
+
+    for record in tub1:
+        t_record = TubRecord(config=cfg,
+                             base_path=tub1.base_path,
+                             underlying=record)
+        img_arr = t_record.image(cached=False)
+        record['sensor'] = list(np.random.uniform(size=2))
+        record['cam/image_array'] = img_arr
+        tub2.write_record(record)
+```
+
+### Making the model available
+We don't have a dynamic factory yet, so we need to add the new model into the 
+function `get_model_by_type()` in the module `donkeycar/utils.py`:
+```python
+...
+elif model_type == 'sensor':
+    kl = KerasSensors(input_shape=input_shape)
+...
+```
+
+### Go train
+In your car app folder now the following should work:
+`donkey train --tub data2/tub_sensor --model models/pilot.h5 --type sensor`
+Because of the random values in the data the model will not converge quickly,
+the goal here is to get it working in the frame work.
+
+
+## Support and discussions
+Please join the [Discord](https://discord.gg/dpvYHhpV2w) Donkey Car group for 
+support and discussions.
+
+
+
diff --git a/docs/utility/donkey.md b/docs/utility/donkey.md
@@ -65,6 +65,16 @@ donkey tubclean <folder containing tubs>
 * Opens the web server to delete bad data.
 * Hit `Ctrl + C` to exit
 
+## Train the model
+**Note:** _This section only applies to version 4.x_
+The command train the model.
+```bash
+donkey train --tub=<tub_path> [--config=<config.py>] [--model=<model path>] [--model_type=(linear|categorical|inferred)] 
+```
+The `createcar` command still creates a `train.py` file for backward 
+compatibility, but it's not need and training can be run like this.
+
+
 ## Make Movie from Tub
 
 This command allows you to create a movie file from the images in a Tub.

diff --git a/donkeycar/__init__.py b/donkeycar/__init__.py
@@ -1,11 +1,14 @@
-__version__ = '4.1.0-dev'
+import sys
+from pyfiglet import Figlet
 
-print('using donkey v{} ...'.format(__version__))
+__version__ = '4.1.0-dev'
+f = Figlet(font='speed')
 
-import sys
+print(f.renderText('Donkey Car'))
+print(f'using donkey v{__version__} ...')
 
-if sys.version_info.major < 3:
-    msg = 'Donkey Requires Python 3.6 or greater. You are using {}'.format(sys.version)
+if sys.version_info.major < 3 or sys.version_info.minor < 6:
+    msg = f'Donkey Requires Python 3.6 or greater. You are using {sys.version}'
     raise ValueError(msg)
 
 # The default recursion limits in CPython are too small.

diff --git a/donkeycar/parts/__init__.py b/donkeycar/parts/__init__.py