Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Minor changes for 4.1 in tub conversion script and developer doc #708

Merged
merged 4 commits into from
Dec 23, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
41 changes: 30 additions & 11 deletions docs/dev_guide/model.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,14 @@
# How to build your own model

---
**Note:** _This requires version >= 4.1.X_

---

* [Overview](model.md#overview)
* [Constructor](model.md#constructor)
* [Training Interface](model.md#training interface)
* [Parts Interface](model.md#parts interface)
* [Training Interface](model.md#training-interface)
* [Parts Interface](model.md#parts-interface)
* [Example](model.md#example)

## Overview
Expand Down Expand Up @@ -62,6 +67,8 @@ only the image as input.

The function returns a single data item if the model has only one input. You
need to return a tuple if your model uses more input data.


**Note:** _If your model has more inputs, the tuple needs to have the image in
the first place._

Expand All @@ -84,9 +91,11 @@ be fed into `tf.data`. Note, `tf.data` expects a dictionary if the model has
more than one input variable, so we have chosen to use dictionaries also in the
one-argument case for consistency. Above we have shown the implementation in the
base class which works for all models that have only the image as input. You
don't have to overwrite neither `x_transform` nor
`x_translate` if your model only uses the image as input data.
**Note:** _the keys of the dictionary must match the name of the **input**
don't have to overwrite neither `x_transform` nor `x_translate` if your
model only uses the image as input data.


**Note:** _the keys of the dictionary must match the name of the **input**
layers in the model._

```python
Expand All @@ -100,6 +109,8 @@ def y_translate(self, y: XY) -> Dict[str, Union[float, np.ndarray]]:
Similar to the above, this provides the translation of the `y` data into the
dictionary required for `tf.data`. This example shows the implementation of
`KerasLinear`.


**Note:** _the keys of the dictionary must match the name of the **output**
layers in the model._

Expand All @@ -115,8 +126,12 @@ def output_shapes(self):
This function returns a tuple of _two_ dictionaries that tells tensorflow which
shapes are used in the model. We have shown the example of the
`KerasCategorical` model here.
**Note 1:** _The keys of the two dictionaries must match the name of the
**input** and **output** layers in the model._


**Note 1:** _As above, the keys of the two dictionaries must match the name
of the **input** and **output** layers in the model._


**Note 2:** _Where the model returns scalar numbers, the corresponding
type has to be `tf.TensorShape([])`._

Expand All @@ -141,6 +156,8 @@ Here we are showing the implementation of the linear model. Please note that
the input tensor shape always contains the batch dimension in the first
place, hence the shape of the input image is adjusted from
`(120, 160, 3) -> (1, 120, 160, 3)`.


**Note:** _If you are passing another array in the`other_arr` variable, you will
have to do a similar re-shaping.

Expand Down Expand Up @@ -177,7 +194,7 @@ class KerasSensors(KerasPilot):
sensor_in = Input(shape=(self.num_sensors, ), name='sensor_in')
y = sensor_in
z = concatenate([x, y])
# here we add two more dens layers
# here we add two more dense layers
z = Dense(50, activation='relu', name='dense_3')(z)
z = Dropout(drop)(z)
z = Dense(50, activation='relu', name='dense_4')(z)
Expand Down Expand Up @@ -237,9 +254,11 @@ class KerasSensors(KerasPilot):
'n_outputs1': tf.TensorShape([])})
return shapes
```
We could have inherited from `KerasLinear` which would provide the
implementation of `y_transform(), y_translate(), compile()`. The model
requires the sensor data to be an array in the TubRecord with key `"sensor"`.
We could have inherited from `KerasLinear` which already provides the
implementation of `y_transform(), y_translate(), compile()`. However, to
make it explicit for the general case we have implemented all functions here.
The model requires the sensor data to be an array in the TubRecord with key
`"sensor"`.

### Creating a tub

Expand Down
101 changes: 98 additions & 3 deletions docs/utility/donkey.md
Original file line number Diff line number Diff line change
Expand Up @@ -66,13 +66,13 @@ donkey tubclean <folder containing tubs>
* Hit `Ctrl + C` to exit

## Train the model
**Note:** _This section only applies to version 4.x_
The command train the model.
**Note:** _This section only applies to version >= 4.1_
This command trains the model.
```bash
donkey train --tub=<tub_path> [--config=<config.py>] [--model=<model path>] [--model_type=(linear|categorical|inferred)]
```
The `createcar` command still creates a `train.py` file for backward
compatibility, but it's not need and training can be run like this.
compatibility, but it's not required for training.


## Make Movie from Tub
Expand All @@ -95,6 +95,52 @@ donkey makemovie --tub=<tub_path> [--out=<tub_movie.mp4>] [--config=<config.py>]
* optional `--start` and/or `--end` can specify a range of frame numbers to use.
* scale will cause ouput image to be scaled by this amount

## Check Tub

This command allows you to see how many records are contained in any/all tubs. It will also open each record and ensure that the data is readable and intact. If not, it will allow you to remove corrupt records.

> Note: This should be moved from manage.py to donkey command

Usage:

```bash
donkey tubcheck <tub_path> [--fix]
```

* Run on the host computer or the robot
* It will print summary of record count and channels recorded for each tub
* It will print the records that throw an exception while reading
* The optional `--fix` will delete records that have problems

## Augment Tub

This command allows you to perform the data augmentation on a tub or set of tubs directly. The augmentation is also available in training via the `--aug` flag. Preprocessing the tub can speed up the training as the augmentation can take some time. Also you can train with the unmodified tub and the augmented tub joined together.

Usage:

```bash
donkey tubaugment <tub_path> [--inplace]
```

* Run on the host computer or the robot
* The optional `--inplace` will replace the original tub images when provided. Otherwise `tub_XY_YY-MM-DD` will be copied to a new tub `tub_XX_aug_YY-MM-DD` and the original data remains unchanged


## Histogram

This command will show a pop-up window showing the histogram of record values in a given tub.

> Note: This should be moved from manage.py to donkey command

Usage:

```bash
donkey tubhist <tub_path> --rec=<"user/angle">
```

* Run on the host computer

* When the `--tub` is omitted, it will check all tubs in the default data dir

## Plot Predictions

Expand All @@ -113,6 +159,55 @@ donkey tubplot <tub_path> [--model=<model_path>]
* Will show a pop-up window showing the plot of steering values in a given tub compared to NN predictions from the trained model
* When the `--tub` is omitted, it will check all tubs in the default data dir

## Continuous Rsync

This command uses rsync to copy files from your pi to your host. It does so in a loop, continuously copying files. By default, it will also delete any files
on the host that are deleted on the pi. This allows your PS3 Triangle edits to affect the files on both machines.

Usage:

```bash
donkey consync [--dir = <data_path>] [--delete=<y|n>]
```

* Run on the host computer
* First copy your public key to the pi so you don't need a password for each rsync:

```bash
cat ~/.ssh/id_rsa.pub | ssh pi@<your pi ip> 'cat >> .ssh/authorized_keys'
```

* If you don't have a id_rsa.pub then google how to make one
* Edit your config.py and make sure the fields `PI_USERNAME`, `PI_HOSTNAME`, `PI_DONKEY_ROOT` are setup. Only on windows, you need to set `PI_PASSWD`.
* This command may be run from `~/mycar` dir

## Continuous Train

This command fires off the keras training in a mode where it will continuously look for new data at the end of every epoch.

Usage:

```bash
donkey contrain [--tub=<data_path>] [--model=<path to model>] [--transfer=<path to model>] [--type=<linear|categorical|rnn|imu|behavior|3d>] [--aug]
```

* This command may be run from `~/mycar` dir
* Run on the host computer
* First copy your public key to the pi so you don't need a password for each rsync:

```bash
cat ~/.ssh/id_rsa.pub | ssh pi@<your pi ip> 'cat >> .ssh/authorized_keys'
```

* If you don't have a id_rsa.pub then google how to make one
* Edit your config.py and make sure the fields `PI_USERNAME`, `PI_HOSTNAME`, `PI_DONKEY_ROOT` are setup. Only on windows, you need to set `PI_PASSWD`.
* Optionally it can send the model file to your pi when it achieves a best loss. In config.py set `SEND_BEST_MODEL_TO_PI = True`.
* Your pi drive loop will autoload the weights file when it changes. This works best if car started with `.json` weights like:

```bash
python manage.py drive --model models/drive.json
```

## Joystick Wizard

This command line wizard will walk you through the steps to create a custom/customized controller.
Expand Down
17 changes: 9 additions & 8 deletions donkeycar/parts/tub_v2.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,10 +9,11 @@


class Tub(object):
'''
"""
A datastore to store sensor data in a key, value format. \n
Accepts str, int, float, image_array, image, and array data types.
'''
"""

def __init__(self, base_path, inputs=[], types=[], metadata=[],
max_catalog_len=1000, read_only=False):
self.base_path = base_path
Expand All @@ -28,15 +29,15 @@ def __init__(self, base_path, inputs=[], types=[], metadata=[],
if not os.path.exists(self.images_base_path):
os.makedirs(self.images_base_path, exist_ok=True)

def write_record(self, record):
'''
def write_record(self, record=None):
"""
Can handle various data types including images.
'''
"""
contents = dict()
for key, value in record.items():
if value is None:
continue
elif not key in self.input_types:
elif key not in self.input_types:
continue
else:
input_type = self.input_types[key]
Expand Down Expand Up @@ -99,9 +100,9 @@ def _image_file_name(cls, index, key, extension='.jpg'):


class TubWriter(object):
'''
"""
A Donkey part, which can write records to the datastore.
'''
"""
def __init__(self, base_path, inputs=[], types=[], metadata=[],
max_catalog_len=1000):
self.tub = Tub(base_path, inputs, types, metadata, max_catalog_len)
Expand Down
34 changes: 29 additions & 5 deletions scripts/convert_to_tub_v2.py
Original file line number Diff line number Diff line change
Expand Up @@ -22,29 +22,53 @@


def convert_to_tub_v2(paths, output_path):
empty_record = {'__empty__': True}

if type(paths) is str:
paths = [paths]
legacy_tubs = [LegacyTub(path) for path in paths]
output_tub = None
print('Total number of tubs: %s' % (len(legacy_tubs)))
print(f'Total number of tubs: {len(legacy_tubs)}')

for legacy_tub in legacy_tubs:
if not output_tub:
output_tub = Tub(output_path, legacy_tub.inputs, legacy_tub.types, list(legacy_tub.meta.items()))
# add input and type for empty records recording
inputs = legacy_tub.inputs + ['__empty__']
types = legacy_tub.types + ['boolean']
output_tub = Tub(output_path, inputs, types,
list(legacy_tub.meta.items()))

record_paths = legacy_tub.gather_records()
bar = IncrementalBar('Converting', max=len(record_paths))

previous_index = None
for record_path in record_paths:
try:
contents = Path(record_path).read_text()
record = json.loads(contents)
image_path = record['cam/image_array']
current_index = int(image_path.split('_')[0])
image_path = os.path.join(legacy_tub.path, image_path)
image_data = Image.open(image_path)
record['cam/image_array'] = image_data
output_tub.write_record(record)
# first record or they are continuous, just append
if not previous_index or current_index == previous_index + 1:
output_tub.write_record(record)
previous_index = current_index
# otherwise fill the gap with dummy records
else:
# Skipping over previous record here because it has
# already been written.
previous_index += 1
# Adding empty record nodes, and marking them deleted
# until the next valid record.
while previous_index < current_index:
idx = output_tub.manifest.current_index
output_tub.write_record(empty_record)
output_tub.delete_record(idx)
previous_index += 1
bar.next()
except Exception as exception:
print('Ignoring record path %s\n' % (record_path), exception)
print(f'Ignoring record path {record_path}\n', exception)
traceback.print_exc()


Expand Down