Skip to content

Commit

Permalink
[REVIEW] Creating dedicated loader submodule to build TF async data…
Browse files Browse the repository at this point in the history
…loader (#224)

* adding tensorflow example stuff

* getting workflow working

* training of both workflows works

* notebook updates and addding image from run

* updating workflow for nightly tf build

* Create dummy.txt

* Add files via upload

* Delete dummy.txt

* adding tensorflow example stuff

* getting workflow working

* training of both workflows works

* notebook updates and addding image from run

* adding root Dockerfile

* updating root build for 2.3 rc1

* updating Dockerfile for tf 2.3-rc1 and filling out notebook

* updating throughput curves in README

* moving dlrm-train

* cleaning up notebook and layers code, adding cupti symlink to Dockerfile

* getting rid of modprobe install in Dockerfile

* playing with requirements

* updating for tf 2.3 full release

* updating notebook

* removing old Dockerfiles, updating environment and README and finishing example notebook

* removing old images

* consolidating data loading code

* cleaning up and blackening

* finished separating loader code

* adding fixed Dockerfile

* getting tf data loading running

* blackening

* fixing bug in torch loader

* applying isort fixes

* isort fixes

* ironing out data loaders

* creating parent dataloader class

* playing with thread safe iteration

* small change

* moving tensoritr loop into asynciterator

* fixing syntax error

* debugging iter issues

* fixing generator issues

* cleaning up backend code

* got torch data loader working

* working out tf missing gradient issues

* working on gradient issues

* reformatting loader backend to use only 2 classes

* undoing changes really quick

* backend changes

* getting tf dataloader working

* trying out tensor y

* tf data loader working

* undoing some testing changes to Tensorflow

* rerunning tf example for checks

* updating tests

* blackening

* blackening

* fixing dataloader bench bug

* fixing unused variables

* isort fixes

* adding qsize to chunkedbuffer

* fixed typo in backed

* simplifying and updating DataLoader

* updating dataloader backend

* trying new async scheme

* got new implementation working

* cleaning up

* blackening

* fixing bugs

* updating wait time

* isort fixes

* minor aesthetic change

* merging upstream changes

* bug fixes

* trying to update examples

* adding custom validation callback

* got examples working

* blackening

* fixing bug and documenting

* gettin criteo most of the way through

* rearranging and adding checks

* adding proper torch documentation

* documenting and blackening

* remove trailing whitespace

* updating tests

* changing cat and cont defaults to empty lists and including checks

* updating TF example notebook

* adding PARTS_PER_CHUNK to criteo example

* adding tf config changes

* fixing tf unit tests

* blackening

* fixed tf_util bug

* fixing tf_utils bug

* blackening

* blackening

* fixing bug in loader backend

* tests passing

* blackening

* updating rossmann notebook test

* Fix cupy device errors

Co-authored-by: Alec Gunny <agunny@nvidia.com>
Co-authored-by: Ben Frederickson <github@benfrederickson.com>
  • Loading branch information
3 people authored Aug 27, 2020
1 parent c0356a3 commit 7f9c780
Show file tree
Hide file tree
Showing 19 changed files with 2,148 additions and 1,467 deletions.
12 changes: 12 additions & 0 deletions .dockerignore
Original file line number Diff line number Diff line change
@@ -1,5 +1,17 @@
.git
.dockerignore

# ignore hidden directories created
# by RAPIDS libs
**/.cupy
**/.nv
**/.python_history
**/.cache
**/.config
**/.local

# ignore any local files created
# by examples notebooks
examples/tensorflow/logs
examples/tensorflow/docker/Dockerfile*
examples/tensorflow/.*
Expand Down
Loading

0 comments on commit 7f9c780

Please sign in to comment.