Skip to content

Commit

Permalink
replace pkl with csv
Browse files Browse the repository at this point in the history
  • Loading branch information
williamgilpin committed Jul 29, 2020
1 parent 8473ad2 commit 944272a
Show file tree
Hide file tree
Showing 28 changed files with 40 additions and 42 deletions.
17 changes: 10 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,14 +41,17 @@ Test that everything is working:

A great summary of the work in this repository, and the broader topic, has been written by Sigrid Keydana [on the RStudio blog](https://blogs.rstudio.com/ai/posts/2020-06-24-deep-attractors/). The post includes an R implementation of the fnn regularizer.

Some functions used for baselines in this repository have been adapted from code in other repositories. We have included these files here directly, in order to reduce dependencies. However, if using these portions of this code in future work, please heed their licenses and attribution requirements:
+ The file `tica.py` is a standalone version of the tICA implementation in [MSMBuilder](https://github.com/msmbuilder/msmbuilder)
In order to provide a baseline embedding technique for comparison, the file `tica.py` has been extracted and modifed from the molecular dynamics suite [MSMBuilder](https://github.com/msmbuilder/msmbuilder). We include the modified file directly, in order to reduce dependencies. If using tICA in any work, please provide attribution to the original MSMBuilder authors and repository.

# Datasets

The folder `datasets` contains abridged versions of several time series datasets used for testing and evaluating the code. We summarize these files, and provide their original sources, here:
+ `geyser_train_test.pkl` corresponds to detrended temperature readings from the main runoff pool of the Old Faithful geyser in Yellowstone National Park, downloaded from the [GeyserTimes database](https://geysertimes.org/). Temperature measurements start on April 13, 2015 and occur in one-minute increments.
+ `electricity_train_test.pkl` corresponds to average power consumption by 321 Portuguese households between 2012 and 2014, in units of kilowatts consumed in fifteen minute increments. This dataset is from the [UCI machine learning database](http://archive.ics.uci.edu/ml/datasets/ElectricityLoadDiagrams20112014).
+ `pendulum_train.pkl` and `pendulum_test.pkl` correspond to two different double pendulum experiments, taken from a series of experiments by [Asseman et al.](https://developer.ibm.com/exchanges/data/all/double-pendulum-chaotic/). In Asseman et al.'s original study, pendula were filmed, and the $(x,y)$ positions of centroids were detected. Here, we have converted the dataset into canonical Hamiltonian coordinates $(\theta_1, \theta_2, \dot\theta_1, \dot\theta_2)$.
+ `ecg_train.pkl` and `ecg_test.pkl` correspond to ECG measurements for two different patients, taken from the [PhysioNet QT database](https://physionet.org/content/qtdb/1.0.0/)
+ `mouse.pkl` A time series of spiking rates for a neuron in a mouse thalamus. Raw spike data was obtained from [CRCNS](http://crcns.org/data-sets/thalamus/th-1/about-th-1) and processed with the authors' code in order to generate a spike rate time series.
+ `geyser_train_test.csv` corresponds to detrended temperature readings from the main runoff pool of the Old Faithful geyser in Yellowstone National Park, downloaded from the [GeyserTimes database](https://geysertimes.org/). Temperature measurements start on April 13, 2015 and occur in one-minute increments.
+ `electricity_train_test.csv` corresponds to average power consumption by 321 Portuguese households between 2012 and 2014, in units of kilowatts consumed in fifteen minute increments. This dataset is from the [UCI machine learning database](http://archive.ics.uci.edu/ml/datasets/ElectricityLoadDiagrams20112014).
+ `pendulum_train.csv` and `pendulum_test.csv` correspond to two different double pendulum experiments, taken from a series of experiments by [Asseman et al.](https://developer.ibm.com/exchanges/data/all/double-pendulum-chaotic/). In Asseman et al.'s original study, pendula were filmed, and the $(x,y)$ positions of centroids were detected. Here, we have converted the dataset into canonical Hamiltonian coordinates $(\theta_1, \theta_2, \dot\theta_1, \dot\theta_2)$.
+ `ecg_train.csv` and `ecg_test.csv` correspond to ECG measurements for two different patients, taken from the [PhysioNet QT database](https://physionet.org/content/qtdb/1.0.0/)
+ `mouse.csv` A time series of spiking rates for a neuron in a mouse thalamus. Raw spike data was obtained from [CRCNS](http://crcns.org/data-sets/thalamus/th-1/about-th-1) and processed with the authors' code in order to generate a spike rate time series.
+ `roaming_worm1.csv` and `dwelling_worm1.csv` are time series of the first five principal components of C. elegans body curvature during crawling, taken from [Ahamed et al 2019](https://www.biorxiv.org/content/10.1101/827535v1)
+ `gait_marker_trackers_patient1_speed1.csv` and `gait_force_patient1_speed1.csv` are marker positions and force recordings for a patient running on a treadmill, from [the GaitPhase database](https://www.mad.tf.fau.de/research/activitynet/gaitphase-database/)
+ `accelerometer_subject1.csv` contains smartphone accelerometer recordings of a walking individual, taken from [Vajdi et al 2019](https://arxiv.org/abs/1905.03109)

2 changes: 1 addition & 1 deletion compare.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@
"name": "stdout",
"output_type": "stream",
"text": [
"2.0.0\n"
"2.4.0-dev20200727\n"
]
}
],
Expand Down
Binary file added datasets/accelerometer_subject1.csv.gz
Binary file not shown.
Binary file added datasets/accelerometer_subject3.csv.gz
Binary file not shown.
Binary file added datasets/dwelling_worm1.csv.gz
Binary file not shown.
Binary file added datasets/dwelling_worm2.csv.gz
Binary file not shown.
Binary file added datasets/ecg_test.csv.gz
Binary file not shown.
Binary file removed datasets/ecg_test.pkl
Binary file not shown.
Binary file added datasets/ecg_train.csv.gz
Binary file not shown.
Binary file removed datasets/ecg_train.pkl
Binary file not shown.
Binary file added datasets/electricity_train_test.csv.gz
Binary file not shown.
Binary file removed datasets/electricity_train_test.pkl
Binary file not shown.
Binary file added datasets/gait_force_patient1_speed1.csv.gz
Binary file not shown.
Binary file added datasets/gait_force_patient1_speed2.csv.gz
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file added datasets/geyser_train_test.csv.gz
Binary file not shown.
Binary file removed datasets/geyser_train_test.pkl
Binary file not shown.
Binary file added datasets/mouse.csv.gz
Binary file not shown.
Binary file removed datasets/mouse.pkl
Binary file not shown.
Binary file added datasets/pendulum_test.csv.gz
Binary file not shown.
Binary file removed datasets/pendulum_test.pkl
Binary file not shown.
Binary file added datasets/pendulum_train.csv.gz
Binary file not shown.
Binary file removed datasets/pendulum_train.pkl
Binary file not shown.
Binary file added datasets/roaming_worm1.csv.gz
Binary file not shown.
Binary file added datasets/roaming_worm2.csv.gz
Binary file not shown.
21 changes: 9 additions & 12 deletions demos.ipynb

Large diffs are not rendered by default.

42 changes: 20 additions & 22 deletions exploratory.ipynb

Large diffs are not rendered by default.

0 comments on commit 944272a

Please sign in to comment.