added support for multivariate ts, added unit tests

williamgilpin · May 8, 2020 · 6b11634 · 6b11634
1 parent 9521ac5
commit 6b11634
Show file tree

Hide file tree

Showing 9 changed files with 671 additions and 504 deletions.
diff --git a/README.md b/README.md
@@ -1,16 +1,18 @@
-## FNN 
+## fnn
 
-Embed complex time series using autoencoders and a loss function based on penalizing false-nearest-neighbors.
+Embed univariate or multivariate time series using autoencoders with a loss function that penalizes false-nearest-neighbors.
 
 This package includes alternative embedding methods using lag based on the average mutual information, Eigen-time-delay coordinates (ETD), and time-lagged independent component analysis (tICA).
 
 ![Schematic of approach](resources/fig_github.jpg)
 
 For more information about the technique, please see the following reference. If using this code, please consider citing the paper.
 
-> Gilpin, William. "Deep learning of dynamical attractors from time series measurements" 2020. [https://arxiv.org/abs/2002.05909](https://arxiv.org/abs/2002.05909)
+> Gilpin, William. "Learning strange attractors from time series" 2020. [https://arxiv.org/abs/2002.05909](https://arxiv.org/abs/2002.05909)
 
-# Requirements
+# Installation
+
+This library requires the following packages
 
 + Numpy
 + Scipy
@@ -19,6 +21,15 @@ For more information about the technique, please see the following reference. If
 + Matplotlib (for demos)
 + Jupyter Notebook (for demos)
 
+To use this repository, directly download the source:
+
+	git clone https://github.com/williamgilpin/fnn
+
+Test that everything is working:
+
+	python tests/test_models.py 
+
+
 # Usage
 
 + `demos.ipynb` shows the step-by-step process of constructing embeddings of the Lorenz attractor,  experimental measurements of a double pendulum, a quasiperiodic torus, the Rössler attractor, and a high-dimensional chaotic ecosystem.
@@ -34,5 +45,5 @@ The folder `datasets` contains abridged versions of several time series datasets
 + `ecg_train.pkl` and `ecg_test.pkl` correspond to ECG measurements for two different patients, taken from the [PhysioNet QT database](https://physionet.org/content/qtdb/1.0.0/)
 + `mouse.pkl` A time series of spiking rates for a neuron in a mouse thalamus. Raw spike data was obtained from [CRCNS](http://crcns.org/data-sets/thalamus/th-1/about-th-1) and processed with the authors' code in order to generate a spike rate time series.
 
-Some functions used for baselines in this repository have been adapted from code other repositories. We have included these files here directly, in order to reduce dependencies. However, if using this code in future work, please heed licenses and attribute those libraries if using these components:
+Some functions used for baselines in this repository have been adapted from code in other repositories. We have included these files here directly, in order to reduce dependencies. However, if using these portions of this code in future work, please heed their licenses and attribution requirements:
 + The file `tica.py` is a standalone version of the tICA implementation in [MSMBuilder](https://github.com/msmbuilder/msmbuilder)
diff --git a/compare.ipynb b/compare.ipynb
diff --git a/demos.ipynb b/demos.ipynb
diff --git a/exploratory.ipynb b/exploratory.ipynb
diff --git a/fnn/models.py b/fnn/models.py
@@ -3,29 +3,54 @@
 """
 import tensorflow as tf
 
-
-def train_autoencoder(X_train, network_type='lstm', learning_rate=1e-3, lambda_latent=0.0, 
-                      time_window=10, num_hidden=10, batch_size=100, random_seed=0, 
-                      verbose=0, train_time=200):
+def train_autoencoder(X_train, network_type='lstm',
+                      time_window=10, num_hidden=10, n_features=1, 
+                      lambda_latent=0.0, learning_rate=1e-3, batch_size=100, train_time=200,
+                      verbose=0, random_seed=0, return_history=False):
     """
     This is a helper function that captures some of the boilerplate code for constructing
-    and training an autoencoder (with default architecture)
+    and training an autoencoder (with default architecture).
+    
+    Parameters
+    ----------
+    - network_type : 'lstm' or 'mlp'
+        Whether to use an LSTM or a feedforward network (equiv. to a time-delay neural network)
+    - n_feature : int
+        Rhe dimensionality of a point in the time series
+    - time_window : int
+        The number of past timesteps to use for embedding
+    - num_hidden : int
+        The number of latent variables
+    - lambda_latent : float
+        The relative weight of the false-neighbors loss during training
+    - learning_rate : float
+        The learning rate to use for training
+    - batch_size : int
+        The number of samples in each training batch
+    - train_time : int
+        The number of training epochs
+    - verbose : int
+        The amount of detail to record during training
+    - random_seed : int
+        The value for random initialization of the network
+    - return_history : bool
+        Whether to return the training history with the trained models
     """
 
     tf.random.set_seed(random_seed)
 
     if network_type=='lstm':
-        enc, dec = enc_dec_lstm(time_window, 1, num_hidden, 
+        enc, dec = enc_dec_lstm(time_window, n_features, num_hidden, 
                                 rnn_opts={'activation': None, 
                                           'batch_size': batch_size})
     elif network_type=='mlp':
-            enc, dec = enc_dec_tdnn(time_window, 1, num_hidden, 
+            enc, dec = enc_dec_tdnn(time_window, n_features, num_hidden, 
                             rnn_opts={'activation': None, 
                                       'batch_size': batch_size})
     else:
         warnings.warn("Network type not recognized")
 
-    inp = tf.keras.layers.Input(shape=(time_window, 1))
+    inp = tf.keras.layers.Input(shape=(time_window, n_features))
     code = enc(inp)
     reconstruction = dec(code)
     autoencoder = tf.keras.models.Model(inputs=inp, outputs=reconstruction)
@@ -35,6 +60,7 @@ def train_autoencoder(X_train, network_type='lstm', learning_rate=1e-3, lambda_l
     autoencoder.compile(optimizer=tf.keras.optimizers.Adam(lr=learning_rate), 
                         loss=loss_latent(code,batch_size, 
                                      lam_latent=lambda_latent),
+                        metrics=[mse_loss],
                         experimental_run_tf_function=False)
 
 
@@ -43,6 +69,8 @@ def train_autoencoder(X_train, network_type='lstm', learning_rate=1e-3, lambda_l
                                     epochs=train_time,
                                     batch_size=batch_size,
                                     verbose=verbose)
+    if return_history:
+        return enc, dec, train_history
 
     return enc, dec
 
@@ -52,26 +80,40 @@ def enc_dec_lstm(time_window, n_features, n_latent, hidden=[10], rnn_opts=dict()
                 activation_func=tf.keras.layers.ELU(alpha=1.0)):
                 #activation_func=tf.keras.activations.tanh):
     """
-    Shallow LSTM autoencoder
+    Build a one-layer LSTM autoencoder
     
-    activation_func = tf.keras.activations.softplus
+    Parameters
+    ----------
+    - activation_func : function
+        The nonlinearity to apply to all layers. Try tf.keras.activations.relu
+        or tf.keras.activations.softplus
     """
     enc = tf.keras.Sequential()
-    enc.add(tf.keras.layers.GaussianNoise(0.5)) # smooths the output
+    enc.add(tf.keras.layers.GaussianNoise(0.5))  # smooths the output
 
-    enc.add(tf.keras.layers.LSTM(n_latent, input_shape=(time_window, n_features), 
-                                      return_sequences=False, **rnn_opts))
+    enc.add(
+        tf.keras.layers.LSTM(
+            n_latent,
+            input_shape=(time_window, n_features),
+            return_sequences=False,
+            **rnn_opts
+        )
+    )
     enc.add(tf.keras.layers.BatchNormalization())
-    #enc.add(tf.keras.layers.Activation(activation_func))
-
+    # enc.add(tf.keras.layers.Activation(activation_func))
+
+    ## Latent ##
 
     dec = tf.keras.Sequential()
-    #dec.add(tf.keras.layers.BatchNormalization())
-    #dec.add(tf.keras.layers.Activation(activation_func))
+    # dec.add(tf.keras.layers.BatchNormalization())
+    # dec.add(tf.keras.layers.Activation(activation_func))
     dec.add(tf.keras.layers.RepeatVector(time_window))
     dec.add(tf.keras.layers.GaussianNoise(0.5))
-    dec.add(tf.keras.layers.LSTM(n_latent, return_sequences=True,  go_backwards=True,
-                             **rnn_opts))
+    dec.add(
+        tf.keras.layers.LSTM(
+            n_latent, return_sequences=True, go_backwards=True, **rnn_opts
+        )
+    )
     dec.add(tf.keras.layers.BatchNormalization())
     dec.add(tf.keras.layers.Activation(activation_func))
     dec.add(tf.keras.layers.TimeDistributed(tf.keras.layers.Dense(n_features)))
@@ -81,7 +123,7 @@ def enc_dec_lstm(time_window, n_features, n_latent, hidden=[10], rnn_opts=dict()
 def enc_dec_tdnn(time_window, n_features, n_latent, hidden=None, rnn_opts=dict(), 
                 activation_func=tf.keras.layers.ELU(alpha=1.0)):
     """
-    timedelay-NN (not recurrent)
+    Build a time-delay neural network (a non-stateful architecture)
     """
     if not hidden:
         hidden = [time_window, time_window]
@@ -105,7 +147,7 @@ def enc_dec_tdnn(time_window, n_features, n_latent, hidden=None, rnn_opts=dict()
 
     enc.add(tf.keras.layers.Reshape((n_latent,)))
 
-
+    ## Latent ##
 
     dec = tf.keras.Sequential()
     dec.add(tf.keras.layers.Flatten())
@@ -120,48 +162,94 @@ def enc_dec_tdnn(time_window, n_features, n_latent, hidden=None, rnn_opts=dict()
     dec.add(tf.keras.layers.Activation(activation_func))
 
 
-    dec.add(tf.keras.layers.Dense(n_latent, **rnn_opts))
+    dec.add(tf.keras.layers.Dense(time_window*n_features, **rnn_opts))
     dec.add(tf.keras.layers.BatchNormalization())
     dec.add(tf.keras.layers.Activation(activation_func))
 
     dec.add(tf.keras.layers.Reshape((time_window, n_features)))
 
     return enc, dec
 
+def enc_dec_rnn(time_window, n_features, n_latent, hidden=None, rnn_opts=dict(), 
+                activation_func=tf.keras.layers.ELU(alpha=1.0)):
+    """
+    Build a two-layer vanilla RNN autoencoder
+    """
+
+    if not hidden:
+        hidden = [time_window]
+
+    enc = tf.keras.Sequential()
+    enc.add(tf.keras.layers.GaussianNoise(0.5)) # smooths the output
+
+    enc.add(tf.keras.layers.SimpleRNN(hidden[0], 
+                                      input_shape=(time_window, n_features), 
+                                      return_sequences=True, 
+                                      **rnn_opts))
+    enc.add(tf.keras.layers.BatchNormalization())
+    enc.add(tf.keras.layers.Activation(activation_func))
+
+    enc.add(tf.keras.layers.SimpleRNN(n_latent, 
+                                      return_sequences=False, 
+                                      **rnn_opts))
+    enc.add(tf.keras.layers.BatchNormalization())
 
 
+    dec = tf.keras.Sequential()
+    dec.add(tf.keras.layers.RepeatVector(time_window))
+    dec.add(tf.keras.layers.GaussianNoise(0.5))
+    dec.add(tf.keras.layers.SimpleRNN(n_latent, 
+                                      return_sequences=True,  
+                                      go_backwards=True,
+                                      **rnn_opts))
+    dec.add(tf.keras.layers.BatchNormalization())
+    dec.add(tf.keras.layers.Activation(activation_func))
+    dec.add(tf.keras.layers.SimpleRNN(hidden[0], 
+                                      return_sequences=True,  
+                                      go_backwards=True,
+                                      **rnn_opts))
+    dec.add(tf.keras.layers.BatchNormalization())
+    dec.add(tf.keras.layers.Activation(activation_func))
+    dec.add(tf.keras.layers.TimeDistributed(tf.keras.layers.Dense(n_features)))
+
+    return enc, dec
+
 ###------------------------------------###
 #
 #
-#       SCRATCH : Testing models and losses
+#       Losses and Analysis Functions
 #
 #
 ###------------------------------------###
 
 def loss_latent(latent, batch_size, lam_latent=1.0):
     """
-    Build a custom loss function that includes layer terms, etc
-    Does the covariance loss get summed over the batch?
-    models : list of keras.Sequential() models
+    Build a custom loss function that keras.compile will accept.
+    
+    Parameters
+    ----------
+    - latent : B x T x L)
+        A batch of latent activations
+    - batch_size : int
+        The expected batch size
+    - lam_latent : float
+        The relative weight of the fnn regularizer. Expressed relative to the
+        standard autoencoder reconstruction loss, which has constant weight 1.0
+        
+    Returns
+    -------
+    - loss : function
+        A keras-friendly loss function that takes two batches of labels as arguments
     """
     @tf.function
     def loss(y_true, y_pred):
-        ## first term has shape (batch, lookback), do we really want to flatten it to just be (batch,)?
-        ## can avoid by increasing dimensionality of last term. the grad wrt 
+        """Loss function generated automatically by loss_latent()"""
         total_loss = tf.reduce_mean(tf.keras.losses.mean_squared_error(y_pred, y_true), axis=1)
         total_loss += lam_latent*loss_false(latent, batch_size=batch_size)
         return total_loss
 
     return loss
 
-###------------------------------------###
-#
-#
-#       Losses and Analysis Functions
-#
-#
-###------------------------------------###
-
 @tf.function
 def loss_false(code_batch, batch_size=1, k=None):
     """
@@ -174,7 +262,7 @@ def loss_false(code_batch, batch_size=1, k=None):
         The encoded inputs, with dimensionality L
     - batch_size : int
         The batch size. For some reason, this has to be passed manually
-        due to a cryptic bug
+        due to a bug on keras' end
     - k : int 
         DEPRECATED. The number of nearest neighbors to use to compute 
         neighborhoods. Right now this is set to a constant, since it doesn't 
@@ -249,3 +337,5 @@ def mse_loss(y_true, y_pred):
     The mean squared error loss between observed and true labels
     """
     return tf.reduce_mean(tf.keras.losses.mean_squared_error(y_pred, y_true), axis=1)
+
+