Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CNN-RNNs not learning anything #39

Open
kren1 opened this issue Dec 3, 2015 · 7 comments
Open

CNN-RNNs not learning anything #39

kren1 opened this issue Dec 3, 2015 · 7 comments

Comments

@kren1
Copy link
Contributor

kren1 commented Dec 3, 2015

@eadward I've now done the CNN-RNN approach discussed in the meeting.

It is behaving very similarly to the subject independent RNN aproach, meaning it is not learning anything.

For example here is a training trace:

Train on 10721 samples, validate on 10410 samples
Epoch 1/15
10721/10721 [==============================] - 56s - loss: 0.9948 - val_loss: 1.5181
Epoch 2/15
10721/10721 [==============================] - 56s - loss: 0.9000 - val_loss: 1.4525
Epoch 3/15
10721/10721 [==============================] - 55s - loss: 0.8459 - val_loss: 1.4802
Epoch 4/15
10721/10721 [==============================] - 55s - loss: 0.7673 - val_loss: 1.4254
Epoch 5/15
10721/10721 [==============================] - 55s - loss: 0.7109 - val_loss: 1.4720
Epoch 6/15
10721/10721 [==============================] - 55s - loss: 0.7307 - val_loss: 1.4763
Epoch 7/15
10721/10721 [==============================] - 55s - loss: 0.6144 - val_loss: 1.4643
Epoch 8/15
10721/10721 [==============================] - 54s - loss: 0.5660 - val_loss: 1.4857

#Final result on validation set: 
Model r:  (0.044959106007683075, 4.4554423862127374e-06)
Model rmse:  1.2189017562850681


#Final result on test set:
Model r:  (-0.0059427638053850106, 0.48934621677575696)
Model rmse:  1.3999194428042123

The loss (root mean square error in this case) keeps decreasing, but the training has no effect on the validation set performance. This is exactly the same thing as it is happening with the direct RNN approach.

The architecture of the network is (tinkered keras syntax for clarity), shape comments represent the shape of the network at that point:

    # shape (n_images, frequencies, time)
    # shape (1,200,158)

    model.add(Convolution2D(16,in_shape[1],5))  #doing 1d convolutions on the time asix with filter size 5
    model.add(MaxPooling2D((1,2)))

    # shape (16,1,77)
    model.add(Convolution2D(32,1,10))
    model.add(MaxPooling2D((1,2)))


    # shape (32,1,34)
    model.add(Reshape(dims=(...)))
    #shape (32,34)
    model.add(Permute((2,1)))
    #shape (time, features)
    #(34,32)
    model.add(LSTM(200)) 

    model.add(Dense(1))
    model.add(Activation('linear'))
@kren1
Copy link
Contributor Author

kren1 commented Dec 3, 2015

@kren1
Copy link
Contributor Author

kren1 commented Dec 3, 2015

This is subject depedant case of training. Will update the table with full detials once the run is finnished.

Train on 15819 samples, validate on 5272 samples
Epoch 1/15
15819/15819 [==============================] - 47s - loss: 1.0041 - val_loss: 0.9244
Epoch 2/15
15819/15819 [==============================] - 47s - loss: 0.9064 - val_loss: 0.8358
Epoch 3/15
15819/15819 [==============================] - 47s - loss: 0.8438 - val_loss: 0.8319
Epoch 4/15
15819/15819 [==============================] - 47s - loss: 0.8235 - val_loss: 0.7711
Epoch 5/15
15819/15819 [==============================] - 47s - loss: 0.7947 - val_loss: 0.7190
Epoch 6/15
15819/15819 [==============================] - 47s - loss: 0.7778 - val_loss: 0.8233
Epoch 7/15
15819/15819 [==============================] - 47s - loss: 0.7511 - val_loss: 0.6501
Epoch 8/15
15819/15819 [==============================] - 47s - loss: 0.7112 - val_loss: 0.6251
Epoch 9/15
15819/15819 [==============================] - 47s - loss: 0.6967 - val_loss: 0.6696
Epoch 10/15
15819/15819 [==============================] - 47s - loss: 0.6733 - val_loss: 0.6607
Epoch 11/15
15819/15819 [==============================] - 47s - loss: 0.6539 - val_loss: 0.6705
Epoch 12/15
15819/15819 [==============================] - 47s - loss: 0.6369 - val_loss: 0.5782
Epoch 13/15
15819/15819 [==============================] - 47s - loss: 0.6498 - val_loss: 0.5720
Epoch 14/15
15819/15819 [==============================] - 47s - loss: 0.6290 - val_loss: 0.6046
Epoch 15/15
15819/15819 [==============================] - 47s - loss: 0.6116 - val_loss: 0.5938
Model r:  (0.65146763812233133, 0.0)
Model rmse:  0.77059348390805

It behaves exactly the same as pure RNN aproach, will get solid data for that as well at some point.

@eadward
Copy link

eadward commented Dec 7, 2015

Hi Timotej,

Sorry for the slow reply but I’ve been sick.

Without changing more parameters it is a bit hard. Usually this is key to working with NN, nas therefore my recurrent suggestion to be document all the tests and parameters used. The sequences that the LSTM received are quite short, and sometimes LSTM do not work well that way. Also, you are using 200 units, which may be a little to much considering that you only have 34 input features. Try using 20.

Is the RMSE computed on the standardised outputs? If not it actually looks quite low (error of 1.5 BPM is very low). Can you plot the predictions for training, validation and test?

Cheers,
E>

From: Timotej Kapus <[email protected]mailto:[email protected]>
Reply-To: group-24/Palpitate <[email protected]mailto:[email protected]>
Date: Thursday, 3 December 2015 19:55
To: group-24/Palpitate <[email protected]mailto:[email protected]>
Cc: edebrito <[email protected]mailto:[email protected]>
Subject: [Palpitate] CNN-RNNs not learning anything (#39)

@eadwardhttps://github.com/eadward I've now done the CNN-RNN approach discussed in the meeting.

It is behaving very similarly to the subject independent RNN aproach, meaning it is not learning anything.

For example here is a training trace:

Train on 10721 samples, validate on 10410 samples
Epoch 1/15
10721/10721 [==============================] - 56s - loss: 0.9948 - val_loss: 1.5181
Epoch 2/15
10721/10721 [==============================] - 56s - loss: 0.9000 - val_loss: 1.4525
Epoch 3/15
10721/10721 [==============================] - 55s - loss: 0.8459 - val_loss: 1.4802
Epoch 4/15
10721/10721 [==============================] - 55s - loss: 0.7673 - val_loss: 1.4254
Epoch 5/15
10721/10721 [==============================] - 55s - loss: 0.7109 - val_loss: 1.4720
Epoch 6/15
10721/10721 [==============================] - 55s - loss: 0.7307 - val_loss: 1.4763
Epoch 7/15
10721/10721 [==============================] - 55s - loss: 0.6144 - val_loss: 1.4643
Epoch 8/15
10721/10721 [==============================] - 54s - loss: 0.5660 - val_loss: 1.4857

#Final result on validation set:
Model r: (0.044959106007683075, 4.4554423862127374e-06)
Model rmse: 1.2189017562850681

#Final result on test set:
Model r: (-0.0059427638053850106, 0.48934621677575696)
Model rmse: 1.3999194428042123

The loss (root mean square error in this case) keeps decreasing, but the training has no effect on the validation set performance. This is exactly the same thing as it is happening with the direct RNN approach.

The architecture of the network is (tinkered keras syntax for clarity), shape comments represent the shape of the network at that point:

# shape (n_images, frequencies, time)
# shape (1,200,158)

model.add(Convolution2D(16,in_shape[1],5))  #doing 1d convolutions on the time asix with filter size 5
model.add(MaxPooling2D((1,2)))

# shape (16,1,77)
model.add(Convolution2D(32,1,10))
model.add(MaxPooling2D((1,2)))


# shape (32,1,34)
model.add(Reshape(dims=(...)))
#shape (32,34)
model.add(Permute((2,1)))
#shape (time, features)
#(34,32)
model.add(LSTM(200))

model.add(Dense(1))
model.add(Activation('linear'))


Reply to this email directly or view it on GitHubhttps://github.com//issues/39.

@eadward
Copy link

eadward commented Dec 7, 2015

Please send as attachment or add to github.
E.

From:

Timotej Kapus <[email protected]mailto:[email protected]>
Reply-To: group-24/Palpitate <[email protected]mailto:[email protected]>
Date: Thursday, 3 December 2015 22:11
To: group-24/Palpitate <[email protected]mailto:[email protected]>
Cc: edebrito <[email protected]mailto:[email protected]>
Subject: Re: [Palpitate] CNN-RNNs not learning anything (#39)

https://docs.google.com/spreadsheets/d/1ZWiciyp-fRBzmOj5dewdBBrCnttDqPFTo8_iuCU8VF0/edit#gid=0&vpid=A1 a google spreadsheet of some of the runs


Reply to this email directly or view it on GitHubhttps://github.com//issues/39#issuecomment-161801107.

@eadward
Copy link

eadward commented Dec 7, 2015

So, both RNN and CC+RNN work similarly for subject dependant, right? Make sure that you report this. Make sure that you train a model on a single subject (1 male and 1 female). In the meantime let us try to solve the independent folds issue: are you sure that there is no error in the data preparation? E.

@kren1
Copy link
Contributor Author

kren1 commented Dec 7, 2015

@eadward They also work similarly (bad) for subject independent model.

I'm not sure there is no error in the data preparation, but have no idea how to test it.

@eadward
Copy link

eadward commented Dec 7, 2015

I assume that there no errors then, as you are using the same scripts for other tests. Have you checked the if you have a gender balanced folds and also that the HR distributions for the various folds are similar? Male/female unbalance could be a problem, and if the HR disctributions in each fold are very different the event worst.. E.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants