pickle.dump gives SystemError #8

xdvx · 2017-02-27T14:09:48Z

picke.dump gives me SystemError if I try larger batch of images. Thought it is memory issue, tried same batch on 32GB ram machine, same error.

bakwc · 2017-02-27T16:36:22Z

Thanks for report! Could you please provide more details? What are you trying to run - pcr.py or nnpcr.py? Are you trying to train a new model? What exact SystemError do you have (please attach full exception trace)? How large is your training set?

xdvx · 2017-02-27T16:56:37Z

I got this running nnpcr.py

This is an error message:
pickle.dump(obj, open(fileName + '.tmp', 'wb'), -1) SystemError: error return without exception set

I had 10k positive / 10k negative images. While googling I've found only one solution to run same script on Python 3. Took me awhile but no error message anymore.

I have one more question, I see that this is set to constant number:
def train(self, numIterations=1500):

Should this represent the size of training set?

bakwc · 2017-02-27T17:16:41Z

Seems like this dataset is too large to fit into in-memory cache. You can comment line 114 (saveCache((trainX, trainY, testX, testY), 'nncache.bin')) - this cache is only used to improve speed if running train multiple times.

I have one more question, I see that this is set to constant number:
def train(self, numIterations=1500):

Should this represent the size of training set?

No (at least directly). But you can try to set different numbers here. If numIterations too little or too large - the accuracy will be poor. 1500 was optimal for my training set (3K images total).

bakwc · 2017-02-27T17:20:40Z

BTW, what accuracy do you have with your dataset? Could you share your model (or how you gather it)?

xdvx · 2017-02-27T17:32:26Z

Didn't get over 80% yet. Looking for better dataset. Basically I aim to train to recognize photos which are not suitable for advertisement networks. So even minor nudity is not accepted here.

Should I train it only with people photos or should I provide with all kind of different samples as negative ones? Do photo dimensions matter or I can collect photos with smaller resolution? At final stage I hope I could use this script to go through 20 million photos on my website and mark which ones are not suited for showing advertisements.

I am collecting data samples by crawling reddit. So I could gather even huge datasets like hundrends of thousands images.

bakwc · 2017-02-27T17:47:41Z

Should I train it only with people photos or should I provide with all kind of different samples as negative ones?

Not only people - just arbitrary images. The more - the better.

Do photo dimensions matter or I can collect photos with smaller resolution?

Currently all images converted to 128x128. You could try smaller one, they should will be upscaled.

I am collecting data samples by crawling reddit. So I could gather even huge datasets like hundrends of thousands images.

Crawling reddit is a good idea, may be I'll try too later.

xdvx · 2017-02-27T17:55:02Z

Would it be smart to double image size? Would it be enough to change this constant?
IMG_SIZE = 128
to
IMG_SIZE = 256

Must positive and negative data samples be at same size?

bakwc · 2017-02-27T18:31:09Z

Would it be smart to double image size? Would it be enough to change this constant?
IMG_SIZE = 128
to
IMG_SIZE = 256

It's not that simple - you also need to change architecture of neural network - eg. add additinal conv & max_pool layers.

Must positive and negative data samples be at same size?

Yep.

xdvx · 2017-02-28T10:40:00Z

def train(self, numIterations=1500):

Is this enough for batch of 60K of training data?

bakwc · 2017-02-28T12:25:31Z

Not sure. Try to increase to 3K and check if quality is better or worse than with 1.5k.

xdvx · 2017-02-28T12:43:36Z

It really takes lot of ram. I could only run 40k sample on 32gb ram. Will try 80k on 64gb tomorrow. My model size always end up being 13mb, is this right? I still can't get accuracy over 80%.

bakwc · 2017-02-28T21:49:50Z

It really takes lot of ram. I could only run 40k sample on 32gb ram.

Seems like it currently not optimized for large datests. Currently it is loading the whole dataset in-memory, need to fix working with dataset.

My model size always end up being 13mb, is this right? I still can't get accuracy over 80%.

Current network architecture is not very complex - may be you should try more complicated architectures, eg Inception. You can also try following:

tune iterations number
tune size of pre-outer layer (now 1024 - you can try to increase or decrease it)
tune size of channels for convolutional layers (6, 12, 24)
add additional layers
try 5x5 kernel instead 3x3 one (shape=[3, 3, ...] => shape=[5, 5, ...] )

xdvx · 2017-03-01T12:02:11Z

By the way in new TensorFlow libarry you have to call

init_ops.zeros_initializer

like this, otherwise you will get error
init_ops.zeros_initializer()

I'll test with all those different parameters.

Also I get these warnings:

Use tf.losses.softmax_cross_entropy instead. [2017-03-01 13:01:03,853 deprecation.py:116 WARNING] From nnpcr.py:216: softmax_cross_entropy (from tensorflow.contrib.losses.python.losses.loss_ops) is deprecated and will be removed after 2016-12-30. Instructions for updating: Use tf.losses.softmax_cross_entropy instead. WARNING:tensorflow:From /usr/local/lib/python2.7/dist-packages/tensorflow/contrib/losses/python/losses/loss_ops.py:394: compute_weighted_loss (from tensorflow.contrib.losses.python.losses.loss_ops) is deprecated and will be removed after 2016-12-30. Instructions for updating: Use tf.losses.compute_weighted_loss instead. [2017-03-01 13:01:03,865 deprecation.py:116 WARNING] From /usr/local/lib/python2.7/dist-packages/tensorflow/contrib/losses/python/losses/loss_ops.py:394: compute_weighted_loss (from tensorflow.contrib.losses.python.losses.loss_ops) is deprecated and will be removed after 2016-12-30. Instructions for updating: Use tf.losses.compute_weighted_loss instead. WARNING:tensorflow:From /usr/local/lib/python2.7/dist-packages/tensorflow/contrib/losses/python/losses/loss_ops.py:151: add_loss (from tensorflow.contrib.losses.python.losses.loss_ops) is deprecated and will be removed after 2016-12-30. Instructions for updating: Use tf.losses.add_loss instead. [2017-03-01 13:01:03,879 deprecation.py:116 WARNING] From /usr/local/lib/python2.7/dist-packages/tensorflow/contrib/losses/python/losses/loss_ops.py:151: add_loss (from tensorflow.contrib.losses.python.losses.loss_ops) is deprecated and will be removed after 2016-12-30.

xdvx · 2017-03-01T13:23:58Z

Which line is for pre-outer layer?

bakwc · 2017-03-01T16:20:27Z

W_fc1 = tf.get_variable("W_fc1", shape=[8 * 8 * 24, 1024], initializer=xavier())

bakwc · 2017-03-01T16:22:20Z

By the way in new TensorFlow libarry you have to call init_ops.zeros_initializer

Haven't yet ported to a new version. Will do it soon.

xdvx · 2017-03-15T02:03:22Z

   x_image = tf.reshape(x, [-1, IMG_SIZE, IMG_SIZE, 3])     # 128

    W_conv1 = tf.get_variable("W_conv1", shape=[3, 3, 3, 64], initializer=xavier())
    b_conv1 = tf.get_variable('b_conv1', [1, 1, 1, 64])
    h_conv1 = tf.nn.relu(conv2d(x_image, W_conv1) + b_conv1)
    h_pool1 = max_pool_2x2(h_conv1)                             # 64

    W_conv2 = tf.get_variable("W_conv2", shape=[3, 3, 64, 128], initializer=xavier())
    b_conv2 = tf.get_variable('b_conv2', [1, 1, 1, 128])
    h_conv2 = tf.nn.relu(conv2d(h_pool1, W_conv2) + b_conv2)
    h_pool2 = max_pool_2x2(h_conv2)                             # 32

    W_conv3 = tf.get_variable("W_conv3", shape=[3, 3, 128, 256], initializer=xavier())
    b_conv3 = tf.get_variable('b_conv3', [1, 1, 1, 256])
    h_conv3 = tf.nn.relu(conv2d(h_pool2, W_conv3) + b_conv3)
    h_pool3 = max_pool_2x2(h_conv3)                             # 16

    W_conv4 = tf.get_variable("W_conv4", shape=[3, 3, 256, 512], initializer=xavier())
    b_conv4 = tf.get_variable('b_conv4', [1, 1, 1, 512])
    h_conv4 = tf.nn.relu(conv2d(h_pool3, W_conv4) + b_conv4)
    h_pool4 = max_pool_2x2(h_conv4)                             # 8

    W_conv5 = tf.get_variable("W_conv5", shape=[3, 3, 512, 512], initializer=xavier())
    b_conv5 = tf.get_variable('b_conv5', [1, 1, 1, 512])
    h_conv5 = tf.nn.relu(conv2d(h_pool4, W_conv5) + b_conv5)
    h_pool5 = max_pool_2x2(h_conv5)                             # 4

    h_pool5_flat = tf.reshape(h_pool5, [-1, 4 * 4 * 512])

    W_fc1 = tf.get_variable("W_fc1", shape=[4 * 4 * 512, 4096], initializer=xavier())
    b_fc1 = tf.get_variable('b_fc1', [4096], initializer=init_ops.zeros_initializer())
    h_fc1 = tf.nn.relu(tf.matmul(h_pool5_flat, W_fc1) + b_fc1)

    keep_prob = tf.placeholder(tf.float32)
    h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob)

    W_fcO = tf.get_variable("W_fcO", shape=[4096, 2], initializer=xavier())
    b_fcO = tf.get_variable('b_fcO', [2], initializer=init_ops.zeros_initializer())

    logits = tf.matmul(h_fc1_drop, W_fcO) + b_fcO
    y_conv = tf.nn.softmax(logits)


    cross_entropy = loss_ops.softmax_cross_entropy(logits, y_)

    train_step = tf.train.AdamOptimizer(0.0005).minimize(cross_entropy)

    self.results = predictions = tf.argmax(y_conv, 1)

    self.probabilities = y_conv

    correct_prediction = tf.equal(predictions, tf.argmax(y_, 1))
    accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

I got best results with this network. But accuracy didn't go over 93%. I ran 80 epochs.

It is giving me pretty good results in real life but I got curious to see its predictions. So I've checked what softmax is returning for me. And result wasn't good. It usually 1 or very close to one or 0. Shouldn't be like this. What do you think was my mistake? Neural network is too big for 128x128 pixels or 64k data batch is too small for such a big network?

bakwc · 2017-03-15T09:31:24Z

Sory, I can't understand what's the problem. 93% is rother good accuracy. All dataset is spliited into train (80%) and test (20%) one, accuracy is calculated over a test set. So if you have 93% accuracy - the same accuracy should be in real life too.
Or maybe you want to improve quality even more?

xdvx · 2017-03-15T12:22:04Z

No the problem I think is with what confident results softmax returns me. My guess is results are too dense. But I'm new to this field as you can see. Just learnt a lot in couple weeks. If I output what softmax returns me it outputs numbers around 1.00, 0.00 or 0.9999, 0.0001 something from those lines. I'm just curious why my neural network is so confident on results, should probability ever go to 100% and in most cases be between 99-100%?

One more question would be would it be practical to add second fully connected layer on top of fully connected layer add another dropout and only then retrieve 2 final classes.

I also played with different kernel sizes, didn't give me any effect, just slowed down my training.

AdamOptimizer gave me quite an improvement on results.

Just happy to share what I've learnt.

bakwc mentioned this issue Sep 30, 2017

What is "porn"? #13

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pickle.dump gives SystemError #8

pickle.dump gives SystemError #8

xdvx commented Feb 27, 2017

bakwc commented Feb 27, 2017

xdvx commented Feb 27, 2017

bakwc commented Feb 27, 2017 •

edited

Loading

bakwc commented Feb 27, 2017

xdvx commented Feb 27, 2017

bakwc commented Feb 27, 2017

xdvx commented Feb 27, 2017

bakwc commented Feb 27, 2017

xdvx commented Feb 28, 2017

bakwc commented Feb 28, 2017

xdvx commented Feb 28, 2017

bakwc commented Feb 28, 2017

xdvx commented Mar 1, 2017

xdvx commented Mar 1, 2017

bakwc commented Mar 1, 2017

bakwc commented Mar 1, 2017 •

edited

Loading

xdvx commented Mar 15, 2017

bakwc commented Mar 15, 2017

xdvx commented Mar 15, 2017

pickle.dump gives SystemError #8

pickle.dump gives SystemError #8

Comments

xdvx commented Feb 27, 2017

bakwc commented Feb 27, 2017

xdvx commented Feb 27, 2017

bakwc commented Feb 27, 2017 • edited Loading

bakwc commented Feb 27, 2017

xdvx commented Feb 27, 2017

bakwc commented Feb 27, 2017

xdvx commented Feb 27, 2017

bakwc commented Feb 27, 2017

xdvx commented Feb 28, 2017

bakwc commented Feb 28, 2017

xdvx commented Feb 28, 2017

bakwc commented Feb 28, 2017

xdvx commented Mar 1, 2017

xdvx commented Mar 1, 2017

bakwc commented Mar 1, 2017

bakwc commented Mar 1, 2017 • edited Loading

xdvx commented Mar 15, 2017

bakwc commented Mar 15, 2017

xdvx commented Mar 15, 2017

bakwc commented Feb 27, 2017 •

edited

Loading

bakwc commented Mar 1, 2017 •

edited

Loading