-
Notifications
You must be signed in to change notification settings - Fork 68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pickle.dump gives SystemError #8
Comments
Thanks for report! Could you please provide more details? What are you trying to run - |
I got this running nnpcr.py This is an error message: I had 10k positive / 10k negative images. While googling I've found only one solution to run same script on Python 3. Took me awhile but no error message anymore. I have one more question, I see that this is set to constant number: Should this represent the size of training set? |
Seems like this dataset is too large to fit into in-memory cache. You can comment line 114 (
No (at least directly). But you can try to set different numbers here. If |
BTW, what accuracy do you have with your dataset? Could you share your model (or how you gather it)? |
Didn't get over 80% yet. Looking for better dataset. Basically I aim to train to recognize photos which are not suitable for advertisement networks. So even minor nudity is not accepted here. Should I train it only with people photos or should I provide with all kind of different samples as negative ones? Do photo dimensions matter or I can collect photos with smaller resolution? At final stage I hope I could use this script to go through 20 million photos on my website and mark which ones are not suited for showing advertisements. I am collecting data samples by crawling reddit. So I could gather even huge datasets like hundrends of thousands images. |
Not only people - just arbitrary images. The more - the better.
Currently all images converted to 128x128. You could try smaller one, they should will be upscaled.
Crawling reddit is a good idea, may be I'll try too later. |
Would it be smart to double image size? Would it be enough to change this constant? Must positive and negative data samples be at same size? |
It's not that simple - you also need to change architecture of neural network - eg. add additinal conv & max_pool layers.
Yep. |
Is this enough for batch of 60K of training data? |
Not sure. Try to increase to 3K and check if quality is better or worse than with 1.5k. |
It really takes lot of ram. I could only run 40k sample on 32gb ram. Will try 80k on 64gb tomorrow. My model size always end up being 13mb, is this right? I still can't get accuracy over 80%. |
Seems like it currently not optimized for large datests. Currently it is loading the whole dataset in-memory, need to fix working with dataset.
Current network architecture is not very complex - may be you should try more complicated architectures, eg Inception. You can also try following:
|
By the way in new TensorFlow libarry you have to call
like this, otherwise you will get error I'll test with all those different parameters. Also I get these warnings:
|
Which line is for pre-outer layer? |
|
Haven't yet ported to a new version. Will do it soon. |
I got best results with this network. But accuracy didn't go over 93%. I ran 80 epochs. It is giving me pretty good results in real life but I got curious to see its predictions. So I've checked what softmax is returning for me. And result wasn't good. It usually 1 or very close to one or 0. Shouldn't be like this. What do you think was my mistake? Neural network is too big for 128x128 pixels or 64k data batch is too small for such a big network? |
Sory, I can't understand what's the problem. 93% is rother good accuracy. All dataset is spliited into train (80%) and test (20%) one, accuracy is calculated over a test set. So if you have 93% accuracy - the same accuracy should be in real life too. |
No the problem I think is with what confident results softmax returns me. My guess is results are too dense. But I'm new to this field as you can see. Just learnt a lot in couple weeks. If I output what softmax returns me it outputs numbers around 1.00, 0.00 or 0.9999, 0.0001 something from those lines. I'm just curious why my neural network is so confident on results, should probability ever go to 100% and in most cases be between 99-100%? One more question would be would it be practical to add second fully connected layer on top of fully connected layer add another dropout and only then retrieve 2 final classes. I also played with different kernel sizes, didn't give me any effect, just slowed down my training. AdamOptimizer gave me quite an improvement on results. Just happy to share what I've learnt. |
picke.dump gives me SystemError if I try larger batch of images. Thought it is memory issue, tried same batch on 32GB ram machine, same error.
The text was updated successfully, but these errors were encountered: