PyTorch/Tensorflow #2

ClashLuke · 2019-11-24T23:31:00Z

Hello there,
I recently stumbled upon this repository and was interested in trying out your code. However, using single-threaded sklearn doesn't seem to be efficient to me, compared to using GPU-optimized PyTorch or TF.
Do you have any plans of moving to those frameworks, or would you accept a pullrequest implementing these?
Regards,
Luke

alessiosavi · 2019-11-25T08:38:50Z

Hi Clash,

Now i'm trying to understand which type of neural network suits better for recognize the 68 points extract from the face. So the work that you find here is only test/study purpose, for me and for everyone that need a basecode.

I'm currently changing the KNN in order to use a MLP classifier, that is obviosly more effifcient (in terms of precision) for this purpouse.

Of course, pull request are welcome!
I've played very little with PyTorch, and i think that Tensorflow will be a more preferable choiche.

alessiosavi · 2019-11-25T10:00:08Z

I've made some tests.
During the predict phase, the most time consuming process is the face encoding.

As you can see, encode two face cost ~3s on my hardware (GeForce 940MX).
It's because the jitter parameter used during the training/tuning phase have to be equal when make prediction, and i've choose 300 in order to increase the type of distortion made on the photos before training/predict.

Are you talking about tuning/training phase or predict?

Be sure to pull from master, i've migrated to MLP classifier that is more precise during prediction.

Enhancements Issue #2 - Model saving mechanism rewritten from scratch (using timestamp as name) - Every model will be now saved in a different directory - Every data related to the model (dataset + configuration) will be saved in the same folder - Configuration file changed due to new implementation of model folder - dump_model (dataset) rewritten and migrated to utils - dump_model (classifier) rewriten in order to be compliant with new folder architecture - Remove migrated parallelism from "different person" from "different image same person" - Enabled progress bar during face analysis - Response constuctor will now accept parameter Issue #4 - Create function for retrieve the dataset from the input HTML form and return to tune/train function - Standardize and refactor logic for train/tune BugFix - Dump the real classifier (grid.best_estimator_)

ClashLuke · 2019-11-27T21:17:28Z

First of all, thank you very much for taking your time to reply to this issue regarding training-optimization.

Second, you pointed out that the most time is consumed when encoding faces. This process could mostly be skipped, by using a couple keras features, such as the ImageDataGenerator.

Why does the jitter have to be equal during training and prediction? Isn't it normally used as a regularization technique, and therefore should be left out during inference, or am I thinking about something else here?

Lastly, I'd love to know if you'd be fine with a (backwards-compatible) switch to a CNN, so that we could compare the performance of Inception-v4 with an MLP.

alessiosavi · 2019-11-27T23:28:07Z

Hi Sir,

Thank you for the interest in the project!
You was completely right!

The problems related to the jitter parameter, was caused from the KNN that was not able to recognize the faces if trained using an high number of distortion. So far as that they does not correspond quite strictly (during the train/prediction phase), seems that the network is not very precise (i think that this have to be investigated in the archiecture of the KNN, but is out of scope).
With the MLP architecture (that have increase by an huge factor the confidence during the prediction), this little strange things is no more relevant, and we can use a different number of jitter during the training and predict phase.

Of course, the jitter parameter cause lot of time spent in image distortion (from documentation, jitter=300 -> 300x the time used), so use a different approach for create distortion will be a great performance improvement.
We have to understand the necessary parameters for ImageDataGenerator in order to preserve the quality of prediction (jitter make an average of the augmented data) and increase the speed of faces encodings.

From the architectural POV, we can play as much as we want with the code. So we can try lot of different type of network, using the same dataset for compare the result.

I expect that (with an higher amount of data), the CNN/RNN perform better. I suspect, instead, that with very few photos (and the majority of the ones of this dataset are lesser than 10), the MLP will perform slightly better.

During the change of the NN basecode (Classifer.py), it's important to maintain the possibility of recognize multiple faces in the same photo.

Before migrate to tensorflow, i think that are some work from my side in order to clean the code and standardize return function.

Enhancements Issue #2 - Model saving mechanism rewritten from scratch (using timestamp as name) - Every model will be now saved in a different directory - Every data related to the model (dataset + configuration) will be saved in the same folder - Configuration file changed due to new implementation of model folder - dump_model (dataset) rewritten and migrated to utils - dump_model (classifier) rewriten in order to be compliant with new folder architecture - Remove migrated parallelism from "different person" from "different image same person" - Enabled progress bar during face analysis - Response constuctor will now accept parameter Issue #4 - Create function for retrieve the dataset from the input HTML form and return to tune/train function - Standardize and refactor logic for train/tune BugFix - Dump the real classifier (grid.best_estimator_)

ClashLuke · 2020-01-12T14:38:09Z

Since the ImageDataGenerator has a lot of parameters it might be better to switch to the new system instead. Unfortunately I got lost in another repository trying to figure out what the jitter paramter does. Could you explain it real quick?
I also have to agree that MLPs might perform better than CNNs on tiny datasets. Luckily the ImageDataGenerator can alleviate this issue quite a bit.

alessiosavi · 2020-01-12T15:29:06Z

Hi @ClashLuke,

The num_jitter is related to the number of times to re-sample the face when calculating encoding.
If num_jitters>1 then each face will be randomly jittered slightly num_jitters times, each run through the 128D projection, and the average used as the face descriptor.

After some test, I've realized that cv2 is better to "find" faces in photos, when the image have low quality or the person in the photo have a "not centred" face angle.

I think that the first step is to migrate the face recognition from dlib (face_recognition use dlib internally) to the cv2.dnn.readNetFromCaffe. This will traduce to an increase of the quality related to face detection. In context like CCVT camera or low resolution/quality of the photo, we can be sure to have a quite optimal face detection tool. Than we can move to generate some augmented data that is helpful for the train/tune process. I think that i can start to work to the cv2 migration in the next month. I've lost the jupyter-notebook where i start to develop the poc of the migration

ClashLuke · 2020-01-13T21:08:33Z

I dont think I understand. You sample the jitter paramter randomly from a uniform distribution u=[-P;P], where P is a parameter you set somewhere, correct?
Doesn't that mean that, we have an irwin-hall distribution, implying that we now have a normal distribution with zero mean and σ=((P*300)/12)^0.5*P=5P? Why jitter so many times?

Another thing I don't quite understand is where opencv and dlib come from. I assumed there was a MLP involved in this process?
Lastly, do you know why it's better at figuring out what the jittereted faced contain? Are the labels jittered as well?

ClashLuke · 2020-03-10T06:44:34Z

I can finally say that I know what you're doing when training the model.
We can definitely keep the initial pipeline, even though it would be nice to jitter while training.
Should I give it a try to rewrite the Classifier in a new Tensorflow branch/fork?

I'd have to rewrite the hyperparameter search though. While I'm at it I'd also change the architecture to a densenet, as they are insanely powerful for mlps.
Would that be an issue for you?

alessiosavi · 2020-03-11T11:07:38Z

Hi @ClashLuke, thank you for the interest and sorry for the late response. I'm very busy these days and i can only contribute in the weekend.

Of course, you can rewrite every part of the code that you are confident (:

Another thing I don't quite understand is where opencv and dlib come from. I assumed there was a MLP involved in this process?

dlib is used (from the face_recognition high level API) in order to extract the point related to the face. It use a custom version of the library but now we can use the one present in the master branch of the project.
The MLP is delegated to "link" the face encodings to the label

Lastly, do you know why it's better at figuring out what the jittereted faced contain? Are the labels jittered as well?
From my understanding, no. The jitter perform data augmentation on the image, so the label is the same.

I've created a gitter channel in order to discuss the future change/roadmap of the project.
https://gitter.im/PyRecognizer/PyRecognizer-Development

Thank you another time for the interest for the project.

ClashLuke · 2020-04-22T10:34:45Z

Are the hyperparameter search and the architecture search important? If not, I'd postpone them for now. The basic model already exists. What's next are the training loop and regularization.

alessiosavi · 2020-04-22T12:09:51Z

Hi Clash!

Thank you for the effort of the analysis! I'm here for explanation if you need some tips on the code.

Of course, we can tune the hyperparameters in the next phase :D

ClashLuke · 2020-04-22T15:50:55Z

Pretty sure I've got a testable state with model tuning now here. I had to remove balanced accuracy and precision for now, as I wasn't keen on calculating accuracy in buckets.
Now, how would I go about testing this unit?

ClashLuke · 2020-05-11T22:23:23Z

Any news?

alessiosavi · 2021-04-29T12:25:06Z

Hi Clash, I'm going to rewrite the "backen engine" from scratch using dlib and tensorflow. I'm going to update the repo in the next month.

I'm testing the neural network and it have ~97% accuracy on validation dataset!
I'm changing completely the architecture of the code.
I think that the repository will split in two different part:
Python NeuralNetwork, a webserver that run on localhost delegated to:

Load the image
Recognize face bound
Perform data augmentation using ImageDataGenerator
Encode face using shape_predictor_68_face_landmarks.dat and dlib_face_recognition_resnet_model_v1.dat instead of face_recognition wrapper in order to get more accuracy
Use a Tensorflow Dense network

Go webservice:

New go frontend delegated to talk with the python daemon in order to expose the predict functionality.

In first instance the train will be delegated to run without HTTP interaction, so scripts will be released in order to train "offline" the network

alessiosavi added enhancement New feature or request good first issue Good for newcomers labels Nov 25, 2019

alessiosavi added this to the Tensorflow migration milestone Nov 25, 2019

alessiosavi assigned alessiosavi and ClashLuke Nov 25, 2019

alessiosavi added architectural change Someone have to take a decision here ... performance Same work, less readable labels Nov 25, 2019

This was referenced Mar 10, 2020

Appending user training images when needed #24

Open

Adaptively adding and removing classes #25

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PyTorch/Tensorflow #2

PyTorch/Tensorflow #2

ClashLuke commented Nov 24, 2019

alessiosavi commented Nov 25, 2019 •

edited

Loading

alessiosavi commented Nov 25, 2019 •

edited

Loading

ClashLuke commented Nov 27, 2019

alessiosavi commented Nov 27, 2019 •

edited

Loading

ClashLuke commented Jan 12, 2020

alessiosavi commented Jan 12, 2020 •

edited

Loading

ClashLuke commented Jan 13, 2020

ClashLuke commented Mar 10, 2020

alessiosavi commented Mar 11, 2020

ClashLuke commented Apr 22, 2020

alessiosavi commented Apr 22, 2020

ClashLuke commented Apr 22, 2020 •

edited

Loading

ClashLuke commented May 11, 2020

alessiosavi commented Apr 29, 2021

PyTorch/Tensorflow #2

PyTorch/Tensorflow #2

Comments

ClashLuke commented Nov 24, 2019

alessiosavi commented Nov 25, 2019 • edited Loading

alessiosavi commented Nov 25, 2019 • edited Loading

ClashLuke commented Nov 27, 2019

alessiosavi commented Nov 27, 2019 • edited Loading

ClashLuke commented Jan 12, 2020

alessiosavi commented Jan 12, 2020 • edited Loading

ClashLuke commented Jan 13, 2020

ClashLuke commented Mar 10, 2020

alessiosavi commented Mar 11, 2020

ClashLuke commented Apr 22, 2020

alessiosavi commented Apr 22, 2020

ClashLuke commented Apr 22, 2020 • edited Loading

ClashLuke commented May 11, 2020

alessiosavi commented Apr 29, 2021

alessiosavi commented Nov 25, 2019 •

edited

Loading

alessiosavi commented Nov 25, 2019 •

edited

Loading

alessiosavi commented Nov 27, 2019 •

edited

Loading

alessiosavi commented Jan 12, 2020 •

edited

Loading

ClashLuke commented Apr 22, 2020 •

edited

Loading