Performance problem combination of SSG + SPJ #10

MargauxParentAubay · 2022-04-25T14:04:54Z

Hey,

My team and I are facing an issue with the concatenation of SSG and SPJ. We trained the SSG and SPJ. The performances are quite good taken separatly. But, as soon as we test the NRD globally, the performances drop down. We have 0.55, 087 precision and recall for the SSG, 0.89 F1 score for the SPJ but 0.131 for the accumulation of SSG + SPJ. Based on the table from the Neural Databases article, we expected to have better results. Do you have any idea why this is happening ?

Why can't we predict the other types of questions for SSG + SPJ ... ?

Thanks :)

j6mes · 2022-04-26T09:55:31Z

It looks like your SPJ is not being trained with the random negative sampling in this instance. The model reported in the paper is the spj_rand model in the scripts.

To replicate the results from the paper the table is an average of 5 different random initialisations of the model. But, I trained with SEED=1 using the scripts from the repo last night and got these numbers when testing on the SSG predictions.

You can download my model checkpoint from this Google Drive folder: https://drive.google.com/drive/folders/1Goyx1KST01nritrm4CoAOl1KCJYOc43P?usp=sharing

                                           all_ type_bool type_count type_minmax   type_set
                                           mean      mean       mean        mean       mean
model generator retriever lr   steps
t5    spj_rand            1e-4 1      67.077511  77.97619  50.609756   73.227743  64.840929

PierreGAubay · 2022-05-02T08:54:42Z

Hi James,
I am in the team with Margaux. Thanks for replying us so fast.
We found out that our problem was linked to the conversion of the final predictions. In some conda environments, the label actual (the true answer) was printing the token NULL_ANSWERS. That obviously ruined our performances. We fixed it by modifying the convert_spj_to_predictions.py file.

andreabac3 · 2022-06-20T08:44:58Z

Hi @PierreGAubay ,
please can you share the modified version of convert_spj_to_predictions.py?

Andrea

PierreGAubay · 2022-06-20T13:04:54Z

Hi @andreabac3,
Unfortunately I can't send you the complete file. The code I changed belongs to the company I am working at.
However, the changes my team and I made to this very specific file are not breaking down the fundamentals of the functions in it.
During the process, we realized 2 things. First, in some question_type conditions, we thought that some assert lines of code were too restrictive such as assert "[SEP]" not in (we do not fully understand why it is there). It causes the program to abandon theses lines for prediction generating bad performances globally. Second, we had to paste the answer of the predictions into the derivations arg in the json output file. Why ? Because at some point in the spj_generator.py file, the output yield by the function _process_query is the derivation_tokens.

Hope it will help !

PierreGAubay · 2022-06-21T08:51:27Z

Hi again @andreabac3
I have convinced my manager.
Here is the txt version of it. Sorry if the code is a bit messy.
convert_spj_to_predictions.txt

andreabac3 · 2022-06-21T08:53:49Z

Hi @PierreGAubay,
sorry for my late response.

Thank you very much, for your help and kindness.

Sincerely,
Andrea

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance problem combination of SSG + SPJ #10

Performance problem combination of SSG + SPJ #10

MargauxParentAubay commented Apr 25, 2022 •

edited

Loading

j6mes commented Apr 26, 2022 •

edited

Loading

PierreGAubay commented May 2, 2022

andreabac3 commented Jun 20, 2022

PierreGAubay commented Jun 20, 2022

PierreGAubay commented Jun 21, 2022

andreabac3 commented Jun 21, 2022

Performance problem combination of SSG + SPJ #10

Performance problem combination of SSG + SPJ #10

Comments

MargauxParentAubay commented Apr 25, 2022 • edited Loading

j6mes commented Apr 26, 2022 • edited Loading

PierreGAubay commented May 2, 2022

andreabac3 commented Jun 20, 2022

PierreGAubay commented Jun 20, 2022

PierreGAubay commented Jun 21, 2022

andreabac3 commented Jun 21, 2022

MargauxParentAubay commented Apr 25, 2022 •

edited

Loading

j6mes commented Apr 26, 2022 •

edited

Loading