-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Non-reproducibility in TrackerPhase2OTL1Track #47071
Comments
assign l1, dqm, upgrade |
New categories assigned: l1,dqm,upgrade @aloeliger,@antoniovagnerini,@epalencia,@Moanwar,@rseidita,@srimanob,@subirsarkar you have been requested to review this Pull request/Issue and eventually sign? Thanks |
cms-bot internal usage |
A new Issue was created by @makortel. @Dr15Jones, @antoniovilela, @makortel, @mandrenguyen, @rappoccio, @sextonkennedy, @smuzaffar can you please review it and eventually sign/assign? Thanks. cms-bot commands are listed here |
@tomalin FYI |
hi @makortel , i am confused why these are all showing as failures. if i look at the actual histograms, they all look fine (e.g. https://cmssdt.cern.ch/SDT/jenkins-artifacts/baseLineComparisons/CMSSW_15_0_X_2025-01-09-1100+9e6aa1/66377/29634.911_TTbar_14TeV+Run4D110_DD4hep/TrackerPhase2OTL1Track_Tracks_HQ.html). there is one entry difference between the two sets (246 vs 247), is that what is causing these all to be flagged as red? |
Probably? (technical question would be for @cms-sw/pdmv-l2 whose histogram comparison infrastructure is being used in PR tests) There is "clear" difference between blue and red in the 4-5 bin (probably by 1). |
The original sin there is that the comparison is performed via the Now the question would be: do we want to spot these discrepancies? Maybe this case is a bit pathological (and the test could, e.g., take into account the histogram population), but in general I think it would be interesting to be aware of this irreproducibilities given we run exactly on the same events. |
Ok, of course all the PR comparisons are BinToBin, while usually for the RelMon we use the Chi2. And actually the threshold is way higher: 0.999999999999. |
So far we have (in practice, at least) required CPU code to be fully reproducible within the same x86 microarchitecture and CPU vendor when running on 1 thread. In all cases so far the cause for non-reproducibility has been a bug somewhere. |
Is this issue an extension of #45505 ? |
OK, thanks. So, looking on the list of workflow, the issue is not only DD4hep (as in #35109), but also DB one (with DDD). |
Tests of PRs unrelated to L1T show differences in workflows 29634.911 and 29834.999 in TrackerPhase2OTL1Track, TrackerPhase2OTL1TrackV, and L1T folders. In #47051 (comment)
The text was updated successfully, but these errors were encountered: