-
Notifications
You must be signed in to change notification settings - Fork 42
Introduce multi-slice NuGraph2 inference and filtering #868
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
Tag for integration release
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose this executable and the other one to run on grid are meant to setup tritonserver on the grid/interactively respectibely, but I wanted to understand better what's the scope of both scripts and perhaps I'd recommend adding a short description to both in case people want to use them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this was actually introduced in the previous PR. These scripts are still under development so it's a bit premature to document them, but if needed we can add a brief description of the intended usage (but maybe not necessarily for this PR as it does not make any change to the scripts?).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If they were already there then my bad for not remembering, I agree on the proposal: once they will be in their final shape documentation could be added.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps I'll clarify this on my own after checking the rest of the code, but I wanted to make sure what some of the labels refer to:
nuslhitsCryoE/W are the hits before nu-graph filtering
NuGraphCryoE/W I guess is a "global object/container" for all nu-graph related reconstruction
NCCSlicesCryoE/W is the "not clear cosmic" single slice considered as the neutrino candidate slice based on the trigger/optical flash
NGMultiSliceCryoE/W corresponds to nu-graph being applied to all non clear cosmic slices in a certain event
ngfilteredhitsCryoE/W corresponds to post-nu-graph-filtering hits
is this interpretation correct?
Perhaps, especially as far as NCCSlicesCryoE/W and NGMultiSliceCryoE/W are concerned at least, I'd add a comment to clarify/specify some details in the fhicl.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The proposed chain uses:
NCCSlicesCryoW/E: a collection of slices that are not tagged as clear cosmic by Pandora;NGMultiSliceCryoW/E: this runs NuGraph2 on multiple slices (here, the ones fromNCCSlicesCryoW/E) and provides objects and associations with NuGraph2 predictions;ngfilteredhitsCryoW/E: this filters the hits in each slice using the NuGraph2 filter score, and creates a new hit collection we then run Pandora over (the second-pass Pandora).
Thanks, we will add some comments and perhaps get rid of labels that we do not use anymore (nuslhitsCryoE/W and NuGraphCryoE/W were used to select the "triggering" slice and run NuGraph2 on a single slice).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the clarification: from the context and while checking carefully/talking to you it was clear but it doesn't hurt to put a few more comments for the future/people not expert enough who want to make sure they are doing the right thing/using the correct objects.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you clarify how these two fhicl parameters set to false change the settings in pandora?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The bottom line is that, once PandoraPFA/larpandora PR #35 and PandoraPFA/LArContent PR #256 are merged in Pandora/LArSoft, there will be two fcl parameters (CollectHitPredictions and UseHitPredictions) to collect the information from NuGraph2 and inject it into Pandora.
More specifically, we added a new interface in Pandora (larpandora) to collect the hit-by-hit predictions from NuGraph2 at the second Pandora stage, after NuGraph2 was ran, and inject them into the hit object within Pandora (specifically, a LArCaloHit).
CollectHitPredictions steers whether we expect NuGraph2 information from the input we give to Pandora, and UseHitPredictions adds the information to the LArCaloHit.
This operation just adds more information that can be used in Pandora, but does not change the reconstruction (for that, we would need to actually use the information in any algorithm, and change the relevant XML).
I kept the parameters around just to remind us that there is this option, but they are commented out right now since the Pandora PRs are not yet merged. Specifically, false means that we do not inject the NuGraph2 information into Pandora (even if available): this can and will be changed once everything is ready.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for the detailed explanation. Maybe a small comment summarising the purpose of this (commented) set of options could be useful (also for you in the future)?
Something like "Enable to inject NuGraph2 hit-by-hit predictions in Pandora"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This fhicl changes with respect to stage1_run2_icarus.fcl are clear, I wonder why the (correct) label change for tracks and showers after nu-graph filtering + second Pandora iteration is different for tracks and showers?
tracks: physics.producers.pandoraTrackGausCryoX.PFParticleLabel: "pandoraGausNuGraphRecoCryoX"
showers: physics.producers.SBNShowerGausCryoX.PFParticleLabel: "pandoraGausNuGraphRecoCryoX"
Maybe it was always there and I never realised...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, I think they were (maybe implicitly?) always there. Those labels steer which PFPs we use to run the post-pattern-recognition track/shower algos, so they needed to be aware of the post-NG2 Pandora products
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, the reason for the addition of this explicit statement was clear, I guess I never realised the labelling was different.
As a further confirmation, this change is applied only in the fhicls with "_NuGraphReco" because for the default ones nu-graph runs, creates new variables/info to be stored to caf, but the ark vs shower classification is done using the first and unique Pandora reconstruction pass, correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's exactly right!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't @sequence::icarus_tpcpmtbarycentermatch appear after the second pandora iteration as for stage1_run2_icarus_NuGraphReco? I guess in this case the labels for the tpc - pmt matching should rely on the second pass of Pandora (post nu-graph filtering)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, @sequence::icarus_tpcpmtbarycentermatch runs the TPC-PMT barycenter-based matching, and relies on slices from a per-cryo Pandora label -- you're right: things were OK in similar fcls, but I did forgot to swap things in this one. Thanks a lot!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the clarification and I see you did the swap.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment as above for the non _MC case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment as above for the data case on the different labels for tracks and showers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since it could be useful for people willing to do the training in the future and given the confusion we have seen in the recent past related to 1D --> 2D change, I'd recommend adding a comment before L94, indicating the correct SimChannelModuleLabel to use in the 2 cases or at least highlighting the configuration used in the present file version.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep. I think @cerati might have the 2D-version of this fcl ready, since he and Giulia did all the 2D training work. Also, we have to remember to add the 2D NG2 model to the 2D fcls. Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @rtriozzi, not sure if @cerati wants to do the change in the present PR or keep everything consistently referring to the training based on 1D-deconv. files. For sure for the upcoming icaruscode release I think it would be useful to have both options consistently applied (and thus all the needed fhicls).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a question on the parameters settings maybe I'll clarify on my own looking at the slides references, but is it wanted that avg and dev parameters are the same for the two cryostats?
Thinking ahead of people who might use this tool, but not sure if it is costumary I'd ask to add a comment on the choice of parameters indicated here (maybe highlighting the last time this decision was made to keep track?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe I'm wrong (and confused after all the tests with 2d deconv we have done), but after 2D shouldn't this parameter services.BackTrackerService.BackTracker.SimChannelModuleLabel: "daq:simpleSC
be "merge"?
Edit: talking with @rtriozzi (and as the name suggests) it seems this is not something that needs to be updated since it was used for (rapid) tests of the model before updates to have it in the current shape so I guess you can ignore my comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very pedantic suggestion: comment at L=45 should be //2535 is the last part of the wires in cryos 0 and 1 before the cut in z=0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If understand the workflow and based on the naming convention adpoted, both this fhicl and cafmakerjob_icarus_detsim2d_NuGraphReco.fcl are to be used for cases where stage1 includes two Pandora iterations and the second takes place after nugraph filtering and is the one we want to save to caf files (together with semantic info from nugraph itself).
However, the alternative should be to run Pandora only once (since we don't want to alter the reco based on nugraph) and save the outcome of nugraph into CAF files so shouldn't another fhicl to create caf with such info appear here?
Edit: @rtriozzi explained me that the default CAF maker is already compliant with the new nu-graph features. It was only me not realising things will appear in the related sbncode PR. Ignore this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few questions:
- information and attributes of the "spacepoint" are referred to the output of cluster3d step, correct?
- the energyDepNtuple includes only true info (MC)?
- can you clarify the comment at L=311? is this referred to the possibility of multiple v interactions in the same event?
Also, maybe I'm misunderstanding, but if the input in terms of spacepoints is cluster3d and this has the mythical points setting (2 planes are sufficient to find a match in 3d) isn't the case at L=360 indicated as an error possibly happening?
L417: what does this energy deposit map include? I understand this option is not available for ICARUS, but I was curious to understand. Never mind, I responded myself this is something used in uB.
L447-9: I'm not sure I'm following. It is said negative indicates no associated particle and thus one should search for a possible match track id <--> particle before taking the abs, but when doing this check the abs is done, is this wanted? I went to check a bit the TrackIdToParticle_P function but I'm not expert on how this operates so want to make sure this is what was intended here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
L459: I'm not sure I'm getting "invisible" particles, why invisible?
L523: maybe it's not an issue, but here a vector of float is declared whereas the optical flash has a vector of double for the photoelectrons per channel (https://github.com/SBNSoftware/lardataobj/blob/7c9c4f84cddf3692803f92e06ae17f4dcc7e0cb0/lardataobj/RecoBase/OpFlash.h#L88)
Especially on this part related to optical flashes and in genera PMT-related reconstruction, I must say I'm not an expert so I trust your choices, but please do consider contacting an expert if needed.
This PR integrates additional NuGraph2 modules in ICARUS, including some that were already introduced in Giuseppe's icaruscode PR #815, on which this PR is based, as well as new modules designed to:
Along with the modules, a set of FHiCL files is provided. They are all based on the standard
v10_06_00_01p01configurations. Following the review of this PR, similar FHiCL files based ondevelopwill be provided.More information on NuGraph2 and its applications can be found in SBN DocDB #40585, #44208, Indico.
A companion PR to sbncode is available for dealing with CAF-making after the NG2 filter module (sbncode PR #616).
Associated PRs
Review
Tagging for review @acampani and @PetrilloAtWork as reconstruction and infrastructure experts. Thanks a lot!