How it works#

In this article we go over how edep2supera works focusing on 5 points below.

  1. Instantiate and configure the module

  2. Prepare the input

  3. Execute the main functions

  4. Interpreting the input

  5. Interpreting the output

The first three items show the flow of processes to run Supera with edep-sim files as input. Then the last two items demonstrate how a user can access the input to and the output from Supera.

Instantiate and Configure#

Import edep2supera as well as an example module.

import edep2supera
from edep2supera import config
Welcome to JupyROOT 6.26/06

Instantiate and configure SuperaDriver for execution of edep2supera main functions.

# Instantiate
e2s_driver = edep2supera.edep2supera.SuperaDriver()

# Configure the driver
e2s_driver.ConfigureFromFile(config.get_config('tutorial'))
   [WARNING]  <SuperaDriver::Configure> starting
   [WARNING]  <SuperaDriver::Configure> BBox config...
   [WARNING]  <SuperaDriver::Configure> Label config...

What’s going on?

  1. SuperaDriver is the driver of executing main functions of edep2supera

    • A technical detail: it is a wrapper (i.e. C++ inheritance) of supera::Driver defined in SuperaAtomic.

  2. SuperaDriver needs three configurations including

    • one for itself (SuperaDriver.Configure), and two algorithms:

    • one for creating the meta data of the 3D image volume to be recorded

    • another for interpreting the input and creating necessary information to optimize lartpc_mlreco3d

  3. Configuration is done by passing either a configuration file for string text (i.e. the contents of a configuration file) in a yaml format.

    • In the above, we use config module that can provide an example configuration for this tutorial.

One can add a new configuration with a .yaml extension in the data directory:

edep2supera/python/edep2supera/config_data

and it will be automatically available through a config module.

To list the list of available configuration files:

config.list_config()
['tutorial']

The contents of a tutorial configuration looks like this:

print(open(config.get_config('tutorial'),'r').read())
# ===============
# tutorial.yaml
# ===============

LogLevel:        WARNING
ActiveDetectors: ["TPCActive_shape"]
MaxSegmentSize:  0.03

BBoxAlgorithm: BBoxInteraction
BBoxConfig:
    LogLevel:   WARNING
    Seed:       -1
    BBoxSize:   [2048,2048,2048]
    VoxelSize:  [0.5,0.5,0.5]
    BBoxBottom: [-1000,-1000,-1000]
    #WorldBoundMax: [-1.e20,-1.e20,-1.e20]
    #WorldBoundMin: [ 1.e20, 1.e20, 1.e20]
    
LabelAlgorithm: LArTPCMLReco3D
LabelConfig:
    LogLevel: WARNING
    DeltaSize:     3
    ComptonSize:  10
    LEScatterSize: 2
    TouchDistance: 1
    StoreLEScatter:   True
    SemanticPriority: [1,0,2,3,4]
    EnergyDepositThreshold: 0.0
    #WorldBoundMax: [-1.e20,-1.e20,-1.e20]
    #WorldBoundMin: [ 1.e20, 1.e20, 1.e20]
    

We don’t go over the full details of the configuration parameters nor the details of the algorithms. For those, refer to the documentation of SuperaAtomic.

Reading and Preparing Input#

Download an example dataset for this tutorial.

! wget https://web.stanford.edu/~kterao/edep-sim-example.root
--2022-11-16 08:24:27--  https://web.stanford.edu/~kterao/edep-sim-example.root
Resolving web.stanford.edu (web.stanford.edu)... 171.67.215.200, 2607:f6d0:0:925a::ab43:d7c8
Connecting to web.stanford.edu (web.stanford.edu)|171.67.215.200|:443... connected.
HTTP request sent, awaiting response... 
200 OK
Length: 6604072 (6.3M)
Saving to: 'edep-sim-example.root'


edep-sim-example.ro   0%[                    ]       0  --.-KB/s               
edep-sim-example.ro  21%[===>                ]   1.34M  6.68MB/s               
edep-sim-example.ro  50%[=========>          ]   3.16M  7.85MB/s               
edep-sim-example.ro  63%[===========>        ]   4.01M  5.21MB/s               
edep-sim-example.ro  84%[===============>    ]   5.35M  5.52MB/s               
edep-sim-example.ro 100%[===================>]   6.30M  5.61MB/s    in 1.1s    

2022-11-16 08:24:29 (5.61 MB/s) - 'edep-sim-example.root' saved [6604072/6604072]

This file is generated by (=is an output of) edep-sim, which can be easily accessible through ROOT.TChain. If there are multiple input files, they can be sequentially added using TChain.Add function. To run Supera, we need to:

  1. Read one event input from the edep-sim output file

  2. Interpret edep-sim data and generate a formatted input data to run Supera

    • This (= being an interface to edep-sim file) is exactly why edep2supera exists.

# Use TChain to read edep-sim
from ROOT import TChain 
ch=TChain("EDepSimEvents")
ch.AddFile('edep-sim-example.root')

# Access the first (0-th) input data sample
ch.GetEntry(0)

# Interpret the edep-sim input by handing TG4Event containing all info for this sample.
input_data = e2s_driver.ReadEvent(ch.Event)

Running Supera#

… is done over two steps.

  • First, GenerateImageMeta generates an image meta data.

  • Second, GenerateLabel generates the main output, infomration necessary for optimizing lartpc_mlreco3d.

# Generate image meta data
e2s_driver.GenerateImageMeta(input_data)

# Generate infomration necessary for lartpc_mlreco3d
e2s_driver.GenerateLabel(input_data)

# Access the image meta and output of Supera
meta  = e2s_driver.Meta()
label = e2s_driver.Label()

Accessing and Interpreting the Input#

Let’s take a brief look at the input_data which is of supera::EventInput type (C++ class defined in SuperaAtomic). In short, input_data is an array of particle information.

print('Input contains',input_data.size(),'particles')
Input contains 2236 particles

Each particle has two essential data elements:

  1. Information about this particle

  2. A list of energy deposition 3D points with energy deposited and dE/dX at each point

Let’s access the former first.

# Print the track ID and PDG code for the first 10 particles stored in the list
for i, p in enumerate(input_data):
    if i == 10: break
    print('Index',i,'Track ID',p.part.trackid,'PDG',p.part.pdg)
    
# Dump the details of the first particle
print('\nDumping information about the first particle...\n')
print(input_data[0].part.dump())
Index 0 Track ID 0 PDG 11
Index 1 Track ID 1 PDG 22
Index 2 Track ID 2 PDG 11
Index 3 Track ID 3 PDG 22
Index 4 Track ID 4 PDG 22
Index 5 Track ID 5 PDG 22
Index 6 Track ID 6 PDG 11
Index 7 Track ID 7 PDG 22
Index 8 Track ID 8 PDG 22
Index 9 Track ID 9 PDG 22

Dumping information about the first particle...

      Particle  (PdgCode,TrackID) = (11,0) 
      ... with Parent (11,0)
      ... + Ancestor  (9223372036854775807,kINVALID_TRACKID)
      Vertex   (x, y, z, t) = (32.0045,-70.0098,665.004,0.911358)
      Momentum (px, py, pz) = (-1237.63,-580.933,-618.506)
      Initial  Energy  = 1500.59
      Deposit  Energy  = 0
      Creation Process = 0::0
      Instance ID      = 0
      Group ID         = kINVALID_INSTANCEID
      Interaction ID   = kINVALID_INSTANCEID
      Type     = 3
      Shape    = 6
      Children = 

Note that not all information may be filled at this point, in particular various “ID” (except for Track ID which is aleady determined by the previous simulation step edep-sim).

Next, let’s access energy deposition points.

import plotly.graph_objs as go
import numpy as np

particle = input_data[0]
print('There are',particle.pcloud.size(),'points')

# Create a list of points
points = [[pt.x, pt.y, pt.z, pt.e, pt.dedx] for pt in particle.pcloud]
# ... and turn that into a numpy array (so we can slice)
points = np.array(points)

# Create a scatter plot
trace = go.Scatter3d(x=points[:,0], y=points[:,1], z=points[:,2],
                     mode='markers',
                     marker=dict(size=1, color=points[:,3], opacity=0.3),
                    )
fig = go.Figure(data=trace)
fig.show()
There are 1297 points

… or plot all points togetehr

points_array = []

for particle in input_data:

    points = [[pt.x, pt.y, pt.z, pt.e, pt.dedx] for pt in particle.pcloud]
    
    if len(points) < 1: continue
    
    points_array.append(np.array(points))

# Concatenate a list of points for all particles
points = np.concatenate(points_array)

print('Total:',len(points),'points',np.sum(points[:,3]),' MeV for',input_data.size(),'particles')

# Create a scatter plot with associated energy in MeV as hovertext
trace = go.Scatter3d(x=points[:,0], y=points[:,1], z=points[:,2],
                     mode='markers',
                     marker=dict(size=1, color=points[:,3], opacity=0.3),
                     hovertext=['%f MeV' % pt[3] for pt in points]                      
                    )
fig = go.Figure(data=trace)
fig.show()
Total: 20246 points 1343.614154319449  MeV for 2236 particles

… or distinguishing each particle by a color? Let’s only plot particles with some size.

import plotly
colors = plotly.colors.qualitative.Light24

traces = []
minimum_point_count = 100

for particle in input_data:

    points = [[pt.x, pt.y, pt.z, pt.e, pt.dedx] for pt in particle.pcloud]
    
    if len(points) < minimum_point_count: continue
        
    points = np.array(points)
    color  = colors[particle.part.trackid % len(colors)]
        
    traces.append(go.Scatter3d(x=points[:,0],y=points[:,1],z=points[:,2],
                               mode='markers',
                               name='TID %d PDG %d' % (particle.part.trackid,particle.part.pdg),
                               marker=dict(size=1,color=color,opacity=0.3),
                               hovertext=['EDep %f' % pt[3] for pt in points],
                              )
                 )
    
fig = go.Figure(data=traces)
fig.show()

Accessing and Interpreting the Ouput#

The output is defined and created by SuperaAtomic and there is nothing specific about edep2supera. But we can look at a few examples.

# Overall energy deposition is stored in _energies attribute
data = label._energies

# data is VoxelSet type, essentially an array of voxel ID and value = energy.
# We need to take an extra step to translate each voxel ID into XYZ coordinate
xyz = [meta.position(vox.id()) for vox in data.as_vector()]
xyz = [[pt.x, pt.y, pt.z] for pt in xyz]
xyz = np.array(xyz)

energy = [vox.value() for vox in data.as_vector()]
energy = np.array(energy)

trace = go.Scatter3d(x=xyz[:,0],y=xyz[:,1],z=xyz[:,2],
                     mode='markers',
                     marker=dict(size=1,color=energy,opacity=0.3),
                    )
fig = go.Figure(data=trace)
fig.show()

Perhaps let’s access something unique to the output. An example is a semantic label, which is discrete categorization of pixel types. In the below example we ignore pixels of a type kShapeLEScatter which represents many small fragments of energy depositions. This allows us to more easily see the trunk of a shower.

from ROOT import supera

# semantic data is stored in an attribute _semanticLabels
# It is of VoxelSet type just like _energies.
data = label._semanticLabels

semantic = [vox.value() for vox in label._semanticLabels.as_vector()]
semantic = np.array(semantic).astype(np.int32)

# Create a pixel mask
mask = np.where(~(semantic==supera.kShapeLEScatter))
# Use the same points (xyz and energy) but apply a mask
trace = go.Scatter3d(x=xyz[mask][:,0],y=xyz[mask][:,1],z=xyz[mask][:,2],
                     mode='markers',
                     marker=dict(size=1,color=energy[mask],opacity=0.3),
                    )
fig = go.Figure(data=trace)
fig.show()

We can construct the same view from individual particle information that is also stored in the output. Each particle also stores a semantic type.

traces = []

for particle in label.Particles():
    
    if particle.part.shape == supera.kShapeLEScatter:
        continue
        
    color = colors[particle.part.id % len(colors)]        
    data  = particle.energy
    
    xyz = [meta.position(vox.id()) for vox in data.as_vector()]
    xyz = [[pt.x, pt.y, pt.z] for pt in xyz]
    xyz = np.array(xyz)

    energy = [vox.value() for vox in data.as_vector()]
    energy = np.array(energy)

    traces.append(go.Scatter3d(x=xyz[:,0],y=xyz[:,1],z=xyz[:,2],
                               mode='markers',
                               name='ID %d PDG %d' % (particle.part.id,particle.part.pdg),
                               marker=dict(size=1,color=color,opacity=0.3),
                              )
                 )

fig = go.Figure(data=traces)
fig.show()