How it works
Contents
How it works#
In this article we go over how edep2supera
works focusing on 5 points below.
Instantiate and configure the module
Prepare the input
Execute the main functions
Interpreting the input
Interpreting the output
The first three items show the flow of processes to run Supera
with edep-sim
files as input. Then the last two items demonstrate how a user can access the input to and the output from Supera
.
Instantiate and Configure#
Import edep2supera
as well as an example module.
import edep2supera
from edep2supera import config
Welcome to JupyROOT 6.26/06
Instantiate and configure SuperaDriver
for execution of edep2supera
main functions.
# Instantiate
e2s_driver = edep2supera.edep2supera.SuperaDriver()
# Configure the driver
e2s_driver.ConfigureFromFile(config.get_config('tutorial'))
[WARNING] <SuperaDriver::Configure> starting
[WARNING] <SuperaDriver::Configure> BBox config...
[WARNING] <SuperaDriver::Configure> Label config...
What’s going on?
SuperaDriver
is the driver of executing main functions ofedep2supera
A technical detail: it is a wrapper (i.e. C++ inheritance) of
supera::Driver
defined inSuperaAtomic
.
SuperaDriver
needs three configurations includingone for itself (
SuperaDriver.Configure
), and two algorithms:one for creating the meta data of the 3D image volume to be recorded
another for interpreting the input and creating necessary information to optimize
lartpc_mlreco3d
Configuration is done by passing either a configuration file for string text (i.e. the contents of a configuration file) in a yaml format.
In the above, we use
config
module that can provide an example configuration for this tutorial.
One can add a new configuration with a .yaml
extension in the data directory:
edep2supera/python/edep2supera/config_data
and it will be automatically available through a config
module.
To list the list of available configuration files:
config.list_config()
['tutorial']
The contents of a tutorial configuration looks like this:
print(open(config.get_config('tutorial'),'r').read())
# ===============
# tutorial.yaml
# ===============
LogLevel: WARNING
ActiveDetectors: ["TPCActive_shape"]
MaxSegmentSize: 0.03
BBoxAlgorithm: BBoxInteraction
BBoxConfig:
LogLevel: WARNING
Seed: -1
BBoxSize: [2048,2048,2048]
VoxelSize: [0.5,0.5,0.5]
BBoxBottom: [-1000,-1000,-1000]
#WorldBoundMax: [-1.e20,-1.e20,-1.e20]
#WorldBoundMin: [ 1.e20, 1.e20, 1.e20]
LabelAlgorithm: LArTPCMLReco3D
LabelConfig:
LogLevel: WARNING
DeltaSize: 3
ComptonSize: 10
LEScatterSize: 2
TouchDistance: 1
StoreLEScatter: True
SemanticPriority: [1,0,2,3,4]
EnergyDepositThreshold: 0.0
#WorldBoundMax: [-1.e20,-1.e20,-1.e20]
#WorldBoundMin: [ 1.e20, 1.e20, 1.e20]
We don’t go over the full details of the configuration parameters nor the details of the algorithms. For those, refer to the documentation of SuperaAtomic
.
Reading and Preparing Input#
Download an example dataset for this tutorial.
! wget https://web.stanford.edu/~kterao/edep-sim-example.root
--2022-11-16 08:24:27-- https://web.stanford.edu/~kterao/edep-sim-example.root
Resolving web.stanford.edu (web.stanford.edu)... 171.67.215.200, 2607:f6d0:0:925a::ab43:d7c8
Connecting to web.stanford.edu (web.stanford.edu)|171.67.215.200|:443... connected.
HTTP request sent, awaiting response...
200 OK
Length: 6604072 (6.3M)
Saving to: 'edep-sim-example.root'
edep-sim-example.ro 0%[ ] 0 --.-KB/s
edep-sim-example.ro 21%[===> ] 1.34M 6.68MB/s
edep-sim-example.ro 50%[=========> ] 3.16M 7.85MB/s
edep-sim-example.ro 63%[===========> ] 4.01M 5.21MB/s
edep-sim-example.ro 84%[===============> ] 5.35M 5.52MB/s
edep-sim-example.ro 100%[===================>] 6.30M 5.61MB/s in 1.1s
2022-11-16 08:24:29 (5.61 MB/s) - 'edep-sim-example.root' saved [6604072/6604072]
This file is generated by (=is an output of) edep-sim
, which can be easily accessible through ROOT.TChain
. If there are multiple input files, they can be sequentially added using TChain.Add
function. To run Supera, we need to:
Read one event input from the
edep-sim
output fileInterpret
edep-sim
data and generate a formatted input data to runSupera
This (= being an interface to
edep-sim
file) is exactly whyedep2supera
exists.
# Use TChain to read edep-sim
from ROOT import TChain
ch=TChain("EDepSimEvents")
ch.AddFile('edep-sim-example.root')
# Access the first (0-th) input data sample
ch.GetEntry(0)
# Interpret the edep-sim input by handing TG4Event containing all info for this sample.
input_data = e2s_driver.ReadEvent(ch.Event)
Running Supera#
… is done over two steps.
First,
GenerateImageMeta
generates an image meta data.Second,
GenerateLabel
generates the main output, infomration necessary for optimizinglartpc_mlreco3d
.
# Generate image meta data
e2s_driver.GenerateImageMeta(input_data)
# Generate infomration necessary for lartpc_mlreco3d
e2s_driver.GenerateLabel(input_data)
# Access the image meta and output of Supera
meta = e2s_driver.Meta()
label = e2s_driver.Label()
Accessing and Interpreting the Input#
Let’s take a brief look at the input_data
which is of supera::EventInput
type (C++ class defined in SuperaAtomic
). In short, input_data
is an array of particle information.
print('Input contains',input_data.size(),'particles')
Input contains 2236 particles
Each particle has two essential data elements:
Information about this particle
A list of energy deposition 3D points with energy deposited and dE/dX at each point
Let’s access the former first.
# Print the track ID and PDG code for the first 10 particles stored in the list
for i, p in enumerate(input_data):
if i == 10: break
print('Index',i,'Track ID',p.part.trackid,'PDG',p.part.pdg)
# Dump the details of the first particle
print('\nDumping information about the first particle...\n')
print(input_data[0].part.dump())
Index 0 Track ID 0 PDG 11
Index 1 Track ID 1 PDG 22
Index 2 Track ID 2 PDG 11
Index 3 Track ID 3 PDG 22
Index 4 Track ID 4 PDG 22
Index 5 Track ID 5 PDG 22
Index 6 Track ID 6 PDG 11
Index 7 Track ID 7 PDG 22
Index 8 Track ID 8 PDG 22
Index 9 Track ID 9 PDG 22
Dumping information about the first particle...
Particle (PdgCode,TrackID) = (11,0)
... with Parent (11,0)
... + Ancestor (9223372036854775807,kINVALID_TRACKID)
Vertex (x, y, z, t) = (32.0045,-70.0098,665.004,0.911358)
Momentum (px, py, pz) = (-1237.63,-580.933,-618.506)
Initial Energy = 1500.59
Deposit Energy = 0
Creation Process = 0::0
Instance ID = 0
Group ID = kINVALID_INSTANCEID
Interaction ID = kINVALID_INSTANCEID
Type = 3
Shape = 6
Children =
Note that not all information may be filled at this point, in particular various “ID” (except for Track ID which is aleady determined by the previous simulation step edep-sim
).
Next, let’s access energy deposition points.
import plotly.graph_objs as go
import numpy as np
particle = input_data[0]
print('There are',particle.pcloud.size(),'points')
# Create a list of points
points = [[pt.x, pt.y, pt.z, pt.e, pt.dedx] for pt in particle.pcloud]
# ... and turn that into a numpy array (so we can slice)
points = np.array(points)
# Create a scatter plot
trace = go.Scatter3d(x=points[:,0], y=points[:,1], z=points[:,2],
mode='markers',
marker=dict(size=1, color=points[:,3], opacity=0.3),
)
fig = go.Figure(data=trace)
fig.show()
There are 1297 points
… or plot all points togetehr
points_array = []
for particle in input_data:
points = [[pt.x, pt.y, pt.z, pt.e, pt.dedx] for pt in particle.pcloud]
if len(points) < 1: continue
points_array.append(np.array(points))
# Concatenate a list of points for all particles
points = np.concatenate(points_array)
print('Total:',len(points),'points',np.sum(points[:,3]),' MeV for',input_data.size(),'particles')
# Create a scatter plot with associated energy in MeV as hovertext
trace = go.Scatter3d(x=points[:,0], y=points[:,1], z=points[:,2],
mode='markers',
marker=dict(size=1, color=points[:,3], opacity=0.3),
hovertext=['%f MeV' % pt[3] for pt in points]
)
fig = go.Figure(data=trace)
fig.show()
Total: 20246 points 1343.614154319449 MeV for 2236 particles
… or distinguishing each particle by a color? Let’s only plot particles with some size.
import plotly
colors = plotly.colors.qualitative.Light24
traces = []
minimum_point_count = 100
for particle in input_data:
points = [[pt.x, pt.y, pt.z, pt.e, pt.dedx] for pt in particle.pcloud]
if len(points) < minimum_point_count: continue
points = np.array(points)
color = colors[particle.part.trackid % len(colors)]
traces.append(go.Scatter3d(x=points[:,0],y=points[:,1],z=points[:,2],
mode='markers',
name='TID %d PDG %d' % (particle.part.trackid,particle.part.pdg),
marker=dict(size=1,color=color,opacity=0.3),
hovertext=['EDep %f' % pt[3] for pt in points],
)
)
fig = go.Figure(data=traces)
fig.show()
Accessing and Interpreting the Ouput#
The output is defined and created by SuperaAtomic
and there is nothing specific about edep2supera
. But we can look at a few examples.
# Overall energy deposition is stored in _energies attribute
data = label._energies
# data is VoxelSet type, essentially an array of voxel ID and value = energy.
# We need to take an extra step to translate each voxel ID into XYZ coordinate
xyz = [meta.position(vox.id()) for vox in data.as_vector()]
xyz = [[pt.x, pt.y, pt.z] for pt in xyz]
xyz = np.array(xyz)
energy = [vox.value() for vox in data.as_vector()]
energy = np.array(energy)
trace = go.Scatter3d(x=xyz[:,0],y=xyz[:,1],z=xyz[:,2],
mode='markers',
marker=dict(size=1,color=energy,opacity=0.3),
)
fig = go.Figure(data=trace)
fig.show()
Perhaps let’s access something unique to the output. An example is a semantic label, which is discrete categorization of pixel types. In the below example we ignore pixels of a type kShapeLEScatter
which represents many small fragments of energy depositions. This allows us to more easily see the trunk of a shower.
from ROOT import supera
# semantic data is stored in an attribute _semanticLabels
# It is of VoxelSet type just like _energies.
data = label._semanticLabels
semantic = [vox.value() for vox in label._semanticLabels.as_vector()]
semantic = np.array(semantic).astype(np.int32)
# Create a pixel mask
mask = np.where(~(semantic==supera.kShapeLEScatter))
# Use the same points (xyz and energy) but apply a mask
trace = go.Scatter3d(x=xyz[mask][:,0],y=xyz[mask][:,1],z=xyz[mask][:,2],
mode='markers',
marker=dict(size=1,color=energy[mask],opacity=0.3),
)
fig = go.Figure(data=trace)
fig.show()
We can construct the same view from individual particle information that is also stored in the output. Each particle also stores a semantic type.
traces = []
for particle in label.Particles():
if particle.part.shape == supera.kShapeLEScatter:
continue
color = colors[particle.part.id % len(colors)]
data = particle.energy
xyz = [meta.position(vox.id()) for vox in data.as_vector()]
xyz = [[pt.x, pt.y, pt.z] for pt in xyz]
xyz = np.array(xyz)
energy = [vox.value() for vox in data.as_vector()]
energy = np.array(energy)
traces.append(go.Scatter3d(x=xyz[:,0],y=xyz[:,1],z=xyz[:,2],
mode='markers',
name='ID %d PDG %d' % (particle.part.id,particle.part.pdg),
marker=dict(size=1,color=color,opacity=0.3),
)
)
fig = go.Figure(data=traces)
fig.show()