Numerical optimization for Artificial Retina Algorithm

09/25/2017
by   Maxim Borisyak, et al.
Higher School of Economics
0

High-energy physics experiments rely on reconstruction of the trajectories of particles produced at the interaction point. This is a challenging task, especially in the high track multiplicity environment generated by p-p collisions at the LHC energies. A typical event includes hundreds of signal examples (interesting decays) and a significant amount of noise (uninteresting examples). This work describes a modification of the Artificial Retina algorithm for fast track finding: numerical optimization methods were adopted for fast local track search. This approach allows for considerable reduction of the total computational time per event. Test results on simplified simulated model of LHCb VELO (VErtex LOcator) detector are presented. Also this approach is well-suited for implementation of paralleled computations as GPGPU which look very attractive in the context of upcoming detector upgrades.

READ FULL TEXT VIEW PDF

page 3

page 4

11/14/2018

Catch and Prolong: recurrent neural network for seeking track-candidates

One of the most important problems of data processing in high energy and...
07/08/2022

Search by triplet: An efficient local track reconstruction algorithm for parallel architectures

Millions of particles are collided every second at the LHCb detector pla...
10/23/2019

Towards Fast Displaced Vertex Finding

Many Standard Model extensions predict metastable massive particles that...
12/07/2018

The particle track reconstruction based on deep learning neural networks

One of the most important problems of data processing in high energy and...
05/03/2015

Electron Neutrino Classification in Liquid Argon Time Projection Chamber Detector

Neutrinos are one of the least known elementary particles. The detection...
12/08/2020

Beyond 4D Tracking: Using Cluster Shapes for Track Seeding

Tracking is one of the most time consuming aspects of event reconstructi...
06/25/2018

FLIT-level InfiniBand network simulations of the DAQ system of the LHCb experiment for Run-3

The LHCb (Large Hadron Collider beauty) experiment is designed to study ...

1 Introduction

Track reconstruction naturally arises in many of high-energy physics experiments: events produced by p-p collisions at the LHC energies typically include hundreds of signal examples (interesting decays) and a significant amount of noise (uninteresting examples). This makes track reconstruction a challenging task. The substantial increase in collision energy, which leads to the increase in the number of produced tracks, makes one seek for more sophisticated event selection and reconstruction techniques which heavily rely on track finding procedures. High computational cost of event reconstruction methods gives an advantage to algorithms designed for massively parallel architectures (e.g. GPU or custom hardware). One of such algorithms is the Artificial Retina [1]

, a pattern-matching algorithm inspired by the structure of low-level visual recognition areas in mammal’s receptive fields 

[2]. One of the advantages of the algorithm is its extremely high parallization capacity which makes it well-suited to the track finding in high track multiplicity environments [3].

In this work, we study a modification of Artificial Retina algorithm: it is reformulated as an optimization problem and well-known methods for global optimization in continuous space are adopted. This approach allows more flexible trade-off between computational cost and track finding performance and leads to considerable reduction of total computational time.

Comparison of a grid-based and the proposed method is made on a simplified model of the LHCb VELO (VErtex LOcator) detector. The model qualitatively reflects physics in VELO, parameters for the model are inspired by parameters of Monte-Carlo simulation with the Run 3 upgrade design of the VELO [4].

2 Artificial Retina algorithm

The Artificial Retina (AR) algorithm was proposed as a fast, massively parallel track reconstruction method [1], inspired by low-level structure of line and edge detection areas of visual cortex in mammal’s brain [2]. The algorithm introduces a grid of units (or, continuing biological analogy, ‘cells’ or

neurons

), each corresponding to a particular track pattern configuration (pattern) such as position and angle. For each new observation (hits) each cell computes measurements of correspondence between its pattern and the observation.

In a simple case of straight line detection in 2D space, a line can be represented by two parameters . Thus units of AR can be arranged into a two dimensional grid, each of which corresponds to a pattern with parameters . Given a set of hits with coordinates , activation (response) of a unit is typically defined as:

(1)
(2)

where:

  • — the distance between hit and the line defined by parameters ;

  •  — bandwidth parameter.

As can be seen from (2), the activation roughly corresponds to the number of hits that are in agreement with pattern’s parameters . Typically, a set of hits aligned in a line activates a cluster of units with maximal activation in the unit with pattern parameters closest to the tracks parameters.

Bandwidth parameter controls smoothness of the response function (2), and should be set accordingly to the grid step and hit position errors, e.g. . As shown on figure 1(a), high values may result in merge of two close tracks, and, when uncertainties in hits coordinates are present, low , in contrast, may lead to several clusters activated by a single track.

(a)
(b)
Figure 1: An example of Artificial Retina response: 0(a)

an event with two tracks (10 hits each) and 20 uniformly distributed noise hits, hits are denoted by dots, true tracks are denoted by dashed lines, detector planes — by solid lines;

0(b) response of the Artificial Retina grid for , track is parametrized by angle to horizontal line and offset , two local maxima are close to the true track parameters, the distance between the track and the hit is defined as euclidean distance in corresponding detector planes.
(a)
(b)
(c)
(d)
Figure 2: Examples of two close tracks: 1(a) input event with two tracks (20 hits each), noise is applied to x-coordinate of the hits; response of Artificial Retina: 1(b) for small ; 1(c) comparable to noise dispersion; 1(d) , tracks are merged.

Most of the known AR algorithms rely on computing response of the whole grid . The usual steps of such algorithms are:

  1. define track model, parameter grid, distance measure;

  2. compute responses (2) for each unit of the grid;

  3. select clusters of activated units;

  4. for each cluster estimate track parameters.

One may notice that the AR algorithm can be reformulated as an optimization problem of finding all local maxima of the response function (1) with respect to track parameters. From this perspective, AR algorithms described above employ brute-force approach: grid-search in parameter space. In this work we examine one family of methods that can be used as a substitution for grid-search: first- and second-order optimization procedures. One crucial observation is that computations of gradient and Hessian matrix of (1) with respect to track parameters imposes relatively small overhead, which may bring significant benefits to the methods that can utilize this information, e.g. gradient descent.

However, the problem of finding all local maxima of the response function (1) is intrinsically non-convex and non-local, hence global optimization strategies must be adopted. One of such strategies is the multi-start algorithm [5], which allocates initial guesses drawn from prior distribution and then sequentially updates each of them. In this study the number of updates for each initial guess is fixed.

Pseudo-code for the proposed method is shown in listing 1: function denotes selected optimization procedure.

function accelerated-artificial-retina(, , )
     for  do
         draw from prior distribution
     end for
     for  do
         for  do
              compute response , gradient and Hessian
              
         end for
     end for
     cluster solutions
     for each cluster select solution with highest response and above threshold
     return selected solutions
end function
Algorithm 1 Accelerated Artificial Retina algorithm

3 Simplified LHCb VELO model and experiment

We illustrate the application of Artificial Retina for tracking on the example of a simplified model of the LHCb Vertex Locator (VELO) detector. This simplified model (sVELO) is inspired by the VELO upgrade Technical Design Report [4], and is aimed to capture all VELO details essential from the tracking point of view.

At LHCb two crossing beams result in proton-proton collisions which produce numerous secondary particles. The collision point (called primary vertex) is far from the magnet, thus secondary particles trajectories can be considered as straight lines. The VELO detector surrounds interaction region and consists of layers perpendicular to the beam axis (-axis). Each layer consist of a number of silicon detectors (pixels) that react on a charged particles crossing the material.

An event consist of coordinates of triggered pixels (hits): either activated by a secondary particle, or noise.

In sVELO we assume that:

  • number of layers ;

  • each layer is a disk with radiuses: outer , inner ;

  • layers are equally spaced within 700 mm along -axis;

  • particles are travelling in straight lines;

  • pseudo-rapidity of particles is distributed uniformly ;

  • angle in the traverse plain is distributed uniformly: ;

  • each particle has a probability

    of interacting with detector layer;

  • particles that leave less than hits are considered as undetectable and their hits are marked as noise;

  • errors of coordinate measurements are distributed normally: ;

  • uniformly distributed hits are introduced in each event with .

In the experiment number of reconstructible tracks was varied between 50 and 350.

The track is parametrized by two angles and :

The distance function is defined as euclidean distance between the hit and intersection of a track within hit’s detector layer.

A track is considered to be reconstructed if the algorithm reports an estimation within radians from true track’s parameters (which is comparable to the angular size of VELO pixel).

In the experiment we found that for this distance function routine for computing AR response, its gradient and Hessian matrix takes less than 3 time longer than computations of response alone (), normalization constant is used to account the time difference for different routines. To show the reduction in total computational time relative to plain grid-search, we set the number of initial guesses, so that computational resources are fraction of these required by grid-search:

(3)

where:

  • — number of cells required for plain grid-search to provide resolution,

  • number of optimization steps,

  • — time required by each optimization step normalized by .

Among all methods examined during the experiment, the Truncated Newton method[6, 7] was found to yield the best results. For this methods the normalization constant . During the experiments we discovered that slight improvement of the results can be achieved by updating bandwidth parameter with each optimization step, sequence was used in this study.

(a) Efficiency for
(b) Efficiency for
Figure 3: Experimental results for and . Horizontal axis correspond to number of reconstructible tracks, vertical — efficiency, fraction of reconstructed tracks.

Results are shown in figure 3. Generally, the efficiency of the algorithm is high. From the figure 2(b) it can be clearly seen that the efficiency of the algorithm decreases as number of initial seeds approaches to the number of tracks. Nevertheless, with enough initial seeds (figure 2(a)) efficiency is close to 1, while the whole procedure requires only one third of the resources required by grid-search method.

4 Conclusion

In this work we examined a modification of the Artificial Retina algorithm that adopts continuous space optimization methods and multi-start procedure. High convergence rates of these methods overcome gradient and Hessian matrix computational costs, which results in overall reduction of total computation time.

Experiments on a simplified model of LHCb VErtex LOcator detector were performed and showed that hat it is possible to keep track reconstruction efficiency above 95% and thanks to the method proposed the computational time can be reduced by the factor of 3 compared to the grid-search based Artificial Retina algorithm.

References

References

  • [1] Ristori L 2000 Nuclear Instruments and Methods in Physics Research Section A: Accelerators, Spectrometers, Detectors and Associated Equipment 453 425–429
  • [2] Hubel D H and Wiesel T N 1962 The Journal of physiology 160 106–154
  • [3] Abba A, Bedeschi F, Citterio M, Caponio F, Cusimano A, Geraci A, Marino P, Morello M, Neri N, Punzi G et al. 2015 Journal of Instrumentation 10 C03008
  • [4] LHCb Collaboration and others 2013 LHCb VELO upgrade technical design report Tech. rep.
  • [5] Martí R 2003 Multi-start methods Handbook of metaheuristics (Springer) pp 355–368
  • [6] Nocedal J and Wright S J 2006 Numerical optimization 2nd
  • [7] Nash S G 1984 SIAM Journal on Numerical Analysis 21 770–788