1 Introduction
Track reconstruction naturally arises in many of highenergy physics experiments: events produced by pp collisions at the LHC energies typically include hundreds of signal examples (interesting decays) and a significant amount of noise (uninteresting examples). This makes track reconstruction a challenging task. The substantial increase in collision energy, which leads to the increase in the number of produced tracks, makes one seek for more sophisticated event selection and reconstruction techniques which heavily rely on track finding procedures. High computational cost of event reconstruction methods gives an advantage to algorithms designed for massively parallel architectures (e.g. GPU or custom hardware). One of such algorithms is the Artificial Retina [1]
, a patternmatching algorithm inspired by the structure of lowlevel visual recognition areas in mammal’s receptive fields
[2]. One of the advantages of the algorithm is its extremely high parallization capacity which makes it wellsuited to the track finding in high track multiplicity environments [3].In this work, we study a modification of Artificial Retina algorithm: it is reformulated as an optimization problem and wellknown methods for global optimization in continuous space are adopted. This approach allows more flexible tradeoff between computational cost and track finding performance and leads to considerable reduction of total computational time.
Comparison of a gridbased and the proposed method is made on a simplified model of the LHCb VELO (VErtex LOcator) detector. The model qualitatively reflects physics in VELO, parameters for the model are inspired by parameters of MonteCarlo simulation with the Run 3 upgrade design of the VELO [4].
2 Artificial Retina algorithm
The Artificial Retina (AR) algorithm was proposed as a fast, massively parallel track reconstruction method [1], inspired by lowlevel structure of line and edge detection areas of visual cortex in mammal’s brain [2]. The algorithm introduces a grid of units (or, continuing biological analogy, ‘cells’ or
‘neurons’
), each corresponding to a particular track pattern configuration (pattern) such as position and angle. For each new observation (hits) each cell computes measurements of correspondence between its pattern and the observation.In a simple case of straight line detection in 2D space, a line can be represented by two parameters . Thus units of AR can be arranged into a two dimensional grid, each of which corresponds to a pattern with parameters . Given a set of hits with coordinates , activation (response) of a unit is typically defined as:
(1)  
(2) 
where:

— the distance between hit and the line defined by parameters ;

— bandwidth parameter.
As can be seen from (2), the activation roughly corresponds to the number of hits that are in agreement with pattern’s parameters . Typically, a set of hits aligned in a line activates a cluster of units with maximal activation in the unit with pattern parameters closest to the tracks parameters.
Bandwidth parameter controls smoothness of the response function (2), and should be set accordingly to the grid step and hit position errors, e.g. . As shown on figure 1(a), high values may result in merge of two close tracks, and, when uncertainties in hits coordinates are present, low , in contrast, may lead to several clusters activated by a single track.
an event with two tracks (10 hits each) and 20 uniformly distributed noise hits, hits are denoted by dots, true tracks are denoted by dashed lines, detector planes — by solid lines;
0(b) response of the Artificial Retina grid for , track is parametrized by angle to horizontal line and offset , two local maxima are close to the true track parameters, the distance between the track and the hit is defined as euclidean distance in corresponding detector planes.Most of the known AR algorithms rely on computing response of the whole grid . The usual steps of such algorithms are:
One may notice that the AR algorithm can be reformulated as an optimization problem of finding all local maxima of the response function (1) with respect to track parameters. From this perspective, AR algorithms described above employ bruteforce approach: gridsearch in parameter space. In this work we examine one family of methods that can be used as a substitution for gridsearch: first and secondorder optimization procedures. One crucial observation is that computations of gradient and Hessian matrix of (1) with respect to track parameters imposes relatively small overhead, which may bring significant benefits to the methods that can utilize this information, e.g. gradient descent.
However, the problem of finding all local maxima of the response function (1) is intrinsically nonconvex and nonlocal, hence global optimization strategies must be adopted. One of such strategies is the multistart algorithm [5], which allocates initial guesses drawn from prior distribution and then sequentially updates each of them. In this study the number of updates for each initial guess is fixed.
Pseudocode for the proposed method is shown in listing 1: function denotes selected optimization procedure.
3 Simplified LHCb VELO model and experiment
We illustrate the application of Artificial Retina for tracking on the example of a simplified model of the LHCb Vertex Locator (VELO) detector. This simplified model (sVELO) is inspired by the VELO upgrade Technical Design Report [4], and is aimed to capture all VELO details essential from the tracking point of view.
At LHCb two crossing beams result in protonproton collisions which produce numerous secondary particles. The collision point (called primary vertex) is far from the magnet, thus secondary particles trajectories can be considered as straight lines. The VELO detector surrounds interaction region and consists of layers perpendicular to the beam axis (axis). Each layer consist of a number of silicon detectors (pixels) that react on a charged particles crossing the material.
An event consist of coordinates of triggered pixels (hits): either activated by a secondary particle, or noise.
In sVELO we assume that:

number of layers ;

each layer is a disk with radiuses: outer , inner ;

layers are equally spaced within 700 mm along axis;

particles are travelling in straight lines;

pseudorapidity of particles is distributed uniformly ;

angle in the traverse plain is distributed uniformly: ;

each particle has a probability
of interacting with detector layer; 
particles that leave less than hits are considered as undetectable and their hits are marked as noise;

errors of coordinate measurements are distributed normally: ;

uniformly distributed hits are introduced in each event with .
In the experiment number of reconstructible tracks was varied between 50 and 350.
The track is parametrized by two angles and :
The distance function is defined as euclidean distance between the hit and intersection of a track within hit’s detector layer.
A track is considered to be reconstructed if the algorithm reports an estimation within radians from true track’s parameters (which is comparable to the angular size of VELO pixel).
In the experiment we found that for this distance function routine for computing AR response, its gradient and Hessian matrix takes less than 3 time longer than computations of response alone (), normalization constant is used to account the time difference for different routines. To show the reduction in total computational time relative to plain gridsearch, we set the number of initial guesses, so that computational resources are fraction of these required by gridsearch:
(3) 
where:

— number of cells required for plain gridsearch to provide resolution,

number of optimization steps,

— time required by each optimization step normalized by .
Among all methods examined during the experiment, the Truncated Newton method[6, 7] was found to yield the best results. For this methods the normalization constant . During the experiments we discovered that slight improvement of the results can be achieved by updating bandwidth parameter with each optimization step, sequence was used in this study.
Results are shown in figure 3. Generally, the efficiency of the algorithm is high. From the figure 2(b) it can be clearly seen that the efficiency of the algorithm decreases as number of initial seeds approaches to the number of tracks. Nevertheless, with enough initial seeds (figure 2(a)) efficiency is close to 1, while the whole procedure requires only one third of the resources required by gridsearch method.
4 Conclusion
In this work we examined a modification of the Artificial Retina algorithm that adopts continuous space optimization methods and multistart procedure. High convergence rates of these methods overcome gradient and Hessian matrix computational costs, which results in overall reduction of total computation time.
Experiments on a simplified model of LHCb VErtex LOcator detector were performed and showed that hat it is possible to keep track reconstruction efficiency above 95% and thanks to the method proposed the computational time can be reduced by the factor of 3 compared to the gridsearch based Artificial Retina algorithm.
References
References
 [1] Ristori L 2000 Nuclear Instruments and Methods in Physics Research Section A: Accelerators, Spectrometers, Detectors and Associated Equipment 453 425–429
 [2] Hubel D H and Wiesel T N 1962 The Journal of physiology 160 106–154
 [3] Abba A, Bedeschi F, Citterio M, Caponio F, Cusimano A, Geraci A, Marino P, Morello M, Neri N, Punzi G et al. 2015 Journal of Instrumentation 10 C03008
 [4] LHCb Collaboration and others 2013 LHCb VELO upgrade technical design report Tech. rep.
 [5] Martí R 2003 Multistart methods Handbook of metaheuristics (Springer) pp 355–368
 [6] Nocedal J and Wright S J 2006 Numerical optimization 2nd
 [7] Nash S G 1984 SIAM Journal on Numerical Analysis 21 770–788