A Wide-area, Low-latency, and Power-efficient 6-DoF Pose Tracking System for Rigid Objects

by   Young-Ho Kim, et al.
Siemens Healthineers

Position sensitive detectors (PSDs) offer possibility to track single active marker's two (or three) degrees of freedom (DoF) position with a high accuracy, while having a fast response time with high update frequency and low latency, all using a very simple signal processing circuit. However they are not particularly suitable for 6-DoF object pose tracking system due to lack of orientation measurement, limited tracking range, and sensitivity to environmental variation. We propose a novel 6-DoF pose tracking system for a rigid object tracking requiring a single active marker. The proposed system uses a stereo-based PSD pair and multiple Inertial Measurement Units (IMUs). This is done based on a practical approach to identify and control the power of Infrared-Light Emitting Diode (IR-LED) active markers, with an aim to increase the tracking work space and reduce the power consumption. Our proposed tracking system is validated with three different work space sizes and for static and dynamic positional accuracy using robotic arm manipulator with three different dynamic motion patterns. The results show that the static position root-mean-square (RMS) error is 0.6mm. The dynamic position RMS error is 0.7-0.9mm. The orientation RMS error is between 0.04 and 0.9 degree at varied dynamic motion. Overall, our proposed tracking system is capable of tracking a rigid object pose with sub-millimeter accuracy at the mid range of the work space and sub-degree accuracy for all work space under a lab setting.



There are no comments yet.


page 1

page 5

page 7

page 8


RigidFusion: Robot Localisation and Mapping in Environments with Large Dynamic Rigid Objects

This work presents a novel approach to simultaneously track a robot with...

Magnetometer-free inertial motion tracking of arbitrary joints with range of motion constraints

In motion tracking of connected multi-body systems Inertial Measurement ...

Event-Based high-speed low-latency fiducial marker tracking

Motion and dynamic environments, especially under challenging lighting c...

Iterative Learning Control for Fast and Accurate Position Tracking with a Soft Robotic Arm

This paper presents an iterative learning control scheme to improve the ...

FAST-Hex – A Morphing Hexarotor: Design, Mechanical Implementation, Control and Experimental Validation

We present FAST-Hex, a micro aerial hexarotor platform that allows to se...

Markerless Suture Needle 6D Pose Tracking with Robust Uncertainty Estimation for Autonomous Minimally Invasive Robotic Surgery

Suture needle localization plays a crucial role towards autonomous sutur...

Efficient Multiple Object Tracking Using Mutually Repulsive Active Membranes

Studies of social and group behavior in interacting organisms require hi...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

Position sensitive detectors (PSDs) have become one of the most important components in the 2 or 3 degree-of-freedom (DoF) position tracking system for various applications [lee10psd, ivan12psd2d, qu19psd, yang17psd] (e.g., primarily based on lasers for measuring distance, displacement, and vibration), for which they provide high position resolution, fast response time, while being rather cost-effective and requiring simple signal conditioning circuits.

Despite advantages, PSDs are not generally used for 6-DoF object pose tracking system (position and orientation). This is primarily based on three reasons; 1) a single marker orientation can not be captured using a PSD alone, 2) a multi-marker recognition is required for orientation estimation and that is not readily feasible due to multiple markers’ line-of-sight and extra computation to identify the markers and compute the pose, and 3) sensitivity of positional accuracy to active light source usually a light emitting diode (LED) to interference such as ambient light.

In this paper, we propose a new 6-DoF pose tracking system, enabling us to accurately track low power active markers within a large work space with high update rate and low latency. More specifically, the system has the following characteristics and novelties:

  • The proposed system comprises of two main components; (1) a tracking-base unit (TBU), housing two PSDs arranged with a fixed baseline to allow for position triangulation, and an inertial measurement unit (IMU) for a reference coordinate system; (2) a compact size of tracked-object unit (TOU), which could be a set of active makers consisting of multiple infrared light emitting diodes (IR-LEDs) along with an inertia measurement unit (IMU).

  • A practical IR-LED identification methodology is proposed for a low-latency tracking and a noise filtering. This is primarily due to the limitation of PSDs in doing position sensing only for one active marker at a time and vulnerability to external IR ambient noise.

  • An adaptable incident power control method is introduced to ensure optimal power consumption for the active markers and increase the tracking range.

  • An overall calibration method using a highly accurate positioning system such as a commercial robotic manipulator is provided to properly fuse the rotational and translation information read from the PSDs and IMUs.

  • We finally integrate our tracking system with an ultrasound machine in order to track ultrasound probes, and demonstrate pose tracking performance in a lab environment.

Fig. 1: The proposed system consists of the tracking-base unit (TBU) and the tracked-object unit (TOU). (a)(b) TOU consists of multiple IR-LEDs (Centroid wavelength is 940nM.) along with one IMU. The IMU of TOU orientation information is transferred to the TBU via a wireless connection. (c)(d) TBU has two PSDs as a pair with a bandpass filter and transimpedance circuits. IR-LEDs incident light positions on each PSD are digitized by an ADC and processed to micro-controller via an analog signal processing circuit.

Ii Background and Related works

There are three main categories of contact-less technologies to perform 6-DoF rigid object pose tracking; 1) Electromagnetic tracking system (ETS), 2) Optical-based tracking system (OTS), 3) Other sensor fusions.

Electromagnetic Tracking System (ETS) is mainly comprised of a stationary magnetic field generator, coil sensors attached to tracked objects, and a control unit. The magnetic field generator generates a magnetic field and establish a reference coordinate system. The coil sensors are attached to the tracked object, and coils induce a voltage due to magnetic field effect. The control unit operates the magnetic field generator, infers 6-DoF pose information from the coil sensors’ voltages, and transfers the information to the host system [chen20tracking]. ETS has a small size of coil sensors (e.g., 1 mm in diameter and less than 10 mm in length), and does not require line-of-sight clearance. However, the tethered connection between the coil sensors and the control unit is cumbersome, and ETS accuracy is adversely affected by the presence of ferromagnetic object within the magnetic field of the field generator. Moreover, the tracking distance is limited up to , and the performance depends on the distance from the field generator [franz14EM, andria20EM].

There exist two typical Optical Tracking Systems (OTS): Infrared (IR)-based and video-metric-based tracking system. Depending on the type of fiducial markers used, IR-based tracking system can be categorized as either passive (i.e. using retro-reflective material) or active (i.e. IR emitting). The principle of tracking is based on triangulation and registration of a marker cluster, which is in fact a set of fiducial markers within a known geometry fixed to the tracked object. 3-DoF locations of each marker are estimated via triangulation where 3-DoF orientations are determined via registration to the known marker cluster geometry, requiring a high resolution/low latency cameras and non-trivial computations [chen20tracking].

The number and spatial distribution of the fiducial markers significantly influence the tracking accuracy of the OTS. At least three visible and non-collinear markers are required to uniquely determine 6-DoF pose of the object. Moreover, the lighting conditions including natural environmental disturbance (e.g., background illumination) may influence the tracking performance. The fiducial markers forming a marker cluster are usually spatially distributed to provide a large lever-arm to provide rotational accuracy. Marker cluster size and the direct line-of-sight clearance are among disadvantages of OTS [glossop09optical, sorriento20opticalEM].

Video-metric-based tracking system is primarily relying on feature detection and image registration techniques for tracking objects. For example, a wearable 6-DoF hand pose tracking system is proposed for virtual reality application using a blob detection and tracking image processing algorithms for 3-DoF position tracking and IMU for 3-DoF orientation [andualem17vr]. garon17deep

introduced a temporal 6-DoF pose tracking system using deep learning algorithms with data augmentation.

deng21posetracking used a particle filter with deep learning methods to estimate 6-DoF targeted objects from camera images. In another example, dong20tracking

demonstrated data-driven methods to estimate a tracked object pose using a convolution neural network. These methods do not require markers, however training samples from the tracked object are needed to properly train the pose inference model. Moreover, the tracking accuracy is still limited as compared to OTS and ETS, which are capable of tracking the objects’ pose with sub-millimeter and sub-degree accuracy with low latency (

) under an ideal setting.

Additionally, there exist multiple sensor-fusion-based 6-DoF pose tracking systems proposed in the literature. For example, han10ar proposed a mobile robot pose tracking system for an augmented reality application, which integrated pose information from a radiofrequency-based pose tracking system with that from a vision-based tracking system. Electromagnetic tracking (5-DoF) and IMU (3-DoF) tracking data are fused to arrive at a full 6-D pose tracking resulted in [dai18magnet]. esslinger20optoIMU proposed a pose tracking system based on an opto-acoustic system and IMU, in which the fusing is done using a particle filtering approach.

PSD provides a continuous position measurement of the incident light spot on a surface featuring a special monolithic PIN photodiode with several electrodes placed along each side of the square near the sensor boundary [hamamatsu]. The currents collected from the electrodes directly provide the incident light position with a simple circuitry, which has nanoscale position and time resolutions leading to a fast response time and a high accuracy position tracking system. lee10psd used two PSD sensors to track 3-DoF position as a stereo vision system. PSD is utilized for visual servoing and control applications [ivan12psd2d]. yang17psd used PSD for measuring a string vibration primarily due its fast response time. qu19psd applied PSD-based position detection system for a closed-loop control of a solar tracking mobile robot. There are also work done in literature to better characterize the performance of PSD based systems. rodriguez16calibrationLPS

proposed a mathematical model and a calibration method to do accurate measurements using PSDs. PSD errors are analyzed in terms of component tolerances, temperature variations, signal to noise ratio, operational amplifier parameters, and analog to digital converter quantization 

[rodriguez16analysis]. In addition, Lu19spot presented a quantitative analysis on the position error caused by the changes in the light spot diameter and the distance.

Iii Materials and Methods

Our objectives for the proposed system is to achieve a) high accuracy over a large work-space, b) high update rate along with low latency, c) power efficiency to operate the sensors possibly with lithium ion batteries, and d) cost efficiency by utilizing readily available research and development cost amortized components.

Our proposed system consists of two parts; a tracking-base unit (TBU) and a tracked-object unit (TOU) shown in Figure 1. TOU could have multiple light sources (i.e. active LEDs) along with an inertial measurement unit (IMU). The objective here is to have TOU as small and as low power as possible to have it running on battery. Furthermore, we included a micro-controller unit with wireless capabilities to transmit orientation parameters and also receive timing information and other commands from TBU in real-time.

TBU establishes the reference coordinate system and tracks one or more light spots emitted from the TOU using a pair of PSDs through triangulation. In TBU, we have one IMU to provide reference orientation information for measuring relative orientation as reported wirelessly from TOU. Specifically, the unique combination of each IR-LED and IMU ensures that position is measured by the stereo PSD system in TBU, while orientation is measured by the IMU of TOU in reference to the IMU in TBU. These two IMU-based orientation are coupled and provide a reference coordinate system at TBU for all measurements.

We made three design decisions as follows:

  • Two PSD sensors are adopted for a stereo vision system instead of two CCD camera modules requiring imaging processing. PSDs are able to accurately detect the location of light spots from sensors, and two 2D detector locations can be triangulated into three-dimensional position in the stereo camera reference coordinate system.

  • An estimated active IR-LED position is combined with orientation measurement from two IMUs in TOU and TBU; The relative orientation in the reference coordinate system is established by TBU’s IMU. A rigid object’s orientation is measured by IMU in TOU.

  • In order to maintain a required line of sight and to track an object rotating a full degree around its axis, we need to be able to track multiple markers. Since the IR-LED’s emitting angle is limited, we could instrument the tracked objects with a number of IR-LEDs to cover the rotational range. We need at least one IR-LED in the line-of-sight for each angle for TBU to detect.

The TBU details are as follows: (1) Two PSD sensors (S5991, active area , Hamamatsu) with optical camera lens (Forcal length , Angle of View (Horizonal ,Vertical ) and the band-pass filter ( center) are used as a stereo PSD system. (2) Analog signal processing circuits consisting of transimpedance, summation, addition, subtraction amplifiers for four PSD current signals (i.e., eight current signals from two PSD), described in Figure 1(d). We chose a rail-to-rail, low-noise OP-amp with low input bias current, such as OPA192 and OPA657. We used as a feedback resistor, to achieve a high sensitivity of PSD. Thus, the system requires a hardware filter for external infrared noise removal. Analog-to-Digital (ADC) (AD7606, 6ch simultaneous sampling) circuit is employed to process the analog signal. Furthermore, we designed an event capture circuit to detect incident value without lagging. (3) IMU (BNO055, Bosch) is used to provide orientation measurements. (4) One micro-controller (CC2650, Texas Instrument) is used to gather and process signals, and send the results to the host computer in real-time.

The TOU consist of four components; (1) Six IR-LEDs (SFH4775S,  nM centroid wavelength, beam angle, Max  V forward voltage, max  A forward current at  uS, Oslam), (2) N-Channel power MOSFETs (0.5A) for each IR-LED, (3) One IMU (BNO055, Bosch) rigidly attached to the IR-LED. (4) micro-controller and necessary circuit to feed LEDs with a pulse width modulated (PWM) variable constant amplitude voltage source.

Each IMU reports an absolute orientation measurement with respect to the Earth and its magnetic field, which is obtained by fusing 9-DoF (accelerator, gyroscope, and magnetometer). BNO055 has internal fusion software, combining all three sensors with a fast calculation in high output data rate ( Hz) and high robustness from other than the Earth magnetic field distortions.

We need to address the following four specific challenges for the proposed system.

  • Our proposed system have multiple IR-LEDs surrounding TOU, which requires to be tracked at least one IR-LED. However, it is still a critical point to recognize which single IR-LED is emitted from TOU because tracked each position of IR-LEDs will be integrated with IMU information to estimate a 6-DoF pose. One possible solution is to construct a closed loop control system to synchronize TOU LED firing and TBU reading with a proper hand-shaking mechanism implemented through either a wireless or a wired connection. However, this takes extra times due to a limited bandwidth of transmitted packets, which is a bottleneck for achieving low latency. Lastly, it is desirable not to have additional wiring between TOU and TBU.

  • PSD has a wide spectral response range between 320 to 1100 nm even though it has a peak sensitivity wavelength around 960 nM. Thus, the incident light spot always includes unknown natural environmental disturbance, which lead to inaccurate measurement of position.

  • IR-LEDs illumination excite a spot on the PSD’s 2D sensor and as the result a transimpedance circuit converts the current to a voltage, which is digitalized through an ADC. For large work space tracking, there are three important elements, namely IR-LED emitted light intensity, PSD sensitivity level in detecting IR-LED illumination, and finally ADC input range. To track the incident light in a large workspace, the light intensity needs to be increased exponentially over distance, but this increases IR-LED power consumption and temperature. PSD sensitivity level can also be increased over distance, but this could lead to vulnerability to external IR ambient noise. Finally, increased ADC input range results in decreased ADC resolution.

  • To achieve 6-DOF pose tracking system using PSDs with IMUs, there exist three calibration stages. (1) A stereo PSD calibration to compute the depth from two 2D PSDs’ positions. (2) the IMU coordinate system of TOU needs to be aligned with TBU reference coordinate system. (3) The stereo PSD system has the reference coordinate assumed to be at the center of the left PSD sensor (or the right), which should be calibrated with the IMU coordinate system in TBU.

Fig. 2: Nomenclature of proposed 6-DoF pose tracking system

The nomenclature of the proposed tracking system is described in Figure 2. Let represents as 3-DoF position of -th IR-LED in terms of TBU’s reference coordinate system (i.e. the left PSD coordinate system). Two PSD’s 2-DoF states , has as 2-DoF positions of left and right PSD sensors, respectively. Let be a photocurrent from electrodes of PSD where for four corners. and can be computed as follows;


where is the resistance length of PSD, provided by the manufactures of PSD.

Let and be an absolute rotation matrix with respect to the earth (g) and its magnetic field, coming from the IMU in the object, ’o’ TOU and the IMU in the base, ’b’ TBU, respectively. Similarly, let and be the absolute orientation for TOU and TBU, respectively.

Let be the relative pivoting 3-DoF position from each IR-LED position to the pivot point of the targeted object, while be the relative pivoting 3-DoF position with regard to the reference coordinate system in TBU (i.e., the left PSD coordinate system).

Iii-a Identification of Multiple IR-LEDs

There exist analog and digital multiplexing methodologies (i.e., Frequency Division Multiplexing (FDM) and Time Division Multiplexing (TDM)) to track multiple active markers simultaneously. FDM is easy to synchronize multiple signals, however it requires complex circuitry to treat highly sensible multiple analog signals, and a large enough bandwidth channels to get highly accurate multiple signals without crosstalk. Instead, we bring a simple TDM idea with a trigger signal to recognize which IR-LED is excited.

We propose a pattern-based LED identification method between TBU and TOU, which is based on a pre-defined pattern used for each LED initiated from TOU. Thus, it does not require a hand-shake and the corresponding communication. We define a pattern signal period based on inverse of desired update frequency (e.g.,  Hz ). One pattern signal cycle consists of one generic trigger signal to indicate the start of the cycle and IR-LED specific signals as shown in Figure 3. More specifically for a 6-LED TOU, we have seven signals with even widths fitting in  ms for  Hz update frequency. The trigger signal at the start of pattern cycle is designed to be more than duty cycle Pulse Width Modulation (PWM) signal, exciting all IR-LEDs at the same time. Following that, IR-LED specific signals are designed to be less than duty cycle PWM signal, and is applied sequentially to IR-LEDs with a specific known consistent order.

One exemplary full pattern signal is captured and depicted in Figure 3. TOU emits a pattern signal consisting of trigger and IR-LED specific patterns continuously. As the result, the emitted light from IR-LEDs pulses are detected by PSDs in the TBU through analog signal processing circuits (Purple in Figure 3). We devised an event capturing circuit to detect the rising edge of detected pulses (Green in Figure 3).

We perform two ADC conversions synced on the rising edge with a fixed interval (Cyan in Figure 3). After each ADC conversion, the results of ADC is read by the micro-controller and used for further processing. Then, the first trigger signal is easily detected in the TBU as two ADC conversion values are nearly the same and positive due to the larger Power and fixed reading interval (see first two rising edges (cyan) and the trigger signal (first pulse of purple) in Figure 3). Following this, as the IR-LEDs pattern follow a sequential order, the timing of the next non-zero ADC value determines the index of the exited IR-LED.

In the circular arrangement of IR-LEDS, only one or two LEDs are detected at a time within a single pattern cycle. This pattern-based approach provides a synchronous data processing in a real time without physical wire/wireless communication between TOU and TBU.

Iii-B Cancellation of natural environmental disturbance

Most tracking environment is not ideal settings including unknown environmental noises, which affects the performance of multiplexing method. Thus, we proposed two ADC converted values (cyan in Figure 3) for each signal of one pattern (one trigger and multiple IR-LED signals).

Based on the double ADC reads for each IR-LED, we have ADC conversion values. The first one represents a IR-LED pure signal plus any possible ambient and other noises. The second ADC value captures external various noises primarily based on the ambient stray light, which is closely captured to the main signal captured in the first readout. Using these two, we have an opportunity to cancel out the background illumination by subtracting the second value from the first one. This is an important feature, which proved to be essential in reducing the effect of ambient stray light and other noises in an operating environment (e.g., where the sun light or other sources of IR might be present). PSD output signal after circuitry has three analog signals; two subtractions (, ), one summation (). We apply our proposed methodology for these three analog signals in a real-time (i.e., 50 us capturing delay for each signal).

Overall, the advantages of this method are; (1) there is no time consuming hand-shake involving back and forth communication is required, (2) the external IR ambient noise can be alleviated, and finally (3) the centralized control system of IR-LEDs facilitates further the power adjustment and control for a large work-space, which we will be further explained in the next section.

Fig. 3: Pattern-based IR-LEDs identification: We demonstrate one exemplary pattern signal (a trigger signal + multiple IR-LED signals) to explain the overall process. In reality, only one and two IR-LEDs are detected due to geometry constraints.

Iii-C Active light intensity controls

IR-LED power needs to be continuously adjusted to provide better signal and overall accuracy specifically in scenarios where the distance between the TBU and TOU is large or steep orientation of TOU causes decreased signal from IR-LEDs. The basic idea is to construct a closed loop power controller and a look-up-table to adjust the power range depending on distance. Given fixed PSD sensitivity level, and ADC range of inputs, the operational maximum power is a non-linear function of distance between the source and the detector. We construct the function as , where is the sensor-based disparity (inversely proportional to the distance between the source and the detector), and is the orientation of TOU in terms of TBU coordinate system (detailed computation will be addressed in Section III-D).

INPUT: the look-up-table for power limits LUT(), the sum currents of left/right PSDs and , x-axis positions of left/right PSDs and (), the orientation
1 ;
2 while PSD pose tracking system is OPERATIONAL do
3       ;
4       ;

Active power controller algorithm is addressed in Algorithm 1. The inputs are two summation values (, ) from two PSDs, , , and . During the operation, returns the reference power level. The discrepancy between the values denoted as error is transferred to the TOU wirelessly allowing for the the power level of IR-LED to be controlled in real-time.

Iii-D Calibration Procedure

We have mainly two calibration challenges; 1) PSD based stereo-camera calibration within in TBU, 2) 3-DoF position and 3-DoF orientation calibration within a unified coordinate system for TOU in terms of TBU.

Iii-D1 Stereo PSD calibration

Fig. 4: Stereo PSD calibration for IR-LEDs: (a) 4x3 checkerboard is designed and used for stereo calibration. IR-band is not visible, so a single optical camera without IR-filter lens used to show how IR-LEDs are operated, which are emitted periodically. (b) 12 points detected by each PSD (left and right) are plotted in 2-DoF plane; the red is for , and the blue is for .

Two PSDs are setup as a stereo camera system, where each PSD gives the 2D image point of LED point source, , (Figure 4

(b)). Therefore similar to a stereo vision calibration, the PSD stereo calibration can be resolved using an intrinsic and extrinsic calibration steps. The intrinsic calibration parameters consist of the focal length, the principal point, skew angle and the ones related to distortion. Aside from distortion parameters, the focal length, skew angle, and the principal point are directly analogous to those from commonly used cameras.

To obtain projected points, we use a grid of IR-LEDs (i.e. to simulate an optical checkerboard) that is fabricated to certain tolerance to project points in space (Figure 4(a)). The checkerboard is moved around to cover the entire work-space area. Both the standard optical system parameters and the distortion coefficient are obtained using an iterative algorithm of projecting known points, de-warping and recasting them in the 3D space. Once the individual PSDs have been calibrated, we proceed with the calibration of extrinsic parameters of stereo rig, which is formulated as the relative orientation and translation of the right PSD senor system with respect to the left PSD sensor. The extrinsic calibration parameters are computed using an iterative approach detailed in [Bouguet01CameraCT].

The distortion for PSD is different from CMOS/CCD type sensors used in computer vision. The typical biconvex lenses used to focus light produce barrel distortions, whereas the sensor itself produces a pincushion distortion.

Ideally if the electrical center of PSD and the optical center of lens coincide the two can almost cancel out. However, this may not be achieved in practice. Therefore, we use a two dimensional Bernstein basis polynomials of degree to model the inverse of distortion. The formulation is as follows;


To obtain true 3D positions of IR-LED points, we apply the inverse of distortion modeled as in Equation (2) to the PSDs measured pixels (, ) prior to 3D reconstruction of the point ().

Iii-D2 Calibration of IMU and PSD coordinate system

The 3-D position of the IR-LED source (i.e., attached to the tracked object) with respect to the left PSD coordinate system is given by . The 3-D orientation of the IMU in TOU is provided by with respect to the gravity and the magnetic north. If the fixed relative orientation between the tracked object body and the attached IMU is given by , then the 3D orientation of the body with respect to gravity and magnetic north is follows:


Likewise, the relative orientation of the IMU attached to TBU is given by and the fixed relative orientation between the left PSD coordinate system and the attached IMU on TBU is given by . Thus, the left PSD orientation (i.e., tracking base) with respect to gravity and magnetic north is given by


Based on these, the 3-DoF orientation of tracked object unit with respect to left PSD coordinate system is given by


Combination of and , provide full 6 DoF transformation of the tracked object points to the tracking base. For example, any point in i-th IR-LED coordinate system such as the pivot point has the following relationship to the same coordinate denoted in the tracking base coordinate system .


Based on these, the three fixed quantities to be derived or calibrated for the PSD system are; (i) relative orientation between tracked object IMU and the tracked object coordinate system , (ii) the relative orientation between the IMU and tracking base coordinate systems, , (iii) the relative position of a common point such as pivoting point denoted as and within TOU coordinate system.

Calibration of (i)
is stemming misalignment between IMU reference and the tracked object reference coordinate system. In a perfect scenario, this orientation transformation should be close to identity, but it is not in reality. To estimate , we have to measure the reference orientation of the tracked rigid object along TOU coordinate system, which might require the specialized tool (e.g., a robotic manipulator). Let us have number of samples for a real measure of , and TOU’s pure IMU measure . Then, we construct an overdetermined linear system as follows


where and is the number of samples for and , is the number of possible pair in , , and is what we want to compute.

To minimize a residual error, we use the pseudo-inverse of to estimate , then we can compute . More detailed least-squares solution with pseudo-inverse is described in [strang06linear].

Calibration of (ii) and (iii)
We have to estimate , which is also mostly due to fabrication tolerance. Moreover, in our system, the relative position of each LED to the pivot point must be determined in practice. We propose an extended pivot calibration method to estimate both the tolerance error and each relative pivoting position together.

The pivot calibration method evolves basically rotating and swinging the tracked object about a fixed point, while measuring the pose in both local (i.e., TOU) and global (i.e, TBU). The pivot point is constant within both local and global coordinate systems and that is the basis to create an over-complete set of equations.

In our system, each IR-LED needs individual calibration for , which is the location of each IR-LED is a coordinate system established at the pivot point. However, is common for all IR-LED. We extend the pivot calibration method for each IR-LED 3-D position to compute where is an ideal for all IR-LEDs.

We design an iterative least-square approximate method. Algorithm 2 consisted of three parts; (1) estimation of and for -th LED, (2) estimation of for all LEDs, (3) computation of overall errors, and then iterate (1) to (3).

First, we have -th IR-LED 3-D position with orientation based on . We treat as our estimated value for

, which is initially an identity matrix. Let

denote -th set of position and orientation for -th LED, where , and is the total number of data set for -th LED. Then, we can derive an overdetermined linear system for and based on Equation (6).


where , is the number of possible pair . Then, we use the pseudo-inverse to estimate and .

Second, We update where . Given , , the estimated is defined as


where is what we want to estimate.

Then, we can construct another overdetermined linear system as follows;


where is the number of possible pair . Then, we also use the pseudo-inverse to estimate (i.e. ) 111 equal to .

Finally, we compute overall errors of our estimation and keep iterating three steps until threshold or max-iteration meet. As a result, we can get , and .

INPUT: (): number of samples for i-th LED
, ;
  // Initialization
1 while MAX-Iteration & errors threshold do
       // First: estimation of and
2       ;
3       for  do
4             , ,, ;
5             for  do
6                   ;
7                   for  do
8                         ;
9                         , ;
10                         ;
11                         ;
14             ;
15             ;
16             , ;
18       ;
       // Second: estimation of
19       , ;
20       for  do
21             for  do
22                   ;
23                   ;
26       ;
       // compute overall errors
27       for  do
28             for  do
29                   for  do
30                         errors += ;
32                  errors += ;

Iv Experiment and Results

Iv-a System setup

Figure 5 shows our proposed system integrated and utilized within several applications (e.g., ultrasound probes, laparoscopic tool, a stylus pen, a needle biopsy tool, etc.). The TBU, with dimension (L x W x H) of , could be mounted to stationary frame while having a clear line-of-sight to at least one IR-LEDs. The maximum frame rate is , and the distance of IR-LEDs to tracking base could vary from to . The object sensing dimension (L X W X H  mm) is , which depends on IR-LED ring size.

We use a UR5 robotic manipulator (Figure 4) to calibrate and to measure the final tracking error for performance analysis. UR5 is a 6-axis robot, which provides movement accuracy and repeatability of in a working radius of with the pivot tool. We tested our proposed system in a typical lab setting in which we measured an interfering luminous flux, ranging between and  lux due to sun light from the windows.

Fig. 5: Our tracking module is a compact size, which can be attached to any rigid objects. (a) The tracking module is attached to the handle of the transducer, and the 6-DoF pose information fused with ultrasound images. (b) Our proposed system is also integrated with angled ultrasound probe, laparoscopic tool, and a stylus-type needle biopsy tool. The 6-DoF pose is displayed based on QT tools in real-time. (c) One exemplary demonstration of a pose tracking fusion with pre-op images using a stylus needle tool.

Iv-B Calibration setup

Calibration of We designed a holder for TOU, which we mounted to the end-effector of the UR5.

UR5 provides the relative pose of the attached tool tip , i.e., tool center point (TCP) in the robot coordinate system. By attaching the tracked object aligned in lieu of a tool to the end-effector, we can get highly accurate pose of the tracked object, which we use as the ground truth.

To get , we collected 15 measurements along each axis of the object, with a total number of 45 for x, y, z-axes. Each data point has a pair of from UR5 as the ground-truth and from IMU. Then, we used a least-square approximation method using a pseudo-inverse shown in Equation (7). The estimated is defined as a quaternion, .

Calibration of and

Using UR5, we can freely rotate TOU about a pivot point (Figure 6(a)). We collected 15 measurements from each of six IR-LEDs, rotating about a fixed pivot points. We move UR5 and change the pivot point to cover a large work space. Then we apply six set of into Algorithm 2.

The estimated is written as a quaternion . The estimated is

while the standard deviation (unit=

) is among LEDs.

Fig. 6: (a) A holder is attached to the end-effector of UR5, thus we can control the tool center point directly, which provides the ground truth for . (b) A set of is plotted with position and orientation. Using the proposed iterative method that can estimate , , and .

Iv-C Accuracy assessment and interpretation

We analyze the accuracy of our proposed system in the following three scenarios;

  • Static accuracy of 6-DoF pose over the three different size of area.

  • Dynamic accuracy of 6-DoF pose at

  • Power consumption and temperature over distance

To evaluate the proposed method, we used Root-Mean-Square, Mean, and confidence interval of the error, as it has been done for example for assessing OTS errors in [wiles04accuracy].

Fig. 7: Overall boundary of workspaces that we evaluate are displayed. The orange area represents the large, but narrow are. The red area shows the sweet-spot. The green area shows the large area. The purple shows OTS (Vicra, NDI).

Iv-C1 Static accuracy over the three different work space sizes

To characterize the system accuracy dependency to work space size, we measured the error for three different hypothetical work spaces; (a) the large area with the volume of () with depth from to . (b) the large, but narrow area with the volume of (), amd similar depth as large area the large area. (c) the sweet-spot, which is defined as (), with the depth distance from to .

The overall work space is shown in Figure 7, which additionally depicts an overlaid work space ( of OTS (Polaris Vicra, NDI [ndi]). To measure the static accuracy, the UR5 robot is programmed to cover the entire defined work spaces by a interval for each dimension in 3D, while we collect data (, ). The ground truth of distance between any two samples can directly be computed from the actual Cartesian coordinates of UR5 end-effector.

Table I shows the static position error; computed by Equation (6) including and . As the static orientation error does not change over distance, we separately report the static accuracy of the orientation () at the last row. Overall, the position errors ( CI) of our proposed system are computed to be , , and for three different sizes of work space, respectively, and our orientation error of (0.059) is superior to OTS using passive markers. Based on the report [wiles04accuracy], the static position error of Polaris Vicra is ( CI Table 2 of [wiles04accuracy]), while the orientation errors is ( CI) in an ideal setting.

Regions RMS Mean Error CI
The large 1.369  1.237  2.254 
The large+narrow 1.171  0.846  1.934 
The sweet-spot 0.56  0.434  0.925 
3D orientation 0.043 0.040 0.059
TABLE I: Static accuracy of PSD tracker, position/orientation error statistics

Iv-C2 Dynamic accuracy of 6-DoF pose at

To show the dynamic accuracy of the proposed approach, we moved TOU following a pre-defined sway motion using the UR5, which repeatedly moves TOU by changing its translation and rotation simultaneously within range of , , respectively. We created three cycle of this trajectory with three different velocity and acceleration; (1) Slow and , (2) Moderate and , (3) Fast and . The test distance between TBU and TOU is about . We collected the position and the orientation , for the system, while the ground-truth is acquired from UR5 application programming interface (API) in real-time. Figure 8 depicts the overlaid pose changes over time.

Fig. 8: Dynamic pose change of PSD tracker over time at . The position changes of three dynamic tests are (a), (b), and (c) while the orientation changes of three dynamic tests are (d), (e), and (f). The red line shows the ground-truth from UR5 API for actual Cartesian coordinates of the tool. The blue line shows pose changes of our tracker.
Condition RMS Mean Error CI
(, degree ) (, degree ) (, degree )
Slow (0.743, 0.135) (0.725, 0.159) (1.222, 0.222)
Moderate (0.825, 0.246) (0.988, 0.327) (1.357, 0.405)
Fast (0.906, 0.323) (1.052, 0.413) (1.49, 0.532)
TABLE II: Dynamic pose accuracy of PSD tracker, distance/orientation error

The overall dynamic performance evaluation is described in Table II. The orientation errors are within sub-millimeter range for all testing conditions. The dynamic position error increases as compared to static one by less than a millimeter. Unfortunately, dynamic measurements error are not provided in many related work making it hard to compare. The results overall demonstrates that the proposed system can be a competitive system in terms of accuracy for tracking rigid tools to other commercially available OTS, ETS, and video-metric based tracking systems.

Iv-C3 Power and temperature measurement

We evaluated the power consumption and the temperature of the unit during nominal operation. TOU is moved in a straight line from to away from TBU. We used the FLIR one thermal imaging sensor to measure the surface temperature. TOU was kept stationary for one minute at each point along the path, while we gathered saturated temperature and power consumption as shown in Figure 9. The supplied voltage for TOU was , and the current and temperature at the minimum distance () was and , respectively, whereas the same at the maximum distance of () were measured to be and .

V Discussion

  • We demonstrated our proposed pattern-based IR-LED identification method with active power controls performs well in terms of both accuracy and power consumption. We believe that there is a possibility to increase tracking performance by decreasing PWM duty cycle, which could translate to higher IR-LED voltage and increase LED luminance and overall improved accuracy. Moreover, with the reduced PWM duty cycle, we could also decrease the overall power consumption leading to decreased temperature. Our current design has multiple LEDs in TOU, and in order to further optimize the power consumption, the specific IR-LEDs with clear light of sight to the tracking base could be illuminated. This could be possible by analyzing the relative orientation of TOU with respect to TBU in real-time, and changing the LED firing pattern accordingly. Overall, these several additional options could contribute to further miniaturization of the sensing unit with a light weight module supplied by the small size lithium ion battery.

  • Our proposed system also requires the line-of-sight condition similiar to OTS. However, our system has multiple IR-LEDs covering , therefore, if we could potentially use multiple TBUs so that one tracking base system can at least track one IR-LED. In this scenario, all TBU stations can be calibrated at once in the working area. Moreover, the base coordinate system can be transferred to the any fixed reference frame of one of the TBU systems in the same space by using Equation (4).

  • Pattern-based identification method could not track a large number of TOUs due to a bandwidth limitation we have due to minimum requirement on PWM duty cycle used within the IR-LED identification pattern. However, this problem could potentially be addressed by using time division multiplexing, frequency multiplexing, or a combination. These topics are beyond the scope of this paper, and could be considered as an extension of the work.

Fig. 9: X-axis is distance between TBU and TOU. Left Y-axis is power consumption (mA) at . Right Y-axis is temperature () that we measured by FLIR one.

Vi Conclusions

We proposed a novel 6-DoF pose tracking system that uses stereo-based PSDs and multiple IMUs, which are both cost efficient components widely used in many other applications. We devised a practical IR-LED identification methodology with efficient power control to provide tracking accuracy within a large tracking work space. High refresh rates for both PSDs and IMUs provided an opportunity for the overall 6-DoF pose tracking system also to have a high update rate and low latency. Furthermore, the proposed tracking sensors could be manufactured into a small form factor, which makes it favourable for a variety of applications.

The results demonstrated that the proposed tracking system can be used as a wide-area, low-latency, and power-efficient 6-DoF pose tracking system as an alternative inexpensive option to OTS and EMT.


The concepts and information presented in this paper are based on research results that are not commercially available. Future availability cannot be guaranteed.