Motion model transitions in GPS-IMU sensor fusion for user tracking in augmented reality

by   Erkan Bostanci, et al.
Ankara University

Finding the position of the user is an important processing step for augmented reality (AR) applications. This paper investigates the use of different motion models in order to choose the most suitable one, and eventually reduce the Kalman filter errors in sensor fusion for such applications where the accuracy of user tracking is crucial. A Deterministic Finite Automaton (DFA) was employed using the innovation parameters of the filter. Results show that the approach presented here reduces the filter error compared to a static model and prevents filter divergence. The approach was tested on a simple AR game in order to justify the accuracy and performance of the algorithm.


Sensor Fusion of Camera, GPS and IMU using Fuzzy Adaptive Multiple Motion Models

A tracking system that will be used for Augmented Reality (AR) applicati...

Instant Motion Tracking and Its Applications to Augmented Reality

Augmented Reality (AR) brings immersive experiences to users. With recen...

Adaptive User Perspective Rendering for Handheld Augmented Reality

Handheld Augmented Reality commonly implements some variant of magic len...

Here To Stay: Measuring Hologram Stability in Markerless Smartphone Augmented Reality

Markerless augmented reality (AR) has the potential to provide engaging ...

AIR: Anywhere Immersive Reality with User-Perspective Projection

Projection-based augmented reality (AR) has much potential, but is limit...

A Novel Edge Detection Operator for Identifying Buildings in Augmented Reality Applications

Augmented Reality is an environment-enhancing technology, widely applied...

Reducing Gaze Distraction for Real-time Vibration Monitoring Using Augmented Reality

Operators want to maintain awareness of the structure being tested while...

1 Introduction

Integration of data from Global Positioning System (GPS) and Inertial Measurement Unit (IMU) sensors has been well-studied [1, 2, 3] in order to improve upon the robustness of the individual sensors against a number of problems related to accuracy or drift. The Kalman filter (KF) [4] is the most widely-used filter due to its simplicity and computational efficiency [5] especially for real-time user tracking applications such as AR.

Attempts have been made to improve the accuracy of the filter using adaptive values for the state and measurement covariance matrices based on the innovation [6] and recently fuzzy logic was used for this task [7, 8]. In some studies [9, 10]

used dynamic motion parameters to decide on the dominance of individual sensors for the final estimate.

Alternative approaches suggest using different motion models for recognizing the type of the motion [11, 12, 13, 14, 15]. Some of these studies (e.g. [12, 13] use a Bayesian framework for identifying a scoring scheme for selecting a motion model and some other studies, see [15], apply different motion models concurrently and select one of them according to a probabilistic approach.

This paper presents the selection and use of different motion models according to a DFA model [16] in order to reduce the filter error and ensure faster filter convergence. The rest of the paper is structured as follows: Section 2 presents the methods used for obtaining positional estimates from individual sensors. The fusion filter which uses these motion estimates in order to produce a single output is presented in Section 3. Results are given in Section 5 and an AR application is presented in Section 6. Finally, the paper is concluded in Section 7.

2 Finding Position Estimates

Before describing the details of the fusion filter and the DFA approach, it is important to present the calculations used for obtaining individual measurements from the GPS (Phidgets 1040) and IMU sensors (Phidgets Spatial 1056), both low-cost sensors with reasonable accuracy.

2.1 GPS position estimate

The data obtained from the GPS is in well-known NMEA format and includes position, the number of visible satellites and detailed satellite information for a position on Earth’s surface, as shown in Figure 1.

Figure 1: GPS position parameters in latitude (), longitude () and altitude () and , and in ECEF. Following [17].

Using this information, the GPS coordinates can be converted from geodetic latitude (), longitude () and altitude () notation to ECEF Cartesian coordinates , and as:




and is the WGS84 [18] ellipsoid constant for equatorial earth radius (6,378,137m), corresponds to the eccentricity of the earth with a value of  [5]. The calculated values form the measurements from the GPS sensor as .

2.2 IMU position estimate

Finding the position estimate from the IMU is performed by double-integrating the accelerometer outputs for several samples, the current implementation uses four samples. The first integration, to find the velocity, involves integrating accelerations using :


Since multiple samples are taken, is the time passed for each one of them. The next step is to integrate the velocities from (3) to find the position using as


These calculated positions () are used as the measurements from the IMU.

3 Fusion Filter

The filter designed for integration of the two sensors consists of a state which includes positional data (), linear velocities ():


A simple state consisting of 6 elements will facilitate obtaining a better performance in speed than one with a larger state. At each iteration, the predict–measure–update cycle of the KF is executed in order to produce a single output from several sensors as the filter output.

In the first stage, i.e. prediction, a transition matrix ( of (6)) is applied to the state in order to obtain the predicted position:


where is the time between two prediction stages.

Measurements are obtained from the GPS and the IMU using the values obtained as described in Section 2

and are combined to create a measurement vector:


Here, the IMU measurements for position are used as offsets to the position obtained from the most recent GPS fix.

4 DFA based Model Transitions

The difference between the measurements () and the prediction (), omitting the subscripts indicating time, is defined as the innovation ():


The innovation vector has 3 components for position elements as , and . The DFA model presented here uses the magnitude of these to define the filter divergence as


and uses the following rules to assign the values of into different classes named , and which are defined as


A DFA consists of several elements which can be listed as states, input symbols and transition rules [19, 20]. The states of the DFA defined in Figure 2 correspond to different motion models.

Figure 2: DFA model for the model transitions

The classes (, and ) are considered as the input symbols used for the DFA model. Finally, the transitions between states model the selection mechanism presented in this paper.

When a model () is selected, the value for is used as a velocity coefficient () in the transition function () for position (). For instance, P0 indicates a stationary transition model where the current values for position (P) will be unchanged in the predicted state, whereas P2 indicates a motion model where position is predicted with twice the current positional velocities () in order to adapt any sudden changes in the estimated position.

During experiments it was observed that in some cases selected models could be changing very often. A sliding window filter was applied to the results of the model selection logic in order to prevent frequent transitions between different motion models. In the implementation, the most recent five models were averaged to obtain the final motion model as illustrated in Figure 3.

Figure 3: Sliding window for preventing frequent model transitions

5 Results

Experiments were conducted using low-cost GPS and IMU sensors mounted on a cycle helmet for a user walking with varying speed. Sampling rate for the IMU was selected as 20 milliseconds and a GPS fix was received every second.

Figures 4 to 8 show the estimated paths using integration of the two sensors and employing different motion models. Portions of the estimated paths are coloured differently in order to indicate the type of the motion model used for estimation. It is important to note that the static model used in the results correspond to P1 and hence is drawn in the same colour.

Figure 4: Trajectory results for Dataset 1. (a) Static motion model (b) DFA models
Figure 5: Trajectory results for Dataset 2. (a) Static motion model (b) DFA models

Dataset 3 was acquired while the sensors were completely stationary, the accuracy of the sensor fusion is found in this case as 1.5m which is, indeed, less than the accuracy of the GPS used in the experiments (given as 2.5m in the product specification) — a benefit of sensor fusion. The DFA model selection logic reduced this error even further since the motion model is correctly recognized as P0 (see Figure 6).

Figure 6: Trajectory results for Dataset 3. (a) Static motion model (b) DFA models
Figure 7: Trajectory results for Dataset 4. (a) Static motion model (b) DFA models
Figure 8: Trajectory results for Dataset 5. (a) Static motion model (b) DFA models

Filter errors are presented in Figures 9 to 13. Note that these errors are an indicator of the difference between the filter predictions and the actual values of the measurements. It can be seen that the filter error is reduced when the DFA models are employed.

Figure 9: Filter errors for Dataset 1
Figure 10: Filter errors for Dataset 2
Figure 11: Filter errors for Dataset 3
Figure 12: Filter errors for Dataset 4
Figure 13: Filter errors for Dataset 5

6 An AR Game – Treasure Hunt

This section presents an AR game, which works using the DFA based sensor fusion algorithm described earlier in the paper. The aim of the game is to collect items and direct the user to test the accuracy of the approach, albeit unconsciously.

The game presents an egocentric view of the environment, as in First Person Shooter (FPS) games. The rules of the game are quite simple: the user needs to reach and collect all the reward items available as quickly as possible. When he or she reaches an item, the score is incremented by an amount that depends on the type of item encountered. The game provides three types of items: small coins, large coins and a chest (Figure 14), with rewards of 10, 30 and 50 points respectively.

(a) Chest
(b) Coin
Figure 14: Models used in the AR game

After the game is initialized with the positions of all items set, the game loop starts. The coin models use the animator and they rotate about their axes while the chest models remain static.

At each frame, the position of the user is checked against the item positions by calculating the distance between them. If this distance is less than some threshold value (done so that there is some tolerance against positioning inaccuracies), then the score is updated, the item is set as ‘hit’ and a sound file is played. The items collected by the user simply disappear. A timer is used for two purposes. First, it is constantly updated in the display to provide feedback to the user. It is also used to decay the score


where is the final score to be added and correspond to the rewards mentioned above. The constant is selected as arbitrarily. This forces the user to collect the game tokens quickly.

A view from the AR game is presented in Figure 15. The game has an interface which displays the score and time passed making the game more challenging and hence interesting. Note that the frame rate of the game is, indeed, very close to video rates (22 frames per second), an indicator of the speed of the filter.

Figure 15: A view from the AR game

7 Conclusion

This paper presented a DFA design for motion model selection in GPS–IMU sensor fusion. The results show that multiple-motion model sensor fusion can be achieved by utilising Kalman filter innovation together with a DFA based model selection scheme. It was observed that the use of different motion models can reduce the filter error and prevent divergence. It is clear that choosing the appropriate motion model depending on user’s speed improves the accuracy of Kalman filter for tracking applications.

A sample AR game was used to test the defined approach, and it was observed that the filter is accurate and fast enough to collect all the reward items in the game.

Future work will delve into further analysis of different motion models and a machine learning approach appears to be a promising research direction.


  • [1] P.D. Groves. Principles of GNSS, inertial, and multi-sensor integrated navigation systems. GNSS technology and applications series. Artech House, 2008.
  • [2] T. L. Grigore, R. M. Botez, D. G. Sandu, and O. Grigorie. Experimental testing of a data fusion algorithm for miniaturized inertial sensors in redundant configurations. In Proceedings of the 2014 International Conference on Mathematical Methods, Mathematical Models and Simulation in Science and Engineering, pages 116–122, 2014.
  • [3] V. Barrile and G. Bilotta. Automated surveys and integrated auto-location by laser scanner and gps. In Proceedings of the 2014 International Conference on Communications, Signal Processing and Computers, pages 65–72, 2013.
  • [4] R. E. Kalman. A New Approach to Linear Filtering and Prediction Problems. Transactions of the ASME – Journal of Basic Engineering, (82 (Series D)):35–45, 1960.
  • [5] E.D. Kaplan and C.J. Hegarty. Understanding GPS: Principles and Applications. Artech House mobile communications series. Artech House, 2005.
  • [6] A. Almagbile, J. Wang, and W. Ding. Evaluating the performances of adaptive kalman filter methods in GPS/INS integration. Journal of Global Positioning Systems, 9(1):33–40, 2010.
  • [7] C. Tseng, C. Chang, and D. Jwo.

    Fuzzy adaptive interacting multiple model nonlinear filter for integrated navigation sensor fusion.

    Sensors, 11:2090–2111, 2011.
  • [8] Jeffrey Kramer and Abraham Kandel. On accurate localization and uncertain sensors. International Journal of Intelligent Systems, 27(5):429–456, 2012.
  • [9] L. Ojeda and J. Borenstein. Flexnav: fuzzy logic expert rule-based position estimation for mobile robots on rugged terrain. In Robotics and Automation, 2002. Proceedings. ICRA ’02. IEEE International Conference on, volume 1, pages 317–322 vol.1, 2002.
  • [10] Sung Kyung Hong. Fuzzy logic based closed-loop strapdown attitude system for unmanned aerial vehicle (uav). Sensors and Actuators A: Physical, 107(2):109 – 118, 2003.
  • [11] J. Chen and A. Pinz. Structure and motion by fusion of inertial and vision-based tracking. In

    Austrian Association for Pattern Recognition

    , volume 179, pages 75–62, 2004.
  • [12] P.H.S. Torr. Bayesian model estimation and selection for epipolar geometry and generic manifold fitting.

    International Journal of Computer Vision

    , 50(1):35–61, 2002.
  • [13] K. Kanatani. Uncertainty modeling and model selection for geometric inference. IEEE Trans. Pattern Anal. Mach. Intell., 26(10):1307–1319, 2004.
  • [14] K. Schindler and D. Suter.

    Two-view multibody structure-and-motion with outliers through model selection.

    Pattern Analysis and Machine Intelligence, IEEE Transactions on, 28(6):983–995, 2006.
  • [15] J. Civera, A.J. Davison, and J. M. M. Montiel. Interacting multiple model monocular slam. In International Conference on Robotics and Automation, pages 3704–3709, 2008.
  • [16] E. Bostanci. A DFA approach for motion model selection in sensor fusion. In

    Proceedings of the 2014 International Conference on Neural Networks - Fuzzy Systems

    , pages 53–57, 2014.
  • [17] S. H. Stovall. Basic inertial navigation. Technical report, Naval Air Warfare Center Weapons Division, 2008.
  • [18] National Imagery and Mapping Agency. Department of Defense World Geodetic System 1984: its definition and relationships with local geodetic systems. Technical report, National Imagery and Mapping Agency, 2000.
  • [19] J.E. Hopcroft, R. Motwani, and J.D. Ullman. Introduction to automata theory, languages, and computation. Pearson/Addison Wesley, 2007.
  • [20] Y. Uchida, T. Ito, M. Sakamoto, R. Katamune, K. Uchida, H. Furutani, M. Kono, S. Ikeda, and T. Yoshinaga. Path-bounded finite automata on four-dimensional input tapes. International Journal of Computers, 1(5):58–65, 2011.