An IR-based Approach Towards Automated Integration of Geo-spatial Datasets in Map-based Software Systems

06/13/2019
by   Nima Miryeganeh, et al.
0

Data is arguably the most valuable asset of the modern world. In this era, the success of any data-intensive solution relies on the quality of data that drives it. Among vast amount of data that are captured, managed, and analyzed everyday, geospatial data are one of the most interesting class of data that hold geographical information of real-world phenomena and can be visualized as digital maps. Geo-spatial data is the source of many enterprise solutions that provide local information and insights. In order to increase the quality of such solutions, companies continuously aggregate geospatial datasets from various sources. However, lack of a global standard model for geospatial datasets makes the task of merging and integrating datasets difficult and error-prone. Traditionally, domain experts manually validate the data integration process by merging new data sources and/or new versions of previous data against conflicts and other requirement violations. However, this approach is not scalable and is hinder toward rapid release, when dealing with frequently changing big datasets. Thus more automated approaches with limited interaction with domain experts is required. As a first step to tackle this problem, in this paper, we leverage Information Retrieval (IR) and geospatial search techniques to propose a systematic and automated conflict identification approach. To evaluate our approach, we conduct a case study in which we measure the accuracy of our approach in several real-world scenarios and we interview with software developers at Localintel Inc. (our industry partner) to get their feedbacks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/22/2023

Large-scale information retrieval in software engineering – an experience report from industrial application

Software Engineering activities are information intensive. Research prop...
research
10/22/2019

Towards Automated Management and Analysis of Heterogeneous Data Within Cannabinoids Domain

Cannabinoid research requires the cooperation of experts from various fi...
research
09/05/2022

Using Consensual Biterms from Text Structures of Requirements and Code to Improve IR-Based Traceability Recovery

Traceability approves trace links among software artifacts based on whet...
research
08/16/2018

Towards Automated Data Integration in Software Analytics

Software organizations want to be able to base their decisions on the la...
research
07/12/2018

STRICT: Information Retrieval Based Search Term Identification for Concept Location

During maintenance, software developers deal with numerous change reques...
research
03/26/2018

A Quality Model for Actionable Analytics in Rapid Software Development

Background: Accessing relevant data on the software product, process, an...
research
09/17/2021

Geolog: Scalable Logic Programming on Spatial Data

Spatial data is ubiquitous in our data-driven society. The Logic Program...

Please sign up or login with your details

Forgot password? Click here to reset