Semantically Enhanced Time Series Databases in IoT-Edge-Cloud Infrastructure
Many IoT systems are data intensive and are for the purpose of monitoring for fault detection and diagnosis of critical systems. A large volume of data steadily come out of a large number of sensors in the monitoring system. Thus, we need to consider how to store and manage these data. Existing time series databases (TSDBs) can be used for monitoring data storage, but they do not have good models for describing the data streams stored in the database. In this paper, we develop a semantic model for the specification of the monitoring data streams (time series data) in terms of which sensor generated the data stream, which metric of which entity the sensor is monitoring, what is the relation of the entity to other entities in the system, which measurement unit is used for the data stream, etc. We have also developed a tool suite, SE-TSDB, that can run on top of existing TSDBs to help establish semantic specifications for data streams and enable semantic-based data retrievals. With our semantic model for monitoring data and our SE-TSDB tool suite, users can retrieve non-existing data streams that can be automatically derived from the semantics. Users can also retrieve data streams without knowing where they are. Semantic based retrieval is especially important in a large-scale integrated IoT-Edge-Cloud system, because of its sheer quantity of data, its huge number of computing and IoT devices that may store the data, and the dynamics in data migration and evolution. With better data semantics, data streams can be more effectively tracked and flexibly retrieved to help with timely data analysis and control decision making anywhere and anytime.
READ FULL TEXT