The developments in the areas of Internet of Things (IoT) and sensor technologies drive advances in modern manufacturing settings. Industrial manufacturing enterprises recognize this technological progress and are using the new Industry 4.0 capabilities to generate added value. For example, on a daily basis, a single sensor located on a General Electric gas turbine engine can produce 500GB of data . Injection molding machines, as an example of a common manufacturing device, can even generate multiple terabytes of sensor data per day .
However, these new possibilities also pose unique challenges, e.g., regarding data integration, as the characteristics of IoT and business data differ . Linking these two kinds of data holds the key for unlocking the full potential that lies within the collected data treasure. Contrary to horizontal integration, which describes the integration of business data along the value chain, vertical integration refers to the connection between business and sensor data. Whereas in horizontal integrations only homogenous business data needs to be merged, vertical integration requires integration of a variety of data characteristics.
In the presented demo that is available online 111https://github.com/Gnni/DemoDataIntegration, we tackle the challenges of understanding the complex data relations in an Industry 4.0 setting. We present an approach for simulating different types and amounts of sensors in the context of an industrial manufacturing company, which also produces business data as part of its regular activities. Furthermore, we enable issuing ad-hoc queries on the collected data in an easy-to-use fashion by employing SQL. This flexibility allows analyzing and combining all kinds of available data. This allows for horizontal as well as vertical integration in this synthetic and configurable scenario.
2 Developed Demo System
The system is developed using the Play framework  and Scala as the programming language. The demo is realized as a single page application (SPA) which allows controlling data generation for both, business data and sensor data . As the application is preconfigured with the default settings of a fictional engine producing factory employing IoT sensors, it can immediately be run and explored. However, the number and characteristics of sensors that are sending data can be adapted using on-screen controls. Two live-updating line charts visualize the data ingestion rate for both kinds of data. This data is inserted into a columnar in-memory database to enable real-time query execution.
The used data model visualized in Figure 1 is inspired by the schemas of real Enterprise Resource Planning (ERP) systems . Particularly, the idea of having a head and an item table for, e.g., sales orders, is adapted in order to be as close to real-world scenarios as possible.
IoT data is stored in another table with the columns ID, WORKPLACE_ID, SENSOR_ID, DATE, as well as columns related to specific sensor measurements, namely TEMPERATURE_VALUE, TEMPERATURE_UNIT, NOISE_VALUE, NOISE_UNIT, VIBRATION_VALUE, and VIBRATION_UNIT. As there are only three kinds of sensors, the last columns are specific to these. When, e.g., a new temperature value is sent and inserted, the columns storing information about the other two sensor types stay empty for that row.
Horizontal integration is achieved using IDs whereas the process of vertical integration makes use of a time-based approach. Particularly, a link between sensor data and ERP data can be created as PRODUCTION_ORDER_POSITION stores the information when a product entered or left a certain workplace. Since sensor data also comes with a timestamp, a connection between measurements and workplaces, and thus, between IoT data and products can be established.
3 Features and Demo Scenario
The demo shown in Figure 2 allows simulating an industrial manufacturing company, which, from a data perspective, produces business as well as sensor data.
Ad-hoc queries combining sensor data and business data can be executed. All query results are presented in form of a table. Furthermore, two sample queries are provided, which answer the questions:
What are the average temperature and noise on the workplace cutting machine for my recently manufactured products?
What are the average vibrations at the assembly workplace dependent on the supplier?
At the top of Figure 2, there are three buttons. The one in the upper-left allows starting and stopping data generation. The two diagrams on the top visualize the input rate of business and IoT data. The sensor characteristics are defined in a JSON file. Clicking on the button in the middle opens the sensor config area which allows, e.g., adding certain kinds of sensors to workplaces that produce data at a definable input rate. This area is not shown in Figure 2.
Below the upper two diagrams, there are two more buttons triggering the execution of one of the two mentioned predefined queries. The lower diagram shows the result of the first query, i.e., average temperature and noise values for the lastly manufactured products at the cutting machine. Not part of Figure 2 is the result table which presents the raw data belonging to issues queries as well as the query formulation area, where the predefined queries can be adapted or any ad-hoc query can be inserted and executed.
4 Conclusion and Future Work
The presented tool allows experiencing horizontal and vertical integration in scenarios with a real-world character. IoT data can be configured and influences on, e.g., performance can be analyzed. Next to predefined queries that answer valuable questions, any ad-hoc query on the collected data can be executed. Results of given queries are visualized in a diagram. To the best of our knowledge, the presented demo application is the first of its kind, i.e., a program providing an explorable Industry 4.0 environment with focus on scenarios close to real-world systems and use cases. Experiments with regard to data integration strategies, data volumes, and resulting impact analysis on query performance can be done easily. As the code is provided, adaptions and further developments are possible.
-  play - The High Velocity Web Framework For Java and Scala. https://www.playframework.com, accessed: 2018-11-06
-  Davenport, T.H., Dyché, J.: Big Data in Big Companies. http://docs.media.bitpipe.com/io_10x/io_102267/item_725049/Big-Data-in-Big-Companies.pdf (May 2013), accessed: 2017-02-28
-  Hesse, G., Reissaus, B., Matthies, C., Lorenz, M., Kraus, M., Uflacker, M.: Senska - Towards an Enterprise Streaming Benchmark. In: Performance Evaluation and Benchmarking for the Analytics Era - TPC Technology Conference. pp. 25–40 (2017)
-  Huber, M.F., Voigt, M., Ngomo, A.N.: Big Data Architecture for the Semantic Analysis of Complex Events in Manufacturing. In: 46. Jahrestagung der Gesellschaft für Informatik, Informatik. pp. 353–360 (2016)
-  Plattner, H.: A common database approach for OLTP and OLAP using an in-memory column database. In: ACM SIGMOD International Conference on Management of Data. pp. 1–2 (2009)