Integrating Structural Description of Data Format Information into Programming to Auto-generate File Reading Programs

10/11/2021
by   Xinghua Cheng, et al.
0

File reading is the basis for data sharing and scientific computing. However, manual programming for file reading is labour-intensive and time-consuming, as data formats are heterogeneous and complex. To address such an issue, this study proposes a novel approach for the automatic generation of file reading programs based on structured and self-described data format information. This approach provides two modes composed of sequentially and randomly reading. The file data format is described by Data Format Markup Language and thus DFML documents are generated. The formation of data type sequences by parsing those DFML documents. The generation of programs for sequential or random reading data with formed data type sequences and general programing rules for specific programming languages. A tool named DFML Editor was developed for generating and editing DFML documents. Case studies on binary files, i.e., ESRI point shapefiles and plain text files, i.e., input files of Storm Water Management Model, were conducted with the software developed for automatic program generation and file reading. Experimental results show that the proposed approach is effective for automatically generating programs for reading files. The idea in this study is also helpful for automatically writing files.

READ FULL TEXT

page 19

page 20

page 24

research
12/24/2018

Neural Fuzzing: A Neural Approach to Generate Test Data for File Format Fuzzing

This article is aimed at the design and implementation of a file format ...
research
12/04/2018

Using Binary File Format Description Languages for Documenting, Parsing, and Verifying Raw Data in TAIGA Experiment

The paper is devoted to the issues of raw binary data documenting, parsi...
research
06/21/2021

ciftiTools: A package for reading, writing, visualizing and manipulating CIFTI files in R

Surface- and grayordinate-based analysis of MR data has well-recognized ...
research
10/11/2021

Parsing Data Formats of the Inputs and Outputs of Geographic Models with Code Analysis

Model web services provide an approach for implementing and facilitating...
research
12/18/2017

An anthropological account of the Vim text editor: features and tweaks after 10 years of usage

The Vim text editor is very rich in capabilities and thus complex. This ...
research
04/19/2021

Inferring Drop-in Binary Parsers from Program Executions

We present BIEBER (Byte-IdEntical Binary parsER), the first system to mo...
research
10/26/2020

5W1H-based Expression for the Effective Sharing of Information in Digital Forensic Investigations

Digital forensic investigation is used in various areas related to digit...

Please sign up or login with your details

Forgot password? Click here to reset