MLReal: Bridging the gap between training on synthetic data and real data applications in machine learning

09/11/2021
by   Tariq Alkhalifah, et al.
32

Among the biggest challenges we face in utilizing neural networks trained on waveform data (i.e., seismic, electromagnetic, or ultrasound) is its application to real data. The requirement for accurate labels forces us to develop solutions using synthetic data, where labels are readily available. However, synthetic data often do not capture the reality of the field/real experiment, and we end up with poor performance of the trained neural network (NN) at the inference stage. We describe a novel approach to enhance supervised training on synthetic data with real data features (domain adaptation). Specifically, for tasks in which the absolute values of the vertical axis (time or depth) of the input data are not crucial, like classification, or can be corrected afterward, like velocity model building using a well-log, we suggest a series of linear operations on the input so the training and application data have similar distributions. This is accomplished by applying two operations on the input data to the NN model: 1) The crosscorrelation of the input data (i.e., shot gather, seismic image, etc.) with a fixed reference trace from the same dataset. 2) The convolution of the resulting data with the mean (or a random sample) of the autocorrelated data from another domain. In the training stage, the input data are from the synthetic domain and the auto-correlated data are from the real domain, and random samples from real data are drawn at every training epoch. In the inference/application stage, the input data are from the real subset domain and the mean of the autocorrelated sections are from the synthetic data subset domain. Example applications on passive seismic data for microseismic event source location determination and active seismic data for predicting low frequencies are used to demonstrate the power of this approach in improving the applicability of trained models to real data.

READ FULL TEXT

page 8

page 11

page 12

page 13

page 20

page 22

page 24

research
08/17/2021

Direct domain adaptation through reciprocal linear transformations

We propose a direct domain adaptation (DDA) approach to enrich the train...
research
07/11/2019

Cross-Domain Complementary Learning with Synthetic Data for Multi-Person Part Segmentation

The success of supervised deep learning depends on the training labels. ...
research
04/25/2020

StRDAN: Synthetic-to-Real Domain Adaptation Network for Vehicle Re-Identification

Vehicle re-identification aims to obtain the same vehicles from vehicle ...
research
09/21/2022

Can Shadows Reveal Biometric Information?

We study the problem of extracting biometric information of individuals ...
research
12/16/2021

Understanding Memorization from the Perspective of Optimization via Efficient Influence Estimation

Over-parameterized deep neural networks are able to achieve excellent tr...
research
03/27/2023

Knowing the Distance: Understanding the Gap Between Synthetic and Real Data For Face Parsing

The use of synthetic data for training computer vision algorithms has be...
research
11/11/2021

Training neural networks with synthetic electrocardiograms

We present a method for training neural networks with synthetic electroc...

Please sign up or login with your details

Forgot password? Click here to reset