Data preprocessing is a crucial step in the machine learning process tha...
Graph data augmentation has proven to be effective in enhancing the
gene...
Entity matching (EM) refers to the problem of identifying pairs of data
...
Multimodal electronic health record (EHR) data are widely used in clinic...
Domain generalization (DG) aims at generalizing a classifier trained on
...
As machine learning becomes prevalent, mitigating any unfairness present...
To reduce the human annotation efforts, the programmatic weak supervisio...
In healthcare prediction tasks, it is essential to exploit the correlati...
Estimating the number of distinct values (NDV) in a column is useful for...
Entity matching (EM) refers to the problem of identifying tuple pairs in...
Machine learning (ML) is increasingly being used to make decisions in ou...
Fuzzy similarity join is an important database operator widely used in
p...
Machine learning (ML) applications have been thriving recently, largely
...
Entity resolution (ER) refers to the problem of identifying records in o...
It is widely recognized that the data quality affects machine learning (...
Generating large labeled training data is becoming the biggest bottlenec...
Drug-drug interactions (DDIs) are a major cause of preventable
hospitali...
Time series prediction is of great significance in many applications and...