
The PCPlike Theorem for Sublinear Time Inapproximability
In this paper we propose the PCPlike theorem for sublinear time inappr...
Sublinear Time Nearest Neighbor Search over Generalized Weighted Manhattan Distance
Nearest Neighbor Search (NNS) over generalized weighted distance is fund...
Efficient Approximate Nearest Neighbor Search for Multiple Weighted l_p≤2 Distance Functions
Nearest neighbor search is fundamental to a wide range of applications. ...
Sampling Based Approximate Skyline Calculation on Big Data
The existing algorithms for processing skyline queries cannot adapt to b...
A Sublinear Time Algorithm for Approximating kNearestNeighbor with Full Quality Guarantee
In this paper we propose an algorithm for the approximate kNearestNeig...
Efficient Trajectory Compression and Queries
Nowadays, there are ubiquitousness of GPS sensors in various devices col...
Complexity and Efficient Algorithms for Data Inconsistency Evaluating and Repairing
Data inconsistency evaluating and repairing are major concerns in data q...
ExperienceThinking: Hyperparameter Optimization with Budget Constraints
The problem of hyperparameter optimization exists widely in the real lif...
AutoModel: Utilizing Research Papers and HPO Techniques to Deal with the CASH problem
In many fields, a mass of algorithms with completely different hyperpara...
Recognizing the Tractability in Big Data Computing
Due to the limitation on computational power of existing computers, the ...
AutoregressiveModelBased Methods for Online Time Series Prediction with Missing Values: an Experimental Evaluation
Time series prediction with missing values is an important problem of ti...
A True O(n logn) Algorithm for the AllkNearestNeighbors Problem
In this paper we examined an algorithm for the AllkNearestNeighbor pr...
Regular Expression Matching on billionnodes Graphs
In many applications, it is necessary to retrieve pairs of vertices with...
An Algorithm for Reducing Approximate Nearest Neighbor to Approximate Near Neighbor with O(logn) Query Time
This paper proposes a new algorithm for reducing Approximate Nearest Nei...
On the Fairness of Qualitybased Data Markets
For data pricing, data quality is a factor that must be considered. To k...
Mining CFD Rules on Big Data
Current conditional functional dependencies (CFDs) discovery algorithms ...
Schema Integration on Massive Data Sources
As the fundamental phrase of collecting and analyzing data, data integra...
Diversification on Big Data in Query Processing
Recently, in the area of big data, some popular applications such as web...
Improve3C: Data Cleaning on Consistency and Completeness with Currency
Data quality plays a key role in big data management today. With the exp...
MISS: Finding Optimal Sample Sizes for Approximate Analytics
Nowadays, samplingbased Approximate Query Processing (AQP) is widely re...
QuickIM: Efficient, Accurate and Robust Influence Maximization Algorithm on BillionScale Networks
The Influence Maximization (IM) problem aims at finding k seed vertices ...
Impacts of Dirty Data: and Experimental Evaluation
Data quality issues have attracted widespread attention due to the negat...
An Iterative Scheme for Leveragebased Approximate Aggregation
Currently data explosion poses great challenges to approximate aggregati...
Optimal Scheduling of Friendly Jammers for Securing Wireless Communication
Wireless communication systems, such as wireless sensor networks and RFI...
Discovery of Paradigm Dependencies
Missing and incorrect values often cause serious consequences. To deal w...
Diversified Coherent Core Search on MultiLayer Graphs
Mining dense subgraphs on multilayer graphs is an interesting problem, ...
