Chi Zhang

is this you? claim profile


  • Deep Learning Methods for Parallel Magnetic Resonance Image Reconstruction

    Following the success of deep learning in a wide range of applications, neural network-based machine learning techniques have received interest as a means of accelerating magnetic resonance imaging (MRI). A number of ideas inspired by deep learning techniques from computer vision and image processing have been successfully applied to non-linear image reconstruction in the spirit of compressed sensing for both low dose computed tomography and accelerated MRI. The additional integration of multi-coil information to recover missing k-space lines in the MRI reconstruction process, is still studied less frequently, even though it is the de-facto standard for currently used accelerated MR acquisitions. This manuscript provides an overview of the recent machine learning approaches that have been proposed specifically for improving parallel imaging. A general background introduction to parallel MRI is given that is structured around the classical view of image space and k-space based methods. Both linear and non-linear methods are covered, followed by a discussion of recent efforts to further improve parallel imaging using machine learning, and specifically using artificial neural networks. Image-domain based techniques that introduce improved regularizers are covered as well as k-space based methods, where the focus is on better interpolation strategies using neural networks. Issues and open problems are discussed as well as recent efforts for producing open datasets and benchmarks for the community.

    04/01/2019 ∙ by Florian Knoll, et al. ∙ 42 share

    read it

  • Semantic Sentence Embeddings for Paraphrasing and Text Summarization

    This paper introduces a sentence to vector encoding framework suitable for advanced natural language processing. Our latent representation is shown to encode sentences with common semantic information with similar vector representations. The vector representation is extracted from an encoder-decoder model which is trained on sentence paraphrase pairs. We demonstrate the application of the sentence representations for two different tasks -- sentence paraphrasing and paragraph summarization, making it attractive for commonly used recurrent frameworks that process text. Experimental results help gain insight how vector representations are suitable for advanced language embedding.

    09/26/2018 ∙ by Chi Zhang, et al. ∙ 14 share

    read it

  • Dissociable neural representations of adversarially perturbed images in deep neural networks and the human brain

    Despite the remarkable similarities between deep neural networks (DNN) and the human brain as shown in previous studies, the fact that DNNs still fall behind humans in many visual tasks suggests that considerable differences still exist between the two systems. To probe their dissimilarities, we leverage adversarial noise (AN) and adversarial interference (AI) images that yield distinct recognition performance in a prototypical DNN (AlexNet) and human vision. The evoked activity by regular (RE) and adversarial images in both systems is thoroughly compared. We find that representational similarity between RE and adversarial images in the human brain resembles their perceptual similarity. However, such representation-perception association is disrupted in the DNN. Especially, the representational similarity between RE and AN images idiosyncratically increases from low- to high-level layers. Furthermore, forward encoding modeling reveals that the DNN-brain hierarchical correspondence proposed in previous studies only holds when the two systems process RE and AI images but not AN images. These results might be due to the deterministic modeling approach of current DNNs. Taken together, our results provide a complementary perspective on the comparison between DNNs and the human brain, and highlight the need to characterize their differences to further bridge artificial and human intelligence research.

    12/22/2018 ∙ by Chi Zhang, et al. ∙ 14 share

    read it

  • STNReID : Deep Convolutional Networks with Pairwise Spatial Transformer Networks for Partial Person Re-identification

    Partial person re-identification (ReID) is a challenging task because only partial information of person images is available for matching target persons. Few studies, especially on deep learning, have focused on matching partial person images with holistic person images. This study presents a novel deep partial ReID framework based on pairwise spatial transformer networks (STNReID), which can be trained on existing holistic person datasets. STNReID includes a spatial transformer network (STN) module and a ReID module. The STN module samples an affined image (a semantically corresponding patch) from the holistic image to match the partial image. The ReID module extracts the features of the holistic, partial, and affined images. Competition (or confrontation) is observed between the STN module and the ReID module, and two-stage training is applied to acquire a strong STNReID for partial ReID. Experimental results show that our STNReID obtains 66.7 accuracies on partial ReID and partial iLIDS datasets, respectively. These values are at par with those obtained with state-of-the-art methods.

    03/17/2019 ∙ by Hao Luo, et al. ∙ 14 share

    read it

  • Two Birds with One Network: Unifying Failure Event Prediction and Time-to-failure Modeling

    One of the key challenges in predictive maintenance is to predict the impending downtime of an equipment with a reasonable prediction horizon so that countermeasures can be put in place. Classically, this problem has been posed in two different ways which are typically solved independently: (1) Remaining useful life (RUL) estimation as a long-term prediction task to estimate how much time is left in the useful life of the equipment and (2) Failure prediction (FP) as a short-term prediction task to assess the probability of a failure within a pre-specified time window. As these two tasks are related, performing them separately is sub-optimal and might results in inconsistent predictions for the same equipment. In order to alleviate these issues, we propose two methods: Deep Weibull model (DW-RNN) and multi-task learning (MTL-RNN). DW-RNN is able to learn the underlying failure dynamics by fitting Weibull distribution parameters using a deep neural network, learned with a survival likelihood, without training directly on each task. While DW-RNN makes an explicit assumption on the data distribution, MTL-RNN exploits the implicit relationship between the long-term RUL and short-term FP tasks to learn the underlying distribution. Additionally, both our methods can leverage the non-failed equipment data for RUL estimation. We demonstrate that our methods consistently outperform baseline RUL methods that can be used for FP while producing consistent results for RUL and FP. We also show that our methods perform at par with baselines trained on the objectives optimized for either of the two tasks.

    12/18/2018 ∙ by Karan Aggarwal, et al. ∙ 10 share

    read it

  • Re-ID Driven Localization Refinement for Person Search

    Person search aims at localizing and identifying a query person from a gallery of uncropped scene images. Different from person re-identification (re-ID), its performance also depends on the localization accuracy of a pedestrian detector. The state-of-the-art methods train the detector individually, and the detected bounding boxes may be sub-optimal for the following re-ID task. To alleviate this issue, we propose a re-ID driven localization refinement framework for providing the refined detection boxes for person search. Specifically, we develop a differentiable ROI transform layer to effectively transform the bounding boxes from the original images. Thus, the box coordinates can be supervised by the re-ID training other than the original detection task. With this supervision, the detector can generate more reliable bounding boxes, and the downstream re-ID model can produce more discriminative embeddings based on the refined person localizations. Extensive experimental results on the widely used benchmarks demonstrate that our proposed method performs favorably against the state-of-the-art person search methods.

    09/18/2019 ∙ by Chuchu Han, et al. ∙ 8 share

    read it

  • MEBF: a fast and efficient Boolean matrix factorization method

    Boolean matrix has been used to represent digital information in many fields, including bank transaction, crime records, natural language processing, protein-protein interaction, etc. Boolean matrix factorization (BMF) aims to find an approximation of a binary matrix as the Boolean product of two low rank Boolean matrices, which could generate vast amount of information for the patterns of relationships between the features and samples. Inspired by binary matrix permutation theories and geometric segmentation, we developed a fast and efficient BMF approach called MEBF (Median Expansion for Boolean Factorization). Overall, MEBF adopted a heuristic approach to locate binary patterns presented as submatrices that are dense in 1's. At each iteration, MEBF permutates the rows and columns such that the permutated matrix is approximately Upper Triangular-Like (UTL) with so-called Simultaneous Consecutive-ones Property (SC1P). The largest submatrix dense in 1 would lies on the upper triangular area of the permutated matrix, and its location was determined based on a geometric segmentation of a triangular. We compared MEBF with other state of the art approaches on data scenarios with different sparsity and noise levels. MEBF demonstrated superior performances in lower reconstruction error, and higher computational efficiency, as well as more accurate sparse patterns than popular methods such as ASSO, PANDA and MP. We demonstrated the application of MEBF on both binary and non-binary data sets, and revealed its further potential in knowledge retrieving and data denoising.

    09/09/2019 ∙ by Changlin Wan, et al. ∙ 4 share

    read it

  • An Integration of Bottom-up and Top-Down Salient Cues on RGB-D Data: Saliency from Objectness vs. Non-Objectness

    Bottom-up and top-down visual cues are two types of information that helps the visual saliency models. These salient cues can be from spatial distributions of the features (space-based saliency) or contextual / task-dependent features (object based saliency). Saliency models generally incorporate salient cues either in bottom-up or top-down norm separately. In this work, we combine bottom-up and top-down cues from both space and object based salient features on RGB-D data. In addition, we also investigated the ability of various pre-trained convolutional neural networks for extracting top-down saliency on color images based on the object dependent feature activation. We demonstrate that combining salient features from color and dept through bottom-up and top-down methods gives significant improvement on the salient object detection with space based and object based salient cues. RGB-D saliency integration framework yields promising results compared with the several state-of-the-art-models.

    07/04/2018 ∙ by Nevrez Imamoglu, et al. ∙ 2 share

    read it

  • MetaStyle: Three-Way Trade-Off Among Speed, Flexibility, and Quality in Neural Style Transfer

    An unprecedented booming has been witnessed in the research area of artistic style transfer ever since Gatys et al. introduced the neural method. One of the remaining challenges is to balance a trade-off among three critical aspects---speed, flexibility, and quality: (i) the vanilla optimization-based algorithm produces impressive results for arbitrary styles, but is unsatisfyingly slow due to its iterative nature, (ii) the fast approximation methods based on feed-forward neural networks generate satisfactory artistic effects but bound to only a limited number of styles, and (iii) feature-matching methods like AdaIN achieve arbitrary style transfer in a real-time manner but at a cost of the compromised quality. We find it considerably difficult to balance the trade-off well merely using a single feed-forward step and ask, instead, whether there exists an algorithm that could adapt quickly to any style, while the adapted model maintains high efficiency and good image quality. Motivated by this idea, we propose a novel method, coined MetaStyle, which formulates the neural style transfer as a bilevel optimization problem and combines learning with only a few post-processing update steps to adapt to a fast approximation model with satisfying artistic effects, comparable to the optimization-based methods for an arbitrary style. The qualitative and quantitative analysis in the experiments demonstrates that the proposed approach achieves high-quality arbitrary artistic style transfer effectively, with a good trade-off among speed, flexibility, and quality.

    12/13/2018 ∙ by Chi Zhang, et al. ∙ 2 share

    read it

  • SOT for MOT

    In this paper we present a robust tracker to solve the multiple object tracking (MOT) problem, under the framework of tracking-by-detection. As the first contribution, we innovatively combine single object tracking (SOT) algorithms with multiple object tracking algorithms, and our results show that SOT is a general way to strongly reduce the number of false negatives, regardless of the quality of detection. Another contribution is that we show with a deep learning based appearance model, it is easy to associate detections of the same object efficiently and also with high accuracy. This appearance model plays an important role in our MOT algorithm to correctly associate detections into long trajectories, and also in our SOT algorithm to discover new detections mistakenly missed by the detector. The deep neural network based model ensures the robustness of our tracking algorithm, which can perform data association in a wide variety of scenes. We ran comprehensive experiments on a large-scale and challenging dataset, the MOT16 benchmark, and results showed that our tracker achieved state-of-the-art performance based on both public and private detections.

    12/04/2017 ∙ by Qizheng He, et al. ∙ 0 share

    read it

  • AlignedReID: Surpassing Human-Level Performance in Person Re-Identification

    In this paper, we propose a novel method called AlignedReID that extracts a global feature which is jointly learned with local features. Global feature learning benefits greatly from local feature learning, which performs an alignment/matching by calculating the shortest path between two sets of local features, without requiring extra supervision. After the joint learning, we only keep the global feature to compute the similarities between images. Our method achieves rank-1 accuracy of 94.0 outperforming state-of-the-art methods by a large margin. We also evaluate human-level performance and demonstrate that our method is the first to surpass human-level performance on Market1501 and CUHK03, two widely used Person ReID datasets.

    11/22/2017 ∙ by Xuan Zhang, et al. ∙ 0 share

    read it