Kuanquan Wang

is this you? claim profile

0 followers

  • Focus On What's Important: Self-Attention Model for Human Pose Estimation

    Human pose estimation is an essential yet challenging task in computer vision. One of the reasons for this difficulty is that there are many redundant regions in the images. In this work, we proposed a convolutional network architecture combined with the novel attention model. We named it attention convolutional neural network (ACNN). ACNN learns to focus on specific regions of different input features. It's a multi-stage architecture. Early stages filtrate the "nothing-regions", such as background and redundant body parts. And then, they submit the important regions which contain the joints of the human body to the following stages to get a more accurate result. What's more, it does not require extra manual annotations and self-learning is one of our intentions. We separately trained the network because the attention learning task and the pose estimation task are not independent. State-of-the-art performance is obtained on the MPII benchmarks.

    09/22/2018 ∙ by Guanxiong Sun, et al. ∙ 8 share

    read it

  • Detail-preserving and Content-aware Variational Multi-view Stereo Reconstruction

    Accurate recovery of 3D geometrical surfaces from calibrated 2D multi-view images is a fundamental yet active research area in computer vision. Despite the steady progress in multi-view stereo reconstruction, most existing methods are still limited in recovering fine-scale details and sharp features while suppressing noises, and may fail in reconstructing regions with few textures. To address these limitations, this paper presents a Detail-preserving and Content-aware Variational (DCV) multi-view stereo method, which reconstructs the 3D surface by alternating between reprojection error minimization and mesh denoising. In reprojection error minimization, we propose a novel inter-image similarity measure, which is effective to preserve fine-scale details of the reconstructed surface and builds a connection between guided image filtering and image registration. In mesh denoising, we propose a content-aware ℓ_p-minimization algorithm by adaptively estimating the p value and regularization parameters based on the current input. It is much more promising in suppressing noise while preserving sharp features than conventional isotropic mesh smoothing. Experimental results on benchmark datasets demonstrate that our DCV method is capable of recovering more surface details, and obtains cleaner and more accurate reconstructions than state-of-the-art methods. In particular, our method achieves the best results among all published methods on the Middlebury dino ring and dino sparse ring datasets in terms of both completeness and accuracy.

    05/03/2015 ∙ by Zhaoxin Li, et al. ∙ 0 share

    read it

  • Multi-views Fusion CNN for Left Ventricular Volumes Estimation on Cardiac MR Images

    Left ventricular (LV) volumes estimation is a critical procedure for cardiac disease diagnosis. The objective of this paper is to address direct LV volumes prediction task. Methods: In this paper, we propose a direct volumes prediction method based on the end-to-end deep convolutional neural networks (CNN). We study the end-to-end LV volumes prediction method in items of the data preprocessing, networks structure, and multi-views fusion strategy. The main contributions of this paper are the following aspects. First, we propose a new data preprocessing method on cardiac magnetic resonance (CMR). Second, we propose a new networks structure for end-to-end LV volumes estimation. Third, we explore the representational capacity of different slices, and propose a fusion strategy to improve the prediction accuracy. Results: The evaluation results show that the proposed method outperforms other state-of-the-art LV volumes estimation methods on the open accessible benchmark datasets. The clinical indexes derived from the predicted volumes agree well with the ground truth (EDV: R2=0.974, RMSE=9.6ml; ESV: R2=0.976, RMSE=7.1ml; EF: R2=0.828, RMSE =4.71 useful for LV volumes prediction task. Significance: The proposed method not only has application potential for cardiac diseases screening for large-scale CMR data, but also can be extended to other medical image research fields

    04/09/2018 ∙ by Gongning Luo, et al. ∙ 0 share

    read it

  • VoxelAtlasGAN: 3D Left Ventricle Segmentation on Echocardiography with Atlas Guided Generation and Voxel-to-voxel Discrimination

    3D left ventricle (LV) segmentation on echocardiography is very important for diagnosis and treatment of cardiac disease. It is not only because of that echocardiography is a real-time imaging technology and widespread in clinical application, but also because of that LV segmentation on 3D echocardiography can provide more full volume information of heart than LV segmentation on 2D echocardiography. However, 3D LV segmentation on echocardiography is still an open and challenging task owing to the lower contrast, higher noise and data dimensionality, limited annotation of 3D echocardiography. In this paper, we proposed a novel real-time framework, i.e., VoxelAtlasGAN, for 3D LV segmentation on 3D echocardiography. This framework has three contributions: 1) It is based on voxel-to-voxel conditional generative adversarial nets (cGAN). For the first time, cGAN is used for 3D LV segmentation on echocardiography. And cGAN advantageously fuses substantial 3D spatial context information from 3D echocardiography by self-learning structured loss; 2) For the first time, it embeds the atlas into an end-to-end optimization framework, which uses 3D LV atlas as a powerful prior knowledge to improve the inference speed, address the lower contrast and the limited annotation problems of 3D echocardiography; 3) It combines traditional discrimination loss and the new proposed consistent constraint, which further improves the generalization of the proposed framework. VoxelAtlasGAN was validated on 60 subjects on 3D echocardiography and it achieved satisfactory segmentation results and high inference speed. The mean surface distance is 1.85 mm, the mean hausdorff surface distance is 7.26 mm, mean dice is 0.953, the correlation of EF is 0.918, and the mean inference speed is 0.1s. These results have demonstrated that our proposed method has great potential for clinical application

    06/10/2018 ∙ by Suyu Dong, et al. ∙ 0 share

    read it

  • Multi-step Cascaded Networks for Brain Tumor Segmentation

    Automatic brain tumor segmentation method plays an extremely important role in the whole process of brain tumor diagnosis and treatment. In this paper, we propose a multi-step cascaded network which takes the hierarchical topology of the brain tumor substructures into consideration and segments the substructures from coarse to fine , i,e, each step of the the multi-step network is responsible for the segmentation of a specific substructure of the tumor, such as the whole tumor,tumor core and enhancing tumor, the result of the former step is utilized as the prior information for the next step to guide the finer segmentation process. The whole network is trained in an end-to-end fashion. Besides, to alleviate the gradient vanishing issue and reduce overfitting, we added several auxiliary outputs as a kind of deep supervision for each step and introduced several data augmentation strategies, respectively, which proved to be quite efficient for brain tumor segmentation. Lastly, focal loss is utilized to solve the problem of remarkably imbalance of the tumor regions and background. Our model is tested on the BraTS 2019 validation dataset, the preliminary results of mean dice coefficients are 0.882, 0.797, 0.753 for the whole tumor, tumor core and enhancing tumor respectively. Code will be available at https://github.com/JohnleeHIT/Brats2019

    08/16/2019 ∙ by Xiangyu Li, et al. ∙ 0 share

    read it

  • Automatic Detection of ECG Abnormalities by using an Ensemble of Deep Residual Networks with Attention

    Heart disease is one of the most common diseases causing morbidity and mortality. Electrocardiogram (ECG) has been widely used for diagnosing heart diseases for its simplicity and non-invasive property. Automatic ECG analyzing technologies are expected to reduce human working load and increase diagnostic efficacy. However, there are still some challenges to be addressed for achieving this goal. In this study, we develop an algorithm to identify multiple abnormalities from 12-lead ECG recordings. In the algorithm pipeline, several preprocessing methods are firstly applied on the ECG data for denoising, augmentation and balancing recording numbers of variant classes. In consideration of efficiency and consistency of data length, the recordings are padded or truncated into a medium length, where the padding/truncating time windows are selected randomly to sup-press overfitting. Then, the ECGs are used to train deep neural network (DNN) models with a novel structure that combines a deep residual network with an attention mechanism. Finally, an ensemble model is built based on these trained models to make predictions on the test data set. Our method is evaluated based on the test set of the First China ECG Intelligent Competition dataset by using the F1 metric that is regarded as the harmonic mean between the precision and recall. The resultant overall F1 score of the algorithm is 0.875, showing a promising performance and potential for practical use.

    08/27/2019 ∙ by Yang Liu, et al. ∙ 0 share

    read it

  • An Automatic Cardiac Segmentation Framework based on Multi-sequence MR Image

    LGE CMR is an efficient technology for detecting infarcted myocardium. An efficient and objective ventricle segmentation method in LGE can benefit the location of the infarcted myocardium. In this paper, we proposed an automatic framework for LGE image segmentation. There are just 5 labeled LGE volumes with about 15 slices of each volume. We adopted histogram match, an invariant of rotation registration method, on the other labeled modalities to achieve effective augmentation of the training data. A CNN segmentation model was trained based on the augmented training data by leave-one-out strategy. The predicted result of the model followed a connected component analysis for each class to remain the largest connected component as the final segmentation result. Our model was evaluated by the 2019 Multi-sequence Cardiac MR Segmentation Challenge. The mean testing result of 40 testing volumes on Dice score, Jaccard score, Surface distance, and Hausdorff distance is 0.8087, 0.6976, 2.8727mm, and 15.6387mm, respectively. The experiment result shows a satisfying performance of the proposed framework. Code is available at https://github.com/Suiiyu/MS-CMR2019.

    09/12/2019 ∙ by Yashu Liu, et al. ∙ 0 share

    read it