Hui Huang

is this you? claim profile

0 followers

  • LOGAN: Unpaired Shape Transform in Latent Overcomplete Space

    We present LOGAN, a deep neural network aimed at learning generic shape transforms from unpaired domains. The network is trained on two sets of shapes, e.g., tables and chairs, but there is neither a pairing between shapes in the two domains to supervise the shape translation nor any point-wise correspondence between any shapes. Once trained, LOGAN takes a shape from one domain and transforms it into the other. Our network consists of an autoencoder to encode shapes from the two input domains into a common latent space, where the latent codes encode multi-scale shape features in an overcomplete manner. The translator is based on a generative adversarial network (GAN), operating in the latent space, where an adversarial loss enforces cross-domain translation while a feature preservation loss ensures that the right shape features are preserved for a natural shape transform. We conduct various ablation studies to validate each of our key network designs and demonstrate superior capabilities in unpaired shape transforms on a variety of examples over baselines and state-of-the-art approaches. We show that our network is able to learn what shape features to preserve during shape transforms, either local or non-local, whether content or style, etc., depending solely on the input domain pairs.

    03/25/2019 ∙ by Kangxue Yin, et al. ∙ 31 share

    read it

  • Pair-wise Exchangeable Feature Extraction for Arbitrary Style Transfer

    Style transfer has been an important topic in both computer vision and graphics. Gatys et al. first prove that deep features extracted by the pre-trained VGG network represent both content and style features of an image and hence, style transfer can be achieved through optimization in feature space. Huang et al. then show that real-time arbitrary style transfer can be done by simply aligning the mean and variance of each feature channel. In this paper, however, we argue that only aligning the global statistics of deep features cannot always guarantee a good style transfer. Instead, we propose to jointly analyze the input image pair and extract common/exchangeable style features between the two. Besides, a new fusion mode is developed for combining content and style information in feature space. Qualitative and quantitative experiments demonstrate the advantages of our approach.

    11/26/2018 ∙ by Zhijie Wu, et al. ∙ 10 share

    read it

  • Active Scene Understanding via Online Semantic Reconstruction

    We propose a novel approach to robot-operated active understanding of unknown indoor scenes, based on online RGBD reconstruction with semantic segmentation. In our method, the exploratory robot scanning is both driven by and targeting at the recognition and segmentation of semantic objects from the scene. Our algorithm is built on top of the volumetric depth fusion framework (e.g., KinectFusion) and performs real-time voxel-based semantic labeling over the online reconstructed volume. The robot is guided by an online estimated discrete viewing score field (VSF) parameterized over the 3D space of 2D location and azimuth rotation. VSF stores for each grid the score of the corresponding view, which measures how much it reduces the uncertainty (entropy) of both geometric reconstruction and semantic labeling. Based on VSF, we select the next best views (NBV) as the target for each time step. We then jointly optimize the traverse path and camera trajectory between two adjacent NBVs, through maximizing the integral viewing score (information gain) along path and trajectory. Through extensive evaluation, we show that our method achieves efficient and accurate online scene parsing during exploratory scanning.

    06/18/2019 ∙ by Lintao Zheng, et al. ∙ 6 share

    read it

  • Specular-to-Diffuse Translation for Multi-View Reconstruction

    Most multi-view 3D reconstruction algorithms, especially when shape-from-shading cues are used, assume that object appearance is predominantly diffuse. To alleviate this restriction, we introduce S2Dnet, a generative adversarial network for transferring multiple views of objects with specular reflection into diffuse ones, so that multi-view reconstruction methods can be applied more effectively. Our network extends unsupervised image-to-image translation to multi-view "specular to diffuse" translation. To preserve object appearance across multiple views, we introduce a Multi-View Coherence loss (MVC) that evaluates the similarity and faithfulness of local patches after the view-transformation. Our MVC loss ensures that the similarity of local correspondences among multi-view images is preserved under the image-to-image translation. As a result, our network yields significantly better results than several single-view baseline techniques. In addition, we carefully design and generate a large synthetic training data set using physically-based rendering. During testing, our network takes only the raw glossy images as input, without extra information such as segmentation masks or lighting estimation. Results demonstrate that multi-view reconstruction can be significantly improved using the images filtered by our network. We also show promising performance on real world training and testing data.

    07/14/2018 ∙ by Shihao Wu, et al. ∙ 2 share

    read it

  • Non-Stationary Texture Synthesis by Adversarial Expansion

    The real world exhibits an abundance of non-stationary textures. Examples include textures with large-scale structures, as well as spatially variant and inhomogeneous textures. While existing example-based texture synthesis methods can cope well with stationary textures, non-stationary textures still pose a considerable challenge, which remains unresolved. In this paper, we propose a new approach for example-based non-stationary texture synthesis. Our approach uses a generative adversarial network (GAN), trained to double the spatial extent of texture blocks extracted from a specific texture exemplar. Once trained, the fully convolutional generator is able to expand the size of the entire exemplar, as well as of any of its sub-blocks. We demonstrate that this conceptually simple approach is highly effective for capturing large-scale structures, as well as other non-stationary attributes of the input exemplar. As a result, it can cope with challenging textures, which, to our knowledge, no other existing method can handle.

    05/11/2018 ∙ by Yang Zhou, et al. ∙ 2 share

    read it

  • Faster gradient descent and the efficient recovery of images

    Much recent attention has been devoted to gradient descent algorithms where the steepest descent step size is replaced by a similar one from a previous iteration or gets updated only once every second step, thus forming a faster gradient descent method. For unconstrained convex quadratic optimization these methods can converge much faster than steepest descent. But the context of interest here is application to certain ill-posed inverse problems, where the steepest descent method is known to have a smoothing, regularizing effect, and where a strict optimization solution is not necessary. Specifically, in this paper we examine the effect of replacing steepest descent by a faster gradient descent algorithm in the practical context of image deblurring and denoising tasks. We also propose several highly efficient schemes for carrying out these tasks independently of the step size selection, as well as a scheme for the case where both blur and significant noise are present. In the above context there are situations where many steepest descent steps are required, thus building slowness into the solution procedure. Our general conclusion regarding gradient descent methods is that in such cases the faster gradient descent methods offer substantial advantages. In other situations where no such slowness buildup arises the steepest descent method can still be very effective.

    08/12/2013 ∙ by Hui Huang, et al. ∙ 0 share

    read it

  • P2P-NET: Bidirectional Point Displacement Network for Shape Transform

    We introduce P2P-NET, a general-purpose deep neural network which learns geometric transformations between point-based shape representations from two domains, e.g., meso-skeletons and surfaces, partial and complete scans, etc. The architecture of the P2P-NET is that of a bi-directional point displacement network, which transforms a source point set to a target point set with the same cardinality, and vice versa, by applying point-wise displacement vectors learned from data. P2P-NET is trained on paired shapes from the source and target domains, but without relying on point-to-point correspondences between the source and target point sets. The training loss combines two uni-directional geometric losses, each enforcing a shape-wise similarity between the predicted and the target point sets, and a cross-regularization term to encourage consistency between displacement vectors going in opposite directions. We develop and present several different applications enabled by our general-purpose bidirectional P2P-NET to highlight the effectiveness, versatility, and potential of our network in solving a variety of point-based shape transformation problems.

    03/25/2018 ∙ by Kangxue Yin, et al. ∙ 0 share

    read it

  • P2P-NET: Bidirectional Point Displacement Net for Shape Transform

    We introduce P2P-NET, a general-purpose deep neural network which learns geometric transformations between point-based shape representations from two domains, e.g., meso-skeletons and surfaces, partial and complete scans, etc. The architecture of the P2P-NET is that of a bi-directional point displacement network, which transforms a source point set to a target point set with the same cardinality, and vice versa, by applying point-wise displacement vectors learned from data. P2P-NET is trained on paired shapes from the source and target domains, but without relying on point-to-point correspondences between the source and target point sets. The training loss combines two uni-directional geometric losses, each enforcing a shape-wise similarity between the predicted and the target point sets, and a cross-regularization term to encourage consistency between displacement vectors going in opposite directions. We develop and present several different applications enabled by our general-purpose bidirectional P2P-NET to highlight the effectiveness, versatility, and potential of our network in solving a variety of point-based shape transformation problems.

    03/25/2018 ∙ by Kangxue Yin, et al. ∙ 0 share

    read it

  • Full 3D Reconstruction of Transparent Objects

    Numerous techniques have been proposed for reconstructing 3D models for opaque objects in past decades. However, none of them can be directly applied to transparent objects. This paper presents a fully automatic approach for reconstructing complete 3D shapes of transparent objects. Through positioning an object on a turntable, its silhouettes and light refraction paths under different viewing directions are captured. Then, starting from an initial rough model generated from space carving, our algorithm progressively optimizes the model under three constraints: surface and refraction normal consistency, surface projection and silhouette consistency, and surface smoothness. Experimental results on both synthetic and real objects demonstrate that our method can successfully recover the complex shapes of transparent objects and faithfully reproduce their light refraction properties.

    05/09/2018 ∙ by Bojian Wu, et al. ∙ 0 share

    read it

  • D-finite Numbers

    D-finite functions and P-recursive sequences are defined in terms of linear differential and recurrence equations with polynomial coefficients. In this paper, we introduce a class of numbers closely related to D-finite functions and P-recursive sequences. It consists of the limits of convergent P-recursive sequences. Typically, this class contains many well-known mathematical constants in addition to the algebraic numbers. Our definition of the class of D-finite numbers depends on two subrings of the field of complex numbers. We investigate how difference choices of these two subrings affect the class. Moreover, we show that D-finite numbers over the Gaussian rational field are essentially the same as the values of D-finite functions at non-singular algebraic number arguments. This result makes it easier to recognize certain numbers as D-finite.

    11/17/2016 ∙ by Hui Huang, et al. ∙ 0 share

    read it

  • New Bounds for Hypergeometric Creative Telescoping

    Based on a modified version of Abramov-Petkovšek reduction, a new algorithm to compute minimal telescopers for bivariate hypergeometric terms was developed last year. We investigate further in this paper and present a new argument for the termination of this algorithm, which provides an independent proof of the existence of telescopers and even enables us to derive lower as well as upper bounds for the order of telescopers for hypergeometric terms. Compared to the known bounds in the literature, our bounds are sometimes better, and never worse than the known ones.

    04/27/2016 ∙ by Hui Huang, et al. ∙ 0 share

    read it