
LOGAN: Unpaired Shape Transform in Latent Overcomplete Space
We present LOGAN, a deep neural network aimed at learning generic shape transforms from unpaired domains. The network is trained on two sets of shapes, e.g., tables and chairs, but there is neither a pairing between shapes in the two domains to supervise the shape translation nor any pointwise correspondence between any shapes. Once trained, LOGAN takes a shape from one domain and transforms it into the other. Our network consists of an autoencoder to encode shapes from the two input domains into a common latent space, where the latent codes encode multiscale shape features in an overcomplete manner. The translator is based on a generative adversarial network (GAN), operating in the latent space, where an adversarial loss enforces crossdomain translation while a feature preservation loss ensures that the right shape features are preserved for a natural shape transform. We conduct various ablation studies to validate each of our key network designs and demonstrate superior capabilities in unpaired shape transforms on a variety of examples over baselines and stateoftheart approaches. We show that our network is able to learn what shape features to preserve during shape transforms, either local or nonlocal, whether content or style, etc., depending solely on the input domain pairs.
03/25/2019 ∙ by Kangxue Yin, et al. ∙ 30 ∙ shareread it

Pairwise Exchangeable Feature Extraction for Arbitrary Style Transfer
Style transfer has been an important topic in both computer vision and graphics. Gatys et al. first prove that deep features extracted by the pretrained VGG network represent both content and style features of an image and hence, style transfer can be achieved through optimization in feature space. Huang et al. then show that realtime arbitrary style transfer can be done by simply aligning the mean and variance of each feature channel. In this paper, however, we argue that only aligning the global statistics of deep features cannot always guarantee a good style transfer. Instead, we propose to jointly analyze the input image pair and extract common/exchangeable style features between the two. Besides, a new fusion mode is developed for combining content and style information in feature space. Qualitative and quantitative experiments demonstrate the advantages of our approach.
11/26/2018 ∙ by Zhijie Wu, et al. ∙ 10 ∙ shareread it

Active Scene Understanding via Online Semantic Reconstruction
We propose a novel approach to robotoperated active understanding of unknown indoor scenes, based on online RGBD reconstruction with semantic segmentation. In our method, the exploratory robot scanning is both driven by and targeting at the recognition and segmentation of semantic objects from the scene. Our algorithm is built on top of the volumetric depth fusion framework (e.g., KinectFusion) and performs realtime voxelbased semantic labeling over the online reconstructed volume. The robot is guided by an online estimated discrete viewing score field (VSF) parameterized over the 3D space of 2D location and azimuth rotation. VSF stores for each grid the score of the corresponding view, which measures how much it reduces the uncertainty (entropy) of both geometric reconstruction and semantic labeling. Based on VSF, we select the next best views (NBV) as the target for each time step. We then jointly optimize the traverse path and camera trajectory between two adjacent NBVs, through maximizing the integral viewing score (information gain) along path and trajectory. Through extensive evaluation, we show that our method achieves efficient and accurate online scene parsing during exploratory scanning.
06/18/2019 ∙ by Lintao Zheng, et al. ∙ 6 ∙ shareread it

SpeculartoDiffuse Translation for MultiView Reconstruction
Most multiview 3D reconstruction algorithms, especially when shapefromshading cues are used, assume that object appearance is predominantly diffuse. To alleviate this restriction, we introduce S2Dnet, a generative adversarial network for transferring multiple views of objects with specular reflection into diffuse ones, so that multiview reconstruction methods can be applied more effectively. Our network extends unsupervised imagetoimage translation to multiview "specular to diffuse" translation. To preserve object appearance across multiple views, we introduce a MultiView Coherence loss (MVC) that evaluates the similarity and faithfulness of local patches after the viewtransformation. Our MVC loss ensures that the similarity of local correspondences among multiview images is preserved under the imagetoimage translation. As a result, our network yields significantly better results than several singleview baseline techniques. In addition, we carefully design and generate a large synthetic training data set using physicallybased rendering. During testing, our network takes only the raw glossy images as input, without extra information such as segmentation masks or lighting estimation. Results demonstrate that multiview reconstruction can be significantly improved using the images filtered by our network. We also show promising performance on real world training and testing data.
07/14/2018 ∙ by Shihao Wu, et al. ∙ 2 ∙ shareread it

NonStationary Texture Synthesis by Adversarial Expansion
The real world exhibits an abundance of nonstationary textures. Examples include textures with largescale structures, as well as spatially variant and inhomogeneous textures. While existing examplebased texture synthesis methods can cope well with stationary textures, nonstationary textures still pose a considerable challenge, which remains unresolved. In this paper, we propose a new approach for examplebased nonstationary texture synthesis. Our approach uses a generative adversarial network (GAN), trained to double the spatial extent of texture blocks extracted from a specific texture exemplar. Once trained, the fully convolutional generator is able to expand the size of the entire exemplar, as well as of any of its subblocks. We demonstrate that this conceptually simple approach is highly effective for capturing largescale structures, as well as other nonstationary attributes of the input exemplar. As a result, it can cope with challenging textures, which, to our knowledge, no other existing method can handle.
05/11/2018 ∙ by Yang Zhou, et al. ∙ 2 ∙ shareread it

Faster gradient descent and the efficient recovery of images
Much recent attention has been devoted to gradient descent algorithms where the steepest descent step size is replaced by a similar one from a previous iteration or gets updated only once every second step, thus forming a faster gradient descent method. For unconstrained convex quadratic optimization these methods can converge much faster than steepest descent. But the context of interest here is application to certain illposed inverse problems, where the steepest descent method is known to have a smoothing, regularizing effect, and where a strict optimization solution is not necessary. Specifically, in this paper we examine the effect of replacing steepest descent by a faster gradient descent algorithm in the practical context of image deblurring and denoising tasks. We also propose several highly efficient schemes for carrying out these tasks independently of the step size selection, as well as a scheme for the case where both blur and significant noise are present. In the above context there are situations where many steepest descent steps are required, thus building slowness into the solution procedure. Our general conclusion regarding gradient descent methods is that in such cases the faster gradient descent methods offer substantial advantages. In other situations where no such slowness buildup arises the steepest descent method can still be very effective.
08/12/2013 ∙ by Hui Huang, et al. ∙ 0 ∙ shareread it

P2PNET: Bidirectional Point Displacement Network for Shape Transform
We introduce P2PNET, a generalpurpose deep neural network which learns geometric transformations between pointbased shape representations from two domains, e.g., mesoskeletons and surfaces, partial and complete scans, etc. The architecture of the P2PNET is that of a bidirectional point displacement network, which transforms a source point set to a target point set with the same cardinality, and vice versa, by applying pointwise displacement vectors learned from data. P2PNET is trained on paired shapes from the source and target domains, but without relying on pointtopoint correspondences between the source and target point sets. The training loss combines two unidirectional geometric losses, each enforcing a shapewise similarity between the predicted and the target point sets, and a crossregularization term to encourage consistency between displacement vectors going in opposite directions. We develop and present several different applications enabled by our generalpurpose bidirectional P2PNET to highlight the effectiveness, versatility, and potential of our network in solving a variety of pointbased shape transformation problems.
03/25/2018 ∙ by Kangxue Yin, et al. ∙ 0 ∙ shareread it

P2PNET: Bidirectional Point Displacement Net for Shape Transform
We introduce P2PNET, a generalpurpose deep neural network which learns geometric transformations between pointbased shape representations from two domains, e.g., mesoskeletons and surfaces, partial and complete scans, etc. The architecture of the P2PNET is that of a bidirectional point displacement network, which transforms a source point set to a target point set with the same cardinality, and vice versa, by applying pointwise displacement vectors learned from data. P2PNET is trained on paired shapes from the source and target domains, but without relying on pointtopoint correspondences between the source and target point sets. The training loss combines two unidirectional geometric losses, each enforcing a shapewise similarity between the predicted and the target point sets, and a crossregularization term to encourage consistency between displacement vectors going in opposite directions. We develop and present several different applications enabled by our generalpurpose bidirectional P2PNET to highlight the effectiveness, versatility, and potential of our network in solving a variety of pointbased shape transformation problems.
03/25/2018 ∙ by Kangxue Yin, et al. ∙ 0 ∙ shareread it

Full 3D Reconstruction of Transparent Objects
Numerous techniques have been proposed for reconstructing 3D models for opaque objects in past decades. However, none of them can be directly applied to transparent objects. This paper presents a fully automatic approach for reconstructing complete 3D shapes of transparent objects. Through positioning an object on a turntable, its silhouettes and light refraction paths under different viewing directions are captured. Then, starting from an initial rough model generated from space carving, our algorithm progressively optimizes the model under three constraints: surface and refraction normal consistency, surface projection and silhouette consistency, and surface smoothness. Experimental results on both synthetic and real objects demonstrate that our method can successfully recover the complex shapes of transparent objects and faithfully reproduce their light refraction properties.
05/09/2018 ∙ by Bojian Wu, et al. ∙ 0 ∙ shareread it

Dfinite Numbers
Dfinite functions and Precursive sequences are defined in terms of linear differential and recurrence equations with polynomial coefficients. In this paper, we introduce a class of numbers closely related to Dfinite functions and Precursive sequences. It consists of the limits of convergent Precursive sequences. Typically, this class contains many wellknown mathematical constants in addition to the algebraic numbers. Our definition of the class of Dfinite numbers depends on two subrings of the field of complex numbers. We investigate how difference choices of these two subrings affect the class. Moreover, we show that Dfinite numbers over the Gaussian rational field are essentially the same as the values of Dfinite functions at nonsingular algebraic number arguments. This result makes it easier to recognize certain numbers as Dfinite.
11/17/2016 ∙ by Hui Huang, et al. ∙ 0 ∙ shareread it

New Bounds for Hypergeometric Creative Telescoping
Based on a modified version of AbramovPetkovšek reduction, a new algorithm to compute minimal telescopers for bivariate hypergeometric terms was developed last year. We investigate further in this paper and present a new argument for the termination of this algorithm, which provides an independent proof of the existence of telescopers and even enables us to derive lower as well as upper bounds for the order of telescopers for hypergeometric terms. Compared to the known bounds in the literature, our bounds are sometimes better, and never worse than the known ones.
04/27/2016 ∙ by Hui Huang, et al. ∙ 0 ∙ shareread it
Hui Huang
is this you? claim profile