1 Introduction
In eukaryotic cells, the total length of the DNA molecule exceeds by far the diameter of the nucleus. To fit in the nucleus, DNA is carefully packaged around specific proteins, forming a complex called chromatin. Despite a high degree of compaction, DNA, in its uncompressed form, must be rapidly accessible to the protein machineries that regulate the essential functions of life. Recent studies have revealed that chromatin is nonrandomly organized within the cell nucleus, and have linked chromatin folding to many vital cellular functions, such as gene regulation, differentiation, DNA replication and genome stability maintenance (Dekker, 2008; Therizols et al., 2014; Misteli, 2004). Hence, understanding the threedimensional (3D) chromatin conformation is essential for decoding the functions of the genome and can provide a mechanistic explanation of various biological processes and their links to human disease (Misteli, 2007; Mitelman et al., 2007). Furthermore, since many biological processes involving DNA are dynamic, there is a need for methodologies that can elucidate the evolution of chromatin conformation over time.
Traditionally, the structure of the genome has been studied using microscopy techniques, such as fluorescent in situ hybridization (FISH) (van Steensel and Dekker, 2010) or, more recently, stochastic optical reconstruction microscopy (STORM, (Rust et al., 2006)) and photoactivated localization microscopy (PALM, (Gaietta et al., 2002)). Despite the advancements, microscopy approaches are limited to a small number of genomic locations and do not support a comprehensive analysis of the complete genome structure (Bonev and Cavalli, 2016). In recent years, the advancements in chromosome conformation capture (3C) (Dekker et al., 2002) have paved the way for the systematic analysis of the 3D structure of chromatin. 3C methods provide measurements of the physical interaction frequencies between fragments of consecutive chromatin loci of a certain resolution, commonly referred to as chromatin bins. LiebermanAiden and Berkum (2009) proposed HiC, a higherthroughput, higherresolution, 3Cbased method that quantifies intra and interchromosomal interaction frequencies at a wholegenome scale. Chromatin interactions captured by HiC are represented as a contact matrix, where each entry determines the frequency of interactions between a pair of genomic bins in a population of cells. Therefore, one of the main applications of HiC data is to reconstruct the 3D chromatin structure from the HiC contact matrix.
Related Work
Most methods developed thus far in the literature for this task can be classified into
optimizationbased and modelingbased methods. Since, naturally, physical distance and contact frequency are inversely correlated, optimization methods model this relationship with a specific transfer function that maps the contact frequencies to distances, yielding a distance matrix. An optimization problem is then formulated to minimize the difference between distances in this matrix and the ones computed by the inferred structures. In practice, this translates into performing multidimensional scaling (MDS) (Kruskal, 1964) on the distance matrix. Examples of this approach are Duan et al. (2010); Lesne et al. (2014); Wang (2012); Zhang et al. (2013); Trieu and Cheng (2014); Rieber and Mahony (2017). Modelingbased methods (Rousseau et al., 2011; Hu et al., 2013; Varoquaux et al., 2014; Zou et al., 2016; Wang et al., 2015; Park and Lin, 2016; Oluwadare et al., 2018)formulate the relationship between contact frequencies and distances in a probabilistic fashion and perform inference through maximum likelihood estimation or via Bayesian approaches. GEM
(Zhu et al., 2018), a more recent method, adopts a modified tSNE algorithm (Van Der Maaten and Hinton, 2008) to perform manifold learning.Nevertheless, several limitations reduce the usability of the existent methods. First, most methods necessitate the use of a parametric transfer function, which requires making assumptions about the relationship between distances and contact frequencies. One exception is GEM (Zhu et al., 2018), which, however, fails to preserve the order of the bins in the chromosomes. Second, most methods do not scale with the resolution of recent HiC experiments. High resolution is necessary for accurate, finegrained structure inference. MiniMDS (Rieber and Mahony, 2017) is the only method designed to address this issue by inferring highresolution structures for subregions of the chromosomes and connecting them together using a lowresolution global structure. Lastly, none of the existing methods can incorporate timecourse information and perform dynamic analysis of the chromatin structure. In this work we propose a novel manifold learning method, REcurrent Autoencoders for CHromatin 3D structure prediction (REACH3D), to infer the dynamic 3D chromatin structure from HiC data.
2 Methods
REACH3D addresses the major challenges that come with HiC data and the limitations of existing methods. Our solution exploits manifold learning as a means to reduce the dimensionality of HiC data and infer the 3D chromatin structure. To apply manifold learning to the problem at hand, our method first assumes that the 3D coordinates of the chromatin bins lie on an embedded, nonlinear manifold. The manifold lives in a highdimensional space, represented by a contact matrix through the HiC experiment. Our goal is to map the HiC data to 3D Euclidean coordinates , corresponding to the intrinsic dimensionality of the HiC data, i.e., the coordinates of the chromatin bins.
The architecture of REACH3D is inspired by the sequencetosequence models introduced by Sutskever et al. (2014)
, frequently used in natural language processing for translation or sentence completion. In comparison to a sequencetosequence architecture, we encode each element, i.e., each bin in the genomic sequence, into a fixed 3D vector, which is in turn decoded to reconstruct the original element. To encode the whole chromatin sequence, REACH3D consists of a sequence of autoencoders, where each autoencoder is matched to one chromatin bin, thus ensuring that the genomic order of the bins is preserved. As illustrated in Figure
1, REACH3D is designed as a network with recurrent units, specifically the commonly employed Long ShortTerm Memory (LSTM)
(Hochreiter and Schmidhuber, 1997) units; the autoencoders are connected and pass information onward as the sequence progresses. The input to each encoder cell is the feature vector of each bin, representing the interactions of that bin with all other bins in the genome. The sequence length and number of features are equal in this case, and given by , the number of bins in the chromatin structure.Encoder
For the encoder we use an LSTM neural network. The input to each encoder LSTM cell
is the feature vector of the corresponding element , , where is the contact frequency between bins and , and the hidden state of the previous encoder cell is . The output of the encoder cell, when applying the encoding , is the fixed lowdimensional embedding :(1) 
Decoder
Similarly, the decoder is also an LSTM neural network. The input to the decoder LSTM cell is the embedding of the corresponding element , , and the hidden state of the previous decoder cell, . The output of the decoder cell, when applying the decoding , is the fixed reconstruction of the contact frequencies :
(2) 
The obtained sequence of embeddings in 3D space, , where , represents the coordinates of the bins in the predicted chromatin structure, as illustrated in Figure 1.
Loss Function
The loss function in Equation (
3) is composed of two terms, a main reconstruction loss, , and a distance loss, . The reconstruction loss is defined as the standard mean squared error between the input and the reconstruction of the HiC matrix (Equation (4)). The distance loss, inspired by biological priors, acts as a regularizer on the lower and upperbound of the Euclidean distance between two consecutive bins and safeguards against unreasonably high or low distances. Furthermore, the loss formulation in Equation (3) is similar to a Lagrangian expression, and can be seen as a Lagrange multiplier. Specifically, to model the folding behavior of the chromosomes, we introduce two bounds: a lower bound, defined by , which represents a fullypacked folding, and an upper bound, defined by , which represents a fullyextended folding of the chromosome bins. Hence, Equation (5) can be interpreted as the result of two forces (the deviation from the lower and the upper bound) pulling in opposite directions. This does not imply that the distances are at equal deviation from both lower and upper bounds, since the reconstruction cost can impose a preference towards one of the bounds. The lower and upper bound values defining the packing ratio across the bins are independent of the resolution. Changes in the resolution are equivalent to scaling the whole structure, thus obtaining the same effect as proportionally changing the bounds.(3) 
(4) 
(5) 
In our model, the
multiplier is a hyperparameter that we have to optimize. The value of
is evaluated on a few factors: FISH distances validation, loss values, and simulated structures accuracy. All the steps of the inference process are presented in Algorithm 1.Comparison to Related Work
REACH3D is fundamentally different from all existing methods, and shares a few common concepts with GEM (Zhu et al., 2018). The model is specifically tailored for the problem of 3D chromatin prediction, and directly addresses several limitations of prior approaches, as summarized below:

Transfer function assumption: We perform the inference directly on the contact frequency matrix, without requiring any assumptions on the mapping between distances and contacts.

Scale with the resolution: In our model, we employ shared weights in the network architecture, leading to a significantly smaller number of trainable parameters and a shorter network training time.

Sparse contact matrices: Since dimensionality reduction results in the minimum number of dimensions that best describe the data, sparsity does not affect the structure, provided that the initial dimensions accurately explain the variations in the data.

Sequence preservation: We introduce recurrent neural units in our model to describe the sequential relationship between the elements in our structure, i.e., the bins in the genome.

Simulating folding mechanism: Lower and upper bounds of the distance between consecutive bins prohibit nonrealistic solutions in terms of folding, and can be parameterized to allow REACH3D flexibility with respect to different organisms.

Global structure recovery: The autoencoder we propose is able to learn both the local structure through its high expressiveness, but also to preserve the global structure through the memory of the LSTM cells.
Since each HiC matrix represents measurements over a population of cells, we adopt an ensemble prediction strategy, where we predict a set of potential structures representative of that population. To achieve this, we simply use different weight initializations and repeat the learning process.
Timedependent Analysis
HiC data is a snapshot of the interactions between and within the chromosomes. Nevertheless, the chromosome folding process is dynamic and depends on other biological processes, such as the cell cycle or the differentiation state. Although the focus of the HiC community is shifting towards acquiring data and establishing protocols for 4D analysis (Dekker et al., 2017), existing methods are limited to 3D analysis. Applying them to 4D data would be equivalent to obtaining independent 3D structures for each of the time points, thus disregarding the continuous nature of the data. One of the main novelties of our model is that it incorporates information about the structure at previous time points in the inference process. The simplest way of achieving this is via weight sharing: for each time point, we initialize the weights of the network with the corresponding values of the previous time point. In this way, we directly model the evolution of the chromatin conformation over time.
3 Experiments and Results
The inference of 3D chromatin structure falls under the unsupervised learning paradigm. In contrast to protein folding, which can be assessed using Xray crystallography or spectroscopybased techniques, no method able to experimentally determine the folding of the chromatin exists. Therefore, the lack of ground truth requires alternative methods for evaluating the inferred 3D structures. We first tune the hyperparameter
by looking at the value of the loss (and implicitly, at the reconstruction error) and the presence of the desired biological and physical properties in the structure. To evaluate REACH3D and compare it with the stateoftheart methods, we use synthetic data and, where available, 3D FISH microscopymeasured pairwise distances.Datasets
There exist numerous experimental HiC datasets on different organisms, cell types and resolutions in the literature. After reviewing existing datasets, we have selected the following:

Synthetic Data: The only synthetic contact frequency matrix in the literature was developed by Trussart et al. (2015). To create a synthetic HiC contact matrix, the authors created 100 3D toy models of a single, hypothetical chromosome of 1 Mb length and aggregated them to derive one single contact frequency matrix, representative of the population.

Fission Yeast: A second dataset used for evaluation and performance of REACH3D comes from fission yeast, and includes HiC measurements at different time points of the fission yeast’s cell cycle (Tanizawa et al., 2017).

Human: Last, we experimented on HiC maps of all human chromosomes from a lymphoblastoid cell line (GM12878) (Rao et al., 2014), chosen due to its highresolution data.
Experimental Setup
REACH3D is implemented in Tensorflow 1.9
(Abadi et al., 2015). We use Kingma and Ba (2014)’s Adam optimizer with the following settings: a learning rate of, an exponential decay rate for the first and second moment estimates of
and respectively, and an epsilon value of. We also apply gradient clipping with a clipping value of
. To initialize the weights of the network we use the Xavier initializer (Glorot and Bengio, 2010). The number of iterations necessary for convergence varies between 10005000 epochs, depending on the dataset and resolution. The hyperparameter of the REACH3D model is the multiplier. The experiments were ran on an IBM Power System S822LC.Hyperparameter Tuning
To assess the inferred structures, we initially visually examine biological properties expected from prior knowledge, namely the preservation of genomic bin sequence, the existence of chromosome territories, i.e., subnuclear compartments where each chromosome is localized, and the presence of intra and interchromosomal interactions and longrange loops. Resulting structures for the whole fission yeast genome and various values of are shown in Figure 2. Larger values of , e.g. and , do not fully preserve the genomic bin sequence and the chromosome territories are not clearly visible. As decreases, the structures are more consistent with the biological prior expectations. At the same time, the lowest total loss and thus the best reconstruction is obtained for .
3.1 Synthetic Data Validation
To evaluate the method, we generate an ensemble of 100 3D structures and compare them to the 100 groundtruth ones. Since there is no onetoone correspondence between the structures in the ensembles, we use a probabilistic approach, as described in the following. The Euclidean distance between bins and is denoted as and for groundtruth and predicted structures respectively. Similarly, ,
denote the probability distribution of
and respectively. We first compute between all bins in all 100 groundtruth structures, i.e., , for , and the corresponding distribution for each pair of bins , . We repeat the same process in the ensemble of predicted structures and compute , for and , . If the predicted structures match the groundtruth, then the distance between should be small. To quantify this, we estimate the Wasserstein distance (Vallender, 1974) between the two distributions, , . The Wasserstein distance was chosen as an appropriate metric since it does not require that the two distributions have the same support.We perform the analysis on the synthetic data at 5 Kb resolution and compare the results of REACH3D using , with the ones from GEM. MiniMDS cannot infer an ensemble of structures and was thus omitted from the comparison. Figure 4
(a) shows the distribution of the distances for a random pair of bins; REACH3D obtains a distance distribution very similar to the groundtruth, whereas the distribution obtained by GEM has a markedly different shape, mean and variance. Figure
4(b) visualizes the distribution of Wasserstein distances across all pairs of bins. The distribution of Wasserstein distances obtained using REACH3D is rightskewed with a heavy tail, implying that the majority of pairs exhibit a small distance between the groundtruth and predicted distributions, whereas Wasserstein distances computed from GEM are visibly larger. Comparison of the two distributions via the Mann–Whitney U test yielded a pvalue of
, indicating that the distributions are indeed statistically significantly unequal.Finally, we compare the structures obtained by the different algorithms in Figure 4, and observe that REACH3D recovers most of the sought properties. Firstly, the sequence of the bins in the chromosome is best illustrated by the REACH3D structure. MiniMDS recovers only partially the sequence of the bins in the chromosome and GEM results in nonordered bin positions. In REACH3D, the longrange interactions are clearly observable, whereas in GEM the bins seem to follow a random behavior and in miniMDS they are recovered only partially.
3.2 Evaluation on Experimental Data
Fission Yeast
In the case of the fission yeast dataset, 3D FISH measurements are available by Tanizawa et al. (2010), quantifying pairwise distances in 3D between a limited number of fluorescentlytagged genomic loci. The distances serve as a sparse set of labels of intra and interchromosomal distances, on which the inferred 3D structures can be independently validated. Out of the 18 pairs of loci, 11 are intra and 7 are interchromosomal. We compared the predicted structures of REACH3D, miniMDS and GEM by computing the Pearson correlation coefficients between FISH distances and Euclidean distances computed from the inferred structures. REACH3D obtains the highest Pearson correlation coefficient, , followed by miniMDS, , and GEM, . The results are also shown graphically in Figure 6, where we observe similar correlation patterns for both intra and interchromosomal distances. The visual comparison of the inferred structures by REACH3D, miniMDS and GEM is presented in subsection 3.3.
Human
We illustrate the structures of chromosome 22 of the GM12878 cell line at 10 Kb resolution and compare the results of REACH3D with miniMDS in Figure 6. Due to the highresolution of the data, GEM was unable to yield results in reasonable computational time. The sequence of the genomic bins is preserved by both methods. Intrachromosomal interactions, longrange interactions and chromosome looping can be clearly observed in the REACH3D structure and, to a lesser extent, in the miniMDS structure.
3.3 Timedependent Analysis
We last perform time series prediction and compare the inferred structures on eight sampled cell cycle time points with miniMDS and GEM, which, as previously mentioned, obtain 3D structures for each time point independently. The results of GEM are shown in the first row of Figure 7. The transition between time points cannot be easily assessed, since most of the expected biological and physical properties and, notably, the bin sequence, are not preserved. Therefore, it is hard to distinguish the chromosomes, their interactions and the global 3D chromatin structure. The results of miniMDS are shown in the middle row of Figure 7. MiniMDS preserves the bin sequence in all time points. Nevertheless, the structures are characterized by a zigzag pattern, and chromosome territories, intra and interchromosomal interactions and longrange loops are not clearly visible in all time points. For example, there are a few structures where the chromosomes are superposed (Figure 7 (l) and (o)), and cases where few genomic bins stick out of the structure. More importantly, changes in the structures are rather drastic from one time point to the other, since the prediction of each time point does not exploit the prior information about the already observed structures.
The results of REACH3D with weight sharing are shown in the last row of Figure 7. The structures inferred by REACH3D best exhibit the expected biological and physical properties, since at all time points the bin sequence is preserved and chromosome territories, intra and interchromosomal interactions and longrange loops are clearly visible. This time, there seems to be a continuum of progression between the structures at different time points, in agreement with gradual changes in chromatin conformation over time. Starting from early M phase, we observe that the structures at 20 and 30 minutes (Figure 7
(q) and (r)) are very similar. At late M phase and before the MtoG1 phase transition (40 minutes, Figure
7 (s)), more drastic changes occur, when the left arm of the first chromosome expands and chromosomes two and three are condensed. After M phase and for the remaining of the cell cycle, the structure evolves dynamically. In agreement with observations by Tanizawa et al. (2017), the M phase patterns do not drastically disappear, but they rather gradually diminish until the next cell cycle.4 Conclusions
In this work, we explore the idea of manifold learning for the inference of the 3D chromatin structure and present a novel method, REcurrent Autoencoders for CHromatin 3D structure prediction (REACH3D). Our framework addresses the limitations of existing methods by using autoencoders with recurrent neural units to reconstruct the chromatin structure. In comparison to stateoftheart methods, REACH3D recovers most faithfully the expected biological properties of the chromatin structure, obtains the highest correlation with microscopybased distances and the highest reconstruction accuracy on synthetic data. In addition, REACH3D enables us to perform time series analysis, and thus model the dynamic conformation of chromatin across the cell cycle. Despite the methodological advancements to infer chromatin structure, new experimental measurements, such as singlecell HiC (Nagano et al., 2013)
and microscopy measurements could enable a more direct validation of the results and open the door to other machine learning techniques applicable in the context of supervised learning. Such models have the potential to advance our understanding of chromatin structure, its various biological functions and their links to human disease.
References
 et al. [2010] H. , O. Iwasaki, A. Tanaka, J. R. Capizzi, P. Wickramasinghe, M. Lee, Z. Fu, and K.i. Noma. Mapping of longrange associations throughout the fission yeast genome reveals global genome organization linked to transcriptional regulation. Nucleic Acids Research, 38(22):8164–8177, 12 2010. ISSN 03051048. doi: 10.1093/nar/gkq955.
 Abadi et al. [2015] M. Abadi, A. Agarwal, P. Barham, E. Brevdo, Z. Chen, C. Citro, G. S. Corrado, A. Davis, J. Dean, M. Devin, S. Ghemawat, I. Goodfellow, A. Harp, G. Irving, M. Isard, Y. Jia, R. Jozefowicz, L. Kaiser, M. Kudlur, J. Levenberg, D. Mané, R. Monga, S. Moore, D. Murray, C. Olah, M. Schuster, J. Shlens, B. Steiner, I. Sutskever, K. Talwar, P. Tucker, V. Vanhoucke, V. Vasudevan, F. Viégas, O. Vinyals, P. Warden, M. Wattenberg, M. Wicke, Y. Yu, and X. Zheng. TensorFlow: Largescale machine learning on heterogeneous systems, 2015. URL https://www.tensorflow.org/. Software available from tensorflow.org.
 Bonev and Cavalli [2016] B. Bonev and G. Cavalli. Organization and function of the 3D genome. Nature Reviews Genetics, 17(11):661–678, 2016. ISSN 14710064. doi: 10.1038/nrg.2016.112.
 Dekker [2008] J. Dekker. Gene regulation in the third dimension. Science (New York, N.Y.), 319(5871):1793–4, 3 2008. ISSN 10959203. doi: 10.1126/science.1152850.
 Dekker et al. [2002] J. Dekker, K. Rippe, M. Dekker, and N. Kleckner. Capturing chromosome conformation. Science (New York, N.Y.), 295(5558):1306–11, 2 2002. ISSN 10959203. doi: 10.1126/science.1067799.
 Dekker et al. [2017] J. Dekker, A. S. Belmont, M. Guttman, V. O. Leshyk, J. T. Lis, S. Lomvardas, L. A. Mirny, C. C. O’Shea, P. J. Park, B. Ren, J. C. R. Politz, J. Shendure, S. Zhong, and t. D. N. Network. The 4D nucleome project. Nature, 549(7671):219–226, 9 2017. ISSN 00280836. doi: 10.1038/nature23884.
 Duan et al. [2010] Z. Duan, M. Andronescu, K. Schutz, S. McIlwain, Y. J. Kim, C. Lee, J. Shendure, S. Fields, C. A. Blau, and W. S. Noble. A threedimensional model of the yeast genome. Nature, 465(7296):363–7, 5 2010. ISSN 14764687. doi: 10.1038/nature08973.
 Gaietta et al. [2002] G. Gaietta, T. J. Deerinck, S. R. Adams, J. Bouwer, O. Tour, D. W. Laird, G. E. Sosinsky, R. Y. Tsien, and M. H. Ellisman. Multicolor and Electron Microscopic Imaging of Connexin Trafficking. Science, 296(5567):503–507, 4 2002. ISSN 00368075. doi: 10.1126/science.1068793.

Glorot and Bengio [2010]
X. Glorot and Y. Bengio.
Understanding the difficulty of training deep feedforward neural
networks.
In
Proceedings of the thirteenth international conference on artificial intelligence and statistics
, pages 249–256, 2010.  Hochreiter and Schmidhuber [1997] S. Hochreiter and J. Schmidhuber. Long shortterm memory. Neural computation, 9(8):1735–1780, 1997.
 Hu et al. [2013] M. Hu, K. Deng, Z. Qin, J. Dixon, S. Selvaraj, J. Fang, B. Ren, and J. S. Liu. Bayesian Inference of Spatial Organizations of Chromosomes. PLoS Computational Biology, 9(1), 2013. ISSN 1553734X. doi: 10.1371/journal.pcbi.1002893.
 Kingma and Ba [2014] D. P. Kingma and J. Ba. Adam: A Method for Stochastic Optimization. 12 2014.
 Kruskal [1964] J. B. Kruskal. Multidimensional scaling by optimizing goodness of fit to a nonmetric hypothesis. Psychometrika, 29(1):1–27, 3 1964. ISSN 00333123. doi: 10.1007/BF02289565.
 Lesne et al. [2014] A. Lesne, J. Riposo, P. Roger, A. Cournac, and J. Mozziconacci. 3D genome reconstruction from chromosomal contacts. Nature Methods, 11(11):1141–1143, 11 2014. ISSN 15487091. doi: 10.1038/nmeth.3104.
 LiebermanAiden and Berkum [2009] E. LiebermanAiden and N. v. Berkum. Comprehensive mapping of long range interactions reveals folding principles of the human genome. Science, 326(5950):289–293, 2009. ISSN 10959203. doi: 10.1126/science.1181369.Comprehensive.
 Misteli [2004] T. Misteli. Spatial positioning; a new dimension in genome function. Cell, 119(2):153–6, 10 2004. ISSN 00928674. doi: 10.1016/j.cell.2004.09.035.
 Misteli [2007] T. Misteli. Beyond the sequence: cellular organization of genome function. Cell, 128(4):787–800, 2 2007. ISSN 00928674. doi: 10.1016/j.cell.2007.01.028.
 Mitelman et al. [2007] F. Mitelman, B. Johansson, and F. Mertens. The impact of translocations and gene fusions on cancer causation. Nature Reviews Cancer, 7(4):233–245, 4 2007. ISSN 1474175X. doi: 10.1038/nrc2091.
 Nagano et al. [2013] T. Nagano, Y. Lubling, T. J. Stevens, S. Schoenfelder, E. Yaffe, W. Dean, E. D. Laue, A. Tanay, and P. Fraser. Singlecell hic reveals celltocell variability in chromosome structure. Nature, 502(7469):59, 2013.
 Oluwadare et al. [2018] O. Oluwadare, Y. Zhang, and J. Cheng. A maximum likelihood algorithm for reconstructing 3D structures of human chromosomes from chromosomal contact data. BMC genomics, 19(1):161, 2018. ISSN 14712164. doi: 10.1186/s1286401845468.
 Park and Lin [2016] J. Park and S. Lin. Impact of data resolution on threedimensional structure inference methods. BMC Bioinformatics, 17(1):70, 12 2016. ISSN 14712105. doi: 10.1186/s128590160894z.
 Rao et al. [2014] S. S. P. Rao, M. H. Huntley, N. C. Durand, E. K. Stamenova, I. D. Bochkov, J. T. Robinson, A. L. Sanborn, I. Machol, A. D. Omer, E. S. Lander, and E. Lieberman Aiden. A 3D Map of the Human Genome at Kilobase Resolution Reveals Principles of Chromatin Looping. Cell, 159:1665–1680, 2014. doi: 10.1016/j.cell.2014.11.021.
 Rieber and Mahony [2017] L. Rieber and S. Mahony. MiniMDS: 3D structural inference from highresolution HiC data. Bioinformatics, 33(14):i261–i266, 2017. ISSN 14602059. doi: 10.1093/bioinformatics/btx271.

Rousseau et al. [2011]
M. Rousseau, J. Fraser, M. A. Ferraiuolo, J. Dostie, and M. Blanchette.
Threedimensional modeling of chromatin structure from interaction frequency data using Markov chain Monte Carlo sampling.
BMC Bioinformatics, 12(1):414, 10 2011. ISSN 14712105. doi: 10.1186/1471210512414.  Rust et al. [2006] M. J. Rust, M. Bates, and X. Zhuang. Subdiffractionlimit imaging by stochastic optical reconstruction microscopy (STORM). Nature Methods, 3(10):793–796, 10 2006. ISSN 15487091. doi: 10.1038/nmeth929.
 Sutskever et al. [2014] I. Sutskever, O. Vinyals, and Q. V. Le. Sequence to Sequence Learning with Neural Networks. 9 2014.
 Tanizawa et al. [2017] H. Tanizawa, K.D. Kim, O. Iwasaki, and K.i. Noma. Architectural alterations of the fission yeast genome during the cell cycle. Nature Structural & Molecular Biology, 24(11):965–976, 10 2017. ISSN 15459993. doi: 10.1038/nsmb.3482.
 Therizols et al. [2014] P. Therizols, R. S. Illingworth, C. Courilleau, S. Boyle, A. J. Wood, and W. A. Bickmore. Chromatin decondensation is sufficient to alter nuclear organization in embryonic stem cells. Science (New York, N.Y.), 346(6214):1238–42, 12 2014. ISSN 10959203. doi: 10.1126/science.1259587.
 Trieu and Cheng [2014] T. Trieu and J. Cheng. Largescale reconstruction of 3D structures of human chromosomes from chromosomal contact data. Nucleic acids research, 42(7):e52, 4 2014. ISSN 13624962. doi: 10.1093/nar/gkt1411.
 Trussart et al. [2015] M. Trussart, F. Serra, D. Baù, I. Junier, L. Serrano, and M. A. MartiRenom. Assessing the limits of restraintbased 3D modeling of genomes and genomic domains. Nucleic Acids Research, 43(7):3465–3477, 4 2015. ISSN 13624962. doi: 10.1093/nar/gkv221.
 Vallender [1974] S. Vallender. Calculation of the wasserstein distance between probability distributions on the line. Theory of Probability & Its Applications, 18(4):784–786, 1974.
 Van Der Maaten and Hinton [2008] L. Van Der Maaten and G. Hinton. Visualizing Data using tSNE. Journal of Machine Learning Research, 9:2579–2605, 2008.
 van Steensel and Dekker [2010] B. van Steensel and J. Dekker. Genomics tools for unraveling chromosome architecture. Nature biotechnology, 28(10):1089–95, 10 2010. ISSN 15461696. doi: 10.1038/nbt.1680.
 Varoquaux et al. [2014] N. Varoquaux, F. Ay, W. S. Noble, and J. P. Vert. A statistical approach for inferring the 3D structure of the genome. Bioinformatics, 30(12):i26–i33, 2014. ISSN 14602059. doi: 10.1093/bioinformatics/btu268.

Wang [2012]
J. Wang.
Classical Multidimensional Scaling.
In
Geometric Structure of HighDimensional Data and Dimensionality Reduction
, pages 115–129. Springer Berlin Heidelberg, Berlin, Heidelberg, 2012. doi: 10.1007/9783642274978{_}6.  Wang et al. [2015] S. Wang, J. Xu, and J. Zeng. Inferential modeling of 3D chromatin structure. Nucleic acids research, 43(8):e54, 2015. ISSN 13624962. doi: 10.1093/nar/gkv100.
 Zhang et al. [2013] Z. Zhang, G. Li, K.C. Toh, and W.K. Sung. 3D Chromosome Modeling with SemiDefinite Programming and HiC Data. Journal of Computational Biology, 20(11):831–846, 2013. ISSN 10665277. doi: 10.1089/cmb.2013.0076.
 Zhu et al. [2018] G. Zhu, W. Deng, H. Hu, R. Ma, S. Zhang, J. Yang, J. Peng, T. Kaplan, and J. Zeng. Reconstructing spatial organizations of chromosomes through manifold learning. Nucleic Acids Research, 2 2018. ISSN 03051048. doi: 10.1093/nar/gky065.
 Zou et al. [2016] C. Zou, Y. Zhang, and Z. Ouyang. HSA: integrating multitrack HiC data for genomescale reconstruction of 3D chromatin structure. Genome Biology, 17(1):40, 12 2016. ISSN 1474760X. doi: 10.1186/s1305901608961.
Comments
There are no comments yet.