DeepAdversaries: Examining the Robustness of Deep Learning Models for Galaxy Morphology Classification

12/28/2021
by   Aleksandra Ćiprijanović, et al.
Fermilab
55

Data processing and analysis pipelines in cosmological survey experiments introduce data perturbations that can significantly degrade the performance of deep learning-based models. Given the increased adoption of supervised deep learning methods for processing and analysis of cosmological survey data, the assessment of data perturbation effects and the development of methods that increase model robustness are increasingly important. In the context of morphological classification of galaxies, we study the effects of perturbations in imaging data. In particular, we examine the consequences of using neural networks when training on baseline data and testing on perturbed data. We consider perturbations associated with two primary sources: 1) increased observational noise as represented by higher levels of Poisson noise and 2) data processing noise incurred by steps such as image compression or telescope errors as represented by one-pixel adversarial attacks. We also test the efficacy of domain adaptation techniques in mitigating the perturbation-driven errors. We use classification accuracy, latent space visualizations, and latent space distance to assess model robustness. Without domain adaptation, we find that processing pixel-level errors easily flip the classification into an incorrect class and that higher observational noise makes the model trained on low-noise data unable to classify galaxy morphologies. On the other hand, we show that training with domain adaptation improves model robustness and mitigates the effects of these perturbations, improving the classification accuracy by 23 increases by a factor of  2.3 the latent space distance between the baseline and the incorrectly classified one-pixel perturbed image, making the model more robust to inadvertent perturbations.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 6

page 10

page 12

11/01/2021

Robustness of deep learning algorithms in astronomy – galaxy morphology studies

Deep learning models are being increasingly adopted in wide array of sci...
07/07/2020

Dual Mixup Regularized Learning for Adversarial Domain Adaptation

Recent advances on unsupervised domain adaptation (UDA) rely on adversar...
02/05/2021

Optimal Transport as a Defense Against Adversarial Attacks

Deep learning classifiers are now known to have flaws in the representat...
10/29/2020

Beyond cross-entropy: learning highly separable feature distributions for robust and accurate classification

Deep learning has shown outstanding performance in several applications ...
12/17/2020

On the Limitations of Denoising Strategies as Adversarial Defenses

As adversarial attacks against machine learning models have raised incre...
03/27/2021

Improving Model Robustness by Adaptively Correcting Perturbation Levels with Active Queries

In addition to high accuracy, robustness is becoming increasingly import...
09/03/2021

MitoVis: A Visually-guided Interactive Intelligent System for Neuronal Mitochondria Analysis

Neurons have a polarized structure, including dendrites and axons, and c...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

The success of deep learning models across a broad range of science applications is in part driven by their inherent flexibility. For example, deep neural networks can be trained to use features that represent a wide variety of patterns in the data. However, the features these neural networks contain are often incomprehensible by humans, and the models they produce can be brittle, especially when applied outside the intended circumstances. One such change in circumstances happens when trained models are applied to data that contain perturbations, which can be intentional or accidental in origin. The effective use of deep learning tools requires a detailed exploration and accounting of failure modes amidst possible data perturbations.

Adversarial attacks are inputs specifically crafted to confuse susceptible neural networks (Szegedy et al., 2013; Yuan et al., 2019). Often, adversarial examples are thought to arise from non-robust features that can easily be learned by overly-parameterized models (Ilyas et al., 2019). Some attacks rely on access to network information, such as network architecture, trained weights, and internal gradients (Szegedy et al., 2013; Goodfellow et al., 2014)

. One of the most well-known examples of this type of attack is a correctly classified image of a panda that is flipped to the class “gibbon”, with very high probability, after the addition of imperceptible but well-crafted noise, produced by the “fast gradient sign method” 

(Goodfellow et al., 2014). Alternatively, black-box attacks do not require information about the trained model (Nitin Bhagoji et al., 2017; Chen et al., 2017)

. For example, analysis of the widely used benchmark datasets CIFAR-10 

(Krizhevsky et al., )

and ImageNet 

(Deng et al., 2009) shows that and of images, respectively, can be flipped to an incorrect class by changing or “attacking” just one pixel of the image (Su et al., 2019). Furthermore, perturbations of real-world objects can cause significant problems; for example, well-placed stickers on traffic signs have caused autonomous vehicles to misclassify stop signs (Eykholt et al., 2017).

Beyond the extreme of adversarial attacks, readily occurring or accidental data perturbations—including image compression, blurring (via the point spread function), and the addition of observational (often simple Gaussian or Poisson) noise, instrument readout errors, dead camera pixels—can significantly degrade or imperil model performance (Gide et al., 2016; Dodge and Karam, 2016, 2017; Ford et al., 2019) in astronomy applications. Obtaining a deeper understanding of model performance and robustness in the context of these perturbations is crucial for successful implementation in astronomy experiments, in particular for real-time data acquisition and processing.

Deep learning is used with increasing frequency for a variety of tasks in cosmology, from science analysis to data processing. For example, convolutional neural networks (CNNs) have been used to classify/identify a variety of objects and patterns, such as: low surface brightness galaxies (Tanoglidis et al., 2021), merging galaxies (Ćiprijanović et al., 2020), post-mergers  (Bickley et al., 2021), galaxy morphology (Cheng et al., 2021), radio galaxies (Aniyan and Thorat, 2017), fast radio bursts and radio frequency interference candidates (Agarwal et al., 2020), or cosmology via weak lensing (Perraudin et al., 2019). Residual neural networks, which are more complex than generic CNNs, have also proven efficient for searches of many objects, such as galaxy-galaxy strong lenses (Lanusse et al., 2018), Sunyaev-Zel’dovich galaxy clusters (Lin et al., 2021), and Ly-emitting lenses (Li et al., 2018). Furthermore, deep learning has often been used for regression tasks such as measuring galaxy properties from maps (Prelogović et al., 2022), predicting galaxy metallicity from optical images (Wu and Boada, 2019), and constraining cosmological parameters from weak lensing maps (Fluri et al., 2019). Finally, deep learning can also be used to automate multiple tasks in large astronomical surveys, including telescope survey scheduling (Naghib et al., 2019; Alba Hernandez, 2019), cleaning astronomical data sets of ghosts and scattered-light artifacts (Tanoglidis et al., 2021), image denoising (Gheller and Vazza, 2022), and data processing and storing (La Plante et al., 2021). The use of deep learning is likely to grow commensurately with the size and complexity of modern and next-generation cosmic surveys, such as the Dark Energy Survey (DES; Dark Energy Survey Collaboration et al., 2016), the Hyper Suprime-Cam Subaru Strategic Program (HSC-SSP; Aihara et al., 2018), the Rubin Observatory Legacy Survey of Space and Time (LSST; Ivezić et al., 2019), Euclid111https://www.cosmos.esa.int/web/euclid, the Nancy Grace Roman Space Telescope222https://roman.gsfc.nasa.gov, the Subaru Prime Focus Spectrograph (PSF; Sugai et al., 2015), and the Dark Energy Spectroscopic Instrument (DESI; DESI Collaboration et al., 2016a, b).

Most approaches to defend from adversarial attacks (Hendrycks and Dietterich, 2019) can be divided into: 1) reactive measures, which focus on detecting the attack after the model is built (Lu et al., 2017; Metzen et al., 2017; Feinman et al., 2017), cleaning the attacked image (Gu and Rigazio, 2014), or verifying the network properties (Katz et al., 2017); and 2) proactive measures, which aim to increase model robustness before adversarial attacks are produced (Yuan et al., 2019). In the sciences, where adversarial attacks are not targeted but can accrue as a natural part of the data acquisition and storage process, the second group of the defense strategies is more relevant. Some of the methods in this group include network distillation (Papernot et al., 2016), adversarial (re)training (Goodfellow et al., 2014; Madry et al., 2018; Deng et al., 2020), and probabilistic modeling to provide uncertainty quantification (Bradshaw et al., 2017; Abbasi and Gagné, 2017; Wicker et al., 2021). More recently, it has also been shown that viewing a neural architecture as a dynamical system (referred to as implicit neural networks) and incorporating higher-order numerical schemes can improve robustness to adversarial attacks (Li et al., 2020).

Domain adaptation (Csurka, 2017; Wang and Deng, 2018; Wilson and Cook, 2020) comprises another group of methods that could prove useful to increase the robustness of deep learning models against these naturally occurring image perturbations. These techniques are useful when training models that need to perform well on multiple datasets at the same time. In contrast with the previously mentioned approaches, domain adaptation enables the model to learn domain-invariant features, which are present in multiple datasets and therefore more generalizable, thus improving models’ robustness to inadvertent perturbations. Domain adaptation techniques can be categorized into: 1) distance-based methods such as Maximum Mean Discrepancy (MMD; Gretton et al., 2007, 2012), Deep Correlation Alignment (CORAL; Sun and Saenko, 2016)

, Central Moment Discrepancy 

(CMD; Zellinger et al., 2019); and 2) adversarial-based methods such as Domain Adversarial Neural Networks (DANN; Ganin et al., 2016) and Conditional Domain Adversarial Networks (CDAN; Long et al., 2017).

In the context of astronomical observations, the different domains may be simulated and observed data, or data from multiple telescopes. With domain adaptation, the model can be guided to ignore discrepancies across datasets, including different signal-to-noise levels, noise models, and PSFs. In Ćiprijanović et al. (2020), the authors show that a simple algorithm trained to distinguish merging and non-merging galaxies is rendered useless after the inclusion of observational noise. In Ćiprijanović et al. (2020, 2021) the authors study domain adaptation as a way to draw discrepant astronomical data distributions closer together, thereby increasing model robustness. Using domain adaptation, the authors were able to create a model trained on simulated images of merging galaxies from the Illustris-1 cosmological simulation (Vogelsberger et al., 2014), which also performs well on simulated data that includes observational noise. Furthermore, by using domain adaptation the authors were able to bridge the gap between simulated and observed data and create a model trained on simulated Illustris-1 data that performs well on the real Sloan Digital Sky Survey images (SDSS; Lintott et al. (2008, 2010); Darg et al. (2010)).

We posit this robustness is also directly applicable to combating inadvertent pixel-level perturbations coming from image compression or telescope errors. Domain adaptation methods are well suited for astronomy applications since they allow one to utilize previous observations or simulated data to increase the robustness of the model for new datasets. More importantly, domain adaptation methods can even be used when one of the datasets does not include labels. Several relevant cases include working with newly observed unlabeled data, which cannot directly be used to train a model, or fine tuning the weights (via transfer learning) of a model previously trained on old observations or simulations 

(Tuccillo et al., 2018; Domínguez Sánchez et al., 2019; Tanoglidis et al., 2021).

In this work, we use simulated data to explore the effects of inadvertent image perturbations that can arise in complex scientific data processing pipelines, including those that will be used by the Vera C. Rubin Observatory’s LSST (Ivezić et al., 2019). As the context for our tests, we use the problem of galaxy morphology classification (spiral, elliptical, and merging galaxies), using images and catalog data of galaxy morphology from the large-volume cosmological magneto-hydrodynamical simulation IllustrisTNG100 (Nelson et al., 2019). We emulate LSST processing and observations in our images: our baseline dataset representing low-noise observations is generated by applying an exposure time equivalent to ten years of observing. For the first perturbation to the data, we explore the effects of larger observational noise by creating the high-noise observations, which correspond to one year of observing. We also explore pixel-level perturbations—representing effects such as data compression, instrument readout errors, cosmic rays, and dead camera pixels—which are produced through optimized one-pixel attacks (Su et al., 2017). We train our networks—a simple few-layer CNN and a more complex ResNet18 (He et al., 2016)—on baseline data and then test on noisy and one-pixel attacked data. During model training, we employ domain adaptation, to investigate the potential benefits of these methods for increasing model robustness to image perturbations, compared to regular training without domain adaptation. Furthermore, we analyze the network latent spaces to assess the robustness of our models due to these data perturbations and training procedures.

In Section 2, we describe the simulation and how we create our datasets, as well as details about the image perturbations we explore. In Section 3, we describe the deep learning models we use, and in Section 3.2, we introduce domain adaptation and how it is implemented in our experiments. In Section 4, we introduce visualization methods that are used to explore the latent space of our models. We present our results in Section 5, with a discussion and conclusion in Section 6.

2 Data

When creating our dataset, we use IllustrisTNG100 (Marinacci et al., 2018; Naiman et al., 2018; Springel et al., 2018; Naiman et al., 2018; Pillepich et al., 2018; Nelson et al., 2019) – a state-of-the-art cosmological magneto-hydrodynamical simulation that includes gas, stars, dark matter, supermassive black holes, and magnetic fields. We extract galaxy images in () filters 333We use database filter keys psi_g, psi_r, and psi_i. from snapshots at two redshifts: 95 () and 99 (). Finally, we convert all data to an effective redshift of , to create a larger single-redshift dataset.


Figure 1: The - “bulge statistic” plot for our entire dataset made from the two IllustrisTNG100 snapshots. The solid gray line distinguishes between merging (dark blue stars) and non-merging systems. The latter includes spiral (orange circles) and elliptical (violet plus signs) galaxies, which are separated by a dashed gray line.

2.1 Labeling classes

To produce the labels for our experiments, we use the IllustrisTNG100 morphology catalogs (Rodriguez-Gomez et al., 2019), which include non-parametric morphological diagnostics, such as the relative distribution of the galaxy pixel flux values  (Gini coefficient ; Glasser, 1962), the second-order moment of the brightest percent of the galaxy’s flux  (Lotz et al., 2004), the concentration–asymmetry–smoothness () statistics (Conselice et al., 2003; Lotz et al., 2004), and 2D Sérsic fits (Sérsic, 1963).

We follow Lotz et al. (2004) and Snyder et al. (2015), and use the - “bulge statistic” to label spiral, elliptical, and merging galaxies. Figure 1 presents the - diagram of our dataset, with the intersecting lines representing the boundaries between the three classes. Merging galaxies are those where , while non-mergers (including spirals and ellipticals) satisfy . Elliptical (spiral) galaxies have a Gini coefficient greater (lesser) than . The intersection of boundaries between the three classes lies at .

From two IllustrisTNG100 snapshots (95 and 99), we extract spiral, elliptical, and merging galaxies. To generate more data and to increase parity amongst the classes, we augment mergers with horizontal and vertical flips and with -deg and -deg rotations, producing 12,710 merger images. We then divide these 35,000 images into training, validation, and test datasets with proportions . For example images, see Figure 2.

2.2 Perturbation: Noise

To create our data, which emulates LSST observations, we use the GalSim package (Rowe et al., 2015) and follow the same procedure as in Sanchez et al. (2021).

We create two sets of survey-emulating images—a high-noise one-year survey (“Y1”) and a low-noise ten-year survey (“Y10”)—by applying an exposure time corresponding to one year or ten years of observations directly to the raw images ( per year for and filters and for

filter). This procedure simplifies data handling and obtains similar results to co-adding, where multiple single-epoch (30s) exposures are combined to yield the final results

444Typical co-adding strategies consist of adding images using inverse variance weighting; in our case where the variance follows a perfectly known Poisson distribution, co-adding and simulating the full exposure are equivalent procedures.. Furthermore, we incur PSF blurring for both atmospheric and optical PSF models. The images are simulated using a constant sky-background corresponding to the median sky level tabulated in Ivezić et al. (2019)

. This background signal is subtracted from the final images containing the simulated Illustris sources, following the typical procedure used for real astronomical images. Thus, for the empty regions (without the simulated Illustris galaxy) on these images, we expect that the pixel levels follow a Poisson distribution centered at 0 and a variance equal to the original mean background level.

We then also process the images to make the details of galaxies more apparent, by clipping the pixel values to and percentiles, which removes a very small number of outlying pixels. We then perform arcsinh stretching to make fainter objects more apparent while preserving the original color ratios in each pixel, by scaling each of the three filters with , where is a constant used to scale the outputs to the range .


Figure 2: Example images from our datasets. The left, middle, and right column show examples of spiral, elliptical, and merging galaxies, respectively, with emulations of ten-year observational noise (Y10, top row), and one-year observational noise (Y1, bottom row).

2.3 Perturbation: One-pixel attacks

Multiple processes in astronomy data pipelines can change small number of pixels, including image (de)compression, errors in charge-coupled device (CCD) detectors, detector readout, and cosmic rays. We use the one-pixel attack as a proxy for these pixel-level perturbations.

To model one-pixel attacks, we represent the original image as a tensor

and its classification score as

. An attack is optimized to find the additive perturbation vector

that maximizes the score of the image for an incorrect class. The length of the perturbation vector must be less than a prescribed maximum: , where for a one-pixel attack (Su et al., 2017).

Creating an optimal attack is typically performed through differential evolution (Storn and Price, 1997; Das and Suganthan, 2011; Su et al., 2017), a population-based optimization algorithm. In each iteration of the algorithm, a set of candidate pixels (children) is generated according to the current population (parents) during each iteration. To maintain population diversity, children are only compared to their corresponding parent and are kept if they possess a higher fitness value. For adversarial attacks, fitness is measured by the increase of the classification score for the desired incorrect class. The number of iterations required to find the optimal pixel-level perturbation corresponds to the susceptibility of a model to an attack.

3 Networks and Experiments

We study the effects of perturbations in astronomical images in the context of two neural networks, that represent distinct levels of network complexity and sophistication. Furthermore, we also explore the efficacy of domain adaptation for improving the performance and robustness of each of these networks.

3.1 Network architectures

For a relatively simple model, we use a CNN that has three convolutional layers (with each layer followed by ReLU activation, batch normalization, and max pooling) and two dense layers; hereafter we refer to this model as

ConvNet. Details of the ConvNet architecture are shown in Table 5. For a more complex model, we use one of the smallest standard off-the-shelf residual neural networks, ResNet18, which has four residual blocks (each containing convolutional layers), followed by two dense layers (He et al., 2016)

. Both networks have a latent space (layer immediately following the last convolution layer) of dimension 256, followed by an output layer with three neurons, one neuron corresponding to each of three classes: spiral, elliptical, and merging galaxies.

ConvNet (ResNet18) has 1.2M (11.2M) trainable parameters. Training is performed by minimizing the weighted cross-entropy (CE) loss

(1)

where the weight (distinct from the network weight parameters) for each class is calculated as , where is the number of images in class , is the total number of classes, and is the total number of images in the training dataset.

3.2 Domain adaptation

Domain Adaptation (DA) techniques help align the latent data distributions, allowing a model to learn the features shared between the two data domains and to perform well in both (Csurka, 2017; Wang and Deng, 2018; Wilson and Cook, 2020)

. To align latent data distributions, we use Maximum Mean Discrepancy (MMD), which is a distance-based DA method that minimizes the non-parametric distance between mean embeddings of two probability distributions 

(Smola et al., 2007; Gretton et al., 2012; Ćiprijanović et al., 2021). Generally, it is difficult to compare two probability distributions that are not completely known, but only sampled. To address this, in practice, kernel methods are used to map probability distributions into the higher-dimensional reproducing kernel Hilbert space. This preserves the statistical features of the original probability distributions, while allowing one to compare and manipulate distributions using Hilbert space operations, such as the inner product.

We follow Zhang et al. (2020) and implement MMD as in Ćiprijanović et al. (2021)

, by using a combination of multiple Gaussian radial basis function kernels,

, where is the Euclidean distance norm, and are samples from any of the two latent data distributions, and is the free parameter that determines the width of the kernel , which measures similarity between two arguments and . In this work, we minimize the MMD distance between the Y10 and Y1 latent data distributions. We express the MMD loss as

(2)

where is the total number of training samples from Y10 or from Y1 latent data distribution (in our dataset, both distributions have the same number of samples ). For more details about the MMD distance calculation, see Smola et al. (2007); Gretton et al. (2012); Ćiprijanović et al. (2021).

When using DA, the total loss is composed of the MMD and the CE loss:

(3)

where controls the relative contribution of the MMD loss. The minimization of the MMD loss requires the maximization of the kernels and that describe cross-similarities between the two data distributions. This results in the model being forced to find domain-invariant features, which make cross-similarities large.

During training, the CE loss requires labeled images. In our experiments (both regular training without domain adaptation and with domain adaptation), networks are trained using our baseline low-noise Y10 images and labels. On the other hand, the MMD loss uses only latent space image embeddings from both Y10 and Y1 datasets and does not require labels. This feature of the MMD loss is particularly valuable in cases when one of the datasets is unlabeled.

3.3 Hyperparameters and training

We use the Adam optimizer (Kingma and Ba, 2014) with beta values of and a weight decay (L2 penalty) of for regular training and for domain adaptation training. The initial learning rate in all our experiments is . We use fixed batch sizes of 128 during training and 64 during validation and testing. The training length is set to 100 epochs, but we use early stopping to prevent overfitting. When using domain adaptation, through experimentation with various values, we set for the MMD loss term. When shuffling images and initializing network weights for training, we ensured consistency of results by setting one fixed random seed () for all experiments. Training was performed on an Nvidia Tesla V100 GPU in Google Cloud.

4 Assessing Model Robustness

We assess the robustness of trained neural networks when they are presented with data that has been perturbed. First, we employ the simple network classification accuracy and other standard performance metrics. Next, we study the distributions of the distances between original and perturbed data in the latent space. Then, we visualize the trained latent space using two techniques: church window plots that show specific directions in the latent space; and isomaps that show lower-dimensional projections of the latent space.

4.1 Distance metrics

Perturbations to images move their positions within the network’s trained latent space, which can cause an object to cross a decision boundary from the region corresponding to the correct class to a region corresponding to the incorrect class. If a method increases the model robustness to perturbations, crossing the decision boundary and entering the wrong class region will require the image to move further from its origin. In other words, the region of the wrong class will become further away from correctly classified images.

We select a 150-image sub-sample of our test dataset on which to apply one-pixel perturbations. This sample is large enough for statistically significant characterization of distances between the perturbed and unperturbed data distributions (see Figure 6), and it is small enough to generate a one-pixel attack and run visual inspection on all the images. We then choose the images that were successfully flipped for both regular and domain adaptation training, which amounts to 136 for ResNet18.

We use two distance metrics to quantify the sensitivity of our models to perturbations and compare latent spaces of models trained without (regular training) and with domain adaptation. First, for each image in the baseline dataset and its perturbed counterparts, we calculate the Euclidean distance between the latent space positions of the baseline and the perturbed images. Next, we calculate the Jensen-Shannon (JS) distance (Lin, 1991) between the distributions of Euclidean distances , for models trained using regular training and training with domain adaptation. The JS distance is the square root of the JS divergence, which is a measure of similarity between two probability distributions (Lin, 1991).

4.2 Perturbation direction: Church window plots

We also seek to investigate how a perturbation in an image affects its latent space representation and thus classification. Church window plots, named after the often-colorful stained glass windows, visualize the latent space regions for classes in the proximity of a given image (Goodfellow et al., 2014; Warde-Farley and Goodfellow, 2017; Ford et al., 2019).

First, in a plot, we place the the latent space embedding of the unperturbed baseline image at the origin. Then, we subtract the unperturbed image’s latent embedding from that of the perturbed image, yielding the latent space representation of the perturbation vector. We chose to orient the plane such that the horizontal axis lies along the one-pixel perturbation direction, and the vertical axis lies along the noisy direction; in principle any perturbation direction can be chosen. In our plots, we take a slice of the entire latent space, motivated by the desire to visualize the model behavior in the direction of perturbations we chose for basis vectors.

Next, the perturbation vectors are discretized into small steps in each direction. All possible combinations of these perturbations are added to the baseline image to create new perturbed image embeddings. These new embeddings are then passed into a truncated network consisting of only the dense layers of our original trained model: a -dimensional layer and an output layer with three neurons. This truncated network necessarily shares the same weights as the flattened layers of the original network, and outputs the classification result of the given perturbed image embedding. This classification determines the color of that pixel on the plot.

A church window plot shows relative distance; each axis is normalized to based on the latent space representation for that image. Therefore, it is difficult to use such plots to compare church window representations for different images. We deviate slightly from traditional church window plot applications, e.g. Warde-Farley and Goodfellow (2017), wherein the authors oriented the horizontal axis with the adversarial (perturbation) direction, while the other axis is calculated to be orthonormal; we instead have two perturbation directions. Also, traditionally, the color white is used to designate the correct class.

4.3 Low-dimensional projections with isomaps

Next, we project our high-dimensional latent spaces to two and three dimensions, which is nontrivial. Linear projections, such as those generated by Principal Component Analysis 

(PCA; Pearson, 1901), often miss important non-linear structures in the data. Alternatively, manifold learning respects non-linear data patterns; some example algorithms are t-distributed stochastic neighbor embedding (tSNE; van der Maaten and Hinton, 2008), locally linear embedding (Roweis and Saul, 2000), and the isomap (Tenenbaum et al., 2000).

In this work, we use the isomap, which is a lower-dimensional embedding of a network latent space, such that geodesic distances in the original higher-dimensional space are also respected in the lower-dimensional space. The isomap-generation algorithm has three major stages. First, a weighted neighborhood graph over all data points is constructed, either by connecting all neighboring points that are within some chosen radius or by selecting data points among the nearest neighbors. We used the scikit implementation of isomaps, which has an option for auto that instructs the algorithm to select the optimal method for graph construction (Pedregosa et al., 2011). Within graph

, the edge weight values are assigned the distances between neighboring points. Next, the geodesic distances between all pairs of points on the manifold are estimated as their shortest-path distances in the graph

. Finally, the lower -dimensional embedding that best preserves the manifold’s estimated intrinsic geometry (low-dimensional representation of the data in which the distances respect well the distances in the original high-dimensional space) is produced by applying classical Metric Multidimensional Scaling (MDS; Borg and Groenen, 2005) to the matrix of the shortest-graph distances. Most commonly, is or , to facilitate visualization.

5 Results

We assess the performance and robustness of the two deep learning models, each trained without (regular training) and with domain adaptation. We train on data in one domain (Y10) and test on data that has been perturbed in one of two ways: with a one-pixel attack (1P) representing data processing errors, or with inclusion of higher observational noise (Y1).

5.1 Classification accuracy and other network performance metrics

We focus here on ResNet18; the setup and results for the simpler ConvNet model are discussed in A. While the more complex ResNet18 network achieved higher classification accuracy, we find that both models respond similarly to image perturbations and to domain adaptation during training.

Without domain adaptation, the ResNet18 model achieves an accuracy of () when tested on Y10 (Y1) images. When we use domain adaptation, the accuracy is () when tested on Y10 (Y1) images. Using domain adaptation prevented the more complex model from overfitting, which helps increase the accuracy in the final epoch on the baseline Y10 images; this was also observed in Ćiprijanović et al. (2021). Domain adaptation also helped increase the accuracy on noisy Y1 data by . In Table 1, we report the accuracy, as well as weighted precision, recall, and F1 score for ResNet18 regular training and training with domain adaptation. Weighted metrics are calculated for each of the three class labels, and then their average is found and weighted by the number of true instances for each label.

Training Metric Y10 Y1
Reg Accuracy
Precision
Recall
F1 Score
DA Accuracy
Precision
Recall
F1 Score
Table 1: Performance metrics for ResNet18 on Y10 and Y1 test data for regular training (top row) and training with domain adaptation (bottom row). The table shows the accuracy and weighted precision, recall, and F1 scores. Domain adaptation increases performance in all metrics for both Y10 and Y1 data.

5.2 Case studies: Latent space visualizations of perturbed data

Next, we investigate the classification of a single spiral galaxy image in three forms—the baseline and the two perturbations—by visualizing the network latent space representation of each form. Figure 7 presents church window plots and 2D isomaps of latent space representations given a ResNet18 network with regular training (top) and with DA training (bottom). The church window panel shows that with regular training, the one-pixel attack moved the latent space representation into the elliptical region, while the noise moved the representation to the merger region.

The isomap panel shows a 2D projection of the latent space representation for 250 randomly selected objects in our test dataset, as well as the three forms of our single example galaxy: Y10 (“”), Y1 (star), 1P (triangle). The filled (empty) circles show the Y10 (Y1) latent representation of the randomly selected galaxies (we pick the same galaxies from both Y10 and Y1 data). For the baseline Y10 dataset, which the model was trained on, examples are clearly separated into three classes—spiral (orange), elliptical (violet), and merger (navy blue)—for both regular and domain adaptation training.

In the case of regular training, most of the Y1 data is incorrectly classified as mergers (empty navy blue circles). Using domain adaptation training produces a clear class separation in the Y1 data as well, leading to good overlap between the Y1 and Y10 classes and a common decision boundary, as we can see on the isomap in the bottom row. Both the church window plot and the corresponding isomap show that with domain adaptation the example Y1 image from the triplet is correctly classified as a spiral galaxy. The one-pixel attack still manages to flip the Y10 image to elliptical, but because the incorrect class region is now further away, more iterations of differential evolution were needed (see Section 5.3 for details).

To understand how the data moves in the latent space after domain adaptation is employed, Figure 4 shows illustrative 3D isomaps of Y10 and Y1 test data555To further illustrate the differences between Y10 and Y1 latent data distributions, we show videos of rotating 3D isomaps (made from 50 randomly chosen test set images, due to memory constraints) as supplementary online material, as well as on our GitHub page.. Here, we can clearly see that without domain adaptation (top row), the noisy Y1 data is not overlapping with the Y10 data. In fact, Y1 data is concentrated in a small region of the plot. On the other hand, with the inclusion of domain adaptation (bottom row), both the Y10 and Y1 data distributions follow the same trend and data distributions overlapping quite well.


Figure 3: Church window plots (left) and isomaps (right) of an example triplet of an baseline Y10 image (“”) with each of the two perturbations: noisy Y1 (star) and one-pixel attack (triangle) 777On the top isomap, the star symbol is plotted with a cyan border to make it more visible, since it is located in the region with many other navy blue points.. The top row of images corresponds to the model trained without domain adaptation (regular training), while the bottom row shows the same triplet of images for a model trained with domain adaptation. In all plots, classification into spiral galaxies is shown in orange, elliptical in violet, and merger in navy blue. Isomaps are constructed from 250 randomly selected images from our test set, with Y10 images shown as filled circles, and Y1 as empty circles. Each church window plot is labeled with the true class of the Y10 image and the class targeted by the one-pixel attack: “true class targeted class”. Most noisy Y1 images are incorrectly classified as mergers when training without domain adaptation, and thus most Y1 points in the top isomap are navy blue. Adding domain adaptation improves the overlap between the Y10 and Y1 data distributions, which leads to correct classification of all three classes in both the Y10 and Y1 datasets.

Figure 5 shows additional ResNet18 church window plots for several examples from the 136-image test sub-sample, for which we performed the one-pixel attack. The top and bottom rows show the same examples for regular and domain adaptation training, respectively. The first row shows examples of baseline Y10 images that were correctly classified and then successfully flipped to a different class with a one-pixel attack. As we can see in the top row of images, the one-pixel perturbed, the noisy Y1, and the baseline Y10 image can belong to three different classes. Domain adaptation leads to more robustness and higher output probabilities, which means that the one-pixel attack needs to move the image further in the latent space in order to reach the region of the wrong class, so more iterations of differential evolution are needed to find such a pixel. We limit the differential evolution procedure, which seeks the adversarial pixel, to 80 iterations. Within the maximum number of iterations only 136 images (out of the 150-image test set sub-sample) were successfully flipped to the wrong class after the inclusion of domain adaptation. Domain adaptation helps distinguish between classes in the noisy Y1 domain, so the church window plots often only show two of the three possible class regions.

Figure 4: 3D isomap projections of the 256-dimensional latent space, derived from 200 randomly sampled data points from test sets, in four camera orientations (regular training in top row, domain adaptation in bottom row). The baseline Y10 images are connected with an orange plane, while the noisy Y1 images are connected with a blue plane. Domain adaptation helps increase the overlap of the Y10 and Y1 distributions.

Figure 5: Example church window plots for ResNet18. The top (bottom) row shows three examples for the model using regular (domain adaptation) training. We selected examples for which the Y10 image was correctly classified after training, and for which the one-pixel attack is successful. Each church window plot title includes the true class of the baseline Y10 image and the targeted incorrect class for the pixel-attacked image: “true class targeted class.”

Figure 6: Distributions of Euclidean distances between Y10 and noisier Y1 images (left), and between Y10 and one-pixel (1P) perturbed images (right), for the ResNet18 model. Distances for the regular training are plotted in navy blue, with mean values (given in Table 2) plotted as dashed lines. Distances for the model trained with domain adaptation are plotted in orange, with means plotted as solid lines. On each plot we also report the JS distance as a measure of the difference between the two distributions. Domain adaptation increases the distances to all types of perturbations, an indicator of increased model robustness.

5.3 Distances metrics: Characterizing network robustness with latent space distances

We use Euclidean distances between baseline Y10 images and their perturbed counterparts in the network latent space to assess model robustness.

First, we estimate the mean and standard errors of the distribution of distances between baseline Y10 images and their perturbed counterparts in the latent space of the

ResNet18 model in Table 2; these distance metrics are then used as a measure of robustness of the networks. For both types of perturbations (noisy and one-pixel attack), we observe that domain adaptation increases the distance between the baseline and perturbed images.

For Y10-Y1 distances, this happens because domain adaptation allows the Y1 data to align with all three classes and to be correctly classified, instead of being concentrated in one region with no class distinction. Furthermore, the mean distance also increases between Y10 and the one-pixel perturbed images (see Figure 6 and Table 2). With domain adaptation, images that were successfully flipped to the wrong class needed to move farther to cross the class boundary and end up in the wrong class. For the better performing ResNet18 model, increases by a factor of , which means that the one-pixel attack is less likely to work.

We also compare the distributions of Euclidean distances between baseline and perturbed images. First, we normalize these distributions to sum to 1; we illustrate the ResNet18 distributions in Figure 6. We then calculate the JS distance between the regular and domain adaptation distributions of Euclidean distances. As with the distribution means, the success of domain adaptation in increasing the distance to the one-pixel perturbed images for ResNet18 results in the larger JS distance of .

Because the latent spaces of the two networks are different, we cannot directly compare the distance metrics for the ConvNet and ResNet18 models. For this reason, we look at the overall behavior and the changes introduced by domain adaptation, in combination with church window plots and isomaps, to get a better understanding of the effects of image perturbations on the model performance. These results show the power of domain adaptation as a tool for increasing model robustness.

Perturbation Training
Y10 – Y1 Reg
DA
Y10 – 1P Reg
DA
Table 2: Means and standard errors of Euclidean distances in the latent space of ResNet18 for the -image sub-sample of our test set of images. Domain adaptation increases the mean distance to the one-pixel perturbed image, making the model more robust against this kind of attack (we also plot histograms of all distances in Figure 6).

6 Discussion and Conclusion

In this paper, we explored how data perturbations that arise from astronomical processing and analysis pipelines can degrade the capacity of deep learning models to classify objects. We then explored the efficacy of particular visualization techniques (church window plots and isomaps) in assessing model behavior and robustness in these classification tasks. Finally, we tested the use of domain adaptation for mitigating model performance degradation.

Our work focuses on the effects of two types of perturbations: observational noise and image processing error (represented by the one-pixel attack). We demonstrated that the performance of standard deep learning models can be significantly degraded by changing a single pixel in the image. Additionally, images with different noise levels (even if the noise model is the same) are also incorrectly classified if the model is only trained on one of the noise realizations. Even larger discrepancies between data distributions can arise if the two datasets include noise that cannot be described with the same model.

We illustrated how training on multiple datasets with the inclusion of domain adaptation leads to extraction of more robust features that can substantially improve performance on both datasets (Y1 and Y10). Furthermore, the added benefit of this type of training is increased robustness to inadvertent one-pixel (1P) perturbations that can arise in astronomical data pipelines. In other words, older high-noise (Y1) data can be used in combination with domain adaptation during training to increase performance and robustness of models intended to work with newer low-noise data (Y10). We showed that the inclusion of domain adaptation during the training of ResNet18 increases the classification accuracy for Y10 data by , while the accuracy in the noisy Y1 domain (which could not be classified at all without domain adaptation) increases by . Furthermore, to successfully flip an image to the wrong class using the one-pixel attack, the image needs to move times further in the neural network’s latent space after the inclusion of domain adaptation.

Domain adaptation methods can help bring discrepant data distributions closer together even if the differences between the datasets are quite large. Still, the best results are achieved when the datasets are preprocessed to be as similar as possible and include a large number of images for training. This is particularly important when one of the datasets contains simulated images, since one can work to make the simulations as realistic as possible, closer to the real data that that the model is intended to be used on.

Even though MMD has proven to be very successful in bridging the gap between astronomical datasets, it should not be used for very complex problems. MMD is a method that is not class-aware and hence tries to align entire data distributions. This property can be problematic when one of the datasets contains a new or unknown class that should not be aligned with the other domain. Our future work will focus on leveraging more sophisticated class-aware DA methods such as Contrastive Adaptation Networks (CAN; Kang et al., 2019), or Domain Adaptive Neighborhood Clustering via Entropy optimization (DANCE; Saito et al., 2020), which can successfully perform domain alignment in more complex experiments.

Although we adopted a generic ResNet18 to carry out our experiments with domain adaptation for robustness, other adversarial robustness approaches, such as those based on architecture improvement, data augmentation, and probabilistic modeling, can be used alongside domain adaptation. This is a future direction we will pursue.

In astronomy, new insights about astrophysical objects often come from our ability to simultaneously learn from multiple datasets: simulated and observed, observations from different telescopes and at different wavelengths, or using the same observations but with different observing times. Domain adaptation techniques are ideally suited in cases where deep learning models need to work in multiple domains and can even work when one of the domains is unlabeled. In scientific applications, where data perturbations are typically not targeted, but rather occur naturally, using domain adaptation can simultaneously help a) increase robustness to these small perturbations and b) realize the gains when information comes from multiple datasets. Future developments and implementations of adversarial robustness and domain adaptation methods in astronomical pipelines will open doors for many more uses of deep learning models.

Acknowledgments

This manuscript has been supported by Fermi Research Alliance, LLC under Contract No. DE-AC02-07CH11359 with the U.S. Department of Energy (DOE), Office of Science, Office of High Energy Physics. This research has been partially supported by the High Velocity Artificial Intelligence grant as part of the DOE High Energy Physics Computational HEP program. This research has been partially supported by the DOE Office of Science, Office of Advanced Scientific Computing Research, applied mathematics and SciDAC programs under Contract No. DE-AC02-06CH11357. This research used resources of the Argonne Leadership Computing Facility at Argonne National Laboratory, which is a user facility supported by the DOE Office of Science.

The authors of this paper have committed themselves to performing this work in an equitable, inclusive, and just environment, and we hold ourselves accountable, believing that the best science is contingent on a good research environment. We acknowledge the Deep Skies Lab as a community of multi-domain experts and collaborators who have facilitated an environment of open discussion, idea-generation, and collaboration. This community was important for the development of this project.

We are very thankful to Nic Ford for useful discussion regarding church window plots.

Author Contributions

A. Ćiprijanović: Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Project administration, Resources, Software, Supervision, Visualization, Writing of original draft; D. Kafkes: Formal analysis, Investigation, Methodology, Resources, Software, Visualization, Writing of original draft; S. Madireddy: Conceptualization, Methodology, Resources, Software, Supervision, Writing (review & editing); B. Nord: Conceptualization, Methodology, Supervision, Writing (review & editing); K. Pedro: Conceptualization, Methodology, Project administration, Resources, Software, Supervision, Writing (review & editing); G. N. Perdue: Conceptualization, Methodology, Project administration, Resources, Software, Supervision, Writing (review & editing); F. J. Sánchez: Data curation, Methodology, Writing (review & editing); G. F. Snyder: Conceptualization, Data curation, Methodology, Writing (review & editing); S. M. Wild: Conceptualization, Methodology, Writing (review & editing).

Data and Code Availability

All simulated LSST images are available on Zenodo: https://doi.org/10.5281/zenodo.5514180. For code that was used to perform the experiments in this paper see our GitHub repository: https://github.com/AleksCipri/DeepAdversaries.

Appendix A ConvNet Results

Our simpler ConvNet model architecture is presented in Table 5.

ConvNet reaches slightly lower accuracies compared to ResNet18, but exhibits similar behavior when trained with and without domain adaptation. With the regular training, the one-pixel attack more easily flips the image to an incorrect class, and most of the noisy Y1 images are incorrectly classified. When domain adaptation is employed, noisy images are classified correctly, and successful one-pixel attacks are harder to find (more iterations of differential evolution are needed and the successfully attacked images are further away from the baseline Y10 image). Table 3 provides detailed metrics for the performance of ConvNet on the Y10 and Y1 test data.

Furthermore, in Table 4 we give means and standard errors of Euclidean distances between baseline Y10 images and noisy Y1 or one-pixel perturbed (1P) images, calculated for the 134-image sub-sample of the test set of images (images that were successfully flipped for both regular and domain adaptation training). In Figure 7, we plot distributions of these Euclidean distances and give the JS distance as a measure of the difference between the regular and domain adaptation distributions. Similar to ResNet18, the simpler ConvNet also exhibits improved robustness when trained with domain adaptation, which is reflected in the larger Euclidean distances between baseline Y10 images and their perturbed counterparts and the increased classification accuracy on noisy Y1 data.

To compare distances in spaces with different dimensions (objects becoming more distant as the dimensionality grows), we require that both the ResNet18 and simpler ConvNet have the same 256-dimensional latent space. Therefore, any different behavior of data points in this space can be attributed to different features the two networks find as important and exploit to build their latent spaces. It is important to keep in mind that these differences are a consequence of the vastly different model sizes (number of tunable parameters), as well as the type of the model (one a regular CNN and the other containing residual blocks).

Training Metric Y10 Y1
Reg Accuracy
Precision
Recall
F1 score
DA Accuracy
Precision
Recall
F1 score
Table 3: Performance metrics for ConvNet on Y10 and Y1 test data for regular training (top row) and training with domain adaptation (bottom row). The table shows the accuracy and weighted precision, recall, and F1 scores.
Perturbation Training
Y10 – Y1 Reg
DA
Y10 – 1P Reg
DA
Table 4: Means and standard errors of Euclidean distances in the ConvNet latent space. Values are calculated for the -image sub-sample of our test set of images. Domain adaptation increases the mean distance to the one-pixel perturbed image, making the model more robust against this kind of attacks (we also plot histograms of all distances in Figure 7).
Layers Properties Stride Padding Output Shape Parameters
Input 888We use the “channel first” image data format. (3, 100, 100) 0
Convolution (2D) Filters: 8 (8, 100, 100) 608
Kernel:
Activation: ReLU
Batch Normalization (8, 100, 100) 16
MaxPooling Kernel: 0 (8, 50, 50) 0
Convolution (2D) Filters: 16 (16, 50, 50) 1168
Kernel:
Activation: ReLU
Batch Normalization (16, 50, 50) 32
MaxPooling Kernel: 0 (16, 25, 25) 0
Convolution (2D) Filters: 32 1 (32, 25, 25) 4640
Kernel:
Activation: ReLU
Batch Normalization (32, 25, 25) 64
MaxPooling Kernel: 0 (32, 12, 12) 0
Flatten (4608)
Bottleneck (256) 1179904
Fully connected Activation: Softmax (3) 771
Total number of trainable parameters:
Table 5: The architecture of the ConvNet CNN used in this paper.

Figure 7: Distributions of Euclidean distances between baseline Y10 and noisy Y1 images (left), and between Y10 and one-pixel (1P) perturbed images (right), for the ConvNet model. Distances for the regular training are plotted in navy blue, with mean values (see Table 4) plotted as dashed lines. Distances for the model trained with domain adaptation are plotted in orange, with means plotted as solid lines. Each plot also contains the JS distance as a measure of the difference between the two distributions. Domain adaptation increases the distances between baseline and the two types of perturbations, indicating an increase in model robustness.

References

  • M. Abbasi and C. Gagné (2017) Robustness to adversarial examples through an ensemble of specialists. arXiv 1702.06856. Cited by: §1.
  • D. Agarwal, K. Aggarwal, S. Burke-Spolaor, D. R. Lorimer, and N. Garver-Daniels (2020) FETCH: A deep-learning based classifier for fast transient classification. MNRAS 497 (2), pp. 1661–1674. External Links: Document, 1902.06343 Cited by: §1.
  • H. Aihara, N. Arimoto, R. Armstrong, S. Arnouts, N. A. Bahcall, S. Bickerton, J. Bosch, K. Bundy, and et al. (2018) The Hyper Suprime-Cam SSP Survey: Overview and survey design. PASJ 70, pp. S4. External Links: Document, 1704.05858 Cited by: §1.
  • A. F. Alba Hernandez (2019)

    Sky Surveys Scheduling Using Reinforcement Learning

    .
    Master’s Thesis, Northern Illinois University. Cited by: §1.
  • A. K. Aniyan and K. Thorat (2017) Classifying Radio Galaxies with the Convolutional Neural Network. ApJS 230 (2), pp. 20. External Links: Document, 1705.03413 Cited by: §1.
  • R. W. Bickley, C. Bottrell, M. H. Hani, S. L. Ellison, H. Teimoorinia, K. M. Yi, S. Wilkinson, S. Gwyn, and M. J. Hudson (2021) Convolutional neural network identification of galaxy post-mergers in UNIONS using IllustrisTNG. MNRAS 504 (1), pp. 372–392. External Links: Document, 2103.09367 Cited by: §1.
  • I. Borg and P. Groenen (2005) Modern multidimensional scaling: theory and applications (springer series in statistics). External Links: Document Cited by: §4.3.
  • J. Bradshaw, A. G. d. G. Matthews, and Z. Ghahramani (2017) Adversarial examples, uncertainty, and transfer testing robustness in Gaussian process hybrid deep networks. arXiv e-prints, pp. . External Links: 1707.02476 Cited by: §1.
  • P. Chen, H. Zhang, Y. Sharma, J. Yi, and C. Hsieh (2017) ZOO: zeroth order optimization based black-box attacks to deep neural networks without training substitute models. arXiv e-prints, pp. . External Links: 1708.03999 Cited by: §1.
  • T. Cheng, C. J. Conselice, A. Aragón-Salamanca, M. Aguena, S. Allam, F. Andrade-Oliveira, J. Annis, A. F. L. Bluck, and et al. (2021) Galaxy morphological classification catalogue of the Dark Energy Survey Year 3 data with convolutional neural networks. MNRAS 507 (3), pp. 4425–4444. External Links: Document, 2107.10210 Cited by: §1.
  • A. Ćiprijanović, D. Kafkes, K. Downey, S. Jenkins, G. N. Perdue, S. Madireddy, T. Johnston, G. F. Snyder, and B. Nord (2021) DeepMerge - II. Building robust deep learning algorithms for merging galaxy identification across domains. MNRAS 506 (1), pp. 677–691. External Links: Document, 2103.01373 Cited by: §1, §3.2, §3.2, §5.1.
  • A. Ćiprijanović, D. Kafkes, S. Jenkins, K. Downey, G. N. Perdue, S. Madireddy, T. Johnston, and B. Nord (2020) Domain adaptation techniques for improved cross-domain study of galaxy mergers. arXiv e-prints, pp. arXiv:2011.03591. External Links: 2011.03591 Cited by: §1.
  • A. Ćiprijanović, G. F. Snyder, B. Nord, and J. E. G. Peek (2020) DeepMerge: Classifying high-redshift merging galaxies with deep neural networks. Astronomy and Computing 32, pp. 100390. External Links: Document, 2004.11981 Cited by: §1, §1.
  • C. J. Conselice, M. A. Bershady, M. Dickinson, and C. Papovich (2003) A Direct Measurement of Major Galaxy Mergers at z¡~3. AJ 126 (3), pp. 1183–1207. External Links: Document, astro-ph/0306106 Cited by: §2.1.
  • G. Csurka (2017) A comprehensive survey on domain adaptation for visual applications. In

    Domain Adaptation in Computer Vision Applications

    ,
    pp. 1–35. External Links: ISBN 978-3-319-58347-1, Document Cited by: §1, §3.2.
  • D. W. Darg, S. Kaviraj, C. J. Lintott, K. Schawinski, M. Sarzi, S. Bamford, J. Silk, R. Proctor, and et al. (2010) Galaxy Zoo: the fraction of merging galaxies in the SDSS and their morphologies. Monthly Notices of the Royal Astronomical Society 401 (2), pp. 1043–1056. External Links: ISSN 0035-8711, Document Cited by: §1.
  • Dark Energy Survey Collaboration, T. Abbott, F. B. Abdalla, J. Aleksić, S. Allam, A. Amara, D. Bacon, E. Balbinot, M. Banerji, and et al. (2016) The Dark Energy Survey: more than dark energy - an overview. MNRAS 460 (2), pp. 1270–1299. External Links: Document, 1601.00329 Cited by: §1.
  • S. Das and P. N. Suganthan (2011) Differential evolution: a survey of the state-of-the-art.

    IEEE Transactions on Evolutionary Computation

    15 (1), pp. 4–31.
    External Links: Document Cited by: §2.3.
  • J. Deng, W. Dong, R. Socher, L. Li, K. Li, and L. Fei-Fei (2009) Imagenet: a large-scale hierarchical image database. In

    2009 IEEE conference on computer vision and pattern recognition

    ,
    pp. 248–255. Cited by: §1.
  • Z. Deng, C. Dwork, J. Wang, and L. Zhang (2020) Interpreting robust optimization via adversarial influence functions. In

    International Conference on Machine Learning

    ,
    pp. 2464–2473. Cited by: §1.
  • DESI Collaboration, A. Aghamousa, J. Aguilar, S. Ahlen, S. Alam, L. E. Allen, C. Allende Prieto, J. Annis, S. Bailey, and et al. (2016a) The DESI Experiment Part I: Science,Targeting, and Survey Design. arXiv e-prints, pp. arXiv:1611.00036. External Links: 1611.00036 Cited by: §1.
  • DESI Collaboration, A. Aghamousa, J. Aguilar, S. Ahlen, S. Alam, L. E. Allen, C. Allende Prieto, J. Annis, S. Bailey, and et al. (2016b) The DESI Experiment Part II: Instrument Design. arXiv e-prints, pp. arXiv:1611.00037. External Links: 1611.00037 Cited by: §1.
  • S. Dodge and L. Karam (2016) Understanding How Image Quality Affects Deep Neural Networks. arXiv e-prints, pp. . External Links: Cited by: §1.
  • S. Dodge and L. Karam (2017) A study and comparison of human and deep learning recognition performance under visual distortions. arXiv e-prints, pp. . External Links: 1705.02498 Cited by: §1.
  • H. Domínguez Sánchez, M. Huertas-Company, M. Bernardi, S. Kaviraj, J. L. Fischer, T. M. C. Abbott, F. B. Abdalla, J. Annis, and et al. (2019) Transfer learning for galaxy morphology from one survey to another. MNRAS 484 (1), pp. 93–100. External Links: Document, 1807.00807 Cited by: §1.
  • K. Eykholt, I. Evtimov, E. Fernandes, B. Li, A. Rahmati, C. Xiao, A. Prakash, T. Kohno, and D. Song (2017) Robust physical-world attacks on deep learning models. arXiv e-prints, pp. . External Links: 1707.08945 Cited by: §1.
  • R. Feinman, R. R. Curtin, S. Shintre, and A. B. Gardner (2017) Detecting adversarial samples from artifacts. arXiv e-prints, pp. . External Links: 1703.00410 Cited by: §1.
  • J. Fluri, T. Kacprzak, A. Lucchi, A. Refregier, A. Amara, T. Hofmann, and A. Schneider (2019) Cosmological constraints with deep learning from KiDS-450 weak lensing maps. Phys. Rev. D 100 (6), pp. 063514. External Links: Document, 1906.03156 Cited by: §1.
  • N. Ford, J. Gilmer, N. Carlini, and D. Cubuk (2019) Adversarial examples are a natural consequence of test error in noise. arXiv e-prints, pp. . External Links: 1901.10513 Cited by: §1, §4.2.
  • Y. Ganin, E. Ustinova, H. Ajakan, P. Germain, H. Larochelle, F. Laviolette, M. March, and V. Lempitsky (2016) Domain-adversarial training of neural networks. Journal of Machine Learning Research 17 (59), pp. 1–35. External Links: Link Cited by: §1.
  • C. Gheller and F. Vazza (2022)

    Convolutional deep denoising autoencoders for radio astronomical images

    .
    MNRAS 509 (1), pp. 990–1009. External Links: Document, 2110.08618 Cited by: §1.
  • M. S. Gide, S. F. Dodge, and L. J. Karam (2016) The effect of distortions on the prediction of visual attention. arXiv e-prints, pp. . External Links: 1604.03882 Cited by: §1.
  • G. J. Glasser (1962) Variance formulas for the mean difference and coefficient of concentration. Journal of the American Statistical Association 57 (299), pp. 648–654. External Links: Document Cited by: §2.1.
  • I. J. Goodfellow, J. Shlens, and C. Szegedy (2014) Explaining and harnessing adversarial examples. arXiv e-prints, pp. . External Links: 1412.6572 Cited by: §1, §1, §4.2.
  • A. Gretton, K. M. Borgwardt, M. J. Rasch, B. Schölkopf, and A. Smola (2012) A kernel two-sample test. Journal of Machine Learning Research 13 (25), pp. 723–773. External Links: Link Cited by: §1, §3.2, §3.2.
  • A. Gretton, KM. Borgwardt, M. Rasch, B. Schölkopf, and A. Smola (2007) A kernel method for the two-sample-problem. In Advances in Neural Information Processing Systems 19, pp. 513–520. Cited by: §1.
  • S. Gu and L. Rigazio (2014) Towards deep neural network architectures robust to adversarial examples. arXiv preprint arXiv:1412.5068. Cited by: §1.
  • K. He, X. Zhang, S. Ren, and J. Sun (2016) Deep residual learning for image recognition. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Vol. , pp. 770–778. External Links: Document Cited by: §1, §3.1.
  • D. Hendrycks and T. Dietterich (2019) Benchmarking neural network robustness to common corruptions and perturbations. Proceedings of the International Conference on Learning Representations. Cited by: §1.
  • A. Ilyas, S. Santurkar, D. Tsipras, L. Engstrom, B. Tran, and A. Madry (2019) Adversarial examples are not bugs, they are features. In Advances in Neural Information Processing Systems, H. Wallach, H. Larochelle, A. Beygelzimer, F. dAlché-Buc, E. Fox, and R. Garnett (Eds.), Vol. 32, pp. . External Links: Link Cited by: §1.
  • Ž. Ivezić, S. M. Kahn, J. A. Tyson, B. Abel, E. Acosta, R. Allsman, D. Alonso, Y. AlSayyad, and et al. (2019) LSST: from science drivers to reference design and anticipated data products. ApJ 873 (2), pp. 111. External Links: Document, 0805.2366 Cited by: §1, §1, §2.2.
  • G. Kang, L. Jiang, Y. Yang, and A. G. Hauptmann (2019) Contrastive Adaptation Network for Unsupervised Domain Adaptation. arXiv e-prints, pp. arXiv:1901.00976. External Links: 1901.00976 Cited by: §6.
  • G. Katz, C. Barrett, D. Dill, K. Julian, and M. Kochenderfer (2017) Reluplex: an efficient SMT solver for verifying deep neural networks. arXiv e-prints, pp. . External Links: 1702.01135 Cited by: §1.
  • D. P. Kingma and J. Ba (2014) Adam: a method for stochastic optimization. arXiv 1412.6980. Cited by: §3.3.
  • [45] A. Krizhevsky, V. Nair, and G. Hinton () CIFAR-10 (canadian institute for advanced research). . External Links: Link Cited by: §1.
  • P. La Plante, P. K. G. Williams, M. Kolopanis, J. S. Dillon, A. P. Beardsley, N. S. Kern, M. Wilensky, Z. S. Ali, and et al. (2021) A Real Time Processing system for big data in astronomy: Applications to HERA. Astronomy and Computing 36, pp. 100489. External Links: Document, 2104.03990 Cited by: §1.
  • F. Lanusse, Q. Ma, N. Li, T. E. Collett, C. Li, S. Ravanbakhsh, R. Mandelbaum, and B. Póczos (2018) CMU DeepLens: deep learning for automatic image-based galaxy-galaxy strong lens finding. MNRAS 473 (3), pp. 3895–3906. External Links: Document, 1703.02642 Cited by: §1.
  • M. Li, L. He, and Z. Lin (2020) Implicit euler skip connections: enhancing adversarial robustness via numerical stability. In International Conference on Machine Learning, pp. 5874–5883. Cited by: §1.
  • R. Li, Y. Shu, J. Su, H. Feng, G. Zhang, J. Wang, and H. Liu (2018) Using deep Residual Networks to search for galaxy-Ly emitter lens candidates based on spectroscopic selection. Monthly Notices of the Royal Astronomical Society 482 (1), pp. 313–320. External Links: ISSN 0035-8711, Document, Link Cited by: §1.
  • J. Lin (1991) Divergence measures based on the shannon entropy. IEEE Transactions on Information Theory 37 (1), pp. 145–151. External Links: Document Cited by: §4.1.
  • Z. Lin, N. Huang, C. Avestruz, W. L. K. Wu, S. Trivedi, J. Caldeira, and B. Nord (2021) DeepSZ: identification of Sunyaev-Zel’dovich galaxy clusters using deep learning. MNRAS 507 (3), pp. 4149–4164. External Links: Document, 2102.13123 Cited by: §1.
  • C. J. Lintott, K. Schawinski, A. Slosar, K. Land, S. Bamford, D. Thomas, M. J. Raddick, R. C. Nichol, and et. al. (2008) Galaxy zoo: morphologies derived from visual inspection of galaxies from the Sloan digital sky survey. Monthly Notices of the Royal Astronomical Society 389 (3), pp. 1179–1189. External Links: Document Cited by: §1.
  • C. Lintott, K. Schawinski, S. Bamford, A. Slosar, K. Land, D. Thomas, E. Edmondson, K. Masters, and et al. (2010) Galaxy Zoo 1: data release of morphological classifications for nearly 900 000 galaxies*. Monthly Notices of the Royal Astronomical Society 410 (1), pp. 166–178. External Links: ISSN 0035-8711, Document, Link, https://academic.oup.com/mnras/article-pdf/410/1/166/18442057/mnras0410-0166.pdf Cited by: §1.
  • M. Long, Z. Cao, J. Wang, and M. I. Jordan (2017) Conditional adversarial domain adaptation. arXiv e-prints, pp. . External Links: 1705.10667 Cited by: §1.
  • J. M. Lotz, J. Primack, and P. Madau (2004) A new nonparametric approach to galaxy morphological classification. AJ 128 (1), pp. 163–182. External Links: Document, astro-ph/0311352 Cited by: §2.1, §2.1.
  • J. Lu, T. Issaranon, and D. Forsyth (2017) SafetyNet: detecting and rejecting adversarial examples robustly. In Proceedings - 2017 IEEE International Conference on Computer Vision, ICCV 2017, Proceedings of the IEEE International Conference on Computer Vision, pp. 446–454. External Links: Document Cited by: §1.
  • A. Madry, A. Makelov, L. Schmidt, D. Tsipras, and A. Vladu (2018) Towards deep learning models resistant to adversarial attacks. In International Conference on Learning Representations, Cited by: §1.
  • F. Marinacci, M. Vogelsberger, R. Pakmor, P. Torrey, V. Springel, L. Hernquist, D. Nelson, R. Weinberger, and et al. (2018) First results from the IllustrisTNG simulations: radio haloes and magnetic fields. MNRAS 480 (4), pp. 5113–5139. External Links: Document, 1707.03396 Cited by: §2.
  • J. H. Metzen, T. Genewein, V. Fischer, and B. Bischoff (2017) On detecting adversarial perturbations. In Proceedings of 5th International Conference on Learning Representations (ICLR), External Links: Link Cited by: §1.
  • E. Naghib, P. Yoachim, R. J. Vanderbei, A. J. Connolly, and R. L. Jones (2019) A Framework for Telescope Schedulers: With Applications to the Large Synoptic Survey Telescope. AJ 157 (4), pp. 151. External Links: Document, 1810.04815 Cited by: §1.
  • J. P. Naiman, A. Pillepich, V. Springel, E. Ramirez-Ruiz, P. Torrey, M. Vogelsberger, R. Pakmor, D. Nelson, and et al. (2018) First results from the IllustrisTNG simulations: a tale of two elements - chemical evolution of magnesium and europium. MNRAS 477 (1), pp. 1206–1224. External Links: Document, 1707.03401 Cited by: §2.
  • D. Nelson, V. Springel, A. Pillepich, V. Rodriguez-Gomez, P. Torrey, S. Genel, M. Vogelsberger, R. Pakmor, and et al. (2019) The IllustrisTNG simulations: public data release. Computational Astrophysics and Cosmology 6 (1), pp. 2. External Links: Document, 1812.05609 Cited by: §1, §2.
  • A. Nitin Bhagoji, W. He, B. Li, and D. Song (2017) Exploring the space of black-box attacks on deep neural networks. arXiv e-prints, pp. . External Links: 1712.09491 Cited by: §1.
  • N. Papernot, P. Mcdaniel, X. Wu, S. Jha, and A. Swami (2016) Distillation as a defense to adversarial perturbations against deep neural networks. 2016 IEEE Symposium on Security and Privacy (SP), pp. 582–597. Cited by: §1.
  • K.F.R.S. Pearson (1901) LIII. on lines and planes of closest fit to systems of points in space. The London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science 2 (11), pp. 559–572. External Links: Document Cited by: §4.3.
  • F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, and et al. (2011) Scikit-learn: machine learning in Python. Journal of Machine Learning Research 12, pp. 2825–2830. Cited by: §4.3.
  • N. Perraudin, M. Defferrard, T. Kacprzak, and R. Sgier (2019) DeepSphere: Efficient spherical convolutional neural network with HEALPix sampling for cosmological applications. Astronomy and Computing 27, pp. 130. External Links: Document, 1810.12186 Cited by: §1.
  • A. Pillepich, D. Nelson, L. Hernquist, V. Springel, R. Pakmor, P. Torrey, R. Weinberger, S. Genel, and et al. (2018) First results from the IllustrisTNG simulations: the stellar mass content of groups and clusters of galaxies. MNRAS 475 (1), pp. 648–675. External Links: Document, 1707.03406 Cited by: §2.
  • D. Prelogović, A. Mesinger, S. Murray, G. Fiameni, and N. Gillet (2022) Machine learning astrophysics from 21 cm lightcones: impact of network architectures and signal contamination. MNRAS 509 (3), pp. 3852–3867. External Links: Document, 2107.00018 Cited by: §1.
  • V. Rodriguez-Gomez, G. F. Snyder, J. M. Lotz, D. Nelson, A. Pillepich, V. Springel, S. Genel, R. Weinberger, and et al. (2019) The optical morphologies of galaxies in the IllustrisTNG simulation: a comparison to Pan-STARRS observations. MNRAS 483 (3), pp. 4140–4159. External Links: Document, 1809.08239 Cited by: §2.1.
  • B. T. P. Rowe, M. Jarvis, R. Mandelbaum, G. M. Bernstein, J. Bosch, M. Simet, J. E. Meyers, T. Kacprzak, and et al. (2015) GALSIM: The modular galaxy image simulation toolkit. Astronomy and Computing 10, pp. 121–150. External Links: Document, 1407.7676 Cited by: §2.2.
  • S. T. Roweis and L. K. Saul (2000) Nonlinear dimensionality reduction by locally linear embedding. Science 290 (5500), pp. 2323–2326. External Links: Document Cited by: §4.3.
  • K. Saito, D. Kim, S. Sclaroff, and K. Saenko (2020) Universal Domain Adaptation through Self Supervision. arXiv e-prints, pp. arXiv:2002.07953. External Links: 2002.07953 Cited by: §6.
  • J. Sanchez, I. Mendoza, D. P. Kirkby, P. R. Burchat, and LSST Dark Energy Science Collaboration (2021) Effects of overlapping sources on cosmic shear estimation: Statistical sensitivity and pixel-noise bias. J. Cosmology Astropart. Phys 2021 (7), pp. 043. External Links: Document, 2103.02078 Cited by: §2.2.
  • J. L. Sérsic (1963) Influence of the atmospheric and instrumental dispersion on the brightness distribution in a galaxy. Boletin de la Asociacion Argentina de Astronomia La Plata Argentina 6, pp. 41–43. Cited by: §2.1.
  • A. Smola, A. Gretton, L. Song, and B. Schölkopf (2007) A Hilbert space embedding for distributions. In Algorithmic Learning Theory, Lecture Notes in Computer Science 4754, pp. 13–31. Cited by: §3.2, §3.2.
  • G. F. Snyder, P. Torrey, J. M. Lotz, S. Genel, C. K. McBride, M. Vogelsberger, A. Pillepich, D. Nelson, and et al. (2015) Galaxy morphology and star formation in the Illustris Simulation at z = 0. Monthly Notices of the Royal Astronomical Society 454 (2), pp. 1886–1908. External Links: Document Cited by: §2.1.
  • V. Springel, R. Pakmor, A. Pillepich, R. Weinberger, D. Nelson, L. Hernquist, M. Vogelsberger, S. Genel, and et al. (2018) First results from the IllustrisTNG simulations: matter and galaxy clustering. MNRAS 475 (1), pp. 676–698. External Links: Document, 1707.03397 Cited by: §2.
  • R. Storn and K. Price (1997)

    Differential evolution–a simple and efficient heuristic for global optimization over continuous spaces

    .
    Journal of Global Optimization 11 (4), pp. 341–359. External Links: Document Cited by: §2.3.
  • J. Su, D. V. Vargas, and K. Sakurai (2019) One pixel attack for fooling deep neural networks. IEEE Transactions on Evolutionary Computation 23 (5), pp. 828–841. External Links: Document Cited by: §1.
  • J. Su, D. Vasconcellos Vargas, and S. Kouichi (2017) One pixel attack for fooling deep neural networks. arXiv e-prints, pp. arXiv:1710.08864. External Links: 1710.08864 Cited by: §1, §2.3, §2.3.
  • H. Sugai, N. Tamura, H. Karoji, A. Shimono, N. Takato, M. Kimura, Y. Ohyama, A. Ueda, and et al. (2015) Prime Focus Spectrograph for the Subaru telescope: massively multiplexed optical and near-infrared fiber spectrograph. Journal of Astronomical Telescopes, Instruments, and Systems 1, pp. 035001. External Links: Document, 1507.00725 Cited by: §1.
  • B. Sun and K. Saenko (2016) Deep CORAL: correlation alignment for deep domain adaptation. In ECCV Workshops, Cited by: §1.
  • C. Szegedy, W. Zaremba, I. Sutskever, J. Bruna, D. Erhan, I. Goodfellow, and R. Fergus (2013) Intriguing properties of neural networks. arXiv e-prints, pp. . External Links: 1312.6199 Cited by: §1.
  • D. Tanoglidis, A. Ćiprijanović, and A. Drlica-Wagner (2021) DeepShadows: Separating low surface brightness galaxies from artifacts using deep learning. Astronomy and Computing 35, pp. 100469. External Links: Document, 2011.12437 Cited by: §1, §1.
  • D. Tanoglidis, A. Ćiprijanović, A. Drlica-Wagner, B. Nord, M. H. L. S. Wang, A. J. Amsellem, K. Downey, S. Jenkins, and et al. (2021) DeepGhostBusters: Using Mask R-CNN to Detect and Mask Ghosting and Scattered-Light Artifacts from Optical Survey Images. arXiv e-prints, pp. arXiv:2109.08246. External Links: 2109.08246 Cited by: §1.
  • J. B. Tenenbaum, V. de Silva, and J. C. Langford (2000) A Global Geometric Framework for Nonlinear Dimensionality Reduction. Science 290 (5500), pp. 2319–2323. External Links: Document Cited by: §4.3.
  • D. Tuccillo, M. Huertas-Company, E. Decencière, S. Velasco-Forero, H. Domínguez Sánchez, and P. Dimauro (2018) Deep learning for galaxy surface brightness profile fitting. MNRAS 475 (1), pp. 894–909. External Links: Document, 1711.03108 Cited by: §1.
  • L. van der Maaten and G. Hinton (2008) Visualizing data using t-SNE. Journal of Machine Learning Research 9 (86), pp. 2579–2605. External Links: Link Cited by: §4.3.
  • M. Vogelsberger, S. Genel, V. Springel, P. Torrey, D. Sijacki, D. Xu, G. Snyder, D. Nelson, and L. Hernquist (2014) Introducing the Illustris Project: simulating the coevolution of dark and visible matter in the Universe. MNRAS 444 (2), pp. 1518–1547. External Links: Document, 1405.2921 Cited by: §1.
  • M. Wang and W. Deng (2018) Deep visual domain adaptation: a survey. Neurocomputing 312, pp. 135–153. External Links: ISSN 0925-2312, Document Cited by: §1, §3.2.
  • D. Warde-Farley and I. Goodfellow (2017) Adversarial perturbations of deep neural networks. In Perturbations, Optimization, and Statistics, T. Hazan, G. Papandreou, and D. Tarlow (Eds.), pp. 311–342. External Links: Document Cited by: §4.2, §4.2.
  • M. Wicker, L. Laurenti, A. Patane, Z. Chen, Z. Zhang, and M. Kwiatkowska (2021) Bayesian inference with certifiable adversarial robustness. In International Conference on Artificial Intelligence and Statistics, pp. 2431–2439. Cited by: §1.
  • G. Wilson and D. J. Cook (2020) A survey of unsupervised deep domain adaptation. ACM Transactions on Intelligent Systems and Technology 11 (5). External Links: ISSN 2157-6904, Document Cited by: §1, §3.2.
  • J. F. Wu and S. Boada (2019) Using convolutional neural networks to predict galaxy metallicity from three-colour images. MNRAS 484 (4), pp. 4683–4694. External Links: Document, 1810.12913 Cited by: §1.
  • X. Yuan, P. He, Q. Zhu, and X. Li (2019) Adversarial examples: attacks and defenses for deep learning. IEEE Transactions on Neural Networks and Learning Systems 30 (9), pp. 2805–2824. External Links: Document Cited by: §1, §1.
  • W. Zellinger, B. A. Moser, T. Grubinger, E. Lughofer, T. Natschläger, and S. Saminger-Platz (2019) Robust unsupervised domain adaptation for neural networks via moment alignment. Information Sciences 483, pp. 174–191. External Links: Document Cited by: §1.
  • Y. Zhang, Y. Zhang, Y. Wei, K. Bai, Y. Song, and Q. Yang (2020) Fisher deep domain adaptation. In Proceedings of the 2020 SIAM International Conference on Data Mining (SDM), pp. 469–477. External Links: Document Cited by: §3.2.