A Probabilistic Approach for Predicting Landslides by Learning a Self-Aligned Deep Convolutional Model

11/12/2019 ∙ by Ainaz Hajimoradlou, et al. ∙ 30

Landslides are movement of soil and rock under the influence of gravity. They are common phenomena that cause significant human and economic losses every year. To reduce the impact of landslides, experts have developed tools to identify areas that are more likely to generate landslides. We propose a novel statistical approach for predicting landslides using deep convolutional networks. Using a standardized dataset of georeferenced images consisting of slope, elevation, land cover, lithology, rock age, and rock family as inputs, we deliver a landslide susceptibility map as output. We call our model a Self-Aligned Convolutional Neural Network, SACNN, as it follows the ground surface at multiple scales to predict possible landslide occurrence for a single point. To validate our method, we compare it to several baselines, including linear regression, a neural network, and a convolutional network, using log-likelihood error and Receiver Operating Characteristic curves on the test set. We show that our model performs better than the other proposed baselines, suggesting that such deep convolutional models are effective in heterogenous datasets for improving landslide susceptibility maps, which has the potential to reduce the human and economic cost of these events.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 4

page 6

page 7

Code Repositories

LandslidePrediction

Classification task for predicting landslides based on GIS maps using locally aligned convolutional neural networks. This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.


view repo

VenetoItaly

A repository for creating a hdf5 dataset of GIS maps. This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.


view repo
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Landslides, the downslope movement of Earth materials under the influence of gravity, are common and destructive phenomena. Despite the number of studies focusing on landslide mapping [10]

and landslide spatial and temporal probability prediction

[21, 18], real-world applications are scarce and landslides cause significant life and economic losses every year [19]. There are three different approaches to landslide susceptibility mapping: expert-based, physical-based, and statistical approaches. Expert-based methods rely on the qualitative judgment of a domain expert, while physical-based approaches model the stability of a slope given physical parameters such as geotechnical rock and soil properties, and calculate the equilibrium between destabilizing factors and slope strength, but often require more information than is available at scale. Statistical models rely on the statistical analysis of large landslide databases and their relation with landscape attributes. Landscape attributes typically include internal (e.g. slope angle, rock type, etc.) and external (e.g. rainfall) properties of the slope. These data are then used to map the spatial and/or temporal probability of slope failure [18]. The spatial probability of landslide occurrence is usually referred to as the susceptibility map. When the magnitude and the temporal component (e.g. frequency and triggers) are also considered, it is referred to as a hazard map [18].

Statistical approaches for predicting landslides have significantly increased in recent years. However, they mostly apply models such as linear logistic regression, Support Vector Machines (SVM), or neural networks

[21]. In this study, we propose a novel convolutional model which we call a Self-Aligned Convolutional Neural Network, SACNN, for producing susceptibility maps. Convolutional Neural Networks, CNNs, form a category of neural network models with tied parameters [12]. CNN with pooling layers can capture both local and global features of an image which has been proven extremely useful in many vision tasks such as object recognition, image classification, and object detection.

We are interested in predicting the landslide probability for each point on the ground. The output of our model is a probability map with the same resolution as the input features. We use a fully convolutional model [24] for this purpose. These models have been widely used for image segmentation [22, 17] and usually consist of down-sampling and up-sampling stages. One of the popular models in this category is UNet [22], which our architecture is also based on. The down-sampling stage consists of convolutions with pooling layers and tries to create a set of compact features capturing both local and global properties of the input features. The up-sampling stage typically consists of convolution transpose layers which are mainly doing the inverse of pooling but with learning parameters. We don’t use convolution transpose layers in our model as they produce checkboard artifacts in the final image [2]

. Instead, we use interpolation for up-sampling. It has been proven that adding skip connections to a fully convolutional model improves its performance

[8, 14]. As short skip connections has been shown to work only in very deep networks, we only apply long connections to our model.

To produce good susceptibility maps for landslides, we are interested in learning filters that can follow the ground surface and extract features towards the up-hill direction. For this to work, we need the CNN model to preserve orientational information of landslides to each other but this is not possible using traditional techniques, when the filters align themselves up, down, left, and right, which corresponds to north, south, east, and west. Capsul networks [23, 1, 20]

have been recently proposed to address this issue however, they are not suitable for the task of landslide prediction. We add a pre-processing stage to our CNN model to find the best directions for each pixel at multiple scales and then learn hidden features according to those directions. We call this model a Self-Aligned CNN as the model first aligns itself to a specific set of orientations and then learns a classifier.

Figure 1: The process of finding aligned features at multiple scales. The red point shows the point of interest where we want to find the up-hill directions. Each blue circle shows a set of neighboring points at a specific range. The green point in each circle is the detected point with the highest elevation at that distance, from which the aligned features will be extracted.

The contributions of our paper are:

  • We provide an open-source dataset with a standard set of features. The dataset consists of several input features such as the slope, elevation, rock types with age and family, and land cover, along with the ground truth in the shape of landslide polygons which can be used in both a supervised and unsupervised learning framework.

  • We propose a novel statistical approach for predicting landslides using deep convolutional networks. We develop a model that can capture each pixel’s orientation at three different ranges to classify a landslide. We use ranges of 30, 100, and 300 meters in our model. These scales can also be optimized using cross-validation.

  • We define several baseline models for comparison. We provide five different baselines including a Random model, a linear logistic regression (LLR), a neural network (NN), and a self-aligned neural network (SANN) model without any convolutions to compare our model’s performance against them.

  • We provide a way to use CNN models with heterogeneous datasets for predicting landslides rather than only using images in our models.

2 Related Work

Producing susceptibility maps by statistical approaches is not new in the landslide community. Many people have been using models such as logistic regression, SVM, and random forests.

[6] used random forests to generate susceptibility maps emphasizing on sensitivity and scaling issues. [15] and [28] also used random forest models in predicting landslides for Switzerland and Wadi Tayyah Basin in Saudi Arabia. Some have developed software packages using random forests for susceptibility mapping [5]. [16] generate several susceptibility mappings using SVMs, random forests, and Adaboost. [3], [4], and [7]

focus on linear regression for predicting landslides due to its simplicity and easy training procedure. There is a volume of approaches that formulate the problem in a probabilistic framework such as Bayesian networks

[11, 13]. Neural networks and convolutional models are among more recent approaches for susceptibility mapping. [27] and [25] use neural networks to assess mine landslide susceptibility and to predict shallow landslide hazards. [9] and [26] use a CNN model for detecting landslides from satellite images. However, most of these models are quite simple and do not have a large receptive field of view. Moreover, they learn a model to recognize landslides from satellite images but we are interested in predicting them given geospatial data.

3 Dataset

Figure 2:

The self-aligned CNN architecture used for predicting landslides. Each conv2d uses a kernel of size 3 with stride 1 and each MaxPool unit uses a kernel of size 2. Upsample units interpolate the image with scale factor of 2 by bi-linear interpolation.

The dataset used for predicting landslides is from an Italian open-source database. The dataset cotains both continuous and categorical features in the shape of rasters and vector files respectively. Continuous features including slope and DEM111Digital Elevation Model contain out of range values while categorical features such as rock type, land cover, rock age, and rock family, have several no-data points. To use such data in a CNN, we converted each vector map to a raster after removing no-data samples. We then prepared a new dataset of rasters with no out of range values and no-data points.

As we wanted to propose a baseline framework for this type of problem, we needed to come up with a standard set of features for our categorical data. Therefore, we decided to choose 44 rock types, 5 land covers, 5 rock families, and 38 rock ages, based on the INSPIRE terminology, as the one-hot encoding for our categorical data. INSPIRE

222Infrastructure for Spatial Information in Europe: https://inspire.ec.europa.eu is an open-source project for standardizing spatial data across countries in Europe. Using the INSPIRE terminology, we ended up with 94 standard input features. Anyone using the INSPIRE terminology should be able to compare their results with our proposed baselines and further use our prepared dataset.

Each pixel in our prepared dataset has a 10 meters resolution and the images are x

pixels resulting in an area with approximately 210 (km) width and 195 (km) height. This area is Veneto, a region of Italy. We used this region since it expands over both mountains and flat zones close to the sea. The ratio of landslides in this region is below 1% which makes the dataset extremely imbalanced. The landslides in Veneto include both mountainous and less steep areas which are good for training our model. Unfortunately, the landslides do not usually contain information about the date of occurrence. All of these characteristics make this dataset challenging from the machine learning point of view.

The rasters in the dataset are too large to fit into memory when training. Instead, we divide each raster, an input feature, into smaller images of size x

, which we call patches. We further feed mini-batches of these patches into our model for training. Since we want to produce a coherent probability map for the whole region, we use patches that overlap each other. For this purpose, we pad each patch with 64 pixels on each side resulting in

x images. This padding number is used to ensure that the overlap between patches is bigger than the receptive field of view of our networks. We partitioned these patches into training, testing, and validation sets.

4 Self-Aligned Deep Network

The slope is considered one of the main conditioning factors in causing landslides. The LLR baseline that we learned also confirms this claim as the slope’s weight is among the top 5 learned weights. Traditional CNN filters are oriented vertically in an image, but the important orientation is up-hill and down-hill for landslides. Based on this, we propose a Self-Aligned CNN model with filters that align themselves according to the up-hill direction and extract features alongside that direction. As illustrated in Figure 1, we find the highest elevation value for each pixel in the image at three different ranges and extract relevant features at those located points. Because space is at a premium for batch size, we selected a subset of 22 features for this purpose. These features are chosen based on our trained LLR baseline. If we define the LLR’s weight vector by , each feature is chosen such that . The code is available under https://github.com/ainazHjm/Landslide.

4.1 Architecture

Our Self-Aligned CNN architecture consists of a preprocessing module and four layers of down-sampling and up-sampling as in Figure 2

. The preprocessing module takes the elevation map along with other input features from the dataset as inputs and outputs 22 aligned features for each looking distance. We use 30, 100, and 300 meters as looking distance in our experiments but it can also be considered a hyper-parameter and be optimized using cross-validation. The preprocessing module outputs 66 aligned features that we further feed into the convolutional network along with the original 94 features. We apply long skip connections between each sampling layer for our SACNN architecture similar to UNet. Each down-sampling layer consists of two convolution layers followed by Relu as non-linearity and a max-pooling layer. Every up-sampling layer includes an up-sampling module to interpolate the data followed by convolutions and Relu. In the end, we apply a Sigmoid function to the output of the model to obtain probabilities.

4.2 Training

We partition image patches after shuffling such that 80% of the data is used for training, 10% for testing, and the other 10% for validation. We use the negative log-likelihood loss to train our model. However, as the training data is extremely imbalanced, we use oversampling to balance the data to some extent. As landslides are quite rare, we don’t want the model to become overconfident when predicting them. Since we want to train our model on patches and preserve the spatial relation between pixels, we don’t oversample landslide pixels but rather patches that have at least one positive label. By oversampling those patches, we are oversampling both landslides and non-landslide pixel points. After doing this, the distribution of landslides stays below 1%. This oversampling technique can also be seen as a type of data augmentation which provides more training data. We use an oversampling ratio of 5 in our experiments.

MODEL OPTIMIZER LR EPOCHS BS DECAY PATIENCE
LLR Adam 0.125 10 15 0.001 2
NN Adam 0.125 10 13 0.001 2
SANN Adam 0.0156 15 10 0.001 2
CNN SGD 0.125 20 12 0.001 2
SACNN Adam 0.001 30 9 0.001 2
Table 1: Training Hyper-Parameters. LR and BS represent the learning rate and the batch size respectively.
Figure 3: This image shows the data partitioning used for the whole region of Veneto. Light purple, blue, and green colors are used to represent train, validation, and test sets respectively. The white background shows no-data points.

We propose several baselines to compare our model against them including linear logistic regression, NN, CNN, and a Self-Aligned NN model that uses preprocessing combined with a neural network. Table 1 shows the hyper-parameters used for training each of these models. We optimized the learning rate and the optimizer with 5-fold cross-validation for one epoch. The batch size is chosen such that we can fit the maximum number of samples in 12 GB memory of a TitanXP GPU. The number of epochs is also chosen to fully train each model. We validate our models at each epoch and reduce the learning rate if the validation error keeps increasing for patience number of epochs to avoid overfitting.

5 Results

(a) Probability Map.
(b) Ground Truth.
Figure 4: SACNN probability map of 21005x19500 resolution for Veneto and its corresponding ground truth. Red regions correspond to higher probabilities and blue regions are areas with probabilities close to zero. Red polygons represent observed landslides in the Veneto region
(a) Linear Logistic Regression.
(b) Neural Network.
(c) Self-Aligned NN.
(d) CNN.
(e) Self-Aligned CNN.
(f) Ground truth.
Figure 5: The acquired probability maps for a small region in Veneto and its corresponding ground truth. The resolution of the images are 2500x2500 pixels.

Since time scale is not provided for landslides, the output probabilities are for undefined time period, and therefore should only be interpreted as relative scales. Figure 3(a)

shows the probability distribution map (susceptibility map) obtained by our model on the whole region of Veneto. The corresponding ground truth is also available in Figure

3(b). Since the prediction map is too big and it is hard to differentiate between the outputs of the baselines, we show a smaller region of the map to compare various outputs against each other with their corresponding ground truth. The chosen area of interest, as shown in Figure 5, includes both landslide polygons and non-landslide areas. We chose this region as it has a variety of terrain. We also illustrate the number of patches that were used for training, validation, and testing in the whole region in Figure 3.

5.1 Evaluation

Figure 6: ROC curves from all models on the test set.

We define as the ratio of negative/zero labels and as the ratio of positive/one labels in the training set. We propose a baseline model called Random that predicts 0.001 in of the times and predicts 0.999 in of the times. Given and , we can easily calculate the expected negative log-likelihood on the training set, which is approximately equal to 0.18. We can find the expected error with the same calculations on the test set as well. We compare the test and training errors of our other baselines with the Random model to make sure that the learned models perform better than Random as shown in Table 2.

METHOD TEST ERR TRAIN ERR
Random 0.16 0.18
LLR 0.055 0.057
NN 0.052 0.055
SANN 0.048 0.052
CNN 0.047 0.051
SACNN 0.046 0.050
Table 2: Negative Log Likelihood Loss
Figure 7: Negative log-likelihood error of all models on the validation set in the first 10 epochs.

We evaluate our models by the Receiver Operating Characteristic (ROC) curve and negative log-likelihood error on the test set to see how much they can differentiate between distinct classes. The Area Under the Curve (AUC) is also reported in Figure 6. Our model, SACNN, is able to achieve the best result in all metrics as in Figure 6 and Table 2, conferring the significance of using aligned features for predicting landslides. Although SANN achieves higher test error than the CNN baseline, it obtains the same AUC as CNN on the test set, suggesting that alignment alone can improve the model performance by great extent. However, the best result is obtained by using both convolutions and alignment as in our proposed model, SACNN. The validation curves of all baselines is also provided in Figure 7.

6 Conclusion

Landslides are the movement of ground under the force of gravity. They are common phenomena that can cause significant casualties. There have been many approaches to produce susceptibility maps to reduce the impact of landslides including expert-based, physics-based, and statistical methods. All of these methods have their flaws and lack a standard set of features. We provide a standardized open-source dataset with the same terminology as INSPIRE so that anyone who uses the INSPIRE terminology can compare their results to our proposed baselines. We also propose a novel statistical approach for predicting landslides using machine learning. We introduce a deep convolutional model, called SACNN, that can follow the ground surface and align itself with the ground contour lines to extract relevant features. We evaluate our model by ROC curves and negative log-likelihood error and show that it can achieve the best results on the test set among all the baselines. Our results suggest that this type of statistical approach is effective for generating susceptibility maps which in turn has the potential to alleviate human and financial losses caused by landslides.

References

  • [1] A. Ahmad, B. Kakillioglu, and S. Velipasalar (2018) 3D capsule networks for object classification from 3d model data. 2018 52nd Asilomar Conference on Signals, Systems, and Computers, pp. 2225–2229. Cited by: §1.
  • [2] A. P. Aitken, C. Ledig, L. Theis, J. Caballero, Z. Wang, and W. Shi (2017) Checkerboard artifact free sub-pixel convolution: A note on sub-pixel convolution, resize convolution and convolution resize. CoRR abs/1707.02937. External Links: Link, 1707.02937 Cited by: §1.
  • [3] P. M. Atkinson and R. Massari (1998-05-15) Generalised linear modelling of susceptibility to landsliding in the central apennines, italy. Computers and Geosciences 24 (4), pp. 373–385 (English). Note: M1 - 4 External Links: Document, ISSN 0098-3004 Cited by: §2.
  • [4] L. Ayalew and H. Yamagishi (2005-02) The application of gis-based logistic regression for landslide susceptibility mapping in the kakuda-yahiko mountains, central japan. Geomorphology 65, pp. 15–31. External Links: Document Cited by: §2.
  • [5] P. Behnia and A. Blais-Stevens (2017-11) Landslide susceptibility modelling using the quantitative random forest method along the northern portion of the yukon alaska highway corridor, canada. Natural Hazards, pp. . External Links: Document Cited by: §2.
  • [6] F. Catani, D. Lagomarsino, S. Segoni, and V. Tofani (2013)

    Landslide susceptibility estimation by random forests technique: sensitivity and scaling issues

    .
    Natural Hazards and Earth System Sciences 13 (11), pp. 2815–2831. External Links: Link, Document Cited by: §2.
  • [7] J. Davis, C. Chung, and G. Ohlmacher (2006-10) Two models for evaluating landslide hazards. Computers & Geosciences - COMPUT GEOSCI 32, pp. 1120–1127. External Links: Document Cited by: §2.
  • [8] M. Drozdzal, E. Vorontsov, G. Chartrand, S. Kadoury, and C. Pal (2016) The importance of skip connections in biomedical image segmentation. In Deep Learning and Data Labeling for Medical Applications, G. Carneiro, D. Mateus, L. Peter, A. Bradley, J. M. R. S. Tavares, V. Belagiannis, J. P. Papa, J. C. Nascimento, M. Loog, Z. Lu, J. S. Cardoso, and J. Cornebise (Eds.), Cham, pp. 179–187. External Links: ISBN 978-3-319-46976-8 Cited by: §1.
  • [9] O. Ghorbanzadeh, T. Blaschke, K. Gholamnia, S. Meena, D. Tiede, and J. Aryal (2019-01) Evaluation of different machine learning methods and deep-learning convolutional neural networks for landslide detection. Remote Sensing 11, pp. 21. External Links: Document Cited by: §2.
  • [10] F. Guzzetti, A. Mondini, M. Cardinali, F. Fiorucci, M. Santangelo, and K. Chang (2012-03) Landslide inventory maps: new tools for an old problem. Earth-Science Reviews 112, pp. 42–66. Cited by: §1.
  • [11] T. Heckmann, W. Schwanghart, and J. D. Phillips (2015) Graph theory—recent developments of its application in geomorphology. Cited by: §2.
  • [12] Y. LeCun, P. Haffner, L. Bottou, and Y. Bengio (1999) Object recognition with gradient-based learning. In

    Shape, Contour and Grouping in Computer Vision

    ,
    London, UK, UK, pp. 319–. External Links: ISBN 3-540-66722-9, Link Cited by: §1.
  • [13] L. Lombardo, T. Opitz, and R. Huser (2018-02) Point process-based modeling of multiple debris flow landslides using inla: an application to the 2009 messina disaster -. Stochastic Environmental Research and Risk Assessment, pp. . External Links: Document Cited by: §2.
  • [14] X. Mao, C. Shen, and Y. Yang (2016) Image restoration using very deep convolutional encoder-decoder networks with symmetric skip connections. In Advances in Neural Information Processing Systems 29, D. D. Lee, M. Sugiyama, U. V. Luxburg, I. Guyon, and R. Garnett (Eds.), pp. 2802–2810. External Links: Link Cited by: §1.
  • [15] N. Micheletti, L. Foresti, S. Robert, M. Leuenberger, A. Pedrazzini, M. Jaboyedoff, and M. Kanevski (2013-12)

    Machine learning feature selection methods for landslide susceptibility mapping

    .
    Mathematical geosciences 46, pp. . External Links: Document Cited by: §2.
  • [16] N. Micheletti, L. Foresti, S. Robert, M. Leuenberger, A. Pedrazzini, M. Jaboyedoff, and M. Kanevski (2013-12) Machine learning feature selection methods for landslide susceptibility mapping. Mathematical geosciences 46, pp. . External Links: Document Cited by: §2.
  • [17] H. Noh, S. Hong, and B. Han (2015-12) Learning deconvolution network for semantic segmentation. In The IEEE International Conference on Computer Vision (ICCV), Cited by: §1.
  • [18] D. (. Ottowitz (2012) Safeland deliverable d2.8: recommended procedures for validating landslide hazard and risk models and maps. living with landslide risk in europe: assessment, effects of global change, and risk management strategies. European Project SafeLand. External Links: Link Cited by: §1.
  • [19] D. Petley (2012-10) Global patterns of loss of life from landslides. Geology 40 (10), pp. 927–930. External Links: ISSN 0091-7613, Document, Link Cited by: §1.
  • [20] S. Ramasinghe, C. D. Athuraliya, and S. H. Khan (2018) A context-aware capsule network for multi-label classification. ArXiv abs/1810.06231. Cited by: §1.
  • [21] P. Reichenbach, M. Rossi, B. D. Malamud, M. Mihir, and F. Guzzetti (2018) A review of statistically-based landslide susceptibility models. Earth-Science Reviews 180 (March), pp. 60–91. External Links: Document, ISBN 0755014413, ISSN 00128252, Link Cited by: §1, §1.
  • [22] O. Ronneberger, P.Fischer, and T. Brox (2015) U-net: convolutional networks for biomedical image segmentation. In Medical Image Computing and Computer-Assisted Intervention (MICCAI), LNCS, Vol. 9351, pp. 234–241. Note: (available on arXiv:1505.04597 [cs.CV]) External Links: Link Cited by: §1.
  • [23] S. Sabour, N. Frosst, and G. E. Hinton (2017) Dynamic routing between capsules. CoRR abs/1710.09829. External Links: Link, 1710.09829 Cited by: §1.
  • [24] E. Shelhamer, J. Long, and T. Darrell (2017-04) Fully convolutional networks for semantic segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 39 (4), pp. 640–651. External Links: ISSN 0162-8828, Link, Document Cited by: §1.
  • [25] D. Tien Bui, T. Tuan, H. Klempe, B. Pradhan, and I. Revhaug (2015-01) Spatial prediction models for shallow landslide hazards: a comparative assessment of the efficacy of support vector machines, artificial neural networks, kernel logistic regression, and logistic model tree. Landslides, pp. 1–18. External Links: Document Cited by: §2.
  • [26] Y. Wang, Z. Fang, and H. Hong (2019-05) Comparison of convolutional neural networks for landslide susceptibility mapping in yanshan county, china. Science of The Total Environment 666, pp. 975–993. External Links: Document Cited by: §2.
  • [27] L. X (2019) Mine landslide susceptibility assessment using ivm, ann and svm models considering the contribution of affecting factors. External Links: Document Cited by: §2.
  • [28] Dr. A. Youssef, M. Al-Kathery, and B. Pradhan (2014-07) Landslide susceptibility mapping at al-hasher area, jizan (saudi arabia) using gis-based frequency ratio and index of entropy models. Geosciences Journal 19, pp. . External Links: Document Cited by: §2.