CNN Profiler on Polar Coordinate Images for Tropical Cyclone Structure Analysis

10/28/2020 ∙ by Boyo Chen, et al. ∙ 0

Convolutional neural networks (CNN) have achieved great success in analyzing tropical cyclones (TC) with satellite images in several tasks, such as TC intensity estimation. In contrast, TC structure, which is conventionally described by a few parameters estimated subjectively by meteorology specialists, is still hard to be profiled objectively and routinely. This study applies CNN on satellite images to create the entire TC structure profiles, covering all the structural parameters. By utilizing the meteorological domain knowledge to construct TC wind profiles based on historical structure parameters, we provide valuable labels for training in our newly released benchmark dataset. With such a dataset, we hope to attract more attention to this crucial issue among data scientists. Meanwhile, a baseline is established with a specialized convolutional model operating on polar-coordinates. We discovered that it is more feasible and physically reasonable to extract structural information on polar-coordinates, instead of Cartesian coordinates, according to a TC's rotational and spiral natures. Experimental results on the released benchmark dataset verified the robustness of the proposed model and demonstrated the potential for applying deep learning techniques for this barely developed yet important topic.



There are no comments yet.


page 4

page 9

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

A tropical cyclone (TC), also called hurricane or typhoon, is a kind of rotating storm formed on tropical oceans; it is characterized by a low-pressure center (i.e., the “eye”), eyewall associated with deep convective clouds and strong winds, and spiral rainbands outside of the eyewall. This severe weather system often causes serious damage to human society due to gusty winds, torrential rainfall, high waves, and storm surges.

Although the improvement of TC forecasting in recent years ensures fairly well prediction of the track and primary rainfall distribution of a TC, there is still room for improvement in the ability to predict TC structure Knaff et al.; Sampson and Knaff; Sampson et al.. Moreover, the TC structure, in terms of its 2-D surface wind fields, is closely related to the potential TC damage, the area affected by gale-force winds, and the magnitude of storm surges Powell and Reinhold (2007). Moreover, a better TC structure analysis served as the initial data of numerical weather prediction models is critical to improving the prediction accuracy regarding TCs Tallapragada (2015); Bender et al. (2016).

It is not easy to accurately analyze the structure of a TC, noting that TCs spend most of their lifetime on the open ocean, where meteorological observation is severely limited. Therefore, meteorologists strongly relay on satellite remote sensing to estimate TC surface wind field, TC radial wind profile, and structural parameters (e.g., intensity, the radius of maximum wind, size; please refer to section 2).

The most straightforward way to analyze TC structure is using the spaceborne radar surface wind observation (fig. 1), such as Advance Scatterometer (ASCAT, Figa-Saldaña et al. (2002); Knaff et al. (2011)). Although ASCAT provides high-quality surface wind observation outside of the TC inner-core (i.e., larger than approximately 80-150 km radius), ASCAT only provides two scans of a TC per day. Sometimes, only a portion of the TC is observed due to ASCAT’s limited swath width.

To estimate TC structure at a higher frequency, other kinds of satellite observations have to be used, such as microwave sounders on low-Earth-orbit satellites Knaff et al. (2011); Demuth et al. (2004); Wimmers et al. (2019) and images from geostationary satellites Velden et al. (2006); Knaff et al. (2014); Chen et al. (2019)

. Infrared images that observe cloud features associated with a TC can be used to estimate several important TC structural parameters, including intensity and size. For instance, Knaff et al. (2014) developed a TC size estimation technique based on feature engineering. Their model utilized principal component analysis of the azimuthally-averaged radial profile of the infrared brightness temperature and linear regression to estimate TC size. With the structural parameters retrieved from satellite images, the TC radial wind profile can be constructed by a physically-based parametric wind model

Knaff et al. (2016); Morris and Ruf (2017)

. However, there is some difficulty in using such a sample parametric model to analyze the TC structure.

Figure 1:

The ASCAT surface winds (colored vectors, kt) observation of Typhoon Soulik (2018) at August 22 2018 11:48 UTC. The raw ASCAT data can be download from

Although satellite remote sensing provides various observational data for TC structure analysis, the conventional statistical methods face limitations in analyzing multi-channel, high-dimensional, and temporal-spatially heterogeneous satellite data. Meanwhile, deep learning has achieved great success in analyzing satellite remote-sensing images of tropical cyclones (TC), such as TC intensity estimation Chen et al. (2019), predicting TC intensification Bai et al. (2019); Yang et al. (2020), and anticipating TC formation Matsuoka et al. (2018). In these studies, Convolutional Neural Network (CNN,Krizhevsky et al. (2012)

) successfully extracted features that are difficult to be quantified before, form high-dimensional data, and use them to complete the classification or regression work. These deep learning models provided more efficient, stable, and objective guidance for TC forecasting, and their performance is comparable but not significantly exceed the state-of-the-art meteorological techniques.

The goal of this study is to explore the potential of deep learning in this necessary but not well-tackled topic in meteorology. To remove the dependencies of any sample parametric model and analyze TC structure directly with satellite images, we construct and release a new benchmark dataset, in which TC wind profiles were constructed based on meteorological domain knowledge to provide valuable data labels.

Furthermore, we propose a novel specialized CNN model operating on polar coordinates. Several different loss function compositions and model structures are explored and discussed in the following section. By properly designing our model, the experimental result show the promising future for the deep leaning techniques in this new topic.

This paper is organized as follows. Section 2 describes the definitions of TC structure and structural parameters, and how meteorologist conventionally estimates it. Section 3 describes our new-released dataset: Dataset of Tropical Cyclone Structural Analysis. Section 4 proposes the CNN architecture on polar coordinates, suitable for processing TC satellite imagery that is rotationally invariant. Section 5 includes the experiment results. Section 6 is the conclusion.

2 Background Knowledge

2.1 Definition of TC structure and structural parameters

As a cyclonically rotated system, a TC’s structure is usually referred to as the characteristics of the storm-centered surface wind field, which is closely related to the potential TC impacts. But noting that TC is fairly axis-symmetric with respect to the center and has the tangential wind components much larger than the radial wind components (fig. 1), it is more practical to describe TC’s structure by the azimuthally-averaged radial surface wind profile Holland and Merrill (1984); Knaff et al. (2016).

With such a profile (fig. 2a, green line, as an example), several important structural parameters can be defined. TC intensity () is conventionally defined as the maximum wind near the TC center, and the radius of the maximum wind (RMW) indicates where occurs. TC’s size is usually defined as the radial extent from the center of certain wind thresholds, such as 5, 34, 50, or 64 kt. The 34 kt winds radius () is considered the most practical TC size parameter as it strongly relates to a TC’s impact. These three parameters are the most critical parameters to be estimated in real-time TC forecasting in operational weather prediction centers. Furthermore, previous studies Chan and Chan (2012); Weatherford and Gray (1988) have shown that the intensity is not strongly related to the size. This implies that knowing the intensity, which is the easiest one to estimate among the three parameters, is not sufficient to determine the structure of a TC.

2.2 Conventional method to estimate TC structure

A space-borne scatterometer (e.g., ASCAT) provides high-quality surface wind observation Figa-Saldaña et al. (2002); Knaff et al. (2011). ASCAT is a C-band radar that measures ocean roughness and uses it to retrieve surface winds under approximately 30 m/s. Thus, the subjective analysis by forecasters based on scatterometer observation is considered one of the best metrics for analyzing TC structure Sampson et al. (2017, 2018). However, the sampling frequency (twice pre-day) of ASCAT is not enough for operational TC structure analysis, which better has a higher frequency (less than six hour).

A method, ”Multiplatform Tropical Cyclone Surface Winds Analysis” (MTCSWA) Knaff et al. (2011), to estimate TC surface wind fields every 6 h utilized observation from multiple satellite platforms and satellite-based wind retrieval techniques. MTCSWA uses a variational data-fitting method to merge satellite observations that are temporal-spatially heterogeneous. Although this method produces wind estimates with generally smaller errors than single raw input data, the analysis quality may be unstable when some important input data (i.e., ASCAT) are not available.

The other approach to estimate an axis-symmetric TC structure is to estimate key structural parameters. Several studies Knaff et al. (2014) have applied CNN for estimating TC intensity utilizing satellite imagery. On the other hand, Knaff et al. (2014) related the storm-centered satellite infrared imagery to TC size, in terms of the radius of azimuthally-averaged 5-kt winds. They created a multivariable linear regression equation based only on the first three principal components of the azimuthally-averaged radial profile of the infrared brightened temperature.

Estimating these structural parameters strongly relies on extracting high-level features from satellite images. However, some of the methods are subjective and depended on weather forecasters’ human intelligence; other objective statistical methods can only handle limited features or predicters. As there may be a great potential to extract more useful features by deep learning, we are motivated to apply deep learning for estimating not only a single structural parameter but, hopefully, the entire radial surface wind profile (i.e., the profiles as shown in fig. 2a).

3 Dataset: TCSA

A new dataset for Tropical Cyclone Structure Analysis (TCSA) is released along with this research. TCSA can be used to develop deep learning models that estimate TC structural parameters (e.g., intensity, size, size asymmetry) and, more importantly, the axis-symmetric wind profile of the storm. Link to the dataset repository will be provided here after the double blind review.

As an extension of another open dataset – Dataset of Tropical Cyclone for Image-to-intensity Regression (TCIR) Chen et al. (2018), TCSA contains 76835 TC images, collected from 2004 to 2018, covering 1407 TCs in every basin over the globe. For each TC, images are collected once per 3 hours. The center of the TCs are always placed at the center of the images.

4 satellite channels

are included in every images: (1) infrared, (2) water vapor, (3) visible light channel, and (4) passive micro-wave rain rate (Figure 3(a)).

4 labels

are provided, including (1) intensity (, defined as the maximum wind velocity), (2) size (, defined as the mean of radii of 34-knot wind in the four quadrants, in kilometer), (3) radius of maximum wind speed (RMW), and, most important of all, (4) the wind profile.

Figure 2: (a) A good wind profile and an uncertain wind profile. For the good profile (green), , RMW, and are indicated. For the uncertain profile (brown), the best-track RMW (orange ”x”) and the calculated RMW (brown ”x”) differ to each other. (b) The scatter plot of RMW difference vs.

. The horizontal lines indicate the interval of 2 standard deviations in Y-axis. The red triangle indicates the position of the uncertain sample as shown in (a).

3.1 Wind Profile Label

In the TCSA, we apply the parametric wind model Morris and Ruf (2017) to calculate the TC wind profile for every available data. The radial wind profile of a TC can be described by

where is the radius of maximum wind speed (RMW), is the maximum wind speed, is the radial distance from the storm center, and is the Coriolis parameter. Here, we use , , and to approach the most possible wind profile of the TC, with parameters a and b calculated by iteration. This wind model assumes that the TC is symmetry, and the adjustment in a and b allows fitting the wind speed profile better. According to meteorological domain knowledge, and are fixed in our calculation because of their higher reliability than that of RMW. and are also more critical in accessing TC impact in operational TC forecast. Consequently, RMW is allowed to be adjusted during the iteration. However, sometimes there might be a large difference between the original RMW and calculated RMW (fig. 2a), especially for weaker TCs. In such cases, we would question the correctness of the calculated profile.

Although we collected over 76000 images, only 46% data, 35310 images, can be equipped with a valid wind profile label. This is usually because a sample’s does not exist while its intensity is weaker than 34 knots.

3.2 Profile Quality Analysis

Noted that, even with valid wind profiles, there is still a portion of data that has a large difference between the original RMW label and the calculated RMW. We consider a wind profile with uncertainty if the distance between origianl RMW label and calculated RMW is more than two standard deviations (fig. 2b). fig. 2(a) demonstrates good and uncertain examples. The green line shows a good TC wind profile, with at maximum wind speed, RMW at the radius of , and at the radius of wind speed equals to 34 knots. In contrast, the brown line is a profile with uncertainty. Although and always fit the best-track data, the calculated RMW moves outward a lot, and the calculated outer wind speed may be over-estimated.

As shown in fig. 2(b), the RMW of most of the samples shifts slightly to fit the wind model. We can see that there are 91.6% of data with RMW difference within two standard deviations (17.4km). Most of the data having significant differences are weak TCs, since that the parametric wind model is developed upon mature TCs.

While the wind profile label we calculated could hardly be perfect, especially with the assumption that the TCs are perfectly symmetric, we still believe that these labels are valuable in tackling the important topic of TC structure.

4 Proposed Method

Figure 3: Selected TC images on (a) Cartesian coordinates and (b) polar coordinates.
Figure 4: A schematic showing convolution kernel working on (a) Cartesian coordinates and (b) polar coordinates.

According to TCs’ spiral nature, a TC is generally axis-symmetric or point symmetric with the center. Therefore, we propose a unique CNN model that operates on polar coordinates with respect to the TC center. Before the training, we project all the TC images, originally 128x128 on Cartesian coordinates, to 180x103 images on polar coordinates ( fig. 3). Using polar coordinates brings us three benefits:

  1. They provide more explainable dimensions than those on Cartesian coordinates, allowing us to interpret the model better. Each index in the first dimension (180 points) represents 2 degrees of the directional angle, while each point in the second dimension (103 points) represents 5 kilometers of the radius.

  2. As proposed in Chen et al. (2018), the spiral characteristic of TCs enables us to obtain better results by blending the predictions of an image rotated with several different angles. The effectiveness of this method is also supported in the following work Chen et al. (2019)

    . To rotate an image on Cartesian coordinates, it requires interpolations and probably cropping if we don’t want black corners. But on polar coordinates, the only thing we need to do is to roll the image.

  3. On polar coordinates, the meaning of a convolution kernel is a sector, instead of a square, with its vertex pointing to the TC center. The sector mask can further highlight the spiral structure that grows outward from the cyclone eye fig. 4. We will discuss the efficiency of convolution masks in different coordinate systems and different shapes later in section 5.1.

We stack IR1 and PMW (2 out of all 4 channels), into 180x103x2 images before we pass them into our CNN model. The selection of IR1 and PMW is proven to be the best in Chen et al. (2019).

4.1 Profiling a TC

Figure 5: A schematic showing two different way to obtain and from the model. Since method (a) might cause contradictory result, method (b) is recommended.

As suggested in section 2, TC structure is conventionally represented by several parameters: , RMW, and . In this work, we hope to further predict the entire wind profile (fig. 2a). Such a wind profile covers the information provided by all the above parameters and provides a more concrete concept of a TC’s structure.

However, considering the simplicity and the convenience to compare with other works, we hope that the model can also output and in addition to the wind profile.

On the other hand, as mentioned in section 3, only 46% of the data have profiles. Meanwhile, if the of a TC is lower than 34, its will naturally be 0. In other words, while every data is guaranteed to have , not every data has a profile and an label.

Therefore, in order to make good use of each data, the loss of prediction is also added to the loss function during training. Since the data always has a label, we can ensure that there’s a loss to be optimized for each data, even if the data don’t have profiles and labels. To output and along with the profile, we have two approaches:

A naive way is to let the model output three predictions at the same time: , , and the profile. Nevertheless, even if the three outputs share most of the layers, there may be contradictory results. For example, as shown in fig. 5(a), while is lower than 34, the model output a nonzero .

It is worth noting that there are direct links between the profile and the other two labels. Thus, we suggest to first obtain the profile before inferring and from the determined profile, as shown in fig. 5(b).

In the following section, a profile will be denoted as while the i-th element in the profile will be denoted as .

Inferring ()

By definition, is the maximum wind speed in the profile, which can be calculated simply by the transformation :


Inferring ()

We first get the biggest index where the wind speed in the profile is greater than 34, and, since the distance between each point is 5 kilometers, multiply the index by 5 to obtain the inferred (in km) from the profile.


4.2 Training Objective

For a batch of the data and our model , we first obtain wind profiles using the model,


where and stand for the j-th data and the j-th profile prediction in the batch respectively.

Profile loss ()

We calculate point wise mean square error (MSE) between the output profile and the profile label , the loss will be:


Noted that only when the data have profile label will the profile loss be optimized.

Intensity loss ()

We first inferred prediction from the profile prediction using transformation , then calculate MSE between the prediction and the label :


Size loss ()

We inferred prediction from the profile prediction using transformation before calculating MSE between the prediction and the label :


Finally, as mentioned in section 4.1, we are optimizing , , and simultaneously. The loss functions are formulated as below:


and are the factor of intensity loss and size loss, respectively, The factors used for the experiments are provided in table 2. The detail of the model structure including every layers and blending methods are listed in the appendix.

5 Experiments and Analysis

In this section, our attempts in convolution kernel sizes and loss functions are provided first. After that, we look into several actual cases before we compare our proposed model’s performance to those of the competitive models.

All models are trained with 2004-2014 TCs, validated with 2015-16 TCs and tested with 2017-2018 TCs.

5.1 Kernel Size Experiments

Coordinate (angle, radial) Profile RMSE (knots) RMSE (knots) RMSE (km)

Selected Epoch

(2, 2) 14.85 12.51 70.07 20
(3, 3) 14.71 10.66 70.55 35
Cartasian (4, 4) 14.89 11.32 69.27 20
(2, 2) 14.92 11.35 68.15 60
(2, 3) 14.88 10.78 66.33 60
(2, 4) 14.67 12.24 67.09 35
(3, 2) 14.53 12.08 69.33 30
(3, 3) 14.63 11.00 66.70 45
(3, 4) 14.82 11.21 68.79 55
(4, 2) 14.43 11.16 66.71 20
(4, 3) 14.28 11.07 66.48 40
(4, 4) 14.45 11.76 70.62 40
(6, 3) 14.21 11.22 69.49 55
Polar (8, 3) 13.84 12.15 70.97 55
Table 1: The comparison between different kernel shapes. Since the scores vibrate, we select the epoch based on the profile RMSE on the validation data. (4, 3) is selected as the shape of the convolution layers in the final model.

Since the proposed model is designed to be used in polar coordinates, the shape of the convolution kernel has a more specific meaning. We experimented with the performance of convolution kernels of different shapes. For simplification, in every model, we use the same strides and the same number of convolution layers. Moreover, each convolution layers in a single model share one kernel shape. For better performance, one can mix different kernel shapes and strides in a model, but considering the simplicity, this is beyond the scope of this work.

Table 1 shows the performance of different convolution kernels. The experimental result also shows that images on Cartesian coordinates provide decent estimates but fall short of predicting profile and .

Meanwhile, we can observe that the performance on estimating and is related to the kernel’s coverage on the radius. Experimental result shows that choosing a kernel that covers 3 grids on the radius performs best. In contrast, predicting the profile is more related to how large the angle covered by the kernel is. As the angle covered is larger, the performance of predicting profile will be better. However, we also found that as the angle covered becomes larger, the model is easier to over-fit and thus the accuracy of predicting and is damaged. In the end, we choose (4, 3) as our kernel shape in the proposed model.

5.2 Loss Function Combinations

Loss Profile RMSE (knots) RMSE (knots) RMSE (km) Selected Epoch
Profile 0 0 15.93 13.55 76.95 15
Profile+ 0 0.1 15.87 13.19 72.74 35
Profile+ 0.3 0 14.18 11.32 70.60 65
Profile++ 0.3 0.1 14.37 11.31 69.68 30
Table 2: The comparison between different factor combinations in the loss function. While adding loss into loss function provide limited improvement, optimizing at the same time help the model learn much better. and stand for the coefficients mentioned in eq. 7. The performance is calculated with the validation data.

We then compare the performance of various combinations of loss functions. As mentioned in section 4.1, we hope that the proposed model can provide high-quality profile, , and predictions at the same time. Table 2 lists the performance of various combinations of the above three goals in the loss function. The alpha and beta here correspond to the coefficients mentioned in eq. 7.

We can observe that, comparing to the model only optimizing , the model with additional in the loss function received a decent improvement. In other words, guiding the model to draw the highest point in the curve at the correct height provides a clear direction for the model to do better in fitting the whole line, thus greatly enhanced the performance of the model to predict not only the but also the profile and .

In contrast, adding to the loss function hardly improves the model. Our explanation is that for the CNN model, the point where velocity equals 34 in the curve, comparing to the highest point in the curve, is very difficult to grasp. Therefore, it is just better to concentrate on fitting the profile curve and let the fit naturally.

According to the above results, we combine and into our loss function of the proposed model.

5.3 Case Study

Figure 6 shows a representative case in which we compare the profile label with the prediction of (1) our best model, (2) a model optimized profile loss only, and (3) the ASCAT observation. From the line chart, we can found that by adding the into the loss function, the model did better fitting the peak () of the predicted profile.

On the other hand, the ASCAT observations are restricted by the device limitation. Therefore, when the wind speed is very high (i.e., in the TC inner-core), the ASCAT tend to under-estimate the wind speed. In this case (Figure 6), our model (green line) did a good job in both accurately estimating the high winds in the inner-core compared to the best-track (i.e., the max of the red line) and adequately estimating the TC outer wind comparing the ASCAT profile within 100-300 km radius.

More cases and interesting observations will be provided in the appendix (and github after the double-blind review).

Figure 6: Comparing the predicted profiles based on different loss functions (green and blue lines) to the profile label (red line) and the ASCAT observation (gold line, corresponding to the right panel). This figure is generate with the testing data.

5.4 Performance

Table 3: Comparing our model to the state-of-the-art methods in both intensity and size estimation.

Table 3 shows the comparison between our proposed model and the state-of-the-art models in TC intensity and size estimation, respectively.

In estimating intensity, the model proposed by Chen et al. (2019) is the state-of-the-art in our best knowledge. By smoothing the output, its performance can be further improved. For the sake of fairness, we compare the performance without smoothing the output. If necessary, smoothing techniques can also be applied to our proposed CNN model for better performance.

In estimating TC size, the model of Sampson et al. (2018) obtained the best results by blending six independent models. These six models have their pros and cons and are not always available. However, simply blending the available estimates of these models with an equally-weighted average leads to better performance. On the other hand, our model can not only systematically estimate the TC size, but also be comparable in performance to the best single model Sampson et al. (2018) used for blending.

The comparison results suggest that our proposed model can simultaneously predict and and has comparable performance to state-of-the-art techniques. Moreover, our model provides the radial wind profile, giving us a more concrete concept of TC structure that no other model can provide.

6 Conclusion

This paper focuses on an influential but undeveloped task: systematically analyzing the TC structure in terms of its entire radial wind profile. An organized new dataset with valuable labels for this task is published to facilitate data scientists in the following researches. By developing on polar coordinates instead of ordinary Cartesian coordinates, we proposed a specialized CNN model that uses rectangular convolution kernels instead of standard square kernels. We also discovered that optimizing the loss of both intensity estimation and structure estimation at the same time improved our model decently.

With a properly designed model structure and a delicate-composed loss function, our proposed model provides comparable predictions of a TC’s size, intensity, and wind profile simultaneously. Most importantly, the prediction is achieved systematically and objectively by using high-availability data, which leads to a more reliable and timely (every 3 h compared to longer than 6 h before) TC forecasting system.


  • C. Bai, B. Chen, and H. Lin (2019) Attention-based deep tropical cyclone rapid intensification prediction. arXiv preprint arXiv:1909.11616. Cited by: §1.
  • M. Bender, M. Morin, K. Emanuel, J. Knaff, C. Sampson, I. Ginis, and B. Thomas (2016) Impact of storm structure and the environmental conditions in the rapid intensification of hurricanes katrina and patricia. In 32nd Conf. on Hurricanes and Tropical Meteorology, Cited by: §1.
  • K. T. Chan and J. C. Chan (2012) Size and strength of tropical cyclones as inferred from quikscat data. Monthly weather review 140 (3), pp. 811–824. Cited by: §2.1.
  • B. Chen, B. Chen, and H. Lin (2018) Rotation-blended cnns on a new open dataset for tropical cyclone image-to-intensity regression. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 90–99. Cited by: §A.2, §3, item 2.
  • B. Chen, B. Chen, H. Lin, and R. L. Elsberry (2019) Estimating tropical cyclone intensity by satellite imagery utilizing convolutional neural networks. Weather and Forecasting 34 (2), pp. 447–465. Cited by: §A.1, Appendix A, §1, §1, item 2, §4, §5.4.
  • J. L. Demuth, M. DeMaria, J. A. Knaff, and T. H. Vonder Haar (2004) Evaluation of advanced microwave sounding unit tropical-cyclone intensity and size estimation algorithms. Journal of Applied Meteorology 43 (2), pp. 282–296. Cited by: §1.
  • J. Figa-Saldaña, J. J. Wilson, E. Attema, R. Gelsthorpe, M. R. Drinkwater, and A. Stoffelen (2002) The advanced scatterometer (ascat) on the meteorological operational (metop) platform: a follow on for european wind scatterometers. Canadian Journal of Remote Sensing 28 (3), pp. 404–412. Cited by: §1, §2.2.
  • G. J. Holland and R. T. Merrill (1984) On the dynamics of tropical cyclone structural changes. Quarterly Journal of the Royal Meteorological Society 110 (465), pp. 723–745. Cited by: §2.1.
  • J. A. Knaff, M. DeMaria, D. A. Molenar, C. R. Sampson, and M. G. Seybold (2011) An automated, objective, multiple-satellite-platform tropical cyclone surface wind analysis. Journal of applied meteorology and climatology 50 (10), pp. 2149–2166. Cited by: §1, §1, §2.2, §2.2.
  • J. A. Knaff, S. P. Longmore, and D. A. Molenar (2014) An objective satellite-based tropical cyclone size climatology. Journal of Climate 27 (1), pp. 455–476. Cited by: §1, §2.2.
  • J. A. Knaff, C. J. Slocum, K. D. Musgrave, C. R. Sampson, and B. R. Strahl (2016) Using routinely available information to estimate tropical cyclone wind structure. Monthly Weather Review 144 (4), pp. 1233–1247. Cited by: §1, §1, §2.1.
  • A. Krizhevsky, I. Sutskever, and G. E. Hinton (2012) Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems, pp. 1097–1105. Cited by: §1.
  • D. Matsuoka, M. Nakano, D. Sugiyama, and S. Uchida (2018) Deep learning approach for detecting tropical cyclones and their precursors in the simulation by a cloud-resolving global nonhydrostatic atmospheric model. Progress in Earth and Planetary Science 5 (1), pp. 80. Cited by: §1.
  • M. Morris and C. S. Ruf (2017) Determining tropical cyclone surface wind speed structure and intensity with the cygnss satellite constellation. Journal of Applied Meteorology and Climatology 56 (7), pp. 1847–1865. Cited by: §1, §3.1.
  • M. D. Powell and T. A. Reinhold (2007) Tropical cyclone destructive potential by integrated kinetic energy. Bulletin of the American Meteorological Society 88 (4), pp. 513–526. Cited by: §1.
  • C. R. Sampson, E. M. Fukada, J. A. Knaff, B. R. Strahl, M. J. Brennan, and T. Marchok (2017) Tropical cyclone gale wind radii estimates for the western north pacific. Weather and Forecasting 32 (3), pp. 1029–1040. Cited by: §2.2.
  • C. R. Sampson, J. S. Goerss, J. A. Knaff, B. R. Strahl, E. M. Fukada, and E. A. Serra (2018) Tropical cyclone gale wind radii estimates, forecasts, and error forecasts for the western north pacific. Weather and Forecasting 33 (4), pp. 1081–1092. Cited by: §1, §2.2, §5.4.
  • C. R. Sampson and J. A. Knaff (2015) A consensus forecast for tropical cyclone gale wind radii. Weather and Forecasting 30 (5), pp. 1397–1403. Cited by: §1.
  • V. Tallapragada (2015) Hurricane weather research and forecasting (hwrf) model: 2015 scientific documentation. NCAR Developmental Testbed Center, Boulder, CO, pp. . Cited by: §1.
  • C. Velden, B. Harper, F. Wells, J. L. Beven, R. Zehr, T. Olander, M. Mayfield, C. ”. Guard, M. Lander, R. Edson, et al. (2006) The dvorak tropical cyclone intensity estimation technique: a satellite-based method that has endured for over 30 years. Bulletin of the American Meteorological Society 87 (9), pp. 1195–1210. Cited by: §1.
  • C. L. Weatherford and W. M. Gray (1988) Typhoon structure as revealed by aircraft reconnaissance. part ii: structural variability. Monthly Weather Review 116 (5), pp. 1044–1056. Cited by: §2.1.
  • A. Wimmers, C. Velden, and J. H. Cossuth (2019) Using deep learning to estimate tropical cyclone intensity from satellite passive microwave imagery. Monthly Weather Review 147 (6), pp. 2261–2282. Cited by: §1.
  • Q. Yang, C. Lee, and M. K. Tippett (2020)

    A long short-term memory model for global rapid intensification prediction

    Weather and Forecasting 35 (4), pp. 1203–1220. Cited by: §1.

Appendix A Implementation details

In the following section, two techniques proposed in the previous work will be explained briefly, including auxiliary features and rotation-blending. Please refer to Chen et al. (2019) for more details.

a.1 Auxiliary Features

In addition to the output from convolution layers, additional features are appended before feeding them into the fully-connected layers. The auxiliary features are demonstrated to be helpful in improving the precision of estimation Chen et al. (2019)

. These features provide clues such as (1) day of year: stand for seasonal information, (2) local time, and the most influential one: (3) One-hot encoded region codes: region codes is in {

WPAC, EPAC, CPAC, ATLN, IO, SH}, representing 6 different basins.

a.2 Rotation Blending

Considering the nature of TCs as a rotating weather system, TC data is rotation invariant. That is, rotations with respect to the center usually do not affect the estimation of the TC intensity. Chen et al. (2018) demonstrated that the idea of using rotation for augmentation leads to a significant improvement in performance.

During the training phase, each image will be randomly rotated by any degree before feeding into our model. When it comes to inference, images will be rotated by evenly distributed ten angles ranged from 0 to 360 to collect 10 different estimations. Afterward, these intensity estimations are blended to obtain the final estimate.

Notice that, to rotate images in polar coordinates, we are merely rolling the image upward (fig. 7).

In this work, by transforming the images from Cartesian coordinates to polar coordinates, the computing loading is greatly reduced.

Figure 7: A schematic showing how images on polar coordinates are rotated.

a.3 Model Structure

The model structure for the CNN-profiler is detailed in table 4.

Operation Kernel Strides Dim. BN activ.
BN - - - Y -
conv 4x3 2x2 16 Y relu
conv 4x3 2x2 32 Y relu
conv 4x3 2x2 64 Y relu
conv 4x3 2x2 128 Y relu
conv 4x3 2x2 256 Y relu
conv 4x3 2x2 512 Y relu
concatenate 10 additional features
linear - - 256 Y relu
linear - - 64 Y relu
linear - - 151 N -
Table 4:

Model structure of the CNN-profiler. The first batch normalization layer right serves as z-score normalization. After the convolution layers, 10 dimension features, which were mentioned in

section A.1, are passed into linear layers along with the convolution layers’ output.

Appendix B Extended Case Study

In fig. 8, we provide 6 cases from different TCs to compare the predicted profiles and the ASCAT observation. Here we have several observations:

  1. If Vmax loss is added to the loss function, the model tend to ’tap’ the maximum velocity with a sharp peak (the middle right case). In contrast, if Vmax loss is not included in the loss function, the model produce much more smooth predictions.

  2. In every cases, the model with additional Vmax loss produce higher curves, which in most cases are more similar to the corresponding profile labels. However, there are still sporadic exceptions (the upper left case).

  3. ASCAT, as the most reliable tropical cyclone size estimation techniques so far, is likely to under-estimate the velocity when the actual velocity is extreme. In contrast, our model provide reliable estimation in both inner and outer core of tropical cyclones.

Figure 8: An extended version of fig. 6. This figure compares the predicted profiles based on different loss functions to the profile label and the ASCAT observation using 6 cases from different TCs.