Log In Sign Up

Segmentation Network with Compound Loss Function for Hydatidiform Mole Hydrops Lesion Recognition

Pathological morphology diagnosis is the standard diagnosis method of hydatidiform mole. As a disease with malignant potential, the hydatidiform mole section of hydrops lesions is an important basis for diagnosis. Due to incomplete lesion development, early hydatidiform mole is difficult to distinguish, resulting in a low accuracy of clinical diagnosis. As a remarkable machine learning technology, image semantic segmentation networks have been used in many medical image recognition tasks. We developed a hydatidiform mole hydrops lesion segmentation model based on a novel loss function and training method. The model consists of different networks that segment the section image at the pixel and lesion levels. Our compound loss function assign weights to the segmentation results of the two levels to calculate the loss. We then propose a stagewise training method to combine the advantages of various loss functions at different levels. We evaluate our method on a hydatidiform mole hydrops dataset. Experiments show that the proposed model with our loss function and training method has good recognition performance under different segmentation metrics.


page 2

page 3

page 4

page 6

page 8


A Novel Focal Tversky loss function with improved Attention U-Net for lesion segmentation

We propose a generalized focal loss function based on the Tversky index ...

Learning Fuzzy Clustering for SPECT/CT Segmentation via Convolutional Neural Networks

Quantitative bone single-photon emission computed tomography (QBSPECT) h...

Liver Lesion Segmentation with slice-wise 2D Tiramisu and Tversky loss function

At present, lesion segmentation is still performed manually (or semi-aut...

Enhancing Foreground Boundaries for Medical Image Segmentation

Object segmentation plays an important role in the modern medical image ...

Generative Model-Based Ischemic Stroke Lesion Segmentation

CT perfusion (CTP) has been used to triage ischemic stroke patients in t...

D-UNet: a dimension-fusion U shape network for chronic stroke lesion segmentation

Assessing the location and extent of lesions caused by chronic stroke is...

1 Introduction

Hydatidiform mole(HM) is one of the most common gestational trophoblastic diseases(GTD), which occur in about 1 in 500-1000 pregnancies[1]-[2]. Since there is a certain probability that HM will develop into invasive HM and choriocarcinoma, most HM fetuses are unviable, or HM grows into a teratoma [3]-[6]. HM is commonly found in women under age 17 or over age 35, and can be partial or complete. Although moles can be identified using different types of diagnostic methods, the final diagnosis must be confirmed by pathologists. Pathological diagnosis of tissue section is the gold standard for the diagnosis of HM. Pathologists generally use microscopes of

times and times magnification to observe multiple sections of subjects and make comprehensive diagnoses according to experience and the morphology of the sections. Fig. 1 shows a well-developed complete HM under a microscope. Pathologists mainly diagnose by observing the villi characteristics of HM in the sections [5]. With partial HM, there is local hydrops of villi stroma and local trophoblast hyperplasia at the edge of the villi, and with complete HM, there is entire hydrops of villi stroma and diffuse hyperplasia of trophoblastic cells at the edge of the villi [6]. Therefore, hydrops lesions are an important basis for HM diagnosis.

In actual pathological work, HM before 12 weeks of pregnancy is often morphologically confused with non-HM pregnancy and other diseases due to incomplete development [7]-[8], pathologists must spend substantial time on the diagnosis, resulting in low detection efficiency. There is a need for an auxiliary diagnosis system for HM that can improve the diagnostic accuracy to reduce missed diagnoses and misdiagnoses.

Computer-aided diagnosis has been widely adopted in clinical practice recent years, especially the neural network based medical imaging recognition algorithms[28]-[37]. An image semantic segmentation network based on deep learning has been proved to effectively increase the accuracy and efficiency of diagnosis in many medical imaging lesion segmentation tasks. However, there is rare research on intelligent diagnosis of HM lesions based on deep learning[3]. To address the above issues, we propose an intelligent auxiliary diagnosis method that can identify HM lesions under a microscope in real time. The contributions of this paper are as follows: (1) A semantic segmentation model for HM hydrops lesion segmentation is constructed. HM hydrops lesion segmentation models under different networks, feature-extraction networks, and loss functions are tested and evaluated. Experimental results confirm that the network model has a good segmentation effect on hydrops datasets. (2) A compound loss function combined with pixel- and lesion-level loss for multiple evaluation metrics is constructed that, compared with the traditional loss function, shows good performance on pixel- and lesion-level evaluation metrics. We also propose a stagewise training method of multiple loss functions, which we experimentally show significantly improves segmentation results.

Other than assisting diagnosis of HM, this model can also be generalized since its components are not dedicated to only recognize the HM lesion, but designed to extract the features we need for the corresponding diagnosing process. Therefore, our proposed method is also able to assist diagnosing other diseases by extracting the corresponding pathological features.

Fig. 1: Complete HM under microscope.

2 Related Work

Here we discuss semantic segmentation artificial neural network, which has been widely applied to medical images, and pathological section image segmentation.

2.1 Semantic Segmentation Network

Semantic segmentation is pixel-level classification. Before the advent of deep learning, the traditional classifier is generally designed for a single category and is greatly limited by features [9]-[13]. Deep learning has simplified the pipeline of semantic segmentation and obtained better segmentation results. The fully convolutional network (FCN) [14] extracts features and samples high-level semantic features to specified dimensions to obtain the final prediction results, which naturally form an encoder-decoder framework. Classical algorithms, such as U-Net [15], the Feature Pyramid Network (FPN) [16], SegNet [17], PSPNet [18], and LinkNet [19], all have an encoder-decoder structure. The DeepLab series introduces an atrous convolution method, which controls the receptive field size of a model and can therefore obtain feature information in different ranges [20]-[22]. DeepLabv3+ [22] combines spatial pyramid pooling and an encoder-decoder structure, and also exploits a more powerful network by using modified aligned exception and atrous separable revolution.

2.2 Pathological Section Image Segmentation

Most early lesion identification methods for pathological section are unsupervised and semi-supervised. These methods use lesion’s own feature in the section images as descriptors, uses threshold, cluster, similarity measuring and other methods to segment the lesion.[23]-[27] For examples, the machine learning approach employs fuzzy C-Means clustering with hue, saturation and value color space [27] in order to divide the hydatidiform mole villi area correctly. Wavelet based texture features [23] were used to segment lesions in hyperspectral human colon tissue cell images. MLSeg [25] defines a new set of high-level texture descriptors to represent prior knowledge in colon tissue and uses it in unsupervised multi-level segmentation algorithms. Semi-supervised learning for both spectral dimension reduction and hierarchical pixel clustering [24] were used for hyperspectral images of tissue samples and successfully segmented the pixels of different cell types in the images. Although the traditional method of lesion recognition has been successfully applied in pathological practice, the traditional algorithm is limited by the weak expression ability of feature operator, which has the loss of recognition accuracy and generalization performance, just like the disadvantages of manual feature in image processing.

Image segmentation network based on deep learning has strong learning ability and greatly improves the accuracy of lesion recognition in pathological images. Segmentation networks commonly used for assisiting diagnosis include CNN [31]-[33], U-Net [15][29][30], DeepLab Series [34], etc. On the basis of the above network, combined with the characteristics of pathological image data set, a more targeted network model is proposed. DRA-Net combined with the new computer-aided cancer diagnosis framework based on Session Histopathological Image Recommendation (SHIR) [28] successfully learned the pathological knowledge of lesion recognition by using only WSI tags. A human-like automatic diagnostic network [35] designed a structure including a scanning network (s-net), a diagnostic network (d-net), and an aggregation network (a-net) for urothelial carcinoma biopsy data of bladder cancer subjects.

Neural network already has preliminary results in pathologic diagnosis of hydatidiform mole. P. Pal et al. [3] classified pathological sections of hydatidiform mole villi into normal, PHM, or CHM categories based on various characteristics of hydatidiform mole using three fully connected networks. The overall accuracy of the validation dataset is able to reach 86.1%. However, this method is based on the hydatidiform mole section image with complete structures and edges, can only extract the superficial features of the image and classify the whole image. The algorithm has a simple structure and does not have the capability to locate and segment the lesion from the whole image. In this paper, our method uses an efficient semantic segmentation network to accurately segment the input pathological images and provide doctors with more convincing results.

3 Proposed Method

3.1 Dataset and Labeling

The data used in this paper are from the Third Affiliated Hospital of Zhengzhou University. The data usage was approved by the Third Affiliated Hospital of Zhengzhou University Ethic Committee. We selected and scanned 157 HM sections from 59 subjects with a Motic section scanner as the main section dataset. Subjects’ ages ranged from 25 to 38, and the amenorrhea time ranged from 30 to 92 days. Most sections had complete or partial HM, and the remaining sections had several diseases that are easily confused with HM.

With the help of pathologists, we performed hydrops, hyperplasia, and villi labeling on the scanned section images. All labeling results were examined and approved by clinical pathologists. We used Motic digital section assistant system to obtain section images and annotation files. After data conversion and image processing, we obtained the HM section and HM section hydrops lesion label images shown in Fig. 2.

Fig. 2: Scan and label mask of HM. Left: HM section scan under microscope; Right: Label mask of HM.

3.2 Diagnostic Deep Network

Mainstream semantic segmentation network structures include DeepLabv3+, U-Net, and FPN, all of which have an encoder-decoder architecture on a macro level. We use encoder-decoder architecture to construct the hydrops lesion segmentation model. As shown in Fig. 3(a), the input is the preprocessed HM section, and the output is the label map of the hydrops lesion of the HM section.

In the decoder in Fig. 3(a), convolution and upsampling ensure that the feature image remains the same size, and P1, P2, and P3 at different scales obtained by the feature-extraction network are merged. Networks such as DeepLabv3+, UNet, and FPN use different merge methods.

Fig. 3: Overview of the hydrops lesion segmentation model.(a) Framework of hydrops lesion segmentation model. (b) U-Net. (c) FPN. (d) LinkNet. (e) PSPNet.

U-Net is widely used in the semantic segmentation task of medical image lesion segmentation. Its structure is shown in Fig. 3(b). The encoder extracts multiscale features, whereas the output features in decoder engages the same size corresponding to the encoder. Then, the feature maps of the encoder and decoder are combined (concat). U-Net overcomes insufficient upsampling information by merging feature images of the same resolution between the encoder and decoder and provides multiscale information for semantic segmentation by retaining deep and shallow feature information.

Unlike U-Net, FPN adds two images directly to fuse the image, and the fused feature map is upsampled to obtain the multiscale prediction results. The image segmentation results obtained from the upsampling of each layer are combined to obtain the results.

The pyramid scene parsing network (PSPNet) constructs a pyramid pooling module to obtain multiscale information. As shown in Fig. 3(e), PSPNet inputs the feature map to the pyramid pooling module. The pooling feature map takes convolutions and upsamples to obtain the same scale as the input feature image of the pyramid pooling layer. Four feature maps, with different scales and input feature maps of the pyramid pooling layer, are merged. LinkNet has a structure similar to U-Net, but the different scales output by the encoder are directly added to the corresponding characteristic images of the decoder, which is similar to FPN. LinkNet also optimizes the network structure, meaning it obtains better semantic segmentation results without adding parameters (see Fig. 3(d)).

Fig. 4: (a) Structure of DeepLabv3+. (b) DeepLabv3+ network structure on HM lesion segmentation.

DeepLabv3+ is the latest version of DeepLab, the main network used in this paper. The network structure is shown in Fig. 4. DeepLabv3+ uses atrous convolution, which does not apply to the adjacent feature images but to the nine feature points of the interval rate. Feature convolution at different scales is completed with different rate. In particular, when the rate is 1, atrous convolution is the same as traditional convolution. Atrous convolution can expand the receptive field and capture multiscale feature information, as shown in Fig. 4(a). DeepLab parallels the atrous convolutions of multiple scales and combines the output results. Since the rate can be freely selected, it can adapt to different scales of lesion segmentation during network parameter adjustment, which is of great significance for the lesion segmentation of HM hydrops.

3.3 Compound Loss Function

We conducted experiments on commonly used loss functions BCELoss, DiceLoss, IoULoss, and IoULoss-based FocalLoss. Although it can be seen that IoULoss and BCELoss both have good performance on the dataset, they are insensitive to lesion-level evaluation metrics. Detailed experiments and results are shown in the Experiment sections. Also, in clinical practice, doctors focus on different sections, so the model will require a higher recall rate. Therefore, we propose a compound loss function that not only takes into account pixel- and lesion-level loss but also take both recall loss and precision loss into consideration:




where is a weight factor; PixelLoss and LesionLoss are pixel- and lesion-level losses, respectively; is the focal loss, which can reduce the impact of category imbalance; and and are constants. The proposed compound loss function takes into account the recall loss and accuracy loss, which are weighted in LesionLoss and PixelLoss. The latter is defined as


where is a weighting factor, and

represent pixel-level precision and recall loss, respectively.

represents the number of pixels in the intersection of and , meanwhile, represents the number of lesions accurately predicted. LesionLoss is defined as


The compound loss function combines pixel- and lesion-level loss, which ensures the improvement of the segmentation results of two levels. The calculation of lesion-level loss is based on the connected domain of a single lesion, where small and large lesions have the same weight.

By adding and , the compound loss function enables the tailoring of models according to use. A model trained with has a better recall rate, and it provides more potential lesions for pathologists. When , the model accounts for the recall rate and accuracy and has more comprehensive performance. In practical applications, the intersection of the two models represents an area with a greater probability of hydrops lesions, and their difference represents an area with a certain probability of hydrops lesions.

3.4 Stagewise Training Strategy

For a variety of evaluation metrics, we propose a stagewise training with the advantages of multiple loss functions, as shown in Fig. 5.

Fig. 5: Flow diagram of stagewise training strategy.

IoULoss or BCEWithLogitsLoss is used to train the model, and only the model with the best pixel-level IoU result on validation is saved, so as to ensure pixel-level performance. The model is trained on the compound loss function, and only the model with the best pixel-level IoU result on validation is saved, so as to maintain pixel-level performance and improve lesion-level performance. Experimental results show that this training method can effectively integrate the advantages of different loss functions and improve the performance of the model under multiple evaluation metrics.

4 Experiment

Forty-two sections with typical characteristics of hydrops lesions were selected from 157 sections as the section dataset.

4.1 Preprocessing

4.1.1 Data cleaning

The sliding method was used to crop HM sections to expand the dataset. The sliding window selects the 9000-pixel size according to the visual field size under the microscope. Imaging under this size basically ensures that the whole hydrops area of the villi can be seen in the cropped window. The sliding step is half the window size. Each cropped image (called “image” below) of the HM and the label mask constitute a set of training data. Since the lesion area in HM sections accounts for a small area, it is difficult to train a suitable hydrops lesion segmentation model when all images are included in the dataset.

To reduce the class imbalance of the dataset, 3078 pieces of images with hydrops areas and 3724 pieces without hydrops areas were cleaned as training and validation dataset . As a result, 96 pieces of images without hydrops areas were randomly selected with probability 0.025. A total of 3174 pieces were split into train and valid at a ratio of 9:1.

To evaluate the model performance on a new section, test images are from different subjects of training and validation dataset. The testing dataset finally obtained 330 images in the same cleaning way.

hydrops and
hydrops and
2.5% Non-
Train&Valid hydrops area ratio 5.2% 11.1%
Non-hydrops area ratio 94.8% 88.9%
Test hydrops area ratio 4.3% 15.6%
Non-hydrops area ratio 95.7% 84.4%
TABLE I: Image ratio of hydrops area and non-hydrops area

The dataset retains part of the background so that the hydrops lesion segmentation model is more robust, with better discriminatory ability to prevent misdiagnosis. Data cleaning can increase the proportion of hydrops area to more than 10% (see Table 1), which can prevent the impact of class imbalance.

4.1.2 Data Augmentation

Data enhancement expands the training set, with methods including horizontal reversal, rotation, scaling, and translation. Each method corresponds to a certain section situation. The final image transformation combines multiple image transformations, each with a probability of 0.5. Each batch performs random online data enhancement in each round of model training.

4.1.3 Metrics

There are pixel- and lesion-level metrics. A lesion-level evaluation indicator takes into account the needs of the pathological diagnosis of HM, as pathologists mainly consider a single lesion for diagnosis. The metrics are intersection over union (IoU), recall (Rec), and precision (Pre). IoU is for the comprehensive evaluation of lesion segmentation performance. Rec can evaluate the severity of a missed diagnosis, and Pre measures the amount of misdiagnosis.

4.2 Segmentation Model Result

4.2.1 Segmentation network

In the experiment, the input HM section image was RGB. The Adam optimizer was used to train the network model. The initial learning rate was 0.0001, and the batch size was 8. All models used mean pooling, and the dropout parameter was set to 0.5. All the training data had random online data enhancement. Only the model with the best lesion segmentation performance was saved through training, so as to prevent overfitting. The saved model was used to predict the training, validation, and testing datasets to evaluate the performance of the hydrops lesion segmentation network.

We used DeepLabv3+, U-net, FPN, LinkNet, PSPNet, and pyramid attention network (PAN) [38] image segmentation networks. Since ResNet has excellent image feature-extraction ability and is commonly used in deep convolution networks, we used ResNet50 as the feature-extraction network.

Table 2 presents the parameters of different networks (M denotes million). The six networks have similar numbers of parameters. DeepLabv3+ and PSPNet are excellent in terms of time consumption for image processing. All the networks can basically meet the requirements of real-time detection.

Model Params(M) Model Size(M) Time(ms)
DeepLabv3+ 26.7 102.1 82.3
FPN 26.1 99.9 90.7
LinkNet 31.2 119.3 89.0
PAN 24.3 92.8 89.7
PSPNet 24.3 93.0 78.5
U-net 24.6 94.3 95.7
TABLE II: Evaluation of different models

Fig. 6 and Table 3 show the performance of hydrops lesion segmentation in different networks on the test set. It can be seen that DeepLabv3+ performs better than the other networks.

Fig. 6: Examples of hydrops lesions segmentation in different networks

In summary, considering time consumption, model size, hydrops lesion segmentation performance, and other factors, DeepLabv3+ is used as the main network in this paper.

Model Pixel-level Lesion-level
IoU(%) Rec(%) Pre(%) IoU(%) Rec(%) Pre(%)
DeepLabv3+ 75.9 82.8 90.1 64.8 75.4 82.1
FPN 74.5 85.5 85.2 61.8 76.6 76.2
LinkNet 75.3 88.1 83.8 60.9 83.1 69.5
PAN 67.1 72.5 90.0 64.0 76.8 79.3
PSPNet 71.9 82.3 85.1 60.5 85.9 67.2
U-net 75.6 90.7 82.0 61.7 85.8 68.8
TABLE III: Evaluation performance of different Models on hydrops lesions segmentation

4.3 Compound Loss Function Analysis

We conducted experiments on commonly used loss functions BCELoss, DiceLoss, IoULoss, and IoULoss-based FocalLoss, which outputs the model through a sigmoid function and calculates the loss between the model output and the real label.

The BCELoss can be calculated using


The DiceLoss can be calculated using


The IoULoss can be calculated using


Where is the number of pixels in the label images, is the ground truth, is the prediction refers to the hydrops lesion area in ground truth and is the prediction hydrops lesion area, is number of pixels, is the sigmoid function, is a nonzero constant.

In this experiment, DeepLabV3+ is used as the backbone network, and resnet50 is uniformly used for feature extraction network. The changing curves of IoU under different loss functions are shown in Fig.7.

Fig. 7: Loss curve of different loss functions.

It can be seen that the curve of DiceLoss fluctuates in the last several rounds of training, and other loss functions tend to converge after 20 epoches. BCEWithLogitsLoss has not reached complete convergence, indicating that the model still has room for improvement. Table 4 shows the evaluation results of hydrops lesions recognition with different loss functions on Test.

Pixel-level Lesion-level
IoU(%) Rec(%) Pre(%) IoU(%) Rec(%) Pre(%)
BCELoss 75.9 82.4 90.7 66.4 75.1 85.2
DiceLoss 75.9 82.8 90.1 64.8 75.4 82.1
IoULoss 76.9 84.7 89.3 64.5 72.9 84.8
FocalLoss 73.6 79.4 90.1 62.1 68.3 87.3
TABLE IV: Network performance with different Loss Function

We can see that BCELoss and IoULoss are slightly superior to other two loss functions. BCELoss is able to reach 66.4% on lesion-level IoU and IoULoss is able to reach 76.9% on pixel-level IoU. Each of the four loss functions has its own advantages and disadvantages, but no single loss function is able to achieve satisfactory results in both pixel- and lesion-level evaluation indices.

Moving on to our proposed compound loss function, we tested different weight coefficients in the compound loss function, with results as shown in Table 5.

Pixel-level Lesion-level
IoU(%) Rec(%) Pre(%) IoU(%) Rec(%) Pre(%)
0 72.4 80.4 87.9 60.4 81.8 69.7
0.2 74.6 85.1 85.8 62.4 80.7 73.3
0.4 74.7 88.2 83.0 62.4 85.2 70.1
0.5 76.1 84.8 88.1 67.1 79.0 81.6
0.6 74.2 84.1 86.3 63.8 77.0 78.7
0.8 75.1 84.4 87.1 61.9 76.3 76.6
1.0 19.5 96.7 19.6 81.1 100.0 81.1
TABLE V: Network performance with different

With the increase of lesion-level loss weight , lesion- and pixel-level IoU are significantly improved, indicating that adding lesion-level loss to the loss function can improve lesion-level performance and bring positive benefits at the pixel level. The lesion- and pixel-level IoU are maximized when . When and keeps increasing, both these decrease. Hence, under the premise of ensuring that the pixel-level loss fully affects the segmentation accuracy of each pixel in the network, the appropriate lesion-level loss is added to improve the segmentation accuracy of the model. When , the pixel-level IoU drops sharply, whereas the lesion-level IoU rises sharply. When the loss function is composed of lesion-level losses, the prediction label tends to be a whole white map, which illustrates the necessity of pixel-level loss in the loss function.

Pixel-level Lesion-level
IoU(%) Rec(%) Pre(%) IoU(%) Rec(%) Pre(%)
0 40.5 99.8 40.5 48.4 100.0 48.4
0.1 69.0 96.3 70.8 56.3 96.3 57.5
0.3 72.8 91.2 78.3 56.1 89.0 60.3
0.5 76.1 84.8 88.1 67.1 79.0 81.6
0.6 72.1 78.6 89.8 61.0 72.9 78.8
0.7 60.3 62.4 94.8 52.3 58.1 83.9
0.9 43.2 45.7 88.5 47.0 54.0 78.5
TABLE VI: Network performance with different

A similar experiment evaluated different weight coefficients . Table 6 shows that when , the lesion- and pixel-level IoU are maximized, which means that IoU must equally consider Pre and Rec losses. Moreover, as deviates from 0.5, IoU obviously decreases. With the increase of , the lesion area becomes smaller, the accuracy rate of the lesion area becomes larger, and the recall rate becomes smaller. In actual pathological diagnosis, the requirement of recall rate is stricter, which means a missed diagnosis is more severe than a misdiagnosis. Therefore, the model with can be selected as a standby, which is suitable for pathologists who demand a high recall rate.

Fig. 8: Examples of hydrops lesions segmentation in different stagewise training method

4.4 Stagewise Training Method Result

We performed four experiments to verify the effect of stagewise training based on multiple loss functions:

a) Experiment 1: learning rate is 1e-4, and IoULoss is used to train the model for 50 epochs. The learning rate is changed to 1e-5, and the compound loss function with and is used for another 50 epochs;

b) Experiment 2: learning rate is 1e-4, and IoULoss is used to train the model for 50 epochs. The learning rate is changed to 1e-5, and IoULoss is used for another 50 epochs;

c) Experiment 3: learning rate is 1e-4, and BCEWithLogitsLoss is used to train the model for 50 epochs. The learning rate is changed to 1e-5, and the compound loss function with and is used for another 50 epochs;

d) Experiment 4: learning rate is 1e-4, and BCEWithLogitsLoss is used to train the model for 50 epochs. The learning rate is changed to 1e-5, and BCEWithLogitsLoss is used for another 50 epochs. The results using DeepLabv3+ are shown in Table 7 and Fig. 8.

Pixel-level Lesion-level
IoU(%) Rec(%) Pre(%) IoU(%) Rec(%) Pre(%)
E1 77.0 88.1 86.0 70.2 86.2 79.1
E2 76.1 88.6 84.3 68.2 79.6 82.6
E3 76.1 87.9 85.0 67.4 81.5 79.6
E4 75.4 87.9 84.2 66.4 80.4 79.2
TABLE VII: Performance evaluation of multi-loss function model by stagewise training method

From the comparison of Experiment 2(called E2 below) and E4, it can be seen that IoULoss is superior to BCEWithLogitsLoss on the HM hydrops dataset. From E1-E2 group and E3-E4 group, after adding the compound loss function, lesion-level IoU and recall has significant advance, the pixel-level IoU and precision have also increased. This shows that the stagewise training of the model based on multiple loss functions can further improve the model on multiple metrics.

5 Conclusion

We constructed an HM hydrops lesion detection model based on a semantic segmentation network to improve the efficiency of HM diagnosis and reduce misdiagnosis.

We completed the HM hydrops lesion dataset collection and annotation and established the dataset for the hydrops lesion segmentation model by using a sliding window to crop sections into image patches, with data cleaning and online data enhancement. Models under different networks, feature-extraction networks, and loss functions were tested and evaluated, with DeepLabv3+ as the segmentation network and se_resnet50 as the feature-extraction module to segment lesions. Most importantly, we proposed a compound loss function for multiple evaluation metrics, which was proved to be superior to the traditional loss function in multiple evaluation indexes through comparative experiments. The performance of the model was significantly improved with the proposed stagewise training method of multiple loss functions.

Since the labeled dataset included hydrops, hyperplasia, and villi, model training was conducted on the hydrops dataset. In the future, it is hoped that models can be trained on the hyperplasia dataset, which can display hydrops and hyperplasia lesions in real time, to assist pathologists in the diagnosis of HM slices.

The uses of our method are not limited to assisting HM diagnosis. Despite the morphological diversity of the tissues and organs, the basic ideas for pathological diagnosing are similar, to find the corresponding pathological manifestations, the lesion. Most lesion have different pathological features comparing to normal tissues and organs. Extracting those pathological features can make a difference in the diagnosing process and workflow. And our method has shown its powerful feature extraction capability in the paper. Besides, all of the components of our method are non-specified. Therefore, the model we proposed in this paper is not only to help with HM diagnosis but also has the ability to provide help with other pathologic diagnoses in the form of extracting the corresponding pathological features.


The authors would like to thank the Third Affiliated Hospital of Zhengzhou University for providing the data source.