Microorganisms are tiny living organisms that can appear as unicellular, multicellular, and acellular types Madigan-1997-BBM . We provide some examples in Fig. 1. Some microorganisms are benefiting, such as Lactobacteria can decompose substances to give nutrients to plants Kulwa-2019-ASSM , Actinophrys can digest the organic waste in sludge and increase the quality of freshwater Zhang-2020-AMCF , and Rhizobium leguminosarum can help soybean to fix nitrogen and supply food to human beings Bagyaraj-2007-AM . However, there are also many harmful microorganisms, such as Mycobacterium tuberculosis can lead to disease and death Gillespie-2012-MMIA , and the novel coronavirus disease 2019 (COVID-19) constitutes a public health emergency globally Rahaman-2020-ICSC . Therefore, microorganism research plays a vital role in pollution monitoring, environmental management, medical diagnosis, agriculture, and food production Kulwa-2019-ASSM ; Li-2019-ASAC , and the analysis of microorganisms is the essential step for related researches and applications Li-2020-MAMR .
In general, microorganism analysis methods can be summarised into four categories: chemical (e.g., chemical component analysis), physical (e.g., spectrum analysis), molecular biological (e.g., DNA and RNA analysis), and morphological (e.g.manual observation under a microscope) methods Li-2019-ASAC . Their main advantages and disadvantages are compared in Table 1. The chemical method is highly accurate but often results in secondary pollution of chemical reagent Li-2019-ASAC . The physical method also has high accuracy, but it requires expensive pieces of equipment Li-2019-ASAC . The molecular biological method distinguishes microorganism by sequence analysis of genome Yamaguchi-2015-SDCR . This strategy needs expensive pieces of equipment, plenty of time, and professional researchers. The morphological method is the most direct and brief approach, where a microorganism is observed under a microscope and recognized manually based on its shape Pepper-2011-EM . The morphological method is the most cost-effective of the above methods, but it is still laborious, tedious, and time-consuming Zhang-2020-AMCF . Besides, the objectivity of this manual analyzing process is unstable, depending on the experience, workload, and mood of the biologist significantly.
|Chemical method||High accuracy||Secondary pollution of chemical reagent|
|Physical method||High accuracy||Expensive equipment|
|Morphological method||Short testing time||Long training time for skillful operators|
In recent years, Artificial Intelligence (AI) technology develops rapidly Zhou-2020-ACRB . It achieves outstanding performance in many fields of image analysis and processing, such as autonomous driving Sallab-2017-DRLF ; Chen-2017-M3OD ; Wang-2019-PVDELiu-2017-SDHE ; Wang-2018-CLMC ; Deng-2019-AAAM , and disease diagnosis Sun-2020-GHIS ; Li-2020-FFDO ; Rahaman-2020-ICSC . AI can undertake the laborious and time-consuming work and quickly extract valuable information from image data. Therefore, AI shows potential in Microorganism Image Analysis (MIA). Besides, AI has a strong objective analysis ability in MIA and can avoid subjective differences caused by manual analysis. To some extent, the misjudgement of biologists can be reduced, and the efficiency can be improved.
As an essential part of artificial intelligence technology, Artificial Neural Network
(ANN) is originally designed according to the biological neuronMcculloch-1943-ALCI . Due to the limitation of computer performance, the difficulty of training, and the popularization of Support Vector Machine (SVM), the early ANN development once fall into a state of stagnation Yamashita-2016-AIGD . After that, with the improvement of computer performance, Convolutional Neural Network (CNN) shows an overwhelming advantage in image recognition Krizhevsky-2017-ICDC , and ANN is paid attention to again and developed rapidly. We find that ANNs are widely used in MIA thorough investigation because they can learn useful patterns from enormous data and features.
1.1 Artificial Intelligence Methods for Microorganism Image Analysis
AI is an umbrella term encompassing the techniques for a machine to mimic or go beyond human intelligence, primarily in cognitive capabilities Robertson-2018-DIAB . Fig. 2 provides the structure of AI technology Linnosmaa-2020-MLSC ; Zhou-2020-ACRB . AI has several important sub-domains, such as Machine learningLi-2020-MAMR ; Zhang-2020-AMCF , Herpesvirus detection Devan-2019-DHCT , and Tuberculosis Bacilli (TB) classification Rulaningtyas-2011-ACTB ; Osman-2012-OSEL .
As shown in Fig. 2, ML can be grouped into conventional methods and ANNs. In conventional methods, SVM, -Nearest Neighbor
(KNN),Random Forest (RF) and other methods have been applied to the MIA task. For example, work Li-2015-ACIA proposes an automatic EM classification system based on content-based image analysis techniques. There are four features (histogram descriptor, geometric feature, Fourier descriptor and internal structure histogram) extracted from the Sobel edge detector based segmentation result for training SVM to perform the classification task. For ten EM classes tested in this work, the mean of average precisions obtained by the system amounts to 94.75% Kulwa-2019-ASSM . An automatic identification approach of TB is proposed in Forero-2003-AITT . The Canny edge technique is applied for edge detection, followed by a non-maxima suppression and a hysteresis threshold operations. After that, a morphological closing operation is applied. Then compactness and eccentricity features are extracted in one branch, and the same segmented images are passed through
-means clustering in the second branch. Each branch independently goes to classification part using the nearest neighbour classifier. The average results obtained from the two branches are 93.30% and 100% sensitivity for the first and second branchesKulwa-2019-ASSM . In Di-2011-UZAS , a content-based MIA work based on ZooImage automated system is introduced, where 1437 objects of 11 taxa are classified using four shape features and a Random Forest (RF) classifier, and a general classification accuracy of 83.92% is finally achieved Li-2016-CMIA .
ANNs also play a vital role in the MIA task. Fig. 2 shows that they include classical neural networks and deep neural networks. In the early years, due to the computer performance limitation, classical neural networks represented by Multilayer Perceptron
Multilayer Perceptron(MLP), Radial Basis Function Neural Network (RBF), Probabilistic Neural Network (PNN), etc. are applied to the MIA tasks. For example, in Culverhouse-1996-ACFD , human experts’ performance in identifying Dinoflagellates is compared to that achieved by two ANN classifiers (MLP and RBF) and two other statistical techniques, KNN and Quadratic Discriminant Analysis (QDA). The data set used for training and testing comprised a collection of 60 features that are extracted from the specimen images. The result shows that the ANN classifiers outperform classical statistical techniques. Extended trials show that the human experts achieved 85% accuracy while the RBF achieves the best performance of 83%, the MLP 66%, KNN 60%, and the QDA 56%. The work Kumar-2010-RDMU
uses PNN to select the best identification parameters of the features extracting from the microorganism images. PNN is then used to classify the microorganisms with a 100% accuracy using nine identification parameters.
Later, with the significant improvement of computer performance, the development of neural network theory and the proposal of Convolutional Neural Network (CNN), deep neural networks show an overwhelming advantage in image analysis, including microbial image analysis. For example, the work Matuszewski-2018-MATS uses U-Net to perform the Rift Valley virus segmentation and achieves a Dice score of 90% and Intersection Over Union (IOU) of 83.1%. In Wahid-2019-DCNN , the transfer learning based on Xception is applied to perform the bacterial classification. Seven varieties of bacteria for recognition which might be lethal for human are chosen for the experiment. The performance is evaluated on 230 bacteria images of seven varieties from test dataset, which shows promising performance with approximately 97.5% prediction accuracy in bacteria image classification.
In conclusion, we can find that conventional machine learning methods and classical neural network methods have similar workflows in the MIA task. They typically rely heavily on feature engineering Al-2015-PEHC . These workflows usually contain image acquisition, image preprocessing, image segmentation, feature extraction, classifier design, and evaluation. The reliability of accuracy depends on the design and extraction of features Al-2015-PEHC . In recent years, with the development and popularization of CNN, one of the most important parts of deep neural networks, the MIA task is no longer affected by feature engineering. Compared with the classical ANN method, CNN can directly extract useful features from the image through the convolutional kernel. This kind of ability makes the research and application of CNN in MIA increase rapidly and obtain overwhelming advantages.
1.2 Motivation of This Review
This paper focuses on the development and application of ANNs in the MIA task. A comprehensive overview of techniques for the image analysis of microorganism using classical neural networks and deep neural networks is presented. The motivation is to clarify the development history of ANNs, understand the popular technology and trend of ANN applications in the MIA field. Besides, this paper also discusses potential techniques for the image analysis of microorganism by ANNs. As far as we know, there are some review papers that summarize researches related to the MIA task, for example, papers Kulwa-2019-ASSM ; Li-2020-ARCM ; Li-2019-ASAC ; Li-2018-ABRC . In the following part, we go through these review papers.
The review Kulwa-2019-ASSM comprehensively analyses the various studies focused on microorganism image segmentation methods from 1989 to 2018. These methods include classical segmentation methods (e.g., edge-based, threshold-based, and region-based segmentation methods) and machine learning-based methods (supervised and unsupervised machine learning methods). About 85 related papers are summarized in this review. The ANN-based segmentation method is one part of this review.
In the review Li-2020-ARCM , to clarify the potential and application of different clustering techniques in MIA, the related works from 1997 to 2017 are summarized while pinning out the specific challenges on each work (area) with the corresponding suitable clustering algorithm. More than 20 related research papers are summarized in this review.
In the review Li-2019-ASAC , the development history of microorganism classification using content-based microscopic image analysis approaches is summarized. This review introduces the classification methods from different application domains, including agricultural, food, environmental, industrial, medical, water-borne, and scientific microorganisms. Around 240 related works are summarized. The classification methods discussed in this review contains not only ANNs but also many other methods like SVM, KNN, RF, and so on.
In our previous work Li-2018-ABRC , we propose a brief review for content-based microorganism image analysis using classical and deep neural networks. This review briefly summarizes 55 related papers from 1992 to 2017.
To sum up, we can find that the review Li-2020-ARCM only summarizes the MIA method based on clustering techniques. The review papers Kulwa-2019-ASSM ; Li-2019-ASAC focus on the segmentation and classification tasks, respectively. Although the methods discussed in these reviews include ANN methods, ANN is not central to these two reviews’ discussions. Besides, our previous review Li-2018-ABRC only briefly discusses the ANN technique in the MIA task, and the time of the discussion is up to 2017. Hence, to clarify the development history of ANNs, understand the popular technology and trend of ANN applications in the MIA field, we summarize the related papers and provide a detailed discussion about the research motivation, contribution, dataset, workflow, and result for each work in this review.
1.3 Paper Searching and Screening
To illustrate the collection process of related papers, Fig. 3 provides the details. Based on our previous review, there are 55 related papers from 1992 to 2017. We collect 51 papers from several databases including Google Scholar, IEEE, Elsevier, Springer, ACM, etc. After carefully reading, 16 papers on other topics are excluded, and six related paper is added. Finally, a total of 96 research papers are retained for our review. The popularity and trend of ANN for the analysis of microorganism image are provided in Fig. 4.
1.4 Structure of This Review
This review is structured as follows, we begin by introducing the development of ANN and some representative networks used in MIA tasks in Sec. 2; Sec. 3 introduces the MIA work using classical ANN methods; in Sec. 4, State-of-the-art deep ANN methods applied in the MIA tasks are summarized; Sec. 5 presents the method analysis; Sec. 6 concludes this paper.
2 Artificial Neural Networks
In this section, to better understand the development of ANN in the MIA task, we briefly introduce the evolution history of ANN. Besides, some representative ANN structures in MIA tasks are introduced.
2.1 Evolution of Artificial Neural Networks
The development of ANN has a long history. As shown in Fig. 5, its course can be divided into three stages Yamashita-2016-AIGD . The ANN research can be traced back to M-P Neuron, designed according to the biological neuron, proposed in Mcculloch-1943-ALCI in 1943. This model is a physical model made up of elements such as resistors. Perceptron, whose learning rule is based on the original M-P Neuron, is proposed in Rosenblatt-1958-PAPM . After training, perceptron can determine the connection weights of neurons Yamashita-2016-AIGD . Since then, the first upsurge of ANN starts. However, Minsky et al. point out that perceptron can not be applied to solve the XOR problem (the linearly inseparable problem) in Minsky-2017-PAIC , which makes the ANN research fall into the trough.
In the second stage, with the proposal of Hopfield network Hopfield-1982-NNPS in 1982, the study of ANN attracts attention again. After that, with the proposal of Back-Propagation (BP) algorithm Rumelhart-1986-LRBE , the XOR problem can be solved by training Multilayer Perceptron (MLP) using BP algorithm. Besides, LeCun et al. propose CNN by introducing the convolutional layer, inspired by the biological primary visual cortex, into the neural network Lecun-1989-BAHZ ; Lecun-1998-GLAD . However, limited by the computer performance and neural network theory at that time, although BP enables CNN to be trained, there are still problems such as too long training time and easy over-fitting. With the popularization of SVM, the ANN research falls into the trough again.
Although the ANN research falls into the trough, the related researches by Hinton Hinton-2002-TPEM ; Hinton-2006-AFLA ; Hinton-2006-RDDN ; Nair-2010-RLUI ; Salakhutdinov-2009-DBM and Bengio Bengio-2009-LDAA ; Bengio-2007-GLTD ; Glorot-2010-UDTD ; Le-2008-RPRB et al. don’t stop. Benefit from their research progresses, ANNs show overwhelming advantages in speech and image recognition Krizhevsky-2017-ICDC . Since then, the third rise of ANN begins. Different from the second stage, the computer performance is significantly improved. Training a deep neural network does not need such a long time like before. Besides, with the popularization of the internet, more and more data can be used for the training, which can reduce the over-fitting.
The ANN based MIA research also follows the development trend of ANN. In the following part, some representative ANN structures in MIA tasks are introduced.
2.2 Representative Artificial Neural Networks
ANN plays an essential role in the MIA task. By investigating related research papers, we find that early MIA tasks based on classical neural networks rely on the “Feature Engineering + Classifier” workflow. They usually extract effective features from images by the existing experience or the select method and then use them for training classifiers to perform corresponding tasks. Among these classifiers, MLP is the most widely used one. With the widespread use of CNNs, the MIA task no longer relies on feature engineering. Among these CNNs, AlexNet Krizhevsky-2017-ICDC , VGGNet Simonyan-2014-VDCN , ResNet He-2016-DRLI , and Inception Szegedy-2015-GDC based methods are widely used in the microorganism image classification task, U-Net Ronneberger-2015-UCNB and its related improved networks are also widely used for the microorganism segmentation task, and YOLO Redmon-2016-YOLO is also widely used in the microorganism detection task. To understand the characteristics of these networks, we provide a brief description of MLP, AlexNet, VGGNet, ResNet, Inception, U-Net, and YOLO below.
2.2.1 Multilayer Perception
As is shown in Fig. 6, an MLP consists of at least three layers: an input layer, a hidden layer, and an output layer. The early MLP is a class of feed-forward ANN. Like the perceptron, the early MLP can determine the connection weight between two layers by the error correction learning, which adjusts the connection weight according to the error between the expected output and the actual output. However, error correction learning cannot work between multiple layers. For this reason, the early MLP uses the random number to determine the connection weight between the input layer and the hidden layer, and the error correction learning is used to adjust the connection weight between the hidden layer and the output layer. With the proposal of the BP algorithm, MLP can adjust the connection weight layer by layer.
AlexNet adopts an architecture with consecutive convolutional layers. It is the first ANN to win the ILSVRC 2012 Rahaman-2020-ASCC . After its triumphant performance, the following years’ winning architectures are all deep CNNs. As shown in Fig. 7(a), AlexNet consists of 8 layers, which includes five convolutional layers and three fully connected layers. The significance of AlexNet is that they use the Rectified Linear UnitRahaman-2020-ASCC . AlexNet is trained by multi-GPU, which can cut down on the training time. Besides, different from the conventional pooling methods, overlapping pooling is introduced to AlexNet. AlexNet has 60 million parameters Russakovsky-2015-ILSV . To solve the overfitting problem, data augmentation and dropout are employed.
Simonyan et al. propose a CNN model named VGGNet, which wins second place in ILSVRC 2014 Simonyan-2014-VDCN . VGGNet is characterized by its simplicity, using only the convolutional filter, which is the smallest size to capture the spatial information of left/right, up/down, center. The layer number of VGGNet could be 16 or 19. As the architecture of VGG-16 shown in Fig. 7(b)
, the network consists of 13 convolutional layers, 5 max pooling layers, 3 fully connected layers, and a softmax layer. VGG-19 has 3 more convolution layers than VGG-16. VGG-16 and VGG-19 comprise 138 and 144 million parametersSimonyan-2014-VDCN . The significance of VGGNet is the use of convolutional filters, whose stack could obtain the same receptive filed as the bigger convolutional filter used in AlexNet. Besides, this stack, which has fewer parameters than the bigger convolutional filter used in AlexNet, allows VGGNet to have more weight layers, which results to improve performance Rahaman-2020-ASCC .
Szegedy et al. propose GoogleNet, which wins first place in ILSVRC 2014 Szegedy-2015-GDC . As shown in Fig. 7(c), it consists of 22 convolutional layers and 5 pooling layers. Nevertheless, this network has only 7 million parameters Rahaman-2020-ASCC . The significance of GoogleNet is that it first introduces the Inception structure. As the Inception structure is shown in Fig. 7(e), it first uses the idea of using the convolutional filter to reduce the channel number of the previous feature map to reduce the total number of parameters of the network. Besides, the structure uses convolutional filters of different sizes to obtain multi-level features to improve the performance Zhang-2020-AMCF .
In the Inception-V2, batch normalization is added, and a stack of twoconvolutional filters are employed instead of a convolutional filter to increase the network depth and reduce the parameters Ioffe-2015-BNAD . In Inception-V3, a stack of and convolutional filters is used to replace the convolutional filter Szegedy-2016-RIAC . In Inception-V4, the idea of residual learning block is incorporated Szegedy-2017-IIIR .
From AlexNet (5 convolutional layers), VGGNet (16 or 19 convolutional layers), to GoogleNet (22 convolutional layers), the structure of the network is getting deeper and deeper. This is because deeper networks allow more complex features to be extracted, which in theory leads to better results for deeper networks. However, adding depth to the network by simply adding layers can lead to gradient vanishing/gradient explosion problems Bengio-1994-LLDG ; Glorot-2010-UDTD . To solve these problems, He et al. propose ResNet, whose structure is shown in Fig. 7(d). ResNet proposes a novel approach called identity mapping so that the network depth can increase for improved performance. Fig. 7(f) shows the residual learning block based on identity mapping. In this block, denotes an underlying mapping to be fit by the stacked layers, and the input feature map of these layers is denoted as . The residual function can be denoted as . The residual network’s primary purpose is to make a deeper network from a shallow network by copying weight layers in the shallow network and setting other layers in the deeper network to be identity mapping.
U-Net is a CNN that is initially used to perform the microscopic image segmentation task Zhang-2021-LANL . As the structure is shown in Fig. 8, U-Net is symmetrical, consisting of a contracting path (left side) and an expansive path (right side), which gives it the u-shaped architecture. The training strategy of U-Net relies on the strong use of data augmentation to make more effective use of the available annotated samples Ronneberger-2015-UCNB . Besides, the end-to-end structure of U-Net can retrieve the shallow information of the network Ronneberger-2015-UCNB .
Redmon et al. propose a novel framework called YOLO, which uses the whole topmost feature map to predict both confidences for multiple categories and bounding boxes Zhao-2019-ODDL . YOLO’s main idea is that it divides the input image into an grid, and the grid cell is responsible for detecting the object centered in that cell Redmon-2016-YOLO . Each grid cell predicts bounding boxes and confidence scores for those boxes Redmon-2016-YOLO . These confidence scores reflect how confident the model is that the box contains an object and how accurate it thinks the box is that it predicts Redmon-2016-YOLO . Each grid cell also predicts
conditional class probabilities. It should be noticed that only the contribution from the grid cell containing an object is calculatedZhao-2019-ODDL . The structure of YOLO is shown in Fig. 9. It consists of 24 convolutional layers and 2 fully connected layers. Instead of the Inception used by GoogleNet, they use reduction layers followed by convolutional layers.
3 Classical Neural Network for Microorganism Image Analysis
An overview of the MIA task using classical neural networks is discussed in this section. We divide the tasks into two categories: classification and other tasks. In the classification part, we mainly introduce the works related to classification and identification tasks. The other tasks, including segmentation, feature generation, and enumeration, are discussed in the following part. Besides, we provide a summary to summarize the characters of the MIA based on classical neural networks and a table for readers to find relevant research works conveniently.
3.1 Classification Tasks
In Balfoort-1992-AIAN , to discriminate and identify algal species with the Optical Plankton Analyser (OPA), a multilayer feed-forward network is applied. In the experiment, a total of eight species of algae are cultivated in monoculture. From each monoculture, six-parameter logarithmic data generated by OPA is used to train the network. The results show that the network can distinguish Cyanobacteria from other algae with 99% accuracy. The identification of species is performed with less accuracy but is generally more than 90%.
In Culverhouse-1994-ACFS , to achieve automatic taxonomic classification in ecological research, back-propagation of error network is used to perform the Cymatocylis classification task of this paper, which contains five species: Cymatocylis calyciformis, C. drygalskii, C. vanhöffeni, C. convallaria, and C. parva
. In the experiment, the images are digitized, binarised, and edited by hand to remove large debris first. Then, the network is trained by the image data after the Fourier transformation. The results show that 28% of the 299 trials achieve more than 70% correct categorization rates of the data used in the training sets. The best network shows error rates of 11% and 18% of training dataset and previously unseen data.
In Culverhouse-1996-ACFD , to decrease the cost of monitoring of noxious and toxic algae and other parameters in coastal waters and improve the efficiency, a comparison between human experts, two ANN methods (MLP and RBF), and two ML methods (KNN and QDA) is made in this paper. The dataset used in this paper contains 23 species of toxic and noxious Dinoflagellates. In the experiment, images are pre-processed to segment the specimen from the background, debris and clutter. Then, these images are analyzed by six functions, which includes the 2D Fast Fourier Transform of the object, the Discrete Fourier Transform of the object’s profile, its second-order statistics, a Sobel edge descriptor, a junction descriptor, and a texture metric, to generate a multiplicity of low-resolution parameters. These parameters are fed into the automatic classifiers for training. The experiment results show that the human expert achieves the averaged identification rate of 85% while RBF achieves the best performance of 83%, MLP 66%, KNN 60%, and QDA 56%.
Tuberculosis is largely treatable with early diagnosis and subsequent monitoring. To detect TB in sputum smears, an automatic method is proposed in Veropoulos-1998-IPNC . The image data used in the experiment is created based on the sputum smears taken at the South African Institute for Medical Research, Cape Town. There are 1147 objects (267 TB and 880 other objects) are obtained. From these, l000 examples are used for training and 147 for testing. In the experiment, the pre-processing steps, including edge detection, region labelling and removal, edge pixel linking, and boundary tracing, are applied one by one first for the shape features extraction in which the discrete Fourier transform is used. Then, in the classifier training phase, the first 15 coefficients of the discrete Fourier transform are sufficient for training. During this process, the classifiers from discriminant methods of statistics or neural networks are compared. The results show that the MLP based on the BP algorithm performs best. It achieves an accuracy of 97.9%
In Gerlach-1998-IRSM , the MLP based on the BP algorithm is proposed to perform the classification task for investigating the reactors’ influence on the fungal morphology. These reactors are shaking flask, stirred tank, and airlift tower loop reactor. The authors themselves make the image data used in the experiment. Five morphological features, including object area, eccentricity, circularity, length of the skeleton, and branching frequency, are extracted from the image data. These five features and the cultivation time are used as the inputs of the network. The clustering method is used to find the best combination of features. The experimental results show that the best classification results (better than 90%) were obtained with cultivations in shaking flasks with baffles and a stirred tank reactor.
The microscopic examination used to determine concentrations and biovolumes of microorganisms is time-consuming and tedious. Automation can effectively improve this situation. In Blackburn-1998-RDBA , a classification method based on ANN is proposed, and it can effectively classify individual bacteria and non-bacteria objects. The private dataset used in the experiment is based on samples collected at several stations in the Baltic Sea. A total of 690 images were taken for the experiment, each containing about 200 bacteria. The process of the experiment is to use edge detection to perform image segmentation and then extract shape features for network training. The network used in the experiment is a MLP based on the BP algorithm, which contains 16 input nodes, five hidden nodes and three output nodes. Experimental results show that this method can effectively perform the classification task.
Pathogen identification plays an important role in disease control and prevention. Simple and quick methods are essential in the pathogen identification task. In Kay-1999-TAAS , a classification system based on statistical methods successfully discriminate closely related species within the same host. Five closely related species of Gyrodactulus
are used to test the system performance. There are two kinds of image data used in the experiment: one is prepared under the scanning electron microscope, and another is prepared under the light microscope. In the experiment, the features are extracted from these images to train the classifiers, which contains linear discriminant analysis, nearest neighbours, feed-forward neural network and projection pursuit regression. The experimental results show that nearest neighbours and linear discriminant analysis give the best results on the data of light microscope and nearest neighbours and feed-forward neural network achieve absolute detection ofGyrodactulus salaris from just a single sclerite, the marginal hook, from scanning electron microscope images.
It is a time-consuming task to classify marine dinoflagellates by artificial analysis. However, the analysis of seawater samples is very important for marine ecology and fishery health. Therefore, using automatic methods to assist ecologists to analyse seawater samples has great significance. A system called DiCANN is proposed for this purpose in Culverhouse-2000-DAMV . This system consists of ANNs, an Internet distributed database, and image analysis techniques. The image data used in the experiment is derived from a Distributed Image Database (DIB) on the Internet. As the system’s workflow shown in Fig. 10, the input image data is pre-processed first, and then the features are extracted for training or testing the classifiers. The classifiers contain back-propagation of error feed-forward Perceptron (BPN), KNN, RBF, and QDA. The experimental results are shown in Fig. 11. As the number of species increases, human performance declines while the performance of BPN rises most obviously.
In Avci-2002-CECB , four kinds of ANNs are applied to the classification task of Escherichia coli, including MLP, RBF, General Regression Neural Network (GRNN), and PNN. A total of 336 Escherichia coli instances are used in the experiment. There are eight classes in the dataset. The experimental results show that PNN is the most suitable model for Escherichia coli classification. Using this neural network, the benchmark result using the ad hoc structured probability model is improved to 82.97% correct classification rate.
Counting phytoplankton using a microscope is a time-consuming process. Therefore, it is of great significance to develop an automatic counting system. In Embleton-2003-ACPP
, a phytoplankton identification and counting system based on computer image analysis and pattern recognition is proposed. The system consists of ANNs and simple rule-based procedures. The image data used in the experiment is taken directly from water samples from Lough Neagh in northern Northern Ireland. In the experimental process, the input image is pre-processed and analysed first, then the 74 parameters describing size, shape, colour and grey level distribution are extracted for the training of the network, and finally, the test data are used for testing. The experimental result shows that the automatic system is close to the results obtained by manual.
Their own experience often influences human experts in the identification of Cryptosporidium parvum and Giardia lamblia that can infect humans and cause deadly gastroenteritis. ANNs are widely used in biomedical image identification in recent years. Therefore, the methods based on ANNs are proposed to perform the identification task of these two protozoa in Widmer-2005-UANN . In the experiments, the image data is first clipped to fit the input of the networks (40 by 40 and 95 by 95 pixels for Cryptosporidium parvum and Giardia lamblia, respectively). The ANNs used for Cryptosporidium parvum and Giardia lamblia have 1600 and 9025 input neurons. Both of them have five hidden and two output neurons. In the experiments, there are 1586 and 2431 images for training Cryptosporidium oocyst ANN and Giardia cyst ANN, respectively. After the training process, 100 images are used to select ANNs, which show the best performance. Finally, Cryptosporidium oocyst ANN achieves 91.8% correct identification rate on 500 unseen images and Giardia cyst ANN achieves 99.6% correct identification rate on 282 unseen images.
The analysis of sedimentary organic matter can be used in geochronological, biostratigraphical, paleoecological and paleoenvironmental analysis. However, the traditional microscope analysis is time-consuming and tedious. Automatic technology can improve effectiveness. Thus, in Weller-2005-SCSO , a semi-automatic system is presented to perform the classification task of sedimentary organic matter. The image data used in the experiments are from Paleogene and Holocene samples. The data can be divided into two orders. In the first order, there are 3266 samples belong to three second order classes: 501 for second order amorphous, 1475 for second order palynodebris, and 1290 for second order palynomorphs. In the experiments, after image capture, the pre-processing is applied to segment the foreground, and then 194 original features, including morphological and textural features, are extracted. After that, the Exhaustive CHi-square Automatic Interaction Detector classification tree algorithm is used to determine the effective features for each ANN training. The ANNs used here are back-propagation MLP. The Gamma test is used to prevent the overtraining problem. The workflow of the system is shown in Fig. 12, the classification contains two stages. In the first stage, the test image is classified into the second order category, and then it is classified into a subcategory in the second stage. The experimental results show that the system achieves an average correct classification rate of 87%.
The related image recognition methods are behindhand with the wide application of optical imaging samplers in plankton ecology research. Most methods may need manual post-processing to correct their results. To optimize this situation, a dual-classification method is proposed in Hu-2006-AAQT . As the workflow of the dual-classification system shown in Fig. 13
, four kinds of shape features, including moment invariants, morphological measurements, Fourier descriptors, and granulometry curves, are used to train a
Learning Vector Quantization Neural Network
(LVQ-NN) in the first classification, and texture-based features (co-occurrence matrix) are used to train the SVM to achieve the second classification. Only when the results of the two classifiers are the same, can the plankton be classified into one class, which effectively reduces the false positive rate. The image data used in the experiments are collected from the Great South Channel off Cape Cod, Massachusetts. It contains seven categories and a total of about 20000 images. The experimental results show that compared with previous methods, the dual-classification system after correction can effectively achieve abundance estimation and reduce the error by 50% to 100%.
It is a time-consuming and laborious task to analyse protozoa and metazoa by observing through the microscope. It is of great significance to apply digital image analysis technology in this analysis process. In Ginoris-2006-RPMU , the authors develop an image analysis program to perform the semi-automatic recognition of several protozoa and metazoa commonly used in sewage treatment. The program uses around 40 morphological parameters to train Discriminant Analysis and ANN to to identify and classify protozoan or metazoan images. The experimental results show that this program can effectively distinguish amoebas, sessile ciliates, crawling ciliates, large flagellates and free swimming ciliates.
Protozoa and metazoa are good indicators in sludge treatment. Manual analysis methods are time-consuming and have professional requirements for operators. Therefore, the digital image analysis method has the potential capability in this task. In Ginoris-2007-RPMU
, a semi-automatic image analysis program is proposed for protozoa and metazoa classification. The image data used in the experiments are taken from 22 classes of protozoa and metazoa collected from aeration tanks of WWTPs of Nancy (France) and Portugal treating domestic and industrial effluents. In the program’s workflow, the image data is pre-processed (filtering, segmentation, noise reduction, etc.) to generate the region of interest. Then the morphological features are extracted for training the classifiers, which includes discriminant analysis, neural network, and decision tree. The neural network used here is a two-layer feed-forward neural network without a hidden layer based on the BP algorithm. The experimental results show that discriminant analysis and neural network are more suitable for this task, while the decision tree is not.
It is tedious and time-consuming work to identify, count and measure individual cells manually. Image analysis methods are effective tools. In Xiaojuan-2007-ANBC , the authors present a novel method for bacterial image classification. In this method, the image pre-processing is applied first, in which Iterative threshold segmentation and mathematical morphology are proposed to realize bacteria image edge detection, and then 20 original features are extracted. After the feature extraction, eight effective features are selected by the Bayesian model, and Principal Component Analysis (PCA) is used. These processed features are used to train ANN, a three-layer feed-forward neural network, to perform the classification. The data used in the experiments contains eight bacteria classes, which are from National Microbe Culture Resource (NMCR) database. There are 60 images used as a training set and 20 images for testing. The accuracy of recognition rate in the experiments is 82.3%.
. Six effective features are selected from 15 original features (four morphological features, seven invariant moments, and four texture features) by their variance contribution. Under the same experimental condition asXiaojuan-2007-ANBC , the accuracy of recognition rate in Xiaojuan-2007-ANBR is 86.3%.
As an extension of work Weller-2005-SCSO , a new version of the classification system is proposed in Weller-2007-TSNN . Due to the ANNs used in the previous system show poor performance on the first order and second order amorphous classification tasks, RBF is used to replace back-propagation MLP in this work. The novel system is shown in Fig. 14. The experimental results show that in the best case, the correct recognition rate is improved by 4% to just over 91%.
The determination of bacterial abundance, biovolume and morphology in wastewater by manual analysis through a microscope is a time-consuming task. Therefore, in Xiaojuan-2008-ANWB ; Cunshe-2008-ANWB ; Xiaojuan-2009-AIBN , a bacteria image recognition system is proposed for related research and application to solve the problem. In this system, adaptive and enhanced edge detection is proposed for image pre-processing first. Then, the original features, including contour Invariant moment and morphological features, are extracted, and six effective features are selected from the original features, and PCA is applied to reduce the features’ dimension. Finally, the processed features are used to train an ANN to perform the classification task. The ANN used here is a three-layer feed-forward neural network based on an adaptive accelerated BP algorithm. The data used in the experiments are from the CECC database. In the experiments, the adaptive accelerated BP algorithm can effectively accelerate the training process by five times. Thirty images from the CECC database are used to test the system’s performance, and the result shows that the recognition rate is 85.5%.
Because the manual method to actived sludge screening is time-consuming and laborious, it can be widely applied. In Amaral-2008-SPII , a semi-automatic method is proposed to perform the recognition task of stalked protozoa species in sewage. In this method, the acquired images are segmented first, and then the features, which contains geometrical, morphological and signature descriptors, are extracted. Finally, the discriminant analysis and neural networks trained by these features were used to identify stalked protozoa. The data used in the experiments are collected from aeration tanks of WWTP in Nancy (France) and Braga (Portugal). The main goal of this study is to find out the most effective feature class. The experimental results show that geometrical features are the most important.
The development of a simple and reliable computer-aided microscopy method to analyse complex microbial communities is a major challenge in microbial ecology. In Hiremath-2010-AICB , a method for the classification of different cell growth phases is introduced. In this method, the adaptive global thresholding is applied to segment the cell images first, and then five geometric features, including circularity, compactness, eccentricity, tortuosity and length-width ratio, are extracted for training the classifiers. The classifiers contain 3, KNN, neural network, and fuzzy classifiers. The neural network used in this study is RBF. The dataset used here has three phases: normal or grownup or about-to-divide. Each phase of the bacilli bacterial cell has 100 colour images. The experimental results show that fuzzy classifier performs best, and the classification accuracy of the three phases are 100%, 98% and 98%, respectively, while the neural network is the second, and the classification accuracies are 100%, 96% and 95%, respectively.
In Hiremath-2010-DIAC ; Hiremath-2011-ICCB , an image analysis method used for cocci bacterial cell classification is proposed. In this method, the image segmentation is performed by the actived contour, and then five geometric features, which contain circularity, compactness, eccentricity, tortuosity and length-width ratio, are extracted for training and testing of the classifiers. There are three classifiers, 3, KNN, and neural network classifiers, used in this study. The neural network is RBF. A total of 350 image data is used in the experiment, with six categories: cocci, diplococci, streptococci, tetrad, sarcinae, and staphylococci. The experimental results show that the neural network achieves the best performance. According to Hiremath-2010-DIAC , the classification accuracy details of each class achieve by the neural network are 100%, 99%, 98%, 99%, 98%, and 99%, respectively.
The traditional microorganism detection and identification method is labours and time-consuming. In Kumar-2010-RDMU , a rapid and cost-effective method based on PNN, whose structure is shown in Fig. 15, is applied to classify the microorganism image classification task. The features, including various geometrical, optical, and textural parameters, are extracted from the pre-processed images to train the network in this method. In the experiments, the dataset includes five categories that are Bacillus thuringiensis, Escherichia coli K12, Lactobacillus brevis, Listeria innocua, and Staphylococcus epidermis. The experimental results show that the PNN based on nine kinds of features, which are run length non-uniformity, width, shape factor, horizontal run length non-uniformity, mean grey level intensity, ten percentile values of the grey level histogram, 99 percentile values of the grey level histogram, sum entropy, and entropy, can classify the microorganisms with 100% accuracy.
In environmental bacteria image acquisition, it is easy to generate low-quality images, bringing troubles to the subsequent analysis. Therefore, a computer-aid method based on ANN is introduced to perform the classification task of different quality categories in Zeder-2010-AQAA . The experimental data is private, and it contains 25000 images with three categories: high quality, medium quality, and low quality. As the workflow is shown in Fig. 16, the input image is divided into nine sub-images, and three features, including mean grey value (MGV), background inhomogeneity (BGI), and cell density measure (CDM), are extracted from each sub-image. Then the ANN is trained by the normalization and sorting of the measured features. The network used in this study consists of 27 input, 90 hidden, and three output neurons. The experimental results show that the optimal ANN achieves a correct identification rate of 94%.
The conventional manual method for detecting Mycobacterium tuberculosis is an ineffective but necessary part of diagnosing tuberculosis disease. In Osman-2010-DMTZ , a method based on the image analysis technique and ANN is introduced to detect the Mycobacterium tuberculosis in the tissue section. Fifteen tissue slides are collected in the Department of Pathology, USM Hospital, Kelantan, and each slide is captured to generate 30 to 50 images. A total of 607 objects, which contains 302 definite TB and 305 possible TB, are obtained. In the experiments, the moving -means clustering is applied to segment the images, and geometrical features of Zernike moments are extracted. Then, the Hybrid Multilayered Perceptron (HMLP) is used to test the performance of different feature combinations in the detection task. Fig 17 provides an example of HMLP with one hidden layer. The experimental results show that the HMLP with the best feature combination can achieve an accuracy of 98.07%, a sensitivity of 100%, and a specificity of 96.19%.
Similar to Hiremath-2010-AICB ; Hiremath-2010-DIAC , the same workflow is applied to classify three types of spiral bacterial cells: are vibrio, spirillum, and spirochete in Hiremath-2011-DMIA . Same as Hiremath-2010-AICB , 3, KNN, neural network, and fuzzy classifiers are applied in this study. The experimental results show 3, KNN (K = 5), neural network (RBF), and fuzzy classifiers achieve 100% classification accuracy on the test data of three categories. It proves that neural network can provide good classification ability. Besides, in Hiremath-2012-SBCI , the neuro fuzzy classifier is used to replace the fuzzy classifier in Hiremath-2011-DMIA , the results show that it shows the same performance as the fuzzy classifier.
TB detection in tissue is more complex and challenging than detection in sputum. In Osman-2011-TBDZ , a method based on the image processing technique and ANN is introduced to detect and classify TB in tissue. The slides used in this work are collected from the Pathology Department, Hospital Universiti Sains Malaysia, Kelantan. The 1603 objects consist of three categories, which are TB, overlapped TB, and non-TB, are collected from 150 tissue slide images. As the workflow is shown in Fig. 18, the captured image is segmented first, and then the features are extracted to train the network to perform the classification task. The network is a single-layer feed-forward neural network trained by Extreme Learning Machine. The experimental results show that the network obtains a classification accuracy of 77.25%.
As the extension of Osman-2011-TBDZ , a new method based on image processing technique and ANN is proposed in Osman-2011-HMPN . The data and workflow used in this study are the same as Osman-2011-TBDZ . The network used here is the HMLP network trained by integrating both Modified Recursive Prediction Error algorithm and Extreme Learning Machine. The results show that this network can achieve the highest testing accuracy of 77.33% and average testing accuracy of 74.62% for 35 hidden nodes. Besides, in Osman-2012-OSEL , similar work is presented. The data used in this study contains 1600 objects consisting of TB, overlapped TB, and non-TB, which are collected from 500 tissue slide images. A single hidden layer feed-forward network trained by Online Sequential Extreme Learning Machine is applied in this study. The experimental results show that the network trained by geometrical features can achieve classification scores above 90.00% on three categories.
It is essential but ineffective to analyze the sputum specimen by manual methods. The digital image analysis method can effectively optimize this situation. In Rulaningtyas-2011-ACTB , a classification method based on ANN is introduced. A total of 100 images used in this study are the binary images directly taken from Forero-2004-ITBB . There are 75 and 25 images for training and testing, respectively. In the workflow of this method, geometry features, containing circularity, compactness, eccentricity, and tortuosity, are extracted from the binary images for training and testing the classifier. The network used here is BP based MLP. The experimental results show that the ANN can accurately classify the tuberculosis bacteria or not with a mean square error of 0.000368.
Biomonitor is important for the researches related to the ecosystem. However, the manual identification method is ineffective. Therefore, Kiranyaz-2011-CRMI focus on the automatic classification and retrieval of macroinvertebrate images. In this study, SVM, Bayesian model, feed-forward ANN, and RBF are analyzed and compared. The data used in this study consists of 1350 images in eight categories. The classier training process is shown in Fig. 19. The experimental results show that SVM and Bayesian model achieve training and test classification errors lower than 10%. The performance of ANNs depends on the network configuration and the data partition. The best classification performance is obtained by MLP, which is used in the following retrieval part.
The identification of algae is an essential problem in the research of the environment. The traditional identification method by using a microscope is time-consuming. The automatically computer-aid method can optimize this situation. In Mosleh-2012-APSA , a system based on the image processing technique and ANN is introduced to recognize and classify several common algae. The algae used in this study are collected from Putrajaya Lake, Malaysia. The image data consists of five genera, and each genus has 100 images (40 images for training and 60 images for testing). The workflow of the system is shown in Fig. 20. The images are pre-processed first, and then shape and texture features are extracted from the segmented images for training and testing the MLP classifier. The experimental results show that this system can effectively identify the algae with an overall accuracy of 93%.
Sputum examination by microscope is widely used in the diagnosis of tuberculosis. However, the manual method is ineffective and time-consuming. In Siena-2012-DATB , the image process technology is developed to assist the ANN to identify TB. The sputum sample images used in this study are taken from the Centres for Disease Control and Prevention, Public health image library. Atlanta, GA, USA. The workflow of the identification process is shown in Fig. 21. The colour segmentation is applied first, and then the morphology process, including dilation and erosion, is used. After that, the features, which contains eccentricity and compactness, are extracted for training and testing the ANN classifier. The ANN used in this study has two input, 15 hidden, and two output neurons. The experimental results show that the method can achieve an accuracy of about 88%.
In Danping-2013-IPMS , an approach is introduced to perform the identification task of powdery mildew spores, which is essential to related research. The workflow of this approach is shown in Fig. 22. The powdery mildew spores image is pre-processed first. The pre-processing operations contain illumination compensation, greying, and image enhancement. After pre-processing, image segmentation and feature extraction are applied. The features used in this study contain perimeter, area, roundness, shape complexity, four Hu invariant moments of the connected domain. Finally, these features are used for training and testing the BP neural network, which has seven input, three hidden, and one output neurons. The correct rate is 95.5% on 155 training images and 63.6% on 89 test images. There is still space for optimization of this method.
Phytoplankton community analysis is an important tool in the determination of freshwater quality. In Schulze-2013-PAAS , an automatic microscope image analysis system is developed to recognize phytoplankton. The diagram of the system is shown in Fig. 23. The system contains the segmentation, features extraction, and classification functions. The experimental data has ten categories in this study. After testing, the best network configuration has two hidden layers (the first layer has 50 neurons, and the second layer has 30 neurons). The experimental results show that the average recognition and error rates are 94.7% and 5.5%, respectively.
The classification of algae is one of the essential problems in water resources management. In Coltelli-2014-WMAR , an automatic real-time microalgae identification and enumeration method based on image analysis is introduced. This method integrates segmentation, shape features extraction, pigment signature determination, and ANN classification. The ANN used here is a Self-Organizing Map (SOM), whose details can be found in Rissino-2009-RSTC ; Sap-2008-HSOM ; Silva-2007-AHPS . The neuron number of SOM used in this study is fixed as the estimate of category number, and the neurons are interconnected through neighbourhood relations. The data shown in the experiment is private. The experimental results show that the accuracy of this method is 98.6% on 23 categories of 53869 images.
It is a time-consuming task to diagnose tuberculosis by manual sputum examination under a microscope. In Priya-2015-AITO
, a method for identifying TB in digital sputum smear using neural network and neuro fuzzy inference system is proposed. The dataset used in this paper consisted of 1537 objects in 100 images, of which 1278 are target objects and 259 are outliers. 60% of the bacilli objects and 80% of outliers are used for training, and others are used for testing. In this method, active contour segmentation method using level set formulation and Mumford-Shah technique is applied first, then the geometric features are extracted, and finally, these features are classified by using RBF,Adaptive Neuro Fuzzy Inference System (ANFIS), which is a MLP-based fuzzy system, and Complex-valued Adaptive Neuro Fuzzy Inference System (CANFIS), which is the extended idea of its predecessor ANFIS for any number of input-output pairs. The experimental results show that this segmentation method along with CANFIS is effective for identifying tuberculosis objects.
The identification of soil microorganisms is of great significance for agricultural production. In Kruk-2015-CCSI , a method based on digital image processing is introduced to identify soil microorganisms. As the workflow shown in Fig. 24
, the main steps of this method contain segmentation, feature extraction, feature selection, and classification. The original features extracted from the segmented images have 78 components: cluster centroids, colour characterization, and histogram. Then, the fast correlation-based filterYu-2003-FSHD is used to select 21 features for classification. There are five classifiers, including SVM, MLP, RBF, KNN, and RF, used in this study. The soil samples are taken from the Agrotechnical Department, Research Institute of Horticulture, Skierniewice, Poland. The image data used here has 441 images with 12 categories. As the experimental results are shown in Tab. 2, the RF classifier achieves the best average accuracy of 98.41% with the selected features.
|Classifier||All Features||Selected Features|
In Priya-2016-AOIL , a digital tuberculosis object-level and image-level classification method based on MLP activated by SVM learning algorithm is proposed. In this method, the active contour method is applied to segment TB objects, and then 15 Fourier descriptors are used to describe the boundary of segmented objects. Besides, fuzzy entropy measures are used to select the prominent Fourier descriptors for the training network. For comparison, the BP based MLP is applied. The sputum smear slides used here are collected from the South African National Health Laboratory Services, Groote Schuur Hospital in Cape Town. The experimental results show that the proposed network achieves accuracy of 91.3% for object-level classification and 92.5% for image-level classification.
3.2 Other Tasks
Continuous monitoring of cells during the fermentation process can significantly increase the fermentation yield. In Shabtai-1996-MMMC , a prototype Self-Tunning Vision System (STVS) is developed to monitor morphologic changes of the yeast-like fungus Aureobasidium pullulans during the fermentation of pullulan polysaccharide. The workflow of this system contains several steps: pre-processing, segmentation, intermediate processing, feature extraction, and classification. A self-organizing multilayer neural network is employed for image segmentation. The experimental results show that the system is reliable, accurate and versatile.
In Wit-1998-AAAN , a feed-forward MLP, which contains 81 input neurons, ten hidden neurons, and one output neuron, is proposed for the image enhancement and subsequent enumeration of microorganisms adhering to the solid substrate. Different from traditional classical ANNs, this network does not rely on feature engineering. As shown in Fig. 25, the network’s input directly employs grey values from pixel sub-images taken from the microscope image ( pixel). These grey values are processed to yield an output value for every pixel in the original image, whereafter microorganisms are indicated with a marking circle. For high-contrast images, nearly all of the adhering organisms can be detected correctly. However, in the lower quality images, for metal or silicone rubber substrata, image enhancement by the ANN yielded enumeration of the adhering bacteria with an accuracy of 93%-98%. After ANN enhancement, 98% of the yeast cells adhering on silicone rubber substrata were enumerated correctly. It can be concluded that for low quality and complicated images, ANNs have outstanding performance.
The manual method for bacteria classification is complex and ineffective. In Zhu-2010-BCUN , a bacteria image classification method based on the features generated by Pulse Coupled Neural Network (PCNN), which is a biologically inspired ANN based on the work Eckhorn-1989-ANNF , is introduced. The workflow of this method is shown in Fig. 26. The acquired images are pre-processed by cropping the image and saving the region of interest first. Then, the entropy sequence features are generated by PCNN. In the end, the classification is done by a classifier based on Euclid distance. The experimental results show that this method is feasible and efficient.
The Zeihl-Neelsen tissue slide image segmentation is a key part of diagnosing tuberculosis disease by the computer-aided method. A method based on HMLP is introduced to perform the segmentation task in Osman-2010-STBZ . In the experiments, there are a number of 5608 data used for training and numbers of 2268, 1634, and 2800 used for sample A, B, and C for testing. In the segmentation process, the hue and saturation components of each pixel in the kernel of pixels are used as the inputs of the HMLP network. The sliding process of the kernel in an image is shown in Fig. 27. The centre pixel of the kernel is used as the output of the HMLP network. If the testing results show that the pixel belongs to the segmentation object, the centre pixel will be assigned as ’1’; otherwise, it will be ’0’. The experimental results show that this method can obtain the accuracy of 98.72%, 99.45%, and 97.75% for sample A, B, and C, respectively.
The conventional tuberculosis diagnosis method based on the microscope is widely used in developing countries. Nevertheless, it is time-consuming. Therefore, in Costa-2015-AITM , a novel method based on image processing techniques is proposed to assist the diagnosis. It contains three steps: image acquisition, segmentation and post-processing. In the segmentation part, SVM and ANN (three layers feed-forward neural network) are used. The input variables of classifiers are combinations of pixel colour features selected from four colour spaces. The best feature is selected by a scalar feature selection technique. The output of the segmentation part is to determine whether it is a bacilli. The function of the post-processing step is to remove the non-bacilli parts. The training data used in the segmentation part consists of 1200 bacilli pixels and 1200 background pixels from 120 images. The experimental results show that SVM achieves the best sensitivity of 96.80% with an error rate of 3.38%. The ANN classifier achieves the best sensitivity of 91.53% and an error rate of 5.20%.
According to the above discussion, we can find that MIA based on the classical neural network has the following characteristics: First, most of the work uses private datasets; Secondly, in the classification tasks, almost all the work includes three steps: pre-processing, feature extraction and classification, which indicates that the microorganism image classification task based on classical neural network is very dependent on feature engineering; Thirdly, among other tasks, there are examples of directly applying classical neural network to process image data, which are different from CNN that appears later.
Besides, to help readers understand relevant works more quickly and find objective papers, we briefly summarize these papers mentioned above on MIA based on classical neural network in Tab. 3, in which years, application tasks, references, datasets, object species, class problem, methods, and results are provided.
|1992||C||Balfoort-1992-AIAN||Private||Algae||8||MLP||Average identification rate >93%|
|1994||C||Culverhouse-1994-ACFS||Private||Cymatocylis||5||MLP||Identification rate of 28% trials>70%|
|1998||C||Veropoulos-1998-IPNC||Private||Tuberculosis||2||MLP||Accuracy = 97.9%|
|1998||C||Gerlach-1998-IRSM||Private||Aspergillus awamori||4||MLP||Best classification results >90%|
|1999||C||Kay-1999-TAAS||Private||Gyrodactulus||5||MLP||Detection rate = 100%|
|2002||C||Avci-2002-CECB||Ecoli Dataset||Escherichia coli||8||
|2003||C||Embleton-2003-ACPP||Private||Phytoplankton||4||MLP||Within 10% of manual analysis|
|2005||C||Widmer-2005-UANN||Private||Cryptosporidium parvumGiardia lamblia||2||MLP||
|2005||C||Weller-2005-SCSO||Private||Sedimentary organic matter||11||MLP||Average classification rate = 87%|
|2006||C||Hu-2006-AAQT||Private||Plankton||7||LVQ-NN||Reduce the error by 50% to 100%|
|2006||C||Ginoris-2006-RPMU||Private||Protozoa and metazoa||5||MLP||\|
|2007||C||Ginoris-2007-RPMU||Private||Protozoa and metazoa||22||MLP||Overall recognition rate >80%|
|2007||C||Xiaojuan-2007-ANBC||NMCR||Bacteria||8||MLP||Accuracy = 82.3%|
|2007||C||Xiaojuan-2007-ANBR||NMCR||Bacteria||8||MLP||Accuracy = 86.3%|
|2007||C||Weller-2007-TSNN||Private||Sedimentary organic matter||11||
|Average classification rate = 91%|
|2008||C||Xiaojuan-2008-ANWB||CECC||Bacteria||4||MLP||Accuracy = 85.5%.|
|2008||C||Cunshe-2008-ANWB||CECC||Bacteria||4||MLP||Accuracy = 85.5%.|
|2009||C||Xiaojuan-2009-AIBN||CECC||Bacteria||4||MLP||Accuracy = 85.5%.|
|2010||C||Hiremath-2010-AICB||Private||Bacterial cell growth phases||3||RBF||Average accuracy = 97%|
|2010||C||Hiremath-2010-DIAC||Private||Cocci bacteria||6||RBF||Average accuracy = 98.83%|
|2010||C||Kumar-2010-RDMU||Private||Microorganism||5||PNN||Accuracy = 100%|
|2010||C||Zeder-2010-AQAA||Private||Environmental bacteria||3||MLP||Identification rate = 94%.|
|2010||S||Osman-2010-STBZ||Private||Tuberculosis||3||HMLP||Overall accuracy = 98.64%|
|2011||C||Hiremath-2011-ICCB||Private||Cocci bacteria||6||RBF||Average accuracy = 98.5%|
|2011||C||Hiremath-2011-DMIA||Private||Spiral bacteria||3||RBF||Accuracy = 100%|
|2011||C||Osman-2011-TBDZ||Private||Tuberculosis||3||Perceptron||Accuracy = 77.25%|
|2011||C||Osman-2011-HMPN||Private||Tuberculosis||3||HMLP||Accuracy = 74.62%|
|2011||C||Rulaningtyas-2011-ACTB||Forero-2004-ITBB||Tuberculosis||2||MLP||Mean square error = 0.000368|
|MLP performs best|
|2012||C||Osman-2012-OSEL||Private||Tuberculosis||3||Perceptron||Accuracy = 91.33%|
|2012||C||Mosleh-2012-APSA||Private||Algae||5||MLP||Overall accuracy = 93%|
|2012||C||Hiremath-2012-SBCI||Private||Spiral bacteria||3||RBF||Accuracy = 100%|
|2012||C||Siena-2012-DATB||Private||Tuberculosis||2||MLP||Accuracy about 88%|
|2013||C||Danping-2013-IPMS||Private||Powdery mildew spores||2||MLP||Correct rate = 63.6%|
|CANFIS is effective|
4 Deep Neural Network for Microorganism Image Analysis
An overview of the MIA task based on deep neural networks is introduced in this section. We divide the tasks into two categories: classification and other tasks. In the classification part, we mainly introduce the works related to classification and identification tasks. In the other tasks, the papers related to segmentation, feature generation, counting, and data augmentation are reviewed. In the end, a summary is provided to summarize the characters of the MIA based on deep neural networks and a table for readers to find relevant papers conveniently.
4.1 Classification Tasks
The recognition system of phytoplankton or other natural objects is challenging to design, mainly because most systems rely on feature engineering. When the recognition task relates to multiple categories, the selection and design of practical features will be a tough challenge. To optimize this situation, a recognition system, called SYRACO2, is proposed to perform the recognition task of coccoliths in Dollfus-1999-FNNR . A convolutional neural network, whose structure is shown in Fig. 28, is designed in this system. Unlike the traditional ANNs, the network can generate the feature instead of relying on feature engineering, thus reducing pre-processing to rotation and translation normalisation. The performance of the system is verified by applying the system to coccolith and face recognition. The coccolith data used in the experiment contains 13 species and one non-coccolith class. Each coccolith species has about 100 images, and the non-coccolith class has about 600 images. The average effective recognition rate of the system in 13 species is 86%.
The previous SYRACO systems effectively identify individual species of coccoliths with high reliability but failed because many coccoliths are overlooked. In SYRACO2, a pre-processing algorithm is used to translate or rotate the input image, but it cannot meet the requirements. To solve these problems, a novel convolutional neural network, named parallel neural network, is proposed in Beaufort-2004-ARCD . The structure is shown in Fig. 29. Based on the network proposed in SYRACO2 Dollfus-1999-FNNR , motor modules are introduced in the parallel neural network. The motor modules can dynamically achieve the translation, rotation, dilatation, contrast and symmetry of the images through training. Besides, as the workflow is shown in Fig. 30, the novel system introduces secondary classification, which can achieve further detailed classification. The experimental results show that the system recognizes approximately 96% of coccoliths belonging to 11 Pleistocene taxa during routine work.
Bacteria identification plays an essential role in disease diagnosis. However, it is not sufficient to identify bacteria in many applications. In Nie-2015-ADFB
, a framework for segmentation and classification is proposed. Unlike classical neural networks, this framework does not rely on artificially designed features but utilizes deep learning technology to extract high-level features automatically. In this framework, the original images are cropped into patches, and
Convolutional Deep Belief Network(CDBN) is used to extract the patch-level features for training SVM so that it can achieve the patch-level segmentation (patch classification). After the patch-level segmentation, the foreground patches are classified by a CNN, whose structure is shown in Fig. 31. The predicted patch labels by CNN vote the final bacterial category. The private data used in this study consists of 862 images with 17 species. The experimental results show that SVM based on the features extracted by CDBN achieves a mean accuracy of 97.14% in patch-level segmentation, and the framework achieves an accuracy of 62.10%, a precision of 83.76%, and a recall of 82.16% on the whole classification task.
The traditional methods for plankton identification based on feature engineering are not universal. In Dai-2016-AHCN , a hybrid CNN model is proposed for plankton classification. The diagram of the model is shown in Fig 32. Multi-sources data composed of original images, global feature images and local feature images are used to train the network. This network consists of three substructures based on AlexNet. Fully connected layers are used to connect these substructures, and softmax is used to classify. The dataset used here contains a total of 30,000 images in 30 classes from WHOI-Plankton. The experimental results show that the single AlexNet achieves an accuracy of 94.75%, and the hybrid CNN based on AlexNet achieves 95.83%. Besides, GoogLeNet is applied for comparison, the single network achieves an accuracy of 95.2%, and the hybrid CNN based on GoogLeNet achieves 96.3%.
The WHOI-Plankton consists of 3.4 million expert-labelled plankton images with 103 classes. Nevertheless, this dataset has a seriously imbalanced problem. To solve this problem, a CNN based method is introduced in Lee-2016-PCIL . In this method, transfer learning is used. First, a balanced sub-dataset is made by setting a threshold to randomly select images in the original dataset to pre-train the network. Then, the original imbalanced dataset is used to fine-tune the network to perform the classification task. The experimental results show that this method achieves the best accuracy and F1-score of 92.80% and 33.39%.
Traditional methods of monitoring microorganism populations are time-consuming. In Py-2016-PCDC , a CNN model for plankton classification using translational and rotational symmetry is introduced. In the design of the network, there are two constraints: First, for each convolutional layer, the ability to learn more complex patterns is guaranteed; Second, the receptive field of the topmost layer should be no larger than the image region. Besides, the structure like Inception, which is shown in Fig. 33, is used to adapt to the multiple sizes of the input images. The dataset used in this study is Plankton Set 1.0. The results show that the best model can achieve a softmax loss of 0.613.
Automatic technology is of great significance for diatom classification. In Pedraza-2017-ADCA
, a diatom classification method based on CNN is proposed, and the performance of network trained by different forms of data is studied. In this study, the original image data is pre-processed by cropping to generate the single sample, and the label is proofread. Then, data augmentation is applied by rotation with different degrees and flipping to obtain 69350 samples in 80 categories. After that, the augmented dataset is segmented and normalized to obtain the segmented dataset and normalized dataset, respectively. The classification method used in the experiment is the transfer learning technology based on AlexNet pre-trained by ImageNet. The experimental results show that when the original data are combined with the normalized data, the network achieves the best average accuracy of 99.51%.
There are 30336 images of 121 categories labelled as the Plankton Dataset to hold a Kaggle data science competition in 2015. An automatic classification method is urgently needed. InYan-2017-AMEC , to find out the relationships between the ANN structure and the classification accuracy, the classical CNN models, including CaffeNet, VGG-19, ResNet with different layer numbers, are tested to propose a novel network, whose structure is shown in Fig. 34. The experimental results are shown in Tab. 4, while achieving similar performance, the proposed network significantly reduces the model size and improves the frame rate.
|Top-1 accuracy (%)||76.4||77.7||78.5||76.2||76.8||74.9|
|Top-5 accuracy (%)||96.1||96.1||95.4||96.1||96.8||95.8|
|Model size (MB)||1.5||224||44||93||173.4||3.1|
|Frame rate (fps)||32.2||16.9||12.4||2.7||0.9||18.4|
The manual plankton classification method is ineffective. In recent years, the accuracy of image analysis methods often depends on feature engineering. In Al-2018-IPIC , a CNN based on the thought of VGG-16 is proposed, and its structure is shown in Fig. 35. To evaluate the network’s performance, three sub-datasets from SIPPER is tested. The experimental results are shown in Tab. 5.
Rapid identification of microbial pathogens is of great significance in the treatment of infection. A method to distinguish bacterial species using 3D refractive index images is presented in Kim-2018-AIBU . In this study, Densely Connected Convolutional Network (DenseNet) Huang-2017-DCCN and Wide Residual Network (WRN) Zagoruyko-2016-WRN are compared. The data used in the experiment consisted of seven categories, each containing more than a thousand images. The experimental results show that the highest accuracy obtained by WRN is 73.2%, and the highest accuracy obtained by DenseNet is 85%.
Plankton image classification is of great significance to the study of plankton. In Liu-2018-DPRN , a CNN called PyramidNet for plankton image classification is proposed. The dataset used in this study is WHOI-Plankton. The experimental results show that PyramidNet performs better than AlexNet, GoogLeNet, VGG-16, and ResNet. It gets the best performance with 86.30% accuracy and 0.4164 F1-score.
The manual plankton classification method is ineffective. An automatic algorithm using CNN is proposed to perform plankton classification in Luo-2018-APIA . The In Situ Ichthyoplankton Imaging System (ISIIS) Cowen-2008-ISII images collected in the northern Gulf of Mexico is used to test the performance. The workflow is shown in Fig. 36. First, the original image is segmented, and the segmented image data is divided into training and test datasets. The whole image dataset is classified by the classifier trained by the training dataset, and the classification results are spot-checked by manual method to evaluate the classifier’s performance. The test dataset images are then used to generate a threshold for post-processing to improve the method’s performance further. The results show that after the post-processing step removes images with low classification scores, an utterly random classification assessment shows an average accuracy of 84% and a recall of 40% for all groups are obtained. It is challenging to classify rare biological classes reliably, so after excluding the 12 rarest taxa, the classification accuracy of the remaining biological groups reaches more than 90%.
Bacterial classification is of great significance in medical application. In Wahid-2018-CMIB , a deep learning technique is proposed to realize bacterial classification. This method uses the modified deep CNN model based on Inception-V1, whose structure is shown in Fig. 37. The network is pre-trained by a million images, and then the network is re-trained by using the dataset in this study. The dataset used here has five bacteria species selected from several online resources such as HOWMED, PIXNIO, Microbiology-in-Pictures and so on. Results show that the network achieves an accuracy of about 95%.
Plankton classification plays a vital role in marine ecology and economy. In Cui-2018-TSIF , a novel texture feature extraction method and a hybrid CNN based on AlexNet are proposed for plankton classification. As the workflow is provided in Fig. 38, the original image is used to generate texture and shape images, and the three kinds of images are concatenated, which is used in the subsequent AlexNet training. The dataset used in this study is WHOI-Plankton, and the experimental results show that the network with three inputs obtained the best accuracy of 96.58% in 30 categories and 94.32% in 103 categories.
In Wang-2018-TPCN , a transferred parallel neural network is proposed for large-scale imbalanced plankton dataset classification. As the network shown in Fig. 39, it consists of two parts: a pre-trained model based on small classes and an untrained model. In the training process of the network, the data is fed into the network for training, and the pre-trained model is used as a feature extractor to enhance the features of small classes to improve the classification ability on the imbalanced dataset. The data used in this experiment is WHOI-Plankton, and the experimental results show that the transferred parallel neural network based on VGG-16 achieves the best accuracy of 94.98% and F1-score of 54.44%.
Recognizing bacteria under a microscope is a part of the daily work of doctors and scientists, and automatic methods can help in the process. In Polap-2019-BSCU , a model based on region covariance and CNN is proposed to classify bacteria. The schematic diagram of the model is shown in Fig 40. First, region covariance is used to segment the input image in which texture comparison is used. These segments are fed into CNN for binary classification. The data used in the experiment consisted of 63 images in three categories: rod-shaped bacteria, round or nearly round shape bacteria, and mixed bacteria strains. The experimental results show that the correct classification rates in the rod-shaped bacteria and spherical or nearly spherical shape bacteria are over 91% and over 78%, respectively.
Different species of bacteria have different effects on humans, so it is essential to distinguish them. In Rujichan-2019-BCUI
, deep CNN is used to classify 33 bacteria species in the DIBaS dataset. In this work, to improve the classification accuracy, some pre-processing operations, including colour masking and image augmentation, are applied first, and then the MobileNetV2 based on ImageNet weight is used as the base model for classification. The experimental results show that the model achieves an average accuracy of 95.09%.
As an extension of work Wahid-2018-CMIB ; Ahmed-2019-CDCN , the Xception-based bacteria classification method is tested in Wahid-2019-DCNN . Xception is pre-trained by ImageNet and then fine-tuned with experimental data to make it suitable for bacterial classification. The dataset used here is collected from several online datasets. A total of 740 images in seven categories are collected. The experimental results show that the Xception-based method achieves an accuracy of 97.5%.
In Balagurusamy-2019-DDBD , a smartphone optical device is built. It can be combined with the CNN-based application to detect bacteria and analyze their motion. The private data used in this study includes two categories, E.coli and B.subtilis. The CNN used here is a binary classification network with two convolution layers and two fully connected layers. The classification layer consists of a Softmax layer. The experimental results show that the CNN trained with the size-preserved image can achieve an accuracy of 83%.
The analysis of bacteria is vital to human health. In Bliznuks-2020-ENNS , an automatic system for microorganisms growth analysis in the laboratory is proposed. The system uses CNN to analyze the laser speckle to identify microbial growth. The image data used in this study is private, and it contains 435 raw images. The CNN used in this study is 3D convolutional network architecture, which can encode spatial speckle variance and their changes in time. The experimental results show that the network reaches an accuracy of 0.95.
Analysis of viruses using transmission electron microscopy images is an essential step in understanding the formation mechanism of infectious virions. In Devan-2019-DHCT , an investigation for comparing the performance of a CNN that was trained from scratch with pre-trained CNN models as well as existing image analysis methods are produced. The private data, which contains 190 images for training and verification and 21 images for testing, is used in this study. The experimental results show that the pre-trained ResNet50 fine-tuned by virus image data obtains the best performance with an accuracy of 95.44% and an F1-score of 95.22%.
Although CNNs show excellent performance in many fields in recent years, many iterations make it easy to get into local optima during the training process. To address this problem, the hybrid CNN models based on autoencoders, particle swarm optimization, and genetic algorithm are compared inChopra-2020-NMBC . The data used in this study consists of 1833 bacteria images in four categories. In the experiments, five models are tested, namely simple CNN, simple CNN optimized by particle swarm optimization, autoencoders pre-training the CNN, autoencoders pretraining the CNN optimized by particle swarm optimization, and CNN optimized by genetic algorithms. The classification results show that autoencoders pre-training the CNN optimized by particle swarm optimization performs best with the accuracy and F1-score of 94.9% and 95.6%.
Food-borne pathogenic bacteria are a severe threat to the food industry. Traditional detection methods are inefficient and time-consuming, so it is necessary to have an efficient detection method of pathogenic bacteria. In Kang-2020-CFBU , a classification method of food-borne pathogens using hyperspectral microscope imaging and CNN is proposed. The data used in the experiment contains 50 slides of five species. Hyperspectral microscope imaging technology can obtain both spatial and spectral information of food-borne pathogenic bacteria. There are two CNN models used in this work: U-Net and One-dimensional CNN (1D CNN). U-Net is first applied to segment the regions of interest. Then 1D CNN is used to classify the bacteria using the spectral information extracted from the regions of interest. The experimental results show that U-Net achieves the average accuracy of 96% and the average mIOU of 88%, and 1D CNN obtains the classification accuracy of 90%.
Using a microscope to analyse bacterial images needs an experienced and skilful operator. In Mhathesh-2020-A3CN , a method based on CNN is presented to perform 3D bacterial image classification. The data used in the experiment is bacterial image data collected from the intestine of larval zebrafish. The CNN structure used in the experiment is shown in Fig. 41, which is used to achieve binary classification. The experimental results showed that the presented CNN obtains an accuracy of 95%.
Mycobacterium tuberculosis is the pathogenic bacterium that causes tuberculosis. In Swetha-2020-CNNB , the purpose is to use advanced image processing techniques to achieve automatic and rapid detection of tuberculosis in sputum images. The data used in this study is collected from an infected person. The workflow used in this paper is shown in Fig. 42. Firstly, the image is pre-processed by applying noise reduction and intensity modification, and then the segmentation is done by the Channel Area Thresholding (CAT). After that, the HOG and SURF features are extracted from the segmented images. Finally, CNN is trained by the HOG, SURF, and features extracted by itself. The experimental results showed that the proposed method achieves an accuracy of 99.5%, a sensitivity of 94.7%, and a specificity of 99%.
ML methods can reduce the time requirement for humans to analyse microorganism and eliminate human error. The possibility of using image classification and deep learning method to recognise the standard and high-resolution bacteria and yeast images is studied in Treebupachatsakul-2020-MIRB . The standard resolution data used in this study is provided by the authors and includes three classes of bacteria and one class of yeast, each with more than 200 images. High-resolution images are taken from similar bacteria and yeasts in Zielinski-2017-ALAB . The network used in the experiment is LeNet. The experimental results show that more than 80% accuracy can be obtained on the standard resolution data.
The detection of microorganism is of great significance for human health. In Zawadzki-2020-DLAC , several popular CNN models with or without pre-trained weights is compared and analysed in the microorganism (bacteria and fungi) classification task. There are three datasets, including DIBaS, DIFaS, and a private dataset, used in this study. Each dataset contains four categories. The CNN models used here are Xception, ResNet-50, Inception-V3, MobileNetV2, DenseNet201, VGG-16, and VGG-19. The experimental results are provided in Tab. 6. It is obvious that Xception and ResNet-50 perform well, and it is not always beneficial to use the pre-trained weights of CNN models.
Conventional bacteria sub-population classification and counting are achieved by manual microscopy. In Tamiev-2020-ACBC , a method to classify and enumerate bacterial cell sub-populations based on CNN is proposed. Besides, a pre-processing algorithm for augmenting fluorescent microscope images is developed. A total of 1000 fluorescent microscope images of B. subtilis are collected in this study. In the experimental process, the original image is first processed with a binary segmentation algorithm and annotated manually, and then all images are unified into the same size and augmented. These images are fed into the CNN for training and testing. The experimental results show that the proposed CNN can achieve 86% accuracy when trained on a relatively small dataset (81 images). By summing the classified cells together, the algorithm provides the same count as manual counting.
The presence of pathogenic microorganisms in food is a significant threat to consumers and the food industry. In Kang-2020-SCFP , a high-throughput hyperspectral microscope imaging technology with a hybrid deep learning framework defined as “Fusion-Net” is proposed for rapid classification of foodborne bacteria at the single-cell level. The dataset used here contains five classes. In the experimental process, the first step is data acquisition, and then three features, including morphological features, intensity images, and spectral profiles, are extracted from the image data. These features are used to train three networks: LSTM, ResNet, and 1D CNN. The schematic diagram is shown in Fig. 43. The experimental results show that the three networks achieve classification accuracies of 92.2%, 93.8%, and 96.2%, respectively. After the fusion, the classification accuracy is increased to 98.4%.
4.2 Other Tasks
The manual methods to classify plankton is time-consuming and tedious. In Al-2015-PEHC , a hybrid plankton classification algorithm based on CNN is proposed. The dataset, obtained from the SIPPER dataset provided by the University of South Florida (USF), Tampa, FL, USA, has 3119 images in seven classes used in this study. The illumination of the hybrid algorithm is shown in Fig. 44. First, the plankton image data is used to train the CNN, which has three hidden layers. Then, the features generated by each hidden layer are combined with different classification methods, which includes RF and SVM. The experimental results show that the SVM trained by the features generated by the first hidden layer achieves the best performance with an accuracy of 96.70%.
In Hung-2017-AFRO , Faster Region-based Convolutional Neural Network (Faster R-CNN) is applied to the detection of malaria parasite images. The data used in this study is from ex vivo samples from P. vivax infected patients in Brazil. The method used in this paper includes two stages of detection and classification. In the detection stage, Faster RCNN (based on AlexNet) is used to detect the target, which only distinguished whether it is the red blood cell or not. In the second stage, AlexNet is used to classify the targets labelled as non-red blood cell in the first phase. All of the deep learning networks shown in the experiment are pre-trained by ImageNet and fine-tuned by the malaria parasite image data. The experimental results show that an accuracy of 59% is achieved in the detection stage, and an accuracy of 98% is obtained in the second classification stage.
Accurate segmentation in cell microscopy is one of the critical steps in cell analysis. Nevertheless, manual methods are inefficient. In Aydin-2017-CBTC , a CNN-based segmentation method is proposed and applied to the multi-modal fluorescent microscopy image data of yeast cells. The data used in the experiment is the fluorescence microscope images of yeast cell division, consisting of 6000 training samples, 1200 validation samples, and 1200 test samples. The CNN used in the experiment is based on SegNet. Experimental results show that the method achieves a mIOU of 71.72%. In addition, the addition of extra channels may further improve the segmentation performance.
Environmental microorganism plays an essential role in pollution control and management. Traditional microorganism classification methods are ineffective, so a classification framework, which is shown in Fig. 45, based on deep neural network and CRF is proposed in Kosov-2018-EMCU . The dataset used in the study is Environmental Microorganism Data Set 4th Version (EMDS-4), which contains 400 images of 20 categories. The frame is combined with global and local features for building the CRF model. The local features are generated by DeepLab-VGG-16, which is a reorganized network of DeepLab and pre-trained VGG-16. The DeepLab-VGG-16 is trained by EMDS-4 to generated a feature vector for each pixel. Then, these feature vectors are fed into RF to perform as the unary potential of CRF. Compared with SIFT and Simple features, the results show that the DeepLab-VGG-16 feature is better.
Making labels for plankton images is a time-consuming and laborious task. In Rodrigues-2018-ETLS , transfer learning technology is used to remedy the problem of small dataset. Two public datasets and one private dataset are used in this study. As the feature extraction process shown in Fig. 46, the ISIIS is used to pre-train DeepSea and AlexNet, while ImageNet is used to pre-train AlexNet. These pre-trained networks are used as feature extractors to obtain the image features of the private LAPSDS dataset, and the SVM is used to test these features. The results showed that DeepSea with ISIIS performs best, achieving 84% classification accuracy.
Making ground truth images in segmentation tasks is a time-consuming and laborious task. In Matuszewski-2018-MATS , a CNN trained using the minimum annotation is proposed to perform the segmentation task. The dataset used in this study is the Rift Valley virus dataset Kylberg-2012-SVPC , which contains 143 TEM images. The minimal annotation method used in the experiment requires only simple annotation of the virus centre, and then the ground truth images are generated by dilating operation. The network based on U-Net, whose structure is shown in Fig. 47, is trained by the augmented data. The experimental results show that the network achieves a Dice of 90% and an IOU of 83.1%.
Fungi are notorious for food, archives, and human health. The detection of fungi is of great significance. In Tahir-2018-AFSD , the authors propose a new fungus dataset and develop a CNN-based method for fungus detection. The fungus dataset contains 40800 images of five classes of fungi and one extra class of dirt. This data is divided into 30000 training dataset and 10800 test dataset. The CNN proposed in this paper is shown in Fig. 48, which can perform not only the detection of fungi but also the classification of fungi. The experimental results show that the accuracy of detection is 94.8%.
The detection of diatom is a challenging task for biologists and computer scientists. In Pedraza-2018-LPCN , a comparison is conducted to test whether the latest deep learning network, including RCNN and YOLO, can adapt to the detection of diatoms. The data used here is private and contains nearly 11000 images in ten categories. The experimental results show that YOLO is more effective with an F1-score of 72%, while the RCNN is only 10%.
The applications of plankton imaging systems increase quickly in marine science, but the process of large image data is still a significant challenge due to the wide variation in different Marine environments. In Cheng-2019-ECNN , an end-to-end framework for plankton image identification and enumeration is presented. As the workflow is shown in Fig. 49, an algorithm is proposed to extract and enhance the Region of Interest (ROI) from the input image first. This algorithm can effectively extract and segment the potential targets from the image data, and then the local grayscale values are used to enhance the local features of ROIs. After that, CNN is used to extract the features from the enhanced ROIs. These features are fed into SVM for multi-class classification. In the experiments, CNNs, including AlexNet, VGGNet, GoogLeNet, and ResNet, are compared. The private data used in this study contains six plankton categories and one ‘other’ category. The experimental results show that compared with CNNs alone, the methods combining CNN and SVM achieve better results. Among the methods, based on Resnet50, the best accuracy (94.13%) and recall (94.52%) are obtained by ResNet50 with SVM.
The classification of plankton is of great significance for related research. In Rawat-2019-ADLB
, a generic framework for the classification of plankton in the ocean is proposed. The dataset used in this study is collected from the Internet, which contains 235 images in five categories. In this experiment, Inception-V3, VGG-16, and VGG-19 are used to extract common features, and then CNN, Logistic Regression, SVM, and KNN are used as the classifiers to test the performance of different features. The results show that the CNN with Inception-V3 feature extractor achieves the best accuracy of 99.5%.
Plankton is one of the essential components in marine ecosystems. In Lumini-2019-DLTL , different transfer learning methods based on CNNs are investigated, aiming to design an ensemble plankton classifier based on their diversity. The transfer learning methods include one round tuning, two rounds tuning, and pre-processing tuning. The datasets used in this study include three public datasets (WHOI, ZooScan, and Kaggle) and one dataset used in two rounds of tuning. In the experiments, three transfer learning methods and the features extracted from the one round tuning are combined with SVM are tested. The experimental results show that the ensemble of models generated by one round tuning and two rounds tuning obtains the best performance. It achieves an accuracy of 0.9527 in the WHOI dataset, an accuracy of 0.8826 in the ZooScan dataset, and an accuracy of 0.9413 in the Kaggle dataset.
As an extension of works Wahid-2018-CMIB , a bacteria classification system based on CNN and SVM is presented in Ahmed-2019-CDCN . The data used in this study consists of seven categories collected from several public datasets, including Howmed, Microbiology-in-Pictures, Pixnio and so on. In the system, the pre-trained Inception-V3 is fine-tuned by about 800 training images. After that, SVM with the fine-tuned Inception-V3 feature extractor performs the classification task. The experimental results show that the system achieves an accuracy of around 96%.
Automatic and accurate identification of organisms is essential for real-time monitoring of marine ecology and further water quality assessment. Deep learning technologies can assist this process. However, since the convolution module is usually translational invariant, when the target rotates by a certain angle, the network will not recognize it. In Cheng-2020-MTCN , a method combining the translational and rotational features is proposed to address this problem. In this method, the original images (images in Cartesian coordinates) is transformed into polar images (images in polar coordinates) first. As shown in Fig. 50
, the CNN models trained by polar coordinates and original images are used as a feature extractor. The classification is performed by SVM. In this study, the proposed method is tested on the in situ plankton dataset and the CIFAR-10 dataset. The experimental results show that the Densenet201+Polar+SVM model obtains the highest classification accuracy (97.989%) and recall rate (97.986%) on the in situ plankton dataset. On the CIFAR-10 dataset, it obtains the highest classification accuracy (94.91%) and the highest recall rate (94.76%).
Tuberculosis is one of the top 10 causes of death worldwide. In Serrao-2020-ABDL , a method based on CNN and a mosaic image approach is proposed. The data used in this study is provided by the UFAM Pattern Recognition and Optimization Research Group, including positive and negative patch categories. These patch data are used to generate a total of 5000 mosaic images. Each image is composed of 100 patches, about half of which are negative, and the other half are positive. Fig. 51 provides a mosaic image example. In the experiments, three CNNs are proposed to perform the segmentation task to achieve the bacillus detection, which aims to perform the bacilli count. Fig. 52 provides these networks’ structures. The network performance is evaluated by counting the number of segmented bacilli. The experimental results show that the deepest CNN1 obtain the best performance with an accuracy of 99.665%.
Environmental microorganism image datasets are usually small. In Xu-2020-AEFG , an enhanced framework of GANs is proposed to perform the environmental microorganism image data augmentation task. The dataset used in the experiment is EMDS-5, which contains 21 classes of microorganisms, with 20 images and their corresponding ground truth images for each class Li-2021-EMDS . Due to the different directions of microorganism images, it is challenging to generate images in various directions directly through GAN. Secondly, due to the small data set, the images directly generated by GAN miss many details. As the framework provided in Fig. 53, to address the above problems, this framework transforms the original image in the same direction by combining it with the ground truth image. Besides, to increase the amount of training data and enable GAN to generate more details, colour space transformation is performed. To evaluate the framework’s effectiveness, VGG-16 is trained by the generated images for classification, and the results show that the data generated by the proposed framework can effectively improve the performance of classification.
Microorganism image segmentation plays a vital role in microorganism analysis. In Li-2020-MAMR , a Multiple Receptive Field U-Net (MRFU-Net) is developed to perform the environmental microorganism image segmentation task. The dataset used in this study is EMDS-5. Due to the various sizes and shapes of microorganism and the limitation of U-Net’s receptive field, a novel block structure is proposed to optimize the original U-Net, whose structure is shown in Fig. 54. This block inspired by the Inception structure uses filters of different sizes to obtain multiple receptive fields. Based on this block, MRFU-Net is proposed to improve the segmentation performance. Experimental results show that the Dice, Jaccard, recall, accuracy, and VOE obtained by MRFU-Net are 87.23%, 79.74%, 87.65%, 97.30%, and 20.26%, respectively.
As an extension of work Li-2020-MAMR , a CCN-CRF framework is proposed to perform the environmental microorganism image segmentation task in Zhang-2020-AMCF . Fig. 55 provides the details of this framework. The framework includes two parts: pixel-level segmentation and patch-level segmentation. In pixel-level segmentation, mU-Net-B3, which is an optimized U-Net structure, is performed the pixel-level segmentation. This network not only shows better performance but also has less than one third memory requirement than U-Net. Besides, Dense CRF is used as the post-processing to improve the pixel-level segmentation results further. In the patch-level segmentation, the transfer learning based on VGG-16 is used to perform the patch classification. These predicted labels are used to reconstruct the patch-level segmentation results to overcome partial under-segmentation in pixel-level segmentation. EMDS-5 is the dataset used in this study. The experimental results show that the Dice, Jaccard, recall, accuracy, and VOE obtained in pixel-level segmentation are 87.13%, 79.74%, 87.12%, 96.91%, and 20.26%, respectively.
The recognition of cell boundaries in microscopic images leads to the bottleneck of large-scale experiments. In Dietler-2020-ACNN , a method for boundary identification of irregular yeast cells is proposed. The data used in this study consists of 384 images and their corresponding manual annotation masks. In the experiment, U-Net is used to segment the cell objects from background or cell-cell border pixels, and then through calculation of each cell transformation distance to find the cell’s internal point. After that, the putative cell regions are predicted by applying the watershed algorithm to each internal point. Finally, by calculating the CNN score of the boundary pixel of the putative cell, the regions separated by the erroneous boundary are merged. The experimental results show that the proposed method achieves a mean accuracy of 94%.
Traditional methods of algal classification and cell counting are considered time-consuming, labour-intensive and subjective. In Baek-2020-IECS , a method based on fast RCNN and CNN is proposed to classify and quantify five species of cyanobacteria. The dataset used in this study contains a total of 1250 images, which are captured from the water samples in the Haman weir of Nakdong River and Baekje weir of the Geum River. Two networks are used in the experiment, and their structures are shown in Fig. 56. First, Fast RCNN is used to detect and classify targets, and then CNN is used to count. The experimental results show that the classification accuracy values of fast RCNN for five species of cyanobacteria are 0.929, 0.973, 0.829, 0.890, and 0.890, and CNN obtained the value of 0.85 and RMSE of 23 cells in the counting task.
After the above review, compared with the study based on classical neural network, we find that the MIA based on the deep neural network has the following characteristics: First, these related works no longer rely on feature engineering; Second, there are several public microorganism image datasets for related researches and applications; Third, the researches, which focus on the optimization and improvement of the deep neural networks, are more abundant; Fourth, the related tasks include more fields, such as image segmentation, object detection, feature extraction, and data augmentation.
To help readers understand relevant works more quickly and find related researches for their works, we briefly summarize the papers mentioned above on MIA based on deep neural network in Tab. 7, in which years, application tasks, references, datasets, object species, class problem, methods, and results are provided.
|1999||C||Dollfus-1999-FNNR||\||Plankton||13||CNN||Recognition rate = 86%|
|2004||C||Beaufort-2004-ARCD||Private||Plankton||11||CNN||Recognition rate = 96%|
|2015||FE||Al-2015-PEHC||SIPPER||Plankton||7||CNN+SVM||CNN+SVM accuracy = 96.70%|
|2016||C||Dai-2016-AHCN||WHOI-Plankton||Plankton||30||Hybrid CNN based on AlexNet||Accuracy = 95.83%|
|2016||C||Lee-2016-PCIL||WHOI-Plankton||Plankton||103||Transfer learning based on CNN||
|2016||C||Py-2016-PCDC||Plankton Set 1.0||Plankton||\||CNN||Softmax loss = 61.30%|
|2017||S||Aydin-2017-CBTC||Private||Yeast||2||CNN based on SegNet||mIOU = 71.72%|
|2017||C||Pedraza-2017-ADCA||Private||Diatom||80||Transfer learning based on AlexNet||Average accuracy = 99.51%|
|2018||C||Al-2018-IPIC||SIPPER||Plankton||77||CNN based on VGG-16||Accuracy = 80.54%|
|2018||C||Cui-2018-TSIF||WHOI-Plankton||Plankton||103||Hybrid CNN based on AlexNet||Accuracy = 94.32%|
|2018||S||Matuszewski-2018-MATS||Rift Valley virus||Rift Valley virus||2||CNN based on U-Net||
|2018||FE||Rodrigues-2018-ETLS||Private||Plankton||20||Transfer learning based on DeepSea||Accuracy = 84%|
|2018||D||Tahir-2018-AFSD||Private||Fungus||6||CNN||Accuracy = 94.8%|
|2018||C||Wahid-2018-CMIB||Multiple sources||Bacteria||5||CNN based on Inception-V1||Accuracy approx 95%|
|2019||FE||Ahmed-2019-CDCN||Multiple sources||Bacteria||7||Inception-V3 + SVM||Accuracy 96%|
|2019||C||Balagurusamy-2019-DDBD||Private||Bacteria||2||CNN||Accuracy = 83%|
|2019||C||Bliznuks-2020-ENNS||Private||Microorganism||2||3D CNN||Accuracy = 95%|
|2019||FE||Cheng-2019-ECNN||Private||Plankton||7||ResNet50 + SVM||
|2019||C||Devan-2019-DHCT||Private||Herpesvirus||2||Transfer learning based on ResNet||
|Ensemble of CNN models||
|2019||FE||Rawat-2019-ADLB||from Internet||Plankton||5||Inception-V3 + CNN||Accuracy = 99.5%|
|2019||C||Rujichan-2019-BCUI||DIBaS||Bacteria||33||MobileNetV2||Average accuracy = 95.09%|
|2019||C||Wahid-2019-DCNN||Multiple sources||Bacteria||7||CNN based on Xception||Accuracy = 97.5%|
|Plankton||\||Densenet201 + Polar + SVM||
|2020||S||Dietler-2020-ACNN||Private||Yeast||2||U-Net||Accuracy = 94%|
|Accuracy = 98.4%|
|2020||C||Mhathesh-2020-A3CN||Private||Bacteria||2||CNN||Accuracy = 95%|
|2020||S||Serrao-2020-ABDL||Private||Tuberculosis||2||CNN||Accuracy = 99.665%|
|2020||C||Tamiev-2020-ACBC||Private||B. subtilis||11||CNN||Accuracy = 86%|
|Xception and ResNet-50 perform well|
5 Methodology Analysis and Potential Direction
Various research papers for MIA based on classical and deep neural networks are briefly reviewed in Sec. 3 and Sec. 4. To further understand the characteristics of these works, a corresponding in-depth analysis is provided in the following subsections. Besides, we also discuss the potential development directions.
5.1 Analysis of Methods Based on Classical Neural Networks
After the brief reviews of works related to MIA based on classical neural networks in Sec. 3, these tasks can be divided into classification tasks and other tasks. The generic workflow of these works can be summarized as three parts: image pre-processing, image feature extraction and ANN-based analysis.
In the classification tasks, the methods used in image pre-processing usually include noise reduction, image enhancement, segmentation. For example, the pre-processing operations used in Culverhouse-1994-ACFS ; Ginoris-2007-RPMU ; Mosleh-2012-APSA contain noise reduction. These noise reduction methods have both manual methods and image processing algorithms. Image enhancement technology is used in the pre-processing of Danping-2013-IPMS . Segmentation methods applied in the pre-processing steps of Blackburn-1998-RDBA ; Weller-2005-SCSO ; Xiaojuan-2007-ANBC ; Hiremath-2010-AICB have edge detection, Iterative thresholding, and adaptive global thresholding. AImage segmentation is the most commonly used among these pre-processing operations because it is the vital step for subsequent feature extraction. In feature extraction, the morphological feature and texture feature are the most commonly used features. For example, morphological features of images are extracted in Veropoulos-1998-IPNC ; Gerlach-1998-IRSM ; Hu-2006-AAQT ; Xiaojuan-2007-ANBR , in which Fourier descriptors are extracted in Veropoulos-1998-IPNC ; Hu-2006-AAQT , area, eccentricity, circularity, and other geometrical features are used in Gerlach-1998-IRSM , and the invariant moment is extracted in Hu-2006-AAQT ; Xiaojuan-2007-ANBR . The texture features of images are extracted for training classifiers in Culverhouse-1996-ACFD ; Hu-2006-AAQT ; Xiaojuan-2007-ANBR ; Mosleh-2012-APSA .
In the other tasks, the workflow is different from the classification tasks. Image segmentation is usually performed by conventional image processing algorithms or manual methods in the classification tasks, but in the segmentation tasks, it is performed by ANNs. For example, the self-organizing multilayer neural network is applied to segment yeast-like fungus images in Shabtai-1996-MMMC , and HMLP is used to perform the Zeihl-Neelsen tissue slide image segmentation in Osman-2010-STBZ . Besides, different from the morphological and texture features used in the classification tasks, ANNs are used to generate the features for the subsequent analysis in the feature extraction tasks. For example, in Zhu-2010-BCUN , PCNN is used to extract the entropy sequence features for the classification based on Euclidean distance.
After the discussion of pre-processing and feature extraction, the statistic and analysis of the classical neural networks used in MIA tasks are provided. As the statistic shown in Fig. 57, we can find that MLP and RBF are the most widely used networks. The basic information and structure of MLP are provided in Sec. 2.2.1. RBF is an ANN that uses the radial basis function as the activation function. The output of RBF is a linear combination of radial basis functions of the inputs and neuron parameters.
In conclusion, it can be found that the MIA tasks based on classical neural networks are very dependent on feature engineering, and the experimental results are not only dependent on the selected neural network but also on the feature selection or design. Therefore, this character makes it difficult to directly transplant the methods developed in these works to different microorganism analysis tasks. However, there are also cases where the classical neural network is directly applied to the microorganism image without any feature engineering among these classical neural networks. For example, a feed-forward MLP used in Wit-1998-AAAN does not rely on feature engineering but is directly applied to the image for training and classification. This innovation provides a new idea for the subsequent microorganism image analysis based on CNN.
5.2 Analysis of Methods Based on Deep Neural Network
As the overviews of works related to MIA based on deep neural networks are introduced in Sec. 4, we can find that the analysis tasks include classification, segmentation, detection, counting, feature extraction, and data augmentation. Unlike the works related to MIA based on classical neural networks, the deep neural networks used here are mainly CNNs, which can directly extract the potentially effective feature from the image data by the convolutional filter. That means MIA based on deep neural networks does not rely on feature engineering. Therefore, we focus on the CNN models and their related innovation ideas in the following discussion.
A statistic of the deep neural networks used in MIA tasks is conducted in Fig. 58. In this statistic, we divide the self-designed CNN into the CNN category. The network optimized based on the public network is divided into the original category of this public network. As we can find from Fig. 58, CNNs are the most widely used. Among them, some characteristic self-designed networks are proposed. For example, the motor modules proposed in the parallel neural network can dynamically achieve translation, rotation, dilatation, contrast, and symmetry of the images by training in Beaufort-2004-ARCD , and a CNN is proposed to perform not only the detection but also the classification of fungi in Tahir-2018-AFSD .
When it comes to the public networks, the top five widely used networks are AlexNet, VGGNet, Inception, ResNet, and U-Net, respectively. The characters of these widely used public networks are provided in Sec. 2.2. There are some characteristic optimized networks based on these public networks. For example, a network called LCU-Net, which optimized the original U-Net by increasing the diversity of the receptive field and decreasing the memory requirement of the network, is proposed to perform the microorganism image segmentation task in Zhang-2020-AMCF ; Zhang-2021-LANL , and a transferred parallel neural network, which combines a pre-trained deep learning model using as a feature extractor and an untrained model, is introduced in Wang-2018-TPCN .
In addition to optimising these networks, there are also some other innovative points in the MIA tasks based on deep neural networks. For instance, in Cui-2018-TSIF , the original plankton images are converted into texture and shape images to improve the feature diversity. These three kinds of images are concatenated for the subsequent AlexNet training. Besides, in Cheng-2020-MTCN , to address the unrecognisable problem caused by the target angle change, the translational and rotational features are applied. The rotational features are extracted from the polar images transformed from the original images (from Cartesian to polar coordinates) by CNN.
Besides, transfer learning also plays an essential role in the MIA tasks based on deep neural networks. According to the statistic, we find more than 15 papers involving transfer learning technology. Transfer learning focuses on storing knowledge learned from one task and applying it to a different but related task West-2007-SRPA ; Doodfellow-2016-DL . Transfer learning has many applications, such as it can be used to perform the task that has limited training data Zhang-2020-AMCF , and it can be applied as the feature extractor for extracting the potential high-level features Wang-2018-TPCN . Among the MIA tasks based on transfer learning, there are some characteristic works. For example, in Lee-2016-PCIL , to address the imbalanced problem, a balanced sub-dataset is made from the original dataset for pre-training the CNN, and then the original dataset is used for fine-tuning the network. In Rodrigues-2018-ETLS , the plankton dataset ISIIS and the public dataset ImageNet are used to pre-train the deep learning models to determine the influence of different pre-trained datasets.
5.3 Potential Direction
In this part, we present three potential directions from the perspectives of fusion of existing microorganism image analysis methods, dataset characteristics, and advanced methods in other fields.
5.3.1 Potential Direction Based on Existing Methods
In existing methods, the enhanced framework of GANs is proposed to achieve the data augmentation of microorganism image in Xu-2020-AEFG , but it suffers from the influence caused by different object directions. To solve this problem, it uses the corresponding ground truth images to uniform the directions of these objects. This method has a significant limitation, which is the requirement of vast labour for making labels. For some datasets without ground truth images, this framework cannot work well. The work Cheng-2020-MTCN also faces the same problem. However, it provides another way to optimize this limitation. Because the images in polar coordinates do not have directions, the method used in Cheng-2020-MTCN converts the original images (images in Cartesian coordinates) into polar images (images in polar coordinates) to reduce the influence by the object direction. Therefore, this thought can be applied to improve further the framework used in Xu-2020-AEFG . Besides, in Pedraza-2017-ADCA ; Li-2020-MAMR ; Zhang-2020-AMCF , data augmentation is one of the steps in their workflows, the methods used in these works are rotation and flipping. The enhanced frameworks of GANs Xu-2020-AEFG may be helpful in these works.
5.3.2 Potential Direction Based on Dataset Characteristics
In the perspective of the dataset, we find that most microorganism image datasets used in MIA tasks based on classical neural networks are private, and a few open-access datasets are used in MIA tasks based on deep neural networks. We find that some datasets, such as the EMDS series, are few-shot. The existing methods based EMDS series usually applied data augmentation to solve the few-shot problem. In recent years, the few-shot problem is an essential topic in several important computer vision conferences such as CVPR, ICCV, and ECCV. It aims to develop some methods from data (augment the training data set by prior knowledge), model (constrain the hypothesis space by prior knowledge), and algorithm (alter the search strategy for the parameter of the best hypothesis in hypothesis space by prior knowledge) to achieve the good performance with limited training data Wang-2020-GAFE . Considering that collecting enough microorganism image data is a big challenge, the few-shot problem in microorganism image analysis also can be a research topic in the future.
In Ling-2020-FSPR , a few-shot learning method is proposed to perform the pill image recognition. The workflow of the proposed method is shown in Fig. 59. This method contains two steps, including pill segmentation and recognition. An optimized U-Net is applied for pill segmentation. Then, the original image is used to generated contour and texture images. Besides, the imprinted text is extracted to increase the effective features. All pill images are used to train RGB, contour, and texture streams. Batch hard images selected in batch all stage are used for training the following fully connected layers. This method is proved to be a useful method for pill recognition based on limited data. It may be a potential direction for the limited microorganism dataset based methods.
5.3.3 Potential Direction Based on Advanced Methods
In recent years, the self-attention mechanism based transformer Vaswani-2017-AIAY , which is widely used in the field of natural language processing, becomes is a new and hot spot in the field of computer vision. That is because that CNNs concern with detecting certain features and do not consider their positioning with respect to each other. Besides, the pooling operations used in CNNs lose a lot of valuable information such as the clear location of the effective feature descriptor. However, transformer based on self-attention mechanism has a more robust ability of global information representation. Besides, compared with CNN, transformer has less parameters and low computation but performs well with the image analysis tasks. Therefore, it also can be a development direction in MIA tasks.
Considering the advantage of transformer, Vision Transformer (ViT) is proposed to perform the image classification task in Dosovitskiy-2020-AIIW . It is one of the most remarkable visual transformer methods, which directly applies sequences of image patches (with position information) as input first. The ViT projects the patches to the original transformer encoder and classifies the images with a multi-head attention mechanism as it works in natural language processing tasks. Fig. 60 provides the architecture of ViT. It is a potential method for microorganism image classification.
In addition to the classification task, transformer also performs well in the image detection task. In Carion-2020-EODT , a novel network named DEtection TRansformer (DETR) is proposed for object detection. As the architecture of DETR is shown in Fig. 61
, it consists of a CNN backbone for feature extraction, an encoder-decoder transformer, and a detection prediction network based on feed-forward network. Experiment results on COCO indicate that the results of DETR and Faster R-CNN are comparable. It means DETR has the potential to be employed in microorganism image detection.
In this paper, we conduct a review of the microorganism image analysis based on classical and deep neural networks. A total of 96 papers are collected and reviewed in this paper. In Sec. 1, we first introduce the background of microorganism image analysis based on artificial neural networks and the motivation of this review paper and then provide the process of literature collection and the organization of this paper. In Sec. 2, we introduce the development trend of artificial neural networks. We divide the development process of neural networks into three stages. The first stage is from the proposal of perceptron to the proposal of XOR problem; the second stage is from the proposal of Hopfield network and the solution of XOR problem to the widely use of SVM. The third stage is from the proposal of DBN and AlexNet to present. Besides, we introduce some representative networks related to subsequent analysis in this section, including perceptron, VGGNet, Inception series, ResNet, U-Net, and YOLO.
In Sec. 3, we introduce the microorganism image analysis based on the classical neural network. We introduce this section from the perspectives of classification, counting, segmentation, and feature extraction. In the end, we provide a table that summarizes the characteristics of all the related papers in this section. In Sec. 4, we also introduce the microorganism microscopic image analysis based on the deep neural network from the perspectives of different tasks, including classification, segmentation, detection, counting, feature extraction, and data augmentation. We also provide a summary table at the end of this section.
In Sec. 5, we make statistics of microorganism image analysis based on classical neural networks and deep neural networks, respectively. We find that the most widely used network in the classical methods is MLP, the most widely used network in the deep methods is CNN (including some self-designed networks), and the top five widely used public networks are AlexNet, VGGNet, Inception, ResNet, and U-Net. Besides, the transfer learning technique is also widely used in microorganism image analysis based on deep neural networks. At the end of this section, we discuss the potential development directions from the perspectives of fusion of existing methods, characteristics of datasets, and advanced methods in other fields.
Acknowledgements.This work is supported by the “National Natural Science Foundation of China” (No. 61806047) and the “Fundamental Research Funds for the Central Universities” (No. N2019003). We also thank Miss. Zixian Li and Mr. Guoxian Li for their important discussion in this work. Chen Li is both the co-first author and corresponding author of this paper.
- (1) Ahmed, T., Wahid, M.F., Hasan, M.J.: Combining deep convolutional neural network with support vector machine to classify microscopic bacteria images. In: 2019 International Conference on Electrical, Computer and Communication Engineering (ECCE), pp. 1–5. IEEE (2019)
- (2) Al-Barazanchi, H., Verma, A., Wang, S.X.: Intelligent plankton image classification with deep learning. International Journal of Computational Vision and Robotics 8(6), 561–571 (2018)
- (3) Al-Barazanchi, H.A., Verma, A., Wang, S.: Performance evaluation of hybrid cnn for sipper plankton image calssification. In: 2015 Third International Conference on Image Information Processing (ICIIP), pp. 551–556. IEEE (2015)
- (4) Amaral, A., Ginoris, Y.P., Nicolau, A., Coelho, M., Ferreira, E.: Stalked protozoa identification by image analysis and multivariable statistical techniques. Analytical and bioanalytical chemistry 391(4), 1321–1325 (2008)
- (5) Avci, M., Yildirim, T.: Classification of escherichia coli bacteria by artificial neural networks. In: Proceedings First International IEEE Symposium Intelligent Systems, vol. 3, pp. 13–16. IEEE (2002)
- (6) Aydin, A.S., Dubey, A., Dovrat, D., Aharoni, A., Shilkrot, R.: Cnn based yeast cell segmentation in multi-modal fluorescent microscopy data. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 753–759. IEEE (2017)
- (7) Baek, S.S., Pyo, J., Pachepsky, Y., Park, Y., Ligaray, M., Ahn, C.Y., Kim, Y.H., Chun, J.A., Cho, K.H.: Identification and enumeration of cyanobacteria species using a deep neural network. Ecological Indicators 115, 106395 (2020)
- (8) Bagyaraj, D., Rangaswami, G.: Agricultural microbiology. PHI Learning Pvt. Ltd. (2007)
- (9) Balagurusamy, V., Siu, V., Kumar, A.D., Dureja, S., Ligman, J., Kudva, P., Tong, M., Dillenberger, D.: Detecting and discriminating between different types of bacteria with a low-cost smartphone based optical device and neural network models. In: Biosensing and Nanomedicine XII, vol. 11087, p. 110870E. International Society for Optics and Photonics (2019)
- (10) Balfoort, H., Snoek, J., Smiths, J., Breedveld, L., Hofstraat, J., Ringelberg, J.: Automatic identification of algae: neural network analysis of flow cytometric data. Journal of Plankton Research 14(4), 575–589 (1992)
- (11) Beaufort, L., Dollfus, D.: Automatic recognition of coccoliths by dynamical neural networks. Marine Micropaleontology 51(1-2), 57–73 (2004)
- (12) Bengio, Y.: Learning deep architectures for AI. Now Publishers Inc (2009)
- (13) Bengio, Y., Lamblin, P., Popovici, D., Larochelle, H.: Greedy layer-wise training of deep networks. In: Advances in neural information processing systems, pp. 153–160 (2007)
- (14) Bengio, Y., Simard, P., Frasconi, P.: Learning long-term dependencies with gradient descent is difficult. IEEE transactions on neural networks 5(2), 157–166 (1994)
- (15) Blackburn, N., Hagström, Å., Wikner, J., Cuadros-Hansson, R., Bjørnsen, P.K.: Rapid determination of bacterial abundance, biovolume, morphology, and growth by neural network-based image analysis. Applied and Environmental Microbiology 64(9), 3246–3255 (1998)
- (16) Bliznuks, D., Chizhov, Y., Bondarenko, A., Uteshev, D., Liepins, J., Zolins, S., Lihachev, A., Lihacova, I.: Embedded neural network system for microorganisms growth analysis. In: Saratov Fall Meeting 2019: Optical and Nano-Technologies for Biology and Medicine, vol. 11457, p. 1145720. International Society for Optics and Photonics (2020)
- (17) Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229. Springer (2020)
- (18) Chen, X., Ma, H., Wan, J., Li, B., Xia, T.: Multi-view 3d object detection network for autonomous driving. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1907–1915 (2017)
- (19) Cheng, K., Cheng, X., Wang, Y., Bi, H., Benfield, M.C.: Enhanced convolutional neural network for plankton identification and enumeration. PloS one 14(7), e0219570 (2019)
- (20) Cheng, X., Ren, Y., Cheng, K., Cao, J., Hao, Q.: Method for training convolutional neural networks for in situ plankton image recognition and classification based on the mechanisms of the human eye. Sensors 20(9), 2592 (2020)
- (21) Chopra, C., Verma, R.: Novel methods based on cnn for improved bacteria classification. In: Proceedings of Fifth International Congress on Information and Communication Technology, pp. 1–16. Springer
- (22) Coltelli, P., Barsanti, L., Evangelista, V., Frassanito, A.M., Gualtieri, P.: Water monitoring: automated and real time identification and classification of algae using digital microscopy. Environmental Science: Processes & Impacts 16(11), 2656–2665 (2014)
- (23) Costa Filho, C.F.F., Levy, P.C., Xavier, C.d.M., Fujimoto, L.B.M., Costa, M.G.F.: Automatic identification of tuberculosis mycobacterium. Research on biomedical engineering 31(1), 33–43 (2015)
- (24) Cowen, R.K., Guigand, C.M.: In situ ichthyoplankton imaging system (isiis): system design and preliminary results. Limnology and Oceanography: Methods 6(2), 126–132 (2008)
- (25) Cui, J., Wei, B., Wang, C., Yu, Z., Zheng, H., Zheng, B., Yang, H.: Texture and shape information fusion of convolutional neural network for plankton image classification. In: 2018 OCEANS-MTS/IEEE Kobe Techno-Oceans (OTO), pp. 1–5. IEEE (2018)
- (26) Culverhouse, P., Ellis, R., Simpson, R., Williams, R., Pierce, R., Turner, J.: Automatic categorisation of five species of cymatocylis (protozoa, tintinnida) by artificial neural network. Marine Ecology Progress Series pp. 273–280 (1994)
- (27) Culverhouse, P., Herry, V., Parisini, T., Williams, R., Reguera, B., Gonzalez-Gil, S., Fonda, S., Cabrini, M.: Dicann: a machine vision solution to biological specimen categorisation. In: Proceedings of the EurOCEAN 2000 Conference, pp. 239–240 (2000)
- (28) Culverhouse, P.F., Simpson, R., Ellis, R., Lindley, J., Williams, R., Parisini, T., Reguera, B., Bravo, I., Zoppoli, R., Earnshaw, G., et al.: Automatic classification of field-collected dinoflagellates by artificial neural network. Marine Ecology Progress Series 139, 281–287 (1996)
- (29) Cunshe, C., Xiaojuan, L.: A new wastewater bacteria classification with microscopic image analysis. In: Proceedings of the 12th WSEAS international conference on Computers, pp. 915–921 (2008)
- (30) Dai, J., Yu, Z., Zheng, H., Zheng, B., Wang, N.: A hybrid convolutional neural network for plankton classification. In: Asian Conference on Computer Vision, pp. 102–114. Springer (2016)
- (31) Danping, W., Botao, W., Yue, Y.: The identification of powdery mildew spores image based on the integration of intelligent spore image sequence capture device. In: 2013 Ninth International Conference on Intelligent Information Hiding and Multimedia Signal Processing, pp. 177–180. IEEE (2013)
- (32) Deng, J., Guo, J., Xue, N., Zafeiriou, S.: Arcface: Additive angular margin loss for deep face recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4690–4699 (2019)
- (33) Devan, K.S., Walther, P., von Einem, J., Ropinski, T., Kestler, H.A., Read, C.: Detection of herpesvirus capsids in transmission electron microscopy images using transfer learning. Histochemistry and cell biology 151(2), 101–114 (2019)
- (34) Di Mauro, R., Cepeda, G., Capitanio, F., Viñas, M.: Using zooimage automated system for the estimation of biovolume of copepods from the northern argentine sea. Journal of Sea Research 66(2), 69–75 (2011)
- (35) Dietler, N., Minder, M., Gligorovski, V., Economou, A.M., Joly, D.A.H.L., Sadeghi, A., Chan, C.H.M., Koziński, M., Weigert, M., Bitbol, A.F., et al.: A convolutional neural network segments yeast microscopy images with high accuracy. Nature communications 11(1), 1–8 (2020)
- (36) Dollfus, D., Beaufort, L.: Fat neural network for recognition of position-normalised objects. Neural Networks 12(3), 553–560 (1999)
- (37) Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., Gelly, S., et al.: An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929 (2020)
- (38) Eckhorn, R., Reitbock, H.J., Arndt, M., Dicke, P.: A neural network for feature linking via synchronous activity. Canadian Journal of Microbiology 46(8), 759–763 (1989)
- (39) Embleton, K., Gibson, C., Heaney, S.: Automated counting of phytoplankton by pattern recognition: a comparison with a manual counting method. Journal of Plankton Research 25(6), 669–681 (2003)
- (40) Forero, M., Cristobal, G., Alvarez-Borrego, J.: Automatic identification techniques of tuberculosis bacteria. In: Applications of digital image processing XXVI, vol. 5203, pp. 71–81. International Society for Optics and Photonics (2003)
- (41) Forero, M.G., Sroubek, F., Cristóbal, G.: Identification of tuberculosis bacteria based on shape and color. Real-time imaging 10(4), 251–262 (2004)
Gerlach, S., Siedenberg, D., Gerlach, D., Schügerl, K., Giuseppin, M., Hunik, J.: Influence of reactor systems on the morphology of aspergillus awamori. application of neural network and cluster analysis for characterization of fungal morphology.Process biochemistry 33(6), 601–615 (1998)
- (43) Gillespie, S., Bamford, K.: Medical microbiology and infection at a glance. John Wiley & Sons (2012)
- (44) Ginoris, Y., Amaral, A., Nicolau, A., Coelho, M., Ferreira, E.: Recognition of protozoa and metazoa using image analysis tools, discriminant analysis, neural networks and decision trees. Analytica Chimica Acta 595(1-2), 160–169 (2007)
- (45) Ginoris, Y., Amaral, A., Nicolau, A., Ferreira, E., Coelho, M.: Recognition of protozoa and metazoa using image analysis tools, discriminant analysis and neural network (2006)
- (46) Glorot, X., Bengio, Y.: Understanding the difficulty of training deep feedforward neural networks. In: Proceedings of the thirteenth international conference on artificial intelligence and statistics, pp. 249–256 (2010)
- (47) Goodfellow, I., Bengio, Y., Courville, A.: Deep learning. MIT press (2016)
- (48) He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 770–778 (2016)
Hinton, G.E.: Training products of experts by minimizing contrastive divergence.Neural computation 14(8), 1771–1800 (2002)
- (50) Hinton, G.E., Osindero, S., Teh, Y.W.: A fast learning algorithm for deep belief nets. Neural computation 18(7), 1527–1554 (2006)
- (51) Hinton, G.E., Salakhutdinov, R.R.: Reducing the dimensionality of data with neural networks. science 313(5786), 504–507 (2006)
- (52) Hiremath, P., Bannigidad, P.: Automatic identification and classification of bacilli bacterial cell growth phases. IJCA Special Issue on Recent Trends in Image Processing and Pattern Recognition 1(2), 48–52 (2010)
- (53) Hiremath, P., Bannigidad, P.: Digital image analysis of cocci bacterial cells using active contour method. In: 2010 International Conference on Signal and Image Processing, pp. 163–168. IEEE (2010)
- (54) Hiremath, P., Bannigidad, P.: Digital microscopic image analysis of spiral bacterial cell groups. In: International conference on intelligent systems & data processing, pp. 209–213 (2011)
- (55) Hiremath, P., Bannigidad, P.: Identification and classification of cocci bacterial cells in digital microscopic images. International journal of computational biology and drug design 4(3), 262–273 (2011)
- (56) Hiremath, P., Bannigidad, P.: Spiral bacterial cell image analysis using active contour method. Int. J. Comput. Appl 37(8), 5–9 (2012)
- (57) Hopfield, J.J.: Neural networks and physical systems with emergent collective computational abilities. In: Proceedings of the national academy of sciences, vol. 79, pp. 2554–2558. National Acad Sciences (1982)
- (58) Hu, Q., Davis, C.: Accurate automatic quantification of taxa-specific plankton abundance using dual classification with correction. Marine Ecology Progress Series 306, 51–61 (2006)
- (59) Huang, G., Liu, Z., Van Der Maaten, L., Weinberger, K.Q.: Densely connected convolutional networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 4700–4708 (2017)
- (60) Hung, J., Carpenter, A.: Applying faster r-cnn for object detection on malaria images. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pp. 56–61 (2017)
- (61) Ioffe, S., Szegedy, C.: Batch normalization: Accelerating deep network training by reducing internal covariate shift. In: International conference on machine learning, pp. 448–456. PMLR (2015)
- (62) Kang, R., Park, B., Eady, M., Ouyang, Q., Chen, K.: Classification of foodborne bacteria using hyperspectral microscope imaging technology coupled with convolutional neural networks. Applied Microbiology and Biotechnology 104(7), 3157–3166 (2020)
- (63) Kang, R., Park, B., Eady, M., Ouyang, Q., Chen, K.: Single-cell classification of foodborne pathogens using hyperspectral microscope imaging coupled with deep learning frameworks. Sensors and Actuators B: Chemical 309, 127789 (2020)
- (64) Kay, J.W., Shinn, A., Sommerville, C.: Towards an automated system for the identification of notifiable pathogens: using gyrodactylus salaris as an example. Parasitology Today 15(5), 201–206 (1999)
- (65) Kim, G., Jo, Y., Cho, H., Choi, G., Kim, B.S., Min, H.s., Park, Y.: Automated identification of bacteria using three-dimensional holographic imaging and convolutional neural network. In: 2018 IEEE Photonics Conference (IPC), pp. 1–2. IEEE (2018)
- (66) Kiranyaz, S., Ince, T., Pulkkinen, J., Gabbouj, M., Ärje, J., Kärkkäinen, S., Tirronen, V., Juhola, M., Turpeinen, T., Meissner, K.: Classification and retrieval on macroinvertebrate image databases. Computers in biology and medicine 41(7), 463–472 (2011)
- (67) Kosov, S., Shirahama, K., Li, C., Grzegorzek, M.: Environmental microorganism classification using conditional random fields and deep convolutional neural networks. Pattern Recognition 77, 248–261 (2018)
- (68) Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. Communications of the ACM 60(6), 84–90 (2017)
- (69) Kruk, M., Kozera, R., Osowski, S., Trzciński, P., Paszt, L.S., Sumorok, B., Borkowski, B.: Computerized classification system for the identification of soil microorganisms. In: AIP conference proceedings, vol. 1648, p. 660018. AIP Publishing LLC (2015)
- (70) Kulwa, F., Li, C., Zhao, X., Cai, B., Xu, N., Qi, S., Chen, S., Teng, Y.: A state-of-the-art survey for microorganism image segmentation methods and future potential. IEEE Access 7, 100243–100269 (2019)
- (71) Kumar, S., Mittal, G.S.: Rapid detection of microorganisms using image processing parameters and neural network. Food and Bioprocess Technology 3(5), 741–751 (2010)
- (72) Kylberg, G., Uppström, M., HEDLUND, K.O., Borgefors, G., SINTORN, I.M.: Segmentation of virus particle candidates in transmission electron microscopy images. Journal of microscopy 245(2), 140–147 (2012)
Le Roux, N., Bengio, Y.: Representational power of restricted boltzmann machines and deep belief networks.Neural computation 20(6), 1631–1649 (2008)
LeCun, Y., Boser, B., Denker, J.S., Henderson, D., Howard, R.E., Hubbard, W., Jackel, L.D.: Backpropagation applied to handwritten zip code recognition.Neural computation 1(4), 541–551 (1989)
- (75) LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. In: Proceedings of the IEEE, vol. 86, pp. 2278–2324. Ieee (1998)
- (76) Lee, H., Park, M., Kim, J.: Plankton classification on imbalanced large scale database via convolutional neural networks with transfer learning. In: 2016 IEEE international conference on image processing (ICIP), pp. 3713–3717. IEEE (2016)
- (77) Li, C.: Content-based microscopic image analysis, vol. 39. Logos Verlag Berlin GmbH (2016)
- (78) Li, C., Kulwa, F., Zhang, J., Li, Z., Xu, H., Zhao, X.: A review of clustering methods in microorganism image analysis. In: Information Technology in Biomedicine, pp. 13–25. Springer (2020)
- (79) Li, C., Shirahama, K., Grzegorzek, M.: Application of content-based image analysis to environmental microorganism classification. Biocybernetics and Biomedical Engineering 35(1), 10–21 (2015)
- (80) Li, C., Wang, K., Xu, N.: A survey for the applications of content-based microscopic image analysis in microorganism classification domains. Artificial Intelligence Review 51(4), 577–646 (2019)
- (81) Li, C., Xu, N., Jiang, T., Qi, S., Han, F., Qian, W., Zhao, X.: A brief review for content-based microorganism image analysis using classical and deep neural networks. In: International Conference on Information Technologies in Biomedicine, pp. 3–14. Springer (2018)
- (82) Li, C., Zhang, J., Zhao, X., Kulwa, F., Li, Z., Xu, H., Li, H.: Mrfu-net: A multiple receptive field u-net for environmental microorganism image segmentation. In: Information Technology in Biomedicine, pp. 27–40. Springer (2020)
- (83) Li, X., Li, C., Kulwa, F., Rahaman, M.M., Zhao, W., Wang, X., Xue, D., Yao, Y., Cheng, Y., Li, J., et al.: Foldover features for dynamic object behaviour description in microscopic videos. IEEE Access 8, 114519–114540 (2020)
- (84) Li, Z., Li, C., Yao, Y., Zhang, J., Rahaman, M.M., Xu, H., Kulwa, F., Lu, B., Zhu, X., Jiang, T.: Emds-5: Environmental microorganism image dataset fifth version for multiple image analysis tasks. Plos one 16(5), e0250631 (2021)
- (85) Ling, S., Pastor, A., Li, J., Che, Z., Wang, J., Kim, J., Callet, P.L.: Few-shot pill recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9789–9798 (2020)
- (86) Linnosmaa, J., Tikka, P., Suomalainen, J., Papakonstantinou, N.: Machine learning in safety critical industry domains. VTT Technical Research Centre of Finland (2020)
- (87) Liu, J., Du, A., Wang, C., Yu, Z., Zheng, H., Zheng, B., Zhang, H.: Deep pyramidal residual networks for plankton image classification. In: 2018 OCEANS-MTS/IEEE Kobe Techno-Oceans (OTO), pp. 1–5. IEEE (2018)
- (88) Liu, W., Wen, Y., Yu, Z., Li, M., Raj, B., Song, L.: Sphereface: Deep hypersphere embedding for face recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 212–220 (2017)
- (89) Lumini, A., Nanni, L.: Deep learning and transfer learning features for plankton classification. Ecological informatics 51, 33–43 (2019)
- (90) Luo, J.Y., Irisson, J.O., Graham, B., Guigand, C., Sarafraz, A., Mader, C., Cowen, R.K.: Automated plankton image analysis using convolutional neural networks. Limnology and Oceanography: Methods 16(12), 814–827 (2018)
- (91) Madigan, M.T., Martinko, J.M., Parker, J., et al.: Brock biology of microorganisms, vol. 11. Prentice hall Upper Saddle River, NJ (1997)
- (92) Matuszewski, D.J., Sintorn, I.M.: Minimal annotation training for segmentation of microscopy images. In: 2018 IEEE 15th International Symposium on Biomedical Imaging (ISBI 2018), pp. 387–390. IEEE (2018)
- (93) McCulloch, W.S., Pitts, W.: A logical calculus of the ideas immanent in nervous activity. The bulletin of mathematical biophysics 5(4), 115–133 (1943)
- (94) Mhathesh, T., Andrew, J., Sagayam, K.M., Henesey, L.: A 3d convolutional neural network for bacterial image classification. In: Intelligence in Big Data Technologies—Beyond the Hype, pp. 419–431. Springer (2020)
- (95) Minsky, M., Papert, S.A.: Perceptrons: An introduction to computational geometry. MIT press (2017)
- (96) Mosleh, M.A., Manssor, H., Malek, S., Milow, P., Salleh, A.: A preliminary study on automated freshwater algae recognition and classification system. BMC Bioinformatics 13(Suppl 17), S25 (2012)
Nair, V., Hinton, G.E.: Rectified linear units improve restricted boltzmann machines.In: ICML (2010)
- (98) Nie, D., Shank, E.A., Jojic, V.: A deep framework for bacterial image segmentation and classification. In: Proceedings of the 6th ACM Conference on Bioinformatics, Computational Biology and Health Informatics, pp. 306–314 (2015)
- (99) Osman, M., Mashor, M., Jaafar, H.: Hybrid multilayered perceptron network trained by modified recursive prediction error-extreme learning machine for tuberculosis bacilli detection. In: 5th Kuala Lumpur International Conference on Biomedical Engineering 2011, pp. 667–673. Springer (2011)
- (100) Osman, M., Mashor, M., Jaafar, H.: Tuberculosis bacilli detection in ziehl-neelsen-stained tissue using affine moment invariants and extreme learning machine. In: 2011 IEEE 7th International Colloquium on Signal Processing and its Applications, pp. 232–236. IEEE (2011)
- (101) Osman, M.K., Mashor, M.Y., Jaafar, H.: Detection of mycobacterium tuberculosis in ziehl-neelsen stained tissue images using zernike moments and hybrid multilayered perceptron network. In: 2010 IEEE International Conference on Systems, Man and Cybernetics, pp. 4049–4055. IEEE (2010)
- (102) Osman, M.K., Mashor, M.Y., Jaafar, H.: Segmentation of tuberculosis bacilli in ziehl-neelsen tissue slide images using hibrid multilayered perceptron network. In: 10th International Conference on Information Science, Signal Processing and their Applications (ISSPA 2010), pp. 365–368. IEEE (2010)
- (103) Osman, M.K., Mashor, M.Y., Jaafar, H.: Online sequential extreme learning machine for classification of mycobacterium tuberculosis in ziehl-neelsen stained tissue. In: 2012 International Conference on Biomedical Engineering (ICoBE), pp. 139–143. IEEE (2012)
- (104) Pedraza, A., Bueno, G., Deniz, O., Cristóbal, G., Blanco, S., Borrego-Ramos, M.: Automated diatom classification (part b): a deep learning approach. Applied Sciences 7(5), 460 (2017)
- (105) Pedraza, A., Bueno, G., Deniz, O., Ruiz-Santaquiteria, J., Sanchez, C., Blanco, S., Borrego-Ramos, M., Olenici, A., Cristobal, G.: Lights and pitfalls of convolutional neural networks for diatom identification. In: Optics, Photonics, and Digital Technologies for Imaging Applications V, vol. 10679, p. 106790G. International Society for Optics and Photonics (2018)
- (106) Pepper, I.L., Gerba, C.P., Gentry, T.J., Maier, R.M.: Environmental microbiology. Academic press (2011)
- (107) Połap, D., Woźniak, M.: Bacteria shape classification by the use of region covariance and convolutional neural network. In: 2019 International Joint Conference on Neural Networks (IJCNN), pp. 1–7. IEEE (2019)
- (108) Priya, E., Srinivasan, S.: Automated identification of tuberculosis objects in digital images using neural network and neuro fuzzy inference systems. Journal of Medical Imaging and Health Informatics 5(3), 506–512 (2015)
- (109) Priya, E., Srinivasan, S.: Automated object and image level classification of tb images using support vector neural network classifier. Biocybernetics and Biomedical Engineering 36(4), 670–678 (2016)
- (110) Py, O., Hong, H., Zhongzhi, S.: Plankton classification with deep convolutional neural networks. In: 2016 IEEE Information Technology, Networking, Electronic and Automation Control Conference, pp. 132–136. IEEE (2016)
- (111) Rahaman, M.M., Li, C., Wu, X., Yao, Y., Hu, Z., Jiang, T., Li, X., Qi, S.: A survey for cervical cytopathology image analysis using deep learning. IEEE Access 8, 61687–61710 (2020)
- (112) Rahaman, M.M., Li, C., Yao, Y., Kulwa, F., Rahman, M.A., Wang, Q., Qi, S., Kong, F., Zhu, X., Zhao, X.: Identification of covid-19 samples from chest x-ray images using deep learning: A comparison of transfer learning approaches. Journal of X-ray Science and Technology 28(5), 821–839 (2020)
- (113) Rawat, S.S., Bisht, A., Nijhawan, R.: A deep learning based cnn framework approach for plankton classification. In: 2019 Fifth International Conference on Image Information Processing (ICIIP), pp. 268–273. IEEE (2019)
- (114) Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: Unified, real-time object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 779–788 (2016)
- (115) Rissino, S., Lambert-Torres, G.: Rough set theory—fundamental concepts, principals, data extraction, and applications. In: Data mining and knowledge discovery in real life applications. IntechOpen (2009)
- (116) Robertson, S., Azizpour, H., Smith, K., Hartman, J.: Digital image analysis in breast pathology—from image processing techniques to artificial intelligence. Translational Research 194, 19–35 (2018)
- (117) Rodrigues, F.C.M., Hirata, N.S., Abello, A.A., Leandro, T., La Cruz, D., Lopes, R.M., Hirata Jr, R.: Evaluation of transfer learning scenarios in plankton image classification. In: VISIGRAPP (5: VISAPP), pp. 359–366 (2018)
- (118) Ronneberger, O., Fischer, P., Brox, T.: U-net: Convolutional networks for biomedical image segmentation. In: International Conference on Medical image computing and computer-assisted intervention, pp. 234–241. Springer (2015)
- (119) Rosenblatt, F.: The perceptron: a probabilistic model for information storage and organization in the brain. Psychological review 65(6), 386 (1958)
- (120) Rujichan, C., Vongserewattana, N., Phasukkit, P.: Bacteria classification using image processing and deep convolutional neural network. In: 2019 12th Biomedical Engineering International Conference (BMEiCON), pp. 1–4. IEEE (2019)
- (121) Rulaningtyas, R., Suksmono, A.B., Mengko, T.L.: Automatic classification of tuberculosis bacteria using neural network. In: Proceedings of the 2011 International Conference on Electrical Engineering and Informatics, pp. 1–4. IEEE (2011)
- (122) Rumelhart, D.E., Hinton, G.E., Williams, R.J.: Learning representations by back-propagating errors. nature 323(6088), 533–536 (1986)
- (123) Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., et al.: Imagenet large scale visual recognition challenge. International journal of computer vision 115(3), 211–252 (2015)
- (124) Salakhutdinov, R., Hinton, G.: Deep boltzmann machines. In: Artificial intelligence and statistics, pp. 448–455 (2009)
Sallab, A.E., Abdou, M., Perot, E., Yogamani, S.: Deep reinforcement learning framework for autonomous driving.Electronic Imaging 2017(19), 70–76 (2017)
- (126) Sap, M., Mohebi, E.: Hybrid self organizing map for overlapping clusters. International Journal of Signal Processing, Image Processing and Pattern Recognition 1(1), 11–20 (2008)
- (127) Schulze, K., Tillich, U.M., Dandekar, T., Frohme, M.: Planktovision-an automated analysis system for the identification of phytoplankton. BMC bioinformatics 14(1), 1–10 (2013)
- (128) Serrão, M., Costa, M., Fujimoto, L., Ogusku, M., Costa Filho, C.: Automatic bacillus detection in light field microscopy images using convolutional neural networks and mosaic imaging approach. In: 2020 42nd Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC), pp. 1903–1906. IEEE (2020)
- (129) Shabtai, Y., Ronen, M., Mukmenev, I., Guterman, H.: Monitoring micorbial morphogenetic changes in a fermentation process by a self-tuning vision system (stvs). Computers & chemical engineering 20, S321–S326 (1996)
- (130) Siena, I., Adi, K., Gernowo, R., Mirnasari, N.: Development of algorithm tuberculosis bacteria identification using color segmentation and neural networks. International Journal of Video and Image Processing and Network Security 12(4), 9–13 (2012)
- (131) Silva, B., Marques, N.: A hybrid parallel som algorithm for large maps in data-mining. New Trends in Artificial Intelligence (2007)
- (132) Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
- (133) Sun, C., Li, C., Zhang, J., Rahaman, M.M., Ai, S., Chen, H., Kulwa, F., Li, Y., Li, X., Jiang, T.: Gastric histopathology image segmentation using a hierarchical conditional random field. Biocybernetics and Biomedical Engineering 40(4), 1535–1555 (2020)
- (134) Swetha, K., Sankaragomathi, B., Thangamalar, J.B.: Convolutional neural network based automated detection of mycobacterium bacillus from sputum images. In: 2020 International Conference on Inventive Computation Technologies (ICICT), pp. 293–300. IEEE (2020)
Szegedy, C., Ioffe, S., Vanhoucke, V., Alemi, A.: Inception-v4, inception-resnet and the impact of residual connections on learning.In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 31 (2017)
- (136) Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A.: Going deeper with convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1–9 (2015)
- (137) Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 2818–2826 (2016)
- (138) Tahir, M.W., Zaidi, N.A., Rao, A.A., Blank, R., Vellekoop, M.J., Lang, W.: A fungus spores dataset and a convolutional neural network based approach for fungus detection. IEEE transactions on nanobioscience 17(3), 281–290 (2018)
- (139) Tamiev, D., Furman, P.E., Reuel, N.F.: Automated classification of bacterial cell sub-populations with convolutional neural networks. PloS one 15(10), e0241200 (2020)
- (140) Treebupachatsakul, T., Poomrittigul, S.: Microorganism image recognition based on deep learning application. In: 2020 International Conference on Electronics, Information, and Communication (ICEIC), pp. 1–5. IEEE (2020)
- (141) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Advances in neural information processing systems, pp. 5998–6008 (2017)
- (142) Veropoulos, K., Campbell, C., Learmonth, G.: Image processing and neural computing used in the diagnosis of tuberculosis. In: IEE Colloquium on Intelligent Methods in Healthcare and Medical Applications (Digest No. 1998/514), pp. 8–1. IET (1998)
- (143) Wahid, M.F., Ahmed, T., Habib, M.A.: Classification of microscopic images of bacteria using deep convolutional neural network. In: 2018 10th International Conference on Electrical and Computer Engineering (ICECE), pp. 217–220. IEEE (2018)
- (144) Wahid, M.F., Hasan, M.J., Alom, M.S.: Deep convolutional neural network for microscopic bacteria image classification. In: 2019 5th International Conference on Advances in Electrical Engineering (ICAEE), pp. 866–869. IEEE (2019)
- (145) Wang, C., Zheng, X., Guo, C., Yu, Z., Yu, J., Zheng, H., Zheng, B.: Transferred parallel convolutional neural network for large imbalanced plankton database classification. In: 2018 OCEANS-MTS/IEEE Kobe Techno-Oceans (OTO), pp. 1–5. IEEE (2018)
- (146) Wang, H., Wang, Y., Zhou, Z., Ji, X., Gong, D., Zhou, J., Li, Z., Liu, W.: Cosface: Large margin cosine loss for deep face recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5265–5274 (2018)
- (147) Wang, Y., Chao, W.L., Garg, D., Hariharan, B., Campbell, M., Weinberger, K.Q.: Pseudo-lidar from visual depth estimation: Bridging the gap in 3d object detection for autonomous driving. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8445–8453 (2019)
- (148) Wang, Y., Yao, Q., Kwok, J.T., Ni, L.M.: Generalizing from a few examples: A survey on few-shot learning. ACM Computing Surveys (CSUR) 53(3), 1–34 (2020)
- (149) Weller, A.F., Corcoran, J., Harris, A.J., Ware, J.A.: The semi-automated classification of sedimentary organic matter in palynological preparations. Computers & geosciences 31(10), 1213–1223 (2005)
- (150) Weller, A.F., Harris, A.J., Ware, J.A.: Two supervised neural networks for classification of sedimentary organic matter images from palynological preparations. Mathematical geology 39(7), 657–671 (2007)
- (151) West, J., Ventura, D., Warnick, S.: Spring research presentation: A theoretical foundation for inductive transfer. Brigham Young University, College of Physical and Mathematical Sciences 1(08) (2007)
- (152) Widmer, K.W., Srikumar, D., Pillai, S.D.: Use of artificial neural networks to accurately identify cryptosporidium oocyst and giardia cyst images. Applied and environmental microbiology 71(1), 80–84 (2005)
- (153) Wit, P., Busscher, H.: Application of an artificial neural network in the enumeration of yeasts and bacteria adhering to solid substrata. Journal of microbiological methods 32(3), 281–290 (1998)
- (154) Xiaojuan, L., Cunshe, C.: A novel bacteria recognition method based on microscopic image analysis. New Zealand Journal of Agricultural Research 50(5), 697–703 (2007)
- (155) Xiaojuan, L., Cunshe, C.: A novel wastewater bacteria recognition method based on microscopic image analysis. In: WSEAS International Conference. Proceedings. Mathematics and Computers in Science and Engineering, 7. World Scientific and Engineering Academy and Society (2008)
- (156) Xiaojuan, L., Cunshe, C.: An improved bp neural network for wastewater bacteria recognition based on microscopic image analysis. WSEAS Transactions on computers 8(2), 237–247 (2009)
- (157) Xiaojuan, L., Cunshe, C., Huimei, Y.: A novel bacteria classification scheme based on microscopic image analysis. WSEAS Transactions on Systems 6(8), 1250 (2007)
Xu, H., Li, C., Rahaman, M.M., Yao, Y., Li, Z., Zhang, J., Kulwa, F., Zhao, X., Qi, S., Teng, Y.: An enhanced framework of generative adversarial networks (ef-gans) for environmental microorganism image augmentation with limited rotation-invariant training data.IEEE Access 8, 187455–187469 (2020)
- (159) Yamaguchi, T., Kawakami, S., Hatamoto, M., Imachi, H., Takahashi, M., Araki, N., Yamaguchi, T., Kubota, K.: In situ dna-hybridization chain reaction (hcr): a facilitated in situ hcr system for the detection of environmental microorganisms. Environmental Microbiology 17(7), 2532–2541 (2015)
- (160) Yamashita, T.: An illustrated guide to deep learning. Kodansha Ltd. (2016)
- (161) Yan, J., Li, X., Cui, Z.: A more efficient cnn architecture for plankton classification. In: CCF Chinese Conference on Computer Vision, pp. 198–208. Springer (2017)
Yu, L., Liu, H.: Feature selection for high-dimensional data: A fast correlation-based filter solution.In: Proceedings of the 20th international conference on machine learning (ICML-03), pp. 856–863 (2003)
- (163) Zagoruyko, S., Komodakis, N.: Wide residual networks. arXiv preprint arXiv:1605.07146 (2016)
- (164) Zawadzki, P.: Deep learning approach to the classification of selected fungi and bacteria. In: 2020 IEEE 21st International Conference on Computational Problems of Electrical Engineering (CPEE), pp. 1–4. IEEE (2020)
- (165) Zeder, M., Kohler, E., Pernthaler, J.: Automated quality assessment of autonomously acquired microscopic images of fluorescently stained bacteria. Cytometry Part A: The Journal of the International Society for Advancement of Cytometry 77(1), 76–85 (2010)
- (166) Zhang, J., Li, C., Kosov, S., Grzegorzek, M., Shirahama, K., Jiang, T., Sun, C., Li, Z., Li, H.: Lcu-net: A novel low-cost u-net for environmental microorganism image segmentation. Pattern Recognition 115, 107885 (2021)
- (167) Zhang, J., Li, C., Kulwa, F., Zhao, X., Sun, C., Li, Z., Jiang, T., Li, H., Qi, S.: A multiscale cnn-crf framework for environmental microorganism image segmentation. BioMed Research International 2020, Article ID 4621403 (2020)
- (168) Zhao, Z.Q., Zheng, P., Xu, S.t., Wu, X.: Object detection with deep learning: A review. IEEE transactions on neural networks and learning systems 30(11), 3212–3232 (2019)
- (169) Zhou, X., Li, C., Rahaman, M.M., Yao, Y., Ai, S., Sun, C., Wang, Q., Zhang, Y., Li, M., Li, X., et al.: A comprehensive review for breast histopathology image analysis using classical and deep neural networks. IEEE Access 8, 90931–90956 (2020)
- (170) Zhu, Y., Wang, Z., Zhou, J., Wang, Z.: Bacteria classification using neural network. In: 2010 Sixth International Conference on Natural Computation, vol. 3, pp. 1199–1203. IEEE (2010)
- (171) Zieliński, B., Plichta, A., Misztal, K., Spurek, P., Brzychczy-Włoch, M., Ochońska, D.: Deep learning approach to bacterial colony classification. PloS one 12(9), e0184554 (2017)