DeepAI
Log In Sign Up

A Conditional Random Field Model for Context Aware Cloud Detection in Sky Images

06/18/2019
by   Vijai T. Jayadevan, et al.
0

A conditional random field (CRF) model for cloud detection in ground based sky images is presented. We show that very high cloud detection accuracy can be achieved by combining a discriminative classifier and a higher order clique potential in a CRF framework. The image is first divided into homogeneous regions using a mean shift clustering algorithm and then a CRF model is defined over these regions. The various parameters involved are estimated using training data and the inference is performed using Iterated Conditional Modes (ICM) algorithm. We demonstrate how taking spatial context into account can boost the accuracy. We present qualitative and quantitative results to prove the superior performance of this framework in comparison with other state of the art methods applied for cloud detection.

READ FULL TEXT VIEW PDF
01/16/2012

Image Labeling and Segmentation using Hierarchical Conditional Random Field Model

The use of hierarchical Conditional Random Field model deal with the pro...
01/05/2015

A Deep-structured Conditional Random Field Model for Object Silhouette Tracking

In this work, we introduce a deep-structured conditional random field (D...
03/19/2021

Masked Conditional Random Fields for Sequence Labeling

Conditional Random Field (CRF) based neural models are among the most pe...
11/30/2020

A CRF-based Framework for Tracklet Inactivation in Online Multi-Object Tracking

Online multi-object tracking (MOT) is an active research topic in the do...
12/18/2014

Deep Structured Output Learning for Unconstrained Text Recognition

We develop a representation suitable for the unconstrained recognition o...

I Introduction

Ground based sky imaging (GBSI) systems have become very popular these days for the task of making cloud observations. For instance GBSI systems are employed extensively for predicting intermittency due to clouds in the field of intra-hour solar power forecasting [1, 2, 3]. The Whole Sky Imager (WSI) developed by the Scripps Institution of Oceanography at the University of California, San Diego [4] and the Total Sky Imager (TSI) developed by Yankee Environmental Systems, Inc. [5] are two popular GBSI systems. We [3] like many other research groups around the world [5, 6, 7, 8] have developed our own GBSI system. But unlike most other GBSI systems which are stationary and use either an upward looking camera fitted with a fish eye lens or a camera looking down on to a curved mirror to obtain a complete view of the sky, we use a low cost camera fitted with a fish eye lens which tracks the sun. Higher resolution around the sun (which is preferred since we care about predicting solar power intermittency due to clouds) is one of the advantages of this set up. Since it tracks the sun the position of the sun on the image remains constant and this makes occluding the sun (to prevent saturation of the image) easier. Also the relative area occupied by this occlusion is small in comparison to other imaging systems which is another advantage of this set up.

Our main goal is to predict intermittency due to clouds for which cloud detection is an important step. One of the earliest works used an empirically derived fixed threshold on the red to blue channel (RB) ratio for cloud detection [9, 10, 4]. Another fixed thresholding scheme was suggested by Souza-Echer et al. where they apply a fixed threshold on the saturation component of the IHS colorspace [11]. Neto et al.[12] used multidimensional Euclidean geometric distance (EGD) and Bayesian methods to classify sky and cloud patterns based on the observation that sky and cloud patterns occupy different loci on the RGB colorspace. Recently Ghonima et al. [13] proposed the use of the difference in RB ratio between the pixel to be classified and the corresponding pixel in the clear sky library for cloud opacity measurements. They described a method to compensate for the variations in aerosol optical depth by using a haze correction factor derived with the help of a clear sky library. Yamashita et. al [14] and Li et al. [15] proposed the use of normalized blue to red channel (NBR) ratio (Yamashita et al. called it sky index) for cloud classification and detection. NBR ratio is defined as

where and

represents the blue and red channel intensities of the pixel respectively. Li et al. also proposed a hybrid algorithm combining fixed and adaptive thresholding schemes. They apply a threshold on the standard deviation of NBR ratio to decide whether to use a fixed thresholding scheme or an adaptive thresholding scheme. If the standard deviation of NBR ratio is below a threshold a fixed threshold determined statistically from training data is used for cloud detection. Otherwise an adaptive thresholding algorithm based on cross entropy minimization is applied to obtain the threshold for the image.

]]. In [16] we showed that NSV ratio could serve as a contrast enhancing feature suitable for adaptive thresholding. The NSV ratio is defined as

where

is the value component and is the saturation component in the HSV colorspace. As discussed in [16, 15] and as depicted in the Fig. 1 there is considerable overlap between cloud and sky pixels in the various feature spaces. Please note that unlike in [16] the histograms shown here corresponds to a set of images used as a training set and not of a single image. These histograms depict the problem associated with a fixed thresholding scheme. Whatever be the threshold that is picked it will inevitably misclassify some pixels.

On the other hand in the case of adaptive thresholding it is only necessary to ensure that the classes do not overlap in the feature space on an image by image basis. In [16]

though we showed that the NSV ratio provides a strong separation between the two classes (cloud and sky) in the feature space for many images, we also mentioned that this is not always true. The dependency of the NSV ratio on the value component of the pixels (in the HSV space) causes the NSV ratio for some dark clouds to overlap with that of sky pixels. We also showed that an adaptive thresholding scheme like cross entropy minimization might pick a wrong threshold if the intra-class variance is high, which is a likely scenario in the case of sky images. Minimum cross entropy thresholding and Otsu’s thresholding can be viewed as a surface fitting problem, and the best fit may not always correspond to the best segmentation

[16].

To overcome these issues, we propose a cloud detection scheme based on the CRF framework. We explain how this model addresses each of the issues mentioned above. Quantitative and qualitative results are presented, and possible extensions to this work are discussed towards the end.

Fig. 1: Normalized Histograms in various feature spaces.

Ii Proposed Method

Probabilistic graphical models like Markov random fields (MRF) and conditional random fields (CRF) have been extensively applied for the task of contextual image segmentation [17, 18, 19, 20, 21, 22, 23]. Li et al. [20]

proposed an MRF model for detecting thin clouds. Though MRF models have proven to be successful for computer vision tasks like segmentation and image denoising, they have some drawbacks. Being a generative model, MRF based approaches model the joint density

, where represents the input image data, where is the set of all sites. A site can correspond to a single pixel or a group of pixels (region). And , represents the class labels corresponding to the sites (y is often referred to as a label configuration) and

is the set of all possible labels/classes. In the MAP (maximum a posteriori) MRF framework, Bayes’ rule is employed to derive the posterior probability of the labels given the data:

In many cases such as in the case of cloud detection x is always observed (i.e. during training as well as while performing inference or classifying) and hence it is not necessary to model which might not be a simple function. Another drawback of MRF modeling is that often for tractability the likelihood is assumed to have a factorized form, i.e. = . This assumption is too restrictive as complex relationships usually exist between data at neighboring sites. And finally the MRF model imposes label consistency (modeled as the prior ) uniformly over the entire image without taking into consideration the observed data.

CRF models, on the other hand, are discriminative models which directly model , the probability of a label configuration given the data. The CRF model therefore does not attempt to model the distribution of input data . The conditional independence assumption of the likelihood model given the labels is relaxed in the case of CRFs. CRFs also allow us to model data-dependent label interactions as the clique potentials, or the interaction potentials are functions of both the labels and the data. This property, as we will see, plays a crucial role in cloud detection. Due to the above-mentioned factors, CFRs outperform MRF models for various computer vision tasks including image segmentation [24, 22].

Fig. 2: a) Class conditional distribution for NBR ratio obtained using training data and b) Logistic function, , with parameters = 6.072 and = -37.001 estimated using training data.

Ii-a CRF Model

As mentioned before, we combine a discriminative classifier and a higher order clique potential in a CRF framework. Again represents the input image data, where is the set of all sites. We divide the image into homogeneous regions using mean shift clustering [25], hence each region defines a site. The labels corresponding to the sites are , with where 0 is the label for sky and 1 is the label for cloud. Following the discriminative random field approach [22], assuming y to obey the Markov property conditioned on the data x (i.e., , where represents the neighborhood of site ), and using the Hammersley Clifford theorem [26] the posterior distribution of the label configuration is defined as

(1)

where

(2)
(3)

and

(4)

and represent the NBR ratio and NSV ratio at site , respectively; are the parameters that need to be estimated. is the partition function and serves as a normalization factor. It is computed by summing the model over all possible configurations of y.

Both the NBR and NSV ratios are utilized in our CRF model. As can be seen in Fig. 2, the class conditional density for cloud is almost one for all NBR values below 0.11, and it is almost zero for all values above 0.2. It is mainly in the range [0.11, 0.20] that reliance on NBR alone would result in misclassification, and outside this range we can make cloud detection decisions almost unambiguously. Thus, there are a considerable number of regions which we can classify as cloud or sky with very high confidence. And for the remaining regions where there are ambiguities, we can look at the spatial context to make decisions. As we showed in [16] the NSV ratio exhibits enhanced contrast between cloud and sky, so it could serve as an effective, local feature for testing spatial consistency. But due to the dependency of this ratio on the value component of the HSV space of the image, it makes more sense to use the NBR ratio as a global feature.

Fig. 3: Figure illustrating how the interaction potential is defined.

This idea can be modeled very well using a CRF framework. The function

is nothing but a logistic regression classifier. This classifier gives a well calibrated probability value indicating whether the site under consideration is cloud or sky just based on the NBR ratio at that site. As shown in Fig. 2, the logistic function derived from the training data is a smoother version of the class conditional density for the cloud. In the DRF framework terminology

[22] the logistic function is the association potential. Now in order to take the spatial context into account, we define the interaction potential function . Before defining the interaction potential we need to define the neighborhood for the site . In our case, we define as all regions which are within a 200-pixel radius of the centroid of site (the site is not part of the neighborhood). It is not necessary that a region be entirely within this radius to be considered a neighbor; even if it is partially within this range, it will be considered a neighbor. The interaction potential can now be understood with the help of Fig. 3. and are the average NSV values of sky neighbors and cloud neighbors, respectively. Clouds usually have higher NSV values in comparison to sky regions, hence . The interaction potential has been defined in such a manner that and serve as reference points. I.e., the farther the NSV ratio of a site is to the left of the higher the probability that it is a sky region, and the farther it is to the right of the higher the probability that it is cloud. If the NSV ratio of a site is in between and , then the interaction potential is negative, and the magnitude will depend on its proximity to the two reference points.

As stated earlier, the mean shift clustering algorithm [25] is used to form homogeneous regions in an image. The parameters including range bandwidth, color bandwidth, minimum number of pixels in a region, etc., are set manually to produce the best qualitative results. Defining the CRF model over regions instead of pixels has two main advantages. First, it would help in speeding up the inference algorithm (the algorithm used to find the most probable labeling configuration given a new image) and secondly it will help combat noise as we will demonstrate in the results section.

Ii-B Parameter Estimation and Inference

The parameters in CRF are usually estimated using maximum likelihood estimation (MLE) [27, 28, 24]

. MLE finds the parameters that maximize the conditional likelihood of the true labels given the training data. Stochastic gradient descent (SGD) is usually employed to find the MLE estimates

[24]. In SGD the partial derivatives with respect to each parameter are evaluated. These partial derivatives involve the computation of the expected value of the feature function (for e.g. interaction potential in our case) over all possible label configurations [27, 28]. Since this computation is intractable, the expectation is approximated by coming up with an approximation for . Here, represents the parameter set {}. Note that

is the probability density over all possible label configurations given an image and the parameters, and it needs to be calculated for each image. Loopy belief propagation, Markov chain Monte Carlo (MCMC) and Contrastive divergence are some methods used for approximate training

[27, 28]. Being approximations, these methods may not estimate the parameters well [21]. Furthermore, for methods like belief propagation, the time complexity is exponential in the size of the largest clique. And in our case, since the CRF model involves higher order cliques, the largest clique might have a size of 80, hence loopy belief propagation cannot be employed.

An alternative, practical approach is piecewise training [29, 21, 23]. In piecewise training each piece of the CRF model is learned independently. In our case, parameters for association potential and interaction potential are learned independently. The training methodology is similar to that of [23]. Our training data set consists of eight manually labeled images. We first estimate the association potential parameters and using four images from the training set. We use the R project for statistical computing [30] to estimate the parameters. The glm function in the stats core package is used for this purpose. This function uses the iteratively reweighted least squares (IWLS) method to obtain the fit. Once the parameters for association potential are estimated, we estimate the parameter for the interaction potential by keeping the association potential parameters constant. The second half of the training dataset (four images) which was not used to train the association potential is used for this purpose. The interaction potential parameter, , which minimized the pixel wise classification error (i.e. the total number of misclassified sky pixels and cloud pixels) was picked.

ATS ATS CRF
(NSV ratio) (NBR ratio)
Accuracy 0.7979   0.0963 0.7173   0.1654 0.9346   0.0269
Precision 0.7782   0.1552 0.7324   0.1954 0.9561   0.0560
Recall 0.9047   0.0838 0.9022   0.1196 0.9022   0.0471
TABLE I:

Accuracy, Precision and Recall Values with 99.9% Confidence Interval for Adaptive Thresholding Schemes (ATS) and CRF Based Method

Fixed Thresholding CRF
Accuracy 0.8892   0.0359 0.9436   0.0181
Precision 0.9420  0.0378 0.9597   0.0341
Recall 0.8236   0.0540 0.9095   0.0309
TABLE II: Accuracy, Precision and Recall Values with 98% Confidence Interval for Fixed Thresholding Scheme and CRF Based Method

Once we have estimated all the parameters, we need an inference algorithm which would find the most probable labeling configuration given an image (data) and the parameters. I.e., we want to find such that

Due to the intractability of exact inference , we use the local search alg, iterative conditional modes (ICM), proposed by Besag [31]. ICM is an iterative procedure that maximizes the local conditional probability. For each site in an image we find

This procedure is iterated until convergence. ICM requires an initial labeling configuration to begin with. The output of the logistic regression classifier is used as the initial guess.

During training and inference, there may be situations where a particular site has only one class of sites around it, such that all its neighbors are cloud or all are sky. In such scenarios we use the average sky and cloud NSV values of the entire image instead of the corresponding values of the neighbors. In order to calculate the average sky and cloud NSV values of the entire image, the labeling configuration from the previous ICM iteration is used.

Iii Results

In order to evaluate the performance of the proposed method, two separate test datasets were used. One of these datasets (Set C) contains 22 images and was used to compare the performance of the CRF-based method with the minimum cross entropy method proposed by Li et al.[15]. Li’s method uses a hybrid scheme combining fixed thresholding and adaptive thresholding. The decision to use an adaptive threshold or fixed threshold is based on the standard deviation of the NBR ratio. But we observed that this decision rule does not always work well because some clear sky and completely cloudy images have a standard deviation greater than that of some images containing both sky and cloud. Hence, for a fair comparison we have only included images containing both sky and cloud in the first dataset. The second dataset (Set D) contains 26 images in total and was used to compare the performance of the proposed CRF method with that of fixed thresholding proposed by Long et al. [5]. This data set contains all the 22 images in the first dataset with an addition of two completely cloudy and two clear sky images.

Replace[[

Accuracy, precision and recall computed based on the confusion matrix are employed as metrics for performance evaluation

[32]. The metrics are defined below:

Here, TP (true positive) is the number of cloud pixels classified correctly, TN (true negative) is the number of sky pixels classified correctly, FP (false positive) is the number of cloud pixels that got misclassified as sky pixels and FN (false negative) is the number of sky pixels which got misclassified as cloud pixels.]]

Fig. 4: a) Original image and segmentation results using adaptive thresholding with b) NBR Ratio and c) NSV Ratio as features.
Fig. 5: a) Original image and segmentation results using the CRF model b) with = 0 and c) with = 0.95 (the estimated value).
Fig. 6: a) Original image, b) pixel-wise classification result of logistic regression and c) segmentation result using the CRF model with = 0.
Fig. 7: Segmentation results obtained using the CRF model for some images from the test database.

Table I shows the results for two adaptive thresholding schemes and CRF method using Set C. Both adaptive thresholding schemes are based on cross entropy minimization, the difference being that one uses NSV as the feature while the other uses NBR. The accuracy of the CRF-based method is much higher in comparison with the other two methods, and the accuracy values do not overlap even at 99.9% confidence. The NSV thresholding performed better than NBR thresholding. The reason for the poor performance of adaptive thresholding is that the best fit does not always correspond to the best segmentation/classification , especially when there is high intra-class variance. For instance, as shown in Fig. 4, the sky region at the top portion of the image is dark in comparison to the sky region at the bottom of the image. And when the adaptive thresholding algorithm tries to fit a binary surface that minimizes the cross entropy, it ends up misclassifying the entire bottom portion of the image.

The results of comparing the performance of our method with the fixed thresholding scheme is provided in Table II. Though the fixed thresholding scheme does better than adaptive thresholding, the CRF method provides considerable improvement in accuracy. The accuracy values do not overlap at 98% confidence level, so we can be 98% confident that the CRF method is better. Here, taking the spatial context into consideration leads to better accuracy. Fig. 5 shows that the inclusion of interaction potential helps in correcting some mistakes made in segmentation using the association potential alone. Looking at the local context is really helpful in regions where the histograms of sky and cloud regions overlap in the feature space.

Forming regions using mean shift segmentation and then performing region-based classification helps in dealing with noise. Fig. 6 shows that using logistic regression (association potential) to classify the image on a pixel-by-pixel basis yields noisier results than classifying regions formed by mean shift segmentation. Hence, defining the CRF model over regions not only speeds up computation but also helps in combating noise. However, in cases where ta cloud has an unusual shape, segmentation may fail to find the proper regions. But for most cases, a region based approach works well. Fig. 7 shows the segmentation results for some images obtained using our CRF method.

Iv Conclusion

We have presented a CRF model for cloud detection on ground-based sky images which outperforms Li’s adaptive thresholding algorithm and Long’s fixed thresholding algorithm on our images. Though our model uses only two features (NSV ratio and NBR ratio) it is very flexible, and more features (if found useful) can be added very easily. For instance, adding more features into the association potential is trivial. Similarly, the interaction potential could be replaced by one or more features that may help take the local context into account in a better way. Since each imaging system is different, it is possible to find features different from what we have used which are better suited for different imaging setups. But our aim here is to demonstrate that the idea of combining a discriminative classifier with a higher-order clique potential in a CRF framework is a really powerful scheme for cloud detection.

Acknowledgment

We would like to acknowledge Tucson Electric Power (TEP), The Arizona Research Institute for Solar Energy Research (AZRISE) and Arizona TRIF program for supporting this work.

References

  • [1] C. W. Chow, B. Urquhart, M. Lave, A. Dominguez, J. Kleissl, J. Shields, and B. Washom, “Intra-hour forecasting with a total sky imager at the UC San Diego solar energy testbed,” Solar Energy, vol. 85, no. 11, pp. 2881–2893, 2011.
  • [2] R. Marquez and C. F. Coimbra, “Intra-hour DNI forecasting based on cloud tracking image analysis,” Solar Energy, vol. 91, no. 0, pp. 327 – 336, 2013. [Online]. Available: http://www.sciencedirect.com/science/article/pii/S0038092X1200343X
  • [3] V. T. Jayadevan, J. J. Rodriguez, V. P. Lonij, and A. D. Cronin, “Forecasting solar power intermittency using ground-based cloud imaging,” in Proc. 41st ASES Annual Conference, vol. 3, 2012, pp. 2100–2106.
  • [4] J. Shields, M. Karr, T. Tooman, D. Sowle, and S. Moore, “The whole sky imager–a year of progress,” in Eighth Atmospheric Radiation Measurement (ARM) Science Team Meeting, Tucson, Arizona, 1998, pp. 23–27.
  • [5] C. N. Long, J. Sabburg, J. Calbó, and D. Pages, “Retrieving cloud characteristics from ground-based daytime color all-sky images,” Journal of Atmospheric and Oceanic Technology, vol. 23, no. 5, pp. 633–652, 2006.
  • [6] G. Pfister, R. McKenzie, J. Liley, A. Thomas, B. Forgan, and C. N. Long, “Cloud coverage based on all-sky imaging and its impact on surface solar irradiance,” Journal of Applied Meteorology, vol. 42, no. 10, pp. 1421–1434, 2003.
  • [7] A. Cazorla, F. Olmo, and L. Alados-Arboledas, “Development of a sky imager for cloud cover assessment,” JOSA A, vol. 25, no. 1, pp. 29–39, 2008.
  • [8] G. Seiz, J. Shields, U. Feister, E. Baltsavias, and A. Gruen, “Cloud mapping with ground-based photogrammetric cameras,” International Journal of Remote Sensing, vol. 28, no. 9, pp. 2001–2032, 2007.
  • [9] R. W. Johnson, W. S. Hering, and J. E. Shields, “Automated visibility & cloud cover measurements with a solid state imaging system,” DTIC Document, Tech. Rep., 1989.
  • [10] J. Shields, R. Johnson, and T. Koehler, “Automated whole sky imaging systems for cloud field assessment,” in Fourth Symposium on Global Change Studies of the American Meteorological Society, Boston, Massachusetts, 1993.
  • [11] M. Souza-Echer, E. Pereira, L. Bins, and M. Andrade, “A simple method for the assessment of the cloud cover state in high-latitude regions by a ground-based digital camera,” Journal of Atmospheric and Oceanic Technology, vol. 23, no. 3, pp. 437–447, 2006.
  • [12] S. L. Mantelli Neto, A. von Wangenheim, E. B. Pereira, and E. Comunello, “The use of euclidean geometric distance on rgb color space for the classification of sky and cloud patterns,” Journal of Atmospheric and Oceanic Technology, vol. 27, no. 9, pp. 1504–1517, 2010.
  • [13] M. Ghonima, B. Urquhart, C. Chow, J. Shields, A. Cazorla, and J. Kleissl, “A method for cloud detection and opacity classification based on ground based sky imagery,” Atmospheric Measurement Techniques Discussions, vol. 5, no. 4, pp. 4535–4569, 2012.
  • [14] M. Yamashita, M. Yoshimura, and T. Nakashizuka, “Cloud cover estimation using multitemporal hemisphere imageries,” in Proc. XXth Congress of the International Society for Photogrammetry and Remote Sensing (ISPRS04).   Citeseer, 2004, pp. 818–821.
  • [15] Q. Li, W. Lu, and J. Yang, “A hybrid thresholding algorithm for cloud detection on ground-based color images,” Journal of atmospheric and oceanic technology, vol. 28, no. 10, pp. 1286–1296, 2011.
  • [16] V. T. Jayadevan, J. J. Rodriguez, and A. D. Cronin, “A new contrast enhancing feature for cloud detection in ground based sky images,” Journal of atmospheric and oceanic technology, Submitted. [Online]. Available: http://uapv.physics.arizona.edu/Publications/Submitted/NewRatio.pdf
  • [17] D. Panjwani and G. Healey, “Markov random field models for unsupervised segmentation of textured color images,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 17, no. 10, pp. 939–954, 1995.
  • [18] K. Held, E. Kops, B. Krause, W. Wells, R. Kikinis, and H. Muller-Gartner, “Markov random field segmentation of brain mr images,” IEEE Transactions on Medical Imaging, vol. 16, no. 6, pp. 878–886, 1997.
  • [19] F. Melgani and S. Serpico, “A markov random field approach to spatio-temporal contextual image classification,” IEEE Transactions on Geoscience and Remote Sensing, vol. 41, no. 11, pp. 2478–2487, 2003.
  • [20] Q. Li, W. Lu, J. Yang, and J. Z. Wang, “Thin cloud detection of all-sky images using markov random fields,” IEEE Geoscience and Remote Sensing Letters, vol. 9, no. 3, pp. 417–421, 2012.
  • [21] J. Shotton, J. Winn, C. Rother, and A. Criminisi, “Textonboost: Joint appearance, shape and context modeling for multi-class object recognition and segmentation,” in Computer Vision ECCV 2006, ser. Lecture Notes in Computer Science, A. Leonardis, H. Bischof, and A. Pinz, Eds.   Springer Berlin Heidelberg, 2006, vol. 3951, pp. 1–15.
  • [22] S. Kumar and M. Hebert, “Discriminative random fields: a discriminative framework for contextual interaction in classification,” in Proceedings. Ninth IEEE International Conference on Computer Vision, 2003, pp. 1150–1157 vol.2.
  • [23] P. Kohli, L. Ladick , and P. Torr, “Robust higher order potentials for enforcing label consistency,” International Journal of Computer Vision, vol. 82, no. 3, pp. 302–324, 2009.
  • [24] X. He, R. Zemel, and M. Carreira-Perpindn, “Multiscale conditional random fields for image labeling,” in

    Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition

    , vol. 2, 2004, pp. II–695–II–702 Vol.2.
  • [25] D. Comaniciu and P. Meer, “Mean shift: a robust approach toward feature space analysis,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 24, no. 5, pp. 603–619, 2002.
  • [26] S. Z. Li, Markov random field modeling in image analysis.   Springer, 2009.
  • [27] C. Elkan, “Log-linear models and conditional random fields,” Tutorial notes at CIKM, vol. 8, 2008.
  • [28] C. Sutton and A. McCallum, “An introduction to conditional random fields for relational learning,” Introduction to statistical relational learning, vol. 93, pp. 142–146, 2007.
  • [29] ——, “Piecewise training for undirected models,” in

    Proceedings of the Twenty-First Conference Annual Conference on Uncertainty in Artificial Intelligence (UAI-05)

    .   AUAI Press, 2005, pp. 568–575.
  • [30] R Core Team, R: A Language and Environment for Statistical Computing, R Foundation for Statistical Computing, Vienna, Austria, 2012, ISBN 3-900051-07-0. [Online]. Available: http://www.R-project.org/
  • [31] J. Besag, “On the statistical analysis of dirty pictures,” Journal of the Royal Statistical Society. Series B (Methodological), pp. 259–302, 1986.
  • [32] R. Kohavi and F. Provost, “Glossary of terms,” Machine Learning, vol. 30, no. 2-3, pp. 271–274, 1998.