Slipping through the net: can data science approaches help target clean cooking policy interventions?

Reliance on solid biomass cooking fuels in India has negative health and socio-economic consequences for households, yet policies aimed at promoting uptake of LPG for cooking have not always been effective at promoting sustained transition to cleaner cooking amongst intended beneficiaries. This paper uses a two step approach combining predictive and descriptive analyses of the IHDS panel dataset to identify different groups of households that switched stove between 2004/5 and 2011/12. A tree-based ensemble machine learning predictive analysis identifies key determinants of a switch from biomass to non-biomass stoves. A descriptive clustering analysis is used to identify groups of stove-switching households that follow different transition pathways. There are three key findings of this study: Firstly non-income determinants of stove switching do not have a linear effect on stove switching, in particular variables on time of use and appliance ownership which offer a proxy for household energy practices; secondly location specific factors including region, infrastructure availability, and dwelling quality are found to be key determinants and as a result policies must be tailored to take into account local variations; thirdly clean cooking interventions must enact a range of measures to address the barriers faced by households on different energy transition pathways.




Group segmentation and heterogeneity in the choice of cooking fuels in post-earthquake Nepal

Segmenting population into subgroups with higher intergroup, but lower i...

Reshaping Smart Energy Transition: An analysis of human-building interactions in Qatar Using Machine Learning Techniques

Policy Planning have the potential to contribute to the strategic develo...

On the Capacity Region of Bipartite and Tripartite Entanglement Switching

We study a quantum switch serving a set of users. The function of the sw...

Impact of Interventional Policies Including Vaccine on Covid-19 Propagation and Socio-Economic Factors

A novel coronavirus disease has emerged (later named COVID-19) and cause...

A Factor-Augmented Markov Switching (FAMS) Model

This paper investigates the role of high-dimensional information sets in...

Deep Learning Predictive Band Switching in Wireless Networks

In cellular systems, the user equipment (UE) can request a change in the...

Learning Explainable Interventions to Mitigate HIV Transmission in Sex Workers Across Five States in India

Female sex workers(FSWs) are one of the most vulnerable and stigmatized ...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.


  • Policies promoting cleaner cooking do not reach all intended beneficiaries.

  • Descriptive analytics identify household groups with distinct transition pathways.

  • Non-income factors do not all linearly affect the probability of switching stove.

  • Policies must be tailored to take into account local socio-economic variations.

  • Cooking interventions must address range of needs of different groups.

1 Introduction

Worldwide there are almost 3 billion people who do not have access to clean cooking fuel, and in India just under half the population still face limited access to clean cooking fuels (International Energy Agency et al., 2019). Reliance on solid fuels has negative consequences including the health impacts of household air pollution, environmental impacts of local deforestation, and negative socio-economic effects arising from the practices surrounding the use of such biomass fuels (Smith and Sagar, 2014). These socio-economic effects disproportionately impact women and children of the household, for example the time spent collecting fuel by female members of the household negatively impacts their livelihoods and empowerment (Rahut et al., 2016). While in the past there were attempts to improve biomass stove efficiency to reduce negative health impacts of air pollution from biomass fuel use, there has been growing recognition that solving the wider negative socio-economic and health impact of solid fuel use requires a transition towards cleaner alternatives such as gas and electricity (Batchelor et al., 2019).

In recent years there has been a concerted effort in India to promote the uptake of Liquified Petroleum Gas (LPG) for cooking to reduce the use of solid fuels and tackle the associated negative health and development consequences of their use. Most recently the flagship Pradhan Mantri Ujjwala Yojana (PMUY) programme achieved its target of providing 80 million low-income households with a gas connection. The PMUY programme provided financial support to poor households, covering the cost of LPG connection and subsidising the first LPG cylinder, thus removing the initial cost barrier (Sharma et al., 2019). However, studies have found that while the programme successfully enabled many households to acquire their first cylinder, many of those households have not necessarily transitioned to sustained LPG use. They continue to use solid fuels for part or all of their needs (Kar et al., 2019). Further to that, findings suggest that the programme may not have managed to reach its intended beneficiaries equally in all regions, and benefited some households that would likely have transitioned without the incentive from the programme (Sharma et al., 2019; Sankhyayan and Dasgupta, 2019). These are outcomes also seen in other top-down clean energy interventions in India and the Global South more widely (Sehjpal et al., 2014; Silver and Marvin, Simon, 2017; Kebede et al., 2002). Whilst many energy poor households may benefit from such policies, there are always those who don’t benefit as expected, or ’slip through the net’ and miss out altogether (Rao, 2012; Batchelor et al., 2019).

PMUY is the most recent in a long line of policies aimed at reducing costs of LPG for poor households and improving access. In the last two decades there have been a range of other programmes and initiatives to promote the uptake of LPG including the ’Vitrak Yojana’ from 2009 which aimed to increase LPG distributorship through improving infrastructure and supply chains, while a range of subsidies have existed at national and local levels (Sankhyayan and Dasgupta, 2019). Recently there has been an initiative in place alongside PMUY to encourage wealthier households to voluntarily give up their subsidy if they do not need it so that it may benefit a poorer household (Sharma et al., 2019).

A key assumption underpining these policies is that use of a cleaner fuel, in this case LPG, is desired by all households and that the barrier that prevents them from using this fuel is the upfront costs of switching (Kar et al., 2019). This simplifies lack of access to clean cooking to an issue of household income or lack thereof which understates the complexity of barriers to clean cooking transitions (Sankhyayan and Dasgupta, 2019). As Gould and Urpelainen (2018) show, for many rural households in India the upfront cost is only one of many barriers to sustained LPG use, with other notable barriers including the lump-sum nature of monthly payments for LPG cylinders (as opposed to the actual cost) as well as the time and difficulty of transporting the cylinders.

There is a substantial body of literature investigating the determinants of household energy use, which shows that while income is an important determinant of energy transition, it is but one of many drivers of LPG and electricity use (Ekholm et al., 2010; Sehjpal et al., 2014). Studies in India have shown that there is a hierarchy of preferred fuels, and while income is an important driver of use of cleaner cooking fuels such as LPG (Ahmad and Puppim de Oliveira, 2015), factors beyond income play a role in determining uptake and transition (Farsi et al., 2007; Kemmler, 2007). As a result, transition pathways can be more complex than assumed by traditional ’energy ladder’ conceptualisations of energy transition (van der Kroon et al., 2013).

Lack of access to clean cooking can be a form of energy poverty, and as described by Sadath and Acharya (2017) problems of energy poverty are multidimensional and should not be simply confused with income poverty. Khandker et al. (2012) showed that income non-poor households were not necessarily energy non-poor, and the effect of non-income variables on energy decision making plays a key role in determining the energy poverty of a household. Sankhyayan and Dasgupta (2019) discuss how both accessibility and affordability become important in understanding energy poverty, and overcoming barriers of access and affordability requires an understanding of the socio-economic and cultural circumstances of households and their energy practices.

There has been growing interest in the role of urban data analytics for improving energy provision (Bibri and Krogstie, 2017), and these are particularly relevant given the multidimensional nature of energy poverty. Such studies use techniques from the data sciences to process large socio-economic and/or demographic datasets to inform better policy interventions. This involves three broad categories of analysis: descriptive analysis, which is concerned with understanding the data; predictive analysis which is concerned with extrapolating the trends found in the data; and prescriptive analysis which is concerned with using the data to identify the interventions likely to achieve desired outcomes (Wang et al., 2019). The majority of studies make use of regression models, which constitute a form of predictive analysis. However, prediction models based on regression assume that all variables considered in the analysis are independent and influence a given quantity of interest in a similar manner. This restricts the understanding of variations of features that influence uptake of clean fuels across different types of households. Descriptive analysis overcomes these limitations, thus enhancing the understanding of the composition of characteristics that govern energy transitions. Together with predictive analysis, it can yield a better understanding of the multidimensional nature of household energy decisions and practices, and the different scales at which key determinants act.

To demonstrate our proposed modelling approach we employ ensemble machine learning and clustering algorithms to conduct a combined predictive and descriptive analysis of the panel data from the Indian Human Development Survey between 2004/5 and 2011/12. Through the combined analysis, we demonstrate that groups of households follow different cooking transition pathways. Such an approach can be used to target specific policy interventions to address the needs and challenges of a particular group of households that might currently be under-served by cost-centric policies. The remainder of this paper is structured as follows: section 2 provides a review of literature on technical-economic and social conceptualisations of energy transitions, section 3 discusses the features and handling of the dataset, section 4 describes our analytical methods. Section 5 presents results of two different regression models and compares the performance of these, leading on to section 6 which presents the results of a clustering analysis of households that did switch to a non-biomass stove discussing the existence of different types of switching household, presenting conclusions and policy implications in section 7.

2 Background

Identifying and characterising clean cooking transition pathways requires an understanding of the concepts and phenomenon involved in energy transitions. The popular ’energy ladder’ concept of energy transition put forward by Leach (1992) offers a macro scale conceptualisation of transition. It assumes a preferential hierarchy of fuels, with solid fuels at the bottom and gas and electricity at the top. This model assumes that all households would prefer to use gas and electricity, and that once they can afford to they will switch to these cleaner, preferred fuels, in the process abandoning use of their previously used fuel. However subsequent work has challenged the assumptions of this model, with studies including those by Masera et al. (2000) and Heltberg (2004) finding that households do not switch entirely from one fuel to another, but rather adopt the new fuel while continuing use of the old fuel in a behaviour known as ’fuel stacking’. Differences at a micro, or household-level scale are missed by the energy ladder concept.

Empirical studies in India have supported some of the assumptions of the energy ladder while demonstrating the importance of non-income determinants. Kemmler (2007) in their study of rural Indian households, found that beyond expenditure, community-wide electrification, education of household members, and type of employment were significant determinants of access to electricity. Farsi et al. (2007) showed that while there was an observed hierarchy of preference of fuels in urban Indian households, as anticipated by the energy ladder model, socio-economic factors in addition to income such as education and sex of head of the household were significant determinants. Dhanaraj et al. (2018) found that, lack of female education in the household hindered uptake of refrigerators, and a study by Rao and Ummel (2017) used the IHDS and an ensemble machine learning analysis to show that dwelling quality, hours of electricity supply, and education played an important role in determining appliance uptake, particularly amongst the lower income households. In a study of rural households in Madhya Pradesh, Sehjpal et al. (2014) found that profession of the head of household, land ownership, fuel prices, and electricity access were significant determinants of cooking fuel. Ahmad and Puppim de Oliveira (2015), who also used the IHDS, found that in urban households greater education level, and piped water access were drivers of ’modern stove’ uptake. On the other hand, belonging to lower castes or having a larger household hindered modern stove uptake. More recently Sankhyayan and Dasgupta (2019) explored determinants of LPG consumption and appliance uptake finding that education and transport infrastructure was a driver of uptake in rural areas while in urban areas there was a significant positive effect of female literacy on LPG consumption.

The ’energy ladder’ is rooted in a technical-economic view of energy transition which focuses on cost and performance of different alternatives and assumes that households behave as rational consumers and has proven useful for quantitatively understanding energy consumption and appliance ownership trends at a macro level. van der Kroon et al. (2013) makes the case that identifying different energy transition pathways requires a better understanding of the decision-making and external context of a household. This requires understanding the social aspects of energy use, through preferences, practices, and decisions of households which act at a local scale.

Social practice theory (SPT) provides a lens through which energy transitions at a local scale can be analysed. While there have been a variety of formulations of practice theory since being put forward by Schatzki in 1996, SPT approaches used in energy research follow the view put forth by Shove et al. (2019) which views people as ’practitioners’ who combine materials, competences or know-how, and meanings to create practices (Bisaga and Parikh, 2018; Khalid and Sunikka-Blank, 2017). Unlike technical-economic approaches interested in macro-analyses of socio-economic data to understand trends in energy demand, Shove and Walker (2014) explain that practice theory views it as a matter of understanding how social practice develops and influence energy use and other household practices. Essentially this provides a shift in perspective away from resourced-based systems thinking towards more individual enquiry of what energy is actually used for.

Several recent studies have applied an SPT approach to the study of energy use in the Global South. In their work on middle class households in Pakistan, Khalid and Sunikka-Blank (2017) found that household practices shaped around cultural norms and socio-cultural dynamics explained the peculiar nature of energy demand of these households. Bisaga and Parikh (2018) conducted a study of solar home system users in Rwanda comparing and contrasting the insights of energy ladder and social practice approaches. They show that understanding the practices surrounding dominant electricity uses (lighting and mobile phone charging) helped to explain energy transitions in rural-off grid communities. Lighting and mobile phone charging are at present the two most common electricity uses for these households so understanding when and why villagers used these appliances helped explain decisions concerning adoption of electricity by households. A recent study by Debnath et al. (2019) provides an Indian context using a quantitative approach grounded in SPT to explore the influence of non-income factors on appliance ownership in rehabilitated slums in Mumbai, finding that the change in built environment following the rehabilitation of slums led to a change in practices of those households which translated into changes of appliance ownership and usage. Understanding key household practices related to energy can offer a rationale for otherwise peculiar energy demand, and help anticipate responses to interventions. However the population samples in all of these studies were relatively homogeneous in their socio-economic characteristics, and differences in practices played out on a household to household level. Galvin and Sunikka-Blank (2016) explain that socio-economic causality in energy consumption studies can be a blind spot for SPT approaches, and households in such studies are often relatively uniform in their socio-economic profiles.

The macro scale perspective of the energy ladder as well as the local scale perspective of SPT approaches informed the selection of variables as well as the approach for this paper. Our study takes a wide range of variables covering socio-economic characteristics as well as more household specific variables on fuel choices, time of use, and appliance ownership. In addition, this study combines predictive modelling to identify macro-scale trends with descriptive modelling to understand smaller scale variations between groups of households. Findings at a macro-scale may point to the existence and prevalence certain trends governing uptake of modern fuel use. For example, Rao and Ummel (2017) found that Sikh households in India were more likely to own a refrigerator. However to correctly interpret the causes and implications of such trends sometimes requires taking account of local practices, behaviours, and decisions that define energy use. For example, Bisaga and Parikh (2018) found common electricity demand profiles amongst villages in rural Rwanda and used qualitative data were to explain this in terms of daily routines and mobile phone charging demands. The approach taken by this study aims to identify and characterise clean cooking transition pathways by drawing upon the strengths of analysis at these different scales. We do so by first identifying key variables influencing clean cooking transitions across a large sample of households. Findings from the first step are then used to examine the different combination of features that define groups of households that did switch to clean cooking.

3 Data

Our study uses household level survey data from the publicly available and nationally representative Indian Human Development Survey (IHDS). The first IHDS was conducted in 2004-2005 (referred to as IHDS-I) (Desai et al., 2010) with a second follow-up survey in 2011-2012 (IHDS-II) (Desai and Vanneman, 2015), which returned to survey the same households originally surveyed for IHDS-I. The surveys were conducted by means of two one-hour interviews with the whole household or the head of the household, and comprised of a nationally representative sample of 41,554 urban and rural households across all Indian territories excluding the Andaman Isles and Lakshadweep. This sample included 1503 villages and 971 urban city blocks across 383 districts in 33 different states. IHDS-II covered 85 percent of the original households, with those households not surveyed the second time either having split, been unreachable, or struck by natural disaster (Desai and Vanneman, 2015). For our analysis, we use only the 32,922 households which were surveyed both in the IHDS-I and IHDS-II.

As pointed out by Khandker et al. (2012) and Ahmad and Puppim de Oliveira (2015), the energy related questions in the IHDS are more comprehensive than those in comparable studies including the Living Standards Measurement Studies coordinated by the World Bank, and the NSS Surveys of Consumer Expenditure. The IHDS dataset is disaggregated by housing type and various demographic features such as gender, religion, caste, occupation, and education (Desai and Vanneman, 2015; Desai et al., 2010). Additionally the IHDS includes some information on time spent carrying out certain energy related practices in the household, including time spent watching television, time spent collecting firewood, and hours of stove usage. Recommendations from the authors of the dataset were followed (Desai and Vanneman, 2015) with regards to weightings and variable selection. All weightings used were the ’SWeights’ specified for the households in the IHDS-I, and values for relatively unchanging variables (e.g. Caste and Religion) were taken from the IHDS-II.

The main dependant variable of interest was a binary variable indicating whether the household had switched from primarily using a biomass stove in 2004/5 to a non-biomass stove in 2011/12. This was constructed using the the variables indicating the main stove used for cooking indicated for the IHDS-I and IHDS-II respectively. The stove options included 3 types of biomass solid fuel stoves, and a general ’modern stove’ category which could represent Kerosene, LPG, or Electric Stoves. In the IHDS panel dataset (training and test subsets combined) 5358 households switched from using a biomass stove as their primary stove in 2004-5 to using an ’modern’ non-biomass stove in 2011-12 representing 16.27% of households (14.94% when adjusted by sampling weights).

Other variables were constructed from the dataset either to make variables more comparable or to create a dummy variable for a particular characteristic, or to characterise change in a variable between surveys. Energy consumption values in the IHDS are given in units of cost (INR) as opposed to units of energy which makes comparisons difficult. These values were converted to estimated energy consumption in kWh using local price data available in the IHDS and collected from government sources

(Government of India Planning Commission, 2012). In addition appliance ownership was grouped according to associated household activity: cooking (Pressure Cooker, Mixer/Grinder, Microwave, Refrigerator), and IT (Television, Telephone, Mobile Telephone, Computer, Laptop).

4 Methods

Studies on energy transition typically use some form of logit or probit regression model to perform a predictive analysis identifying the trends and effect of a given set of variables on appliance ownership, fuel use, or adoption rates of electricity or LPG. Recently

Rao and Ummel (2017) used a form of ensemble technique called a Boosted Regression Tree (BRT) model to analyse the effect of a range of household characteristics on the uptake of so-called ’white good’ appliances. A comparison of the predictive capability of these two modelling approaches found that the BRT model on the whole outperformed the logit model in predicting appliance ownership (Rao and Ummel, 2017).

In this study we seek to provide a greater level of descriptive or explanatory analysis to identify the different transition pathways and a two stage approach was used to achieve this. The first stage involves predictive modelling using an ensemble machine learning technique to identify factors that are determinants of clean cooking transition and assess performance of the model. The second stage focuses on descriptive modelling using hierarchical clustering, where the key determinants identified in the predictive modelling are used to identify the different groups of households that did switch stove and the different combination of features that characterize each group. The first stage of the analysis uses a training subset of 25,000 of the 32,922 households from the IHDS to identify the influence of variables on the propensity of a household to switch from a solid fuel biomass stove to a cleaner ’modern stove’ as their main cooking stove. The predictive performance of the ensemble learning regression and a conventional probit regression are assessed and compared using the remainder of the dataset not used to train the model. The secondary stage of analysis uses agglomerative hierarchical clustering to cluster the 5,358 households that did switch from biomass to a ’modern’ non-biomass stove. By comparing the effect of key determinants identified by the predictive modelling and the defining characteristics of the clusters of stove-switching households, it is possible to identify the different combinations of key determinants enabling stove transition in each cluster.

Variable selection was carried out using both correlation and random forest analysis to identify the most relevant variables. Given the inter-related nature of the socio-economic and cultural variables of interest in the dataset it was important to identify and address any significant multi-collinearity in the dataset before performing any analysis. A Farrar-Glauber test was conducted to identify and address any multi-collinearity. In particular fuels used exclusively for cooking showed cross-dependent correlation with one another, so redundant fuels were removed from the selected variables. In addition, the number of different region categories was reduced by reassigning households in states in the center region to the neighbouring eastern region as there was little distinction between these two. The descriptive statistics of the resulting independent variables are show in table

1 (except profession, caste, and region which are non-continuous, and non-binary).

Independent variable Mean Median Min. Max.
Income per capita (INR/month) 2401 1363 0 346750
Urban 0.330 - 0 1
Time in Place (years) 78.41 90.00 0.00 90.00
Female Education (years) 5.395 5.000 0.000 16.000
Permanent House 0.704 - 0 1
Flush Toilet 0.392 - 0 1
Piped Water Availability (hours/day) 1.845 0.000 0.000 24.000
Dairy Spend (INR/month) 196.40 100.00 0.00 8600
Electricity Availability (hours/day) 13.11 14.00 0.00 24.00
Electricity Consumption (kWh/month) 93.68 54.50 0.00 1977.40
Kerosene Consumption (kWh/month) 28.01 24.79 0.00 587.00
Change in fuel collection time (min) -3.475 0.000 -320.000 450.000
Cooking appliance ownership 0.291 0.250 0.000 1.000
IT appliance ownership 0.310 0.429 0.000 1.000
Change in Female TV Time (hours/day) 0.687 1.000 -12.000 14.000
Table 1: Descriptive statistics for variables

The BRT is a tree based ensemble learning technique that combines a large number of simple categorisation trees, using gradient boosting to build ensembles of decision trees that are fit to the remaining model residuals. Unlike a probit model there is no a priori specification of the functional form and the BRT analyses the influence of the variables capturing non-linear effects and complex interactions. A challenge of the BRT model is the specification of the hyper-parameters which include the number of trees, the learning rate, and the tree complexity. We used n-fold cross validation to determine the optimum number of trees, and followed the recommendations of

Elith et al. (2008) to optimise the remaining parameters to produce an accurate model and minimise risk of over-fitting. For this model we used a tree complexity of 5, and a learning rate of 0.01, with 4100 trees fitted. We implemented the BRT using the gbm and dismo packages in the R programming language.

A probit regression was carried out for comparison with the BRT, as this is a commonly used model for studies on energy transition concerned with a binary outcome. Assuming that the individual’s decision to switch from a biomass stove to an non-biomass stove is based on a latent variable which represents some measure of utility, then this variable can be defined as a linear function of the independent variables, as shown in equation 1 where

is a vector of all the independent variables for an individual household,

is a vector of coefficients, and captures the uncertainty.


The binary outcome we are interested in with these models is not unlike the binary outcomes in medical models assessing patient outcomes (although in our study the outcome is a switch from biomass to non-biomass or not, instead of life and death), and in both cases there is a need for the models to not only perform well on average but also to perform well in distinguishing borderline cases. In the field of medicine when assessing models for patient outcomes it is good practice to report the calibration and discriminatory ability of the model (Steyerberg et al., 2010). The Brier Score is an overall performance measure of calibration and discrimination for binary outcomes whose scoring rule is shown in equation 2 where N is the number of instances, f is the outcome from the model, and o is the actual outcome. The concordance statistic c, identical to the area under the Receiver Operating Characteristic (ROC) curve for binary outcomes offers a measure of how well the model distinguishes outcomes. Both of these measures were calculated for each model using base packages in R.


For the second stage of the analysis hierarchical clustering was used. This is an unsupervised machine learning method that can be used to identify subsets within a dataset that have similar characteristics based on the connectivity between data points. A benefit of hierarchical clustering algorithms for such descriptive analysis is that the iterative process produces a clear tree like structure of clusters which offers a more intuitive view of the clustering process and easier analysis of results, although the iterative nature of the algorithm makes it inefficient for extremely large datasets (Kassambra, 2017)

. We used an agglomerative hierarchical clustering algorithm and with the gower distance measure for categorical variables as it produced a clear and distinct cluster structure. All analysis was performed in R using base packages, as well as the ’dendextend’ and ’fpc’ packages.

5 Predictive Modelling Results

5.1 Boosted regression tree model

From the BRT analysis we obtain both the relative importance of variables shown in figure 1 and the marginal effects of the independent variables shown in figures 2, 3, 4. Figure 1 shows all independent variables were found to have non-zero relative influence ranging from 1-12%. Use of kerosene and electricity both have an influence of around 11%, while cooking equipment ownership shows an 8.5% influence, and IT appliance ownership a 7.1% influence. The region a household is in has a 10.7% influence and the profession of the head of the household has an influence of 9.3%. Income per capita of the household does have an influence of 8.5% but the BRT shows it is not the dominant determinant of a household’s switch to non-biomass stoves. The marginal effects shown in figures 2, 3 for each of the variables exhibit one of three different types of response: either a constant response (for categorical variables), a threshold response, or a multiple threshold (multiple regime) response.

Figure 1: Relative influence of variables in BRT Model

The constant marginal effects observed for categorical variables shows that these variables will be key determinants of modern stove switching for only some households - for example region is one of the more relatively influential variables, with North-Eastern states being associated with a markedly higher probability of switching stove, while households in the South have a slightly higher chance of switching than households in the East, North and West where region is a determinant of minor influence. This difference could be the result of local policy or climate differences; for example the southern states are typically wealthier relative to the national average, and southern states such as Tamil Nadu and Karnatka have led development in renewable energy infrastructure in India (Schmid, 2012). North Eastern states have lower incomes and with historically lower access to infrastructure (Ghosh and De, 1998), the georgraphy of this region also results in greater local availability and dependency on biomass fuel compared to other regions (Bhatt et al., 2016). LPG distribution infrastructure development under ’Vitrak Yojana’ between 2005 and 2011 benefited many poorly serviced settlements in North Eastern states. In the work of Sankhyayan and Dasgupta (2019) a significant relationship between region and LPG use was not found, however the coefficients from their model are compatible with the marginal effects from our analysis.

The profession of the head of the household was also found to be of greater relative influence, although the marginal effects were only significant for some professions as shown in figure 2. Those in skilled trades, artisans, salaried employment, or collecting pensions or rent all had a greater probability of switching, whereas those in agricultural wage labour, and unskilled work were less likely to switch.Kemmler (2007) found that more labour intensive and ’daily wage’ type employment was associated with lower electricity use, and Sehjpal et al. (2014) found that, in rural India, households whose head was in more formal employment had a greater likelihood of the household transitioning to clean cooking. This may be related to the frequency of payment with the former group of jobs being associated with regular monthly or weekly pay whereas income can be more erratic for the latter group.

A measure of household infrastructure is provided through variables measuring permanent house construction, and availability of flush toilets shown in figure 2 and both show a small positive increase in marginal effect on the switch to a modern stove with greater levels of access. Rao and Ummel (2017) similarly found that better dwelling quality had a positive relationship with ownership of refrigerators and TVs, and Ahmad and Puppim de Oliveira (2015) showed that access to piped water was associated with clean cooking. Permanent housing, while having the lowest relative influence of the variables in the dataset, did have a positive marginal effect on the switch to a non-biomass stove. These findings suggest that access to public utilities and quality of the household’s immediate built environment are important, as Debnath et al. (2019) found in their study of rehabilitated slum housing in Mumbai.

Figure 2: Marginal effect of constant effect independent variables on probability of a household switching from Biomass to LPG

Figure 3 shows the marginal effects of variables which exhibit a threshold response, namely hours of electricity supply and years of education of the head female of the household. The marginal effect of hours of electricity supply on switching behaviour shows a constant effect up until 15 hours of electricity supply per day, after which the marginal effect increases with hours of electricity. Rao and Ummel (2017) similarly found that hours of electricity supply had a positive relationship with ownership of refrigerators and TVs, and Ahmad and Puppim de Oliveira (2015) showed that access to electricity was associated with clean cooking. The threshold observed at 15 hours could be indicative of the added convenience or reliability of having electricity available for two thirds of the day, encouraging investment in appliances or changing household practices related to cooking.

Education of the head female of the household also displays a threshold response as seen in figure 3. Households whose head female has 10 or more years of schooling, i.e. completing some level of secondary or tertiary education, has a greater probability of switching to a ’modern stove’. A recent study by Sharma et al. (2019) found a significant relationship between education and LPG uptake for households in the eastern states of Chattisgarh and Jharkhand, while Ahmad and Puppim de Oliveira (2015) found female education to be a significant determinant of non-biomass cooking in non-slum households. In their study, Sankhyayan and Dasgupta (2019) found that in urban areas there was a stronger positive association between female literacy and LPG use, especially for households where the female head of the household had more than 9 years of schooling, and they suggest this difference is a result of female literacy not translating into female empowerment as effectively in rural households.

Figure 3: Marginal effect of threshold response independent variables on probability of a household switching from Biomass to LPG
Figure 4: Marginal effect of multiple threshold response independent variables on probability of a household switching from Biomass to LPG

Figure 4 shows the marginal effects of variables with multiple thresholds, or different regimes, where marginal effect follows different trends within given ranges. LPG and biomass fuels are used fairly exclusively for cooking. In contrast electricity and kerosene have a range of different end uses. Use of these fuels can indicate transition to cleaner energy for other household activities which offers an explanation for the high relative influence of these variables. In figure 4 we can see that low levels of electricity consumption are associated with a negative marginal effect on the probability of a household switching but this marginal effect increases to a positive level with increasing electricity consumption up to a level of 500kWh/month. Beyond this electricity has a negligible effect on the probability of switching as households using more electricity than that almost certainly have transitioned to clean cooking, with over 80% of households using no biomass fuel at all. We similarly see that Kerosene use up to 200 kWh leads to a greater probability of a household switching whereas above that 200 kWh the marginal effect is negative indicating reduced chance of switching. This could be due to households using more than 200 kWh of kerosene are likely using it for cooking, and not necessarily using a modern stove. The noisy behaviour between 350 kWh and 500 kWh is likely due to households switching from a biomass stove to a kerosene one, which counts as a ’modern stove’ switch in the IHDS. The different marginal effect thresholds show how related energy practices of the household shape the observed energy consumption and how these practices have inter-dependencies, as Bisaga and Parikh (2018) found in their study.

Appliance ownership can serve as a proxy for energy use by a household as appliances are used to deliver a particular energy service. Figure 4 shows how increasing ownership of IT and cooking appliances increases the probability of a household having switched from to a non-biomass stove to a cleaner stove. Rao and Ummel (2017) found that refrigerator and television ownership was associated with greater LPG use by a household, which suggests clean cooking facilities. Greater appliance ownership could also signal better access to markets or shops, as well as better availability of electricity. However there is are two thresholds, as the marginal effect plataeu’s for households with average ownership, and drops off at high ownership levels as households with very high levels of appliance ownership are more likely to already use LPG and thus the greatest marginal probability of switching occurs for households with a middling levels (40-60%) of ownership.

Time spent collecting fuel and watching TV in a household shown in figure 4 offer some quantification of household practices as a measure of time allocation to given practices. A decrease in time spent collecting fuel of up to 130 minutes is associated with a greater probability of a switch to a ’modern stove’, and decreases in time spent collecting fuel beyond 130 minutes have a relatively low marginal effect on the chance of a household transitioning. An increase up to 50 minutes is associated with a decreasing probability of switching and increases in fuel collection time above 50 minutes see the lowest probability of switching. Similarly the change in number of hours spent watching TV by the adult women of the household has a small positive association for small decreases and increases, but larger increases beyond 5 hours of TV viewing are associated with a lower probability of a household stove switching. The marginal effect of changes in energy practices surrounding energy use and clean cooking transitions are characterised by multiple thresholds. Additionally the marginal effects of these two variables quantitatively shows that there is a change in the time allocated to energy related practices in a household that switches stove. This is important as it implies that characteristics of the stove and its usage have an impact on the practices of a household. Debnath et al. (2019) found that characteristics of household appliances in Mumbai slums had a significant effect on the practices of the household.

5.2 Probit model

The coefficients of the probit regression model are shown in table 2

. A key difference between the outputs of the probit and BRT models is that while the BRT provides relative importance and marginal effect plots, the probit model provides coefficients, standard errors, and confidence intervals denoted by statistical significance levels which can make the process of evaluating the model more straightforward. Comparing the coefficients in table

2 with the relative importance and marginal effect plots from the BRT model in figures 2, 3 and 4 we can see that many of the coefficients and marginal effects for many of the categorical variables such as region, permanent housing, profession, and flush toilet availability show compatibility with respect to influence on stove switching.

Dependent variable
Independant Variable Coefficient Standard Error
RegionNorth 0.008 (0.046)
RegionNorth East 1.078 (0.089)
RegionSouth 0.494 (0.045)
RegionWest 0.091 (0.045)
Income.pc 0.00000 (0.00000) 0.003 (0.001) 0.003 (0.004)
ProfessionAgricultural wage labourer 0.811 (0.478)
ProfessionArtisan/Skilled 1.036 (0.483)
ProfessionPension/Rent 0.847 (0.477)
ProfessionPetty shop 1.038 (0.477)
ProfessionSalaried 0.949 (0.477)
ProfessionWage labourer 0.958 (0.478) 0.352 (0.037)
Flush.toilet 0.266 (0.035)
Water.piped.hours 0.007 (0.003)
Dairy.spend -0.00004 (0.0001)
Electricity.Hours 0.010 (0.002)
Electricity 0.00001 (0.0001)
Kerosene 0.001 (0.0004)
Fuel.distance.change -0.002 (0.0004)
Cooking.apps 0.311 (0.085)
Ict.apps 0.916 (0.111)
TV.hours.women.change -0.010 (0.008)
Constant -2.817 (0.540)
Pseudo R 0.115
Note: p0.1; p0.05; p0.01
Table 2:

However there are some key differences between the outputs, particularly those which have a non linear effect in the BRT model. For example, while the BRT identified the use of complimentary fuels as being significant, the probit regression does not find any significant effect. If we look at the marginal effect plots for electricity use in figure 4 we can see that the marginal effects vary with the level of respective fuel use. This non-linear relationship cannot be captured by the probit regression. Conversely while the probit regression correctly identifies significant effects for variables such as cooking and IT appliance ownership, distance travelled for fuel, and hours of electricity supply, it does not capture the threshold identified by the BRT beyond which the marginal effects of these variables are reduced or negligible.

5.3 Comparison of predictive performance of BRT and Probit Models

Using the test subset of the dataset as inputs to each of the two models, predictions of whether a household would switch to a non-biomass ’modern stove’ or not were calculated and compared to the actual stove switching outcome in the dataset. Table 3

shows the classification tallies of each model as well as three measures of predictive performance, including the percentage of correctly classified households (a higher score indicates better predictive ability), the AUC score indicating discriminative ability of the model (a higher score indicates better predictive ability), and the Brier score which is an indication of both calibration and discriminative ability of the model (a lower score indicates better predictive ability).

BRT Model Probit Model
Correct classification 84.9% 83.5%
AUC 0.823 0.731
Brier Score 0.108 0.126
True positive 214 103
False negative 1138 1249
False positive 89 116
True negative 6866 6839
Note: test subset of 7922 households
Table 3: Results of indicators for comparison of predictive performance of BRT and Probit Model

The BRT model outperforms the probit model on all three measures particularly on its discriminative ability, although the results are comparable. This is a reflection on the ability of the tree-based ensemble method to model non-linear effects. Indeed many of the dependent variables had non-linear marginal effects. Thresholds for non-zero effects are a reflection of the non-linear nature of practices and deicison making concerning household energy use. Figure 5 demonstrates this difference between the probit and BRT model using the example of cooking appliance ownership. As shown both models follow the same positive trend with greater appliance ownership and have similar marginal effects at the mean. However, for specific households the probit either under- or overestimates the effect of appliance ownership compared to the BRT model. The probit regression offers the benefit of simplicity, which can make communicating results to a non-technical audience straightforward. Additionally the assessment of compatibility of results via statistical significance can help validate and compare results. However, the outputs of the BRT offer a visual and intuitive way of conveying the variation in marginal effect and the existence of thresholds levels.

Figure 5: Comparison of Marginal Effect of Cooking Appliance Ownership from Probit and BRT Models

While our measures of performance provide a metric for the calibration and discriminatory ability of each model, the rates of true and false positives and negatives for each model shown in the bottom half of table 3 point to a problem of such models. For both the probit and BRT models we find that the number of false negatives, that is the households that the model predicted would not switch but did in reality switch, accounts for over 84% of switching households in the BRT model and 92% of the transitioning households under the probit model. This suggests that while these models are good at predicting households that did not switch (true negatives compared with false positives), they perform poorly at predicting households that do transition. Households that transition against the expectation of the model point to the existence of alternative transition pathways not captured by either model, defined by characteristics that individually would ordinarily not be drivers of transition, but when present in specific combinations can allow household to overcome other barriers.

6 Descriptive Modelling Results

Using the variables shown in table 1 a divisive hierarchical clustering analysis was conducted on the subset of households that did switch their main stove from solid fuel biomass stoves to a clean non-biomass stove between 2004-5 and 2011-12. The clustering analysis identifying nine distinct clusters of households all of which had transitioned away from primarily using a biomass stove but with different combinations of defining characteristics. The resulting dendrogram is shown in figure 6, and the mean characteristics of each cluster are shown in table 4.

Figure 6: Dendrogram of Hierarchical Clustering with IHDS Biomass to LPG switching households
X1 X2 X3 X4 X5 X6 X7 X8 X9
Region (most represented) North South South North North North East East West South
Income 55058.20 33054.90 27915.67 46401.94 29561.38 48916.25 40480.51 35244.72 37380.36
Caste Fwd/Gen OBC OBC OBC OBC Fwd/Gen Fwd/Gen OBC OBC
Time in Place 85.18 82.87 79.91 64.58 70.67 75.81 63.07 84.60 71.86
Urban (%) 0.01 0.02 0.38 0.99 0.96 0.54 0.98 0.01 0.48
Female Education 8.52 5.89 5.57 7.96 5.98 10.05 8.46 6.92 8.51
Pucca House 0.98 1.00 0.00 0.99 0.96 0.78 1.00 0.94 0.98
Flush Toilet 0.99 0.02 0.41 1.00 0.06 0.79 0.57 0.44 0.99
Water Piped Hours 2.53 1.70 1.98 2.22 4.91 0.35 3.04 1.86 3.23
Monthly spend on dairy 461.24 220.90 162.93 301.77 194.66 355.35 186.90 185.29 154.35
Electricity Hours 15.14 12.55 12.96 15.79 15.56 6.67 19.61 16.59 17.98
Electricity 166.48 74.36 80.94 228.55 130.85 96.90 167.94 95.86 110.81
LPG 172.18 136.10 132.90 181.68 141.11 215.12 145.49 128.91 130.25
Biomass 466.67 522.18 423.21 159.73 327.45 360.39 131.21 598.30 380.47
Kerosene 18.00 21.64 31.77 22.61 29.11 47.52 32.84 34.54 16.43
Change in Fuel Distance -6.46 -9.67 -9.78 -2.30 -4.32 9.86 -4.50 -21.76 -5.06
Cooking Appliances 0.54 0.36 0.32 0.50 0.37 0.43 0.46 0.41 0.52
IT Appliances 0.45 0.38 0.34 0.42 0.38 0.43 0.43 0.37 0.46
Female TV Viewing Hours 0.91 0.85 0.85 0.63 0.42 1.55 0.26 0.98 0.21
Correct BRT Prediction 12.8% 8.1% 10.0% 4.7% 13.7% 76.6% 3.6% 12.6% 28.2%
Correct Probit Prediction 0.6% 0.6% 0.0% 1.0% 1.6% 64.1% 0.0% 0.0% 24.2‘%
Table 4: Mean characteristics of clean cooking transition clusters
(a) Electricity Use
(b) Piped Water Supply
(c) Region
(d) Rural - Urban Split
Figure 7: Key explanatory variables by cluster for households that have switched from Biomass to LPG

The diversity of characteristics between clusters is notable as it suggests that there is no single combination of determinants that results in a transition to clean cooking fuels, and points to the different and complex transition pathways that van der Kroon et al. (2013) discussed. A comparison of clusters 1 and 2 detailed in table 4 and shown in figure 7 serves to illustrate a rural case of such different transition pathways: households in cluster 1 have a mean income of 55,058 INR, are nearly all Northern rural households. They have good provision of water and electricity, with near ubiquity of flush toilets, permanent housing, above average appliance ownership and electricity use, as well as above average levels of female education. This group represents households that score highly on most of the key determinants, and a higher proportion of these households were correctly predicted to have switched stove by the BRT model. In contrast households in cluster 2 have a lower mean per capita income of 33,054 INR, lower female education levels, lower prevalence of flush toilets, fewer hours of piped water and electricity access, lower electricity consumption and higher average biomass consumption while having lower appliance ownership. However households in cluster 2 all have permanent housing, have been settled for over 80 years and still have better than average availability of electricity and water. This suggests that despite their lower income these households still have access to a better than average level of physical infrastructure, but their high biomass use relative to cluster 1 suggests that there is a higher prevalence of fuel stacking in households of cluster 2.

The existence of different transition pathways can also be observed between urban clusters 4 and 5. Cluster 4 represents above average income households with a per capita annual income of 46,401 INR, and above average education of the head female of the household, access to flush toilets, hours of electricity, and appliance ownership. Cluster 4 also largely represents northern urban households. Households in cluster 5 are also urban, but have markedly lower mean per capita annual income of 29,561 INR, and low prevalence of flush toilets, lower electricity consumption, lower levels of head female education, appliance ownership, and mean biomass consumption double that of cluster 4. Cluster 5 have a high proportion of households employed in stable jobs, and have equally good availability of water and electricity as those households in cluster 4, as well as being settled in their current neighbourhood for longer and containing more Southern households. These longer established households with steady employment are likely to have stronger communities with good ’social infrastructure’, with better relationships and sharing of information between neighbours. A greater proportion of households in cluster 5 were correctly predicted to transition by the BRT model as they score highly on key determinants such as region, profession, and change in fuel distance while not lagging too far behind the mean on other key determinants. These households are likely to have a higher prevalence of fuel stacking as evidenced by the higher mean biomass use, where biomass fuels may offer a back up fuel when LPG is not available, or in months when household income needs to be spent on other priorities.

It is interesting to note the uneven distribution of correct model predictions across the clusters, that is the rate of true positives in the test subset of the dataset present in each cluster. The probit model fails to predict a significant proportion of transitions in any cluster but 6 and 9 where it correctly predicted 64.1% and 24.2% of stove transitions respectively. These are the clusters which score highly in nearly all the determinants and are easy identification targets for the model. The BRT model does correctly identify a low percentage of stove switching in several other clusters but similarly performs best at identifying stove switching households in clusters 6 and 9. The clusters other than 6 and 9 do not score highly on nearly all determinants, but rather score highly on specific combinations of key determinants. Levels of access to both physical infrastructure - indicated by variables including housing quality, hours of electricity, piped water availability, and flush toilet availability - and/or social infrastructure - indicated by variables including years since migration and caste - seem to be important to transition to clean cooking. As we have shown, households with different non-income characteristics can still have similarly suitable levels of physical and social infrastructure to enable clean cooking transition even though not all of these combinations of characteristic would be identified by a predictive model.

It is notable that there is a greater share of biomass use in households which are more income poor, or have poorer access to infrastructure both social and physical, even though they have switched their main stove to a clean non-biomass stove. The use of fuel stacking to manage energy services in the household as described by van der Kroon et al. (2013), was widespread amongst households that switched stove. Understanding the energy practices and decisions leading to such fuel stacking behaviours requires an understanding at a household level such as demonstrated by Khalid and Sunikka-Blank (2017) in order to enable policy interventions to promote greater uptake of sustained clean cooking among such households.

7 Limitations and Future Work

This analysis does face a limitation due to the nature of the IHDS dataset which is representative at the national level. It serves to make some crucial comparisons between regions and states. Differences in the non-income drivers that determine clean cooking transitions and the interaction between physical and social infrastructure and household energy practices all take place at a local scale. Larger sample size surveys at a city scale could be used to identify and characterise the different transition pathways of different groups of households. Additional data on the current fuels used, different energy end uses within a household and time of use, as well as aspirations of households would be invaluable. In addition such detailed surveys could include some qualitative interviews with households discussing their energy practices and decisions to provide context to the data. For example this could provide an understanding of the non-monetary trade-offs considered by households when switching to LPG.

The authors note that promisingly a number of recent studies including by Debnath et al. (2019) and Sharma et al. (2019) in this journal have carried out local case studies exploring the influence of non-income drivers on changes in energy practices, appliance ownership, and fuel use. Further work with larger and more widely representative samples of such local data is needed while embracing alternative analytical tools such as ensemble methods and clustering analyses alongside qualitative approaches which can help identify the complex action of non-income factors and identify different pathways to transition.

8 Conclusions and Policy Implications

This study has used unsupervised machine learning methods in a two stage analysis using predictive modelling to characterise the non-income determinants of a switch from a biomass to a non-biomass stove by Indian households, and descriptive modelling to identify groups of households with similar energy transition pathways. Using the panel IHDS dataset with over 32,000 households surveyed in 2004/5 and 2011/12, this study uses ensemble machine learning predictive modelling and descriptive clustering analysis to identify households that are missed by current policy interventions.

North-eastern and southern households had a greater probability of switching from a biomass to non-biomass stove, as did those whose head of household was employed in non manual labour professions. Several determinants displayed a threshold relationship with stove switching, and were only influential determinants of stove switching beyond a given value - for example availability of electricity above 15 hours a day was associated with a increasing stove switching, similarly where the head female of the household had more than 10 years of education a similar increasing probability of stove switching was observed. The influence of other determinants was characterised by multiple thresholds or regimes for example low appliance ownership of both cooking and IT appliances was had a plateau of greatest marginal effect for households with ownership between 10 and 50% with slightly lower probability of fuel switching for households with higher appliance ownership and negligible chance of switching below this range.

Our study found that the BRT model performed better than the probit model in predicting whether households switched, however both models performed relatively poorly in identifying the households that did switch compared to those that did not. The clustering analysis showed that there were nine clearly distinguishable groups of household that had switched. Each cluster is defined by different combinations of key determinants. However nearly all the households correctly identified by the predictive models were grouped in only two of the clusters. The other groups of households represent those typically missed out by predictive models and policies informed by such models. The two stage approach in this study provided additional insight over simple predictive models by determining not only the trends in the data but also the latent groups of households within the sample which followed different cooking transition pathways.

There are two major implications from this study for policy interventions aiming to alleviate energy poverty and promote transition to sustained use of cleaner cooking fuels. Firstly, local regional and city-scale variation must be taken into account in the design of policies so as to target policy to the energy needs of local households. This could for example involve accounting for regional variations in cooking practices, such as a preference for bread over rice. This adds to previous studies in India showing that income alone is not the best metric for targeting interventions for clean cooking transition (Sehjpal et al., 2014), and supports a conclusion of Kebede et al. (2002) that local variations must be factored into the design and tailoring of policies.

The second key implication is that households follow different energy transition pathways, even within the same region or city, and each will be responsive to different incentives and therefore a single policy measure will not be effective at promoting clean cooking transition for all households. Effective policy needs to enact a range of interventions, beyond fuel subsidies to help overcome the barriers to transition faced by households on these different transition pathways. In some cases this may be linked to financial barriers, but it may also require addressing infrastructure, legal issues, or even education and community barriers. Targeted data collection with clearly designed survey instruments could offer a means to tailor analysis leveraging a combination of data science techniques, thus maximising the information gained to support the design of more effective policy to address these.

9 Acknowledgements

The authors are grateful for EPSRC support through the CDT in Future Infrastructure and Built Environment (EP/L016095/1) and to Indian Institute of Human Settlements, India. AP Neto-Bradley is supported by the The Leathersellers’ Company and Fitzwilliam College, Cambridge.

10 Data Availability

Datasets related to this article can be found at the following online repositories: IHDS-I dataset -, and the IHDS-II dataset -

an open-source online data repository hosted by the Inter-university Consortium for Political and Social Research (ICPSR) (Desai et al. 2010)



  • S. Ahmad and J. A. Puppim de Oliveira (2015) Fuel switching in slum and non-slum households in urban India. Journal of Cleaner Production 94, pp. 130–136. External Links: ISSN 0959-6526, Link, Document Cited by: §1, §2, §3, §5.1, §5.1, §5.1.
  • S. Batchelor, E. Brown, N. Scott, and J. Leary (2019) Two Birds, One Stone—Reframing Cooking Energy Policies in Africa and Asia. Energies 12 (9), pp. 1591 (en). External Links: Link, Document Cited by: §1, §1.
  • B. P. Bhatt, S. S. Rathore, M. Lemtur, and B. Sarkar (2016) Fuelwood energy pattern and biomass resources in Eastern Himalaya. Renewable Energy 94, pp. 410–417 (en). External Links: ISSN 0960-1481, Link, Document Cited by: §5.1.
  • S. E. Bibri and J. Krogstie (2017) Smart sustainable cities of the future: An extensive interdisciplinary literature review. Sustainable Cities and Society 31, pp. 183–212 (en). External Links: ISSN 2210-6707, Link, Document Cited by: §1.
  • I. Bisaga and P. Parikh (2018) To climb or not to climb? Investigating energy use behaviour among Solar Home System adopters through energy ladder and social practice lens. Energy Research & Social Science 44, pp. 293–303. External Links: ISSN 2214-6296, Link, Document Cited by: §2, §2, §2, §5.1.
  • R. Debnath, R. Bardhan, and M. Sunikka-Blank (2019) How does slum rehabilitation influence appliance ownership? A structural model of non-income drivers. Energy Policy 132, pp. 418–428. External Links: ISSN 0301-4215, Link, Document Cited by: §2, §5.1, §5.1, §7.
  • S. Desai, R. Vanneman, and N. D. National Council Of Applied Economic Research (2010) India Human Development Survey (IHDS), 2005: Version 12. Inter-University Consortium for Political and Social Research (eng). Note: type: dataset External Links: Link, Document Cited by: §3, §3.
  • S. Desai and R. Vanneman (2015) India Human Development Survey-II (IHDS-II), 2011-12: Version 6. Inter-University Consortium for Political and Social Research (eng). Note: type: dataset External Links: Link, Document Cited by: §3, §3.
  • S. Dhanaraj, V. Mahambare, and P. Munjal (2018) From Income to Household Welfare: Lessons from Refrigerator Ownership in India. Journal of Quantitative Economics 16 (2), pp. 573–588 (en). External Links: ISSN 2364-1045, Link, Document Cited by: §2.
  • T. Ekholm, V. Krey, S. Pachauri, and K. Riahi (2010) Determinants of household energy consumption in India. Energy Policy 38 (10), pp. 5696–5707. External Links: ISSN 0301-4215, Link, Document Cited by: §1.
  • J. Elith, J. R. Leathwick, and T. Hastie (2008) A working guide to boosted regression trees. Journal of Animal Ecology, pp. 22 (en). Cited by: §4.
  • M. Farsi, M. Filippini, and S. Pachauri (2007) Fuel choices in urban Indian households. Environment and Development Economics 12 (6), pp. 757–774 (en). External Links: ISSN 1469-4395, 1355-770X, Link, Document Cited by: §1, §2.
  • R. Galvin and M. Sunikka-Blank (2016) Schatzkian practice theory and energy consumption research: Time for some philosophical spring cleaning?. Energy Research & Social Science 22, pp. 63–68. External Links: ISSN 2214-6296, Link, Document Cited by: §2.
  • B. Ghosh and P. De (1998) Role of Infrastructure in Regional Development: A Study over the Plan Period. Economic and Political Weekly 33 (47/48), pp. 3039–3048. External Links: ISSN 0012-9976, Link Cited by: §5.1.
  • C. F. Gould and J. Urpelainen (2018) LPG as a clean cooking fuel: Adoption, use, and impact in rural India. Energy Policy 122, pp. 395–408. External Links: ISSN 0301-4215, Link, Document Cited by: §1.
  • Government of India Planning Commission (2012) Annual Report 2011-2012. Annual Report Government of India, New Delhi. External Links: Link Cited by: §3.
  • R. Heltberg (2004) Fuel switching: evidence from eight developing countries. Energy Economics 26 (5), pp. 869–887. External Links: ISSN 0140-9883, Link, Document Cited by: §2.
  • International Energy Agency, International Renewable Energy Agency, United Nations Statistics Division, World Bank, and World Health Organization (2019) Main Report. Technical report Technical Report 136961, The World Bank (en). External Links: Link Cited by: §1.
  • A. Kar, S. Pachauri, R. Bailis, and H. Zerriffi (2019) Using sales data to assess cooking gas adoption and the impact of India’s Ujjwala programme in rural Karnataka. Nature Energy 4 (9), pp. 806–814 (en). External Links: ISSN 2058-7546, Link, Document Cited by: §1, §1.
  • A. Kassambra (2017)

    Practical Guide to Cluster Analysis in R. Unsupervised Machine Learning (Multivariate Analysis I)

    Vol. 1, STHDA. Cited by: §4.
  • B. Kebede, A. Bekele, and E. Kedir (2002) Can the urban poor afford modern energy? The case of Ethiopia. Energy Policy 30 (11), pp. 1029–1045. External Links: ISSN 0301-4215, Link, Document Cited by: §1, §8.
  • A. Kemmler (2007) Factors influencing household access to electricity in India. Energy for Sustainable Development 11 (4), pp. 13–20. External Links: ISSN 0973-0826, Link, Document Cited by: §1, §2, §5.1.
  • R. Khalid and M. Sunikka-Blank (2017) Homely social practices, uncanny electricity demands: Class, culture and material dynamics in Pakistan. Energy Research & Social Science 34, pp. 122–131. External Links: ISSN 2214-6296, Link, Document Cited by: §2, §2, §6.
  • S. R. Khandker, D. F. Barnes, and H. A. Samad (2012) Are the energy poor also income poor? Evidence from India. Energy Policy 47, pp. 1–12. External Links: ISSN 0301-4215, Link, Document Cited by: §1, §3.
  • G. Leach (1992) The energy transition. Energy Policy 20 (2), pp. 116–123. External Links: ISSN 0301-4215, Link, Document Cited by: §2.
  • O. R. Masera, B. D. Saatkamp, and D. M. Kammen (2000) From Linear Fuel Switching to Multiple Cooking Strategies: A Critique and Alternative to the Energy Ladder Model. World Development 28 (12), pp. 2083–2103. External Links: ISSN 0305-750X, Link, Document Cited by: §2.
  • D. B. Rahut, B. Behera, and A. Ali (2016) Patterns and determinants of household use of fuels for cooking: Empirical evidence from sub-Saharan Africa. Energy 117 (Part 1), pp. 93–104. External Links: ISSN 0360-5442, Link, Document Cited by: §1.
  • N. D. Rao and K. Ummel (2017) White goods for white people? Drivers of electric appliance growth in emerging economies. Energy Research & Social Science 27, pp. 106–116. External Links: ISSN 2214-6296, Link, Document Cited by: §2, §2, §4, §5.1, §5.1, §5.1.
  • N. D. Rao (2012) Kerosene subsidies in India: When energy policy fails as social policy. Energy for Sustainable Development 16 (1), pp. 35–43. External Links: ISSN 0973-0826, Link, Document Cited by: §1.
  • A. C. Sadath and R. H. Acharya (2017) Assessing the extent and intensity of energy poverty using Multidimensional Energy Poverty Index: Empirical evidence from households in India. Energy Policy 102, pp. 540–550. External Links: ISSN 0301-4215, Link, Document Cited by: §1.
  • P. Sankhyayan and S. Dasgupta (2019) ‘Availability’ and/or ‘Affordability’:What matters in household energy access in India?. Energy Policy 131, pp. 131–143. External Links: ISSN 0301-4215, Link, Document Cited by: §1, §1, §1, §1, §2, §5.1, §5.1.
  • G. Schmid (2012) The development of renewable energy power in India: Which policies have been effective?. Energy Policy 45, pp. 317–326 (en). External Links: ISSN 0301-4215, Link, Document Cited by: §5.1.
  • R. Sehjpal, A. Ramji, A. Soni, and A. Kumar (2014) Going beyond incomes: Dimensions of cooking energy transitions in rural India. Energy 68, pp. 470–477. External Links: ISSN 0360-5442, Link, Document Cited by: §1, §1, §2, §5.1, §8.
  • A. Sharma, J. Parikh, and C. Singh (2019) Transition to LPG for cooking: A case study from two states of India. Energy for Sustainable Development 51, pp. 63–72. External Links: ISSN 0973-0826, Link, Document Cited by: §1, §1, §5.1.
  • S. V. Sharma, P. Han, and V. K. Sharma (2019) Socio-economic determinants of energy poverty amongst Indian households: A case study of Mumbai. Energy Policy 132, pp. 1184–1190. External Links: ISSN 0301-4215, Link, Document Cited by: §7.
  • E. Shove, M. Pantzar, and M. Watson (2019) The Dynamics of Social Practice: Everyday Life and How it Changes. London. External Links: Link, Document Cited by: §2.
  • E. Shove and G. Walker (2014) What Is Energy For? Social Practice and Energy Demand. Theory, Culture & Society 31 (5), pp. 41–58 (en). External Links: ISSN 0263-2764, Link, Document Cited by: §2.
  • J. Silver and Marvin, Simon (2017) Powering sub-Saharan Africa’s urban revolution: An energy transitions approach. Urban Studies 54 (4), pp. 847–861. Cited by: §1.
  • K. R. Smith and A. Sagar (2014) Making the clean available: Escaping India’s Chulha Trap. Energy Policy 75, pp. 410–414. External Links: ISSN 0301-4215, Link, Document Cited by: §1.
  • E. W. Steyerberg, A. J. Vickers, N. R. Cook, T. Gerds, M. Gonen, N. Obuchowski, M. J. Pencina, and M. W. Kattan (2010) Assessing the performance of prediction models: a framework for some traditional and novel measures. Epidemiology (Cambridge, Mass.) 21 (1), pp. 128–138. External Links: ISSN 1044-3983, Link, Document Cited by: §4.
  • B. van der Kroon, R. Brouwer, and P. J. H. van Beukering (2013) The energy ladder: Theoretical myth or empirical truth? Results from a meta-analysis. Renewable and Sustainable Energy Reviews 20 (Supplement C), pp. 504–513. External Links: ISSN 1364-0321, Link, Document Cited by: §1, §2, §6, §6.
  • Y. Wang, Q. Chen, T. Hong, and C. Kang (2019) Review of Smart Meter Data Analytics: Applications, Methodologies, and Challenges. IEEE Transactions on Smart Grid 10 (3), pp. 3125–3148. External Links: ISSN 1949-3061, Document Cited by: §1.