Date post: | 19-Jan-2023 |
Category: |
Documents |
Upload: | khangminh22 |
View: | 0 times |
Download: | 0 times |
Emissions to exposure: modeling approaches and performance for estimating
personal exposure to household air pollution
Final Report
Michael Johnson1, Ricardo Piedrahita1, Ajay Pillarisetti2, Matthew Shupler3, Madeleine Rossanese1, Samantha Delapeña1, Ryan Chartier4, Elisa Puzzolo3,5, Diana Menya6, Daniel Pope3
1 Berkeley Air Monitoring Group, Berkeley, California, USA 2 Emory University, Atlanta, Georgia, USA 3 Dept of Public Health and Policy, University of Liverpool, Liverpool, UK 4 RTI International, North Carolina, USA 5 Global LPG Partnership, New York/London
6 Dept. of Epidemiology and Biostatistics, School of Public Health, College of Health Sciences, Moi University, Kenya
1 Executive Summary
Background and objectives: This study assessed the performance of modeling approaches to estimate
personal exposure in Kenyan homes where household fuel combustion contributes substantially to household
air pollution (HAP). This work had two primary objectives: first to evaluate the models used for setting
emissions performance targets by the World Health Organization (WHO) and International Organization for
Standardization (ISO), and second to adapt and develop models that predict exposures to household air
pollution.
Approach: To address these objectives we collected data on a subsample of homes participating in the Clean
Air (Africa) study, which is evaluating the potential impacts of transitioning from biomass fuels to liquified
petroleum gas. Within this subsample, we measured emissions (PM2.5, black carbon, CO); household air 1
pollution (PM2.5, CO); personal exposure (PM2.5, CO); stove use; and behavior, socioeconomic, and
environmental (e.g. ventilation and kitchen volume) characteristics. This data was then used to assess and
develop the modeling approaches: the single-zone model used by ISO and the WHO for stove performance
targets; indirect exposure models that combine person-location and area-level measurements; and
regression-based models that predict exposure based on a set of predictors such as fuel type, room volume,
and other relatively easily measured parameters.
Key findings:
- The measured stove performance, kitchen concentrations, and personal exposures were all in
alignment with previous work based on fuel type. LPG had the lowest emissions, with corresponding
low HAP concentrations and personal exposures, while wood and charcoal stoves had substantially
higher emissions and exposures. This anticipated trend and variability in conditions provided a good
dataset to test and develop the models.
1 The CLEAN-Air (Africa) project is being conducted by the University of Liverpool (UoL) and Moi University in Eldoret, Kenya. The CLEAN-Air (Africa) Global Health Research Group is working directly with government ministries in Cameroon, Ghana, and Kenya to assess the impacts of the slated expansion of LPG use in the countries.
1
- The WHO and ISO single zone model was reasonably well correlated with measured kitchen
concentrations of PM2.5 and CO (R2=0.45), but lacked precision, with relatively large standard errors.
The model also overestimated measured kitchen concentrations by several fold.
- The combination of the single-zone model’s correlation with measured kitchen concentrations and
systematic overestimation suggests it is a reasonable tool for setting performance targets as it
provides a conservative approach for linking emission rates with indoor air quality. Given that the
model overestimates kitchen concentrations, by potentially up to an order of magnitude, there may
be room to adjust the modeling approach such that it provides more reasonable estimates of kitchen
concentrations while still erring on the side of conservativeness.
- Normalizing the model response to measured kitchen concentrations, incorporating daily stove use
patterns of all stoves used in the home, and applying the ratios of kitchen concentrations to personal
exposures generated distributions of modeled PM2.5 exposures for the LPG and wood user groups
with similar medians (LPG modeled = 19 μg/m3 vs measured =29 μg/m3; wood modeled 207 μg/m3 vs
measured 182 μg/m3) and substantially overlapping interquartile ranges (LPG modeled = 10-41 μg/m3
vs measured 27-46 μg/m3; wood modeled 113-386 μg/m3 vs measured 104-292 μg/m3). This
agreement suggests the adapted model can produce group-level estimates of HAP exposure that are
reasonable, but given the large standard errors, interpreting individual level estimates would be
problematic.
- Regression models were made using the entire dataset and separately for LPG and biomass-using
households. The best performing model for the entire dataset used a combination of survey-based
data and measurements. The model performed well, with an R2 of 0.76 and a root mean-squared error
of 85 µg/m3. - The survey-based regression model was able to predict PM2.5 exposures with an R2 of 0.51, nominally
better than the single-zone model. A reliable and accurate survey-based model for estimating
exposure would be an extremely valuable tool for researchers and program evaluators, as it would
mitigate the need for relatively expensive and technical exposure measurements. While promising,
the substantial exposure contrast between the LPG and biomass user groups was largely responsible
for the relatively good performance of the simple model, with fuel type being the most important
predictor in the model. This caveat implies that the model performance may rely on those large
contrasts, which are not always evident, an issue that has been previously reported in HAP exposure
modeling studies.
Conclusions: Overall, both the single zone and regression based models show substantial promise for
predicting personal exposures to HAP. The models were aided by large exposure contrasts between fuel use
groups, which helped explain much of the variability in exposure. All of the modeling approaches also had
substantial uncertainty associated with specific data points, suggesting that they are best applied towards
group-level estimates rather than being used as a tool to understand exposure in any given home. The
modeling approaches were also developed using data from a single context, and some of the assumptions or
predictors may have differential impacts in other locations. Testing them in different contexts (fuels,
geographies, stove use patterns, housing characteristics) would help characterize how robustly they operate
and/or the degree to which they may need to be tuned to those specific conditions.
Having conducted this work in Kenya was especially timely as the country has a relatively strong market for
modern biomass cookstoves and clean cooking energy sources; and innovative consumer finance programs,
such as pay-as-you go systems and microfinance. Kenya’s transition towards cleaner stoves and fuels is being
2
aided by active involvement in the development and adoption of ISO performance standards, which is part of
the country’s work with the World Health Organization to expedite the energy transition. The models
developed here are clearly most applicable to the Kenyan context, and ideally will be used to help characterize
the HAP exposure implications of energy transitions, as well as strengthen the implementation of Kenya’s
cookstove standards framework.
2 Table of Contents
Executive Summary 1
Table of Contents 3
Acknowledgements 4
Background and Rationale 5
Methods 7
5.1 Study design and field site 7
5.2 Data collection and analysis methods 8
5.2.1 Emissions measurement techniques 8
5.2.2 Ventilation rate determination 9
5.2.3 Stove usage monitoring 10
5.2.4 Ambient monitoring 10
5.2.5 Personal exposure monitoring 11
5.2.6 Household air pollution monitoring 11
5.2.7 Beacon based time-activity monitoring 12
5.2.8 Behavioral factors 13
5.3 Modeling approaches 13
5.4 Data Analysis 14
5.4.1 Sensor data 14
5.4.2 Mass-balance single-zone model 15
5.4.3 Regression models 15
Results 16
6.1 Single-zone model 16
6.2 Regression-based models 20
3
Discussion 25
7.1 Model application considerations 25
References 28
Supplementary Information 32
S1 Emissions performance 32
S2 Housing characteristics and socioeconomic status 32
S3 PM2.5 and CO household air pollution (HAP) and personal exposure concentrations for the study population 34
S4 Characterization of long-term stove use patterns 35
S4.1 Average events per day by group 36
S4.2 Average minutes per day by group: 37
S4.3 Temporal trends 38
S4.4 Usage fraction by group 40
S4.5 Stove stacking 41
S4.6 SUMs placement 42
S4.7 Ambient monitoring 43
S4.8 Intensive monitoring 46
3 Acknowledgements
We would foremost like to thank the kind and gracious study participants from the Eldoret area for
participating in the study, without whom it would not have been possible.
This work was made possible by the research team lead by Dr. Diana Menya at Moi University, including, Judi
Mangeni, Edna Sang, Gilbert Nyauke, Mary Lydia Kiano, Bernard Bosire, Anabwani Menya, Noelle Sutton, Seth
Owiti, Sharon Cherono, Rachel Samoei, Ruth Jepchirchir, Joan Chepng’eno, Joseck Erambo, Zipporah Mageto,
and Mondesta Malemo.
Thank you to the entire University of Liverpool team, who supported the field work training, project
management, and data cleaning, especially Iva Cukic, Rachel Anderson de Cuevas, Sara Ronzi, and Emily Nix.
Thank you to Dr. Daniel Wilson at Geocene for assistance in processing the stove usage monitor and exposure
data. Thank you to Parker Alex Matthews for her assistance in data cleaning, organization, and analysis.
This work was funded by the United Nations Office for Project Services (UNOPS; RFP/2017/2592), a subsidiary
organ of the United Nations, through the Climate and Clean Air Coalition, and managed by the Clean Cooking
4
Alliance. Funding for a substantial portion of the measurements and field work that were complementary to
this work was provided by the Clean Air (Africa) project. CLEAN-AIR(Africa) was funded by the National
Institute for Health Research (NIHR) (ref: 17/63/155) using UK aid from the UK government to support global
health research. The views expressed in this publication are those of the author(s) and not necessarily those of
the NIHR, UK Department of Health and Social Care, UNOPS, the Climate and Clean Air Coalition, or the Clean
Cooking Alliance.
4 Background and Rationale
This study was a collaborative effort with a CLEAN-Air (Africa) project conducted by the University of Liverpool
(UoL) and Moi University in Eldoret, Kenya. The CLEAN-Air (Africa) Global Health Research Group worked
directly with government ministries in Cameroon, Ghana, and Kenya that have recently made ambitious
commitments to scale up household access to LPG for a significant proportion of their populations. As part of
the effort to estimate impacts of scaled LPG adoption, the CLEAN-Air (Africa) project collected rapid survey
data on fuel usage and household characteristics from over 2000 homes, and in ~100 homes, in-depth
surveys, personal exposure and household-level measurements of particulate matter under 2.5µm (PM2.5) and
carbon monoxide (CO) and performed stove use monitoring in Kenyan homes using biomass only, biomass and
LPG, and LPG only. The additional measurements for the UNOPS study were a subset of those homes. The
most common biomass stoves in the study area were traditional three-stone-fires, jiko-style charcoal stoves,
and handmade mud stoves (Chepkube stoves) (see Figure 2).
There is also limited data on emissions and exposure in Kenya. To our knowledge, the only published field
emission performance studies on stove interventions have been carried out by Berkeley Air on charcoal and
kerosene stoves (Garland et al. 2017; Johnson et al. 2009, 2019). The data on personal exposures is also
limited, with available studies reporting only carbon monoxide (CO) or associated with only wood-fueled
stoves (Ochieng et al. 2013; Yip et al. 2017). Based on our information, there have yet to be studies that
report PM2.5 exposures from a survey of the current common user groups in Kenya, i.e. users of traditional
biomass, mixed biomass and LPG, and exclusive LPG users.
In this paper we report on efforts to: 1) develop and refine current physical modeling approaches used by the
WHO and ISO to predict personal exposures to PM2.5 and CO from emissions measurements (ISO 2018;
Johnson et al. 2014) and 2) develop multivariate models that predict personal PM2.5 and CO exposures based
on physical and behavioral predictors of PM2.5 and CO emissions. Figure 1 illustrates the emissions to exposure
pathways, which forms the basis of the modeling approaches.
5
Figure 1. Diagram of the emissions to exposure pathways and the basic modeling approaches employed in this validation
study. (Figure credit: Ajay Pillarisetti, Nick Lam)
Additional results include characterization of: 1) in-field PM2.5, CO, and black carbon (BC) emissions;
performance of wood, charcoal, and LPG stoves; 2) distributions of kitchen volumes, air exchange rates, and
cooking times (key WHO/ISO physical model inputs); 3) PM2.5 and CO HAP and personal exposure
concentrations for the study population; 4) indirect PM2.5 and CO exposure modeling results, wherein indoor
location and household air pollution are used to model personal exposure; and 5) characterization of
long-term (2-4 months) stove use patterns for each study group (biomass as the primary fuel, charcoal as
primary, and LPG as primary).
Assessment and improvement of physical models for household and personal exposure concentrations is an
important goal. Field studies of personal exposure to PM2.5 are some of the most difficult to conduct as they
require costly equipment, highly trained technicians, and participant compliance with PM monitors. Models
that could accurately estimate exposure based on known performance of technologies combined with more
easily collected data, such as ventilation rates, kitchen volumes, and simple measures of user behavior, would
be highly valuable. While not a substitute for direct field studies of exposure, these models could help guide
programmatic decisions toward the most effective household energy solutions. Here we present models that
aim to predict personal exposure to household air pollution using various approaches and data types within
the context of cooking, though they could be applied to other source types as well.
The data set collected in this study is unique due to the number of samples collected and a wide range of
measurements performed at each home. Consequently, there are many interesting relationships to assess
6
and analyze to perform that are outside the primary aims of the work, and it is likely that future researchers
would find additional analyses to conduct with this data set.
5 Methods
5.1 Study design and field site
57 study households were selected from Turbo and Kesses, rural and peri-urban communities in Western
Kenya, near Eldoret. These households were selected for household air pollution (HAP) and personal exposure
(PE) monitoring from a subset of the CLEAN-Air (Africa) study homes, and were split into groups primarily
using LPG (n=32) and wood (n=32), in addition to 7 primary charcoal users. Household selection and 2
assignment to the study groups was conducted with feedback from local partners, though a rapid survey
(n=2248) provided the most critical contextual information showing which types of users are representative of
the CLEAN-Air (Africa) target population. Careful sampling ensured a large emissions performance and
exposure gradient for modelling purposes. Field measurements for this work were conducted from October
2019-January 2020, encompassing only the dry season. Although the climate throughout the year is generally
temperate, and cooking occurs primarily indoors, it is conceivable that ventilation and fuel usage patterns may
change over the season, influencing model performance.
Figure 2. Typical stoves encountered in the study, with the LPG and charcoal stoves shown at left (with SUMs installed), a
traditional Chepkube stove at center, and a 4-burner LPG stove at right.
In all study homes, emissions, HAP, personal exposure, stove use, and user behavior (sensor and survey-based)
were measured (Table 1). Ambient sampling was conducted during all personal samples. 28 homes were
selected for more intensive sampling. In these homes, we measured stove usage for up to four months and
HAP for up to four days (details in Table S4). During emissions sampling at these homes, stratified kitchen CO
and PM2.5 concentrations were measured at heights of 1m and 2m, in addition to the typical 1.5m height. The
specific measurement techniques for these measures are presented in Table 1 below.
2 These were the total number of home monitoring samples conducted, but the final amount of valid samples available
for modeling was lower for the different modeling approaches, due to data quality assurance protocols filtering out some
of the data.
7
Table 1. Measurement overview. Measurements were conducted at multiple time scales, from the measurement of
single cooking events, to multi-day household air pollution monitoring.
Emissions measurements during a cooking event
24h measurements Intensive measurements (1—4 days)
Personal exposure
MicroPEM (PM2.5) Lascar CO (CO) Berkeley Air Beacons (participant location)
Berkeley Air Beacons (participant location)
Kitchen HAP MicroPEM (PM2.5) Lascar CO (CO)
Berkeley Air PATS+ (PM2.5) Lascar CO (CO)
Secondary HAP Berkeley Air PATS+ (PM2.5) Lascar CO (CO)
Berkeley Air PATS+ (PM2.5) Lascar CO (CO)
PM2.5 Emissions
UPAS (PM2.5) Berkeley Air PATS+ (PM2.5, placed at 1m, 1.5m, 2m)
- -
CO/CO2 Emissions
TSI (CO) Lascar CO (CO, placed at 1m, 1.5m, 2m)
- -
Participant Behavior
Observation Berkeley Air Beacon Loggers (participant location) Survey (time-activity, fuel-use, socioeconomic status)
Berkeley Air Beacon Loggers (participant location) Survey (time-activity, fuel-use, socioeconomic status)
Stove use Geocene SUMS (temperature loggers)
Environment Kitchen volume, air exchange rate, housing characteristics
Ambient UPAS (PM2.5), PATS+ (PM2.5), Lascar CO (CO)
5.2 Data collection and analysis methods
5.2.1 Emissions measurement techniques Emissions samples were collected during uncontrolled cooking tests in participants’ homes, where the cooks
were instructed to prepare a meal as they normally would, without altering stove operation or cooking
techniques. The emissions sample was collected with a multi-port probe suspended in the smoke plume
(Figure 3), and the sample stream was drawn through a Teflon filter after a PM2.5 size cut cyclone to determine
PM2.5 mass deposition. Carbon dioxide (CO2) and CO were measured with real-time instrumentation (TSI IAQ
CALC 7545). Background concentrations of CO2, CO, and PM2.5 were measured before and after each sampling
event, and subtracted from those measured in the emissions plume. Samples with more than 15% of the CO or
CO2 readings above the instrument maximum measurable value were removed from analysis.
If real-time estimates of PM2.5 concentrations exceeded 50µg/m3, testing was delayed until a lower
background concentration was observed. Filter analyses were performed at Colorado State University (Fort
8
Collins, CO USA) using an electronic microbalance (Mettler Toledo, USA) with 0.1µg resolution in a
temperature and humidity-controlled chamber. Mass depositions were determined by weighing the filters
before and after sampling, and correcting for contamination using the median mass deposition of collected
blank filters (n=20). Limit of detection (LoD) was calculated as three times the standard deviation of the mass
deposition on the blank filters (Shrivastava et al., 2011). Black carbon on filters was analysed optically by
transmittance (before and after sampling) using a SootScan OT21 analyser (Magee Scientific), and adjusted
using calibration factors as reported in Garland et al. 2017.
Figure 3. The emissions measurement system set-up to measure an LPG cooking event at left, and a biomass cooking event on a Chepkube stove at right. Note the stratified samples hanging in the room as part of an intensive sample.
Emission factors were determined using the carbon balance approach, as has been done in previous studies of
stove emissions and as is described in the WBT 4.2.3 protocol (Johnson et al. 2008; Roden et al. 2006; WBT
Technical Committee 2014). Emission rates were calculated by dividing the total emissions during a sampled
stove use event by the amount of time the event lasted.
Observations and measurements of operational conditions, which may affect emissions performance, were also recorded for the duration of each cooking event, such as lighting techniques, pot types, and fuel conditions.
5.2.2 Ventilation rate determination Ventilation was measured via tracer gas method, according to the standard WHO protocols specifically designed for the single zone box model. Briefly, CO levels were elevated in the cooking area due to the emissions source, and the rate at which the gas decreased at the end of the cooking event was converted into a ventilation rate (Cowlin 2005). Primarily, this was calculated using data from the CO monitor placed at 1.5m height, but we also assessed it using data from the monitors at 1m and 2m for homes that had intensive
9
sampling. In cases where the kitchen monitor did not provide valid data, the data collected by the emissions monitoring system were used to estimate the ventilation rate.
Additionally, we analyzed peak gas and/or PM2.5 concentration rates of decrease measured during longer term HAP sampling to compare these with the conventional approach and examine the stability of ventilation rates over time. This approach is presented in (Carter et al. 2016).
5.2.3 Stove usage monitoring
Stove usage was directly measured at 5-minute intervals on all stoves used more than once per week in the
study homes with stove use monitors (SUMs) and participant surveys. Geocene SUMs were selected for this
measurement due to their ability to measure high temperatures with thermocouples, the ease of device
launching and data management, and unobtrusive design (Wilson et al. 2020). SUM data was assessed on-site
to ensure proper placement, and immediate corrections were applied to address any issues. Placement of the
SUMs was piloted on any stove type that had not been previously encountered by the team to ensure
successful data collection (see Figure S10).
One week of stove usage monitoring, planned to coincide with the exposure assessment period, was already
included as part of the CLEAN-Air (Africa) study, which allowed for the collection of a wealth of short-term
data. Additionally, we conducted 2-6 months of SUMs monitoring on a subset of 48 homes (over 55 stoves in
total) to better characterize long-term usage trends and assess day-to-day variability, which both impact
household air pollution and personal exposure.
To generate cooking events from the SUMS temperature time series, this project used the FireFinder
algorithm (Wilson et al. 2020) from Geocene (Geocene, Berkeley, California). Two versions of the FireFinder
algorithm were deployed, the default version for time series in which the temperature exceeded 250 C, and a
sensitive version for time series in which the maximum temperature was below 250 C. The FireFinder sensitive
algorithm used a primary threshold parameter of 31 C (the 95th percentile of all indoor temperature
measurements collected by PATS+ monitors) and a min event temperature of 24 C (the 75th percentile of
outdoor temperature values). As for the default FireFinder algorithm the minimum event duration was 5
minutes, and any events within 10 minutes of each other were grouped together into a single event. An
example time series is shown in the Supplementary Information in Figure S3. The time cooked with each stove
type per day was a direct input into the WHO single-zone box model and was used to predict kitchen-level
concentrations, as well as personal exposure. Longer term stove usage trends for the UNOPS and Clean Air
(Africa) homes are presented separately in the Supplementary Information.
5.2.4 Ambient monitoring
Ambient monitoring of gravimetric and nephelometric PM2.5 and CO was carried out during most emissions
sample collection periods. An ambient monitoring station was set designated in a rural background location in
the Kesses region (0°25'07.7"N 35°19'24.4"E). Gravimetric and black carbon PM2.5 measurements were
collected (UPAS, Access Sensor Technologies, Fort Collins, CO), alongside real-time PM2.5 (PATS+) and CO
(Lascar EL-USB-300). Instrument inlets were placed at a height of 6m, and away from trees, buildings, or other
obstructions, and there were no substantial nearby air pollution sources except for one biomass burning
kitchen ~50m from the site (see Figure 4).
10
Figure 4. Details of the ambient monitors and the installation site.
5.2.5 Personal exposure monitoring
Personal exposure was measured for the primary cook using the MicroPEM or Early Childhood MicroPEM
monitors (RTI, Research Triangle Park, North Carolina, USA), combined gravimetric and real-time
nephelometric PM2.5 monitors. MicroPEM and ECM filter samples were weighed gravimetrically at RTI and the
blank-corrected mass concentrations were used to correct the nephelometer readings, which are sensitive to
aerosol properties that may vary by source type and atmospheric conditions. Clean filter handling and
MicroPEM calibration protocols were applied to ensure high data quality throughout the study. The built-in
accelerometers were used to assess compliance, and a 20-minute rolling average was applied to the
magnitude of the composite acceleration. Compliance with wearing the monitors is evaluated considering the
fraction of the day that participants typically spend sleeping, and the amount of time they may be sitting with
the sampling vest adjacent to them. Compliance fractions between 0.25-0.75 are considered good, but with
the current model implementations we retained all data regardless of compliance.
Personal carbon monoxide exposure monitoring was performed using Lascar EL-USB-CO300 monitors (Lascar
Electronics, UK). All CO sensors used in the study underwent two-point calibrations before the study, with
certified calibration standards in Berkeley Air Monitoring Group’s California laboratory. Linear calibration
corrections were applied to the data. The data were then manually reviewed as a sense check, and filtered in
cases of clear instrument malfunctions. Personal PM2.5 and CO data were used as the independent variables
in the exposure models, at minute and 24-hr average time scales.
5.2.6 Household air pollution monitoring
Household air pollution was also measured with the MicroPEM as part of the CLEAN-Air study. These data
were supplemented with additional real-time PM and CO instrumentation (PATS+ and Lascar CO). In a subset,
PATS+CO (Berkeley Air Monitoring Group, Berkeley, California, USA) were used rather than the standard
PATS+, as its integrated high-precision CO electrochemical sensor allowed for comparison with the Lascars. A
sampling pack containing a MicroPEM, PATS+, and Lascar CO monitor was installed in the kitchen area for 24
hours, including the emissions sampling event. PATS+ and Lascar CO monitors were also placed in a separate
room where the participants reported spending the most time (generally the living area). In a subset of one
fourth of households, a set of three evenly spaced PATS+ and Lascar CO monitors was also hung between the
11
ceiling and floor in a stratified sampling configuration to capture the spatial variability of pollutants in the
kitchen space, which informs the variability of the ventilation conditions within the room, and in turn the
variability of kitchen to personal exposure estimation methods. In the 28 intensive homes, PATS+ and Lascar
CO monitors in the kitchen and living areas were left installed for up to five days to assess day to day
variability.
5.2.7 Beacon based time-activity monitoring The Berkeley Air Beacon Logger System is a time-activity monitoring system specially designed for household
energy applications (Piedrahita et al. 2019, Liao et al. 2019). The system is composed of two components, a
poker-chip sized Bluetooth Beacon, which safely emits a unique ID multiple times a second over Bluetooth
Low Energy, and a Beacon Logger, which records the address and the strength of the Beacon’s emitted signal.
The system components are low in cost, power consumption, and maintenance efforts, especially in
comparison with personal exposure monitors.
Beacon Loggers were installed in all kitchens and living areas, and Beacons were given to the primary cooks to
wear during the 24-hour monitoring period around which emissions data was captured. Users generally wore
the two Beacons on a necklace or in the pocket alongside the exposure monitors. A minute-wise time series of
presence in each microenvironment was generated for the user by associating the signal strength of their
Beacons with the fixed locations of the Beacon Loggers (the primary kitchen and the living area where they
spent most of their time during the day). Presence in a room was determined using two different approaches:
1. In the ‘kitchen threshold’ approach, if the mean signal strength of a user’s Beacons in a given minute
are stronger than -70 (RSSI units, associated with an open field distance of ~5m), then the location for
the given minute is classified as ‘kitchen’, and if not, it is classified as ‘living area’. If the kitchen Beacon
Logger does not measure any signal for a given minute, it is assumed the user is away from the house,
and the location is then classified as ‘ambient’.
2. In the ‘nearest logger’ approach, whichever of the two Beacon Loggers in the home records the
highest mean signal strength for a given minute, the user’s location is classified to that room. If neither
logger records a signal, the location is again classified as ‘ambient’.
A performance check of the system, called a walkthrough, was carried out at each home before the start of
the deployment, to assess system performance. This entailed leaving the equipment in each area for a 5
minute period, to determine whether the classification was correct. The walkthrough results indicated that
with the ‘nearest logger’ algorithm, the classification was correct 83.0% of the time when the equipment was
in the kitchen, and was incorrect 15.5% of the time, when it classified the location as the other area in which a
logger was installed (0.6% of the time, it was classified as equidistant from both loggers, and 0.9% of the time
it was classified as not being near either logger, termed ‘ambient’). Similarly for the other area (typically the
living room), a correct classification was made 83.2% of the time, an incorrect prediction that the equipment
was in the kitchen was made 15.4% of the time, with the remaining 0.7% for both equidistant and ambient
classifications. In the subset of intensive households, the Beacons and Beacon Loggers remained in place
alongside PATS+ and Lascar CO microenvironment monitors for a period of up to five days. This longer-term
period allowed us to assess participant acceptability of protocols, model performance, and compare
day-to-day within-person variability to between-person variability.
12
5.2.8 Behavioral factors
Time-activity, fuel use, and cooking habit data were collected using standard questionnaires that have been
used both in this part of Africa and in other countries, including India, Mongolia, Laos, and Cambodia.
Additional questions were added based on their potential utility to contribute explanatory power to statistical
models and included parameters such as trash-burning, animal fodder preparation, and smoking habits.
Additional information on socioeconomic status and educational status were evaluated. Socioeconomic status
was assessed using principal components analysis (PCA) on asset ownership and home characteristic variables
as per Vyas and Kumaranayake (2006) to generate a 5-category index. The index is generated and households
are assigned to a category using the prediction from the first principal component of the analysis. The first
index categorized was associated with low ownership of assets (such as cars, cell phones, and computers),
higher use of biomass, and outdoor sanitation facilities, while the fifth was characterized by high ownership
rates of those assets, indoor sanitation facilities, and access to water indoors (Table S2). Survey data was
collected with Mobenzi (Cape Town, South Africa), a tablet-based data entry system that has been used
extensively in similar studies and minimizes the likelihood of transcription errors and data loss.
5.3 Modeling approaches
The measurements collected allowed a thorough evaluation of model performance and helped us to
determine the measurable factors most critical for accurate exposure estimation.
The first modeling approach was the single-zone, mass-based model currently employed by ISO and WHO to
estimate kitchen concentrations and derive emissions performance targets (Johnson et al., 2014; WHO 2015,
ISO 2018). The model predicts room concentrations of pollutants using input distributions of emission rates
and usage times of the sources (in this case, stoves); a room’s ventilation rate and volume; the fraction of
emissions from the sources that enter the room (important for chimney stoves); and the background/ambient
concentration. The mathematical description is provided below and can also be found in Johnson et al. 2014.
Equation 1
C(t) ,= αVq f +q f +q f +…q f1 1 2 2 3 3 n n (1 )− e−αt + Co (e )−αt + Cb
where
C(t) = Pollutant concentration for a given time point
qx = emission rate for source x (mass/min)
fx = fraction of emissions from source x that enters the kitchen environment
α = first order loss rate (nominal air exchange rate) (changes/min)
V = kitchen volume (m3) t = time interval (1 min)
Co = concentration from preceding time interval (unit mass/m3) Cb = Background concentration
The model produces 24 hours of minute-by-minute concentration estimates, where the emission rates for the
respective sources are inputs for three discrete, evenly spaced cooking times. The sum of these periods is the
device usage time, which is also a model input. To calculate the predicted 24-hour mean concentration in the
13
kitchen (Ck), the concentrations for each time point are summed over the day and divided by the number of
minutes in a day. To estimate exposure, the 24-hour mean concentration Ck was multiplied by a Kitchen
Exposure Factor (KEF), as shown in Equation 2 below. The exposure ratios are ideally location-specific (as
possible here), though global averages have been applied such as those from the Global Burden of Disease
Study (0.742 for women, 0.628 for young children, 0.450 for men (Smith et al. 2014)).
Equation 2
KEFEr = Ck
We calculated KEFs for each sample and by stove-fuel group. The predictive power of average KEFs was
evaluated using k-fold cross-validation (k depends on the total number of samples obtained). Previous work
(Hill et al. 2019) has shown that KEFs alone have poor predictive power in some contexts. The predictive
power of other ratios (of pollutants or pollutants and locations) was evaluated.
In addition to the physical, mass-based modeling of the first approach, a second approach was developed to
estimate personal exposure to PM2.5 and CO using linear regression. Covariates included both sensor-based
(indoor location, stove usage patterns, kitchen pollutant measurements) and survey (characteristics of the
home, kitchen, fuel, etc.) data. We created models with different sets of covariates, beginning with those that
are easiest to collect and most crude (survey data) and continuing with increasingly complex data streams. In
doing so, we identified both minimal and maximal predictive powers. We also compared models using both
sensor-based measurements and modeled estimates of kitchen concentrations. Previous work has shown that
some statistical variation in exposure can be explained through data easily obtained via questionnaire
(Balakrishnan et al. 2013; Baumgartner et al. 2011; Clark et al. 2010; Hill et al. 2019) (Balakrishnan et al. 2013;
Baumgartner et al. 2011; Clark et al. 2010; Hill et al. 2019).
Models were evaluated using standard regression fit metrics, including RMSE, R2, AIC, and BIC, and tests were
performed to check that regression assumptions are met. We compared single zone-model estimates with
those from linear statistical models that included a parsimonious set of variables, including survey-assessed
and sensor-based measures, and reduced sets of measurements that could be reasonably deployed as part of
large-scale assessments. Statistical modeling approaches for estimating pollutant concentrations followed
those described by Balakrishnan et al (2013).
5.4 Data Analysis
5.4.1 Sensor data Data for this work were collected with a variety of devices, on multiple time scales. The general approach for
analysis of this type of data consists of tethering the results to non-identifiable user identification numbers
and sample collection dates, which were then used to merge the sensor and survey data streams. Each
instrument had its own pre-processing code to import the raw instrument data, adjust per protocol (e.g. by
14
calibration factors), quality check it, and generate merged minute-wise time series for each deployment. The
data was then used to perform exposure modeling at various resolutions, including minute and daily levels.
5.4.2 Mass-balance single-zone model
Distributions of input data for the single-zone model (emission rates, ventilation rates, kitchen volumes, and
stove use times) were generated to determine the mean, max, and minimums as needed for the model
(Equation 2). The model was run for PM2.5 using Monte Carlo simulations with the Risk Analyzer software
package and generated an output of room concentration distributions. These distributions were compared
against measured PM2.5 kitchen concentrations (both the standard single kitchen measure and the integrated
concentration estimate based on stratified measurements). The resultant distribution of exposures calculated
as the ratio between personal exposure and the room concentration distributions were compared against
previously published ratios. We also present distributions of proximity between the participants and
microenvironment pollution monitors, and joint probability distributions between stove usage-by-proximity
categories and exposure, which helped to generate a modified exposure ratio, accounting for proximity to the
pollution source.
5.4.3 Regression models
Statistical analysis and model creation were performed in R (3.6.2 and 4.0.2.). We used univariate and
multivariate linear regression (MLR) to model PM2.5 exposures among primary cooks. 50 households provided
data that passed quality checks for use in modeling. We imputed missing covariate data by using the
population-wide median values. The dependent variable – the cook’s measured exposure to PM2.5 – was
log-transformed to meet normality requirements. Predictor variables were not transformed. Models were run
separately for the entire dataset and for biomass and LPG users.
Univariate models were fit to assess the relationship between exposure proxies and measured concentrations
and exposures. The use of kitchen concentrations and ratios of concentrations to exposure were assessed with
measures of correlations (Pearson’s r). Multivariate models were used to assess the relationship between
personal exposure and sociodemographic characteristics, stove-fuel energy use patterns, household
characteristics, and other physical measurements (time activity, stove use, etc.) in the home. Variable
selection occurred using multiple modalities. First, we used an automatic variable selection algorithm (from
the “leaps” r package) to pick parameters that optimized between model comparison parameters, including
adjusted R2, Bayesian Information Criterion (BIC), and Malloy’s Cp (shortened to Cp). Models identified using
the automatic variable selection algorithm were further screened using 10-fold cross validation. The model
with that minimized RMSE during 10-fold cross validation was selected for further evaluation. Second, based
on our prior knowledge and a review of the literature, we evaluated sets of predictors that we anticipate
would be easier to collect in the field using surveys, less intrusive monitoring devices placed in a kitchen, or
less intrusive personal monitors. Model performance was compared using adjusted R2 and RMSE. Finally, we
estimated a commonly used ratio known as the ‘Kitchen-Exposure Factor’ (KEF), the ratio of personal exposure
to kitchen concentrations and compared it to a commonly used global KEF 0.742. Estimates of global
KEF-derived exposures were compared to measured exposures using Wilcoxon Rank Sum tests.
15
6 Results
6.1 Single-zone model
Table 2 presents the summary statistics for the input parameters measured during the cooking events for the
single zone model, as well as the corresponding kitchen concentrations. The data is generally inline with
previously reported estimates . Kitchen concentrations PM2.5 and CO during cooking events are all reasonable 3
given the ranges of 24-hour exposures for these user groups (Balakrishnan et al. 2014; Johnson et al. 2018;
Pillarisetti et al. 2019; Pope et al. 2017). Emission rates are also inline with estimates for wood and charcoal
stoves (Johnson et al. 2019; Piedrahita et al. 2020). Our estimated LoD for PM2.5 emission rates was
approximately 5mg/min, which was greater than those we measured here for LPG. We therefore have used
the PM2.5 emission rates for LPG reported by Weyant et al. (2019) and Johnson et al. (2019), both studies
which were able to measure them in the field. Overall, the data provided the anticipated variability, ranging
from very clean (LPG), to relatively high emissions from the wood stoves and charcoal in-between. Ambient
measurements of PM2.5 and CO were also made (see Table S3) and showed consistently low levels (6.8±5.4
µg/m3 for PM2.5, 0.9±2.7ppm). Note that stove/fuel performance metrics not directly used in the modeling
efforts, including combustion efficiency, emission factors (PM2.5, CO, and black carbon), and firepower can be
found in Table S1 of the supporting information.
3 https://berkeleyair.shinyapps.io/who_input_data_v2/
16
Table 2. Summary statistics for cooking event measurements used in the single zone model.
Mean Median SD Range n
Kitchen PM2.5 (μg/m3)
LPG 135 92 130 10--531 28
Charcoal 855 642 970 28--2442 6
Wood 2048 1580 2870 71--16,161 29
Kitchen CO (ppm)
LPG 4.8 2.6 5.1 0.0--15.6 27
Charcoal 70.4 63.8 73.9 5.9--150.2 4
Wood 44.3 38.2 39.8 0.4--196.1 28
PM2.5 emissions rate (mg/min)
LPG 1* NA 0.5* 0.1*--2.5* NA
Charcoal 15 15 10 3--30 7
Wood 159 147 65 69--343 29
CO emissions rate (mg/min)
LPG 0.0 2.9 3.8 0.2--15.9 30
Charcoal 15.3 14.5 0.9 1.1--3.1 7
Wood 1.7 1.5 0.8 0.5--3.6 29
Ventilation (air changes/ hour)
LPG 14.3 12.2 8.3 5.5--40.0 30
Charcoal 27.0 19.0 21.4 12.1--72.6 7
Wood 18.3 17.5 7.9 7.1--38.7 32
Kitchen volume (m3)
LPG 16.5 13.2 12.2 5.4--51.9 32
Charcoal 24.8 23.2 9.6 12.8-41.5 6
Wood 25.0 22.3 11.46 11.1--49.6 32
Event Duration (minutes)
LPG 45 45 28 7--125 31
Charcoal 54 50 16 29--81 7
Wood 58 54 26 21--116 30
*Assumed from Weyant et al. 2019 and Johnson et al. 2019
17
Figure 5. Relationship between the modeled and measured kitchen concentrations (PM top, CO, below) during the
sampled cooking events.
The relationship between modeled and measured estimates of kitchen concentrations for PM2.5 and CO are
shown in Figure 5. There are clear positive correlations between the modeled and measured estimates,
following the anticipated trend of lower pollutant concentrations during LPG use, and higher when using the
biomass stoves; however, there is considerable scatter (PM2.5 model RMSE = 767μg/m3; CO model RSME:
30ppm), with the model explaining 45% of the variability in the measured event concentrations of both PM2.5
18
and CO (PM2.5: R2=0.45, p<0.01; CO R2=0.45, p<0.01; CO). The model also overestimates the measured kitchen
concentrations (~10 fold for PM2.5) and (~6 fold for CO). This bias is similar to what was reported by Piedrahita
et al. 2019 and Johnson et al. 2011, who both found the model to overestimate measured concentrations in
the kitchen. There are several reasons for this potential bias, the most likely being due to the model
assumptions that all emitted pollutants instantaneously and perfectly mix throughout the room. It is likely
that a substantial fraction of emissions escape through windows, eaves, or other openings before mixing
throughout the room, and mixing is incomplete, with higher concentrations pooling higher in the room. The
variability in mixing and stratification of pollutants also likely contributes substantially to the amount of scatter
in the plots. It is also evident that modeled PM2.5 emissions are clustered near the y-axis, potentially due to
setting the LPG emissions rate to 1mg/min (LoD). This potential artifact may be shifting the overall
relationship between the modeled and measured concentrations, resulting in a high intercept.
Our stratified CO samples show mean concentrations sequentially increased from 14ppm at 1 meter above the
ground, to 20ppm at 1.5 meters above the ground (HAP standard protocol height), to 28ppm at 2 meters.
Stratified samples of CO by Johnson et al. 2011 suggested a similar pattern, and MacCarty et al. 2020 did a
systematic investigation of the model’s performance in a test kitchen, showing that PM2.5 concentrations
increased in an S-shaped curve, pooling at the ceiling. These two studies also found that a height of ~1.5
meters was likely the best proxy height to capture the average room concentration and/or exposure of a
standing adult.
To model 24 hour PM2.5 exposures, we applied simple correction factors (ratio of measured to modeled
means: 0.07, 0.24, 0.50 and for wood, charcoal, and LPG, respectively) to normalize model response to the
measured kitchen concentrations, then applied the measured KEFs (0.69, 0.85, 0.83), for wood, charcoal, and
LPG, respectively). The model was run through a Monte Carlo simulation (10,000 iterations) for each of these
fuel user groups, defined by the greatest amount of a given fuel use during the day exposure was measured.
All stoves used within the house for the given day were included.
19
Figure 6. Modeled and measured probability distributions (fitted) of PM2.5 exposures.
The modeled distributions generally compared well with the measured 24-hour PM2.5 exposures. Figure 6
shows the fitted distributions, illustrating the overlap between the modeled and measured estimates (LPG in
blue, charcoal in grey/black, wood in red/pink). The LPG and wood distributions compare most favorably, with
their interquartile ranges overlapping substantially, and mean and medians reasonably close (see Table 3).
The modeled exposures for charcoal were lower than that of the measured concentrations (mean and median
values were ~70% and 40% of the measured estimates), though this comparison is the most tenuous as only
seven samples were available for analysis.
Table 3. Comparison of modeled and measured 24 hour PM2.5 exposures (μg/m3).
Modeled Measured
LPG Mean 41 43
Median 19 29
25th-75th percentile 10-41 27-46
n 10,000 19
Charcoal Mean 80 115
Median 43 110
25th-75th percentile 21-92 50-121
n 10,000 7
Wood Mean 296 225
Median 207 182
25th-75th percentile 113-386 104-292
n 10,000 21
Overall, this approach shows promise that the model can be applied to estimate distributions of PM2.5
exposures, though care needs to be taken to ensure that inputs for normalizing the model account for bias.
Given the scatter in the relationship of individual estimates, it is also not recommended to use the model for
predicting specific households, but rather as a tool for understanding how group-level exposures may be
impacted by changes in stove use, stove performance, environmental conditions or other parameters that
may change over time.
6.2 Regression-based models
A number of model specifications were tested, starting with simple relationships, building up to more complex
models, and finally selecting parsimonious and physically reasonable models with the best fits. As with the
single-zone model, we focused on predicting PM2.5 due to the importance of its association with health
impacts. We first present the summary statistics for the variables included in model selection in Table 4. Note
that these values are slightly different from those from the single-zone modeling exercise, as the data
20
completeness changed with the exclusion of the direct emissions-related measurements. Summary findings
for the regression models are shown in Table 5. Among the models evaluated by the variable-identifying
algorithm, models with between 6 and 7 predictors out of the >20 evaluated offered an optimal compromise
between RMSE, adjusted R2, and other model selection criteria.
21
Table 4. Summary statistics for the 24hr datasets included in the regression models (mean, standard deviation, minimum, 25th percentile, median, 75th percentile, maximum, and number of valid samples.
variable mean SD min 25th %-tile median 75th %-tile max n
Cook’s personal PM2.5 exposure (µg/m3) 139 150 13 43 86 156 687 50
Compliance (fraction of 24hr period monitors in motion) 0.42 0.19 0.03 0.28 0.42 0.62 2 50
Kitchen PM2.5 (µg/m3) 492 673 26 49 192 695 3819 50
Secondary Area PM2.5 (µg/m3) 60 117 10 14 22 31 689 47
Ambient PM2.5 (µg/m3) 6 3 3 4 7 9 10 30
Cook’s personal CO exposure (ppm) 4.7 5.7 0.0 1.5 2.7 5.0 32 44
Kitchen CO (ppm) 16.5 22.6 0.0 2.9 8.9 20.3 131 48
Secondary Area CO (ppm) 5.0 7.7 0.0 0.0 1.1 8.7 38 49
Ambient CO (ppm) 1.2 2.9 0.0 0.0 0.0 0.2 10 38
Traditional charcoal stove (minutes) 78 192 0 0 0 0 881 50
3-stone fire (minutes) 84 170 0 0 0 16 570 50
Chepkube (minutes) 51 86 0 0 0 79 410 50
Total cooking time using all stoves (minutes) 214 219 0 15 115 361 881 50
Beacon threshold algorithm PM2.5 indirect exposure estimate (µg/m3) 255 514 7 37 73 401 3483 50
Beacon nearest algorithm CO indirect exposure estimate (ppm) 9.4 12.1 0 2.3 6.1 13.8 76.0 50
Beacon threshold algorithm CO indirect exposure estimate (ppm) 5.1 5.6 0 0.6 2.9 8.0 24.0 50
Number of walls in the kitchen with open eaves 0.22 0.65 0 0 0 0 3 50
Kitchen volume (m3) 22.5 12.8 5.4 13.1 20.5 27.2 52.0 50
Open door area (m2) 2.1 1.3 0 1.7 1.9 2.6 6.0 50
Socioeconomic status index 1.56 2.9 -2.35 -1.05 0.93 4.19 7 50
Air exchange rate (1/hr) 17.4 8.2 0.1 11.4 17.1 20.8 40.0 48
22
For the overall dataset, survey data alone – consisting of a measure of kitchen volume, the primary stove type,
and a socioeconomic index comprised of assets, housing characteristics, and other variables – had an adjusted
R2 of 0.51 and a root mean squared error of 130. The most predictive model (Figure 7) for the overall dataset
included stove type, measures of CO (personal, kitchen, and living room), kitchen volume and ventilation, and
a microenvironmental PM estimate. The R2 of this model was 0.75 and the RMSE was 85 (RMSE/mean of
measured exposures [nRMSE] = 0.61). Of note, and unsurprisingly, all predictive models poorly predicted
overall dispersion in biomass households, a phenomenon consistent with the literature.
Figure 7. The relationship between predicted and measured personal exposure to PM2.5. The dotted line is a 1:1 line; dots represent individual datapoints; the blue line is a linear model including terms for personal, living room, and kitchen CO levels; kitchen volume and ventilation; and an estimate of PM concentrations derived from a microenvironmental location monitoring system.
Although the microenvironmental estimate using the beacon system (model 4) significantly improved model
performance over just kitchen PM2.5 data from model 2, it did not significantly improve the model fit over the
survey data (model 1), likely due to the strong predictive ability of the survey data in this data set. Similarly,
adding kitchen PM2.5 to the survey data (as in model 5) did not significantly improve model fit over the
survey-only model.
23
Table 5. Model performance and fit statistics for the personal PM2.5 exposure estimation models.
Overall (n = 50)
Observed 139 151
Model Adj R2 RMSE Mean SD p°
1 Survey-type data* 0.51 130 110 65 0.2
2 Kitchen PM2.5 0.32 206 115 180 0.47
3 Kitchen PM2.5 + CO 0.43 139 117 137 0.45
4 1 + Microenvironmental PM2.5 Estimate 0.53 137 113 88 0.29
5 1 + 2 0.52 131 112 80 0.27
6 1 + 3 0.59 112 118 106 0.42
7 Stove Type + Personal , Kitchen, Living
Room CO + Kitchen Volume +
Microenvironmental PM Estimate‡
0.75 85 124 110 0.57
* includes Primary Stove type, a socioeconomic index, and kitchen volume ° comparing predicted and measured values ‡ identical to the model selected by the variable selection algorithm
Another common method of estimating exposure involves using a global ratio of exposure to kitchen
concentrations. The commonly accepted value of the so-called “Kitchen Exposure Factor” (KEF) is 0.742 (Smith
et al. 2014). For the current set of measurements, KEF was a poor predictor of exposure. The global KEF was
significantly different from the study-specific KEF (Wilcoxon rank sum test, p < 0.001), which exhibited wide
variability within and between primary stove types.
The distribution of KEFs by stove type is shown in Figure 8. The relatively high number of KEFs exceeding 1 in
LPG households indicates that exposure may have been driven by sources outside of the kitchen, consistent
with what one would expect for households that rely on LPG as a cooking fuel.
24
Figure 8. Distribution of the KEF – the ratio of personal exposure to kitchen concentration – in the current study. A KEF
less than one indicates that personal exposure is less than kitchen concentration, while one greater than one indicates
that personal exposures exceed kitchen concentrations. In the current study, LPG users had KEFs > 1, indicating their
exposure may be derived outside of the kitchen. “Traditional” stove class includes charcoal and wood stoves.
7 Discussion
7.1 Model application considerations
Collecting personal PM2.5 exposure data is challenging due to cost, technical requirements, and logistical
considerations making investigation of predictive models an important exercise, especially for HAP given the
health implications. To date, HAP exposure models have not been widely applied and performance has varied.
Several past works have found that survey data has provided statistically significant predictive power for
exposure estimation (Baumgartner et al., 2011; Clark et al., 2010), while quite a few have reported
comparisons of kitchen and personal PM2.5 with varying success (Balakrishnan et al., 2004; Piedrahita et al.,
2017; Liao et al. 2019), and others have looked to personal CO exposure (an easier measurement to conduct)
to estimate personal PM2.5 exposure, but found it to be an unreliable surrogate in many settings (Carter et al.,
2017). Dionisio et al. (2012) assessed personal exposure modeling performance of CO in children using
survey-type data, and though not directly comparable to our work, due to using CO as the dependent variable
rather than PM2.5, they found a model RMSE of 0.86 ppm (nRMSE = 0.74). Dionisio et al. (2008) also assessed
the performance of a model in predicting child PM2.5 exposure (n=31) from personal CO, survey data, and
kitchen PM2.5 concentrations, but did not find strong relationships in any of the model permutations (r<0.01).
Hill et al. (2019) applied regression models and machine learning models to estimate personal PM2.5 exposure
(n=36) in a rural area of Lao, but adjusted R2 values below 0.3, and RMSE of 40.0 µg/m3 (nRMSE = 0.39).
25
The HAP exposure models presented here showed promise and potential improvements in performance over
similar past works. The better performance may be the result of factors that vary between sites (e.g. higher
exposure contrasts), or possibly improvements in measurement techniques in more recent times, but also
sounds a note of caution in that even with carefully conducted studies, good model performance is not
assured. Still, the simple survey based model able to predict PM2.5 exposures with an R2 of 0.51, represents a
potentially important advancement, as surveys are the simplest data collection tool available. The best
performing model for the entire dataset used a combination of survey-based data and measurement, resulting
in an R2 of 0.76 and a root mean-squared error of 85 µg/m3. Should those predictive capacities be robust,
these modeling approaches would provide substantial value in mitigating the need or extent of costly and
complicated exposure studies.
While not a focus of this study, the large exposure contrasts which aided this modeling effort also suggest that
LPG use was having an important impact on exposure to HAP. Median exposure for the LPG users in our
subsample was 29 µg/m3, below the WHO Interim one annual target of 35 µg/m3, and median longer term
kitchen concentrations were 58 µg/m3 (See table S4). A full analysis and write up on the implications of LPG
on exposure will be conducted and presented by the CLEAN-Air(Africa) Global Health Research Group.
Finally, we found that the single-zone model used by WHO and ISO for setting emission targets did reasonably
well predicting kitchen concentrations and exposures, once its systematic overestimation was adjusted for.
The combination of the single-zone model’s correlation with measured kitchen concentrations and systematic
overestimation suggests it is a reasonable tool for setting performance targets as it provides a conservative
approach for linking emission rates with indoor air quality. Given that the model overestimates kitchen
concentrations, by potentially up to an order of magnitude, there may be room to adjust the modeling
approach such that it provides more reasonable estimates of kitchen concentrations while still erring on the
side of conservativeness.
7.2 Limitations
There is potential to reduce the burden of data collection on participants for large-scale projects, as the
performance of the exposure estimation regression models that used survey data or less intrusive
concentration measurements moderately explained exposure variability. With the ability to collect survey,
stove usage and household air pollution data over multiple days, there are scenarios where this approach
should be considered, including instances where long-term trends are of interest. While the models from this
work generally performed higher than past efforts, it is not clear that they are repeatable in different contexts.
We opted to use our resources for conducting a relatively comprehensive set of measurements on potential
predictors, which resulted in somewhat small sample sizes. While this tradeoff seemed appropriate given our
goal of exploring new modeling approaches for their potential utility, the smaller sample sizes limited our
power to evaluate with more certainty which predictors and approaches were strongest.
Although the sample sizes were relatively small, data management for this study still presented a substantial
challenge given the number and types of measurements involved and could likewise be problematic if several
data streams are required for various model inputs. Finding the right balance in terms of expected predictive
26
ability and data collection cost and analysis complexity is difficult, and likely will differ with the technical
capacity of the group conducting the work. Of course, as improved methods and equipment continue to reach
a wider audience, future analyses can become more streamlined.
It should be noted that any use of these types of models must be considered exploratory unless model
validation is performed. There are many idiosyncrasies related to specific contexts that can affect predictive
models, such as the intensity of neighbors cooking with dirty fuels, community-level use of polluting fuels,
ambient air pollution, temperature, and housing characteristics. In this study, the low housing density and low
ambient air pollution levels (Table S3) provided relatively low variability in environmental conditions, allowing
the majority of air pollution exposure to be assumed as a result of cooking. Thus, fuel type indicators may not
perform as strongly in other contexts.
7.3 Recommendations
We recommend testing these models in different geographies or fuel use scenarios as they were developed
from a single study community. Continuing to build and test HAP exposure models in different contexts
(cooking fuels, geographies, stove use patterns, housing characteristics) would enable more robust evaluation
of how they can be extended to other contexts. For clean cooking standards, additional characterization of the
single-zone model’s bias (more regions, fuel types, housing types, etc…) would help support potential
modifications to the model’s application for deriving performance targets. Future modeling efforts would also
benefit from machine-learning approaches, including both supervised and unsupervised methods. These
approaches have shown some promise in generating reasonable predictive power, especially when combined
with traditional statistical modeling approaches. De-aggregated real-time data may enhance machine-learning
predictive power, by identifying data features that may be predictive of mean PM2.5 exposures. We also
recommend exploration of additional predictors that may be relevant for exposure prediction and considering
inclusion of the single-zone model based exposure estimates in machine learning models as predictors.
Having conducted this work in Kenya was especially timely as the country has a relatively strong market for
improved biomass cookstoves and clean cooking energy sources; and innovative consumer finance programs,
such as pay-as-you go systems and microfinance. Kenya’s transition towards cleaner stoves and fuels is being
aided by active involvement in the development and adoption of ISO performance standards, which is part of
the country’s work with the World Health Organization to expedite the energy transition. The models
developed here are clearly most applicable to the Kenyan context, and ideally can be used to help characterize
the HAP exposure implications of energy transitions, as well as strengthen the implementation of Kenya’s
cookstove standards framework.
27
References
Balakrishnan K, Mehta S, Ghosh S, Johnson MA, Brauer M, Naeher L, et al. 2014. WHO Guidelines for
Indoor Air Quality: Household Fuel Combustion - Population levels of household air pollution and
exposures.
Balakrishnan K, Ghosh S, Ganguli B, Sambandam S, Bruce N, Barnes DF, et al. 2013. Modeling national
average household concentrations of PM2.5 from solid cookfuel use for the global burden of
disease -2010 assessment: results from cross-sectional assessments in India. Environmental Health
12:77; doi:10.1186/1476-069X-12-77.
Baumgartner J, Schauer JJ, Ezzati M, Lu L, Cheng C, Patz J, et al. 2011. Patterns and predictors of personal
exposure to indoor air pollution from biomass combustion among women and children in rural
China. Indoor Air; doi:10.1111/j.1600-0668.2011.00730.x.
Carter E, Archer-Nicholls S, Ni K, Lai AM, Niu H, Secrest MH, et al. 2016. Seasonal and Diurnal Air Pollution
from Residential Cooking and Space Heating in the Eastern Tibetan Plateau. Environ Sci Technol
50:8353–8361; doi:10.1021/acs.est.6b00082.
Carter, E., Norris, C., Dionisio, K. L., Balakrishnan, K., Checkley, W., Clark, M. L., Ghosh, S., Jack, D. W.,
Kinney, P. L., Marshall, J. D., Naeher, L. P., Peel, J. L., Sambandam, S., Schauer, J. J., Smith, K. R.,
Wylie, B. J., & Baumgartner, J. (2017). Assessing Exposure to Household Air Pollution: A Systematic
Review and Pooled Analysis of Carbon Monoxide as a Surrogate Measure of Particulate Matter.
Environmental Health Perspectives, 125(7); doi:/10.1289/EHP767
Clark ML, Peel JL, Balakrishnan K, Breysse PN, Chillrud SN, Naeher LP, et al. 2013. Health and Household Air
Pollution from Solid Fuel Use: The Need for Improved Exposure Assessment. Environmental Health
Perspectives; doi:10.1289/ehp.1206429.
Clark ML, Reynolds SJ, Burch JB, Conway S, Bachand AM, Peel JL. 2010. Indoor air pollution, cookstove
quality, and housing characteristics in two Honduran communities. Environ Res 110:12–18;
doi:10.1016/j.envres.2009.10.008.
Cowlin SC. 2005. Tracer Decay for Determining Kitchen Ventilation Rates in San Lorenzo, Guatemala.
Maxwell Student Projects, Max-04-4, EHS, School of Public Health, University of California,
Berkeley 1: 2.
Dionisio, K. L., Howie, S., Fornace, K. M., Chimah, O., Adegbola, R. A., & Ezzati, M. (2008). Measuring the
exposure of infants and children to indoor air pollution from biomass fuels in The Gambia. Indoor
Air, 18(4), 317–327. https://doi.org/10.1111/j.1600-0668.2008.00533.x
Dionisio, K. L., Howie, S. R. C., Dominici, F., Fornace, K. M., Spengler, J. D., Donkor, S., Chimah, O.,
Oluwalana, C., Ideh, R. C., Ebruke, B., Adegbola, R. A., & Ezzati, M. (2012). The exposure of infants
and children to carbon monoxide from biomass fuels in The Gambia: A measurement and
28
modeling study. Journal of Exposure Science & Environmental Epidemiology, 22(2), 173–181.
doi:10.1038/jes.2011.47
Garland C, Delapena S, Prasad R, L’Orange C, Alexander D, Johnson M. 2017. Black carbon cookstove
emissions: A field assessment of 19 stove/fuel combinations. Atmospheric Environment
169:140–149; doi:10.1016/j.atmosenv.2017.08.040.
Hill LD, Pillarisetti A, Delapena S, Garland C, Pennise D, Pelletreau A, et al. 2019. Machine-learned
modeling of PM2.5 exposures in rural Lao PDR. Science of The Total Environment 676:811–822;
doi:10.1016/j.scitotenv.2019.04.258.
IHME. 2018. GBD Compare | IHME Viz Hub. Available: http://vizhub.healthdata.org/gbd-compare
[accessed 23 May 2018].
ISO. 2018. Technical Report 19867-3: Clean cookstoves and clean cooking solutions — Harmonized
laboratory test protocols — Part 3: Voluntary performance targets for cookstoves based on
laboratory testing.
Johnson M, Piedrahita R, Garland C, Pillarisetti A, Sambandam S, Gurusamy T, et al. 2018. Exposures to
PM2.5 associated with LPG stove and fuel interventions: Pilot results from the HAPIN Trial.
Johnson, M., Lam, N., Brant, S., Gray, C., & Pennise, D. (2011). Modeling indoor air pollution from
cookstove emissions in developing countries using a Monte Carlo single-box model. Atmospheric
Environment, 45(19), 3237–3243; doi:10.1016/j.atmosenv.2011.03.044
Johnson M, Edwards R, Alatorre Frenk C, Masera O. 2008. In-field greenhouse gas emissions from
cookstoves in rural Mexican households. Atmospheric Environment 42:1206–1222;
doi:10.1016/j.atmosenv.2007.10.034.
Johnson M, Lam N, Wofchuck T, Edwards R, Pennise D. 2009. In-field charcoal stove emission factors and
indoor air pollution in Nairobi, Kenya.
Johnson M, Smith K, Edwards R, Morawska L, Nicas M. 2014. WHO Guidelines for Indoor Air Quality:
Household Fuel Combustion - Model for linking household energy use with indoor air quality.
Johnson MA, Garland CR, Jagoe K, Edwards R, Ndemere J, Weyant C, et al. 2019. In-Home Emissions
Performance of Cookstoves in Asia and Africa. Atmosphere 10; doi:10.3390/atmos10050290.
Liao J, McCracken JP, Piedrahita R, Thompson L, Mollinedo E, Canuz E, et al. 2019. The use of bluetooth
low energy Beacon systems to estimate indirect personal exposure to household air pollution.
Journal of Exposure Science & Environmental Epidemiology 1–11; doi:10.1038/s41370-019-0172-z.
MacCarty, N., Bentson, S., Cushman, K., Au, J., Li, C., Murugan, G., & Still, D. (2020). Stratification of
particulate matter in a kitchen: A comparison of empirical to predicted concentrations and
implications for cookstove emissions targets. Energy for Sustainable Development, 54, 14–24.
https://doi.org/10.1016/j.esd.2019.09.006
29
Ochieng CA, Vardoulakis S, Tonne C. 2013. Are rocket mud stoves associated with lower indoor carbon
monoxide and personal exposure in rural Kenya? Indoor Air 23:14–24;
doi:10.1111/j.1600-0668.2012.00786.x.
Piedrahita, R., Kanyomse, E., Coffey, E., Xie, M., Hagar, Y., Alirigia, R., Agyei, F., Wiedinmyer, C., Dickinson,
K. L., Oduro, A., & Hannigan, M. (2017). Exposures to and origins of carbonaceous PM2.5 in a
cookstove intervention in Northern Ghana. Science of The Total Environment, 576, 178–192.
https://doi.org/10.1016/j.scitotenv.2016.10.069
Piedrahita R, Coffey ER, Hagar Y, Kanyomse E, Verploeg K, Wiedinmyer C, et al. Attributing Air Pollutant
Exposure to Emission Sources with Proximity Sensing. Atmosphere. 2019 Jul 13;10(7):395.
Piedrahita, R., Johnson, M., Bilsback, K. R., L’Orange, C., Kodros, J. K., Eilenberg, S. R., Naluwagga, A., Shan,
M., Sambandam, S., Clark, M., Pierce, J. R., Balakrishnan, K., Robinson, A. L., & Volckens, J. (2020).
Comparing regional stove-usage patterns and using those patterns to model indoor air quality
impacts. Indoor Air, n/a(n/a). https://doi.org/10.1111/ina.12645
Pillarisetti, A., Carter, E., Rajkumar, S., Young, B. N., Benka-Coker, M. L., Peel, J. L., Johnson, M., & Clark, M. L. (2019). Measuring personal exposure to fine particulate matter (PM2.5) among rural Honduran women: A field evaluation of the Ultrasonic Personal Aerosol Sampler (UPAS). Environment International, 123, 50–53. https://doi.org/10.1016/j.envint.2018.11.014
Pillarisetti, A., Ghorpade, M., Madhav, S., Dhongade, A., Roy, S., Balakrishnan, K., Sankar, S., Patil, R., Levine, D. I., Juvekar, S., & Smith, K. R. (2019). Promoting LPG usage during pregnancy: A pilot study in rural Maharashtra, India. Environment International, 127, 540–549. https://doi.org/10.1016/j.envint.2019.04.017
Pope, D., Bruce, N., Dherani, M., Jagoe, K., & Rehfuess, E. (2017). Real-life effectiveness of ‘improved’
stoves and clean fuels in reducing PM2.5 and CO: Systematic review and meta-analysis.
Environment International, 101, 7–18; doi:10.1016/j.envint.2017.01.012.
Roden CA, Bond TC, Conway S, Pinel ABO. 2006. Emission factors and real-time optical properties of
particles emitted from traditional wood burning cookstoves. Environ Sci Technol 40: 6750–6757.
Senelwa K. 2016. Kenya to subsidise cost of gas cylinders. The East African.
Shrivastava, Alankar. (2011). Methods for the determination of limit of detection and limit of quantitation
of the analytical methods. Chronicles of Young Scientists. 2. 21-25. 10.4103/2229-5186.79345.
Smith KR, Bruce N, Balakrishnan K, Adair-Rohani H, Balmes J, Chafe Z, et al. 2014. Millions Dead: How Do
We Know and What Does It Mean? Methods Used in the Comparative Risk Assessment of
Household Air Pollution. Annual Review of Public Health 35:185–206;
doi:10.1146/annurev-publhealth-032013-182356.
United Nations Statistics Division. 2018. UNdata | record view | Population using solid fuels, percentage.
Available: http://data.un.org/Data.aspx?d=MDG&f=seriesRowID%3A712 [accessed 23 May 2018].
30
Vyas, S., & Kumaranayake, L. (2006). Constructing socio-economic status indices: How to use principal
components analysis. Oxford University Press; doi:10.1093/heapol/czl029
WBT Technical Committee. 2014. Water Boiling Test Protocol: Version 4.2.3.
Weyant, C. L., Thompson, R., Lam, N. L., Upadhyay, B., Shrestha, P., Maharjan, S., Rai, K., Adhikari, C., Fox,
M. C., & Pokhrel, A. (2019). In-Field Emission Measurements from Biogas and Liquified Petroleum
Gas (LPG) Stoves. Atmosphere, 10(12), 729; doi:10.3390/atmos10120729
World Health Organization (WHO). Disease Burden and Mortality Estimates. 2000-2015. World Health
Organization, Health Statistics andMortality Estimates. Retrieved from:
Http://www.who.int/healthinfo/globalburdendisease/estimates/en/index1.html. (2015).
Wilson DL, Williams KN, Pillarisetti A. 2020. An Integrated Sensor Data Logging, Survey, and Analytics
Platform for Field Research and Its Application in HAPIN, a Multi-Center Household Energy
Intervention Trial. Sustainability 12:1805; doi:10.3390/su12051805.
Yip F, Christensen B, Sircar K, Naeher L, Bruce N, Pennise D, et al. 2017. Assessment of traditional and
improved stove use on household air pollution and personal exposures in rural western Kenya.
Environment International 99:185–191; doi:10.1016/j.envint.2016.11.015.
Yuchi, W., Gombojav, E., Boldbaatar, B., Galsuren, J., Enkhmaa, S., Beejin, B., Naidan, G., Ochir, C., Legtseg, B., Byambaa, T., Barn, P., Henderson, S. B., Janes, C. R., Lanphear, B. P., McCandless, L. C., Takaro, T. K., Venners, S. A., Webster, G. M., & Allen, R. W. (2019). Evaluation of random forest regression and multiple linear regression for predicting indoor fine particulate matter concentrations in a highly polluted city. Environmental Pollution, 245, 746–753. https://doi.org/10.1016/j.envpol.2018.11.034
31
Supplementary Information
S1 Emissions performance
While the focus of this work was on developing and evaluating models to predict exposure to household air
pollution, stove performance metrics were calculated and are presented below in Table S1. LPG had very
high modified combustion efficiency (CO2/[CO2+CO] molar) as expected, indicating that almost all fuel
carbon was being converted into CO2. Charcoal stoves had the highest CO emissions, common due to the
surface oxidation combustion process for the fuel. Wood stoves had the highest PM2.5 and black carbon
emission factors. Wood also had a higher BC/PM2.5 ratio, suggesting its aerosol emissions were potentially
more warming, but the climate impacts are difficult to characterize based on the limited set of point source
emissions, especially as the majority of emissions for charcoal are generated during its production.
Table S1. Stove/fuel performance from measurements during cooking events.
LPG Charcoal Wood
Modified combustion efficiency (%)
99.1±0.8 (30) 80.8±0.1 (7) 94.0±2.4 (29)
Firepower (kW) 1.60±0.64 (32) 2.53±0.58 (7) 7.15±1.77 (32)
PM2.5 emission factor (g/kg) BDL 3.17±2.18 (7) 6.70±2.96 (29)
BC emission factor (g/kg) BDL 0.26±0.23 (7) 0.87±0.51 (29)
CO emission factor (g/kg) 17.7±15.8 (30) 373.2±110.0 (7) 67.9±27.7 (29)
BC/PM2.5 BDL 0.11±0.11 (7) 0.15±0.13 (29)
BDL = below detection limit
S2 Housing characteristics and socioeconomic status
The table below shows the distributions of the air exchange rates, room volumes, and cooking event
durations for monitored cooking events throughout the sample. These characteristics are key inputs
for the WHO and ISO physical models. The mean, standard deviation, and sample size are noted.
32
Figure S1. Distributions of kitchen volumes, air exchange rates, and cooking times (key WHO/ISO physical model
inputs)
The table below shows the socioeconomic index results for the full sample. The table is split into the average
fraction of homes possessing a given characteristic toward the index and the standard deviation of home
responses shows the distribution of that characteristic for a given category.
Table S2. Socioeconomic index results. The average fraction homes columns shows the percentage of homes owning an asset or possessing some characteristic, grouped by the index categorization. The standard deviation columns show the distribution of the occurrences of those assets for the given category.
Average fraction of homes
Standard deviation of home responses
Ownership or possession
Poorest
quintile
(category 1) 2 3 4
Wealthiest
quintile
(category 5) 1 2 3 4 5
Own the land/home they live in 0.97 0.82 0.79 0.77 0.67 0.16 0.38 0.41 0.42 0.47
33
Animal(s)(cows, sheep, etc.) 0.87 0.78 0.75 0.73 0.60 0.34 0.42 0.43 0.44 0.49
Cellphone 0.94 0.88 0.78 0.72 0.59 0.24 0.32 0.42 0.45 0.49
Smartphone 0.04 0.18 0.46 0.77 0.87 0.19 0.38 0.50 0.42 0.33
Radio 0.60 0.69 0.74 0.84 0.84 0.49 0.46 0.44 0.37 0.36
Hi-Fi/CD-player 0.00 0.01 0.04 0.26 0.58 0.05 0.10 0.19 0.44 0.49
Solar connection 0.42 0.30 0.28 0.15 0.15 0.49 0.46 0.45 0.36 0.35
Electricity Connection 0.00 0.24 0.57 0.81 0.90 0.05 0.43 0.50 0.39 0.30
TV 0.05 0.22 0.54 0.84 0.85 0.22 0.42 0.50 0.37 0.36
Satellite TV 0.01 0.19 0.29 0.47 0.65 0.09 0.39 0.46 0.50 0.48
Refrigerator/fridge/freezer 0.00 0.00 0.00 0.02 0.41 0.00 0.00 0.05 0.14 0.49
Shower/bath within house 0.00 0.00 0.02 0.05 0.54 0.00 0.07 0.12 0.23 0.50
Land 0.93 0.79 0.77 0.78 0.72 0.26 0.41 0.42 0.41 0.45
Bicycle 0.08 0.11 0.19 0.23 0.33 0.26 0.31 0.39 0.42 0.47
Moped/Motorcycle 0.06 0.11 0.16 0.17 0.11 0.24 0.31 0.36 0.38 0.32
Pick-up truck 0.00 0.01 0.02 0.04 0.07 0.00 0.09 0.15 0.19 0.25
Car 0.00 0.01 0.03 0.10 0.43 0.00 0.11 0.17 0.30 0.50
Computer 0.00 0.00 0.01 0.03 0.21 0.00 0.07 0.08 0.17 0.41
Washing machine 0.00 0.00 0.01 0.01 0.03 0.00 0.00 0.08 0.09 0.17
Tractor 0.01 0.01 0.03 0.04 0.08 0.08 0.09 0.17 0.19 0.28
Septic or Flushing Toilet Inside 0.00 0.00 0.02 0.09 0.63 0.00 0.00 0.12 0.29 0.48
Latrine in Compound 1.00 0.97 0.99 0.99 0.86 0.00 0.17 0.10 0.11 0.35
Use LPG 0.01 0.28 0.53 0.80 0.95 0.11 0.45 0.50 0.40 0.22
pca_score -2.17 -1.40 -0.51 0.70 3.38 0.21 0.25 0.31 0.38 1.51
S3 PM2.5 and CO household air pollution (HAP) and personal exposure concentrations
for the study population
This figure shows a typical 24-hr monitoring period time series, with all plots showing by-minute data. The top
frame shows the PM2.5 time series for the cook’s personal exposure (red), and kitchen concentrations (teal)
from the MicroPEM devices; the second frame shows the PM2.5 concentration data from the PATS+ devices
placed in the kitchen (directly adjacent to the kitchen MicroPEM for inter-comparability and redundancy); the
third frame shows the indirect exposure estimates using three different the Beacon localization methods and
34
the associated concentrations from the PATS+ monitors in the given rooms; the fourth frame shows the
localization assignment using the three different localization approaches explained previously (color indicates
room assignment); the fifth frame indicates stove usage (teal signifies the periods of cooking with the LPG
stove, and red signifies the periods of not-cooking).
Figure S2. A typical 24-hr monitoring period time series for a single household, with all plots showing by-minute data for all instruments used.
S4 Characterization of long-term stove use patterns
Stove usage data was collected at 91 households, for durations ranging from 48-hr to 6 months. Results are
presented by study group (biomass as the primary fuel, charcoal as primary, and LPG as primary). Below, an
example time series is presented for a home, showing the diurnal temperature trends typical in SUMs
35
measurements, and the peaks produced by cooking events. This time series shows a home’s stove usage over
one week, for a home primarily using charcoal. They appear to use both stoves on days that they cook, and on
November 3-5, cooking is not observed.
Figure S3. A typical temperature trace for a single home, showing the diurnal temperature trends typical in SUMs measurements, and the peaks produced by cooking events on two different stoves.
S4.1 Average events per day by group In households that reported biomass as their primary fuel (n=34), SUMS analysis showed that there were an
average of 2.4 Chepkube stove use events per day (n=19), 1.9 charcoal stove use events per day (n=12), 1.5
LPG stove use events per day (n=2), and 1 stove use event per day on the three-stone fire in the sample (n=1).
In households that reported charcoal as their primary fuel (n=4), we found an average of 1.6 charcoal stove
use events per day (n=3), 1.6 Chepkube stove use events per day (n=1), 1.5 three-stone fire stove use events
per day (n=1), and no kerosene stove use events in the single kerosene stove included in the sample (n=1). In
households that reported LPG as their primary fuel (n=53), we found an average of 2.8 LPG stove use events
per day (n=51), 2.9 Chepkube stove use events per day (n=2), and 1.2 charcoal stove use events per day
(n=15).
36
Figure S4. Average stove use events per day, divided by stove group. The labelled central point shows the mean, the line within the box plot shows the median, and the upper and lower bounds of the box show the lower and upper quartiles. The n’s are shown below each box for each stove type within the primary group.
S4.2 Average minutes per day by group:
In households that reported biomass as their primary fuel (n=34), SUMS analysis showed there was an average
of 208.8 minutes of cooking on a Chepkube stove per day (n=19), an average of 210 minutes of cooking on a
charcoal stove per day (n=12), an average of 32.4 minutes of cooking on an LPG stove per day (n=2) and an
average of 200 minutes of cooking per day on the single three stone fire included in the group (n=1). In
households that reported charcoal as their primary fuel (n=4), we found an average of 355 and 333.3 minutes
of cooking each on the single three-stone fire and Chepkube stoves included in the sample respectively, and a
negligible amount of cooking on the single kerosene stove included in the sample. In households that reported
LPG as their primary fuel (n=53), we found an average of 92.4 minutes of cooking on an LPG stove per day
(n=52), 161.4 minutes of cooking on a Chepkube stove per day (n=2), and 77.5 minutes of cooking on a
charcoal stove per day (n=15).
37
Figure S5. Average minutes each stove was indicating use per day, divided by stove group. The labelled central point
shows the mean, the line within the box plot shows the median, and the upper and lower bounds of the box show the
lower and upper quartiles. The n’s are shown below each box for each stove type within the primary group.
S4.3 Temporal trends
Temporal stove usage trends were assessed by comparing distributions of the total daily cooking time (the
sum of usage from all stoves in a given household) by day of week. No clear patterns emerged from this,
indicating that the usage did not vary by day of week, as it does in some regions.
38
Figure S6. Distribution of minutes that indicated use on each stove throughout each day of the week, divided by stove group. A lack of a clear pattern within the days of the week indicated that there were no variations in usage based on day of the week.
The usage patterns by time of day showed clear increases in usage at typical breakfast, lunch, and dinner
periods. The area under each curve is normalized so that the figure does not demonstrate use quantitatively,
but rather shows when use occurs on each stove type on average over the course of 24 hours. Charcoal and
kerosene patterns had relatively low sample sizes so the patterns are expected to be less representative of the
regional patterns.
39
Figure S7. Usage patterns on each stove type within primary stove groupings over the course of 24 hours.
S4.4 Usage fraction by group
In households that reported biomass as their primary fuel (n=34), SUMS analysis showed 56% of all stove
usage that occurred in this group took place on a Chepkube stove, 35% of all stove usage took place on a
charcoal stove, 6% took place on an LPG stove, and 3% took place on a three-stone fire. In households that
reported charcoal as their primary fuel (n=4), 53% of all stove use took place on a charcoal stove, 25% of all
stove use took place on a three-stone fire, and 22% took place on a Chepkube stove. In households that
reported LPG as their primary fuel (n=53), 84% of all stove usage took place on an LPG stove, 14% took place
on a charcoal stove, and 2% took place on a Chepkube stove.
40
Figure S8. Usage fractions on each stove type within the primary stove groupings. Total stove use for each primary group is normalized to 1.0 and the stove use fractions are divided amongst stove types.
S4.5 Stove stacking
Stove stacking behavior is shown in the plot below, which shows the percentage of days each individual and
pairs of stoves are used in a household. Though some groupings are sparse due to low coverage, basic trends
can be observed to understand the energy consumption process on a long-term by-household level. For
example, In the group that primarily uses LPG for cooking, the LPG homes were used on average 73% of days,
and on 38% of days, they used both the LPG stove and the charcoal stoves. This is a relatively high usage rate
which may point to either personal user preferences, or financial decision making.
41
Figure S9. Stove stacking by percent-days used by each stove type within the primary groupings. The labelled central
point shows the mean, the line within the box plot shows the median, and the upper and lower bounds of the box show
the lower and upper quartiles. The n’s are shown below each box for each stove type or combination of stove types
within the primary group.
S4.6 SUMs placement
The figure below shows the thermocouples for the stove use monitoring devices deployed on several stove
types. For those stoves that were stationary, the logger of the monitor was affixed above or adjacent to the
stove, while the thermocouple was threaded to an appropriate distance from the combustion zone to detect
cooking events in temperature traces. For those stoves that were portable, the logger of the monitor was
affixed to the body of the stove, while the thermocouple was situated appropriately at the combustion zone,
so that the stove and monitor were able to be moved as the participant wished with no disruption of
monitoring.
42
Figure S10. Photos showing SUMs installation on various stove types. On the portable stoves, the logger can be
observed affixed to the stove body, while on stationary stoves, the thermocouple is shown threaded to the zone of
combustion.
S4.7 Ambient monitoring
Table S3. Ambient measurement results for PM2.5 and CO
PM2.5 (µg/m3) CO (ppm)
Mean 6.83 0.91
SD 4.52 2.71
Min 1.26 0
43
S4.8 Intensive monitoring
Table S4. Intensive sample summary statistics. These data will be further analyzed in future work, to assess the
day-to-day variability of the household air pollution measurements, and compliance of the Beacons.
Stove group Parameter
PM2.5 Kitchen
PM2.5 Kitchen sampling duration
PM2.5 Living Room
PM2.5 Living Room sampling duration
Kitchen CO (ppm)
CO Kitchen sampling duration
Living Room CO (ppm)
CO Living Room sampling duration
Charcoal
Mean 86.6 4897 54.2 3907 10.6 4177 8.5 4892
SD 22.0 1229 42.0 1783 3.1 1471 4.6 1230
Min 66.2 3306 11.3 2081 8.4 2772 2.6 3300
q25 68.9 4268 30.3 2685 8.7 3167 7.3 4262
Median 84.7 5103 47.9 3737 9.4 3930 8.9 5100
q75 102.3 5731 71.8 4960 11.3 4939 10.1 5730
Max 110.8 6074 109.9 6074 15.1 6074 13.7 6069
n 4.0 4 4.0 4 4.0 4 4.0 4
Chepkube
Mean 1027.4 7101 49.2 7059 11.9 5056 1.5 6619
SD 534.7 718 18.6 728 4.0 2037 2.4 92
Min 410.0 6686 27.8 6639 9.6 3880 0.1 6513
q25 873.0 6686 43.9 6639 9.6 3880 0.1 6593
Median 1336.1 6686 60.0 6639 9.6 3880 0.1 6672
q75 1336.1 7308 60.0 7270 13.0 5645 2.2 6672
Max 1336.1 7930 60.0 7900 16.5 7409 4.2 6672
n 3.0 3 3.0 3 3.0 3 3.0 3
LPG
Mean 119.2 5098 65.2 4508 8.4 4485 5.6 4719
SD 158.9 1543 67.1 2403 7.1 1923 5.4 2051
Min 10.7 2718 13.8 0 0.2 1767 0.3 0
q25 29.3 4657 20.2 3944 3.6 3060 0.5 3900
Median 58.2 4727 37.5 4726 5.7 4621 2.9 4670
q75 111.7 5763 82.8 5762 13.8 4761 7.9 5748
Max 543.3 8092 221.9 8043 21.6 8020 16.7 7999
n 14.0 14 12.0 14 14.0 14 13.0 14
46
Trad Biomass
Mean 630.1 4429 25.5 4429 11.1 3587 1.6 4289
SD 463.9 3097 10.5 3096 6.4 1898 2.7 3192
Min 131.6 2006 17.3 2006 3.4 1936 0.0 1922
q25 263.8 2858 19.3 2857 6.4 2424 0.1 2424
Median 473.3 2951 21.9 2951 9.9 2925 0.3 2941
q75 975.1 4952 26.6 4952 15.3 4233 1.7 4945
max 1328.0 10428 47.7 10426 20.9 6934 7.2 10420
n 7.0 7 7.0 7 7.0 7 7.0 7
Figure S12. Typical CO and CO2 emissions time series, showing the initial background period, the cooking period, and
final background period, in addition to the data points identified to be associated with the decay that can be used to
calculate the kitchen ventilation rate.
47