+ All Categories
Home > Documents > Download - Climate & Clean Air Coalition |

Download - Climate & Clean Air Coalition |

Date post: 19-Jan-2023
Category:
Upload: khangminh22
View: 0 times
Download: 0 times
Share this document with a friend
48
0
Transcript

0

Emissions to exposure: modeling approaches and performance for estimating

personal exposure to household air pollution

Final Report

Michael Johnson1, Ricardo Piedrahita1, Ajay Pillarisetti2, Matthew Shupler3, Madeleine Rossanese1, Samantha Delapeña1, Ryan Chartier4, Elisa Puzzolo3,5, Diana Menya6, Daniel Pope3

1 Berkeley Air Monitoring Group, Berkeley, California, USA 2 Emory University, Atlanta, Georgia, USA 3 Dept of Public Health and Policy, University of Liverpool, Liverpool, UK 4 RTI International, North Carolina, USA 5 Global LPG Partnership, New York/London

6 Dept. of Epidemiology and Biostatistics, School of Public Health, College of Health Sciences, Moi University, Kenya

1 Executive Summary

Background and objectives: This study assessed the performance of modeling approaches to estimate

personal exposure in Kenyan homes where household fuel combustion contributes substantially to household

air pollution (HAP). This work had two primary objectives: first to evaluate the models used for setting

emissions performance targets by the World Health Organization (WHO) and International Organization for

Standardization (ISO), and second to adapt and develop models that predict exposures to household air

pollution.

Approach: To address these objectives we collected data on a subsample of homes participating in the Clean

Air (Africa) study, which is evaluating the potential impacts of transitioning from biomass fuels to liquified

petroleum gas. Within this subsample, we measured emissions (PM2.5, black carbon, CO); household air 1

pollution (PM2.5, CO); personal exposure (PM2.5, CO); stove use; and behavior, socioeconomic, and

environmental (e.g. ventilation and kitchen volume) characteristics. This data was then used to assess and

develop the modeling approaches: the single-zone model used by ISO and the WHO for stove performance

targets; indirect exposure models that combine person-location and area-level measurements; and

regression-based models that predict exposure based on a set of predictors such as fuel type, room volume,

and other relatively easily measured parameters.

Key findings:

- The measured stove performance, kitchen concentrations, and personal exposures were all in

alignment with previous work based on fuel type. LPG had the lowest emissions, with corresponding

low HAP concentrations and personal exposures, while wood and charcoal stoves had substantially

higher emissions and exposures. This anticipated trend and variability in conditions provided a good

dataset to test and develop the models.

1 The CLEAN-Air (Africa) project is being conducted by the University of Liverpool (UoL) and Moi University in Eldoret, Kenya. The CLEAN-Air (Africa) Global Health Research Group is working directly with government ministries in Cameroon, Ghana, and Kenya to assess the impacts of the slated expansion of LPG use in the countries.

1

- The WHO and ISO single zone model was reasonably well correlated with measured kitchen

concentrations of PM2.5 and CO (R2=0.45), but lacked precision, with relatively large standard errors.

The model also overestimated measured kitchen concentrations by several fold.

- The combination of the single-zone model’s correlation with measured kitchen concentrations and

systematic overestimation suggests it is a reasonable tool for setting performance targets as it

provides a conservative approach for linking emission rates with indoor air quality. Given that the

model overestimates kitchen concentrations, by potentially up to an order of magnitude, there may

be room to adjust the modeling approach such that it provides more reasonable estimates of kitchen

concentrations while still erring on the side of conservativeness.

- Normalizing the model response to measured kitchen concentrations, incorporating daily stove use

patterns of all stoves used in the home, and applying the ratios of kitchen concentrations to personal

exposures generated distributions of modeled PM2.5 exposures for the LPG and wood user groups

with similar medians (LPG modeled = 19 μg/m3 vs measured =29 μg/m3; wood modeled 207 μg/m3 vs

measured 182 μg/m3) and substantially overlapping interquartile ranges (LPG modeled = 10-41 μg/m3

vs measured 27-46 μg/m3; wood modeled 113-386 μg/m3 vs measured 104-292 μg/m3). This

agreement suggests the adapted model can produce group-level estimates of HAP exposure that are

reasonable, but given the large standard errors, interpreting individual level estimates would be

problematic.

- Regression models were made using the entire dataset and separately for LPG and biomass-using

households. The best performing model for the entire dataset used a combination of survey-based

data and measurements. The model performed well, with an R2 of 0.76 and a root mean-squared error

of 85 µg/m3. - The survey-based regression model was able to predict PM2.5 exposures with an R2 of 0.51, nominally

better than the single-zone model. A reliable and accurate survey-based model for estimating

exposure would be an extremely valuable tool for researchers and program evaluators, as it would

mitigate the need for relatively expensive and technical exposure measurements. While promising,

the substantial exposure contrast between the LPG and biomass user groups was largely responsible

for the relatively good performance of the simple model, with fuel type being the most important

predictor in the model. This caveat implies that the model performance may rely on those large

contrasts, which are not always evident, an issue that has been previously reported in HAP exposure

modeling studies.

Conclusions: Overall, both the single zone and regression based models show substantial promise for

predicting personal exposures to HAP. The models were aided by large exposure contrasts between fuel use

groups, which helped explain much of the variability in exposure. All of the modeling approaches also had

substantial uncertainty associated with specific data points, suggesting that they are best applied towards

group-level estimates rather than being used as a tool to understand exposure in any given home. The

modeling approaches were also developed using data from a single context, and some of the assumptions or

predictors may have differential impacts in other locations. Testing them in different contexts (fuels,

geographies, stove use patterns, housing characteristics) would help characterize how robustly they operate

and/or the degree to which they may need to be tuned to those specific conditions.

Having conducted this work in Kenya was especially timely as the country has a relatively strong market for

modern biomass cookstoves and clean cooking energy sources; and innovative consumer finance programs,

such as pay-as-you go systems and microfinance. Kenya’s transition towards cleaner stoves and fuels is being

2

aided by active involvement in the development and adoption of ISO performance standards, which is part of

the country’s work with the World Health Organization to expedite the energy transition. The models

developed here are clearly most applicable to the Kenyan context, and ideally will be used to help characterize

the HAP exposure implications of energy transitions, as well as strengthen the implementation of Kenya’s

cookstove standards framework.

2 Table of Contents

Executive Summary 1

Table of Contents 3

Acknowledgements 4

Background and Rationale 5

Methods 7

5.1 Study design and field site 7

5.2 Data collection and analysis methods 8

5.2.1 Emissions measurement techniques 8

5.2.2 Ventilation rate determination 9

5.2.3 Stove usage monitoring 10

5.2.4 Ambient monitoring 10

5.2.5 Personal exposure monitoring 11

5.2.6 Household air pollution monitoring 11

5.2.7 Beacon based time-activity monitoring 12

5.2.8 Behavioral factors 13

5.3 Modeling approaches 13

5.4 Data Analysis 14

5.4.1 Sensor data 14

5.4.2 Mass-balance single-zone model 15

5.4.3 Regression models 15

Results 16

6.1 Single-zone model 16

6.2 Regression-based models 20

3

Discussion 25

7.1 Model application considerations 25

References 28

Supplementary Information 32

S1 Emissions performance 32

S2 Housing characteristics and socioeconomic status 32

S3 PM2.5 and CO household air pollution (HAP) and personal exposure concentrations for the study population 34

S4 Characterization of long-term stove use patterns 35

S4.1 Average events per day by group 36

S4.2 Average minutes per day by group: 37

S4.3 Temporal trends 38

S4.4 Usage fraction by group 40

S4.5 Stove stacking 41

S4.6 SUMs placement 42

S4.7 Ambient monitoring 43

S4.8 Intensive monitoring 46

3 Acknowledgements

We would foremost like to thank the kind and gracious study participants from the Eldoret area for

participating in the study, without whom it would not have been possible.

This work was made possible by the research team lead by Dr. Diana Menya at Moi University, including, Judi

Mangeni, Edna Sang, Gilbert Nyauke, Mary Lydia Kiano, Bernard Bosire, Anabwani Menya, Noelle Sutton, Seth

Owiti, Sharon Cherono, Rachel Samoei, Ruth Jepchirchir, Joan Chepng’eno, Joseck Erambo, Zipporah Mageto,

and Mondesta Malemo.

Thank you to the entire University of Liverpool team, who supported the field work training, project

management, and data cleaning, especially Iva Cukic, Rachel Anderson de Cuevas, Sara Ronzi, and Emily Nix.

Thank you to Dr. Daniel Wilson at Geocene for assistance in processing the stove usage monitor and exposure

data. Thank you to Parker Alex Matthews for her assistance in data cleaning, organization, and analysis.

This work was funded by the United Nations Office for Project Services (UNOPS; RFP/2017/2592), a subsidiary

organ of the United Nations, through the Climate and Clean Air Coalition, and managed by the Clean Cooking

4

Alliance. Funding for a substantial portion of the measurements and field work that were complementary to

this work was provided by the Clean Air (Africa) project. CLEAN-AIR(Africa) was funded by the National

Institute for Health Research (NIHR) (ref: 17/63/155) using UK aid from the UK government to support global

health research. The views expressed in this publication are those of the author(s) and not necessarily those of

the NIHR, UK Department of Health and Social Care, UNOPS, the Climate and Clean Air Coalition, or the Clean

Cooking Alliance.

4 Background and Rationale

This study was a collaborative effort with a CLEAN-Air (Africa) project conducted by the University of Liverpool

(UoL) and Moi University in Eldoret, Kenya. The CLEAN-Air (Africa) Global Health Research Group worked

directly with government ministries in Cameroon, Ghana, and Kenya that have recently made ambitious

commitments to scale up household access to LPG for a significant proportion of their populations. As part of

the effort to estimate impacts of scaled LPG adoption, the CLEAN-Air (Africa) project collected rapid survey

data on fuel usage and household characteristics from over 2000 homes, and in ~100 homes, in-depth

surveys, personal exposure and household-level measurements of particulate matter under 2.5µm (PM2.5) and

carbon monoxide (CO) and performed stove use monitoring in Kenyan homes using biomass only, biomass and

LPG, and LPG only. The additional measurements for the UNOPS study were a subset of those homes. The

most common biomass stoves in the study area were traditional three-stone-fires, jiko-style charcoal stoves,

and handmade mud stoves (Chepkube stoves) (see Figure 2).

There is also limited data on emissions and exposure in Kenya. To our knowledge, the only published field

emission performance studies on stove interventions have been carried out by Berkeley Air on charcoal and

kerosene stoves (Garland et al. 2017; Johnson et al. 2009, 2019). The data on personal exposures is also

limited, with available studies reporting only carbon monoxide (CO) or associated with only wood-fueled

stoves (Ochieng et al. 2013; Yip et al. 2017). Based on our information, there have yet to be studies that

report PM2.5 exposures from a survey of the current common user groups in Kenya, i.e. users of traditional

biomass, mixed biomass and LPG, and exclusive LPG users.

In this paper we report on efforts to: 1) develop and refine current physical modeling approaches used by the

WHO and ISO to predict personal exposures to PM2.5 and CO from emissions measurements (ISO 2018;

Johnson et al. 2014) and 2) develop multivariate models that predict personal PM2.5 and CO exposures based

on physical and behavioral predictors of PM2.5 and CO emissions. Figure 1 illustrates the emissions to exposure

pathways, which forms the basis of the modeling approaches.

5

Figure 1. Diagram of the emissions to exposure pathways and the basic modeling approaches employed in this validation

study. (Figure credit: Ajay Pillarisetti, Nick Lam)

Additional results include characterization of: 1) in-field PM2.5, CO, and black carbon (BC) emissions;

performance of wood, charcoal, and LPG stoves; 2) distributions of kitchen volumes, air exchange rates, and

cooking times (key WHO/ISO physical model inputs); 3) PM2.5 and CO HAP and personal exposure

concentrations for the study population; 4) indirect PM2.5 and CO exposure modeling results, wherein indoor

location and household air pollution are used to model personal exposure; and 5) characterization of

long-term (2-4 months) stove use patterns for each study group (biomass as the primary fuel, charcoal as

primary, and LPG as primary).

Assessment and improvement of physical models for household and personal exposure concentrations is an

important goal. Field studies of personal exposure to PM2.5 are some of the most difficult to conduct as they

require costly equipment, highly trained technicians, and participant compliance with PM monitors. Models

that could accurately estimate exposure based on known performance of technologies combined with more

easily collected data, such as ventilation rates, kitchen volumes, and simple measures of user behavior, would

be highly valuable. While not a substitute for direct field studies of exposure, these models could help guide

programmatic decisions toward the most effective household energy solutions. Here we present models that

aim to predict personal exposure to household air pollution using various approaches and data types within

the context of cooking, though they could be applied to other source types as well.

The data set collected in this study is unique due to the number of samples collected and a wide range of

measurements performed at each home. Consequently, there are many interesting relationships to assess

6

and analyze to perform that are outside the primary aims of the work, and it is likely that future researchers

would find additional analyses to conduct with this data set.

5 Methods

5.1 Study design and field site

57 study households were selected from Turbo and Kesses, rural and peri-urban communities in Western

Kenya, near Eldoret. These households were selected for household air pollution (HAP) and personal exposure

(PE) monitoring from a subset of the CLEAN-Air (Africa) study homes, and were split into groups primarily

using LPG (n=32) and wood (n=32), in addition to 7 primary charcoal users. Household selection and 2

assignment to the study groups was conducted with feedback from local partners, though a rapid survey

(n=2248) provided the most critical contextual information showing which types of users are representative of

the CLEAN-Air (Africa) target population. Careful sampling ensured a large emissions performance and

exposure gradient for modelling purposes. Field measurements for this work were conducted from October

2019-January 2020, encompassing only the dry season. Although the climate throughout the year is generally

temperate, and cooking occurs primarily indoors, it is conceivable that ventilation and fuel usage patterns may

change over the season, influencing model performance.

Figure 2. Typical stoves encountered in the study, with the LPG and charcoal stoves shown at left (with SUMs installed), a

traditional Chepkube stove at center, and a 4-burner LPG stove at right.

In all study homes, emissions, HAP, personal exposure, stove use, and user behavior (sensor and survey-based)

were measured (Table 1). Ambient sampling was conducted during all personal samples. 28 homes were

selected for more intensive sampling. In these homes, we measured stove usage for up to four months and

HAP for up to four days (details in Table S4). During emissions sampling at these homes, stratified kitchen CO

and PM2.5 concentrations were measured at heights of 1m and 2m, in addition to the typical 1.5m height. The

specific measurement techniques for these measures are presented in Table 1 below.

2 These were the total number of home monitoring samples conducted, but the final amount of valid samples available

for modeling was lower for the different modeling approaches, due to data quality assurance protocols filtering out some

of the data.

7

Table 1. Measurement overview. Measurements were conducted at multiple time scales, from the measurement of

single cooking events, to multi-day household air pollution monitoring.

Emissions measurements during a cooking event

24h measurements Intensive measurements (1—4 days)

Personal exposure

MicroPEM (PM2.5) Lascar CO (CO) Berkeley Air Beacons (participant location)

Berkeley Air Beacons (participant location)

Kitchen HAP MicroPEM (PM2.5) Lascar CO (CO)

Berkeley Air PATS+ (PM2.5) Lascar CO (CO)

Secondary HAP Berkeley Air PATS+ (PM2.5) Lascar CO (CO)

Berkeley Air PATS+ (PM2.5) Lascar CO (CO)

PM2.5 Emissions

UPAS (PM2.5) Berkeley Air PATS+ (PM2.5, placed at 1m, 1.5m, 2m)

- -

CO/CO2 Emissions

TSI (CO) Lascar CO (CO, placed at 1m, 1.5m, 2m)

- -

Participant Behavior

Observation Berkeley Air Beacon Loggers (participant location) Survey (time-activity, fuel-use, socioeconomic status)

Berkeley Air Beacon Loggers (participant location) Survey (time-activity, fuel-use, socioeconomic status)

Stove use Geocene SUMS (temperature loggers)

Environment Kitchen volume, air exchange rate, housing characteristics

Ambient UPAS (PM2.5), PATS+ (PM2.5), Lascar CO (CO)

5.2 Data collection and analysis methods

5.2.1 Emissions measurement techniques Emissions samples were collected during uncontrolled cooking tests in participants’ homes, where the cooks

were instructed to prepare a meal as they normally would, without altering stove operation or cooking

techniques. The emissions sample was collected with a multi-port probe suspended in the smoke plume

(Figure 3), and the sample stream was drawn through a Teflon filter after a PM2.5 size cut cyclone to determine

PM2.5 mass deposition. Carbon dioxide (CO2) and CO were measured with real-time instrumentation (TSI IAQ

CALC 7545). Background concentrations of CO2, CO, and PM2.5 were measured before and after each sampling

event, and subtracted from those measured in the emissions plume. Samples with more than 15% of the CO or

CO2 readings above the instrument maximum measurable value were removed from analysis.

If real-time estimates of PM2.5 concentrations exceeded 50µg/m3, testing was delayed until a lower

background concentration was observed. Filter analyses were performed at Colorado State University (Fort

8

Collins, CO USA) using an electronic microbalance (Mettler Toledo, USA) with 0.1µg resolution in a

temperature and humidity-controlled chamber. Mass depositions were determined by weighing the filters

before and after sampling, and correcting for contamination using the median mass deposition of collected

blank filters (n=20). Limit of detection (LoD) was calculated as three times the standard deviation of the mass

deposition on the blank filters (Shrivastava et al., 2011). Black carbon on filters was analysed optically by

transmittance (before and after sampling) using a SootScan OT21 analyser (Magee Scientific), and adjusted

using calibration factors as reported in Garland et al. 2017.

Figure 3. The emissions measurement system set-up to measure an LPG cooking event at left, and a biomass cooking event on a Chepkube stove at right. Note the stratified samples hanging in the room as part of an intensive sample.

Emission factors were determined using the carbon balance approach, as has been done in previous studies of

stove emissions and as is described in the WBT 4.2.3 protocol (Johnson et al. 2008; Roden et al. 2006; WBT

Technical Committee 2014). Emission rates were calculated by dividing the total emissions during a sampled

stove use event by the amount of time the event lasted.

Observations and measurements of operational conditions, which may affect emissions performance, were also recorded for the duration of each cooking event, such as lighting techniques, pot types, and fuel conditions.

5.2.2 Ventilation rate determination Ventilation was measured via tracer gas method, according to the standard WHO protocols specifically designed for the single zone box model. Briefly, CO levels were elevated in the cooking area due to the emissions source, and the rate at which the gas decreased at the end of the cooking event was converted into a ventilation rate (Cowlin 2005). Primarily, this was calculated using data from the CO monitor placed at 1.5m height, but we also assessed it using data from the monitors at 1m and 2m for homes that had intensive

9

sampling. In cases where the kitchen monitor did not provide valid data, the data collected by the emissions monitoring system were used to estimate the ventilation rate.

Additionally, we analyzed peak gas and/or PM2.5 concentration rates of decrease measured during longer term HAP sampling to compare these with the conventional approach and examine the stability of ventilation rates over time. This approach is presented in (Carter et al. 2016).

5.2.3 Stove usage monitoring

Stove usage was directly measured at 5-minute intervals on all stoves used more than once per week in the

study homes with stove use monitors (SUMs) and participant surveys. Geocene SUMs were selected for this

measurement due to their ability to measure high temperatures with thermocouples, the ease of device

launching and data management, and unobtrusive design (Wilson et al. 2020). SUM data was assessed on-site

to ensure proper placement, and immediate corrections were applied to address any issues. Placement of the

SUMs was piloted on any stove type that had not been previously encountered by the team to ensure

successful data collection (see Figure S10).

One week of stove usage monitoring, planned to coincide with the exposure assessment period, was already

included as part of the CLEAN-Air (Africa) study, which allowed for the collection of a wealth of short-term

data. Additionally, we conducted 2-6 months of SUMs monitoring on a subset of 48 homes (over 55 stoves in

total) to better characterize long-term usage trends and assess day-to-day variability, which both impact

household air pollution and personal exposure.

To generate cooking events from the SUMS temperature time series, this project used the FireFinder

algorithm (Wilson et al. 2020) from Geocene (Geocene, Berkeley, California). Two versions of the FireFinder

algorithm were deployed, the default version for time series in which the temperature exceeded 250 C, and a

sensitive version for time series in which the maximum temperature was below 250 C. The FireFinder sensitive

algorithm used a primary threshold parameter of 31 C (the 95th percentile of all indoor temperature

measurements collected by PATS+ monitors) and a min event temperature of 24 C (the 75th percentile of

outdoor temperature values). As for the default FireFinder algorithm the minimum event duration was 5

minutes, and any events within 10 minutes of each other were grouped together into a single event. An

example time series is shown in the Supplementary Information in Figure S3. The time cooked with each stove

type per day was a direct input into the WHO single-zone box model and was used to predict kitchen-level

concentrations, as well as personal exposure. Longer term stove usage trends for the UNOPS and Clean Air

(Africa) homes are presented separately in the Supplementary Information.

5.2.4 Ambient monitoring

Ambient monitoring of gravimetric and nephelometric PM2.5 and CO was carried out during most emissions

sample collection periods. An ambient monitoring station was set designated in a rural background location in

the Kesses region (0°25'07.7"N 35°19'24.4"E). Gravimetric and black carbon PM2.5 measurements were

collected (UPAS, Access Sensor Technologies, Fort Collins, CO), alongside real-time PM2.5 (PATS+) and CO

(Lascar EL-USB-300). Instrument inlets were placed at a height of 6m, and away from trees, buildings, or other

obstructions, and there were no substantial nearby air pollution sources except for one biomass burning

kitchen ~50m from the site (see Figure 4).

10

Figure 4. Details of the ambient monitors and the installation site.

5.2.5 Personal exposure monitoring

Personal exposure was measured for the primary cook using the MicroPEM or Early Childhood MicroPEM

monitors (RTI, Research Triangle Park, North Carolina, USA), combined gravimetric and real-time

nephelometric PM2.5 monitors. MicroPEM and ECM filter samples were weighed gravimetrically at RTI and the

blank-corrected mass concentrations were used to correct the nephelometer readings, which are sensitive to

aerosol properties that may vary by source type and atmospheric conditions. Clean filter handling and

MicroPEM calibration protocols were applied to ensure high data quality throughout the study. The built-in

accelerometers were used to assess compliance, and a 20-minute rolling average was applied to the

magnitude of the composite acceleration. Compliance with wearing the monitors is evaluated considering the

fraction of the day that participants typically spend sleeping, and the amount of time they may be sitting with

the sampling vest adjacent to them. Compliance fractions between 0.25-0.75 are considered good, but with

the current model implementations we retained all data regardless of compliance.

Personal carbon monoxide exposure monitoring was performed using Lascar EL-USB-CO300 monitors (Lascar

Electronics, UK). All CO sensors used in the study underwent two-point calibrations before the study, with

certified calibration standards in Berkeley Air Monitoring Group’s California laboratory. Linear calibration

corrections were applied to the data. The data were then manually reviewed as a sense check, and filtered in

cases of clear instrument malfunctions. Personal PM2.5 and CO data were used as the independent variables

in the exposure models, at minute and 24-hr average time scales.

5.2.6 Household air pollution monitoring

Household air pollution was also measured with the MicroPEM as part of the CLEAN-Air study. These data

were supplemented with additional real-time PM and CO instrumentation (PATS+ and Lascar CO). In a subset,

PATS+CO (Berkeley Air Monitoring Group, Berkeley, California, USA) were used rather than the standard

PATS+, as its integrated high-precision CO electrochemical sensor allowed for comparison with the Lascars. A

sampling pack containing a MicroPEM, PATS+, and Lascar CO monitor was installed in the kitchen area for 24

hours, including the emissions sampling event. PATS+ and Lascar CO monitors were also placed in a separate

room where the participants reported spending the most time (generally the living area). In a subset of one

fourth of households, a set of three evenly spaced PATS+ and Lascar CO monitors was also hung between the

11

ceiling and floor in a stratified sampling configuration to capture the spatial variability of pollutants in the

kitchen space, which informs the variability of the ventilation conditions within the room, and in turn the

variability of kitchen to personal exposure estimation methods. In the 28 intensive homes, PATS+ and Lascar

CO monitors in the kitchen and living areas were left installed for up to five days to assess day to day

variability.

5.2.7 Beacon based time-activity monitoring The Berkeley Air Beacon Logger System is a time-activity monitoring system specially designed for household

energy applications (Piedrahita et al. 2019, Liao et al. 2019). The system is composed of two components, a

poker-chip sized Bluetooth Beacon, which safely emits a unique ID multiple times a second over Bluetooth

Low Energy, and a Beacon Logger, which records the address and the strength of the Beacon’s emitted signal.

The system components are low in cost, power consumption, and maintenance efforts, especially in

comparison with personal exposure monitors.

Beacon Loggers were installed in all kitchens and living areas, and Beacons were given to the primary cooks to

wear during the 24-hour monitoring period around which emissions data was captured. Users generally wore

the two Beacons on a necklace or in the pocket alongside the exposure monitors. A minute-wise time series of

presence in each microenvironment was generated for the user by associating the signal strength of their

Beacons with the fixed locations of the Beacon Loggers (the primary kitchen and the living area where they

spent most of their time during the day). Presence in a room was determined using two different approaches:

1. In the ‘kitchen threshold’ approach, if the mean signal strength of a user’s Beacons in a given minute

are stronger than -70 (RSSI units, associated with an open field distance of ~5m), then the location for

the given minute is classified as ‘kitchen’, and if not, it is classified as ‘living area’. If the kitchen Beacon

Logger does not measure any signal for a given minute, it is assumed the user is away from the house,

and the location is then classified as ‘ambient’.

2. In the ‘nearest logger’ approach, whichever of the two Beacon Loggers in the home records the

highest mean signal strength for a given minute, the user’s location is classified to that room. If neither

logger records a signal, the location is again classified as ‘ambient’.

A performance check of the system, called a walkthrough, was carried out at each home before the start of

the deployment, to assess system performance. This entailed leaving the equipment in each area for a 5

minute period, to determine whether the classification was correct. The walkthrough results indicated that

with the ‘nearest logger’ algorithm, the classification was correct 83.0% of the time when the equipment was

in the kitchen, and was incorrect 15.5% of the time, when it classified the location as the other area in which a

logger was installed (0.6% of the time, it was classified as equidistant from both loggers, and 0.9% of the time

it was classified as not being near either logger, termed ‘ambient’). Similarly for the other area (typically the

living room), a correct classification was made 83.2% of the time, an incorrect prediction that the equipment

was in the kitchen was made 15.4% of the time, with the remaining 0.7% for both equidistant and ambient

classifications. In the subset of intensive households, the Beacons and Beacon Loggers remained in place

alongside PATS+ and Lascar CO microenvironment monitors for a period of up to five days. This longer-term

period allowed us to assess participant acceptability of protocols, model performance, and compare

day-to-day within-person variability to between-person variability.

12

5.2.8 Behavioral factors

Time-activity, fuel use, and cooking habit data were collected using standard questionnaires that have been

used both in this part of Africa and in other countries, including India, Mongolia, Laos, and Cambodia.

Additional questions were added based on their potential utility to contribute explanatory power to statistical

models and included parameters such as trash-burning, animal fodder preparation, and smoking habits.

Additional information on socioeconomic status and educational status were evaluated. Socioeconomic status

was assessed using principal components analysis (PCA) on asset ownership and home characteristic variables

as per Vyas and Kumaranayake (2006) to generate a 5-category index. The index is generated and households

are assigned to a category using the prediction from the first principal component of the analysis. The first

index categorized was associated with low ownership of assets (such as cars, cell phones, and computers),

higher use of biomass, and outdoor sanitation facilities, while the fifth was characterized by high ownership

rates of those assets, indoor sanitation facilities, and access to water indoors (Table S2). Survey data was

collected with Mobenzi (Cape Town, South Africa), a tablet-based data entry system that has been used

extensively in similar studies and minimizes the likelihood of transcription errors and data loss.

5.3 Modeling approaches

The measurements collected allowed a thorough evaluation of model performance and helped us to

determine the measurable factors most critical for accurate exposure estimation.

The first modeling approach was the single-zone, mass-based model currently employed by ISO and WHO to

estimate kitchen concentrations and derive emissions performance targets (Johnson et al., 2014; WHO 2015,

ISO 2018). The model predicts room concentrations of pollutants using input distributions of emission rates

and usage times of the sources (in this case, stoves); a room’s ventilation rate and volume; the fraction of

emissions from the sources that enter the room (important for chimney stoves); and the background/ambient

concentration. The mathematical description is provided below and can also be found in Johnson et al. 2014.

Equation 1

C(t) ,= αVq f +q f +q f +…q f1 1 2 2 3 3 n n (1 )− e−αt + Co (e )−αt + Cb

where

C(t) = Pollutant concentration for a given time point

qx = emission rate for source x (mass/min)

fx = fraction of emissions from source x that enters the kitchen environment

α = first order loss rate (nominal air exchange rate) (changes/min)

V = kitchen volume (m3) t = time interval (1 min)

Co = concentration from preceding time interval (unit mass/m3) Cb = Background concentration

The model produces 24 hours of minute-by-minute concentration estimates, where the emission rates for the

respective sources are inputs for three discrete, evenly spaced cooking times. The sum of these periods is the

device usage time, which is also a model input. To calculate the predicted 24-hour mean concentration in the

13

kitchen (Ck), the concentrations for each time point are summed over the day and divided by the number of

minutes in a day. To estimate exposure, the 24-hour mean concentration Ck was multiplied by a Kitchen

Exposure Factor (KEF), as shown in Equation 2 below. The exposure ratios are ideally location-specific (as

possible here), though global averages have been applied such as those from the Global Burden of Disease

Study (0.742 for women, 0.628 for young children, 0.450 for men (Smith et al. 2014)).

Equation 2

KEFEr = Ck

We calculated KEFs for each sample and by stove-fuel group. The predictive power of average KEFs was

evaluated using k-fold cross-validation (k depends on the total number of samples obtained). Previous work

(Hill et al. 2019) has shown that KEFs alone have poor predictive power in some contexts. The predictive

power of other ratios (of pollutants or pollutants and locations) was evaluated.

In addition to the physical, mass-based modeling of the first approach, a second approach was developed to

estimate personal exposure to PM2.5 and CO using linear regression. Covariates included both sensor-based

(indoor location, stove usage patterns, kitchen pollutant measurements) and survey (characteristics of the

home, kitchen, fuel, etc.) data. We created models with different sets of covariates, beginning with those that

are easiest to collect and most crude (survey data) and continuing with increasingly complex data streams. In

doing so, we identified both minimal and maximal predictive powers. We also compared models using both

sensor-based measurements and modeled estimates of kitchen concentrations. Previous work has shown that

some statistical variation in exposure can be explained through data easily obtained via questionnaire

(Balakrishnan et al. 2013; Baumgartner et al. 2011; Clark et al. 2010; Hill et al. 2019) (Balakrishnan et al. 2013;

Baumgartner et al. 2011; Clark et al. 2010; Hill et al. 2019).

Models were evaluated using standard regression fit metrics, including RMSE, R2, AIC, and BIC, and tests were

performed to check that regression assumptions are met. We compared single zone-model estimates with

those from linear statistical models that included a parsimonious set of variables, including survey-assessed

and sensor-based measures, and reduced sets of measurements that could be reasonably deployed as part of

large-scale assessments. Statistical modeling approaches for estimating pollutant concentrations followed

those described by Balakrishnan et al (2013).

5.4 Data Analysis

5.4.1 Sensor data Data for this work were collected with a variety of devices, on multiple time scales. The general approach for

analysis of this type of data consists of tethering the results to non-identifiable user identification numbers

and sample collection dates, which were then used to merge the sensor and survey data streams. Each

instrument had its own pre-processing code to import the raw instrument data, adjust per protocol (e.g. by

14

calibration factors), quality check it, and generate merged minute-wise time series for each deployment. The

data was then used to perform exposure modeling at various resolutions, including minute and daily levels.

5.4.2 Mass-balance single-zone model

Distributions of input data for the single-zone model (emission rates, ventilation rates, kitchen volumes, and

stove use times) were generated to determine the mean, max, and minimums as needed for the model

(Equation 2). The model was run for PM2.5 using Monte Carlo simulations with the Risk Analyzer software

package and generated an output of room concentration distributions. These distributions were compared

against measured PM2.5 kitchen concentrations (both the standard single kitchen measure and the integrated

concentration estimate based on stratified measurements). The resultant distribution of exposures calculated

as the ratio between personal exposure and the room concentration distributions were compared against

previously published ratios. We also present distributions of proximity between the participants and

microenvironment pollution monitors, and joint probability distributions between stove usage-by-proximity

categories and exposure, which helped to generate a modified exposure ratio, accounting for proximity to the

pollution source.

5.4.3 Regression models

Statistical analysis and model creation were performed in R (3.6.2 and 4.0.2.). We used univariate and

multivariate linear regression (MLR) to model PM2.5 exposures among primary cooks. 50 households provided

data that passed quality checks for use in modeling. We imputed missing covariate data by using the

population-wide median values. The dependent variable – the cook’s measured exposure to PM2.5 – was

log-transformed to meet normality requirements. Predictor variables were not transformed. Models were run

separately for the entire dataset and for biomass and LPG users.

Univariate models were fit to assess the relationship between exposure proxies and measured concentrations

and exposures. The use of kitchen concentrations and ratios of concentrations to exposure were assessed with

measures of correlations (Pearson’s r). Multivariate models were used to assess the relationship between

personal exposure and sociodemographic characteristics, stove-fuel energy use patterns, household

characteristics, and other physical measurements (time activity, stove use, etc.) in the home. Variable

selection occurred using multiple modalities. First, we used an automatic variable selection algorithm (from

the “leaps” r package) to pick parameters that optimized between model comparison parameters, including

adjusted R2, Bayesian Information Criterion (BIC), and Malloy’s Cp (shortened to Cp). Models identified using

the automatic variable selection algorithm were further screened using 10-fold cross validation. The model

with that minimized RMSE during 10-fold cross validation was selected for further evaluation. Second, based

on our prior knowledge and a review of the literature, we evaluated sets of predictors that we anticipate

would be easier to collect in the field using surveys, less intrusive monitoring devices placed in a kitchen, or

less intrusive personal monitors. Model performance was compared using adjusted R2 and RMSE. Finally, we

estimated a commonly used ratio known as the ‘Kitchen-Exposure Factor’ (KEF), the ratio of personal exposure

to kitchen concentrations and compared it to a commonly used global KEF 0.742. Estimates of global

KEF-derived exposures were compared to measured exposures using Wilcoxon Rank Sum tests.

15

6 Results

6.1 Single-zone model

Table 2 presents the summary statistics for the input parameters measured during the cooking events for the

single zone model, as well as the corresponding kitchen concentrations. The data is generally inline with

previously reported estimates . Kitchen concentrations PM2.5 and CO during cooking events are all reasonable 3

given the ranges of 24-hour exposures for these user groups (Balakrishnan et al. 2014; Johnson et al. 2018;

Pillarisetti et al. 2019; Pope et al. 2017). Emission rates are also inline with estimates for wood and charcoal

stoves (Johnson et al. 2019; Piedrahita et al. 2020). Our estimated LoD for PM2.5 emission rates was

approximately 5mg/min, which was greater than those we measured here for LPG. We therefore have used

the PM2.5 emission rates for LPG reported by Weyant et al. (2019) and Johnson et al. (2019), both studies

which were able to measure them in the field. Overall, the data provided the anticipated variability, ranging

from very clean (LPG), to relatively high emissions from the wood stoves and charcoal in-between. Ambient

measurements of PM2.5 and CO were also made (see Table S3) and showed consistently low levels (6.8±5.4

µg/m3 for PM2.5, 0.9±2.7ppm). Note that stove/fuel performance metrics not directly used in the modeling

efforts, including combustion efficiency, emission factors (PM2.5, CO, and black carbon), and firepower can be

found in Table S1 of the supporting information.

3 https://berkeleyair.shinyapps.io/who_input_data_v2/

16

Table 2. Summary statistics for cooking event measurements used in the single zone model.

Mean Median SD Range n

Kitchen PM2.5 (μg/m3)

LPG 135 92 130 10--531 28

Charcoal 855 642 970 28--2442 6

Wood 2048 1580 2870 71--16,161 29

Kitchen CO (ppm)

LPG 4.8 2.6 5.1 0.0--15.6 27

Charcoal 70.4 63.8 73.9 5.9--150.2 4

Wood 44.3 38.2 39.8 0.4--196.1 28

PM2.5 emissions rate (mg/min)

LPG 1* NA 0.5* 0.1*--2.5* NA

Charcoal 15 15 10 3--30 7

Wood 159 147 65 69--343 29

CO emissions rate (mg/min)

LPG 0.0 2.9 3.8 0.2--15.9 30

Charcoal 15.3 14.5 0.9 1.1--3.1 7

Wood 1.7 1.5 0.8 0.5--3.6 29

Ventilation (air changes/ hour)

LPG 14.3 12.2 8.3 5.5--40.0 30

Charcoal 27.0 19.0 21.4 12.1--72.6 7

Wood 18.3 17.5 7.9 7.1--38.7 32

Kitchen volume (m3)

LPG 16.5 13.2 12.2 5.4--51.9 32

Charcoal 24.8 23.2 9.6 12.8-41.5 6

Wood 25.0 22.3 11.46 11.1--49.6 32

Event Duration (minutes)

LPG 45 45 28 7--125 31

Charcoal 54 50 16 29--81 7

Wood 58 54 26 21--116 30

*Assumed from Weyant et al. 2019 and Johnson et al. 2019

17

Figure 5. Relationship between the modeled and measured kitchen concentrations (PM top, CO, below) during the

sampled cooking events.

The relationship between modeled and measured estimates of kitchen concentrations for PM2.5 and CO are

shown in Figure 5. There are clear positive correlations between the modeled and measured estimates,

following the anticipated trend of lower pollutant concentrations during LPG use, and higher when using the

biomass stoves; however, there is considerable scatter (PM2.5 model RMSE = 767μg/m3; CO model RSME:

30ppm), with the model explaining 45% of the variability in the measured event concentrations of both PM2.5

18

and CO (PM2.5: R2=0.45, p<0.01; CO R2=0.45, p<0.01; CO). The model also overestimates the measured kitchen

concentrations (~10 fold for PM2.5) and (~6 fold for CO). This bias is similar to what was reported by Piedrahita

et al. 2019 and Johnson et al. 2011, who both found the model to overestimate measured concentrations in

the kitchen. There are several reasons for this potential bias, the most likely being due to the model

assumptions that all emitted pollutants instantaneously and perfectly mix throughout the room. It is likely

that a substantial fraction of emissions escape through windows, eaves, or other openings before mixing

throughout the room, and mixing is incomplete, with higher concentrations pooling higher in the room. The

variability in mixing and stratification of pollutants also likely contributes substantially to the amount of scatter

in the plots. It is also evident that modeled PM2.5 emissions are clustered near the y-axis, potentially due to

setting the LPG emissions rate to 1mg/min (LoD). This potential artifact may be shifting the overall

relationship between the modeled and measured concentrations, resulting in a high intercept.

Our stratified CO samples show mean concentrations sequentially increased from 14ppm at 1 meter above the

ground, to 20ppm at 1.5 meters above the ground (HAP standard protocol height), to 28ppm at 2 meters.

Stratified samples of CO by Johnson et al. 2011 suggested a similar pattern, and MacCarty et al. 2020 did a

systematic investigation of the model’s performance in a test kitchen, showing that PM2.5 concentrations

increased in an S-shaped curve, pooling at the ceiling. These two studies also found that a height of ~1.5

meters was likely the best proxy height to capture the average room concentration and/or exposure of a

standing adult.

To model 24 hour PM2.5 exposures, we applied simple correction factors (ratio of measured to modeled

means: 0.07, 0.24, 0.50 and for wood, charcoal, and LPG, respectively) to normalize model response to the

measured kitchen concentrations, then applied the measured KEFs (0.69, 0.85, 0.83), for wood, charcoal, and

LPG, respectively). The model was run through a Monte Carlo simulation (10,000 iterations) for each of these

fuel user groups, defined by the greatest amount of a given fuel use during the day exposure was measured.

All stoves used within the house for the given day were included.

19

Figure 6. Modeled and measured probability distributions (fitted) of PM2.5 exposures.

The modeled distributions generally compared well with the measured 24-hour PM2.5 exposures. Figure 6

shows the fitted distributions, illustrating the overlap between the modeled and measured estimates (LPG in

blue, charcoal in grey/black, wood in red/pink). The LPG and wood distributions compare most favorably, with

their interquartile ranges overlapping substantially, and mean and medians reasonably close (see Table 3).

The modeled exposures for charcoal were lower than that of the measured concentrations (mean and median

values were ~70% and 40% of the measured estimates), though this comparison is the most tenuous as only

seven samples were available for analysis.

Table 3. Comparison of modeled and measured 24 hour PM2.5 exposures (μg/m3).

Modeled Measured

LPG Mean 41 43

Median 19 29

25th-75th percentile 10-41 27-46

n 10,000 19

Charcoal Mean 80 115

Median 43 110

25th-75th percentile 21-92 50-121

n 10,000 7

Wood Mean 296 225

Median 207 182

25th-75th percentile 113-386 104-292

n 10,000 21

Overall, this approach shows promise that the model can be applied to estimate distributions of PM2.5

exposures, though care needs to be taken to ensure that inputs for normalizing the model account for bias.

Given the scatter in the relationship of individual estimates, it is also not recommended to use the model for

predicting specific households, but rather as a tool for understanding how group-level exposures may be

impacted by changes in stove use, stove performance, environmental conditions or other parameters that

may change over time.

6.2 Regression-based models

A number of model specifications were tested, starting with simple relationships, building up to more complex

models, and finally selecting parsimonious and physically reasonable models with the best fits. As with the

single-zone model, we focused on predicting PM2.5 due to the importance of its association with health

impacts. We first present the summary statistics for the variables included in model selection in Table 4. Note

that these values are slightly different from those from the single-zone modeling exercise, as the data

20

completeness changed with the exclusion of the direct emissions-related measurements. Summary findings

for the regression models are shown in Table 5. Among the models evaluated by the variable-identifying

algorithm, models with between 6 and 7 predictors out of the >20 evaluated offered an optimal compromise

between RMSE, adjusted R2, and other model selection criteria.

21

Table 4. Summary statistics for the 24hr datasets included in the regression models (mean, standard deviation, minimum, 25th percentile, median, 75th percentile, maximum, and number of valid samples.

variable mean SD min 25th %-tile median 75th %-tile max n

Cook’s personal PM2.5 exposure (µg/m3) 139 150 13 43 86 156 687 50

Compliance (fraction of 24hr period monitors in motion) 0.42 0.19 0.03 0.28 0.42 0.62 2 50

Kitchen PM2.5 (µg/m3) 492 673 26 49 192 695 3819 50

Secondary Area PM2.5 (µg/m3) 60 117 10 14 22 31 689 47

Ambient PM2.5 (µg/m3) 6 3 3 4 7 9 10 30

Cook’s personal CO exposure (ppm) 4.7 5.7 0.0 1.5 2.7 5.0 32 44

Kitchen CO (ppm) 16.5 22.6 0.0 2.9 8.9 20.3 131 48

Secondary Area CO (ppm) 5.0 7.7 0.0 0.0 1.1 8.7 38 49

Ambient CO (ppm) 1.2 2.9 0.0 0.0 0.0 0.2 10 38

Traditional charcoal stove (minutes) 78 192 0 0 0 0 881 50

3-stone fire (minutes) 84 170 0 0 0 16 570 50

Chepkube (minutes) 51 86 0 0 0 79 410 50

Total cooking time using all stoves (minutes) 214 219 0 15 115 361 881 50

Beacon threshold algorithm PM2.5 indirect exposure estimate (µg/m3) 255 514 7 37 73 401 3483 50

Beacon nearest algorithm CO indirect exposure estimate (ppm) 9.4 12.1 0 2.3 6.1 13.8 76.0 50

Beacon threshold algorithm CO indirect exposure estimate (ppm) 5.1 5.6 0 0.6 2.9 8.0 24.0 50

Number of walls in the kitchen with open eaves 0.22 0.65 0 0 0 0 3 50

Kitchen volume (m3) 22.5 12.8 5.4 13.1 20.5 27.2 52.0 50

Open door area (m2) 2.1 1.3 0 1.7 1.9 2.6 6.0 50

Socioeconomic status index 1.56 2.9 -2.35 -1.05 0.93 4.19 7 50

Air exchange rate (1/hr) 17.4 8.2 0.1 11.4 17.1 20.8 40.0 48

22

For the overall dataset, survey data alone – consisting of a measure of kitchen volume, the primary stove type,

and a socioeconomic index comprised of assets, housing characteristics, and other variables – had an adjusted

R2 of 0.51 and a root mean squared error of 130. The most predictive model (Figure 7) for the overall dataset

included stove type, measures of CO (personal, kitchen, and living room), kitchen volume and ventilation, and

a microenvironmental PM estimate. The R2 of this model was 0.75 and the RMSE was 85 (RMSE/mean of

measured exposures [nRMSE] = 0.61). Of note, and unsurprisingly, all predictive models poorly predicted

overall dispersion in biomass households, a phenomenon consistent with the literature.

Figure 7. The relationship between predicted and measured personal exposure to PM2.5. The dotted line is a 1:1 line; dots represent individual datapoints; the blue line is a linear model including terms for personal, living room, and kitchen CO levels; kitchen volume and ventilation; and an estimate of PM concentrations derived from a microenvironmental location monitoring system.

Although the microenvironmental estimate using the beacon system (model 4) significantly improved model

performance over just kitchen PM2.5 data from model 2, it did not significantly improve the model fit over the

survey data (model 1), likely due to the strong predictive ability of the survey data in this data set. Similarly,

adding kitchen PM2.5 to the survey data (as in model 5) did not significantly improve model fit over the

survey-only model.

23

Table 5. Model performance and fit statistics for the personal PM2.5 exposure estimation models.

Overall (n = 50)

Observed 139 151

Model Adj R2 RMSE Mean SD p°

1 Survey-type data* 0.51 130 110 65 0.2

2 Kitchen PM2.5 0.32 206 115 180 0.47

3 Kitchen PM2.5 + CO 0.43 139 117 137 0.45

4 1 + Microenvironmental PM2.5 Estimate 0.53 137 113 88 0.29

5 1 + 2 0.52 131 112 80 0.27

6 1 + 3 0.59 112 118 106 0.42

7 Stove Type + Personal , Kitchen, Living

Room CO + Kitchen Volume +

Microenvironmental PM Estimate‡

0.75 85 124 110 0.57

* includes Primary Stove type, a socioeconomic index, and kitchen volume ° comparing predicted and measured values ‡ identical to the model selected by the variable selection algorithm

Another common method of estimating exposure involves using a global ratio of exposure to kitchen

concentrations. The commonly accepted value of the so-called “Kitchen Exposure Factor” (KEF) is 0.742 (Smith

et al. 2014). For the current set of measurements, KEF was a poor predictor of exposure. The global KEF was

significantly different from the study-specific KEF (Wilcoxon rank sum test, p < 0.001), which exhibited wide

variability within and between primary stove types.

The distribution of KEFs by stove type is shown in Figure 8. The relatively high number of KEFs exceeding 1 in

LPG households indicates that exposure may have been driven by sources outside of the kitchen, consistent

with what one would expect for households that rely on LPG as a cooking fuel.

24

Figure 8. Distribution of the KEF – the ratio of personal exposure to kitchen concentration – in the current study. A KEF

less than one indicates that personal exposure is less than kitchen concentration, while one greater than one indicates

that personal exposures exceed kitchen concentrations. In the current study, LPG users had KEFs > 1, indicating their

exposure may be derived outside of the kitchen. “Traditional” stove class includes charcoal and wood stoves.

7 Discussion

7.1 Model application considerations

Collecting personal PM2.5 exposure data is challenging due to cost, technical requirements, and logistical

considerations making investigation of predictive models an important exercise, especially for HAP given the

health implications. To date, HAP exposure models have not been widely applied and performance has varied.

Several past works have found that survey data has provided statistically significant predictive power for

exposure estimation (Baumgartner et al., 2011; Clark et al., 2010), while quite a few have reported

comparisons of kitchen and personal PM2.5 with varying success (Balakrishnan et al., 2004; Piedrahita et al.,

2017; Liao et al. 2019), and others have looked to personal CO exposure (an easier measurement to conduct)

to estimate personal PM2.5 exposure, but found it to be an unreliable surrogate in many settings (Carter et al.,

2017). Dionisio et al. (2012) assessed personal exposure modeling performance of CO in children using

survey-type data, and though not directly comparable to our work, due to using CO as the dependent variable

rather than PM2.5, they found a model RMSE of 0.86 ppm (nRMSE = 0.74). Dionisio et al. (2008) also assessed

the performance of a model in predicting child PM2.5 exposure (n=31) from personal CO, survey data, and

kitchen PM2.5 concentrations, but did not find strong relationships in any of the model permutations (r<0.01).

Hill et al. (2019) applied regression models and machine learning models to estimate personal PM2.5 exposure

(n=36) in a rural area of Lao, but adjusted R2 values below 0.3, and RMSE of 40.0 µg/m3 (nRMSE = 0.39).

25

The HAP exposure models presented here showed promise and potential improvements in performance over

similar past works. The better performance may be the result of factors that vary between sites (e.g. higher

exposure contrasts), or possibly improvements in measurement techniques in more recent times, but also

sounds a note of caution in that even with carefully conducted studies, good model performance is not

assured. Still, the simple survey based model able to predict PM2.5 exposures with an R2 of 0.51, represents a

potentially important advancement, as surveys are the simplest data collection tool available. The best

performing model for the entire dataset used a combination of survey-based data and measurement, resulting

in an R2 of 0.76 and a root mean-squared error of 85 µg/m3. Should those predictive capacities be robust,

these modeling approaches would provide substantial value in mitigating the need or extent of costly and

complicated exposure studies.

While not a focus of this study, the large exposure contrasts which aided this modeling effort also suggest that

LPG use was having an important impact on exposure to HAP. Median exposure for the LPG users in our

subsample was 29 µg/m3, below the WHO Interim one annual target of 35 µg/m3, and median longer term

kitchen concentrations were 58 µg/m3 (See table S4). A full analysis and write up on the implications of LPG

on exposure will be conducted and presented by the CLEAN-Air(Africa) Global Health Research Group.

Finally, we found that the single-zone model used by WHO and ISO for setting emission targets did reasonably

well predicting kitchen concentrations and exposures, once its systematic overestimation was adjusted for.

The combination of the single-zone model’s correlation with measured kitchen concentrations and systematic

overestimation suggests it is a reasonable tool for setting performance targets as it provides a conservative

approach for linking emission rates with indoor air quality. Given that the model overestimates kitchen

concentrations, by potentially up to an order of magnitude, there may be room to adjust the modeling

approach such that it provides more reasonable estimates of kitchen concentrations while still erring on the

side of conservativeness.

7.2 Limitations

There is potential to reduce the burden of data collection on participants for large-scale projects, as the

performance of the exposure estimation regression models that used survey data or less intrusive

concentration measurements moderately explained exposure variability. With the ability to collect survey,

stove usage and household air pollution data over multiple days, there are scenarios where this approach

should be considered, including instances where long-term trends are of interest. While the models from this

work generally performed higher than past efforts, it is not clear that they are repeatable in different contexts.

We opted to use our resources for conducting a relatively comprehensive set of measurements on potential

predictors, which resulted in somewhat small sample sizes. While this tradeoff seemed appropriate given our

goal of exploring new modeling approaches for their potential utility, the smaller sample sizes limited our

power to evaluate with more certainty which predictors and approaches were strongest.

Although the sample sizes were relatively small, data management for this study still presented a substantial

challenge given the number and types of measurements involved and could likewise be problematic if several

data streams are required for various model inputs. Finding the right balance in terms of expected predictive

26

ability and data collection cost and analysis complexity is difficult, and likely will differ with the technical

capacity of the group conducting the work. Of course, as improved methods and equipment continue to reach

a wider audience, future analyses can become more streamlined.

It should be noted that any use of these types of models must be considered exploratory unless model

validation is performed. There are many idiosyncrasies related to specific contexts that can affect predictive

models, such as the intensity of neighbors cooking with dirty fuels, community-level use of polluting fuels,

ambient air pollution, temperature, and housing characteristics. In this study, the low housing density and low

ambient air pollution levels (Table S3) provided relatively low variability in environmental conditions, allowing

the majority of air pollution exposure to be assumed as a result of cooking. Thus, fuel type indicators may not

perform as strongly in other contexts.

7.3 Recommendations

We recommend testing these models in different geographies or fuel use scenarios as they were developed

from a single study community. Continuing to build and test HAP exposure models in different contexts

(cooking fuels, geographies, stove use patterns, housing characteristics) would enable more robust evaluation

of how they can be extended to other contexts. For clean cooking standards, additional characterization of the

single-zone model’s bias (more regions, fuel types, housing types, etc…) would help support potential

modifications to the model’s application for deriving performance targets. Future modeling efforts would also

benefit from machine-learning approaches, including both supervised and unsupervised methods. These

approaches have shown some promise in generating reasonable predictive power, especially when combined

with traditional statistical modeling approaches. De-aggregated real-time data may enhance machine-learning

predictive power, by identifying data features that may be predictive of mean PM2.5 exposures. We also

recommend exploration of additional predictors that may be relevant for exposure prediction and considering

inclusion of the single-zone model based exposure estimates in machine learning models as predictors.

Having conducted this work in Kenya was especially timely as the country has a relatively strong market for

improved biomass cookstoves and clean cooking energy sources; and innovative consumer finance programs,

such as pay-as-you go systems and microfinance. Kenya’s transition towards cleaner stoves and fuels is being

aided by active involvement in the development and adoption of ISO performance standards, which is part of

the country’s work with the World Health Organization to expedite the energy transition. The models

developed here are clearly most applicable to the Kenyan context, and ideally can be used to help characterize

the HAP exposure implications of energy transitions, as well as strengthen the implementation of Kenya’s

cookstove standards framework.

27

References

Balakrishnan K, Mehta S, Ghosh S, Johnson MA, Brauer M, Naeher L, et al. 2014. WHO Guidelines for

Indoor Air Quality: Household Fuel Combustion - Population levels of household air pollution and

exposures.

Balakrishnan K, Ghosh S, Ganguli B, Sambandam S, Bruce N, Barnes DF, et al. 2013. Modeling national

average household concentrations of PM2.5 from solid cookfuel use for the global burden of

disease -2010 assessment: results from cross-sectional assessments in India. Environmental Health

12:77; doi:10.1186/1476-069X-12-77.

Baumgartner J, Schauer JJ, Ezzati M, Lu L, Cheng C, Patz J, et al. 2011. Patterns and predictors of personal

exposure to indoor air pollution from biomass combustion among women and children in rural

China. Indoor Air; doi:10.1111/j.1600-0668.2011.00730.x.

Carter E, Archer-Nicholls S, Ni K, Lai AM, Niu H, Secrest MH, et al. 2016. Seasonal and Diurnal Air Pollution

from Residential Cooking and Space Heating in the Eastern Tibetan Plateau. Environ Sci Technol

50:8353–8361; doi:10.1021/acs.est.6b00082.

Carter, E., Norris, C., Dionisio, K. L., Balakrishnan, K., Checkley, W., Clark, M. L., Ghosh, S., Jack, D. W.,

Kinney, P. L., Marshall, J. D., Naeher, L. P., Peel, J. L., Sambandam, S., Schauer, J. J., Smith, K. R.,

Wylie, B. J., & Baumgartner, J. (2017). Assessing Exposure to Household Air Pollution: A Systematic

Review and Pooled Analysis of Carbon Monoxide as a Surrogate Measure of Particulate Matter.

Environmental Health Perspectives, 125(7); doi:/10.1289/EHP767

Clark ML, Peel JL, Balakrishnan K, Breysse PN, Chillrud SN, Naeher LP, et al. 2013. Health and Household Air

Pollution from Solid Fuel Use: The Need for Improved Exposure Assessment. Environmental Health

Perspectives; doi:10.1289/ehp.1206429.

Clark ML, Reynolds SJ, Burch JB, Conway S, Bachand AM, Peel JL. 2010. Indoor air pollution, cookstove

quality, and housing characteristics in two Honduran communities. Environ Res 110:12–18;

doi:10.1016/j.envres.2009.10.008.

Cowlin SC. 2005. Tracer Decay for Determining Kitchen Ventilation Rates in San Lorenzo, Guatemala.

Maxwell Student Projects, Max-04-4, EHS, School of Public Health, University of California,

Berkeley 1: 2.

Dionisio, K. L., Howie, S., Fornace, K. M., Chimah, O., Adegbola, R. A., & Ezzati, M. (2008). Measuring the

exposure of infants and children to indoor air pollution from biomass fuels in The Gambia. Indoor

Air, 18(4), 317–327. https://doi.org/10.1111/j.1600-0668.2008.00533.x

Dionisio, K. L., Howie, S. R. C., Dominici, F., Fornace, K. M., Spengler, J. D., Donkor, S., Chimah, O.,

Oluwalana, C., Ideh, R. C., Ebruke, B., Adegbola, R. A., & Ezzati, M. (2012). The exposure of infants

and children to carbon monoxide from biomass fuels in The Gambia: A measurement and

28

modeling study. Journal of Exposure Science & Environmental Epidemiology, 22(2), 173–181.

doi:10.1038/jes.2011.47

Garland C, Delapena S, Prasad R, L’Orange C, Alexander D, Johnson M. 2017. Black carbon cookstove

emissions: A field assessment of 19 stove/fuel combinations. Atmospheric Environment

169:140–149; doi:10.1016/j.atmosenv.2017.08.040.

Hill LD, Pillarisetti A, Delapena S, Garland C, Pennise D, Pelletreau A, et al. 2019. Machine-learned

modeling of PM2.5 exposures in rural Lao PDR. Science of The Total Environment 676:811–822;

doi:10.1016/j.scitotenv.2019.04.258.

IHME. 2018. GBD Compare | IHME Viz Hub. Available: http://vizhub.healthdata.org/gbd-compare

[accessed 23 May 2018].

ISO. 2018. Technical Report 19867-3: Clean cookstoves and clean cooking solutions — Harmonized

laboratory test protocols — Part 3: Voluntary performance targets for cookstoves based on

laboratory testing.

Johnson M, Piedrahita R, Garland C, Pillarisetti A, Sambandam S, Gurusamy T, et al. 2018. Exposures to

PM2.5 associated with LPG stove and fuel interventions: Pilot results from the HAPIN Trial.

Johnson, M., Lam, N., Brant, S., Gray, C., & Pennise, D. (2011). Modeling indoor air pollution from

cookstove emissions in developing countries using a Monte Carlo single-box model. Atmospheric

Environment, 45(19), 3237–3243; doi:10.1016/j.atmosenv.2011.03.044

Johnson M, Edwards R, Alatorre Frenk C, Masera O. 2008. In-field greenhouse gas emissions from

cookstoves in rural Mexican households. Atmospheric Environment 42:1206–1222;

doi:10.1016/j.atmosenv.2007.10.034.

Johnson M, Lam N, Wofchuck T, Edwards R, Pennise D. 2009. In-field charcoal stove emission factors and

indoor air pollution in Nairobi, Kenya.

Johnson M, Smith K, Edwards R, Morawska L, Nicas M. 2014. WHO Guidelines for Indoor Air Quality:

Household Fuel Combustion - Model for linking household energy use with indoor air quality.

Johnson MA, Garland CR, Jagoe K, Edwards R, Ndemere J, Weyant C, et al. 2019. In-Home Emissions

Performance of Cookstoves in Asia and Africa. Atmosphere 10; doi:10.3390/atmos10050290.

Liao J, McCracken JP, Piedrahita R, Thompson L, Mollinedo E, Canuz E, et al. 2019. The use of bluetooth

low energy Beacon systems to estimate indirect personal exposure to household air pollution.

Journal of Exposure Science & Environmental Epidemiology 1–11; doi:10.1038/s41370-019-0172-z.

MacCarty, N., Bentson, S., Cushman, K., Au, J., Li, C., Murugan, G., & Still, D. (2020). Stratification of

particulate matter in a kitchen: A comparison of empirical to predicted concentrations and

implications for cookstove emissions targets. Energy for Sustainable Development, 54, 14–24.

https://doi.org/10.1016/j.esd.2019.09.006

29

Ochieng CA, Vardoulakis S, Tonne C. 2013. Are rocket mud stoves associated with lower indoor carbon

monoxide and personal exposure in rural Kenya? Indoor Air 23:14–24;

doi:10.1111/j.1600-0668.2012.00786.x.

Piedrahita, R., Kanyomse, E., Coffey, E., Xie, M., Hagar, Y., Alirigia, R., Agyei, F., Wiedinmyer, C., Dickinson,

K. L., Oduro, A., & Hannigan, M. (2017). Exposures to and origins of carbonaceous PM2.5 in a

cookstove intervention in Northern Ghana. Science of The Total Environment, 576, 178–192.

https://doi.org/10.1016/j.scitotenv.2016.10.069

Piedrahita R, Coffey ER, Hagar Y, Kanyomse E, Verploeg K, Wiedinmyer C, et al. Attributing Air Pollutant

Exposure to Emission Sources with Proximity Sensing. Atmosphere. 2019 Jul 13;10(7):395.

Piedrahita, R., Johnson, M., Bilsback, K. R., L’Orange, C., Kodros, J. K., Eilenberg, S. R., Naluwagga, A., Shan,

M., Sambandam, S., Clark, M., Pierce, J. R., Balakrishnan, K., Robinson, A. L., & Volckens, J. (2020).

Comparing regional stove-usage patterns and using those patterns to model indoor air quality

impacts. Indoor Air, n/a(n/a). https://doi.org/10.1111/ina.12645

Pillarisetti, A., Carter, E., Rajkumar, S., Young, B. N., Benka-Coker, M. L., Peel, J. L., Johnson, M., & Clark, M. L. (2019). Measuring personal exposure to fine particulate matter (PM2.5) among rural Honduran women: A field evaluation of the Ultrasonic Personal Aerosol Sampler (UPAS). Environment International, 123, 50–53. https://doi.org/10.1016/j.envint.2018.11.014

Pillarisetti, A., Ghorpade, M., Madhav, S., Dhongade, A., Roy, S., Balakrishnan, K., Sankar, S., Patil, R., Levine, D. I., Juvekar, S., & Smith, K. R. (2019). Promoting LPG usage during pregnancy: A pilot study in rural Maharashtra, India. Environment International, 127, 540–549. https://doi.org/10.1016/j.envint.2019.04.017

Pope, D., Bruce, N., Dherani, M., Jagoe, K., & Rehfuess, E. (2017). Real-life effectiveness of ‘improved’

stoves and clean fuels in reducing PM2.5 and CO: Systematic review and meta-analysis.

Environment International, 101, 7–18; doi:10.1016/j.envint.2017.01.012.

Roden CA, Bond TC, Conway S, Pinel ABO. 2006. Emission factors and real-time optical properties of

particles emitted from traditional wood burning cookstoves. Environ Sci Technol 40: 6750–6757.

Senelwa K. 2016. Kenya to subsidise cost of gas cylinders. The East African.

Shrivastava, Alankar. (2011). Methods for the determination of limit of detection and limit of quantitation

of the analytical methods. Chronicles of Young Scientists. 2. 21-25. 10.4103/2229-5186.79345.

Smith KR, Bruce N, Balakrishnan K, Adair-Rohani H, Balmes J, Chafe Z, et al. 2014. Millions Dead: How Do

We Know and What Does It Mean? Methods Used in the Comparative Risk Assessment of

Household Air Pollution. Annual Review of Public Health 35:185–206;

doi:10.1146/annurev-publhealth-032013-182356.

United Nations Statistics Division. 2018. UNdata | record view | Population using solid fuels, percentage.

Available: http://data.un.org/Data.aspx?d=MDG&f=seriesRowID%3A712 [accessed 23 May 2018].

30

Vyas, S., & Kumaranayake, L. (2006). Constructing socio-economic status indices: How to use principal

components analysis. Oxford University Press; doi:10.1093/heapol/czl029

WBT Technical Committee. 2014. Water Boiling Test Protocol: Version 4.2.3.

Weyant, C. L., Thompson, R., Lam, N. L., Upadhyay, B., Shrestha, P., Maharjan, S., Rai, K., Adhikari, C., Fox,

M. C., & Pokhrel, A. (2019). In-Field Emission Measurements from Biogas and Liquified Petroleum

Gas (LPG) Stoves. Atmosphere, 10(12), 729; doi:10.3390/atmos10120729

World Health Organization (WHO). Disease Burden and Mortality Estimates. 2000-2015. World Health

Organization, Health Statistics andMortality Estimates. Retrieved from:

Http://www.who.int/healthinfo/globalburdendisease/estimates/en/index1.html. (2015).

Wilson DL, Williams KN, Pillarisetti A. 2020. An Integrated Sensor Data Logging, Survey, and Analytics

Platform for Field Research and Its Application in HAPIN, a Multi-Center Household Energy

Intervention Trial. Sustainability 12:1805; doi:10.3390/su12051805.

Yip F, Christensen B, Sircar K, Naeher L, Bruce N, Pennise D, et al. 2017. Assessment of traditional and

improved stove use on household air pollution and personal exposures in rural western Kenya.

Environment International 99:185–191; doi:10.1016/j.envint.2016.11.015.

Yuchi, W., Gombojav, E., Boldbaatar, B., Galsuren, J., Enkhmaa, S., Beejin, B., Naidan, G., Ochir, C., Legtseg, B., Byambaa, T., Barn, P., Henderson, S. B., Janes, C. R., Lanphear, B. P., McCandless, L. C., Takaro, T. K., Venners, S. A., Webster, G. M., & Allen, R. W. (2019). Evaluation of random forest regression and multiple linear regression for predicting indoor fine particulate matter concentrations in a highly polluted city. Environmental Pollution, 245, 746–753. https://doi.org/10.1016/j.envpol.2018.11.034

31

Supplementary Information

S1 Emissions performance

While the focus of this work was on developing and evaluating models to predict exposure to household air

pollution, stove performance metrics were calculated and are presented below in Table S1. LPG had very

high modified combustion efficiency (CO2/[CO2+CO] molar) as expected, indicating that almost all fuel

carbon was being converted into CO2. Charcoal stoves had the highest CO emissions, common due to the

surface oxidation combustion process for the fuel. Wood stoves had the highest PM2.5 and black carbon

emission factors. Wood also had a higher BC/PM2.5 ratio, suggesting its aerosol emissions were potentially

more warming, but the climate impacts are difficult to characterize based on the limited set of point source

emissions, especially as the majority of emissions for charcoal are generated during its production.

Table S1. Stove/fuel performance from measurements during cooking events.

LPG Charcoal Wood

Modified combustion efficiency (%)

99.1±0.8 (30) 80.8±0.1 (7) 94.0±2.4 (29)

Firepower (kW) 1.60±0.64 (32) 2.53±0.58 (7) 7.15±1.77 (32)

PM2.5 emission factor (g/kg) BDL 3.17±2.18 (7) 6.70±2.96 (29)

BC emission factor (g/kg) BDL 0.26±0.23 (7) 0.87±0.51 (29)

CO emission factor (g/kg) 17.7±15.8 (30) 373.2±110.0 (7) 67.9±27.7 (29)

BC/PM2.5 BDL 0.11±0.11 (7) 0.15±0.13 (29)

BDL = below detection limit

S2 Housing characteristics and socioeconomic status

The table below shows the distributions of the air exchange rates, room volumes, and cooking event

durations for monitored cooking events throughout the sample. These characteristics are key inputs

for the WHO and ISO physical models. The mean, standard deviation, and sample size are noted.

32

Figure S1. Distributions of kitchen volumes, air exchange rates, and cooking times (key WHO/ISO physical model

inputs)

The table below shows the socioeconomic index results for the full sample. The table is split into the average

fraction of homes possessing a given characteristic toward the index and the standard deviation of home

responses shows the distribution of that characteristic for a given category.

Table S2. Socioeconomic index results. The average fraction homes columns shows the percentage of homes owning an asset or possessing some characteristic, grouped by the index categorization. The standard deviation columns show the distribution of the occurrences of those assets for the given category.

Average fraction of homes

Standard deviation of home responses

Ownership or possession

Poorest

quintile

(category 1) 2 3 4

Wealthiest

quintile

(category 5) 1 2 3 4 5

Own the land/home they live in 0.97 0.82 0.79 0.77 0.67 0.16 0.38 0.41 0.42 0.47

33

Animal(s)(cows, sheep, etc.) 0.87 0.78 0.75 0.73 0.60 0.34 0.42 0.43 0.44 0.49

Cellphone 0.94 0.88 0.78 0.72 0.59 0.24 0.32 0.42 0.45 0.49

Smartphone 0.04 0.18 0.46 0.77 0.87 0.19 0.38 0.50 0.42 0.33

Radio 0.60 0.69 0.74 0.84 0.84 0.49 0.46 0.44 0.37 0.36

Hi-Fi/CD-player 0.00 0.01 0.04 0.26 0.58 0.05 0.10 0.19 0.44 0.49

Solar connection 0.42 0.30 0.28 0.15 0.15 0.49 0.46 0.45 0.36 0.35

Electricity Connection 0.00 0.24 0.57 0.81 0.90 0.05 0.43 0.50 0.39 0.30

TV 0.05 0.22 0.54 0.84 0.85 0.22 0.42 0.50 0.37 0.36

Satellite TV 0.01 0.19 0.29 0.47 0.65 0.09 0.39 0.46 0.50 0.48

Refrigerator/fridge/freezer 0.00 0.00 0.00 0.02 0.41 0.00 0.00 0.05 0.14 0.49

Shower/bath within house 0.00 0.00 0.02 0.05 0.54 0.00 0.07 0.12 0.23 0.50

Land 0.93 0.79 0.77 0.78 0.72 0.26 0.41 0.42 0.41 0.45

Bicycle 0.08 0.11 0.19 0.23 0.33 0.26 0.31 0.39 0.42 0.47

Moped/Motorcycle 0.06 0.11 0.16 0.17 0.11 0.24 0.31 0.36 0.38 0.32

Pick-up truck 0.00 0.01 0.02 0.04 0.07 0.00 0.09 0.15 0.19 0.25

Car 0.00 0.01 0.03 0.10 0.43 0.00 0.11 0.17 0.30 0.50

Computer 0.00 0.00 0.01 0.03 0.21 0.00 0.07 0.08 0.17 0.41

Washing machine 0.00 0.00 0.01 0.01 0.03 0.00 0.00 0.08 0.09 0.17

Tractor 0.01 0.01 0.03 0.04 0.08 0.08 0.09 0.17 0.19 0.28

Septic or Flushing Toilet Inside 0.00 0.00 0.02 0.09 0.63 0.00 0.00 0.12 0.29 0.48

Latrine in Compound 1.00 0.97 0.99 0.99 0.86 0.00 0.17 0.10 0.11 0.35

Use LPG 0.01 0.28 0.53 0.80 0.95 0.11 0.45 0.50 0.40 0.22

pca_score -2.17 -1.40 -0.51 0.70 3.38 0.21 0.25 0.31 0.38 1.51

S3 PM2.5 and CO household air pollution (HAP) and personal exposure concentrations

for the study population

This figure shows a typical 24-hr monitoring period time series, with all plots showing by-minute data. The top

frame shows the PM2.5 time series for the cook’s personal exposure (red), and kitchen concentrations (teal)

from the MicroPEM devices; the second frame shows the PM2.5 concentration data from the PATS+ devices

placed in the kitchen (directly adjacent to the kitchen MicroPEM for inter-comparability and redundancy); the

third frame shows the indirect exposure estimates using three different the Beacon localization methods and

34

the associated concentrations from the PATS+ monitors in the given rooms; the fourth frame shows the

localization assignment using the three different localization approaches explained previously (color indicates

room assignment); the fifth frame indicates stove usage (teal signifies the periods of cooking with the LPG

stove, and red signifies the periods of not-cooking).

Figure S2. A typical 24-hr monitoring period time series for a single household, with all plots showing by-minute data for all instruments used.

S4 Characterization of long-term stove use patterns

Stove usage data was collected at 91 households, for durations ranging from 48-hr to 6 months. Results are

presented by study group (biomass as the primary fuel, charcoal as primary, and LPG as primary). Below, an

example time series is presented for a home, showing the diurnal temperature trends typical in SUMs

35

measurements, and the peaks produced by cooking events. This time series shows a home’s stove usage over

one week, for a home primarily using charcoal. They appear to use both stoves on days that they cook, and on

November 3-5, cooking is not observed.

Figure S3. A typical temperature trace for a single home, showing the diurnal temperature trends typical in SUMs measurements, and the peaks produced by cooking events on two different stoves.

S4.1 Average events per day by group In households that reported biomass as their primary fuel (n=34), SUMS analysis showed that there were an

average of 2.4 Chepkube stove use events per day (n=19), 1.9 charcoal stove use events per day (n=12), 1.5

LPG stove use events per day (n=2), and 1 stove use event per day on the three-stone fire in the sample (n=1).

In households that reported charcoal as their primary fuel (n=4), we found an average of 1.6 charcoal stove

use events per day (n=3), 1.6 Chepkube stove use events per day (n=1), 1.5 three-stone fire stove use events

per day (n=1), and no kerosene stove use events in the single kerosene stove included in the sample (n=1). In

households that reported LPG as their primary fuel (n=53), we found an average of 2.8 LPG stove use events

per day (n=51), 2.9 Chepkube stove use events per day (n=2), and 1.2 charcoal stove use events per day

(n=15).

36

Figure S4. Average stove use events per day, divided by stove group. The labelled central point shows the mean, the line within the box plot shows the median, and the upper and lower bounds of the box show the lower and upper quartiles. The n’s are shown below each box for each stove type within the primary group.

S4.2 Average minutes per day by group:

In households that reported biomass as their primary fuel (n=34), SUMS analysis showed there was an average

of 208.8 minutes of cooking on a Chepkube stove per day (n=19), an average of 210 minutes of cooking on a

charcoal stove per day (n=12), an average of 32.4 minutes of cooking on an LPG stove per day (n=2) and an

average of 200 minutes of cooking per day on the single three stone fire included in the group (n=1). In

households that reported charcoal as their primary fuel (n=4), we found an average of 355 and 333.3 minutes

of cooking each on the single three-stone fire and Chepkube stoves included in the sample respectively, and a

negligible amount of cooking on the single kerosene stove included in the sample. In households that reported

LPG as their primary fuel (n=53), we found an average of 92.4 minutes of cooking on an LPG stove per day

(n=52), 161.4 minutes of cooking on a Chepkube stove per day (n=2), and 77.5 minutes of cooking on a

charcoal stove per day (n=15).

37

Figure S5. Average minutes each stove was indicating use per day, divided by stove group. The labelled central point

shows the mean, the line within the box plot shows the median, and the upper and lower bounds of the box show the

lower and upper quartiles. The n’s are shown below each box for each stove type within the primary group.

S4.3 Temporal trends

Temporal stove usage trends were assessed by comparing distributions of the total daily cooking time (the

sum of usage from all stoves in a given household) by day of week. No clear patterns emerged from this,

indicating that the usage did not vary by day of week, as it does in some regions.

38

Figure S6. Distribution of minutes that indicated use on each stove throughout each day of the week, divided by stove group. A lack of a clear pattern within the days of the week indicated that there were no variations in usage based on day of the week.

The usage patterns by time of day showed clear increases in usage at typical breakfast, lunch, and dinner

periods. The area under each curve is normalized so that the figure does not demonstrate use quantitatively,

but rather shows when use occurs on each stove type on average over the course of 24 hours. Charcoal and

kerosene patterns had relatively low sample sizes so the patterns are expected to be less representative of the

regional patterns.

39

Figure S7. Usage patterns on each stove type within primary stove groupings over the course of 24 hours.

S4.4 Usage fraction by group

In households that reported biomass as their primary fuel (n=34), SUMS analysis showed 56% of all stove

usage that occurred in this group took place on a Chepkube stove, 35% of all stove usage took place on a

charcoal stove, 6% took place on an LPG stove, and 3% took place on a three-stone fire. In households that

reported charcoal as their primary fuel (n=4), 53% of all stove use took place on a charcoal stove, 25% of all

stove use took place on a three-stone fire, and 22% took place on a Chepkube stove. In households that

reported LPG as their primary fuel (n=53), 84% of all stove usage took place on an LPG stove, 14% took place

on a charcoal stove, and 2% took place on a Chepkube stove.

40

Figure S8. Usage fractions on each stove type within the primary stove groupings. Total stove use for each primary group is normalized to 1.0 and the stove use fractions are divided amongst stove types.

S4.5 Stove stacking

Stove stacking behavior is shown in the plot below, which shows the percentage of days each individual and

pairs of stoves are used in a household. Though some groupings are sparse due to low coverage, basic trends

can be observed to understand the energy consumption process on a long-term by-household level. For

example, In the group that primarily uses LPG for cooking, the LPG homes were used on average 73% of days,

and on 38% of days, they used both the LPG stove and the charcoal stoves. This is a relatively high usage rate

which may point to either personal user preferences, or financial decision making.

41

Figure S9. Stove stacking by percent-days used by each stove type within the primary groupings. The labelled central

point shows the mean, the line within the box plot shows the median, and the upper and lower bounds of the box show

the lower and upper quartiles. The n’s are shown below each box for each stove type or combination of stove types

within the primary group.

S4.6 SUMs placement

The figure below shows the thermocouples for the stove use monitoring devices deployed on several stove

types. For those stoves that were stationary, the logger of the monitor was affixed above or adjacent to the

stove, while the thermocouple was threaded to an appropriate distance from the combustion zone to detect

cooking events in temperature traces. For those stoves that were portable, the logger of the monitor was

affixed to the body of the stove, while the thermocouple was situated appropriately at the combustion zone,

so that the stove and monitor were able to be moved as the participant wished with no disruption of

monitoring.

42

Figure S10. Photos showing SUMs installation on various stove types. On the portable stoves, the logger can be

observed affixed to the stove body, while on stationary stoves, the thermocouple is shown threaded to the zone of

combustion.

S4.7 Ambient monitoring

Table S3. Ambient measurement results for PM2.5 and CO

PM2.5 (µg/m3) CO (ppm)

Mean 6.83 0.91

SD 4.52 2.71

Min 1.26 0

43

q25 3.64 0

Median 6.46 0

q75 10 0

Max 293.11 117.5

n 55563 79744

44

Figure S11. Ambient data, divided by monitoring instrument, over the course of the study period.

45

S4.8 Intensive monitoring

Table S4. Intensive sample summary statistics. These data will be further analyzed in future work, to assess the

day-to-day variability of the household air pollution measurements, and compliance of the Beacons.

Stove group Parameter

PM2.5 Kitchen

PM2.5 Kitchen sampling duration

PM2.5 Living Room

PM2.5 Living Room sampling duration

Kitchen CO (ppm)

CO Kitchen sampling duration

Living Room CO (ppm)

CO Living Room sampling duration

Charcoal

Mean 86.6 4897 54.2 3907 10.6 4177 8.5 4892

SD 22.0 1229 42.0 1783 3.1 1471 4.6 1230

Min 66.2 3306 11.3 2081 8.4 2772 2.6 3300

q25 68.9 4268 30.3 2685 8.7 3167 7.3 4262

Median 84.7 5103 47.9 3737 9.4 3930 8.9 5100

q75 102.3 5731 71.8 4960 11.3 4939 10.1 5730

Max 110.8 6074 109.9 6074 15.1 6074 13.7 6069

n 4.0 4 4.0 4 4.0 4 4.0 4

Chepkube

Mean 1027.4 7101 49.2 7059 11.9 5056 1.5 6619

SD 534.7 718 18.6 728 4.0 2037 2.4 92

Min 410.0 6686 27.8 6639 9.6 3880 0.1 6513

q25 873.0 6686 43.9 6639 9.6 3880 0.1 6593

Median 1336.1 6686 60.0 6639 9.6 3880 0.1 6672

q75 1336.1 7308 60.0 7270 13.0 5645 2.2 6672

Max 1336.1 7930 60.0 7900 16.5 7409 4.2 6672

n 3.0 3 3.0 3 3.0 3 3.0 3

LPG

Mean 119.2 5098 65.2 4508 8.4 4485 5.6 4719

SD 158.9 1543 67.1 2403 7.1 1923 5.4 2051

Min 10.7 2718 13.8 0 0.2 1767 0.3 0

q25 29.3 4657 20.2 3944 3.6 3060 0.5 3900

Median 58.2 4727 37.5 4726 5.7 4621 2.9 4670

q75 111.7 5763 82.8 5762 13.8 4761 7.9 5748

Max 543.3 8092 221.9 8043 21.6 8020 16.7 7999

n 14.0 14 12.0 14 14.0 14 13.0 14

46

Trad Biomass

Mean 630.1 4429 25.5 4429 11.1 3587 1.6 4289

SD 463.9 3097 10.5 3096 6.4 1898 2.7 3192

Min 131.6 2006 17.3 2006 3.4 1936 0.0 1922

q25 263.8 2858 19.3 2857 6.4 2424 0.1 2424

Median 473.3 2951 21.9 2951 9.9 2925 0.3 2941

q75 975.1 4952 26.6 4952 15.3 4233 1.7 4945

max 1328.0 10428 47.7 10426 20.9 6934 7.2 10420

n 7.0 7 7.0 7 7.0 7 7.0 7

Figure S12. Typical CO and CO2 emissions time series, showing the initial background period, the cooking period, and

final background period, in addition to the data points identified to be associated with the decay that can be used to

calculate the kitchen ventilation rate.

47


Recommended