+ All Categories
Home > Documents > LOGISTIC REGRESSION ANALYSIS OF INFANT MORTALITY: …ijsernet.org/uploads/SER_01_50.pdf ·...

LOGISTIC REGRESSION ANALYSIS OF INFANT MORTALITY: …ijsernet.org/uploads/SER_01_50.pdf ·...

Date post: 03-Aug-2020
Category:
Upload: others
View: 1 times
Download: 0 times
Share this document with a friend
18
International Journal of Scientific and Education Research Vol. 2, No. 04; 2018 http://ijsernet.org/ www.ijsernet.org Page 81 LOGISTIC REGRESSION ANALYSIS OF INFANT MORTALITY: EVIDENCE FROM GHANA 1 Felix Atanga Adongo 2 Richmond Essieku 3 John Amo Jr. Lewis 4 John Boamah 1 University of Mines and Technology, Ghana 2 University of Cape Coast, Ghana 3 University of Liberia, Liberia 4 University of Cape Coast, Ghana ABSTRACT Aware of the appreciable level of infant mortality in the Bongo district and its toll on the health and general well-being of the inhabitants, the study adopted an econometric tool, modelled the risk factors of the phenomenon and made recommendations to ameliorate the problem. Binary Logistic Regression was adopted to analyse data from questionnaires administered in five community clinics of the district. The study in its uniqueness considered four level risk factors comprising the mother, child, environmental and the medical attendants’ level. For the mother level risk factors, the nutritional status and antenatal care were significant as far as infant deaths are concerned. For the child level which included the sex and size of the child, results show that this model was not very informative given its overly poor fit and severely biased estimates evidenced by the likelihood ratio and Hosmer-Lemeshow tests. The level of sun in the region of the pregnant woman was a significant contributor to infant mortality in the district. Care immediately after delivery from the medical attendants’ level was also a significant contr ibutor to infant deaths in the district. The study also analysed data on infant deaths from the BDH to find out the possibility of infant survival in the hospital. The results revealed quite a substantial likelihood of infant survival given a maximum of 3% possibility of a baby expiring after birth. Keywords: Binary Logistic Regression, Risk Factors, Model, Infant Mortality, Likelihood Ratio, Hosmer-Lemeshow Abbreviation: BDH: Bongo District Hospital Introduction The loss of a child from birth remains a sad reality as it exacts a toll on the health and well-being of the immediate family and the society. Infant mortality rate is often used as one of the indicators to measure the health and well-being of an economy as its occurrence taints the outlook of the nation. It is the priority of every government thus to fight the incidences of infant mortality. According to Oestergaard et al. (2011), 7.7 million children below the age of one year died worldwide where 3.1 million died in the first month of birth. More than five million children under one year of age die every year in Africa out of which half of them die within the first four weeks of birth (Kwara, 2012). According to (WHO, 2016), 4.2 million representing 75% of all deaths below five years were recorded in the first year of life in 2016. Infant death is highest in Africa as
Transcript
Page 1: LOGISTIC REGRESSION ANALYSIS OF INFANT MORTALITY: …ijsernet.org/uploads/SER_01_50.pdf · 2018-08-02 · The rest of paper is categorized as follows: ... consumption and the occurrence

International Journal of Scientific and Education Research

Vol. 2, No. 04; 2018

http://ijsernet.org/

www.ijsernet.org Page 81

LOGISTIC REGRESSION ANALYSIS OF INFANT MORTALITY:

EVIDENCE FROM GHANA

1Felix Atanga Adongo 2Richmond Essieku 3John Amo Jr. Lewis 4John Boamah

1University of Mines and Technology, Ghana 2University of Cape Coast, Ghana 3University of Liberia, Liberia 4University of Cape Coast, Ghana

ABSTRACT

Aware of the appreciable level of infant mortality in the Bongo district and its toll on the health

and general well-being of the inhabitants, the study adopted an econometric tool, modelled the risk

factors of the phenomenon and made recommendations to ameliorate the problem. Binary Logistic

Regression was adopted to analyse data from questionnaires administered in five community

clinics of the district. The study in its uniqueness considered four level risk factors comprising the

mother, child, environmental and the medical attendants’ level. For the mother level risk factors,

the nutritional status and antenatal care were significant as far as infant deaths are concerned. For

the child level which included the sex and size of the child, results show that this model was not

very informative given its overly poor fit and severely biased estimates evidenced by the likelihood

ratio and Hosmer-Lemeshow tests. The level of sun in the region of the pregnant woman was a

significant contributor to infant mortality in the district. Care immediately after delivery from the

medical attendants’ level was also a significant contributor to infant deaths in the district. The

study also analysed data on infant deaths from the BDH to find out the possibility of infant survival

in the hospital. The results revealed quite a substantial likelihood of infant survival given a

maximum of 3% possibility of a baby expiring after birth.

Keywords: Binary Logistic Regression, Risk Factors, Model, Infant Mortality, Likelihood Ratio,

Hosmer-Lemeshow

Abbreviation: BDH: Bongo District Hospital

Introduction

The loss of a child from birth remains a sad reality as it exacts a toll on the health and well-being

of the immediate family and the society. Infant mortality rate is often used as one of the indicators

to measure the health and well-being of an economy as its occurrence taints the outlook of the

nation. It is the priority of every government thus to fight the incidences of infant mortality.

According to Oestergaard et al. (2011), 7.7 million children below the age of one year died

worldwide where 3.1 million died in the first month of birth. More than five million children under

one year of age die every year in Africa out of which half of them die within the first four weeks

of birth (Kwara, 2012). According to (WHO, 2016), 4.2 million representing 75% of all deaths

below five years were recorded in the first year of life in 2016. Infant death is highest in Africa as

Page 2: LOGISTIC REGRESSION ANALYSIS OF INFANT MORTALITY: …ijsernet.org/uploads/SER_01_50.pdf · 2018-08-02 · The rest of paper is categorized as follows: ... consumption and the occurrence

International Journal of Scientific and Education Research

Vol. 2, No. 04; 2018

http://ijsernet.org/

www.ijsernet.org Page 82

it records about 52 deaths per 1,000 live births. This is roughly six times higher than that of the

European region which is about 8 deaths per every 1,000 live births (WHO, 2016).

The level of infant deaths has not yet met a substantial decline over the past few years amidst

progressive technology in the health sector and an augmented devotion to parental care. While

most would expect the rate to be declining at an appreciable level, it has rather remained fairly

steady since the early 2000. There are several risk factors pertinent to infant deaths in our societies

today. The Sudden Infant Death Syndrome (SIDS) is an inexplicable sudden demise of babies and

it has considerably claimed the lives of children below one year of age. In 2005, 2,234 infants died

due to SIDS (WHO, 2006)

The environment that accommodates the child after birth contributes hugely to the health of the

child and its survival. Society underestimates the effect of the environment on the survival of the

child. For example, excessive hot weather conditions on pregnant women can hamper the survival

of infants, the nature of drinking water in the environment, homes with the absurdity of forbidding

a woman delivering at the hospital rather than at home, homes with the impossibility of delivering

at the hospitals because of their remoteness and many more. These and many other environmental

issues have an appreciable impact on infant deaths and should be given a limelight addressing the

problem.

This research focuses on the Bongo district in the Upper East Region of Ghana where infant

mortality is widespread. The district during the dry season from the start of February to the close

of April becomes excessively hot with an average temperature of about 38 which is not

congenial for pregnant women. Preterm and stillbirths are predominant during these times in the

district and reason could be accordingly attributed to hostile weather conditions. There has been

little to no progress in research relevant to mitigating child deaths in the first year of age in the

district. Consequently, this paper identified the major risk factors of infant deaths in the district’s

context and modelled these factors at four different levels using the Logistic Regression model.

Kwara (2012) modelled the risk factors of neonatal mortality in Ghana but did not consider the

medical or traditional birth attendants’ level risk factors which are often disregarded but appear to

have serious impact on deaths of infants. Infant deaths could be triggered when there are birth

complications like breach presentation – much blame will be on the doctors and midwives for

exhibiting incompetence. Data obtained from questionnaires administered to workers in five

community clinics with fairly good knowledge on health and child mortality were used for this

analysis. The study also made use of secondary data on infant deaths from the district hospital

from 2006 to 2014. This data were analysed to predict the chance of a baby surviving given that

the mother delivers at the BDH.

The rest of paper is categorized as follows: Section II talks about the review of related literature,

section III talks about the methodology and conceptual framework adopted in the study, section

Page 3: LOGISTIC REGRESSION ANALYSIS OF INFANT MORTALITY: …ijsernet.org/uploads/SER_01_50.pdf · 2018-08-02 · The rest of paper is categorized as follows: ... consumption and the occurrence

International Journal of Scientific and Education Research

Vol. 2, No. 04; 2018

http://ijsernet.org/

www.ijsernet.org Page 83

IV is concerned with the analysis of results and findings, section V discusses the findings and

section VI concludes and provides recommendations and gaps of the research.

Literature Review

This section reviews literature on infant deaths and its major causes that take place in the world.

According to (Kwara, 2012), a noticeable level of transformation is taking place in areas relevant

to maternal and child health in order to realize the international declaration and country

commitment objectives. The pursuit for evaluation and information on child mortalities has thus

become progressively obvious. This section presents a review on infant mortality, its relationship

with stillbirth, perinatal mortality, neonatal mortality, post-neonatal mortality and their associated

risk factors. Conscious of the negative effects infant mortality is hurling on the immediate families,

communities and the nation as a whole, a good number of scholars and researchers have examined

the phenomenon and its related risk factors and have ascertained these findings.

Preterm of delivery on Infant Mortality

(Khashu et al., 2009) conducted a study to compare the mortality and morbidity of late preterm

infants to those born at term. Data were collected from the British Columbia Perinatal Registry

(BCPR) and were analysed including all singleton births between 33 and 40 weeks gestation from

April 1999 to March 2002 in the province of British Columbia, Canada. The birth cohort was

divided into late preterm (33-36 weeks, n = 6,381) and term (37-40 weeks, n = 88,867) groups.

The results show that stillbirth rate, perinatal, neonatal and infant mortality rates were significantly

higher in the late preterm group compared to the term group.

BMI of Mother on Infant Mortality

(Chen et al., 2009) conducted a research on the maternal obesity and the risk of infant deaths in

the United States. The aim of this research was to examine the effects of maternal obesity on

neonatal and postnatal death separately, and to examine causes of infant death associated with

maternal obesity. The study was the association between the maternal obesity and the risk of infant

death by using 1998 US National Maternity and Infant Health Survey (NMIHS) data. A case

controlled analysis of 4,265 infant deaths and 7,293 controls were conducted. Self-reported

pregnancy BMI and weight gain were used in the primary analysis, whereas weight variables in

medical records were used in a subset of 4,308 women. They found out that the normal weight

women who gained 0.66 to 0.97 Ib/wk. during pregnancy, obese women had significantly

increased risk of neonatal death and overall infant death.

Effect of diabetic women on Infant Mortality

(Dunne et al., 2009) carried out a study to evaluate the pregnancy outcome in pre-gestational

diabetes along the Atlantic seaboard from 2006-2007. The Atlantic Diabetes in pregnancy group

Page 4: LOGISTIC REGRESSION ANALYSIS OF INFANT MORTALITY: …ijsernet.org/uploads/SER_01_50.pdf · 2018-08-02 · The rest of paper is categorized as follows: ... consumption and the occurrence

International Journal of Scientific and Education Research

Vol. 2, No. 04; 2018

http://ijsernet.org/

www.ijsernet.org Page 84

representing five antenatal centres in a wide geographical location in Ireland established in 2005.

All women with diabetes for greater than 6 months before the index pregnancy were included. The

pregnancy outcome was compared with background rates. Prospective information was obtained

from 104 singleton pregnancies from 2006-2007 and compared to the background population.

Results show that significant associations were found with stillbirth, and PM, where rates were 5.0

and 3.5 times that of the background population respectively.

Alcoholic Consumption and Smoking on Infant Mortality

(Rasch, 2003) aimed at studying the association between cigarette, alcohol, and caffeine

consumption and the occurrence of spontaneous abortion. The study population consisted of 330

women with spontaneous abortion and 1,168 pregnant women receiving antenatal care. A case

control design was utilized; cases were defined as women with a spontaneous abortion in

gestational week 6-16 and controls as women with a live foetus in gestational week 6-16 and

controls as women. The variables studied include age, parity, occupational situation, cigarette,

alcohol, and caffeine consumption. He realized that there was a significant association between

alcohol consumption (5 or more units of alcohol per week) during pregnancy and spontaneous

abortion (OR: 4.8; CI: 2.9-8.2).

Methodology

Due to the advancement in technology, many researchers often tackle problems by the use of

statistical software packages with slight knowledge about the methods used. The use of statistical

software packages has actually helped in analysis of data. As a result; researchers do not deal much

with the conceptual framework of the methods used. This section focuses on the methods adopted

in analysing the data in the subsequent section.

The Econometrics of Logistic Regression

In econometrics, logistic regression is a type of non-linear probabilistic regression whose link

function is the logistic CDF. It is a type of regression model that is used to predict a categorical

response given one or more predictor variables. Examples in the binary case are, a customer

decides whether or not to take a solar panel offer, whether or not a student passes an accountancy

test, a child survives after birth or dies after birth and so on. Since the dependent variable is not

continuous, we cannot predict a numerical value for it, instead we predict the chances that the

response occurs. The logistic regression model unlike the Linear Probability Model (LPM) is very

useful because, it can take any input from negative to positive infinity with the outcome variable

taking on values between zero and unity and so interpretable as a probability. Logistic regression

model can first be understood by looking at the logistic function defined below:

Page 5: LOGISTIC REGRESSION ANALYSIS OF INFANT MORTALITY: …ijsernet.org/uploads/SER_01_50.pdf · 2018-08-02 · The rest of paper is categorized as follows: ... consumption and the occurrence

International Journal of Scientific and Education Research

Vol. 2, No. 04; 2018

http://ijsernet.org/

www.ijsernet.org Page 85

(3.1)

The Logit Function

Considering the logistic function in equation (3.1), let be a linear function of an explanatory

variable say where and are constants; then the logistic function by

rationalization of the numerator is given by:

(3.2)

The inverse of logistic function can be seen as:

(3.3)

is the natural logarithm of the odds and equivalently:

where is the odds.

Generally, is the logit function of some linear combination of the predictors. The equation

for in (3.3) illustrates that the logit (natural logarithm of the odds) is equivalent to the

linear regression expression. is the probability that the dependent variable equals a case which

is coded 1 other than 0, given some linear combination of predictors.

The Odds Ratio

The odds of the dependent variable equalling a case are equivalent to the exponential function of

the linear regression expression. So we can define odds of the dependent variable equalling a case

(given some linear combination of the predictors) as follows:

Page 6: LOGISTIC REGRESSION ANALYSIS OF INFANT MORTALITY: …ijsernet.org/uploads/SER_01_50.pdf · 2018-08-02 · The rest of paper is categorized as follows: ... consumption and the occurrence

International Journal of Scientific and Education Research

Vol. 2, No. 04; 2018

http://ijsernet.org/

www.ijsernet.org Page 86

(3.4)

From equation (3.4), the odds ratio (OR) can be defined as the ratio of an increase in one unit of

the independent variable which is given by:

(3.5)

Hence the odds ratio is given by . This factor is the OR for the independent variable

and it gives the relative amount by which the odds of the outcome increase (OR greater than 1)

or decrease (OR less than 1) when the value of the independent variable is increased by 1 unit.

For example, the variable INFANT MORTALITY is coded as 0 = (Infant Survival) and 1 = (Infant

Mortality), and the odds ratio for an independent variable say antenatal care is 3.2. This means

that, in the model, the odds for the outcome (INFANT MORTALITY) in cases where the Infant

died are 3.2 times higher than in cases where the Infant survived when antenatal care increases by

one unit.

Assumptions of the Logistic Regression Model

Logistic Regression does not make many of the key assumptions of linear regression and general

linear models that are based on ordinary least squares algorithms particularly regarding linearity,

normality, homoscedasticity, and measurement level. (Anon., 2014)

These are the following key assumptions of Logistic Regression Model:

Binary logistic regression requires the dependent variable to be binary.

Logistic regression assumes that P(Y = 1) is a probability of the event occurring, it is

necessary that the dependent variable is coded accordingly. That is the factor level 1 of the

dependent should represent the desired outcome.

The model should be fitted correctly. That is only the meaningful variables should be

included.

The model should have little or no multicollinearity.

Logistic regression assumes linearity of independent variables and log odds.

Logistic regression requires quite a large sample size. Reliability of estimates declines as

fewer observations are used.

Tests and Goodness of Fit Measures Adopted in the Study

The Hosmer-Lemeshow Test

The Hosmer-Lemeshow test is a statistical test for goodness of fit used in logistic regression

modelling. The data are divided into approximately ten groups defined by increasing order of

Page 7: LOGISTIC REGRESSION ANALYSIS OF INFANT MORTALITY: …ijsernet.org/uploads/SER_01_50.pdf · 2018-08-02 · The rest of paper is categorized as follows: ... consumption and the occurrence

International Journal of Scientific and Education Research

Vol. 2, No. 04; 2018

http://ijsernet.org/

www.ijsernet.org Page 87

estimated risk. The observed and expected number of cases in each group is calculated and a Chi-

squared statistic is calculated by the use of the formula:

(3.6)

With and being the observed events, expected events and number of observations for the

risk decile group respectively, and is the number of groups. The test statistic follows a Chi-

squared distribution with degrees of freedom. A large value of Chi-squared (with small p-

values < 0.05 level of significance) indicates poor fit while small Chi-squared values with large p-

values (closer to 1) indicate a good logistic regression model fit.

The Likelihood Ratio Test

The Likelihood Ratio test is performed by estimating two models and comparing the fit of one

model to the fit of the other. Removing predictor variables from a model will almost always make

the model fit less well (that is a model will have a lower log likelihood), but it is necessary to test

whether the observed difference in model fit is statistically significant. The likelihood ratio test

does this by comparing the log likelihoods of the two models, if this difference is statistically

significant (p-value less than 0.05), then the less restrictive model (the one with more predictors)

is said to fit the data significantly better than the more restrictive model (the one with only the

constant or fewer predictors). If one has the log likelihoods from the models, the likelihood ratio

test is fairly easy to calculate. The formula for the likelihood ratio test is:

(3.7)

Wald Estimator

The Wald statistic can also be used to assess the individual predictors in a particular model. Unlike

linear regression where we use the test statistics t to assess the significance of coefficients in the

model, in logistic regression, the Wald estimator is used to assess the contribution each predictor

plays in the model. The Wald statistic is the ratio of the square of the regression coefficient to the

square of the asymptotic standard error of the coefficient and this is asymptotically distributed as

a chi-square distribution with degree of freedom equal to unity. The Wald statistic can be obtained

by:

(3.8)

Where A.S.E is the Asymptotic Standard Error of the regression coefficient, beta, the significance

or importance of a variable depends largely on the Wald statistic. The significance is proportional

to the Wald statistic.

Page 8: LOGISTIC REGRESSION ANALYSIS OF INFANT MORTALITY: …ijsernet.org/uploads/SER_01_50.pdf · 2018-08-02 · The rest of paper is categorized as follows: ... consumption and the occurrence

International Journal of Scientific and Education Research

Vol. 2, No. 04; 2018

http://ijsernet.org/

www.ijsernet.org Page 88

The Pseudo R Squares

In the linear regression, the squared multiple correlations, R squares are used to assess goodness

of fit as it represents the proportion of variability in the outcome that is explained by the model. In

logistic regression analysis, there is no agreed upon analogous measure, but there are several

competing measures each with limitations. The Cox and Snell R squared is an alternative index of

goodness of fit related to the R squared from linear regression. The Cox and Snell index is

problematic as its maximum value is 0.75 thus accounting up to a third quarter of variability in the

outcome. The Nagelkerke R squared however provides a correction to the Cox and Snell R squared

so that the maximum value is now equal to unity. Yet, a higher R squared value does not guarantee

a good model fit as it increases with the number of predictor variables. Fitting the model with a lot

of independent variables bloats the R squared yet is not very informative as the reliability of the

estimates declines and inferences with such estimates may be unsound. So in fitting a model we

entertain a great deal of care about the parsimony goodness of fit trade off.

Data Collection and Analysis

This section looks at analysis of data collected from questionnaires administered in five

community clinics under the district and data on infant deaths collected from the BDH from 2006

– 2014. A total of 118 out of 125 questionnaires administered were received. The non-response

rate to the questionnaires was roughly 5%. Thus, a total of 118 observations from the

questionnaires were used in the first part of the analysis (modelling the risk factors of infant

mortality at various levels). The questionnaire divided the risk factors into four levels which

include the mother, child, environmental and the medical attendants’ level factors. Factors

included in the analysis of the mother level are, the nutritional status of the mother, antenatal care

and the educational level of the mother. From the child level category, the factors included were

the sex of the child and the size of the child. From the environmental level category, the only factor

included and in the framework of the district was the level of sun. From the medical attendants’

level category, the considered factors were the handling complications and the care for the child

immediately after delivery or postnatal care. The data collected from the district hospital contained

the number of infant deaths and the number of births from 2006 – 2014. This secondary data will

be used in the second part of the analysis (predicting the chances of infant survival in the district

hospital).

The administered questionnaire was divided into six broad sections. The first section is the

demographic and socio-economic characteristics of the respondent; which entailed the gender, age

bracket, educational level, occupation, marital status and community of residence. The second

section solicited for the respondent’s knowledge about infant mortality in the district. It asks

Page 9: LOGISTIC REGRESSION ANALYSIS OF INFANT MORTALITY: …ijsernet.org/uploads/SER_01_50.pdf · 2018-08-02 · The rest of paper is categorized as follows: ... consumption and the occurrence

International Journal of Scientific and Education Research

Vol. 2, No. 04; 2018

http://ijsernet.org/

www.ijsernet.org Page 89

respondents whether or not there are ever infant mortalities in their community clinic of work and

about the current level of infant mortality rate in their clinic if there exists. The former question

was coded in the “Yes” “No” response category intended for the dependent variable “Infant

Mortality” which in our case is binary with “Yes” was coded as 1 for infant mortality and “No”

was coded as 0 for infant survival. The last four sections of the questionnaire designed the

questions in the same way, they were captioned as: determinants of infant mortality from the

mother level, determinants of infant mortality from the child level, determinants of infant mortality

from the environment level and determinants of infant mortality from the medical attendants’ level.

The questionnaire categorize all questions from section 4 – 6 on a six-point Likert scale comprising

strongly disagree, disagree, somewhat disagree, somewhat agree, agree and strongly agree. It

proceeds to probe respondents – if at all there was infant mortality in either their community clinic

of work, other clinics, hospitals or homes within the district, then the following listed risk factors

were perhaps the cause. Strongly disagree is coded as 1 and in the order above through to 6 for

strongly agree. The questionnaire additionally allows respondents to give their views on some of

the risk factors not captured in the questionnaire. Most information of respondents is captured in

the introduction.

The following table is the check of collinearity or multicollinearity check of the risk factors

considered in the modelling process. So there are a total of 8 independent variables modelled at

four different levels. The software used in the analysis is the Statistical Package for Social

Sciences (SPSS)

Collinearity and Diagnostic Test

Independent Variable Collinearity Statistics

Tolerance VIF

Nutritional Status of Mother 0.676 1.476

Attendance of Antenatal Care 0.990 1.010

Level of Mother’s Education 0.994 1.006

Sex of Child 0.972 1.029

Size of Child 0.936 1.068

Level of Sun 0.920 1.087

Handling of Complications 0.967 1.034

Care Immediately after delivery 0.999 1.001

Prior to running a logistic regression models, collinearity/multicollinearity test must be ran to

ensure there are no issues of perfect collinearity or multicollinearity since estimates of logistic

regression models are sensitive to multicollinearity. Note that Variance Inflation Factor (VIF)

values are all less than 5 (see for example Rogerson, 2005), this implies not a very strong

Page 10: LOGISTIC REGRESSION ANALYSIS OF INFANT MORTALITY: …ijsernet.org/uploads/SER_01_50.pdf · 2018-08-02 · The rest of paper is categorized as follows: ... consumption and the occurrence

International Journal of Scientific and Education Research

Vol. 2, No. 04; 2018

http://ijsernet.org/

www.ijsernet.org Page 90

correlation among the variables and therefore these can all be included in running the logistic

regression models.

Results of the Mother Level Risk Factors

Overall Model Evaluation

Test Chi-Squared DF p-value

Likelihood Test 15.839 3 0.001

Goodness of Fit

Hosmer-Lemeshow Test 1.951 5 0.856

R Squared

Cox and Snell’s R Squared = 0.165

Nagelkerke R Squared = 0.269

Predictor 𝛽 (std. error)

Wald DF p-value Odds

Ratio

95% CI for the

Odds Ratio

Lower Upper

Nutritional Status of

Mother

2.065

(0.798)

6.689 1 0.010 7.883 1.649 37.689

Antenatal Care -2.209

(0.752)

8.637 1 0.003 0.110 0.025 0.479

Level of Mother’s

Education

1.625

(0.868)

3.502 1 0.061 5.077 0.926 27.839

Constant -2.318

(3.124)

0.550 1 0.458 0.099

From the first table, 26.9% of the variability in the dependent variable is accounted by the

variability of the logistic regression model from the Nagelkerke’s R squared. P-value of the

likelihood ratio test is less than 0.05 implying that the model with predictors fits significantly better

than the model with only the constant (more restrictive model). The Hosmer-Lemeshow test has a

p-value greater than 0.05 indicating that the model with three predictors is a good fitting model.

Looking at the Wald criterion and odds ratio table, we observe that all variables are significant

except the mother’s level of education which is marginally significant at the 10% alpha level. This

shows that the model from the mother level is a statistically stable model. The coefficient on

antenatal care is -2.209, this means that infant mortality is less likely to occur when antennal care

increases. The odds ratio for the nutritional status of the mother is 7.883. This shows that the odds

of the outcome in cases of infant mortality is 7.883 times higher than in cases of infant survival

when there is an increase in the category of the mother’s nutritional status with other variables

constant. This is counterintuitive and contrasting to the expectations of the study given that a

woman who feeds sumptuously well has a high possibility of her baby expiring after birth.

Page 11: LOGISTIC REGRESSION ANALYSIS OF INFANT MORTALITY: …ijsernet.org/uploads/SER_01_50.pdf · 2018-08-02 · The rest of paper is categorized as follows: ... consumption and the occurrence

International Journal of Scientific and Education Research

Vol. 2, No. 04; 2018

http://ijsernet.org/

www.ijsernet.org Page 91

Results of the Child Level Risk Factors

Overall Model Evaluation

Test Chi-Squared DF p-value

Likelihood Test 0.475 2 0.796

Goodness of Fit

Hosmer-Lemeshow Test 8.371 4 0.079

R Squared

Cox and Snell’s R Squared = 0.042

Nagelkerke R Squared = 0.078

Predictor 𝛽 (std. error)

Wald DF p-value Odds

Ratio

95% CI for the

Odds Ratio

Lower Upper

Sex of Child 0.464

(0.743)

0.391 1 0.532 1.591 0.371 6.821

Size of Child 0.056

(0.189)

0.087 1 0.768 1.057 0.730 1.531

Constant -1.497

(0.695)

4.634 1 0.031 0.224

Nagelkerke’s R squared which corrects the Cox and Snell’s R squared explains only 7.8% of

variability in the logistic regression model. Here in the case of the child level factors, first table

shows that the model with predictors did not fit significantly better than the model with the

intercept only given the large p-value of the likelihood ratio test, 0.796. Also results from the

Hosmer-Lemeshow test indicate that the model with the two predictors did not fit the data well as

the p-value of the test is far from unity. Second table confirms the output of the first table because

none of the two variables considered in the model is significant. However, Size of child has a

positive coefficient indicating a higher likelihood of infant mortality when the size of the baby

increases. Interpreting the sex of child is not very educative since it is in category. The odds ratio

of the variable Sex of child is 1.591 meaning the odds of the outcome in cases of infant mortality

is 1.591 times higher in male infants than in female infants. Standard errors are substantially higher

than coefficients indicating a highly unstable model. Thus, we obtain a constant model from this

level since logistic regression must only be fitted with the right and significant variables.

Results of the Environmental Risk Factor

Overall Model Evaluation

Page 12: LOGISTIC REGRESSION ANALYSIS OF INFANT MORTALITY: …ijsernet.org/uploads/SER_01_50.pdf · 2018-08-02 · The rest of paper is categorized as follows: ... consumption and the occurrence

International Journal of Scientific and Education Research

Vol. 2, No. 04; 2018

http://ijsernet.org/

www.ijsernet.org Page 92

Test Chi-Squared DF p-value

Likelihood Test 16.516 1 0.000

Goodness of Fit

Hosmer-Lemeshow Test 0.730 3 0.866

R Squared

Cox and Snell’s R Squared = 0.169

Nagelkerke R Squared = 0.262

Predictor 𝛽 (std. error)

Wald DF p-value Odds

Ratio

95% CI for the

Odds Ratio

Lower Upper

Level of Sun 1.309

(0.440)

8.851 1 0.003 3.702 1.563 8.767

Constant -6.676

(1.957)

11.642 1 0.001 0.001

Similar to the mother level risk factors, both the likelihood ratio and the Hosmer-Lemeshow tests

indicate that the model with the predictor Level of Sun fits the data well and also fits better than

the constant only model given the 0.000 p-value of the likelihood ratio test and the 0.866 p-value

of the Hosmer-Lemeshow test. This shows that the model that will be obtained from the second

table of this level is a good fitting model with statistical stability since Wald statistic is fairly large

and standard error is low as well. Level of sun is statistically significant and the positive coefficient

shows that increasing the level of sun by increasing the category increases the likelihood of infant

deaths which is very much expected and conforms to the situation in the district. Odds ratio is

greater than one implying that, increasing the category of the level of sun in the district, infant

deaths increase over infant survival by 3.702 times. Nagelkerke’s R squared explains 26.2% of

variability in the logistic regression model.

Results of the Medical Attendants’ Level Risk Factors

Overall Model Evaluation

Test Chi-

Squared

DF p-value

Likelihood Test 7.016 2 0.030

Goodness of Fit

Hosmer-Lemeshow Test 1.323 5 0.933

R Squared

Cox and Snell’s R Squared = 0.077

Nagelkerke R Squared = 0.121

Page 13: LOGISTIC REGRESSION ANALYSIS OF INFANT MORTALITY: …ijsernet.org/uploads/SER_01_50.pdf · 2018-08-02 · The rest of paper is categorized as follows: ... consumption and the occurrence

International Journal of Scientific and Education Research

Vol. 2, No. 04; 2018

http://ijsernet.org/

www.ijsernet.org Page 93

Predictor 𝛽 (std. error)

Wald DF p-value Odds

Ratio

95% CI for the

Odds Ratio

Lower Upper

Handling

Complications

0.540

(0.299)

3.256 1 0.071 1.716 0.954 3.086

Postnatal Care -1.221

(0.576)

4.494 1 0.034 0.295 0.095 0.912

Constant -2.697

(1.201)

5.044 1 0.025 0.067

From the explanations above which now seem like a cliché, we have a good fitting model for the

last level risk factors given the 0.030 p-value of the likelihood test and the 0.933 p-value of

Hosmer-Lemeshow test. 12.1% of the variability in the dependent variable is accounted by the

variability of the logistic regression model from the Nagelkerke’s R squared. The odds of outcome

of infant mortality is 1.716 times higher than infant survival when there is an increase in handling

complications at birth. This is quite ambiguous. The negative coefficient on postnatal care

indicates that, the likelihood of infant mortality decreases as the category of postnatal care

increases which is intuitive. Postnatal care is significant at 5% level with handling complications

marginally significant at the 10% level and the model overall is statistically stable given the fairly

low standard errors.

Data from the BDH

Considering the number of births in the district hospital from 2006 to 2014 and in fulfilling one of

the objectives of this study; that is predicting the chances of infant survival in the BDH, a binary

logistic regression coded the dependent variable as: infant survival (desired) as 1 and infant

mortality (undesired) as 0. The results of infant survival are summarized in the subsequent three

tables. Considering the results below, the likelihood ratio test has a p-value of 0.023 indicating that

the model with the predictor year group fits significantly better than the restrictive model (intercept

only model). The Hosmer-Lemeshow test has a p-value of 0.205 which is greater than 0.05 though

not close to one indicating a good logistic regression model fit of the data. Again Nagelkerke’s R

squared explains only 7.6% of the variability in the logistic regression model. This low R squared

is not very educational in the framework of prediction. Because, there is a vast percentage of

variability that is left unexplained by the model. However, inferring from the second table, the

odds ratio for the predictor year group is 1.730 indicating that when the year group increases by

one unit, then the odds of the desired outcome infant survival is 1.730 times higher than the odds

of infant mortality which pretty confirms the findings below in table three as the probabilities of

an infant surviving in the last column of table three is increasing with years. In other words, the

odds of an infant surviving in the district hospital will be 1.730 times higher than an infant expiring

Page 14: LOGISTIC REGRESSION ANALYSIS OF INFANT MORTALITY: …ijsernet.org/uploads/SER_01_50.pdf · 2018-08-02 · The rest of paper is categorized as follows: ... consumption and the occurrence

International Journal of Scientific and Education Research

Vol. 2, No. 04; 2018

http://ijsernet.org/

www.ijsernet.org Page 94

in a year say 2008 than it is in the year 2007. The model we used for predicting the chances of an

infant surviving at the hospital is given below:

IS is the Infant Survival.

Summary of tests and goodness of fit of infant survival from the Bongo District Hospital

Overall Model Evaluation

Test Chi-Squared DF p-value

Likelihood Test 5.204 1 0.023

Goodness of Fit

Hosmer-Lemeshow Test 1.607 1 0.205

R Squared

Cox and Snell’s R Squared = 0.030

Nagelkerke R Squared = 0.076

Predictor 𝛽 (std. error)

Wald DF p-value Odds

Ratio

95% CI for the

Odds Ratio

Lower Upper

Year Group 0.548

(0.244)

5.056 1 0.025 1.730 1.073 2.788

Constant 3.089

(0.484)

40.712 1 0.000 21.963

Summary Results of IS from the BDH (overall)

Year Year

Group

(Coded)

Infant

Mortality

Number

of Births

Probability

of Infant

Survival

Odds

p/(1-p)

Logit

(p)

Predicted

Probabilities

2006 1 13 450 0.9711 37.9777 3.6370 0.9743

2007 2 5 528 0.9905 65.6935 4.1850 0.9850

2008 3 8 750 0.9893 113.6360 4.7330 0.9913

2009 4 16 1067 0.9850 196.5663 5.2810 0.9949

2010 5 13 1081 0.9880 340.0185 5.8290 0.9971

2011 6 15 414 0.9638 588.1606 6.3770 0.9983

2012 7 8 525 0.9848 1017.3943 6.9250 0.9988

2013 8 13 541 0.9760 1759.8784 7.4730 0.9994

2014 9 6 706 0.9915 3044.2200 8.0231 0.9996

Page 15: LOGISTIC REGRESSION ANALYSIS OF INFANT MORTALITY: …ijsernet.org/uploads/SER_01_50.pdf · 2018-08-02 · The rest of paper is categorized as follows: ... consumption and the occurrence

International Journal of Scientific and Education Research

Vol. 2, No. 04; 2018

http://ijsernet.org/

www.ijsernet.org Page 95

Discussion

At the mother level factors, two out of three factors were significant at the 5% level. Nutritional

status of mother and attendance of antenatal care are contributing factors to infant deaths in the

district. Because of financial restraints, most women in the district cannot afford a well-balanced

meal that they ought to consume during gestation period. They are therefore obliged to consume

any meal at their disposal at the expense of the health of their unborn children (extracts from the

questionnaires). It is not uncommon in the district to see a child born with infections most of which

result from poor nutritional status of their mothers. Also, in the Bongo district where there are

inadequate hospitals, clinics and CHPS compounds, women living in the remote areas often find

it inconvenient traveling several miles to seek antenatal care (extracts from the questionnaires).

The aftermath of missing antenatal care could be sad for the woman and the entire household.

The next set of factors was the child level risk factors which included the sex of the child and size

of the child. These factors are mostly not directly related to the family after the woman becomes

pregnant. They are mostly related to nature except for the size of the child that could be related to

nutritional status of the mother (Kwara, 2012). These are inevitable causes of infant deaths and the

study’s findings align with the fact since there was not any meaningful association of the variables

with infant deaths. In this category, both factors were found insignificant as causing infant death

in the district.

The next factor the study looked at was the environmental level risk factor which captured only

the level of sun in the region of the pregnant woman. The factor considered was significant as

contributing to infant mortality in the district. It is obvious that there is a causality between infant

deaths and weather conditions wherein infant deaths rise in the district during the hot weather

conditions from latter days of February until the close of April.

The last set of factors the study looked at were the medical attendants’ level risk factors which

included the handling complications during birth and the care for the child immediately after

delivery or postnatal care. Here, care can either be from the medical attendant or the traditional

birth attendant. It was found out that, the care for the child immediately after delivery by either

medical attendants at the hospitals or at home by traditional birth attendants was a significant

contributor to infant deaths in the district.

Conclusion

The four level risk factors were modelled, however the model for the child level involved only the

intercept since the two factors considered were not significant as contributing to infant deaths in

the district. Evidence from the Likelihood ratio and the Hosmer-Lemeshow tests and the individual

significance of the variables in the Wald criterion for the child level risk factors confirm that the

intercept only model is better than including the so called risk factors. This confirms the work of

Page 16: LOGISTIC REGRESSION ANALYSIS OF INFANT MORTALITY: …ijsernet.org/uploads/SER_01_50.pdf · 2018-08-02 · The rest of paper is categorized as follows: ... consumption and the occurrence

International Journal of Scientific and Education Research

Vol. 2, No. 04; 2018

http://ijsernet.org/

www.ijsernet.org Page 96

Kwara (2012) about modelling neonatal mortality rate in Ghana at three different levels excluding

the medical attendants’ level. Thus, an efficiency of the logistic regression model is realized when

only the significant and meaningful variables are fitted. Noting that, at the mother level, mother’s

educational level was not significant at the benchmark 5% alpha level. Handling complications at

birth by attendants was not significant at the 5% level and hence we can explicitly write our models

as.

(6.1)

(6.2)

(6.3)

From the analysis of data from the BDH, the chances of infants surviving in the district hospital

were very high. From 2006 – 2014, the probability that infants survived in the hospital was between

0.9743 and 0.9997. This gives approximately at most 3% infant deaths in the hospital between

2006 and 2014. This shows from 2006 – 2014 that, a maximum of 3 infant deaths per every 100

live births implies 30 infant deaths for every 1,000 live births in the district hospital citeris paribus.

A rate lower than the 38.47 infant deaths per every 1,000 live births of infants in Ghana for 2014.

This further logically implies that the alarming rate of infant mortality observed in the district was

grossly attributed to home delivery wherein mothers at the brink of delivery were attended to by

traditional birth attendants who may lack the formal and requisite training in delivery and handling

of these foetuses. These deaths perhaps could also be due to delivery in other community clinics

of the district which are not studied here in the paper.

Recommendations of the study: It is recommended that the nutritional status of every pregnant

woman should be improved and the nutritional section of every health facility, if any, should be

fortified to advise pregnant women on the importance of feeding well during pregnancy and after

delivery. Women should be encouraged to deliver at the hospitals given the vast difference

between the numbers of deaths at home and unaccounted for and the number of deaths in the

Bongo district hospital. It is recommended that a nursing college or midwifery institution be built

in the district to properly train nurses, midwives and obstetricians on how to handle and take good

care of mothers and their new-borns until discharged from hospital.

Lapses of the study: Predicted probabilities of infant survival in the district hospital could be

spurious given the very low Nagelkerke’s R squared value. Reason so imputed to small sample

size. Long (1997) approach can be used to attain the right sample size per the population and the

entity of study. It is thus recommended that further research should be carried out about infant

mortality in the district using several other risk factors and using a fairly and reasonably large

sample size. The BDH could also keep a very rigorous data on infant deaths and their associated

Page 17: LOGISTIC REGRESSION ANALYSIS OF INFANT MORTALITY: …ijsernet.org/uploads/SER_01_50.pdf · 2018-08-02 · The rest of paper is categorized as follows: ... consumption and the occurrence

International Journal of Scientific and Education Research

Vol. 2, No. 04; 2018

http://ijsernet.org/

www.ijsernet.org Page 97

major risk factors at all levels for prospective detailed studies of the occurrence. This is because

using questionnaires to measure these risk factors comes along with a lot of measurement errors

which are innate and impossible to sidestep and consequently, obtaining biased and inconsistent

estimates of the true parameters of interest. Several control factors could be included in the models

to address potential issues of endogeneity or omitted variable bias. Or perhaps to cater for any

possibility of endogeneity, an instrumental variable estimation of the parameters of non-linear

models (logit or probit models) could be studied however in the framework of child mortality (see,

Charbonneau, 2013), this could be a remedy to the potential endogeneity and perhaps

inconsistency problems.

Reference

Anon., (2014), “Assumptions of the Logistic Regression”,

www.statisticsolutions.com/assumptions_of_logistic_regression, Accessed: February

2015.

Ananth, C.V. and Basso, O. (2010), “Impact of pregnancy-induced hypertension on stillbirth and

neonatal mortality”, Epidemiology, January; 21(1):118-23.

Blair, P.S., Fleming, P.J., Bentley, D., Smith, I., Bacon, C., Taylor, E., Berry, J., Golding, J. and

Tripp, J. (1996), “Smoking and the sudden infant death syndrome:

results from 1993-5 case-control study for confidential inquiry into stillbirths and deaths

in infancy”, BMJ, July 27; 313 (7051):195-8.

Cedergren, M.I. (2004), “Maternal morbid obesity and risk of adverse pregnancy outcome”,

Obstetrical Gynecology, February; 103(2):219-24.

Charbonneau, K. B. (2013), “Multiple fixed effects in theoretical and applied econometrics. PhD

thesis, Princeton University.

Chen, A., Feresu, S.A., Fernandez, C. and Rogan, W.F. (2009), “Maternal obesity and the risk of

infant deaths in the United States”, Epidemiology, January; 20(1):78-81.

Clausen, T.D., Matheson, E., Ekborn, P., Helmut, E., Mandrup-Poulsen, T. and Damm, P. (2005),

“Poor pregnancy outcome in women with type 2 diabetes”, Diabetes Care, February;

28(2):323-8.

Dunne, F.P., Avalos, G., Durkan, M., Mitchell, Y., Gallacher, T., Keenan, M., Hogan, M.,

Carmody, L.A. and Gaffney, G. (2009), “Pregnancy outcome for women with

presentational diabetes along the Irish Atlantic seaboard”, Diabetes Care, July; 32(7):

1205- 6.

Page 18: LOGISTIC REGRESSION ANALYSIS OF INFANT MORTALITY: …ijsernet.org/uploads/SER_01_50.pdf · 2018-08-02 · The rest of paper is categorized as follows: ... consumption and the occurrence

International Journal of Scientific and Education Research

Vol. 2, No. 04; 2018

http://ijsernet.org/

www.ijsernet.org Page 98

Khashu, M., Narayanan, M., Bhargava, S. and Osiovich, H. (2009), “Perinatal outcomes associated

with preterm birth at 33 to 36 weeks’ gestation: a population-based cohort study”,

pediatrics, January; 123(1): 109-13.

Kwara, K. (2012), “Modeling the risk factors of neonatal mortality in Ghana using logistic

regression”, June 2012, pp. 1-8, 36-70.

Long, J.S. (1997), “Regression Models for categorical and limited dependent variables,

Thousand Oaks, CA: Sage Publications.

Oestergaard MZ, Inoue M, Yoshida S, Maharani WR, Gore FM, et al. (2011) Neonatal Mortality

levels for 193 countries in 2009 with trends since 1990: A systematic analysis of progress,

projections and priorities. PLoS Med 8: e1001080.

Rasch, V. (2003), “Cigarette, alcohol, and caffeine consumption: risk factors for spontaneous

abortion”, Acta Obstetrical Gynecology Scand., February; 82(2):182-8.

Roberson, P.A. (2001), “Statistical methods for geography”, London: sage.

World Health Organization (WHO), (2016), “Child mortality and causes of death”, Global Health

Observation (GHO) data.

World Health Organization (WHO), (2006), “Infant Mortality”, Country, Regional and Global

Estimates.


Recommended