geographical epidemiology of …...by prem kumar mony master of science (2009) institute of medical...
TRANSCRIPT
GEOGRAPHICAL EPIDEMIOLOGY OF CARDIOVASCULAR DISEASE IN INDIA: AN EXPLORATORY STUDY
by
Prem kumar Mony
A thesis submitted in conformity with the requirements for the degree of Master of Science
Graduate Department of Institute of Medical Sciences University of Toronto
© Copyright by Prem kumar Mony 2009
ABSTRACT
Geographical Epidemiology of Cardiovascular Disease in India: An exploratory study
By Prem kumar Mony Master of Science (2009)
Institute of Medical Sciences University of Toronto
Cardiovascular Diseases (CVD) have become the leading cause of death in India and other
developing countries. The aims of this study were to: (1) describe the geographical epidemiology of CVD
in India, (2) provide a graphical display of CVD risk factors and mortality outcomes, and (3) describe the
sources of bias. Five large, nationally-representative datasets from India were studied.
Cardiovascular death rates were 308/100,000 among males and 198/100,000 among females in
middle-age (30-69years). Wide variations between states were noted in the distribution of risk factors and
mortality. The selected risk factors explained 49% and 43% of the variation among males and females
respectively. Ecologic analysis revealed death rates at state-level were associated with rates of overweight
and vegetarianism among males; no such association was found among females. This study has
implications for identification of areas with high burden, formulation of hypotheses, and assessing needs
for disease control at national/regional levels.
ii
ACKNOWLEDGEMENT
A thesis is supposed to be an independent piece of work, but in reality this is hardly so. Over the last two years there have been so many people without whose support, guidance and patience, this work would not have been completed. It is to them I owe my deepest gratitude. I have been fortunate to have worked and learned under the supervision of Dr Prabhat Jha, a dedicated epidemiologist and an able mentor. His wisdom, knowledge and commitment to the highest standards have inspired and motivated me. Thanks Prabhat, you’ve truly been a guru. My thanks and appreciation go to Dr Richard Glazier and Dr Jim Dunn, for being on my program committee and helping me shape this thesis with their critical inputs and interesting perspectives. I wish to thank my examination committee members Dr Jack Tu, Dr David Alter and Dr Rajeev Gupta for the insightful and constructive comments. Thanks also go out to my colleagues Wilson Suraweera and Paul Arora for timely assistance with my statistics queries and to Ashleigh Sullivan and Brent Harris for advice with the geographic software. On a different note, a quiet thanks also to all the anonymous study respondents who have collectively taught me more than they would ever realize. Finally and importantly I would like to thank my wife Lolita and my daughter Pritika for their encouragement in supporting me in my aspirations. I thank my parents, Kamala and Daniel, and my sister Premalatha, for their support and unwavering love.
iii
TABLE OF CONTENTS Abstract…………………………………………………………………………………iii Acknowledgements…………………………………………………………………..….iv List of tables……………………………………………………………………………viii List of figures…………………………………………………………………….……...ix List of appendices……………………………………………………………………….xi
1 INTRODUCTION……………………………………………………………………………..1
1.1 India – current scenario…………………………………………………………………..1
1.2 Cardiovascular disease in India………………………………………………………….2
1.2.1 Cardiovascular disease mortality……………………………………………….2
1.2.2 Prevalence of Cardiovascular Disease………………………………………….4
1.2.3 Cardiovascular risk factors……………………………………………………..4
1.2.4 Geography and Cardiovascular Disease………………………………………..6
1.3 Health Information Visualization ……………………………………………………….7
1.4 In summation……………………………………………………………………………8
1.5 Objectives ………………………………………………………………………………9
2 LITERATURE REVIEW…………………………………………………………………..10
2.1 Cardiovascular risk factors…………………………………………………………….10
2.1.1 Smoking……………………………………………………………………….10
2.1.2 Body mass……………………………………………………………………..11
2.1.3 Dietary factors and diabetes…………………………………………………...12
2.1.4 Other risk factors………….…………………………………………………...12
2.2 Cardiovascular mortality……………………………………………………………….14
2.2.1 Assessment of cause-of-death…………………………………………………14
2.3 Geographical epidemiology…………………………………………………………….15
2.4 Data visualization……………………………………………………………………….17
3 METHODOLOGY………………………………………………………………….…..….18
3.1 Study setting………………………………………………………………………….18
3.2 Study design…………………………………………………………………………..19
3.3 Data sources…………………………………………………………………………...19
3.3.1 Special Fertility and Mortality Survey……………………………………….19
3.3.2 National Family Health Surveys……………………………………………..20
iv
3.3.3 Sample Registration System………………………………………………….21
3.3.4 Million Death Study………………………………………………………22
3.4 Database management……………………………………………………………..26
3.4.1 Abstraction of relevant variables from the 5 databases…………………...26
3.4.2 Compilation of data dictionaries………………………………………….26
3.4.3 Data quality assessment…………………………………………………..26
3.4.4 Exploratory data analysis…………………………………………………27
3.5 Data analysis……………………………………………………………………….28
3.5.1 Conceptual framework……………………………………………………28
3.5.2 Standardization……………………………………………………………28
3.5.3 Statistical analysis………………………………………………………...29
3.5.4 Geographical analysis…………………………………………………….31
3.6 Ethical approval……………………………………………………………………34
4 RESULTS…………………………………………………………………………………35
4.1 Survey characteristics and descriptives of study population………………………35
4.1.1 Survey characteristics…………………………………………………….35
4.1.2 Demographic characteristics……………………………………………..37
4.1.3 Crude prevalence of selected CVD determinants…………………..……37
4.2 Smoking……………………………………………………………………..…….40
4.2.1 Smoking prevalence among males and females…………………………40
4.2.2 Smoking among all males in SFMS-1998……………………………….41
4.2.3 Smoking among middle-aged adults………………………………….….46
4.2.4 Spatial heterogeneity………………………………………………..……54
4.3 Body mass…………………………………………………………………………55
4.3.1 Overweight/ obesity………………………………………………………55
4.3.2 Geographic mapping of overweight prevalence by states………………..62
4.3.3 Spatial heterogeneity……………………………………………………..66
4.4 Diet and self-reported diabetes…………………………………………………….67
4.4.1 Vegetarianism…………………………………………………………….67
4.4.2 Fruit intake………………………………………………………………..70
4.4.3 Diabetes…………………………………………………………………..74
4.5 Ecologic association ………………………………………………………………77
4.5.1 Cardiovascular mortality………………………………………………….77
v
4.5.2 Ranking of states………………………………………………………….79
4.5.3 Univariate regression analysis…………………………………………….81
4.5.4 Multivariate regression……………………………………………………82
4.6 Biases & limitations………………………………………………………………..89
4.6.1 Assessment of representativeness of surveys……………………………..89
4.6.2 Integrity of surveys………………………………………………………..92
4.6.3 Study characteristics………………………………………………………93
4.6.4 Differences in sociodemographic characteristics…………………………96
4.6.5 Limitations……………………………………………………………..…101
5 DISCUSSION……………………………………………………………………………103
5.1 Summary of key findings…………………………………………………………103
5.2 Smoking……………………………………………………………………….…105
5.3 Overweight and obesity………………………………………………….………109
5.4 Dietary factors and self-reported diabetes………………………………..………111
5.5 Cardiovascular mortality…………………………………………………………114
5.6 Study implications………………………………………………………..………118
5.7 Future directions for research…………………………………………….………119
5.8 Conclusions………………………………………………………………………119
6 REFERENCES……………………………………………………………………..…122
7 APPENDIX……………………………………………………………………….....…130
vi
LIST OF TABLES
Table 1.1 Top dozen causes of death among middle-aged adults (ages 30-69 years), India, 2001-03…………..…..3
Table 1.2 Factors influencing risk of myocardial infarction (INTERHEART study)…………………………..…...5
Table 3.1 Description of the 5 databases, study periods, sample sizes and study populations………………..……20
Table 3.2 Variables used from the 5 surveys (4 risk factor surveys & 1 mortality outcome survey……………..…25
Table 4.1 Descriptive analysis of baseline characteristics in the selected surveys………………………………….36
Table 4.2 Crude prevalence of CVD determinants in selected surveys, India…..………………………………….38
Table 4.3 Smoking among young, middle-aged and older adults by sex and residence, 1998….………………….46
Table 4.4 Pearson correlation coefficients comparing state-level smoking prevalence across different survyes
for males, ages 45-59 years….……………………………………………………………………….…………….52
Table 4.5 Prevalence of overweight/obesity in NFHS-3 survey, 2005-06..…………………….………………….56
Table 4.6 Prevalence proportions of overweight by residence and sex, NFHS-3 survey, 2005-06………………...57
Table 4.7 Vegetarianism among adults aged 15 years and over in India from selected surveys…………………...67
Table 4.8 Reported fruit intake (atleast weekly) in NFHS-3 survey, 2005-06……………………………………..71
Table 4.9 Self-reported diabetes prevalence in NFHS-3 survey, 2005-06………………………………………...74
Table 4.10(a) Ranking of states by outcome (CVD death rate, highest to lowest) for middle-aged males,
India, 2006…………..……………………………………………………………………………………………..79
Table 4.10(b) Ranking of states by outcome (CVD death rate, highest to lowest) for middle-aged females,
India, 2006…………………………………………………………………………………..……………………..80
Table 4.11 Correlations between male vs female ranks for 29 states.………..……………………………………81
Table 4.12 Correlations between selected risk factors and CVD mortality by sex, in 29 states of India..…………82
Table 4.13 Pearson correlation coefficients between variables at state level, males and females………….……...86
Table 4.14(a) Multiple linear regression of cardiovascular death rates among males at state level……….……...87
Table 4.14(b) Multiple linear regression of cardiovascular death rates among females at state level……….…….88
Table 4.15 Sex-ratios (no. of females per 1000 males) in selected surveys in comparison to the census
2001 population……………………………….…………………………………………………………………...89
Table 4.16 Potential sources of bias based on characteristics of respondents, survey instruments and
interviewers in the four surveys.…………………………………………………………………………………...94
Table 5.1 Profile of cardiovascular disease and its risk factors in rural and urban population in southern India,
PURE study……………………………….…………………………………………………….………………...113
vii
LIST OF FIGURES
Figure 1.1 Life and death in 20th century………………………………………………………………………….….1
Figure 3.1 Political map of India showing states and union territories..…………………………………………….18
Figure 3.2 Process flow of the Million Death Study……………….....……………………………………….…….24
Figure 3.3 Conceptual framework for the geographical epidemiological analysis of CVD in India…………….….28
Figure 4.1 Crude prevalence of current tobacco smoking among males and females, aged 15 years
and over, from selected surveys in India……....……………………………………………………………….……40
Figure 4.2(a) Age-specific prevalence (bars) and cumulative prevalence (line) of current smoking
among all males, ages ≥ 15 years, SFMS 1998..……………………………………………………………….……41
Figure 4.2(b) Cumulative prevalence (line) and percent increase in smoking above younger age-group (bars)
of current smoking among all males, ages ≥ 15 years, SFMS 1998..……………………………...………………..42
Figure 4.3 Types of tobacco smoked by level of education among males, ages 15 years and over, SFMS 1998......43
Figure 4.4 Mean age of initiation of smoking by type of tobacco used among male smokers, SFMS 1998……......44
Figure 4.5 Mean age of initiation of smoking among males for different types of tobacco by level of education,
SFMS
1998…….............................................................................................................................................................45
Figure 4.6 Type of tobacco smoked by place of residence among middle-aged male smokers, 1998……………....47
Figure 4.7 Types of tobacco smoked and proportion of beedi smokers among middle-aged (30-69 years)
males in states of India, 1998……………………………………………………………………………………......48
Figure 4.8 Prevalence of smoking among males in different states of India,
1998……………………..…………....49
Figure 4.9 Proportion of different types of tobacco smoked in different states, SFMS 1998………………..……....50
Figure 4.10 Maps of smoking prevalence among rural males, ages 45-59 years, across the four selected
surveys.....51
Figure 4.11 Correlation between smoking prevalence in states between SFMS and NFHS-2……………………....52
Figure 4.12 Ratio of ex:current smokers among males, ages 45-59 years, SRS 2004……………………………....53
Figure 4.13 Scatterplot of global spatial autocorrelation (for males) and LISA maps showing local spatial
clustering of smoking (for males and females), NFHS-3 [2005-06] ………………….………………………….....54
Figure 4.14 Boxplot showing distribution of body mass index by sex, 2005-06…………………………………....58
Figure 4.15 Boxplot showing distribution of body mass index by gender and residence, 2005-06………………....59
Figure 4.16(a) Boxplots of distribution of body mass index by gender and age (males), 2005-06………………....60
Figure 4.16(b) Boxplots of distribution of body mass index by gender and age (females), 2005-06…………….....61
Figure 4.17 Mapping of proportions of adults, ages 30-49 years, overweight by state, 2005-06…………………...63
Figure 4.18 Mapping of proportions of adults, ages 20-29 years, overweight by state, 2005-06……………….......64
Figure 4.19 Mapping of proportions of adults, ages 15-19 years, overweight by state, 2005-06…………………...65
Figure 4.20 LISA maps showing local spatial clustering of overweight/obesity among males and females,
NFHS-3 [2005-06] ………………….……………………………………………………………………………...66
viii
Figure 4.21 Prevalence of lacto-vegetarianism in the states among adults in the NFHS-3 survey, 2005-06…..…...69
Figure 4.22 LISA maps showing local spatial clustering of lacto-vegetarianism, NFHS-3 [2005-06]………...…...70
Figure 4.23 Reported fruit intake (at least weekly) in various states by sex and residence, 2005-06…………..…...72
Figure 4.24 LISA maps showing local spatial clustering of fruit intake among males and females,
NFHS-3 [2005-06] ………………….……………………………………………………………………………...73
Figure 4.25 Prevalence of self-reported diabetes in different states among adults aged 30 years & over by sex
and residence, NFHS-3 survey, 2005-06…..……………………………………………………………………......75
Figure 4.26 LISA maps showing local spatial clustering of self-reported diabetes among males and females,
NFHS-3 [2005-06] ………………….……………………………………………………………………………...76
Figure 4.27 Age-standardized vascular death rate per 100,000 males and females, ages 30-69 years,
in states of India [2006]…………………………………………………………………………………………….78
Figure 4.28(a) Plots of vascular death rate per 100,000 males against predictor variables………………………...84
Figure 4.28(b) Plots of vascular death rate per 100,000 females against predictor variables……………….……...85
Figure 4.29 Age-sex pyramids of selected surveys in comparison with the census 2001 population……………...91
Figure 4.30 Comparison of proportions of adult males, ages 15-54 years, smoking different types of tobacco
in selected surveys………………………………………………………………………………………………….95
Figure 4.31 Relationship of reporting bias for smoking with type of respondent in selected surveys in India….....96
Figure 4.32 Prevalence of various risk factors by urban-rural residence…………………………………….….....97
Figure 4.33 Prevalence of risk factors by education…………………..…………………………………….….....98
Figure 4.34 Distribution of CVD determinants by residence-sex-education groups in India…………………….100
ix
LIST OF APPENDICES
Table 7.1 Poisson regression (males and females)……………………………………………………………….126
x
1 INTRODUCTION
1.1 India – current scenario India’s population currently totals 1.13 billion or 17% of world population. Between the first
and last decade of the 20th century, the crude death rate fell by nearly four-fifths and life expectancy
at birth tripled from around 22 years to over 61 years (Figure 1.1) [1,2]. India has thus seen marked
reductions in death rates at young ages and more modest reductions in death rates in middle-age in
the 20th century.
Figure 1.1 Life and death in 20th century India
Source: [1,2]
Of the 10 million deaths that currently occur every year in India, about 3.4 million deaths
occur in the age-group 0-34 years; these are mostly acute infectious conditions. The 6.6 million
deaths that occur in those aged 35 years and over are mostly chronic conditions – 3.8 million occur
during middle-age (35-69 years) and 2.8 million occur during old-age (70 years and over) [3,4].
Evidence over the last two centuries from around the world suggests that while death in old age
(after age 70 years) is inevitable, death at young ages (below age 35 years) could become a rare
occurrence, and death in middle age (age 35-69 years) need not be common [5].
India is currently going through multiple transitions – demographic, socioeconomic and health
transitions. Together with the on-going demographic transition associated with improving survival
and increasing urbanization [6], there has also been a dramatic socioeconomic transition over the
1
last few decades [7]. Concurrently, it has witnessed a ‘risk transition’, characterized by changes in
tobacco and alcohol consumption, nutrition, and other lifestyles leading to changing patterns of
disease, disability and death called ‘epidemiologic or health transition’ [8,9]. Tobacco smoke,
physical activity, obesity, hypertension, high glucose and dyslipidemia are the conventional
cardiovascular risk factors identified from seminal studies such as the Framingham [10,11,12],
Whitehall [13] and Ni-Hon-San [14] field epidemiological studies as well as risk factor trials such
as MRFIT [15] and North Karelia [16] trials. More recently, novel risk factors such as
lipoprotein(a), homocysteine, and high-sensitivity C-reactive protein have also been suggested to
influence development of atherosclerosis [17]. Other possible drivers of this epidemiologic
transition have also been proposed. Low birth weight and poor childhood growth have been linked
to increased susceptibility to cardiovascular disease in later life. Gene-environment interaction due
to the presence of a ‘thrifty gene’ in south Asians has also been proposed to be a determinant of
early, excess and extensive cardiovascular disease in south Asians mediated through the
development of hyperlipidemia, insulin resistance and abdominal obesity [18,19]. India now thus
faces a ‘double-burden’ – the task of facing a combination of the ‘unfinished agenda’ of
communicable, nutritional, maternal and child deaths as well as the ‘emerging epidemic’ of chronic,
non-communicable diseases such as obesity, hypertension, diabetes and cardiovascular diseases
[20,21,22]. Cardiovascular Diseases (CVD) such as coronary heart disease (CHD) and strokes are
already the leading cause of death in India and other south Asian countries [23].
1.2 Cardiovascular disease in India
1.2.1 Cardiovascular disease mortality According to the Global Burden of Diseases Study, there were 1.60 million CHD deaths and
0.60 million stroke deaths during the year 2002 in India [24]. Mortality from these conditions is
predicted to rise rapidly in the future with the absolute numbers of CHD cases in India exceeding
those of the established market economies and China combined. Available information in India
comes from the Medical Certification of Cause of Death (MCCD) system which is a hospital-based
cause-of-death assignment system. According to the MCCD, which registered 14.5% of all deaths
in the country for the year 2000, about 20% of deaths in the age-group 15-54 years were due to
cardiovascular diseases [25]. The reliability of this mortality data has however been questioned on
2
issues of poor coverage, and with regard to poor compliance with guidelines for cause of death
reporting, coding and classification [26]. Hence, a nationally-representative cause-of-death
assessment is currently being pursued by the Office of the Registrar General of India through the
Million Death Study using validated verbal autopsy (VA) technique [27]. Preliminary results on
mortality in middle-age (ages 30-69 years) for the period 2001-03 are shown in Table 1.1 [28].
Table 1.1 Top dozen causes of deaths among middle-aged adults (ages 30-69 years), India, 2001-03 Males Females Rank Cause of death cum. Rank Cause of death cum. % % % %
1 Cardiovascular disease 27.3 27.3 1 Cardiovascular disease23.4 23.4
2 Tuberculosis 11.0 38.3 2 Neoplasms 12.3 35.7
3 Chronic lung disease 10.7 49.0 3 Chronic lung disease 10.9 46.6
4 Neoplasms 8.1 57.1 4 Tuberculosis 7.8 54.4
5 Unintentional injuries 7.7 64.8 5 Ill-defined/ unknown 6.7 61.1
6 Digestive diseases 7.6 72.4 6 Diarrhoeal diseases 6.5 67.6
7 Ill-defined/ unknown cause 4.5 76.9 7 Digestive diseases 4.9 72.5
8 Diarrhoeal diseases 4.1 81.0 8 Unintentional injuries 4.8 77.3
9 Intentional injuries 3.7 84.7 9 Malaria 3.4 80.7
10 Genito-urinary disease 2.8 87.5 10 Genito-urinary disease 2.6 83.3
11 Malaria 2.3 89.8 11 Fever of unknown origin 2.5 85.8
12 Diabetes mellitus 2.1 91.9 12 Diabetes mellitus 2.2 88.0 cum. = cumulative
Smaller population-based studies from urban Tamil Nadu and rural Andhra Pradesh using
the verbal autopsy instruments have yielded cause-specific mortality information recently [29,30].
In urban Chennai, analyses of a total of 66,777 deaths revealed that cardiovascular diseases were the
largest group with 18,680 deaths (28%) [29]. In rural Andhra Pradesh, it has been reported that 34%
of deaths among males and 30% of deaths among females were due to diseases of the
cardiovascular system among 1354 deaths occurring in an year in a community of about 150,000
persons followed prospectively [30].
3
1.2.2 Prevalence of Cardiovascular Disease In the absence of reliable nationwide prospectively collected morbidity data, estimates of the
prevalence of CHD have been based on indicators from population-based, cross-sectional surveys.
Multiple epidemiological studies have been performed in urban and rural populations in India over
the past few decades. Comparisons of these studies have various limitations such as inadequate
sample sizes, variable response rates, lack of age-standardization, unstandardized diagnostic criteria
such as medical history and non-specific electrocardiographic changes like abnormal ST-T waves,
and inadequate reporting of results. Review of a subset of high-quality studies that used broadly
similar recruitment procedures, study methods and diagnostic criteria (known CHD, Rose
questionnaire angina and/or electrocardiographic Q-ST-T changes) are able to offer a perspective on
secular trends. A higher prevalence of CHD was consistently seen in urban communities (6.6%-
12.5%) as compared to rural communities (2.1%-4.3%); the relative risk was about 3.0. There was
also significantly increasing trends in urban (r2 = 0.60) and rural (r2 = 0.31) regions over the last
four decades [22].
Stroke prevalence studies are very limited in India and the available studies also have the
multiple biases as in studies of CHD. The crude and age-standardized prevalence rates of stroke
appear to be higher in urban populations than in rural subjects. However evaluation of secular trends
in stroke in India is not possible owing to the small numbers of studies [22].
1.2.3 Cardiovascular risk factors The INTERHEART study, a 52-country, case-control study involving 15,152 cases of first
myocardial infarction and 14,820 age- & sex-matched, hospital/community controls identified that
over 90% of cases of acute myocardial infarction could be attributed to nine well-known coronary
risk factors – smoking, low fruit and vegetable consumption, low physical activity, alcohol
consumption, psychosocial stress, abdominal obesity, diabetes, hypertension, and abnormal lipids
(Table 1.2) [31]. These same risk factors that were found to be important in the overall
INTERHEART study were also found to operate within the south Asian subset as well [32]. A key
finding of this study was that the south Asians seen with a first myocardial infarction were younger
in comparison to the others.
4
Table 1.2 Factors influencing risk of acute myocardial infarction (INTERHEART study) Risk factor OR (99% CI) PAR (99%CI)
Increased risk (harmful)
Apo B: apo A1 ratio (highest vs. lowest decile) 3.25 (2.81-3.76) 49·2% (43·8–54·5)
Smoking (current vs. never) 2.87 (2.58-3.19) 35·7% (32·5–39·1)
Psychosocial factors 2.67 (2.21-3.22) 32·5% (25·1–40·8)
Diabetes 2.37 (2.07-2.71) 9·9% (8-5–11·5)
Hypertension 1.97 (1.74-2.10) 17·9% (15·7–20·4)
Abdominal obesity (highest vs. lowest quartile) 1.62 (1.45-1.80) 20·1% (15·3–26·0)
Decreased risk (protective)
Alcohol consumption (3 times/week) 0.91 (0.82-1.02) 6·7% (2·0–20·2)
Regular physical activity 0.86 (0.76-0.97) 12·2% (5·5–25·1)
Fruit and vegetable consumption (daily) 0.70 (0.62-0.79) 3·7% (9·9–18·6)
OR = odds ratio adjusted for all other risk factors; CI = confidence interval; PAR = population attributable risk
Stroke is a clinical entity caused by either extracranial/intracranial vascular atherothrombotic
pathology or intracranial haemorrhagic conditions. Risk factors differ for each type of stroke –
atherosclerosis risk factors (as in CHD) predominate in the former, whereas hypertension and
smoking are common in the latter. Leading stroke risk factors in low- and middle-income countries
include raised blood pressure, smoking, low fruit and vegetable intake, low physical activity, and
alcohol excess [33]. The relationship between salt intake and elevated blood pressure was
established by the INTERSALT study, a 32-country study involving 10,079 men and women aged
20-59 years in whom it was seen that 24-hour urinary sodium was positively correlated with blood
pressure adjusted for age and sex, after taking into account of body mass index and alcohol intake
as confounders [34]. Further, the Prospective Studies Collaboration meta-analysis of 61 prospective
studies with over one million individuals from industrialized countries reported a direct relationship
between blood pressure and stroke after correction for regression-dilution; throughout middle and
old age (40-89 years), usual blood pressure was strongly and directly related to stroke death rate,
with no evidence of a threshold down to at least 115/75 mm Hg [35].
5
1.2.4 Geography and Cardiovascular Disease Age-standardized cardiovascular death rates (per 100,000) in middle-aged subjects (30–69
years) are high in India (428) and other low- and middle-income countries such as Brazil (330),
China (290), Pakistan (425), Nigeria (452) and Russia (688) while it is lower in industrialized
countries such as Canada (140) and Britain (182) [36]. Moreover, in India about 50% of CHD-
related deaths occur in people younger than 70 years compared with only 22% in the West [36,37].
Studies in emigrants have indicated that South Asians had higher rates of CHD [38]. Genetic factors
were suggested. These studies however suffered from multiple biases, the major being the "healthy
survivor" bias, as survivors of acute coronary event that reached the study hospitals were younger,
more educated, affluent and had risk factors that were not considered significant with the available
knowledge [22].
Globally, the bulk of modern descriptive research has focused mostly on time and person,
with little consideration of the implications of place [39]. In India too, there is limited data on
regional variations of CVD in India. About a decade back, Gupta et al [40] linked hospital-based,
mortality data from the Medically Certified Cause of Death (MCCD) database to risk factor data
from the second National Family Health Survey and the India Nutrition Profile Study. They found
that cardiovascular death rates among middle-aged adults in different Indian states varied from a
low of 75-100 per 100,000 individuals in sub-Himalayan states of Nagaland, Meghalaya, Himachal
Pradesh and Sikkim to a high of 360-430 per 100,000 in Andhra Pradesh, Tamil Nadu, Punjab and
Goa. Such large variations in cardiovascular disease mortality in different Indian states could at
least partly be attributed to differences in dietary consumption of fats, milk, sugars and green-leafy
vegetables, as well as the prevalence of obesity [40].
Urban-rural differences in the distribution of cardiovascular disease have also been
documented -- a higher prevalence of coronary heart disease being seen in urban (6.6%-12.5%)
versus rural populations (2.1%-4.3%). There is also evidence of a significantly increasing time trend
in urban and rural areas over the last few decades [22]. Furthermore, the underlying risk factors
such as smoking, dietary patterns, obesity, diabetes and hypertension [21,22,41,42] also have urban-
rural differences, with cigarette smoking, adverse diet, obesity, diabetes and hypertension being
documented higher in urban areas and smoking of beedis being noted higher in rural areas.
6
1.3 Health Information Visualization One of the challenges of the current century is to improve the widespread uptake and use of
health policies that take into account of up-to-date epidemiological evidence. Risk communication
is a critical component of this process of knowledge dissemination, with the objective of presenting
scientific outputs in ways that are understandable to scientists and non-scientists. One of the
mechanisms for enhancing the transparency and widespread understanding of scientific evidence is
to use visual methods of presentation in order to make fairly abstract quantitative results easier to
comprehend [43,44]. Visual methods such as tables, graphs, pie-charts, boxplots, histograms and
maps facilitate our understanding of health issues by summarizing complex survey data, allowing
for visualization, and stimulating thought leading to new ideas and solutions [43]. Such
visualization in the field of CVD epidemiology is useful in two separate domains [45,46] — (1) the
academic domain, in which the use of such visualization to explore data would help in
understanding the distribution of cardiovascular diseases in India and generating hypotheses; and
(2) the public domain, in which a graphical report using enhanced visualization enables
professionals to present the “visual thinking” on the geographic distribution of cardiovascular
diseases to other concerned stakeholders for appropriate action.
This type of data visualization is especially important for chronic diseases such as
cardiovascular diseases which are a neglected epidemic in many low- and middle-income countries
[47]. The hypothesized reasons for this neglect include: lack of up-to-date evidence on disease
burden in the hands of decision makers; strong beliefs that chronic diseases affect only the affluent;
beliefs that the control of chronic disease is not cost-effective and hence should wait until infectious
diseases are controlled; and also due to the orientation of health systems toward acute care [48].
Visualization of the geographical epidemiology of cardiovascular disease in India has the potential
to enhance understanding for stakeholders such as global agencies, governments, academia and
research groups, donors, health professionals and the private sector.
7
1.4 In summation Cardiovascular diseases have become the leading cause of death and disease burden in India.
Conventional risk factors explain much of the CVD burden though genetic factors and possibly fetal
programming of adulthood chronic diseases have been proposed as relevant to the south Asian
context. The unique epidemiological context of the CVD epidemic in India with regard to smoking
(of beedis and cigarettes, and a later age of initiation of smoking), low overall rates of obesity but
with presence of central obesity, increasing prevalence of diabetes (especially in urban areas), and
high levels of CVD mortality that vary across different regions of the country point to the need for a
focused descriptive geographical epidemiological study of cardiovascular disease in India. Visual
methods of presentation of study findings offer a mechanism to enable a wider understanding of the
epidemiologic evidence regarding cardiovascular diseases for greater action regarding their control.
Such a study would not only help appreciate the health situation within India but could also
contribute to a better understanding of global health.
In this thesis, first I review the relevant literature on the descriptive epidemiology of CVD
risk factors in India including features that are similar to global findings and also features that are
unique to India. Then in the methods section, I describe the datasets used and outline the statistical
and geographical analyses undertaken. Subsequently, the results of this analysis in the form of
descriptive and geographical epidemiology of key CVD risk factors (smoking, body mass, dietary
factors and self-reported diabetes) and cardiovascular mortality outcomes are presented. This is
followed by a discussion of study findings along with interpretation of biases and limitations of the
datasets and analytic methods. Finally, the significant conclusions from this study are listed along
with possible directions for future research.
8
1.5 Objectives
1.5.1 General Objective
1.5.1.1 To describe the geographical epidemiology of cardiovascular disease in India
1.5.2 Specific Objectives
1.5.2.1 To describe the geographical epidemiology of cardiovascular risk factors (smoking, body
mass, diet and self-reported diabetes) and cardiovascular mortality in India
1.5.2.2 To present a graphical display of cardiovascular risk factors and mortality outcomes by
age, sex, education, region and residence in India
1.5.2.3 To describe the biases in the study datasets
9
2 REVIEW OF LITERATURE In this section, I first review current knowledge regarding the various cardiovascular risk
factors and mortality for an understanding of the CVD epidemic; in particular, features that are unique
to the epidemiological context within India are highlighted. Then I describe applications of
geographical epidemiology in the field of cardiovascular disease from literature; this was available
predominantly from industrialized countries. Lastly, I look at the utility of data visualization using
graphical displays in bringing to the attention of policymakers, academics and other stakeholders, the
burden associated with cardiovascular diseases.
2.1 CVD risk factors The following is a description of key determinants of cardiovascular mortality including smoking,
obesity, dietary factors and diabetes. Strongest evidence till date comes from systematic reviews of
the effects of smoking and obesity on heart disease. Smoking leads to a loss of 10 years of life [49].
Obesity leads to a loss of three years of life [50]. While the risk factors in the Indian population are
similar to those in the global population [38], there are however some important differences in the
distribution and nature of these risk factors in the Indian population that provide a unique
epidemiological context vis-à-vis the evolving epidemic of cardiovascular diseases.
2.1.1 Smoking India has a unique variation in the types of tobacco smoked [51]. Beedis and cigarettes are the
most common forms of smoking. A beedi consists of 0.2-0.3 grams of sun-cured tobacco loosely
packed and rolled in a rectangular piece of dried leaf (temburni leaf) and tied with a cotton thread.
Beedis may allow two to three times as many puffs as an ordinary cigarette. Because of the low
porosity of their wrappers and their poor combustibility, beedis must be puffed frequently to be kept
alight and so they deliver a relatively higher dose of tar to the smoker [52]. A less common form of
smoking is water-pipe smoking known as hookah. Hookah pipes involve smoking of tobacco from
cured leaves or leaves fermented in molasses, honey or fruit juices then covered with glowing
charcoal. This is seen in some parts of the country. Also, cheroots (chutta), which resemble cigars, are
smoked in a few regions of the country. Reverse chutta smoking (with the lit end inside the mouth) is
also seen in some districts.
10
There are an estimated 100 million adult smokers (95 million males and 7 million females) in
India. Importantly, cessation rates among smokers are low. About 2% of men are ex-smokers, and
many of them quit due to disease. In contrast, male ex-smoking rates are about 40% in many
industrialized countries where the risks of smoking are better known [53].
Some features of the tobacco epidemiology that are specific to India and could therefore have
a unique impact on the epidemiology of the evolving CVD epidemic in India are listed below:
o Unlike in all high-income countries and in China where tobacco consumption is mostly
of cigarettes, in India tobacco smoking consists of the use of beedi (predominantly) and
cigarettes [51]
o Analysis of age-specific prevalence proportions of smoking among males in India
[54,55] revealed interesting differences in comparison to global and American data. For
example, peak smoking rates (50%) among Indian males was seen in the 40-49 year age-
group; this was older than the peak use (36%) in ages 30-39 years seen globally [56] and
the peak use (39%) seen in the age-group of 21-25 year old males in America [57]
o Amount of smoking is generally low in India as compared to other countries. Mean
number of beedis or cigarettes smoked per day by male smokers aged 30-69 years was
4.0 in India [58] compared to about 14 cigarettes per day in U.S.A [59]
o Wide variations in prevalence of smoking among males have been noted across different
states [41,54]
Further, Jha et al (2008) have recently used a case-control study design with data from the first
phase of the Million Deaths Study to conclude that in persons between the ages of 30 and 69 years
smoking was responsible for about 1 in 20 deaths of women and 1 in 5 deaths of men [58].
2.1.2 Body mass
The practical methods applicable for assessing large populations are body mass index (BMI),
waist circumference (WC) and the waist:hip ratio (WHR) as these are the commonly used measures in
epidemiological studies. Of these, the most widely used measure of overall obesity in adults is the
BMI (Quetelet index), a measure of weight adjusted for height, calculated as weight (kg)/height (m)2.
BMI is closely correlated with more sophisticated measures of obesity and, as such, is a useful
screening tool. It has been widely used in population studies and predicts the future development of
diabetes. The definitions of overweight and obesity have varied in different studies. The WHO
11
definition for pre-obesity is BMI = 25.0-29.9 kg/m2 and for obesity is BMI ≥ 30 kg/m2. Obesity is
further classified as Class I (BMI = 30.0-34.9 kg/m2), Class II (BMI = 35.0-39.9 kg/m2), and Class III
or morbid obesity (BMI ≥ 40 kg/m2).
Among the CVD risk factors, poor nutritional habits and physical inactivity are seen to be
contributing to the epidemic of obesity sweeping the world. Body fat distribution, especially visceral
adipose tissue accumulation, has been found to be a significant correlate of a cluster of diabetogenic
and atherogenic abnormalities [60]. Most recent update on the association between body mass and
cardiovascular mortality comes from the Prospective Studies Collaboration meta-analysis of 57
prospective studies with 894,576 participants, mostly from western Europe and North America [50].
After adjusting for age, sex and smoking status, it was found that at BMI of 30-35 kg/m2, median
survival was reduced by 2-4 years; at 40-45 kg/m2, it was reduced by 8-10 years (which is comparable
with the effects of smoking).
Characteristics relating to body mass that are unique to India include:
o Overall rates of obesity are low in the Indian population [61,62]
o Central obesity (abdominal or visceral obesity) is however more common in south
Asians than in Caucasians. Factors contributing to this phenotype of obesity may include
genotype, fetal growth, appetite, physical activity and body composition [62]
o Overweight/obesity more likely to co-exist with undernutrition as twin burdens in
rapidly developing economies with income inequalities such as India [63]
2.1.3 Dietary factors and Diabetes
Dietary factors
Diet and nutrition have been extensively investigated as risk factors for cardiovascular
diseases such as coronary heart disease (CHD) and stroke. Their links to other cardiovascular risk
factors like diabetes, high blood pressure and obesity have also been established according to a recent
comprehensive review by Reddy and Katan [64]. However, available evidence has recognized
considerable practical and methodological issues in many of these studies pertaining to: the
measurement of exposures, definition of health outcomes, multitude of research designs and the need
for careful consideration when inferring causality. There is sufficient evidence from a variety of
studies linking several nutrients, food groups and dietary patterns with an increased or decreased risk
of CVD. There is also substantial evidence showing that vegetarians have a lower mortality from
12
ischaemic heart disease than non-vegetarians; however, cancer mortality and total mortality do not
differ. Vegetarianism can be subdivided into lacto-vegetarianism (a diet with dairy products and eggs
but without meat and fish) and veganism (a diet without any animal foods whatsoever, including dairy
products and eggs). Dietary fats such as trans-fats and saturated fats are associated with an increased
risk of CHD while polyunsaturated fats are known to reduce risk. Dietary sodium is associated with
elevation of blood pressure, while dietary potassium protects against hypertension and stroke. Regular
frequent intake of fruits and vegetables is certainly cardio-protective. Composite and prudent diets
appear to reduce the risk of CHD and stroke by being preventative and therapeutic [64].
Diabetes Much of the evidence related to high propensity of diabetes among south Asians comes from
expatriates in Britain and America. This high instance of diabetes and its complications do not have a
single explanation. The early incidence of diabetes and its link with coronary heart disease may be
partially explained by the central adiposity-insulin resistance syndrome [62]. Predisposition to this
may be genetic but exacerbated by other factors such as diet and physical activity levels in a rapidly
changing socio-cultural milieu [62,65].
Factors relating to diet and diabetes that are peculiar to the Indian context are listed below:
Diet
o The Indian diet is predominantly vegetarian with a recent shift towards higher intake of
fats and added sugars [66]
o Trans-fats or hydrogenated fats (eg. vanaspathi) are commonly used, especially in
several parts of urban India [67]
Diabetes
o South Asians appear to have a high risk of developing diabetes. It is hypothesized that
impaired sensing of glucose, reduced insulin secretion, or increased insulin resistance
that lead on to development of impaired glucose tolerance and diabetes mellitus [62].
Glucose intolerance, abdominal obesity and the metabolic syndrome features appear to
be important factors associated with the development of CHD in south Asians [62,65]
o Evidence of increasing diabetes prevalence from within the country is mostly anecdotal
based on rising clinical disease burden in hospital-based settings. Population-based
studies are limited in number with some evidence that diabetes may be rising in urban
areas of the country [68]
13
2.1.4 Other risk factors -- Hypertension and dyslipidemia
The Prospective Studies Collaboration meta-analysis of 61 prospective studies with over one
million individuals from industrialized countries reported a direct relationship between blood
pressure and stroke throughout middle and old age (40-89 years) -- usual blood pressure was
strongly and directly related to stroke death rate [35]. Blood cholesterol is also a major risk factor
for cardiovascular morbitidy and mortality [13,15,69]. Total cholesterol was positively associated
with IHD mortality (but not stroke) in both middle and old age and at all blood pressure levels in a
meta-analysis of the above 61 studies [69]. Information from nationally-representative surveys is
however lacking from India on these two risk factors.
2.2 CVD mortality The age-standardized cardiovascular disease death rate in middle-aged adults (30–69 years) is
estimated to be high in India (428 per 100,000) [36]. Hospital-based [40] and population-based
[29,30] studies indicate that CVDs are already the leading cause of death in this age-group.
2.2.1 Assessment of cause-of-death
Assessment and attribution of cardiovascular mortality is however challenging in the India
context because most deaths occur outside of the hospital setting. This demands that there be a system
of reliable ascertainment and validation of causes of death in ‘at-risk’ age-groups. Hence, a new and
enhanced “verbal autopsy” (VA) instrument [27,29] has been developed, piloted, and implemented in
certifying 125,000 deaths within the Sample Registration System, India’s flagship fertility and
mortality monitoring system.
The verbal autopsy method has three main components. First is data collection on
circumstances surrounding death by non-medical staff via household interview. Second is re-sampling
of the field work, and other quality control checks. Third, there is central medical adjudication of the
field reports to arrive at a final cause of death. Household assessment of the cause of death involves an
investigation of the train of events and/or circumstances at the onset and during the course of the
terminal illness, through an interview of relatives/associates of the deceased. VA can be of substantial
help in assessing the “underlying cause of death” [70]. Verbal autopsy is now of established value in
helping to classify the broad categories of mortality in young and middle ages (0 – 69 years) although
there is variation in the sensitivity and specificity for certain diseases.
14
These VA methods, study instruments, training material, coding manuals, and quality control
checks are now freely available [27] and are now being increasingly used in settings with poor death
registration and certification systems. .
Despite the substantial misclassification that is inevitable, results obtained by the use of VA
provide much better evidence than was earlier available on cause-specific mortality rates for India as a
whole, and on the geographic variation in those mortality patterns.
2.3 Geographical epidemiology The concept of “place” in human health is probably a surrogate for the interplay between
genetic factors, environment, lifestyle and society. From time memorial, ‘place’ as a determinant of
health has been acknowledged in scientific enquiries as in ‘On Airs, Waters and Places’ by
Hippocrates circa 400 years BCE. Systematic interest in the field has emerged over the last 50
years, with the perspective and methodology of geography being applied to the study of health and
disease over the last few decades. The emergence of a systematic interest in geography and health
can be seen from the first report of the Commission on Medical Geography (Ecology) of Health and
Disease of the International Geographic Union in 1952 [71]. Subsequently, interest in the field
diffused around the world as evidenced by focused enquiries on geography and health documented
from many countries. These studies led to the development of health geography consisting of the
fields of medical geography and geographical epidemiology with slightly differing foci -- medical
geography maintained an ecologic perspective to study disease patterns while geographical
epidemiology maintained a significant focus on study design and analytic methods. The primary
difference appears to be the focus of medical geography on the spatial context of health-related
issues—an aspect that epidemiology recognizes but rarely explicitly considers [39]. Health
geography is thus both an ancient perspective and a modern specialization.
Over the last decade however recent advances in geotechnology and analytic techniques have
given a major impetus to the fields of medical geography, geographical epidemiology and public
health sciences. Modern advances such as geographic information systems (GIS) have helped advance
the science of health geography greatly. A GIS is a system of hardware, software, and procedures for:
capture, management, manipulation, analysis, modeling and display of spatially referenced data for
solving complex planning and management problems [72]. GIS technology and the term originated in
15
Canada. The first fully operational GIS was the Canada Geographic Information System (CGIS)
developed at the Natural Resources Canada in late 1963 [73].
At the international level over the last decade, geographical epidemiologic outputs on health
topics in the form of atlases have been published to convey messages to a wider audience. These
include the World Atlas of Health, the Global Child Health Atlas, and other disease-specific atlases on
Diabetes, Tobacco, Cancers, and Heart Disease and Stroke. Such atlases have been constructed for
use at the continent level [74], at national level in U.S.A [75]and Canada [76] or at subnational level
for a province such as the Ontario Diabetes Atlas [77] or for a city such as the Toronto Diabetes Atlas
[78].
Within the field of cardiovascular disease (CVD) epidemiology, mapping the prevalence of
cardiovascular risk factors and disease burden reveals an interesting 2-to-3 fold, west-to-east gradient
within Canada [76]. Similarly, geographical differences in prevalence are also seen in the Toronto
diabetes atlas with a low prevalence of diabetes in the high-income area of central Toronto and a high
prevalence of diabetes in the eastern and western suburbs which has a large population of south Asian
immigrants [78].
Spatial analysis of cardiovascular mortality in the United States has revealed a west-to-east
gradient in coronary heart disease mortality with clustering in some states [75]; the clustering of
stroke mortality was also common in these states as well as in some other areas [79]. Geographic
disparities (urban-rural differences and inter-state differences) in cardiovascular health and underlying
social determinants have also been identified [80]. In Canada [81] and Germany [82] also, regional
differences in coronary heart disease and personal and regional risk factors have been studied and
found to be different. A recent review of the status of cardiovascular disease in the Commonwealth
countries revealed that CVD death rates were higher in south Asian and sub-Saharan countries than in
European, North American or Australasian countries [83]. Further, even within a single population
subgroup such as among Native Indians within U.S.A., there are geographic variations in
cardiovascular disease and risk factors depending on residence in different states [84].
Descriptive epidemiology of mortality and risk factors in India also, be it for communicable or
noncommunicable diseases, have traditionally given emphasis on person and time characteristics but
not place. Exceptions to this include the limited mapping of maternal and child health status (such as
maternal mortality ratios, infant mortality ratios and childhood vaccination coverage) that is available
for the entire country. Mapping of HIV prevalence among women attending public antenatal clinics in
16
115 districts of the four high-risk southern states (of the total 35 states) in India is recently available
[85].
Epidemiology of coronary heart disease has been documented in India from the 1960s [86].
Recent reviews on distributions of hypertension, [21] coronary heart disease, stroke and their risk
factors [22] have also focused on person and time predominantly. There are however some
exceptions. Wide variations in tobacco use across various states in the country have been documented
in nationally-representative surveys [41,54]. State-level differences in dietary intake and
cardiovascular disease mortality rates have also been described [40].
2.4 Data visualization Graphical displays help to disclose complex structure in data [44,87]. From this point of view,
data visualization may not only create interest and attract the attention of the viewer but also provide a
way of discovering unexpected trends or patterns.
Maps by themselves have been used since time immemorial in ancient civilizations to present
visual information but an atlas or a collection of maps is a more recent phenomenon. An atlas is
typically a collection of maps of the earth or a region of the earth. Abraham Ortelius is credited with
issuing the first ‘modern’ atlas of 53 maps in 1570 in Antwerp, Belgium. However, use of the word
"atlas" for a bound collection of maps did not come into use until 1595 when it was first used by
Gerardus Mercator. Map-making or Cartography is both a science and an art. Unlike general
cartography which involves maps that are constructed for a general audience, thematic cartography
(statistical maps) involves maps of specific themes oriented toward specific audiences. The intent of
maps is to illustrate in a manner in which the ‘percipient’ acknowledges its purpose in an accurate,
comprehensible and timely fashion [88].
Graphical displays with maps, boxplots, line graphs, pie charts and other visualization
methods are part of a ‘periodic table of visualization methods’ that may be used to translate
knowledge to multiple stakeholders. They have the ability to blend science and art into a document
that provides relevance and meaning to the communication process [43]. These visual methods thus
enable us to bridge the gap in information sharing and the decision-making process by reducing detail
and complexity to a simple visual representation that can be more easily understood by a variety of
professionals and other interested individuals. This could be of relevance in developing countries
where noncommunicable disease control is not yet on the radar of policymakers and other
stakeholders due to inadequate information [48].
17
3 METHODOLOGY
In this section, following a brief description of the study setting and study design, I explain in
detail the 5 secondary datasets and study variables that I used in my study. Subsequently there is
information on data management. In data analysis, I cover univariate and multivariate statistical
analyses undertaken. Lastly, there is an outline of geographical analytic methods used.
3.1 Study setting
Lying entirely in the northern hemisphere, India extends between 8° 4' and 37° 6' latitudes
north of the equator, and between 68° 7' and 97° 25' longitudes east of the prime meridian. Its
population totals 1.13 billion with a median age of 23.8 years. It has 2.2% of world’s land mass and
16.6% of world population. India is a federal union of 35 administrative provinces; of these, 19 are
large states (>10 million population), 10 are small states (<10 million population) & six are union
territories (UT) (Figure 3.1). As of 2008, these 35 provinces were subdivided into a total of 611
districts. There is wide variation in the size, structure and composition of states.
Figure 3.1 Political map of India showing states and union territories
Large states: (1) Jammu & Kashmir, (2) Punjab, (3) Haryana, (4) Delhi, (5) Rajasthan, (6) Uttar Pradesh, (7) Bihar, (8) Jharkhand, (9) Assam, (10) West Bengal, (11) Orissa, (12) Chattisgarh, (13) Madhya Pradesh, (14) Gujarat, (15) Maharashtra, (16) Andhra Pradesh, (17) Karnataka, (18) Kerala, (19) Tamil Nadu Small states: (1) Goa, (2) Himachal Pradesh, (3) Uttarakhand, (4) Arunachal Pradesh, (5) Meghalaya, (6) Manipur, (7) Mizoram, (8) Nagaland, (9) Sikkim, (10) Tripura
Union territories: (1) Chandigarh, (2) Dadra, Nagar & Haveli, (3) Daman & Diu, (4) Pondicherry, (5) Lakshadweep, (6) Andaman & Nicobar Islands
18
3.2 Study Design This thesis was a descriptive study of the geographical epidemiology of cardiovascular
disease in India. An ecologic analysis was undertaken on the secondary data from nationally-
representative health surveys in India.
3.3 Data sources All nation-wide health surveys conducted in the last decade were potentially eligible for
review. Six surveys were considered – India Nutrition Profile (1994-96)[89], Special Fertility and
Mortality Survey (SFMS, 1998) [90], National Family Health Survey (NFHS-2, 1998-99) [91],
Million Death Study –Phase I (MDS, 2001-03) [27], Sample Registration System (SRS, 2004) [92]
and National Family Health Survey (NFHS-3, 2005-06) [93]. Of these, the India Nutrition Profile
was found to be an aggregate of two independent surveys (National Nutrition Monitoring Bureau
survey 1994 and the District Nutrition Profile 1995-96). The NNMB survey however covered only
the rural regions of 8 states only; the DNP covered 18 states – some with both urban and rural
regions and some with rural regions only. Large and populous states such as Uttar Pradesh and West
Bengal were among the six states not covered in either of the surveys. Hence the India Nutrition
Profile was considered not nationally representative and was excluded.
All the other five surveys were included for the secondary data analyses. SFMS, SRS and
MDS were conducted by the Registrar General of India (RGI), Ministry of Home Affairs, Govt of
India, New Delhi, while NFHS-2 & 3 were organized by the International Institute of Population
Studies (IIPS), Mumbai. MDS was a mortality outcome survey and the other four were considered
as risk factor surveys. All five surveys followed sampling design procedures to obtain health
indicators at the national and state levels. Some key characteristics of the five surveys are shown in
Table 3.1 and the details of the data sources are described below.
3.3.1 Special Fertility and Mortality Survey (SFMS)
This was a one-time special survey conducted in 1998 covering 3.7 million individuals aged
15 years and over from 1.1 million households [90]. This survey collected information at
community level (social/community facilities), household level (religion, caste, household
19
characteristics) and individual level (demographics, fertility and mortality). It was conducted within
the Sample Registration System (SRS) framework – India’s flagship fertility and mortality
monitoring system since 1971. Field supervisors interviewed heads of household who provided
proxy information for other family members. Relevant key variables included sociodemographic
information and some health information.
Table 3.1 Description of the 5 databases, study periods, sample sizes and study respondents No. Database Year(s) Description Sample Size Respondent1 2 3 4 5
SFMS (Special Fertility and Mortality Survey) NFHS-2 (National Family Health Survey-2) MDS (Million Death Study)-Phase I SRS (Sample Registration System) NFHS-3 (National Family Health Survey-3)
1998 1998-99 2001-03 2004 2005-06
Fertility and mortality survey of a nationally representative sample Demographic and Health Survey in 26 states Mortality surveillance system (using ICD-10) in all states Demographic surveillance system covering 6.3 M persons in all states Demographic and Health Survey in 29 states
3.7 M ind. aged 15+ from 1.1 M households (6671 sample units) 334,486 ind. aged 15+ from 91,196 households (168,517 ♂ & 165,969 ♀) 123,905 ind. aged 15+ from 1.1 M households (82,383 ♂ & 66,498 ♀) 4.5 M ind. aged 15+ from 1.3 M households (7597 sample units) 198,754 ind. from 109,041 households (74,369 ♂ 15-54 yrs & 124,385 ♀ 15-49 yrs)
Head of household Household respondent VA respondent Head of household Self
Ind. = individuals; M = million; VA= verbal autopsy
3.3.2 National Family Health Surveys (NFHS)
These were cross-sectional demography and health surveys of a representative sample of
households in India conducted to provide estimates of indicators of population, health, and nutrition
by sociodemographic characteristics at the national and state levels. NFHS-1 survey was conducted
in 1992-1993 (but not included in the present study), NHFS-2 in 1998-1999 and NFHS-3 in 2005-
20
2006. There were differences in study design and study population between NFHS-2 and NFHS-3
that are detailed below.
NFHS-2 (1998–1999), was a nationally representative cross-sectional study of 92,447
households [91]. Trained data-collectors interviewed an adult member in each selected household to
obtain socio-demographic and health information about the household and its family members,
obtaining a household response rate of 98%. From these households, the data-collectors interviewed
90,303 ever-married women aged 15–49 in face-to-face interviews obtaining an individual response
rate of 96%. These women were located in 3204 primary sampling units in 26 of the 32 states. In
rural areas, these primary sampling units were villages or village and in urban areas these were
census enumeration blocks which were contiguous areas created to be as demographically
homogenous as possible.
NFHS-3 interviewed 124,385 women aged 15-49 and 74,369 men aged 15-54 to obtain
information on population, health, and nutrition in India covering 29 of the 35 states [93]. Complex
multi-stage sampling procedures were undertaken: two-stage (village/primary sampling unit and
household) procedure in rural areas and three-stage (urban ward/census enumeration
block/household) procedure in urban areas. Study design also involved stratification by geographic
(district, area size, etc) and sociodemographic characteristics (percent of males in non-agricultural
sector, percent of population belonging to scheduled castes/tribes, and female literacy). In addition,
the study sampling frame over-sampled urban residents to represent larger metropolises and slums
and also over-sampled female participants from within households. Household response rate was
98% and interview response rate was 92%.
3.3.3 Sample Registration System (SRS)
This survey was the baseline study of the 2004-2014 sampling frame conducted within the
SRS, a large, routine demographic survey serving as the primary system for collection of fertility
and mortality data since 1971. The latest SRS sample frame covered about 7.6 million people
(including 4.5 million adults aged ≥15 yrs) in all 28 states (except rural Nagaland) and seven union
territories of India [92]. A total of 7,597 sample units (4,433 rural and 3,164 urban) were selected
from the 2001 census. SRS sample units were randomly selected to be representative of the
population at the state level. The sample design was a uni-stage, stratified, simple random sample
without replacement. The sample size was based on infant mortality rates. Within the SRS, selected
21
households were continuously monitored for vital events by two independent surveyors. The heads
of households were identified to obtain proxy information regarding all family members.
3.3.4 Million Death Study (MDS)
This is the world’s largest prospective study that is being conducted (1998-2014) to provide
quantification and epidemiological evidence of cause-of-death in India. The overall study is being
conducted within two sampling frames of the Sample Registration System (SRS) framework:
sampling frame 1 (1998-2003) with 6.3 million individuals and sampling frame 2 (2004- 2014) with
7.6 million individuals, yielding a total sample size of 14 million individuals in 2.4 million
households. There is continuous (on-going) collection of data and monitoring of vital status. This
yielded about 300,000 deaths for the period 1998-2003; another 700,000 deaths are estimated from
the second period 2004-2014 to make a total of one million deaths [27]. Deaths (numbering
123,905) that occurred during the period 2001-2003 and that were studied in depth using the verbal
autopsy (VA) method were included in this thesis.
3.3.4.1 Verbal autopsy method
There was use of validated Verbal Autopsy (VA) instrument to record and validate the
cause-of-death [27]. The VA instrument followed a hybrid (open/closed) format. In the ‘closed’
section, there was provision for collection of socioeconomic and demographic characteristics of the
respondent and the deceased. There were also “filter” questions used to screen for the presence/
absence of specific symptoms. If the filter question was positive, then subsequent questions on
severity, duration, or other characteristics of these symptoms were asked of the respondent. In
addition, there was a symptom list to aid in the drafting of a written narrative, the ‘open’ section.
This written narrative detailed the following information: associated symptoms in chronological
order; duration; onset of illness; type of treatment received (if any); details on hospitalization;
history of past episodes; and abstracted information relating to the terminal illness from available
investigation slips, discharge summaries or death certificate. Each interview lasted about 30-45
minutes. Forms were available in English/ Hindi, with the narrative written in the vernacular
language (such as Tamil, Punjabi, Gujarati, etc).
3.3.4.2 Random re-sampling of field interviews
To ensure high quality fieldwork, a specialist re-sample team (RST) directly reporting to the
study investigators re-interviewed up to 5-10% of randomly chosen households. A review of about
22
3,500 field reports from several states had found a high correlation between the random audit team
and the RGI supervisors on overall distribution of causes of death [27].
3.3.4.3 Ascertainment of cause of death by trained physicians
Previous validation results for adult deaths have suggested that the central adjudication by a
trained panel of physician coders yielded consistently higher sensitivity for most mortality
outcomes than an algorithm-based approach [27]. Before assigning cause of death, a panel of about
120 physician coders were trained to carefully screen all documentation provided, noting all of the
positive/negative evidence, and use clinical judgment in assigning the underlying cause of death. In
order to reduce inter-observer variation, the two coders independently examined each report and
determined a probable underlying cause of death with ICD-10 codes. They then provided an
underlying cause of death in words (e.g., “myocardial infarction”), the corresponding ICD-10 code
(e.g., I21) and the key words used to support their decision. If two physicians did not agree on an
underlying cause of death, a web-based system assigned to each physician their own original report
and the ICD-10 code of the other physician (without revealing their identity). The physician coders
were then required to use the additional information (ICD-10 and key words) provided
anonymously by the other physician to reach an agreement on the underlying cause of death. An
expert panel of senior physician coders assigned a final cause of death where two physicians did not
agree on a cause of death after one reconciliation attempt. Physicians were drawn from across India
to ensure valid cross-region comparisons.
For my analysis, I focussed on the subset of ICD-10 codes comprising of cardiovascular
deaths [including ischaemic heart disease (I20-25, I44,I46,170), hypertensive heart disease (I10-15),
heart failure (I50), cerebrovascular disease (I60-70, G45, G81-83) and sudden deaths (R55, R96)].
The reason for including the G-codes (transient ischaemic attacks or plegias) and the R-codes
(sudden deaths or syncope & collapse) was that it was assumed to be wrongly coded by the
physician coders instead of the underlying cause of death being coded; this was based on a review
by this reviewer of a subset of the written narratives for deaths occurring in middle-aged adults and
concluding that they were probably cardiovascular deaths based on the clinical information
available in the narratives. The process-flow of VA methods within the MDS is depicted in Figure
3.2
23
Figure 3.2 Process flow of the Million Death Study
Continuous recording of births & deaths
Part-time enumerators
RGI surveyors
Collection of circumstances of death and complete narrative
Resample Team surveyors
Reconciliation & Adjudication
2 physician coders assign cause of death using ICD-10
B. Cause of death assignment
A. Field activities
5%-10% re-sampling of deaths
For this analysis, I used a subset of this study from the years 2001-2003 (called Phase I)
covering a total of about 123,905 deaths, of which about 22,000 were cardiovascular deaths.
The set of demographic and geographic variables, cardiovascular risk factors, and fatal
outcomes from these five databases are shown in Table 3.2.
24
25
Variable Values SFMS NFHS-2 MDS SRS NFHS-3
Year 1998, 1999, 2001-06 1998 1998-99 2001-03 2004 2005-06
Table 3.2 Variables used from the 5 surveys (4 risk factor surveys & 1 mortality outcome survey)
Survey
Time
Demographic
Age (≥15) Years
Sex Male, Female
Education Illiterate, upto Grade 5, Grade 6 to 10, Grade 10 & above
Geographic
Residence Urban, Rural
State 29 States, 6 union territories
CVD determinants
Smoking Yes, No .
Age of initiation Years . . . .
Type of tobacco Beedi, Cigarette, Other . . . .
Body mass 10.0-49.0 kg/m2 . . . .
Diet Lacto-vegetarian – Yes, No . . .
Fruit intake Daily, Weekly, Rarely/never . . . .
Self-reported diabetes Yes, No . . . .
CVD outcome
CVD-specific deaths ICD-10 codes (I10-15, I20-25, I44, I46, . . . . I50, I60-70, G45, G81-83, R55, R96)
3.4.3 Data quality assessment
Ascertainment of quality of data was done by assessing coverage of surveys and by studying
internal validity of survey information.
3.4.2 Compilation of data dictionaries
Data dictionaries containing descriptions of data and the data fields were compiled for all
five databases. Each data dictionary comprised of the following items at the minimum:
2. CVD determinants – smoking, body mass, diet (lacto-vegetarianism, fruit intake), self-reported
diabetes
For mortality outcome, age-standardized cardiovascular death rates were obtained from the
on-going Million Deaths Study (age-standardization explained below).
The relevant variables were then identified in the various study questionnaires for
compilation of data dictionaries and for abstraction from the 5 databases.
1. Sociodemographic variables – age, sex, education, place of residence (urban/rural), state
3.4.1 Abstraction of relevant variables from the 5 databases
Variables of interest were based on risk factors that account for bulk of the disease burden in
adults and that are amenable to surveillance, prevention and control as documented in global health
documents such as the World Health Report [94], Global Burden of Disease [23], the WHO-STEPS
[95] and the Commission on Social Determinants of Health [96].
3.4 Database Management
other values of the variable (such as unknown, missing, etc.)
maximum value of the variable
minimum value of the variable
variable length for decimal places
variable length
variable category
variable type
variable description
column (variable) ID
26
1) Internal quality control measures in surveys were reviewed to assess coverage: (a) It was
noted that 10% post-enumeration checks were routinely carried out in Census surveys in
India [97]; (b) it has been reported that completeness of reporting of vital events (births
and deaths) averaged around 85% in the Sample Registration System in India [98].
2) A literature review of accuracy of self-reporting was undertaken to assess internal validity:
(a) regarding smoking – past evidence indicates that proxy-reported smoking status was an
accurate and effective means of monitoring population-wide smoking prevalence of adults
[99]. Self-reporting was however found to under-estimate current smoking when
compared to metabolic markers such as carboxyhaemoglobin or cotinine measurements
[100]. Agreement between self-reported and proxy-reported smoking status was found to
be dependent on ethnicity; Cohen’s kappa was 0.82 for Asian Americans and found to be
intermediate between that for Whites/Blacks (0.91) and that for Hispanics (0.76). It was
also dependent on age, with lower agreement for younger ages [101]; (b) regarding self-
reported morbidities – it was noted that this was related to self-rated health across the
social gradient [102] and thus acceptable for large-scale epidemiological surveys
Further, for geographical analysis, the quality of data from secondary sources was assessed for
geographic coverage and completeness of interview information [103].
3.4.4 Exploratory data analysis
Exploratory data analysis was undertaken initially to detect errors such as sequence break in
serial numbers, duplication of data, range errors and inconsistencies. Appropriate data cleaning was
done after detection of data errors -- for example, implausible values/ outliers were omitted from
further final analysis.
27
3.5 Data analysis
3.5.1 Conceptual framework
The conceptual framework for the data analysis is shown in Figure 3.3.[104] Figure 3.3. Conceptual framework for the geographical epidemiological analysis of CVD in India
Database management
Standardization Feature data (geographic areal data)
Geographical analysis
Validation
Statistical analysis
Attribute data (disease data)
3.5.2 Standardization
3.5.2.1 Risk factor estimates
All the risk factor surveys were age-standardized to the Census 2001 population. Direct
standardization was used for three (SFMS, NFHS-2 and NFHS-3) of the four risk factor surveys
which had individual-level data and indirect standardization was used for the SRS survey for which
only group-level information was available [105].
28
3.5.2.2 Mortality rates
Outcome estimates require knowledge regarding the numbers of people at risk. In India, the
primary source of such data for the entire country comes from two sources: the decennial Census
(the latest being the 2001 Census) and the Sample Registration System (SRS). Both have their own
set of problems. Firstly, due to the decennial nature of the census, estimates of population are not
routinely available and therefore need to be computed for the years between consecutive census.
This is computed by a ‘roll-forward’ geometric progression method based on the base year’s
estimates taking into account current births, deaths and net migration for the country. Secondly,
there is the issue of ‘under-count’ of deaths up to 20% within the Sample Registration System [98].
The approach to the latter issue has been to recalculate the death rates based on the higher absolute
deaths for the country as per the UN/WHO estimates.
The Million Death Study 2001-03 has two limitations. Firstly, it estimates only the
numerator, that is, the absolute numbers of cardiovascular deaths. Secondly, it has no appropriate
denominator since it covers a inter-censal time period.
Therefore, computation of cardiovascular mortality rate necessitated two procedures for
standardization to the year 2006. Firstly, the proportion of deaths from the MDS was used with the
death rates (average for years 2005 and 2006) from SRS and corrected upward (by about 15%) with
the UN mortality figures for the country to arrive at estimates of absolute deaths for the year 2006.
Similarly, UN population estimates for 2006 were used in conjunction with the proportion of
population in the various states from Indian Census to arrive state totals for the year 2006. This
enabled computation of age-standardized cardiovascular death rates for various states for the year
2006 as the outcome measure.
3.5.3 Statistical analysis
3.5.3.1 Univariate analysis
Univariate analysis of tobacco smoking, dietary behaviours, obesity and diabetes was done
to calculate prevalence proportions, means and rates per 100,000 persons respectively. Following
this step, age-standardized estimates [105] were computed using the following formula:
pr = ∑ [wa * (d+ar ÷ dar)]………………….………………..(1)
pr = age-adjusted disease (or risk factor) prevalence in region r
d+ar = number with disease (or risk factor) in age-group a in region r
29
dar = number in denominator age-group a in region r
wa = a numeric weight for the age-group a
the value wa is derived from the 2001 Indian Census population (reference population) as:
wa = Na ÷ Nt………………………………………….(2)
where Na is the total number of persons in age stratum a in the reference population and Nt
is the total number of persons in the reference population. 99% confidence intervals for the
age-adjusted prevalence estimates were calculated using the following formulae:
Variancer = ∑ [wa2 * (par * qar ÷ dar)]……………………………..(3)
SEr = √ Variancer……………………………………………(4)
99% CI for pr = pr +/- 2.576*SEr…………………………………..(5)
Variancer = variance for the age-adjusted disease prevalence in region r
par = age-group specific prevalence in age-group a in region r
qar = 1 – par
dar = total number with disease in age-group a in region r
SEr = standard error for the age-adjusted disease prevalence in region r
The 2001 census population was used as the reference population.
Subsequently odds ratios with 99% confidence intervals were calculated as measures of
association. Pearson or product-moment correlation coefficient was computed to compare
correlation between smoking prevalence proportions across various surveys.
Results were reported using graphical displays such as tables, line-graphs, bar graphs, pie-
charts and box plots wherever possible.
3.5.3.2 Multivariate analysis
Multivariate analysis was carried out by multiple linear regression [106] and poisson
regression [107]. I regressed cardiovascular death rates for males and females in each state (as
outcome variable) on the following set of eight predictor variables:
Percent urbanization – from the census 2001
Smoking prevalence %, lacto-vegetarianism prevalence%, regular fruit intake prevalence %,
overweight prevalence % and diabetes prevalence % -- from NFHS-3 survey.
For multivariate linear regression [106], the data were first examined prior to modeling by
way of plotting to assess whether the data had linear relationships with the outcome. As a second
30
step, associations between various parameters were studied by looking at correlations between the
study variables. Then the appropriate model was created using these variables.
Then I tested for assumptions of linear regression. Firstly, I used the SPEC option in the
PROG REG statement to check for heteroscedasticity (not identical distributions of error terms) and
dependence of error terms. As the SPEC test does the opposite of what one hopes to conclude, a
non-significant p-value indicates the error variances are not identical and the error terms are not
dependent. Secondly, the Durbin-Watson (D-W) statistic was obtained by using the DW option in
REG. This tested for first order correlation of error terms. The D-W statistic ranges from 0 to 4.0.
Generally a value of 2.0 indicates the data are independent, while a low value of <1.6 indicates
positive correlation and a large D-W indicates negative correlation. Lastly I examined the residuals
of the model in two steps: using REG to create an output of residuals for which I subsequently used
PROC UNIVARIATE to test them.
PROC REG DATA=DEATHS_RISKS;
MODEL DEATHS = X Y Z / DW SPEC;
OUTPUT OUT=RESIDS R=RES
RUN;
PROC UNIVARIATE DATA=RESIDS NORMAL PLOT;
VAR RES;
RUN;
I also looked for multicollinearity (whether the variables were correlated) by obtaining the
Variance Inflation Factor by using VIF option in the REG statement. A cut-off of 10 was used to
test if it was stable. Lastly I used the R option to generate Cook’s D statistic while looking for
outliers that could exert a large influence on the overall outcome.
Subsequently I also re-tested my model using poisson regression [107] with PROC
GENMOD and DIST=POISSON. To correct for overdispersion, I used the PSCALE (for Pearson)
option to obtain corrected chi-square statistics.
All statistical analyses and visualization was done using SAS 9.1 and MS-Excel 2003.
3.5.4 Geographical analysis
The two types of geographical analysis undertaken were Visualization and Exploration
[104]. For visualization, thematic mapping with choropleth maps (regional statistical maps) was
31
undertaken. Choropleth maps are vector maps depicting how a measurement is distributed across a
geographic area. They use estimates of unadjusted and standardized disease rates. Choropleth maps
may be simple or conditioned choropleth (CC) maps. Conditioned choropleth maps are special
choropleth maps showing distribution of dependent variables along two dimensions or conditions
[108].
For exploration, the variations across states were studied for spatial heterogeneity [109].
Spatial dependence of study variables was studied by spatial autocorrelation which is a measure of
similarity in neighbouring areas based on the values of a variable and a matrix for identification of a
region’s neighbours. Spatial autocorrelation of study variables was explored by means of both
global and local Moran’s autocorrelation. Global testing was done with a Moran scatterplot. The
plotting was done of a variable ‘x’ for a state and its spatial lag ‘w_x’, a weighted average of the
neighbouring states’ values. The slope of the regression line corresponds to Moran’s I statistic. The
value for this statistic ranges from -1 to +1, where -1 denotes strong negative autocorrelation; 0
denotes random distribution of values; and +1 denotes strong positive autocorrelation. Local
univariate Moran was used for the LISA (local indicator of spatial association) significance maps
which are maps of differences (that are statistically significant) between disease risks in one state
and the overall risk in the neighbouring states. This enabled construction of five different
comparisons to identify ‘clusters’ (high-high, meaning a state with high value that also has
neighbours with high values; similarly low-low) or ‘outliers’ (low-high, meaning a low value state
surrounded by high value statess; and the opposite high-low) or ‘not significant’ regions. Sensitivity
analysis with up to 9999 permutations was performed in the generation of LISA maps with
significance being set at p<0.05. K-nearest states (with k=6) was used for the purpose of
constructing spatial weights.
All maps were created using the 2001 Census state boundaries. LISA maps were restricted
to the NFHS-3 dataset because this was the dataset that had information on all risk factors within
one study. All mapping was done using good cartographic principles with particular attention to
recommended map elements, typography and colour schemes.
3.5.4.1 General cartographic principles
Geographic locations linked to other spatially referenced data, and aggregated into larger
geographical units as desired, are suitable for thematic mapping [110].
32
Mapping metrics
A key question in disease mapping exercises is what to map? Two types for areal data were
used for mapping: either maps of standardized rates, or maps of differences (that were statistically
significant) between disease risks in one area and the overall risk in the contiguous or total area [111].
Choice of mapping regions
The state was the geographic unit of study in India for three reasons. Firstly, it was an
optimal choice of mapping region as a trade-off between making it large enough to have stable
outcome estimates, and small enough to cover regions that were homogenous in nature [103]. If the
regions are too small, mapping may reveal spurious geographic patterns because of random
variations in the small numbers of events [112,113]. Secondly, the variables of interest for this
study were available from across multiple nation-wide surveys at the state level predominantly.
Lastly, it is the political administrative unit for macro-level health policy, planning and action and
this harmonized with my study objective to create a report that would be of use to stakeholders for
action in the field of cardiovascular disease control [46].
No. of classes for choropleth maps
Choosing the optimum number of classes for maps is an important part of map design. Too
few numbers of classes may fail to show the variation in the data. Increasing the number of classes
may yield a data-rich presentation by decreasing the amount of generalization. But it also has its
demerits [114,115] – (a) too many classes may overwhelm the map reader with information and
distract them from noting the general trend in the distribution; and (b) it may decrease map
legibility, with the increasing number of classes making it require more colours that increasingly
become difficult to tell apart. I therefore chose 4 to 5 classes for most maps based on
recommendation from literature and after multiple testing.
Legend types [116]
I used two legend types in mapping: (1) Sequential schemes – for ordered data that progressed
from low to high with light colours for low range data and darker colours for the upper range data;
and (2) Diverging schemes – where the mid-range values had light colours and the extremes had
bright colours with contrasting hues.
33
All geographical analyses and presentation were done using ArcView 9.0 (ESRI, Redlands,
CA) and the public domain software GeoDa 0.9.5-i beta (Spatial Analysis Lab, University of
Illinois).
3.6 Ethics approval Research ethics approval was obtained from St Michael’s Hospital (REB# 09-021C,
2/13/2009) and administrative approval was obtained from University of Toronto (ORE# 23964,
3/30/2009) for the thesis. This thesis involved secondary analysis of data previously collected for each
of the five different surveys. All are publicly available datasets (SFMS 1998, MDS 2001-03 & SRS
2004 from the Office of the Registrar General of India, New Delhi; NFHS-2 & NFHS-3 from the
International Institute of Population Studies, Mumbai). SRS-2004 had only group-level data; the other
four datasets with individual-level data were anonymized and made available to the Centre for Global
Health Research, St Michael’s Hospital (SMH). The databases are housed on the SMH server
securely. I therefore had access to these datasets without any personal information. It is almost
impossible to match observations in the study datasets to any personal identifiers.
34
4 RESULTS
In this section, I first look at the characteristics of the various surveys and provide the crude
prevalence estimates for the various risk factors. Subsequently, I present in detail the age-adjusted
results for each risk factor starting with smoking (because it takes away about 10 years of life)[49],
then overweight (which takes away about 3 years of life)[50], followed by diet (about which
evidence is less clear-cut in terms of life-years lost) and self-reported diabetes (about which
minimal information was available in my study dataset).
4.1 Survey characteristics and descriptive analysis of study population The characteristics of the five surveys such as survey timing, geographic extent covered,
age-group of interest, survey response rates, total number of individuals studied and their
demographic characteristics are listed in table 4.1.
4.1.1 Survey characteristics
The risk factor surveys were conducted over a 9-year period 1998-2006 and covered almost
the entire country in terms of the number of states covered. The total number of states and union
territories in India before the year 2000 was 32; in the year 2000, three large states were bifurcated
for administrative ease increasing the total to 35. The surveys carried out by the Registrar General
of India (SFMS, SRS and MDS-Phase I) were intended to be carried out in all geographic regions.
However due to civil unrest in rural Nagaland, it was not covered in all three surveys; the state of
Jammu & Kashmir was in addition excluded in the SFMS survey in the year 1998 due to a similar
reason. Both the National Family Health Surveys covered all existing states leaving out the smaller
union territories. All five surveys however still covered over 99% of the country’s population. The
risk factor surveys achieved high overall survey response rates (over 90%) and studied large
populations ranging from 0.2 million to about 4.5 million adults aged 15 years and over. The
mortality outcome survey which covered a base population of 6.4 million individuals followed up
over a study period of 3 years had a relatively lower response rate of 85%. This study yielded a total
of 123,905 deaths of which about 50,336 deaths were in the age-group 30 to 69 years.
35
Table 4.1: Descriptive analysis of baseline characteristics in the selected surveys Surveys
Risk factor surveys Mortality survey
Variables SFMS NFHS-2 SRS NFHS-3 MDS (Ph I)
Survey characteristics Survey year(s) 1998 1998-99 2004 2005-06 2001-03 No. of states surveyed 30.5/32 26/32 34.5/35 29/35 34.5/35
Large (>10 M pop.) 16/17 17/17 19/19 19/19 19/19 Small (<10 M pop.) 8.5/9 9/9 9.5/10 10/10 9.5/10 Union territories 6/6 -- 6/6 -- 6/6
Country population covered 99% >99% >99% >99% >99% Survey response rate n/a 93% n/a 92% 88% Age-group analyzed ≥ 15 yrs ≥ 15 yrs ≥ 15 yrs ♂:15-54 yrs 30-69yrs ♀: 15-49 yrs Nos. studied 3,870,872 334,486 4.5 M 198,754 50,336 Demographic characteristics Sex n/a
Male 51.0% 50.4% 37.4% 60.1% Female 49.0% 49.6% 62.6% 39.9% missing 0.0% 0.0% 0.0% 0.09%
Age (yrs) -- overall n/a Mean/ median 35.8/ 33.0 35.6/ 32.0 29.8/ 29.0 Age-groups (in young & middle age: 15-69 yrs) n/a
15-29 44.7% 46.0% 52.3% -- 30-44 31.3% 29.6% 37.2% 21.8% 45-59 17.0% 16.7% 10.5% 36.0% 60-69 7.0% 7.7% n/a 42.2%
Education n/a
Illiterate 38.4% 35.5% 25.4% 51.5% Upto Grade 5 27.8% 17.4% 14.7% 20.1% Upto Grade 10 26.4% 31.9% 37.0% 15.5% Grade 11 & over 7.4% 15.2% 22.9% 5.0% missing 0.5% 0.05% 0.02% 7.9%
Residence n/a Urban 23.5% 33.4% 47.9% 17% Rural 76.5% 66.6% 52.1% 83% missing 0.0% 0.0% 0.0% 0.0%
MDS- Ph I = MDS phase I; M = million; States surveyed, 0.5 = urban only; n/a = Data not available; -- = age-group not included in this study. All percentages are calculated excluding missing observations
36
4.1.2 Demographic characteristics
The SRS 2004 survey had no individual-level data on demographic characteristics. In the
other three risk factor surveys with individual-level information, data was almost complete with less
than 1% of values missing for the four demographic variables. In MDS, missing values were low at
about 1% except for the education field wherein about 7% of deceased had missing observations.
Sex
The SFMS and NFHS-2 had slight preponderance of males over females while NFHS-3 had
nearly two-thirds of participants as females. The MDS had about 60% male subjects.
Age
The overall mean (and median) ages in the SFMS and NFHS-2 surveys were almost similar;
the NFHS-3 survey mean was about 6 years lower.
By age-group of interest, that is among young and middle-aged adults (15 to 69 years), the
SFMS and NFHS-2 had similar age distributions with about 45% of study population being young
adults (in the 15 to 29 years age-group) and about 55% being middle-aged adults. This was seen to
be inverted in NFHS-3 with about 52% in the 15 to 29 years age-group and about 47% being
middle-aged adults. In the MDS, the age-group of 30 to 69 years was the study-group of interest.
Education
The proportions that were illiterate in SFMS, NFHS-2 and NFHS-3 were 38%, 36% and
25% respectively. In MDS, this was about 52% because of higher death rates among illiterates.
Residence
In the SFMS, about a quarter of participants were from a rural area with the remainder
residing in an urban area. The proportions living in an urban area in NFHS-2 and NFHS-3 were
higher at 33% and 48% respectively while it was lower at 17% in MDS.
4.1.3 Crude prevalence of selected CVD determinants in the four surveys
The crude prevalence of selected CVD determinants (smoking, fruit consumption,
vegetarianism, overweight and diabetes) along with male-female differences and urban-rural
variations are shown in table 4.2.
37
Table 4.2 Crude prevalence of CVD determinants in selected surveys, India Surveys
Variables SFMS NFHS-2 SRS NFHS-3
Survey year(s) 1998 1998-99 2004 2005-06
Age-group studied ≥ 15 yrs ≥ 15 yrs ≥ 15 yrs ♂:15-54 yrs ♀: 15-49 yrs
Current smoking prevalence % ≥ 15 years
Male 27.3% 30.4% 26.1% 33.2% Female 1.6% 3.1% 2.4% 2.2%
≥ 15 years Urban male 21.5% 23.6% 20.6% 29.8% Rural male 28.8% 33.8% 28.3% 36.8%
≥ 30 years Urban male 31.8% 33.4% 28.9% 36.5% Rural male 42.9% 46.1% 39.0% 46.2%
Overweight %
Male 11.8% Female 15.1%
Urban 20.9% Rural 7.8%
Vegetarianism %*
Male 24.1% 27.9% Female 26.0% 36.8%
Urban 32.2% Rural 34.5%
Fruit consumption % (atleast weekly)
Male 55.5% Female 47.7%
Urban 62.6% Rural 39.5%
Diabetes%
Male 2.7% Female 1.9%
Urban 3.2% Rural 1.5%
* = Vegetarianism in SRS survey and Lactovegetarianism in NFHS-3 survey; = not available
38
Smoking
The crude prevalence of current smoking among all males aged 15 years and over was found
to be between 26.1% and 33.2% in the four surveys; this was about 10-fold higher than the
prevalence among female which was between 1.6% and 3.1%. Among males in this age-group,
prevalence of smoking was around 22 to 24% in urban males and 28 to 34% in rural males. In the
age-group of 30 years plus, prevalence was noted to be higher at 29 to 34% among urban males and
39 to 46% among rural males. The ratio of rural:urban smokers was noted to be consistently 1.4
across all surveys in the age-groups considered.
Body mass
Only NFHS-3 had data on body mass for both males and females. Proportions of males and
females who were overweight was 12% and 15% respectively. When overweight prevalence was
looked at by residence, urban: rural ratio was nearly three-fold with nearly 21% of urban residents
being overweight or obese as compared to about 8% of rural residents.
Vegetarianism
About 26% of females considered themselves vegetarians in the SRS survey while nearly
37% considered themselves lacto-vegetarians in the NFHS-3 survey; this was higher in comparison
to the 20 to 24% among males in the SRS and NFHS-3 surveys. Slightly higher proportion of rural
residents (35%) reported as being lacto-vegetarians when compared to urban residents (32%) in the
NFHS-3 survey. This data by residence was not available in the SRS survey.
Fruit consumption
Data on frequency of fruit consumption was available only from NFHS-3. About 56% of
males and 48% of females reported at least weekly fruit consumption. This male: female ratio of 1.2
was less marked than the urban: rural ratio of 1.6 wherein 63% of urban residents as compared to
40% of rural residents who reported fruit consumption at least once a week.
Self-reported Diabetes
Self-reported diabetes prevalence among those aged 30 years and over was 2.7% among
males and 1.9% among females. While the sex ratio was 1.5:1.0, the urban: rural ratio was 2.1:1.0
since diabetes prevalence was 3.2% among urban residents and 1.5% among their rural
counterparts.
39
4.2 Smoking In this section, I start with the results of smoking prevalence in both males and females from
across different surveys. Then I quickly go on to present details of smoking among all males (aged
15 years and over) since smoking among females was less common. Finally, I focus on middle-aged
males (ages 30-69 years) since peak smoking was seen in this age-group and because cardiovascular
mortality was studied in this age-group.
4.2.1 Smoking prevalence among males and females
The crude prevalence of current tobacco smoking among all males and females aged 15
years and over is graphed in Figure 4.1. Male smoking prevalence ranged between 26 and 30% and
female smoking prevalence ranged between 1.6 and 3.1% in the first three surveys where a family
member (usually the male head of household) reported the smoking history on behalf of all family
members. In the NFHS-3, prevalence was calculated based on self-reporting by both males and
females and found to be 33% and 2.2% respectively. Across all surveys however, it can be seen that
female smoking was very low in the country and male smoking was 10-fold or higher than females.
Figure 4.1 Crude prevalence of current tobacco smoking among males and females, aged 15 years and over, from selected surveys in India
27.3
30.4
26.1
33.2
1.63.1 2.4 2.2
0
5
10
15
20
25
30
35
40
SFMS [1998] NFHS-2 [1998-99] SRS [2004] NFHS-3 [2005-06]
Survey (year)
Prev
alen
ce %
Male Female
40
4.2.2 Smoking among all males in SFMS 1998
This subsection takes a detailed look at male smoking behaviour in SFMS 1998 survey since
it was a large survey and contained relatively more details about smoking.
4.2.2.1 Smoking prevalence
Figure 4.2 panel (a) shows, on two y-axes, the proportions smoking within each 5-year age
group and the cumulative prevalence of smoking among males. Among young adults aged 15-24
years, less than 10% were smoking. This however increased in the next two age-groups 25 to 29
years and 30-34 years to reach a peak of about 45% in the late 40s and 50s. Then the proportion
smoking within each age group reduced. The cumulative prevalence increased from a low of about
1.4% in the youngest age group to about 28% overall.
Panel (b) represents the same findings from a different perspective. It shows, on two y-axes,
the cumulative prevalence and the change in cumulative prevalence for each 5-year age-group as
compared to the previous 5-year age-group.
Figure 4.2(a) Age-specific prevalence (bars) and cumulative prevalence (line) of current smoking among all males, ages ≥ 15 years, SFMS 1998
1%
9%
21%
33% 33%
40%
44%45% 46%
44%42%
38%
1%
5%
10%
15%
18%21%
23%25% 26% 27% 27% 28%
0%
5%
10%
15%
20%
25%
30%
35%
40%
45%
50%
15-19 20-24 25-29 30-34 35-39 40-44 45-49 50-54 55-59 60-64 65-69 70-74
Age-group (yrs)
Age
-spe
cific
pre
vale
nce
0%
5%
10%
15%
20%
25%
30%
35%
40%
45%
50%
Cum
ulat
ive
prev
alen
ce
Proportion Cumulative %
41
Figure 4.2(b) Cumulative prevalence (line) and percent increase in smoking prevalence above younger age-group (bars) of current smoking among all males, ages ≥ 15 years, SFMS 1998
1
5
10
15
18
21
2325
26 27 27 28
0
50
100
150
200
250
300
350
15-19 20-24 25-29 30-34 35-39 40-44 45-49 50-54 55-59 60-64 65-69 70-74
Age-group
% in
crea
se in
sm
okin
g pr
eval
ence
abov
e yo
unge
r age
-gro
up
0
5
10
15
20
25
30
35
Cum
ulat
ive
prev
alen
ce %
Series2 Series1
249%
94%
49%
22% 16% 11% 7% 5% 3% 2% 0%
% change in cumulative prevalence
Cumulative prevalence
4.2.2.2 Types of tobacco smoked by level of education
Figure 4.3 depicts the proportion of males smoking different types of tobacco by level of
education. Overall smoking prevalence tended to decrease with increasing educational attainment.
When we look at by type of tobacco smoked, there are interesting results. Beedi smoking
prevalence was seen to decrease with increasing levels of education, from a high of 29.2% among
illiterates to a low of 4.9% among those with post-secondary education (p<0.001). Cigarette
smoking, on the other hand, was seen to increase with increasing levels of education from a low of
42
3.0% among illiterates to a high of 13.2% among those with post-secondary or graduate education
(p<0.001).
Figure 4.3 Types of tobacco smoked by level of education among males, ages 15 years and over, SFMS 1998
3.04.5
5.77.1
8.7
13.2
29.2
18.3
13.2
9.47.4
4.9
38.0
25.5
20.718.0 17.9
19.2
0.0
10.0
20.0
30.0
40.0
50.0
60.0
Illiterate Upto Grade 5 Grades 6-8 Grades 9-10 Grade 11-12 Post-secondary
Education level
Prev
alen
ce %
cigarette beedi all smokers
4.2.2.3 Age at initiation of smoking
The mean and median ages at initiation of smoking among males in India was 21.0 and 20.0
years respectively. Figure 4.4 shows a box plot of the mean ages at initiation of smoking for
cigarette and beedi smoking respectively along with the univariate statistics. Mean age at initiation
of smoking for beedis was lower at 20 years as compared to 22 years for cigarette. Of smokers, 90%
had started the habit by 25 years of age for beedi smoking and by 28 years of age for cigarette
smoking.
43
Figure 4.4 Mean age at initiation of smoking by type of tobacco used among male smokers, SFMS 1998
Cigarette Beedi
1 2
smokage
smokt ype
Type of smoker Variable All smokers Cigarette Beedi Total (N) 534,697 102,196 368,841 No. (%) analyzed 528,873 (99%) 100,875 (99%) 365,279 (99%) Mean 20.6 22.0 20.2 S.D. 4.5 4.5 4.5 S.E. 0.01 0.01 0.01 Variance 21.0 20.1 19.8 IQR 5.0 6.0 4.0
95% 30.0 30.0 30.0 Q3 23.0 25.0 22.0 Median 20.0 20.0 20.0 Q1 18.0 19.0 18.0 5% 15.0 15.0 16.0
44
Figure 4.5 shows the mean age of initiation of smoking by level of education for male
smoking different types of tobacco. Those who were illiterate started smoking about 2.0 years
earlier than those with post-secondary or graduate education. Beedi smokers who were illiterate
started smoking at 20.1 years of age, about 1.5 years before cigarette smokers with similar
education. Cigarette smokers with graduate education started smoking at 23.2 years of age, 2.0
years later than beedi smokers with similar education.
Figure 4.5 Mean age of initiation of smoking among males for different types of tobacco by level of education, SFMS 1998
18
19
20
21
22
23
24
Illiterate Upto Grade 5 Grades 6-8 Grades 9-10 Grades 11-12 Post-secondary
Education
Age
(yrs
)
Cigarette All smokers Beedi
Education Illiterate Upto Grade 5 Grades 6- 8 Grades 9-10 Grades 11-12 Post-secondary
Tobacco type
Cigarette 16,731 28,810 23,744 14,508 9,308 10,608
Beedi 160,810 122,233 54,628 19,106 7,884 3,905
All* 209,465 167,648 85,872 36,713 19,178 15,403
* All = cigarette, beedi & others
45
4.2.3 Smoking among middle-aged adults
This section deals with middle-aged adults (ages 30-69 years) with a special emphasis on males.
4.2.3.1 Smoking prevalence
Given that the mean age of initiation of smoking was late in the SFMS-1998 survey with
most men taking up smoking only during the later phase of young adult life, the proportions
smoking tobacco were assessed using three broad age-categories of adulthood – young adults (15-
29 years), middle-aged adults (30-69 years) and older adults (70 years and over). This is illustrated
in table 4.3. While overall 27% of adult males were smokers, in the younger age-group this was less
than 10%; but in middle-age over 40% of males were smokers. Rural residents (43.4%) were more
likely to be current smokers than their urban counterparts (32.4%) in middle-age also.
Table 4.3: Smoking among young, middle-aged and older adults by sex and residence, 1998 Smokers in region Age-group Rural Urban Total
No. (%) No. (%) No. (%)
Males 15-29 years 657,003 (10.7) 197,796 (7.3) 854,799 (9.9) 30-69 years 764,472 (43.4) 256,996 (32.4) 1,051,468 (40.7) ≥ 70 years 52,957 (35.1) 15,450 (21.9) 68,407 (32.1) Total 1,504,432 (28.8) 470,242 (21.5) 1,974,674 (27.1)
Females
15-29 years 624,192 (0.8) 188,965 (0.5) 813,157 (0.7) 30-69 years 777,188 (2.4) 233117 (1.5) 1,010,305 (2.2) ≥ 70 years 55,507 (4.0) 17,229 (3.1) 72,736 (3.8) Total 1,456,887 (1.8) 439,311 (1.2) 1,896,198 (1.6)
4.2.3.2 Geographical analysis of tobacco use by place of residence
Type of tobacco used by middle-aged male smokers according to the place of residence is
shown in Figure 4.6. Overall, about 70% of smokers were beedi users and only one in five were
cigarette smokers. Cigarette smoking was more common in urban areas than in rural areas (43% vs
14% respectively).
46
Figure 4.6 Type of tobacco smoked by place of residence among middle-aged male smokers, 1998
Total
Beedi70%
Cigarette19%
Other11%
Beedi Cigarette Other
Rural
Beedi73%
Cigarette14%
Other13%
Urban
Beedi51%Cigarette
43%
Other6%
4.2.3.3 Types of tobacco use in different states
The proportion of different types of tobacco smoked in the various states is shown in figure
4.7. Overall, there was a 6-fold variation in current smoking between states; this varied for beedi
smoking (which had a 5-fold variation) and for cigarette smoking (which had a 15-fold variation).
Beedi was the most common form of tobacco smoked in the country; overall, it was nearly four
times more commonly used than the cigarette. There were however wide variations in
beedi:cigarette use ratio between different states ranging from a low of 1:1 (in Delhi) and 1.2:1.0 (in
Kerala and northeastern states) to a high of 31:1 (in Gujarat). Other forms of tobacco smoking such
as cheroot and chutta were seen only in some districts of a few northern states (Bihar, Jharkhand,
Uttar Pradesh, Uttaranchal and Haryana) and in some regions of a northeastern state (Mizoram) and
a southern state (Andhra Pradesh).
47
0.0% 10.0% 20.0% 30.0% 40.0% 50.0% 60.0% 70.0%
Punjab & Chandigarh
Maharashtra, Goa, Daman & Diu
Delhi
Orissa
Tamil Nadu & Puducherry
Kerala & Lakshadweep
Gujarat (incl. DNH)
Karnataka
Bihar & Jharkhand
INDIA
Madhya Pradesh & Chhatisgarh
Andhra Pradesh
Uttar Pradesh & Uttaranchal
Northeastern states
Rajasthan
Himachal Pradesh
Assam
Haryana
West Bengal & AN Islands
Prevalence %Beedi Cigarette Other
78.7
73.9
61.6
78.1
86.1
44.3
80.2 51.8
85.8
65.9
52.9
80.8 88.1
53.5
53.7
78.5
45.9 68.5
69.3
48
Figure 4.7 Types of tobacco smoked and proportion of beedi smokers among middle-aged (30-69 years) males in states of India, 1998 Proportion beedi smokers (%)
Northeastern states include: Meghalaya, Nagaland, Manipur, Mizoram, Sikkim, Arunachal Pradesh, Tripura
49
4.2.3.4 Geographic mapping of smoking variation by residence
The prevalence of smoking among males aged 30-69 years residing in urban and rural
locations in SFMS is shown in figure 4.8. Smoking in rural areas was more common than in
urban areas. Further, smoking in rural regions was more prevalent in north-western states
(Rajasthan, Haryana, Himarchal Pradesh, Uttarakhand and Uttar Pradesh), the eastern/north-
eastern states (West Bengal, Assam, Meghalaya, Manipur, Mizoram and Tripura) and the
southern state of Andhra Pradesh. North-eastern states had high prevalence of smoking in urban
areas as well. Smoking was least prevalent in urban Maharashtra and urban and rural Punjab.
Figure 4.8 Prevalence of smoking among males in different states of India, 1998
4.2.3.5 Smoking variation across states by type of tobacco smoked
Figure 4.9 shows the geographical distribution of types of tobacco smoked in different
states. This is shown using pie-charts of the proportions of adult male smokers who used beedis,
cigarettes or other forms of tobacco (like cheroot, chutta, etc.). Cigarette smoking was relatively
more common in the north-eastern and southern states and less common in northern, western and
central parts of India.
Figure 4.9 Proportion of different types of tobacco smoked in different states, Special Fertility and Mortality Survey (SFMS) 1998
4.2.3.6 Variations across time
Figure 4.10 shows the time trends in smoking among rural males in the stable age-group
of 45-59 years across four different surveys. The four categories were: low (< 20%), medium (20
to 33%), high (34 to 50%) and very high (>50%) smoking prevalence. From the maps, it appears
that there was a rise in smoking prevalence between SFMS (1998) and NFHS-2 (1998-99)
followed by a dip in SRS (2004) and then again an increase in NFHS-3 (2005-06). The states of
Punjab and Maharashtra were consistently low smoking prevalence states (in tandem with
findings in figures 4.7 and 4.8). The temporal variations for urban residents aged 45 to 59 years
and for adults in the age-group of 30 to 44 years also followed a similar pattern (not shown).
50
Figure 4.10 Maps of smoking prevalence among rural males, ages 45-59 years, across the four selected surveys
51
4.2.3.7 Correlation between states across time
Correlation between state-level smoking prevalence among 45-59 year old adult males
across various surveys was very high. Table 4.4 displays Pearson correlation statistics for pairs
of analysis variables across different surveys.
Table 4.4 Pearson correlation coefficients comparing state-level smoking prevalence across different surveys for males, ages 45-59 years Surveys SFMS NFHS-2 SRS NFHS-3 SFMS 1.00 0.8728
* 0.8334* 0.7728
*
NFHS-2 1.00 0.9229* 0.9329
*
SRS 1.00 0.8829*
NFHS-3 1.00 * p < 0.001; subscripts are number of comparison states
An example of a scatterplot comparing the values of different states in two different
surveys (SFMS 1998 and NFHS-2) is shown in Figure 4.11. The positive correlation between the
values for the states was consistent across the four surveys in a similar manner (not shown).
Figure 4.11 Correlation between smoking prevalence in states between SFMS and NFHS-2
Mizoram
Assam
NFHS-2 1998-99
SFMS 1998
Punjab
Tamil Nadu
52
Though the overall correlation coefficients were high, there were however a few
exceptions: some states had low values in some surveys and high values in other surveys (eg.
Assam, Tamil Nadu).
4.2.3.8 Geographic mapping of quitting smoking
From the SRS 2004 survey that had a question idenfying ex-smokers, it was possible to
calculate the ratio of ex:current smokers among males aged 45-59 years. The national average
was 4.8. Between states there were differences: most states had ratios lower than 4.0 and a few
states had greater than 8.0. The former states were generally those with high prevalence of
smoking. The latter included some states with low prevalence of smoking (eg. Maharashtra in
central India) and also some states with high prevalence of smoking (eg. Kerala in the south).
This is seen in the accompanying figure 4.12.
Figure 4.12 Ratio of ex:current smokers among males, ages 45-59 years, Sample Registration System (SRS) 2004
53
4.2.4 Spatial heterogeneity
Spatial heterogeneity in smoking prevalence between states was tested by looking for
spatial autocorrelation. This is depicted in figure 4.13. The scatterplot of smoking prevalence
plotted against its ‘spatial lag’ (ie, the weighted average of neighbouring values) enabled
computation of global Moran’s I statistic to be 0.314 for males (shown below) and 0.02 for
females (not shown) revealing minimal clustering of smoking behaviour in the NFHS-3 survey.
The accompanying LISA map for males revealed ‘high-high’ clustering in the
northeastern states, ‘low-low’ clustering in the western states of Gujarat and Maharashtra and a
‘high-low’ outlier in the state of Kerala in the south. No such ‘high-high’ clustering was noted
for females.
Figure 4.13 Scatterplot of global spatial autocorrelation (for males) and LISA maps showing local spatial clustering of smoking (for males and females), NFHS-3 [2005-06]
LISA map of Male Smoking, 15-54 yrs LISA map of Female Smoking, 15-49 yrs
54
4.3 Body mass Increased body mass index (BMI ≥ 25 kg/ m2) which is a major risk factor in terms of
life-years lost is the second CVD risk factor described in this section. This is done in two ways
from data available in the NFHS-3 (2005-06) survey: in terms of proportions who had increased
BMI and by looking at mean BMI.
4.3.1 Overweight/obesity
4.3.1.1 Prevalence of overweight/obesity
The prevalence of overweight/obesity (BMI ≥ 25 kg/m2) was 13.9% (11.8% among males
and 15.1% among females). The prevalence of overweight/obesity by residence, sex and
education are shown in table 4.5. The prevalence in urban and rural areas was 20.9% and 7.8%
respectively. Peak prevalence was seen among rural females (23.3%), almost 3-fold higher than
among rural females and 4-fold higher than among rural males. Overweight/obesity was also
higher among those with higher education, with those who had completed grade 10 having a
prevalence of 21.5% -- almost 3 times higher than those who were illiterate.
55
Table 4.5 Prevalence of overweight/obesity in NFHS-3 survey, 2005-06
Survey Overweight/ Obesity (BMI≥25) NFHS-3 (2005-06) Characteristic % Crude O.R. (99% CI) Total (n=187,886) 13.9% Sex
Male (n=69,198) 11.8% 0.75 (0.73-0.78) Female (n=118,727) 15.1% 1.00
Residence
Urban (n=95,160) 20.9% 3.11 (3.03-3.20) Rural (n=103,594) 7.8% 1.00
Residence & sex
Urban Male (n=34,646) 17.2% 0.69 (0.66-0.71) Female (n=53,171) 23.3% 1.00
Rural Male (n=34,552) 6.5% 0.74 (0.71-0.78) Female (n=65,556) 8.5% 1.00
Education Illiterate (n=48,204) 8.5% 0.34 (0.29-0.39) Upto Grade 5 (n=27,631) 11.3% 0.47 (0.43-0.52) Grade 6 – 10 (n=69,950) 14.1% 0.60 (0.56-0.64) Grade 11 & over (n=42,101) 21.5% 1.00
Additional analysis of categories of overweight by residence and sex are shown in table
4.6. About 80% of those who had BMI ≥ 25 were in the category of pre-obesity (BMI=25.0-
29.9); the rest were in the category of W.H.O. Class I & II obesity (BMI=30.0-34.9 and
BMI=35.0-39.9 respectively) with <1% being morbidly obese (BMI ≥40.0). While prevalence of
overweight among rural males and females was 5.78% and 7.1% respectively, among urban
males and females it was much higher at 14.6% and 17.3% respectively. Class I & II obesity
were much less common among males but not among females, especially in urban areas where it
was 5.7%. Morbid obesity or class III obesity was less than 0.25% in all sub-groups.
56
Table 4.6 Prevalence proportions of overweight by residence and sex, NFHS-3, 2005-06 Variable Urban Rural
Total
BMI ≥ 25 20.9% 7.8% By sex Males & females
Overweight/pre-obesity (BMI=25.0-29.9) 16.2% 6.6% Obese Class I & II (BMI=30.0-39.9) 4.5% 1.1% Obese Class III (BMI ≥ 40) 0.1% 0.03%
Male Overweight/pre-obesity (BMI=25.0-29.9) 14.6% 5.8% Obese Class I & II (BMI=30.0-39.9) 2.6% 0.7% Obese Class III (BMI ≥ 40) 0.01% 0.02%
Female Overweight/pre-obesity (BMI=25.0-29.9) 17.3% 7.1% Obese Class I & II (BMI=30.0-39.9) 5.7% 1.4% Obese Class III (BMI ≥ 40) 0.2% 0.04%
4.3.1.2 Distribution of BMI by sex and residence
About 7% of males and 4% of females had missing or implausible BMI values; these
observations were excluded and the remainder were retained for the analysis. The mean (and
99% confidence limits) of BMI for males was 20.8 (20.7-20.8) and marginally higher for females
equaling 21.0 (21.0-21.1) kg/m2. The boxplots of BMI distributions for males and females are
shown in figure 4.14 and by sex and residence are shown in figure 4.15 along with the statistical
estimates. While the mean BMI in rural areas was about 20 kg/m2 in both sexes, it was
significantly higher at 21.5 for urban males and 22.1 for urban females (p<0.001).
57
Figure 4.14 Boxplot showing distribution of body mass index by sex, 2005-06
_
t-test = -15.9 p-value < 0.001
Statistic Male Female LCL UCL LCL UCL Total no. 74369 124385 No. (%) analyzed 69198 (93%) 118727 (96%) 99% 31.2 33.6 95% 27.4 28.8 Q3 22.8 23.1 Median 20.2 20.3 Q1 18.3 18.2 5% 16.1 16.0 1% 14.8 14.6
Mean 20.7 20.8 20.8 21.0 21.0 21.1 S.D. 3.5 3.5 3.6 4.0 4.0 4.1 S.E. 0.01 0.01 Variance 12.4 16.3
LCL=lower confidence limit; UCL=upper confidence limit
58
59
Figure 4.15 Boxplot showing distribution of body mass index by gender and residence, 2005-06 Males Females
_ _
Males Females Statistic Urban Rural Urban Rural
LCL UCL LCL UCL LCL UCL LCL UCL
No. of persons 34646 34552 53171 65556
99% 32.4 29.2 35.4 30.9 95% 28.4 25.6 30.5 26.5 Q3 23.8 21.6 24.7 21.8 Median 21.0 19.6 21.3 19.6 Q1 18.7 18.0 18.8 17.8 5% 16.2 16.0 16.2 15.8 1% 14.8 14.7 14.7 14.5 Mean 21.4 21.5 21.5 20.0 20.0 20.1 22.1 22.1 22.1 20.1 20.2 20.2 S.D. 3.8 3.8 3.9 3.0 3.0 3.0 4.5 4.5 4.5 3.3 3.4 3.4 S.E. 0.02 0.02 0.02 0.01
Males: t-test = 55.7 p-value < 0.001 Females: t-test = 82.6 p-value < 0.001 LCL=lower confidence limit; UCL=upper confidence limit
4.3.1.3 Distribution of BMI by sex and age-group
The boxplots of BMI distributions by age-group are depicted in figure 4.16(a) for males and
figure 4.16(b) for females. The mean BMI for males in the age-groups 15-19 yrs, 20-29 yrs and 30-
49 yrs was 18.7, 20.5 and 21.6 respectively (p<0.001). It displayed a similar increase with age for
females also with values of 19.3, 20.5 and 22.1 in the three age-groups respectively (p<0.001).
Figure 4.16(a) Boxplots of distribution of Body Mass Index by gender and age (males), 2005-06
_ _
Statistic 15-19 yrs 20-29 yrs 30-49 yrs No. of males 12183 21783 30990 99% 27.3 29.8 32.2 95% 23.2 26.1 28.4 Q3 20.0 22.1 23.9 Median 18.4 20.0 21.1 Q1 17.0 18.3 18.9 5% 15.2 16.4 16.6 1% 14.0 15.2 15.2 Mean 18.7 20.5 21.6 S.D. 2.6 3.1 3.7 S.E. 0.02 0.02 0.02 Variance 6.8 9.3 13.8 IQR 3.1 3.8 5.0
F-value (Pr > F): 3469 (< 0.001) * post-hoc comparison of means between all 3 groups are statistically significant
60
Figure 4.16(b) Boxplots of distribution of body mass index by gender and age (females), 2005-06
_ _
F-value (Pr > F): 4825 (< 0.001) * post-hoc comparison of means between all 3 groups are statistically significant
Statistic 15-19 yrs 20-29 yrs 30-49 yrs No. of females 22813 41362 54552 99% 27.8 31.5 35.4 95% 24.1 27.0 30.6 Q3 20.8 22.2 24.8 Median 19.0 19.9 21.4 Q1 17.5 18.1 18.8 5% 15.6 16.0 16.1 1% 14.4 14.7 14.7 Mean 19.3 20.5 22.1 S.D. 2.7 3.5 4.5 S.E. 0.02 0.02 0.02 Variance 7.3 12.0 20.6 IQR 3.3 4.1 6.1
61
4.3.2 Geographic mapping of overweight prevalence by states
Conditioned choropleth maps of prevalence of overweight along two axes – sex and place of
residence – in the age-group of 30 to 49 year old adults in different states of the country are shown
in figure 4.17. The prevalence of overweight among males varied 13-fold ranging from 2.9% in
rural Chattisgarh to 38% in urban Punjab and among females varied 22-fold ranging from 2.5% in
rural Jharkhand to 55.9% in urban Punjab in this age-group. It was noted that BMIs were high in
rural areas of only a few states such as Punjab and Gujarat in the north and Kerala, Andhra Pradesh
and Tamil Nadu in the south. In the urban areas however, overweight was found to be more
widespread in many states, especially among females. Figures 4.18 and 4.19 depict proportions that
were overweight in the age-groups of 20-29 years and 15-19 years respectively. Prevalence of
overweight was relatively much less in these two age-groups. It ranged from 1.2% in rural
Meghalaya to 20.4% in urban Punjab among males and from 0.6% in rural Jharkhand to 25.7% in
urban Tamil Nadu in the 20 to 29 years age-group. In the 15 to 19 years age-group, the range was
much less, 0% to 12.1% among males and 0% to 14.3% among females. Interestingly, overweight
prevalence in these younger age-groups was high in those states that had high prevalence in the
older age-group – Punjab, Kerala and Tamil Nadu – to name a few, especially in urban areas.
62
63
Figure 4.17 Mapping of proportions of adults, ages 30-49 years, overweight by state, National Family Health Survey-3 (2005-06)
Figure 4.18 Mapping of proportions of adults, ages 20-29 years, overweight by state, National Family Health Survey-3 (2005-06)
64
Figure 4.19 Mapping of proportions of adults, ages 15-19 years, overweight by state, National Family Health Survey-3 (2005-06)
65
4.3.3 Spatial heterogeneity
Global Moran’s I for overweight/obesity was 0.14 among males and 0.32 among females.
This revealed that there was no significant spatial autocorrelation for males and moderate spatial
autocorrelation for females. LISA maps of proportions with increased BMI in the different states
(shown in figure 4.20) revealed ‘low-low’ clustering in the northeastern states among males and
females and ‘high-high clustering in the northern states among females.
Figure 4.20 LISA maps showing local spatial clustering of overweight/obesity among males and females, National Family Health Survey-3 (NFHS-3) [2005-06]
Males >30yrs Females >30yrs
Global Moran’s I = 0.14 Global Moran’s I = 0.32
66
4.4 Dietary factors & Self-reported Diabetes In this section, I present some results available on other factors like diet (vegetarianism and
fruit intake) and self-reported diabetes.
4.4.1 Vegetarianism
4.4.1.1 Prevalence of vegetarianism
The prevalence of vegetarianism is detailed in table 4.7. One-in-four males and females had
reported being vegetarians in the SRS 2004 survey. In the NFHS-3 survey, about one-third of the
population (27.9% of males and 36.8% of females) defined themselves as lacto-vegetarians. By
urban/rural location, the lowest prevalence was among urban males (27.5%) and highest prevalence
among rural females (37.9%). Prevalence was significantly higher among those who had completed
Grade 10 than in those with lower grades of education (x2 for trend=22.4; p<0.0001).
Table 4.7 Vegetarianism among adults aged 15 years and over in India from selected surveys Survey Vegetarianism Lacto-vegetarianism
SRS 2004 (n=4.5 million) NFHS-3, 2005-06 (n=198,754) Characteristic % % Crude O.R. (99%CI) Total 33.4% Sex
Male 24.1% 27.9% 0.67 (0.65-0.68) Female 26.0% 36.8% 1.00
Residence
Urban 32.2% 0.9 (0.88-0.92) Rural 34.5% 1.00
Residence & sex
Urban Male 27.5% 0.69 (0.67-0.72) Female 35.4% 1.00
Rural Male 28.2% 0.65 (0.62-0.67) Female 37.9% 1.00
Education Illiterate 32.4% 0.73 (0.70-0.76) Upto Grade 5 29.1% 0.62 (0.59-0.65) Grade 6 – 10 31.9% 0.71 (0.67-0.75) Grade 11 & over 40.7% 1.00
= data not available; O.R. (99%C.I.) = odds ratio (99% confidence interval)
67
There was no association of vegetarianism with age (x2 for trend= - 0.72; p<0.23) (not shown).
4.4.1.2 Geographic mapping of prevalence of vegetarianism by state
The prevalence of lacto-vegetarianism among adults aged 15 years and over was estimated
from the NFHS-3 survey. Figure 4.21 shows the conditioned choropleth map of lacto-vegetarianism
in the states along two dimensions, sex and residence. All four maps show an east-west gradient in
reported vegetarianism. In urban areas, the lowest prevalence among males was 2.7% in eastern
Arunachal Pradesh and the highest prevalence was 67.7% in western Rajasthan; among females the
lowest prevalence was 4.3% in eastern Nagaland and highest prevalence was 76% in the
northwestern states of Punjab, Gujarat and Rajasthan. In rural areas, even greater differences were
seen: among males, a 60-fold difference was noted with Nagaland having a prevalence of 1.2% and
Rajasthan having a prevalence of 83%; among females, a 50-fold difference was noted with the
prevalence ranging from 1.2% in Nagaland to 92.9% in Haryana.
4.4.1.3 Spatial heterogeneity
Global Moran’s I for spatial autocorrelation in prevalence of lacto-vegetarianism was 0.75
among males and 0.77 among females. This indicated significant spatial autocorrelation for males
and females. LISA maps of lacto-vegetarianism in the different states (shown in figure 4.22)
revealed statistically significant ‘high-high’ clustering for males and females in the northern and
western states and significant ‘low-low’ clustering in the northeastern states of India.
68
69
Figure 4.21 Prevalence of Lacto-vegetarianism in the states among adults in the National Family Health Survey-3 survey, 2005-06
Figure 4.22 LISA maps showing local spatial clustering of lacto-vegetarianism, National Family Health Survey (NFHS-3) [2005-06]
Males >15yrs
Females >15yrs
4.4.2.1 Prevalence of at least weekly fruit consumption
Prevalence of at least weekly fruit consumption is shown in table 4.8. It was reported to be
55.5% among males and 47.7% among females in the NFHS-3 survey. In urban areas it was 62.6%
and in rural areas it was 39.5%. Fruit intake increased from about 28% among illiterate to nearly
75% among those who had completed grade 10 (x2 for trend=146.8; p<0.0001).
4.4.2 Fruit intake
Global Moran’s I = 0.75 Global Moran’s I = 0.77
70
Table 4.8 Reported fruit intake (at least weekly) in National Family Health Survey (NFHS-3) [2005-06]
Survey Fruit intake (at least weekly) NFHS-3 2005-06 Characteristic % Crude O.R. (99% CI) Total (n=198,754) 50.4% Sex
Male (n=74,369) 55.5% 1.36 (1.33-1.39) Female (n=124,385) 47.7% 1.00
Residence
Urban (n=95,160) 62.6% 2.58 (2.52-2.64) Rural (n=103,594) 39.5% 1.00
Residence & sex
Urban Male (n=38,199) 64.5% 1.16 (1.12-1.20) Female (n=56,961) 36.6% 1.00
Rural Male (n=36,170) 45.3% 1.47 (1.42-1.52) Female (n=67,424) 36.0% 1.00
Education Illiterate (n=50,772) 27.7% 0.13 (0.09-0.17) Upto Grade 5 (n=28,923) 40.7% 0.24 (0.18-0.30) Grade 6 – 10 (n=73,587) 55.2% 0.43 (0.38-0.48) Grade 11 & over (n=44,742) 75.6% 1.00
4.4.2.2 Geographic mapping of fruit intake by state
Figure 4.23 shows the conditioned choropleth map of reported fruit intake by state along two
dimensions, sex and residence. In urban areas, the lowest prevalence among males was 34% in
Orissa and the highest prevalence was 87% in Karnataka; among females the lowest prevalence was
33% in Orissa and highest prevalence was 82% in Karnataka. In rural areas, the lowest prevalence
among males was noted in Orissa which had a prevalence of 12% and Kerala which had a
prevalence of 78%; among females, a 10-fold difference was noted with the prevalence ranging
from 8% in Orissa to 81% in Goa.
71
72
Figure 4.23 Reported fruit intake (at least weekly) in various states by sex and residence, National Family Health Survey (NFHS-3) [2005-06]
4.4.2.3 Spatial heterogeneity
Global Moran’s I for spatial autocorrelation in prevalence of regular fruit intake was 0.41
among males and 0.27 among females. This indicated that there was moderate spatial
autocorrelation for males and no significant spatial autocorrelation for females. LISA maps of
reported fruit intakes in the different states (shown in figure 4.24) revealed some statistically
significant ‘high-high’ clustering predominantly in the southern states for males and females and
‘low-low’ clustering in the north central states.
Figure 4.24 LISA maps showing local spatial clustering of fruit intake among males and females, National Family Health Survey (NFHS-3) [2005-06]
Males >15yrs
Females >15 yrs
73
4.4.3 Diabetes
4.4.3.1 Prevalence of diabetes
The prevalence of diabetes among those aged 30 years and over is shown in table 4.9. The
prevalence among males was 2.81% and that among females was 2.03%. The prevalence among
urban males and females was higher at 3.84% and 2.88% respectively than that seen among rural
males and females (1.77% and 1.28%) respectively. It was found to be directly associated with the
level of education (x2 for trend = 270; p< 0.0001) probably indicating an awareness bias.
Table 4.9 Self-reported diabetes prevalence in National Family Health Survey (NFHS-3) [2005-06]
Survey Self-reported diabetes NFHS-3 (2005-06) Characteristic % O.R. (99% CI) Total (n=93,213) 2.3% Sex
Male (n=37,064) 2.8% 1.40 (1.25-1.56) Female (n=56,149) 2.0% 1.00
Residence
Urban (n=45,064) 3.3% 2.27 (2.02-2.56) Rural (n=48,149) 1.5% 1.00
Residence & sex
Urban Male (n=18,743) 3.8% 1.35 (1.18-1.54) Female (n=26,321) 2.9% 1.00
Rural Male (n=18,321) 1.8% 1.38 (1.14-1.68) Female (n=29,828) 1.3% 1.00
Education Illiterate (n=31,287) 1.4% 0.38 (0.22-0.54) Upto Grade 5 (n=15,178) 2.1% 0.58 (0.30-0.76) Grade 6 – 10 (n=28,241) 2.8% 0.80 (0.68-0.92) Grade 11 & over (n=18,486) 3.5% 1.00
4.4.3.2 Geographic mapping of diabetes prevalence in states
The distribution of diabetes prevalence across different states is shown as a conditioned
choropleth map by sex and residence in Figure 4.25. Diabetes prevalence was found to be low
among rural residents and high among urban residents, in both males and females, especially in the
southern states.
74
75
Figure 4.25 Prevalence of self-reported diabetes in different states among adults aged 30 years & over by sex and residence, National Family Health Survey (NFHS-3) [2005-06]
Males >30yrs Females >30 yrs
Figure 4.26 LISA maps showing local spatial clustering of self-reported diabetes among males and females, National Family Health Survey (NFHS-3) [2005-06]
4.4.3.3 Spatial heterogeneity
Global Moran’s I for spatial autocorrelation in prevalence of self-reported diabetes was 0.20
among males and 0.18 among females. This indicated that there was no significant spatial
autocorrelation among males and females. LISA maps of reported fruit intakes in the different states
(shown in figure 4.26) revealed statistically significant ‘high-high’ clustering predominantly in the
southern states for males and females and a ‘high-low’ outlier in the northern state of Punjab for
males.
Global Moran’s I = 0.20 Global Moran’s I = 0.18
76
4.5 Ecologic association between selected risk factors and CVD mortality I had shown the importance of individual risk factors in cardiovascular mortality in the
introduction and literature review sections. Subsequently, the study datasets and analytic techniques
were outlined in the methodology section. Earlier in the results section, I described the distribution
and correlates of selected risk factors (smoking, body mass and to a lesser extent, diet and self-
reported diabetes). In this subsection, I proceed to explore the association between these selected
risk factors and cardiovascular mortality at the ecologic level.
4.5.1 Cardiovascular mortality
4.5.1.1 Geographic mapping of cardiovascular death rate
The mean cardiovascular death rate among males was 308 per 100,000 and the mean
cardiovascular death rate among females was 198 per 100,000. The rates among males ranged from
about 180 per 100,000 in Mizoram to over 400 per 100,000 in Tamil Nadu and Andhra Pradesh.
Among females, the rates varied from below 100 per 100,000 in Mizoram and Haryana to about 240
per 100,000 in Punjab and Andhra Pradesh. The mapping of these vascular death rates is shown in
figure 4.27.
Most southern states (such as Andhra Pradesh, Kerala and Tamil Nadu), and a few in the
east (West Bengal) and north (Punjab) had relatively higher vascular death rates among both males
and females.
77
78
Figure 4.27 Age-standardized vascular death rate per 100,000 males and females, ages 30-69 years, in states of India [Million Deaths
Study, 2006]
Males
Females
4.5.2 Ranking of states
Ranking of states by cardiovascular death rate (in descending order) and comparing it to the ranking
for the selected risk factors is shown for males and females separately in table 4.10(a) and table
4.10(b) respectively.
Table 4.10(a). Ranking of states by outcome (CVD death rate, highest to lowest) for middle-aged males, India, 2006
Region Outcome Risk factors
State CVD deaths-
females Smoking Overweight Vegetarianism Fruit intake Diabetes
Andhra Pradesh 1 22 3 15 8 4 Tamil Nadu 2 20 7 17 2 7 Punjab 3 26 1 5 5 10 Kerala 4 16 2 20 1 2 Goa 5 29 8 18 3 1 Assam 6 8 25 24 19 23 West Bengal 7 4 20 23 26 6 Karnataka 8 23 12 11 4 14 Maharashtra 9 27 17 12 13 11 Gujarat 10 25 10 3 16 22 Madhya Pradesh 11 14 23 6 17 27 Haryana 12 6 4 2 10 15 Manipur 13 10 18 25 6 12 Delhi 14 19 28 8 11 8 Chhattisgarh 15 9 11 13 25 13 Tripura 16 3 15 27 22 3 Nagaland 17 11 24 29 23 19 Jharkhand 18 28 26 19 28 17 Himachal Pradesh 19 17 6 4 7 24 Orissa 20 21 13 21 29 9 Arunachal Pradesh 21 18 22 28 21 26 Jammu & Kashmir 22 5 16 10 14 25 Uttaranchal 23 13 19 9 12 20 Bihar 24 24 27 14 24 18 Rajasthan 25 7 14 1 27 29 Uttar Pradesh 26 12 9 7 18 21 Meghalaya 27 2 29 22 9 16 Sikkim 28 15 5 16 15 5 Mizoram 29 1 21 26 20 28
79
For males it was seen that the top 5 states with high/low CVD death rates had corresponding
high/low levels of prevalence of overweight, high/low levels of diabetes and low/high levels of
vegetarianism; but on the other hand, they also had corresponding high/low levels of fruit intake
and corresponding low/high levels of smoking. For females, states with high/low levels of CVD
death rates had somewhat corresponding high/low levels of overweight and diabetes prevalence;
however, smoking and dietary factors were not clearly linked.
Table 4.10(b). Ranking of states by outcome (CVD death rate, highest to lowest) for middle-aged females, India, 2006
Region Outcome Risk factors
State CVD deaths-
females Smoking Overweight Vegetarianism Fruit intake Diabetes
Nagaland 1 26 27 29 16 20 Andhra Pradesh 2 24 4 15 9 6 West Bengal 3 18 8 24 23 4 Punjab 4 25 1 2 21 11 Goa 5 23 14 16 1 3 Arunachal Pradesh 6 9 25 27 20 28 Maharashtra 7 19 22 12 8 18 Orissa 8 15 18 19 29 23 Tamil Nadu 9 29 5 18 6 1 Assam 10 21 15 26 22 27 Jharkhand 11 17 26 20 27 12 Jammu & Kashmir 12 6 7 10 12 25 Madhya Pradesh 13 16 19 6 17 17 Gujarat 14 22 10 5 13 10 Delhi 15 13 16 7 3 5 Manipur 16 7 17 21 4 19 Tripura 17 3 20 22 18 7 Karnataka 18 27 13 11 2 24 Bihar 19 5 24 14 19 8 Kerala 20 28 2 23 5 2 Chhattisgarh 21 2 21 13 24 22 Uttaranchal 22 12 12 9 14 15 Uttar Pradesh 23 11 11 8 26 26 Himachal Pradesh 24 20 3 4 10 13 Rajasthan 25 8 23 3 28 29 Meghalaya 26 14 29 28 7 14 Sikkim 27 4 9 17 11 9 Haryana 28 10 6 1 25 16 Mizoram 29 1 28 25 15 21
80
Pearson correlation coefficients between male versus female ranks for the 29 states that had
information on death rates and risk factors is shown in table 4.11.
Table 4.11 Correlations between male vs female ranks for 29 states Females Variables Smoking Overweight Vegetarianism Fruits Diabetes CVD death
Males
Smoking 0.65*
Overweight 0.87*
Vegetarianism 0.99*
Fruits 0.80*
Diabetes 0.76*
CVD death 0.44**
* p < 0.001; ** p < 0.01
Correlations between male and female ranks in the states were high for the selected risk
factors (ranging between 0.65 for smoking to 0.99 for vegetarianism). The correlation between state
ranks for cardiovascular death rate was moderate at 0.44.
4.5.3 Univariate regression analysis
The ecologic association between selected risk factors (for adults aged 15 years and over)
and the outcome variable of cardiovascular death rate (for adults aged 30 to 69 years from the
Million Death Study) was analyzed across 29 states. Cardiovascular death rates for males and
females were regressed on the following set of six predictor variables:
Percent urbanization – from Census 2001
Smoking prevalence, lacto-vegetarianism prevalence, regular fruit intake prevalence,
overweight prevalence and diabetes prevalence -- from NFHS-3 survey.
The univariate analysis for males and females are shown in Table 4.12.
81
Table 4.12 Correlation between selected risk factors and CVD mortality by sex, in 29 states of India Cardiovascular death rate per 100,000 (ages 30-69 yrs)
Predictors Males Females Regression Regression coefficient coefficient (p-value) R2 (p-value) R2
Census 2001
% Urbanization 0.79 (0.23) 0.05 -0.11 (0.84) 0.01
NFHS-3 survey2005-06
Smoking 0.12 (0.89) 0.00 -6.50 (0.01)* 0.20
Overweight 5.81 (0.03)* 0.17 3.49 (0.02)* 0.21
Vegetarianism -0.68 (0.14) 0.08 0.01 (0.98) 0.01
Regular fruit intake 1.13 (0.09) 0.11 0.08 (0.89) 0.00
Self-reported diabetes 17.38 (0.01)* 0.29 17.80 (0.03)* 0.17
* statistically significant Among males, a higher vascular death rate was seen to be positively correlated with
prevalence of overweight and diabetes. Percent of variance explained by variables was lowest for
smoking (0%) and highest for self-reported diabetes (29%). Among females, those states with high
vascular death rates had lower female smoking prevalence and significantly higher prevalence of
overweight and diabetes. Percent of variance explained by variables was lowest for fruit intake (0%)
and highest for overweight (21%).
4.5.4 Multivariate Regression
Mulitvariate Linear Regression
Prior to modeling, the data were examined by way of plotting to conclude that the data had
linear relationships between risk factors and CVD death rates for males (figure 4.28a) and females
(figure 4.28b). As a second step, associations between various parameters were examined by
observing the correlations between the study variables for males and females (table 4.13). Since all
the plots were suggestive of mostly a linear relationship and there were no Pearson correlation
coefficients > 0.60, I used all the parameters to create the model. The outputs from this modeling
are shown in table 4.14a (for males) and table 4.14b (for females).
82
The study variables explained 49% of the variation in cardiovascular death rates among
males (R2 = 0.487) and 43% of the variation among females (R2 = 0.431). Among males, prevalence
of overweight in states was positively associated with cardiovascular death rates and the level of
vegetarianism was negatively associated with the cardiovascular death rates; the former was
statistically significant (p<0.03) and the latter tended towards significance (p<0.09). Among
females, vascular deaths were not significantly associated with the study variables at the state level.
The analysis of variance showed Prob>F-value to be significant for males (0.014) and not
significant for females (0.057).
I was able to validate the assumptions of linear regression. The SPEC option yielded
Pr>Chi-sq of 0.85 for males and 0.56 for females suggesting that the error terms were independent
and identically distributed. The Durbin-Watson statistic was between 1.6 and 2.4 (2.2 for males and
1.7 for females) indicating that the data were independent and not correlated. The Shapiro-Wilks
test for normality showed p-values of 0.57 for males and 0.36 for females indicating that the error
terms were from a normal distribution. In both cases, the probability plots also confirmed the same.
The variance inflation factors were less than the cut-off of 10 showing that the variables
were not correlated. Lastly, the Cook’s D statistic values were all <2 confirming that there were no
outliers.
Visual inspection of outliers carried out on scatterplots revealed that no single state appeared
to be an outlier for males; for females however, Mizoram appeared to be an outlier – it had high
female smoking prevalence (over 15%) and high level of urbanization (over 40%) but a relatively
low vascular death rate (~50/100,000). Exclusion of Mizoram from the ecologic analysis however
did not significantly alter the model outcome
Poisson Regression
Exploring the ecologic association by Poisson regression also revealed the same results (tables
shown in appendix). Among males, prevalence of overweight in states was positively associated
with cardiovascular death rates and the level of vegetarianism was negatively associated with the
cardiovascular death rates; the former was statistically significant (p<0.02) and the latter tended
towards significance (p<0.07). Among females, vascular deaths were not significantly associated
with the study variables at the state level. I used the PSCALE option to adjust for overdispersion
(wherein variance was greater than mean) and thereby obtain a better fit of the model (wherein the
SCALED DEVIANCE equaled 1.0).
83
Figure 4.28a Plots of vascular death rate per 100,000 males against predictor variables (checking for linearity of relationship)
y = 0.794x + 278.77
0
50
100
150
200
250
300
350
400
450
0.0 20.0 40.0 60.0 80.0 100.0
Vasc
ular
dea
th ra
te p
er 1
00,0
00
y = 0.1183x + 296.78
0
50
100
150
200
250
300
350
400
450
0.0 20.0 40.0 60.0 80.0 100.0
Vasc
ular
dea
th ra
te p
er 1
00,0
00
% Urban Male smoking prevalence %
y = -0.6757x + 319.26
0
50
100
150
200
250
300
350
400
450
0 20 40 60 80 100
Vasc
ular
dea
th ra
te p
er 1
00,0
00
y = 1.1292x + 242.07
0
50
100
150
200
250
300
350
400
450
0 20 40 60 80 1
Vasc
ular
dea
th ra
te p
er 1
00,0
00
00 Lacto-vegetarianism prevalence % % Regular fruit intake
y = 5.8056x + 230.39
0
50
100
150
200
250
300
350
400
450
0 5 10 15 20 25 30
Vasc
ular
dea
th ra
te p
er 1
00,0
00
y = 17.384x + 254.2
0
50
100
150
200
250
300
350
400
450
0 2 4 6 8
Vasc
ular
dea
th ra
te p
er 1
00,0
00
10 % Males with BMI≥25 Male diabetes prevalence %
84
Figure 4.28b Plots of vascular death rate per 100,000 females against predictor variables (checking for linearity of relationship)
y = -0.1076x + 176.16
0
50
100
150
200
250
300
350
400
450
0.0 20.0 40.0 60.0 80.0 100.0
Vasc
ular
dea
th ra
te p
er 1
00,0
00
y = -6.5572x + 192.27
0
50
100
150
200
250
300
350
400
450
0.0 5.0 10.0 15.0 20.0
Vasc
ular
dea
th ra
te p
er 1
00,0
00
% Urban Female smoking prevalence %
y = 0.0092x + 172.82
0
50
100
150
200
250
300
350
400
450
0 20 40 60 80 100
Vasc
ular
dea
th ra
te p
er 1
00,0
00
y = 0.0797x + 169.32
0
50
100
150
200
250
300
350
400
450
0 20 40 60 80 1
Vasc
ular
dea
th ra
te p
er 1
00,0
00
00 Lacto-vegetarianism prevalence % % Regular fruit intake
y = 3.4845x + 112.49
0
50
100
150
200
250
300
350
400
450
0 5 10 15 20 25 30 35
Vasc
ular
dea
th ra
te p
er 1
00,0
00
y = 17.798x + 138.57
0
50
100
150
200
250
300
350
400
450
0 1 2 3 4 5 6
Vasc
ular
dea
th ra
te p
er 1
00,0
00
% Females with BMI≥25 Female diabetes prevalence %
85
86
Table 4.13. Pearson correlation coefficients between variables at state level (to check for multicollinearity), males and females
Males % Urban ♂Smoking % ♂Lacto-veg % ♂Fruit intake % ♂Overweight % ♂Diabetes %
% Urban 1.00 -0.11 0.15 0.36 -0.04 0.19 ♂Smoking % 1.00 -0.20 -0.22 -0.31 -0.24 ♂Lacto-veg % 1.00 0.16 0.31 -0.37 ♂Fruit intake % 1.00 0.47 0.40 ♂Overweight % 1.00 0.42 ♂Diabetes % 1.00
Females % Urban ♀Smoking % ♀Lacto-veg % ♀Fruit intake % ♀Overweight % ♀Diabetes %
% Urban 1.00 ♀Smoking % -0.01 1.00 ♀Lacto-veg % 0.18 -0.19 1.00 ♀Fruit intake % 0.47 -0.19 -0.16 1.00 ♀Overweight % 0.03 -0.41 0.43 0.14 1.00 ♀Diabetes % 0.34 -0.23 -0.24 0.50 0.41 1.00
87
Table 4.14(a). Multiple linear regression of cardiovascular death rates among males at state level Parameter Estimates Parameter Variance Variable DF Estimate Pr > |t| Inflation Intercept 1 160.12 0.005 0 Smoking 1 0.83 0.279 1.19 Overweight 1 7.50 0.029 2.20 Veg_% 1 -0.96 0.086 2.05 Fruit intake 1 0.04 0.958 1.63 Self-reported diabetes 1 5.16 0.501 2.37 Urban_% 1 1.03 0.116 1.40
Analysis of Variance
F value = 3.48 (Pr > F =0.014) Model (degrees of freedom) = 6
R-Square = 0.487 Adj R-Sq = 0.347 Test of first and second moment specification: Chi-Square = 21.3 (p = 0.85)
Durbin-Watson statistic = 2.2
Tests for Normality
Shapiro-Wilks statistic (W) = 0. 97 (p = 0. 57) Skewness = 0.42 Kurtosis = 0.15 | *+*** +----+----+----+----+----+----+----+----+----+----+
Normal Probability Plot 90+ *+++++ | * *++++ | *++++ | ++*+* 10+ ******** | ******
| *++*+* -70+ *++++
-2 -1 0 +1 +2
Validation of assumptions of linear regression: 1) Variance inflation factors (for multicollinearity): values < 10, hence variables were not correlated 2) SPEC option (Pr>Chi-sq of 0.85 for ♂and 0.56 for ♀) -- error terms independent and identically distributed. 3) Durbin-Watson statistic = values b/n 1.6 and 2.4 indicate data were independent, not correlated & adequate sample size 4) Shapiro-Wilks test (p-values of 0.58 for ♂and 0.36 for ♀) and normal probability plot -- error terms from a normal distribution 5) Cook’s D statistic values <2 -- no outliers
Table 4.14(b). Multiple linear regression of cardiovascular death rates among females at state level Parameter Estimates Parameter Variance Variable DF Estimate Pr > |t| Inflation Intercept 1 195.36 0.002 0 Smoking 1 -4.76 0.083 1.50 Overweight 1 2.29 0.281 2.57 Veg_% 1 -0.25 0.527 2.19 Fruit intake 1 -0.62 0.355 1.89 Self-reported diabetes 1 14.0 0.219 2.38 Urban_% 1 0.11 0.858 2.04 Analysis of Variance
F value = 2.77 (Pr > F =0.057) Model (degrees of freedom) = 6
R-Square = 0.431 Adj R-Sq = 0.275 Test of first and second moment specification: Chi-Square = 27.3 (p = 0.56)
Durbin-Watson statistic = 1.9
Tests for Normality
Shapiro-Wilks statistic (W) = 0. 96 (p = 0. 36) Skewness = 0.76 Kurtosis = 1.08 | **+**+ -2 -1 0 +1 +2
Normal Probability Plot 90+ * ++ | ++++++ | *+*++*+ | ++*+*+ | +******* | *******
-50+ * +*++*+ +----+----+----+----+----+----+----+----+----+----+
Validation of assumptions of linear regression: 1) Variance inflation factors (for multicollinearity): values < 10, hence variables were not correlated 2) SPEC option (Pr>Chi-sq of 0.85 for ♂and 0.56 for ♀) -- error terms independent and identically distributed. 3) Durbin-Watson statistic = values b/n 1.6 and 2.4 indicate data were independent, not correlated & adequate sample size 4) Shapiro-Wilks test (p-values of 0.58 for ♂and 0.36 for ♀) and normal probability plot -- error terms from a normal distribution 5) Cook’s D statistic values <2 -- no outliers
88
4.6 Biases & Limitations Having shown the results of the descriptive geographical epidemiology of cardiovascular
risk factors and mortality, I proceed in this section to study the biases and limitations in this study.
4.6.1 Assessment of representativeness of surveys
All surveys were assessed by comparing the sex ratios (number of females per 1000 males)
and age-sex distributions to an external comparison, the 2001 census population. NFHS-3 was
excluded in the comparisons because it over-sampled women by intent in its study design.
4.6.1.1 Sex ratio
The sex ratios in selected surveys are tabulated in table 4.15. The two risk factor surveys
(SFMS & NFHS-2) had sex ratios of 946 and 965 per 1000 that were relatively more favourable to
females in the entire population as compared to the census sex ratio of 933. These ratios were even
higher in those aged 15 years and over. By residence, the sex ratios were relatively more favourable
for females in urban than in rural areas as per the 2001 census; this was however reversed in SFMS
and NFHS-2 surveys where the sex ratios were higher in rural locations than in urban communities.
SRS provided no data for such comparisons.
In the MDS, sex ratio was skewed unfavourably towards women because of two reasons: (i)
sampling was based on mortality, and (ii) number of male deaths were more than female deaths
Table 4.15 Sex ratios (no. of females per 1000 males) in selected surveys in comparison to the census 2001 population
Survey (year) Comparison SFMS NFHS-2 MDS Census Variable 1998 1998-99 2001-03 2001 Sex ratio (India)
All ages (urban & rural) 946 965 807 933 Age ≥ 15 yrs (urban & rural) 960 985 761 941 Urban 934 959 960
Rural 968 998 897
89
4.6.1.2 Age-sex distribution
The age-sex pyramids of the risk factor survey populations in comparison with the census
population are shown in Figure 4.29. Overall, the age-sex distributions of the three surveys
appeared broadly similar in proportions to the census population by 5-year age groups. All age-sex
distributions were characterized by a broad base and a narrow apex in the age-group of interest
(ages 15 years and over). Overlaying of individual age-sex pyramids over the census age-sex
pyramid (not shown here) revealed minor discrepancies in some age-groups (such as over-
representation of women in the 50-54 year age-group in the SRS 2004 survey).
90
Figure 4.29 Age-sex pyramids of selected surveys in comparison with Census 2001 population
Age-sex pyramid, India, Census 2001
10 8 6 4 2 0 2 4 6 8 10
0-4 yrs.
10-14 yrs.
20-24 yrs
30-34 yrs.
40-44 yrs.
50-54 yrs.
60-64 yrs.
70-74 yrs.
80+ yrs.
Age-group
Percent
Males(%) Females(%)
Age-sex pyramid, India, SFMS 1998
10 8 6 4 2 0 2 4 6 8 10
0-4 yrs.
10-14 yrs.
20-24 yrs
30-34 yrs.
40-44 yrs.
50-54 yrs.
60-64 yrs.
70-74 yrs.
80+ yrs
Age-groups
PercentMales(%) Females(%)
Age-sex pyramid, India, NFHS-2 (1998-99)
10 8 6 4 2 0 2 4 6 8 10
0-4 yrs.
10-14 yrs.
20-24 yrs
30-34 yrs.
40-44 yrs.
50-54 yrs.
60-64 yrs.
70-74 yrs.
80+ yrs
Age
PercentMales(%) Females(%)
Age-sex pyramid, India, SRS 2004
10 8 6 4 2 0 2 4 6 8 10
0-4 yrs.
10-14 yrs.
20-24 yrs
30-34 yrs.
40-44 yrs.
50-54 yrs.
60-64 yrs.
70-74 yrs.
80+ yrs.
Age-group
Percent
Males(%) Females(%)
91
4.6.2 Integrity of surveys
All of the selected surveys were recent (within the last decade), large (0.2 million to 4.5
million) and nationally-representative (covering almost all states and over 99% of the country’s
population) with few missing values. They were all household interviews or face-to-face individual
interviews covering males and females over the age of 15 years, non-institutionalized and living in
private households. They utilized uni-stage or multi-stage sampling designs with probability
methods being implemented at the sampling stage of design to ensure that a representative sample
of the target population was obtained. All the surveys also had several hundreds of small geographic
areas as the primary sampling units in each state proportionate to urban and rural populations and
were sampled based on the fertility and/or infant mortality levels for each state.
All surveys were well-conducted surveys based on the survey metric of response rates [117].
The risk factor surveys achieved response rates over 92% and the mortality outcome survey
achieved response rate of about 85%. The availability of large sample size however does not
guarantee against any potential bias from non-response.
All surveys were also high-quality in terms of having low numbers of missing observations.
The datasets had internal policies for identifying missing values by special responses or codes that
enabled the handling of missing observations to produce high-quality datasets for analysis in a
coherent and consistent form. As a result, it was noted that all the survey datasets had few missing
values (<1%) for the demographic variables of interest (age, sex, education and residence);
education was the only item for which about 8% had missing values in the Million Death Study.
Similarly among CVD predictors (smoking, diet, overweight and self-reported diabetes), it was seen
that BMI was the only item that had 7% missing values in some sub-groups (males) in NFHS-3
while all other items had <5% missing values in all surveys.
The validation of these datasets against an external comparison, the census population,
revealed differences in demographic characteristics between individual studies. Two risk factor
surveys (SMFS and NFHS-2) had an under-representation of males while the NFHS-3 had over-
sampled females by intent in study design. In addition, the National Family Health Surveys-2 & 3
(especially the most recent survey) had a relatively higher proportion of females with higher
education and from urban locations. The Million Death Study however had an under-representation
of females due to differential follow-up of individuals for vital status (dead or alive). Completeness
of reporting of deaths has earlier been documented to be around 82% for females and 87% for males
92
[98]. Comparison of age-sex structures to the census population revealed that all surveys were to a
large extent comparable except for minor variations in some age-sex groups.
These survey characteristics had two major implications. Firstly, they covered from 1998 to
2006, allowing for the standardization to the census 2001 population or the estimated 2006
population for comparability. Secondly, the large sample sizes and the comprehensive nature of the
datasets lent credence to the robustness of the study findings observed.
Overall, these survey characteristics and survey metrics attest to the internal validity and
reasonably good representativeness of the different surveys [117].
4.6.3 Study characteristics Potential sources of bias in the different datasets were examined by reviewing characteristics
of respondents, survey instruments and interviewers (table 4.16).
Survey instrument characteristics
The survey questions were different in the three surveys that had individual-level data.
There were filter questions on smoking in the SFMS survey followed by additional separate
questions on beedi and cigarette use. NFHS-3 on the other hand had no filter question but only a
common question on cigarette or beedi use. Questions on dietary factors were asked only in the
NFHS-3 survey. The frequency of dietary intake of specific food items such as fruits, eggs, milk,
fish and meat/chicken was questioned and this was used to examine fruit consumption and
vegetarianism. Self-reporting of diabetes and measurement of individual height and weight to
compute body mass index were included only in the NFHS-3 survey.
Respondent characteristics
The NFHS-3 survey obtained responses from all household members directly unlike the
other surveys which obtained from proxy respondents who in most cases were heads of households
(as in SFMS & SRS) or was any female respondent (as in NFHS-2). In addition, by design the
participants in NFHS-3 were younger, more often from urban areas and were also relatively more
educated.
Interviewer characteristics
All four surveys employed trained non-medical field workers as interviewers.
93
Table 4.16: Potential sources of bias based on characteristics of respondents, survey instruments and interviewers in the four surveys Risk factor surveys
Characteristics SFMS NFHS-2 SRS NFHS-3
Respondent
Type of respondent Proxy Proxy Proxy Self
Sex n/a Male 51.0% 50.4% 37.4% Female 49.0% 49.6% 62.6%
Age (yrs) - Mean/ median 35.8/ 33.0 35.6/ 32.0 n/a 29.8/ 29.0 Education n/a
Illiterate 38.4% 35.5% 25.4% Upto Grade 5 27.8% 17.4% 14.7% Upto Grade 10 26.4% 31.9% 37.0% Grade 11 & over 7.4% 15.2% 22.9%
Residence n/a
Urban 23.5% 33.4% 47.9% Rural 76.5% 66.6% 52.1%
Survey instrument
Smoking Screening question Does this Does anyone n/a n/a
person smoke? smoke? - yes/no - yes/no
Tobacco type What do n/a n/a Do you smoke they smoke? 1.cigarette/ beedi
1.cigarette, 2. pipe, 3. other 2. beedi, 3. other Starting age At what age? n/a n/a n/a Dietary determinants n/a n/a n/a How often do you
consume these foods (fruits, milk/egg/ fish/meat)? 1=daily, 2=weekly, 3=occasionally, 4=never
Interviewer non-medical non-medical non-medical non-medical
trained worker trained worker trained worker trained worker
n/a = data not available; totals may not add up to 100 because of missing observations
94
4.6.3.1 Effect of differences in survey questions
Type of tobacco smoked
The survey questions were different in the three surveys which had individual-level data. I
therefore graphed the proportion of population using cigarette, beedi or other forms of tobacco in
SFMS and NFHS-3 surveys to examine the effect of differences in survey questions (Figure 4.30).
The SFMS survey provided proportions using beedi and cigarette separately while the NFHS-3
survey provided a combined value for those smoking cigarette or beedi. All other forms of smoking
were relatively less common according to both surveys.
Figure 4.30 Comparison of proportions of adult males, ages 15-54 years, smoking different types of tobacco in selected surveys SFMS, 1998 NFHS-3, 2005-06
6.2
17.4
2.3
0
5
10
15
20
25
30
35
1
Cigarette Beedi other
30.6
1.230.81
0
5
10
15
20
25
30
35
1
Cigarette/beedi Pipe Other
4.6.3.2 Effect of type of interviewers
All surveys had non-medical field workers who were trained in survey-specific methods and
interview techniques. This involved description of question-by-question specifications for the study
instrument along with instructions for ‘do’s and donts’ for the interview. For SFMS and SRS
surveys, these field workers were all permanent employees of the central government; for the NFHS
surveys, these were typically research assistants of central or state government departments in each
state. All workers typically had completed 10-12 years of formal education and were very fluent in
95
the vernacular language of each state and also well aware of the socio-cultural context within each
state. So there were minimal differences between workers across surveys.
4.6.3.3 Effect of type of respondent
Age-specific prevalence of smoking
Figure 4.31 shows age-specific prevalence of smoking in the three different surveys. While in the
age-group above 30 years of age the age-specific prevalences were similar, the age-specific
prevalences in the 15 to 29 year age-group was higher in the NFHS-3 survey compared to the other
two surveys. The question on smoking was directed to each male in the household in NFHS-3 while
it was obtained from other household members in the other two surveys.
Figure 4.31 Relationship of reporting bias for smoking with type of respondent in selected surveys in India
0
5
10
15
20
25
30
35
40
45
50
15-19 20-24 25-29 30-34 35-39 40-44 45-49 50-54 55-59 60-64 65-69
Age-group (years)
Prev
alen
ce %
SFMS 1998 NFHS-2 1998-99 NFHS-3 2005-06
selfproxy
proxy
4.6.4 Differences in sociodemographic characteristics The association of various risk factors with sociodemographic characteristics such as place of
residence and education is not uniform but complex. This is illustrated in Figure 4.32 for place of
residence (urban/rural) and in Figure 4.33 for education. Prevalence proportions for beedi and
96
cigarette smoking were estimated for those aged 40 years and over from the SFMS 1998-99 survey;
prevalence for other risk factors were from the NFHS-3 survey, vegetarianism and fruit intake
prevalence being estimated among those aged 15 years and over, while overweight and diabetes
prevalence were estimated among those aged 30 years and over.
Beedi and cigarette smoking had opposite relationships with place of residence. Overweight and
diabetes were uniformly higher in urban than in rural areas. Vegetarianism and fruit consumption
also had opposite relationships with place of residence.
Figure 4.32 Prevalence of various risk factors by urban-rural residence
16.113.6
32 32.2
62.6
2.7
31.5
5.6
12.2
34.5
39.5
1.9
0
10
20
30
40
50
60
70
Beedi
Cigaret
te
Overw
eight
Vegeta
rianism
Fruit i
ntake (
weekly)
Diabete
s
Risk factors
Prev
alen
ce %
Urban Rural
At higher levels of education (as a proxy for socioeconomic status), it was seen that there was lower
level of beedi smoking but higher level of cigarette smoking. Further, this educational group had
higher levels of protective factors (higher vegetarianism and higher fruit intake) but also higher
levels of adverse risk factors (such as higher body mass). Reported diabetes was probably due to
awareness bias in this educational group. These opposing variations in prevalence of risk factors by
education could impact cardiovascular death rates in transitional populations in complex ways and
could partly explain the paucity of associations in the ecologic analyses.
97
Figure 4.33 Prevalence of risk factors by education
0
10
20
30
40
50
60
70
80
Illiterate Up to Grade 5 Grades 6-10 > Grade 10
Education
Pre
vale
nce
%
BeediCigarette
0
10
20
30
40
50
60
70
80
Illiterate Up to Grade 5 Grades 6-10 > Grade 10
Education
Prev
alen
ce %
VegetarianismFruit intake
0.05.0
10.015.020.025.030.035.040.045.050.0
Illiterate Up to Grade 5 Grades 6-10 > Grade 10
Education
Prev
alen
ce %
Overweight
0.0
0.5
1.0
1.5
2.0
2.5
3.0
3.5
4.0
Illiterate Up to Grade 5 Grades 6-10 > Grade 10
Education
Prev
alen
ce %
Diabetes
98
The complex and confounding effects of education and urbanization among males and females
could therefore be mediated through these different distributions of risk factors leading on to the
observed differences in cardiovascular mortality outcomes between males and females. This has
implications for health policy focus differences for different subgroups in India. Figure 4.34
illustrates the differences in distributions of risk factors for the extremes of subgroups by sex, level
of education and place of residence. It looks at four subgroups of middle-aged Indians – urban
males or females with greater than secondary (grade 10) education and rural males or females with
less than primary (grade 5) education. Data on smoking was from the 1998-99 SFMS survey and
data on other four risk factors was from the 2005-06 NFHS-3 survey. From the illustration, it
appears that smoking is a major risk factor among males (rural and urban), overweight is a major
risk factor among urban residents (females more than males) and diabetes is relatively more
common among urban residents.
99
100
2.7%
0.4%
45.7%
46.7%
83.0%
0% 10% 20% 30% 40% 50% 60% 70% 80% 90%
Diabetes
Smoking
Overweight/obesity
Lactovegetarianism
Weekly fruit intake
Figure 4.34 Distribution of CVD determinants* by residence-sex-education group among adults (aged 15 years and over) in IndiaUrban females with > grade 10 education (population ≈ 25 million)
1.1%
2.6%
10.0%
38.7%
26.4%
0% 10% 20% 30% 40% 50% 60% 70% 80% 90%
Diabetes
Smoking
Overweight/obesity
Lactovegetarianism
Weekly fruit intake
Rural females with < grade 5 education (population ≈ 250 million)
3.5%
25.2%
34.2%
35.2%
77.4%
0% 10% 20% 30% 40% 50% 60% 70% 80% 90%
Diabetes
Smoking
Overweight/obesity
Lactovegetarianism
Weekly fruit intake
CVD
det
erm
inan
ts
1.2%
54.2%
5.1%
25.1%
32.3%
0% 10% 20% 30% 40% 50% 60% 70% 80% 90%
Diabetes
Smoking
Overweight/obesity
Lactovegetarianism
Weekly fruit intake
CVD
det
erm
inan
ts
Urban males with > grade 10 education (population ≈ 80 million)
Rural males with < grade 5 education (population ≈ 150 million)
* Cardioprotective factors are shown in green shading and harmful risk factors are shown in red shading; Smoking data is from SFMS surveys while other four risk factors are from NFHS-3 survey
101
4.6.5 Limitations
These differences in study characteristics and participant characteristics were associated with
limitations in the datasets used and analytic methods undertaken.
4.6.5.1 Limitations in the data
Thus the available national health surveys selected for the present study differed in their key
objectives and study designs. Hence not all of them covered the entire range of middle-aged adult
life (30 to 69 years). Some were not representative of the overall population in terms of
demographic characteristics such as educational status and place of residence. They also differed in
various other ways such as in the elicitation of information from various respondents, definitions of
study variables and in the wording of the questions (eg. vegetarianism). All of these factors
impacted the descriptive analysis, comparisons across surveys and studying of time trends. The
study was also limited by the presence of study variables studied; some key risk factors such as
blood sugar and lipid profile were not available from national surveys.
4.6.5.2 Limitations in the analysis
The surveys had different sampling procedures with various levels of complexities in
staging, stratification and weighting involved in the selection of sampling units and respondents, all
of which were not incorporated in the calculation of confidence intervals for the estimates. Given
that the sample sizes were large though in all the surveys, the confidence intervals around the
estimates were mostly very minimal for almost all states except for some small states and union
territories.
For the linear regression analysis, in the interest of simplicity, interactions were not
considered in the model. Further, there was inadequate power in the analysis because of the number
of observations (29 states) in the linear regression being small compared to the number of
parameters (six) in the model. Lastly, exploration of ecologic association between CVD
determinants and CVD outcome assumes independence of observations. In reality, this was not the
case for all predictors. For example, from the global Moran’s I correlation for vegetarianism
(equaling 0.7), it was known that some states had influence on the neighbouring states with resultant
clustering in some regions. By ignoring this clustering, I have failed to account for this in the linear
regression analysis. Spatial regression would be an appropriate analytic method that accounts for
this clustering.
Further, since the study predictors included some at aggregate level and some at individual
level, multi-level modeling would be have been an appropriate analytic method in this setting in
overcoming individualistic fallacy and ecologic fallacy to arrive at meaningful conclusions on
various determinants and health outcomes [118].
102
5 DISCUSSION
In this section, I first provide a summary of salient findings. Then I interpret the study results
discussing the relevance and plausibility of the descriptive geographic epidemiologic findings as
also the ecologic comparison of cardiovascular risk factors and mortality. Lastly, I outline key study
implications, list possible directions for future research and close with some concluding remarks.
5.1 Summary of salient findings 1) The selected surveys were large and nationally-representative surveys with high response rates and few
missing observations
2) There were differences between the various surveys with regard to study designs employed, survey
questionnaires used, respondents interviewed and the demographic characteristics of the subjects studied.
5.1.1 Smoking
3) About 30% of males aged ≥15 years were current smokers across surveys; this was more than 10-times
the prevalence among females. Among males aged ≥30 years, about 40% were smokers.
4) The mean age of initiation among males was 21 years and the peak smoking prevalence (45%) was seen
in the 45 to 59 year age-group.
5) Overall, about 70% of tobacco smoked was in the form of beedis and 20% was in the form of cigarettes.
This beedi:cigarette use ratio was 5:1 in rural areas and nearly 1:1 in urban areas.
6) Prevalence of beedi smoking was inversely related to educational attainment while cigarette smoking
was positively associated with level of education.
7) Rural males were 1.4 times more likely to be smokers than urban males.
8) There was a 6-fold variation in smoking prevalence between states.
9) Smoking was more common geographically in the northern states of the country with statistically
significant spatial clustering noted in the northeastern states.
10) Correlation between smoking prevalence proportions in males was high between surveys in most states.
11) Quitting smoking was quite uncommon throughout the country
12) Questions relating to smoking behaviour were not standardized across different surveys
5.1.2 Pre-obesity/obesity
13) Pre-obesity/obesity (BMI ≥ 25 kg/m2) prevalence was 11.8% among males and 15.1% among females.
14) The prevalence in urban areas (20.9%) was significantly higher than in rural areas (7.8%) (p<0.01).
103
15) About 80% of those who had BMI ≥ 25 were in the category of pre-obesity (BMI=25.0-29.9); about 19%
were obese (BMI=30.0-39.9) and <1% were morbidly obese (BMI ≥40.0).
16) The mean BMI in the three age-groups 15-19 yrs, 20-29 yrs and 30-49 yrs was 18.7, 20.5 and 21.6
respectively (p<0.001) among males and 19.3, 20.5 and 22.1 respectively (p<0.001) among females.
17) Among middle-aged adults aged 30 to 49 years, the prevalence of pre-obesity/obesity in states varied
nearly 10-fold among males and varied nearly 20-fold among females.
5.1.3 Dietary factors and Self-reported Diabetes
Vegetarianism
18) Overall, about a third of the population were lacto-vegetarians.
19) Males were less often vegetarians [OR(99%CI) = 0.67(0.65-0.68)] than females, in urban and rural areas.
20) Urban residents were less likely to be vegetarians [OR(99%CI) = 0.90(0.88-0.92)] than rural residents.
21) Vegetarianism was more common among those with greater educational attainment.
22) There was a strong east-west gradient in vegetarianism with the lowest prevalence in the northeastern
states and highest prevalence in the northwestern states.
23) There was wide variation in prevalence of vegetarianism across states: a 25-fold variation was noted in
urban areas and a 50-fold variation was seen in rural areas.
Fruit consumption
24) About half the population reported regular (at least weekly) fruit consumption
25) Males reported regular fruit intake more commonly than females [OR.(99%CI) = 1.36(1.33-1.39)]
26) Regular fruit intake was higher in urban areas [OR.(99%CI) = 2.58(2.52-2.64)] than in rural areas.
27) Proportion of population consuming fruits regularly increased about 3-fold across the education gradient
28) Regular fruit intake in states varied nearly 3-fold across urban areas and about 8-fold across rural areas.
Self-reported Diabetes
29) The prevalence of self-reported diabetes among males was 2.81% and that among females was 2.03%.
30) In urban areas the prevalence was more than double that in rural areas (3.28% vs. 1.47%) (p<0.01).
31) The prevalence showed nearly a 10-fold variation among males and a 8-fold variation among females.
5.1.4 Cardiovascular mortality
32) The mean cardiovascular death rate was 310/100,000 among males and 190/100,000 among females.
33) The rates among males ranged from 203/100,000 in Meghalaya to over 410/100,000 in Tamil Nadu.
Among females, the rates varied from about 40/100,000 in Mizoram to about 240/100,000 in Punjab.
104
34) In multivariate regression, the study variables explained 49% and 43% of the vascular death rate
variation among states for males and females respectively. The vascular death rates were significantly
associated with levels of overweight and vegetarianism for males at the state level; no such association
was found for females.
5.2 Smoking While there can be no dispute over the role of smoking in the causation of cardiovascular
disease at the individual-level based on research over the last 60 years [49], smoking did not turn
out to be a significant predictor of CVD in this ecologic comparison. There are possibly several
explanations for this apparent lack of correlation. In females, it may due to small sample size; only a
small proportion of them in India were smokers. In males, it may be due to the following six
reasons. Firstly, it may be because of competing mortality. This is not entirely surprising given that
from the picture on smoking epidemiology from this analysis, it was clear that most of the smokers
in India were predominantly beedi smokers from lower socioeconomic strata. These smokers would
die not only from cardiovascular deaths and cancers as is typically seen in industrialized countries
but also from competing causes such as tuberculosis and other chronic lung diseases at younger
ages [58]. Secondly, it may be due to demographic reasons: the average life-expectancy for Indian
males is only 62 years [119]. This means that the full burden of smoking vis-à-vis cardiovascular
mortality is not apparent now and will take time to evolve as the life-expectancy of males continues
to increase over the coming decades. Thirdly, the consumption of cigarettes is increasing in urban
areas recently. This smoking epidemic may be separated in time by a lag-period of a few decades
from the CVD epidemic; hence the association between the two may not be detected by analysis
currently in the interim period. Fourthly, there may have been some misclassification of smoking
status due to misreporting of smokers as non-smokers owing to the common misconception in India
that smoking a beedis/cigarettes infrequently or in small quantities (<5 per day) is not harmful; this
belief is prevalent in the general population [120] and among health care professionals [121].
Fifthly, the lack of association could also be a feature of the ‘phase of epidemiologic transition’ --
with many regions still in different stages of epidemiologic transition it may take time for this
evolving picture to stabilize. Lastly, it may be a characteristic of the ecologic study design. From
the earliest ecologic studies including the Seven Countries’ Study by Ancel Keys et al [122], it has
105
been noted that smoking, unlike other cardiovascular risk factors (eg. cholesterol levels), has only a
weak correlation with CVD mortality at the ecologic level.
With regard to the descriptive epidemiology of smoking, this is the first study that compares
nationwide estimates of the smoking prevalence from different surveys in India. The greater
frequency of smoking among men (about 10-fold) as compared to women is well documented in the
literature from all large-scale surveys in India over the last two decades [54,55,123,124]. The
prevalence of smoking among males ranged from 26 to 30% according to the SFMS, NFHS-2 and
SRS surveys covering the period 1998 to 2004. This was comparable to the 29% documented by
Neufeld et al [55] earlier in the National Sample Survey (round 52) conducted during 1995-96. It
was however lower than the 33% seen in the most recent NFHS-3 survey of 2005-06. Comparison
of estimates across time reveals variations that could be explained by differences in sample size or
in the choice of study respondents. The most likely explanation however is that NFHS-3 which
employed self-reporting instead of proxy-reporting unlike the other three surveys possibly yielded a
higher estimate compared to the other studies. This has been noted earlier within the NFHS-2 by
Rani et al [54] who compared estimates for family members reporting for themselves against the
estimates for other family members and concluded that there was up to 5% under-estimation of
smoking prevalence among men when proxy-respondents were interviewed. Older family members
who were proxy respondents were likely to be unaware of the smoking status or were under-
reporting, due to social stigma, the smoking status of adolescents or young adults in the family.
Age-specific prevalence rates of tobacco use among males revealed interesting differences
in comparison with global and U.S. data. The peak smoking prevalence (45%) in my datasets was in
the 45-59 years age-group; this was higher than the peak use (36%) in ages 30-39% reported in
global estimates [56] as well as the peak use (39%) seen in the age-group of 21-25 year old males in
America [57]. The observed age-dependency of smoking could be attributed to one of three possible
factors: it could be due to cohort effect (with declining prevalence over time with younger cohorts
smoking less) or it could be due to age effect (younger persons smoking less often and more people
initiating smoking as they get older) or it could be due to under-reporting of smoking by younger
people. Available evidence indicates that there is a combination of under-reporting of smoking at
younger ages as well as an actual increase in prevalence of smoking with age up to mid-50s. This
has implications for health policy and programming with respect to smoking control in India: the
initiation into smoking could occur at any age and not just among young people. Hence tobacco
106
control programmes need to focus on all age-groups (adolescents, young adults and middle-aged
adults).
There are two subpopulations of smokers in India: beedi smokers who were likely to be
older, reside predominantly in rural areas and have lower education compared with cigarette
smokers who were more likely to be younger, live in urban areas and have higher education. While
the strong gradient of smoking with education is seen worldwide [125], this dichotomy between
beedi smokers and cigarette smokers is not common. It has been observed in a smaller cross-
sectional study in urban Delhi previously [123]. The identification of these two subpopulations with
different risk markers points to the need for tailoring tobacco control programmes to suitably target
these two vastly different groups of smokers.
My study has also identified social inequalities with respect to smoking among males. Those
who were less educated were not only more likely to be smokers, they were also more likely to
initiate smoking earlier than those who with higher education. Further, beedi smokers were more
likely to initiate smoking earlier when compared to cigarette smokers. Such inequalities in initiation
are of use in identifying the important age groups and entry points for policies to tackle inequalities
in smoking.
There were wide geographic variations in current smoking between states in India. While
the inter-state variation (up to 7-fold) for overall smoking has been noted earlier [41,54], what was
new from my analysis was that there was an even greater 25-fold variation between states for
cigarette smoking and a 50-fold variation between states for beedi smoking. Smoking was observed
to be more common geographically in the northwestern and northeastern states of the country as
reported earlier [41,54]. Through spatial analysis, statistically significant spatial clustering of
smoking was noted in the northeastern states and Kerala was seen to be an outlier with high
smoking prevalence among the southern states. The state level variations may be due to underlying
differences in regional socio-cultural patterns or due to different public policies on tobacco in
different states. There are also implications for tobacco control policies at national and state-level.
State-level policies may sometimes need to be focused on individual states or at times jointly
between neighbouring states if there are strong ties between neighbouring states due to shared
socio-cultural norms or because of trade and commerce ties. Another interesting finding was the
beedi:cigarette use ratio in states that ranged from about 1:1 (in Delhi & Kerala) to about 30:1 (in
Gujarat). This may reflect socio-cultural differences between states, differences in economic status
107
or differences in business policies or tobacco control policies. This aspect needs to be investigated
further.
Correlation between smoking prevalence rates in males was noted to be high between
surveys in most states. This was despite the differences between the various surveys with regard to
data on smoking. The format in which the questions relating to smoking were phrased in the
different surveys may partly account for the observed differences. While SFMS asked explicitly
about all forms of tobacco separately (beedi, cigarette, hukka, other), NFHS-3 probed for some
detail (cigarette/beedi, pipe, other) and NFHS-2 only asked whether any household member
‘smoked tobacco’ leaving the interpretation of tobacco to the respondents themselves. Further, there
were differences between studies with regard to the choice of respondents. The estimates from
SFMS, NFHS-2 and the SRS, all of which relied on proxy household informants, were lower than
the prevalence rate of smoking obtained from NFHS-3 that asked for self-reporting by each
individual. Specifically, the differential reporting on the smoking behaviour of those aged 15 to 29
years was dependent on the type of respondent with proxy-respondents consistently under-reporting
the smoking behaviour of these younger males than the self-reporting by these individuals
themselves. Household respondents (be it the head of the household, usually the eldest male, or any
other adult respondent) may not report accurately about the smoking status of all household
members either because he/she may not be aware of the smoking habits of other household
members [54] or may be intentionally under-reporting about some specific members of the
household due to prevailing social norms. The rate of agreement between proxy and self report of
smoking status has been compared amongst various ethnic groups in a survey of 57,244 households
in U.S.A. Cohen's kappa coefficients of agreement on smoking status was found to be 0.82 for
Asian Americans; it was lower than that seen among non-Hispanic whites and African Americans
(kappa = 0.91) but higher than that seen among Hispanics (kappa = 0.76). But these smoking rates
were estimated by telephone surveys [101]. No such information was available from the Indian
context.
The above differences in phrasing of questions or choice of respondents could thus partly
account for differences in smoking prevalence between surveys within some states such as Tamil
Nadu and Assam; in Tamil Nadu, estimate of smoking prevalence obtained by self-reporting was
higher than that obtained from proxy-reporting while in Assam, the estimate from proxy-reporting
was higher than that obtained from self-reporting.
108
Differences between populations sampled could have impacted on reported prevalence of
smoking in the study populations. The NFHS-3 survey had respondents who were younger, were
from urban locations and were relatively more educated. This could have biased the smoking
prevalence downwards compared to the other surveys. But because this survey obtained reports
from self-respondents, the prevalence estimate obtained was higher than that of other surveys; so
the actual prevalence was possibly even higher.
Finally, there is the question of whether individual reporting of smoking behaviour without
validation using biomarkers of tobacco use is an optimal method of studying smoking habit in the
general population. Mainly, three biochemical measurements have been used to validate reported
smoking: carbon monoxide, thiocyanate, and cotinine [126]. A meta-analysis by Patrick et al in
1994 [127] that identified 26 published reports containing comparisons between self-reported
behavior and biochemical measures concluded that self-reports of smoking were reasonably
accurate in most general population studies. Biochemical assessment, preferably with cotinine, was
recommended to improve accuracy only in intervention studies and student populations. A more
recent systematic review of 67 studies revealed trends of underestimation when smoking prevalence
was based on self-report [100]. It also showed varying sensitivity levels for self-reported estimates
depending on the population studied and the medium in which the biological sample was measured.
Sensitivity values were consistently higher when cotinine was measured in saliva instead of urine or
blood. These studies were however based in industrialized countries predominantly.
None of the surveys had questions on quitting behaviour among smokers. SRS 2004 had
group-level data on the population proportion that was classified as ex-smokers or current smokers.
I used this ratio of ex:current smokers to obtain some understanding of quitting among smokers
keeping in mind the limitations with the available data. This ratio was dependent on
misclassification of smokers as ex-smokers as well as the current smoking prevalence in a state. The
national value was <4%; this was about a tenth of values seen in developed countries [53].
5.3 Overweight and obesity Excess body weight is an independent predictor for cardiovascular diseases and other risk
factors such as type 2 diabetes, hypertension and dyslipidemia. In my study, increased BMI was
positively correlated with CVD mortality for males in the ecologic comparison. Each unit increase
in BMI was associated with a 7.5/100,000 increase in CVD death rate. Such a finding on ecologic
comparison is in consonance with other studies on individuals who on prospective follow-up
109
experienced higher vascular death rates with increasing BMI. Most recently, the Prospective Studies
Collaboration that reviewed a total of 57 prospective studies with 894,576 participants (mostly in
western Europe and North America) identified that at BMI of 30-35 kg/m2, median survival was
reduced by 2-4 years; above 40 kg/m2, it was reduced by 8-10 years (comparable with the effects of
smoking) [50].
In my study, about 12% of males and 15% of females had BMI ≥ 25kg/m2. The urban-rural
differences were however much bigger with one in five individuals being overweight in urban areas
and one in twelve persons being overweight in rural areas. About 80% of those who had BMI ≥ 25
were in the category of overweight/pre-obesity (BMI=25.0-29.9); the rest were in the category of
class I & II obesity (BMI=30.0-34.9 and BMI=35.0-39.9) with <1% being morbidly obese (BMI
≥40.0). Mean BMI values were generally lower than 23.0 among age, sex and residence groups
studied. These findings are consistent with available knowledge regarding female preponderance of
overweight/obesity over males, urban rates being higher than rural rates, and rarity of obesity but
increasing prevalence of pre-obesity among Asian Indians [61,62]. A significant finding that was
opposite to other studies elsewhere was that pre-obesity was also more common in females than
males whereas in most other settings males had more pre-obesity than females [61].
Anthropometric measurements such as BMI have an important place in nutritional
assessment. BMI has a J-shaped association with mortality; while at the lower end, it is associated
with digestive and respiratory mortality and may be confounded by smoking and disease states, at
the higher end, it is associated with diabetes and vascular mortality. It is clear that the imbalance
between caloric intake and energy expenditure is fueling the epidemic of overweight and obesity
worldwide [128].
There are however methodological issues associated with BMI measurements as estimates
of body fat percentage (BF%) and risk for cardiovascular disease. Firstly, reliability of physical
measurements is dependent on characteristics relating to the subject, instrument and the
anthropometrist. No reported ‘technical errors of measurement’ (TEM) were published for the
NFHS-3 survey to assess how accurately the anthropometrists (field workers) took the
measurements in comparison against a criterion anthropometrist [129]. This could potentially affect
any univariate and multivariate analysis and attendant interpretations. Secondly, though BMI is
widely used as a measure of BF%, there is increasing evidence that this may not applicable
universally, since it is age-dependent, sex-dependent and is also associated with ethnic differences.
110
This is especially so for Asians and other populations that differ in body build and body proportions
from Caucasians in whom the earlier research was done and for whom BMI appears to be a good
indicator of body fatness. Thirdly, it is increasingly becoming obvious that universal cut-points for
BMI that were arbitrarily determined earlier may not be applicable for all populations. For example,
in Asians the high risk of type 2 diabetes and cardiovascular disease is substantial at BMIs lower
than the existing cut-off point for overweight. So though BMIs of 25,30,35 and 40 as cut-points
were used in this analyses, it is now accepted that BMI cut-points of 23.0, 27.5, 32.5 and 37.5 may
be considered for public health action in Asian and other populations [61].
Geographic variation in prevalence of high BMIs among those aged 30 to 49 years shows
that it was high (>33%) in some states such as Punjab, Gujarat, West Bengal and the south Indian
states of Kerala, Tamil Nadu and Andhra Pradesh. The prevalence of pre-obesity/obesity varied 13-
fold among males ranging from 2.9% in rural Chattisgarh to 38% in urban Punjab and varied 22-
fold among females ranging from 2.5% in rural Jharkhand to 55.9% in urban Punjab in this age-
group. This has implications for which states are likely to see the adverse effects on development of
diabetes and increased vascular risk. Further, prevalence of pre-obesity/obesity was high (nearly
25%) among females. This is probably due to a sedentary lifestyle, especially among urban women,
as was documented in the more detailed PURE cohort study of 21,934 participants from five centres
in India [130,131]. Overweight was also more common in these same states in younger age groups
(adolescents aged 15 to 19 years and young adults aged 20 to 29 years) as well. This will influence
future obesity rates since individuals who become overweight earlier on are more likely to be
overweight or obese as adults [128].
5.4 Dietary factors & self-reported diabetes In the ecologic comparison, vegetarianism was inversely correlated with CVD mortality for
males. Each unit increase in vegetarianism prevalence was associated with a 1/100,000 decrease in
CVD death rate. Such a finding on ecologic comparison is in agreement with current knowledge
based on studies on individuals for whom vegetarian diets offered protection against vascular
mortality [64]. There was no such association for females. And for both males and females, regular
fruit intake was not significantly associated with CVD mortality at the state level. This could
possibly be due to several reasons. Firstly, it could be due to measurement issues. The studies in this
thesis did not define fruits and consequently this could impact results. In surveys in India, bananas
are commonly reported as part of fruit intake, and fruit juices with added sugars are not
111
differentiated from fresh fruits. Similarly, potato and yam are considered vegetables by respondents.
Secondly, the optimal recommended intake of fruit and vegetable servings per day to prevent
vascular disease is not identified for Indians as it is known for populations in industrialized
countries.
Vegetarianism
Geographic mapping of the distribution of lacto-vegetarianism revealed an interesting east-
west gradient across the country with eastern and southern states having lower prevalence (<20%)
and northwestern states having higher prevalence (≈ 70%). Diet was dependent on urban or rural
residence, with the rural population reporting greater level of vegetarianism. It was also dependent
on the level of education; those with a higher education reporting a greater level of vegetarianism.
This finding was in the opposite direction to what one would expect as the association between
lacto-vegetarian diet and education (as a proxy for income) because non-vegetarian foods are
generally more expensive in India. The NFHS-3 survey which had higher proportion of urban and
educated respondents resulted in a combined effect of decreasing vegetarianism (with a higher
proportion of urban population) and increasing vegetarianism (with a higher proportion of education
respondents).
Although a high consumption of red meat, which is rich in haem iron and saturated fat, may
increase the risk of heart attacks and stroke, this does not apply to white meat and fish. In fact, the
cardio-protective effect would seem to be derived from the consumption of unrefined vegetable
products (whole-grain cereals, vegetables and fruits) and fish. In other words, a diet containing
ample quantities of unrefined vegetable products along with moderate amounts of animal products
(in which red meat is partly replaced by white meat and fish) is considered to be just as protective as
a vegetarian diet. On the other hand, a vegan diet is associated with an increased risk of deficiencies
of iron, vitamin B12, and other micronutrients. The cardio-protective effect of the lacto-vegetarian
diet in India however may be offset to some degree because of certain cooking practices seen in
various states [67]. For instance, most vegetables are cooked or deep-fried rather than being
consumed as fresh vegetables or salads. Most food items are also exposed to prolonged or repeated
cooking. In addition, trans-fats or hydrogenated fats (eg. vanaspathi) are commonly used, especially
in several parts of urban India. These dietary trans fatty acids are known to raise LDL, triglycerides,
and lipoprotein(a) and lower HDL cholesterol [67]. Further, the types of cooking oils used
throughout the country are different with varying amounts of saturated and unsaturated fatty acids;
112
for example, the overall effect of mustard oil is considered to be protective against ischaemic heart
disease [67,132].
Fruit intake
Frequency of fruit intake was measured not in terms of number of days per week but as
daily, weekly, occasionally or never. So ‘at least weekly’ intake was computed as a marker of
protection against vascular disease. Fruit consumption was higher among males as compared to
females. It was also more common in urban areas and among those with higher education.
While life-time vegetarianism is less likely to be misclassified, frequency of specific foods
consumption such as fruits is more likely to be misclassified due to recall bias or reporting bias.
This is a problem associated with nutritional epidemiology and requires further validation.
Although much remains to be known regarding the role of specific nutrients in reducing the
risk of cardiovascular disease, dietary patterns are increasingly being identified as an important
determinant [128]. Dietary patterns that emphasize whole-grain foods, vegetables, and fruits and
that limit red meat, full-fat dairy products, and foods and beverages high in added salt and sugars
are associated with reduced risk of cardiovascular diseases.
Self-reported diabetes
The overall prevalence of self-reported diabetes was 2.3%. The prevalence in urban areas
was double that in rural areas. Similarly high prevalence based on plasma glucose testing has been
well documented in urban south India (Chennai city) with an increasing secular trend over the last
two decades [68]. Self-reported prevalence was also found to be directly linked with level of
education.
Methodologically speaking, self-reported prevalence is likely to be an under-estimate of the
true prevalence because of the effect of health care utilization on diagnosis of diabetes. Higher
prevalence noted in males as compared to females, in urban areas more than rural areas and a link
with education gradient all point to the effect of differences in access to health care. Though the
absolute levels are low according to self-reporting, the general trends are probably true.
Subramanian et al (2009) have shown recently that self-reporting of morbidity need not, in general,
be inaccurate in India [102].
Geographic variation in prevalence of self-reported diabetes was strong with high
prevalence (5-11%) in the southern and southeastern states. This may partly be a reflection of
differences in health care availability since southern states generally have better health care services
113
than the northern states; but high prevalence in some southeastern states such as Orissa and
Chattisgarh negate this argument because they have weak health infrastructure. Hence it may be
possible that states with high prevalence of diabetes may truly be having multiple risk factors for
the development of diabetes. A similarly high prevalence of diabetes has been documented in urban
areas of Toronto with large numbers of south Asian immigrants [78].
Indians are said to have the so called "Asian Indian Phenotype" that refers to certain unique
clinical and biochemical abnormalities including increased insulin resistance, greater abdominal
adiposity i.e., higher waist circumference despite lower body mass index, and high levels of highly-
sensitive C-reactive protein measurements. [133]. This phenotype makes Asian Indians more prone
to diabetes and premature coronary artery disease. This may at least be partly genetic [62].
However, the epidemic of diabetes is primarily fueled by the rapid epidemiological transition
associated with changes in dietary patterns and decreased physical activity as evident from the
higher prevalence of diabetes in the urban population.
5.5 Cardiovascular mortality The cardiovascular death rates estimated by verbal autopsy method were 300 per 100,000
for males and 190 per 100,000 for females. These rates are lower than the overall rate of 428 per
100,000 estimated by WHO for India [36].There was a 2-fold variation between states for males and
a 6-fold variation for females. The aggregate level study variable (percent urban population in the
states) and the individual-level determinants (such as prevalence of smoking, overweight,
lactovegetarianism, fruit intake and diabetes) were able to explain 49% of the variation in CVD
among males and 43% among females. Of these factors, the critical finding by multivariate
regression was that the vascular deaths among males were significantly determined by the level of
overweight prevalence and vegetarianism in the state; no such significant determinant was detected
for female CVD deaths. Ranking of states also revealed associations in the expected directions
between vegetarianism, overweight, diabetes and CVD death rate for males; there was an
association between overweight, diabetes and CVD death rates for females as well.
Thus overweight/obesity and diabetes are likely to be key drivers of the CVD epidemic at
the state level in India in the future. The complex relationships between various risk factors and the
sociodemographic characteristics such as education and urbanization may partly explain the mixed
picture. Unlike in developed countries (early industrializers) where the burden of CVD is
predominantly seen in those from lower socioeconomic strata [134,135,136], in India the
114
cardiovascular disease burden is seen predominantly seen in those from higher socioeconomic
strata. Those in urban areas with higher education were seen to be more likely to be cigarette
smokers, more likely to be overweight and more likely to report having diabetes and experience
higher cardiovascular mortality; this was in spite of smoking beedis less often and consuming
relatively more vegetables and fruits. Preliminary data based on 9,290 participants aged 35 to 70
years from the Bangalore study centre of the PURE cohort study in India [130,131] also highlights
this pattern (table 5.1). Those who were affluent (and predominantly in urban areas) consumed
more vegetables and fruits but were seen to be having relatively higher intakes of calories, fats,
sugars and salt as compared to those in rural areas and higher levels of body mass, cholesterol,
diabetes (based on self-report or fasting blood sugar) and coronary heart disease (based on self-
report or electrocardiogram).
Table 5.1 Profile of cardiovascular disease and its risk factors in rural and urban populations in southern India, PURE study [130,131]
Rural Urban
(Andhra Pradesh) (Karnataka) Characteristic N=3323 N=5967 Daily dietary intake
Energy (kcal) 1843 2278 Carbohydrate intake (g) 362 351 Fat intake (g) 23 66 Sugar (g) 5.2 28.5 Salt (g) 2.4 8.1 Total vegetables (g) 50 158 Total fruits (g) 49 166
Anthropometric profile Pre-obesity/obesity (BMI ≥25kg/m2) Males (%) 5.5 44.3
Females (%) 6.0 60.3 Mean Serum Cholesterol (gm/dl) Males 163.4 191.3
Females 168.8 207.8 Diabetes Males (%) 3.7 17.9
Females (%) 1.7 13.8 Coronary heart disease Males (%) 5.9 9.0
Females (%) 3.8 6.2
115
This is probably characteristic of early stages of the epidemiologic transition seen as a
consequence of increasing life-expectancy, increasing urbanization and improving socioeconomic
conditions seen in India and other developing countries. Lifestyle changes associated with stage of
socioeconomic development in a population may explain the varying associations between
socioeconomic status and cardiovascular diseases that is observed between countries. This
hypothesis is supported by the reversal in the association between coronary heart disease mortality
and socioeconomic status observed in ‘early industrializers’ [79,137]. Smoking cessation, better
nutrition, and physical activity are potential mechanisms for explaining these trends, because earlier
adoption of healthy behaviors by people from higher socioeconomic groups may have caused
differential declines in coronary heart disease. This has been documented in national surveys in
other countries such as Korea that adopted industrialization after western countries but before India
[138].
Given the high amount of statistical variation (42 to 49%) explained by the study variables,
only a small number of statistically significant variables were however identified by the ecologic
analysis. This could be explained by the underpowered nature of the study as identified earlier in the
limitations section since the sample size had only 29 states.
In addition, the limited number of variables available from across different study datasets
did not cover all the traditional risk factors known to be CVD determinants; information on blood
sugar and lipid profile was not available from any survey. Hence, it is possible that some of the
residual variation that is unexplained could be due to other unmeasured confounding factors such as
blood glucose, lipids, hypertension, etc. or probably due to differences in genetic factors. Further, at
the regional level, I attempted to study aggregate-level information (such as level of urbanization of
a state) to overall cardiovascular death rates. It is possible that the outcome may be linked to
regional determinants such as the relative wealth (% gross domestic product) or socioeconomic
status (education, income, occupation or house type) of people living in different states in India.
Among the latter, while income is frequently misclassified by respondents in a survey setting in
India, education, occupation and house type are less prone to this misclassification error.
Subramaniam et al (2007) have identified income inequalities between states to be a predictor of
both underweight and overweight prevalence among various states in India [63].
Further, the lack of a clear association between the risk factors studied and vascular
mortality outcome could be because of a lack of a relevant lag period between the two. It is known
116
that the interaction of these risk factors may take years to decades for cardiovascular disease to
develop and cause mortality. The study datasets I used however covered only a 9 year period.
Lastly, the death rate data comes from population-based surveys dependent on the verbal
autopsy method and not from a hospital-based death certification system. In verbal autopsy, the
collection of details of circumstances surrounding death by trained lay-workers is to a large extent
dependent on the ability of the respondent to recognize, recall and report positive and negative
symptoms and signs in the correct chronological order. Further many such symptoms and signs are
not exclusive to the cardiovascular system for easy identification and correct ascertainment of cause
of death by the physician coder. Hence, VA data have inherent limitations in the quantity and
quality of information collected from respondents in settings with limited access to health care
services; there are also cost and cross-site comparability issues in a physician-coded VA system
[36,139]. Validation studies conducted earlier have compared causes of death obtained by verbal
autopsy against hospital based diagnoses in northern and southern India [29,140]. The cause-
specific mortality fractions assigned by verbal autopsy method were statistically similar to the
causes arrived at by review of hospital records (p>0.05) [29,140]. Specificity was high (>95%) for
all broad cause groups except cardiovascular (79%) diseases. Sensitivity for cardiovascular diseases
was the same as that for neoplasms and infectious diseases (60% to 65%) but lower than that for
injuries (85%) and higher than that for respiratory, digestive, and endocrine diseases (20% to 40%)
[140]. This was broadly consonant with findings for noncommunicable diseases in a multi-centre
validation study in Africa [141].
Conclusions drawn from this ecologic analysis that showed an association between
vegetarianism and overweight and cardiovascular death rate among males should be restricted to the
level of regions (states) only and not extrapolated to individuals. While the former is a valid
conclusion, the latter could fall victim to ‘ecologic fallacy’, in which incorrect assumptions are
made about individuals based on aggregated data about their communities [142]. However, this
need not completely undermine ecologic studies since the geographic context in which health-states
occur cannot be neglected [143]. Secondly, the ‘modifiable areal unit problem’ (MAUP) could act
as a potential source of error in geographical mapping and analysis. Here one needs to be cautious
of the fact that geovisualization patterns may partly be a consequence of the size and shape of the
areal/regional units used in the study. The choice of areal units and the level of aggregation or
categorization may have a bearing on the study interpretations and the implications. Further, the
117
choice of study units, cut-offs for the map scales and colour schemes in graphical displays could
impact the visualization and impressions formed [46].
5.6 Study implications
5.6.1 Implications for clinical practice
The wealth of evidence available on the individual-level importance of risk factor
identification and management for cardiovascular disease control is not affected by the failure to
identify significant correlations between risk factors other than vegetarianism/increased body mass
and cardiovascular mortality in this ecologic study. For individuals seen in clinical practice and in
community practice, advice on tobacco cessation, regular intake of fruits and vegetables, and
increased physical activity needs to be recommended.
5.6.2 Implications for population health
High levels of smoking (especially among males) seen in this study point to the need for
widespread roll-out of interventions that have been proven to be cost-effective: tobacco tax
increases, the dissemination of information about health risks from smoking, restrictions on
smoking in public places/workplaces, comprehensive bans on advertising and promotion, and
increased access to cessation therapies [144].
The prevalence of increased body mass (especially among urban females) points to the need
for health policy and action at individual-level and population-level. This may include areas such as
economic policies relevant to the promotion of intake of heart-healthy foods and improving physical
activity levels through urban planning, neighbourhood walkability and traffic design that are
locally-relevant and cost-effective [145].
Distribution of risk factors also varied by sex, level of education and place of residence. It
was seen that smoking was a major risk factor among males (rural and urban), overweight was a
major risk factor among urban residents (females and males) and diabetes was relatively more
common among urban residents; vascular death rates were higher in the southern states. From a
health policy perspective, this could have implications on different focus for different target groups
and also for early roll-out out of interventions in some areas such as the south Indian states.
118
5.7 Future directions for research
5.7.1 Further analyses with data
Detailed mapping of the age-standardized prevalence of various risk factors and
cardiovascular deaths by the demographic characteristics in each state could help in producing an atlas
of cardiovascular disease for the states in India. These rates could be converted into absolute risks and
attributable risks to estimate potential cardiovascular disease burden in each state. This would be an
useful communication tool relevant for policymakers, academics and other stakeholders. Further,
spatial analysis including regression could be undertaken to better study geographic variation at the
regional or state level for cardiovascular deaths and even at the district level for some parameters such
as smoking because of sufficient numbers in surveys such as the SFMS 1998 survey. This would help
identify hot-spots for smoking as well as potential clustering of cardiovascular deaths in different
regions of India. Multi-level modeling using aggregate data on state-level urbanization and the
individual-level risk factor data would also lead to improved understanding of association between
predictors and cardiovascular death rates.
5.7.2 Improvement of CVD research data in India
There is certainly room for improvement in data collection on cardiovascular risk factors
from the large nationally-representative surveys in India. Better harmonization of study instruments
with respect to definitions and phrasing of questions would result in standardization enabling better
comparison across surveys and to study temporal trends. While better civil registration systems are
the way to move forward for improving accuracy of death certification systems, it does not seem
feasible in the near future and hence the verbal autopsy method would need to be refined to improve
accuracy for ascertainment of deaths attributable to the cardiovascular system. These ecologic
studies together with large cohort studies of individuals that are in their early stages of recruitment
[27,130] would offer a better understanding of the personal and societal level of cardiovascular risk
factors and how they operate to cause cardiovascular mortality in India.
5.8 Conclusions From the selected large, nationally-representative surveys conducted in India over the last
decade, some key community level information on cardiovascular risk factors and mortality
outcomes were elucidated. About one in three males over the age of 15 years were smokers with
119
70% of them smoking beedi and 20% smoking cigarettes; female smoking was one-tenth of male
smoking prevalence. Mean age of smoking initiation was 21 years. Beedi smokers were more often
illiterate, more common in rural areas and started smoking earlier than cigarette smokers. While
beedi smoking decreased across the education gradient, cigarette smoking increased. There was a 7-
fold variation in smoking prevalence among states. Higher prevalence was seen in northern and
northeastern states.
11.8% of males and 15.1% of females had BMIs ≥ 25 kg/m2 in the age-group of 30 to 49 years;
about 80% of the population was pre-obese and the rest was obese. Pre-obesity and obesity were
both more common among females than males. Overweight prevalence in urban areas was greater
than in rural areas. Among males there was a 13-fold variation with prevalence ranging from 2.9%
in rural Chattisgarh to 38% in urban Punjab; among females there was a 22-fold variation with
prevalence ranging from 2.5% in rural Jharkhand to 55.9% in urban Punjab.
One-third of the population reported being lacto-vegetarians. Rural residents and females
were more likely to be lacto-vegetarians than urban residents and males respectively. There was a
strong east-west gradient in lacto-vegetarianism with the lowest prevalence seen in northeastern
states and the highest prevalence seen in northwestern states. About half the population consumed
fruits at least once a week. Males and urban residents consumed more than females and rural
residents respectively. Fruit consumption increased directly across the education gradient.
Prevalence of self-reported diabetes among males was 2.8% and among females was 2.0%. Urban-
rural ratio was 2.2. The prevalence varied 10-fold among males with values ranging from below
0.5% in Rajasthan to over 5.0% in Andhra Pradesh and varied 8-fold among females with values
ranging from <0.5% in Rajasthan to over 4.0% in Tamil Nadu & Kerala. Southern states had a
higher prevalence of self-reported diabetes.
Cardiovascular death rates were 308 per 100,000 among males and 198 per 100,000 among
females in the 30 to 69 year age-group. Among males, the rates ranged from about 180 per 100,000
in Mizoram to over 400 per 100,000 in Tamil Nadu and Andhra Pradesh. Among females, the rates
ranged from below 100 per 100,000 in Mizoram and Haryana to about 240 per 100,000 in Punjab
and Andhra Pradesh. The selected risk factors studied explained 49% and 43% of the variation
among states for males and females respectively. Ecologic analysis revealed that the cardiovascular
death rates were significantly associated with rates of overweight and levels of vegetarianism at the
state level for males; no such association was found for females. Limited association between
120
predictors and cardiovascular death rate is probably indicative of an evolving picture characteristic
of regions in epidemiologic transition.
Graphical displays using maps and other techniques have helped summarize and visualize the
geography of cardiovascular disease in India at a unique scale from multiple large surveys. This will
enhance transparency and widespread understanding of the epidemiologic evidence for all
concerned stakeholders.
121
6 REFERENCES 1. Davis K (1951) The Population of India and Pakistan. Princeton, New Jersey: Princeton
University Press. 2. RGI (1998) Compendium of India's Fertility and Mortality Indicators, 1971-1997. New Delhi:
India: Office of Registrar General of India. 3. Black RE, Morris SS, Bryce J (2003) Where and why are 10 million children dying every year?
Lancet 361: 2226-2234. 4. Jha P (2002) Avoidable mortality in India: past progress and future prospects. Natl Med J India
15 Suppl 1: 32-36. 5. Doll R, Peto R (1981) The causes of cancer: quantitative estimates of avoidable risks of cancer in
the United States today. J Natl Cancer Inst 66: 1191-1308. 6. Dyson T, Cassen R, Visaria L (2004) Twenty-first century India: population, economy, human
development, and the environment. New York: Oxford University Press. 1-31 p. 7. Mukherji R (2008) The Political Economy of India's Economic Reforms. Asian Economic Policy
Review 3: 315-331. 8. Omran AR (1971) The Epidemiologic Transition: A Theory of the Epidemiology of Population
Change. The Milbank Memorial Fund Quarterly 49: 509-538. 9. Olshansky SJ, Ault AB (1986) The fourth stage of the epidemiologic transition: the age of
delayed degenerative diseases. The Milbank Memorial Fund Quarterly 64: 355-391. 10. Gordon T, Kannel WB, Castelli WP, Dawber TR (1981) Lipoproteins, cardiovascular disease,
and death. The Framingham study. Arch Intern Med 141: 1128-1131. 11. Dawber TR, Kannel WB (1958) An epidemiologic study of heart disease: the Framingham
study. Nutr Rev 16: 1-4. 12. Kannel WB, McGee D, Gordon T (1976) A general cardiovascular risk profile: the Framingham
Study. Am J Cardiol 38: 46-51. 13. Smith GD, Shipley MJ, Marmot MG, Rose G (1992) Plasma cholesterol concentration and
mortality. The Whitehall Study. JAMA 267: 70-76. 14. Benfante R (1992) Studies of cardiovascular disease and cause-specific mortality trends in
Japanese-American men living in Hawaii and risk factor comparisons with other Japanese populations in the Pacific region: a review. Hum Biol 64: 791-805.
15. Cutler JA, Grandits GA, Grimm RH, Jr., Thomas HE, Jr., Billings JH, et al. (1991) Risk factor changes after cessation of intervention in the Multiple Risk Factor Intervention Trial. The MRFIT Research Group. Prev Med 20: 183-196.
16. Pietinen P, Lahti-Koski M, Vartiainen E, Puska P (2001) Nutrition and cardiovascular disease in Finland since the early 1970s: a success story. J Nutr Health Aging 5: 150-154.
17. Fruchart JC, Nierman MC, Stroes ES, Kastelein JJ, Duriez P (2004) New risk factors for atherosclerosis and patient risk assessment. Circulation 109: III15-19.
18. Barker DJP (1992) Fetal and infant origins of adult disease. London: BMJ Books. 19. Enas EA, Mehta J (1995) Malignant coronary artery disease in young Asian Indians: thoughts
on pathogenesis, prevention, and therapy. Coronary Artery Disease in Asian Indians (CADI) Study. Clin Cardiol 18: 131-135.
20. Prentice AM (2006) The emerging epidemic of obesity in developing countries. Int J Epidemiol 35: 93-99.
21. Gupta R (2004) Trends in hypertension epidemiology in India. J Hum Hypertens 18: 73-78.
122
22. Gupta R, Joshi P, Mohan V, Reddy KS, Yusuf S (2008) Epidemiology and causation of coronary heart disease and stroke in India. Heart 94: 16-26.
23. Lopez AD, Mathers CD, Ezzati M, Jamison DT, Murray CJL, editors (2006) Global Burden of Disease and Risk Factors. New York: World Bank & Oxford University Press. 156-161 p.
24. WHO (2004) Global Burden of Disease: Disease and Injury Country Estimates. World Health Organization. Dept of Measurement and Health Information.
25. CBHI (2007) Mortality Statistics in India - 2006. New Delhi, India: Central Bureau of Health Information.
26. Mahapatra P, Rao CP (2001) Cause of death reporting systems in India: a performance analysis. Natl Med J India 14: 154-162.
27. Jha P, Gajalakshmi V, Gupta PC, Kumar R, Mony P, et al. (2006) Prospective study of one million deaths in India: rationale, design, and validation results. PLoS Med 3: e18.
28. RGI-CGHR, Collaborators (2009) Causes of death in India, 2001-03. In: Sample Registration System RGoI, editor. New Delhi: Ministry of Home Affairs, Govt. of India (forthcoming).
29. Gajalakshmi V, Peto R, Kanaka S, Balasubramanian S (2002) Verbal autopsy of 48 000 adult deaths attributable to medical causes in Chennai (formerly Madras), India. BMC Public Health 2: 7.
30. Joshi R, Cardona M, Iyengar S, Sukumar A, Raju CR, et al. (2006) Chronic diseases now a leading cause of death in rural India--mortality data from the Andhra Pradesh Rural Health Initiative. Int J Epidemiol 35: 1522-1529.
31. Yusuf S, Hawken S, Ounpuu S, Dans T, Avezum A, et al. (2004) Effect of potentially modifiable risk factors associated with myocardial infarction in 52 countries (the INTERHEART study): case-control study. Lancet 364: 937-952.
32. Joshi P, Islam S, Pais P, Reddy S, Dorairaj P, et al. (2007) Risk factors for early myocardial infarction in South Asians compared with individuals in other countries. JAMA 297: 286-294.
33. Feigin VL (2007) Stroke in developing countries: can the epidemic be stopped and outcomes improved? Lancet Neurol 6: 94-97.
34. INTERSALT, CooperativeResearchGroup (1988) INTERSALT: an international study of electrolyte excretion and blood pressure. Results for 24 hour urinary sodium and potassium excretion. BMJ 297: 319-328.
35. Lewington S, Clarke R, Qizilbash N, Peto R, Collins R (2002) Age-specific relevance of usual blood pressure to vascular mortality: a meta-analysis of individual data for one million adults in 61 prospective studies. Lancet 360: 1903-1913.
36. WHO (2004) Mortality and burden of disease - 2002. WHO Statistical Information System. Geneva, Switzerland: WHO.
37. WHO (2005) Preventing chronic diseases: a vital investment. Geneva, Switzerland: World Health Organization.
38. Yusuf S, Reddy S, Ounpuu S, Anand S (2001) Global burden of cardiovascular diseases: Part II: variations in cardiovascular disease by specific ethnic groups and geographic regions and prevention strategies. Circulation 104: 2855-2864.
39. Glass GE (2000) Update: spatial aspects of epidemiology: the interface with medical geography. Epidemiol Rev 22: 136-139.
40. Gupta R, Misra A, Pais P, Rastogi P, Gupta VP (2006) Correlation of regional cardiovascular disease mortality in India with lifestyle and nutritional factors. Int J Cardiol 108: 291-300.
123
41. Subramanian SV, Nandy S, Kelly M, Gordon D, Davey Smith G (2004) Patterns and distribution of tobacco consumption in India: cross sectional multilevel evidence from the 1998-9 national family health survey. BMJ 328: 801-806.
42. Shetty PS (2002) Nutrition transition in India. Public Health Nutr 5: 175-182. 43. Lengler R, Eppler M. Towards a Periodic Table of Visualization Methods for Management;
2007; Clearwater, Florida. 44. Pfeiffer D, Robinson T, Stevenson M, Stevens K, Rogers D, et al. (2008) Spatial analysis in
epidemiology. Oxford: UK: Oxford University Press. 45. Jiang B, Huang B, Vasek V (2003) Geovisualisation for Planning Support Systems. In:
Geertman S, Stillwell J, editors. Planning Support Systems in Practice. Berlin: Springer. 46. Dummer TJ (2008) Health geography: supporting public health policy and planning. CMAJ
178: 1177-1180. 47. Daar AS, Singer PA, Persad DL, Pramming SK, Matthews DR, et al. (2007) Grand challenges
in chronic non-communicable diseases. Nature 450: 494-496. 48. Yach D, Hawkes C, Gould CL, Hofman KJ (2004) The global burden of chronic diseases:
overcoming impediments to prevention and control. JAMA 291: 2616-2622. 49. Doll R, Peto R, Boreham J, Sutherland I (2004) Mortality in relation to smoking: 50 years'
observations on male British doctors. BMJ 328: 1519. 50. Whitlock G, Lewington S, Sherliker P, Clarke R, Emberson J, et al. (2009) Body-mass index
and cause-specific mortality in 900 000 adults: collaborative analyses of 57 prospective studies. Lancet 373: 1083-1096.
51. Reddy KS, Gupta PC, editors (2004) Report on tobacco control in India. New Delhi, India: Ministry of Health and Family Welfare, Govt of India. 41-82 p.
52. IARC (1986) International Agency for Research on Cancer: Monograph on the Evaluation of Carcinogenic Risk of Chemicals to Humans - Tobacco Smoking. Switzerland: World Health Organization. 38 p.
53. Jha P, Chen Z (2007) Poverty and chronic diseases in Asia: challenges and opportunities. CMAJ 177: 1059-1062.
54. Rani M, Bonu S, Jha P, Nguyen SN, Jamjoum L (2003) Tobacco use in India: prevalence and predictors of smoking and chewing in a national cross sectional household survey. Tob Control 12: e4.
55. Neufeld KJ, Peters DH, Rani M, Bonu S, Brooner RK (2005) Regular use of alcohol and tobacco in India and its association with age, gender, and poverty. Drug Alcohol Depend 77: 283-291.
56. Gajalakshmi C, Jha P, Ranson K, Nguyen S (2000) Global Patterns of Smoking and Smoking-Attributable Mortality. In: Jha P, Chaloupka F, editors. Tobacco Control in Developing Countries. Oxford, U.K.: Oxford University Press.
57. USDHHS (2008) Results from the 2007 National Survey on Drug Use and Health: National Findings. In: US Department of Health and Human Services SAMHSA, editor. Washington, DC: US Government Printing Office.
58. Jha P, Jacob B, Gajalakshmi V, Gupta PC, Dhingra N, et al. (2008) A nationally representative case-control study of smoking and death in India. N Engl J Med 358: 1137-1147.
59. CDC (2007) Smoking & Tobacco Use: National Health Interview Surveys, Selected Years—United States, 1974–2006. In: Services UDoHaH, editor. Atlanta, GA.
60. Despres JP, Arsenault BJ, Cote M, Cartier A, Lemieux I (2008) Abdominal obesity: the cholesterol of the 21st century? Can J Cardiol 24 Suppl D: 7D-12D.
124
61. Mascie-Taylor CG, Goto R (2007) Human variation and body mass index: a review of the universality of BMI cut-offs, gender and urban-rural differences, and secular changes. J Physiol Anthropol 26: 109-112.
62. Wild SH, Byrne CD (2004) Evidence for fetal programming of obesity with a focus on putative mechanisms. Nutr Res Rev 17: 153-162.
63. Subramanian SV, Kawachi I, Smith GD (2007) Income inequality and the double burden of under- and overnutrition in India. J Epidemiol Community Health 61: 802-809.
64. Srinath Reddy K, Katan MB (2004) Diet, nutrition and the prevention of hypertension and cardiovascular diseases. Public Health Nutr 7: 167-186.
65. Anand SS, Ounpuu S, Yusuf S (2003) Ethnicity and cardiovascular disease. In: Yusuf S, Cairns JA, Camm AJ, Fallen EL, Gersh BJ, editors. Evidence-Based Cardiology 2nd ed: BMJ Publishing Group. pp. 171–190
66. Popkin BM, Horton S, Kim S, Mahal A, Shuigao J (2001) Trends in diet, nutritional status, and diet-related noncommunicable diseases in China and India: the economic costs of the nutrition transition. Nutr Rev 59: 379-390.
67. Enas EA, Singh V, Munjal YP, Bhandari S, Yadave RD, et al. (2008) Reducing the burden of coronary artery disease in India: challenges and opportunities. Indian Heart J 60: 161-175.
68. Mohan V, Deepa M, Deepa R, Shanthirani CS, Farooq S, et al. (2006) Secular trends in the prevalence of diabetes and impaired glucose tolerance in urban South India--the Chennai Urban Rural Epidemiology Study (CURES-17). Diabetologia 49: 1175-1178.
69. Lewington S, Whitlock G, Clarke R, Sherliker P, Emberson J, et al. (2007) Blood cholesterol and vascular mortality by age, sex, and blood pressure: a meta-analysis of individual data from 61 prospective studies with 55,000 vascular deaths. Lancet 370: 1829-1839.
70. Mony PK, Nagaraj C (2007) Health information management: an introduction to disease classification and coding. Natl Med J India 20: 307-310.
71. Meade MS, Earickson RJ (2005) Medical Geography – 2nd edition. New York, NY: The Guilford Press.
72. Lo CP, Yeung AKW (2007) Concepts and Techniques of Geographic Information Systems: Prentice Hall Inc.
73. Longley P, Goodchild MF, Maguire D, Rhind D (2005) Geographic Information Systems and Science: Wiley and Sons.
74. PAHO (2006) Core Health Indicators of the Americas 2004-2006. Pan American Health Organization.
75. CDC (2008) Atlas of United States Mortality: selected causes of death. . Hyattsville, MD: U.S. Department of Health and Human Services.
76. CCORT. (2006) Canadian Cardiovascular Atlas.; Tu JV, Ghali WA, Pilote L, Brien S, editors. Toronto: Pulsus Group Inc. and Institute for Clinical Evaluative Sciences.
77. Hux J, Booth G, Slaughter P, Laupacis A (2003) Diabetes in Ontario: An ICES Practice Atlas. Toronto: Institute for Clinical Evaluative Sciences.
78. Glazier R, Booth G (2007) Neighbourhood environments and resources for healthy living - A focus on diabetes in Toronto. Toronto: Institute for Clinical Evaluative Sciences.
79. Cooper R, Cutler J, Desvigne-Nickens P, Fortmann SP, Friedman L, et al. (2000) Trends and disparities in coronary heart disease, stroke, and other cardiovascular diseases in the United States: findings of the national conference on cardiovascular disease prevention. Circulation 102: 3137-3147.
125
80. Mensah GA (2005) Eliminating disparities in cardiovascular health: six strategic imperatives and a framework for action. Circulation 111: 1332-1336.
81. Chow CM, Donovan L, Manuel D, Johansen H, Tu JV (2005) Regional variation in self-reported heart disease prevalence in Canada. Can J Cardiol 21: 1265-1271.
82. Breckenkamp J, Mielck A, Razum O (2007) Health inequalities in Germany: do regional-level variables explain differentials in cardiovascular risk? BMC Public Health 7: 132.
83. Murphy A, Mony P, Alleyne G, Dirks J, Messah E, et al. Cardiovascular Disease Mortality in the Commonwealth; 2008 17-18 Nov 2008; Toronto, Canada. Centre for Global Health Research and Commonwealth Secretariat.
84. Levin S, Welch VL, Bell RA, Casper ML (2002) Geographic variation in cardiovascular disease risk factors among American Indians and comparisons with the corresponding state populations. Ethn Health 7: 57-67.
85. CGHR (2006 ) Atlas of HIV-1 prevalence among women attending antenatal clinics in 115 districts of southern India. Toronto, Canada: Centre for Global Health Research, St Michael’s Hospital, University of Toronto.
86. Padmavati S (1962) Epidemiology of cardiovascular disease in India. II. Ischemic heart disease. Circulation 25: 711-717.
87. Cleveland WS (1993) Visualising data. Summit, NJ: Hobart Press. 88. MacEachren AM (1995) How Maps Work. New York: The Guilford Press. 89. Anon (1998) India Nutrition Profile. In: Development DoWC, editor. New Delhi: Ministry of
Human Resource Development, Govt. of India. pp. 1-8. 90. SFMS (1998) Special Fertility and Mortality Survey. New Delhi, India: Office of the Registrar
General of India. 91. IIPS (2000) National Family Health Survey (NFHS-2), 1998-99. Mumbai: India: International
Institute for Population Sciences. 92. SRS (2004) Sample Registration System. New Delhi, India: Office of the Registrar General of
India. 93. IIPS (2007) National Family Health Survey (NFHS-3), 2005-06. Mumbai: India: International
Institute for Population Sciences 94. WHO (2002) World Health Report -- Reducing Risks, Promoting Healthy Life. Geneva,
Switzerland: World Health Organization. 95. WHO (2008) STEPwise approach to chronic disease risk factor surveillance (STEPS). Geneva,
Switzerland: Chronic Diseases and Health Promotion. World Health Organization. 96. CSDH (2008) Closing the gap in a generation: health equity through action on the social
determinants of health. Final Report of the Commission on Social Determinants of Health. Geneva, Switzerland: World Health Organization.
97. Roy TK. Alternative Data Sources for Demographic and Health Statistics in India; 2003 24-27 June; Bangkok, Thailand.
98. Bhat MPN (2002) Completeness of India’s Sample Registration System: An assessment using the general growth balance method. . Population Studies 56 119–134.
99. Hyland A, Cummings KM, Lynn WR, Corle D, Giffen CA (1997) Effect of proxy-reported smoking status on population estimates of smoking prevalence. Am J Epidemiol 145: 746-751.
100. Gorber SC, Schofield-Hurwitz S, Hardt J, Levasseur G, Tremblay M (2009) The accuracy of self-reported smoking: a systematic review of the relationship between self-reported and cotinine-assessed smoking status. Nicotine Tob Res 11: 12-24.
126
101. Navarro AM (1999) Smoking status by proxy and self report: rate of agreement in different ethnic groups. Tob Control 8: 182-185.
102. Subramanian SV, Subramanyam MA, Selvaraj S, Kawachi I (2009) Are self-reports of health and morbidities in developing countries misleading? Evidence from India. Soc Sci Med 68: 260-265.
103. Walter SD, Birnie SE (1991) Mapping mortality and morbidity patterns: an international comparison. Int J Epidemiol 20: 678-689.
104. Bailey TC, Gatrell AC (1995) Interactive Spatial Data Analysis. Harlow, Essex: Addison Wesley Longman.
105. Rothman K, Greenland S (1998) Modern Epidemiology. Philadelphia, PA: Lippincott Williams & Wilkins.
106. Littell R, Stroup W, Freund R (2002) SAS® for Linear Models. Cary, NC: SAS Institute Inc. 107. Allison P (1999) Logistic Regression Using SAS®: Theory and Application. Cary, NC: SAS
Institute Inc. 217-226 p. 108. Carr DB, Wallin JF, Carr DA (2000) Two new templates for epidemiology applications: linked
micromap plots and conditioned choropleth maps. Stat Med 19: 2521-2538. 109. Anselin L, Syabri I, Kho Y (2006) GeoDa: An Introduction to Spatial Data Analysis.
Geographical Analysis 38: 5-22. 110. Cliff AD (1995) Analysing geographically-related disease data. Stat Methods Med Res 4: 93-
101. 111. Rezaeian M, Dunn G, St Leger S, Appleby L (2004) The production and interpretation of
disease maps: A methodological case-study. Soc Psychiatry Psychiatr Epidemiol 39: 947-954.
112. Miyawaki N, Chen SC (1981) A statistical consideration on the mapping of mortality. The geography of health: 93-101.
113. Aylin P, Maheswaran R, Wakefield J, Cockings S, Jarup L, et al. (1999) A national facility for small area disease mapping and rapid initial assessment of apparent disease clusters around a point source: the UK Small Area Health Statistics Unit. J Public Health Med 21: 289-298.
114. Slocum TA, McMaster RB, Kessler FC, Howard HH (2008) Thematic Cartography and Geographic Visualization New Jersey, U.S.: Prentice Hall Inc.
115. Dent BD (1999) Cartography: thematic map design McGraw-Hill. 116. Brewer CA (1994) Colour use guidelines for mapping and visualization. In: MacEachren AM,
Talyor DRF, editors. Visualization in modern cartography Terrytown, NY.: Elsevier Science.
117. Krosnick JA (1999) Survey research. Annu Rev Psychol 50: 537-567. 118. Wakefield J (2009) Multi-level modelling, the ecologic fallacy, and hybrid study designs. Int J
Epidemiol 38: 330-336. 119. RGI (2003) SRS Based Abridged Life Tables, SRS Analytical Studies Report No. 3 of 2003.
New Delhi: Registrar General of India. 120. Nichter M, Van Sickle D (2004) Popular perceptions of tobacco products and patterns of use
among male college students in India. Soc Sci Med 59: 415-431. 121. Mohan S, Pradeepkumar AS, Thresia CU, Thankappan KR, Poston WS, et al. (2006) Tobacco
use among medical professionals in Kerala, India: the need for enhanced tobacco cessation and control efforts. Addict Behav 31: 2313-2318.
127
122. Keys A, Aravanis C, Blackburn HW, Van Buchem FS, Buzina R, et al. (1966) Epidemiological studies related to coronary heart disease: characteristics of men aged 40-59 in seven countries. Acta Med Scand Suppl 460: 1-392.
123. Narayan KM, Chadha SL, Hanson RL, Tandon R, Shekhawat S, et al. (1996) Prevalence and patterns of smoking in Delhi: cross sectional study. BMJ 312: 1576-1579.
124. Gupta PC (1996) Survey of sociodemographic characteristics of tobacco use among 99,598 individuals in Bombay, India using handheld computers. Tob Control 5: 114-120.
125. Schaap MM, Kunst AE (2009) Monitoring of socio-economic inequalities in smoking: learning from the experiences of recent scientific studies. Public Health 123: 103-109.
126. Jarvis MJ, Tunstall-Pedoe H, Feyerabend C, Vesey C, Saloojee Y (1987) Comparison of tests used to distinguish smokers from nonsmokers. Am J Public Health 77: 1435-1438.
127. Patrick DL, Cheadle A, Thompson DC, Diehr P, Koepsell T, et al. (1994) The validity of self-reported smoking: a review and meta-analysis. Am J Public Health 84: 1086-1093.
128. Eyre H, Kahn R, Robertson RM, Clark NG, Doyle C, et al. (2004) Preventing cancer, cardiovascular disease, and diabetes: a common agenda for the American Cancer Society, the American Diabetes Association, and the American Heart Association. Circulation 109: 3244-3255.
129. Ulijaszek SJ, Kerr DA (1999) Anthropometric measurement error and the assessment of nutritional status. Br J Nutr 82: 165-177.
130. Teo K, Chow CK, Vaz M, Rangarajan S, Yusuf S (2009) The Prospective Urban Rural Epidemiology (PURE) study: examining the impact of societal influences on chronic noncommunicable diseases in low-, middle-, and high-income countries. Am Heart J 158: 1-7 e1.
131. Yusuf S, Vaz M (2006) PURE India. Prospective Urban and Rural Epidemiology Study. Dubai presentation. Hamilton: Population Health Research Institute.
132. Rastogi T, Reddy KS, Vaz M, Spiegelman D, Prabhakaran D, et al. (2004) Diet and risk of ischemic heart disease in India. Am J Clin Nutr 79: 582-592.
133. Mohan V, Sandeep S, Deepa R, Shah B, Varghese C (2007) Epidemiology of type 2 diabetes: Indian scenario. Indian J Med Res 125: 217-230.
134. Kunst AE, del Rios M, Groenhof F, Mackenbach JP (1998) Socioeconomic inequalities in stroke mortality among middle-aged men: an international overview. European Union Working Group on Socioeconomic Inequalities in Health. Stroke 29: 2285-2291.
135. Smith GD, Wentworth D, Neaton JD, Stamler R, Stamler J (1996) Socioeconomic differentials in mortality risk among men screened for the Multiple Risk Factor Intervention Trial: II. Black men. Am J Public Health 86: 497-504.
136. Strand BH, Tverdal A (2004) Can cardiovascular risk factors and lifestyle explain the educational inequalities in mortality from ischaemic heart disease and from other heart diseases? 26 year follow up of 50,000 Norwegian men and women. J Epidemiol Community Health 58: 705-709.
137. Marmot MG, Adelstein AM, Robinson N, Rose GA (1978) Changing social-class distribution of heart disease. Br Med J 2: 1109-1112.
138. Song YM, Ferrer RL, Cho SI, Sung J, Ebrahim S, et al. (2006) Socioeconomic status and cardiovascular disease among men: the Korean national health service prospective cohort study. Am J Public Health 96: 152-159.
139. Murray CJ, Lopez AD, Feehan DM, Peter ST, Yang G (2007) Validation of the symptom pattern method for analyzing verbal autopsy data. PLoS Med 4: e327.
128
140. Kumar R, Thakur J, Rao B, Singh M, Bhatia S (2006) Validity of verbal autopsy in determining causes of adult deaths. Indian Journal of Public Health 50: 90-94.
141. Chandramohan D, Maude GH, Rodrigues LC, Hayes RJ (1998) Verbal autopsies for adult deaths: their development and validation in a multicentre study. Trop Med Int Health 3: 436-446.
142. Tu JV, Ko DT (2008) Ecological studies and cardiovascular outcomes research. Circulation 118: 2588-2593.
143. Pearce N (2000) The ecological fallacy strikes back. J Epidemiol Community Health 54: 326-327.
144. Jha P, Chaloupka F, Moore J, Gajalakshmi V, Gupta P, et al. (2006) Disease Control Priorities in Developing Countries: Tobacco Addiction. In: Jamison D, Breman J, Measham A, Alleyne G, Claeson M et al., editors. Disease Control Priorities in Developing Countries. 2nd ed. New York: Oxford University Press. pp. 17.
145. Willett W, Koplan J, Nugent R, Dusenbury C, Puska P, et al., editors (2006) Disease Control Priorities in Developing Countries: Prevention of Chronic Disease by Means of Diet and Lifestyle Changes. 2nd ed. New York: Oxford University Press. 18 p.
129
130
APPENDIX Table 7.1 Poisson regression
POISSON REGRESSION (males) Criteria For Assessing Goodness Of Fit Criterion DF Value Value/DF Deviance 22 154.4283 7.0195 Scaled Deviance 22 22.0000 1.0000 Pearson Chi-Square 22 153.5481 6.9795 Scaled Pearson X2 22 21.8746 0.9943 Log Likelihood 5866.4697 Analysis Of Parameter Estimates Parameter DF Estimate Pr > ChiSq Intercept 1 5.2467 <.0001 Smoking 1 0.0027 0.2652 Veg_% 1 -0.0032 0.0682 Fruits 1 0.0001 0.9690 Overwgt 1 0.0243 0.0166 Diabetes 1 0.0156 0.5073 Urban_% 1 0.0035 0.0876 Scale 0 2.6494
POISSON REGRESSION (females)
Criteria For Assessing Goodness Of Fit Criterion DF Value Value/DF Deviance 22 228.8826 10.4038 Scaled Deviance 22 22.0000 1.0000 Pearson Chi-Square 22 224.7683 10.2167 Scaled Pearson X2 22 21.6045 0.9820 Log Likelihood 2011.8376 Analysis Of Parameter Estimates
Parameter DF Estimate Pr > ChiSq Intercept 1 5.1974 <.0001 Smoking 1 -0.0378 0.0310 Veg_% 1 -0.0008 0.7437 Fruits 1 -0.0042 0.2736 Overwgt 1 0.0085 0.4865 Diabetes 1 0.0793 0.2530 Urban_% 1 -0.0008 0.8289 Scale 0 3.2255