geographical epidemiology of …...by prem kumar mony master of science (2009) institute of medical...

GEOGRAPHICAL EPIDEMIOLOGY OF CARDIOVASCULAR DISEASE IN INDIA: AN EXPLORATORY STUDY

by

Prem kumar Mony

A thesis submitted in conformity with the requirements for the degree of Master of Science

Graduate Department of Institute of Medical Sciences University of Toronto

© Copyright by Prem kumar Mony 2009

ABSTRACT

Geographical Epidemiology of Cardiovascular Disease in India: An exploratory study

By Prem kumar Mony Master of Science (2009)

Institute of Medical Sciences University of Toronto

Cardiovascular Diseases (CVD) have become the leading cause of death in India and other

developing countries. The aims of this study were to: (1) describe the geographical epidemiology of CVD

in India, (2) provide a graphical display of CVD risk factors and mortality outcomes, and (3) describe the

sources of bias. Five large, nationally-representative datasets from India were studied.

Cardiovascular death rates were 308/100,000 among males and 198/100,000 among females in

middle-age (30-69years). Wide variations between states were noted in the distribution of risk factors and

mortality. The selected risk factors explained 49% and 43% of the variation among males and females

respectively. Ecologic analysis revealed death rates at state-level were associated with rates of overweight

and vegetarianism among males; no such association was found among females. This study has

implications for identification of areas with high burden, formulation of hypotheses, and assessing needs

for disease control at national/regional levels.

ii

ACKNOWLEDGEMENT

A thesis is supposed to be an independent piece of work, but in reality this is hardly so. Over the last two years there have been so many people without whose support, guidance and patience, this work would not have been completed. It is to them I owe my deepest gratitude. I have been fortunate to have worked and learned under the supervision of Dr Prabhat Jha, a dedicated epidemiologist and an able mentor. His wisdom, knowledge and commitment to the highest standards have inspired and motivated me. Thanks Prabhat, you’ve truly been a guru. My thanks and appreciation go to Dr Richard Glazier and Dr Jim Dunn, for being on my program committee and helping me shape this thesis with their critical inputs and interesting perspectives. I wish to thank my examination committee members Dr Jack Tu, Dr David Alter and Dr Rajeev Gupta for the insightful and constructive comments. Thanks also go out to my colleagues Wilson Suraweera and Paul Arora for timely assistance with my statistics queries and to Ashleigh Sullivan and Brent Harris for advice with the geographic software. On a different note, a quiet thanks also to all the anonymous study respondents who have collectively taught me more than they would ever realize. Finally and importantly I would like to thank my wife Lolita and my daughter Pritika for their encouragement in supporting me in my aspirations. I thank my parents, Kamala and Daniel, and my sister Premalatha, for their support and unwavering love.

iii

TABLE OF CONTENTS Abstract…………………………………………………………………………………iii Acknowledgements…………………………………………………………………..….iv List of tables……………………………………………………………………………viii List of figures…………………………………………………………………….……...ix List of appendices……………………………………………………………………….xi

1 INTRODUCTION……………………………………………………………………………..1

1.1 India – current scenario…………………………………………………………………..1

1.2 Cardiovascular disease in India………………………………………………………….2

1.2.1 Cardiovascular disease mortality……………………………………………….2

1.2.2 Prevalence of Cardiovascular Disease………………………………………….4

1.2.3 Cardiovascular risk factors……………………………………………………..4

1.2.4 Geography and Cardiovascular Disease………………………………………..6

1.3 Health Information Visualization ……………………………………………………….7

1.4 In summation……………………………………………………………………………8

1.5 Objectives ………………………………………………………………………………9

2 LITERATURE REVIEW…………………………………………………………………..10

2.1 Cardiovascular risk factors…………………………………………………………….10

2.1.1 Smoking……………………………………………………………………….10

2.1.2 Body mass……………………………………………………………………..11

2.1.3 Dietary factors and diabetes…………………………………………………...12

2.1.4 Other risk factors………….…………………………………………………...12

2.2 Cardiovascular mortality……………………………………………………………….14

2.2.1 Assessment of cause-of-death…………………………………………………14

2.3 Geographical epidemiology…………………………………………………………….15

2.4 Data visualization……………………………………………………………………….17

3 METHODOLOGY………………………………………………………………….…..….18

3.1 Study setting………………………………………………………………………….18

3.2 Study design…………………………………………………………………………..19

3.3 Data sources…………………………………………………………………………...19

3.3.1 Special Fertility and Mortality Survey……………………………………….19

3.3.2 National Family Health Surveys……………………………………………..20

iv

3.3.3 Sample Registration System………………………………………………….21

3.3.4 Million Death Study………………………………………………………22

3.4 Database management……………………………………………………………..26

3.4.1 Abstraction of relevant variables from the 5 databases…………………...26

3.4.2 Compilation of data dictionaries………………………………………….26

3.4.3 Data quality assessment…………………………………………………..26

3.4.4 Exploratory data analysis…………………………………………………27

3.5 Data analysis……………………………………………………………………….28

3.5.1 Conceptual framework……………………………………………………28

3.5.2 Standardization……………………………………………………………28

3.5.3 Statistical analysis………………………………………………………...29

3.5.4 Geographical analysis…………………………………………………….31

3.6 Ethical approval……………………………………………………………………34

4 RESULTS…………………………………………………………………………………35

4.1 Survey characteristics and descriptives of study population………………………35

4.1.1 Survey characteristics…………………………………………………….35

4.1.2 Demographic characteristics……………………………………………..37

4.1.3 Crude prevalence of selected CVD determinants…………………..……37

4.2 Smoking……………………………………………………………………..…….40

4.2.1 Smoking prevalence among males and females…………………………40

4.2.2 Smoking among all males in SFMS-1998……………………………….41

4.2.3 Smoking among middle-aged adults………………………………….….46

4.2.4 Spatial heterogeneity………………………………………………..……54

4.3 Body mass…………………………………………………………………………55

4.3.1 Overweight/ obesity………………………………………………………55

4.3.2 Geographic mapping of overweight prevalence by states………………..62

4.3.3 Spatial heterogeneity……………………………………………………..66

4.4 Diet and self-reported diabetes…………………………………………………….67

4.4.1 Vegetarianism…………………………………………………………….67

4.4.2 Fruit intake………………………………………………………………..70

4.4.3 Diabetes…………………………………………………………………..74

4.5 Ecologic association ………………………………………………………………77

4.5.1 Cardiovascular mortality………………………………………………….77

v

4.5.2 Ranking of states………………………………………………………….79

4.5.3 Univariate regression analysis…………………………………………….81

4.5.4 Multivariate regression……………………………………………………82

4.6 Biases & limitations………………………………………………………………..89

4.6.1 Assessment of representativeness of surveys……………………………..89

4.6.2 Integrity of surveys………………………………………………………..92

4.6.3 Study characteristics………………………………………………………93

4.6.4 Differences in sociodemographic characteristics…………………………96

4.6.5 Limitations……………………………………………………………..…101

5 DISCUSSION……………………………………………………………………………103

5.1 Summary of key findings…………………………………………………………103

5.2 Smoking……………………………………………………………………….…105

5.3 Overweight and obesity………………………………………………….………109

5.4 Dietary factors and self-reported diabetes………………………………..………111

5.5 Cardiovascular mortality…………………………………………………………114

5.6 Study implications………………………………………………………..………118

5.7 Future directions for research…………………………………………….………119

5.8 Conclusions………………………………………………………………………119

6 REFERENCES……………………………………………………………………..…122

7 APPENDIX……………………………………………………………………….....…130

vi

LIST OF TABLES

Table 1.1 Top dozen causes of death among middle-aged adults (ages 30-69 years), India, 2001-03…………..…..3

Table 1.2 Factors influencing risk of myocardial infarction (INTERHEART study)…………………………..…...5

Table 3.1 Description of the 5 databases, study periods, sample sizes and study populations………………..……20

Table 3.2 Variables used from the 5 surveys (4 risk factor surveys & 1 mortality outcome survey……………..…25

Table 4.1 Descriptive analysis of baseline characteristics in the selected surveys………………………………….36

Table 4.2 Crude prevalence of CVD determinants in selected surveys, India…..………………………………….38

Table 4.3 Smoking among young, middle-aged and older adults by sex and residence, 1998….………………….46

Table 4.4 Pearson correlation coefficients comparing state-level smoking prevalence across different survyes

for males, ages 45-59 years….……………………………………………………………………….…………….52

Table 4.5 Prevalence of overweight/obesity in NFHS-3 survey, 2005-06..…………………….………………….56

Table 4.6 Prevalence proportions of overweight by residence and sex, NFHS-3 survey, 2005-06………………...57

Table 4.7 Vegetarianism among adults aged 15 years and over in India from selected surveys…………………...67

Table 4.8 Reported fruit intake (atleast weekly) in NFHS-3 survey, 2005-06……………………………………..71

Table 4.9 Self-reported diabetes prevalence in NFHS-3 survey, 2005-06………………………………………...74

Table 4.10(a) Ranking of states by outcome (CVD death rate, highest to lowest) for middle-aged males,

India, 2006…………..……………………………………………………………………………………………..79

Table 4.10(b) Ranking of states by outcome (CVD death rate, highest to lowest) for middle-aged females,

India, 2006…………………………………………………………………………………..……………………..80

Table 4.11 Correlations between male vs female ranks for 29 states.………..……………………………………81

Table 4.12 Correlations between selected risk factors and CVD mortality by sex, in 29 states of India..…………82

Table 4.13 Pearson correlation coefficients between variables at state level, males and females………….……...86

Table 4.14(a) Multiple linear regression of cardiovascular death rates among males at state level……….……...87

Table 4.14(b) Multiple linear regression of cardiovascular death rates among females at state level……….…….88

Table 4.15 Sex-ratios (no. of females per 1000 males) in selected surveys in comparison to the census

2001 population……………………………….…………………………………………………………………...89

Table 4.16 Potential sources of bias based on characteristics of respondents, survey instruments and

interviewers in the four surveys.…………………………………………………………………………………...94

Table 5.1 Profile of cardiovascular disease and its risk factors in rural and urban population in southern India,

PURE study……………………………….…………………………………………………….………………...113

vii

LIST OF FIGURES

Figure 1.1 Life and death in 20th century………………………………………………………………………….….1

Figure 3.1 Political map of India showing states and union territories..…………………………………………….18

Figure 3.2 Process flow of the Million Death Study……………….....……………………………………….…….24

Figure 3.3 Conceptual framework for the geographical epidemiological analysis of CVD in India…………….….28

Figure 4.1 Crude prevalence of current tobacco smoking among males and females, aged 15 years

and over, from selected surveys in India……....……………………………………………………………….……40

Figure 4.2(a) Age-specific prevalence (bars) and cumulative prevalence (line) of current smoking

among all males, ages ≥ 15 years, SFMS 1998..……………………………………………………………….……41

Figure 4.2(b) Cumulative prevalence (line) and percent increase in smoking above younger age-group (bars)

of current smoking among all males, ages ≥ 15 years, SFMS 1998..……………………………...………………..42

Figure 4.3 Types of tobacco smoked by level of education among males, ages 15 years and over, SFMS 1998......43

Figure 4.4 Mean age of initiation of smoking by type of tobacco used among male smokers, SFMS 1998……......44

Figure 4.5 Mean age of initiation of smoking among males for different types of tobacco by level of education,

SFMS

1998…….............................................................................................................................................................45

Figure 4.6 Type of tobacco smoked by place of residence among middle-aged male smokers, 1998……………....47

Figure 4.7 Types of tobacco smoked and proportion of beedi smokers among middle-aged (30-69 years)

males in states of India, 1998……………………………………………………………………………………......48

Figure 4.8 Prevalence of smoking among males in different states of India,

1998……………………..…………....49

Figure 4.9 Proportion of different types of tobacco smoked in different states, SFMS 1998………………..……....50

Figure 4.10 Maps of smoking prevalence among rural males, ages 45-59 years, across the four selected

surveys.....51

Figure 4.11 Correlation between smoking prevalence in states between SFMS and NFHS-2……………………....52

Figure 4.12 Ratio of ex:current smokers among males, ages 45-59 years, SRS 2004……………………………....53

Figure 4.13 Scatterplot of global spatial autocorrelation (for males) and LISA maps showing local spatial

clustering of smoking (for males and females), NFHS-3 [2005-06] ………………….………………………….....54

Figure 4.14 Boxplot showing distribution of body mass index by sex, 2005-06…………………………………....58

Figure 4.15 Boxplot showing distribution of body mass index by gender and residence, 2005-06………………....59

Figure 4.16(a) Boxplots of distribution of body mass index by gender and age (males), 2005-06………………....60

Figure 4.16(b) Boxplots of distribution of body mass index by gender and age (females), 2005-06…………….....61

Figure 4.17 Mapping of proportions of adults, ages 30-49 years, overweight by state, 2005-06…………………...63

Figure 4.18 Mapping of proportions of adults, ages 20-29 years, overweight by state, 2005-06……………….......64

Figure 4.19 Mapping of proportions of adults, ages 15-19 years, overweight by state, 2005-06…………………...65

Figure 4.20 LISA maps showing local spatial clustering of overweight/obesity among males and females,

NFHS-3 [2005-06] ………………….……………………………………………………………………………...66

viii

Figure 4.21 Prevalence of lacto-vegetarianism in the states among adults in the NFHS-3 survey, 2005-06…..…...69

Figure 4.22 LISA maps showing local spatial clustering of lacto-vegetarianism, NFHS-3 [2005-06]………...…...70

Figure 4.23 Reported fruit intake (at least weekly) in various states by sex and residence, 2005-06…………..…...72

Figure 4.24 LISA maps showing local spatial clustering of fruit intake among males and females,

NFHS-3 [2005-06] ………………….……………………………………………………………………………...73

Figure 4.25 Prevalence of self-reported diabetes in different states among adults aged 30 years & over by sex

and residence, NFHS-3 survey, 2005-06…..……………………………………………………………………......75

Figure 4.26 LISA maps showing local spatial clustering of self-reported diabetes among males and females,

NFHS-3 [2005-06] ………………….……………………………………………………………………………...76

Figure 4.27 Age-standardized vascular death rate per 100,000 males and females, ages 30-69 years,

in states of India [2006]…………………………………………………………………………………………….78

Figure 4.28(a) Plots of vascular death rate per 100,000 males against predictor variables………………………...84

Figure 4.28(b) Plots of vascular death rate per 100,000 females against predictor variables……………….……...85

Figure 4.29 Age-sex pyramids of selected surveys in comparison with the census 2001 population……………...91

Figure 4.30 Comparison of proportions of adult males, ages 15-54 years, smoking different types of tobacco

in selected surveys………………………………………………………………………………………………….95

Figure 4.31 Relationship of reporting bias for smoking with type of respondent in selected surveys in India….....96

Figure 4.32 Prevalence of various risk factors by urban-rural residence…………………………………….….....97

Figure 4.33 Prevalence of risk factors by education…………………..…………………………………….….....98

Figure 4.34 Distribution of CVD determinants by residence-sex-education groups in India…………………….100

ix

LIST OF APPENDICES

Table 7.1 Poisson regression (males and females)……………………………………………………………….126

x

1 INTRODUCTION

1.1 India – current scenario India’s population currently totals 1.13 billion or 17% of world population. Between the first

and last decade of the 20th century, the crude death rate fell by nearly four-fifths and life expectancy

at birth tripled from around 22 years to over 61 years (Figure 1.1) [1,2]. India has thus seen marked

reductions in death rates at young ages and more modest reductions in death rates in middle-age in

the 20th century.

Figure 1.1 Life and death in 20th century India

Source: [1,2]

Of the 10 million deaths that currently occur every year in India, about 3.4 million deaths

occur in the age-group 0-34 years; these are mostly acute infectious conditions. The 6.6 million

deaths that occur in those aged 35 years and over are mostly chronic conditions – 3.8 million occur

during middle-age (35-69 years) and 2.8 million occur during old-age (70 years and over) [3,4].

Evidence over the last two centuries from around the world suggests that while death in old age

(after age 70 years) is inevitable, death at young ages (below age 35 years) could become a rare

occurrence, and death in middle age (age 35-69 years) need not be common [5].

India is currently going through multiple transitions – demographic, socioeconomic and health

transitions. Together with the on-going demographic transition associated with improving survival

and increasing urbanization [6], there has also been a dramatic socioeconomic transition over the

1

last few decades [7]. Concurrently, it has witnessed a ‘risk transition’, characterized by changes in

tobacco and alcohol consumption, nutrition, and other lifestyles leading to changing patterns of

disease, disability and death called ‘epidemiologic or health transition’ [8,9]. Tobacco smoke,

physical activity, obesity, hypertension, high glucose and dyslipidemia are the conventional

cardiovascular risk factors identified from seminal studies such as the Framingham [10,11,12],

Whitehall [13] and Ni-Hon-San [14] field epidemiological studies as well as risk factor trials such

as MRFIT [15] and North Karelia [16] trials. More recently, novel risk factors such as

lipoprotein(a), homocysteine, and high-sensitivity C-reactive protein have also been suggested to

influence development of atherosclerosis [17]. Other possible drivers of this epidemiologic

transition have also been proposed. Low birth weight and poor childhood growth have been linked

to increased susceptibility to cardiovascular disease in later life. Gene-environment interaction due

to the presence of a ‘thrifty gene’ in south Asians has also been proposed to be a determinant of

early, excess and extensive cardiovascular disease in south Asians mediated through the

development of hyperlipidemia, insulin resistance and abdominal obesity [18,19]. India now thus

faces a ‘double-burden’ – the task of facing a combination of the ‘unfinished agenda’ of

communicable, nutritional, maternal and child deaths as well as the ‘emerging epidemic’ of chronic,

non-communicable diseases such as obesity, hypertension, diabetes and cardiovascular diseases

[20,21,22]. Cardiovascular Diseases (CVD) such as coronary heart disease (CHD) and strokes are

already the leading cause of death in India and other south Asian countries [23].

1.2 Cardiovascular disease in India

1.2.1 Cardiovascular disease mortality According to the Global Burden of Diseases Study, there were 1.60 million CHD deaths and

0.60 million stroke deaths during the year 2002 in India [24]. Mortality from these conditions is

predicted to rise rapidly in the future with the absolute numbers of CHD cases in India exceeding

those of the established market economies and China combined. Available information in India

comes from the Medical Certification of Cause of Death (MCCD) system which is a hospital-based

cause-of-death assignment system. According to the MCCD, which registered 14.5% of all deaths

in the country for the year 2000, about 20% of deaths in the age-group 15-54 years were due to

cardiovascular diseases [25]. The reliability of this mortality data has however been questioned on

2

issues of poor coverage, and with regard to poor compliance with guidelines for cause of death

reporting, coding and classification [26]. Hence, a nationally-representative cause-of-death

assessment is currently being pursued by the Office of the Registrar General of India through the

Million Death Study using validated verbal autopsy (VA) technique [27]. Preliminary results on

mortality in middle-age (ages 30-69 years) for the period 2001-03 are shown in Table 1.1 [28].

Table 1.1 Top dozen causes of deaths among middle-aged adults (ages 30-69 years), India, 2001-03 Males Females Rank Cause of death cum. Rank Cause of death cum. % % % %

1 Cardiovascular disease 27.3 27.3 1 Cardiovascular disease23.4 23.4

2 Tuberculosis 11.0 38.3 2 Neoplasms 12.3 35.7

3 Chronic lung disease 10.7 49.0 3 Chronic lung disease 10.9 46.6

4 Neoplasms 8.1 57.1 4 Tuberculosis 7.8 54.4

5 Unintentional injuries 7.7 64.8 5 Ill-defined/ unknown 6.7 61.1

6 Digestive diseases 7.6 72.4 6 Diarrhoeal diseases 6.5 67.6

7 Ill-defined/ unknown cause 4.5 76.9 7 Digestive diseases 4.9 72.5

8 Diarrhoeal diseases 4.1 81.0 8 Unintentional injuries 4.8 77.3

9 Intentional injuries 3.7 84.7 9 Malaria 3.4 80.7

10 Genito-urinary disease 2.8 87.5 10 Genito-urinary disease 2.6 83.3

11 Malaria 2.3 89.8 11 Fever of unknown origin 2.5 85.8

12 Diabetes mellitus 2.1 91.9 12 Diabetes mellitus 2.2 88.0 cum. = cumulative

Smaller population-based studies from urban Tamil Nadu and rural Andhra Pradesh using

the verbal autopsy instruments have yielded cause-specific mortality information recently [29,30].

In urban Chennai, analyses of a total of 66,777 deaths revealed that cardiovascular diseases were the

largest group with 18,680 deaths (28%) [29]. In rural Andhra Pradesh, it has been reported that 34%

of deaths among males and 30% of deaths among females were due to diseases of the

cardiovascular system among 1354 deaths occurring in an year in a community of about 150,000

persons followed prospectively [30].

3

1.2.2 Prevalence of Cardiovascular Disease In the absence of reliable nationwide prospectively collected morbidity data, estimates of the

prevalence of CHD have been based on indicators from population-based, cross-sectional surveys.

Multiple epidemiological studies have been performed in urban and rural populations in India over

the past few decades. Comparisons of these studies have various limitations such as inadequate

sample sizes, variable response rates, lack of age-standardization, unstandardized diagnostic criteria

such as medical history and non-specific electrocardiographic changes like abnormal ST-T waves,

and inadequate reporting of results. Review of a subset of high-quality studies that used broadly

similar recruitment procedures, study methods and diagnostic criteria (known CHD, Rose

questionnaire angina and/or electrocardiographic Q-ST-T changes) are able to offer a perspective on

secular trends. A higher prevalence of CHD was consistently seen in urban communities (6.6%-

12.5%) as compared to rural communities (2.1%-4.3%); the relative risk was about 3.0. There was

also significantly increasing trends in urban (r2 = 0.60) and rural (r2 = 0.31) regions over the last

four decades [22].

Stroke prevalence studies are very limited in India and the available studies also have the

multiple biases as in studies of CHD. The crude and age-standardized prevalence rates of stroke

appear to be higher in urban populations than in rural subjects. However evaluation of secular trends

in stroke in India is not possible owing to the small numbers of studies [22].

1.2.3 Cardiovascular risk factors The INTERHEART study, a 52-country, case-control study involving 15,152 cases of first

myocardial infarction and 14,820 age- & sex-matched, hospital/community controls identified that

over 90% of cases of acute myocardial infarction could be attributed to nine well-known coronary

risk factors – smoking, low fruit and vegetable consumption, low physical activity, alcohol

consumption, psychosocial stress, abdominal obesity, diabetes, hypertension, and abnormal lipids

(Table 1.2) [31]. These same risk factors that were found to be important in the overall

INTERHEART study were also found to operate within the south Asian subset as well [32]. A key

finding of this study was that the south Asians seen with a first myocardial infarction were younger

in comparison to the others.

4

Table 1.2 Factors influencing risk of acute myocardial infarction (INTERHEART study) Risk factor OR (99% CI) PAR (99%CI)

Increased risk (harmful)

Apo B: apo A1 ratio (highest vs. lowest decile) 3.25 (2.81-3.76) 49·2% (43·8–54·5)

Smoking (current vs. never) 2.87 (2.58-3.19) 35·7% (32·5–39·1)

Psychosocial factors 2.67 (2.21-3.22) 32·5% (25·1–40·8)

Diabetes 2.37 (2.07-2.71) 9·9% (8-5–11·5)

Hypertension 1.97 (1.74-2.10) 17·9% (15·7–20·4)

Abdominal obesity (highest vs. lowest quartile) 1.62 (1.45-1.80) 20·1% (15·3–26·0)

Decreased risk (protective)

Alcohol consumption (3 times/week) 0.91 (0.82-1.02) 6·7% (2·0–20·2)

Regular physical activity 0.86 (0.76-0.97) 12·2% (5·5–25·1)

Fruit and vegetable consumption (daily) 0.70 (0.62-0.79) 3·7% (9·9–18·6)

OR = odds ratio adjusted for all other risk factors; CI = confidence interval; PAR = population attributable risk

Stroke is a clinical entity caused by either extracranial/intracranial vascular atherothrombotic

pathology or intracranial haemorrhagic conditions. Risk factors differ for each type of stroke –

atherosclerosis risk factors (as in CHD) predominate in the former, whereas hypertension and

smoking are common in the latter. Leading stroke risk factors in low- and middle-income countries

include raised blood pressure, smoking, low fruit and vegetable intake, low physical activity, and

alcohol excess [33]. The relationship between salt intake and elevated blood pressure was

established by the INTERSALT study, a 32-country study involving 10,079 men and women aged

20-59 years in whom it was seen that 24-hour urinary sodium was positively correlated with blood

pressure adjusted for age and sex, after taking into account of body mass index and alcohol intake

as confounders [34]. Further, the Prospective Studies Collaboration meta-analysis of 61 prospective

studies with over one million individuals from industrialized countries reported a direct relationship

between blood pressure and stroke after correction for regression-dilution; throughout middle and

old age (40-89 years), usual blood pressure was strongly and directly related to stroke death rate,

with no evidence of a threshold down to at least 115/75 mm Hg [35].

5

1.2.4 Geography and Cardiovascular Disease Age-standardized cardiovascular death rates (per 100,000) in middle-aged subjects (30–69

years) are high in India (428) and other low- and middle-income countries such as Brazil (330),

China (290), Pakistan (425), Nigeria (452) and Russia (688) while it is lower in industrialized

countries such as Canada (140) and Britain (182) [36]. Moreover, in India about 50% of CHD-

related deaths occur in people younger than 70 years compared with only 22% in the West [36,37].

Studies in emigrants have indicated that South Asians had higher rates of CHD [38]. Genetic factors

were suggested. These studies however suffered from multiple biases, the major being the "healthy

survivor" bias, as survivors of acute coronary event that reached the study hospitals were younger,

more educated, affluent and had risk factors that were not considered significant with the available

knowledge [22].

Globally, the bulk of modern descriptive research has focused mostly on time and person,

with little consideration of the implications of place [39]. In India too, there is limited data on

regional variations of CVD in India. About a decade back, Gupta et al [40] linked hospital-based,

mortality data from the Medically Certified Cause of Death (MCCD) database to risk factor data

from the second National Family Health Survey and the India Nutrition Profile Study. They found

that cardiovascular death rates among middle-aged adults in different Indian states varied from a

low of 75-100 per 100,000 individuals in sub-Himalayan states of Nagaland, Meghalaya, Himachal

Pradesh and Sikkim to a high of 360-430 per 100,000 in Andhra Pradesh, Tamil Nadu, Punjab and

Goa. Such large variations in cardiovascular disease mortality in different Indian states could at

least partly be attributed to differences in dietary consumption of fats, milk, sugars and green-leafy

vegetables, as well as the prevalence of obesity [40].

Urban-rural differences in the distribution of cardiovascular disease have also been

documented -- a higher prevalence of coronary heart disease being seen in urban (6.6%-12.5%)

versus rural populations (2.1%-4.3%). There is also evidence of a significantly increasing time trend

in urban and rural areas over the last few decades [22]. Furthermore, the underlying risk factors

such as smoking, dietary patterns, obesity, diabetes and hypertension [21,22,41,42] also have urban-

rural differences, with cigarette smoking, adverse diet, obesity, diabetes and hypertension being

documented higher in urban areas and smoking of beedis being noted higher in rural areas.

6

1.3 Health Information Visualization One of the challenges of the current century is to improve the widespread uptake and use of

health policies that take into account of up-to-date epidemiological evidence. Risk communication

is a critical component of this process of knowledge dissemination, with the objective of presenting

scientific outputs in ways that are understandable to scientists and non-scientists. One of the

mechanisms for enhancing the transparency and widespread understanding of scientific evidence is

to use visual methods of presentation in order to make fairly abstract quantitative results easier to

comprehend [43,44]. Visual methods such as tables, graphs, pie-charts, boxplots, histograms and

maps facilitate our understanding of health issues by summarizing complex survey data, allowing

for visualization, and stimulating thought leading to new ideas and solutions [43]. Such

visualization in the field of CVD epidemiology is useful in two separate domains [45,46] — (1) the

academic domain, in which the use of such visualization to explore data would help in

understanding the distribution of cardiovascular diseases in India and generating hypotheses; and

(2) the public domain, in which a graphical report using enhanced visualization enables

professionals to present the “visual thinking” on the geographic distribution of cardiovascular

diseases to other concerned stakeholders for appropriate action.

This type of data visualization is especially important for chronic diseases such as

cardiovascular diseases which are a neglected epidemic in many low- and middle-income countries

[47]. The hypothesized reasons for this neglect include: lack of up-to-date evidence on disease

burden in the hands of decision makers; strong beliefs that chronic diseases affect only the affluent;

beliefs that the control of chronic disease is not cost-effective and hence should wait until infectious

diseases are controlled; and also due to the orientation of health systems toward acute care [48].

Visualization of the geographical epidemiology of cardiovascular disease in India has the potential

to enhance understanding for stakeholders such as global agencies, governments, academia and

research groups, donors, health professionals and the private sector.

7

1.4 In summation Cardiovascular diseases have become the leading cause of death and disease burden in India.

Conventional risk factors explain much of the CVD burden though genetic factors and possibly fetal

programming of adulthood chronic diseases have been proposed as relevant to the south Asian

context. The unique epidemiological context of the CVD epidemic in India with regard to smoking

(of beedis and cigarettes, and a later age of initiation of smoking), low overall rates of obesity but

with presence of central obesity, increasing prevalence of diabetes (especially in urban areas), and

high levels of CVD mortality that vary across different regions of the country point to the need for a

focused descriptive geographical epidemiological study of cardiovascular disease in India. Visual

methods of presentation of study findings offer a mechanism to enable a wider understanding of the

epidemiologic evidence regarding cardiovascular diseases for greater action regarding their control.

Such a study would not only help appreciate the health situation within India but could also

contribute to a better understanding of global health.

In this thesis, first I review the relevant literature on the descriptive epidemiology of CVD

risk factors in India including features that are similar to global findings and also features that are

unique to India. Then in the methods section, I describe the datasets used and outline the statistical

and geographical analyses undertaken. Subsequently, the results of this analysis in the form of

descriptive and geographical epidemiology of key CVD risk factors (smoking, body mass, dietary

factors and self-reported diabetes) and cardiovascular mortality outcomes are presented. This is

followed by a discussion of study findings along with interpretation of biases and limitations of the

datasets and analytic methods. Finally, the significant conclusions from this study are listed along

with possible directions for future research.

8

1.5 Objectives

1.5.1 General Objective

1.5.1.1 To describe the geographical epidemiology of cardiovascular disease in India

1.5.2 Specific Objectives

1.5.2.1 To describe the geographical epidemiology of cardiovascular risk factors (smoking, body

mass, diet and self-reported diabetes) and cardiovascular mortality in India

1.5.2.2 To present a graphical display of cardiovascular risk factors and mortality outcomes by

age, sex, education, region and residence in India

1.5.2.3 To describe the biases in the study datasets

9

2 REVIEW OF LITERATURE In this section, I first review current knowledge regarding the various cardiovascular risk

factors and mortality for an understanding of the CVD epidemic; in particular, features that are unique

to the epidemiological context within India are highlighted. Then I describe applications of

geographical epidemiology in the field of cardiovascular disease from literature; this was available

predominantly from industrialized countries. Lastly, I look at the utility of data visualization using

graphical displays in bringing to the attention of policymakers, academics and other stakeholders, the

burden associated with cardiovascular diseases.

2.1 CVD risk factors The following is a description of key determinants of cardiovascular mortality including smoking,

obesity, dietary factors and diabetes. Strongest evidence till date comes from systematic reviews of

the effects of smoking and obesity on heart disease. Smoking leads to a loss of 10 years of life [49].

Obesity leads to a loss of three years of life [50]. While the risk factors in the Indian population are

similar to those in the global population [38], there are however some important differences in the

distribution and nature of these risk factors in the Indian population that provide a unique

epidemiological context vis-à-vis the evolving epidemic of cardiovascular diseases.

2.1.1 Smoking India has a unique variation in the types of tobacco smoked [51]. Beedis and cigarettes are the

most common forms of smoking. A beedi consists of 0.2-0.3 grams of sun-cured tobacco loosely

packed and rolled in a rectangular piece of dried leaf (temburni leaf) and tied with a cotton thread.

Beedis may allow two to three times as many puffs as an ordinary cigarette. Because of the low

porosity of their wrappers and their poor combustibility, beedis must be puffed frequently to be kept

alight and so they deliver a relatively higher dose of tar to the smoker [52]. A less common form of

smoking is water-pipe smoking known as hookah. Hookah pipes involve smoking of tobacco from

cured leaves or leaves fermented in molasses, honey or fruit juices then covered with glowing

charcoal. This is seen in some parts of the country. Also, cheroots (chutta), which resemble cigars, are

smoked in a few regions of the country. Reverse chutta smoking (with the lit end inside the mouth) is

also seen in some districts.

10

There are an estimated 100 million adult smokers (95 million males and 7 million females) in

India. Importantly, cessation rates among smokers are low. About 2% of men are ex-smokers, and

many of them quit due to disease. In contrast, male ex-smoking rates are about 40% in many

industrialized countries where the risks of smoking are better known [53].

Some features of the tobacco epidemiology that are specific to India and could therefore have

a unique impact on the epidemiology of the evolving CVD epidemic in India are listed below:

o Unlike in all high-income countries and in China where tobacco consumption is mostly

of cigarettes, in India tobacco smoking consists of the use of beedi (predominantly) and

cigarettes [51]

o Analysis of age-specific prevalence proportions of smoking among males in India

[54,55] revealed interesting differences in comparison to global and American data. For

example, peak smoking rates (50%) among Indian males was seen in the 40-49 year age-

group; this was older than the peak use (36%) in ages 30-39 years seen globally [56] and

the peak use (39%) seen in the age-group of 21-25 year old males in America [57]

o Amount of smoking is generally low in India as compared to other countries. Mean

number of beedis or cigarettes smoked per day by male smokers aged 30-69 years was

4.0 in India [58] compared to about 14 cigarettes per day in U.S.A [59]

o Wide variations in prevalence of smoking among males have been noted across different

states [41,54]

Further, Jha et al (2008) have recently used a case-control study design with data from the first

phase of the Million Deaths Study to conclude that in persons between the ages of 30 and 69 years

smoking was responsible for about 1 in 20 deaths of women and 1 in 5 deaths of men [58].

2.1.2 Body mass

The practical methods applicable for assessing large populations are body mass index (BMI),

waist circumference (WC) and the waist:hip ratio (WHR) as these are the commonly used measures in

epidemiological studies. Of these, the most widely used measure of overall obesity in adults is the

BMI (Quetelet index), a measure of weight adjusted for height, calculated as weight (kg)/height (m)2.

BMI is closely correlated with more sophisticated measures of obesity and, as such, is a useful

screening tool. It has been widely used in population studies and predicts the future development of

diabetes. The definitions of overweight and obesity have varied in different studies. The WHO

11

definition for pre-obesity is BMI = 25.0-29.9 kg/m2 and for obesity is BMI ≥ 30 kg/m2. Obesity is

further classified as Class I (BMI = 30.0-34.9 kg/m2), Class II (BMI = 35.0-39.9 kg/m2), and Class III

or morbid obesity (BMI ≥ 40 kg/m2).

Among the CVD risk factors, poor nutritional habits and physical inactivity are seen to be

contributing to the epidemic of obesity sweeping the world. Body fat distribution, especially visceral

adipose tissue accumulation, has been found to be a significant correlate of a cluster of diabetogenic

and atherogenic abnormalities [60]. Most recent update on the association between body mass and

cardiovascular mortality comes from the Prospective Studies Collaboration meta-analysis of 57

prospective studies with 894,576 participants, mostly from western Europe and North America [50].

After adjusting for age, sex and smoking status, it was found that at BMI of 30-35 kg/m2, median

survival was reduced by 2-4 years; at 40-45 kg/m2, it was reduced by 8-10 years (which is comparable

with the effects of smoking).

Characteristics relating to body mass that are unique to India include:

o Overall rates of obesity are low in the Indian population [61,62]

o Central obesity (abdominal or visceral obesity) is however more common in south

Asians than in Caucasians. Factors contributing to this phenotype of obesity may include

genotype, fetal growth, appetite, physical activity and body composition [62]

o Overweight/obesity more likely to co-exist with undernutrition as twin burdens in

rapidly developing economies with income inequalities such as India [63]

2.1.3 Dietary factors and Diabetes

Dietary factors

Diet and nutrition have been extensively investigated as risk factors for cardiovascular

diseases such as coronary heart disease (CHD) and stroke. Their links to other cardiovascular risk

factors like diabetes, high blood pressure and obesity have also been established according to a recent

comprehensive review by Reddy and Katan [64]. However, available evidence has recognized

considerable practical and methodological issues in many of these studies pertaining to: the

measurement of exposures, definition of health outcomes, multitude of research designs and the need

for careful consideration when inferring causality. There is sufficient evidence from a variety of

studies linking several nutrients, food groups and dietary patterns with an increased or decreased risk

of CVD. There is also substantial evidence showing that vegetarians have a lower mortality from

12

ischaemic heart disease than non-vegetarians; however, cancer mortality and total mortality do not

differ. Vegetarianism can be subdivided into lacto-vegetarianism (a diet with dairy products and eggs

but without meat and fish) and veganism (a diet without any animal foods whatsoever, including dairy

products and eggs). Dietary fats such as trans-fats and saturated fats are associated with an increased

risk of CHD while polyunsaturated fats are known to reduce risk. Dietary sodium is associated with

elevation of blood pressure, while dietary potassium protects against hypertension and stroke. Regular

frequent intake of fruits and vegetables is certainly cardio-protective. Composite and prudent diets

appear to reduce the risk of CHD and stroke by being preventative and therapeutic [64].

Diabetes Much of the evidence related to high propensity of diabetes among south Asians comes from

expatriates in Britain and America. This high instance of diabetes and its complications do not have a

single explanation. The early incidence of diabetes and its link with coronary heart disease may be

partially explained by the central adiposity-insulin resistance syndrome [62]. Predisposition to this

may be genetic but exacerbated by other factors such as diet and physical activity levels in a rapidly

changing socio-cultural milieu [62,65].

Factors relating to diet and diabetes that are peculiar to the Indian context are listed below:

Diet

o The Indian diet is predominantly vegetarian with a recent shift towards higher intake of

fats and added sugars [66]

o Trans-fats or hydrogenated fats (eg. vanaspathi) are commonly used, especially in

several parts of urban India [67]

Diabetes

o South Asians appear to have a high risk of developing diabetes. It is hypothesized that

impaired sensing of glucose, reduced insulin secretion, or increased insulin resistance

that lead on to development of impaired glucose tolerance and diabetes mellitus [62].

Glucose intolerance, abdominal obesity and the metabolic syndrome features appear to

be important factors associated with the development of CHD in south Asians [62,65]

o Evidence of increasing diabetes prevalence from within the country is mostly anecdotal

based on rising clinical disease burden in hospital-based settings. Population-based

studies are limited in number with some evidence that diabetes may be rising in urban

areas of the country [68]

13

2.1.4 Other risk factors -- Hypertension and dyslipidemia

The Prospective Studies Collaboration meta-analysis of 61 prospective studies with over one

million individuals from industrialized countries reported a direct relationship between blood

pressure and stroke throughout middle and old age (40-89 years) -- usual blood pressure was

strongly and directly related to stroke death rate [35]. Blood cholesterol is also a major risk factor

for cardiovascular morbitidy and mortality [13,15,69]. Total cholesterol was positively associated

with IHD mortality (but not stroke) in both middle and old age and at all blood pressure levels in a

meta-analysis of the above 61 studies [69]. Information from nationally-representative surveys is

however lacking from India on these two risk factors.

2.2 CVD mortality The age-standardized cardiovascular disease death rate in middle-aged adults (30–69 years) is

estimated to be high in India (428 per 100,000) [36]. Hospital-based [40] and population-based

[29,30] studies indicate that CVDs are already the leading cause of death in this age-group.

2.2.1 Assessment of cause-of-death

Assessment and attribution of cardiovascular mortality is however challenging in the India

context because most deaths occur outside of the hospital setting. This demands that there be a system

of reliable ascertainment and validation of causes of death in ‘at-risk’ age-groups. Hence, a new and

enhanced “verbal autopsy” (VA) instrument [27,29] has been developed, piloted, and implemented in

certifying 125,000 deaths within the Sample Registration System, India’s flagship fertility and

mortality monitoring system.

The verbal autopsy method has three main components. First is data collection on

circumstances surrounding death by non-medical staff via household interview. Second is re-sampling

of the field work, and other quality control checks. Third, there is central medical adjudication of the

field reports to arrive at a final cause of death. Household assessment of the cause of death involves an

investigation of the train of events and/or circumstances at the onset and during the course of the

terminal illness, through an interview of relatives/associates of the deceased. VA can be of substantial

help in assessing the “underlying cause of death” [70]. Verbal autopsy is now of established value in

helping to classify the broad categories of mortality in young and middle ages (0 – 69 years) although

there is variation in the sensitivity and specificity for certain diseases.

14

These VA methods, study instruments, training material, coding manuals, and quality control

checks are now freely available [27] and are now being increasingly used in settings with poor death

registration and certification systems. .

Despite the substantial misclassification that is inevitable, results obtained by the use of VA

provide much better evidence than was earlier available on cause-specific mortality rates for India as a

whole, and on the geographic variation in those mortality patterns.

2.3 Geographical epidemiology The concept of “place” in human health is probably a surrogate for the interplay between

genetic factors, environment, lifestyle and society. From time memorial, ‘place’ as a determinant of

health has been acknowledged in scientific enquiries as in ‘On Airs, Waters and Places’ by

Hippocrates circa 400 years BCE. Systematic interest in the field has emerged over the last 50

years, with the perspective and methodology of geography being applied to the study of health and

disease over the last few decades. The emergence of a systematic interest in geography and health

can be seen from the first report of the Commission on Medical Geography (Ecology) of Health and

Disease of the International Geographic Union in 1952 [71]. Subsequently, interest in the field

diffused around the world as evidenced by focused enquiries on geography and health documented

from many countries. These studies led to the development of health geography consisting of the

fields of medical geography and geographical epidemiology with slightly differing foci -- medical

geography maintained an ecologic perspective to study disease patterns while geographical

epidemiology maintained a significant focus on study design and analytic methods. The primary

difference appears to be the focus of medical geography on the spatial context of health-related

issues—an aspect that epidemiology recognizes but rarely explicitly considers [39]. Health

geography is thus both an ancient perspective and a modern specialization.

Over the last decade however recent advances in geotechnology and analytic techniques have

given a major impetus to the fields of medical geography, geographical epidemiology and public

health sciences. Modern advances such as geographic information systems (GIS) have helped advance

the science of health geography greatly. A GIS is a system of hardware, software, and procedures for:

capture, management, manipulation, analysis, modeling and display of spatially referenced data for

solving complex planning and management problems [72]. GIS technology and the term originated in

15

Canada. The first fully operational GIS was the Canada Geographic Information System (CGIS)

developed at the Natural Resources Canada in late 1963 [73].

At the international level over the last decade, geographical epidemiologic outputs on health

topics in the form of atlases have been published to convey messages to a wider audience. These

include the World Atlas of Health, the Global Child Health Atlas, and other disease-specific atlases on

Diabetes, Tobacco, Cancers, and Heart Disease and Stroke. Such atlases have been constructed for

use at the continent level [74], at national level in U.S.A [75]and Canada [76] or at subnational level

for a province such as the Ontario Diabetes Atlas [77] or for a city such as the Toronto Diabetes Atlas

[78].

Within the field of cardiovascular disease (CVD) epidemiology, mapping the prevalence of

cardiovascular risk factors and disease burden reveals an interesting 2-to-3 fold, west-to-east gradient

within Canada [76]. Similarly, geographical differences in prevalence are also seen in the Toronto

diabetes atlas with a low prevalence of diabetes in the high-income area of central Toronto and a high

prevalence of diabetes in the eastern and western suburbs which has a large population of south Asian

immigrants [78].

Spatial analysis of cardiovascular mortality in the United States has revealed a west-to-east

gradient in coronary heart disease mortality with clustering in some states [75]; the clustering of

stroke mortality was also common in these states as well as in some other areas [79]. Geographic

disparities (urban-rural differences and inter-state differences) in cardiovascular health and underlying

social determinants have also been identified [80]. In Canada [81] and Germany [82] also, regional

differences in coronary heart disease and personal and regional risk factors have been studied and

found to be different. A recent review of the status of cardiovascular disease in the Commonwealth

countries revealed that CVD death rates were higher in south Asian and sub-Saharan countries than in

European, North American or Australasian countries [83]. Further, even within a single population

subgroup such as among Native Indians within U.S.A., there are geographic variations in

cardiovascular disease and risk factors depending on residence in different states [84].

Descriptive epidemiology of mortality and risk factors in India also, be it for communicable or

noncommunicable diseases, have traditionally given emphasis on person and time characteristics but

not place. Exceptions to this include the limited mapping of maternal and child health status (such as

maternal mortality ratios, infant mortality ratios and childhood vaccination coverage) that is available

for the entire country. Mapping of HIV prevalence among women attending public antenatal clinics in

16

115 districts of the four high-risk southern states (of the total 35 states) in India is recently available

[85].

Epidemiology of coronary heart disease has been documented in India from the 1960s [86].

Recent reviews on distributions of hypertension, [21] coronary heart disease, stroke and their risk

factors [22] have also focused on person and time predominantly. There are however some

exceptions. Wide variations in tobacco use across various states in the country have been documented

in nationally-representative surveys [41,54]. State-level differences in dietary intake and

cardiovascular disease mortality rates have also been described [40].

2.4 Data visualization Graphical displays help to disclose complex structure in data [44,87]. From this point of view,

data visualization may not only create interest and attract the attention of the viewer but also provide a

way of discovering unexpected trends or patterns.

Maps by themselves have been used since time immemorial in ancient civilizations to present

visual information but an atlas or a collection of maps is a more recent phenomenon. An atlas is

typically a collection of maps of the earth or a region of the earth. Abraham Ortelius is credited with

issuing the first ‘modern’ atlas of 53 maps in 1570 in Antwerp, Belgium. However, use of the word

"atlas" for a bound collection of maps did not come into use until 1595 when it was first used by

Gerardus Mercator. Map-making or Cartography is both a science and an art. Unlike general

cartography which involves maps that are constructed for a general audience, thematic cartography

(statistical maps) involves maps of specific themes oriented toward specific audiences. The intent of

maps is to illustrate in a manner in which the ‘percipient’ acknowledges its purpose in an accurate,

comprehensible and timely fashion [88].

Graphical displays with maps, boxplots, line graphs, pie charts and other visualization

methods are part of a ‘periodic table of visualization methods’ that may be used to translate

knowledge to multiple stakeholders. They have the ability to blend science and art into a document

that provides relevance and meaning to the communication process [43]. These visual methods thus

enable us to bridge the gap in information sharing and the decision-making process by reducing detail

and complexity to a simple visual representation that can be more easily understood by a variety of

professionals and other interested individuals. This could be of relevance in developing countries

where noncommunicable disease control is not yet on the radar of policymakers and other

stakeholders due to inadequate information [48].

17

http://en.wikipedia.org/wiki/Map

http://en.wikipedia.org/wiki/Abraham_Ortelius

http://en.wikipedia.org/wiki/1570

http://en.wikipedia.org/wiki/Thematic_map

3 METHODOLOGY

In this section, following a brief description of the study setting and study design, I explain in

detail the 5 secondary datasets and study variables that I used in my study. Subsequently there is

information on data management. In data analysis, I cover univariate and multivariate statistical

analyses undertaken. Lastly, there is an outline of geographical analytic methods used.

3.1 Study setting

Lying entirely in the northern hemisphere, India extends between 8° 4' and 37° 6' latitudes

north of the equator, and between 68° 7' and 97° 25' longitudes east of the prime meridian. Its

population totals 1.13 billion with a median age of 23.8 years. It has 2.2% of world’s land mass and

16.6% of world population. India is a federal union of 35 administrative provinces; of these, 19 are

large states (>10 million population), 10 are small states (<10 million population) & six are union

territories (UT) (Figure 3.1). As of 2008, these 35 provinces were subdivided into a total of 611

districts. There is wide variation in the size, structure and composition of states.

Figure 3.1 Political map of India showing states and union territories

Large states: (1) Jammu & Kashmir, (2) Punjab, (3) Haryana, (4) Delhi, (5) Rajasthan, (6) Uttar Pradesh, (7) Bihar, (8) Jharkhand, (9) Assam, (10) West Bengal, (11) Orissa, (12) Chattisgarh, (13) Madhya Pradesh, (14) Gujarat, (15) Maharashtra, (16) Andhra Pradesh, (17) Karnataka, (18) Kerala, (19) Tamil Nadu Small states: (1) Goa, (2) Himachal Pradesh, (3) Uttarakhand, (4) Arunachal Pradesh, (5) Meghalaya, (6) Manipur, (7) Mizoram, (8) Nagaland, (9) Sikkim, (10) Tripura

Union territories: (1) Chandigarh, (2) Dadra, Nagar & Haveli, (3) Daman & Diu, (4) Pondicherry, (5) Lakshadweep, (6) Andaman & Nicobar Islands

18

3.2 Study Design This thesis was a descriptive study of the geographical epidemiology of cardiovascular

disease in India. An ecologic analysis was undertaken on the secondary data from nationally-

representative health surveys in India.

3.3 Data sources All nation-wide health surveys conducted in the last decade were potentially eligible for

review. Six surveys were considered – India Nutrition Profile (1994-96)[89], Special Fertility and

Mortality Survey (SFMS, 1998) [90], National Family Health Survey (NFHS-2, 1998-99) [91],

Million Death Study –Phase I (MDS, 2001-03) [27], Sample Registration System (SRS, 2004) [92]

and National Family Health Survey (NFHS-3, 2005-06) [93]. Of these, the India Nutrition Profile

was found to be an aggregate of two independent surveys (National Nutrition Monitoring Bureau

survey 1994 and the District Nutrition Profile 1995-96). The NNMB survey however covered only

the rural regions of 8 states only; the DNP covered 18 states – some with both urban and rural

regions and some with rural regions only. Large and populous states such as Uttar Pradesh and West

Bengal were among the six states not covered in either of the surveys. Hence the India Nutrition

Profile was considered not nationally representative and was excluded.

All the other five surveys were included for the secondary data analyses. SFMS, SRS and

MDS were conducted by the Registrar General of India (RGI), Ministry of Home Affairs, Govt of

India, New Delhi, while NFHS-2 & 3 were organized by the International Institute of Population

Studies (IIPS), Mumbai. MDS was a mortality outcome survey and the other four were considered

as risk factor surveys. All five surveys followed sampling design procedures to obtain health

indicators at the national and state levels. Some key characteristics of the five surveys are shown in

Table 3.1 and the details of the data sources are described below.

3.3.1 Special Fertility and Mortality Survey (SFMS)

This was a one-time special survey conducted in 1998 covering 3.7 million individuals aged

15 years and over from 1.1 million households [90]. This survey collected information at

community level (social/community facilities), household level (religion, caste, household

19

characteristics) and individual level (demographics, fertility and mortality). It was conducted within

the Sample Registration System (SRS) framework – India’s flagship fertility and mortality

monitoring system since 1971. Field supervisors interviewed heads of household who provided

proxy information for other family members. Relevant key variables included sociodemographic

information and some health information.

Table 3.1 Description of the 5 databases, study periods, sample sizes and study respondents No. Database Year(s) Description Sample Size Respondent1 2 3 4 5

SFMS (Special Fertility and Mortality Survey) NFHS-2 (National Family Health Survey-2) MDS (Million Death Study)-Phase I SRS (Sample Registration System) NFHS-3 (National Family Health Survey-3)

1998 1998-99 2001-03 2004 2005-06

Fertility and mortality survey of a nationally representative sample Demographic and Health Survey in 26 states Mortality surveillance system (using ICD-10) in all states Demographic surveillance system covering 6.3 M persons in all states Demographic and Health Survey in 29 states

3.7 M ind. aged 15+ from 1.1 M households (6671 sample units) 334,486 ind. aged 15+ from 91,196 households (168,517 ♂ & 165,969 ♀) 123,905 ind. aged 15+ from 1.1 M households (82,383 ♂ & 66,498 ♀) 4.5 M ind. aged 15+ from 1.3 M households (7597 sample units) 198,754 ind. from 109,041 households (74,369 ♂ 15-54 yrs & 124,385 ♀ 15-49 yrs)

Head of household Household respondent VA respondent Head of household Self

Ind. = individuals; M = million; VA= verbal autopsy

3.3.2 National Family Health Surveys (NFHS)

These were cross-sectional demography and health surveys of a representative sample of

households in India conducted to provide estimates of indicators of population, health, and nutrition

by sociodemographic characteristics at the national and state levels. NFHS-1 survey was conducted

in 1992-1993 (but not included in the present study), NHFS-2 in 1998-1999 and NFHS-3 in 2005-

20

2006. There were differences in study design and study population between NFHS-2 and NFHS-3

that are detailed below.

NFHS-2 (1998–1999), was a nationally representative cross-sectional study of 92,447

households [91]. Trained data-collectors interviewed an adult member in each selected household to

obtain socio-demographic and health information about the household and its family members,

obtaining a household response rate of 98%. From these households, the data-collectors interviewed

90,303 ever-married women aged 15–49 in face-to-face interviews obtaining an individual response

rate of 96%. These women were located in 3204 primary sampling units in 26 of the 32 states. In

rural areas, these primary sampling units were villages or village and in urban areas these were

census enumeration blocks which were contiguous areas created to be as demographically

homogenous as possible.

NFHS-3 interviewed 124,385 women aged 15-49 and 74,369 men aged 15-54 to obtain

information on population, health, and nutrition in India covering 29 of the 35 states [93]. Complex

multi-stage sampling procedures were undertaken: two-stage (village/primary sampling unit and

household) procedure in rural areas and three-stage (urban ward/census enumeration

block/household) procedure in urban areas. Study design also involved stratification by geographic

(district, area size, etc) and sociodemographic characteristics (percent of males in non-agricultural

sector, percent of population belonging to scheduled castes/tribes, and female literacy). In addition,

the study sampling frame over-sampled urban residents to represent larger metropolises and slums

and also over-sampled female participants from within households. Household response rate was

98% and interview response rate was 92%.

3.3.3 Sample Registration System (SRS)

This survey was the baseline study of the 2004-2014 sampling frame conducted within the

SRS, a large, routine demographic survey serving as the primary system for collection of fertility

and mortality data since 1971. The latest SRS sample frame covered about 7.6 million people

(including 4.5 million adults aged ≥15 yrs) in all 28 states (except rural Nagaland) and seven union

territories of India [92]. A total of 7,597 sample units (4,433 rural and 3,164 urban) were selected

from the 2001 census. SRS sample units were randomly selected to be representative of the

population at the state level. The sample design was a uni-stage, stratified, simple random sample

without replacement. The sample size was based on infant mortality rates. Within the SRS, selected

21

households were continuously monitored for vital events by two independent surveyors. The heads

of households were identified to obtain proxy information regarding all family members.

3.3.4 Million Death Study (MDS)

This is the world’s largest prospective study that is being conducted (1998-2014) to provide

quantification and epidemiological evidence of cause-of-death in India. The overall study is being

conducted within two sampling frames of the Sample Registration System (SRS) framework:

sampling frame 1 (1998-2003) with 6.3 million individuals and sampling frame 2 (2004- 2014) with

7.6 million individuals, yielding a total sample size of 14 million individuals in 2.4 million

households. There is continuous (on-going) collection of data and monitoring of vital status. This

yielded about 300,000 deaths for the period 1998-2003; another 700,000 deaths are estimated from

the second period 2004-2014 to make a total of one million deaths [27]. Deaths (numbering

123,905) that occurred during the period 2001-2003 and that were studied in depth using the verbal

autopsy (VA) method were included in this thesis.

3.3.4.1 Verbal autopsy method

There was use of validated Verbal Autopsy (VA) instrument to record and validate the

cause-of-death [27]. The VA instrument followed a hybrid (open/closed) format. In the ‘closed’

section, there was provision for collection of socioeconomic and demographic characteristics of the

respondent and the deceased. There were also “filter” questions used to screen for the presence/

absence of specific symptoms. If the filter question was positive, then subsequent questions on

severity, duration, or other characteristics of these symptoms were asked of the respondent. In

addition, there was a symptom list to aid in the drafting of a written narrative, the ‘open’ section.

This written narrative detailed the following information: associated symptoms in chronological

order; duration; onset of illness; type of treatment received (if any); details on hospitalization;

history of past episodes; and abstracted information relating to the terminal illness from available

investigation slips, discharge summaries or death certificate. Each interview lasted about 30-45

minutes. Forms were available in English/ Hindi, with the narrative written in the vernacular

language (such as Tamil, Punjabi, Gujarati, etc).

3.3.4.2 Random re-sampling of field interviews

To ensure high quality fieldwork, a specialist re-sample team (RST) directly reporting to the

study investigators re-interviewed up to 5-10% of randomly chosen households. A review of about

22

3,500 field reports from several states had found a high correlation between the random audit team

and the RGI supervisors on overall distribution of causes of death [27].

3.3.4.3 Ascertainment of cause of death by trained physicians

Previous validation results for adult deaths have suggested that the central adjudication by a

trained panel of physician coders yielded consistently higher sensitivity for most mortality

outcomes than an algorithm-based approach [27]. Before assigning cause of death, a panel of about

120 physician coders were trained to carefully screen all documentation provided, noting all of the

positive/negative evidence, and use clinical judgment in assigning the underlying cause of death. In

order to reduce inter-observer variation, the two coders independently examined each report and

determined a probable underlying cause of death with ICD-10 codes. They then provided an

underlying cause of death in words (e.g., “myocardial infarction”), the corresponding ICD-10 code

(e.g., I21) and the key words used to support their decision. If two physicians did not agree on an

underlying cause of death, a web-based system assigned to each physician their own original report

and the ICD-10 code of the other physician (without revealing their identity). The physician coders

were then required to use the additional information (ICD-10 and key words) provided

anonymously by the other physician to reach an agreement on the underlying cause of death. An

expert panel of senior physician coders assigned a final cause of death where two physicians did not

agree on a cause of death after one reconciliation attempt. Physicians were drawn from across India

to ensure valid cross-region comparisons.

For my analysis, I focussed on the subset of ICD-10 codes comprising of cardiovascular

deaths [including ischaemic heart disease (I20-25, I44,I46,170), hypertensive heart disease (I10-15),

heart failure (I50), cerebrovascular disease (I60-70, G45, G81-83) and sudden deaths (R55, R96)].

The reason for including the G-codes (transient ischaemic attacks or plegias) and the R-codes

(sudden deaths or syncope & collapse) was that it was assumed to be wrongly coded by the

physician coders instead of the underlying cause of death being coded; this was based on a review

by this reviewer of a subset of the written narratives for deaths occurring in middle-aged adults and

concluding that they were probably cardiovascular deaths based on the clinical information

available in the narratives. The process-flow of VA methods within the MDS is depicted in Figure

3.2

23

Figure 3.2 Process flow of the Million Death Study

Continuous recording of births & deaths

Part-time enumerators

RGI surveyors

Collection of circumstances of death and complete narrative

Resample Team surveyors

Reconciliation & Adjudication

2 physician coders assign cause of death using ICD-10

B. Cause of death assignment

A. Field activities

5%-10% re-sampling of deaths

For this analysis, I used a subset of this study from the years 2001-2003 (called Phase I)

covering a total of about 123,905 deaths, of which about 22,000 were cardiovascular deaths.

The set of demographic and geographic variables, cardiovascular risk factors, and fatal

outcomes from these five databases are shown in Table 3.2.

24

25

Variable Values SFMS NFHS-2 MDS SRS NFHS-3

Year 1998, 1999, 2001-06 1998 1998-99 2001-03 2004 2005-06

Table 3.2 Variables used from the 5 surveys (4 risk factor surveys & 1 mortality outcome survey)

Survey

Time

Demographic

Age (≥15) Years

Sex Male, Female

Education Illiterate, upto Grade 5, Grade 6 to 10, Grade 10 & above

Geographic

Residence Urban, Rural

State 29 States, 6 union territories

CVD determinants

Smoking Yes, No .

Age of initiation Years . . . .

Type of tobacco Beedi, Cigarette, Other . . . .

Body mass 10.0-49.0 kg/m2 . . . .

Diet Lacto-vegetarian – Yes, No . . .

Fruit intake Daily, Weekly, Rarely/never . . . .

Self-reported diabetes Yes, No . . . .

CVD outcome

CVD-specific deaths ICD-10 codes (I10-15, I20-25, I44, I46, . . . . I50, I60-70, G45, G81-83, R55, R96)

3.4.3 Data quality assessment

Ascertainment of quality of data was done by assessing coverage of surveys and by studying

internal validity of survey information.

3.4.2 Compilation of data dictionaries

Data dictionaries containing descriptions of data and the data fields were compiled for all

five databases. Each data dictionary comprised of the following items at the minimum:

2. CVD determinants – smoking, body mass, diet (lacto-vegetarianism, fruit intake), self-reported

diabetes

For mortality outcome, age-standardized cardiovascular death rates were obtained from the

on-going Million Deaths Study (age-standardization explained below).

The relevant variables were then identified in the various study questionnaires for

compilation of data dictionaries and for abstraction from the 5 databases.

1. Sociodemographic variables – age, sex, education, place of residence (urban/rural), state

3.4.1 Abstraction of relevant variables from the 5 databases

Variables of interest were based on risk factors that account for bulk of the disease burden in

adults and that are amenable to surveillance, prevention and control as documented in global health

documents such as the World Health Report [94], Global Burden of Disease [23], the WHO-STEPS

[95] and the Commission on Social Determinants of Health [96].

3.4 Database Management

other values of the variable (such as unknown, missing, etc.)

maximum value of the variable

minimum value of the variable

variable length for decimal places

variable length

variable category

variable type

variable description

column (variable) ID

26

1) Internal quality control measures in surveys were reviewed to assess coverage: (a) It was

noted that 10% post-enumeration checks were routinely carried out in Census surveys in

India [97]; (b) it has been reported that completeness of reporting of vital events (births

and deaths) averaged around 85% in the Sample Registration System in India [98].

2) A literature review of accuracy of self-reporting was undertaken to assess internal validity:

(a) regarding smoking – past evidence indicates that proxy-reported smoking status was an

accurate and effective means of monitoring population-wide smoking prevalence of adults

[99]. Self-reporting was however found to under-estimate current smoking when

compared to metabolic markers such as carboxyhaemoglobin or cotinine measurements

[100]. Agreement between self-reported and proxy-reported smoking status was found to

be dependent on ethnicity; Cohen’s kappa was 0.82 for Asian Americans and found to be

intermediate between that for Whites/Blacks (0.91) and that for Hispanics (0.76). It was

also dependent on age, with lower agreement for younger ages [101]; (b) regarding self-

reported morbidities – it was noted that this was related to self-rated health across the

social gradient [102] and thus acceptable for large-scale epidemiological surveys

Further, for geographical analysis, the quality of data from secondary sources was assessed for

geographic coverage and completeness of interview information [103].

3.4.4 Exploratory data analysis

Exploratory data analysis was undertaken initially to detect errors such as sequence break in

serial numbers, duplication of data, range errors and inconsistencies. Appropriate data cleaning was

done after detection of data errors -- for example, implausible values/ outliers were omitted from

further final analysis.

27

3.5 Data analysis

3.5.1 Conceptual framework

The conceptual framework for the data analysis is shown in Figure 3.3.[104] Figure 3.3. Conceptual framework for the geographical epidemiological analysis of CVD in India

Database management

Standardization Feature data (geographic areal data)

Geographical analysis

Validation

Statistical analysis

Attribute data (disease data)

3.5.2 Standardization

3.5.2.1 Risk factor estimates

All the risk factor surveys were age-standardized to the Census 2001 population. Direct

standardization was used for three (SFMS, NFHS-2 and NFHS-3) of the four risk factor surveys

which had individual-level data and indirect standardization was used for the SRS survey for which

only group-level information was available [105].

28

3.5.2.2 Mortality rates

Outcome estimates require knowledge regarding the numbers of people at risk. In India, the

primary source of such data for the entire country comes from two sources: the decennial Census

(the latest being the 2001 Census) and the Sample Registration System (SRS). Both have their own

set of problems. Firstly, due to the decennial nature of the census, estimates of population are not

routinely available and therefore need to be computed for the years between consecutive census.

This is computed by a ‘roll-forward’ geometric progression method based on the base year’s

estimates taking into account current births, deaths and net migration for the country. Secondly,

there is the issue of ‘under-count’ of deaths up to 20% within the Sample Registration System [98].

The approach to the latter issue has been to recalculate the death rates based on the higher absolute

deaths for the country as per the UN/WHO estimates.

The Million Death Study 2001-03 has two limitations. Firstly, it estimates only the

numerator, that is, the absolute numbers of cardiovascular deaths. Secondly, it has no appropriate

denominator since it covers a inter-censal time period.

Therefore, computation of cardiovascular mortality rate necessitated two procedures for

standardization to the year 2006. Firstly, the proportion of deaths from the MDS was used with the

death rates (average for years 2005 and 2006) from SRS and corrected upward (by about 15%) with

the UN mortality figures for the country to arrive at estimates of absolute deaths for the year 2006.

Similarly, UN population estimates for 2006 were used in conjunction with the proportion of

population in the various states from Indian Census to arrive state totals for the year 2006. This

enabled computation of age-standardized cardiovascular death rates for various states for the year

2006 as the outcome measure.

3.5.3 Statistical analysis

3.5.3.1 Univariate analysis

Univariate analysis of tobacco smoking, dietary behaviours, obesity and diabetes was done

to calculate prevalence proportions, means and rates per 100,000 persons respectively. Following

this step, age-standardized estimates [105] were computed using the following formula:

pr = ∑ [wa * (d+ar ÷ dar)]………………….………………..(1)

pr = age-adjusted disease (or risk factor) prevalence in region r

d+ar = number with disease (or risk factor) in age-group a in region r

29

dar = number in denominator age-group a in region r

wa = a numeric weight for the age-group a

the value wa is derived from the 2001 Indian Census population (reference population) as:

wa = Na ÷ Nt………………………………………….(2)

where Na is the total number of persons in age stratum a in the reference population and Nt

is the total number of persons in the reference population. 99% confidence intervals for the

age-adjusted prevalence estimates were calculated using the following formulae:

Variancer = ∑ [wa2 * (par * qar ÷ dar)]……………………………..(3)

SEr = √ Variancer……………………………………………(4)

99% CI for pr = pr +/- 2.576*SEr…………………………………..(5)

Variancer = variance for the age-adjusted disease prevalence in region r

par = age-group specific prevalence in age-group a in region r

qar = 1 – par

dar = total number with disease in age-group a in region r

SEr = standard error for the age-adjusted disease prevalence in region r

The 2001 census population was used as the reference population.

Subsequently odds ratios with 99% confidence intervals were calculated as measures of

association. Pearson or product-moment correlation coefficient was computed to compare

correlation between smoking prevalence proportions across various surveys.

Results were reported using graphical displays such as tables, line-graphs, bar graphs, pie-

charts and box plots wherever possible.

3.5.3.2 Multivariate analysis

Multivariate analysis was carried out by multiple linear regression [106] and poisson

regression [107]. I regressed cardiovascular death rates for males and females in each state (as

outcome variable) on the following set of eight predictor variables:

Percent urbanization – from the census 2001

Smoking prevalence %, lacto-vegetarianism prevalence%, regular fruit intake prevalence %,

overweight prevalence % and diabetes prevalence % -- from NFHS-3 survey.

For multivariate linear regression [106], the data were first examined prior to modeling by

way of plotting to assess whether the data had linear relationships with the outcome. As a second

30

step, associations between various parameters were studied by looking at correlations between the

study variables. Then the appropriate model was created using these variables.

Then I tested for assumptions of linear regression. Firstly, I used the SPEC option in the

PROG REG statement to check for heteroscedasticity (not identical distributions of error terms) and

dependence of error terms. As the SPEC test does the opposite of what one hopes to conclude, a

non-significant p-value indicates the error variances are not identical and the error terms are not

dependent. Secondly, the Durbin-Watson (D-W) statistic was obtained by using the DW option in

REG. This tested for first order correlation of error terms. The D-W statistic ranges from 0 to 4.0.

Generally a value of 2.0 indicates the data are independent, while a low value of <1.6 indicates

positive correlation and a large D-W indicates negative correlation. Lastly I examined the residuals

of the model in two steps: using REG to create an output of residuals for which I subsequently used

PROC UNIVARIATE to test them.

PROC REG DATA=DEATHS_RISKS;

MODEL DEATHS = X Y Z / DW SPEC;

OUTPUT OUT=RESIDS R=RES

RUN;

PROC UNIVARIATE DATA=RESIDS NORMAL PLOT;

VAR RES;

RUN;

I also looked for multicollinearity (whether the variables were correlated) by obtaining the

Variance Inflation Factor by using VIF option in the REG statement. A cut-off of 10 was used to

test if it was stable. Lastly I used the R option to generate Cook’s D statistic while looking for

outliers that could exert a large influence on the overall outcome.

Subsequently I also re-tested my model using poisson regression [107] with PROC

GENMOD and DIST=POISSON. To correct for overdispersion, I used the PSCALE (for Pearson)

option to obtain corrected chi-square statistics.

All statistical analyses and visualization was done using SAS 9.1 and MS-Excel 2003.

3.5.4 Geographical analysis

The two types of geographical analysis undertaken were Visualization and Exploration

[104]. For visualization, thematic mapping with choropleth maps (regional statistical maps) was

31

undertaken. Choropleth maps are vector maps depicting how a measurement is distributed across a

geographic area. They use estimates of unadjusted and standardized disease rates. Choropleth maps

may be simple or conditioned choropleth (CC) maps. Conditioned choropleth maps are special

choropleth maps showing distribution of dependent variables along two dimensions or conditions

[108].

For exploration, the variations across states were studied for spatial heterogeneity [109].

Spatial dependence of study variables was studied by spatial autocorrelation which is a measure of

similarity in neighbouring areas based on the values of a variable and a matrix for identification of a

region’s neighbours. Spatial autocorrelation of study variables was explored by means of both

global and local Moran’s autocorrelation. Global testing was done with a Moran scatterplot. The

plotting was done of a variable ‘x’ for a state and its spatial lag ‘w_x’, a weighted average of the

neighbouring states’ values. The slope of the regression line corresponds to Moran’s I statistic. The

value for this statistic ranges from -1 to +1, where -1 denotes strong negative autocorrelation; 0

denotes random distribution of values; and +1 denotes strong positive autocorrelation. Local

univariate Moran was used for the LISA (local indicator of spatial association) significance maps

which are maps of differences (that are statistically significant) between disease risks in one state

and the overall risk in the neighbouring states. This enabled construction of five different

comparisons to identify ‘clusters’ (high-high, meaning a state with high value that also has

neighbours with high values; similarly low-low) or ‘outliers’ (low-high, meaning a low value state

surrounded by high value statess; and the opposite high-low) or ‘not significant’ regions. Sensitivity

analysis with up to 9999 permutations was performed in the generation of LISA maps with

significance being set at p<0.05. K-nearest states (with k=6) was used for the purpose of

constructing spatial weights.

All maps were created using the 2001 Census state boundaries. LISA maps were restricted

to the NFHS-3 dataset because this was the dataset that had information on all risk factors within

one study. All mapping was done using good cartographic principles with particular attention to

recommended map elements, typography and colour schemes.

3.5.4.1 General cartographic principles

Geographic locations linked to other spatially referenced data, and aggregated into larger

geographical units as desired, are suitable for thematic mapping [110].

32

Mapping metrics

A key question in disease mapping exercises is what to map? Two types for areal data were

used for mapping: either maps of standardized rates, or maps of differences (that were statistically

significant) between disease risks in one area and the overall risk in the contiguous or total area [111].

Choice of mapping regions

The state was the geographic unit of study in India for three reasons. Firstly, it was an

optimal choice of mapping region as a trade-off between making it large enough to have stable

outcome estimates, and small enough to cover regions that were homogenous in nature [103]. If the

regions are too small, mapping may reveal spurious geographic patterns because of random

variations in the small numbers of events [112,113]. Secondly, the variables of interest for this

study were available from across multiple nation-wide surveys at the state level predominantly.

Lastly, it is the political administrative unit for macro-level health policy, planning and action and

this harmonized with my study objective to create a report that would be of use to stakeholders for

action in the field of cardiovascular disease control [46].

No. of classes for choropleth maps

Choosing the optimum number of classes for maps is an important part of map design. Too

few numbers of classes may fail to show the variation in the data. Increasing the number of classes

may yield a data-rich presentation by decreasing the amount of generalization. But it also has its

demerits [114,115] – (a) too many classes may overwhelm the map reader with information and

distract them from noting the general trend in the distribution; and (b) it may decrease map

legibility, with the increasing number of classes making it require more colours that increasingly

become difficult to tell apart. I therefore chose 4 to 5 classes for most maps based on

recommendation from literature and after multiple testing.

Legend types [116]

I used two legend types in mapping: (1) Sequential schemes – for ordered data that progressed

from low to high with light colours for low range data and darker colours for the upper range data;

and (2) Diverging schemes – where the mid-range values had light colours and the extremes had

bright colours with contrasting hues.

33

All geographical analyses and presentation were done using ArcView 9.0 (ESRI, Redlands,

CA) and the public domain software GeoDa 0.9.5-i beta (Spatial Analysis Lab, University of

Illinois).

3.6 Ethics approval Research ethics approval was obtained from St Michael’s Hospital (REB# 09-021C,

2/13/2009) and administrative approval was obtained from University of Toronto (ORE# 23964,

3/30/2009) for the thesis. This thesis involved secondary analysis of data previously collected for each

of the five different surveys. All are publicly available datasets (SFMS 1998, MDS 2001-03 & SRS

2004 from the Office of the Registrar General of India, New Delhi; NFHS-2 & NFHS-3 from the

International Institute of Population Studies, Mumbai). SRS-2004 had only group-level data; the other

four datasets with individual-level data were anonymized and made available to the Centre for Global

Health Research, St Michael’s Hospital (SMH). The databases are housed on the SMH server

securely. I therefore had access to these datasets without any personal information. It is almost

impossible to match observations in the study datasets to any personal identifiers.

34

4 RESULTS

In this section, I first look at the characteristics of the various surveys and provide the crude

prevalence estimates for the various risk factors. Subsequently, I present in detail the age-adjusted

results for each risk factor starting with smoking (because it takes away about 10 years of life)[49],

then overweight (which takes away about 3 years of life)[50], followed by diet (about which

evidence is less clear-cut in terms of life-years lost) and self-reported diabetes (about which

minimal information was available in my study dataset).

4.1 Survey characteristics and descriptive analysis of study population The characteristics of the five surveys such as survey timing, geographic extent covered,

age-group of interest, survey response rates, total number of individuals studied and their

demographic characteristics are listed in table 4.1.

4.1.1 Survey characteristics

The risk factor surveys were conducted over a 9-year period 1998-2006 and covered almost

the entire country in terms of the number of states covered. The total number of states and union

territories in India before the year 2000 was 32; in the year 2000, three large states were bifurcated

for administrative ease increasing the total to 35. The surveys carried out by the Registrar General

of India (SFMS, SRS and MDS-Phase I) were intended to be carried out in all geographic regions.

However due to civil unrest in rural Nagaland, it was not covered in all three surveys; the state of

Jammu & Kashmir was in addition excluded in the SFMS survey in the year 1998 due to a similar

reason. Both the National Family Health Surveys covered all existing states leaving out the smaller

union territories. All five surveys however still covered over 99% of the country’s population. The

risk factor surveys achieved high overall survey response rates (over 90%) and studied large

populations ranging from 0.2 million to about 4.5 million adults aged 15 years and over. The

mortality outcome survey which covered a base population of 6.4 million individuals followed up

over a study period of 3 years had a relatively lower response rate of 85%. This study yielded a total

of 123,905 deaths of which about 50,336 deaths were in the age-group 30 to 69 years.

35

Table 4.1: Descriptive analysis of baseline characteristics in the selected surveys Surveys

Risk factor surveys Mortality survey

Variables SFMS NFHS-2 SRS NFHS-3 MDS (Ph I)

Survey characteristics Survey year(s) 1998 1998-99 2004 2005-06 2001-03 No. of states surveyed 30.5/32 26/32 34.5/35 29/35 34.5/35

Large (>10 M pop.) 16/17 17/17 19/19 19/19 19/19 Small (<10 M pop.) 8.5/9 9/9 9.5/10 10/10 9.5/10 Union territories 6/6 -- 6/6 -- 6/6

Country population covered 99% >99% >99% >99% >99% Survey response rate n/a 93% n/a 92% 88% Age-group analyzed ≥ 15 yrs ≥ 15 yrs ≥ 15 yrs ♂:15-54 yrs 30-69yrs ♀: 15-49 yrs Nos. studied 3,870,872 334,486 4.5 M 198,754 50,336 Demographic characteristics Sex n/a

Male 51.0% 50.4% 37.4% 60.1% Female 49.0% 49.6% 62.6% 39.9% missing 0.0% 0.0% 0.0% 0.09%

Age (yrs) -- overall n/a Mean/ median 35.8/ 33.0 35.6/ 32.0 29.8/ 29.0 Age-groups (in young & middle age: 15-69 yrs) n/a

15-29 44.7% 46.0% 52.3% -- 30-44 31.3% 29.6% 37.2% 21.8% 45-59 17.0% 16.7% 10.5% 36.0% 60-69 7.0% 7.7% n/a 42.2%

Education n/a

Illiterate 38.4% 35.5% 25.4% 51.5% Upto Grade 5 27.8% 17.4% 14.7% 20.1% Upto Grade 10 26.4% 31.9% 37.0% 15.5% Grade 11 & over 7.4% 15.2% 22.9% 5.0% missing 0.5% 0.05% 0.02% 7.9%

Residence n/a Urban 23.5% 33.4% 47.9% 17% Rural 76.5% 66.6% 52.1% 83% missing 0.0% 0.0% 0.0% 0.0%

MDS- Ph I = MDS phase I; M = million; States surveyed, 0.5 = urban only; n/a = Data not available; -- = age-group not included in this study. All percentages are calculated excluding missing observations

36

4.1.2 Demographic characteristics

The SRS 2004 survey had no individual-level data on demographic characteristics. In the

other three risk factor surveys with individual-level information, data was almost complete with less

than 1% of values missing for the four demographic variables. In MDS, missing values were low at

about 1% except for the education field wherein about 7% of deceased had missing observations.

Sex

The SFMS and NFHS-2 had slight preponderance of males over females while NFHS-3 had

nearly two-thirds of participants as females. The MDS had about 60% male subjects.

Age

The overall mean (and median) ages in the SFMS and NFHS-2 surveys were almost similar;

the NFHS-3 survey mean was about 6 years lower.

By age-group of interest, that is among young and middle-aged adults (15 to 69 years), the

SFMS and NFHS-2 had similar age distributions with about 45% of study population being young

adults (in the 15 to 29 years age-group) and about 55% being middle-aged adults. This was seen to

be inverted in NFHS-3 with about 52% in the 15 to 29 years age-group and about 47% being

middle-aged adults. In the MDS, the age-group of 30 to 69 years was the study-group of interest.

Education

The proportions that were illiterate in SFMS, NFHS-2 and NFHS-3 were 38%, 36% and

25% respectively. In MDS, this was about 52% because of higher death rates among illiterates.

Residence

In the SFMS, about a quarter of participants were from a rural area with the remainder

residing in an urban area. The proportions living in an urban area in NFHS-2 and NFHS-3 were

higher at 33% and 48% respectively while it was lower at 17% in MDS.

4.1.3 Crude prevalence of selected CVD determinants in the four surveys

The crude prevalence of selected CVD determinants (smoking, fruit consumption,

vegetarianism, overweight and diabetes) along with male-female differences and urban-rural

variations are shown in table 4.2.

37

Table 4.2 Crude prevalence of CVD determinants in selected surveys, India Surveys

Variables SFMS NFHS-2 SRS NFHS-3

Survey year(s) 1998 1998-99 2004 2005-06

Age-group studied ≥ 15 yrs ≥ 15 yrs ≥ 15 yrs ♂:15-54 yrs ♀: 15-49 yrs

Current smoking prevalence % ≥ 15 years

Male 27.3% 30.4% 26.1% 33.2% Female 1.6% 3.1% 2.4% 2.2%

≥ 15 years Urban male 21.5% 23.6% 20.6% 29.8% Rural male 28.8% 33.8% 28.3% 36.8%

≥ 30 years Urban male 31.8% 33.4% 28.9% 36.5% Rural male 42.9% 46.1% 39.0% 46.2%

Overweight %

Male 11.8% Female 15.1%

Urban 20.9% Rural 7.8%

Vegetarianism %*

Male 24.1% 27.9% Female 26.0% 36.8%


Fruit consumption % (atleast weekly)



Diabetes%



* = Vegetarianism in SRS survey and Lactovegetarianism in NFHS-3 survey; = not available

38

Smoking

The crude prevalence of current smoking among all males aged 15 years and over was found

to be between 26.1% and 33.2% in the four surveys; this was about 10-fold higher than the

prevalence among female which was between 1.6% and 3.1%. Among males in this age-group,

prevalence of smoking was around 22 to 24% in urban males and 28 to 34% in rural males. In the

age-group of 30 years plus, prevalence was noted to be higher at 29 to 34% among urban males and

39 to 46% among rural males. The ratio of rural:urban smokers was noted to be consistently 1.4

across all surveys in the age-groups considered.

Body mass

Only NFHS-3 had data on body mass for both males and females. Proportions of males and

females who were overweight was 12% and 15% respectively. When overweight prevalence was

looked at by residence, urban: rural ratio was nearly three-fold with nearly 21% of urban residents

being overweight or obese as compared to about 8% of rural residents.

Vegetarianism

About 26% of females considered themselves vegetarians in the SRS survey while nearly

37% considered themselves lacto-vegetarians in the NFHS-3 survey; this was higher in comparison

to the 20 to 24% among males in the SRS and NFHS-3 surveys. Slightly higher proportion of rural

residents (35%) reported as being lacto-vegetarians when compared to urban residents (32%) in the

NFHS-3 survey. This data by residence was not available in the SRS survey.

Fruit consumption

Data on frequency of fruit consumption was available only from NFHS-3. About 56% of

males and 48% of females reported at least weekly fruit consumption. This male: female ratio of 1.2

was less marked than the urban: rural ratio of 1.6 wherein 63% of urban residents as compared to

40% of rural residents who reported fruit consumption at least once a week.

Self-reported Diabetes

Self-reported diabetes prevalence among those aged 30 years and over was 2.7% among

males and 1.9% among females. While the sex ratio was 1.5:1.0, the urban: rural ratio was 2.1:1.0

since diabetes prevalence was 3.2% among urban residents and 1.5% among their rural

counterparts.

39

4.2 Smoking In this section, I start with the results of smoking prevalence in both males and females from

across different surveys. Then I quickly go on to present details of smoking among all males (aged

15 years and over) since smoking among females was less common. Finally, I focus on middle-aged

males (ages 30-69 years) since peak smoking was seen in this age-group and because cardiovascular

mortality was studied in this age-group.

4.2.1 Smoking prevalence among males and females

The crude prevalence of current tobacco smoking among all males and females aged 15

years and over is graphed in Figure 4.1. Male smoking prevalence ranged between 26 and 30% and

female smoking prevalence ranged between 1.6 and 3.1% in the first three surveys where a family

member (usually the male head of household) reported the smoking history on behalf of all family

members. In the NFHS-3, prevalence was calculated based on self-reporting by both males and

females and found to be 33% and 2.2% respectively. Across all surveys however, it can be seen that

female smoking was very low in the country and male smoking was 10-fold or higher than females.

Figure 4.1 Crude prevalence of current tobacco smoking among males and females, aged 15 years and over, from selected surveys in India

27.3

30.4

26.1

33.2

1.63.1 2.4 2.2

0

5

10

15

20

25

30

35

40

SFMS [1998] NFHS-2 [1998-99] SRS [2004] NFHS-3 [2005-06]

Survey (year)

Prev

alen

ce %

Male Female

40

4.2.2 Smoking among all males in SFMS 1998

This subsection takes a detailed look at male smoking behaviour in SFMS 1998 survey since

it was a large survey and contained relatively more details about smoking.

4.2.2.1 Smoking prevalence

Figure 4.2 panel (a) shows, on two y-axes, the proportions smoking within each 5-year age

group and the cumulative prevalence of smoking among males. Among young adults aged 15-24

years, less than 10% were smoking. This however increased in the next two age-groups 25 to 29

years and 30-34 years to reach a peak of about 45% in the late 40s and 50s. Then the proportion

smoking within each age group reduced. The cumulative prevalence increased from a low of about

1.4% in the youngest age group to about 28% overall.

Panel (b) represents the same findings from a different perspective. It shows, on two y-axes,

the cumulative prevalence and the change in cumulative prevalence for each 5-year age-group as

compared to the previous 5-year age-group.

Figure 4.2(a) Age-specific prevalence (bars) and cumulative prevalence (line) of current smoking among all males, ages ≥ 15 years, SFMS 1998

1%

9%

21%

33% 33%

40%

44%45% 46%

44%42%

38%

1%

5%

10%

15%

18%21%

23%25% 26% 27% 27% 28%

0%

5%

10%

15%

20%

25%

30%

35%

40%

45%

50%

15-19 20-24 25-29 30-34 35-39 40-44 45-49 50-54 55-59 60-64 65-69 70-74

Age-group (yrs)

Age

-spe

cific

pre

vale

nce

0%

5%

10%

15%

20%

25%

30%

35%

40%

45%

50%

Cum

ulat

ive

prev

alen

ce

Proportion Cumulative %

41

Figure 4.2(b) Cumulative prevalence (line) and percent increase in smoking prevalence above younger age-group (bars) of current smoking among all males, ages ≥ 15 years, SFMS 1998

1

5

10

15

18

21

2325

26 27 27 28

0

50

100

150

200

250

300

350

15-19 20-24 25-29 30-34 35-39 40-44 45-49 50-54 55-59 60-64 65-69 70-74

Age-group

% in

crea

se in

sm

okin

g pr

eval

ence

abov

e yo

unge

r age

-gro

up

0

5

10

15

20

25

30

35

Cum

ulat

ive

prev

alen

ce %

Series2 Series1

249%

94%

49%

22% 16% 11% 7% 5% 3% 2% 0%

% change in cumulative prevalence

Cumulative prevalence

4.2.2.2 Types of tobacco smoked by level of education

Figure 4.3 depicts the proportion of males smoking different types of tobacco by level of

education. Overall smoking prevalence tended to decrease with increasing educational attainment.

When we look at by type of tobacco smoked, there are interesting results. Beedi smoking

prevalence was seen to decrease with increasing levels of education, from a high of 29.2% among

illiterates to a low of 4.9% among those with post-secondary education (p<0.001). Cigarette

smoking, on the other hand, was seen to increase with increasing levels of education from a low of

42

3.0% among illiterates to a high of 13.2% among those with post-secondary or graduate education

(p<0.001).

Figure 4.3 Types of tobacco smoked by level of education among males, ages 15 years and over, SFMS 1998

3.04.5

5.77.1

8.7

13.2

29.2

18.3

13.2

9.47.4

4.9

38.0

25.5

20.718.0 17.9

19.2

0.0

10.0

20.0

30.0

40.0

50.0

60.0

Illiterate Upto Grade 5 Grades 6-8 Grades 9-10 Grade 11-12 Post-secondary

Education level

Prev

alen

ce %

cigarette beedi all smokers

4.2.2.3 Age at initiation of smoking

The mean and median ages at initiation of smoking among males in India was 21.0 and 20.0

years respectively. Figure 4.4 shows a box plot of the mean ages at initiation of smoking for

cigarette and beedi smoking respectively along with the univariate statistics. Mean age at initiation

of smoking for beedis was lower at 20 years as compared to 22 years for cigarette. Of smokers, 90%

had started the habit by 25 years of age for beedi smoking and by 28 years of age for cigarette

smoking.

43

Figure 4.4 Mean age at initiation of smoking by type of tobacco used among male smokers, SFMS 1998

Cigarette Beedi

1 2

smokage

smokt ype

Type of smoker Variable All smokers Cigarette Beedi Total (N) 534,697 102,196 368,841 No. (%) analyzed 528,873 (99%) 100,875 (99%) 365,279 (99%) Mean 20.6 22.0 20.2 S.D. 4.5 4.5 4.5 S.E. 0.01 0.01 0.01 Variance 21.0 20.1 19.8 IQR 5.0 6.0 4.0

95% 30.0 30.0 30.0 Q3 23.0 25.0 22.0 Median 20.0 20.0 20.0 Q1 18.0 19.0 18.0 5% 15.0 15.0 16.0

44

Figure 4.5 shows the mean age of initiation of smoking by level of education for male

smoking different types of tobacco. Those who were illiterate started smoking about 2.0 years

earlier than those with post-secondary or graduate education. Beedi smokers who were illiterate

started smoking at 20.1 years of age, about 1.5 years before cigarette smokers with similar

education. Cigarette smokers with graduate education started smoking at 23.2 years of age, 2.0

years later than beedi smokers with similar education.

Figure 4.5 Mean age of initiation of smoking among males for different types of tobacco by level of education, SFMS 1998

18

19

20

21

22

23

24

Illiterate Upto Grade 5 Grades 6-8 Grades 9-10 Grades 11-12 Post-secondary

Education

Age

(yrs

)

Cigarette All smokers Beedi

Education Illiterate Upto Grade 5 Grades 6- 8 Grades 9-10 Grades 11-12 Post-secondary

Tobacco type

Cigarette 16,731 28,810 23,744 14,508 9,308 10,608

Beedi 160,810 122,233 54,628 19,106 7,884 3,905

All* 209,465 167,648 85,872 36,713 19,178 15,403

* All = cigarette, beedi & others

45

4.2.3 Smoking among middle-aged adults

This section deals with middle-aged adults (ages 30-69 years) with a special emphasis on males.

4.2.3.1 Smoking prevalence

Given that the mean age of initiation of smoking was late in the SFMS-1998 survey with

most men taking up smoking only during the later phase of young adult life, the proportions

smoking tobacco were assessed using three broad age-categories of adulthood – young adults (15-

29 years), middle-aged adults (30-69 years) and older adults (70 years and over). This is illustrated

in table 4.3. While overall 27% of adult males were smokers, in the younger age-group this was less

than 10%; but in middle-age over 40% of males were smokers. Rural residents (43.4%) were more

likely to be current smokers than their urban counterparts (32.4%) in middle-age also.

Table 4.3: Smoking among young, middle-aged and older adults by sex and residence, 1998 Smokers in region Age-group Rural Urban Total

No. (%) No. (%) No. (%)

Males 15-29 years 657,003 (10.7) 197,796 (7.3) 854,799 (9.9) 30-69 years 764,472 (43.4) 256,996 (32.4) 1,051,468 (40.7) ≥ 70 years 52,957 (35.1) 15,450 (21.9) 68,407 (32.1) Total 1,504,432 (28.8) 470,242 (21.5) 1,974,674 (27.1)

Females

15-29 years 624,192 (0.8) 188,965 (0.5) 813,157 (0.7) 30-69 years 777,188 (2.4) 233117 (1.5) 1,010,305 (2.2) ≥ 70 years 55,507 (4.0) 17,229 (3.1) 72,736 (3.8) Total 1,456,887 (1.8) 439,311 (1.2) 1,896,198 (1.6)

4.2.3.2 Geographical analysis of tobacco use by place of residence

Type of tobacco used by middle-aged male smokers according to the place of residence is

shown in Figure 4.6. Overall, about 70% of smokers were beedi users and only one in five were

cigarette smokers. Cigarette smoking was more common in urban areas than in rural areas (43% vs

14% respectively).

46

Figure 4.6 Type of tobacco smoked by place of residence among middle-aged male smokers, 1998

Total

Beedi70%

Cigarette19%

Other11%

Beedi Cigarette Other

Rural

Beedi73%

Cigarette14%

Other13%

Urban

Beedi51%Cigarette

43%

Other6%

4.2.3.3 Types of tobacco use in different states

The proportion of different types of tobacco smoked in the various states is shown in figure

4.7. Overall, there was a 6-fold variation in current smoking between states; this varied for beedi

smoking (which had a 5-fold variation) and for cigarette smoking (which had a 15-fold variation).

Beedi was the most common form of tobacco smoked in the country; overall, it was nearly four

times more commonly used than the cigarette. There were however wide variations in

beedi:cigarette use ratio between different states ranging from a low of 1:1 (in Delhi) and 1.2:1.0 (in

Kerala and northeastern states) to a high of 31:1 (in Gujarat). Other forms of tobacco smoking such

as cheroot and chutta were seen only in some districts of a few northern states (Bihar, Jharkhand,

Uttar Pradesh, Uttaranchal and Haryana) and in some regions of a northeastern state (Mizoram) and

a southern state (Andhra Pradesh).

47

0.0% 10.0% 20.0% 30.0% 40.0% 50.0% 60.0% 70.0%

Punjab & Chandigarh

Maharashtra, Goa, Daman & Diu

Delhi

Orissa

Tamil Nadu & Puducherry

Kerala & Lakshadweep

Gujarat (incl. DNH)

Karnataka

Bihar & Jharkhand

INDIA

Madhya Pradesh & Chhatisgarh

Andhra Pradesh

Uttar Pradesh & Uttaranchal

Northeastern states

Rajasthan

Himachal Pradesh

Assam

Haryana

West Bengal & AN Islands

Prevalence %Beedi Cigarette Other

78.7

73.9

61.6

78.1

86.1

44.3

80.2 51.8

85.8

65.9

52.9

80.8 88.1

53.5

53.7

78.5

45.9 68.5

69.3

48

Figure 4.7 Types of tobacco smoked and proportion of beedi smokers among middle-aged (30-69 years) males in states of India, 1998 Proportion beedi smokers (%)

Northeastern states include: Meghalaya, Nagaland, Manipur, Mizoram, Sikkim, Arunachal Pradesh, Tripura

49

4.2.3.4 Geographic mapping of smoking variation by residence

The prevalence of smoking among males aged 30-69 years residing in urban and rural

locations in SFMS is shown in figure 4.8. Smoking in rural areas was more common than in

urban areas. Further, smoking in rural regions was more prevalent in north-western states

(Rajasthan, Haryana, Himarchal Pradesh, Uttarakhand and Uttar Pradesh), the eastern/north-

eastern states (West Bengal, Assam, Meghalaya, Manipur, Mizoram and Tripura) and the

southern state of Andhra Pradesh. North-eastern states had high prevalence of smoking in urban

areas as well. Smoking was least prevalent in urban Maharashtra and urban and rural Punjab.

Figure 4.8 Prevalence of smoking among males in different states of India, 1998

4.2.3.5 Smoking variation across states by type of tobacco smoked

Figure 4.9 shows the geographical distribution of types of tobacco smoked in different

states. This is shown using pie-charts of the proportions of adult male smokers who used beedis,

cigarettes or other forms of tobacco (like cheroot, chutta, etc.). Cigarette smoking was relatively

more common in the north-eastern and southern states and less common in northern, western and

central parts of India.

Figure 4.9 Proportion of different types of tobacco smoked in different states, Special Fertility and Mortality Survey (SFMS) 1998

4.2.3.6 Variations across time

Figure 4.10 shows the time trends in smoking among rural males in the stable age-group

of 45-59 years across four different surveys. The four categories were: low (< 20%), medium (20

to 33%), high (34 to 50%) and very high (>50%) smoking prevalence. From the maps, it appears

that there was a rise in smoking prevalence between SFMS (1998) and NFHS-2 (1998-99)

followed by a dip in SRS (2004) and then again an increase in NFHS-3 (2005-06). The states of

Punjab and Maharashtra were consistently low smoking prevalence states (in tandem with

findings in figures 4.7 and 4.8). The temporal variations for urban residents aged 45 to 59 years

and for adults in the age-group of 30 to 44 years also followed a similar pattern (not shown).

50

Figure 4.10 Maps of smoking prevalence among rural males, ages 45-59 years, across the four selected surveys

51

4.2.3.7 Correlation between states across time

Correlation between state-level smoking prevalence among 45-59 year old adult males

across various surveys was very high. Table 4.4 displays Pearson correlation statistics for pairs

of analysis variables across different surveys.

Table 4.4 Pearson correlation coefficients comparing state-level smoking prevalence across different surveys for males, ages 45-59 years Surveys SFMS NFHS-2 SRS NFHS-3 SFMS 1.00 0.8728

* 0.8334* 0.7728

*

NFHS-2 1.00 0.9229* 0.9329

*

SRS 1.00 0.8829*

NFHS-3 1.00 * p < 0.001; subscripts are number of comparison states

An example of a scatterplot comparing the values of different states in two different

surveys (SFMS 1998 and NFHS-2) is shown in Figure 4.11. The positive correlation between the

values for the states was consistent across the four surveys in a similar manner (not shown).

Figure 4.11 Correlation between smoking prevalence in states between SFMS and NFHS-2

Mizoram

Assam

NFHS-2 1998-99

SFMS 1998

Punjab

Tamil Nadu

52

Though the overall correlation coefficients were high, there were however a few

exceptions: some states had low values in some surveys and high values in other surveys (eg.

Assam, Tamil Nadu).

4.2.3.8 Geographic mapping of quitting smoking

From the SRS 2004 survey that had a question idenfying ex-smokers, it was possible to

calculate the ratio of ex:current smokers among males aged 45-59 years. The national average

was 4.8. Between states there were differences: most states had ratios lower than 4.0 and a few

states had greater than 8.0. The former states were generally those with high prevalence of

smoking. The latter included some states with low prevalence of smoking (eg. Maharashtra in

central India) and also some states with high prevalence of smoking (eg. Kerala in the south).

This is seen in the accompanying figure 4.12.

Figure 4.12 Ratio of ex:current smokers among males, ages 45-59 years, Sample Registration System (SRS) 2004

53

4.2.4 Spatial heterogeneity

Spatial heterogeneity in smoking prevalence between states was tested by looking for

spatial autocorrelation. This is depicted in figure 4.13. The scatterplot of smoking prevalence

plotted against its ‘spatial lag’ (ie, the weighted average of neighbouring values) enabled

computation of global Moran’s I statistic to be 0.314 for males (shown below) and 0.02 for

females (not shown) revealing minimal clustering of smoking behaviour in the NFHS-3 survey.

The accompanying LISA map for males revealed ‘high-high’ clustering in the

northeastern states, ‘low-low’ clustering in the western states of Gujarat and Maharashtra and a

‘high-low’ outlier in the state of Kerala in the south. No such ‘high-high’ clustering was noted

for females.

Figure 4.13 Scatterplot of global spatial autocorrelation (for males) and LISA maps showing local spatial clustering of smoking (for males and females), NFHS-3 [2005-06]

LISA map of Male Smoking, 15-54 yrs LISA map of Female Smoking, 15-49 yrs

54

4.3 Body mass Increased body mass index (BMI ≥ 25 kg/ m2) which is a major risk factor in terms of

life-years lost is the second CVD risk factor described in this section. This is done in two ways

from data available in the NFHS-3 (2005-06) survey: in terms of proportions who had increased

BMI and by looking at mean BMI.

4.3.1 Overweight/obesity

4.3.1.1 Prevalence of overweight/obesity

The prevalence of overweight/obesity (BMI ≥ 25 kg/m2) was 13.9% (11.8% among males

and 15.1% among females). The prevalence of overweight/obesity by residence, sex and

education are shown in table 4.5. The prevalence in urban and rural areas was 20.9% and 7.8%

respectively. Peak prevalence was seen among rural females (23.3%), almost 3-fold higher than

among rural females and 4-fold higher than among rural males. Overweight/obesity was also

higher among those with higher education, with those who had completed grade 10 having a

prevalence of 21.5% -- almost 3 times higher than those who were illiterate.

55

Table 4.5 Prevalence of overweight/obesity in NFHS-3 survey, 2005-06

Survey Overweight/ Obesity (BMI≥25) NFHS-3 (2005-06) Characteristic % Crude O.R. (99% CI) Total (n=187,886) 13.9% Sex

Male (n=69,198) 11.8% 0.75 (0.73-0.78) Female (n=118,727) 15.1% 1.00

Residence

Urban (n=95,160) 20.9% 3.11 (3.03-3.20) Rural (n=103,594) 7.8% 1.00

Residence & sex

Urban Male (n=34,646) 17.2% 0.69 (0.66-0.71) Female (n=53,171) 23.3% 1.00

Rural Male (n=34,552) 6.5% 0.74 (0.71-0.78) Female (n=65,556) 8.5% 1.00

Education Illiterate (n=48,204) 8.5% 0.34 (0.29-0.39) Upto Grade 5 (n=27,631) 11.3% 0.47 (0.43-0.52) Grade 6 – 10 (n=69,950) 14.1% 0.60 (0.56-0.64) Grade 11 & over (n=42,101) 21.5% 1.00

Additional analysis of categories of overweight by residence and sex are shown in table

4.6. About 80% of those who had BMI ≥ 25 were in the category of pre-obesity (BMI=25.0-

29.9); the rest were in the category of W.H.O. Class I & II obesity (BMI=30.0-34.9 and

BMI=35.0-39.9 respectively) with <1% being morbidly obese (BMI ≥40.0). While prevalence of

overweight among rural males and females was 5.78% and 7.1% respectively, among urban

males and females it was much higher at 14.6% and 17.3% respectively. Class I & II obesity

were much less common among males but not among females, especially in urban areas where it

was 5.7%. Morbid obesity or class III obesity was less than 0.25% in all sub-groups.

56

Table 4.6 Prevalence proportions of overweight by residence and sex, NFHS-3, 2005-06 Variable Urban Rural

Total

BMI ≥ 25 20.9% 7.8% By sex Males & females

Overweight/pre-obesity (BMI=25.0-29.9) 16.2% 6.6% Obese Class I & II (BMI=30.0-39.9) 4.5% 1.1% Obese Class III (BMI ≥ 40) 0.1% 0.03%

Male Overweight/pre-obesity (BMI=25.0-29.9) 14.6% 5.8% Obese Class I & II (BMI=30.0-39.9) 2.6% 0.7% Obese Class III (BMI ≥ 40) 0.01% 0.02%

Female Overweight/pre-obesity (BMI=25.0-29.9) 17.3% 7.1% Obese Class I & II (BMI=30.0-39.9) 5.7% 1.4% Obese Class III (BMI ≥ 40) 0.2% 0.04%

4.3.1.2 Distribution of BMI by sex and residence

About 7% of males and 4% of females had missing or implausible BMI values; these

observations were excluded and the remainder were retained for the analysis. The mean (and

99% confidence limits) of BMI for males was 20.8 (20.7-20.8) and marginally higher for females

equaling 21.0 (21.0-21.1) kg/m2. The boxplots of BMI distributions for males and females are

shown in figure 4.14 and by sex and residence are shown in figure 4.15 along with the statistical

estimates. While the mean BMI in rural areas was about 20 kg/m2 in both sexes, it was

significantly higher at 21.5 for urban males and 22.1 for urban females (p<0.001).

57

Figure 4.14 Boxplot showing distribution of body mass index by sex, 2005-06

_

t-test = -15.9 p-value < 0.001

Statistic Male Female LCL UCL LCL UCL Total no. 74369 124385 No. (%) analyzed 69198 (93%) 118727 (96%) 99% 31.2 33.6 95% 27.4 28.8 Q3 22.8 23.1 Median 20.2 20.3 Q1 18.3 18.2 5% 16.1 16.0 1% 14.8 14.6

Mean 20.7 20.8 20.8 21.0 21.0 21.1 S.D. 3.5 3.5 3.6 4.0 4.0 4.1 S.E. 0.01 0.01 Variance 12.4 16.3

LCL=lower confidence limit; UCL=upper confidence limit

58

59

Figure 4.15 Boxplot showing distribution of body mass index by gender and residence, 2005-06 Males Females

_ _

Males Females Statistic Urban Rural Urban Rural

LCL UCL LCL UCL LCL UCL LCL UCL

No. of persons 34646 34552 53171 65556

99% 32.4 29.2 35.4 30.9 95% 28.4 25.6 30.5 26.5 Q3 23.8 21.6 24.7 21.8 Median 21.0 19.6 21.3 19.6 Q1 18.7 18.0 18.8 17.8 5% 16.2 16.0 16.2 15.8 1% 14.8 14.7 14.7 14.5 Mean 21.4 21.5 21.5 20.0 20.0 20.1 22.1 22.1 22.1 20.1 20.2 20.2 S.D. 3.8 3.8 3.9 3.0 3.0 3.0 4.5 4.5 4.5 3.3 3.4 3.4 S.E. 0.02 0.02 0.02 0.01

Males: t-test = 55.7 p-value < 0.001 Females: t-test = 82.6 p-value < 0.001 LCL=lower confidence limit; UCL=upper confidence limit

4.3.1.3 Distribution of BMI by sex and age-group

The boxplots of BMI distributions by age-group are depicted in figure 4.16(a) for males and

figure 4.16(b) for females. The mean BMI for males in the age-groups 15-19 yrs, 20-29 yrs and 30-

49 yrs was 18.7, 20.5 and 21.6 respectively (p<0.001). It displayed a similar increase with age for

females also with values of 19.3, 20.5 and 22.1 in the three age-groups respectively (p<0.001).

Figure 4.16(a) Boxplots of distribution of Body Mass Index by gender and age (males), 2005-06

_ _

Statistic 15-19 yrs 20-29 yrs 30-49 yrs No. of males 12183 21783 30990 99% 27.3 29.8 32.2 95% 23.2 26.1 28.4 Q3 20.0 22.1 23.9 Median 18.4 20.0 21.1 Q1 17.0 18.3 18.9 5% 15.2 16.4 16.6 1% 14.0 15.2 15.2 Mean 18.7 20.5 21.6 S.D. 2.6 3.1 3.7 S.E. 0.02 0.02 0.02 Variance 6.8 9.3 13.8 IQR 3.1 3.8 5.0

F-value (Pr > F): 3469 (< 0.001) * post-hoc comparison of means between all 3 groups are statistically significant

60

Figure 4.16(b) Boxplots of distribution of body mass index by gender and age (females), 2005-06

_ _

F-value (Pr > F): 4825 (< 0.001) * post-hoc comparison of means between all 3 groups are statistically significant

Statistic 15-19 yrs 20-29 yrs 30-49 yrs No. of females 22813 41362 54552 99% 27.8 31.5 35.4 95% 24.1 27.0 30.6 Q3 20.8 22.2 24.8 Median 19.0 19.9 21.4 Q1 17.5 18.1 18.8 5% 15.6 16.0 16.1 1% 14.4 14.7 14.7 Mean 19.3 20.5 22.1 S.D. 2.7 3.5 4.5 S.E. 0.02 0.02 0.02 Variance 7.3 12.0 20.6 IQR 3.3 4.1 6.1

61

4.3.2 Geographic mapping of overweight prevalence by states

Conditioned choropleth maps of prevalence of overweight along two axes – sex and place of

residence – in the age-group of 30 to 49 year old adults in different states of the country are shown

in figure 4.17. The prevalence of overweight among males varied 13-fold ranging from 2.9% in

rural Chattisgarh to 38% in urban Punjab and among females varied 22-fold ranging from 2.5% in

rural Jharkhand to 55.9% in urban Punjab in this age-group. It was noted that BMIs were high in

rural areas of only a few states such as Punjab and Gujarat in the north and Kerala, Andhra Pradesh

and Tamil Nadu in the south. In the urban areas however, overweight was found to be more

widespread in many states, especially among females. Figures 4.18 and 4.19 depict proportions that

were overweight in the age-groups of 20-29 years and 15-19 years respectively. Prevalence of

overweight was relatively much less in these two age-groups. It ranged from 1.2% in rural

Meghalaya to 20.4% in urban Punjab among males and from 0.6% in rural Jharkhand to 25.7% in

urban Tamil Nadu in the 20 to 29 years age-group. In the 15 to 19 years age-group, the range was

much less, 0% to 12.1% among males and 0% to 14.3% among females. Interestingly, overweight

prevalence in these younger age-groups was high in those states that had high prevalence in the

older age-group – Punjab, Kerala and Tamil Nadu – to name a few, especially in urban areas.

62

63

Figure 4.17 Mapping of proportions of adults, ages 30-49 years, overweight by state, National Family Health Survey-3 (2005-06)


64


65

4.3.3 Spatial heterogeneity

Global Moran’s I for overweight/obesity was 0.14 among males and 0.32 among females.

This revealed that there was no significant spatial autocorrelation for males and moderate spatial

autocorrelation for females. LISA maps of proportions with increased BMI in the different states

(shown in figure 4.20) revealed ‘low-low’ clustering in the northeastern states among males and

females and ‘high-high clustering in the northern states among females.

Figure 4.20 LISA maps showing local spatial clustering of overweight/obesity among males and females, National Family Health Survey-3 (NFHS-3) [2005-06]

Males >30yrs Females >30yrs

Global Moran’s I = 0.14 Global Moran’s I = 0.32

66

4.4 Dietary factors & Self-reported Diabetes In this section, I present some results available on other factors like diet (vegetarianism and

fruit intake) and self-reported diabetes.

4.4.1 Vegetarianism

4.4.1.1 Prevalence of vegetarianism

The prevalence of vegetarianism is detailed in table 4.7. One-in-four males and females had

reported being vegetarians in the SRS 2004 survey. In the NFHS-3 survey, about one-third of the

population (27.9% of males and 36.8% of females) defined themselves as lacto-vegetarians. By

urban/rural location, the lowest prevalence was among urban males (27.5%) and highest prevalence

among rural females (37.9%). Prevalence was significantly higher among those who had completed

Grade 10 than in those with lower grades of education (x2 for trend=22.4; p<0.0001).

Table 4.7 Vegetarianism among adults aged 15 years and over in India from selected surveys Survey Vegetarianism Lacto-vegetarianism

SRS 2004 (n=4.5 million) NFHS-3, 2005-06 (n=198,754) Characteristic % % Crude O.R. (99%CI) Total 33.4% Sex

Male 24.1% 27.9% 0.67 (0.65-0.68) Female 26.0% 36.8% 1.00

Residence

Urban 32.2% 0.9 (0.88-0.92) Rural 34.5% 1.00

Residence & sex

Urban Male 27.5% 0.69 (0.67-0.72) Female 35.4% 1.00

Rural Male 28.2% 0.65 (0.62-0.67) Female 37.9% 1.00

Education Illiterate 32.4% 0.73 (0.70-0.76) Upto Grade 5 29.1% 0.62 (0.59-0.65) Grade 6 – 10 31.9% 0.71 (0.67-0.75) Grade 11 & over 40.7% 1.00

= data not available; O.R. (99%C.I.) = odds ratio (99% confidence interval)

67

There was no association of vegetarianism with age (x2 for trend= - 0.72; p<0.23) (not shown).

4.4.1.2 Geographic mapping of prevalence of vegetarianism by state

The prevalence of lacto-vegetarianism among adults aged 15 years and over was estimated

from the NFHS-3 survey. Figure 4.21 shows the conditioned choropleth map of lacto-vegetarianism

in the states along two dimensions, sex and residence. All four maps show an east-west gradient in

reported vegetarianism. In urban areas, the lowest prevalence among males was 2.7% in eastern

Arunachal Pradesh and the highest prevalence was 67.7% in western Rajasthan; among females the

lowest prevalence was 4.3% in eastern Nagaland and highest prevalence was 76% in the

northwestern states of Punjab, Gujarat and Rajasthan. In rural areas, even greater differences were

seen: among males, a 60-fold difference was noted with Nagaland having a prevalence of 1.2% and

Rajasthan having a prevalence of 83%; among females, a 50-fold difference was noted with the

prevalence ranging from 1.2% in Nagaland to 92.9% in Haryana.

4.4.1.3 Spatial heterogeneity

Global Moran’s I for spatial autocorrelation in prevalence of lacto-vegetarianism was 0.75

among males and 0.77 among females. This indicated significant spatial autocorrelation for males

and females. LISA maps of lacto-vegetarianism in the different states (shown in figure 4.22)

revealed statistically significant ‘high-high’ clustering for males and females in the northern and

western states and significant ‘low-low’ clustering in the northeastern states of India.

68

69

Figure 4.21 Prevalence of Lacto-vegetarianism in the states among adults in the National Family Health Survey-3 survey, 2005-06

Figure 4.22 LISA maps showing local spatial clustering of lacto-vegetarianism, National Family Health Survey (NFHS-3) [2005-06]

Males >15yrs

Females >15yrs

4.4.2.1 Prevalence of at least weekly fruit consumption

Prevalence of at least weekly fruit consumption is shown in table 4.8. It was reported to be

55.5% among males and 47.7% among females in the NFHS-3 survey. In urban areas it was 62.6%

and in rural areas it was 39.5%. Fruit intake increased from about 28% among illiterate to nearly

75% among those who had completed grade 10 (x2 for trend=146.8; p<0.0001).

4.4.2 Fruit intake


70

Table 4.8 Reported fruit intake (at least weekly) in National Family Health Survey (NFHS-3) [2005-06]

Survey Fruit intake (at least weekly) NFHS-3 2005-06 Characteristic % Crude O.R. (99% CI) Total (n=198,754) 50.4% Sex

Male (n=74,369) 55.5% 1.36 (1.33-1.39) Female (n=124,385) 47.7% 1.00

Residence

Urban (n=95,160) 62.6% 2.58 (2.52-2.64) Rural (n=103,594) 39.5% 1.00

Residence & sex




4.4.2.2 Geographic mapping of fruit intake by state

Figure 4.23 shows the conditioned choropleth map of reported fruit intake by state along two

dimensions, sex and residence. In urban areas, the lowest prevalence among males was 34% in

Orissa and the highest prevalence was 87% in Karnataka; among females the lowest prevalence was

33% in Orissa and highest prevalence was 82% in Karnataka. In rural areas, the lowest prevalence

among males was noted in Orissa which had a prevalence of 12% and Kerala which had a

prevalence of 78%; among females, a 10-fold difference was noted with the prevalence ranging

from 8% in Orissa to 81% in Goa.

71

72

Figure 4.23 Reported fruit intake (at least weekly) in various states by sex and residence, National Family Health Survey (NFHS-3) [2005-06]


Global Moran’s I for spatial autocorrelation in prevalence of regular fruit intake was 0.41

among males and 0.27 among females. This indicated that there was moderate spatial

autocorrelation for males and no significant spatial autocorrelation for females. LISA maps of

reported fruit intakes in the different states (shown in figure 4.24) revealed some statistically

significant ‘high-high’ clustering predominantly in the southern states for males and females and

‘low-low’ clustering in the north central states.

Figure 4.24 LISA maps showing local spatial clustering of fruit intake among males and females, National Family Health Survey (NFHS-3) [2005-06]

Males >15yrs

Females >15 yrs

73

4.4.3 Diabetes

4.4.3.1 Prevalence of diabetes

The prevalence of diabetes among those aged 30 years and over is shown in table 4.9. The

prevalence among males was 2.81% and that among females was 2.03%. The prevalence among

urban males and females was higher at 3.84% and 2.88% respectively than that seen among rural

males and females (1.77% and 1.28%) respectively. It was found to be directly associated with the

level of education (x2 for trend = 270; p< 0.0001) probably indicating an awareness bias.

Table 4.9 Self-reported diabetes prevalence in National Family Health Survey (NFHS-3) [2005-06]

Survey Self-reported diabetes NFHS-3 (2005-06) Characteristic % O.R. (99% CI) Total (n=93,213) 2.3% Sex

Male (n=37,064) 2.8% 1.40 (1.25-1.56) Female (n=56,149) 2.0% 1.00

Residence

Urban (n=45,064) 3.3% 2.27 (2.02-2.56) Rural (n=48,149) 1.5% 1.00

Residence & sex




4.4.3.2 Geographic mapping of diabetes prevalence in states

The distribution of diabetes prevalence across different states is shown as a conditioned

choropleth map by sex and residence in Figure 4.25. Diabetes prevalence was found to be low

among rural residents and high among urban residents, in both males and females, especially in the

southern states.

74

75

Figure 4.25 Prevalence of self-reported diabetes in different states among adults aged 30 years & over by sex and residence, National Family Health Survey (NFHS-3) [2005-06]

Males >30yrs Females >30 yrs

Figure 4.26 LISA maps showing local spatial clustering of self-reported diabetes among males and females, National Family Health Survey (NFHS-3) [2005-06]


Global Moran’s I for spatial autocorrelation in prevalence of self-reported diabetes was 0.20

among males and 0.18 among females. This indicated that there was no significant spatial

autocorrelation among males and females. LISA maps of reported fruit intakes in the different states

(shown in figure 4.26) revealed statistically significant ‘high-high’ clustering predominantly in the

southern states for males and females and a ‘high-low’ outlier in the northern state of Punjab for

males.


76

4.5 Ecologic association between selected risk factors and CVD mortality I had shown the importance of individual risk factors in cardiovascular mortality in the

introduction and literature review sections. Subsequently, the study datasets and analytic techniques

were outlined in the methodology section. Earlier in the results section, I described the distribution

and correlates of selected risk factors (smoking, body mass and to a lesser extent, diet and self-

reported diabetes). In this subsection, I proceed to explore the association between these selected

risk factors and cardiovascular mortality at the ecologic level.

4.5.1 Cardiovascular mortality

4.5.1.1 Geographic mapping of cardiovascular death rate

The mean cardiovascular death rate among males was 308 per 100,000 and the mean

cardiovascular death rate among females was 198 per 100,000. The rates among males ranged from

about 180 per 100,000 in Mizoram to over 400 per 100,000 in Tamil Nadu and Andhra Pradesh.

Among females, the rates varied from below 100 per 100,000 in Mizoram and Haryana to about 240

per 100,000 in Punjab and Andhra Pradesh. The mapping of these vascular death rates is shown in

figure 4.27.

Most southern states (such as Andhra Pradesh, Kerala and Tamil Nadu), and a few in the

east (West Bengal) and north (Punjab) had relatively higher vascular death rates among both males

and females.

77

78

Figure 4.27 Age-standardized vascular death rate per 100,000 males and females, ages 30-69 years, in states of India [Million Deaths

Study, 2006]

Males

Females

4.5.2 Ranking of states

Ranking of states by cardiovascular death rate (in descending order) and comparing it to the ranking

for the selected risk factors is shown for males and females separately in table 4.10(a) and table

4.10(b) respectively.

Table 4.10(a). Ranking of states by outcome (CVD death rate, highest to lowest) for middle-aged males, India, 2006

Region Outcome Risk factors

State CVD deaths-

females Smoking Overweight Vegetarianism Fruit intake Diabetes

Andhra Pradesh 1 22 3 15 8 4 Tamil Nadu 2 20 7 17 2 7 Punjab 3 26 1 5 5 10 Kerala 4 16 2 20 1 2 Goa 5 29 8 18 3 1 Assam 6 8 25 24 19 23 West Bengal 7 4 20 23 26 6 Karnataka 8 23 12 11 4 14 Maharashtra 9 27 17 12 13 11 Gujarat 10 25 10 3 16 22 Madhya Pradesh 11 14 23 6 17 27 Haryana 12 6 4 2 10 15 Manipur 13 10 18 25 6 12 Delhi 14 19 28 8 11 8 Chhattisgarh 15 9 11 13 25 13 Tripura 16 3 15 27 22 3 Nagaland 17 11 24 29 23 19 Jharkhand 18 28 26 19 28 17 Himachal Pradesh 19 17 6 4 7 24 Orissa 20 21 13 21 29 9 Arunachal Pradesh 21 18 22 28 21 26 Jammu & Kashmir 22 5 16 10 14 25 Uttaranchal 23 13 19 9 12 20 Bihar 24 24 27 14 24 18 Rajasthan 25 7 14 1 27 29 Uttar Pradesh 26 12 9 7 18 21 Meghalaya 27 2 29 22 9 16 Sikkim 28 15 5 16 15 5 Mizoram 29 1 21 26 20 28

79

For males it was seen that the top 5 states with high/low CVD death rates had corresponding

high/low levels of prevalence of overweight, high/low levels of diabetes and low/high levels of

vegetarianism; but on the other hand, they also had corresponding high/low levels of fruit intake

and corresponding low/high levels of smoking. For females, states with high/low levels of CVD

death rates had somewhat corresponding high/low levels of overweight and diabetes prevalence;

however, smoking and dietary factors were not clearly linked.

Table 4.10(b). Ranking of states by outcome (CVD death rate, highest to lowest) for middle-aged females, India, 2006

Region Outcome Risk factors

State CVD deaths-

females Smoking Overweight Vegetarianism Fruit intake Diabetes

Nagaland 1 26 27 29 16 20 Andhra Pradesh 2 24 4 15 9 6 West Bengal 3 18 8 24 23 4 Punjab 4 25 1 2 21 11 Goa 5 23 14 16 1 3 Arunachal Pradesh 6 9 25 27 20 28 Maharashtra 7 19 22 12 8 18 Orissa 8 15 18 19 29 23 Tamil Nadu 9 29 5 18 6 1 Assam 10 21 15 26 22 27 Jharkhand 11 17 26 20 27 12 Jammu & Kashmir 12 6 7 10 12 25 Madhya Pradesh 13 16 19 6 17 17 Gujarat 14 22 10 5 13 10 Delhi 15 13 16 7 3 5 Manipur 16 7 17 21 4 19 Tripura 17 3 20 22 18 7 Karnataka 18 27 13 11 2 24 Bihar 19 5 24 14 19 8 Kerala 20 28 2 23 5 2 Chhattisgarh 21 2 21 13 24 22 Uttaranchal 22 12 12 9 14 15 Uttar Pradesh 23 11 11 8 26 26 Himachal Pradesh 24 20 3 4 10 13 Rajasthan 25 8 23 3 28 29 Meghalaya 26 14 29 28 7 14 Sikkim 27 4 9 17 11 9 Haryana 28 10 6 1 25 16 Mizoram 29 1 28 25 15 21

80

Pearson correlation coefficients between male versus female ranks for the 29 states that had

information on death rates and risk factors is shown in table 4.11.

Table 4.11 Correlations between male vs female ranks for 29 states Females Variables Smoking Overweight Vegetarianism Fruits Diabetes CVD death

Males

Smoking 0.65*

Overweight 0.87*

Vegetarianism 0.99*

Fruits 0.80*

Diabetes 0.76*

CVD death 0.44**

* p < 0.001; ** p < 0.01

Correlations between male and female ranks in the states were high for the selected risk

factors (ranging between 0.65 for smoking to 0.99 for vegetarianism). The correlation between state

ranks for cardiovascular death rate was moderate at 0.44.

4.5.3 Univariate regression analysis

The ecologic association between selected risk factors (for adults aged 15 years and over)

and the outcome variable of cardiovascular death rate (for adults aged 30 to 69 years from the

Million Death Study) was analyzed across 29 states. Cardiovascular death rates for males and

females were regressed on the following set of six predictor variables:

Percent urbanization – from Census 2001

Smoking prevalence, lacto-vegetarianism prevalence, regular fruit intake prevalence,

overweight prevalence and diabetes prevalence -- from NFHS-3 survey.

The univariate analysis for males and females are shown in Table 4.12.

81

Table 4.12 Correlation between selected risk factors and CVD mortality by sex, in 29 states of India Cardiovascular death rate per 100,000 (ages 30-69 yrs)

Predictors Males Females Regression Regression coefficient coefficient (p-value) R2 (p-value) R2

Census 2001

% Urbanization 0.79 (0.23) 0.05 -0.11 (0.84) 0.01

NFHS-3 survey2005-06

Smoking 0.12 (0.89) 0.00 -6.50 (0.01)* 0.20

Overweight 5.81 (0.03)* 0.17 3.49 (0.02)* 0.21

Vegetarianism -0.68 (0.14) 0.08 0.01 (0.98) 0.01

Regular fruit intake 1.13 (0.09) 0.11 0.08 (0.89) 0.00

Self-reported diabetes 17.38 (0.01)* 0.29 17.80 (0.03)* 0.17

* statistically significant Among males, a higher vascular death rate was seen to be positively correlated with

prevalence of overweight and diabetes. Percent of variance explained by variables was lowest for

smoking (0%) and highest for self-reported diabetes (29%). Among females, those states with high

vascular death rates had lower female smoking prevalence and significantly higher prevalence of

overweight and diabetes. Percent of variance explained by variables was lowest for fruit intake (0%)

and highest for overweight (21%).

4.5.4 Multivariate Regression

Mulitvariate Linear Regression

Prior to modeling, the data were examined by way of plotting to conclude that the data had

linear relationships between risk factors and CVD death rates for males (figure 4.28a) and females

(figure 4.28b). As a second step, associations between various parameters were examined by

observing the correlations between the study variables for males and females (table 4.13). Since all

the plots were suggestive of mostly a linear relationship and there were no Pearson correlation

coefficients > 0.60, I used all the parameters to create the model. The outputs from this modeling

are shown in table 4.14a (for males) and table 4.14b (for females).

82

The study variables explained 49% of the variation in cardiovascular death rates among

males (R2 = 0.487) and 43% of the variation among females (R2 = 0.431). Among males, prevalence

of overweight in states was positively associated with cardiovascular death rates and the level of

vegetarianism was negatively associated with the cardiovascular death rates; the former was

statistically significant (p<0.03) and the latter tended towards significance (p<0.09). Among

females, vascular deaths were not significantly associated with the study variables at the state level.

The analysis of variance showed Prob>F-value to be significant for males (0.014) and not

significant for females (0.057).

I was able to validate the assumptions of linear regression. The SPEC option yielded

Pr>Chi-sq of 0.85 for males and 0.56 for females suggesting that the error terms were independent

and identically distributed. The Durbin-Watson statistic was between 1.6 and 2.4 (2.2 for males and

1.7 for females) indicating that the data were independent and not correlated. The Shapiro-Wilks

test for normality showed p-values of 0.57 for males and 0.36 for females indicating that the error

terms were from a normal distribution. In both cases, the probability plots also confirmed the same.

The variance inflation factors were less than the cut-off of 10 showing that the variables

were not correlated. Lastly, the Cook’s D statistic values were all <2 confirming that there were no

outliers.

Visual inspection of outliers carried out on scatterplots revealed that no single state appeared

to be an outlier for males; for females however, Mizoram appeared to be an outlier – it had high

female smoking prevalence (over 15%) and high level of urbanization (over 40%) but a relatively

low vascular death rate (~50/100,000). Exclusion of Mizoram from the ecologic analysis however

did not significantly alter the model outcome

Poisson Regression

Exploring the ecologic association by Poisson regression also revealed the same results (tables

shown in appendix). Among males, prevalence of overweight in states was positively associated

with cardiovascular death rates and the level of vegetarianism was negatively associated with the

cardiovascular death rates; the former was statistically significant (p<0.02) and the latter tended

towards significance (p<0.07). Among females, vascular deaths were not significantly associated

with the study variables at the state level. I used the PSCALE option to adjust for overdispersion

(wherein variance was greater than mean) and thereby obtain a better fit of the model (wherein the

SCALED DEVIANCE equaled 1.0).

83

Figure 4.28a Plots of vascular death rate per 100,000 males against predictor variables (checking for linearity of relationship)

y = 0.794x + 278.77

0

50

100

150

200

250

300

350

400

450

0.0 20.0 40.0 60.0 80.0 100.0

Vasc

ular

dea

th ra

te p

er 1

00,0

00

y = 0.1183x + 296.78

0

50

100

150

200

250

300

350

400

450

0.0 20.0 40.0 60.0 80.0 100.0

Vasc

ular

dea

th ra

te p

er 1

00,0

00

% Urban Male smoking prevalence %

y = -0.6757x + 319.26

0

50

100

150

200

250

300

350

400

450

0 20 40 60 80 100

Vasc

ular

dea

th ra

te p

er 1

00,0

00

y = 1.1292x + 242.07

0

50

100

150

200

250

300

350

400

450

0 20 40 60 80 1

Vasc

ular

dea

th ra

te p

er 1

00,0

00

00 Lacto-vegetarianism prevalence % % Regular fruit intake

y = 5.8056x + 230.39

0

50

100

150

200

250

300

350

400

450

0 5 10 15 20 25 30

Vasc

ular

dea

th ra

te p

er 1

00,0

00

y = 17.384x + 254.2

0

50

100

150

200

250

300

350

400

450

0 2 4 6 8

Vasc

ular

dea

th ra

te p

er 1

00,0

00

10 % Males with BMI≥25 Male diabetes prevalence %

84

Figure 4.28b Plots of vascular death rate per 100,000 females against predictor variables (checking for linearity of relationship)

y = -0.1076x + 176.16

0

50

100

150

200

250

300

350

400

450

0.0 20.0 40.0 60.0 80.0 100.0

Vasc

ular

dea

th ra

te p

er 1

00,0

00

y = -6.5572x + 192.27

0

50

100

150

200

250

300

350

400

450

0.0 5.0 10.0 15.0 20.0

Vasc

ular

dea

th ra

te p

er 1

00,0

00

% Urban Female smoking prevalence %

y = 0.0092x + 172.82

0

50

100

150

200

250

300

350

400

450

0 20 40 60 80 100

Vasc

ular

dea

th ra

te p

er 1

00,0

00

y = 0.0797x + 169.32

0

50

100

150

200

250

300

350

400

450

0 20 40 60 80 1

Vasc

ular

dea

th ra

te p

er 1

00,0

00

00 Lacto-vegetarianism prevalence % % Regular fruit intake

y = 3.4845x + 112.49

0

50

100

150

200

250

300

350

400

450

0 5 10 15 20 25 30 35

Vasc

ular

dea

th ra

te p

er 1

00,0

00

y = 17.798x + 138.57

0

50

100

150

200

250

300

350

400

450

0 1 2 3 4 5 6

Vasc

ular

dea

th ra

te p

er 1

00,0

00

% Females with BMI≥25 Female diabetes prevalence %

85

86

Table 4.13. Pearson correlation coefficients between variables at state level (to check for multicollinearity), males and females

Males % Urban ♂Smoking % ♂Lacto-veg % ♂Fruit intake % ♂Overweight % ♂Diabetes %

% Urban 1.00 -0.11 0.15 0.36 -0.04 0.19 ♂Smoking % 1.00 -0.20 -0.22 -0.31 -0.24 ♂Lacto-veg % 1.00 0.16 0.31 -0.37 ♂Fruit intake % 1.00 0.47 0.40 ♂Overweight % 1.00 0.42 ♂Diabetes % 1.00

Females % Urban ♀Smoking % ♀Lacto-veg % ♀Fruit intake % ♀Overweight % ♀Diabetes %

% Urban 1.00 ♀Smoking % -0.01 1.00 ♀Lacto-veg % 0.18 -0.19 1.00 ♀Fruit intake % 0.47 -0.19 -0.16 1.00 ♀Overweight % 0.03 -0.41 0.43 0.14 1.00 ♀Diabetes % 0.34 -0.23 -0.24 0.50 0.41 1.00

87

Table 4.14(a). Multiple linear regression of cardiovascular death rates among males at state level Parameter Estimates Parameter Variance Variable DF Estimate Pr > |t| Inflation Intercept 1 160.12 0.005 0 Smoking 1 0.83 0.279 1.19 Overweight 1 7.50 0.029 2.20 Veg_% 1 -0.96 0.086 2.05 Fruit intake 1 0.04 0.958 1.63 Self-reported diabetes 1 5.16 0.501 2.37 Urban_% 1 1.03 0.116 1.40

Analysis of Variance

F value = 3.48 (Pr > F =0.014) Model (degrees of freedom) = 6

R-Square = 0.487 Adj R-Sq = 0.347 Test of first and second moment specification: Chi-Square = 21.3 (p = 0.85)

Durbin-Watson statistic = 2.2

Tests for Normality

Shapiro-Wilks statistic (W) = 0. 97 (p = 0. 57) Skewness = 0.42 Kurtosis = 0.15 | *+*** +----+----+----+----+----+----+----+----+----+----+

Normal Probability Plot 90+ *+++++ | * *++++ | *++++ | ++*+* 10+ ******** | ******

| *++*+* -70+ *++++

-2 -1 0 +1 +2

Validation of assumptions of linear regression: 1) Variance inflation factors (for multicollinearity): values < 10, hence variables were not correlated 2) SPEC option (Pr>Chi-sq of 0.85 for ♂and 0.56 for ♀) -- error terms independent and identically distributed. 3) Durbin-Watson statistic = values b/n 1.6 and 2.4 indicate data were independent, not correlated & adequate sample size 4) Shapiro-Wilks test (p-values of 0.58 for ♂and 0.36 for ♀) and normal probability plot -- error terms from a normal distribution 5) Cook’s D statistic values <2 -- no outliers

Table 4.14(b). Multiple linear regression of cardiovascular death rates among females at state level Parameter Estimates Parameter Variance Variable DF Estimate Pr > |t| Inflation Intercept 1 195.36 0.002 0 Smoking 1 -4.76 0.083 1.50 Overweight 1 2.29 0.281 2.57 Veg_% 1 -0.25 0.527 2.19 Fruit intake 1 -0.62 0.355 1.89 Self-reported diabetes 1 14.0 0.219 2.38 Urban_% 1 0.11 0.858 2.04 Analysis of Variance

F value = 2.77 (Pr > F =0.057) Model (degrees of freedom) = 6

R-Square = 0.431 Adj R-Sq = 0.275 Test of first and second moment specification: Chi-Square = 27.3 (p = 0.56)

Durbin-Watson statistic = 1.9

Tests for Normality

Shapiro-Wilks statistic (W) = 0. 96 (p = 0. 36) Skewness = 0.76 Kurtosis = 1.08 | **+**+ -2 -1 0 +1 +2

Normal Probability Plot 90+ * ++ | ++++++ | *+*++*+ | ++*+*+ | +******* | *******

-50+ * +*++*+ +----+----+----+----+----+----+----+----+----+----+

Validation of assumptions of linear regression: 1) Variance inflation factors (for multicollinearity): values < 10, hence variables were not correlated 2) SPEC option (Pr>Chi-sq of 0.85 for ♂and 0.56 for ♀) -- error terms independent and identically distributed. 3) Durbin-Watson statistic = values b/n 1.6 and 2.4 indicate data were independent, not correlated & adequate sample size 4) Shapiro-Wilks test (p-values of 0.58 for ♂and 0.36 for ♀) and normal probability plot -- error terms from a normal distribution 5) Cook’s D statistic values <2 -- no outliers

88

4.6 Biases & Limitations Having shown the results of the descriptive geographical epidemiology of cardiovascular

risk factors and mortality, I proceed in this section to study the biases and limitations in this study.

4.6.1 Assessment of representativeness of surveys

All surveys were assessed by comparing the sex ratios (number of females per 1000 males)

and age-sex distributions to an external comparison, the 2001 census population. NFHS-3 was

excluded in the comparisons because it over-sampled women by intent in its study design.

4.6.1.1 Sex ratio

The sex ratios in selected surveys are tabulated in table 4.15. The two risk factor surveys

(SFMS & NFHS-2) had sex ratios of 946 and 965 per 1000 that were relatively more favourable to

females in the entire population as compared to the census sex ratio of 933. These ratios were even

higher in those aged 15 years and over. By residence, the sex ratios were relatively more favourable

for females in urban than in rural areas as per the 2001 census; this was however reversed in SFMS

and NFHS-2 surveys where the sex ratios were higher in rural locations than in urban communities.

SRS provided no data for such comparisons.

In the MDS, sex ratio was skewed unfavourably towards women because of two reasons: (i)

sampling was based on mortality, and (ii) number of male deaths were more than female deaths

Table 4.15 Sex ratios (no. of females per 1000 males) in selected surveys in comparison to the census 2001 population

Survey (year) Comparison SFMS NFHS-2 MDS Census Variable 1998 1998-99 2001-03 2001 Sex ratio (India)

All ages (urban & rural) 946 965 807 933 Age ≥ 15 yrs (urban & rural) 960 985 761 941 Urban 934 959 960

Rural 968 998 897

89

4.6.1.2 Age-sex distribution

The age-sex pyramids of the risk factor survey populations in comparison with the census

population are shown in Figure 4.29. Overall, the age-sex distributions of the three surveys

appeared broadly similar in proportions to the census population by 5-year age groups. All age-sex

distributions were characterized by a broad base and a narrow apex in the age-group of interest

(ages 15 years and over). Overlaying of individual age-sex pyramids over the census age-sex

pyramid (not shown here) revealed minor discrepancies in some age-groups (such as over-

representation of women in the 50-54 year age-group in the SRS 2004 survey).

90

Figure 4.29 Age-sex pyramids of selected surveys in comparison with Census 2001 population

Age-sex pyramid, India, Census 2001

10 8 6 4 2 0 2 4 6 8 10

0-4 yrs.

10-14 yrs.

20-24 yrs

30-34 yrs.

40-44 yrs.

50-54 yrs.

60-64 yrs.

70-74 yrs.

80+ yrs.

Age-group

Percent

Males(%) Females(%)

Age-sex pyramid, India, SFMS 1998

10 8 6 4 2 0 2 4 6 8 10

0-4 yrs.

10-14 yrs.

20-24 yrs

30-34 yrs.

40-44 yrs.

50-54 yrs.

60-64 yrs.

70-74 yrs.

80+ yrs

Age-groups

PercentMales(%) Females(%)

Age-sex pyramid, India, NFHS-2 (1998-99)

10 8 6 4 2 0 2 4 6 8 10

0-4 yrs.

10-14 yrs.

20-24 yrs

30-34 yrs.

40-44 yrs.

50-54 yrs.

60-64 yrs.

70-74 yrs.

80+ yrs

Age

PercentMales(%) Females(%)

Age-sex pyramid, India, SRS 2004

10 8 6 4 2 0 2 4 6 8 10

0-4 yrs.

10-14 yrs.

20-24 yrs

30-34 yrs.

40-44 yrs.

50-54 yrs.

60-64 yrs.

70-74 yrs.

80+ yrs.

Age-group

Percent

Males(%) Females(%)

91

4.6.2 Integrity of surveys

All of the selected surveys were recent (within the last decade), large (0.2 million to 4.5

million) and nationally-representative (covering almost all states and over 99% of the country’s

population) with few missing values. They were all household interviews or face-to-face individual

interviews covering males and females over the age of 15 years, non-institutionalized and living in

private households. They utilized uni-stage or multi-stage sampling designs with probability

methods being implemented at the sampling stage of design to ensure that a representative sample

of the target population was obtained. All the surveys also had several hundreds of small geographic

areas as the primary sampling units in each state proportionate to urban and rural populations and

were sampled based on the fertility and/or infant mortality levels for each state.

All surveys were well-conducted surveys based on the survey metric of response rates [117].

The risk factor surveys achieved response rates over 92% and the mortality outcome survey

achieved response rate of about 85%. The availability of large sample size however does not

guarantee against any potential bias from non-response.

All surveys were also high-quality in terms of having low numbers of missing observations.

The datasets had internal policies for identifying missing values by special responses or codes that

enabled the handling of missing observations to produce high-quality datasets for analysis in a

coherent and consistent form. As a result, it was noted that all the survey datasets had few missing

values (<1%) for the demographic variables of interest (age, sex, education and residence);

education was the only item for which about 8% had missing values in the Million Death Study.

Similarly among CVD predictors (smoking, diet, overweight and self-reported diabetes), it was seen

that BMI was the only item that had 7% missing values in some sub-groups (males) in NFHS-3

while all other items had <5% missing values in all surveys.

The validation of these datasets against an external comparison, the census population,

revealed differences in demographic characteristics between individual studies. Two risk factor

surveys (SMFS and NFHS-2) had an under-representation of males while the NFHS-3 had over-

sampled females by intent in study design. In addition, the National Family Health Surveys-2 & 3

(especially the most recent survey) had a relatively higher proportion of females with higher

education and from urban locations. The Million Death Study however had an under-representation

of females due to differential follow-up of individuals for vital status (dead or alive). Completeness

of reporting of deaths has earlier been documented to be around 82% for females and 87% for males

92

[98]. Comparison of age-sex structures to the census population revealed that all surveys were to a

large extent comparable except for minor variations in some age-sex groups.

These survey characteristics had two major implications. Firstly, they covered from 1998 to

2006, allowing for the standardization to the census 2001 population or the estimated 2006

population for comparability. Secondly, the large sample sizes and the comprehensive nature of the

datasets lent credence to the robustness of the study findings observed.

Overall, these survey characteristics and survey metrics attest to the internal validity and

reasonably good representativeness of the different surveys [117].

4.6.3 Study characteristics Potential sources of bias in the different datasets were examined by reviewing characteristics

of respondents, survey instruments and interviewers (table 4.16).

Survey instrument characteristics

The survey questions were different in the three surveys that had individual-level data.

There were filter questions on smoking in the SFMS survey followed by additional separate

questions on beedi and cigarette use. NFHS-3 on the other hand had no filter question but only a

common question on cigarette or beedi use. Questions on dietary factors were asked only in the

NFHS-3 survey. The frequency of dietary intake of specific food items such as fruits, eggs, milk,

fish and meat/chicken was questioned and this was used to examine fruit consumption and

vegetarianism. Self-reporting of diabetes and measurement of individual height and weight to

compute body mass index were included only in the NFHS-3 survey.

Respondent characteristics

The NFHS-3 survey obtained responses from all household members directly unlike the

other surveys which obtained from proxy respondents who in most cases were heads of households

(as in SFMS & SRS) or was any female respondent (as in NFHS-2). In addition, by design the

participants in NFHS-3 were younger, more often from urban areas and were also relatively more

educated.

Interviewer characteristics

All four surveys employed trained non-medical field workers as interviewers.

93

Table 4.16: Potential sources of bias based on characteristics of respondents, survey instruments and interviewers in the four surveys Risk factor surveys

Characteristics SFMS NFHS-2 SRS NFHS-3

Respondent

Type of respondent Proxy Proxy Proxy Self

Sex n/a Male 51.0% 50.4% 37.4% Female 49.0% 49.6% 62.6%

Age (yrs) - Mean/ median 35.8/ 33.0 35.6/ 32.0 n/a 29.8/ 29.0 Education n/a

Illiterate 38.4% 35.5% 25.4% Upto Grade 5 27.8% 17.4% 14.7% Upto Grade 10 26.4% 31.9% 37.0% Grade 11 & over 7.4% 15.2% 22.9%

Residence n/a

Urban 23.5% 33.4% 47.9% Rural 76.5% 66.6% 52.1%

Survey instrument

Smoking Screening question Does this Does anyone n/a n/a

person smoke? smoke? - yes/no - yes/no

Tobacco type What do n/a n/a Do you smoke they smoke? 1.cigarette/ beedi

1.cigarette, 2. pipe, 3. other 2. beedi, 3. other Starting age At what age? n/a n/a n/a Dietary determinants n/a n/a n/a How often do you

consume these foods (fruits, milk/egg/ fish/meat)? 1=daily, 2=weekly, 3=occasionally, 4=never

Interviewer non-medical non-medical non-medical non-medical

trained worker trained worker trained worker trained worker

n/a = data not available; totals may not add up to 100 because of missing observations

94

4.6.3.1 Effect of differences in survey questions

Type of tobacco smoked

The survey questions were different in the three surveys which had individual-level data. I

therefore graphed the proportion of population using cigarette, beedi or other forms of tobacco in

SFMS and NFHS-3 surveys to examine the effect of differences in survey questions (Figure 4.30).

The SFMS survey provided proportions using beedi and cigarette separately while the NFHS-3

survey provided a combined value for those smoking cigarette or beedi. All other forms of smoking

were relatively less common according to both surveys.

Figure 4.30 Comparison of proportions of adult males, ages 15-54 years, smoking different types of tobacco in selected surveys SFMS, 1998 NFHS-3, 2005-06

6.2

17.4

2.3

0

5

10

15

20

25

30

35

1

Cigarette Beedi other

30.6

1.230.81

0

5

10

15

20

25

30

35

1

Cigarette/beedi Pipe Other

4.6.3.2 Effect of type of interviewers

All surveys had non-medical field workers who were trained in survey-specific methods and

interview techniques. This involved description of question-by-question specifications for the study

instrument along with instructions for ‘do’s and donts’ for the interview. For SFMS and SRS

surveys, these field workers were all permanent employees of the central government; for the NFHS

surveys, these were typically research assistants of central or state government departments in each

state. All workers typically had completed 10-12 years of formal education and were very fluent in

95

the vernacular language of each state and also well aware of the socio-cultural context within each

state. So there were minimal differences between workers across surveys.

4.6.3.3 Effect of type of respondent

Age-specific prevalence of smoking

Figure 4.31 shows age-specific prevalence of smoking in the three different surveys. While in the

age-group above 30 years of age the age-specific prevalences were similar, the age-specific

prevalences in the 15 to 29 year age-group was higher in the NFHS-3 survey compared to the other

two surveys. The question on smoking was directed to each male in the household in NFHS-3 while

it was obtained from other household members in the other two surveys.

Figure 4.31 Relationship of reporting bias for smoking with type of respondent in selected surveys in India

0

5

10

15

20

25

30

35

40

45

50

15-19 20-24 25-29 30-34 35-39 40-44 45-49 50-54 55-59 60-64 65-69

Age-group (years)

Prev

alen

ce %

SFMS 1998 NFHS-2 1998-99 NFHS-3 2005-06

selfproxy

proxy

4.6.4 Differences in sociodemographic characteristics The association of various risk factors with sociodemographic characteristics such as place of

residence and education is not uniform but complex. This is illustrated in Figure 4.32 for place of

residence (urban/rural) and in Figure 4.33 for education. Prevalence proportions for beedi and

96

cigarette smoking were estimated for those aged 40 years and over from the SFMS 1998-99 survey;

prevalence for other risk factors were from the NFHS-3 survey, vegetarianism and fruit intake

prevalence being estimated among those aged 15 years and over, while overweight and diabetes

prevalence were estimated among those aged 30 years and over.

Beedi and cigarette smoking had opposite relationships with place of residence. Overweight and

diabetes were uniformly higher in urban than in rural areas. Vegetarianism and fruit consumption

also had opposite relationships with place of residence.

Figure 4.32 Prevalence of various risk factors by urban-rural residence

16.113.6

32 32.2

62.6

2.7

31.5

5.6

12.2

34.5

39.5

1.9

0

10

20

30

40

50

60

70

Beedi

Cigaret

te

Overw

eight

Vegeta

rianism

Fruit i

ntake (

weekly)

Diabete

s

Risk factors

Prev

alen

ce %

Urban Rural

At higher levels of education (as a proxy for socioeconomic status), it was seen that there was lower

level of beedi smoking but higher level of cigarette smoking. Further, this educational group had

higher levels of protective factors (higher vegetarianism and higher fruit intake) but also higher

levels of adverse risk factors (such as higher body mass). Reported diabetes was probably due to

awareness bias in this educational group. These opposing variations in prevalence of risk factors by

education could impact cardiovascular death rates in transitional populations in complex ways and

could partly explain the paucity of associations in the ecologic analyses.

97

Figure 4.33 Prevalence of risk factors by education

0

10

20

30

40

50

60

70

80

Illiterate Up to Grade 5 Grades 6-10 > Grade 10

Education

Pre

vale

nce

%

BeediCigarette

0

10

20

30

40

50

60

70

80


Education

Prev

alen

ce %

VegetarianismFruit intake

0.05.0

10.015.020.025.030.035.040.045.050.0


Education

Prev

alen

ce %

Overweight

0.0

0.5

1.0

1.5

2.0

2.5

3.0

3.5

4.0


Education

Prev

alen

ce %

Diabetes

98

The complex and confounding effects of education and urbanization among males and females

could therefore be mediated through these different distributions of risk factors leading on to the

observed differences in cardiovascular mortality outcomes between males and females. This has

implications for health policy focus differences for different subgroups in India. Figure 4.34

illustrates the differences in distributions of risk factors for the extremes of subgroups by sex, level

of education and place of residence. It looks at four subgroups of middle-aged Indians – urban

males or females with greater than secondary (grade 10) education and rural males or females with

less than primary (grade 5) education. Data on smoking was from the 1998-99 SFMS survey and

data on other four risk factors was from the 2005-06 NFHS-3 survey. From the illustration, it

appears that smoking is a major risk factor among males (rural and urban), overweight is a major

risk factor among urban residents (females more than males) and diabetes is relatively more

common among urban residents.

99

100

2.7%

0.4%

45.7%

46.7%

83.0%

0% 10% 20% 30% 40% 50% 60% 70% 80% 90%

Diabetes

Smoking

Overweight/obesity

Lactovegetarianism

Weekly fruit intake

Figure 4.34 Distribution of CVD determinants* by residence-sex-education group among adults (aged 15 years and over) in IndiaUrban females with > grade 10 education (population ≈ 25 million)

1.1%

2.6%

10.0%

38.7%

26.4%

0% 10% 20% 30% 40% 50% 60% 70% 80% 90%

Diabetes

Smoking

Overweight/obesity

Lactovegetarianism

Weekly fruit intake

Rural females with < grade 5 education (population ≈ 250 million)

3.5%

25.2%

34.2%

35.2%

77.4%

0% 10% 20% 30% 40% 50% 60% 70% 80% 90%

Diabetes

Smoking

Overweight/obesity

Lactovegetarianism

Weekly fruit intake

CVD

det

erm

inan

ts

1.2%

54.2%

5.1%

25.1%

32.3%

0% 10% 20% 30% 40% 50% 60% 70% 80% 90%

Diabetes

Smoking

Overweight/obesity

Lactovegetarianism

Weekly fruit intake

CVD

det

erm

inan

ts

Urban males with > grade 10 education (population ≈ 80 million)

Rural males with < grade 5 education (population ≈ 150 million)

* Cardioprotective factors are shown in green shading and harmful risk factors are shown in red shading; Smoking data is from SFMS surveys while other four risk factors are from NFHS-3 survey

101

4.6.5 Limitations

These differences in study characteristics and participant characteristics were associated with

limitations in the datasets used and analytic methods undertaken.

4.6.5.1 Limitations in the data

Thus the available national health surveys selected for the present study differed in their key

objectives and study designs. Hence not all of them covered the entire range of middle-aged adult

life (30 to 69 years). Some were not representative of the overall population in terms of

demographic characteristics such as educational status and place of residence. They also differed in

various other ways such as in the elicitation of information from various respondents, definitions of

study variables and in the wording of the questions (eg. vegetarianism). All of these factors

impacted the descriptive analysis, comparisons across surveys and studying of time trends. The

study was also limited by the presence of study variables studied; some key risk factors such as

blood sugar and lipid profile were not available from national surveys.

4.6.5.2 Limitations in the analysis

The surveys had different sampling procedures with various levels of complexities in

staging, stratification and weighting involved in the selection of sampling units and respondents, all

of which were not incorporated in the calculation of confidence intervals for the estimates. Given

that the sample sizes were large though in all the surveys, the confidence intervals around the

estimates were mostly very minimal for almost all states except for some small states and union

territories.

For the linear regression analysis, in the interest of simplicity, interactions were not

considered in the model. Further, there was inadequate power in the analysis because of the number

of observations (29 states) in the linear regression being small compared to the number of

parameters (six) in the model. Lastly, exploration of ecologic association between CVD

determinants and CVD outcome assumes independence of observations. In reality, this was not the

case for all predictors. For example, from the global Moran’s I correlation for vegetarianism

(equaling 0.7), it was known that some states had influence on the neighbouring states with resultant

clustering in some regions. By ignoring this clustering, I have failed to account for this in the linear

regression analysis. Spatial regression would be an appropriate analytic method that accounts for

this clustering.

Further, since the study predictors included some at aggregate level and some at individual

level, multi-level modeling would be have been an appropriate analytic method in this setting in

overcoming individualistic fallacy and ecologic fallacy to arrive at meaningful conclusions on

various determinants and health outcomes [118].

102

5 DISCUSSION

In this section, I first provide a summary of salient findings. Then I interpret the study results

discussing the relevance and plausibility of the descriptive geographic epidemiologic findings as

also the ecologic comparison of cardiovascular risk factors and mortality. Lastly, I outline key study

implications, list possible directions for future research and close with some concluding remarks.

5.1 Summary of salient findings 1) The selected surveys were large and nationally-representative surveys with high response rates and few

missing observations

2) There were differences between the various surveys with regard to study designs employed, survey

questionnaires used, respondents interviewed and the demographic characteristics of the subjects studied.

5.1.1 Smoking

3) About 30% of males aged ≥15 years were current smokers across surveys; this was more than 10-times

the prevalence among females. Among males aged ≥30 years, about 40% were smokers.

4) The mean age of initiation among males was 21 years and the peak smoking prevalence (45%) was seen

in the 45 to 59 year age-group.

5) Overall, about 70% of tobacco smoked was in the form of beedis and 20% was in the form of cigarettes.

This beedi:cigarette use ratio was 5:1 in rural areas and nearly 1:1 in urban areas.

6) Prevalence of beedi smoking was inversely related to educational attainment while cigarette smoking

was positively associated with level of education.

7) Rural males were 1.4 times more likely to be smokers than urban males.

8) There was a 6-fold variation in smoking prevalence between states.

9) Smoking was more common geographically in the northern states of the country with statistically

significant spatial clustering noted in the northeastern states.

10) Correlation between smoking prevalence proportions in males was high between surveys in most states.

11) Quitting smoking was quite uncommon throughout the country

12) Questions relating to smoking behaviour were not standardized across different surveys

5.1.2 Pre-obesity/obesity

13) Pre-obesity/obesity (BMI ≥ 25 kg/m2) prevalence was 11.8% among males and 15.1% among females.

14) The prevalence in urban areas (20.9%) was significantly higher than in rural areas (7.8%) (p<0.01).

103

15) About 80% of those who had BMI ≥ 25 were in the category of pre-obesity (BMI=25.0-29.9); about 19%

were obese (BMI=30.0-39.9) and <1% were morbidly obese (BMI ≥40.0).

16) The mean BMI in the three age-groups 15-19 yrs, 20-29 yrs and 30-49 yrs was 18.7, 20.5 and 21.6

respectively (p<0.001) among males and 19.3, 20.5 and 22.1 respectively (p<0.001) among females.

17) Among middle-aged adults aged 30 to 49 years, the prevalence of pre-obesity/obesity in states varied

nearly 10-fold among males and varied nearly 20-fold among females.

5.1.3 Dietary factors and Self-reported Diabetes

Vegetarianism

18) Overall, about a third of the population were lacto-vegetarians.

19) Males were less often vegetarians [OR(99%CI) = 0.67(0.65-0.68)] than females, in urban and rural areas.

20) Urban residents were less likely to be vegetarians [OR(99%CI) = 0.90(0.88-0.92)] than rural residents.

21) Vegetarianism was more common among those with greater educational attainment.

22) There was a strong east-west gradient in vegetarianism with the lowest prevalence in the northeastern

states and highest prevalence in the northwestern states.

23) There was wide variation in prevalence of vegetarianism across states: a 25-fold variation was noted in

urban areas and a 50-fold variation was seen in rural areas.

Fruit consumption

24) About half the population reported regular (at least weekly) fruit consumption

25) Males reported regular fruit intake more commonly than females [OR.(99%CI) = 1.36(1.33-1.39)]

26) Regular fruit intake was higher in urban areas [OR.(99%CI) = 2.58(2.52-2.64)] than in rural areas.

27) Proportion of population consuming fruits regularly increased about 3-fold across the education gradient

28) Regular fruit intake in states varied nearly 3-fold across urban areas and about 8-fold across rural areas.

Self-reported Diabetes

29) The prevalence of self-reported diabetes among males was 2.81% and that among females was 2.03%.

30) In urban areas the prevalence was more than double that in rural areas (3.28% vs. 1.47%) (p<0.01).

31) The prevalence showed nearly a 10-fold variation among males and a 8-fold variation among females.

5.1.4 Cardiovascular mortality

32) The mean cardiovascular death rate was 310/100,000 among males and 190/100,000 among females.

33) The rates among males ranged from 203/100,000 in Meghalaya to over 410/100,000 in Tamil Nadu.

Among females, the rates varied from about 40/100,000 in Mizoram to about 240/100,000 in Punjab.

104

34) In multivariate regression, the study variables explained 49% and 43% of the vascular death rate

variation among states for males and females respectively. The vascular death rates were significantly

associated with levels of overweight and vegetarianism for males at the state level; no such association

was found for females.

5.2 Smoking While there can be no dispute over the role of smoking in the causation of cardiovascular

disease at the individual-level based on research over the last 60 years [49], smoking did not turn

out to be a significant predictor of CVD in this ecologic comparison. There are possibly several

explanations for this apparent lack of correlation. In females, it may due to small sample size; only a

small proportion of them in India were smokers. In males, it may be due to the following six

reasons. Firstly, it may be because of competing mortality. This is not entirely surprising given that

from the picture on smoking epidemiology from this analysis, it was clear that most of the smokers

in India were predominantly beedi smokers from lower socioeconomic strata. These smokers would

die not only from cardiovascular deaths and cancers as is typically seen in industrialized countries

but also from competing causes such as tuberculosis and other chronic lung diseases at younger

ages [58]. Secondly, it may be due to demographic reasons: the average life-expectancy for Indian

males is only 62 years [119]. This means that the full burden of smoking vis-à-vis cardiovascular

mortality is not apparent now and will take time to evolve as the life-expectancy of males continues

to increase over the coming decades. Thirdly, the consumption of cigarettes is increasing in urban

areas recently. This smoking epidemic may be separated in time by a lag-period of a few decades

from the CVD epidemic; hence the association between the two may not be detected by analysis

currently in the interim period. Fourthly, there may have been some misclassification of smoking

status due to misreporting of smokers as non-smokers owing to the common misconception in India

that smoking a beedis/cigarettes infrequently or in small quantities (<5 per day) is not harmful; this

belief is prevalent in the general population [120] and among health care professionals [121].

Fifthly, the lack of association could also be a feature of the ‘phase of epidemiologic transition’ --

with many regions still in different stages of epidemiologic transition it may take time for this

evolving picture to stabilize. Lastly, it may be a characteristic of the ecologic study design. From

the earliest ecologic studies including the Seven Countries’ Study by Ancel Keys et al [122], it has

105

been noted that smoking, unlike other cardiovascular risk factors (eg. cholesterol levels), has only a

weak correlation with CVD mortality at the ecologic level.

With regard to the descriptive epidemiology of smoking, this is the first study that compares

nationwide estimates of the smoking prevalence from different surveys in India. The greater

frequency of smoking among men (about 10-fold) as compared to women is well documented in the

literature from all large-scale surveys in India over the last two decades [54,55,123,124]. The

prevalence of smoking among males ranged from 26 to 30% according to the SFMS, NFHS-2 and

SRS surveys covering the period 1998 to 2004. This was comparable to the 29% documented by

Neufeld et al [55] earlier in the National Sample Survey (round 52) conducted during 1995-96. It

was however lower than the 33% seen in the most recent NFHS-3 survey of 2005-06. Comparison

of estimates across time reveals variations that could be explained by differences in sample size or

in the choice of study respondents. The most likely explanation however is that NFHS-3 which

employed self-reporting instead of proxy-reporting unlike the other three surveys possibly yielded a

higher estimate compared to the other studies. This has been noted earlier within the NFHS-2 by

Rani et al [54] who compared estimates for family members reporting for themselves against the

estimates for other family members and concluded that there was up to 5% under-estimation of

smoking prevalence among men when proxy-respondents were interviewed. Older family members

who were proxy respondents were likely to be unaware of the smoking status or were under-

reporting, due to social stigma, the smoking status of adolescents or young adults in the family.

Age-specific prevalence rates of tobacco use among males revealed interesting differences

in comparison with global and U.S. data. The peak smoking prevalence (45%) in my datasets was in

the 45-59 years age-group; this was higher than the peak use (36%) in ages 30-39% reported in

global estimates [56] as well as the peak use (39%) seen in the age-group of 21-25 year old males in

America [57]. The observed age-dependency of smoking could be attributed to one of three possible

factors: it could be due to cohort effect (with declining prevalence over time with younger cohorts

smoking less) or it could be due to age effect (younger persons smoking less often and more people

initiating smoking as they get older) or it could be due to under-reporting of smoking by younger

people. Available evidence indicates that there is a combination of under-reporting of smoking at

younger ages as well as an actual increase in prevalence of smoking with age up to mid-50s. This

has implications for health policy and programming with respect to smoking control in India: the

initiation into smoking could occur at any age and not just among young people. Hence tobacco

106

control programmes need to focus on all age-groups (adolescents, young adults and middle-aged

adults).

There are two subpopulations of smokers in India: beedi smokers who were likely to be

older, reside predominantly in rural areas and have lower education compared with cigarette

smokers who were more likely to be younger, live in urban areas and have higher education. While

the strong gradient of smoking with education is seen worldwide [125], this dichotomy between

beedi smokers and cigarette smokers is not common. It has been observed in a smaller cross-

sectional study in urban Delhi previously [123]. The identification of these two subpopulations with

different risk markers points to the need for tailoring tobacco control programmes to suitably target

these two vastly different groups of smokers.

My study has also identified social inequalities with respect to smoking among males. Those

who were less educated were not only more likely to be smokers, they were also more likely to

initiate smoking earlier than those who with higher education. Further, beedi smokers were more

likely to initiate smoking earlier when compared to cigarette smokers. Such inequalities in initiation

are of use in identifying the important age groups and entry points for policies to tackle inequalities

in smoking.

There were wide geographic variations in current smoking between states in India. While

the inter-state variation (up to 7-fold) for overall smoking has been noted earlier [41,54], what was

new from my analysis was that there was an even greater 25-fold variation between states for

cigarette smoking and a 50-fold variation between states for beedi smoking. Smoking was observed

to be more common geographically in the northwestern and northeastern states of the country as

reported earlier [41,54]. Through spatial analysis, statistically significant spatial clustering of

smoking was noted in the northeastern states and Kerala was seen to be an outlier with high

smoking prevalence among the southern states. The state level variations may be due to underlying

differences in regional socio-cultural patterns or due to different public policies on tobacco in

different states. There are also implications for tobacco control policies at national and state-level.

State-level policies may sometimes need to be focused on individual states or at times jointly

between neighbouring states if there are strong ties between neighbouring states due to shared

socio-cultural norms or because of trade and commerce ties. Another interesting finding was the

beedi:cigarette use ratio in states that ranged from about 1:1 (in Delhi & Kerala) to about 30:1 (in

Gujarat). This may reflect socio-cultural differences between states, differences in economic status

107

or differences in business policies or tobacco control policies. This aspect needs to be investigated

further.

Correlation between smoking prevalence rates in males was noted to be high between

surveys in most states. This was despite the differences between the various surveys with regard to

data on smoking. The format in which the questions relating to smoking were phrased in the

different surveys may partly account for the observed differences. While SFMS asked explicitly

about all forms of tobacco separately (beedi, cigarette, hukka, other), NFHS-3 probed for some

detail (cigarette/beedi, pipe, other) and NFHS-2 only asked whether any household member

‘smoked tobacco’ leaving the interpretation of tobacco to the respondents themselves. Further, there

were differences between studies with regard to the choice of respondents. The estimates from

SFMS, NFHS-2 and the SRS, all of which relied on proxy household informants, were lower than

the prevalence rate of smoking obtained from NFHS-3 that asked for self-reporting by each

individual. Specifically, the differential reporting on the smoking behaviour of those aged 15 to 29

years was dependent on the type of respondent with proxy-respondents consistently under-reporting

the smoking behaviour of these younger males than the self-reporting by these individuals

themselves. Household respondents (be it the head of the household, usually the eldest male, or any

other adult respondent) may not report accurately about the smoking status of all household

members either because he/she may not be aware of the smoking habits of other household

members [54] or may be intentionally under-reporting about some specific members of the

household due to prevailing social norms. The rate of agreement between proxy and self report of

smoking status has been compared amongst various ethnic groups in a survey of 57,244 households

in U.S.A. Cohen's kappa coefficients of agreement on smoking status was found to be 0.82 for

Asian Americans; it was lower than that seen among non-Hispanic whites and African Americans

(kappa = 0.91) but higher than that seen among Hispanics (kappa = 0.76). But these smoking rates

were estimated by telephone surveys [101]. No such information was available from the Indian

context.

The above differences in phrasing of questions or choice of respondents could thus partly

account for differences in smoking prevalence between surveys within some states such as Tamil

Nadu and Assam; in Tamil Nadu, estimate of smoking prevalence obtained by self-reporting was

higher than that obtained from proxy-reporting while in Assam, the estimate from proxy-reporting

was higher than that obtained from self-reporting.

108

Differences between populations sampled could have impacted on reported prevalence of

smoking in the study populations. The NFHS-3 survey had respondents who were younger, were

from urban locations and were relatively more educated. This could have biased the smoking

prevalence downwards compared to the other surveys. But because this survey obtained reports

from self-respondents, the prevalence estimate obtained was higher than that of other surveys; so

the actual prevalence was possibly even higher.

Finally, there is the question of whether individual reporting of smoking behaviour without

validation using biomarkers of tobacco use is an optimal method of studying smoking habit in the

general population. Mainly, three biochemical measurements have been used to validate reported

smoking: carbon monoxide, thiocyanate, and cotinine [126]. A meta-analysis by Patrick et al in

1994 [127] that identified 26 published reports containing comparisons between self-reported

behavior and biochemical measures concluded that self-reports of smoking were reasonably

accurate in most general population studies. Biochemical assessment, preferably with cotinine, was

recommended to improve accuracy only in intervention studies and student populations. A more

recent systematic review of 67 studies revealed trends of underestimation when smoking prevalence

was based on self-report [100]. It also showed varying sensitivity levels for self-reported estimates

depending on the population studied and the medium in which the biological sample was measured.

Sensitivity values were consistently higher when cotinine was measured in saliva instead of urine or

blood. These studies were however based in industrialized countries predominantly.

None of the surveys had questions on quitting behaviour among smokers. SRS 2004 had

group-level data on the population proportion that was classified as ex-smokers or current smokers.

I used this ratio of ex:current smokers to obtain some understanding of quitting among smokers

keeping in mind the limitations with the available data. This ratio was dependent on

misclassification of smokers as ex-smokers as well as the current smoking prevalence in a state. The

national value was <4%; this was about a tenth of values seen in developed countries [53].

5.3 Overweight and obesity Excess body weight is an independent predictor for cardiovascular diseases and other risk

factors such as type 2 diabetes, hypertension and dyslipidemia. In my study, increased BMI was

positively correlated with CVD mortality for males in the ecologic comparison. Each unit increase

in BMI was associated with a 7.5/100,000 increase in CVD death rate. Such a finding on ecologic

comparison is in consonance with other studies on individuals who on prospective follow-up

109

experienced higher vascular death rates with increasing BMI. Most recently, the Prospective Studies

Collaboration that reviewed a total of 57 prospective studies with 894,576 participants (mostly in

western Europe and North America) identified that at BMI of 30-35 kg/m2, median survival was

reduced by 2-4 years; above 40 kg/m2, it was reduced by 8-10 years (comparable with the effects of

smoking) [50].

In my study, about 12% of males and 15% of females had BMI ≥ 25kg/m2. The urban-rural

differences were however much bigger with one in five individuals being overweight in urban areas

and one in twelve persons being overweight in rural areas. About 80% of those who had BMI ≥ 25

were in the category of overweight/pre-obesity (BMI=25.0-29.9); the rest were in the category of

class I & II obesity (BMI=30.0-34.9 and BMI=35.0-39.9) with <1% being morbidly obese (BMI

≥40.0). Mean BMI values were generally lower than 23.0 among age, sex and residence groups

studied. These findings are consistent with available knowledge regarding female preponderance of

overweight/obesity over males, urban rates being higher than rural rates, and rarity of obesity but

increasing prevalence of pre-obesity among Asian Indians [61,62]. A significant finding that was

opposite to other studies elsewhere was that pre-obesity was also more common in females than

males whereas in most other settings males had more pre-obesity than females [61].

Anthropometric measurements such as BMI have an important place in nutritional

assessment. BMI has a J-shaped association with mortality; while at the lower end, it is associated

with digestive and respiratory mortality and may be confounded by smoking and disease states, at

the higher end, it is associated with diabetes and vascular mortality. It is clear that the imbalance

between caloric intake and energy expenditure is fueling the epidemic of overweight and obesity

worldwide [128].

There are however methodological issues associated with BMI measurements as estimates

of body fat percentage (BF%) and risk for cardiovascular disease. Firstly, reliability of physical

measurements is dependent on characteristics relating to the subject, instrument and the

anthropometrist. No reported ‘technical errors of measurement’ (TEM) were published for the

NFHS-3 survey to assess how accurately the anthropometrists (field workers) took the

measurements in comparison against a criterion anthropometrist [129]. This could potentially affect

any univariate and multivariate analysis and attendant interpretations. Secondly, though BMI is

widely used as a measure of BF%, there is increasing evidence that this may not applicable

universally, since it is age-dependent, sex-dependent and is also associated with ethnic differences.

110

This is especially so for Asians and other populations that differ in body build and body proportions

from Caucasians in whom the earlier research was done and for whom BMI appears to be a good

indicator of body fatness. Thirdly, it is increasingly becoming obvious that universal cut-points for

BMI that were arbitrarily determined earlier may not be applicable for all populations. For example,

in Asians the high risk of type 2 diabetes and cardiovascular disease is substantial at BMIs lower

than the existing cut-off point for overweight. So though BMIs of 25,30,35 and 40 as cut-points

were used in this analyses, it is now accepted that BMI cut-points of 23.0, 27.5, 32.5 and 37.5 may

be considered for public health action in Asian and other populations [61].

Geographic variation in prevalence of high BMIs among those aged 30 to 49 years shows

that it was high (>33%) in some states such as Punjab, Gujarat, West Bengal and the south Indian

states of Kerala, Tamil Nadu and Andhra Pradesh. The prevalence of pre-obesity/obesity varied 13-

fold among males ranging from 2.9% in rural Chattisgarh to 38% in urban Punjab and varied 22-

fold among females ranging from 2.5% in rural Jharkhand to 55.9% in urban Punjab in this age-

group. This has implications for which states are likely to see the adverse effects on development of

diabetes and increased vascular risk. Further, prevalence of pre-obesity/obesity was high (nearly

25%) among females. This is probably due to a sedentary lifestyle, especially among urban women,

as was documented in the more detailed PURE cohort study of 21,934 participants from five centres

in India [130,131]. Overweight was also more common in these same states in younger age groups

(adolescents aged 15 to 19 years and young adults aged 20 to 29 years) as well. This will influence

future obesity rates since individuals who become overweight earlier on are more likely to be

overweight or obese as adults [128].

5.4 Dietary factors & self-reported diabetes In the ecologic comparison, vegetarianism was inversely correlated with CVD mortality for

males. Each unit increase in vegetarianism prevalence was associated with a 1/100,000 decrease in

CVD death rate. Such a finding on ecologic comparison is in agreement with current knowledge

based on studies on individuals for whom vegetarian diets offered protection against vascular

mortality [64]. There was no such association for females. And for both males and females, regular

fruit intake was not significantly associated with CVD mortality at the state level. This could

possibly be due to several reasons. Firstly, it could be due to measurement issues. The studies in this

thesis did not define fruits and consequently this could impact results. In surveys in India, bananas

are commonly reported as part of fruit intake, and fruit juices with added sugars are not

111

differentiated from fresh fruits. Similarly, potato and yam are considered vegetables by respondents.

Secondly, the optimal recommended intake of fruit and vegetable servings per day to prevent

vascular disease is not identified for Indians as it is known for populations in industrialized

countries.

Vegetarianism

Geographic mapping of the distribution of lacto-vegetarianism revealed an interesting east-

west gradient across the country with eastern and southern states having lower prevalence (<20%)

and northwestern states having higher prevalence (≈ 70%). Diet was dependent on urban or rural

residence, with the rural population reporting greater level of vegetarianism. It was also dependent

on the level of education; those with a higher education reporting a greater level of vegetarianism.

This finding was in the opposite direction to what one would expect as the association between

lacto-vegetarian diet and education (as a proxy for income) because non-vegetarian foods are

generally more expensive in India. The NFHS-3 survey which had higher proportion of urban and

educated respondents resulted in a combined effect of decreasing vegetarianism (with a higher

proportion of urban population) and increasing vegetarianism (with a higher proportion of education

respondents).

Although a high consumption of red meat, which is rich in haem iron and saturated fat, may

increase the risk of heart attacks and stroke, this does not apply to white meat and fish. In fact, the

cardio-protective effect would seem to be derived from the consumption of unrefined vegetable

products (whole-grain cereals, vegetables and fruits) and fish. In other words, a diet containing

ample quantities of unrefined vegetable products along with moderate amounts of animal products

(in which red meat is partly replaced by white meat and fish) is considered to be just as protective as

a vegetarian diet. On the other hand, a vegan diet is associated with an increased risk of deficiencies

of iron, vitamin B12, and other micronutrients. The cardio-protective effect of the lacto-vegetarian

diet in India however may be offset to some degree because of certain cooking practices seen in

various states [67]. For instance, most vegetables are cooked or deep-fried rather than being

consumed as fresh vegetables or salads. Most food items are also exposed to prolonged or repeated

cooking. In addition, trans-fats or hydrogenated fats (eg. vanaspathi) are commonly used, especially

in several parts of urban India. These dietary trans fatty acids are known to raise LDL, triglycerides,

and lipoprotein(a) and lower HDL cholesterol [67]. Further, the types of cooking oils used

throughout the country are different with varying amounts of saturated and unsaturated fatty acids;

112

for example, the overall effect of mustard oil is considered to be protective against ischaemic heart

disease [67,132].

Fruit intake

Frequency of fruit intake was measured not in terms of number of days per week but as

daily, weekly, occasionally or never. So ‘at least weekly’ intake was computed as a marker of

protection against vascular disease. Fruit consumption was higher among males as compared to

females. It was also more common in urban areas and among those with higher education.

While life-time vegetarianism is less likely to be misclassified, frequency of specific foods

consumption such as fruits is more likely to be misclassified due to recall bias or reporting bias.

This is a problem associated with nutritional epidemiology and requires further validation.

Although much remains to be known regarding the role of specific nutrients in reducing the

risk of cardiovascular disease, dietary patterns are increasingly being identified as an important

determinant [128]. Dietary patterns that emphasize whole-grain foods, vegetables, and fruits and

that limit red meat, full-fat dairy products, and foods and beverages high in added salt and sugars

are associated with reduced risk of cardiovascular diseases.

Self-reported diabetes

The overall prevalence of self-reported diabetes was 2.3%. The prevalence in urban areas

was double that in rural areas. Similarly high prevalence based on plasma glucose testing has been

well documented in urban south India (Chennai city) with an increasing secular trend over the last

two decades [68]. Self-reported prevalence was also found to be directly linked with level of

education.

Methodologically speaking, self-reported prevalence is likely to be an under-estimate of the

true prevalence because of the effect of health care utilization on diagnosis of diabetes. Higher

prevalence noted in males as compared to females, in urban areas more than rural areas and a link

with education gradient all point to the effect of differences in access to health care. Though the

absolute levels are low according to self-reporting, the general trends are probably true.

Subramanian et al (2009) have shown recently that self-reporting of morbidity need not, in general,

be inaccurate in India [102].

Geographic variation in prevalence of self-reported diabetes was strong with high

prevalence (5-11%) in the southern and southeastern states. This may partly be a reflection of

differences in health care availability since southern states generally have better health care services

113

than the northern states; but high prevalence in some southeastern states such as Orissa and

Chattisgarh negate this argument because they have weak health infrastructure. Hence it may be

possible that states with high prevalence of diabetes may truly be having multiple risk factors for

the development of diabetes. A similarly high prevalence of diabetes has been documented in urban

areas of Toronto with large numbers of south Asian immigrants [78].

Indians are said to have the so called "Asian Indian Phenotype" that refers to certain unique

clinical and biochemical abnormalities including increased insulin resistance, greater abdominal

adiposity i.e., higher waist circumference despite lower body mass index, and high levels of highly-

sensitive C-reactive protein measurements. [133]. This phenotype makes Asian Indians more prone

to diabetes and premature coronary artery disease. This may at least be partly genetic [62].

However, the epidemic of diabetes is primarily fueled by the rapid epidemiological transition

associated with changes in dietary patterns and decreased physical activity as evident from the

higher prevalence of diabetes in the urban population.

5.5 Cardiovascular mortality The cardiovascular death rates estimated by verbal autopsy method were 300 per 100,000

for males and 190 per 100,000 for females. These rates are lower than the overall rate of 428 per

100,000 estimated by WHO for India [36].There was a 2-fold variation between states for males and

a 6-fold variation for females. The aggregate level study variable (percent urban population in the

states) and the individual-level determinants (such as prevalence of smoking, overweight,

lactovegetarianism, fruit intake and diabetes) were able to explain 49% of the variation in CVD

among males and 43% among females. Of these factors, the critical finding by multivariate

regression was that the vascular deaths among males were significantly determined by the level of

overweight prevalence and vegetarianism in the state; no such significant determinant was detected

for female CVD deaths. Ranking of states also revealed associations in the expected directions

between vegetarianism, overweight, diabetes and CVD death rate for males; there was an

association between overweight, diabetes and CVD death rates for females as well.

Thus overweight/obesity and diabetes are likely to be key drivers of the CVD epidemic at

the state level in India in the future. The complex relationships between various risk factors and the

sociodemographic characteristics such as education and urbanization may partly explain the mixed

picture. Unlike in developed countries (early industrializers) where the burden of CVD is

predominantly seen in those from lower socioeconomic strata [134,135,136], in India the

114

cardiovascular disease burden is seen predominantly seen in those from higher socioeconomic

strata. Those in urban areas with higher education were seen to be more likely to be cigarette

smokers, more likely to be overweight and more likely to report having diabetes and experience

higher cardiovascular mortality; this was in spite of smoking beedis less often and consuming

relatively more vegetables and fruits. Preliminary data based on 9,290 participants aged 35 to 70

years from the Bangalore study centre of the PURE cohort study in India [130,131] also highlights

this pattern (table 5.1). Those who were affluent (and predominantly in urban areas) consumed

more vegetables and fruits but were seen to be having relatively higher intakes of calories, fats,

sugars and salt as compared to those in rural areas and higher levels of body mass, cholesterol,

diabetes (based on self-report or fasting blood sugar) and coronary heart disease (based on self-

report or electrocardiogram).

Table 5.1 Profile of cardiovascular disease and its risk factors in rural and urban populations in southern India, PURE study [130,131]

Rural Urban

(Andhra Pradesh) (Karnataka) Characteristic N=3323 N=5967 Daily dietary intake

Energy (kcal) 1843 2278 Carbohydrate intake (g) 362 351 Fat intake (g) 23 66 Sugar (g) 5.2 28.5 Salt (g) 2.4 8.1 Total vegetables (g) 50 158 Total fruits (g) 49 166

Anthropometric profile Pre-obesity/obesity (BMI ≥25kg/m2) Males (%) 5.5 44.3

Females (%) 6.0 60.3 Mean Serum Cholesterol (gm/dl) Males 163.4 191.3

Females 168.8 207.8 Diabetes Males (%) 3.7 17.9

Females (%) 1.7 13.8 Coronary heart disease Males (%) 5.9 9.0

Females (%) 3.8 6.2

115

This is probably characteristic of early stages of the epidemiologic transition seen as a

consequence of increasing life-expectancy, increasing urbanization and improving socioeconomic

conditions seen in India and other developing countries. Lifestyle changes associated with stage of

socioeconomic development in a population may explain the varying associations between

socioeconomic status and cardiovascular diseases that is observed between countries. This

hypothesis is supported by the reversal in the association between coronary heart disease mortality

and socioeconomic status observed in ‘early industrializers’ [79,137]. Smoking cessation, better

nutrition, and physical activity are potential mechanisms for explaining these trends, because earlier

adoption of healthy behaviors by people from higher socioeconomic groups may have caused

differential declines in coronary heart disease. This has been documented in national surveys in

other countries such as Korea that adopted industrialization after western countries but before India

[138].

Given the high amount of statistical variation (42 to 49%) explained by the study variables,

only a small number of statistically significant variables were however identified by the ecologic

analysis. This could be explained by the underpowered nature of the study as identified earlier in the

limitations section since the sample size had only 29 states.

In addition, the limited number of variables available from across different study datasets

did not cover all the traditional risk factors known to be CVD determinants; information on blood

sugar and lipid profile was not available from any survey. Hence, it is possible that some of the

residual variation that is unexplained could be due to other unmeasured confounding factors such as

blood glucose, lipids, hypertension, etc. or probably due to differences in genetic factors. Further, at

the regional level, I attempted to study aggregate-level information (such as level of urbanization of

a state) to overall cardiovascular death rates. It is possible that the outcome may be linked to

regional determinants such as the relative wealth (% gross domestic product) or socioeconomic

status (education, income, occupation or house type) of people living in different states in India.

Among the latter, while income is frequently misclassified by respondents in a survey setting in

India, education, occupation and house type are less prone to this misclassification error.

Subramaniam et al (2007) have identified income inequalities between states to be a predictor of

both underweight and overweight prevalence among various states in India [63].

Further, the lack of a clear association between the risk factors studied and vascular

mortality outcome could be because of a lack of a relevant lag period between the two. It is known

116

that the interaction of these risk factors may take years to decades for cardiovascular disease to

develop and cause mortality. The study datasets I used however covered only a 9 year period.

Lastly, the death rate data comes from population-based surveys dependent on the verbal

autopsy method and not from a hospital-based death certification system. In verbal autopsy, the

collection of details of circumstances surrounding death by trained lay-workers is to a large extent

dependent on the ability of the respondent to recognize, recall and report positive and negative

symptoms and signs in the correct chronological order. Further many such symptoms and signs are

not exclusive to the cardiovascular system for easy identification and correct ascertainment of cause

of death by the physician coder. Hence, VA data have inherent limitations in the quantity and

quality of information collected from respondents in settings with limited access to health care

services; there are also cost and cross-site comparability issues in a physician-coded VA system

[36,139]. Validation studies conducted earlier have compared causes of death obtained by verbal

autopsy against hospital based diagnoses in northern and southern India [29,140]. The cause-

specific mortality fractions assigned by verbal autopsy method were statistically similar to the

causes arrived at by review of hospital records (p>0.05) [29,140]. Specificity was high (>95%) for

all broad cause groups except cardiovascular (79%) diseases. Sensitivity for cardiovascular diseases

was the same as that for neoplasms and infectious diseases (60% to 65%) but lower than that for

injuries (85%) and higher than that for respiratory, digestive, and endocrine diseases (20% to 40%)

[140]. This was broadly consonant with findings for noncommunicable diseases in a multi-centre

validation study in Africa [141].

Conclusions drawn from this ecologic analysis that showed an association between

vegetarianism and overweight and cardiovascular death rate among males should be restricted to the

level of regions (states) only and not extrapolated to individuals. While the former is a valid

conclusion, the latter could fall victim to ‘ecologic fallacy’, in which incorrect assumptions are

made about individuals based on aggregated data about their communities [142]. However, this

need not completely undermine ecologic studies since the geographic context in which health-states

occur cannot be neglected [143]. Secondly, the ‘modifiable areal unit problem’ (MAUP) could act

as a potential source of error in geographical mapping and analysis. Here one needs to be cautious

of the fact that geovisualization patterns may partly be a consequence of the size and shape of the

areal/regional units used in the study. The choice of areal units and the level of aggregation or

categorization may have a bearing on the study interpretations and the implications. Further, the

117

choice of study units, cut-offs for the map scales and colour schemes in graphical displays could

impact the visualization and impressions formed [46].

5.6 Study implications

5.6.1 Implications for clinical practice

The wealth of evidence available on the individual-level importance of risk factor

identification and management for cardiovascular disease control is not affected by the failure to

identify significant correlations between risk factors other than vegetarianism/increased body mass

and cardiovascular mortality in this ecologic study. For individuals seen in clinical practice and in

community practice, advice on tobacco cessation, regular intake of fruits and vegetables, and

increased physical activity needs to be recommended.

5.6.2 Implications for population health

High levels of smoking (especially among males) seen in this study point to the need for

widespread roll-out of interventions that have been proven to be cost-effective: tobacco tax

increases, the dissemination of information about health risks from smoking, restrictions on

smoking in public places/workplaces, comprehensive bans on advertising and promotion, and

increased access to cessation therapies [144].

The prevalence of increased body mass (especially among urban females) points to the need

for health policy and action at individual-level and population-level. This may include areas such as

economic policies relevant to the promotion of intake of heart-healthy foods and improving physical

activity levels through urban planning, neighbourhood walkability and traffic design that are

locally-relevant and cost-effective [145].

Distribution of risk factors also varied by sex, level of education and place of residence. It

was seen that smoking was a major risk factor among males (rural and urban), overweight was a

major risk factor among urban residents (females and males) and diabetes was relatively more

common among urban residents; vascular death rates were higher in the southern states. From a

health policy perspective, this could have implications on different focus for different target groups

and also for early roll-out out of interventions in some areas such as the south Indian states.

118

5.7 Future directions for research

5.7.1 Further analyses with data

Detailed mapping of the age-standardized prevalence of various risk factors and

cardiovascular deaths by the demographic characteristics in each state could help in producing an atlas

of cardiovascular disease for the states in India. These rates could be converted into absolute risks and

attributable risks to estimate potential cardiovascular disease burden in each state. This would be an

useful communication tool relevant for policymakers, academics and other stakeholders. Further,

spatial analysis including regression could be undertaken to better study geographic variation at the

regional or state level for cardiovascular deaths and even at the district level for some parameters such

as smoking because of sufficient numbers in surveys such as the SFMS 1998 survey. This would help

identify hot-spots for smoking as well as potential clustering of cardiovascular deaths in different

regions of India. Multi-level modeling using aggregate data on state-level urbanization and the

individual-level risk factor data would also lead to improved understanding of association between

predictors and cardiovascular death rates.

5.7.2 Improvement of CVD research data in India

There is certainly room for improvement in data collection on cardiovascular risk factors

from the large nationally-representative surveys in India. Better harmonization of study instruments

with respect to definitions and phrasing of questions would result in standardization enabling better

comparison across surveys and to study temporal trends. While better civil registration systems are

the way to move forward for improving accuracy of death certification systems, it does not seem

feasible in the near future and hence the verbal autopsy method would need to be refined to improve

accuracy for ascertainment of deaths attributable to the cardiovascular system. These ecologic

studies together with large cohort studies of individuals that are in their early stages of recruitment

[27,130] would offer a better understanding of the personal and societal level of cardiovascular risk

factors and how they operate to cause cardiovascular mortality in India.

5.8 Conclusions From the selected large, nationally-representative surveys conducted in India over the last

decade, some key community level information on cardiovascular risk factors and mortality

outcomes were elucidated. About one in three males over the age of 15 years were smokers with

119

70% of them smoking beedi and 20% smoking cigarettes; female smoking was one-tenth of male

smoking prevalence. Mean age of smoking initiation was 21 years. Beedi smokers were more often

illiterate, more common in rural areas and started smoking earlier than cigarette smokers. While

beedi smoking decreased across the education gradient, cigarette smoking increased. There was a 7-

fold variation in smoking prevalence among states. Higher prevalence was seen in northern and

northeastern states.

11.8% of males and 15.1% of females had BMIs ≥ 25 kg/m2 in the age-group of 30 to 49 years;

about 80% of the population was pre-obese and the rest was obese. Pre-obesity and obesity were

both more common among females than males. Overweight prevalence in urban areas was greater

than in rural areas. Among males there was a 13-fold variation with prevalence ranging from 2.9%

in rural Chattisgarh to 38% in urban Punjab; among females there was a 22-fold variation with

prevalence ranging from 2.5% in rural Jharkhand to 55.9% in urban Punjab.

One-third of the population reported being lacto-vegetarians. Rural residents and females

were more likely to be lacto-vegetarians than urban residents and males respectively. There was a

strong east-west gradient in lacto-vegetarianism with the lowest prevalence seen in northeastern

states and the highest prevalence seen in northwestern states. About half the population consumed

fruits at least once a week. Males and urban residents consumed more than females and rural

residents respectively. Fruit consumption increased directly across the education gradient.

Prevalence of self-reported diabetes among males was 2.8% and among females was 2.0%. Urban-

rural ratio was 2.2. The prevalence varied 10-fold among males with values ranging from below

0.5% in Rajasthan to over 5.0% in Andhra Pradesh and varied 8-fold among females with values

ranging from <0.5% in Rajasthan to over 4.0% in Tamil Nadu & Kerala. Southern states had a

higher prevalence of self-reported diabetes.

Cardiovascular death rates were 308 per 100,000 among males and 198 per 100,000 among

females in the 30 to 69 year age-group. Among males, the rates ranged from about 180 per 100,000

in Mizoram to over 400 per 100,000 in Tamil Nadu and Andhra Pradesh. Among females, the rates

ranged from below 100 per 100,000 in Mizoram and Haryana to about 240 per 100,000 in Punjab

and Andhra Pradesh. The selected risk factors studied explained 49% and 43% of the variation

among states for males and females respectively. Ecologic analysis revealed that the cardiovascular

death rates were significantly associated with rates of overweight and levels of vegetarianism at the

state level for males; no such association was found for females. Limited association between

120

predictors and cardiovascular death rate is probably indicative of an evolving picture characteristic

of regions in epidemiologic transition.

Graphical displays using maps and other techniques have helped summarize and visualize the

geography of cardiovascular disease in India at a unique scale from multiple large surveys. This will

enhance transparency and widespread understanding of the epidemiologic evidence for all

concerned stakeholders.

121

6 REFERENCES 1. Davis K (1951) The Population of India and Pakistan. Princeton, New Jersey: Princeton

University Press. 2. RGI (1998) Compendium of India's Fertility and Mortality Indicators, 1971-1997. New Delhi:

India: Office of Registrar General of India. 3. Black RE, Morris SS, Bryce J (2003) Where and why are 10 million children dying every year?

Lancet 361: 2226-2234. 4. Jha P (2002) Avoidable mortality in India: past progress and future prospects. Natl Med J India

15 Suppl 1: 32-36. 5. Doll R, Peto R (1981) The causes of cancer: quantitative estimates of avoidable risks of cancer in

the United States today. J Natl Cancer Inst 66: 1191-1308. 6. Dyson T, Cassen R, Visaria L (2004) Twenty-first century India: population, economy, human

development, and the environment. New York: Oxford University Press. 1-31 p. 7. Mukherji R (2008) The Political Economy of India's Economic Reforms. Asian Economic Policy

Review 3: 315-331. 8. Omran AR (1971) The Epidemiologic Transition: A Theory of the Epidemiology of Population

Change. The Milbank Memorial Fund Quarterly 49: 509-538. 9. Olshansky SJ, Ault AB (1986) The fourth stage of the epidemiologic transition: the age of

delayed degenerative diseases. The Milbank Memorial Fund Quarterly 64: 355-391. 10. Gordon T, Kannel WB, Castelli WP, Dawber TR (1981) Lipoproteins, cardiovascular disease,

and death. The Framingham study. Arch Intern Med 141: 1128-1131. 11. Dawber TR, Kannel WB (1958) An epidemiologic study of heart disease: the Framingham

study. Nutr Rev 16: 1-4. 12. Kannel WB, McGee D, Gordon T (1976) A general cardiovascular risk profile: the Framingham

Study. Am J Cardiol 38: 46-51. 13. Smith GD, Shipley MJ, Marmot MG, Rose G (1992) Plasma cholesterol concentration and

mortality. The Whitehall Study. JAMA 267: 70-76. 14. Benfante R (1992) Studies of cardiovascular disease and cause-specific mortality trends in

Japanese-American men living in Hawaii and risk factor comparisons with other Japanese populations in the Pacific region: a review. Hum Biol 64: 791-805.

15. Cutler JA, Grandits GA, Grimm RH, Jr., Thomas HE, Jr., Billings JH, et al. (1991) Risk factor changes after cessation of intervention in the Multiple Risk Factor Intervention Trial. The MRFIT Research Group. Prev Med 20: 183-196.

16. Pietinen P, Lahti-Koski M, Vartiainen E, Puska P (2001) Nutrition and cardiovascular disease in Finland since the early 1970s: a success story. J Nutr Health Aging 5: 150-154.

17. Fruchart JC, Nierman MC, Stroes ES, Kastelein JJ, Duriez P (2004) New risk factors for atherosclerosis and patient risk assessment. Circulation 109: III15-19.

18. Barker DJP (1992) Fetal and infant origins of adult disease. London: BMJ Books. 19. Enas EA, Mehta J (1995) Malignant coronary artery disease in young Asian Indians: thoughts

on pathogenesis, prevention, and therapy. Coronary Artery Disease in Asian Indians (CADI) Study. Clin Cardiol 18: 131-135.

20. Prentice AM (2006) The emerging epidemic of obesity in developing countries. Int J Epidemiol 35: 93-99.

21. Gupta R (2004) Trends in hypertension epidemiology in India. J Hum Hypertens 18: 73-78.

122

22. Gupta R, Joshi P, Mohan V, Reddy KS, Yusuf S (2008) Epidemiology and causation of coronary heart disease and stroke in India. Heart 94: 16-26.

23. Lopez AD, Mathers CD, Ezzati M, Jamison DT, Murray CJL, editors (2006) Global Burden of Disease and Risk Factors. New York: World Bank & Oxford University Press. 156-161 p.

24. WHO (2004) Global Burden of Disease: Disease and Injury Country Estimates. World Health Organization. Dept of Measurement and Health Information.

25. CBHI (2007) Mortality Statistics in India - 2006. New Delhi, India: Central Bureau of Health Information.

26. Mahapatra P, Rao CP (2001) Cause of death reporting systems in India: a performance analysis. Natl Med J India 14: 154-162.

27. Jha P, Gajalakshmi V, Gupta PC, Kumar R, Mony P, et al. (2006) Prospective study of one million deaths in India: rationale, design, and validation results. PLoS Med 3: e18.

28. RGI-CGHR, Collaborators (2009) Causes of death in India, 2001-03. In: Sample Registration System RGoI, editor. New Delhi: Ministry of Home Affairs, Govt. of India (forthcoming).

29. Gajalakshmi V, Peto R, Kanaka S, Balasubramanian S (2002) Verbal autopsy of 48 000 adult deaths attributable to medical causes in Chennai (formerly Madras), India. BMC Public Health 2: 7.

30. Joshi R, Cardona M, Iyengar S, Sukumar A, Raju CR, et al. (2006) Chronic diseases now a leading cause of death in rural India--mortality data from the Andhra Pradesh Rural Health Initiative. Int J Epidemiol 35: 1522-1529.

31. Yusuf S, Hawken S, Ounpuu S, Dans T, Avezum A, et al. (2004) Effect of potentially modifiable risk factors associated with myocardial infarction in 52 countries (the INTERHEART study): case-control study. Lancet 364: 937-952.

32. Joshi P, Islam S, Pais P, Reddy S, Dorairaj P, et al. (2007) Risk factors for early myocardial infarction in South Asians compared with individuals in other countries. JAMA 297: 286-294.

33. Feigin VL (2007) Stroke in developing countries: can the epidemic be stopped and outcomes improved? Lancet Neurol 6: 94-97.

34. INTERSALT, CooperativeResearchGroup (1988) INTERSALT: an international study of electrolyte excretion and blood pressure. Results for 24 hour urinary sodium and potassium excretion. BMJ 297: 319-328.

35. Lewington S, Clarke R, Qizilbash N, Peto R, Collins R (2002) Age-specific relevance of usual blood pressure to vascular mortality: a meta-analysis of individual data for one million adults in 61 prospective studies. Lancet 360: 1903-1913.

36. WHO (2004) Mortality and burden of disease - 2002. WHO Statistical Information System. Geneva, Switzerland: WHO.

37. WHO (2005) Preventing chronic diseases: a vital investment. Geneva, Switzerland: World Health Organization.

38. Yusuf S, Reddy S, Ounpuu S, Anand S (2001) Global burden of cardiovascular diseases: Part II: variations in cardiovascular disease by specific ethnic groups and geographic regions and prevention strategies. Circulation 104: 2855-2864.

39. Glass GE (2000) Update: spatial aspects of epidemiology: the interface with medical geography. Epidemiol Rev 22: 136-139.

40. Gupta R, Misra A, Pais P, Rastogi P, Gupta VP (2006) Correlation of regional cardiovascular disease mortality in India with lifestyle and nutritional factors. Int J Cardiol 108: 291-300.

123

41. Subramanian SV, Nandy S, Kelly M, Gordon D, Davey Smith G (2004) Patterns and distribution of tobacco consumption in India: cross sectional multilevel evidence from the 1998-9 national family health survey. BMJ 328: 801-806.

42. Shetty PS (2002) Nutrition transition in India. Public Health Nutr 5: 175-182. 43. Lengler R, Eppler M. Towards a Periodic Table of Visualization Methods for Management;

2007; Clearwater, Florida. 44. Pfeiffer D, Robinson T, Stevenson M, Stevens K, Rogers D, et al. (2008) Spatial analysis in

epidemiology. Oxford: UK: Oxford University Press. 45. Jiang B, Huang B, Vasek V (2003) Geovisualisation for Planning Support Systems. In:

Geertman S, Stillwell J, editors. Planning Support Systems in Practice. Berlin: Springer. 46. Dummer TJ (2008) Health geography: supporting public health policy and planning. CMAJ

178: 1177-1180. 47. Daar AS, Singer PA, Persad DL, Pramming SK, Matthews DR, et al. (2007) Grand challenges

in chronic non-communicable diseases. Nature 450: 494-496. 48. Yach D, Hawkes C, Gould CL, Hofman KJ (2004) The global burden of chronic diseases:

overcoming impediments to prevention and control. JAMA 291: 2616-2622. 49. Doll R, Peto R, Boreham J, Sutherland I (2004) Mortality in relation to smoking: 50 years'

observations on male British doctors. BMJ 328: 1519. 50. Whitlock G, Lewington S, Sherliker P, Clarke R, Emberson J, et al. (2009) Body-mass index

and cause-specific mortality in 900 000 adults: collaborative analyses of 57 prospective studies. Lancet 373: 1083-1096.

51. Reddy KS, Gupta PC, editors (2004) Report on tobacco control in India. New Delhi, India: Ministry of Health and Family Welfare, Govt of India. 41-82 p.

52. IARC (1986) International Agency for Research on Cancer: Monograph on the Evaluation of Carcinogenic Risk of Chemicals to Humans - Tobacco Smoking. Switzerland: World Health Organization. 38 p.

53. Jha P, Chen Z (2007) Poverty and chronic diseases in Asia: challenges and opportunities. CMAJ 177: 1059-1062.

54. Rani M, Bonu S, Jha P, Nguyen SN, Jamjoum L (2003) Tobacco use in India: prevalence and predictors of smoking and chewing in a national cross sectional household survey. Tob Control 12: e4.

55. Neufeld KJ, Peters DH, Rani M, Bonu S, Brooner RK (2005) Regular use of alcohol and tobacco in India and its association with age, gender, and poverty. Drug Alcohol Depend 77: 283-291.

56. Gajalakshmi C, Jha P, Ranson K, Nguyen S (2000) Global Patterns of Smoking and Smoking-Attributable Mortality. In: Jha P, Chaloupka F, editors. Tobacco Control in Developing Countries. Oxford, U.K.: Oxford University Press.

57. USDHHS (2008) Results from the 2007 National Survey on Drug Use and Health: National Findings. In: US Department of Health and Human Services SAMHSA, editor. Washington, DC: US Government Printing Office.

58. Jha P, Jacob B, Gajalakshmi V, Gupta PC, Dhingra N, et al. (2008) A nationally representative case-control study of smoking and death in India. N Engl J Med 358: 1137-1147.

59. CDC (2007) Smoking & Tobacco Use: National Health Interview Surveys, Selected Years—United States, 1974–2006. In: Services UDoHaH, editor. Atlanta, GA.

60. Despres JP, Arsenault BJ, Cote M, Cartier A, Lemieux I (2008) Abdominal obesity: the cholesterol of the 21st century? Can J Cardiol 24 Suppl D: 7D-12D.

124

61. Mascie-Taylor CG, Goto R (2007) Human variation and body mass index: a review of the universality of BMI cut-offs, gender and urban-rural differences, and secular changes. J Physiol Anthropol 26: 109-112.

62. Wild SH, Byrne CD (2004) Evidence for fetal programming of obesity with a focus on putative mechanisms. Nutr Res Rev 17: 153-162.

63. Subramanian SV, Kawachi I, Smith GD (2007) Income inequality and the double burden of under- and overnutrition in India. J Epidemiol Community Health 61: 802-809.

64. Srinath Reddy K, Katan MB (2004) Diet, nutrition and the prevention of hypertension and cardiovascular diseases. Public Health Nutr 7: 167-186.

65. Anand SS, Ounpuu S, Yusuf S (2003) Ethnicity and cardiovascular disease. In: Yusuf S, Cairns JA, Camm AJ, Fallen EL, Gersh BJ, editors. Evidence-Based Cardiology 2nd ed: BMJ Publishing Group. pp. 171–190

66. Popkin BM, Horton S, Kim S, Mahal A, Shuigao J (2001) Trends in diet, nutritional status, and diet-related noncommunicable diseases in China and India: the economic costs of the nutrition transition. Nutr Rev 59: 379-390.

67. Enas EA, Singh V, Munjal YP, Bhandari S, Yadave RD, et al. (2008) Reducing the burden of coronary artery disease in India: challenges and opportunities. Indian Heart J 60: 161-175.

68. Mohan V, Deepa M, Deepa R, Shanthirani CS, Farooq S, et al. (2006) Secular trends in the prevalence of diabetes and impaired glucose tolerance in urban South India--the Chennai Urban Rural Epidemiology Study (CURES-17). Diabetologia 49: 1175-1178.

69. Lewington S, Whitlock G, Clarke R, Sherliker P, Emberson J, et al. (2007) Blood cholesterol and vascular mortality by age, sex, and blood pressure: a meta-analysis of individual data from 61 prospective studies with 55,000 vascular deaths. Lancet 370: 1829-1839.

70. Mony PK, Nagaraj C (2007) Health information management: an introduction to disease classification and coding. Natl Med J India 20: 307-310.

71. Meade MS, Earickson RJ (2005) Medical Geography – 2nd edition. New York, NY: The Guilford Press.

72. Lo CP, Yeung AKW (2007) Concepts and Techniques of Geographic Information Systems: Prentice Hall Inc.

73. Longley P, Goodchild MF, Maguire D, Rhind D (2005) Geographic Information Systems and Science: Wiley and Sons.

74. PAHO (2006) Core Health Indicators of the Americas 2004-2006. Pan American Health Organization.

75. CDC (2008) Atlas of United States Mortality: selected causes of death. . Hyattsville, MD: U.S. Department of Health and Human Services.

76. CCORT. (2006) Canadian Cardiovascular Atlas.; Tu JV, Ghali WA, Pilote L, Brien S, editors. Toronto: Pulsus Group Inc. and Institute for Clinical Evaluative Sciences.

77. Hux J, Booth G, Slaughter P, Laupacis A (2003) Diabetes in Ontario: An ICES Practice Atlas. Toronto: Institute for Clinical Evaluative Sciences.

78. Glazier R, Booth G (2007) Neighbourhood environments and resources for healthy living - A focus on diabetes in Toronto. Toronto: Institute for Clinical Evaluative Sciences.

79. Cooper R, Cutler J, Desvigne-Nickens P, Fortmann SP, Friedman L, et al. (2000) Trends and disparities in coronary heart disease, stroke, and other cardiovascular diseases in the United States: findings of the national conference on cardiovascular disease prevention. Circulation 102: 3137-3147.

125

80. Mensah GA (2005) Eliminating disparities in cardiovascular health: six strategic imperatives and a framework for action. Circulation 111: 1332-1336.

81. Chow CM, Donovan L, Manuel D, Johansen H, Tu JV (2005) Regional variation in self-reported heart disease prevalence in Canada. Can J Cardiol 21: 1265-1271.

82. Breckenkamp J, Mielck A, Razum O (2007) Health inequalities in Germany: do regional-level variables explain differentials in cardiovascular risk? BMC Public Health 7: 132.

83. Murphy A, Mony P, Alleyne G, Dirks J, Messah E, et al. Cardiovascular Disease Mortality in the Commonwealth; 2008 17-18 Nov 2008; Toronto, Canada. Centre for Global Health Research and Commonwealth Secretariat.

84. Levin S, Welch VL, Bell RA, Casper ML (2002) Geographic variation in cardiovascular disease risk factors among American Indians and comparisons with the corresponding state populations. Ethn Health 7: 57-67.

85. CGHR (2006 ) Atlas of HIV-1 prevalence among women attending antenatal clinics in 115 districts of southern India. Toronto, Canada: Centre for Global Health Research, St Michael’s Hospital, University of Toronto.

86. Padmavati S (1962) Epidemiology of cardiovascular disease in India. II. Ischemic heart disease. Circulation 25: 711-717.

87. Cleveland WS (1993) Visualising data. Summit, NJ: Hobart Press. 88. MacEachren AM (1995) How Maps Work. New York: The Guilford Press. 89. Anon (1998) India Nutrition Profile. In: Development DoWC, editor. New Delhi: Ministry of

Human Resource Development, Govt. of India. pp. 1-8. 90. SFMS (1998) Special Fertility and Mortality Survey. New Delhi, India: Office of the Registrar

General of India. 91. IIPS (2000) National Family Health Survey (NFHS-2), 1998-99. Mumbai: India: International

Institute for Population Sciences. 92. SRS (2004) Sample Registration System. New Delhi, India: Office of the Registrar General of

India. 93. IIPS (2007) National Family Health Survey (NFHS-3), 2005-06. Mumbai: India: International

Institute for Population Sciences 94. WHO (2002) World Health Report -- Reducing Risks, Promoting Healthy Life. Geneva,

Switzerland: World Health Organization. 95. WHO (2008) STEPwise approach to chronic disease risk factor surveillance (STEPS). Geneva,

Switzerland: Chronic Diseases and Health Promotion. World Health Organization. 96. CSDH (2008) Closing the gap in a generation: health equity through action on the social

determinants of health. Final Report of the Commission on Social Determinants of Health. Geneva, Switzerland: World Health Organization.

97. Roy TK. Alternative Data Sources for Demographic and Health Statistics in India; 2003 24-27 June; Bangkok, Thailand.

98. Bhat MPN (2002) Completeness of India’s Sample Registration System: An assessment using the general growth balance method. . Population Studies 56 119–134.

99. Hyland A, Cummings KM, Lynn WR, Corle D, Giffen CA (1997) Effect of proxy-reported smoking status on population estimates of smoking prevalence. Am J Epidemiol 145: 746-751.

100. Gorber SC, Schofield-Hurwitz S, Hardt J, Levasseur G, Tremblay M (2009) The accuracy of self-reported smoking: a systematic review of the relationship between self-reported and cotinine-assessed smoking status. Nicotine Tob Res 11: 12-24.

126

101. Navarro AM (1999) Smoking status by proxy and self report: rate of agreement in different ethnic groups. Tob Control 8: 182-185.

102. Subramanian SV, Subramanyam MA, Selvaraj S, Kawachi I (2009) Are self-reports of health and morbidities in developing countries misleading? Evidence from India. Soc Sci Med 68: 260-265.

103. Walter SD, Birnie SE (1991) Mapping mortality and morbidity patterns: an international comparison. Int J Epidemiol 20: 678-689.

104. Bailey TC, Gatrell AC (1995) Interactive Spatial Data Analysis. Harlow, Essex: Addison Wesley Longman.

105. Rothman K, Greenland S (1998) Modern Epidemiology. Philadelphia, PA: Lippincott Williams & Wilkins.

106. Littell R, Stroup W, Freund R (2002) SAS® for Linear Models. Cary, NC: SAS Institute Inc. 107. Allison P (1999) Logistic Regression Using SAS®: Theory and Application. Cary, NC: SAS

Institute Inc. 217-226 p. 108. Carr DB, Wallin JF, Carr DA (2000) Two new templates for epidemiology applications: linked

micromap plots and conditioned choropleth maps. Stat Med 19: 2521-2538. 109. Anselin L, Syabri I, Kho Y (2006) GeoDa: An Introduction to Spatial Data Analysis.

Geographical Analysis 38: 5-22. 110. Cliff AD (1995) Analysing geographically-related disease data. Stat Methods Med Res 4: 93-

101. 111. Rezaeian M, Dunn G, St Leger S, Appleby L (2004) The production and interpretation of

disease maps: A methodological case-study. Soc Psychiatry Psychiatr Epidemiol 39: 947-954.

112. Miyawaki N, Chen SC (1981) A statistical consideration on the mapping of mortality. The geography of health: 93-101.

113. Aylin P, Maheswaran R, Wakefield J, Cockings S, Jarup L, et al. (1999) A national facility for small area disease mapping and rapid initial assessment of apparent disease clusters around a point source: the UK Small Area Health Statistics Unit. J Public Health Med 21: 289-298.

114. Slocum TA, McMaster RB, Kessler FC, Howard HH (2008) Thematic Cartography and Geographic Visualization New Jersey, U.S.: Prentice Hall Inc.

115. Dent BD (1999) Cartography: thematic map design McGraw-Hill. 116. Brewer CA (1994) Colour use guidelines for mapping and visualization. In: MacEachren AM,

Talyor DRF, editors. Visualization in modern cartography Terrytown, NY.: Elsevier Science.

117. Krosnick JA (1999) Survey research. Annu Rev Psychol 50: 537-567. 118. Wakefield J (2009) Multi-level modelling, the ecologic fallacy, and hybrid study designs. Int J

Epidemiol 38: 330-336. 119. RGI (2003) SRS Based Abridged Life Tables, SRS Analytical Studies Report No. 3 of 2003.

New Delhi: Registrar General of India. 120. Nichter M, Van Sickle D (2004) Popular perceptions of tobacco products and patterns of use

among male college students in India. Soc Sci Med 59: 415-431. 121. Mohan S, Pradeepkumar AS, Thresia CU, Thankappan KR, Poston WS, et al. (2006) Tobacco

use among medical professionals in Kerala, India: the need for enhanced tobacco cessation and control efforts. Addict Behav 31: 2313-2318.

127

122. Keys A, Aravanis C, Blackburn HW, Van Buchem FS, Buzina R, et al. (1966) Epidemiological studies related to coronary heart disease: characteristics of men aged 40-59 in seven countries. Acta Med Scand Suppl 460: 1-392.

123. Narayan KM, Chadha SL, Hanson RL, Tandon R, Shekhawat S, et al. (1996) Prevalence and patterns of smoking in Delhi: cross sectional study. BMJ 312: 1576-1579.

124. Gupta PC (1996) Survey of sociodemographic characteristics of tobacco use among 99,598 individuals in Bombay, India using handheld computers. Tob Control 5: 114-120.

125. Schaap MM, Kunst AE (2009) Monitoring of socio-economic inequalities in smoking: learning from the experiences of recent scientific studies. Public Health 123: 103-109.

126. Jarvis MJ, Tunstall-Pedoe H, Feyerabend C, Vesey C, Saloojee Y (1987) Comparison of tests used to distinguish smokers from nonsmokers. Am J Public Health 77: 1435-1438.

127. Patrick DL, Cheadle A, Thompson DC, Diehr P, Koepsell T, et al. (1994) The validity of self-reported smoking: a review and meta-analysis. Am J Public Health 84: 1086-1093.

128. Eyre H, Kahn R, Robertson RM, Clark NG, Doyle C, et al. (2004) Preventing cancer, cardiovascular disease, and diabetes: a common agenda for the American Cancer Society, the American Diabetes Association, and the American Heart Association. Circulation 109: 3244-3255.

129. Ulijaszek SJ, Kerr DA (1999) Anthropometric measurement error and the assessment of nutritional status. Br J Nutr 82: 165-177.

130. Teo K, Chow CK, Vaz M, Rangarajan S, Yusuf S (2009) The Prospective Urban Rural Epidemiology (PURE) study: examining the impact of societal influences on chronic noncommunicable diseases in low-, middle-, and high-income countries. Am Heart J 158: 1-7 e1.

131. Yusuf S, Vaz M (2006) PURE India. Prospective Urban and Rural Epidemiology Study. Dubai presentation. Hamilton: Population Health Research Institute.

132. Rastogi T, Reddy KS, Vaz M, Spiegelman D, Prabhakaran D, et al. (2004) Diet and risk of ischemic heart disease in India. Am J Clin Nutr 79: 582-592.

133. Mohan V, Sandeep S, Deepa R, Shah B, Varghese C (2007) Epidemiology of type 2 diabetes: Indian scenario. Indian J Med Res 125: 217-230.

134. Kunst AE, del Rios M, Groenhof F, Mackenbach JP (1998) Socioeconomic inequalities in stroke mortality among middle-aged men: an international overview. European Union Working Group on Socioeconomic Inequalities in Health. Stroke 29: 2285-2291.

135. Smith GD, Wentworth D, Neaton JD, Stamler R, Stamler J (1996) Socioeconomic differentials in mortality risk among men screened for the Multiple Risk Factor Intervention Trial: II. Black men. Am J Public Health 86: 497-504.

136. Strand BH, Tverdal A (2004) Can cardiovascular risk factors and lifestyle explain the educational inequalities in mortality from ischaemic heart disease and from other heart diseases? 26 year follow up of 50,000 Norwegian men and women. J Epidemiol Community Health 58: 705-709.

137. Marmot MG, Adelstein AM, Robinson N, Rose GA (1978) Changing social-class distribution of heart disease. Br Med J 2: 1109-1112.

138. Song YM, Ferrer RL, Cho SI, Sung J, Ebrahim S, et al. (2006) Socioeconomic status and cardiovascular disease among men: the Korean national health service prospective cohort study. Am J Public Health 96: 152-159.

139. Murray CJ, Lopez AD, Feehan DM, Peter ST, Yang G (2007) Validation of the symptom pattern method for analyzing verbal autopsy data. PLoS Med 4: e327.

128

140. Kumar R, Thakur J, Rao B, Singh M, Bhatia S (2006) Validity of verbal autopsy in determining causes of adult deaths. Indian Journal of Public Health 50: 90-94.

141. Chandramohan D, Maude GH, Rodrigues LC, Hayes RJ (1998) Verbal autopsies for adult deaths: their development and validation in a multicentre study. Trop Med Int Health 3: 436-446.

142. Tu JV, Ko DT (2008) Ecological studies and cardiovascular outcomes research. Circulation 118: 2588-2593.

143. Pearce N (2000) The ecological fallacy strikes back. J Epidemiol Community Health 54: 326-327.

144. Jha P, Chaloupka F, Moore J, Gajalakshmi V, Gupta P, et al. (2006) Disease Control Priorities in Developing Countries: Tobacco Addiction. In: Jamison D, Breman J, Measham A, Alleyne G, Claeson M et al., editors. Disease Control Priorities in Developing Countries. 2nd ed. New York: Oxford University Press. pp. 17.

145. Willett W, Koplan J, Nugent R, Dusenbury C, Puska P, et al., editors (2006) Disease Control Priorities in Developing Countries: Prevention of Chronic Disease by Means of Diet and Lifestyle Changes. 2nd ed. New York: Oxford University Press. 18 p.

129

130

APPENDIX Table 7.1 Poisson regression

POISSON REGRESSION (males) Criteria For Assessing Goodness Of Fit Criterion DF Value Value/DF Deviance 22 154.4283 7.0195 Scaled Deviance 22 22.0000 1.0000 Pearson Chi-Square 22 153.5481 6.9795 Scaled Pearson X2 22 21.8746 0.9943 Log Likelihood 5866.4697 Analysis Of Parameter Estimates Parameter DF Estimate Pr > ChiSq Intercept 1 5.2467 <.0001 Smoking 1 0.0027 0.2652 Veg_% 1 -0.0032 0.0682 Fruits 1 0.0001 0.9690 Overwgt 1 0.0243 0.0166 Diabetes 1 0.0156 0.5073 Urban_% 1 0.0035 0.0876 Scale 0 2.6494

POISSON REGRESSION (females)

Criteria For Assessing Goodness Of Fit Criterion DF Value Value/DF Deviance 22 228.8826 10.4038 Scaled Deviance 22 22.0000 1.0000 Pearson Chi-Square 22 224.7683 10.2167 Scaled Pearson X2 22 21.6045 0.9820 Log Likelihood 2011.8376 Analysis Of Parameter Estimates

Parameter DF Estimate Pr > ChiSq Intercept 1 5.1974 <.0001 Smoking 1 -0.0378 0.0310 Veg_% 1 -0.0008 0.7437 Fruits 1 -0.0042 0.2736 Overwgt 1 0.0085 0.4865 Diabetes 1 0.0793 0.2530 Urban_% 1 -0.0008 0.8289 Scale 0 3.2255

geographical epidemiology of …...by prem kumar mony master of science (2009) institute of medical...

Documents