copy of data analysis 08

Upload: malyn1218

Post on 30-May-2018

217 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/14/2019 Copy of Data Analysis 08

    1/44

    Data Analysis

    Florenda F. Cabatit RN MAFlorenda F. Cabatit RN MAFacilitator

  • 8/14/2019 Copy of Data Analysis 08

    2/44

    DATA ANALYSIS

    Data analysis is the process by whichinformation is rendered meaningfuland intelligible (Polit and Hungler,1995).

    It is the systematic organization andsynthesis of research data and thetesting of research hypotheses usingthose data (2004).

  • 8/14/2019 Copy of Data Analysis 08

    3/44

    Statistical Analysis

    Quantitative analysis deals withnumerical analysis of information.

    It is the manipulation of numeric data

    through statistical procedures for thepurpose of describing phenomena orassessing the magnitude and reliabilityof relationships among them.

    Statistics is the scientific method used inquantitative analysis.

  • 8/14/2019 Copy of Data Analysis 08

    4/44

    StatisticsStatistics

    Statistics helps to:Organize dataSummarize dataEvaluate data

    Present data in an easilyunderstood form .

  • 8/14/2019 Copy of Data Analysis 08

    5/44

    StatisticsStatistics

    Two branches of Statistics :Descriptive statistics -

    statistics used to describe andsummarize dataInferential Statistics

    statistics that permit inferenceson whether relationshipsobserved in a sample are likelyto occur in the larger population.

  • 8/14/2019 Copy of Data Analysis 08

    6/44

    Considerations in theConsiderations in thechoice of appropriatechoice of appropriatestatistical methodsstatistical methods

    The purpose of the research The level of measurement of thevariables

    The number of groups/variablesinvolved

    The type of groups being studied

  • 8/14/2019 Copy of Data Analysis 08

    7/44

    Levels of Measurement

    Nominal - the lowest level- involves assigning numbers to classify

    characteristics into categories

    - numeric codes assigned in nominalmeasurement do not convey quantitativeinformation.

    - the numbers are merely symbols that

    represent different values.- categories must be mutually exclusive

    and collectively exhaustive.

  • 8/14/2019 Copy of Data Analysis 08

    8/44

    Ordinal Measurement

    This involves sorting objects on the basisof their relative standing or ranking on anattribute.The numbers are not arbitrary-they signifyincremental values but does not however,tell anything about how much greater one

    level is than another.

  • 8/14/2019 Copy of Data Analysis 08

    9/44

    Interval Measurement

    A measurement in which

    an attribute of a variableis rank ordered on a scalethat has equal distances

    between points on thatscale.

  • 8/14/2019 Copy of Data Analysis 08

    10/44

    Ratio ScaleRatio Scale

    A quantitative measurement in which intervalsare equal and there is a true zero point.

    The highest level of measurementAll arithmetic operations are permissible withthis measurement (add, subtract, multiply, anddivide numbers on this scale).

  • 8/14/2019 Copy of Data Analysis 08

    11/44

    Descriptive Statistics

    Three characteristics to fullydescribe a set of data:

    shape of the distributionvalues

    central tendency Variability

  • 8/14/2019 Copy of Data Analysis 08

    12/44

    Review of DescriptiveStats.

    Descriptive Statistics are used to presentquantitative descriptions in a manageableform.This method works by reducing lots of datainto a simpler summary.Example:

    37 0 Centigrade as average adult bodytemperatureSUs quality-point system

  • 8/14/2019 Copy of Data Analysis 08

    13/44

    Univariate Analysis

    This is the examination across cases of onevariable at a time.Frequency distributions are used to groupdata.One may set up margins that allow us togroup cases into categories.Examples include

    Age categoriesPrice categoriesTemperature categories.

  • 8/14/2019 Copy of Data Analysis 08

    14/44

    Distributions

    Two ways to describe a univariatedistribution

    A tableA graph (histogram, bar chart)

  • 8/14/2019 Copy of Data Analysis 08

    15/44

    Distributions (cont)

    Distributions may also be displayedusing percentages.

    For example, one could usepercentages to describe the following:

    Percentage of people under the

    poverty levelOver a certain ageOver a certain score on a

    standardized test

  • 8/14/2019 Copy of Data Analysis 08

    16/44

    Distributions (cont.)

    CategoryCategory PercentPercentUnder 35 9%36-45 2146-55 4556-65 1966+ 6

    A Frequency Distribution Table A Frequency Distribution Table

  • 8/14/2019 Copy of Data Analysis 08

    17/44

    Distributions (cont.)

    05

    1015

    2025303540

    45

    U

    n d e r

    3 5

    3 6

    - 4 5

    4 6

    - 5 5

    5 6

    - 6 5

    6 6 +

    Percent

    A Histogram

  • 8/14/2019 Copy of Data Analysis 08

    18/44

    Central Tendency

    An estimate of the center of adistribution

    Three different types of estimates:MeanMedianMode

  • 8/14/2019 Copy of Data Analysis 08

    19/44

    Mean

    The most commonly used method of describing central tendency.One basically totals all the resultsand then divides by the number of units or n of the sample.Example: The NCM 104 Quiz meanwas determined by the sum of all thescores divided by the number of students taking the exam.

  • 8/14/2019 Copy of Data Analysis 08

    20/44

    Median

    The median is the score found at theexact middle of the set.One must list all scores in numericalorder and then locate the score inthe center of the sample.Example: If there are 500 scores in

    the list, score #250 would be themedian. This is useful in weeding out outliers.

  • 8/14/2019 Copy of Data Analysis 08

    21/44

    Mode

    The mode is the most repeated scorein the set of results.Lets take the set of scores:15,20,21,20,36,15, 25,15Again we first line up the scores15,15,15,20,20,21,25,36

    15 is the most repeated score and istherefore labeled the mode.

  • 8/14/2019 Copy of Data Analysis 08

    22/44

    Central Tendency

    If the distribution is normal (i.e., bell-shaped), the mean, median and mode

    are all equal.In our analyses, well use the mean.

  • 8/14/2019 Copy of Data Analysis 08

    23/44

    Dispersion

    Two estimates types:

    Range

    Standard deviationStandard deviation is moreaccurate/detailed because an outlier can

    greatly extend the range.

  • 8/14/2019 Copy of Data Analysis 08

    24/44

    Range

    The range is used to identify thehighest and lowest scores.Lets take the set of scores:15,20,21,20,36,15, 25,15.The range would be 15-36. Thisidentifies the fact that 21 points

    separates the highest to the lowestscore.

  • 8/14/2019 Copy of Data Analysis 08

    25/44

    Standard Deviation

    The standard deviation is avalue that shows the relationthat individual scores have tothe mean of the sample.If scores are said to bestandardized to a normal curve,there are several statisticalmanipulations that can beperformed to analyze the data

    set.

  • 8/14/2019 Copy of Data Analysis 08

    26/44

    Standard Dev. (cont)

    Assumptions may be made aboutthe percentage of scores as theydeviate from the mean.If scores are normally distributed,one can assume thatapproximately 69% of the scores in

    the sample fall within one standarddeviation of the mean.Approximately 95% of the scoreswould then fall within two standard

    deviations of the mean.

  • 8/14/2019 Copy of Data Analysis 08

    27/44

    Standard Dev. (cont)

    The standard deviation calculatesthe square root of the sum of the

    squared deviations from the mean of all the scores, divided by the number of scores.This process accounts for bothpositive and negative deviationsfrom the mean.

  • 8/14/2019 Copy of Data Analysis 08

    28/44

    RESEARCH QUESTION: DESCRIBE

    LEVEL TYPE OF DESCRIPTION STATISTICAL TOOL

    NOMINAL

    Distribution

    Central Tendency

    Frequency distributionContingency Table

    Mode

    ORDINAL Distribution

    Central Tendency

    Frequency DistributionContingency TableScatterpoint

    Mode, Median

    RATIO/INTERVAL

    Distribution Frequency DistributionContingency TableScatterpoint

    Central TendencyMode, Median, Mean

    VariabilityRange, Variance,

    Standard Deviation

  • 8/14/2019 Copy of Data Analysis 08

    29/44

    Inferential

    statistics Based on the law of probabilityIt provides a means for drawingconclusions about a population,given data from a sampleIt estimates population parametersfrom sample statistics

  • 8/14/2019 Copy of Data Analysis 08

    30/44

    Inferential

    StatisticsStatistical Inference consists of twotechniques:

    2.Estimation of parameters3.Hypothesis testing

  • 8/14/2019 Copy of Data Analysis 08

    31/44

    Hypothesis TestingStatistical hypothesis testing provides

    objective criteria for deciding whether hypotheses are supported by empirical evidence.

    It is a process of disproof or rejection.Researchers seek to reject the null hypothesis through various statistical tests.Hypothesis testing uses samples to draw conclusions about relationships within the

    population.

  • 8/14/2019 Copy of Data Analysis 08

    32/44

    Type I and Type II

    ErrorsType I Error - researchers make a type I

    error when a true null hypothesis isrejected.

    Type II Error researchers make a type IIerror when a false null hypothesis isaccepted

  • 8/14/2019 Copy of Data Analysis 08

    33/44

    Level of Significance

    This refers to the risk of making a typeI error in a statistical analysis.The value selected beforehand

    signifies the risk or the probability of rejecting of rejecting a true nullhypothesis.

    The two most frequently usedsignificance levels (referred to as alpha or ) are:

    .05

    .01

  • 8/14/2019 Copy of Data Analysis 08

    34/44

    Level of Significance

    With .05 significance level, we areaccepting the risk that out of 100 samplesdrawn from a population, a true nullhypothesis would be rejected only 5 times.

    With a .01 level of significance, the risk of a type I error is lower: in only 1 sample outof 100 would we erroneously reject thenull hypothesis.

  • 8/14/2019 Copy of Data Analysis 08

    35/44

    Critical Region

    This refers to the area in the samplingdistribution representing values thatare improbable if the null hypothesisis true.

    It is defined by the level of significance

  • 8/14/2019 Copy of Data Analysis 08

    36/44

    Statistical Tests

    Two-tailed test- this means that both endsor tails of the sampling distribution areused to determine improbable values.

    In one-tailed tests, the critical region of improbable values is entirely in one tailof the distribution-the tail correspondingto the direction of the hypothesis

  • 8/14/2019 Copy of Data Analysis 08

    37/44

    An example of Critical Regions of a two-tailed test

  • 8/14/2019 Copy of Data Analysis 08

    38/44

    Types of StatisticalTypes of Statistical

    TestsTestsParametric Tests a class of inferential statistical tests thatinvolve:a. Assumptions about thedistribution of the variablesb. The estimation of a parameterc. The use of interval or ratiomeasures.

  • 8/14/2019 Copy of Data Analysis 08

    39/44

    Statistical TestsStatistical Tests

    Non-parametric Tests statisticaltests that do not estimate parameters

    - also called distribution-free statistics.

  • 8/14/2019 Copy of Data Analysis 08

    40/44

  • 8/14/2019 Copy of Data Analysis 08

    41/44

  • 8/14/2019 Copy of Data Analysis 08

    42/44

    Steps in Hypothesis

    testing1. State the alternative hypothesis2. State the null hypothesis3. Establish the level of significance

    4. Select a one-tailed or two-tailed test5. Compute a test statistic6. Calculate the degrees of freedom

    7. Obtain a tabled value for the statisticaltest8. Compare the test statistic with the

    tabled value.

    The Decision Matrix

  • 8/14/2019 Copy of Data Analysis 08

    43/44

    The Decision MatrixIn realityIn reality

    WhatWhatwe concludewe conclude

    Null trueNull true Null falseNull false

    Alternative falseAlternative false Alternative trueAlternative true

    InIn realityreal ity...... InIn realityreal ity......

    Accept nullAccept null

    Reject alternativeReject alternative

    Reject null

    Accept alternative

    WeWe says ay ......

    There is no real programThere is no real programeffecteffect

    There is no difference,There is no difference,gaingain

    Our theory is wrongOur theory is wrong

    We say...

    There is a real programeffect

    There is a difference, gain Our theory is correct

    There is no real program effectThere is no real program effect There is no difference, gainThere is no difference, gain Our theory is wrongOur theory is wrong

    There is a real program effectThere is a real program effect There is a difference, gainThere is a difference, gain Our theory is correctOur theory is correct

    1-1-

    THE CONFIDENCE LEVELTHE CONFIDENCE LEVEL TYPE II ERRORTYPE II ERROR

    The odds of saying there isThe odds of saying there is nono effect or gain when in fact thereeffect or gain when in fact thereis noneis none

    # of times out of 100 when# of times out of 100 whenthere isthere is nono effect, well say effect, well say

    there is nonethere is none

    The odds of saying there is noThe odds of saying there is noeffect or gain when in facteffect or gain when in fact therethereis oneis one

    # of times out of 100 when# of times out of 100 whentherethere isis an effect, well say an effect, well say

    there is nonethere is none

    1-1- TYPE I ERRORTYPE I ERROR POWERPOWER

    The odds of saying thereThe odds of saying there isis ananeffect or gain when in fact thereeffect or gain when in fact there

    is noneis none

    The odds of saying thereThe odds of saying there isis ananeffect or gain when in fact thereeffect or gain when in fact there

    is oneis one

    # of times out of 100 when# of times out of 100 whenthere isthere is nono effect, well say effect, well say

    there is onethere is one

    # of times out of 100 when# of times out of 100 whentherethere isis an effect, well say an effect, well say

    there is onethere is one

  • 8/14/2019 Copy of Data Analysis 08

    44/44

    Decision Matrix

    If you try to increase power, youIf you try to increase power, youincrease the chance of windingincrease the chance of winding

    up in the bottom row and of up in the bottom row and of Type I error.Type I error.

    If you try to decrease Type IIf you try to decrease Type I

    errors, you increase the chanceerrors, you increase the chanceof winding up in the top row andof winding up in the top row andof Type II error.of Type II error.