session 1 & 2

24
SESSION 1 & 2 Last Update 15 th February 2011 Introduction to Statistics

Upload: aspen-rowland

Post on 30-Dec-2015

27 views

Category:

Documents


2 download

DESCRIPTION

SESSION 1 & 2. Last Update 15 th February 2011. Introduction to Statistics. Learning Unit 1 (10 Sessions). Give a description of statistical techniques Construct a frequency distribution table Represent data in tabular or graphical form - PowerPoint PPT Presentation

TRANSCRIPT

SESSION 1 & 2

Last Update15th February 2011

Introduction to Statistics

Lecturer: Florian BoehlandtUniversity: University of Stellenbosch Business SchoolDomain: http://www.hedge-fund-analysis.net

Learning Unit 1 (10 Sessions)

• Give a description of statistical techniques

• Construct a frequency distribution table• Represent data in tabular or graphical

form• Distinguish between different graphical

representation forms

Session 1 & 2

• Concepts and Definitions• Terminology• Data types• Graphical representations

Definitions

Statistics is the name given to the science of collecting facts, typically in numerical form, and studying or analysing them. The facts, or data, can cover a wide range of subjects. The science of statistics deals with the methods used in the collection, presentation, analysis and interpretation of data.

Definitions cont.

Statistics is a way to get information from data.

Descriptive Statistics

• Methods of organizing, summarizing and presenting data in a convenient and informative way.

• Numerical techniques to summarize data: Measure of Central location or Measure of Variability.

Inferential Statistics

• Body of methods used to draw conclusions or inferences about characteristics of a population based on sample data.

• “Estimation”

Statistical Concepts

• The Population is Group of all items of interest to the statistical practitioner.

• The Sample is a set of data drawn from the population. A descriptive measure of the sample is called a statistic.

• Statistical Inference is the process of making an estimate, prediction, or decision about a population based on sample data.

Statistical Concepts

• A Variable is some characteristic of a population or sample.

• The values of the variable are the possible observations of the variable.

• Data are the observed values of a variable.

Example Stock

Concept Example

Variable Anglo American PLC Closing Price

Value Real numbers (fractional)

Data Time Series of all Closing Prices (Date – Closing Price)

Sample JSE ALSI

Population All JSE-listed companiesStatistical Inference

Example Test Marks

Concept Example

Variable Mark on statistic exam

Value Exam Marks (0 to 100)

Data Test marks of k students

Sample Students from iKapa campus

Population All Vega studentsStatistical Inference

Data Types

• Interval data are real numbers, such as heights, weights, incomes, and distance.

• Example stock performance in %:

1/3/2011 -1.34

1/4/2011 0.00

… …

1/31/2011 +2.05

Data Types

• The values of nominal data are categories. Nominal data is often recorded by arbitrarily assigning a number to each category

• Example Marital Status:Single 1Married 2Divorced 3Widowed 4

Data Types

• Ordinal data appear nominal but their values are in order.

• Example students evaluating course:Poor 1Fair 2Good 3Very Good 4Excellent 5

Codes are arbitrary. Thus, no meaningful interpretation of the results.

Calculations Data Types

• All calculations are allowed on interval data (e.g. calculating the average).

• Codes in nominal data are arbitrary. Averages are not meaningful; Observations can be described counting the number of each category and report the frequencies frequencies.

Example Frequencies

• Original responses:1 2 2 2 4 1 2 2 1 3 4 4 4 3• Frequency table / Proportions:

Category Code Frequency

Single 1 3

Married 2 5

Divorced 3 2

Widowed 4 4

Single 1

Married 2

Divorced 3

Widowed 4

Calculations Data Types

• The only permissible calculations for ordinal data are ones involving a ranking process (e.g. the median).

Data Collection

Primary Data

vs

Secondary Data

Primary Data

- Questionnaires / Surveys- Cannot be looked up elsewhere- The collection is performed by

observation, survey, experimental research conducted for a part of total population under consideration - sample

Data Collection

Discrete Data

vs

Continuous Data

Discrete Data

• A random variable whose observations can take on only specific values, usually only integer (whole number) values, is referred to as a discrete random variable.

• Example– Statistic test marks (0 to 100)– Number of students in a class room– The outcomes of tossing a die– The outcome of tossing a coin (binary)

Continuous Data

• Data that are measured on a scale, such as mass or temperature, are called continuous data.

• Example– Time it takes a student to complete a statistics test– The weight / height of a student– The return on a stock

Graphical Techniques

• Nominal Data: Bar charts / pie charts• Interval data: Frequency distribution tables

and histograms