cse 1320 intermediate programming

24
CS 7455 MOBILE APP DEVELOPMENT Dr. Mingon Kang Computer Science, Kennesaw State University

Upload: others

Post on 13-Apr-2022

11 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: CSE 1320 Intermediate Programming

CS 7455

MOBILE APP DEVELOPMENT

Dr. Mingon Kang

Computer Science, Kennesaw State University

Page 2: CSE 1320 Intermediate Programming

Terminology

Features

An individual measurable property of a phenomenon being observed

The number of features or distinct traits that can be used to describe each item in a quantitative manner

May have implicit/explicit patterns to describe a phenomenon

Samples

Items to process (classify or cluster)

Can be a document, a picture, a sound, a video, or a patient

Reference: http://www.slideshare.net/rahuldausa/introduction-to-machine-learning-38791937

Page 3: CSE 1320 Intermediate Programming

Data In Machine Learning

๐‘ฅ๐‘–: input vector, independent variable

๐‘ฆ: response variable, dependent variable

๐‘ฆ โˆˆ {โˆ’1, 1}: binary classification

๐‘ฆ โˆˆ โ„: regression

Predict a label when having observed some new ๐‘ฅ

Page 4: CSE 1320 Intermediate Programming

Types of Variable

Categorical variable: discrete or qualitative variables

Nominal:

Have two or more categories, but which do not have an

intrinsic order

Dichotomous

Nominal variable which have only two categories or levels.

Ordinal

Have two or more categories, which can be ordered or

ranked.

Continuous variable

Page 5: CSE 1320 Intermediate Programming

Mathematical Notation

Matrix: uppercase bold Roman letter, ๐—

Vector: lower case bold Roman letter, ๐ฑ

Scalar: lowercase letter

Transpose of a matrix or vector: superscript T or โ€˜

E.g.

Row vector: ๐‘ฅ1, ๐‘ฅ2, โ€ฆ , ๐‘ฅ๐‘

Corresponding column vector: ๐ฑ = ๐‘ฅ1, ๐‘ฅ2, โ€ฆ , ๐‘ฅ๐‘T

Matrix: ๐— = {๐’™๐Ÿ, ๐’™๐Ÿ, โ€ฆ , ๐’™๐’‘}

Page 6: CSE 1320 Intermediate Programming

Transpose of a Matrix

Operator which flips a matrix over its diagonal

Switch the row and column indices of the matrix

Denoted as AT, Aโ€ฒ, Atr, or, At.

[AT ]๐‘–๐‘—= [A]๐‘—๐‘– If A is an m*n matrix, Aโ€™ is an n*m matrix

(AT)T = A

(๐ด + ๐ต)T= AT + BT

(AB)T= BTAT

Page 7: CSE 1320 Intermediate Programming

Inverse of a Matrix

The inverse of a square matrix A, sometimes called

a reciprocal matrix, is a matrix Aโˆ’1 such that

AAโˆ’1 = I

where I is the identity matrix.

The Inverse of a Matrix is the same idea but we

write it A-1

Reference: https://www.mathsisfun.com/algebra/matrix-inverse.html

Page 8: CSE 1320 Intermediate Programming

What is Machine Learning?

Data Model

f(x)

Training

Page 9: CSE 1320 Intermediate Programming

What is Machine Learning?

New Data Make a decision

f(x)

Page 10: CSE 1320 Intermediate Programming

Supervised learning

Data: ๐ท = {๐‘‘1, ๐‘‘2, โ€ฆ , ๐‘‘๐‘›} a set of n samples

where ๐‘‘๐‘– =< ๐’™๐’Š, ๐‘ฆ๐‘– >

๐’™๐’Š is a input vector and ๐‘ฆ๐‘– is a desired output

Objective: learning the mapping ๐‘“: ๐‘ฟ โ†’ ๐’š

s.t. ๐‘ฆ๐‘– โ‰ˆ ๐‘“(๐’™๐’Š) for all i = 1,โ€ฆ,n

Regression: ๐’š is continuous

Classification: ๐’š is discrete

Page 11: CSE 1320 Intermediate Programming

Linear Regression

Review of Linear Regression in Statistics

Page 12: CSE 1320 Intermediate Programming

Linear Regression

Page 13: CSE 1320 Intermediate Programming

Linear Regression

Page 14: CSE 1320 Intermediate Programming

Linear Regression

Reference: https://lagunita.stanford.edu/c4x/HumanitiesScience/StatLearning/asset/linear_regression.pdf

Page 15: CSE 1320 Intermediate Programming

Linear Regression

How to represent the data as a vector/matrix

We assume a model:

๐ฒ = b0 + ๐›๐— + ฯต,

where b0 and ๐› are intercept and slope, known as

coefficients or parameters. ฯต is the error term (typically

assumes that ฯต~๐‘(๐œ‡, ๐œŽ2)

Page 16: CSE 1320 Intermediate Programming

Linear Regression

How to represent the data as a vector/matrix

Include bias constant (intercept) in the input vector

๐— โˆˆ โ„๐’ร—(๐’‘+๐Ÿ), ๐ฒ โˆˆ โ„๐’, and ๐› โˆˆ โ„๐’‘+๐Ÿ

๐— = ๐Ÿ, ๐ฑ๐Ÿ, ๐ฑ๐Ÿ, โ€ฆ , ๐ฑ๐ฉ

๐ฒ = {y1, y2, โ€ฆ , yn}T

๐› = {b0, b1, b2, โ€ฆ , bp}T

Page 17: CSE 1320 Intermediate Programming

Linear Regression

Find the optimal coefficient vector b that makes the most similar observation

๐‘ฆ1โ‹ฎ๐‘ฆ๐‘›

โ‰ˆ111

๐‘ฅ11 โ‹ฏ ๐‘ฅ๐‘1โ‹ฎ โ‹ฑ โ‹ฎ

๐‘ฅ1๐‘› โ‹ฏ ๐‘ฅ๐‘๐‘›

๐‘0โ‹ฎ๐‘๐‘

or

๐‘ฆ1โ‹ฎ๐‘ฆ๐‘›

=111

๐‘ฅ11 โ‹ฏ ๐‘ฅ๐‘1โ‹ฎ โ‹ฑ โ‹ฎ

๐‘ฅ1๐‘› โ‹ฏ ๐‘ฅ๐‘๐‘›

๐‘0โ‹ฎ๐‘๐‘

+

๐‘’1โ‹ฎ๐‘’๐‘›

Page 18: CSE 1320 Intermediate Programming

Ordinary Least Squares (OLS)

๐ฒ = ๐—๐› + ๐ž

Estimate the unknown parameters (b) in linear regression model

Minimizing the sum of the squares of the differences between the observed responses and the predicted by a linear function

Residual Sum of Squares (RSS) =

๐‘–=1

๐‘›

(๐‘ฆ๐‘– โˆ’ ๐ฑ๐‘–๐›)2

Page 19: CSE 1320 Intermediate Programming

Ordinary Least Squares (OLS)

Page 20: CSE 1320 Intermediate Programming

Optimization

Need to minimize the error

min ๐ฝ(๐›) =

๐‘–=1

๐‘›

(๐‘ฆ๐‘– โˆ’ ๐ฑ๐‘–๐›)2

To obtain the optimal set of parameters (b),

derivatives of the error w.r.t. each parameters must

be zero.

Page 21: CSE 1320 Intermediate Programming

Optimization

๐ฝ = ๐žT๐ž = ๐ฒ โˆ’ ๐—๐› โ€ฒ ๐ฒ โˆ’ ๐—๐›= ๐ฒโ€ฒ โˆ’ ๐›โ€ฒ๐—โ€ฒ ๐ฒ โˆ’ ๐—๐›= ๐ฒโ€ฒ๐ฒ โˆ’ ๐ฒโ€ฒ๐—๐› โˆ’ ๐›โ€ฒ๐—โ€ฒ๐ฒ + ๐›โ€ฒ๐—โ€ฒ๐—๐›= ๐ฒโ€ฒ๐ฒ โˆ’ ๐Ÿ๐›โ€ฒ๐—โ€ฒ๐ฒ + ๐›โ€ฒ๐—โ€ฒ๐—๐›

๐œ•๐žโ€ฒ๐ž

๐œ•๐›= โˆ’2๐—โ€ฒ๐ฒ + 2๐—โ€ฒ๐—๐› = 0

๐—โ€ฒ๐— ๐› = ๐—โ€ฒ๐ฒแˆ˜๐› = (๐—โ€ฒ๐—)โˆ’1๐—โ€ฒ๐ฒ

Matrix Cookbook: https://www.math.uwaterloo.ca/~hwolkowi/matrixcookbook.pdf

Page 22: CSE 1320 Intermediate Programming

Linear regression for classification

For binary classification

Encode class labels as y = 0, 1 ๐‘œ๐‘Ÿ {โˆ’1, 1}

Apply OLS

Check which class the prediction is closer to

If class 1 is encoded to 1 and class 2 is -1.

๐‘๐‘™๐‘Ž๐‘ ๐‘  1 ๐‘–๐‘“ ๐‘“ ๐‘ฅ โ‰ฅ 0

๐‘๐‘™๐‘Ž๐‘ ๐‘  2 ๐‘–๐‘“ ๐‘“ ๐‘ฅ < 0

Logistic regression

We will cover this later

Page 23: CSE 1320 Intermediate Programming

Linear Model in Computer Vision

Features are pixel values in an image

Data matrix X:

An image is a two dimensional matrix

Need to reshape the 2D-matrix to 1D-array

Height by Width matrix Height * Width matrix array for

a single image

If we have N samples, X will be a N by (Height*Width)

matrix

Assume that every pixel is independent

Page 24: CSE 1320 Intermediate Programming

Linear Model in Computer Vision

Applications

Any classification and regression problems:

Gender/age classification

Hand-written digit classification

MNIST

However, linear models are not the best for

classification problems

Logistic regression is a linear model for classification.