1 introduction ling 572 fei xia, dan jinguji week 1: 1/08/08

Post on 21-Dec-2015

217 Views

Category:

Documents

1 Downloads

Preview:

Click to see full reader

TRANSCRIPT

1

Introduction

LING 572

Fei Xia, Dan Jinguji

Week 1: 1/08/08

Outline

• General course information

• Course contents

• Reading assignment #1: due 1/10

• Take-home exam #1: due 1/10

2

3

General info

• Course url: http://courses.washington.edu/ling572

– Syllabus (incl. slides, assignments, and papers): updated every week.

– GoPost:– Collect it:

• Please check your emails and GoPost at least once per day.

4

Slides

• The slides will be online before class.

• The final version will be uploaded a few hours after class.

• “Additional slides” are not required and not covered in class.

5

Prerequisites• CS 326 (Data Structures) or equivalent:

– Ex: hash table, array, tree, …

• Stat 391 (Prob. and Stats for CS) or equivalent: Basic concepts in probability and statistics– Ex: random variables, chain rule, Bayes’ rule

• Programming in Perl, C, C++, Java, or Python

• Basic unix/linux commands (e.g., ls, cd, ln, sort, head):  tutorials on unix

• LING570:

• If you don’t meet all the prerequisites above, you need to get permission from Fei before taking LING572.

LING570 prerequisites

• If you did not take LING570 last quarter, you need to understand all the material covered in that class.

• Especially,– the material in Weeks #9-#10– hw #10, Quiz #4– the Mallet package– http://courses.washington.edu/ling570

6

7

Topics covered in Ling570

• Unit #1:– Formal languages and formal grammars– FSA, FST– Morphological analysis

• Unit #2: LM, ngram, and smoothing

• Unit #3: HMM and POS tagging

• Unit #4: Classification and sequence labeling tasks.

8

Grades for LING572

No midterm or final exams.

Graded:• Assignments (9): 65-75%• Take-home exams (3-5): 15-25%

Not graded:• Reading assignments (5-9): 5-10%• Class participation: 5%

9

Office hour

• Fei:– Email:

• Email address: fxia@u.washington.edu • Subject line should include “ling572”• The 48-hour rule: it works both ways

– Office hour: • Time: Fr: 10:30-11:30am• Location: Padelford A-210G

10

TA hour

• Dan Jinguji– Email: danjj@u.washington.edu

– Time: T: ??

– Location: Art 337

Assignments / Exams

• Assignments: the same as in ling570

• Take-home exams: to replace in-class quizzes

• Reading assignments: some papers should be read before class.

• When there are take-home exams or reading assignments for the same period, the amount of assignments will be reduced accordingly.

• The total amount of time spent on Ling572 will be about 15-30 hours.

11

Assignments

• Nine assignments

• Programming languages: C, C++, Java, Perl, or Python.

• Please follow the instructions in the assignments, including– command line format– file format– the probability model– …

12

The Mallet package

• Several assignments will use the Mallet package.

• If you don’t know how to use Mallet, you should get familiar with the package ASAP. You can start with the hw10 from LING570.

• The Mallet slides are at http://courses.washington.edu/ling570/fei_fall07/11_28_mallet.pdf

• The LING570 hw10 is at http://courses.washington.edu/ling570/fei_fall07/hw10.pdf

13

14

Assignment submission

• Use “Collect it”: submit the tar file.– E.g., tar –cvf hw1.tar hw1_dir

• Due date: every Thurs at 1pm unless specified otherwise

• The submission area is closed 4 days after the due date.

• There is 1% penalty for every hour after the due date.

15

Homework Submission (cont)

• Each submission includes – a note file: hw1.(txt|doc|pdf) for hw1.

• If your code does not work, explain in the note file what you have implemented so far.

– a set of shell scripts: e.g., kNN.sh– source code: e.g., kNN.C– binary code (for C/C++/Java): kNN.out– data files if any.– The TA will NOT compile or debug your code.

• Time spent on an assignment: 10-20 hours/week

I would appreciate it if you could tell me the time you spent on the homework.

Take-home exams

• Normally, it is handed out on Tuesday, and due on Thursday

• Bring the hardcopy of your solution to class.

• The dates on the tentative schedule are subject to change.

• Extension will be granted for the exams ONLY under extremely unusual circumstances.

• There are no makeup exams. If you know that you will miss a class in which an exam could be given, you need to inform me at least two hours before the class starts.

16

Take-home exams (cont)

• You should complete the exams on your own.

• No discussion among students is allowed.

• You can refer to anything that is available on the course url and on patas, but please don’t search the Web for answers.

• If you have any questions about the exams, please email Fei.

17

Reading assignments

• You will answer some questions about the papers that will be discussed in next class.

• The questions are on teaching slides, and there are no separate documents for them.

• Your answers should be concise and no more than a few lines.

• Your answers are due before the next class. Bring the hardcopy of your answers to class.

18

Summary of assignments and exams

Assignments (hw)

Take-home exams

Reading assignments

Num 9 3-5 5-9

Distribution Download from the course url

Distributed in class

Download from the course url

Discussion Allowed Disallowed Allowed

Submission Collect It Bring to class Bring to classNot graded

Due date 1pm every Thurs Before next class Before next class

Extension 1% penalty per hour

Normally disallowed

Disallowed

Estimate of hours 10-20 hours 1-5 hours 2-5 hours

Solution files On Patas On Patas Discussed in class

19

Patas

• If you need to have a patas account, you need to email linghelp@u.washington.edu right away to get an account.

• The directory for LING572:

~/dropbox/07-08/572/– hw1/, hw2/, ….: Assignments and solution– misc_slides/: Solution to exams and misc slides that

are not on the course url.

• For jobs that run more than 5 minutes, use the cluster submission commands: see Hw1.

20

21

Course plan

Types of ML problems

• Classification problem• Estimation problem• Clustering• Discovery• …

A learning method can be applied to one or more types of ML problems.

We will focus on the classification problem.

22

Course objectives

• Covering basic statistical methods that produce state-of-the-art results

• Focusing on classification and sequence labeling problems

• Some ML algorithms are complex. We will focus on basic ideas, not theoretical proofs.

23

Main units• Unit #1 (2 weeks): simple classification algorithms

– kNN– Decision tree– Naïve Bayes

• Unit #2 (3 weeks): advanced classification algorithms– MaxEnt* – SVM**

24

Main units (cont)

• Unit #3 (2 weeks): sequence labeling algorithms– TBL* (if time permits)– CRF**

• Unit #4 (1 week): system combination

• Unit #5 (if time permits)– Introduction to semi-supervised learning **– Introduction to EM **

25

Other topics

• Information theory

• Feature selection

• Converting multi-class task to binary classification task

26

Three levels of discussion

The default (i.e., unmarked): We will discuss the model, training, decoding, etc. – kNN, Decision tree, Naïve Bayes.

*: We will discuss the model, but not the training and other implementation issues:– MaxEnt, TBL

**: We will only go over the main intuition about the algorithms:– SVM, CRF, semi-supervised learning, EM

27

Questions for each ML method

• Modeling: – what is the model? – What kind of assumption is made by the model? – How many types of model parameters? – How many “internal” (or non-model) parameters?– …

28

Questions for each method (cont)

• Training: how to estimate parameters?

• Decoding: how to find the “best” solution?

• Weaknesses and strengths? – Is the algorithm

• robust? (e.g., handling outliners) • scalable?• prone to overfitting?• efficient in training time? Test time?

– How much data is needed?• Labeled data• Unlabeled data

29

Reading assignment #1

30

Reading assignment #1

• Read M&S 2.2: Essential Information Theory

• Questions: For a random variable X, p(x) and q(x) are two distributions: Assuming p is the real distribution.– p(X=a)=p(X=b)=1/8, p(X=c)=1/4, p(X=d)=1/2– q(X=a)=q(X=b)=q(X=c)=q(X=d)=1/4

(a) What is H(X)?

(b) What is cross entropy H(X,q)?

(c) What is KL divergence D(p||q)?

(d) What is D(q||p)?31

Next time

• Both Reading #1 and Exam #1 are due before class.

• Bring the hardcopy to class.

• Topics:– Information theory and hw #1– Solution to Exam #1 (if time permits)– Recap on the classification problem (if time permits)

32

top related