lec1 · title: lec1.key created date: 1/30/2018 5:57:27 pm
Post on 24-Sep-2020
0 Views
Preview:
TRANSCRIPT
Tue/Thu 1:25-2:40
Intro to Data Science
Kimball B11
https://courses.cit.cornell.edu/info2950_2018sp/
Instructor: Paul Ginsparg (242 Gates Hall)
Info 2950, Lecture 1 25 Jan 2018
https://courses.cit.cornell.edu/info2950_2018sp/https://courses.cit.cornell.edu/info2950_2018sp/
Final: Tue 22 May 2:00-4:30
cs 2110/2800
(last year) https://courses.cit.cornell.edu/info2950_2017sp/
https://piazza.com/cornell/spring2018/info2950/home
0. Review of basic python / jupyter notebook1. Counting and probability (factorial, binomial coefficients, conditional probability, Bayes Theorem) Real Data: text classifier, etc. [baby machine learning]2. Statistics: mean, variance; binomial, Gaussian, Poisson distributions3. Graph theory (nodes, edges), networks (c.f. Info2040), graph algorithms4. Power Law data (need exponential and logarithms …)5. Linear and Logistic regression, Pearson and Spearman correlators6. Markov and other correlated data
Rosen chapters 2,6,7,10,11Easley/Kleinberg chpts 3,18+ many other on-line resources [e.g. http://www.cs.cornell.edu/courses/cs1380/2018sp/textbook/,adapted from Berkeley http://data8.org/ (started apr ’16)]
Rough Syllabus
I would found an institution where any person can study data science. - Ezra Cornell
CS 1380 + ORIE 1380 + STSCI 1380Data Science For All
Spring 2018 MWF 10:10-11:00 am No experience required – Open to all – Fulfills MQR-AS
A course for anyone who wants to study data visualization, prediction, machine learning, and programming in Python. We’ll analyze real-world d a t a s e t s o n c r i m e , h e a l t h , transportation, literature, and more!
https://tinyurl.com/datascienceforall
Problem sets will involve both programming and non-programming problems.
Problem sets are not group projects.You are expected to abide by the Cornell University Code of Academic Integrity. It is your responsibility to understand and follow these policies. (In particular, the work you submit for course assignments must be your own. You may discuss homework assignments with other students at a high level, by for example discussing general methods or strategies to solve a problem, but you must cite the other student in your submission. Any work you submit must be your own understanding of the solution, the details of which you personally and individually worked out, and written in your own words.)
You’ll be penalized if you copy an iPython notebook, OR if yours is copied.
Will be discussed in section tomorrow:
including instructions for installing anaconda, we'll standardize on python 3due to minor python 2.7/3.6 compatibility issues(though welcome to use python 2)
course upload site: https://pgcourse.infosci.cornell.edu/cgi-bin/probset.py
N.B.: (former?) known problem with python installations: cs 1110 unfortunately recommended misconfigured software that violates standard practice by surreptitiously adding environment variables to ~/.bashrc file.
(instructions for removing:https://courses.cit.cornell.edu/info2950_2018sp/resources/bashprob.html )
“Problem Set 0”, due Tue 30 Jan 23:59
Definition. A set S is a collection of objects.
The objects of a set are called elements x of the set: x 2 S, or x 62 S
Examples:
X = {1, 2, 3, 4, 5}
C = {Ithaca,Boston,Chicago}
Stu↵ = {1,snow,Cornell,y}
empty set = ;
Can also be defined by rule or equation:
Example: E is the set of even numbers. E = {x | x is even}
Cardinality |S| is the number of elements of S
Examples: |X| = 5, |C| = 3, |Stu↵| = 4, |;| = 0
Examples:
X = {1, 2, 3, 4, 5}
C = {Ithaca,Boston,Chicago}
Stu↵ = {1,snow,Cornell,y}
empty set = ;
E = {x | x is even}
A = {1, 2, 3}
1
Definition. A set S is a collection of objects.
The objects of a set are called elements x of the set: x 2 S, or x 62 S
Examples:
X = {1, 2, 3, 4, 5}
C = {Ithaca,Boston,Chicago}
Stu↵ = {1,snow,Cornell,y}
empty set = ;
Can also be defined by rule or equation:
Example: E is the set of even numbers. E = {x | x is even}
Cardinality |S| is the number of elements of S
Examples: |X| = 5, |C| = 3, |Stu↵| = 4, |;| = 0
Examples:
X = {1, 2, 3, 4, 5}
C = {Ithaca,Boston,Chicago}
Stu↵ = {1,snow,Cornell,y}
empty set = ;
E = {x | x is even}
A = {1, 2, 3}
1
Definition. A set S is a collection of objects.
The objects of a set are called elements x of the set: x 2 S, or x 62 S
Examples:
X = {1, 2, 3, 4, 5}
C = {Ithaca,Boston,Chicago}
Stu↵ = {1,snow,Cornell,y}
empty set = ;
Can also be defined by rule or equation:
Example: E is the set of even numbers. E = {x | x is even}
Cardinality |S| is the number of elements of S
Examples: |X| = 5, |C| = 3, |Stu↵| = 4, |;| = 0
Examples:
X = {1, 2, 3, 4, 5}
C = {Ithaca,Boston,Chicago}
Stu↵ = {1,snow,Cornell,y}
empty set = ;
E = {x | x is even}
A = {1, 2, 3}
1
S
|S| = 29
Definition. A set S is a collection of objects.
The objects of a set are called elements x of the set: x 2 S, or x 62 S
Examples:
X = {1, 2, 3, 4, 5}
C = {Ithaca,Boston,Chicago}
Stu↵ = {1,snow,Cornell,y}
empty set = ;
Can also be defined by rule or equation:
Example: E is the set of even numbers. E = {x | x is even}
Cardinality |S| is the number of elements of S
Examples: |X| = 5, |C| = 3, |Stu↵| = 4, |;| = 0
1
A subset T of a set S is a set of elements all of which are contained in S.
T ⇢ S (proper subset) or T ✓ S
empty set ; 2 S for all S
Examples:
C 0 = {Ithaca,Chicago}
C 0 ⇢ C
X 0 = {x | x is a whole number between 2 and 5} is a subset of X
The power set P(S) of a set S is the set of all subsets of S.
Example: For the set A = {1, 2, 3}, P(A) = {;, {1}, {2}, {3}, {1, 2}, {2, 3}, {1, 3}, {1, 2, 3}}
For a set S with n elements, what is |P(S)|?
2
A subset T of a set S is a set of elements all of which are contained in S.
T ⇢ S (proper subset) or T ✓ S
empty set ; 2 S for all S
Examples:
C 0 = {Ithaca,Chicago}
C 0 ⇢ C
X 0 = {x | x is a whole number between 2 and 5} is a subset of X
The power set P(S) of a set S is the set of all subsets of S.
Example: For the set A = {1, 2, 3}, P(A) = {;, {1}, {2}, {3}, {1, 2}, {2, 3}, {1, 3}, {1, 2, 3}}
For a set S with n elements, what is |P(S)|?
2
S
T
di↵erence of two sets A�B = {x | x 2 A and x 62 B}
Examples:
X � Stu↵ = {2, 3, 4, 5}
Stu↵ �X = {snow,Cornell, y}
C � ; = {Ithaca,Boston,Chicago}
X � E = {1, 3, 5}
symmetric di↵erence A4B = {x | x 2 A or x 2 B, and x 62 A \B}
Examples:
X4Stu↵ = {2, 3, 4, 5, snow,Cornell, y}
C4; = {Ithaca,Boston,Chicago}
X4A = {4, 5}
Cartesian product of two sets A⇥B = {(x, y) | x 2 A and y 2 B}
Example:
A⇥A = {(1, 1), (1, 2), (1, 3), (2, 1), (2, 2), (2, 3), (3, 1), (3, 2), (3, 3)}
4
For two sets to be the same, must have the same elements.
A = B means that 8x we have x 2 A i↵ x 2 B
(Equivalently A = B means that A ✓ B and B ✓ A)
5
A subset T of a set S is a set of elements all of which are contained in S.
T ⇢ S (proper subset) or T ✓ S
empty set ; 2 S for all S
Examples:
C 0 = {Ithaca,Chicago}
C 0 ⇢ C
X 0 = {x | x is a whole number between 2 and 5} is a subset of X
The power set P(S) of a set S is the set of all subsets of S.
Example: For the set A = {1, 2, 3},P(A) = {;, {1}, {2}, {3}, {1, 2}, {2, 3}, {1, 3}, {1, 2, 3}}
For a set S with n elements, what is |P(S)|?
2
A subset T of a set S is a set of elements all of which are contained in S.
T ⇢ S (proper subset) or T ✓ S
empty set ; 2 S for all S
Examples:
C 0 = {Ithaca,Chicago}
C 0 ⇢ C
X 0 = {x | x is a whole number between 2 and 5} is a subset of X
The power set P(S) of a set S is the set of all subsets of S.
Example: For the set A = {1, 2, 3},P(A) = {;, {1}, {2}, {3}, {1, 2}, {2, 3}, {1, 3}, {1, 2, 3}}
For a set S with n elements, what is |P(S)|?
2
Definition. A set S is a collection of objects.
The objects of a set are called elements x of the set: x 2 S, or x 62 S
Examples:
X = {1, 2, 3, 4, 5}
C = {Ithaca,Boston,Chicago}
Stu↵ = {1,snow,Cornell,y}
empty set = ;
Can also be defined by rule or equation:
Example: E is the set of even numbers. E = {x | x is even}
Cardinality |S| is the number of elements of S
Examples: |X| = 5, |C| = 3, |Stu↵| = 4, |;| = 0
Examples:
X = {1, 2, 3, 4, 5}
C = {Ithaca,Boston,Chicago}
Stu↵ = {1,snow,Cornell,y}
empty set = ;
E = {x | x is even}
A = {1, 2, 3}
1
Set Operations
union of two sets A [B = {x | x 2 A or x 2 B}
Examples:
X [ Stu↵ = {1, 2, 3, 4, 5, snow,Cornell, y}
C [ ; = {Ithaca,Boston,Chicago}
A [X = {1, 2, 3, 4, 5} (In this case, A [X = X).
intersection of two sets A \B = {x | x 2 A and x 2 B}
Examples:
X \ Stu↵ = {1}
C \ ; = ;
X \ E = {2, 4}
A \X = {1, 2, 3} (In this case A \X = A)
3
Set Operations
union of two sets A [B = {x | x 2 A or x 2 B}
Examples:
X [ Stu↵ = {1, 2, 3, 4, 5, snow,Cornell, y}
C [ ; = {Ithaca,Boston,Chicago}
A [X = {1, 2, 3, 4, 5} (In this case, A [X = X).
intersection of two sets A \B = {x | x 2 A and x 2 B}
Examples:
X \ Stu↵ = {1}
C \ ; = ;
X \ E = {2, 4}
A \X = {1, 2, 3} (In this case A \X = A)
3
A B
Set Operations
union of two sets A [B = {x | x 2 A or x 2 B}
Examples:
X [ Stu↵ = {1, 2, 3, 4, 5, snow,Cornell, y}
C [ ; = {Ithaca,Boston,Chicago}
A [X = {1, 2, 3, 4, 5} (In this case, A [X = X).
intersection of two sets A \B = {x | x 2 A and x 2 B}
Examples:
X \ Stu↵ = {1}
C \ ; = ;
X \ E = {2, 4}
A \X = {1, 2, 3} (In this case A \X = A)
3
A B
Conditional Probability
Suppose we know that one event has happened and we wish to ask about another.
For two events A and B, the joint probability of A and B is defined as
p(A,B) = p(A ∩ B)
the probability of the intersection of events A and B in the sample space,
equivalently the probability that events A and B both occur
The conditional probability of A relative to B is
p(A|B) = p(A ∩B)/p(B) “the probability of A given B”
Example: Flip a fair coin 3 times.
B = event that we have at least one H
A = event of getting exactly 2 Hs
What is the probability of A given B?
In this case, (A ∩ B) = A, p(A) = 3/8, p(B) = 7/8,
and therefore p(A|B) = 3/7.
Notice that the definition of conditional probability also gives us the formula: p(A∩B) =
p(A|B)p(B). For three events we have: p(A ∩ B ∩ C) = p(A|B ∩ C)p(B|C)p(C). (What is
a general rule?)
We can also use conditional probabilities to find the probability of an event by breaking
the sample space into disjoint pieces. If S = S1 ∪ S2 . . .∪ Sn and all pairs Si, Sj are disjoint
then for any event A, p(A) =!
i p(A|Si)p(Si) =!
i p(A ∩ Si).
Example: Suppose we flip a fair coin twice. Let S1 be the outcomes where the first flip
is H and S2 be the outcomes where the first flip is T . What is the probability of A = getting
2 Hs? p(A) = (1/2)(1/2) + (0)(1/2) = 1/4.
Two events A and B are independent if p(A ∩ B) = p(A)p(B). This immediately gives:
A and B are independent iff p(A|B) = p(A).
12
Definition. A set S is a collection of objects.
The objects of a set are called elements x of the set: x 2 S, or x 62 S
Examples:
X = {1, 2, 3, 4, 5}
C = {Ithaca,Boston,Chicago}
Stu↵ = {1,snow,Cornell,y}
empty set = ;
Can also be defined by rule or equation:
Example: E is the set of even numbers. E = {x | x is even}
Cardinality |S| is the number of elements of S
Examples: |X| = 5, |C| = 3, |Stu↵| = 4, |;| = 0
Examples:
X = {1, 2, 3, 4, 5}
C = {Ithaca,Boston,Chicago}
Stu↵ = {1,snow,Cornell,y}
empty set = ;
E = {x | x is even}
A = {1, 2, 3}
1
di↵erence of two sets A�B = {x | x 2 A and x 62 B}
Examples:
X � Stu↵ = {2, 3, 4, 5}
Stu↵ �X = {snow,Cornell, y}
C � ; = {Ithaca,Boston,Chicago}
X � E = {1, 3, 5}
symmetric di↵erence A4B = {x | x 2 A or x 2 B, and x 62 A \B}
Examples:
X4Stu↵ = {2, 3, 4, 5, snow,Cornell, y}
C4; = {Ithaca,Boston,Chicago}
X4A = {4, 5}
Cartesian product of two sets A⇥B = {(x, y) | x 2 A and y 2 B}
Example:
A⇥A = {(1, 1), (1, 2), (1, 3), (2, 1), (2, 2), (2, 3), (3, 1), (3, 2), (3, 3)}
4
di↵erence of two sets A�B = {x | x 2 A and x 62 B}
Examples:
X � Stu↵ = {2, 3, 4, 5}
Stu↵ �X = {snow,Cornell, y}
C � ; = {Ithaca,Boston,Chicago}
X � E = {1, 3, 5}
symmetric di↵erence A4B = {x | x 2 A or x 2 B, and x 62 A \B}
Examples:
X4Stu↵ = {2, 3, 4, 5, snow,Cornell, y}
C4; = {Ithaca,Boston,Chicago}
X4A = {4, 5}
Cartesian product of two sets A⇥B = {(x, y) | x 2 A and y 2 B}
Example:
A⇥A = {(1, 1), (1, 2), (1, 3), (2, 1), (2, 2), (2, 3), (3, 1), (3, 2), (3, 3)}
4
A Bdi↵erence of two sets A�B = {x | x 2 A and x 62 B}
Examples:
X � Stu↵ = {2, 3, 4, 5}
Stu↵ �X = {snow,Cornell, y}
C � ; = {Ithaca,Boston,Chicago}
X � E = {1, 3, 5}
symmetric di↵erence A4B = {x | x 2 A or x 2 B, and x 62 A \B}
Examples:
X4Stu↵ = {2, 3, 4, 5, snow,Cornell, y}
C4; = {Ithaca,Boston,Chicago}
X4A = {4, 5}
Cartesian product of two sets A⇥B = {(x, y) | x 2 A and y 2 B}
Example:
A⇥A = {(1, 1), (1, 2), (1, 3), (2, 1), (2, 2), (2, 3), (3, 1), (3, 2), (3, 3)}
4
A B
di↵erence of two sets A�B = {x | x 2 A and x 62 B}
Examples:
X � Stu↵ = {2, 3, 4, 5}
Stu↵ �X = {snow,Cornell, y}
C � ; = {Ithaca,Boston,Chicago}
X � E = {1, 3, 5}
symmetric di↵erence A4B = {x | x 2 A or x 2 B, and x 62 A \B}
Examples:
X4Stu↵ = {2, 3, 4, 5, snow,Cornell, y}
C4; = {Ithaca,Boston,Chicago}
X4A = {4, 5}
Cartesian product of two sets A⇥B = {(x, y) | x 2 A and y 2 B}
Example:
A⇥A = {(1, 1), (1, 2), (1, 3), (2, 1), (2, 2), (2, 3), (3, 1), (3, 2), (3, 3)}
4
top related