web viewch1: python basics . print() type() : int, float, str, bool # in python, double “...

Post on 17-Feb-2018

234 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Python for Data science (DataCamp)ch1: Python basics

print()type() : int, float, str, bool# In python, double “ ” and single quotes ‘ ’ have identical functionality, unlike PHP or Bash# In [16]: 2 + 3 Out[16]: 5 In [17]: 'ab' + 'cd' Out[17]: 'abcd'help(function): open up documentation

ch2: python list [a, b, c] contain any type contain different typesfam = ["liz", 1.73, "emma", 1.68, "mom", 1.71, "dad", 1.89]fam2 = [[ "liz", 1.73], ["emma", 1.68], ["mom", 1.71], ["dad", 1.89]]In [13]: type(fam) Out[13]: list In [14]: type(fam2) Out[14]: list

Subsetting listsfam[3]: 1.68fam[-1]: 1.89fam[3:5]: [1.68, ‘mom’] [ start : end ] inclusive exclusivefam[:4]: ['liz', 1.73, 'emma', 1.68] Adding elements: fam + ["me", 1.79] Delete elements: del(fam[2])# Note: list is not primary type, so if y=x, y is referred to x. Any change to y will also change x.

ch3: Functions & Packagesmax(fam) % Maximum of listlen() % Length of list or string:round() % round(number [ , ndigits]) Round a number to a given precision in decimal digits (default 0).round(1.68, 1) 1.7

list.count() method counts how many times an element has occurred in a list and returns it.fam.append("me")

Methods Everything = object Object have methods associated, depending on type

ch4: NumPyList recap: powerful, collection o f different types, change/add/removeBut lack of mathematical operations over collections, and speedfor example:

Solution: NumPyNumeric Python;Alternative to Python list: NumPy Array;Calculations over entire arraysEasy and FastInstallation in the terminal: pip3 install numpy

NumPy methods: np.mean, np.median, np.corrcoef, np.std, np.sort, np.sum

intermediate_ch1: MatplotlibFunctions: Visualization; Data structures; Control structures; Case study

py.scatter(x, y)help(plt.hist)plt.xlabel(‘Year’)plt.title(‘…’)plt.yticks([0, 2, 4, 6, 8], [‘Germany’, ‘Dutch’, ‘China’, ‘US’, ‘UK’])

intermediate_ch2: Dictionaries & Pandasdict_name [ key ]result: value

Dictionaries can contain key: value pairs where the values are again dictionaries.europe = { 'spain': { 'capital':'madrid', 'population':46.77 }, 'france': { 'capital':'paris', 'population':66.03 }, 'germany': { 'capital':'berlin', 'population':80.62 }, 'norway': { 'capital':'oslo', 'population':5.084 } }# Print out the capital of Franceprint(europe['france']['capital'])# Create sub-dictionary datadata={'capital':'rome', 'population':59.83}# Add data to europe under key 'italy'europe['italy']=dataprint(europe)

Pandas

Pandas is an open source library, providing high-performance, easy-to-use data structures and data analysis tools for Python.

The DataFrame is one of Pandas' most important data structures. It's basically a way to store tabular data where you can label the rows and the columns. One way to build a DataFrame is from a dictionary.

Index and select Data Square brackets Advanced methods

loc, iloc

# Note: The single bracket version gives a Pandas Series, the double bracket version gives a Pandas DataFrame.loc and iloc allow you to select both rows and columns from a DataFrame.

# Note: about differences between Pandas series and Dataframepandas.DataFrame(data=None, index=None, columns=None, dtype=None, copy=False)Two-dimensional size-mutable, potentially heterogeneous tabular data structure with labeled axes (rows and columns). Arithmetic operations align on both row and column labels. Can be thought of as a dict-like container for Series objects. The primary pandas data structureSo the Series is the datastructure for a single column of a DataFrame, not only conceptually, but literally i.e. the data in a DataFrame is actually stored in memory as a collection of Series.https://stackoverflow.com/questions/26047209/what-is-the-difference-between-a-pandas-series-and-a-single-column-dataframe

https://www.tutorialspoint.com/python_pandas/python_pandas_dataframe.htmhttps://www.tutorialspoint.com/python_pandas/python_pandas_series.htm

intermediate_ch3: Comparison OperatorsComparison Operators: how python values relate<, >, <=, >=, ==, !=Boolean Operators: and, or, not# Note: when dealing with numpy array, use np.logical_or/and/not(logic_array1, logic_array2) on element-wise comparison

Conditional Statements: if condition :

expression 1: elif condition:

expression 2: else :

expression 3:

Filtering Pandas DataFrame:Example-1 Compare: select contries with are over 8 million km2

Example-2 Boolean operators: also numpy.logical_and/or/not()

intermediate_ch4: While loopwhile: repeat action until condition is met:while condition :

expression

for loop: for each var in seq, execute expressionfor var in seq :

expression enumerate(obj): iterator for index, value of iterable

Loop over string

Loop over Dictionary: dict.items()

loop over Numpy arrays: np.nditer(obj)

loop over DataFrame: my_pandas_dataframe.iterrows()

Pandas method: apply(function): apply functions

Recap: Dictionary: for key, val in dict.items() :Numpy array: for var in np.nditer(my_array) :DataFrame: for lab, row in my_pandas_dataframe.iterrows() :

intermediate_ch5: Random Numbersimport numpy as npnp.random.seed(num)np.random.rand() # random float from 0-1np.random.randint(start, end)

Throw coins 10 times, count number of times tails appeared, store this number in final_tails list. Repeat 100 times.

== & is == is for value equality. Use it when you would like to know if two objects have the

same value. is is for reference equality. Use it when you would like to know if two references

refer to the same object.In general, when you are comparing something to a simple type, you are usually checking for value equality, so you should use ==. For example, the intention of your example is probably to check whether x has a value equal to 2 (==), not whether x is literally referring to the same object as 2.>>> a = 500>>> b = 500>>> a == bTrue>>> a is bFalse

>>> a = [1, 2, 3]>>> b = a>>> b is a True>>> b == aTrue>>> b = a[:]>>> b is aFalse>>> b == aTruehttps://stackoverflow.com/questions/132988/is-there-a-difference-between-and-is-in-python

top related