a powerful python library for data analysis by badri prudhvi badri prudhvi
TRANSCRIPT
A Powerful Python Library for Data Analysis
BYBADRI PRUDHVI.
Contents: Introduction Key Features Core Operations Sample Dataset Analysis in
Pandas
Introduction
Why Learn Pandas? If you like Python…It’s a better Python. It’s a smoother path than raw numpy Very easy to do Data Analysis
Why do you need pandas? When working with tabular or structured data (like R data frame, SQL
table, Excel spreadsheet, ...): Import data Clean up messy data Explore data, gain insight into data Process and prepare your data for analysis Analyze your data (together with statsmodels, ...)
Introduction
http://pandas.pydata.org Software library written for the Python programming language. Mainly used for data manipulation and analysis. Core Data Structures:
Series Data Frames.
Offers data structures and operations for manipulating : Numerical tables Time series.
Key features
Data Frame object for data manipulation with integrated indexing Fast, easy and flexible input/output for a lot of different data formats Merging and joining (concat, join) Powerful time series manipulation (resampling, timezones, ..) Easy plotting Data alignment integrated handling of missing data Reshaping and pivoting of data sets Label-based slicing, fancy indexing, and sub setting of large data sets
Key features (Contd..) Data structure column insertion and deletion Grouping : groupby functionality Hierarchical axis indexing to work with high-dimensional data
in a lower-dimensional data structure Time series-functionality :
Date range generation, Frequency conversion, Moving window statistics, Moving window linear regressions, Date shifting and lagging
Core Operations
Create
Select
Insert
Map
Join
Sort
Clean
Bin
View
Update
Filter
Append
Group
Summarize
Conform
Rotate
Let’s see some Cool stuff !!!
Analyzing Sample Data using Pandas in Python
Thanks for listening….
Any Questions?