elearning course curriculum

Upload: maheshwar

Post on 03-Jun-2018

232 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/13/2019 Elearning Course Curriculum

    1/9

    E-Learning Certificate Programs in Big DataCertificate Program in Accelerated Excellence (E-learning Mode)

    Engineering Big Data with R and Hadoop Ecosystem Essential of Applied Predictive Analytics

  • 8/13/2019 Elearning Course Curriculum

    2/9

    INTERNATIONAL SCHOOL OF ENGINEERING http://www.insofe.edu.in

    CSE 7304co Engineering Big Data with R and Hadoop Ecosystem

    Companies collect and store large amounts of data during daily transactions. This data is a

    combination of structured, semi-structured and unstructured data. The volume of the data being

    collected daily in many organizations has grown from MB (106) to TB (1012) in the past few yearsand is continuing to grow at an exponential pace. The very large size, lack of structure and the

    pace at which it is growing characterize the "Big Data" revolution.

    To analyze long-term trends and patterns in the data and provide actionable intelligence to

    managers, this data needs to be consolidated and processed in specialized processes; those

    techniques form the core of this module.

    The use cases for the program are "analyzing a customer in near real-time" as applied in Retail,

    Banking, Airlines, Telecom or Gaming industries. At the end of the program, the participants willbe able to set up a Hadoop cluster and write a Map Reduce program that uses pre-built libraries

    to solve typical CRM data mining tasks like recommendation engines.

    This course thoroughly trains candidates on the following techniques:

    1. SQL querying (with a focus on statistical analysis)2. Hadoop and Map Reduce methods of programming3. Designing columnar databases

    From a tools perspective, this course introduces you to Hadoop. You will learn one of the most

    powerful combinations of Big Data, viz., "R and Hadoop".

    In addition, all the essential content required to build powerful Big Data processing applications

    and to acquire respected industry certifications like Cloudera's Apache Hadoop Developer

    certification will be covered in the course. The emphasis is not on abstract theory or on mindless

    coding. The emphasis is, instead, placed on learning concepts and real-world programming

    techniques.

    Schedule

    A 40-hour (20 Sessions), 7-week program each session lasts 2 hours and we meet every alternate

    day (3 sessions/week)

  • 8/13/2019 Elearning Course Curriculum

    3/9

    INTERNATIONAL SCHOOL OF ENGINEERING http://www.insofe.edu.in

    Session# Lecture Session Lab Session (15-30min)Shakeup Quiz

    (5 -7min)

    1Introduction to Big Data &

    Applications

    Live demo of an Internet-based big data

    application (10-15min)

    2 The Hadoop Eco-system Different Hadoop installations (20min) Yes

    3Parallel architectures and

    concurrent algorithmsLinux shell, Java basics demo (5+20min) Yes

    4Distributed File Systems, GFS &

    HDFSYes

    5 HDFS (continued), CDH4 HDFSUsing HDFS from shell & from programs,

    HDFS Configuration & Log files (30min)Yes

    6 Map Reduce MR configuration and log files (15min) Yes

    7 Map Reduce (continued) Word Count with MR (20min) Yes

    8Map Reduce (continued), YARN,

    Hadoop Streaming

    Hadoop streaming (in some language

    popular with this batch); CDH4 features

    demo? (20+ 5-10min)

    Yes

    9 Sqoop, Hive Sqoop, Hive demo (5+20min) Yes

    10 R-Hadoop

    Demonstration of Word Count in R-

    Hadoop, contrast with MR version

    (30min)

    Yes

    11 NoSQL databases including HBaseMore examples on Hive and R-Hadoop.

    Small demo of H-Base (20-25min)Yes

    12 PIG, Oozie PIG, Oozie demo (20+5min) Yes

    13Machine Learning on Hadoop -

    Mahout

    Demonstrate Mahout. Run on movie reco

    data (30min).Yes

    14 Text Search Application on Hadoop

    MR Demo of Text index building. Assign

    Text Search homeworks (Homeworks can

    be done in any one of R-Hadoop / PIG /

    Hive / Java MR / Hadoop Streaming, as

    per individual preference) - 25+15min

    Yes

    15 Other ecosystem components Yes

    16 Text Classification, text clustering

    Mahout for text classification. Text search

    student submissions discussion

    (15+20min).

    Yes

    17Graph processing & Applications

    including SSSP

    MR demo of SSSP on a non-trivial graph

    (20min). Assign graph processing

    homework.

    Yes

    18 PageRank, BSP, HamaPageRank demo on MR and Hama

    (10+10min).Yes

    19Pregel, Giraph, Social Network

    Mining

    Graph homework student submissions

    discussion (20min)Yes

    20 Certification & Wrap up Interaction session with certifiedprofessionals (20min)

    Yes

  • 8/13/2019 Elearning Course Curriculum

    4/9

    INTERNATIONAL SCHOOL OF ENGINEERING http://www.insofe.edu.in

    CSE 7301coEssential of Applied Predictive Analytics

    If you believe that an ability to analyze, forecast and predict using data will help you grow well in

    your current job, then this 40-hour instructor-led online course is the easiest way to achieve that.

    Professionals from a diverse set of verticals and horizontals like Marketing, HR, Engineering,Banking, Pharmaceutical, Healthcare, Retail, Telecom, Manufacturing, Data Warehousing, etc. are

    finding that decisions cannot be taken intuitively anymore. Data is becoming the biggest source

    of knowledge, differentiation and progress. This course teaches robust and systematic methods

    that enable gaining insights from data just as a specialist does. At the end of the program, the

    participants are able to answer business questions such as who is likely to buy a new product

    amongst the existing customers, which customers are most likely to default on a loan or an

    insurance payment and of a given set of transactions, which are most likely to be fraudulent.

    This course thoroughly trains candidates on the following techniques: Pre-processing Techniques:

    Graphical Visualization, Handling Missing Values, Data Standardization; Predictive Models:

    Decision Trees, Linear Regression, Logistic Regression; Model Selection Techniques: Concepts of

    Overfitting, Bias and Variance; Cross Validation; Error metrics like Precision, Accuracy and Recall;

    Introduction to solving analytics problems using R.

    Schedule:

    A 40-hour, 8-week program.

    Each session lasts 2 hours.

    Day 1: Introduction to Big Data; Course Motivation; Logistics; Analysis through Data Visualization

    Day 2: Understanding the business case and defining a solution framework

    Day 3: An introduction to R programming language and environment

    Day 4: Techniques of Pre-processing data (Binning, Normalizing, Filling missing values, removing

    noise)

    Day 5:Data Pre-processingcontinued

    Day 6:Traps and Errors: Confusion matrix, Analyze False positives and False Negatives from a

    problem perspective; Different error measures used in Forecasting

    Day 7:Model Selection: K-fold validation

    Day 8:Introduction to Decision Trees and their structure

    Day 9:Construction of Decision Trees through simplified examples; Choosing the best attributeat each non-leaf node; Entropy; Information Gain

  • 8/13/2019 Elearning Course Curriculum

    5/9

    INTERNATIONAL SCHOOL OF ENGINEERING http://www.insofe.edu.in

    Day 10:Generalizing Decision Trees; Information Content and Gain Ratio; Dealing with numerical

    variables; other measures of randomness

    Day 11: Inductive learning from a 500-ft view; Issues in inductive learning like curse of

    dimensionality; Overfitting; Bias-Variance tradeoff

    Day 12:Pruning a Decision Tree, Cost as a consideration; Unwrapping Trees as rules

    Day 13:A mathematical model for association analysis

    Day 14:Large itemsets and Association Rules; Apriori: Constructs large itemsets with minisup by

    iterations

    Day 15: Interestingness of discovered association rules; Application examples; Association

    analysis vs. Classification

    Day 16:Using Association Rules to compare stores; Dissociation Rules; Sequential Analysis Using

    Association Rules

    Day 17:Data visualization and Story-telling: Anatomy of a graph

    Day 18:Animated graphs, BI dashboards and the latest trends in data visualization

    Days 19 and 20:An end-to-end case study in R involving understanding the data, filling the

    missing values, applying and assessing models and reporting the results.

  • 8/13/2019 Elearning Course Curriculum

    6/9

    INTERNATIONAL SCHOOL OF ENGINEERING http://www.insofe.edu.in

    Dr. SREERAMA K MURTHYCo-founder and CEO, Teqnium Consultancy Services

    PhD in Data Mining, Johns Hopkins University

    Classes Taught

    Engineering Big Data with R and Hadoop Ecosystem

    Brief Profile

    Ph.D. - Johns Hopkins University

    M.Tech. IIT, Chennai (Madras)

    B.E. - NIT, Allahabad

    17 years of work experience after Ph.D. (USA: 5 years, India: 11 years)

    21 US Patent applications (8 issued), 2 Indian patent applications

    Many invention disclosures, numerous journal and conference papers. Designed, managed, built and deployed large software systems. Technocrat, combining love for technology with entrepreneurship and business

    management.

    Helped conceptualize business plans of three ventures. Obtained millions of dollars in funding.

    Chairman & CEO- Teqnium Consultancy Services

    Director, Technology- Globarena ITeknowledge Pvt Ltd

    Managing Director- Globarena Web Technologies

    Senior Manager and Head, E-Commerce Research group- IBM India Research Lab

    Researcher- Siemens Corporate Research

    Areas of Expertise:Technology Enabled Education and Training, e-Skilling, Outsourced R&D,

    Data Mining, Digital Security, Healthcare Informatics

    Specialties:Education Strategy, Role of Technology in Skills Development, Instructional Design,

    Research, Intellectual Property, Novel Product Design

    Mentors Profiles

  • 8/13/2019 Elearning Course Curriculum

    7/9

    INTERNATIONAL SCHOOL OF ENGINEERING http://www.insofe.edu.in

    Dr. DAKSHINAMURTHY V KOLLURUPresident, International School of Engineering

    PhD in Materials Science and Engineering, CMU

    Classes Taught

    Essentials of Applied Predictive Analytics

    Brief Profile

    Ph.D. Carnegie Mellon University (CMU)

    M.S. Carnegie Mellon University (CMU)

    B.E. NIT, Tiruchirapalli15 years of work experience after Ph.D. in diverse organizations ranging from Defense Research

    to Web startup and mid-size IT services companies.

    President- International School of Engineering,

    Chief Research Officer- Prithvi Information Solutions Ltd., Hyderabad,

    Founder and Managing Director- Axaya Cybertech Pvt Ltd,

    Co-founder and Managing Director- Globarena ITeknowledge Pvt. Ltd

    Scientist- Defence Metallurgical Research Laboratory, Hyderabad,

    During his years of experience as a scientist and entrepreneur, Murthy has applied his strengths

    in logical thinking, math and science to solving industrial and societal problems, designing

    solutions from fundamentals, identifying, training and motivating high quality individuals, and to

    articulating the findings in a lucid manner to all the stakeholders.

    Over the past few years, Dr. Murthy has been actively teaching Data Analytics to working

    professionals with wide range of experience and from diverse industries. He has also been

    consulting on Data Science projects with Fortune 25 to IT Services to Startup companies. During

    his years of experience as a scientist and entrepreneur, Dr. Murthy has applied his strengths in

    logical thinking, math and science to solving industrial and societal problems, designing solutions

    from fundamentals, identifying, training and motivating high quality individuals, and to articulating

    the findings in a lucid manner to all the stakeholders.

    He built the Business Analytics and Optimization division of a mid-tier IT services company from

    scratch and filed for 5 patents in Retail and Telecom Analytics, during which time he also acquired

    Fortune 500 clients and turned the division into a profitable delivery center.

  • 8/13/2019 Elearning Course Curriculum

    8/9

    INTERNATIONAL SCHOOL OF ENGINEERING http://www.insofe.edu.in

    Fee Structure

    Program Fee for Each Individual Module:

    For International Students: $9for Application Fees and $640for Program Fees For Indian Students: Rs. 500for Application Fees and Rs. 35,000for Program Fees

    Program Fee for Two Modules:

    For International Students: $9for Application Fees and $960for Program Fees For Indian Students: Rs. 500for Application Fees and Rs. 54,000for Program Fees

    For more details, please visit:http://insofe.edu.in/init/default/elearning_engineering_big_data

    For any queries; Contact: - +91 9502334561 or email us at [email protected]

    http://www.sebd.insofe.edu.in/http://www.sebd.insofe.edu.in/http://www.sebd.insofe.edu.in/
  • 8/13/2019 Elearning Course Curriculum

    9/9

    INTERNATIONAL SCHOOL OF ENGINEERING http://www.insofe.edu.in

    International School of Engineering

    Address: 1st Floor, Plot No 63/A, Road No 13, Film Nagar, Jubilee Hills,

    Hyderabad500033

    Contact Number: +91 9618 483 483; Website:www.insofe.edu.in

    Facebook:www.facebook.com/insofe

    Linkedin:http://goo.gl/VzC9s

    Twitter: @INSOFEedu

    Slideshare:http://www.slideshare.net/INSOFE

    http://www.insofe.edu.in/http://www.insofe.edu.in/http://www.facebook.com/insofehttp://goo.gl/VzC9shttp://www.slideshare.net/INSOFEhttp://www.slideshare.net/INSOFEhttp://www.slideshare.net/INSOFEhttp://goo.gl/VzC9shttp://www.facebook.com/insofehttp://www.insofe.edu.in/