[cs 402] assignment 1 v00

Upload: taaloos

Post on 14-Apr-2018

215 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/30/2019 [CS 402] Assignment 1 v00

    1/2

    BSCSHonors Program CS-402 GIFT University Gujranwala

    Page 1 of 2

    Course: Data Mining 14-November-2012 (Fall 2012)

    Resource Person: Nadeem Qaisar Mehmood ASSIGNMENT1 (Data)

    Total Points: 30Submission Due: Monday 19

    thNovember, 2012

    Instruc tions : Please Read Careful ly!

    This is a group assignment. A group must have at most 3 members only.

    Each group member must pass the viva for this assignment to get marks. The viva will be conductedafter the submission of this assignment.

    Try to write in your own words, you can take help from each other and can use the reference books tosolve the assignment; however the marking shall be done after taking viva of the assignment. You will

    be advised through email to come for viva later on.

    You are expected to submit this assignment as a single .zip file containing all the source files of yourimplementation. This zip file must be named as:

    CS402-AS01-(ROLLNUMBER1)(ROLLNUMBER2).zip and nothing else!

    Assignment is to be submitted electronically via email at [email protected] till Monday 19th

    November, 2012.a. The subject of the email should be:

    CS402-AS01-(ROLLNUMBER1) (ROLLNUMBER2)and noth ing else!b. Attach the zip file to the email.c. Keep the body of the email as empty.

    d. Send a copy of your email to you other group member. There will be a 30% penalty against late submissions.

    No assignment will be submitted afterMonday 20th

    November, 2012 16:00 hrs.

    NOTE 01: Any il legal alteration with the data set and compil ing a bad report wil l resul t in

    a state forward zero in the assignment marks.

    NOTE 02: You must pass the subsequent vivaof this assignment to actually have any

    marks for this assignment.

  • 7/30/2019 [CS 402] Assignment 1 v00

    2/2

    BSCSHonors Program CS-402 GIFT University Gujranwala

    Page 2 of 2

    Data Preparation

    Introduction about the data:

    The data consist of evaluations of teaching performance over three regular semesters and two

    summer semesters of teaching assistant (TA) assignments.

    You have to explore a data set provided to you with this assignment and apply the following

    approaches:

    Data Selection

    Preprocessing

    o Remove noise, outliers, missing values

    o Select features, reduce dimensions

    Also try to use following summary statistics to describe the datao Mean, median, standard deviation, measures of mean and variations, standard

    distribution curse, correlation, proximity measurement, frequency curve,

    percentile

    Also discuss how you can use this data for Data mining and in what kind of problems

    you can use it to perform Modeling

    Some Information about the Data

    Make it sure that you understand the data and the report reflect correct and concise information about the

    data. The report will be a 3-t-4 pages MS word document only that will describe the characteristics of the

    data.

    Tools: You are allowed to use any statistical tool, however as the assignment is on the fundamental bases to

    revise the basic concepts that leads to data exploration and information discovery therefore you are even

    encouraged to use excel for such exploration purposes.

    END OF ASSIGNMENT