learning excel for data analysis dr. chaitali basu mukherji

40
Learning Excel for Data Analysis Sessions 5 and 6 Dr. Chaitali Basu Mukherji

Upload: emely-gooch

Post on 31-Mar-2015

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Learning Excel for Data Analysis Dr. Chaitali Basu Mukherji

Learning Excel for Data Analysis

Sessions 5 and 6

Dr. Chaitali Basu Mukherji

Page 2: Learning Excel for Data Analysis Dr. Chaitali Basu Mukherji

Data AnalysisData Analysis in Excel is performed in multiple ways using the following sections of the Data tab–• Get Data – To Connect to external data set• Sort and Filter• Data Tools – Data Validation, Duplicate Removal,

Consolidation, Data Tables and What If Analysis• Outline – Group and Ungroup, Subtotals• Analysis – Data Analysis, Solver

Page 3: Learning Excel for Data Analysis Dr. Chaitali Basu Mukherji

Group and Ungroup• Group and Ungroup in the Outline Group of the Data tab• Group allows you to collapse a group of rows or columns• Ungroup reverts the action• For both functions, an outline with a + or – sign will appear

Page 4: Learning Excel for Data Analysis Dr. Chaitali Basu Mukherji

SubtotalsSubtotals is used in a sorted list• Sort the list on the field for which you want subtotals inserted• Click the Subtotal button in the Outline group on the Data tab• Subtotal dialog box appears to specify the options for the subtotals

• Select the field for which the subtotals are to be calculated in the At Each Change In drop-down list• Specify the type of totals you want to insert in the Use Function drop-down list• Select the check boxes for the field(s) you want to total in the Add Subtotal To list box• Click OK

Excel adds the subtotals to the worksheet

•When you use the Subtotals command, Excel outlines the data at the same time that it adds the rows with the departmental salary totals and the grand total. This means that you can collapse the data list down to just its departmental subtotal rows or even just the grand total row simply by collapsing the outline down to the second or first level.•In a large list, you may insert page breaks every time data changes in the field on which the list is being subtotaled. To do this, select the Page Break between Groups check box in the Subtotal dialog box before you click OK to subtotal the list.•Excel does not allow you to subtotal a list formatted as a table. You must first convert your table into a normal range of cells. Click a cell in the table and then click the Table Tools Design tab. Click the Convert to Range button in the Tools group, and then click Yes. Excel removes the filter buttons from the columns at the top of the list while still retaining the original table formatting.

Page 5: Learning Excel for Data Analysis Dr. Chaitali Basu Mukherji

SolverSolver can handle problems that involve many variable cells and can help find combinations of variables that maximize or minimize a target Cell with multiple constraints conditions that must be met for the solution to be valid

Problem Statement: You are planning an advertising campaign for a new product with a total print advertising budget of Rs 12,000,000 and want to expose your ads at least 800 million times to potential readers through six publications. Your job is to reach the readership target at the lowest possible cost with the following additional constraints:• At least six advertisements should run in each publication• No more than a third of advertising dollars be spent on any one publication• Cost for placing ad in Pub3 and Pub4 must not exceed Rs 7,500,000.

Page 6: Learning Excel for Data Analysis Dr. Chaitali Basu Mukherji

Lets Solve the Problem using Solver

Page 7: Learning Excel for Data Analysis Dr. Chaitali Basu Mukherji

Solver using iterationLet us solve a quadratic equation set using Solver

F(x,y) = x^2+y+3 = 0G(x,y) = 2*x^2+y^3+5 = 0

Solver will use the best estimate method using 100 iterations to come up with a close result

Page 8: Learning Excel for Data Analysis Dr. Chaitali Basu Mukherji

Analysis ToolpakTo develop complex statistical or engineering analyses, you can save steps and time by using the Analysis ToolPak.

Provide data and parameters for each analysis, and the tool uses appropriate statistical or engineering macro functions to calculate and display the results in an output table.

Some tools generate charts along with output tables.

Page 9: Learning Excel for Data Analysis Dr. Chaitali Basu Mukherji

Anova AnalysisAnova analysis tools provide different types of variance Analysis using tools like Single Factor, Two-Factor with Replication and Two-Factor without Replication

The tool to be used depends on number of factors and number of samples that you have from populations that you want to test

Single Factor Anova - This tool performs a simple analysis of variance on data for two or more samples. It provides a test whether each sample is drawn from same underlying probability distribution

If there are only two samples, function TTEST can be used. With more than two samples, Single Factor Anova model has to be called

Two Factor with Replication: This analysis is useful when data is classified along two different dimensionsExample: We measure plant height which are given 3 different brands of Fertilizer and kept at 2 different temperatures For each of the six possible pairs, we have an unique observation of plant height

Two Factor without Replication: This analysis is useful when data is classified along two different dimensions but there is only a single observation for each pair

http://www.statsoft.com/textbook/anova-anova/?button=1

Page 10: Learning Excel for Data Analysis Dr. Chaitali Basu Mukherji

What is Pivot Table?

Pivot Table is used to produce meaningful information from a table of information. For e.g. from a table of data that has names, addresses, ages, occupations, phone numbers and Pin codes, a Pivot Table we easily and quickly find out:

• How many Salesman work in each region?• What is the net car sale of each region? • Lets look at comparative sales across products.• What is the total Sales for the Company? • How many customers do we sell in each region?

Lets see how

Page 11: Learning Excel for Data Analysis Dr. Chaitali Basu Mukherji

Advantages of Pivot Tables

• Pivot Tables can generate and extract meaningful information from a large table of information within a matter of minutes

• It uses a lot less memory from your PC than if the same results were got using Excels built in functions

• It provides new information by simply drag-and-drop (pivot)

• Information is updated each time we open the Workbook or by clicking refresh

Page 12: Learning Excel for Data Analysis Dr. Chaitali Basu Mukherji

Example for Pivot• Step 1: Select the data range from which to make the

pivot table• Step 2: Go to Insert tab and click on pivot table icon to

select Pivot table optionStep 3: Excel displays a pivot table wizard where you specify the pivot table target location

• Step 4: Make your first pivot report by dragging and dropping fields in the pivot table grid area or by controlling the “Pivot table panel”. The pivot report is divided in to header and body sections. You can drag and drop the fields you want in each area. The body itself contains three parts. Rows, Columns and Cells. You can use any fields in these areas too.

Page 13: Learning Excel for Data Analysis Dr. Chaitali Basu Mukherji

Tips on Pivot Tables• Formatting is easy for pivot tables• You can easily change the pivot table summary formulas by

Right click on pivot table and selecting “summarize data by”• You can apply conditional formatting on pivot tables although

you must be careful as pivot tables scale in size depending on the data

• If original data from which pivot tables are constructed changes, right click on the pivot table and select “Refresh Data” option

• To drill down on a particular summary value, double click on it. Excel will create a new sheet with the data corresponding to that pivot report value. (This is extremely useful)

Page 14: Learning Excel for Data Analysis Dr. Chaitali Basu Mukherji

What are Pivot Charts?• Charts created on the Pivot tables are called Pivot Charts• They allow us to create professional interactive charts that are not

possible without complex VBA coding• How is our data set-up and do we also want a Pivot Chart

The basic information needed to use the Pivot Wizard are –– Where is our data stored eg, range in the same Workbook, a

database, another Workbook– Which column of data is going into which Field i.e. the optional

Page field, Row field, Column field and the mandatory Data field – Where do you want to put your Pivot Table eg, new Worksheet or

existing one– Making a pivot chart from a pivot table is very simple. Just click on

the pivot chart icon from tool bar or Options ribbon area and follow the wizard

Page 15: Learning Excel for Data Analysis Dr. Chaitali Basu Mukherji

What is Charting in Excel?• Charts are used to display series of numeric data in a graphical format

to make it easier to understand large quantities of data and the relationship between different series of data

• To create a chart , you start by entering numeric data which you can plot by selecting chart type that you want

• Excel supports many types of charts (such as a column chart or a pie chart) and their subtypes (such as a stacked column chart or a pie in 3-D chart) to help you display data in ways that are meaningful to your audience

• You can create a combination chart by using more than one chart type • Some Chart Types (Column or Bar) can be created by arranging data in

rows and columns while others (Pie and Bubble) require special arrangement of data

• Chart templates can be saved as .crtx files and used like any other template

Page 16: Learning Excel for Data Analysis Dr. Chaitali Basu Mukherji

Elements of a Chart

1. Chart Area2. Plot Area3. Data Points4. The Axes5. Legends6. Titles7. Label

Page 17: Learning Excel for Data Analysis Dr. Chaitali Basu Mukherji

Excel Chart TypesExcel provides facility to do the following types of chart. A typical use of the Different chart types are mentioned below.• Column charts are useful for

– showing data changes over a period of time– illustrating comparisons among items

• Line charts are useful for– displaying continuous data over time, set against a common scale– showing trends in data at equal intervals

• Pie charts are useful for– showing the size of individual items in proportional to the sum of the items

• Bar charts are useful for – comparisons among individual items

Page 18: Learning Excel for Data Analysis Dr. Chaitali Basu Mukherji

Cont.• XY (scatter) charts useful for

– displaying and comparing numeric values, in scientific, statistical, and engineering data

• Area charts are useful for– emphasizing the magnitude of change over time– drawing attention to the total value across a trend

• Stock charts useful for– illustrating the fluctuation of stock prices, daily or annual temperatures

• Surface charts useful for– finding optimum combinations between two sets of data similar to a topographic map

• Doughnut charts useful for– showing the relationship of parts to a whole, and can contain more than one data series

• Bubble charts useful for– comparing the sizes of parts that make up the data set

• Radar charts– Radar charts compare the aggregate values of several data series as opposed to Pie charts that have

only one data series

Page 19: Learning Excel for Data Analysis Dr. Chaitali Basu Mukherji

Creating a Chart

In Microsoft Excel, you can represent numbers ina chart.

On the Insert tab, you can choose from a variety of chart types, including column, line, pie, bar, area, and scatter.

The basic procedure for creating a chart is the same no matter what type of chart you choose. As you change your data, your chart will automatically update.

• You select a chart type by choosing an option from the Insert tab's Chart group.

• After you choose a chart type, such as column, line, or bar, you choose a chart sub-type.

• For example, after you choose Column Chart, you can choose to have your chart represented as a two-dimensional chart, a three-dimensional chart, a cylinder chart, a cone chart, or a pyramid chart.

• There are further sub-types within each of these categories. • As you roll your mouse pointer over each option, Excel

supplies a brief description of each chart sub-type.

Page 20: Learning Excel for Data Analysis Dr. Chaitali Basu Mukherji

Let’s Create a Column Chart

Page 21: Learning Excel for Data Analysis Dr. Chaitali Basu Mukherji

Sub Types of Column Chart

• Clustered column in 3-D – These compare values across categories. It displays 2-D data values using a 3-D perspective. A third value axis (depth axis) is not used.

• Stacked column in 3-D: Stacked column charts show the relationship of individual items to the whole, comparing the contribution of each value to a total across categories.

• 3-D column: 3-D column charts use three axes that you can modify (a horizontal axis, a vertical axis, and a depth axis), and they compare data points along the horizontal and the depth axes.

• Cylinder, cone, and pyramid: Cylinder, cone, and pyramid charts are available all the above types

Cylinder, cone, and pyramid: Cylinder, cone, and pyramid charts are available all the above types with only the shape being cylinder, cone or pyramid instead of rectangle

Page 22: Learning Excel for Data Analysis Dr. Chaitali Basu Mukherji

Applying a Chart Layout • Context tabs are tabs that only appear when you need them

Called Chart Tools, there are three chart context tabs: Design, Layout, and Format.

• The tabs become available when you create a new chart or when you click on a chart.

• You can use these tabs to customize your chart.• You can determine what your chart displays by choosing a

layout. • The layout you choose determines whether your chart

displays a title, where the title displays, whether your chart has a legend, where the legend displays, whether the chart has axis labels and so on.

• Excel provides several layouts from which you can choose.

Page 23: Learning Excel for Data Analysis Dr. Chaitali Basu Mukherji

Adding Labels

Page 24: Learning Excel for Data Analysis Dr. Chaitali Basu Mukherji

Switching Data

Page 25: Learning Excel for Data Analysis Dr. Chaitali Basu Mukherji

Changing Chart Style

Page 26: Learning Excel for Data Analysis Dr. Chaitali Basu Mukherji

Positioning the Chart

Page 27: Learning Excel for Data Analysis Dr. Chaitali Basu Mukherji

Changing the Chart Type

Page 28: Learning Excel for Data Analysis Dr. Chaitali Basu Mukherji

Communicating through Data• Communicating through data is most effective if we

understand the basic rules• There are 7 common relationships in Quantitative

Business Data• Typical questions that arise on number presentation are –

– Compared to what?– At what instant?– In which sequence?– Relative to what other?– How much is the deviation?– What kind of distribution does it follow?– Is there any special Correlation between them?

Page 29: Learning Excel for Data Analysis Dr. Chaitali Basu Mukherji

Time-Series Relationships• This is the most common relationship in quantitative business data• When quantitative values are expressed as a series of measures taken

across equal intervals of time, this relationship is called a time series • Studies indicate that approximately 75% of all business graphs display

time series• Time can be divided into intervals of varying duration, including years,

quarters, months, weeks, days, and hours

Time series reveal trends and patterns that we must be aware of and understand to make informed decisions

Page 30: Learning Excel for Data Analysis Dr. Chaitali Basu Mukherji

Ranking Relationships• It is most meaningful in business to see things ranked, such as the

performance of sales people or the expenses of departments• When quantitative values are sequenced by size, from large to small or

vice versa, this relationship is called a ranking• This not only reveals their sequence, but makes it much easier to compare

values by placing those that are most similar near one another.

Page 31: Learning Excel for Data Analysis Dr. Chaitali Basu Mukherji

Part-to-Whole Relationships• It is often useful to see how something is divided into parts, and the

percentage relationship of each part to the whole • When quantitative values are displayed to reveal the portion that each

value represents to some whole, this is called a part-to-whole relationship• Some typical examples are how a market is divided up between

competitors, or expenses are divided between regions as shown below

Page 32: Learning Excel for Data Analysis Dr. Chaitali Basu Mukherji

Deviation Relationships• When quantitative values are displayed to feature how one or more sets

of values differ from some reference set of values, this is called a deviation relationship

• The most common example in business is one that shows how some set of actual (such as expenses) deviate from a predefined target (such as a budget)

Page 33: Learning Excel for Data Analysis Dr. Chaitali Basu Mukherji

Distribution Relationships• When we show how a set of quantitative values are spread across their

entire range, this relationship is called a distribution• We can often learn a great deal by examining the distribution of a set of

values, especially the shape of that distribution, which reveals what’s typical, if it is skewed in one direction or the other, and if there are gaps or concentrations

This shows a distribution of values that is fairly symmetrical, approaching what is called a normal or bell-shaped curve

Page 34: Learning Excel for Data Analysis Dr. Chaitali Basu Mukherji

Correlation Relationships• When pairs of quantitative values, each measuring something different

about an entity (for example a person, department, or product), are displayed to reveal if there is significant relationship between them (for instance, as one goes up the other goes up as well, or as one goes up the other goes down), this is called a correlation

• Understanding correlations between quantitative variables can help us predict, take advantage of, or avoid particular behaviors

Correlation between employee’s heights in inches (y axis) and their salary in dollars (x axis) is shown below

Page 35: Learning Excel for Data Analysis Dr. Chaitali Basu Mukherji

Nominal Comparison Relationships• This chart is called a nominal comparison relationship where there is not

particular relationship between the values• Four geographical regions do not relate to each other in any particular

order• It does provide a means to compare the regional values, but nothing more• It is always useful, whenever you prepare a graph that displays nothing

but a nominal comparison, to ask yourself if another relationship could be featured that would make the graph more meaningful

• In this case, simply arranging the regions in order of their quantitative values could produce a ranking relationship

• Often discrete items in a categorical variable, like these geographical regions, need to be arranged in a particular order because people expect to see them arranged in that way

Page 36: Learning Excel for Data Analysis Dr. Chaitali Basu Mukherji

Tip for Selecting Right Chart TypeWhat to represent Chart Type to use

Nominal Comparison BarPoint

Time SeriesLineBarPoint

Ranking BarPoint

Part to WholePieStacked BarBubble

Deviation LinePoint

Frequency Distribution HistogramFrequency Plot

Correlation Scatter Plot with trend line

Page 37: Learning Excel for Data Analysis Dr. Chaitali Basu Mukherji

Best Practices of Charting

• Determine Your Message and Identify Your Data • Format Graphs to focus on the message removing

unnecessary Distractions• Check out if a Table, a Graph, or Both Is Needed to

Communicate Your Message most effectively• Determine the best place in the Charting area to

Display Each Variable• Take special care on Legend Placement

Page 38: Learning Excel for Data Analysis Dr. Chaitali Basu Mukherji

Tips for Enhancing Chart Performance• Use tables to hold the data• Use named ranges, named formulas• Use Pivot Tables• Sort your data• Use Manual Calculation Mode• Use Non-volatile formulas– These formulas are re-calculated whenever there is a

change in the workbook. Examples of volatile formulas are RAND, NOW, TODAY, OFFSET

• Keep formulas in a separate sheet• Write better formulas

Page 39: Learning Excel for Data Analysis Dr. Chaitali Basu Mukherji

Tips for a Good Report

• Restrict The Work Area to relevant Columns and Rows only

• Lock Formula Cells And Protect The Worksheet• Freeze Panes So that Your boss Knows what she is

Reading• Hide Un-necessary / Calculation Sheets• Hide Rows / Columns not used in report• Include Cell – Comments / Help Messages• Use Consistent Colors And Schemes• Name and Color Worksheet Tabs Appropriately• Before Closing The Workbook, Select Cell A1 On The

Correct Sheet

Page 40: Learning Excel for Data Analysis Dr. Chaitali Basu Mukherji

Thank You