uts library · different tasks within the one overall project. likeany document – make sure you...

22
UTS CRICOS PROVIDER CODE 00099F UTS Library Easy Data Analytics in Excel For Mac

Upload: others

Post on 16-Jul-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: UTS Library · different tasks within the one overall project. Likeany document – make sure you regularly save your workbook somewhere you can find it on your computer. Finding

UTS

CRI

COS

PRO

VIDE

R CO

DE 0

0099

F

UTS Library

Easy Data Analytics in Excel

For Mac

Page 2: UTS Library · different tasks within the one overall project. Likeany document – make sure you regularly save your workbook somewhere you can find it on your computer. Finding

Table of Contents Finding Data ........................................................................................................................................ 2

Putting Data into Excel ........................................................................................................................ 3

Making a ‘working’ data sheet. ........................................................................................................... 5

Cleaning the data ................................................................................................................................ 5

Making a table .................................................................................................................................... 7

Creating a graph .................................................................................................................................. 7

Filtering a Chart ................................................................................................................................... 9

Part 2 - Top oil consuming countries ................................................................................................ 10

Extracting data from a pdf ................................................................................................................ 11

Working with the Oil Consumption Data from BP ............................................................................ 13

Using text filters ................................................................................................................................ 16

Making a Line Chart........................................................................................................................... 17

Creating an average .......................................................................................................................... 19

Charting average oil consumption .................................................................................................... 21

Standardizing units of measurement with a formula ....................................................................... 21

Introduction – Researching a Commodity (Oil) Say we are researching oil and we want to know who is producing the oil and who is consuming it. Maybe we would like to know who the top producing countries are and what the top consuming countries are.

Part 1 – Top Oil Producing Countries

What is Excel?

Excel is a spreadsheet program, that allows you to organise and manipulate data, including calculations, graphing and charting tools. An Excel file is called a workbook, and within each workbook you can have one or more sheets. Sheets are like tabs that let you manage different tasks within the one overall project. Like any document – make sure you regularly save your workbook somewhere you can find it on your computer.

Finding Data The website list provided on UTS Online, contains a bunch of sources for you to consult. For oil, websites like the International Energy Agency would probably be able to answer this question, but you can search on Google too. The image seen is for the search - top oil producing countries

Page 3: UTS Library · different tasks within the one overall project. Likeany document – make sure you regularly save your workbook somewhere you can find it on your computer. Finding

For our example in this workbook we’ve chosen to use - Crude Oil production from the CIA world factbook. The World Fact Book is a good solid source, as it’s produced by the US Government. The information contained within the link is pretty current too.

Putting Data into Excel There is a download link above the oil production data but in this case it creates a not terribly useful text file. So instead of using that we’ll try to copy and paste from the webpage

itself. So, the first thing we need to do is highlight all the text in the table (including

the headings)

Page 4: UTS Library · different tasks within the one overall project. Likeany document – make sure you regularly save your workbook somewhere you can find it on your computer. Finding

Open Excel, and save the workbook to your desktop. At the bottom of the workbook double click on the name of the sheet and rename it to ‘Oil production raw’

Click into the cell at the top left of the sheet at A1. and then paste your data.

After you cut and paste into Excel it looks like this:

Page 5: UTS Library · different tasks within the one overall project. Likeany document – make sure you regularly save your workbook somewhere you can find it on your computer. Finding

Paste in the link of your data source beneath the table, so that you know where you found it.

Making a ‘working’ data sheet. It is good practice to have a separate sheet for your raw data and then another sheet where you work with the data. That way if you accidentally delete or change the data, you can always return to the original source.

To do this go down to the oil production raw sheet and right click – then choose move or copy.

Check the ‘create a copy’ box and then rename the new sheet oil production working

Cleaning the data We have some unnecessary information in this data set. So we need to remove it before we can make our chart. In our oil production working sheet let’s remove column A (ranking). To do this click in column A. Then right click and choose delete. Check the box saying entire column.

Page 6: UTS Library · different tasks within the one overall project. Likeany document – make sure you regularly save your workbook somewhere you can find it on your computer. Finding

Now our sheet looks like this:

On a mac you may find the formatting hasn’t been cleared, so for example all the countries might still be underlined. To remove formatiing choose clear>formats.

Now our sheet looks like this:

Page 7: UTS Library · different tasks within the one overall project. Likeany document – make sure you regularly save your workbook somewhere you can find it on your computer. Finding

Making a table If you want to sort the contents of this sheet or make a graph or chart out of the data, it’s a good idea to turn the data into a table.

To highlight the entire table do command-A.

You can also do command-shift-down and then command-shift-right to select data if you prefer. After you select data you can check the corners of the doc by doing ctrl-fullstop.

When you have highlighted the data do insert>table.

Ideally the header row box will be ticked:

Now my data looks like this

Creating a graph Let’s graph the top 20 in a bar chart or something you can. Highlight the top 20 rows and select insert > chart

Page 8: UTS Library · different tasks within the one overall project. Likeany document – make sure you regularly save your workbook somewhere you can find it on your computer. Finding

Now if you want to graph say the top 20 in a bar chart or something you can. Just highlight the top 20 and do insert > chart

and choose the column graph

Then choose 2d clustered columns

That produces a graph that looks like this

Page 9: UTS Library · different tasks within the one overall project. Likeany document – make sure you regularly save your workbook somewhere you can find it on your computer. Finding

We can rename the graph by double clicking on the title. We might do this to make the data more understandable. Something like leading oil producers BBL/day.

Filtering a Chart By using chart>source data you can toggle elements of the chart on and off.

Say you want to compare the biggest producers on each continent. You can untick the countries that don’t apply using the remove button leaving yourself with the countries that do.

Page 10: UTS Library · different tasks within the one overall project. Likeany document – make sure you regularly save your workbook somewhere you can find it on your computer. Finding

To change colour of the chart, first click on the columns to highlight them

Then use chart layout to select colours like this

Part 2 - Top oil consuming countries Find dataset: top consuming countries Google Search – top oil consuming countries -> Click on the Wikipedia article.

Mousing over the superscript numbers will give you links to their sources. Click on number 2 – the Statistical Review of World Energy and open that link up.

Page 11: UTS Library · different tasks within the one overall project. Likeany document – make sure you regularly save your workbook somewhere you can find it on your computer. Finding

This is a report produced by BP (a company). Information provided by companies needs a little extra scrutiny before being used but this is the kind of numerical data where there isn’t much ambiguity, and there are citations beneath the tables to describe where the data is being sourced from. There is some good oil consumption data on p.9 of this report:

However if we try and cut and paste from the PDF the data doesn’t come out properly and is thus unusable, so we need to figure out another way to grab this data.

Extracting data from a pdf The first thing we need to do is download the pdf. Use the little down arrow and computer on the document to do this –

Once it’s downloaded go to your downloads folder and open the file up.

This is a 48 page document, but we only want one page for our purposes. So what we need to do is extract page 9 from the pdf, because we don’t want 48 pages worth of excel tables, especially when a lot of it is just text.

So, to extract page 9:

First thing to do is download the pdf to your computer. Now use this view option to select thumbnail view

Page 12: UTS Library · different tasks within the one overall project. Likeany document – make sure you regularly save your workbook somewhere you can find it on your computer. Finding

Then go down to page 9 in the thumbnails and click on it so a border appears around it

Now go to edit> copy

Then do file>new from clipboard

Page 13: UTS Library · different tasks within the one overall project. Likeany document – make sure you regularly save your workbook somewhere you can find it on your computer. Finding

This will make a duplicate of that page called untitled 1

Then file>Save this as something indicative like oil consumption

Now you can upload that single page pdf into a PDF to Excel file converter. The one I used was called https://online2pdf.com/

Once you arrive at the website select the file you placed on the desktop and press convert

Once converted your file should look like this:

Working with the Oil Consumption Data from BP This data is quite a bit bunched up, and it’s formatted differently to our other sheet. So to standardize the appearance of this sheet we need to clear formats again

Page 14: UTS Library · different tasks within the one overall project. Likeany document – make sure you regularly save your workbook somewhere you can find it on your computer. Finding

And to remove this bunching use format>autofit row height

Now the data looks like this:

The font size is 10 and the font is times new roman so we can change that here if we like

Once our data is all tidied up lets use command-A and move it to our sheet with the production data. We’ll call this new sheet ‘Oil production raw’. Cut and paste the source below this data as well.

Now we can make another sheet (right click > move or copy > choose move to end > check the box called ‘create a copy’) and rename the new sheet Oil Consumption Working.

For the purposes of this exercise we might delete the rightmost two columns that track percentage changes.

Page 15: UTS Library · different tasks within the one overall project. Likeany document – make sure you regularly save your workbook somewhere you can find it on your computer. Finding

After doing that highlight all the relevant data (headings and countries and data, but not the footnotes) and then choose table>new>insert table with headers

Then sort by the drop down beneath the header 2015 – choose sort descending

These are our sorted values. Note that all of the top rows are now totals for continent which we don’t want.

Page 16: UTS Library · different tasks within the one overall project. Likeany document – make sure you regularly save your workbook somewhere you can find it on your computer. Finding

Using text filters To get rid of the rows discovered above, you can highlight them and then right click>delete.

However you can also use a text filter to help locate the rows you want to delete. This method is useful for large datasets that aren’t easy to go through and manually delete row.

Select the drop down menu under Thousand barrels daily.

Then choose filter >contains> and then write ‘total’

You’ll see all the rows that have the word total in them now.

Now highlight all the rows that you need to delete and then right click and choose delete row

Page 17: UTS Library · different tasks within the one overall project. Likeany document – make sure you regularly save your workbook somewhere you can find it on your computer. Finding

Now to get back to your data, open up the drop down beneath thousand barrels daily again and then choose ‘clear filter’.

You can repeat this process with other words if that’s useful.

Making a Line Chart Let’s make a chart to show US consumption of oil over time. To do that we’d need to highlight the top two rows of the table, headings and US.

Then do Insert> charts.

Then choose line

Then select a type of line chart

Page 18: UTS Library · different tasks within the one overall project. Likeany document – make sure you regularly save your workbook somewhere you can find it on your computer. Finding

At first your data may look all back to front:

But if you go to data and choose plot series by row it will look normal again:

If you want to you can add notations for the y axis. Use chart layout and then select axis titles

Page 19: UTS Library · different tasks within the one overall project. Likeany document – make sure you regularly save your workbook somewhere you can find it on your computer. Finding

Then change the text saying axis title to Thousands of barrels per day.

We can also double click on the title and change this to US oil consumption 2005-2015. Finished product looks like this:

Creating an average Go back to our original sheet for oil consumption. If you wanted to do an average for all those years click into the cell to the right of the top row of values

Then click formulas>formula builder

Page 20: UTS Library · different tasks within the one overall project. Likeany document – make sure you regularly save your workbook somewhere you can find it on your computer. Finding

Then double click the option called AVERAGE

Excel will autoselect all the values in that row.

If you are happy with the range press enter and the average will autofill down across all the countries

This is cool but the heading says 2016 where we want it to read average. So double click in 2016 and relabel it.

Page 21: UTS Library · different tasks within the one overall project. Likeany document – make sure you regularly save your workbook somewhere you can find it on your computer. Finding

Charting average oil consumption To show the average oil consumption of the top 10 consuming countries. Hold down command and select the top 10 countries from the countries column and also select the matching averages.

To graph this do insert > chart

Why not play around with the possibilities of charts and see what you can do with this data?

Standardizing units of measurement with a formula If you wanted to compare these two datasets you would need to standardize the units of measurement, as one set (Consumption) is expressed in terms of 1000’s of barrels a day whereas production is expressed as a whole number of barrels per day. To standardize these two we would need to make the consumption figures also represent the whole number, which means multiplying all the numbers in that set by 1000.

To do this, choose a cell – eg: B2.

Then, in the fx column in excel write

=(B2*1000) and then press enter

The multiplied number will appear in whichever cell you have clicked before running the operation. It’s a good idea to make this cell the one next to the original number.

Page 22: UTS Library · different tasks within the one overall project. Likeany document – make sure you regularly save your workbook somewhere you can find it on your computer. Finding

Then go to the cell with the multiplied number. You will see a small green dot at the bottom right of the cell. Click-drag this green dot downwards. It will apply the same formula to all the numbers below it.