information technology in business and society
DESCRIPTION
Information technology in business and society. Session 17 – Advanced SQL + Data Mining Sean J. taylor. Administrativia. Assignment 3: New drop for any updates related to A3 Assignment 4: D ue Sunday 4 /1 (this is an extension) Class participation grading. Midterm Review process. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Information technology in business and society](https://reader035.vdocument.in/reader035/viewer/2022070419/56815d0a550346895dcb098c/html5/thumbnails/1.jpg)
INFORMATION TECHNOLOGY IN BUSINESS AND SOCIETYSESSION 17 – ADVANCED SQL + DATA MINING
SEAN J. TAYLOR
![Page 2: Information technology in business and society](https://reader035.vdocument.in/reader035/viewer/2022070419/56815d0a550346895dcb098c/html5/thumbnails/2.jpg)
ADMINISTRATIVIA
• Assignment 3: New drop for any updates related to A3
• Assignment 4: Due Sunday 4/1 (this is an extension)
• Class participation grading.
![Page 3: Information technology in business and society](https://reader035.vdocument.in/reader035/viewer/2022070419/56815d0a550346895dcb098c/html5/thumbnails/3.jpg)
MIDTERM REVIEW PROCESS
• Consult the solutions (posted to BB).• Photocopy the page(s) of your exam that
you wish to dispute.• Write why you think you deserve points.• Submit to my mailbox on the 8th floor by
Thursday 3/29 (or after class).
![Page 4: Information technology in business and society](https://reader035.vdocument.in/reader035/viewer/2022070419/56815d0a550346895dcb098c/html5/thumbnails/4.jpg)
LEARNING OBJECTIVES
1. Be able to write more advanced queries.
2. Learn about the data-driven organization and the data revolution in management.
3. Know the basic problems data mining attempts to solve.
![Page 5: Information technology in business and society](https://reader035.vdocument.in/reader035/viewer/2022070419/56815d0a550346895dcb098c/html5/thumbnails/5.jpg)
REVIEW: SQL
SELECT ISBN, BookName, Price, Publisher
FROM Book
WHERE
BookName like '*Information Systems*'
AND PubDate > #1/1/2002#
AND Price < 100
ORDER BY Price
![Page 6: Information technology in business and society](https://reader035.vdocument.in/reader035/viewer/2022070419/56815d0a550346895dcb098c/html5/thumbnails/6.jpg)
REVIEW: GROUP BY … HAVINGUse “Having” clause to filter aggregation result
SELECT Publisher, COUNT(*) FROM Book GROUP BY PublisherHaving Count(*) > 2
Use “where” clause to filter records to be aggregated
SELECT Publisher, COUNT(*) as total FROM Book Where Price < 100GROUP BY PublisherHaving Count(*) > 10Order by Count(*)
![Page 7: Information technology in business and society](https://reader035.vdocument.in/reader035/viewer/2022070419/56815d0a550346895dcb098c/html5/thumbnails/7.jpg)
MULTIPLE GROUP BY FIELDS
SELECT Publisher, Author, AVG(Price) as AvgPrice
FROM Book GROUP BY Publisher, Author;
![Page 8: Information technology in business and society](https://reader035.vdocument.in/reader035/viewer/2022070419/56815d0a550346895dcb098c/html5/thumbnails/8.jpg)
GROUPING WITH A JOIN
SELECT Publisher, Count(*) as NumOrders FROM Book, OrdersWHERE Book.ISBN = Orders.ISBNGROUP BY Publisher;
![Page 9: Information technology in business and society](https://reader035.vdocument.in/reader035/viewer/2022070419/56815d0a550346895dcb098c/html5/thumbnails/9.jpg)
GROUPING WITH A JOIN 2
SELECT Publisher, Orders.CustomerID, Sum(price) as TotalPaid FROM Book, Orders, CustomerWHERE Book.ISBN = Orders.ISBN ANDOrders.CustomerID = Customer.CustomerIDGROUP BY Publisher, Orders.CustomerID;
![Page 10: Information technology in business and society](https://reader035.vdocument.in/reader035/viewer/2022070419/56815d0a550346895dcb098c/html5/thumbnails/10.jpg)
MULTIPLE JOINSWITH WHERE AND GROUP BY
SELECT FavoriteMovie, count(*)
FROM Profiles, FavoriteBooks, FavoriteMovies
WHERE
FavoriteMovies.ProfileId = Profiles.ProfileId
and FavoriteBooks.ProfileID = Profiles.ProfileID
and FavoriteBook = "The Great Gatsby"
GROUP BY FavoriteMovie ORDER BY count(*) desc;
![Page 11: Information technology in business and society](https://reader035.vdocument.in/reader035/viewer/2022070419/56815d0a550346895dcb098c/html5/thumbnails/11.jpg)
PROPORTIONSUSING SUB-SELECTS
SELECT FavoriteMovie, count(*) / (select count(*) from Profiles)
FROM Profiles, FavoriteMovies
WHERE
FavoriteMovies.ProfileId = Profiles.ProfileId
GROUP BY FavoriteMovie
ORDER BY count(*) desc;
![Page 12: Information technology in business and society](https://reader035.vdocument.in/reader035/viewer/2022070419/56815d0a550346895dcb098c/html5/thumbnails/12.jpg)
PROPORTIONS USING SUB-SELECTS II
SELECT FavoriteMovie, Profiles.Sex, count(*) / avg(Q.total)
from Profiles, FavoriteMovies, (select Sex, count(*) as total from Profiles group by Sex) as Q
where
FavoriteMovies.ProfileId = Profiles.ProfileId
and Q.Sex = Profiles.Sex
group by Profiles.Sex, FavoriteMovie
order by FavoriteMovie, Profiles.Sex;
![Page 13: Information technology in business and society](https://reader035.vdocument.in/reader035/viewer/2022070419/56815d0a550346895dcb098c/html5/thumbnails/13.jpg)
THE DATA-DRIVEN FIRM
![Page 14: Information technology in business and society](https://reader035.vdocument.in/reader035/viewer/2022070419/56815d0a550346895dcb098c/html5/thumbnails/14.jpg)
GARY LOVEMAN• Zero executive experience
• Zero background in Casinos
• But, an MIT PhD who knows how to make numbers talk
Results
• Transformed Harrah’s from second tier to number one gaming company in the world
• Completed a $30.7 Billion LBO
• Introduced a culture of pervasive field experimentation“There are two ways to get fired from Harrah’s…”
![Page 15: Information technology in business and society](https://reader035.vdocument.in/reader035/viewer/2022070419/56815d0a550346895dcb098c/html5/thumbnails/15.jpg)
THE DATA-DRIVEN FIRM
Why do we see these changes now?• Collect: easier to collect, store information
about consumers, technologies, markets• Respond: Fast internal communication means
that firms are agile enough to respond to external information
• Process: Firms can process large volumes of data to make intelligent decisions
![Page 16: Information technology in business and society](https://reader035.vdocument.in/reader035/viewer/2022070419/56815d0a550346895dcb098c/html5/thumbnails/16.jpg)
DATA-DRIVEN FIRMS ARE WINNING
Data-driven decision makers: • 4% higher productivity• 6% greater profitability• 50% higher market value from IT(Brynjolfsson and Kim, 2011)
![Page 17: Information technology in business and society](https://reader035.vdocument.in/reader035/viewer/2022070419/56815d0a550346895dcb098c/html5/thumbnails/17.jpg)
WHAT WAL-MART KNOWS
http://www.nytimes.com/2004/11/14/business/yourmoney/14wal.html
![Page 18: Information technology in business and society](https://reader035.vdocument.in/reader035/viewer/2022070419/56815d0a550346895dcb098c/html5/thumbnails/18.jpg)
![Page 19: Information technology in business and society](https://reader035.vdocument.in/reader035/viewer/2022070419/56815d0a550346895dcb098c/html5/thumbnails/19.jpg)
DATA-DRIVENCHALLENGES1. Measurement
What should be measured and how?
2. IncentivesHow can we design incentives around these measures without creating adverse consequences?
3. InfrastructureDo we have the right infrastructure (servers, software, etc) in place to measure and analyze the data we have?
4. SkillsDo we have the skills we need to accomplish these tasks?
![Page 20: Information technology in business and society](https://reader035.vdocument.in/reader035/viewer/2022070419/56815d0a550346895dcb098c/html5/thumbnails/20.jpg)
A NEW KIND OF R&D
Measure
Experiment
LearnShare
Replicate
![Page 21: Information technology in business and society](https://reader035.vdocument.in/reader035/viewer/2022070419/56815d0a550346895dcb098c/html5/thumbnails/21.jpg)
WHAT IS DATA MINING?
1. Automated search for patterns in data
2. Automated (or computer assisted) statistical modeling
3. A process for using IT to extract useful, actionable knowledge from large bodies of data
![Page 22: Information technology in business and society](https://reader035.vdocument.in/reader035/viewer/2022070419/56815d0a550346895dcb098c/html5/thumbnails/22.jpg)
“BIG DATA”
http://online.wsj.com/video/2012-the-year-of-big-data/D4237159-C9A9-4A09-9701-F03EF7FB8040.html
![Page 23: Information technology in business and society](https://reader035.vdocument.in/reader035/viewer/2022070419/56815d0a550346895dcb098c/html5/thumbnails/23.jpg)
BIG NAMES WITH BIG DATA
![Page 24: Information technology in business and society](https://reader035.vdocument.in/reader035/viewer/2022070419/56815d0a550346895dcb098c/html5/thumbnails/24.jpg)
CEOS“We have come out on top in the casino wars by mining our customer data deeply, running marketing experiments and using the results to develop and implement finely tuned marketing and services strategies that keep our customers coming back.”
Gary Loveman, Harrahs CEO
”For every leader in the company, not just for me, there are decisions that can be made by analysis. These are the best kinds of decisions. They’re fact-based decisions.”
Jeff Bezos, Amazon CEO
“It’s all about collecting information on 200 million people you’d never meet, and on the basis of that information, making a series of very critical long-term decisions about lending them money and hoping they would pay you back.”
Rich Fairbank, founder and CEO of Capital One
![Page 25: Information technology in business and society](https://reader035.vdocument.in/reader035/viewer/2022070419/56815d0a550346895dcb098c/html5/thumbnails/25.jpg)
WHY NOW?
Firms are collecting massive amounts of data on operations, customers, and the competitive landscape.
But there is far too much data for manual analysis.• Amazon: > 50M active customers• Phone companies: 100M+ accounts, thousands
of txns each• Google: 11B “objects”• RFID tags
![Page 26: Information technology in business and society](https://reader035.vdocument.in/reader035/viewer/2022070419/56815d0a550346895dcb098c/html5/thumbnails/26.jpg)
TYPES OF DATA MINING
Machine Learning
Supervised
Classification Regression
Unsupervised
Clustering Outlier detection
Visualization
![Page 27: Information technology in business and society](https://reader035.vdocument.in/reader035/viewer/2022070419/56815d0a550346895dcb098c/html5/thumbnails/27.jpg)
![Page 28: Information technology in business and society](https://reader035.vdocument.in/reader035/viewer/2022070419/56815d0a550346895dcb098c/html5/thumbnails/28.jpg)
OUR ROADMAP
1. Visualization
2. Basic Data Mining Process
3. Classification Example
4. Clustering Example
![Page 29: Information technology in business and society](https://reader035.vdocument.in/reader035/viewer/2022070419/56815d0a550346895dcb098c/html5/thumbnails/29.jpg)
NEXT CLASS:DATA MINING II
• Work on A4