common sql mistakes
DESCRIPTION
Common SQL MistakesTRANSCRIPT
-
0Mu Sigma Confidential
Chicago, IL
Bangalore, India
www.mu-sigma.com
Proprietary Information
"This document and its attachments are confidential. Any unauthorized copying, disclosure or distribution of the material is strictly forbidden"
Proprietary Information
"This document and its attachments are confidential. Any unauthorized copying, disclosure or distribution of the material is strictly forbidden"
Do The Math
Boldly redefine the possibilities of using data to drive business actions & exceed client expectations
ANALYZE data!! SOCIALIZE insights & learnings!! MONETIZE business actions!
Common SQL Mistakes
Suvranil Basu
-
1Mu Sigma Confidential
Lack of appreciation for the level of the data leads to meaningless queries and illogical group-bys
Select gender,
sum(user_count) as User_count2,
sum(revenue)
from
(Select gender,
unit_price as revenue,
Count(distinct user_id) as user_count
from Users u
join Orders o on o.user_id=u.id
join Order_items oi on o.id=oi.order_id
where status not in ('o','c')
group by gender,
revenue) a
group by gender
Select gender,
sum(unit_price) as revenue,
Count(distinct user_id) as user_count
from Users u
join Orders o on o.user_id=u.id
join Order_items oi on o.id=oi.order_id
where status not in ('o','c')
group by gender
-
2Mu Sigma Confidential
Use of a left join along with a where clause of the non-left joined table leads to invalidity of the left join in itself
Select user_id,
gender,
created_at,
unit_price,
oi.id as item_id
from
users u
left join orders o on o.user_id = u.id
left join order_items oi on oi.order_id = o.id
where o.status not in ('o','c');
Select user_id,
gender,
created_at,
unit_price,
oi.id as item_id
from
users u
left join (select * from orders
where status not in (o,c))o on o.user_id = u.id
left join order_items oi on oi.order_id = o.id;
-
3Mu Sigma Confidential
Lack of awareness of a self-join coupled with carelessness about the level of data leads to unwanted Cartesian products
Select sum(oi.unit_price) as revenue
from
orders o
join orders o1 on o.user_id=o1.user_id
join order_items oi on oi.order_id=o1.id
where o.status not in ('o', 'c')
and o1.status not in ('o','c')
Select sum(oi.unit_price) as revenue
from
orders o
join orders o1 on o.id=o1.id
join order_items oi on oi.order_id=o.id
where o.status not in ('o','c')
-
4Mu Sigma Confidential
Joining two tables which are at different levels leads to duplication/multiplication of numbers while aggregating
Select sum(unit_price) as total_revenue,
count(r.id) as total_referrals
from orders o
join referrals r on o.user_id=r.referring_user_id
join order_items oi on oi.order_id=o.id
where status not in ('o','c')
Select sum(unit_price) as total_revenue,
sum(referrals) as total_referrals
from orders o
join (select referring_user_id,
count(distinct referrals.id) as referrals
from
referrals r
group by referring_user_id)r
on o.user_id=r.referring_user_id
join order_items oi on oi.order_id=o.id
where status not in ('o','c')