common sql mistakes

6
 0 Mu Sigma Confidential Chicago, IL Bangalore, India www.mu-sigma.com Proprietary Information "This document an d its attachments are confidential. Any unautho rized copying, disclosu re or distribution of the material is strictly forbidden" Proprietary Information "This document an d its attachments are confidential. Any unautho rized copying, disclosu re or distribution of the material is strictly forbidden" D o Th e Ma th Boldly redefine the possibilities of using data to drive business actions & exceed client expectations  ANAL YZE data!! SOCIALIZE insights & learnings!! MONETIZE business actions! Common SQL Mistakes Suvranil Basu

Upload: kavitha

Post on 03-Nov-2015

212 views

Category:

Documents


0 download

DESCRIPTION

Common SQL Mistakes

TRANSCRIPT

  • 0Mu Sigma Confidential

    Chicago, IL

    Bangalore, India

    www.mu-sigma.com

    Proprietary Information

    "This document and its attachments are confidential. Any unauthorized copying, disclosure or distribution of the material is strictly forbidden"

    Proprietary Information

    "This document and its attachments are confidential. Any unauthorized copying, disclosure or distribution of the material is strictly forbidden"

    Do The Math

    Boldly redefine the possibilities of using data to drive business actions & exceed client expectations

    ANALYZE data!! SOCIALIZE insights & learnings!! MONETIZE business actions!

    Common SQL Mistakes

    Suvranil Basu

  • 1Mu Sigma Confidential

    Lack of appreciation for the level of the data leads to meaningless queries and illogical group-bys

    Select gender,

    sum(user_count) as User_count2,

    sum(revenue)

    from

    (Select gender,

    unit_price as revenue,

    Count(distinct user_id) as user_count

    from Users u

    join Orders o on o.user_id=u.id

    join Order_items oi on o.id=oi.order_id

    where status not in ('o','c')

    group by gender,

    revenue) a

    group by gender

    Select gender,

    sum(unit_price) as revenue,

    Count(distinct user_id) as user_count

    from Users u

    join Orders o on o.user_id=u.id

    join Order_items oi on o.id=oi.order_id

    where status not in ('o','c')

    group by gender

  • 2Mu Sigma Confidential

    Use of a left join along with a where clause of the non-left joined table leads to invalidity of the left join in itself

    Select user_id,

    gender,

    created_at,

    unit_price,

    oi.id as item_id

    from

    users u

    left join orders o on o.user_id = u.id

    left join order_items oi on oi.order_id = o.id

    where o.status not in ('o','c');

    Select user_id,

    gender,

    created_at,

    unit_price,

    oi.id as item_id

    from

    users u

    left join (select * from orders

    where status not in (o,c))o on o.user_id = u.id

    left join order_items oi on oi.order_id = o.id;

  • 3Mu Sigma Confidential

    Lack of awareness of a self-join coupled with carelessness about the level of data leads to unwanted Cartesian products

    Select sum(oi.unit_price) as revenue

    from

    orders o

    join orders o1 on o.user_id=o1.user_id

    join order_items oi on oi.order_id=o1.id

    where o.status not in ('o', 'c')

    and o1.status not in ('o','c')

    Select sum(oi.unit_price) as revenue

    from

    orders o

    join orders o1 on o.id=o1.id

    join order_items oi on oi.order_id=o.id

    where o.status not in ('o','c')

  • 4Mu Sigma Confidential

    Joining two tables which are at different levels leads to duplication/multiplication of numbers while aggregating

    Select sum(unit_price) as total_revenue,

    count(r.id) as total_referrals

    from orders o

    join referrals r on o.user_id=r.referring_user_id

    join order_items oi on oi.order_id=o.id

    where status not in ('o','c')

    Select sum(unit_price) as total_revenue,

    sum(referrals) as total_referrals

    from orders o

    join (select referring_user_id,

    count(distinct referrals.id) as referrals

    from

    referrals r

    group by referring_user_id)r

    on o.user_id=r.referring_user_id

    join order_items oi on oi.order_id=o.id

    where status not in ('o','c')