forecasting for business - blog

Upload: kejansawane9605

Post on 06-Apr-2018

216 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/3/2019 Forecasting for Business - Blog

    1/12

    Call us: +1 (716) 989 6531 or email at: [email protected]

    ecasting for Business - Blog - http://blog.lokad.com/journal/category/forecastin

    r 12 22/12/2011 14:29

  • 8/3/2019 Forecasting for Business - Blog

    2/12

    Entries in forecasting (45)

    Seasonality illustratedMonday, September 19, 2011 at 12:00PM

    Seasonality is one of the strongest statistical pattern that can

    be leveraged to refine forecasts. Below, 4 time-series

    aggregated at the weekly level (159 weeks). Historical data are

    in red and forecasts are in purple. Vertical gray markers

    indicate January 1st.

    When illustrating seasonality, everyone (Lokad's included) tend

    to use long time-series, much like the first three series here

    above. Indeed, it's more visual and more appealing.

    However, long time-series do not represent

    your usual situation. On average consumer goods have a

    lifespan of no more than 3 or 4 years. Thus, long time-series are

    typically a small minority in your dataset. Worse, those long

    time-series might be outliers that do not reflect the behavior of

    other shorter-livedproducts.

    Here above, the short 4th time-series is a much more

    representative case with less than 1 year of data. In such a

    situation, however, it's much less clear how seasonality can be

    leveraged. The Lokad trick to do that consists of using multiple

    time-series analysis.

    Learn more on our seasonality definition article.

    Joannes Vermorel | Post a Comment | Share Article

    tagged forecasting, insights in forecasting, insights

    Video: How the ForecastingEngine works?Tuesday, September 13, 2011 at 09:00AM

    Questions about under the hooddetails of Lokad are frequent.

    We have recently added a big FAQto our Forecasting

    Technologysection. Today, we are releasing a new video that

    give the big picture on how our forecasting engine is working.

    ecasting for Business - Blog - http://blog.lokad.com/journal/category/forecastin

    r 12 22/12/2011 14:29

  • 8/3/2019 Forecasting for Business - Blog

    3/12

    Again, special thanks to Ray Grover for the voice over.

    Joannes Vermorel | Post a Comment | Share Article

    tagged video in forecasting, insights, video

    Weekly/Monthly aggregation is alossy processThursday, April 14, 2011 at 12:19PM

    When practionners have a first look at a forecast report

    produced by Lokad, then tend to stumble upon various

    oddities. For example, some forecasts may look way too low.

    Without any observable trend nor any seasonality,

    Lokad anticipates something rather unexpected. Sometimes

    it's a by-product of rather advanced correlation analytics, but

    sometimes it's something both simpler and deeper.

    The graph on the left represents a typical situation: steady

    sales for a couple of months, and then, a somewhat

    inexplicable drop in the forecasts.

    Common sense is yelling this can't be right, let's fix this broken

    forecast; and yetforecasting and common sense do not mixwell.

    The way we observe sales is deeply misleading. Indeed, we

    are observing here monthly aggregated sales, not the sales

    themselves. Many businesses favor monthly forecasts because

    they feel their sales are too low or too erratic at the daily or

    weekly level to be of any practical use. Hence, they aggregate

    sales data over long(er) period of time. By doing so, sales

    appear smootherand, consequently, more predictable.

    This visualizationof sales, i.e. thinking totals rather than an

    endless stream of transactions is so ubiquitous than many

    businesses fail to realize that aggregating sales primarily

    means loosing information, that is potentially valuable to

    perform the forecasts.

    Let's illustrate the point with a fresh look at the same sales

    history, although through weekly aggregation.

    The picture is extremely different. We realize that the

    ecasting for Business - Blog - http://blog.lokad.com/journal/category/forecastin

    r 12 22/12/2011 14:29

  • 8/3/2019 Forecasting for Business - Blog

    4/12

    seemingly steady monthly averages were just resulting fromtwo super-heavy weeks: one in between January and February

    and a second in March.

    Such spikes routinely appear in businesses because of

    promotions and other various kind of exception events.

    With the second illustration, low forecasts are making a lot

    more sense: sales include infrequent spikes that should not be

    accounted for, and, when we mentally discardthose spikes,

    we obtain forecasts that just follow the usual averaging

    pattern.

    A traditional forecasting system would typically be fooled by

    such a situation, and would anticipate a much higher monthly

    forecast, which would turn to be much less accurate.

    But Lokad is definitively not your traditional forecasting

    system. When monthly or weekly forecasts are requested, we

    keep looking at the most fine-grained data available. This let

    us identify patterns that would otherwise been lost through the

    sales aggregation process.

    Joannes Vermorel | Post a Comment | Share Article

    tagged forecasting, insights in forecasting, insights, time series

    Business is UP but forecasts are DOWNFriday, April 1, 2011 at 11:11AM

    Statistical demand forecasting is a counter-intuitive science.

    This point was pressed a couple oftimes before, but let's have

    a look at another misleading situation.

    If every single product segment of my business is

    growing fast, then at least some products should

    have an upward sales trend as well. Right?

    Otherwise, we would not be growing at all.

    This statement looks like just plain common sense; and yet it's

    wrong, very wrong. We live in fast paced economy. Having an

    identical product being sold more than 3 years is the exception

    rather than the norm in most consumer good businesses. As a

    result, product life-cycles tend to dwarf organic growth ofretailers.

    This situation is illustrated by the schema below.

    This is a set of product sales plotted on the same graphic. Each

    curve is associated to a particular product; and products are

    launched over time. Each product come with its own lifecycle

    pattern. The lifecycle patterns here illustrate a typical novelty

    ecasting for Business - Blog - http://blog.lokad.com/journal/category/forecastin

    r 12 22/12/2011 14:29

  • 8/3/2019 Forecasting for Business - Blog

    5/12

    effect: sales quickly ramp-up after product launch, and then

    the product enters its downward phase, which ends when the

    product is finally phased out of the market.

    Yet, how does an upward trend - from the retailer itself -

    impacts this picture? Let's have another look at the illustration

    below.

    Sales are higher with a positively trended retailer,yet this

    growth is nowhere strong enough to compensate for the product

    lifecycle effect. The sales of the product are still decreasing -

    albeit at a slower rate.

    This situation outlines how we can have a fast-growing retail

    business with only negatively trended product sales. The main

    trick lies in the fact that new products keep being launched.

    Alas, this situation generates a lot of confusion. Indeed, when

    sales forecasts severely mismatch overall expectations, it

    becomes very tempting tofixthe forecasts.

    Since most forecasting tools are poorly suited to deal with too

    varying or too intermittent demand anyway, it is tempting to

    aggregate sales per family, per category to produce an

    aggregated forecast; and then to de-aggregate forecasts at the

    SKU level using ratios. This approach is named top-down

    forecasting; and heavily used in many industries (textile among

    others).

    Top-down forecasts produce results that look much closer to

    intuitive expectations: a growth is observed in the sales

    forecasts, and it matches growth observed on the various

    business segments.

    Yet, by producing the forecast at the TOP level, the forecasting

    model is capturing an fictitious upward trend that only results

    from the contribution of regular product launches. If this

    fictitious ends up applied to a lower level - aka SKUs or

    products - then we significantly over-forecast the sales for

    each individual product.

    Near worst case: massive overstock is generated for products

    precisely at the time they are phased out of the market.

    From a forecasting perspective, a good forecasting system

    should be able to capture lifecycle effects. It means that sales

    forecasts may significantly differ from the overall business

    forecast. Business can go UP while every single product is

    getting DOWN. In such a situation, trying to fixforecasts is

    most like going to make them worse.

    Addendum: Despite the date of this post (April 1st, 2011), this

    post is not a joke.

    Joannes Vermorel | Post a Comment | Share Article

    tagged forecasting, insights, lifecycle, retail, trend in

    forecasting, insights

    New Forecasting Technology FAQWednesday, March 9, 2011 at 11:28AM

    Lately, we realized that the page detailing our forecasting

    ecasting for Business - Blog - http://blog.lokad.com/journal/category/forecastin

    r 12 22/12/2011 14:29

  • 8/3/2019 Forecasting for Business - Blog

    6/12

    technology was somewhat vague concerning under-the-hood

    aspects such as seasonality, trend, product life-cycle,

    promotions, ... Hence we have just posted a new extensive

    Forecasting Technology FAQ.

    Questions and Answers

    Nuts and bolts

    How accurate are your forecasts?

    Forecasting competitions, do you have any academic

    validation of your technology?Do you evaluate the accuracy of your forecasts?

    General patterns

    Macro trends (ex: financial crisis), how are they

    handled?

    Seasonality, trend, how is it handled?

    Promotions, how are they handled?

    Product Life Cycles and product launches, how are

    they handled?

    Intermittent / low volume products, how are they

    handled?

    Cannibalization, how are they handled?

    Weather, how is it handled?

    Demand artifacts

    Lost sales caused by stock-outs, how are they handled?

    Exceptional sales, how are they handled?

    Aggregation, top-down or bottom-up?

    Obviously, we are barely scratching the surface here. Don't

    hesitate to post your own questions, we will do our best to

    address them as well.

    Joannes Vermorel | Post a Comment | Share Article

    tagged documentation, forecasting, insights in docs,

    forecasting, insights

    Fallacies in data cleaning for(short-term) sales forecastsFriday, November 19, 2010 at 11:43AM

    When it comes to data analysis, experts frequently emphasize

    (and rightly so) the importance of having a clean dataset before

    starting any analysis. Otherwise, you end up with Garbage In,

    Garbage Out.

    As a result, most forecasting toolkits provides extensive

    features to support data cleaning / data preparations; and yet,

    Lokad does not provide any explicit feature supporting data

    cleaning.

    Have we missed something BIG here?

    We don't believe so. There are some misunderstandings when

    it comes to data cleaning for the purpose (short-term) salesforecasting. Indeed, nowadays, sales of most retailers,

    wholesalers, manufacturers are stored into either an ERP or

    some accounting system. In our experience, as of 2010,

    transactional data associated to sales are remarkably clean.

    If there is a transaction recorded November 1st, 2010 indicating

    ecasting for Business - Blog - http://blog.lokad.com/journal/category/forecastin

    r 12 22/12/2011 14:29

  • 8/3/2019 Forecasting for Business - Blog

    7/12

    that the product X has been sold in Y quantity, then, the

    probability for this information to true is very high, with a

    confidence above 99.9% for most sales processes.

    Indeed, companies cannot afford not to knowwhat they are

    selling. As a result, massive efforts have been invested in the

    last two decades to make really sure that sales data are

    reliable to some extent. We are not saying that no erroneous

    sales entry everenter the system, we are only saying that the

    proportion is typically non-significant.

    If sales data are clean, why are we still pushing efforts on

    data cleaning?

    We have been observing a lot of data cleaning practices in the

    industry, and it turns out that the operations referred as

    cleaning tend to be much more than actually looking for the

    0.1% erroneous transactions. The illustration here above gives

    some insights about the actual operations involved in a typical

    data cleaning phase: it's all about smoothing the extremes. For

    example, partial sales during shortages are manually increased,

    and promotional/exceptional sales are caped.

    Needless to say, we are not believers of this approach.Real

    sales data should not be replaced byfictitious sales data.

    Indeed, nothing can tell with 100% confidence how muchproducts would have been soldif there had not been any

    shortage. The partial sales are the only tangible data that we

    have that does not already rely on statistical extrapolation.

    Yet, there is one interesting side-effect of the smooth-

    the-extreme practice: smoothing improves the accuracy of

    the naive forecasting methods that behave much like the

    moving average.

    It is tempting, if the only tool you have is a

    hammer, to treat everything as if it were a nail.,

    Abraham Maslow, 1966

    Trying to adjust the sales data to better fit on the only

    forecasting model on hand is just a bad case of the Law of theinstrument. Our approach consists oftackling directly the

    complex patterns instead of trying to circumvent them.

    Joannes Vermorel | Post a Comment | Share Article

    tagged cleaning, data, forecasting, insight, sales in accuracy,

    forecasting, insights

    Width vs. Depth, Rotate your salesforecasts by 90 degreesTuesday, August 31, 2010 at 06:05PM

    We have already discussed why Lokad did not care much aboutforecasting Chinese food rather than Sport Bar beverages.

    Another way of thinking our technology consists ofrotating

    your sales forecasts by 90 degrees.

    We are observing that a consumer product has, on average, 3

    years lifecycle. This means that on average the amount of data

    available for every single product about 18 months. When, we

    look at the sales history with a monthly aggregation, 18 months

    of data means 18 points.

    With 18 data points, no matter how smart or advanced is your

    forecasting theory, you can't do much simply because we face

    an utter lack of data to perform any robust statistical analysis.

    With 18 points, even a pattern has obviously as seasonality

    becomes a challenge to observe because we don't even have 2

    complete seasonal observation.

    Your mileage may vary from one industry to the next, but

    unless your products stay in the market for decades, you are

    most likely to face this issue.

    ecasting for Business - Blog - http://blog.lokad.com/journal/category/forecastin

    r 12 22/12/2011 14:29

  • 8/3/2019 Forecasting for Business - Blog

    8/12

    As a

    direct

    consequence, classical forecasting toolkits require

    statisticians to tweak forecasting models for every singleproduct because no non-trivial statistical model can be robustly

    fit with only 18 points as input data.

    Yet, Lokad does not require any statistician, and the magic

    lies in the 90 degrees rotation: our models do not iterate over

    data a single time-series at a time, but against all time-series at

    once. Thus, we have a lot more input data available, and

    consequently we can succeed with rather advance models.

    This approach is just common sense: if you want to forecast

    the seasonality of your new chocolate bar, the seasonality of

    the other chocolate bars seems like a good candidate. Why

    should you treat each chocolate bar in strict isolation from the

    others?

    Yet, from a computational perspective, the problem has just

    become a lot harder: if you have 10,000 SKUs the number of

    associations between two SKUs is roughly 100 millions (and

    10,000 SKU is nowhere a large number). That's precisely where

    the cloud kicks in: even if your algorithms are well-designed

    not to suffer a strict quadratic complexity, you're still going to

    need a lot of processing power. The cloud just happens to make

    this processing power available on demand at a very low price.

    Without the cloud, it is simply not possible to deliver this kind

    of technology.

    Joannes Vermorel | Post a Comment | Share Article

    tagged cloud computing, depth, forecasting, insights,

    statistics, technology, width in forecasting, insights

    Forecast's species: classificationvs. regressionTuesday, April 6, 2010 at 12:21PM

    The word forecasting is covering a very large spectrum of

    processes, technologies and even markets. In the past, we

    introduced the worlds of forecasting software, distinguishing

    between:

    Deterministic simulation software

    Expert aggregation softwareStatistical forecasting software

    Lokad falls in the last category as our technology is purely

    statistical. Yet, Lokad is far from covering the entire statistical

    spectrum on is own. Two broads categories of forecasts exist in

    ecasting for Business - Blog - http://blog.lokad.com/journal/category/forecastin

    r 12 22/12/2011 14:29

  • 8/3/2019 Forecasting for Business - Blog

    9/12

    statistical forecasting (*):

    Classification forecasts

    Regression forecasts

    (*) We are oversimplifying here for the sake of clarity, as

    statistical learning subtleties are well beyond the scope of

    this modest blog post.

    Classification attempts to separate (or classify) objects

    according to their properties. The illustration below from

    Tomasz Malisiewicz illustrates a classification task trying toseparate images picturing a chairfrom images picturing a

    table.

    Illustration from tombone's blog

    The output of a classification is binary (or rather discrete):

    objects get assigned to classes with more or less confidence,

    i.e. higher or lower probabilities.

    On the other hand, regressions typically output curves. The

    illustration below is considering a time-series representing

    historical sales, and displays the corresponding forecast.

    The regression forecast is a curve rather than a binary (or

    combination of binary) settings. Inputs get prolonged into the

    future.

    How does this distinction impact the business?

    Well, it turns out that Lokad - as it stands early 2010 - only

    delivers regression forecasts. Thus, there are many interesting

    problems that cannot be tackled by Lokad because these are

    classification problems:

    Customer segmentation: for each customer, we would like

    to evaluate the probability of achieving successful up-sale

    through a direct marketing action. Following the same

    idea, we could try to predict the churn as well.

    ecasting for Business - Blog - http://blog.lokad.com/journal/category/forecastin

    r 12 22/12/2011 14:29

  • 8/3/2019 Forecasting for Business - Blog

    10/12

    Fraud detection: for each transaction, we would like to

    evaluate - based on the transaction pattern - the

    probability for the operation to be a fraud attempt.

    Deal prioritization: based on the properties of the

    prospect (availability of budget, industry, contact rank in

    the company, expressed level of interest, ...), we would

    like to evaluate the likelihood to get a profitable deal out

    of each prospect to prioritize the sales team efforts.

    Frequently, we are asked whether Lokad could deliver

    classification forecasts as well. Unfortunately, the answer will

    be negative for the time being. Albeit being rooted by the same

    mathematical theory, classification and regression entail very

    different technologies; and Lokad is pushing all its efforts

    toward regression problems.

    Although, we are not dismissive about classification

    problems, they truly deserve attention and efforts. For 2010,

    we are sticking to our roadmap, but further ahead,

    classification could be a natural extension of our forecasting

    services.

    Joannes Vermorel | Post a Comment | Share Article

    tagged classification, forecasting, insights, regression,

    software in business, forecasting, insights, market

    Measuring forecast accuracyTuesday, February 23, 2010 at 09:32AM

    Most engineers will tell you that:

    You can't optimize what you

    don't measure

    Turns out that forecasting is no

    exception. Measuring forecast

    accuracy is one of the few

    cornerstones of any forecasting

    technology.

    A frequent misconception about accuracy measurement is that

    Lokad has to wait for the forecasts to become past, to finally

    compare the forecasts with what really happened.

    Although, this approach works to some extend, it comes with

    severe drawbacks:

    It's painfully slow: a 6 months ahead forecast takes 6

    months to be validated.

    It's very sensitive to overfitting. Overfitting should not to

    be taken lightly, and it's one the few thing that is very

    likely to wreak havoc in your accuracy measurements.

    Measuring the accuracy of delivered forecasts is a tough piece

    of work for us. Accuracy measurement accounts for roughly

    half of the complexity of our forecasting technology: the more

    advance the forecasting technology, the greater the need for

    robust accuracy measurements.

    In particular, Lokad returns the forecast accuracy associated to

    every single forecast that we deliver (for example, our

    Excel-addin reports forecast accuracy). The metric used for

    accuracy measurement is the MAPE (Mean Absolute Percentage

    Error).

    In order to compute an estimated accuracy, Lokad proceeds

    (roughly) through cross-validation tuned for time-series

    forecasts. Cross-validation is simpler than it sounds. If weconsider a weekly forecast 10 weeks ahead with 3 years (aka

    150 weeks) of history, then the cross-validation looks like:

    Take the 1st week, forecast 10 weeks ahead, and

    compare results to original.

    1.

    ecasting for Business - Blog - http://blog.lokad.com/journal/category/forecastin

    ur 12 22/12/2011 14:29

  • 8/3/2019 Forecasting for Business - Blog

    11/12

    Take the 2 first weeks, forecast 10 weeks ahead, and

    compare.

    2.

    Take the 3 first weeks, forecast 10 weeks ahead, and

    compare.

    3.

    ...4.

    The process is rather tedious, as we end-up recomputing

    forecasts about 150 times for only 3 years of history. Obviously,

    cross-validation screams for automation, and there is little

    hope to go through such a process without computer support.

    Yet, computers typically cost less than business forecast errors,

    and Lokad relies on cloud computing to deliver such

    high-intensive computations.

    Attempts to "simplify" the process outlined are very likely to

    end-up with overfitting problems. We suggest to say very

    careful, as overfitting isn't a problem to be taken lightly. In

    doubts, stick to a complete cross-validation.

    Joannes Vermorel | 1 Comment | Share Article

    tagged accuracy, forecasting, measure in accuracy, forecasting,

    insights

    Internet is needed for your forecastsSaturday, November 14, 2009 at 07:28PM

    Do I really need an Internet

    connection to get your

    forecasts?is a question

    frequently asked by prospects

    having a look at our

    forecasting technology.

    Well, the answer is YES. With

    Lokad, there is no

    work-around. Our forecasting

    engine does not come as an

    on-premises solution.

    But why should we need an internet connection for an

    algorithmic processing such as forecasting?

    The answer to this question is one of the core reason that have

    lead to the very existence of Lokad in the first place.

    When we started working on the Lokad project - back in 2006 -

    we quickly realized that forecasting, despite appearances, was

    a total misfit for local processing.

    1. Your can't get your forecasts right without having the data

    at hand. Researchers have been looking for decades for a

    universal forecasting model, but the consensus among the

    community is that there is no free lunch; universal models donot exist, or rather, they tend to perform poorly. This is the

    primary reasons why forecasting toolkits feature so many

    models (don't click this link, it's 3000 pages manual for a

    popular toolkit). With Lokad, the process is much simpler

    because the data is made available to Lokad. Hence, it does

    not matter any more if thousands of parameters are needed, as

    parameters are handled by Lokad directly.

    2. Advanced forecasting is quite resource intensive but the

    need to forecast is only intermittent. Even a small retailer with

    10 point of sales and 10k product references represents already

    100k time-series to be forecasted. If we consider a typical

    performance of 10k/series per hour for a single CPU (which is

    already quite optimistic for complex models), then computing

    sales forecasts for the 10 points of sales take a total 10h of CPU

    time. Obviously, retailers prefer not to wait for 10h to get their

    forecasts. Buying an amazingly powerful workstation is

    possible, but then does it make sense to have so much

    processing power staying idle 99% of the time when forecasts

    are made only once a week? Outsourcing the processing power

    ecasting for Business - Blog - http://blog.lokad.com/journal/category/forecastin

    ur 12 22/12/2011 14:29

  • 8/3/2019 Forecasting for Business - Blog

    12/12

    is the obvious cost-effective approach here.

    3. Forecasting is still under fast paced evolution. Since our

    launch about 3 years ago, Lokad has been upgraded every

    month or so. Our forecasting technology is not some

    indisputable achievement carved in stone, but on the contrary,

    is still undergoing a rapid evolution. Every month, the

    statistical learning research community moves forward with

    loads of fresh ideas. In such context, on-premise solutions

    undergo a rapid decay until the day the discrepancy between

    the performance of current version and the performance of the

    deployed version is so great that the company has no choice but

    to rush an upgrade. Aggressively developed SaaS ensure that

    customers benefit from the latest improvements without having

    to even worry about it.

    In our opinion, going for an on-premise solution for your

    forecasts is like entering a golf competition with a large

    handicap. It might make the game more interesting, but it does

    not maximize your chances. Don't expect your competitors to be

    fair enough to start with the same handicap just because you

    do.

    Joannes Vermorel | Post a Comment | Share Article

    tagged business, forecasting, insight, technology in business,

    forecasting, insights

    ecasting for Business - Blog - http://blog.lokad.com/journal/category/forecastin