Chris's profileChris Webb's BI BlogBlogLists Tools Help

Blog


    5/10/2007

    Calculating Seasonally-Adjusted Values in MDX

    My chalk talk yesterday at the BI Conference went.... ok. Unfortunately, despite having asked for projector so I could show my demo code it turned out the screen I got was pretty small so only the people in the front few rows could see anything. So I told everyone that I would put the code up on my blog so they could study the code in more detail and this is the first such post.

    One of the points I wanted to make when discussing how to calculate trend values for KPIs in MDX is that we really need to think about the formulas we're implementing to make sure they actually provide useful results. Unfortunately the business people we talk to often have even less idea than we do about what kind of calculations actually make sense: there's a real need for input from someone with a statistical background, and I only wish I knew more about this subject. For example if your sales vary a lot by season (you might be selling ice cream, so sales are always going to be much higher in summer than in winter) there's no point looking at previous period growth to determine whether your sales are growing or not because the strong seasonal component will mask the long term trend.

    As I said, for this presentation I looked at a book I've had hanging around on my shelf for years which is basically a stats textbook for managers and MBA students: "Quantitative Methods for Decision Makers" by Mik Wisniewski. It explains some simple algorithms and techniques for analysing data that are very relevant for BI practitioners and I took the method for calculating a seasonally-adjusted total and translated it into an MDX query on Adventure Works as an illustration. Here it is:

    with
    --Calculate a 12-month moving average from 6 months up to and including
    --the current member to 6 months after
    member measures.movingaverage as
    avg(
    [Date].[Calendar].currentmember.lag(5)
    :
    [Date].[Calendar].currentmember.lead(6)
    , [Measures].[Internet Sales Amount])

    --Centre the underlying trend, by taking the average of the moving average
    --for the current member and the next member
    member measures.trend as
    avg(
    {[Date].[Calendar].currentmember,
    [Date].[Calendar].currentmember.nextmember}
    , measures.movingaverage)

    --Find the deviation by dividing Internet Sales by Trend
    member measures.deviation as
    [Measures].[Internet Sales Amount] / measures.trend

    --Find the average deviation for any given month
    member measures.averagemonthlydeviation as
    avg(
    exists(
    [Date].[Calendar].[Month].members
    , {[Date].[Month of Year].currentmember})
    , measures.deviation)

    --Find the sum of these average deviations for each month
    --then divide by 12 to find out by how much each month's average deviation
    --needs to be adjusted by
    member measures.adjustedcomponent as
    (sum(head([Date].[Calendar].[Month].members,12), measures.averagemonthlydeviation)
    / 12)

    --Adjust the monthly deviation by the value we've just found
    member measures.adjusteddeviation as
    measures.averagemonthlydeviation / measures.adjustedcomponent

    --Finally find the seasonally adjusted Internet Sales Amount
    member measures.seasonallyadjustedsales as
    [Measures].[Internet Sales Amount] / measures.adjusteddeviation

    --Run the query
    select {[Measures].[Internet Sales Amount], measures.trend, measures.deviation
    , measures.averagemonthlydeviation,measures.adjustedcomponent
    , measures.adjusteddeviation, measures.seasonallyadjustedsales}
    on 0,
    [Date].[Calendar].[Month].members
    *
    [Date].[Month of Year].[Month of Year].members
    on 1
    from
    [Adventure Works]

    This works on the following theory. What we want to do is take the Internet Sales Amount measure and remove the seasonal variations so we can view the underlying trend. If we assume that there are three factors in play with our sales, the long term trend, seasonal variations and random factors (for example a health scare in July could kill our ice cream sales that month) then we can say that Sales = Trend * Seasonal Variations * Random Factors. The random factors we can't do anything about, but taking this formula we can say that Trend  = Sales / (Seasonal Variations * Random Factors). To get here, first we estimate the trend by doing a twelve month moving average which hopefully will smooth out those seasonal variations; we then find the deviation from the average for each month and then find the average deviation for each month (eg I can say then that January sales are on average 10% lower than usual, Julys are on average 25% higher than usual). For a whole year we'd like these deviations to cancel each other out and add up to 12; of course they don't so we now need to find what they do add up to then divide by 12, then divide each month's average by this value. We can then take this value and divide Internet Sales Amount by it to find the seasonally-adjusted value, the combination of the trend and the random factors.

    Hopefully I haven't screwed up in my interpretation of how it works; it certainly isn't the most robust or optimal bit of MDX I've written either but it shows what you could do if you wanted.

    Comments (3)

    Please wait...
    Sorry, the comment you entered is too long. Please shorten it.
    You didn't enter anything. Please try again.
    Sorry, we can't add your comment right now. Please try again later.
    To add a comment, you need permission from your parent. Ask for permission
    Your parent has turned off comments.
    Sorry, we can't delete your comment right now. Please try again later.
    You've exceeded the maximum number of comments that can be left in one day. Please try again in 24 hours.
    Your account has had the ability to leave comments disabled because our systems indicate that you may be spamming other users. If you believe that your account has been disabled in error please contact Windows Live support.
    Complete the security check below to finish leaving your comment.
    The characters you type in the security check must match the characters in the picture or audio.

    To add a comment, sign in with your Windows Live ID (if you use Hotmail, Messenger, or Xbox LIVE, you have a Windows Live ID). Sign in


    Don't have a Windows Live ID? Sign up

    Chris Webbwrote:
    Hi Mark,
     
    I'm not going to be at TechEd unfortunately, but I'd be very interested in hearing more about what you do. I agree that MDX is not the place to do these calculations if you can help it.
    May 12
    Picture of Anonymous
    Mark Deason wrote:
    It's definitely a different tangent than I've been working on sales normalization.  I tend not to work in dollars because price changes, discounts, levels, location differentials all conspire to obscure the underlying unit flying the door.  If you're at TechEd 2007, I would love to chat over a drink or at the TLC on my recent work using a Diff Eq model with exponential decay - Read: I do a ton of heavy lifting in the ETL phase with T-SQL to make some of these calculations not reside in MDX (which is painful from a computational time perspective) on negating seasonality.

    Thanks for the code - TTYL,
    Mark
    May 11
    Dan Meyerswrote:
    Good presentation, Chris.   It was definitley more in depth than I expected.  I heard alot of good opinions from others as well.  I don't think anyone expected you to go into linear regression.  Thanks for the code.
    May 11

    Trackbacks

    The trackback URL for this entry is:
    http://cwebbbi.spaces.live.com/blog/cns!7B84B0F2C239489A!1178.trak
    Weblogs that reference this entry
    • None