Chris's profileChris Webb's BI BlogBlogLists Tools Help
    1/2/2007

    Build Your Own Analysis Services Cache-Warmer in Integration Services

    Cache-warming is one of the most neglected performance-tuning techniques for Analysis Services: perhaps it seems too much like cheating? Yet almost everyone knows how much difference there can be executing a query on a cold cache and a warm cache so there should be no excuse not to be doing it, especially if you know what queries your users are likely to be running in advance. AS2005's caching mechanism is more complex than I can describe here (or than I can describe full stop - although I hear that the recently published "Microsoft Analysis Services 2005" has some great information on this front) but a lot of the time it can cache raw data of the cube and quite often the results of calculations too; you'll need to test your own cubes and queries to find out exactly how much you'll benefit but almost every cube benefits to a noticeable extent.

    I've recently implemented a simple cache-warming system for a few customers which I thought I'd share details of. Now I know that the documentation for asmd contains details of how you can use it for this purpose (see http://msdn2.microsoft.com/en-us/library/ms365187.aspx for details) but I didn't go down this route for a number of reasons:

    • This example uses a batch file and I preferred to keep all my logic in SSIS, especially since the customers were already using it for cube processing.
    • I wanted to avoid making my customers have to get their hands dirty extracting the MDX needed. They were using Excel 2003 as their main client and as you may know Excel 2003 makes heavy use of session sets so extracting and modifying the MDX it generates to create single MDX statements would have been too much to ask.

    Here's what I did instead. First, I created a new SQL Server database to store all the queries I was going to use. Then I used Profiler to capture the necessary MDX: I made sure no-one else was connected to the server, started a trace which only used the QueryBegin event and which included the TextData column, got the user to open Excel and construct and run their query, then stopped the trace and saved the results to a table in my new database. After doing this a few times I ended up with several tables, each of which contained the MDX generated for a particular sequence of queries in Excel.

    Next I created a SSIS package which took each of these queries and executed them. Here's what it looked like:

    The outermost ForEach container used an SMO enumerator to loop through every table in my SQL Server database and put the table name in a variable (the expression generated by the UI was Database[@Name='CacheWarmer']/SMOEnumObj[@Name='Tables']/SMOEnumType[@Name='Names']). Next a script task used this table name to create a SQL SELECT statement which returned every query in the current table and put that in another variable. Here's the code:

    Dts.Variables("GetMDXQueries").Value = "SELECT textdata from [" + Dts.Variables("TableName").Value.ToString() + "] where DatabaseName='" + Dts.Variables("ASDatabaseName").Value.ToString() + "'"

    Next I used an Execute SQL task to execute this statement and out the resultset into another variable, the rows of which I looped over using the innermost ForEach loop using an ADO enumerator. Inside this second loop I got the MDX query out of the current row and into a string variable in a Script task as follows:

    Dts.Variables("MDXQueryString").Value = Dts.Variables("MDXQueryObject").Value.ToString()

    Then used another Execute SQL task, connected to my cube, to run the MDX query. I've been unable to execute MDX queries inside a Data Flow task (except when going via SQL Server using linked servers, which is nasty), hence the use of an Execute SQL task here; I also found that I had to use an ADO connection to my cube - if I used an OLE DB connection all my queries ran twice for some reason. I also set the RetainSameConnection property on the connection to the cube to true so that queries which relied on session scoped sets created earlier in the workflow didn't fail; nonetheless I also set the FailPackageOnFailure and FailParentOnFailure properties of the Execute SQL task to false just in case. I was then able to save the package up to my server and use SQL Server Agent to execute it immediately after cube processing had finished.

    As I said, if you implement a cache-warming system you'll want to test how much of a difference it makes to your query performance. The easiest way to do this is to clear the cache and then run the package twice, noting how long it takes to run both times. The difference between the two times is the difference between a cold and a warm cache. To clear the cache you can either restart the Analysis Services service or run a Clear Cache command in an XMLA query window in SQLMS. Here's an example of the latter which clears the cache of the Adventure Works database:
    <Batch xmlns="http://schemas.microsoft.com/analysisservices/2003/engine">
    <ClearCache>
    <Object>
    <DatabaseID>Adventure Works DW</DatabaseID>
    </Object>
    </ClearCache>
    </Batch>

    Now I will admit that I'm not the world's greatest SSIS expert so if anyone has any suggestions for improving this I'd be pleased to hear them. Please test them first though - as I mentioned above I found SSIS didn't always work as I expected with SSAS as a data source! I've also created a similar package whih connects to the query log AS creates for usage-based optimisation, reads the data in there and uses it to construct MDX queries which it then runs against the cube. This has the advantage of removing the need for anyone to extract any MDX from anywhere; plus the queries it constructs return very large amounts of data so you can use up all that memory you get on 64-bit boxes. The problem is that at the moment some of the queries it constructs are way too big and take forever to run... when I've worked out how I want to break them up into smaller chunks I'll blog about it.

    UPDATE: Allan Mitchell has very kindly done some more research on what happens when you try to run an MDX query through an Execute SQL task and written it up here:
    http://wiki.sqlis.com/default.aspx/SQLISWiki/ExecuteSQLTaskExecutesMDXQueryMoreThanOnce.html

     

    Comments (18)

    Please wait...
    Sorry, the comment you entered is too long. Please shorten it.
    You didn't enter anything. Please try again.
    Sorry, we can't add your comment right now. Please try again later.
    To add a comment, you need permission from your parent. Ask for permission
    Your parent has turned off comments.
    Sorry, we can't delete your comment right now. Please try again later.
    You've exceeded the maximum number of comments that can be left in one day. Please try again in 24 hours.
    Your account has had the ability to leave comments disabled because our systems indicate that you may be spamming other users. If you believe that your account has been disabled in error please contact Windows Live support.
    Complete the security check below to finish leaving your comment.
    The characters you type in the security check must match the characters in the picture or audio.

    To add a comment, sign in with your Windows Live ID (if you use Hotmail, Messenger, or Xbox LIVE, you have a Windows Live ID). Sign in


    Don't have a Windows Live ID? Sign up

    No namewrote:
    Hi,

    Great blog with interesting informations.
    I can use it t solve my problem.

    Thanx
    M.
    http://www.vanjobb.hu/
    May 15
    Chris Webbwrote:
    BTW, the reason your query is failing is because it's referencing a session-scoped named set on the columns axis; I guess you either haven't already executed the CREATE SET statement needed for this set, or you haven't set RetainSameConnection.
    May 6
    No namewrote:
    ya, it's formijob@yahoo.com
    May 1
    Chris Webbwrote:
    I can if you tell me what your email address is! Alternatively, you can find my contact details at http://www.crossjoin.co.uk
    May 1
    No namewrote:
    Chris, Can you please email me your SSIS package?
    thanks
    Apr. 30
    No namewrote:
    hi Chris,
    I tried a different approach for cache warming. In my package I have a execute sql task which gets the mdx queries from a trace table in the database and another execute sql task which is with in a for each loop container executes it.
    But the problem is the query captured by the profiler is in this format.

    SELECT NON EMPTY [{D84A78E5-5A54-4A9D-B945-41A1D370B5DE}Pivot30Axis1Set0] DIMENSION PROPERTIES MEMBER_NAME, PARENT_UNIQUE_NAME ON COLUMNS
    FROM [CUSTOMER CUBE] CELL PROPERTIES VALUE, FORMATTED_VALUE, FORE_COLOR, BACK_COLOR

    when i run the package it gives me an error saying can not run this query.
    Can you please advice why it's happening?
    Apr. 30
    Chris Webbwrote:
    I will if you send me your email address! You can contact me via http://www.crossjoin.co.uk
    Feb. 16
    Jim Wrightwrote:
    Hi Chris,
    Could you also email the SSIS package to me please?
    Thanks!
    Feb. 16
    Chris Webbwrote:
    I assume that Project Server 2003 runs off Analysis Services 2000, right? If so, the general approach should work but the package might need some modification; the package is SQL2005 only.
    Jan. 13
    hungwrote:
    Hi Chris,
    Does this mechanism work with AS for Project server 2003 ?
    If Yes, then would you please send me the package?
    Thanks
    Hung
    Jan. 13
    Great.. the package work fine for query that return small sets of data. it complain about XML and resultset when the data is averagely large, pls any help..
    Dec. 4
    George Qiaowrote:
    Hi Chris
     
    Great Stuff! Could you please send the SSIS package to me? I've been looking for this topic for a while and it is so nice to read your article.
     
    Thanks in advance
    George
    Nov. 10
    Imranwrote:
    Can  u please forward me the package? This is urgent. Tks
     
    Imran
    Oct. 24
    Chris Webbwrote:
    Mutylala, can you send me your email address? You'll find my contact details at www.crossjoin.co.uk
    Oct. 16
    No namewrote:
    hi Chris ,
     
     
    please , Could send me SSIS package which you explained above ....
     
    thanks
    mutylala
    Oct. 15
    No namewrote:
    i am looking for a create cache tables,
     
    please Would you been kind and upload your SSIS package so I can have a look at it and get mine to work
     
     
    Oct. 15
    Chris Webbwrote:
    I mailed you the package using the address you left here - hope you get it...
    July 9
    Nielswrote:
    Hi Chris
     
    Greate blog - I am fairly new to SSIS and "Cache warning" and I am having a hard time getting the SSIS package to work as you describe.
    Would you been kind and upload your SSIS package so I can have a look at it and get mine to work
     
    Thanks in advance
    Niels
    July 9

    Trackbacks (1)

    The trackback URL for this entry is:
    http://cwebbbi.spaces.live.com/blog/cns!7B84B0F2C239489A!994.trak
    Weblogs that reference this entry