| Chris さんのプロフィールChris Webb's BI Blogブログリスト | ヘルプ |
|
2009/12/04 DAX: Running TotalsContinuing my exploration of DAX, here’s a quick post on calculating running totals. I’ll be honest and say that I didn’t work out these expressions myself but got it from a newsgroup exchange between Colin Banfield and Karen Aleksanyan. In this case I want to calculate a running total from the beginning of time up to the current date (as opposed to from the beginning of the current year, as with a year-to-date). There are two possible approaches: ='FactInternetSales'[Sum of SalesAmount](VALUES('DimDate'[FullDateAlternateKey])>='DimDate'[FullDateAlternateKey], ALL('DimDate'[FullDateAlternateKey])) ='FactInternetSales'[Sum of SalesAmount](DATESBETWEEN('DimDate'[FullDateAlternateKey], BLANK(), LASTDATE('DimDate'[FullDateAlternateKey])), ALL('DimDate'[FullDateAlternateKey])) The first simply asks for a filter of all values from FullDateAlternateKey where the date is greater than or equal to the current row date. Don’t ask me why it needs a ‘greater than’ as opposed to ‘less than’, but that’s what it needs to work – possibly this is a bug. The second expression uses the DatesBetween, Blank and LastDate functions (interesting that the Blank function seems to work in the same way a null does with the colon operator in MDX in this case); it has one advantage over the first expression in that it doesn’t display errors when other columns from the DimDate table are used in the pivot table. Here’s what the output looks like with just dates on rows: As with the original ytd calculation in my previous post on DAX, neither expression displays the expected values when looking at Years or Quarters, for example – you again just see the same value as Sum of SalesAmount. The really weird thing is that at the date level both calculations return different values when you have Years and Dates on rows to when you just have dates. The screenshot below shows the crossover between 2001 and 2002 with just dates on rows: Now here’s what you see with Years present: Notice how in the first screenshot the running total carried on across the year boundary, as you’d expect, but in the second case the running total is only calculated from the beginning of the current year. I’m not sure whether this is a bug (although I suspect it is) or expected behaviour; the same thing happens when you slice by Year too, which makes more sense: As ever, I’ll find out what’s going on and post an update here… UPDATE: Marius Dumitru has come to the rescue again. Here’s the working version that does everything I want: ='FactInternetSales'[Sum of SalesAmount](DATESBETWEEN('DimDate'[FullDateAlternateKey], BLANK(),LASTDATE( 'DimDate'[FullDateAlternateKey])), All('DimDate')) Here it is working: 2009/12/01 First steps in DAX: Year-To-DateWith the release of the latest CTP of PowerPivot, DAX has gained a lot of new functions (mostly to do with time calculations), so over the weekend I decided that it was time to sit down and start learning it properly. A lot of other people out there are doing just the same thing - Vidas, for example – and like them I thought I’d blog about my experiences while I was checking it out. So far I agree with Shaun Ryan that anyone with a background in MDX will have an advantage when it comes to understanding DAX, but there are as many differences as there are similarities and it’s still a bit tricky to understand what’s going on. I would like to point out that the code here doesn’t necessarily represent best-practice for DAX (I’ve only been using it for a short time, remember!) and in some cases the code will have to change before RTM because DAX itself will change; hopefully, though, these examples will help anyone else out there who, like me, at the bottom of the learning curve. In all this blog post I’m using a very simple PowerPivot model based on the DimDate and FactInternetSales tables from the 2008 version of the Adventure Works DW database, with the two tables joining on the OrderDate key. I thought I’d choose year-to-dates as the subject of this first blog post because it’s a common calculation and relatively easy to understand. There is, of course, a DAX function specifically for year-to-date calculations: TotalYTD. Here’s what my first attempt at a DAX year-to-date calculated measure looked like: TotalYTD(Sum([SalesAmount]), 'DimDate'[FullDateAlternateKey], All('DimDate'[FullDateAlternateKey])) I would have put in some line breaks but of course DAX doesn’t allow them (yet? hopefully this will change before RTM). Step by step:
Here’s a screenshot of the calculated measure in action: As you can see, it works well when we’re looking at dates. But when we start to introduce years and quarters you can see we don’t get the results we might expect: Hmm, cue several hours of head-scratching on how to fix this. After a lot of trial and error, I found that by creating a second calculated measure like this: 'FactInternetSales'[YTD Sales](LASTDATE(VALUES('DimDate'[FullDateAlternateKey])), all('DimDate')) Worked. What I’m doing here is forcing the measure to display the value of the [YTD Sales] calculated measure for the last date in the current context. Here’s what it displays (note the value for Quarter 2): Various other similar approaches didn’t work, though, and the obvious next step of using LASTDATE(VALUES('DimDate'[FullDateAlternateKey])) inside my original calculated measure didn’t work either. Given that PowerPivot doesn’t allow you to hide calculated measures (another thing that, in my opinion, needs to be fixed before RTM – breaking up calculations into smaller steps is a useful thing to do for debugging and sharing code) I don’t particularly like this two step process. I’ll post an update here when/if I find a way to do this in a single calculation; if anyone else out there finds out how to do this, please leave a comment! UPDATE: Marius Dumitru has given me the answer to my problem. Here's a version of the YTD calc that works on all time selections: =TotalYTD(Sum([SalesAmount]), 'DimDate'[FullDateAlternateKey], All('DimDate')) The difference here is the All function is taking the whole DimDate table rather than just the [FullDateAlternateKey] column. I'm currently trying to work out why this is important... In the meantime, here’s a screenshot of it working: 2009/11/25 Free version of Microstrategy Reporting Suite for SSASHere’s a cheeky move by Microstrategy: they’ve made the free version of their Reporting Suite work for Analysis Services. More details and a download link here: I’ve not tried it so I don’t know whether it’s any good or not, but it’s free and you can have up to 100 users, so it will be worth checking out. Of course this is Microstrategy trying to hurt Microsoft and its partners, but, well, it’s free… 2009/11/24 SQLBits V Summary & ThanksAnother SQLBits – the fifth! – has come and gone, and I wanted to say thanks to everyone that helped to make it such a success. It was a three-day event for the first time this time, which meant that there was even more organisation work needed, but looking back I think it all went extremely smoothly and was by far the slickest conference we’ve put on so far. My colleagues on the organising committee, Simon Sabin, Martin Bell, Allan Mitchell, Darren Green and James Rowland-Jones, are a great bunch of people and it’s always a pleasure to work with them on SQLBits even if it does take a big chunk out of our spare time. Thanks are also due to the speakers and sponsors, without whom the event would not be possible, and I’d also like to highlight the people who volunteered to help out doing unglamorous things like stuffing the attendee bags and room monitoring – Rachel Clements, Jon Reade, Gary Short, Rachel Hawley, Richard Douglas, Luke Hayler and many others. If you came to SQLBits I hope you enjoyed it, and if you did enjoy it please let everyone know by blogging and twittering about it. Please also join the SQLBits groups on LinkedIn and Facebook, and if you’ve got any pictures of the event why not post them on the latter? Anyway, it’s time to crack on with some real work and deal with the big pile of emails that has built up over the last few weeks. I’ve missed a whole bunch of big announcements I would otherwise have blogged about but I’m sure you’ve caught them elsewhere… I must get round to downloading and installing the latest Powerpivot CTP etc. See you at the next SQLBits! 2009/11/17 Pinpoint and DallasInteresting news from PDC: Microsoft has announced two new services – Pinpoint and Dallas. You can find Pinpoint here: http://pinpoint.microsoft.com Here’s the blurb from the site: At the same time, Pinpoint helps developers and technology service providers quickly and easily get software applications and professional services to market—and engage customers who need what they offer. Pinpoint is the largest directory of qualified IT companies and their software solutions built on Microsoft technologies.
Whether you’re searching for expert help or offering it, Pinpoint helps you easily find and engage the right people and technologies to get the job done.
Much, much more interesting from a BI point of view is Dallas, which is part of Pinpoint: http://pinpoint.microsoft.com/en-US/Dallas It’s Microsoft’s marketplace for data, all built on Azure. Again from the blurb: The Register has the best write-up of what this is here: http://www.theregister.co.uk/2009/11/17/microsoft_dallas_data_service/ From that article: Note my emphasis on the last sentence! Here at last is the ability to buy that third party data that’s been a part of every Powerpivot demo. I’ve worked with a lot of companies that sell data in my career, and this looks like it could be a very significant development for them. I’d even heard vague rumours that MS were interested in buying commercial data providers at one point, several years ago – if they were prepared to go this extreme then it would certainly go a long way to making this strategy a success. Now just think how cool it would be if SSAS or PowerPivot could be hosted on the cloud, so all you needed was Excel to analyse this data. Maybe one day… 2009/11/13 SQLBits Agenda PublishedAt long last, the SQLBits agenda has been published: It’s not too late to register, even though SQLBits is only next week. It’s looking like it will be the largest event yet in terms of attendance… If you’re coming and you see me around, say hello! UPDATE: car sharing is live now too - http://sqlbits.com/CarSharing.aspx 2009/11/11 Ragged Hierarchies, HideMemberIf and MDX CompatibilityHere’s something that I suspect a few people out there will be surprised I didn’t know – but as far as I’m concerned, if I didn’t know it then it’s worth blogging about. Anyway, it regards ragged hierarchies, the HideMemberIf property and the MDX Compatibility connection string property. Now you probably know that if you want to turn a user hierarchy into a ragged hierarchy (perhaps to avoid using a parent child hierarchy) you need to use the HideMemberIf property on the user hierarchy. For example, imagine you were using the following SQL query as the basis of your customer dimension: SELECT 'UK' AS Country, 'Bucks' AS State, 'Amersham' AS City, 'Chris Webb' AS Customer We could build a dimension off this with attributes for Country, State, City and Customer, and for two out of our three customers that would be fine. However the Pope lives in the Vatican, which is (at least for the purposes of this exercise) a Country with no concept of City or State; and in the case of customers who live in the Vatican, we just want to be able to drill down on the Country ‘Vatican’ and see all of the Customers who live there without drilling down through a meaningless State and a City. So what we can do is build a user hierarchy on our dimension with levels Country, State, City and Customer, and on the lower three levels set the HideMemberIf property to OnlyChildWithParentName: Then, with any sensible client tool, we can connect to the cube and browse the dimension as we want: I saw ‘sensible’ client tool, because of course this only works if you set: This much I knew. However, what I didn’t realise until last week when I was moaning about this to TK Anand from the SSAS dev team at PASS, is that for some ragged hierarchies you don’t need to set the MDX Compatibility connection string property at all. For example, if in our case we duplicate the Customer upwards rather than the Country downwards, like so: SELECT 'UK' AS Country, 'Bucks' AS State, 'Amersham' AS City, 'Chris Webb' AS Customer …and then build the dimension, setting HideMemberIf on our user hierarchy to OnlyChildWithParentName, we can get the result we want without setting the MDX Compatibility property. Here’s a screenshot of this new dimension in Excel just to prove it: The difference here is that we’re hiding all members below the State level right down to the bottom of the hierarchy, rather than hiding members somewhere in the middle of the hierarchy. Truly, this is one of those “Doh, if only I’d known!” moments… this at least means that in some of the scenarios where you’d use ragged hierarchies you can get them to work with Excel, even if it means that we have to hack the data (‘The Pope’ is a Customer, not a State or a City). On the other hand there are plenty of scenarios where you do need to hide members in the middle of a hierarchy, and frankly I don’t see why Excel 2007 can’t set MDX Compatibility=2 in its connection string so they work properly. 2009/11/09 PASS Summit ThoughtsThe PASS Summit is over for another year and I’m just starting out on the long trip back home, so there’s plenty of time to get my thoughts together on what’s happened over the past week. In fact there’s not much to say about the event itself: it was, as ever, a lot of fun and totally worthwhile. Hey, within 30 minutes of arriving at the conference I learned I’d won an award for the best BI-related blog entry, for my post on implementing real SSAS drilldown in SSRS! Attendance was up from last year although probably the recession still took its toll: remember that there was no BI Conference this year and I would have thought that a lot of people who would have gone to it would have gone to PASS instead. To be honest I think not having a BI Conference is a good thing, actually. I don’t like having to choose which conference to attend, and part of the benefit of a conference is to get as many members of a tech community together in one place. And this was certainly the largest gathering of Analysis Services people I’ve ever seen: all the usual crowd were there, I met a lot of people who I’d only met a few times before, and I finally got to meet Darren Gosbell in person after having known him by email for at least five years. One complaint I would make about the event was that the sessions weren’t scheduled particularly well. I know everyone always complains about this but in this case it did seem worse than usual: my session, for example, was up against two other SSAS-specific sessions, but in other cases there were time slots with no SSAS content at all. The other benefit of PASS is that you get to talk at length about what’s going on in the world of SQL Server with other like-minded people. As a result you get to crystallise your thoughts on a lot of matters and - guess what – I’m going to share mine here. First of all, the topic that was on everyone’s lips was PowerPivot. In fact everyone at the conference must have seen the standard demo at least five times and there were also a lot of advanced sessions on it too. Don’t get me wrong, I really think PowerPivot it cool from a technology point of view, I am going to take the time to learn it, and I also think from a make-money-by-getting-people-to-upgrade-to-Office-2010 point of view it is a very clever move for Microsoft. But my feelings about it remain ambiguous. Quite apart from the arguments about it discouraging ‘one version of the truth’ and encouraging spreadmarts that have already been discussed ad nauseam, I have another problem with it: I don’t honestly know whether I, as a consultant, will be able to make any money from it. The very nature of it, as a self-service tool, means no expensive outside consultancy is necessary. I don’t think it will take business away from me though; it will be widely used and it will be used instead of regular SSAS for more basic projects, but the more serious stuff will stay with SSAS I hope. I think the need for sophisticated security and more complex calculations will be the deciding factor when people choose between SSAS and PowerPivot; I’m not sure I see many people upselling from PowerPivot to SSAS either. We’ll see. Something that worries me more about PowerPivot is the fact that it seems to have diverted the attention of the SSAS dev team. For SSAS 2008 we had few new features, although the performance improvements were very welcome. For 2008 R2 I can only think of one new feature in SSAS, and that’s the ability to use calculated members in subselects that will allow Excel 2010 to use time utility dimensions properly (I’ll blog about that at some point). Even though work on good old server-side SSAS will resume for the next major release of SQL Server I worry that PowerPivot will take priority in the future. If this happened it would be bad for me and other BI partners from a business point of view, and seems crazy given that SSAS has been such a successful product in the enterprise sector; it’s not like there aren’t a lot of new features and fixes that could be done. Shades of IE6 and Microsoft getting complacent once it’s cornered a market, I think. Last of all on PowerPivot, I suspect that there is something new relating to it in the roadmap that hasn’t been announced yet. David DeWitt devoted his keynote on Thursday to it, the specifics of column-store databases and the Vertipaq engine (which is the new in-memory storage engine that PowerPivot uses), and at the end hinted at this saying that although he couldn’t make any announcements, those people who had been paying attention might have some ideas on what the future held for it. Of course I hadn’t been paying attention properly, but the obvious thing would be to integrate it with the relational database somehow. Given that PowerPivot is now being hosted inside Sharepoint, why not host it in SQL Server too? It’s already very table and join friendly, and I could imagine a scenario where it was used inside SQL, pointed at a schema, some kind of proactive caching kept the data in SQL in synch with the data in the Vertipaq store, difficult BI calculations could be expressed in DAX, but the whole thing was transparent to TSQL. Imagine integrating that with Madison too! Moving on, the other thing that has become clear to me is that I really have to sit down and learn Sharepoint (or at least the relevant bits of it) properly. It’s at the heart of Microsoft’s BI strategy and there’s no avoiding it. I have to admit to some mixed feelings about this move though, and I know other people I talked to at the conference share them. Partly it’s because, in the past, there were BI specialists and there were Sharepoint specialists and we didn’t necessarily have much to do with each other; now, though, the two worlds are colliding and I’m outside my comfort zone. You might say that Sharepoint has been part of the MS BI strategy for ages now, what with PerformancePoint etc, but I see an awful lot of MS BI customers in my work and I very rarely seem to see any Sharepoint, although it could be because I’m not looking out for it. A more valid objection is that the need for Sharepoint Enterprise Edition CALs adds a lot of extra cost to a project; and from a technical standpoint Sharepoint itself carries a very big overhead – its installation and maintenance may put a lot of customers off if they don’t already have a company-wide Sharepoint strategy, and if they do have one they may not be willing to go to 2010 for some time. Sharepoint might be just too big for some customers to swallow, and be a difficult sell for BI partners. I’d like to stress though, once again, that I see the considerable technical benefits for using Sharepoint for BI, and even if the reception of the latest wave of PerformancePoint has been somewhat muted (eg the realisation that the decomposition tree has been tacked on at the last minute and isn’t properly integrated) I am impressed with what’s coming with Excel 2010 and Excel Services too; for example I think the Excel Services REST API is very cool indeed, and as a SSAS client Excel 2010 is a big improvement on 2007 (which wasn’t all that bad either). I’ve decided I also need to learn Excel properly now as well – get to know all those advanced Excel functions, use Solver and all that. Once again two worlds are colliding: the Excel guys and the SSAS guys are going to have to learn a lot more about each others’ technologies for truly effective BI applications to get built. Anyway, I think this post has gone on quite long enough now. As always, your comments on everything I’ve written here would be much appreciated. 2009/11/05 Live Blogging @PASS - SSAS Consolidation and VirtualisationHere are some notes from the SQLCat team’s session on SSAS consolidation and virtualisation; they’re a bit fragmentary since I’m too busy paying attention to what’s being said! I get asked about these issues by my customers all the time.
The most useful session so far at this conference for me – I learned a lot. Quest add support for SSAS monitoringSomething I saw yesterday at PASS: Quest now have support for monitoring SSAS from their “Spotlight on SQL Server Enterprise” product. See http://www.quest.com/newsroom/news-releases-show.aspx?contentid=10602 It’s pretty basic at the moment – they capture some Perfmon counters and data from schema rowsets, but no trace data – and nowhere near as sophisticated as what SQLSentry have, but it’s good to see another vendor entering this market. |
|||
|
|