| Chris 的个人资料Chris Webb's BI Blog日志列表 | 帮助 |
|
2008/4/30 Things I'd like to see in Analysis Services 2011It's that time in the release cycle. I know what the new features are going to be for Analysis Services (for a full list see here) and it's rather like Boxing Day: I've just got my presents and I'm starting to wonder to what Santa will bring on his next visit. And to a certain extent I'm feeling as though I should have been a bit more specific when I was writing my Christmas list last time, given that I'm a teeny bit underwhelmed by the features I did get in AS2008. So here, for the benefit of the elves working away in Redmond, is a list of things I'd love to see in Analysis Services 2011 or whatever it will be called:
The ability to parameterise an MDX query is cool, but it would also be great if you could also parameterise calculations on the server. For example you could set up a kind of server-wide variable which was the rate of Income Tax, do calculations using that value and then allow users to have the option of changing its value and explore 'what-if' scenarios. This would clearly have an impact on AS's ability to cache calculations but so long as people knew about this then I'm sure it would be ok. These parameters would also be visible through the metadata so client tools would know to show users they could change them. After my posting on how dynamic sets actually work in AS2005 a few months back, I got to thinking about why I was disappointed and what I really wanted to see. The problem is that however well you model your cube, and however easy-to-use your client tool is, there's always a gap between what the tools are capable of and what the user is able to accomplish with them. For example, you might know that your end users want to be able to create reports showing the top 10 customers with an extra value underneath that represents the aggregated values of all other customers outside the top ten. This is certainly possible in MDX and perhaps there are client tools out there that support this, but wouldn't it be cool if you could create this as a kind of pre-canned selection that was parameterisable (eg show the top 20 instead of the top 10, or show products not customers) and visible through the metadata so any client tool could display it to the end user and allow it to be used? This would be a selection that would be designed to work wherever the user placed it in his or her query; it's not the same thing as a dynamic set, more of a parameterisable custom set function. The concept could be extended from pre-canned selections to entire queries too - and if I haven't convinced you (or been entirely successful in explaining) the concept of pre-canned selections then I think the case for entire queries stored on the server is much easier to make. Again these would be parameterisable and discoverable through metadata, so that any client tool would (hopefully) be able to use them; for Reporting Services alone this would be extremely useful, so you could have one query displayed in many reports with a single point of maintenance. I'm sure there would also be an opportunity to introduce some performance benefits for these stored queries too, perhaps in terms of being able to cache entire cellsets rather than just the values of individual cells. As I mentioned before here, the tools we've got at the moment are not up to the task of managing large numbers of security roles and some serious work needs to be done in the UI. It would also be great if the tools supported dynamic security and the creation of all the supporting objects, perhaps through a wizard. This is an easy one to specify: I want all of the functionality in MDX Studio put into SQL Management Studio. And a hundred other little things, like in the metadata pane I want to be able to open a level and expand a member to drill down as well instead of always having to start at the All Member of a hierarchy; oh, and I'd like to have a Reconnect button for when I've got a query and I've killed my session because of cube changes, instead of having to disconnect and reconnect. And the ability to display... A lot of people have been asking for this already; here's the Connect: https://connect.microsoft.com/SQLServer/feedback/ViewFeedback.aspx?FeedbackID=321161. This would be soooo useful in performance tuning. Analysis Services is useful not only as a database engine in its own right, but as a query/calculation layer on top of other database engines. OK no-one uses ROLAP on SQL Server because performance is rubbish compared to MOLAP, but for years people have been experimenting with using AS in ROLAP mode on top of Teradata (see here) and I'm sure that it could be used in a similar way on top of other data warehouse appliances or the new generation of COP databases like Vertica. You'd be selling it as a way of OLAP-ifying these databases, getting the incredible scalability and performance they (allegedly) offer in combination with the power of MDX for calculations and access to the data via Excel. Writing new cartridges for these databases should be pretty easy; I suppose the problem comes with trying to generate SQL optimised for each of these platforms. MOLAP writeback in AS2008 certainly does improve performance, but the need always to allocate your values down to the granularity of the fact table is always going to lead to a performance hit. Wouldn't it be useful if you could write a value back to a cell without having to allocate it down always? You'd get instant writeback. I'm sure that would be useful in many cases, and in fact it would work in the same way as if you'd assigned a value to that cell in the MDX Script. I wrote some code in the Analysis Services Stored Procedure Project that did this, in fact, although it wasn't really anything more than a proof-of-concept exercise. Anyway, this post has gone on quite long enough now, although I'm sure if I thought about it I could go on for even longer. I need to do some work! But what would you like to see in Analysis Services 2011? Answers in a comment, please... 2008/4/24 Currency formats: should they be tied to language?One of the most commonly asked questions on the AS MSDN Forum is how to format measures that contain values in different currencies with the correct currency symbol. I've never blogged about this because a lot of people have already written up the solution in detail, for example Mosha: However I was thinking about this recently and in my opinion there's a big problem with using the Language property to do this. And that is that when you set the Language of a cell, you not only change the currency symbol but you also change other ways that the number is formatted, for example the symbols used as thousands separators and decimal separators. In the US and UK of course, we use full stops (I think they're called 'periods' in the US?) as decimal separators and commas as thousands separators, but in continental Europe the roles are reversed. So the value: So you can see what the potential problem is - what happens if you have values in Euros, USDs and GBPs in your cube? However much you educate your users you can guarantee that someone at some time is going to get confused or worse not realise what's going on and interpret the values incorrectly. What's the alternative then? I think using Format_String has to be the way to go. If you alter Vidas's example so that instead of locale ids you put currency symbols inside the currency dimension named query, for example: SELECT CurrencyKey, CurrencyAlternateKey, CurrencyName, and then change his MDX assignment to be something like: SCOPE ([Destination Currency].[Destination Currency].[Destination Currency].Members); Then you get the desired result. However, one thing I did notice when I was experimenting with this is that if you try to use more than one character for your symbol (for example you might want to use CHF for Swiss Francs) you sometimes get the following error: #Error The following system error occurred: Out of present range. . Not good. Here's the bug logged on Connect: Following on from all this, it also makes sense that users connecting from different locales automatically see numbers (but not currency symbols) formatted in the convention of their locale. So a German person might connect to the cube and see Euros with a € and USDs with a $, but see commas used as decimal separators, whereas a user in the UK would still see €s and $s with the correct symbol but full stops used as decimal separators. Now AS2K I seem to remember used to be able to handle this perfectly well - it could display the appropriate decimal separator and thousand separator depending on the client locale. However AS2005 RTM had a problem in that it worked ok for calculated measures but not for real measures; this was 'fixed' in SP2 so both calculated measures and real measures always got displayed in the locale of the server. American software, eh? And to think that so many members of the dev team are from Europe too. Here's the Connect: and here's a forums thread on the subject: Hmmmmm.... I need to check if this has been fixed properly in Katmai. 2008/4/23 Interview with me on Cristian Lefter's blogWhile I was at PASS Europe last week Cristian Lefter videoed an interview with me on what I think is cool in AS2008, for the Romanian SQL Server user group. You can watch it on his blog here:
He also interviewed a load of other people, including Marco Russo. 2008/4/19 ASCMD BetaDave Wickert of Microsoft has asked me to relay the following message about a new version of ascmd he's working on:
“In preparation of SQL Server 2008 Analysis Services RTM’ing, I am starting a small beta program for the next version of ‘ascmd’. This version will have two important new features: 1) Built and verified against Analysis Services 2008 – but also directly compatible with AS2K5 – thus I need both AS2K8 and AS2K5 participates. 2) New multi-user capabilities – the Microsoft SQLCAT team uses ascmd as the multi-user load facility for its Analysis Services benchmarks. Ever want to load up a few hundred users against a SSAS server? We are going to be retrofitting those capabilities (plus a few others) back into the released version of ascmd.
I am looking for users to test out this new version: 1) to, at least, slipstream this version into their current use to see if we’ve broken anything (we don’t think so, but we want to fix anything if we did), and 2) test out the new features if you can.
We are also actively soliciting new features and improvements for ascmd at this time – if we can sneak them in, we will.
The “readme” for the current version of ascmd is located here: http://msdn2.microsoft.com/en-us/library/ms365187.aspx
If you are an avid user of ascmd and would like to be included in this beta, please contact me directly (dwickert@microsoft.com).” Distinct Count White PaperYet another excellent paper on optimising distinct count measures from the SQLCat team: Actually I'm beginning to wonder whether I should be linking to the SQLCat team site - in the same way I never link to Mosha because I assume that everyone who reads my blog reads his too, then I would hope everyone subscribes to the SQLCat team blog as well. One topic missing is a comparison of the performance of distinct count measures with the technique of using many-to-many dimensions to get the same result that Marco Russo describes in his famous m2m white paper: Marco presented on this at PASS Europe and mentioned (which tallies with my experience) that this approach can perform as well as, and sometimes better than, a distinct count measure. Panorama, Google and Analysis Services SupportNew developments on the Panorama/Google front: users of Panorama's pivot table for Google docs will be able to connect direct to Analysis Services via http (if they want to pay) and also upload local cube files and query them (which will be free). Here's the announcement: This comes hot on the heels of Google's Salesforce link-up: They're slowly joining the dots... I was asked not to talk about this until the press release went out, but then I found that Nigel Pendse has already beaten me to it and done an excellent job of summing everything up: I agree with Nigel that I don't think there will be many people at the moment who will be willing to expose their SSAS server via http. However there are a small number of companies who provide hosted SSAS solutions and this I think will become a much more popular option over the coming years. I do think, though, that the local cube option has some very interesting possibilities especially because it's free. I can imagine it would be a great way of distributing data to disconnected employees like salespeople, or to partners, for small companies on a tight budget. Tim Peterson must be pleased - this should cause a resurgence of interest in local cubes. 2008/4/18 PASS Europe and WebdashboardI just got back from Germany after attending the PASS Europe conference. It was good fun as always and the standard of presentations was extremely high. Thanks are due to André Kamman and everyone else who organised it for making it such a success. Hopefully we'll have another one next year? Bill Graziano has a picture (the one on the right) of me, Marco Russo, Allan Mitchell and various others here: Both Marco Russo and I were impressed by Webdashboard, who were exhibiting: 2008/4/16 There will be a SQL2005 SP3!A triumph for people power! Microsoft have announced that there will be a SP3 for SQL2005 and it will be delivered this calendar year. More details here:
http://blogs.technet.com/dataplatforminsider/archive/2008/04/15/sql-server-2005-sp3-coming-soon.aspx
There's also been yet another cumulative update released, CU7:
UPDATE: there's a blog entry from the guys in PSS on how CU7 changes what happens when backups and processing overlap, here: http://blogs.msdn.com/psssql/archive/2008/04/16/analysis-services-changes-introduced-in-cu-7-may-impact-backups.aspx 2008/4/13 Announcing Intelligencia Query for Reporting ServicesCan I be utterly shameless and use my blog to promote products I have a commercial interest in? Of course I can! But first, let me tell you a story... Last December I spent a lot of time thinking about the problems of building Reporting Services reports on top of Analysis Services. To be honest, this issue has been a constant irritation for me ever since I first saw Reporting Services about five or six years ago; this blog entry, one of the most popular I've ever posted, sums up my thoughts on the matter: The way you'd want SSRS to work is fairly clear: you'd want a decent query builder and the option of using any MDX query you'd care to write, and you'd want SSRS to render the resulting cellset in a matrix-like control automatically. It should work like pretty much like every SSAS client tool on the market only the output should be a SSRS report. The reason you can't do this is also clear: SSRS needs to know the name and number of columns your query returns in advance, and you can't do that with MDX queries. So how can you get around this limitation? Then it hit me. Instead of having to unpivot your MDX query manually so that it's in the format that SSRS likes, why couldn't you have a custom data extension that did the hard work for you? So I wrote my first attempt in an afternoon and after a lot more polish it now works really well. Here's an example of what it does: Take the following query: select {[Date].[Calendar Year].&[2003],[Date].[Calendar Year].&[2004]} on 0, This returns the following SSRS-unfriendly results: But when run through my custom data extension the data is returned as follows: You'll see that all of the columns have become rows, and there are two columns containing the Value and Formatted_Value properties of the cells. The cool thing is now that you can put a matrix control in your report, map the 'column' fields onto the matrix column groups, the 'row' fields onto matrix row groups and one of the cell properties onto the data area like this: And when you run your report, the matrix control recreates the original structure of your MDX query: This is pretty simplistic but even more complex queries with multiple levels of crossjoining on rows and columns are handled just as easily; it gets around all the limitations with the built-in Analysis Services data source/query builder as well as those of the OLEDB data source. Some of the key advantages are:
I was feeling quite pleased with myself at this point and was about to publish the code up to Codeplex when it occurred to me that I could make some money out of this - after all, I do have a wife, two kids and a mortgage to feed. There was also one big missing feature that I knew I didn't have the time to implement on my own and that was an MDX query builder: whereas I'd be quite happy to write my own MDX queries, I know most people aren't that comfortable doing so and need some kind of tool to do it for them. So I got in touch with Andrew Wiles of iT Workplace who I knew had an excellent MDX query control, signed a partnership agreement with him and over the last few months we (well, mostly him really) have worked to integrate his query builder with my custom data extension with BIDS to come up with what we think is a product that anyone using Reporting Services and Analysis Services will want to use, Intelligencia Query for Reporting Services. Yes, it's something you're going to have to pay for but in my opinion it's very modestly priced and more importantly it has been saved my work from the fate of being a cool idea implemented by an occasional C# coder and stuck in perpetual beta on Codeplex, and turned into a commercial-quality product that is safe to install on your servers and which will grow and improve over time. The main selling point of the whole solution is that it gives you, for the first time with Analysis Services, the ability to build in a query builder exactly the resultset you want to see displayed in your Reporting Services report - it's truly WYSIWYG. Certainly in a lot of cases you can achieve what you want with the native query builder but as I said, with it you always have to try to imagine the end result you want and write a query that can be pivoted into that end result, and that's not easy for the majority of people. Combine this with the fact that the Intelligencia query builder allows you to build much more sophisticated queries than the native query builder and that, as mentioned above, the custom data extension solves many of the problems you run into with the built-in Analysis Services and OLEDB data sources, and I think you've got something that finally makes it easy to build Reporting Services reports using Analysis Services data. I think Andrew's query builder is the equal of anything else on the market in terms of the kind of query it can create. It does all the stuff you'd expect like hierarchical selections, it does stuff that few other tools do such as allow for the creation of asymmetrical sets on an axis, and it has at least one outstanding feature that no-one else has in its ability to let users create MDX calculations with an Excel formula-like interface. It even supports SAP BW as a data source too. Here's a screenshot: Anyway, enough of me going on. You can find out more, download an eval version, watch a video and buy some licences here: One last thing: we want to hear your feedback. This is a version 1.0 and while we've got a lot of ideas on new features that could be added, we want to be sure they are the features that our users actually want. So even if you only download the eval please drop me a mail to let me know what you like and what you think is missing. It's likely we'll be able to add new features pretty quickly. 2008/4/10 illuminate, FAST and yet more completely unfounded (and probably ill-informed) speculation on my partI was just reading this article on TWDI: It's on illuminate and their 'correlation DBMS' - not relational, not OLAP, not COP, but a 'value-based' system. All very fascinating indeed, although their web site is yet another one of those irritating ones that leaves you with the feeling that some important technical detail and information on where they're positioning themselves is missing. This entry on their blog: I might have completely the wrong end of the stick, but didn't Microsoft get its hands on something similar when it acquired FAST last year (blogged here)? Is this it: 2008/4/8 Google App EngineI see Google have announced their own web application platform: 2008/4/3 Can your sum be a subtraction? Or can you avoid it altogether?Quite often you'll find yourself writing calculations that need to sum up large sets; in fact, they might be summing up all of the members on a level apart from one or two. In that situation it's worth using the structure of your dimension to your advantage. Take the following query, which sums up the Internet Sales of all customers apart from one and then does a TopCount based on this: WITH SELECT MEASURES.TEST ON 0, On my laptop using AS2005 the query runs in 2 minutes 2 seconds on a cold cache (Katmai does no better with this query, incidentally). Yuck. But if we change the calculation around, so that rather than summing the customers we do want we take the sales for all customers and subtract the sales for the customer we don't want (which is fine if the measure we're looking at has an aggregate function of Sum) then we can do the following: WITH SELECT MEASURES.TEST ON 0, ...which executes in 1 second on a cold cache. Taking this further, if you have a set like this you're frequently summing up in calculations it might be a good idea to create a new attribute on your dimension to avoid having to do any work in MDX at all. In the Adventure Works example above, maybe Customer 20075 is the Sultan of Brunei and he ordered 5000 new bikes for all his friends - which means that including him in your calculations would skew the results. What you could do is create a new boolean attribute on Customer called something like 'Exclude from Calculations', which would then mean you could rewrite the query like this: SELECT [Measures].[Internet Sales Amount] ON 0, and probably get even better performance. |
|
|