• Shortcuts : 'n' next unread feed - 'p' previous unread feed • Styles : 1 2

» Publishers, Monetize your RSS feeds with FeedShow:  More infos  (Show/Hide Ads)


Date: Sunday, 27 Jul 2014 21:18

Originally posted on: http://geekswithblogs.net/darrengosbell/archive/2014/07/28/the-perils-of-calculating-an-average-of-averages.aspx

I've seen questions around issues calculating averages come up a few times in various forums and it came up again last week and I feel that there is some benefit in walking through the details of this issue. For many of you the following will be nothing new, but I'm hoping that this may serve as a reference that you can point to when you get requests for this sort of calculation.

The core issue here is really a fundamental mathematical one. Personally I see it surfacing most often in DAX and MDX as those are languages that I spend a lot of time with, but also because of their multi-dimensional natures you need to be able to write generic calculations that will work regardless of how the end users slice and dice the data.

The discussions invariably start with a statement like the following:

"I have a calculated measure that an average, but my totals are calculating incorrectly"

There are 2 different issues I see relating to this.

The first one is trying to use the AVG() function in MDX. Basically if you want an average calculation that works with all your different dimensions then avoid this function. The AVG function in MDX calculates the average over a fixed set. You may be able to use it in a static query, but to calculate an average in your MDX script simply create the two base measures - a sum and a count, then divide the sum by the count. This is not as much of an issue in DAX as the built-in AVERAGE, AVERAGEA and AVERAGEX generally work as expected.

The other sort of question that I see is related to how the totals are calculated and the question is usually something like the following:

"I have an average measure calculated by doing sum / count - which produces the correct average for each row, but the total is calculated as "sum of sums" / "sum of counts" and my user wants to see it as the average of all the averages."

And to put it bluntly this requirement is invalid. You should never "total" a series of averages by averaging them. The easiest way to explain why this is the case is to illustrate with some data. So let's have a look at a few scenarios.

The first problem you will see with the "average of averages" approach is that it gives too much weight to outlying amounts.

Category

Amount

Count

Average

Bikes

1,000

1

1,000

Helmets

10,000

1,000

10

TOTAL

11,000

1,001

???

Given the data above how should we calculate the total average? if we do the "average of averages" approach we have:

(1000 + 10) / 2 = 505

If we take the SUM(Amount) / SUM(Count) approach we get the following:

11000 / 1001 = 10.99

This is an extreme example to prove a point, but which do you think is correct? Should the 1 bike we sold for $1000 skew the average to $505 or should the fact that it was just one product out of 1001 mean that the average should only be $10.99?

Your business user might be happy seeing a higher average amount, but what if the situation was reversed and we had sold 1000 bikes and just one helmet? This would make the "average of averages" still equal 505 while recalculating the average at the total level would give us $999.01 - I know which calculation I think is giving a better indication of the total average sales.

It's possible that you may be thinking at this point that this is not so much of a big deal for you because you don't have that sort of variability in your data. However that is only the start of the issues. If you are still unsure about the evils of averaging averages then read on because it only gets worse.

To show the next nasty side effect we need to look at just a little bit more data. Take the following 4 records for example where we have data split between 2 cities and 2 product categories

City Category

Amount

Count

Melbourne Bikes

18

3

Melbourne Helmets

25

5

Seattle Bikes

21

3

Seattle Helmets

16

4

When we group the data by City we get the following results. The "Total" line is where the average is recalculated at the total level. Where as the "Avg of Averages" line is where I've take the average of the 2 City averages.

City

Amount

Count

Average

Melbourne

43

8

5.38

Seattle

37

7

5.29

Avg of Averages    

5.34

Total

80

15

5.33

Now lets have a look at what happens to the figures when we group the data by the product category. Notice that the Total line has remained unchanged, but the "Avg of Averages" is now different!

Category

Amount

Count

Average

Bikes

39

6

6.5

Helmets

41

9

4.56

Avg of Averages    

5.53

Total

80

15

5.33

This sort of behaviour - where the figures reported for total and sub-totals will vary depending on how the data is sliced and diced - will be the death of your BI project.

Trust - the most important "feature" of any BI project

I would argue that possibly the most important "feature" of any BI solution is trust. You can always add features and missing functionality, but it can be very difficult to win back the trust of your users once it's been lost. And nothing will erode the trust of your users than seeing inconsistent results.

It's not just straight Averages that are the issue

Anytime you are mixing calculations that do sums and divisions you need to be careful of the order of operations. Ratios, Percentages and moving averages are just a few of the examples of other calculation types for which you need to take care of the order in which add and divide things.

Author: "Darren Gosbell" Tags: "DAX, Analysis Services, MDX"
Comments Send by mail Print  Save  Delicious 
Date: Wednesday, 02 Jul 2014 21:33

Originally posted on: http://geekswithblogs.net/darrengosbell/archive/2014/07/03/the-case-of-the-vanishing-kpis.aspx

I was contacted today with an interesting issue, we had a tabular model that had some KPIs which were not showing up in Power View.

The first thing I checked was the version setting on the model. KPI support was not added to tabular models in SP1. If your model is set to a compatibility version of RTM (1100) Power View will detect this and will effectively not ask for metadata about the KPIs.

However in this case when we checked the database properties from SSMS the compatibility setting appeared to be correctly set to SP1 (1103)

image

So the next thing I did was to open a profiler trace and look at the metadata queries that Power View executed as it started up. Excel treats SSAS Tabular models as if they were multi-dimensional models and queries the metadata using a number of different DISCOVER queries against different schema rowsets. When SSAS Tabular was developed a new schema rowset was introduced called DISCOVER_CSDL_METADATA which is what DAX clients like Power View use to populate their field browser windows.

Checking the command I could see that it was correctly requesting a version 2.0 recordset. If the model was set to a compatibility setting of RTM (1100) or if there was a problem detecting the compatibility setting of the model you may see a 1.0 in the version restriction. Version 1.0 CSDL will not include KPI information. This is so that client tools can specify the version of metadata which they know how to handle.

image

At this point it looks like Power View and SSAS are correctly talking to each other, but we are still no closer to understanding why the KPIs are visible in Excel, but not in Power View.

The next thing to looks at was the actual response returned by the metadata query to see if there was anything strange in there. To do that I took the RestrictionList and PropertyList elements from the profiler trace and inserted them into the Restrictions and Properties elements in the query below. I also had to remove the LocaleIdentifier and DataSourceInfo elements from the PropertyList as these related to a different session. Below is an example of a DISCOVER_CSDL_METADATA command which can be run from an XMLA window in SSMS.

<Discover xmlns="urn:schemas-microsoft-com:xml-analysis">

    <RequestType>DISCOVER_CSDL_METADATA</RequestType>
   
    <Restrictions>
        <RestrictionList xmlns="urn:schemas-microsoft-com:xml-analysis">
            <CATALOG_NAME>Adventure Works Tabular</CATALOG_NAME>
            <VERSION>2.0</VERSION>
        </RestrictionList>
    </Restrictions>
    <Properties>
        <PropertyList xmlns="urn:schemas-microsoft-com:xml-analysis">
            <Catalog>Adventure Works Tabular</Catalog>
            <Content>SchemaData</Content>
            <Format>Tabular</Format>
        </PropertyList>
    </Properties>
</Discover>

(you can simply replace the 2 references to the Catalog name in the XMLA below to run this against one of your models)

When I searched in the results for "KPI" I came across the following interesting piece of data.

SNAGHTML18885ef1

Notice the Hidden="true" attribute? It turns out that the original developer decided to hide the measures before creating the KPI which resulted in the KPI itself being hidden. Setting the Hidden property to false on the measure fixed this issue. Mystery solved.

So although the end solution turned out to something simple I thought it might be interesting to share the process

A footnote

Note that we still have one minor issue, now in Excel we can now see both the KPI and the measure, while in Power View we only see the KPI. My suspicion is that this may be a bug in the MDSCHEMA_MEASURES rowset which Excel uses to find out what measures a model has. My opinion is that in order to be consistent with Power View that measures which are used for KPI values should not also be displayed as "normal" measures.

Author: "Darren Gosbell" Tags: "Analysis Services, PowerPivot"
Comments Send by mail Print  Save  Delicious 
Date: Thursday, 22 May 2014 20:45

Originally posted on: http://geekswithblogs.net/darrengosbell/archive/2014/05/23/bi-survey-14.aspx

It's BI Survey time again :)

If you haven't done this before here is a little background on it from the guys that run it:

The BI Survey, published by BARC, is the world's largest and most comprehensive annual survey of the real world experiences of business intelligence software users. Now in its fourteenth year, The BI Survey regularly attracts around 3000 responses from a global audience. It provides an invaluable resource to companies deciding which software to select and to vendors who want to understand the needs of the market.


The Survey is funded by its readers, not by the participant vendors. As with the previous thirteen editions, no vendors have been involved in any way with the formulation of The BI Survey. Unlike most other surveys, it is not commissioned, sponsored or influenced by vendors.

Here is a link to the survey:

https://digiumenterprise.com/answer/?link=1981-ZYQSEY8B

If you take the survey you will get access to a summary of the results. By helping to promote the survey here I'll get access to some more detailed results including some country specific analysis so it will be interesting to see the results.

Author: "Darren Gosbell" Tags: "General"
Comments Send by mail Print  Save  Delicious 
Date: Thursday, 15 May 2014 21:40

Originally posted on: http://geekswithblogs.net/darrengosbell/archive/2014/05/16/function-folding-in-powerquery.aspx

Looking at a typical Power Query query you will noticed that it's made up of a number of small steps. As an example take a look at the query I did in my previous post about joining a fact table to a slowly changing dimension. It was roughly built up of the following steps:

  1. Get all records from the fact table
  2. Get all records from the dimension table
  3. do an outer join between these two tables on the business key (resulting in an increase in the row count as there are multiple records in the dimension table for each business key)
  4. Filter out the excess rows introduced in step 3
  5. remove extra columns that are not required in the final result set.

If Power Query was to execute a query like this literally, following the same steps in the same order it would not be overly efficient. Particularly if your two source tables were quite large. However Power Query has a feature called function folding where it can take a number of these small steps and push them down to the data source. The degree of function folding that can be performed depends on the data source, As you might expect, relational data sources like SQL Server, Oracle and Teradata support folding, but so do some of the other sources like OData, Exchange and Active Directory.

To explore how this works I took the data from my previous post and loaded it into a SQL database. Then I converted my Power Query expression to source it's data from that database. Below is the resulting Power Query which I edited by hand so that the whole thing can be shown in a single expression:

let
    SqlSource = Sql.Database("localhost", "PowerQueryTest"),
    BU = SqlSource{[Schema="dbo",Item="BU"]}[Data],
    Fact = SqlSource{[Schema="dbo",Item="fact"]}[Data],
    Source = Table.NestedJoin(Fact,{"BU_Code"},BU,{"BU_Code"},"NewColumn"),
    LeftJoin = Table.ExpandTableColumn(Source, "NewColumn"
                                  , {"BU_Key", "StartDate", "EndDate"}
                                  , {"BU_Key", "StartDate", "EndDate"}),
    BetweenFilter = Table.SelectRows(LeftJoin, each (([Date] >= [StartDate]) and ([Date] <= [EndDate])) ),
    RemovedColumns = Table.RemoveColumns(BetweenFilter,{"StartDate", "EndDate"})
in
    RemovedColumns

If the above query was run step by step in a literal fashion you would expect it to run two queries against the SQL database doing "SELECT * …" from both tables. However a profiler trace shows just the following single SQL query:

select [_].[BU_Code],
    [_].[Date],
    [_].[Amount],
    [_].[BU_Key]
from
(
    select [$Outer].[BU_Code],
        [$Outer].[Date],
        [$Outer].[Amount],
        [$Inner].[BU_Key],
        [$Inner].[StartDate],
        [$Inner].[EndDate]
    from [dbo].[fact] as [$Outer]
    left outer join
    (
        select [_].[BU_Key] as [BU_Key],
            [_].[BU_Code] as [BU_Code2],
            [_].[BU_Name] as [BU_Name],
            [_].[StartDate] as [StartDate],
            [_].[EndDate] as [EndDate]
        from [dbo].[BU] as [_]
    ) as [$Inner] on ([$Outer].[BU_Code] = [$Inner].[BU_Code2] or [$Outer].[BU_Code] is null and [$Inner].[BU_Code2] is null)
) as [_]
where [_].[Date] >= [_].[StartDate] and [_].[Date] <= [_].[EndDate]

The resulting query is a little strange, you can probably tell that it was generated programmatically. But if you look closely you'll notice that every single part of the Power Query formula has been pushed down to SQL Server. Power Query itself ends up just constructing the query and passing the results back to Excel, it does not do any of the data transformation steps itself.

So now you can feel a bit more comfortable showing Power Query to your less technical Colleagues knowing that the tool will do it's best fold all the  small steps in Power Query down the most efficient query that it can against the source systems.

Author: "Darren Gosbell" Tags: "Power Query"
Comments Send by mail Print  Save  Delicious 
Date: Monday, 05 May 2014 11:53

Originally posted on: http://geekswithblogs.net/darrengosbell/archive/2014/05/05/powerquery-ndash-joining-to-a-slowly-changing-dimension.aspx

I blogged previously about how to look up a surrogate key for a slowly changing dimension using DAX. This post is about how to do the same thing using Power Query.

I'm going to start off with the same 2 tables that I used in the previous blog post. One is a fact table and the other is my BU (Business Unit) table. I started by clicking on each table of data in Excel and choosing the "From Table" data source option.

image

And for each table I unchecked the "Load to worksheet" option and then clicked apply & save.

SNAGHTML179a06bf

Once I had done that for both tables my Power Query tool pane looked like the following, I have two queries defined, but neither of them is loading any data directly.

image

Now that we have our two source queries we want to use the Merge option in Power Query to join them together

image

The Merge option in Power Query is how you join matching rows in two tables together. I chose "Fact" as my first table as for each row in the Fact I want to find the matching BU_Key from the BU table.

image

You'll notice that at this point we can only choose columns for an equality match, there are no options for us to test that the Date in Fact is between the StartDate and EndDate in the BU table.

When we click on OK we end up with a result like the following which has our original rows from the Fact table and then a column called "NewColumn" which contains the 1 or more rows from the BU table which matched on the BU_Code column.

image

If we click on the little double arrow button in the header of the NewColumn column you get the following options:

image

We can choose to either expand or aggregate the rows in the nested table. Because we want to lookup the BU_Key we tick that as well as the StartDate and EndDate columns as we will need those later.

That gives us a result like the following:

image

Now we are getting close, but we still have one major issue. We now have 16 rows instead of our original 8 because each row in the Fact table is matching to multiple rows in the BU table as we have not done any filtering based on the start and end dates yet. Clicking on the filter button at the top of the "Date" column it initially looks like doing a date filter and choosing the "Between" option would be a solution.

image

But that only gives us the option to select fixed dates values from our data, not references to another column.

image

One solution would be to put in fixed dates and then manually edit the filter in the formula bar, but I wanted to see how far I could get without resorting to doing any advanced editing. The solution I came up with involved some minor code, but it can be done without manually editing the formula.

What I ended up doing was inserting a new custom column which we can then use to filter out the rows we don't want. So from the "Insert" tab on the ribbon I chose the "Insert Custom Column" option:

image

Then I entered the following expression to create a new column called "DateFilter" which will return a value of True if the Date from the current Fact row was between the StartDate and EndDate from the BU table.

= ( ( [Date] >= [NewColumn.StartDate] ) and ( [Date] <= [NewColumn.EndDate] ) )

image

That gives us the following result:

image

Then to filter down to just the "True" values we just need to click on the dropdown in the header of the "DateFilter" column and select the "TRUE" value in our filter.

image

We are now back to our original 8 rows.

image

Then we just need to do a little clean up. By holding the Ctrl key while clicking on the green columns above we can remove those columns. Then I just renamed "NewColumn.BU_Key" to BU_Key and clicked on the "Date" column and set it's type as date (which somehow did not get correctly get detected) we now end up with our finished table which we could choose to load into Excel or directly into a Power Pivot model.

image

Below is the Power Query Formula that was created as a result of the above steps. (this is just the merge query excluding the 2 source queries for "BU" and "Fact")

let
    Source = Table.NestedJoin(Fact,{"BU_Code"},BU,{"BU_Code"},"NewColumn"),
    #"Expand NewColumn" = Table.ExpandTableColumn(Source
            , "NewColumn"
            , {"BU_Key", "StartDate", "EndDate"}
            , {"NewColumn.BU_Key", "NewColumn.StartDate", "NewColumn.EndDate"}),
    InsertedCustom = Table.AddColumn(
            #"Expand NewColumn", "DateFilter"
            , each ( ( [Date] >= [NewColumn.StartDate] ) and ( [Date] <= [NewColumn.EndDate] ) )),
    FilteredRows = Table.SelectRows(InsertedCustom, each ([DateFilter] = true)),
    RemovedColumns = Table.RemoveColumns(
            FilteredRows,{"BU_Code", "NewColumn.StartDate", "NewColumn.EndDate", "DateFilter"}),
    RenamedColumns = Table.RenameColumns(RemovedColumns,{{"NewColumn.BU_Key", "BU_Key"}})
in
    RenamedColumns

If you want to manually tweak things you can go into the Advanced Editor and manually edit the formula to combine all three queries into one and you can also do away with the custom column and just do the between filtering inline. The following query shows the single query solution.

let
    Fact1 = Excel.CurrentWorkbook(){[Name="Fact"]}[Content],
    BU1 = Excel.CurrentWorkbook(){[Name="BU"]}[Content],
    Join = Table.NestedJoin(Fact1,{"BU_Code"},BU1,{"BU_Code"},"NewColumn"),
    #"Expand NewColumn" = Table.ExpandTableColumn(Join
            , "NewColumn"
            , {"BU_Key", "StartDate", "EndDate"}
            , {"NewColumn.BU_Key", "NewColumn.StartDate", "NewColumn.EndDate"}),
    FilteredRows = Table.SelectRows(#"Expand NewColumn"
            , each ( ( [Date] >= [NewColumn.StartDate] ) and ( [Date] <= [NewColumn.EndDate] ) )),

    RemovedColumns = Table.RemoveColumns(
            FilteredRows,{"BU_Code", "NewColumn.StartDate", "NewColumn.EndDate"}),
    RenamedColumns = Table.RenameColumns(RemovedColumns,{{"NewColumn.BU_Key", "BU_Key"}}),
    ChangedType = Table.TransformColumnTypes(RenamedColumns,{{"Date", type datetime}})
in
    ChangedType

If you are curious you can download the workbook I used for this blog post from my OneDrive: http://1drv.ms/1mux5O5

Author: "Darren Gosbell" Tags: "Power Query"
Comments Send by mail Print  Save  Delicious 
Date: Tuesday, 22 Apr 2014 13:20

Originally posted on: http://geekswithblogs.net/darrengosbell/archive/2014/04/22/implementing-column-security-with-ssas-tabular-and-dax.aspx

Out of the box Analysis Services (both Tabular and Multi-dimensional) has great support for horizontal or row based security. An example of this is where you would give User1 access to all data where the Country is “Australia” and give User2 access to all data where the country = “United States”. This covers a large percentage of the security requirements that most people have.

But neither technology has great support for vertical or column based security. This sort of requirement is most common in privacy scenarios. One example of this would be a model with medical data. It may be acceptable to show all your users demographic data such as the state they live in or their gender. But only a specific subset of users should have access to see individual patient details such as their name or phone number.

One approach would be to simply create 2 models, one with the secure information and one without. While this works, it doubles your processing time and doubles any maintenance activities and takes up double the storage.

Looking at the features in SSAS you may be tempted to try using perspectives. At first glance they appear to do what we want - allowing us to hide a subset of columns. But perspectives are not a security feature. All they do is to show a subset of the metadata to the user, but the user still has to have access to the full model and the hidden columns are still fully query-able from DAX and MDX. Trying to use perspectives for security is like putting a "Keep Out" sign on your front door, but then not actually locking it…

To explore this issue further I created a very simple database in SQL Server which has a Patient table and a FactPatient table which look like the following:

SNAGHTML94b4182

What I want to do is to create a model where only specific people can see the PatientName column. So because we can't restrict access to specific columns in a single table I created 2 views over the Patient table - vPatient which has every column except the PatientName and vPatientSensitive which has the PatientID and PatientName

SNAGHTML950446f

At this point I then created a tabular model bringing in FactPatient, vPatient and vPatientSensitive.

If you create your relationships in the default manner you will end up with something like the following:

image

This works great for the role which has access to the sensitive information, but if you create a role which does not give access to any of the rows in vPatientSensitive, these users can't see any data.

The reason for this is that the Filter Context flows down through the chain of one to many realtionships

image

So if a role has no access to any rows in vPatientSensitive, this flows through the relationships to also filter vPatient and FactPatient resulting in this role not being able to see any data.

Because the relationship between vPatient and vPatientSensitive is actually a 1:1 we can reverse the direction of the relationship as per the following diagram:

image

Now we are getting closer. Our secured role works again, but we've now introduced a problem with our role that has full access. When they browse the data they see the following with the same amounts repeated for every patient name.

SNAGHTML97def30

If you take another look at our relationships you'll notice that it now looks like a many to many relationship. And there well established pattern of dealing with many to many relationships using CALCULATE( <expression>, <intermediate table> ).

So we could try something like CALCULATE( SUM( FactPatient[Amount], vPatientSensitive ) - however we can't just use this expression as if the vPatientSensitive is restricted then we will be back to our original scenario where restricted people can't see any data. So we need to check if the current user has access to the sensitive data before applying this expression. We can do this with COUNTROWS( ALL( vPatientSensitive ) ).

Then our full expression for a measure over the FactPatient[Amount] column becomes:

Total Amount :=
IF (
COUNTROWS ( ALL ( vPatientSensitive ) ) > 0,
CALCULATE ( SUM ( FactPatient[Amount] ), vPatientSensitive ),
SUM ( FactPatient[Amount] )
)

To test this design I setup a number of different roles.

The FullAccess role has no filters applied on any tables.

image

and can see all the patient data including the PatientName.

image

The NoSensitive role can see all the facts, but cannot see any columns from the vPatientSensitive table

image

So when they run the same query as the FullAccess role all they see is the following where the PatientName column from vPatientSensitive only shows blank values:

image

It's also possible to mix and match this approach with standard row based security. So we could limit a role to only seeing data from a particular state and also given them access to the sensitive data:

image

image

Or we could restrict the role to a particular state and deny access to the sensitive information

image

image

If you want to have a play with this solution yourself I've uploaded both the tabular project and a T-SQL script which will build the source database to a folder on my OneDrive.

Author: "Darren Gosbell" Tags: "DAX, Analysis Services"
Comments Send by mail Print  Save  Delicious 
Date: Wednesday, 16 Apr 2014 20:19

Originally posted on: http://geekswithblogs.net/darrengosbell/archive/2014/04/17/running-mdx-studio-against-sql-2012.aspx

Even though MDX Studio has not been updated since SQL 2008 it’s still a fantastic tool for working with MDX. However if you have only installed SQL 2012 (or later) on your machine then you may get errors like the following:

System.IO.FileNotFoundException: Could not load file or assembly 'Microsoft.AnalysisServices, Version=10.0.0.0, Culture=neutral, PublicKeyToken=89845dcd8080cc91' or one of its dependencies. The system cannot find the file specified.
File name: 'Microsoft.AnalysisServices, Version=10.0.0.0, Culture=neutral, PublicKeyToken=89845dcd8080cc91'

There are two ways to address this issue:

1. Install either the SQL 2008 or SQL 2008 R2 version of AMO (which is part of the SQL Server feature pack)

2. Configure assembly redirection via a config file.

You can download a copy of the mdxstudio.exe.config file from my onedrive or save the following xml to a file of that name (this file needs to be in the same folder as the MDXStudio.exe file). This file redirects the 2008 / 2008R2 version of Microsoft.AnalysisServices.dll (v10.0.0.0) to the SQL 2012 version (v11.0.0.0) to redirect to later versions it’s just a matter of changing the newVersion attribute (assuming that the new library is backward compatible)

<configuration>
<runtime>
  <assemblyBinding xmlns="urn:schemas-microsoft-com:asm.v1">
   <dependentAssembly>
    <assemblyIdentity name="Microsoft.AnalysisServices" culture="neutral" publicKeyToken="89845dcd8080cc91" />
    <bindingRedirect oldVersion="10.0.0.0" newVersion="11.0.0.0" />
   </dependentAssembly>
   <dependentAssembly>
    <assemblyIdentity name="Microsoft.AnalysisServices.AdomdClient" culture="neutral" publicKeyToken="89845dcd8080cc91" />
    <bindingRedirect oldVersion="10.0.0.0" newVersion="11.0.0.0" />
   </dependentAssembly>
  </assemblyBinding>
</runtime>
</configuration>

Author: "Darren Gosbell"
Comments Send by mail Print  Save  Delicious 
Date: Tuesday, 08 Apr 2014 21:18

Originally posted on: http://geekswithblogs.net/darrengosbell/archive/2014/04/09/dax-ndash-joining-to-a-slowly-changing-dimension.aspx

The following is one of the scenarios that I showed during my “Drop your DAX” talk at SQL Saturday #296 in Melbourne.

Currently SSAS Tabular and PowerPivot models can only have a relationship based on a single column. So what do you do when you need to join based on multiple columns?

Ideally you would solve this during your ETL. With a type 2 slowly changing dimension you typically want to insert the surrogate key for the dimension into the fact table. As you may know, “type 2” dimensions can have one or more records for a given business key. These records will have a different effective start and end dates.  To find the correct surrogate key you have to match on the business key and a date, where the date is between the effective start and end dates.

This is pretty easy to do in SQL, but what do you do if your two tables are not only on separate servers, but also on different types of servers. I had this situation recently. We were building a quick proof of concept and we had an existing type 2 dimension in Teradata, but the fact table was coming from an Oracle server. One option would have been to stage both sets of data locally and do the join there, but that was going to take time and resources and this was just meant to be a quick proof-of-concept.

So is there a way of fixing this in the model using DAX?

To demonstrate this I’m going to work with the following simplified dimension:

image

And the following simple fact table:

image

What I wanted to do was to try to create a calculated column in the fact table to look up the surrogate from the dimension table. Creating a filter statement to find the matching records was not too hard.

FILTER(BU,Fact[BU_Code] = BU[BU_Code] && Fact[Date] >= BU[StartDate] && Fact[Date] < BU[EndDate])

But FILTER returns a table, not a scalar value so you can’t use it in a calculated column. My first attempt involved using the powerfull CALCULATE function to return a single value from the BU_Key column.

= CALCULATE (
    VALUES ( BU[BU_Key] ),
    FILTER (
        BU,
        'Fact'[BU_Code] = BU[BU_Code]
            && 'Fact'[Date] >= BU[StartDate]
            && 'Fact'[Date] < BU[EndDate]
    )
)

This produced the results I wanted, but when I tried to create a relationship against this column I got the following error:

image

Sounds a bit strange doesn’t it. The issue here is that CALCULATE() is trying to transform the filter context into a row context in order to evaluate the column expression. The problem with this is that relationships influence the  filter context. Because CALCULATE() needs to be able to follow all the relationships in order to return a result - you can't create a relationship over a column that uses CALCULATE(). That’s where the circular dependency comes from.

The solution to this is to switch the expression to use one of the iterative functions like MINX or MAXX. I’ve used MINX here, but it does not matter which one you use as the FILTER should only return a single row.

=MINX (
    FILTER (
        BU,
        'Fact'[BU_Code] = BU[BU_Code]
            && 'Fact'[Date] >= BU[StartDate]
            && 'Fact'[Date] < BU[EndDate]
    ),
    BU[BU_Key]
)

Now we can create a relationship over our calculated column.

However it’s not all chocolates and roses…

While this is a handy technique to have in your tool chest for quick prototypes, there are some downsides that you should be aware of.

Tabular processing happens in 3 phases:

1. Data acquisition

2. Data compression

3. Evaluation of calculations

You’ll notice 2 issues here. Firstly adding calculated columns will increase your processing time. Secondly calculated columns are not compressed which will result in slower scan speeds. So this technique will have a negative impact on both processing and query performance. For a long term solution it probably better to look at an ETL based solution.

Author: "Darren Gosbell" Tags: "Analysis Services, DAX, PowerPivot"
Comments Send by mail Print  Save  Delicious 
Date: Sunday, 30 Mar 2014 20:04

Originally posted on: http://geekswithblogs.net/darrengosbell/archive/2014/03/31/how-to-build-your-own-ssas-resource-governor-with-powershell.aspx

A few weeks ago I posted a way to manually find and kill long running SSAS queries. In this post I’ll take it a step further and show you an automated solution. The idea behind this technique is inspired by a blog post Chris Webb did some years ago. Chris implemented his solution using SSIS while this version uses PowerShell.

You might ask - why create a Powershell version?

Well it does a little bit more in that it logs the cancelled queries and sends the user an email. It also uses membership in an Active Directory group to control who bypasses the governor. This makes that list easy to maintain separately from the script. Scripts are easy to view and change as all you need is a text editor. I also think that there is probably less overhead in running a Powershell script as compared to starting an SSIS package.

This script works with both Tabular and Multi-dimensional instances. And it could easily be extended to monitor multiple instances. 

The basic operation of the resource governor script is as follows:

1. Query $system.DISCOVER_COMMANDS for any commands that have been running longer than x minutes or seconds. (Excluding any XMLA commands (processing, deploying, tracing, etc)

2. Look up the session for these commands in $system.DISCOVER_SESSIONS to find out the LoginName and the database they are querying 

3 Check if the user is in the bypass list for the Resource Governor, if they are log this fact and exit. Otherwise cancel the query, log the details of it to a table in SQL Server and send them a nice email.

This script then gets run every minute during business hours by a SQL Agent job.

Pre-requisites

Before running this script you need to download invoke-sqlcmd2 from the Technet Gallery. It’s a nice lightweight way of running SQL commands from Powershell.

And to use the script as is you will need to use the following script to create some logging tables in a SQL database and create an SQL Agent job.

Conclusion

If you are interested in running this script on one of your servers you might want to consider commenting out the bits that cancels the query and sends the user an email. Then just let it log the actions it would have taken. Once you start cancelling queries you’ll want to monitor the log tables. In some cases you will discover opportunities to improve your cubes. In other cases you will be able to assist the users with a better way to achieve their desired result.

Below is the full script, but you can also download it from here

All the key variables are declared at the top of the script. You should just need to change these to suit your environment in order to use this script.

 <#
.SYNOPSIS
    Resource Governor script for Microsoft SQL Server Analysis Services
.DESCRIPTION
    Automatically cancels queries that have been running longer than the maximum allowed time
.NOTES
    Author : Darren Gosbell (http://darren.gosbell.com)
    Date   : 9 Mar 2014
    Idea from - http://cwebbbi.wordpress.com/2008/12/04/killing-sessions-automatically-with-ssis/
.LINK
    http://darren.gosbell.com/
#>

##### initialization variables #####

$servers = "localhost\tabular" ,"localhost\multidim"
$threshold = 300   # in seconds

$sqlInstance = "localhost"
$bypassADGroup = "CN=ImportantGroup,OU=Distribution Lists,OU=Shared Mailboxes,DC=mycompany,DC=com"

$cancelEmailSubject = "Analysis Services Long Running Query Cancellation"
$cancelEmailFrom = "SSAS.Server@mycompany.com"
$cancelEmailBcc = "ssas-support@mycompany.com"
$cancelEmailServer = "mail.mycompany.com"
$supportEmail = "ssas-support@mycompany.com"

####################################

# load the AMO library
[System.Reflection.Assembly]::LoadWithPartialName("Microsoft.AnalysisServices.adomdclient") > $null

# load the invoke-sqlcmd2 cmdlet
. $PSScriptRoot\invoke-sqlcmd2.ps1

## ============ Start Helper Functions =================
Function Send-Email( $to, $subject, $body )
{
    $emailFrom = $cancelEmailFrom
    $bcc = $cancelEmailBcc
    $smtpServer = $cancelEmailServer
    
    $msg = new-object Net.Mail.MailMessage
    $msg.From = $emailFrom
    $msg.To.Add($to)
    $msg.Bcc.Add($bcc)
    $msg.Subject = $subject
    $msg.Body = $body
    
    $smtp = new-object Net.Mail.SmtpClient($smtpServer)
    $smtp.Send($msg)
    #$smtp.Send($emailFrom, $emailTo, $subject, $body)
}

foreach ($svr in $servers)
{
    $connStr = "data source=$svr"
    [Microsoft.AnalysisServices.adomdclient.adomdconnection]$cnn = new-object Microsoft.AnalysisServices.adomdclient.adomdconnection($connStr)
    $cmd = new-Object Microsoft.AnalysisServices.AdomdClient.AdomdCommand
    $cmd.Connection = $cnn

    $qryLongCmd = @"
    SELECT SESSION_SPID, COMMAND_ELAPSED_TIME_MS, COMMAND_TEXT ,COMMAND_START_TIME
    FROM `$system.discover_commands
    WHERE COMMAND_ELAPSED_TIME_MS > $($threshold * 1000) AND [COMMAND_END_TIME] = NULL
"@

    $qrySessions = @"
    select 
    SESSION_SPID
    ,SESSION_USER_NAME
    ,SESSION_CURRENT_DATABASE
    ,SESSION_USED_MEMORY
    from `$system.discover_sessions
"@

    $cnn.Open()

    # get a list of current commands that exceeded the time threshold
    $cmd.CommandText = $qryLongCmd
    $da = new-Object Microsoft.AnalysisServices.AdomdClient.AdomdDataAdapter($cmd)
    $dsCmd = new-Object System.Data.DataSet
    $da.Fill($dsCmd) > $null
    # filter out any xmla commands that start with '<'
    $drCmd = $dsCmd.Tables[0].rows | where {$_.COMMAND_TEXT.StartsWith("<") -eq $false }

    if (@($drCmd.count).count -eq 0)
    {
        write-host "no excessive queries found"
        $cnn.Close()
        continue
    }

    # get a list of the current sessions
    $cmd.CommandText = $qrySessions
    $da = new-Object Microsoft.AnalysisServices.AdomdClient.AdomdDataAdapter($cmd)
    $dsSess = new-Object System.Data.DataSet
    $da.Fill($dsSess) > $null

    # Lookup the session information for each long running command
    foreach ($c in $drCmd) 
    {
        $s = $dsSess.Tables[0].select("SESSION_SPID = $($c.SESSION_SPID)");
        $c |Add-Member -NotePropertyName "SESSION_USER_NAME" -NotePropertyValue $s.SESSION_USER_NAME ;
        $c |Add-Member -NotePropertyName "SESSION_CURRENT_DATABASE" -NotePropertyValue $s.SESSION_CURRENT_DATABASE ;             
        $c |Add-Member -NotePropertyName "COMMAND_ELAPSED_TIME" -NotePropertyValue $([System.Timespan]::FromMilliseconds($c.COMMAND_ELAPSED_TIME_MS))
        $user = $s.SESSION_USER_NAME.Replace("ACCOUNT-01\","")
        $srchr = (New-Object DirectoryServices.DirectorySearcher "(&(ObjectClass=user)(Name=$user))")
        $srchr.PropertiesToLoad.Add("mail") > $null
        $srchr.PropertiesToLoad.Add("memberof") > $null
        $ad = $srchr.FindOne()
        $InPriorityGroup = $ad.Properties["memberof"]  -Contains $bypassADGroup
        $c |Add-Member -NotePropertyName "InPriorityGroup" -NotePropertyValue $InPriorityGroup
        $c |Add-Member -NotePropertyName "Email" -NotePropertyValue $($ad.Properties["mail"])
    }

    # kill any sessions that were returned 
    foreach ($spid in $drCmd)
    {
        if ($spid.InPriorityGroup -eq $true)
        {
            write-output "Bypassing SPID: $($spid.SESSION_SPID) as it is in the Workload Priorty Group"
            $bypassLogCmd = "INSERT INTO tb_WorkloadBypassLog (UserLogin, CommandDurationMS, SessionDatabase, SPID) VALUES ('$($spid.SESSION_USER_NAME)', '$($spid.COMMAND_ELAPSED_TIME_MS)', '$($spid.SESSION_CURRENT_DATABASE)', '$($spid.SESSION_SPID)')"
            invoke-sqlcmd2 -ServerInstance $sqlInstance -Database "OlapTrace" -Query $bypassLogCmd -As None > $null
            continue
        }
        
        $eml = $spid.Email 
        write-progress "Cancelling SPID $($spid.SESSION_SPID)"
        
        # log the Cancellation attempt
        $qry = $spid.COMMAND_TEXT.Replace("'","''")
        $insertCmd = "INSERT INTO tb_CancelledQueries (UserLogin, CommandStart, CommandDurationMS, SessionDatabase, Query, Email) VALUES ('$($spid.SESSION_USER_NAME)', '$($spid.COMMAND_START_TIME)', '$($spid.COMMAND_ELAPSED_TIME_MS)', '$($spid.SESSION_CURRENT_DATABASE)', '$qry', '$($spid.Email)')"

        # Send email notification to end user
        $msg = @"
Your query against the '$($spid.SESSION_CURRENT_DATABASE)' Analysis Services database has consumed excessive resources
and has been marked for cancellation by the workload management system.

Please wait a few minutes or try requesting a smaller set of data.

For assistance with structuring efficient queries or advice about resource management, 
please forward this email and a brief outline of what you are trying to achieve to 
$supportEmail
"@
        # if we have an email registered in AD send the user a notification
        if ($ad.Properties.Contains("mail"))
        {
            Send-Email $eml $cancelEmailSubject $msg
        }
        
        # cancel the query
        $cmd.CommandText = "<Cancel xmlns=`"http://schemas.microsoft.com/analysisservices/2003/engine`"><SPID>$($spid.SESSION_SPID)</SPID></Cancel>"
        $cmd.ExecuteNonQuery()
        
        # log the cancellation
        invoke-sqlcmd2 -ServerInstance $sqlInstance -Database "OlapTrace" -Query $insertCmd -As None > $null
    
    }

    $cnn.Close()
}

Author: "Darren Gosbell" Tags: "PowerShell, Analysis Services"
Comments Send by mail Print  Save  Delicious 
Date: Sunday, 23 Mar 2014 11:11

Originally posted on: http://geekswithblogs.net/darrengosbell/archive/2014/03/23/extending-the-powerquery-date-table-generator-to-include-iso-weeks.aspx

Chris Webb and Matt Mason have both blogged about formulas for generating  a date table using PowerQuery, but both of these posts focus on the standard year-month-day calendar. I’ve been doing a little work with some week based calculations and thought I would see how hard it would be to extend this sort of approach to generate some columns for a week hierarchy.

The ISO Week standard is part of ISO 8601 and defines a week as starting on Monday and ending on Sunday. That in itself is not very hard. The tricky bit comes into play when you go to assign each week to a year. Because weeks don’t fit evenly into years you need to either move some days from the end of December forward or move a few days of January back to the prior year.

The way to do this is as follows:

  • Find the week that contains January 4 as that is always the first week of the year.
  • If Jan 4 is before Thursday then any January days prior to Monday are allocated to the previous year.
  • If Jan 4 is after Thursday then any December days at the start of the week are treated as being part of the current year.


I’ve also taken this a step further and created a small inline function that figures out the current 4-4-5 period and quarter that a given week falls into. I’m using a function which returns a record to return both the period and quarter from the one function which I think is pretty cool.

The following is an extension of Matt Mason’s method, he has some great screen shots of how to use the function so if you have not seen that post it’s definitely worth checking out.

Basically you start a new blank query, switch to the advanced mode and then paste in the following and invoke it:

let CreateDateTable = (StartDate as date, EndDate as date, optional Culture as nullable text) as table =>
  let
    DayCount = Duration.Days(Duration.From(EndDate - StartDate)),
    Source = List.Dates(StartDate,DayCount,#duration(1,0,0,0)),
    TableFromList = Table.FromList(Source, Splitter.SplitByNothing()),   
    ChangedType = Table.TransformColumnTypes(TableFromList,{{"Column1", type date}}),
    RenamedColumns = Table.RenameColumns(ChangedType,{{"Column1", "Date"}}),
    InsertYear = Table.AddColumn(RenamedColumns, "Year", each Date.Year([Date])),
    InsertQuarter = Table.AddColumn(InsertYear, "QuarterOfYear", each Date.QuarterOfYear([Date])),
    InsertMonth = Table.AddColumn(InsertQuarter, "MonthOfYear", each Date.Month([Date])),
    InsertDay = Table.AddColumn(InsertMonth, "DayOfMonth", each Date.Day([Date])),
    InsertDayInt = Table.AddColumn(InsertDay, "DateInt", each [Year] * 10000 + [MonthOfYear] * 100 + [DayOfMonth]),
    InsertMonthName = Table.AddColumn(InsertDayInt, "MonthName", each Date.ToText([Date], "MMMM", Culture), type text),
    InsertCalendarMonth = Table.AddColumn(InsertMonthName, "MonthInCalendar", each (try(Text.Range([MonthName],0,3)) otherwise [MonthName]) & " " & Number.ToText([Year])),
    InsertCalendarQtr = Table.AddColumn(InsertCalendarMonth, "QuarterInCalendar", each "Q" & Number.ToText([QuarterOfYear]) & " " & Number.ToText([Year])),
    InsertDayWeek = Table.AddColumn(InsertCalendarQtr, "DayInWeek", each Date.DayOfWeek([Date],1)+1),
    InsertDayName = Table.AddColumn(InsertDayWeek, "DayOfWeekName", each Date.ToText([Date], "dddd", Culture), type text),
    InsertWeekEnding = Table.AddColumn(InsertDayName, "WeekEndingFriday", each Date.EndOfWeek([Date],6), type date),   
    InsertCurrentThursday = Table.AddColumn(InsertWeekEnding, "CurrentThursday", each Date.AddDays([Date], -Date.DayOfWeek([Date],1) + 3), type date),
    InsertISOWeekJan4 = Table.AddColumn(InsertCurrentThursday, "ISOWeekJan4", each Date.FromText(Number.ToText(Date.Year([CurrentThursday])) & "-01-04") ,type date),
    InsertISOWeekYear = Table.AddColumn(InsertISOWeekJan4, "ISOWeekYear", each Date.Year([CurrentThursday])) ,  
    InsertISOWeekFirstMon = Table.AddColumn(InsertISOWeekYear, "ISOWeekFirstMon", each
        if [CurrentThursday] < [ISOWeekJan4]
        then Date.AddDays([CurrentThursday],-3)
        else Date.AddDays([ISOWeekJan4], - Date.DayOfWeek([ISOWeekJan4],1) )
      ,type date),
    InsertISOWeekNum = Table.AddColumn(InsertISOWeekFirstMon, "ISOWeekNum", each Number.RoundUp(((Duration.Days(Duration.From([Date] - [ISOWeekFirstMon]))+1) /7 )), type number),
    InsertISOWeekID = Table.AddColumn(InsertISOWeekNum, "ISOWeekID", each [ISOWeekYear] * 100 + [ISOWeekNum], type number),
    InsertISOWeekName = Table.AddColumn(InsertISOWeekID, "ISOWeekName", each Text.From([ISOWeekYear]) & "W" & Text.End( "0" & Text.From(([ISOWeekNum]*10)  + [DayInWeek]),3)),
    InsertISOWeekNameLong = Table.AddColumn(InsertISOWeekName, "ISOWeekNameLong", each Text.From([ISOWeekYear]) & "-W" & Text.End( "0" & Text.From([ISOWeekNum]),2) & "-" & Text.From([DayInWeek])),

    fnPeriod445a = (weekNum) => let
      Periods =
        {
            {(x)=>x<5,  [P=1,Q=1]},
      {(x)=>x<9,  [P=2,Q=1]},
      {(x)=>x<14, [P=3,Q=1]},
      {(x)=>x<18, [P=4,Q=1]},
      {(x)=>x<22, [P=5,Q=2]},
      {(x)=>x<27, [P=6,Q=2]},
      {(x)=>x<31, [P=7,Q=3]},
      {(x)=>x<35, [P=8,Q=3]},
      {(x)=>x<40, [P=9,Q=3]},
            {(x)=>x<44, [P=10,Q=4]},
            {(x)=>x<48, [P=11,Q=4]},
            {(x)=>true, [P=12,Q=4]}
        },
      Result = List.First(List.Select(Periods, each _{0}(weekNum))){1}
    in
      Result,

    InsertPeriod445 = Table.AddColumn(InsertISOWeekNameLong, "Period445Record", each fnPeriod445a([ISOWeekNum])),
    ExpandPeriod445 = Table.ExpandRecordColumn(InsertPeriod445, "Period445Record", {"P","Q" }, {"Period445", "Quarter445"}),
    RemovedColumns = Table.RemoveColumns(ExpandPeriod445,{"CurrentThursday", "ISOWeekFirstMon"})
  in
    RemovedColumns
in
    CreateDateTable

Author: "Darren Gosbell"
Comments Send by mail Print  Save  Delicious 
Date: Monday, 10 Mar 2014 08:47

Originally posted on: http://geekswithblogs.net/darrengosbell/archive/2014/03/10/sql-saturday-melbourne-ndash-5-april-2014.aspx

I’m pretty excited to be part of the first SQL Saturday to be held in my home town of Melbourne. It’s being held on April 5, 2014 at the Monash University Caulfield Campus. If you are interested to know more about the event you can find out more here. I think there may still be places open if you would like to go, but have not registered yet. It’s a great place to meet up with like minded people and learn about SQL Server from some of the best and brightest in the industry – oh and I’ll be there too Smile

Author: "Darren Gosbell"
Comments Send by mail Print  Save  Delicious 
Date: Sunday, 02 Mar 2014 20:20

Originally posted on: http://geekswithblogs.net/darrengosbell/archive/2014/03/03/ssas-ndash-listing-active-queries-with-powershell.aspx

Recently one of our production Tabular servers was being hit with a number of extremely large queries that were causing excessive load and impacting the performance for other users. These queries were pretty much attempts at extracting detail level information. Some of these were due to the fact that some of the users had their own alternate hierarchy which we had not been informed of, so we were able to extend the cube design to rectify this. Other users were trying to build 10+ page reports in Excel and were trying to get everything in a single query and we were able to educate these users to use many smaller pivot tables rather than one massive one. One of the biggest single queries during this time ran for 53 minutes and returned somewhere in the vicinity of 190,000 rows!

At the time this was occurring we needed to find some way of stopping these queries from hogging all the resources.  I was able to see these queries by running a “SELECT * FROM $SYSTEM.DISCOVER_COMMANDS” query and if they had been running for longer than 10 minutes (that’s right, over 10 minutes for a tabular query) I could manually execute a Cancel command using XMLA, but this soon becomes tedious and potentially error prone.

So I developed a little PowerShell script which shows all the active queries with their elapsed time, the query that was executed and the user who is running it. It also shows the elapsed as a formatted string in hh:mm:ss.nn format and sorts the results to that the longest running queries are at the top.

But this script is not display only. If you select one or more rows (you can select multiple rows with the usual Crtl / Shift click mechanisms) and then click the OK button the script will issue cancel commands for each of the selected sessions.

image

I’ve attached a link to the full script below, in a future post I’ll show how I took the information that I learned from this script and built it into a completely automated solution.

There are 2 variables at the top of the script that you will need to change in order to run this in your environment. One is the name of the SSAS instance to monitor. The other is the minimum elapsed time threshold (which is currently set to 0 so that it shows all queries). If there are no active queries over the threshold time the script ends quietly.

You can download the full script from my OneDrive here

Update: Note that although the blog post talks about an issue with a Tabular instance this script will workagainst both Tabular and Multidimensional instances

Author: "Darren Gosbell" Tags: "Analysis Services, PowerShell"
Comments Send by mail Print  Save  Delicious 
Date: Tuesday, 18 Feb 2014 19:45

Originally posted on: http://geekswithblogs.net/darrengosbell/archive/2014/02/19/executing-an-mdx-query-with-xmla.aspx

This question comes up  very occasionally and I have some sample scripts kicking around, but it always takes me a few minutes to find them, so I figured why not post a short sample here and let the magic of Google/Bing help me out next time.

I actually had a need of this last week, a colleague in another team was attempting to setting up some tests in cucumber (which uses Ruby) using the Olap4r library. By looking at the results returned from an XMLA command he was able to see the “structure” of the cellset that was returned which gave him enough hints about how the results are structured to get the tests working.

It’s also interesting (at least to me) to see the difference in the size of the response when you request a Cellset (which is the default for most OLAP clients like Excel) compared to a Flattened response (which is what you get when using the ADO connection in Reporting Services). You can see the “flattened” version of the response by taking out the commented section in the XMLA properties below.

Note that this format is not how the response normally travels over the “wire” to the client, see Mosha’s classic post for more detail about how the XMLA responses are transmitted over TCP/IP.

<Envelope xmlns="http://schemas.xmlsoap.org/soap/envelope/">
<Body>
    <Execute xmlns="urn:schemas-microsoft-com:xml-analysis">
        <Command>
            <Statement>
                SELECT [Measures].[Internet Sales Amount] ON 0,
                [Product].[Product Category].[Product Category].Members ON 1
                FROM [Adventure Works]

            </Statement>
        </Command>

        <Properties>
            <PropertyList>
                <Catalog>Adventure Works</Catalog>

<!--
                <Format>Tabular</Format>
                <Content>Data</Content>

-->
            </PropertyList>
        </Properties>

    </Execute>
</Body>
</Envelope>

Author: "Darren Gosbell"
Comments Send by mail Print  Save  Delicious 
Date: Saturday, 04 May 2013 02:45

Originally posted on: http://geekswithblogs.net/darrengosbell/archive/2013/05/04/new-release---dax-studio-1.2-with-excel-2013-support.aspx

It’s been a long while coming, but the next release of DAX Studio is out. If you already have DAX Studio running in Excel 2010 you will not see anything new in this release, this release is all about getting Excel 2013 support and there has been quite a bit or work gone into the back-end to deal with some major changes to the way PowerPivot works inside Excel. We’ve also been doing some re-architecting with a view to possibly being able to support a standalone version that will run outside of Excel.

A big thanks to Greg Galloway for his assistance with the code for supporting PowerPivot in Excel 2013.

You can get the latest DAX Studio release from here https://daxstudio.codeplex.com/releases/

Below is a copy of the release notes

This version adds support for installing the addin into Excel 2013 including support for connecting to PowerPivot models.
Note: the Query Table feature is not supported against Excel 2013 PowerPivot models. This feature still works in Excel 2010 or when connected to a Tabular Server, but it was never officially supported by Microsoft and has proved to be unstable in Excel 2013 and has been disabled as it can cause occasional model corruption.
However Excel 2013 does support this method of doing the same thing manually. And we are trying to see if we can hook into this mechanism to re-enable this functionality in a future release.

Author: "Darren Gosbell"
Comments Send by mail Print  Save  Delicious 
Date: Friday, 01 Feb 2013 21:52

Originally posted on: http://geekswithblogs.net/darrengosbell/archive/2013/02/02/ssas-clearing-the-mdx-script-for-a-session-in-ssms.aspx

Sometimes when troubleshooting performance issues you may want to comment out the MDX Script in your cube in order to quickly isolate whether the MDX Script is a significant contributor to the issue. So if you can reproduce your issue in your dev environment you can open up your project in BIDS / SSDT, comment out the script and re-deploy. But what happens if you can't reproduce the issue against dev. You may not have the same server specs, or the same data volumes or you  may even have design changes in dev that have not yet been deployed that would cloud the issue.

There is a solution to this using the connection technique from my previous post. If you are a server or database administrator you can use the CLEAR CALCULATIONS command. This command effectively clears the MDX script for your current session. So what I usually do is to run the following:

CLEAR CALCULATIONS;

GO

CALCULATE;

As noted by joschko in the comments on the previous post you can use the GO keyword between statements to run multiple statements at a time in SSMS. Personally I don't tend to use this with SCOPE statements as "GO" is not an MDX keyword, it's only understood by SSMS, so you can't copy and paste it into your MDX Script and I like to keep the SCOPEs that I'm testing in a form that I can directly paste into the cube script.

Author: "Darren Gosbell" Tags: "MDX, Analysis Services"
Comments Send by mail Print  Save  Delicious 
Date: Wednesday, 19 Dec 2012 07:52

Originally posted on: http://geekswithblogs.net/darrengosbell/archive/2012/12/19/ssas-executing-mdx-scope-statements-using-ssms.aspx

So usually when I want to test an MDX expression in SSMS I'll write a query with a "WITH MEMBER…" clause to create a query scope calculated measure. But sometimes you may want to test a scoped assignment before putting it in your cube script. The following steps show you how to do this.

1. Click on the button to open an new MDX window, enter your server name and then click on the "Options >>" button

image

2. You must then specify the database that you are using

image

3. Then under the "Additional Connection Properties" tab you need to add a "Cube=" parameter with the name of the cube that you wish to apply the scope to.

image

So we can now write a normal MDX query against our cube to test that everything is working correctly.

image

Then we can execute our SCOPE statements. Note that SSAS will only let you execute one statement at a time, so I had to execute each of the 3 lines below separately. The following scope simply overrides the value of the Sales Amount measure to always return a value of 1. While this is not something that you would do in a real world scenario, it makes it very easy to see if our scope has been applied. (note: a statement can span multiple lines, but each statement will be terminated by a semi-colon)

image

After running the 3 separate statements one at a time we can then run our original query again to see the affects of our scope assignment.

image

Note that this is a session based assignment which will only be visible to queries run from the same session. As soon as you close this session the temporary assignment will disappear If you open another MDX window or connect using Excel you will not see the affects of the temporary scope statement unless you use the "SessionID" parameter on the connection string (or in the "Additional Connection Parameters" for SSMS).

You can find the SessionID by running "SELECT * FROM $SYSTEM.DISCOVER_SESSIONS" and getting the SESSION_ID column

Author: "Darren Gosbell" Tags: "Analysis Services"
Comments Send by mail Print  Save  Delicious 
Date: Tuesday, 20 Mar 2012 19:47

Originally posted on: http://geekswithblogs.net/darrengosbell/archive/2012/03/21/bids-helper-1.6-beta-release-now-with-sql-2012-support.aspx

The beta for BIDS Helper 1.6 was just released. We have not updated the version notification just yet as we would like to get some feedback on people's experiences with the SQL 2012 version. So if you are using SQL 2012, go grab it and let us know how you go (you can post a comment on this blog post or on the BIDS Helper site itself). This is the first release that supports SQL 2012 and consequently also the first release that runs in Visual Studio 2010. A big thanks to Greg Galloway for doing the bulk of the work on this release.

Please note that if you are doing an xcopy deploy that you will need to unblock the files you download or you will get a cryptic error message. This appears to be caused by a security update to either Visual Studio or the .Net framework – the xcopy deploy instructions have been updated to show you how to do this.

Below are the notes from the release page.

======

This beta release is the first to support SQL Server 2012 (in addition to SQL Server 2005, 2008, and 2008 R2). Since it is marked as a beta release, we are looking for bug reports in the next few months as you use BIDS Helper on real projects.
In addition to getting all existing BIDS Helper functionality working appropriately in SQL Server 2012 (SSDT), the following features are new...


Fixes and Updates
The exe downloads are a self extracting installer, the zip downloads allow for an xcopy deploy. Make sure to note the updated xcopy deploy instructions for SQL Server 2012.
Author: "Darren Gosbell" Tags: "BIDS Helper"
Comments Send by mail Print  Save  Delicious 
Date: Monday, 10 Oct 2011 23:24

I'm currently in Seattle enjoying the start of some of the activities around the SQL PASS Summit.

One of the activities that I'm looking forward to is the book signing session for the MVP Deepdives Volume 2 at lunchtime on Wednesday. I was fortunate to be one of the 60 or so authors this time around with a chapter on using Powershell to manipulate Analysis Services databases.

All of the proceeds from this book go to support Operation Smile. You'll find the book's website here: http://www.manning.com/delaney/.

But if you are at PASS you should be able to pick up a copy from the bookstore and drop by on Wednesday to get it signed by the authors.

Author: "Darren Gosbell" Tags: "Analysis Services, SQL Server"
Comments Send by mail Print  Save  Delicious 
Date: Monday, 10 Oct 2011 23:24

Originally posted on: http://geekswithblogs.net/darrengosbell/archive/2011/10/11/sql-pass-summit-amp-mvp-deepdives-volume-2.aspx

I'm currently in Seattle enjoying the start of some of the activities around the SQL PASS Summit.

One of the activities that I'm looking forward to is the book signing session for the MVP Deepdives Volume 2 at lunchtime on Wednesday. I was fortunate to be one of the 60 or so authors this time around with a chapter on using Powershell to manipulate Analysis Services databases.

All of the proceeds from this book go to support Operation Smile. You'll find the book's website here: http://www.manning.com/delaney/.

But if you are at PASS you should be able to pick up a copy from the bookstore and drop by on Wednesday to get it signed by the authors.

Author: "Darren Gosbell" Tags: "Analysis Services, SQL Server"
Comments Send by mail Print  Save  Delicious 
Date: Sunday, 09 Oct 2011 23:36

Image MDX with Microsoft SQL Server 2008 R2 Analysis Services CookbookFull Disclosure: I was one of the technical reviewers on this book.

I think my friend Tomislav did a great job on this book and it would make a valuable addition to the bookshelf of anyone that is working with MDX.

I really enjoyed reading this and there were even a couple of interesting techniques that I have added to my toolkit. As far as I know there are not any other MDX books on the market quite like this one. It's more aimed at the intermediate level of MDX user and assumes that you have some concept of things like members, tuples and sets. You will probably get the most out of it if you have had at least a little experience writing your own MDX expressions.

The fact that it's written in a cook book style makes it very practical. You can scan down the table of contents if you want and just pick out individual recipes. Or you can read it from cover to cover as the recipes start simple and gradually increase in complexity. The individual recipes stand alone, but they also have links to other related recipes. Although the book makes frequent references to SSAS 2008 R2 the vast majority of the recipes will work just fine with the 2005 and 2008 versions (and for Denali when it's released).

Finally I was very surprised to find that the image on the front cover is one of my home town Melbourne, Australia. So if you live in Melbourne and are into MDX you definitely should buy yourself a copy of this book. Smile

Author: "Darren Gosbell" Tags: "Analysis Services, MDX"
Comments Send by mail Print  Save  Delicious 
Next page
» You can also retrieve older items : Read
» © All content and copyrights belong to their respective authors.«
» © FeedShow - Online RSS Feeds Reader