Calculate the percentage of the total with RLS in place in Power BI

Author:Murphy | View: 27851 | Time: 2025-03-22 23:01:16

Most data models have RLS in place, where some users can see only some aspects of the entire data set. But what happens when they must see their result compared to the overall result? This is not that easy.

Introduction

Let's look at the following scenario:

My Salespeople can see only the results for their assigned geographic regions.

In this scenario, it's the continent.

For benchmarking, they must be able to compare their results to the results of the other continents and the total.

This is impossible when RLS (Row-Level-Security) is in place, as the users are not allowed to see the results for the other continents.

A change to the data model is necessary to make this possible.

So, let's look at how to implement such a change.

Solution by SQLBI

SQLBI has already written an article and created a video on this topic:

Computing accurate percentages with row-level security in Power BI – SQLBI

Theoretically, we can stop here: Read the article or watch the video. All OK, isn't it?

Not that fast, my young horse.

While Alberto has built a solution using DAX, I would like to create the additional tables for the data model earlier, preferably in the source (database) or Power Query.

As not everyone stores their data in a database, I don't want to dig into SQL code to build the necessary tables, even though creating the solution in SQL would be pretty straightforward.

So, I go to Power Query to create the solution.

Just like my previous article:

Converting a Flat Table to a Good Data Model in Power Query

OK, let's start.

Step 1 – Create the Fact table without Customers

I will take the same scenario and approach described in the SQLBI article above.

As the RLS rules are set up on the Customer table, I created a copy of the Fact tale "Online Sales," but without referencing the Customer table.

Without this reference, the RLS rule on the Customer table doesn't apply, and I can calculate the result as needed.

After opening Power Query, I create a Reference of the Online Sales table:

Figure 1 – Create a Reference from the Online Sales table (Figure by the Author)

By creating a Reference, I do not reread the data from the source in Power Query but reuse the result from the original Online Sales table.

I rename the table to "Online Sales (No Customers)".

The second step is to analyze the data and decide by which columns I must group the data and which columns I can aggregate.

This step is necessary as I'm changing the granularity of the data.

By removing the Customer, I can reduce the size of the table, as I have fewer details.

I can group my data by all Dimension Keys on the table. But without the Customer Key. And as I must not be able to figure out which customer has placed which order, I must also remove the Order Number and the Order Line details.

But I must spend some time figuring out which column I can aggregate.

Let's look at an example:

When I take Sales Quantity, I can sum this column without problems, as the result can be reused.

But when I look at Unit Price, the picture is not that easy anymore. I have to look at what I can do with this column.

What happens when I aggregate this column?

This price is likely to change over time.

This poses a danger when calculating the Sales Amount by multiplying the Unit Price with the Sales Quantity. This can lead to wrong results. Therefore, I cannot aggregate the Unit Price.

In the end, I grouped the table by these columns:

OrderDate
StoreKey
ProductKey
PromotionKey
CurrencyKey
DateKeyYear
DueDate
ShipDate

And I aggregate (Sum) these columns:

SalesQuantity
SalesAmount
ReturnQuantity
ReturnAmount
DiscountAmount
TotalCost
UnitCost

Additionally, I can add a distinct count by CustomerKey to count the Customers to analyze the data.

Figure 2 – Grouping and aggregating the new table (Figure by the Author)

I must click on "Add grouping" to add the column by which the data must be grouped.

Then, I must scroll down to click on "Add aggregation" to add the columns that must be aggregated. For each column, I must set the Operation (Aggregation function à e. g. Sum), and I can put a new name.

The next step is to add all relationships to the newly created table:

Figure 3 – Data model for the new table (Figure by the Author)

Now, let's look at the results without and with the additional table:

This is the starting point from the original data and without RLS:

Figure 4 – Initial result without RLS (Figure by the Author)

Now, let's apply the RLS role for Asia and Australia:

Figure 5 – Initial result with applied RLS role (Figure by the Author)

As you can see, the result is mathematically correct, but it doesn't fulfill our requirements.

The percentage is calculated only over the remaining two Continents instead of all continents.

Now, I can change the report by moving the Continent to a Slicer and adding the Product Brand to the Matrix:

Figure 6 – Changed report with continent Slicer and Brand (Figure by the Author)

Due to this change, I can see that the values still change with and without RLS.

The problem now is that the result with an active RLS role is wrong:

Figure 7 – Wrong result by Brand with RLS applied (Figure by the Author)

The reason is that the percentage of Sales over "all" Continents by Brand is calculated over the available Continents, Asia and Australia. Therefore, the result is wrong.

However, the result differs when I add a new Measure, which points to the new table.

The Measure is the following:

% over All Continents (No Customers) =
DIVIDE([Online Sales (By Order Date)]
  ,SUM('Online Sales (No Customers)'[Sales Amount]))

And the result (with RLS) is this:

Figure 8 – Result with new Measure on the new table and RLS (Figure by the Author)

Take care: if you don't select any continent and don't have an active RLS role, the result of the new Measure looks wrong. You must take this into account while building and testing your solution.

Depending on the report, you can use one of the two Measures, as both might deliver the correct result, depending on the situation.

For example, the Sales Director might have access to the Sales results of all Continents. For him, the result of the original Measure is correct, as it considers the Continent of the Customers.

To simplify this, we can add a new table with Customer Attributes.

Step 2 – Reporting over Customer Attributes

As described in the SQL article, there can be a situation where it is necessary to create a report based on some Customer attributes, like Gender, Education, or any other Attributes.

In this case, I must create a table for these Attributes and expand our Data model with this table.

To build such a table, I must first define the Requirements and then the process for creating the table.

For example:

I must have a table without customers' data (e.g., due to Data protection rules).
I must not be able to reconstruct customers' data.
I must be able to filter both the new and the original Online Sales tables.

Next, I define the necessary steps to fulfill the requirements:

Duplicate the Customer table (named Customer Attributes).
Remove the unnecessary columns.
Remove all duplicates.
Add an Index (Key / ID) column.
Merge the new table with the original Customer table to add the newly created Key column.
Merge the Key column to the "Online Sales" table.
Add the new Key column to the "Online Sales (No Customers)" table.

This should work. Let's start:

One short note: As I must Merge the table without the unwanted Customer Attributes to the Original Customer table, I cannot create a Reference, as this would introduce a circular dependency. Therefore, I must Duplicate the Customer table:

Figure 9 – Duplicate the Customer table (Figure by the Author)

I renamed the duplicated table to Customer Attributes.

Now, I remove all columns that contain personal data (Except Gender, Education, and other statistically relevant columns). But I must make sure that I keep all the columns necessary to ensure that the Merge doesn't result in a multiplication of rows in the following Merge Step.

For example, I got duplicates in the Result without the Birthdate. The reason is that there are a lot of customers sharing the same attributes without the Birthdate. But including the Birthdate ensures that the rows can be matched.

You need to check this step carefully, as this might change case-by-case.

Next, I add the Index column:

Figure 10 – Add Index column (Figure by the Author)

The Index column is renamed to "CustomerAttributesKey".

I use the Merge function to transport the new Key to the Original Customer table:

Figure 11 – Merge the new Customer Attributes table to the original Customer table (Figure by the Author)

I must select all matching columns while holding the Ctrl-Key pressed to ensure all columns are combined to merge the correct rows.

A problem exists with non-matching rows.

As you can see above, thirteen rows cannot be matched. Unfortunately, I wasn't able to find the cause of this mismatch. Power Query doesn't offer tools to find and solve such problems.

I created a CustomerAttributes_Dummy table in Excel with a CustomerAttributesKey = -1 to solve this. Then, I added a table with the Enter Data feature with this row and appended this row to the CustomerAttributes table.

Another Note: I added the Dummy row to simplify the solution for this article. In the real world, I would try to find the cause, examine the data, and find the right solution instead of such a workaround.

Now I can expand the merged table to extract the CustomerAttributesKey:

Figure 12 – Expand the new Key column (Figure by the Author)

The non-matching values for the Key columns can be filled with the Dummy-Key -1 through the Replace Value feature:

Figure 13 – Replace the non-matching Keys with the Dummy Key "-1" (Figure by the Author)

The next step is to add the CutomerAttributesKey to the "Online Sales" and, therefore, to the "Online Sales (No Customers)" tables.

Again, I use the Merge and Expand Features to add the column to the "Online Sales" table:

Figure 14 – Merge the Online Sales table with the Customer table (Figure by the Author)

The Expansion of the CustomerAttributeyKey column is done the same way as shown before.

Lastly, I have to add the new CustomerAttributeKey as a Grouping column to the "Online Sales (No Customers)" table:

Figure 15 – Add CustomerAttributeKey as a Grouping column (Figure by the Author)

With this step, I have completed the data preparation tasks.

Modifying the Data model

At this point, I have two options for the data model.

In any case, I will add a Relationship to the "Online Sales (No Customer)" table, as this will fulfill a central requirement.

But I can add a Relationship to the Customer or the Online Sales table:

Linking the new CustomerAttributes table to the Customer table, which relates to the Original Online Sales table.
Add a direct Relationship to the Online Sales table without adding a Relationship to the Customer table.

Here are the two options, side by side:

Figure 16 – Variants for Data model for the Customer Attributes table (Figure by the Author)

Variant 1 uses the Customer table as an intermediary table. This has the following consequences:

The duplicate attributes in the Customer table can be removed.
Therefore, a unique distribution of Attributes.
The RLS on the Customer table is always applied.

Variant 2 uses two relationships from the Online Sales table to each Customer table:

Duplicated Attributes.
Two possibilities to filter the data by the same attribute.
The Customer Attributes table is not affected by RLS on the Customer table.
A clear Star Schema.

I will go with the Variant 2, as I want to eliminate the effects of the RLS rules in some specific calculations, which opens more possibilities.

But this is a decision for each situation.

Results and Conclusion

The result will differ depending on the data model and how you use the attributes from each table.

For example, when I add the Gender from both tables to the Matrix shown at the beginning, the results are the following:

Figure 17 – Comparison of the results with the Gender from each table (Figure by the Author)

While the result by Gender on the right can be summarized to the Brand above, the result on the left cannot.

These results are Non-Visual-Totals, as described in the SQLBI article above, which can cause confusion and are hard to understand while being mathematically correct.

I strongly encourage reading the article to better understand this complex topic.

The main question is: What is better, using DAX or Power Query to prepare the data?

Strictly from the view of a data engineer, I would say Do it in the Source or Power Query to avoid data manipulation in Power Bi using DAX.

This would follow Roche's Maxim:

Transform the data as early as possible and as late as necessary.

One more reason to do it in Power Query is efficiency.

Power BI will compress and optimize the data storage more efficiently when the data is prepared in the source or Power Query than when using DAX tables.

This very technical video explains this in great detail:

However, the complexity of doing this transformation is greater than the approach shown in the SQLBI article.

When developing a solution for my clients, I ask the following questions:

Who will maintain the solution?
What are the skills of that person?
Is that person willing to learn more while maintaining the solution?

The answers to these questions will drive the approach for the solution.

I already had situations when I had to rebuild a solution with a different approach, as the client wasn't able or willing to learn the techniques used in my first approach.

So, as usual, the answer to the question "Power Query or DAX" is: it depends …

My aim for this article is to show you an alternative way of building the solution, which can give you more flexibility while deciding which approach is the right one for you.

I hope I could achieve this and you learned something new.

References

I wrote an article about using Power Query to transform a flat table into a Star schema while using some techniques described here. You can find it here: