Enterprise Architecture & Integration, SOA, ESB, Web Services & Cloud Integration

Enterprise Architecture & Integration, SOA, ESB, Web Services & Cloud Integration

Saturday 19 October 2013

How to use Percentile for subset of EXCEL data without using Pivot table

I thought twice before writing this post as it is not my usual stuff - Enterprise architecture/Integration/SOA/Web services. When I was analyzing the access logs of WebLogic application servers for a performance engineering project for an African customer, I had to spend some time on the excel. I wanted to share my experience as a useful tip here as it will be useful to someone in the globe.

I have been a big fan of Pivot table & Pivot chart for analyzing large set of records. Almost all the times,  my requirements - calculate minimum value, maximum value, average value, count of problematic/badly performing URLs - were same and done very easily using Pivot table. But this time, I got a new requirement - calculate the 90th percentile in addition to min, max, average and count statistics. Initially I thought it was a very easy task but believe me it took more than one full day.

See below sample records. (My original data was very different. For easy understanding, I am giving sample data here)



I want to calculate the 90% percentile for
  1. Both animals
  2. Only for Lion
  3. Only for Tiger
Pivot table supports, by default, calculating min, max and average values but not percentile. The "percentile" function in Excel can be used to find 90th percentile of age for all kinds of animals (i.e., Tiger plus Lion) as mentioned below:

=PERCENTILE.INC($D$7:$D$16, 0.9)

Note: "C7 to C16" contains the kind and "D7 to D16" contains the age.

Next, we have to find the 90th percentile for Lion and Tiger separately. We have to extract the subset of given raw data and then apply percentile function. I choose to use "IF" function to extract the subset (i.e., records that belong only to "Lion" or "Tiger"). Now, my formule will look like below:

=PERCENTILE.INC(IF($C$7:$C$16="Lion",$D$7:$D$16), 0.9)

=PERCENTILE.INC(IF($C$7:$C$16="Tiger",$D$7:$D$16), 0.9)

Are we done? Not yet. In the result cell, I get "0" value instead of getting the expected value. This is where I actually spend most of times to resolve the issue. At last, one trick helped me - after typing the formula in the cell, instead of pressing "ENTER", we have to press "CTRL+SHIFT+ENTER". When we press "CTRL+SHIFT+ENTER", it puts a curly brace around the formula. Please see below new formule
{=PERCENTILE.INC(IF($C$7:$C$16="Lion",$D$7:$D$16), 0.9)} and {=PERCENTILE.INC(IF($C$7:$C$16="Tiger",$D$7:$D$16), 0.9)}


Now, I got the result which I expected (as given below):



There may be other way of doing this, but I wanted to share my experience. This has reduced my ongoing effort from few hours to few minutes and improved my productivity. If it saves your time too, I will be happy.

12 comments:

  1. Ayyappan, you are my hero!! Thank you!! :)

    ReplyDelete
  2. WOW! You literally save me days of menial work!! Thank you so much and bless you sir, for sharing and blessing me.

    ReplyDelete
  3. that one peice i was missing and thanks to you and google..

    the curly bracket was what was missing....

    ReplyDelete
  4. wah wah wah....you are a genius....

    ReplyDelete
  5. I wonder why I can't make it work with an AND in the if function..

    =PERCENTILE.INC(IF(AND(Dati!$W2:$W1000='TTC by Plant (2)'!$B11;Dati!AH2:AH1000=TRUE);Dati!$S2:$S1000);0,9)

    Any ideas? Thank you!

    ReplyDelete
  6. thanks.......guide was very useful, especially with the the final paragraph of the {}

    ReplyDelete