Average Values, Exclude Extremes
July 14, 2008 • JP • 5 Comments • Rate This Article
• Links to this article
Sometimes I have a list of values and want to average them, while excluding zeros or negative numbers, so I use the following popular array formula. "MyRange" is a named range of cells on a worksheet. I discourage the use of magic numbers, but they are included below for visual reference.
Feel free to use cell references instead.
{=AVERAGE(IF(MyRange>0,MyRange))}
I could also do this without an array, for example:
=SUMIF(MyRange,">0")/COUNTIF(MyRange,">0")
Which sums all the numbers in the range greater than zero, then divides by a count of those numbers. Another alternative to array formulas is SUMPRODUCT, which I won't get into here.
All will give me the average of all values in "MyRange" that are greater than zero.
And of course I could specify two different ranges, for example if I had two columns of numbers and wanted to average all of the data elements in one column where the adjacent values are greater than zero. Make sure both ranges are the same size, or you will most likely get an error:
{=AVERAGE(IF(MyRange>0,AnotherRange))}
This all works great when the numbers are relatively close together. But sometimes a range of numbers contains an extreme so far off from the rest that it throws off the whole average, and consequently the average won't accurately represent my data set. Or maybe I just want to blindly throw out the highest value, to get a better picture of what is going on with the data. For example suppose I have a named range "MyRange" that includes the following numbers.
1,2,3,4,5,6,7,8,9,10,100
(cut and paste this into Excel and delimit by comma if you want to follow along, make sure to select the range and use the Name box to name it "MyRange")
The formula =AVERAGE(MyRange) produces 14.0909 as the result, which hides the fact that 90% of the numbers are below 10. It includes the number 100, which throws the whole average off by making it appear that most of the numbers are around 14. If I used this assumption in another calculation I would be technically correct, but at best the result might be functionally useless or at worst, cause harm, if it is used in something like a financial calculation.
Ideally I would also use the MEDIAN function to check that my average is correct, but I know I won't always do that.
If I want to remove the 100, I use the MAX function in the original array formula to compare all the values against the maximum value in the range. If they don't match (i.e. not 100), they are counted towards the average. In other words, all values EXCEPT the maximum in the range will be averaged.
So instead of
{=AVERAGE(IF(MyRange>0,MyRange))}
I use
{=AVERAGE(IF(MyRange<>MAX(MyRange),MyRange))}
I could also use
{=AVERAGE(IF(MyRange<MAX(MyRange),MyRange))}
Notice how the first one uses "<>" and the second one uses "<". Functionally they are going to do the same thing: exclude the maximum value from the average.
But now by changing the condition of the array from "all values greater than zero" to "all values lower than the maximum" I added zero and negative values back into the array. If you know your range won't include values <= zero, you can stop here. But if it does, or you aren't sure, read on.
I need to add another condition to the IF portion of the formula to exclude values <= zero. The way I do it with an array formula is to use the multiplication symbol "*" to add compounding conditions to the evaluation.
{=AVERAGE(IF((MyRange<>MAX(MyRange))*(MyRange>0),MyRange))}
Or
{=AVERAGE(IF((MyRange<MAX(MyRange))*(MyRange>0),MyRange))}
If you aren't familiar with these types of array formulas, check out the link at the beginning of the post. I use truth tables to help me figure it out when to use '+' and '*' inside an array formula. An excellent explanation of this may be found at this site.
For example, if I wanted to average all numbers in MyRange that are greater than 5 AND less than (i.e. excluding) the highest value, I would use:
{=AVERAGE(IF((MyRange<MAX(MyRange))*(MyRange>5),MyRange))}
But if I wanted to average all numbers in MyRange that are greater than zero OR less than the highest value, I would use:
{=AVERAGE(IF((MyRange<MAX(MyRange))+(MyRange>0),MyRange))}
If I want to eliminate the lowest value instead, I simply substitute MIN for MAX in any of the formulas above! Oh, and I might have to reverse some of the < and > signs as well.
I can also eliminate the lowest or highest N values from our calculation, but it requires that I know what my usual data looks like. Yes it's cheating, but if I know my data is usually around a certain number, I can simply specify what values to exclude from the average. For example, if my data is usually below 100, and any spikes over 100 would mess up my average and misrepresent my data:
{=AVERAGE(IF((MyRange<100)*(MyRange>0),MyRange))}
Enjoy,
JP
Previous Post: Format header row
Next Post: Odd Behavior from Resend Message Code




There is a mistake at the text lines:
I could also use
{=AVERAGE(IF(MyRange
Looks like the hypertext got in there to have the remainder of text
The array formula should be:
I could also use
{=AVERAGE(IF(MyRange<MAX(MYRANGE),MYRANGE))}
Thought you would want to make the correction as your information is very helpful.
Will
Thanks, I fixed it!
Hi,
How can I write a function PositiveAverage(R) which return the average of all the positive values in the range R ( any negative values will be excluded in calculating the average). Sue a double 'For .. next loop structure in the function. Work with the range object R(and not array variable) in writing this function.
Your request reads like a homework question.
To return the average of all positive values in a range, one way is to first read the range into an array.
Dim rng As Excel.Range
' set rng = whatever your range is
tempArray = rng.Value
Then loop through the range and check for values greater than zero.
Dim i As Long
Dim total As Long
Dim avg As Long
For i = LBound(tempArray) To UBound(tempArray)
total = total + tempArray(i)
Next i
Then get the average of all positive values:
I wrote the following subroutine .can anyone tell me what 's wrong with this. I dont find anything going wrong bt its just nt working. This function is writen to clik any random cell to change the colour.
Sub ColourCell()
Rand = ActiveCell.Interior.colour
ActiveCell.Interior.Color = WhatColour(RandInteger())
End Sub