Explanation

This formula depends on two helper columns. The first helper column holds random values created with the RAND() function. The formula in C5, copied down is:

=RAND()

The RAND function generates a random value at each row.

Note: RAND is a volatile function and will generate new values with each worksheet change.

The second helper column holds the numbers used to sort data, generated with a formula. The formula in D5 is:

=RANK(C5,rand)+COUNTIF($C$5:C5,C5)-1

See this page for an explanation of this formula.

The formula in E5 is:

=INDEX(names,MATCH(ROWS($D$5:$D5),sort,0))

Here, the INDEX function is used to retrieve values in the named range “names”, using the sort values in the named range “sort”. The actual work of figuring out what value to retrieve is done my the MATCH function in this snippet:

MATCH(ROWS($D$5:$D5),sort,0)

Inside MATCH, the ROWS function is given an expanding range as the lookup value, which begins as one cell, and expands as the formula is copied down the column. This increments the lookup value, starting at 1 and continuing to 7. MATCH then returns the position of the lookup value in the list.

The position is fed to INDEX as the row number, and INDEX retrieves the name at that position.

Explanation

In this example, the goal is to test if a given range contains duplicate values and return TRUE if duplicates exist and FALSE if not. This is essentially a counting problem and the solution is based on the COUNTIF function , which counts values in a range that meet supplied criteria. The formula used in E5 is:

=OR(COUNTIF(data,data)>1)

where data is the named range B5:B16.

Background study

Below are related links to help you understand how this formula works:

  • What is an array formula? - 3 min video
  • What is an array? - 3 min video

COUNTIF function

Working from the inside-out, the core of the formula is based on the COUNTIF function:

COUNTIF(data,data)

Here, data (B5:B16) is given for both range and criteria. Typically, criteria is supplied as a single value, but in this case data contains 12 values. The result is that COUNTIF returns 12 counts (one for each value) in a single array like this:

{1;1;2;1;1;1;1;2;1;1;1;1}

In “modern” versions of Excel that support dynamic arrays , you can enter the COUNTIF formula above as a standalone formula and you will see the results spill onto the worksheet. Most values in the array are 1 but notice that the third value and eighth value are 2, which indicate duplicate values. The value 155 occurs twice at these positions in the range, which is why the count for that number is 2.

In this particular problem, we don’t care about the specific numbers returned by COUNTIF, we only care if any number is greater than 1. Therefore, we use the greater than operator (>) to check the result:

=COUNTIF(data,data)>1
={1;1;2;1;1;1;1;2;1;1;1;1}>1

The result is an array of TRUE and FALSE values.

{FALSE;FALSE;TRUE;FALSE;FALSE;FALSE;FALSE;TRUE;FALSE;FALSE;FALSE;FALSE}

This is the information we need to solve the problem. If any value in the array is TRUE, it means we have duplicates. If all values are FALSE, it means there are no duplicates. To check the array, we can use the OR function.

OR function

The array above is returned directly to the OR function:

=OR({FALSE;FALSE;TRUE;FALSE;FALSE;FALSE;FALSE;TRUE;FALSE;FALSE;FALSE;FALSE})

The OR function returns TRUE if any given argument evaluates to TRUE, and returns FALSE only if all supplied arguments evaluate to FALSE. The final result is TRUE, since at least one value in the array is TRUE.

Note: COUNTIF will automatically ignore empty cells in this configuration, which return a count of 0.

SUMPRODUCT alternative

To avoid an array formula that requires Control + Shift + Enter, you can use the SUMPRODUCT alternative below:

=SUMPRODUCT(--(COUNTIF(data,data)>1))>0

As above, the counts are checked for any numbers greater than 1, resulting in an array of TRUE and FALSE values. The double negative (–) converts the TRUE and FALSE values to 1s and 0s, and the result is delivered to SUMPRODUCT, which returns the total. Finally, the total from SUMPRODUCT is checked against zero. Since the total is greater than zero, the formula returns TRUE as a final result.

This is still an array formula but the SUMPRODUCT function handles the array operation natively, so it is not necessary to use Control + Shift + Enter.

Count duplicates

To count the number of duplicates in the range you can adapt the formula like this:

=SUMPRODUCT(--(COUNTIF(data,data)>1))

Note: this is also an array formula , but because SUMPRODUCT function can handle the array operation natively, it is not necessary to use control + shift + enter.

The configuration of COUNTIF is the same as the original formula above, so we end up with the same array of TRUE and FALSE values. The double negative (–) converts the TRUE and FALSE values to 1s and 0s, and the result is delivered to SUMPRODUCT, which returns the sum of the array:

=SUMPRODUCT({0;0;1;0;0;0;0;1;0;0;0;0}) // returns 2

Note: the SUM function will also work fine in place of SUMPRODUCT, but the formula be entered with control + shift + enter in older versions of Excel.