Important : the following technique doesn’t work for any arrays where the values are :
- boolean (true/false)
- null
- objects
- resources
Suppose you get data from some source (an XML file, a CSV file, …) and you put it into an array. Now suppose this data is full of duplicates. For example, you have :
array(
0 => 'horse',
1 => 'pig',
2 => 'pig',
3 => 'cow',
4 => 'horse',
...
How can you get the unique values from this array ?
The standard way would be to do :
$a = array_uniques($a);
Works fine, except that it’s extremely slow for large arrays.
A better way would be :
$a = array_keys(array_flip($a));
But marginally faster is :
$a = array_flip(array_flip($a));
So how big is the speed difference ? For large arrays, a double array_flip can easily be 20 times faster.
For reference, here’s a small benchmark :
$a = array();
for ($x=0; $x < 1000000; $x++) {
$a[] = rand(0,1000);
}
$starttime = microtime(true);
$b = array_unique($a);
echo (microtime(true) - $starttime) . "\n";
$starttime = microtime(true);
$b = array_keys(array_flip($a));
echo (microtime(true) - $starttime) . "\n";
$starttime = microtime(true);
$b = array_flip(array_flip($a));
echo (microtime(true) - $starttime) . "\n";
The result :
2.06489086151
0.101167201996
0.0999970436096