Important : the following technique doesn’t work for any arrays where the values are :
- boolean (true/false)
- null
- objects
- resources
Suppose you get data from some source (an XML file, a CSV file, …) and you put it into an array. Now suppose this data is full of duplicates. For example, you have :
array( 0 => 'horse', 1 => 'pig', 2 => 'pig', 3 => 'cow', 4 => 'horse',...
How can you get the unique values from this array ?
The standard way would be to do :
$a = array_uniques($a);
Works fine, except that it’s extremely slow for large arrays.
A better way would be :
$a = array_keys(array_flip($a));
But marginally faster is :
$a = array_flip(array_flip($a));
So how big is the speed difference ? For large arrays, a double array_flip can easily be 20 times faster.
For reference, here’s a small benchmark :
$a = array(); for ($x=0; $x < 1000000; $x++) { $a[] = rand(0,1000);} $starttime = microtime(true); $b = array_unique($a); echo (microtime(true) - $starttime) . "\n"; $starttime = microtime(true); $b = array_keys(array_flip($a)); echo (microtime(true) - $starttime) . "\n"; $starttime = microtime(true); $b = array_flip(array_flip($a)); echo (microtime(true) - $starttime) . "\n";
The result :
2.06489086151
0.101167201996
0.0999970436096
RSS feed