Let's assume that we want to create new array @result that contains uppercased data from original @a array. There are many ways to do that - some of them are easier to write, while others are faster.
The simplest way to process all elements in an array is to use map function. Resulting code takes only one line:
Since we're only calling one uc function, we can make this statement even more compact:
While both of these methods are easy to write and understand, as we will see later they are not the fastest ones.
Let's replace map function with explicit foreach loop and see how it affects the speed:
my @result = ();
foreach (@a)
{
push (@result, uc($_));
}
This method turns out to be more than 40% faster on Linux, and 10 times faster on Windows than the map method. It seems that the map function in perl 5.8.x is very inefficient - avoid it if you need to get the most speed from your code.
It is often said that foreach loops are faster than for loops. Let's re-code our example using for loop and see how slower it's going to be:
my @result = ();
for (my $i = 0; $i < @a; $i++)
{
push (@result, uc($a[$i]));
}
This example is 27% slower than the foreach loop, but it's still 12% faster than the map method on Linux, and much faster than the map method on Windows. We can try to tweak the code by replacing push function with direct assignment, although this won't change the speed of the loop:
my @result = ();
for (my $i = 0; $i < @a; $i++)
{
$result[$i] = uc($a[$i]);
}
If for some reason you need to use for loop for data processing and you want to get the most speed out of it, it is advisable to store original array size in a scalar variable and use the variable in conditional part of the for loop:
my @result = ();
my $len = scalar(@a);
for (my $i = 0; $i < $len; $i++)
{
$result[$i] = uc($a[$i]);
}
Replacing "@a" with pre-calculated scalar value results in 5% speed improvement on Linux.
Our foreach loop method did two things at once - it processed the data and copied the data to the resulting array. Let's change the code so that these two tasks are executed separately, i.e. first we'll copy data from one array to the resulting array, and then process it:
my @result = @a;
foreach (@result)
{
$_ = uc($_);
}
This code runs 3% slower than the original foreach method. Why would I want to use this method over the original one - where both tasks were executed simultaneously, and therefore are more efficient? I did it because it allows me to use tr function:
my @result = @a;
foreach (@result)
{
tr/a-z/A-Z/;
}
This method is about 14% faster than the fastest foreach method under Windows and Linux. Just in case, let's see if using yet another uppercase function would help to improve the speed even further:
my @result = @a;
foreach (@result)
{
$_ = "\U$_";
}
Nope. This method is slower, and it's even 12% slower on Linux than the method with uc function.
NOTE: All testing was done in perl 5.8.8 under Linux RedHat Enterprise and Windows 2000.
Back to "Perl programming tricks"
| (c) Copyright 2007 Gennadiy Shvets |