Merge sort (Perl)

From LiteratePrograms

Jump to: navigation, search
Other implementations: ACL2 | C | C++ | dc | Eiffel | Erlang | Haskell | Java | JavaScript | Lisp | OCaml | Oz | Perl | Prolog | Python | Ruby | Scheme

In this article we describe an implementation of merge sort in Perl.

Mergesort

The mergesort algorithm works by:

  • Splitting the data in 2 halves
  • Sorting each half
  • Merging the halves

mergesort_string takes an array reference and two indexes, one representing the start of the data to be sorted, and one representing the end of the data to be sorted (but in fact pointed to the element that follows the section of data to be sorted). The data is split in two parts, and mergesort_string is recursively called on each.

The result is merged, making the data completely sorted. If the provided indexes represent a single element or no elements at all, mergesort_string can just return.

<<string>>=
sub mergesort_string
{
	my ($aref, $begin, $end)=@_;
	my $size=$end-$begin;
	if($size<2) {return;}
	my $half=$begin+int($size/2);
	mergesort_string($aref, $begin, $half);
	mergesort_string($aref, $half, $end);

We use in-place merging, which doesn't waste much memory, but can be a little complicated. An alternative method is to create a new container and copying the elements in to it.

Each element in the left part is compared with the first element in the right part. If it is larger, the right element is copied to the left half, and the old left element is inserted in the correct position in the right half. When all left half elements are tested, the complete container is sorted.

<<string>>=
	for(my $i=$begin; $i<$half; ++$i) {
		if($$aref[$i] gt $$aref[$half]) {
			my $v=$$aref[$i];
			$$aref[$i]=$$aref[$half];
			my $i=$half;
			while($i<$end-1 && $$aref[$i+1] lt $v) {
				($$aref[$i], $$aref[$i+1])=
					($$aref[$i+1], $$aref[$i]);
				++$i;
			}
			$$aref[$i]=$v;
		}
	}
}

This is a version for number sorting, which uses the < and > operators rather than lt and gt.

<<number>>=
sub mergesort_number
{
	my ($aref, $begin, $end)=@_;
	my $size=$end-$begin;
	if($size<2) {return;}
	my $half=$begin+int($size/2);
	mergesort_number($aref, $begin, $half);
	mergesort_number($aref, $half, $end);
	for(my $i=$begin; $i<$half; ++$i) {
		if($$aref[$i] > $$aref[$half]) {
			my $v=$$aref[$i];
			$$aref[$i]=$$aref[$half];
			my $i=$half;
			while($i<$end-1 && $$aref[$i+1] < $v) {
				($$aref[$i], $$aref[$i+1])=
					($$aref[$i+1], $$aref[$i]);
				++$i;
			}
			$$aref[$i]=$v;
		}
	}
}

As most users want to sort a complete array, we provide simpler versions of both mergesort_string and mergesort_number.

<<string>>=
sub msort_string
{
	my $size=@_;
	mergesort_string(\@_, 0, $size);
}
<<number>>=
sub msort_number
{
	my $size=@_;
	mergesort_number(\@_, 0, $size);
}


Test

In this test program, we generate 2 arrays, one containing strings, and one containing numbers. Running the corresponding msort_ functions sorts the data as expected. If we had used msort_string on the array containing numbers, the data would have been sorted by the ASCII-value of the digits in the numbers, so the number 10 would have been sorted before 2.

<<mergesort.perl>>=
#!/usr/bin/env perl
use strict;
use warnings;
string
number
my @towns=qw(Paris London Stockholm Berlin Oslo Rome Madrid Tallinn Amsterdam Dublin);
print "towns before mergesort:";
foreach (@towns) {
	print " $_"
}
print "\n";
msort_string(@towns);
print "towns after mergesort:";
foreach (@towns) {
	print " $_"
}
print "\n";
my @numbers=qw(9 8 6 98 43 12 59 52 4 5 14 2 92 3 32 54 22 41 7 34 15 3 1 13 99 42 63 34);
print "numbers before mergesort:";
foreach (@numbers) {
	print " $_"
}
print "\n";
msort_number(@numbers);
print "numbers after mergesort:";
foreach (@numbers) {
	print " $_"
}
print "\n";
Download code
Views