Benchmarking – The Specifics

In this article I will describe the benchmarking tests that were performed on the eleven languages. Throughout the course of working on these benchmarking experiments I took upon some CPU-intensive programs and recoded those standard test-programs in each of our selected languages. Here I will discuss about some those test-programs and galso will provide code snippets. In a month I will host all the code snippets on our web site.

Ackermann Function

The Ackermann function is the simplest example of a well defined total function which is computable but is not primitive recursive (A function which can be implemented using only for-loops is called primitive recursive), providing a counter example to the belief in the early 1900s that every computable function was also primitive recursive. It grows faster than an exponential function, or even a multiple exponential function. 
The Ackermann function A(x,y) is defined for integer x and y by 



Ackermann's function is heavily recursive, and will really stress a language's ability to do deep recursion. This test computes the value of Ack(3, N), where N ranges up to 8. Those languages that do not implement tail recursion elimination (tail-call elimination) will not peform as well as those that do. The correct output (for N = 4) looks like this: Ack(3,4): 125. As far as this test is concerned Java performs best with N=8. Here is the code in Java.

..........................................................................................................................................
public class ackermann 
{
    public static void main(String[] args) 
    {
      int num = Integer.parseInt(args[0]);
      System.out.println("Ack(3," + num + "): " + Ack(3, 
                                             num));
    }
    public static int Ack(int m, int n) 
    {
      return (m == 0) ? (n + 1) : ((n == 0) ? Ack(m-1, 1) :
                         Ack(m-1, Ack(m, n - 1)));
    }
}

PHP, TCL both stop working at N < 8 (the common error is 'out of stack space'). I was a little surprised by Perl's bad performance and memory usage in this test. But the I came to know that Perl actually keeps around call frames (read more about how recursion works on the net) when they are done so that it can call them again faster. This is undoubtedly the reason for recursion being slow. Following is the Perl code which has been tweaked to reduce the stack frame size.

use integer;

..........................................................................................................................................................
# It's prettier but slower to do this
#sub Ack {
# my($M, $N) = @_;
# return( $N + 1 ) if ($M == 0);
# return( Ack($M - 1, 1) ) if ($N == 0);
# Ack($M - 1, Ack($M, $N - 1));
#}

# in our quest for speed, we must get ugly:
# it helps reduce stack frame size a little bit
sub Ack {
return $_[0] ? ($_[1] ? Ack($_[0]-1, Ack($_[0], $_[1]-1))
: Ack($_[0]-1, 1))
: $_[1]+1;
}

my $NUM = $ARGV[0];
$NUM = 1 if ($NUM < 1);
my $ack = Ack(3, $NUM);
print "Ack(3,$NUM): $ack\n";

Echo Client/Server

In this test I stressed the language’s capability of multi-threading and networking. I first setup a server socket-thread at port 0 of local machine. Then I created a client socket-thread listening at the same port 0 and connecting back to the server already setup and running. The client thread sends a predefined string to the server a number of times(N : at least 1) defined by commandline argument. The server keeps on counting the number of bytes received and echoed back the received bytes to the client. If the echo doesn’t match to that sent by the client then it throws an exception and exits. Perl performs excellent here with CPU time 18.88 sec when N = 100000. Due to readability reason I’m giving the Java code first

..........................................................................................................................................................
import
java.io.*;
import java.net.*;

public class
echo 
{
  public static void main(String[] args) throws Exception     
 
{
    int iIterations = 1;
    try 
    {
        iIterations = Integer.parseInt(args[0]);
   
    catch(Exception e) { }
    EchoServer esServer = new EchoServer(0);
    new EchoClient(InetAddress.getLocalHost(), 
                   esServer.getPort(), iIterations);
  }
}

//Loopback echo-client thread
class EchoClient extends Thread 
{
    private static final String GREETING = "Hello World\n";
    private final InetAddress inetaServer;
    private final int         iPort;
    private final int         iIterations;
    public EchoClient(InetAddress inetaServer, int iPort, 
  int iIterations) 
    {
      this.inetaServer = inetaServer;
     this.iPort = iPort;
      this.iIterations = iIterations;
      start();
    }

   
public void run() 
    {
      Socket socketFromServer = null;
      try 
      {
        socketFromServer = new Socket(inetaServer, iPort);
        BufferedReader in = new BufferedReader(new 
                                     InputStreamReader(
socketFromServer.getInputStream()
  )
                      );
        OutputStream out = 
socketFromServer.getOutputStream();

       
byte[] bytesOut = GREETING.getBytes();
        String strIn = GREETING.trim();

       
for(int i = 0; i < iIterations; ++i) 
        {
           out.write(bytesOut);
           out.flush();
           String strRead = in.readLine();
           if(!strRead.equals(strIn))
              throw new RuntimeException("client: \"" + 
strIn + "\" ne \"" + strRead + "\"");
        }
   
    catch(Exception e) 
    {
        e.printStackTrace();
    }

   
try 
    {
        socketFromServer.close();
   
    catch(Exception e) { }
  }
}
//The main server thread

class EchoServer extends Thread 
{
    private static final int   BUFFER_SIZE = 1024;
    private final ServerSocket ssAccepting;
    private final int          iPort;

   
public EchoServer(int iPort) throws IOException 
    {
       ssAccepting = new ServerSocket(iPort);
       this.iPort = ssAccepting.getLocalPort();
       start();
    }

   
public final int getPort() 
    {
      return iPort;
    }

    public void run() 
    {
      byte bytesIn[] = new byte[BUFFER_SIZE];
      try 
      {
        Socket socketClient = ssAccepting.accept();
        InputStream in = socketClient.getInputStream();
        OutputStream out = socketClient.getOutputStream();
        int iLength, iCount = 0;

       
while ((iLength = in.read(bytesIn)) != -1) 
        {
          out.write(bytesIn, 0, iLength);
          out.flush();
          iCount += iLength;
        } 

        System.out.println("server processed " + iCount + " 
   bytes");
     
      catch (Exception e) 
      {
        e.printStackTrace();
      }
    }
}

HeapSort

The classic heapsort program is useful to test duarability of the language. Given N = 80000, VC++ perfoms great with a whopping CPU time 0.11 sec. Just below VC++ it’s our un-biased dark horse Borland Delphi (non.NET version). Java’s time is 0.76 sec – which is considerably poor. The nice thing about Borland Delphi is it possesses all the weapons in the armory but shows all-round consistent performance not like Java which performs great somewhere – very poor elsewhere. Even C# beats Java here with a good time of 0.27 sec. After all slow but steady wins the race. Again I’m giving Java code here instead of some pseudo-code only because of Java’s readability. It clarifies properly what particularly this test is going to do.


import
java.text.*;
import java.lang.reflect.Array;

public
class heapsort
{
  public static final long IM = 139968;
  public static final long IA =   3877;
  public static final long IC =  29573;
  public static void main(String args[])
  {
    int N = Integer.parseInt(args[0]);
    NumberFormat nf = NumberFormat.getInstance();
    nf.setMaximumFractionDigits(10);
    nf.setMinimumFractionDigits(10);
    nf.setGroupingUsed(false);
   double []ary = (double[])Array.newInstance(double.class,
                                       N+1);
    for (int i=1; i<=N; i++) 
    {
        ary[i] = gen_random(1);
    }

   
heapsort(N, ary);
    System.out.print(nf.format(ary[N]) + "\n");
  }

 
public static long last = 42;
  public static double gen_random(double max)
 {
    return( max * (last = (last * IA + IC) % IM) / IM );
}

 
public static void heapsort(int n, double ra[])
  {
    int l, j, ir, i;
    double rra;
    l = (n >> 1) + 1;
    ir = n;
    for (;;)
    {
        if (l > 1)
        {
          rra = ra[--l];
        }
        else 
        {
          rra = ra[ir];
          ra[ir] = ra[1];

         
if (--ir == 1)
          {
            ra[1] = rra;
            return;
          }
        }    
        i = l;
        j = l << 1;

       
while (j <= ir)
        {
          if (j < ir && ra[j] < ra[j+1]) { ++j; }
          if (rra < ra[j])
          {
            ra[i] = ra[j];
            j += (i = j);
          }
          else
          {
            j = ir + 1;
          }
        }
        ra[i] = rra;
    }
  }
}

Object Instantiation

This is a pure OO test. It basically checks that how quickly a language can instantiate user-defined types (UDT) or classes. It takes a parameter (N) like most of the other tests and iterates the object instantiation task for N number of times. This really stress the OO subsystem of a language especially when tested with N = 1000000. My personal favorite C# performs great here with a high score of 0.29 CPU sec. Next two challengers are Java (with 0.77 CPU sec) and Delphi (very bad performance: with 2.52 CPU sec). Here is the C# code for the test.

..........................................................................................................................................................
using
System;

class
Toggle {
    public bool state = true;
    public Toggle(bool start_state) {
        this.state = start_state;
    }

    public bool value() {
        return(this.state);
    }

    public Toggle activate() {
        this.state = !this.state;
        return(this);
    }
}

class
NthToggle : Toggle {
    int count_max = 0;
    int counter = 0;

   
public NthToggle(bool start_state, int max_counter) : base(start_state) {

        this.count_max = max_counter;
        this.counter = 0;
    }

   
public new NthToggle activate() {
        this.counter += 1;
        if (this.counter >= this.count_max) {
            this.state = !this.state;
            this.counter = 0;
        }
        return(this);
    }
}

class
App {
    public static int Main(String[] args) {
        int n;

        n = System.Convert.ToInt32(args[0]);

       
if(n < 1) n = 1;

       
Toggle toggle1 = new Toggle(true);
        for (int i=0; i<5; i++) {
            Console.WriteLine((toggle1.activate().value()) ? "true" :
"false");
        }

       
for (int i=0; i<n; i++) {
            Toggle toggle = new Toggle(true);
        }
        Console.WriteLine();  

        NthToggle ntoggle1 = new NthToggle(true, 3);

        for (int i=0; i<8; i++) {
            Console.WriteLine((ntoggle1.activate().value()) ? "true" : "false");
        }

       
for (int i=0; i<n; i++) {
            NthToggle toggle = new NthToggle(true, 3);
        }
        return(0);
    }
}

Reversing a Console File

This test measures the console I/O buffer processing capability of the languages. It reads stream of text from the standard console until it encounters a termination character like EOF or Ctrl+Z and prints out those entered lines as First In First Out basis. Delphi is the winner with 1.03 CPU sec and the runner up VC++ finishes with 2.30 CPU sec. If the C++ library had not been used then it would score a terrific 0.32 CPU sec (with raw C inside MSVC editor). I’m sure that the performance penalty for VC++ is solely due to its standard file I/O library, which is an indispensable part of famous STL or Structured Template Library. Here I will give Delphi code first and then C++ code –

..........................................................................................................................................................
program
reversefile2;

uses Windows;

const MAX_READ = 4096;

// avoids using SysUtils and its initializations

function
FileRead(Handle: THandle; var Buffer; Count: LongWord): Integer;

begin
  if not ReadFile(Handle, Buffer, Count, LongWord(Result), nil) then
    Result := -1;
end;

function
FileWrite(Handle: THandle; const Buffer; Count: LongWord): Integer;

begin

  if not WriteFile(Handle, Buffer, Count, LongWord(Result), nil) then
    Result := -1;
end;

var
buf: array of char;
    numRead: integer;
    inHandle,outHandle: THandle;
    filePos,i,e: cardinal;

begin
  inHandle:=GetStdHandle(STD_INPUT_HANDLE);
  outHandle:=GetStdHandle(STD_OUTPUT_HANDLE);
  setLength(buf, MAX_READ);
  numRead:=fileRead(inHandle,buf[0],MAX_READ);
  filePos:=0;

 
while numRead>0 do begin
    inc(filePos,numRead);
   
if integer(filePos)+MAX_READ-1>high(buf) then
      setLength(buf, (high(buf)+1)*2);
    numRead:=fileRead(inHandle,buf[filePos],MAX_READ);
  end;

  e:=filePos-1;

 
for i:=filePos-2 downto 0 do
    if buf[i]=#10 then begin
      fileWrite(outHandle,buf[i+1],e-i);
      e:=i;
    end;
  fileWrite(outHandle,buf[0],e+1);
end.

Now the C++ code

..........................................................................................................................................................
#include <iostream>
#include <vector>
#include <string>
#include <algorithm>

using
namespace std;

int
main() 
{
    typedef vector<string> LINES;
    LINES l;
    char line[256];
    ios_base::sync_with_stdio(false);
    cin.tie(0);

   
while (cin.getline(line, 256)) 
    {
        l.push_back(line);
    }

   
for (LINES::reverse_iterator j = l.rbegin(); j != 
                                l.rend(); j++) 
    {
       cout << (*j) << endl;
    }    
}

There are other benchmarking tests like Regular Expression Matching, High order matrix multiplication, String concatenation, Array access, Hashing etc. They are also as important as those of above. Although these tests are judging the languages inside out but by no means they are the ultimate method. Every language has its own beauty and charm. And moreover there are several other technical and non-technical issues to cover for each language or language-IDE pair. 

(Debasish Bose, is a self confessed C# fan. But he is clear that he did not tweak around, to push his favourite language up the ladder. Hence no mails to him on that!)




Added on January 18, 2008 Comment

Comments

Post a comment

Your name:

Comment: