Performance issue: Java vs C++

I have always heard that C++ was way more efficient than Java (and that is why most games are developed in C++).

I wrote a small algorithm to solve the "Eight queens puzzle" in both Java and C++, using the exact same algorithm, and then started to raise the number or squares. When reaching checkboards of 20*20 or even 22*22, it appears Java is much more effective (3 seconds vs 66 seconds for C++).

I have no idea why, but I am pretty beginning with C++, so it is possible I made some huge performance mistakes, so I will gladly accept any information that would help me understand what is happening.

Below is the code I use in Java:

import java.awt.Point;
import java.util.ArrayList;
import java.util.List;

public class HuitDames {

    /**
     * La liste des coordnnées des dames.
     */
    private static List<Point> positions = new ArrayList<>();

    /**
     * Largeur de la grille.
     */
    private static final int LARGEUR_GRILLE = 22;


    /**
     * @param args the command line arguments
     */
    public static void main(String[] args) {
        int i = 1;
        placerDame(i);
        for (Point point : positions) {
            System.out.println("(" + point.x + "; " + point.y + ")");
        }
    }

    /**
     * Place une dame et return true si la position est bonne.
     * @param i le numéro de la dame.
     * @return si la position est bonne.
     */
    private static boolean placerDame(int i) {

        boolean bonnePosition = false;
        for (int j = 1; j <= LARGEUR_GRILLE && bonnePosition == false; j++) {
            Point emplacement = new Point(i, j);
            positions.add(emplacement);
            if (verifierPrise(emplacement) && (i == LARGEUR_GRILLE || placerDame(i + 1))) {
                bonnePosition = true;
            }
            else {
                positions.remove(i - 1);
            }
        }

        return bonnePosition;
    }

    /**
     * Vérifie que la nouvelle position n'est pas en prise avec une position déjà présente.
     * @param position la position de la nouvelle dame.
     * @return Si la position convient par rapport aux positions des autres dames.
     */
    private static boolean verifierPrise(Point position) {
        boolean nonPrise = true;
        for (Point point : positions) {
            if (!point.equals(position)) {
                // Cas où sur la même colonne.
                if (position.y == point.y) {
                    nonPrise = false;
                }
                // Cas où sur même diagonale.
                if (Math.abs(position.y - point.y) == Math.abs(position.x - point.x)) {
                    nonPrise = false;
                }
            }
        }

        return nonPrise;
    }
}

And below is the code in C++:

#include <iostream>
#include <list>
#include <math.h>
#include <stdlib.h>

using namespace std;


// Class to represent points.
class Point {

    private:
        double xval, yval;

    public:
        // Constructor uses default arguments to allow calling with zero, one,
        // or two values.
        Point(double x = 0.0, double y = 0.0) {
                xval = x;
                yval = y;
        }

        // Extractors.
        double x() { return xval; }
        double y() { return yval; }
};

#define LARGEUR_GRILLE 22
list<Point> positions;


bool verifierNonPrise(Point emplacement) {
    bool nonPrise = true;
    for (list<Point>::iterator it = positions.begin(); it!= positions.end(); it++) {
        if (it->x() != emplacement.x()) {
            if (it->y() == emplacement.y()) {
                nonPrise = false;
            }
            if (abs(it->y() - emplacement.y()) == abs(it->x() - emplacement.x())) {
                nonPrise = false;
            }
        }
    }

    return nonPrise;
}

bool placerDame(int i) {
    bool bonnePosition = false;
    for (int j = 1; j <= LARGEUR_GRILLE && !bonnePosition; j++) {
        Point emplacement(i,j);
        positions.push_back(emplacement);
        if (verifierNonPrise(emplacement) && (i == LARGEUR_GRILLE || placerDame(i + 1))) {
            bonnePosition = true;
        }
        else {
            positions.pop_back();
        }
    }

    return bonnePosition;
}

int main()
{
    int i = 1;
    placerDame(i);
    for (list<Point>::iterator it = positions.begin(); it!= positions.end(); it++) {
        cout << "(" << it->x() << "; " << it->y() << ")" << endl;
    }
    return 0;
}

Solution 1:

std::list in C++ is a linked list, whereas java.util.ArrayList is an array. Try replacing std::list by std::vector. Also, be sure to compile with optimization turned on.

Solution 2:

Updates:

Changes to C++

  • As written:
    Compilation Failure
  • Replace math.h => cmath
    27610 milliseconds
  • Add -O3 flag
    4416 milliseconds
  • Replace std::list with std::vector
    2294 milliseconds
  • Replace Point with std::pair
    2384 milliseconds
  • Made verifierNonPrise const correct
    2351 milliseconds
  • Replaced loop in verifierNonPrise with std::find_if
    929 milliseconds
  • Replacing double with int (to make it equiv to Java)
    549 milliseconds

Changes to Java

  • As written
    3459 milliseconds
  • Changes verifierNonPrise early return
    368 milliseconds

Java Vs C++

> javac HuitDames.java
> time java HuitDames
real    0m0.368s
user    0m0.436s
sys     0m0.042s    
> g++ -O3 -std=c++11 HuitDames.cpp
> time ./a.out
real    0m0.541s
user    0m0.539s
sys     0m0.002s

Final Code:

#include <iostream>
#include <vector>
#include <cmath>
#include <stdlib.h>
#include <chrono>
#include <algorithm>

using namespace std;

typedef std::pair<int, int>   Point;

#define LARGEUR_GRILLE 22
vector<Point> positions;


bool verifierNonPrise(Point const& emplacement) {
    return std::find_if(positions.begin(), positions.end(), [&emplacement](Point const& val){
        if (val.first != emplacement.first) {
            if ((val.second == emplacement.second) || (abs(val.second - emplacement.second) == abs(val.first - emplacement.first))) {
                return true;
            }
        }
        return false;
    }) == positions.end();
}

bool placerDame(int i) {
    bool bonnePosition = false;

    for (int j = 1; j <= LARGEUR_GRILLE && !bonnePosition; j++) {
        Point emplacement(i,j);
        positions.push_back(emplacement);
        if (verifierNonPrise(emplacement) && (i == LARGEUR_GRILLE || placerDame(i + 1))) {
            bonnePosition = true;
        }
        else {
            positions.pop_back();
        }
    }

    return bonnePosition;
}


int main()
{
    using std::chrono::system_clock;

    system_clock::time_point begin_time = system_clock::now();

    int i = 1;
    placerDame(i);
    for (vector<Point>::iterator it = positions.begin(); it!= positions.end(); it++) {
        cout << "(" << it->first << "; " << it->second << ")" << endl;
    }

    system_clock::time_point end_time = system_clock::now();

    long long elapsed_milliseconds = std::chrono::duration_cast<std::chrono::milliseconds>(end_time - begin_time).count();
    cout << "Duration (milliseconds): "
         << elapsed_milliseconds
         << std::endl;    
}

Solution 3:

Test this version, updated using C++11 features. Tested in GCC 4.9.0 with -std=c++11. Tested on Celeron 1.6 GHz, 512 MB RAM.

Times in my PC:
Original: Duration (milliseconds): 12658
First Version: Duration (milliseconds): 3616
Optimized Version: Duration (milliseconds): 1745

Changes are:

  • Using vector instead of list Benchmark, and Words from Stroustrup.
  • Using const whatever we can, the compiler is able to optimize much more if it known that the value don't change.
  • Using std::pair instead of Point.
  • Using new for-loop with constant iterators.

Source:

#include <iostream>
#include <vector>
#include <chrono>
#include <iomanip>

using namespace std;

typedef std::pair<int, int> Point;

#define LARGEUR_GRILLE 22
vector<Point> positions;

bool verifierNonPrise(const Point& emplacement) {
    bool nonPrise = true;
    for (const auto& p : positions) {
        if (p.first != emplacement.first) {
            if (p.second == emplacement.second) {
                nonPrise = false;
            }
            if (abs(p.second - emplacement.second) ==
                abs(p.first - emplacement.first)) {
                nonPrise = false;
            }
        }
    }

    return nonPrise;
}

bool placerDame(int i) {
    bool bonnePosition = false;
    for (int j = 1; j <= LARGEUR_GRILLE && !bonnePosition; j++) {
        Point emplacement(i, j);
        positions.emplace_back(emplacement);
        if (verifierNonPrise(emplacement) &&
            (i == LARGEUR_GRILLE || placerDame(i + 1))) {
            bonnePosition = true;
        } else {
            positions.pop_back();
        }
    }

    return bonnePosition;
}

int main(int argc, char* argv[]) {
    std::chrono::system_clock::time_point begin_time =
        std::chrono::system_clock::now();

    positions.reserve(LARGEUR_GRILLE);

    placerDame(1);
    for (const auto& p : positions) {
        cout << "(" << p.first << "; " << p.second << ")" << endl;
    }

    std::chrono::system_clock::time_point end_time =
        std::chrono::system_clock::now();
    long long elapsed_milliseconds =
        std::chrono::duration_cast<std::chrono::milliseconds>(
            end_time - begin_time).count();
    std::cout << "Duration (milliseconds): " << elapsed_milliseconds
              << std::endl;

    return 0;
}

Some more deep changes.

Changes include:

  • Returning as early as possible. As soon as the queen can not be placed.
  • Returning to a simpler Point class.
  • Using find_if algorithm for searching queen placement.

Source (some recommendation updated):

#include <algorithm>
#include <iostream>
#include <vector>
#include <chrono>
#include <iomanip>

using namespace std;

struct Point {
    int x, y;
};

#define LARGEUR_GRILLE 22
vector<Point> positions;

bool verifierNonPrise(const Point& emplacement) {
    return find_if(positions.cbegin(), positions.cend(), [&emplacement](const Point& p) {
               return (p.x != emplacement.x &&
                       (p.y == emplacement.y ||
                        abs(p.y - emplacement.y) == abs(p.x - emplacement.x)));
           }) == positions.cend();
}

bool placerDame(int i) {
    for (int j = 1; j <= LARGEUR_GRILLE; j++) {
        Point emplacement{i, j};
        positions.push_back(emplacement);
        if (verifierNonPrise(emplacement) &&
            (i == LARGEUR_GRILLE || placerDame(i + 1))) {
            return true;
        } else {
            positions.pop_back();
        }
    }
    return false;
}

int main(int argc, char* argv[]) {
    std::chrono::system_clock::time_point begin_time =
        std::chrono::system_clock::now();

    positions.reserve(LARGEUR_GRILLE);

    placerDame(1);
    for (const auto& p : positions) {
        cout << "(" << p.x << "; " << p.y << ")" << endl;
    }

    std::chrono::system_clock::time_point end_time =
        std::chrono::system_clock::now();
    long long elapsed_milliseconds =
        std::chrono::duration_cast<std::chrono::milliseconds>(
            end_time - begin_time).count();
    std::cout << "Duration (milliseconds): " << elapsed_milliseconds
              << std::endl;

    return 0;
}

Solution 4:

Comparing a managed, dynamically compiled language like Java to a statically compiled language like C++ is very difficult.

You will always be comparing apples to oranges in a sense, because they are conceptually very different. It starts with the use of the standard libraries (ArrayList vs std::list/vector) that will have potentially wildly different performance characteristics, even your code looks similar in both languages.

Then there is the inherent problem with microbenchmarks in Java (short test in Java are always slower because the JIT will observe program flow before it decides what and how it is to be compiled). Same goes for compiler options for C++, even the structure of the source code (independently compiled and linked classes versus single file source) can make a significant difference (because it changes the amount of "insight" the C++ compiler has into the other classes).

Next is the general difference in memory management, garbage collection vs manual memory management (smart pointers etc. are still considered manual memory management).

Not to mention the general language differences like you need to explicitly declare a method virtual in C++, while in Java every member method is virtual by default (working out if it's really virtual at runtime is left to the VM).

With all those differences there will always be cases where one langauge will have a massive advantage over the other. A simple test with very limited scope (like your test here) says very little about each language as a whole.

Another point people often tend to ignore is: How productive can you be with a language - speed isn't everything (look a how sucessful script langages are in some domains, despite being hardly competive when looking only at excution speed). Lack of performance can be crippling, but so can be low productivity.