Atlassian Crucible very slow on large repository

My company has been running a trial of Atlassian Crucible for some months now. For repositories where it's working properly, users have given very positive feedback about the tool. The problem I'm having is that we have several different projects, each with its own repository, and some of those repositories are very large. One repository in particular has a large number of branches and probably around 9,000 files per branch. Browsing that repository in Crucible is extremely slow.

Crucible is running on a CentOS VM. The VM has 4GB of RAM, and I've set Crucible's maximum at 3GB, of which it is currently using 2GB. I've brought this up in a support ticket with Atlassian, and they suggested the following:

In particular because you have a rather large SVN repository you will likely find that Fisheye will be creating a large index file on disk. To help improve performance a few things you can try are:

  • Increasing the available memory available to Fisheye.
  • Migrating to an external database.
  • Excluding files and directories from your index that aren't needed.

I've tried all of these things to an extent, but so far none have helped greatly. I was originally running Crucible on a Windows box with 2GB of RAM using the built in HSQL DB. Moving to MySQL on CentOS saw a performance increase for some repositories, and made Crucible much more stable, but did not seem to help much with our biggest repository. There are only so many files/branches I can exclude from indexing while maintaining the tool's usefulness.

That being the case, does anyone have any tips on how to speed up Crucible on large repositories, without investing in insanely powerful hardware?

Thanks!

Edit: To clarify, since I didn't mention it explicitly above, I am using FishEye.

Edit 2: Since I originally posted this, performance has improved somewhat with new Crucible releases, but it's still not great by any means. It seems that this issue affects many users, including some with far more powerful hardware than we are using. Thus, I do not believe it is a hardware issue, but rather an issue with inherent inefficiency in Crucible. Atlassian is aware of the issue and will be including further performance improvements in future releases, so hopefully those changes solve our problems.

Edit 3: I'd forgotten how long ago I'd asked this question, so in my previous edit I neglected to mention that our hardware situation has also changed since it was originally asked. We're now running Crucible on a dedicated, physical server, still using CentOS. The hardware is still modest (4GB RAM, quad core CPU and dual 500GB disks in RAID 1 with external backup), but we did see a slight performance increase when we moved away from the VM.


Solution 1:

Since migrating to MySQL made a noticeable difference for some repositories, consider tuning the database for further improvements. Changing some my.cnf values from the defaults can make a huge difference. See InnoDB Performance Optimization Basics for more information. Also check for slow queries by enabling the slow query log and add indexes where appropriate.

My next guess would be network speed: is your Crucible instance on the same wired local network as your SVN repositories? You might also try giving Crucible a trial run on the same machine as your primary repository if possible to eliminate network latency as the culprit.

And I know it might be difficult depending on your work environment, but running Crucible in a VM probably isn't helping things; Atlassian makes a note of this on their very brief Best Practices for Crucible Configuration page. I'm sure you've already come across it, but I'll also mention the Tuning FishEye page for other readers.

I also have performance issues for large projects, but attribute a lot of the slowness to Crucible's heavy web interface. This is especially true after clicking around for a bit (previously viewed pages in a review remain in the browser window, even when hidden from sight). Our developers have noticed a slight speed increase by switching to Google Chrome. Also check out the Atlassian IDE Connector if a compatible plugin exists for your development environment. The Eclipse IDE Connector had issues of its own the last time I used it (many months ago), but it could at least handle large file sets without hanging up.

Depending on your company's development practices, you could stop scanning a large number of code branches (assuming many of them are no longer active), and disable repositories for completed/dead projects until they're needed. My company utilizes very small teams on a large number of projects, so most of the time we work primarily on trunk, making branches the exception; we therefore explicitly add branches to scan instead of including all branches by default. Also make sure you're not accidentally scanning tags.

How is your CPU usage on the Crucible box? If you're using SVN behind Apache HTTPD, examine how many connections are consumed by Crucible during a big repository scan. Aside from that, I'm not sure what else you could look at (maybe disk speed? Repository scan frequency?), but hopefully the above tips will help a bit.

Solution 2:

>4 G of RAM isn't "insanely powerful" hardware. Assuming you've got 25 users and you're using Fisheye (which you mention), you're spending $4400 on just the software. $4k at Dell could buy you a server with 48G of RAM.

Also, are you using a 64-bit JVM? The docs suggest that you'll see better memory footprint (as in, less of it) on a 32-bit JVM.