How do we secure our source code?

Our source code is our most prized asset. I would like to have it:

  • secured from proliferation by in-house developers but they also need unrestricted access to the code to do their job right. So I'm not sure this is even possible.
  • regularly backed up to secure location, but would it be safe to upload it to a cloud storage such as box.net?

Any recommendations on strategies? Or am I paranoid?


Securing things from the actions of your people is more a human issue than a technology one unfortunately, so I'll leave that for others to answer (humans are not my forté - machines: yes, cats: sometimes, humans: no!).

If you are sending your code to any external service you either have to make sure it is securely encrypted before it is sent, or that you have fully vetted the external service, preferably both. Running your own backup servers will be safer (you have more direct control) but more complex (you have to do everything yourself). As your backup servers will probably be running in colo space that you don't have physical control of you might want to set the data on encrypted filesystems that do not auto-mount on boot (have them require manual intervention, to send over the key(s), if the servers need restarting) - having the keys on the server so it can auto-mount the encrypted volumes is like having an expensive safe with the combination written on a post-it note near by.

Either way, you should have offline backups as well as online ones - i.e. discs/tapes offsite and not connected. That way if you are thoroughly hacked and all your core servers, local backups and hosted online backups are wrecked you should still have the offline backups to roll back to.

One way to mitigate the problem of a hacker breaking into your main servers and using them to break into your backup servers (which happened to a relatively high profile web service a few months ago) I suggest having an intermediate service that both the live and backup servers connect to. That way you can arrange for neither the live or backup servers to have access to each other, and the intermediate server doesn't need to log in to either live or backup. The live sites would log into the intermediate server to push the most recent data and some time after the backup servers would login to pull it to themselves. This doesn't remove the recommendation of having offline offsite backups too, though it reduces your chance of needing to use them in anger.

One extra option for hosting your external backups: if you are on very good terms with another local non-competing business you could perhaps host each other's backups. You might still encrypt your backups for true paranoia though (not in case the other business goes bad, though that could happen, but to cover for the possibility that they get hacked or burgled themselves).

And one extra point that is all too often neglected: make sure you have a procedure in place for testing the backups. You don't want to find they stopped working for some reason weeks ago on the day that you need to restore something from them. There are a numbber of ways to test your backups, the best one to go for depends on the nature and size of data you are storing and the format it is stored in. For instance, I have a copy of my mail server running in a VM that thinks it is the live server but is not seen from the outside world. Three times a week a script stops it, restores the most recent backup to it, and restarts it, any errors being mailed to me. Then as part of my regular house keeping I login to this backup VM to check everything looks OK (it is running, recent changes are present, a random sample of old data looks OK too, ...). You should still occasionally manually test the backups, but automated tests are sometimes a godsend - they might flag a minor problem before it becomes a major one.

It is difficult to be too paranoid when looking after your source code. It is your core asset, your business may be worth nothing without it, so you need to guard it from outside malicious forces (including natural forces!) very carefully.


No, you are not paranoid. A couple things occur to me

  • Do all you developers sign non-disclosure agreements? Does it spell out that source code is a corporate asset?
  • Do you have policies that explicitly restrict developers from carting around source-code?
  • Are the hard drives & USB keys for your developers' laptops encrypted?

  • Backing up to the "cloud" is OK. You should consider backing it up and storing the backup off-site yourself.

\\Greg


There is absolutely no difference between your source code and any other files you want kept private (e.g. financials). The human side is something you will have to deal with as you see fit. The security and integrity of the files can be managed multiple ways. I prefer to perform my own backups to tape. Those tapes are stored off-site so that in the event of a disaster (e.g. building burning down - again!) no more than one day's worth of data is lost.

When considering using "the cloud" just bear in mind that it's called that because there are no hard edges or definitions. Quite simply, you cannot know where your data is at any point in time, or who has access to it. If you feel so strongly that your data needs to be protected then you need to be in direct control of it.