Protecting executable from reverse engineering?
I've been contemplating how to protect my C/C++ code from disassembly and reverse engineering. Normally I would never condone this behavior myself in my code; however the current protocol I've been working on must not ever be inspected or understandable, for the security of various people.
Now this is a new subject to me, and the internet is not really resourceful for prevention against reverse engineering but rather depicts tons of information on how to reverse engineer
Some of the things I've thought of so far are:
- Code injection (calling dummy functions before and after actual function calls)
- Code obfustication (mangles the disassembly of the binary)
-
Write my own startup routines (harder for debuggers to bind to)
void startup(); int _start() { startup( ); exit (0) } void startup() { /* code here */ }
Runtime check for debuggers (and force exit if detected)
-
Function trampolines
void trampoline(void (*fnptr)(), bool ping = false) { if(ping) fnptr(); else trampoline(fnptr, true); }
Pointless allocations and deallocations (stack changes a lot)
- Pointless dummy calls and trampolines (tons of jumping in disassembly output)
- Tons of casting (for obfuscated disassembly)
I mean these are some of the things I've thought of but they can all be worked around and or figured out by code analysts given the right time frame. Is there anything else alternative I have?
Solution 1:
but they can all be worked around and or figured out by code analysists given the right time frame.
If you give people a program that they are able to run, then they will also be able to reverse-engineer it given enough time. That is the nature of programs. As soon as the binary is available to someone who wants to decipher it, you cannot prevent eventual reverse-engineering. After all, the computer has to be able to decipher it in order to run it, and a human is simply a slower computer.
Solution 2:
What Amber said is exactly right. You can make reverse engineering harder, but you can never prevent it. You should never trust "security" that relies on the prevention of reverse engineering.
That said, the best anti-reverse-engineering techniques that I've seen focused not on obfuscating the code, but instead on breaking the tools that people usually use to understand how code works. Finding creative ways to break disassemblers, debuggers, etc is both likely to be more effective and also more intellectually satisfying than just generating reams of horrible spaghetti code. This does nothing to block a determined attacker, but it does increase the likelihood that J Random Cracker will wander off and work on something easier instead.
Solution 3:
Safe Net Sentinel (formerly Aladdin). Caveats though - their API sucks, documentation sucks, and both of those are great in comparison to their SDK tools.
I've used their hardware protection method (Sentinel HASP HL) for many years. It requires a proprietary USB key fob which acts as the 'license' for the software. Their SDK encrypts and obfuscates your executable & libraries, and allows you to tie different features in your application to features burned into the key. Without a USB key provided and activated by the licensor, the software can not decrypt and hence will not run. The Key even uses a customized USB communication protocol (outside my realm of knowledge, I'm not a device driver guy) to make it difficult to build a virtual key, or tamper with the communication between the runtime wrapper and key. Their SDK is not very developer friendly, and is quite painful to integrate adding protection with an automated build process (but possible).
Before we implemented the HASP HL protection, there were 7 known pirates who had stripped the dotfuscator 'protections' from the product. We added the HASP protection at the same time as a major update to the software, which performs some heavy calculation on video in real time. As best I can tell from profiling and benchmarking, the HASP HL protection only slowed the intensive calculations by about 3%. Since that software was released about 5 years ago, not one new pirate of the product has been found. The software which it protects is in high demand in it's market segment, and the client is aware of several competitors actively trying to reverse engineer (without success so far). We know they have tried to solicit help from a few groups in Russia which advertise a service to break software protection, as numerous posts on various newsgroups and forums have included the newer versions of the protected product.
Recently we tried their software license solution (HASP SL) on a smaller project, which was straightforward enough to get working if you're already familiar with the HL product. It appears to work; there have been no reported piracy incidents, but this product is a lot lower in demand..
Of course, no protection can be perfect. If someone is sufficiently motivated and has serious cash to burn, I'm sure the protections afforded by HASP could be circumvented.
Solution 4:
Making code difficult to reverse-engineer is called code obfuscation.
Most of the techniques you mention are fairly easy to work around. They center on adding some useless code. But useless code is easy to detect and remove, leaving you with a clean program.
For effective obfuscation, you need to make the behavior of your program dependent on the useless bits being executed. For example, rather than doing this:
a = useless_computation();
a = 42;
do this:
a = complicated_computation_that_uses_many_inputs_but_always_returns_42();
Or instead of doing this:
if (running_under_a_debugger()) abort();
a = 42;
Do this (where running_under_a_debugger
should not be easily identifiable as a function that tests whether the code is running under a debugger — it should mix useful computations with debugger detection):
a = 42 - running_under_a_debugger();
Effective obfuscation isn't something you can do purely at the compilation stage. Whatever the compiler can do, a decompiler can do. Sure, you can increase the burden on the decompilers, but it's not going to go far. Effective obfuscation techniques, inasmuch as they exist, involve writing obfuscated source from day 1. Make your code self-modifying. Litter your code with computed jumps, derived from a large number of inputs. For example, instead of a simple call
some_function();
do this, where you happen to know the exact expected layout of the bits in some_data_structure
:
goto (md5sum(&some_data_structure, 42) & 0xffffffff) + MAGIC_CONSTANT;
If you're serious about obfuscation, add several months to your planning; obfuscation doesn't come cheap. And do consider that by far the best way to avoid people reverse-engineering your code is to make it useless so that they don't bother. It's a simple economic consideration: they will reverse-engineer if the value to them is greater than the cost; but raising their cost also raises your cost a lot, so try lowering the value to them.
Now that I've told you that obfuscation is hard and expensive, I'm going to tell you it's not for you anyway. You write
current protocol I've been working on must not ever be inspected or understandable, for the security of various people
That raises a red flag. It's security by obscurity, which has a very poor record. If the security of the protocol depends on people not knowing the protocol, you've lost already.
Recommended reading:
- The security bible: Security Engineering by Ross Anderson
- The obfuscation bible: Surreptitious software by Christian Collberg and Jasvir Nagra