How to hide a string in binary code?
Sometimes, it is useful to hide a string from a binary (executable) file. For example, it makes sense to hide encryption keys from binaries.
When I say “hide”, I mean making strings harder to find in the compiled binary.
For example, this code:
const char* encryptionKey = "My strong encryption key";
// Using the key
after compilation produces an executable file with the following in its data section:
4D 79 20 73 74 72 6F 6E-67 20 65 6E 63 72 79 70 |My strong encryp|
74 69 6F 6E 20 6B 65 79 |tion key |
You can see that our secret string can be easily found and/or modified.
I could hide the string…
char encryptionKey[30];
int n = 0;
encryptionKey[n++] = 'M';
encryptionKey[n++] = 'y';
encryptionKey[n++] = ' ';
encryptionKey[n++] = 's';
encryptionKey[n++] = 't';
encryptionKey[n++] = 'r';
encryptionKey[n++] = 'o';
encryptionKey[n++] = 'n';
encryptionKey[n++] = 'g';
encryptionKey[n++] = ' ';
encryptionKey[n++] = 'e';
encryptionKey[n++] = 'n';
encryptionKey[n++] = 'c';
encryptionKey[n++] = 'r';
encryptionKey[n++] = 'y';
encryptionKey[n++] = 'p';
encryptionKey[n++] = 't';
encryptionKey[n++] = 'i';
encryptionKey[n++] = 'o';
encryptionKey[n++] = 'n';
encryptionKey[n++] = ' ';
encryptionKey[n++] = 'k';
encryptionKey[n++] = 'e';
encryptionKey[n++] = 'y';
…but it's not a nice method. Any better ideas?
PS: I know that merely hiding secrets doesn't work against a determined attacker, but it's much better than nothing…
Also, I know about assymetric encryption, but it's not acceptable in this case. I am refactoring an existing appication which uses Blowfish encryption and passes encrypted data to the server (the server decrypts the data with the same key).
I can't change the encryption algorithm because I need to provide backward compatibility. I can't even change the encryption key.
Solution 1:
I'm sorry for long answer.
Your answers are absolutely correct, but the question was how to hide string and do it nicely.
I did it in such way:
#include "HideString.h"
DEFINE_HIDDEN_STRING(EncryptionKey, 0x7f, ('M')('y')(' ')('s')('t')('r')('o')('n')('g')(' ')('e')('n')('c')('r')('y')('p')('t')('i')('o')('n')(' ')('k')('e')('y'))
DEFINE_HIDDEN_STRING(EncryptionKey2, 0x27, ('T')('e')('s')('t'))
int main()
{
std::cout << GetEncryptionKey() << std::endl;
std::cout << GetEncryptionKey2() << std::endl;
return 0;
}
HideString.h:
#include <boost/preprocessor/cat.hpp>
#include <boost/preprocessor/seq/for_each_i.hpp>
#include <boost/preprocessor/seq/enum.hpp>
#define CRYPT_MACRO(r, d, i, elem) ( elem ^ ( d - i ) )
#define DEFINE_HIDDEN_STRING(NAME, SEED, SEQ)\
static const char* BOOST_PP_CAT(Get, NAME)()\
{\
static char data[] = {\
BOOST_PP_SEQ_ENUM(BOOST_PP_SEQ_FOR_EACH_I(CRYPT_MACRO, SEED, SEQ)),\
'\0'\
};\
\
static bool isEncrypted = true;\
if ( isEncrypted )\
{\
for (unsigned i = 0; i < ( sizeof(data) / sizeof(data[0]) ) - 1; ++i)\
{\
data[i] = CRYPT_MACRO(_, SEED, i, data[i]);\
}\
\
isEncrypted = false;\
}\
\
return data;\
}
Most tricky line in HideString.h is:
BOOST_PP_SEQ_ENUM(BOOST_PP_SEQ_FOR_EACH_I(CRYPT_MACRO, SEED, SEQ))
Lets me explane the line. For code:
DEFINE_HIDDEN_STRING(EncryptionKey2, 0x27, ('T')('e')('s')('t'))
BOOST_PP_SEQ_FOR_EACH_I(CRYPT_MACRO, SEED, SEQ)generate sequence:
( 'T' ^ ( 0x27 - 0 ) ) ( 'e' ^ ( 0x27 - 1 ) ) ( 's' ^ ( 0x27 - 2 ) ) ( 't' ^ ( 0x27 - 3 ) )
BOOST_PP_SEQ_ENUM(BOOST_PP_SEQ_FOR_EACH_I(CRYPT_MACRO, SEED, SEQ))generate:
'T' ^ ( 0x27 - 0 ), 'e' ^ ( 0x27 - 1 ), 's' ^ ( 0x27 - 2 ), 't' ^ ( 0x27 - 3 )
and finally,
DEFINE_HIDDEN_STRING(EncryptionKey2, 0x27, ('T')('e')('s')('t'))
generate:
static const char* GetEncryptionKey2()
{
static char data[] = {
'T' ^ ( 0x27 - 0 ), 'e' ^ ( 0x27 - 1 ), 's' ^ ( 0x27 - 2 ), 't' ^ ( 0x27 - 3 ),
'\0'
};
static bool isEncrypted = true;
if ( isEncrypted )
{
for (unsigned i = 0; i < ( sizeof(data) / sizeof(data[0]) ) - 1; ++i)
{
data[i] = ( data[i] ^ ( 0x27 - i ) );
}
isEncrypted = false;
}
return data;
}
data for "My strong encryption key" looks like:
0x00B0200C 32 07 5d 0f 0f 08 16 16 10 56 10 1a 10 00 08 2.]......V.....
0x00B0201B 00 1b 07 02 02 4b 01 0c 11 00 00 00 00 00 00 .....K.........
Thank you very much for your answers!
Solution 2:
As noted in the comment to pavium's answer, you have two choices:
- Secure the key
- Secure the decryption algorithm
Unfortunately, if you must resort to embedding both the key and the algorithm within the code, neither is truly secret, so you're left with the (far weaker) alternative of security through obscurity. In other words, as you mentioned, you need a clever way to hide either or both of them inside your executable.
Here are some options, though you need to remember that none of these is truly secure according to any cryptographic best practices, and each has its drawbacks:
-
Disguise your key as a string that would normally appear within the code. One example would be the format string of a
printf()
statement, which tends to have numbers, letters, and punctuation. - Hash some or all of the code or data segments on startup, and use that as the key. (You'll need to be a bit clever about this to ensure the key doesn't change unexpectedly!) This has a potentially desirable side-effect of verifying the hashed portion of your code each time it runs.
- Generate the key at run-time from something that is unique to (and constant within) the system for example, by hashing the MAC address of a network adapter.
-
Create the key by choosing bytes from other data. If you have static or global data, regardless of type (
int
,char
, etc.), take a byte from somewhere within each variable after it's initialized (to a non-zero value, of course) and before it changes.
Please let us know how you solve the problem!
Edit: You commented that you're refactoring existing code, so I'll assume you can't necessarily choose the key yourself. In that case, follow a 2-step process: Use one of the above methods to encrypt the key itself, then use that key to decrypt the users' data.
Solution 3:
- Post it as a code golf problem
- Wait for a solution written in J
- Embed a J interpreter in your app