Too much data for RSA block fail. What is PKCS#7?

If you use a block cipher, you input must be an exact multiple of the block bit length.

In order to encipher arbitrary length data, you need first to pad you data to a multiple of the block length. This can be done with any method, but there are a number of standards. PKCS7 is one which is quite common, you can see an overview on the wikipedia article on padding.

Since block cipers operate on blocks, you also need to come up with a way of concatenating the encrypted blocks. This is very important, since naive techniques greatly reduce the strength of the encryption. There is also a wikipedia article on this.

What you did was to try to encrypt (or decrypt) data of a length which didn't match the block length of the cipher, and you also explicitly asked for no padding and also no chaining mode of operation.

Consequently the block cipher could not be applied to your data, and you got the reported exception.

UPDATE:

As a response to your update and GregS's remark, I would like to acknowledge that GregS was right (I did not know this about RSA), and elaborate a bit:

RSA does not operate on bits, it operates on integer numbers. In order to use RSA you therefore need to convert your string message into an integer m: 0 < m < n, where n is the modulus of the two distinct primes chosen in the generation process. The size of a key in the RSA algorithm typically refers to n. More details on this can be found on the wikipedia article on RSA.

The process of converting a string message to an integer, without loss (for instance truncating initial zeroes), the PKCS#1 standard is usually followed. This process also adds some other information for message integrity (a hash digest), semantical security (an IV) ed cetera. With this extra data, the maximum number of bytes which can be supplied to the RSA/None/PKCS1Padding is (keylength - 11). I do not know how PKCS#1 maps the input data to the output integer range, but my impression is that it can take any length input less than or equal to keylength - 11 and produce a valid integer for the RSA encryption.

If you use no padding, your input will simply be interpreted as a number. Your example input, {0xbe, 0xef} will most probably be interpreted as {10111110 +o 11101111} = 1011111011101111_2 = 48879_10 = beef_16 (sic!). Since 0 < beef_16 < d46f473a2d746537de2056ae3092c451_16, your encryption will succeed. It should succeed with any number less than d46f473a2d746537de2056ae3092c451_16.

This is mentioned in the bouncycastle FAQ. They also state the following:

The RSA implementation that ships with Bouncy Castle only allows the encrypting of a single block of data. The RSA algorithm is not suited to streaming data and should not be used that way. In a situation like this you should encrypt the data using a randomly generated key and a symmetric cipher, after that you should encrypt the randomly generated key using RSA, and then send the encrypted data and the encrypted random key to the other end where they can reverse the process (ie. decrypt the random key using their RSA private key and then decrypt the data).


RSA is a one-shot asymmetric encryption with constraints. It encrypts a single "message" in one go, but the message has to fit within rather tight limits based on the public key size. For a typical 1024 bit RSA key, the maximum input message length (with RSA as described in the PKCS#1 standard) is 117 bytes, no more. Also, with such a key, the encrypted message has length 128 bytes, regardless of the input message length. As a generic encryption mechanism, RSA is very inefficient, and wasteful of network bandwidth.

Symmetric encryption systems (e.g. AES or 3DES) are much more efficient, and they come with "chaining modes" which allow them to process input messages of arbitrary length. But they do not have the "asymmetric" property of RSA: with RSA, you can make the encryption key public without revealing the decryption key. That's the whole point of RSA. With symmetric encryption, whoever has the power to encrypt a message also has all the needed information to decrypt messages, hence you cannot make the encryption key public because it would make the decryption key public as well.

Thus it is customary to use an hybrid system, in which a (big) message is symmetrically encrypted (with, e.g., AES), using a symmetric key (which is an arbitrary short sequence of random bytes), and have that key encrypted with RSA. The receiving party then uses RSA decryption to recover that symmetric key, and then uses it to decrypt the message itself.

Beyond the rather simplistic description above, cryptographic systems, in particular hybrid systems, are clock full of little details which, if not taken care of, may make your application extremely weak against attackers. So it is best to use a protocol with an implementation which already handles all that hard work. PKCS#7 is such a protocol. Nowadays, it is standardized under the name of CMS. It is used in several places, e.g. at the heart of S/MIME (a standard for encrypting and signing emails). Another well-known protocol, meant for encrypting network traffic, is SSL (now standardized under the name of TLS, and often used in combination with HTTP as the famous "HTTPS" protocol).

Java contains an implementation of SSL (see javax.net.ssl). Java does not contain a CMS implementation (at least not in its API) but Bouncy Castle has some code for CMS.