How to add/remove PKCS7 padding from an AES encrypted string?

Let's see. PKCS #7 is described in RFC 5652 (Cryptographic Message Syntax).

The padding scheme itself is given in section 6.3. Content-encryption Process. It essentially says: append that many bytes as needed to fill the given block size (but at least one), and each of them should have the padding length as value.

Thus, looking at the last decrypted byte we know how many bytes to strip off. (One could also check that they all have the same value.)

I could now give you a pair of PHP functions to do this, but my PHP is a bit rusty. So either do this yourself (then feel free to edit my answer to add it in), or have a look at the user-contributed notes to the mcrypt documentation - quite some of them are about padding and provide an implementation of PKCS #7 padding.


So, let's look on the first note there in detail:

<?php

function encrypt($str, $key)
 {
     $block = mcrypt_get_block_size('des', 'ecb');

This gets the block size of the used algorithm. In your case, you would use aes or rijndael_128 instead of des, I suppose (I didn't test it). (Instead, you could simply take 16 here for AES, instead of invoking the function.)

     $pad = $block - (strlen($str) % $block);

This calculates the padding size. strlen($str) is the length of your data (in bytes), % $block gives the remainder modulo $block, i.e. the number of data bytes in the last block. $block - ... thus gives the number of bytes needed to fill this last block (this is now a number between 1 and $block, inclusive).

     $str .= str_repeat(chr($pad), $pad);

str_repeat produces a string consisting of a repetition of the same string, here a repetition of the character given by $pad, $pad times, i.e. a string of length $pad, filled with $pad. $str .= ... appends this padding string to the original data.

     return mcrypt_encrypt(MCRYPT_DES, $key, $str, MCRYPT_MODE_ECB);

Here is the encryption itself. Use MCRYPT_RIJNDAEL_128 instead of MCRYPT_DES.

 }

Now the other direction:

 function decrypt($str, $key)
 {   
     $str = mcrypt_decrypt(MCRYPT_DES, $key, $str, MCRYPT_MODE_ECB);

The decryption. (You would of course change the algorithm, as above). $str is now the decrypted string, including the padding.

     $block = mcrypt_get_block_size('des', 'ecb');

This is again the block size. (See above.)

     $pad = ord($str[($len = strlen($str)) - 1]);

This looks a bit strange. Better write it in multiple steps:

    $len = strlen($str);
    $pad = ord($str[$len-1]);

$len is now the length of the padded string, and $str[$len - 1] is the last character of this string. ord converts this to a number. Thus $pad is the number which we previously used as the fill value for the padding, and this is the padding length.

     return substr($str, 0, strlen($str) - $pad);

So now we cut off the last $pad bytes from the string. (Instead of strlen($str) we could also write $len here: substr($str, 0, $len - $pad).).

 }

?>

Note that instead of using substr($str, $len - $pad), one can also write substr($str, -$pad), as the substr function in PHP has a special-handling for negative operands/arguments, to count from the end of the string. (I don't know if this is more or less efficient than getting the length first and and calculating the index manually.)

As said before and noted in the comment by rossum, instead of simply stripping off the padding like done here, you should check that it is correct - i.e. look at substr($str, $len - $pad), and check that all its bytes are chr($pad). This serves as a slight check against corruption (although this check is more effective if you use a chaining mode instead of ECB, and is not a replacement for a real MAC).


(And still, tell your client they should think about changing to a more secure mode than ECB.)


I've created two methods to perform the padding and unpadding. The functions are documented using phpdoc and require PHP 5. As you will notice the unpad function contains a lot of exception handling, generating not less than 4 different messages for each possible error.

To get to the block size for PHP mcrypt, you can use mcrypt_get_block_size, which also defines the block size to be in bytes instead of bits.

/**
 * Right-pads the data string with 1 to n bytes according to PKCS#7,
 * where n is the block size.
 * The size of the result is x times n, where x is at least 1.
 * 
 * The version of PKCS#7 padding used is the one defined in RFC 5652 chapter 6.3.
 * This padding is identical to PKCS#5 padding for 8 byte block ciphers such as DES.
 *
 * @param string $plaintext the plaintext encoded as a string containing bytes
 * @param integer $blocksize the block size of the cipher in bytes
 * @return string the padded plaintext
 */
function pkcs7pad($plaintext, $blocksize)
{
    $padsize = $blocksize - (strlen($plaintext) % $blocksize);
    return $plaintext . str_repeat(chr($padsize), $padsize);
}

/**
 * Validates and unpads the padded plaintext according to PKCS#7.
 * The resulting plaintext will be 1 to n bytes smaller depending on the amount of padding,
 * where n is the block size.
 *
 * The user is required to make sure that plaintext and padding oracles do not apply,
 * for instance by providing integrity and authenticity to the IV and ciphertext using a HMAC.
 *
 * Note that errors during uppadding may occur if the integrity of the ciphertext
 * is not validated or if the key is incorrect. A wrong key, IV or ciphertext may all
 * lead to errors within this method.
 *
 * The version of PKCS#7 padding used is the one defined in RFC 5652 chapter 6.3.
 * This padding is identical to PKCS#5 padding for 8 byte block ciphers such as DES.
 *
 * @param string padded the padded plaintext encoded as a string containing bytes
 * @param integer $blocksize the block size of the cipher in bytes
 * @return string the unpadded plaintext
 * @throws Exception if the unpadding failed
 */
function pkcs7unpad($padded, $blocksize)
{
    $l = strlen($padded);

    if ($l % $blocksize != 0) 
    {
        throw new Exception("Padded plaintext cannot be divided by the block size");
    }

    $padsize = ord($padded[$l - 1]);

    if ($padsize === 0)
    {
        throw new Exception("Zero padding found instead of PKCS#7 padding");
    }    

    if ($padsize > $blocksize)
    {
        throw new Exception("Incorrect amount of PKCS#7 padding for blocksize");
    }

    // check the correctness of the padding bytes by counting the occurance
    $padding = substr($padded, -1 * $padsize);
    if (substr_count($padding, chr($padsize)) != $padsize)
    {
        throw new Exception("Invalid PKCS#7 padding encountered");
    }

    return substr($padded, 0, $l - $padsize);
}

This does not invalidate the answer of Paŭlo Ebermann in any way, it's basically the same answer in code & phpdoc instead of as description.


Note that returning a padding error to an attacker might result in a padding oracle attack which completely breaks CBC (when CBC is used instead of ECB or a secure authenticated cipher).