How to pass base64 encoded image to Tensorflow prediction?

Solution 1:

The TensorFlow model does not have to be trained on base64 data. Leave your training graph as is. However, when exporting the model, you'll need to export a model that can accept PNG or jpeg (or possibly raw, if it's small) data. Then, when you export the model, you'll need to be sure to use a name for the output that ends in _bytes. This signals to CloudML Engine that you will be sending base64 encoded data. Putting it all together would like something like this:

from tensorflow.contrib.saved_model.python.saved_model import utils

# Shape of [None] means we can have a batch of images.
image = tf.placeholder(shape = [None], dtype = tf.string)
# Decode the image.
decoded = tf.image.decode_jpeg(image, channels=3)
# Do the rest of the processing.
scores = build_model(decoded)

# The input name needs to have "_bytes" suffix.
inputs = { 'image_bytes': image }
outputs = { 'scores': scores }
utils.simple_save(session, export_dir, inputs, outputs)

The request you send will look something like this:

{
    "instances": [{
        "b64": "x0welkja..."
    }]
}

Solution 2:

If you just want an efficient way to send images to a model (and not necessarily base-64 encode it), I would suggest uploading your images(s) to Google Cloud Storage and then having your model read off GCS. This way, you are not limited by image size and you can take advantage of multi-part, multithreaded, resumable uploads etc. that the GCS API provides.

TensorFlow's tf.read_file will directly off GCS. Here's an example of a serving input_fn that will do this. Your request to CMLE would send it an image URL (gs://bucket/some/path/to/image.jpg)

def read_and_preprocess(filename, augment=False):
    # decode the image file starting from the filename
    # end up with pixel values that are in the -1, 1 range
    image_contents = tf.read_file(filename)
    image = tf.image.decode_jpeg(image_contents, channels=NUM_CHANNELS)
    image = tf.image.convert_image_dtype(image, dtype=tf.float32) # 0-1
    image = tf.expand_dims(image, 0) # resize_bilinear needs batches
    image = tf.image.resize_bilinear(image, [HEIGHT, WIDTH], align_corners=False)
    #image = tf.image.per_image_whitening(image)  # useful if mean not important
    image = tf.subtract(image, 0.5)
    image = tf.multiply(image, 2.0) # -1 to 1
    return image

def serving_input_fn():
    inputs = {'imageurl': tf.placeholder(tf.string, shape=())}
    filename = tf.squeeze(inputs['imageurl']) # make it a scalar
    image = read_and_preprocess(filename)
    # make the outer dimension unknown (and not 1)
    image = tf.placeholder_with_default(image, shape=[None, HEIGHT, WIDTH, NUM_CHANNELS])

    features = {'image' : image}
    return tf.estimator.export.ServingInputReceiver(features, inputs)

Your training code will train off actual images, just as in rhaertel80's suggestion above. See https://github.com/GoogleCloudPlatform/training-data-analyst/blob/master/courses/machine_learning/deepdive/08_image/flowersmodel/trainer/task.py#L27 for what the training/evaluation input functions would look like.