How to handle HTTP requests in a Microservice / Event Driven Architecture?

Background:

I am building an application and the proposed architecture is Event/Message Driven on a microservice architecture.

The monolithic way of doing thing is that I've a User/HTTP request and that actions some commands that have a direct synchronous response. Thus, to respond to the same User/HTTP request is 'hassle free'.

enter image description here

The problem:

The user sends an HTTP request to the UI Service (there are multiple UI Services) that fires some events to a queue (Kafka/RabbitMQ/any). a N of services picks up that Event/Message do some magic along the way and then at some point that same UI Service should pick that up a response and give that back to the user that originated HTTP request. Request processing is ASYNC but the User/HTTP REQUEST->RESPONSE is SYNC as per your typical HTTP interaction.

Question: How do I send a response to the same UI Service that originated the action (The service thats interacting with the user over HTTP) in this Agnostic/Event driven world?

My research so far I've been looking around and it seems that some people are solving that problem using WebSockets.

But the layer of complexity is that there needs to be some table that maps (RequestId->Websocket(Client-Server)) which is used to ‘discover’ which node in the gateway has the websocket connection for some particular response. But even if I understand the problem and complexity I'm stuck that I can't find any articles that would give me info on how to solve this problem at the implementation layer. AND this still is not a viable option because of 3rd party integrations such as payments providers(WorldPay) that expect REQUEST->RESPONSE - specially on the 3DS validation.

So I am somehow reluctant to think that WebSockets is an option. But even if WebSockets are ok for Webfacing apps, for API that connects to external systems is not a great architecture.

** ** ** Update: ** ** **

Even if long polling is an possible solution for a WebService API with a 202 Accepted a Location header and a retry-after header it wouldn't be performant for a high concurrency & high ability website. Imagine a huge number of people trying to get the transaction status update on EVERY request they make and you have to invalidate CDN cache (go and play with that problem now! ha).

But most important and relatable to my case I've 3rd party APIs such as payment systems where the 3DS systems have automatic redirects that are handled by the payment provider system and they expect a typical REQUEST/RESPONSE flow, thus this model would not work for me nor the sockets model would work.

Because of this use-case the HTTP REQUEST/RESPONSE should be handled in the typical fashion where i have a dumb client that expect that the complexity of the precessing is handled in back-end.

So i am looking for a solution where externally I have a typical Request->Response(SYNC) and the complexity of the status(ASYNCrony of the system) is handled internally

An example of the long polling, but this model wouldn't work for 3rd party API such as payments provider on 3DS Redirects that are not within my control.

 POST /user
    Payload {userdata}
    RETURNs: 
        HTTP/1.1 202 Accepted
        Content-Type: application/json; charset=utf-8
        Date: Mon, 27 Nov 2018 17:25:55 GMT
        Location: https://mydomain/user/transaction/status/:transaction_id
        Retry-After: 10

GET 
   https://mydomain/user/transaction/status/:transaction_id

enter image description here


Solution 1:

As I was expecting - people try to fit everything into a concept even if it does not fit there. This is not a criticism, this is an observation from my experience and after reading your question and other answers.

Yes, you are right that microservices architecture is based on asynchronous messaging patterns. However, when we talk about UI, there are 2 possible cases in my mind:

  1. UI needs a response immediately (e.g. read operations or those commands on which user expects answer right away). These don't have to be asynchronous. Why would you add an overhead of messaging and asynchrony if the response is required on the screen right away? Does not make sense. Microservice architecture is supposed to solve problems rather than create new ones by adding an overhead.

  2. UI can be restructured to tolerate delayed response (e.g. instead of waiting for the result, UI can just submit command, receive acknowledgement, and let the user do something else while response is being prepared). In this case, you can introduce asynchrony. The gateway service (with which UI interacts directly) can orchestrate the asynchronous processing (waits for complete events and so on), and when ready, it can communicate back to the UI. I have seen UI using SignalR in such cases, and the gateway service was an API which accepted socket connections. If the browser does not support sockets, it should fallback to the polling ideally. Anyway, important point is, this can only work with a contingency: UI can tolerate delayed answers.

If Microservices are indeed relevant in your situation (case 2), then structure UI flow accordingly, and there should not be a challenge in microservices on the back-end. In that case, your question comes down to applying event-driven architecture to the set of services (edge being the gateway microservice which connects the event-driven and UI interactions). This problem (event driven services) is solvable and you know that. You just need to decide if you can rethink how your UI works.

Solution 2:

From a more general perspective - on receiving the request you can register a subscriber on the queue in the current request's context (meaning when the request object is in scope) which receives an acknowledgment from responsible services as they finish their jobs (like a state machine which maintains the progress of the total number of operations). When the terminating state is reached it returns the response and removes the listener. I think this will work in any pub/sub style message queue. Here is an overly simplified demo of what I am suggesting.

// a stub for any message queue using the pub sub pattern
let Q = {
  pub: (event, data) => {},
  sub: (event, handler) => {}
}
// typical express request handler
let controller = async (req, res) => {
  // initiate saga
  let sagaId = uuid()
  Q.pub("saga:register-user", {
    username: req.body.username,
    password: req.body.password,
    promoCode: req.body.promoCode,
    sagaId: sagaId
  })
  // wait for user to be added 
  let p1 = new Promise((resolve, reject) => {
    Q.sub("user-added", ack => {
      resolve(ack)
    })
  })
  // wait for promo code to be applied
  let p2 = new Promise((resolve, reject) => {
    Q.sub("promo-applied", ack => {
      resolve(ack)
    })
  })

  // wait for both promises to finish successfully
  try {
    
    var sagaComplete = await Promise.all([p1, p2])
    // respond with some transformation of data
    res.json({success: true, data: sagaComplete})

  } catch (e) {
    logger.error('saga failed due to reasons')
    // rollback asynchronously
    Q.pub('rollback:user-added', {sagaId: sagaId})
    Q.pub('rollback:promo-applied', {sagaId: sagaId})
    // respond with appropriate status 
    res.status(500).json({message: 'could not complete saga. Rolling back side effects'})
  }
  
}

As you can probably tell, this looks like a general pattern which can be abstracted away into a framework to reduce code duplication and manage cross-cutting concerns. This is what the saga pattern is essentially about. The client will only wait for as long as it takes to finish the required operations (which is what would happen even if it was all synchronous), plus the added latency due to inter-service communication. Make sure you do not block the thread if you are using an event loop based system like NodeJS or Python Tornado.

Simply using a web-socket based push mechanism doesn't necessarily improve the efficiency or performance of your system. However, it is recommended that you push messages to the client using a socket connection because it makes your architecture more general (even your clients behave like your services do), consistent and allows for better separation of concerns. It will also allow you to independently scale the push-service without worrying about business logic. The saga pattern can be expanded upon to enable rollbacks in case of partial failures, or timeouts and makes your system more manageable.

Solution 3:

Below is a very bare-bones example how you could implement the UI Service so it works with a normal HTTP Request/Response flow. It uses the node.js events.EventEmitter class to "route" the responses to the right HTTP handler.

Outline of the implementation:

  1. Connect producer/consumer client to Kafka

    1. The producer is used to send the request data to the internal micro-services
    2. The consumer is used to listen for data from the micro-services that means the request has been processed and I assume those Kafka items also contain the data that should be returned to the HTTP client.
  2. Create a global event dispatcher from the EventEmitter class

  3. Register a HTTP request handler that
    1. Creates an UUID for the request and includes it in the payload pushed to Kafka
    2. Registers a event listener with our event dispatcher where the UUID is used as the event name that it listens for
  4. Start consuming the Kafka topic and retrieve the UUID that the HTTP request handler is waiting for and emit an event for it. In the example code I am not including any payload in the emitted event, but you would typically want to include some data from the Kafka data as an argument so the HTTP handler can return it to the HTTP client.

Note that I tried to keep the code as small as possible, leaving out error and timeout handling etc!

Also note that kafkaProduceTopic and kafkaConsumTopic are the same topics to simplify testing, no need for another service/function to produce to the UI Service consume topic.

The code assumes the kafka-node and uuid packages have been npm installed and that Kafka is accessible on localhost:9092

const http = require('http');
const EventEmitter = require('events');
const kafka = require('kafka-node');
const uuidv4 = require('uuid/v4');

const kafkaProduceTopic = "req-res-topic";
const kafkaConsumeTopic = "req-res-topic";

class ResponseEventEmitter extends EventEmitter {}

const responseEventEmitter = new ResponseEventEmitter();

var HighLevelProducer = kafka.HighLevelProducer,
    client = new kafka.Client(),
    producer = new HighLevelProducer(client);

var HighLevelConsumer = kafka.HighLevelConsumer,
    client = new kafka.Client(),
    consumer = new HighLevelConsumer(
        client,
        [
            { topic: kafkaConsumeTopic }
        ],
        {
            groupId: 'my-group'
        }
    );

var s = http.createServer(function (req, res) {
    // Generate a random UUID to be used as the request id that
    // that is used to correlated request/response requests.
    // The internal micro-services need to include this id in
    // the "final" message that is pushed to Kafka and consumed
    // by the ui service
    var id = uuidv4();

    // Send the request data to the internal back-end through Kafka
    // In real code the Kafka message would be a JSON/protobuf/... 
    // message, but it needs to include the UUID generated by this 
    // function
    payloads = [
        { topic: kafkaProduceTopic, messages: id},
    ];
    producer.send(payloads, function (err, data) {
        if(err != null) {
            console.log("Error: ", err);
            return;
        }
    });

    responseEventEmitter.once(id, () => {
        console.log("Got the response event for ", id);
        res.write("Order " + id + " has been processed\n");
        res.end();
    })
});

s.timeout = 10000;
s.listen(8080); 

// Listen to the Kafka topic that streams messages
// indicating that the request has been processed and
// emit an event to the request handler so it can finish.
// In this example the consumed Kafka message is simply
// the UUID of the request that has been processed (which
// is also the event name that the response handler is
// listening to).
//
// In real code the Kafka message would be a JSON/protobuf/... message
// which needs to contain the UUID the request handler generated.
// This Kafka consumer would then have to deserialize the incoming
// message and get the UUID from it. 
consumer.on('message', function (message) {
    responseEventEmitter.emit(message.value);
});

Solution 4:

Question: How do I send a response to the same UI Service that originated the action (The service thats interacting with the user over HTTP) in this Agnostic/Event driven world?

So i am looking for a solution where externally I have a typical Request->Response(SYNC) and the complexity of the status(ASYNCrony of the system) is handled internally

I think the important thing to note is the following.

In short:

  • The edges of your system can be synchronous, at the same time as it handles stuff internally in an asynchronous event-based manner. See: illustration, and backing.

  • The UI Service (Web Server or API Gateway) can just shoot off an async/await function (being written in Node.js), and go on to process other requests. Then, when the UI Service receives the 'OrderConfirmed' event (by listening on a Kafka log, for instance) from another microservice on the backend, it will pick up the user request again (callback) and sends the designated response to the client.

  • Try Reactive Interaction Gateway (RIG), which would handle it for you.

In long:

  • The client can be served synchronously, whilst the backend handles its internal events asynchronously. The client doesn't know, or need to know. But the client has to wait. So beware of long processing time. Because the client's HTTP request can time out (typically 30 sec to 2 min for HTTP requests. Or if UI service is run in a Cloud Function it could also time out; default after 1 min, but could be extended to 9 min). But timeout would actually also be an issue if the request had been synchronous end-to-end. The async event-driven architecture doesn't change this, it just elevates the concern in ones mind.

  • Since the client sends a synchronous request it is blocking (user has to wait in the UI). However, that doesn't mean that the UI Service (Web Server or API Gateway) has to be blocking. It can just shoot off an async/await function (being written in Node.js), and go on to process other requests. Then, when the UI Service receives the 'OrderConfirmed' event, it will pick up the function again (callback) and sends the designated response to the client.

  • Any (micro-)service at the edge of your back-end that interacts with a 3rd party system can do so synchronously, with a typical HTTP request/response flow (although in general you'd want to async/await also here, to free your own microservice resources while the 3rd party is processing). When it synchronously receives the response from the 3rd party it can then send an asynchronous event to the rest of the back-end (e.g. a 'StockResupplied' event). Which can propagate back to the client, if the UI Service was designed to wait for such events before giving its response.

Source of inspiration: Brad Irby's answer to a related question.

The alternative to the above is to:

Design the client's UI in such a way that the user doesn't have to block/wait (maybe having been given an optimistic success response, in line with 'Optimistic UI' principles), but then later asynchronously receive a Server-Sent Event (SSE).

  • "Server-Sent Events (SSE) is a technology that enables a browser (client) to receive automatic updates like text-based event data from a server via HTTP connection."

  • SSE "uses just one long-lived HTTP connection" in the background.

You might also want to know that:

  • “It´s good to know that SSEs suffer from a limitation to the maximum number of open connections, which can be especially painful when opening various tabs, as the limit is per browser is six.”

  • “Compared to SSEs, WebSockets are a lot more complex and task-demanding to set up.”

Source: https://www.telerik.com/blogs/websockets-vs-server-sent-events

Furthermore:

Using the Web Push API with a Service Worker is another option to SSE, but it is more complicated, gives the user that annoying popup, and is more designed for sending notifications outside of the browser and/or when the web app is closed.

You might also want to have a look at:

Reactive Interaction Gateway (RIG), for some nice architectural diagrams, and a readme that illustrates the main point in this answer. RIG is a scalable free open-source API Gateway for microservices, that handles SSE, WebSockets, long-polling etc. Reactive Interaction Gateway (RIG) "subscribes to Kafka topics, while holding connections to all active frontends, forwarding events to the users they're addressed to, all in a scalable way. And on top of that, it also handles authorization, so your services don't have to care about that either."

Solution 5:

Unfortunately, I believe you'll likely have to use either long polling or web-sockets to accomplish something like this. You need to "push" something to the user, or keep the http request open until something comes back.

For handling getting the data back to the actual user, you could use something like socket.io. When a user connects, socket.io creates an id. Anytime a user connects, you map the userid to the id socket.io gives you. Once each request has a userid attached to it, you can emit the result back to the correct client. The flow would be something like this:

web requests order (POST with data and userId)

ui service places order on queue (this order should have userId)

x number of services work on order (passing userId along each time)

ui service consumes from topic. At some point, data appears on the topic. The data it consumes has the userId, the ui service looks up the map to figure out which socket to emit to.

Whatever code is running on your UI would need to also be event-driven, so it would deal with a push of data without the context of the original request. You could use something like redux for this. Essentially, you'd have the server creating redux actions on the client, it works pretty well!

Hope this helps.