Use backups if unavailable (not just down)

Using haproxy, I want:

A pool of 'main' servers and 'backup' servers, though they don't necessarily have to be in separate pools.
Each backend has a low 'maxconn' (in this case 1)
Clients should not wait in a queue. If there are no immediately available servers in the 'main' pool they should be shunted to the 'backup' pool without delay.

Right now I have one backend, 'main' servers have an absurdly high weighting and it 'works'.

acl use_backend + connslots is along the right lines but without the patch in my own answer it isn't perfect.

Bonus points for not requiring a modified haproxy binary.

Solution 1:

The correct way is to add an ACL in the frontend which checks the amount of connections on the server, and then makes a decision based on that.

The config below will check the "monitor_conns" frontend and if there are 500 or more connections, they will be sent to the "backups" backend, otherwise they'll go to the "regular" backend.

Here's an untested example:

frontend monitor_conns
  bind *:80
  acl too_many_conns fe_conn 500
  use_backend backups if too_many_conns
  default_backend regular

backend backups
  ... your config
  server backupsrv 192.168.0.101:80 check port 80 maxconn 1000 inter 1s rise 1 fall 1

backend regular
  ... your config
  server regularsrv 192.168.0.100:80 check port 80 maxconn 500 inter 1s rise 1 fall 1

It's just an example, but it should give you an idea on how to proceed.

Solution 2:

Old question, but I face the same problem, and here is my solution:

You can check front end conn using acl, and use backend that have extra server for that extra server I mean your backup server

So the config will look like this

frontend frontend1 127.0.0.1:9200
    mode tcp
    acl max_conn_reached fe_conn gt 15
    acl production_almost_dead nbsrv(prod1) lt 2

    default_backend prod1
    use_backend prod1_and_prod2 if max_conn_reached OR production_almost_dead

backend prod1
    mode tcp
    balance leastconn
    server se_prod1 127.0.0.1:8001 check maxconn 10
    server se_prod2 127.0.0.1:8002 check maxconn 10

backend prod1_and_prod2
    mode tcp
    balance leastconn

    server se_prod1 127.0.0.1:8001 check maxconn 10
    server se_prod2 127.0.0.1:8002 check maxconn 10

    server se_backup1 127.0.0.1:8003 check maxconn 10
    server se_backup2 127.0.0.1:8004 check maxconn 10

The front end will use backup server (together with production server) if connection on frontend is greater than 15 or one service on backend1 is down

Solution 3:

The following seems to work for me but it has required patching haproxy-1.4.15/src/backend.c:

# diff haproxy-1.4.15/src/backend.c backend.c
1298a1299,1333
> /* set test->i to the number of enabled servers on the proxy */
> static int
> acl_fetch_connfree(struct proxy *px, struct session *l4, void *l7, int dir,
>                     struct acl_expr *expr, struct acl_test *test)
> {
>         struct server *iterator;
>         test->flags = ACL_TEST_F_VOL_TEST;
>         if (expr->arg_len) {
>                 /* another proxy was designated, we must look for it */
>                 for (px = proxy; px; px = px->next)
>                         if ((px->cap & PR_CAP_BE) && !strcmp(px->id, expr->arg.str))
>                                 break;
>         }
>         if (!px)
>                 return 0;
>
>         test->i = 0;
>         iterator = px->srv;
>         while (iterator) {
>                 if ((iterator->state & SRV_RUNNING) == 0) {
>                         iterator = iterator->next;
>                         continue;
>                 }
>                 if (iterator->maxconn == 0) {
>                         test->i = -1;
>                         return 1;
>                 }
>
>                 test->i += (iterator->maxconn - (iterator->cur_sess + iterator->nbpend));
>                 iterator = iterator->next;
>         }
>
>         return 1;
> }
>
1461a1497
>       { "connfree", acl_parse_int,   acl_fetch_connfree, acl_match_int, ACL_USE_NOTHING },

I can then use connfree in my acl:

frontend frontend1
    bind *:12345
    acl main_full connfree(main) eq 0
    use_backend backup if main_full
    default_backend     main

backend main
    balance leastconn
    default-server maxconn 1 maxqueue 1
    server main2 10.0.0.1:12345 check
    server main1 10.0.0.2:12345 check

backend backup
    balance leastconn
    default-server maxconn 1 maxqueue 1
    server backup1 10.0.1.1:12345 check
    server backup2 10.0.1.2:12345 check

Hopefully comparing acl_fetch_connfree() to acl_fetch_connslots() will make the change obvious:

old = (maxconn - current conns) + (maxqueue - pending conns)

new = maxconn - (current conns + pending conns)

pandas.Grouper for time intervals behavior

NLS missing while accessing property file in eclipse plugin development

Disabled Textbox Font Colour

Why is TIMESTAMP fractional seconds off by .001 when exported from DB2 to MSSQL data type DATE, DATETIME, TIMESTAMP not as VARCHAR()

Mask of boolean 2D numpy array with True values for elements contained in another 1D numpy array

custom loss function in Keras combining multiple outputs

passing functions into another function, resulting in ValueError

Azure Synapse Serverless. HashBytes: The query references an object that is not supported in distributed processing mode

R iterating through 1600 cols in df with binary values 0 and 1 and copy values from two other columns to save in an array by group