HikariCP: What database level timeouts should be considered to set maxLifetime for Oracle 11g
In the documentation for HikariCP, it is mentioned that
We strongly recommend setting this value, and it should be at least 30 seconds less than any database-level connection timeout.
What are those database-level connection timeouts that should be taken into account for Oracle11.2 database? And how could I find those timeouts (queries to execute)?
Solution 1:
Short answer: none (by default).
For the record (to include details here in case the link changes), we're talking about property maxLifetime
of HikariCP:
This property controls the maximum lifetime of a connection in the pool. An in-use connection will never be retired, only when it is closed will it then be removed. We strongly recommend setting this value, and it should be at least 30 seconds less than any database or infrastructure imposed connection time limit. A value of 0 indicates no maximum lifetime (infinite lifetime), subject of course to the idleTimeout setting. Default: 1800000 (30 minutes)
In my experience, it's a good thing that HikariCP does that. As far as I can tell by default Oracle does not enforce a max lifetime for connections (neither on JDBC driver side (1), nor on server side(2)). So in this respect, the "infrastructure-imposed connection time limit" is +infinity -- and that was a problem for us, as we did observe issues with long-lived connections. It also means any value is "at least 30 seconds less", including the default :)
I suppose the connection layer does not do anything about this because it counts on the pool layer above to handle such things. It was not possible with (now deprecated) implicit connection pool, and I don't know if UCP (the replacement) does that, but if you use HikariCP you don't use those.
Now, after 30 minutes (usually after many reuses for various purposes) for a given connection, HikariCP closes it and creates a fresh one. That has a very minor cost, and fixed our issues with long-lived connections. We're happy with that default, but still make it configurable just in case (see 2 below).
(1) OracleDataSource
does not offer any configuration point (property or system property) to control that, and I observed infinite lifetime.
(2) For server-side limits, see profile parameter IDLE_TIME
. Quoting
this answer:
Oracle by default will not close a connection due to inactivity. You can configure a profile with an IDLE_TIME to cause Oracle to close inactive connections.
To verify what is the value of IDLE_TIME
for your user, combining answers from this Q&A:
select p.limit
from dba_profiles p, dba_users u
where p.resource_name = 'IDLE_TIME' and p.profile = u.profile and u.username = '...'
;
Default value is UNLIMITED
.
Please note there can be other limits enforced elsewhere (firewall... yes I've been bitten by that, although most DB systems have a keep-alive mechanism) that might interfere. So you'd better make it configurable, in case such issues are discovered when you deploy your product.
On Linux, you can verify max lifetime of physical connections by monitoring TCP sockets connected to your database. I've been running script below on my server (from the DB point of view that's the client host), it takes 1 argument, the ip:port
of your oracle node, as it appears in output of netstat -tan
(or a pattern if you have several nodes).
#!/bin/bash
target="$1"
dir=$(mktemp -d)
while sleep 10
do
echo "------------ "$(date)
now=$(date +%s)
netstat -tan | grep " $target " | awk '{print $4}' | cut -f2 -d: | while read port
do
file="p_$port"
[ ! -e $file ] && touch $file
ftime=$(stat -c %Z "$file")
echo -e "$port :\t "$(( now - ftime))
done
done
\rm "$dir"/p_*
\rmdir "$dir"
If you run that and stop it with ctrl-c during sleep
time, it should exit the loop and clean up the temp directory, but this is not 100% foolproof
In the results none of the ports should show a value that exceeds 1800 seconds (ie 30 minutes), give or take a minute. See example output below, first sample shows 2 sockets above 1800s, they're gone 10s later.
------------ Thu Jul 6 16:09:00 CEST 2017
49806 : 1197
49701 : 1569
49772 : 1348
49782 : 1317
49897 : 835
49731 : 1448
49620 : 1830
49700 : 1569
49986 : 523
49722 : 1498
49715 : 1509
49711 : 1539
49629 : 1820
49732 : 1448
50026 : 332
49849 : 1036
49858 : 1016
------------ Thu Jul 6 16:09:10 CEST 2017
49806 : 1207
49701 : 1579
49772 : 1358
49782 : 1327
49897 : 845
49731 : 1458
49700 : 1579
49986 : 533
49722 : 1508
49715 : 1519
49711 : 1549
49732 : 1458
50026 : 342
49849 : 1046
49858 : 1026
You'll need to run the script for more than 30 minutes to see that, because it does not know age of sockets that existed before