HBase Kerberos connection renewal strategy
Recently I enabled kerberos in my cluster, everything works great until my kerberos login expires, at say, 12 hours. At that point any connections I have created, any tables created with those connections etc will throw when I use them. This could potentially crash my app depending on how I handle this.
I don't mind crashing hugely because my app is managed by slider which will resurrect the app if and when it goes down, however this will only happen when HBase is "used" (i.e. I call a method on a table with a now stale connection) which will probably be caused by a user interaction and this would lead to poor UX.
I don't want authentication implementation details to pervade my application and also don't want to create connection objects more often than is necessary because it is a costly operation which makes a large number of RPC calls (zookeeper metadata location to start with).
Is there a common strategy (preferably inbuilt in HBase client) for managing kerberos authentication expiry and renewing HBase connections/tables when that happens?
Solution 1:
A Kerberos TGT has a lifetime (e.g. 12h) and a renewable lifetime (e.g. 7 days). As long as the ticket is still valid and is still renewable, you can request a "free" renewal -- no password required --, and the lifetime counter is reset (e.g. 12h to go, again).
The Hadoop authentication library spawns a specific Java thread for automatic renewal of the current TGT. It's kind of ugly, using a kinit -R
command line instead of a JAAS library call, but it works - see HADOOP-6656
So, if you get Slider to create a renewable ticket on startup, and if you can bribe your SysAdmin to raise the default (cf. client conf) and the max (cf. KDC conf) renewable lifetime to, say, 30 days, then your app could run for 30 days straight with the initial TGT. A nice improvement.
~~~~~~~~~~
If you really crave for eternity... sorry, but you will actually have some programming to do. That means a dedicated thread/process in charge or re-creating automagically the TGT.
- The Java Way: on startup, before you connect to HBase/HDFS/whatever,
create explicitly an UGI with
loginUserFromKeytab()
then runcheckTGTAndReloginFromKeytab()
from time to time - The Shell Way: start a shell that (a) creates a TGT with
kinit
(b) spawns a sub-process that periodically fireskinit
again (c) launches your Java app then kills the subprocess when/if your app ever terminates
Caveat: if some other thread happens to open, or re-open, a connection while the TGT is being re-created, that connection may fail because the cache was empty at the exact time it was accessed ("race condition"). The next attempt will be successful, but expect a few rogue warnings in your logs.
~~~~~~~~~~
Final advice: you can use a private ticket cache for your app (i.e. you can run multiple apps on the same node with the same Linux account but different Kerberos principals) by setting KRB5CCNAME
environment variable, as long as it's a "FILE:" cache.
Solution 2:
Since this is an older Question, would be nice to know version of HBase, Hadoop etc.
Nowadays, Kerberos ticket renewal should just work in HBase
See configuration steps -
https://docs.cloudera.com/documentation/enterprise/5-14-x/topics/cdh_sg_hbase_authentication.html#concept_zyz_vg5_nt
See HBase client example that configures to use TGT renewal -
https://github.com/apache/hbase/blob/064f5f1394faa8e84ad64488345e3bf46629ce59/hbase-examples/src/main/java/org/apache/hadoop/hbase/util/ClientUtils.java#L66
(btw, renewTGT
=true
is default, this is actually part of Hadoop Commons codebase, see here -
https://github.com/naver/hadoop/blob/master/hadoop-common-project/hadoop-auth/src/main/java/org/apache/hadoop/security/authentication/client/KerberosAuthenticator.java#L132
)