Date post: | 09-Feb-2017 |
Category: |
Software |
Upload: | josh-elser |
View: | 267 times |
Download: | 2 times |
1 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Practical Kerberos with Apache HBaseJosh ElserHBaseCon East2016/09/26
2 © Hortonworks Inc. 2011 – 2016. All Rights Reserved2 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Engineer at Hortonworks, Member of the Apache Software Foundation
Top-Level Projects• Apache
Accumulo®• Apache CalciteTM
• Apache CommonsTM
• Apache HBase®• Apache PhoenixTM
ASF Incubator• Apache FluoTM
• Apache GossipTM
• Apache PirkTM
• Apache RyaTM
• Apache SliderTM
These names are trademarks or registered trademarks of the Apache Software Foundation.
3 © Hortonworks Inc. 2011 – 2016. All Rights Reserved3 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
… but today we’re talking about Kerberos!
- “The Madness beyond the Gate” [1]
- An exploration in black magic and voodoo
- The word most accompanied with expletives
1: https://steveloughran.gitbooks.io/kerberos_and_hadoop/content/sections/kerberos_the_madness.html
4 © Hortonworks Inc. 2011 – 2016. All Rights Reserved4 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
What this talk won’t be...
3dom via https://www.flickr.com/photos/steve_l/6042206137/in/album-72157629289333057/, CC-BY-NC
5 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Introduction to Kerberos
⬢ “Kerberos is a network authentication protocol. It is designed to provide strong authentication for client/server applications by using secret-key cryptography” [1]
⬢ MIT Kerberos is one implementation– Heimdal is another– We’re talking about MIT Kerberos
⬢ Authentication over a computer network– Not authorization– No data privacy
1: http://web.mit.edu/kerberos/
6 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Introduction to Kerberos
⬢ Key Distribution Center (KDC)– Centralized server which grants Kerberos “tickets”– The “trusted third party” of the security model
⬢ Users are defined by a ”principal”– primary[/instance]@REALM– A human: [email protected]– A service: hbase/[email protected]– [email protected] is unique with elserj/[email protected]
7 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Introduction to Kerberos
⬢Principals are identified by a secret shared with the KDC– A normal password– A keytab file (non-plaintext “password”, suitable for non-interactive logins)
⬢ Kerberos Ticket obtained from the KDC by using your secret– Tickets expire– Tickets are renewable*
Client Server
KDC
Password/Keytab Keytab
Authenticated RPC
8 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Interacting with Kerberos
⬢ kadmin (or kadmin.local)– Command-line interface for administrators to create, modify, delete principals.
⬢ kinit– A command-line tool to obtain a ticket for a principal– Places the ticket in a file on disk in a well-known location called a “ticket cache”
• Default location on Linux: /tmp/krb5cc_$(id –u `whoami`)– The ticket cache is read-write protected for the user only (e.g. chmod 600)– Can obtain a ticket for any principal using a password or keytab– Ticket caches can hold multiple tickets
⬢ klist– Lists the contents of the current user’s ticket cache– Can list the keys in a keytab file
9 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Benefits of Kerberos
⬢ Building a secure, network-based authentication system is very hard
⬢ Functions on non-trusted networks– Security for multi-tenant systems, protect against malicious and non-malicious users
⬢ Leveraged across the Apache Hadoop “Stack”
⬢ Widely integrated externally– Operating systems and programming languages
⬢ Can integrate with Active Directory
Apache Hadoop is a registered trademark of the Apache Software Foundation
10
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
Promises
It’s simple, you just get your Kerberos ticket, use HBase and it knows who you are!
[elserj@localhost] $ kinit elserjPassword for [email protected]: [elserj@localhost] $ hbase com.hortonworks.hbase.MyMapReduceJob /user/elserj/my-big-data.txt…Success![elserj@localhost] $
11
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
Reality
[elserj@localhost] $ kinit elserjPassword for [email protected]: [elserj@localhost] $ hbase com.hortonworks.hbase.MyMapReduceJob /big-data.txt... 2016-09-26 14:03:11,549 FATAL [main] ipc.AbstractRpcClient (RpcClientImpl.java:run(709)) – SASL authentication failed. The most likely cause is missing or invalid credentials. Consider ‘kinit’.javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)][elserj@localhost] $
( °□°╯ )╯︵┻━┻
12
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
Ok, let’s figure out what went wrong?
What should I search for?
RPC
SASL
GSSAPI
JGSSUGI
JAAS
KDC
JCEToken
TicketVoldemort
“Bars near meopen now”
Cthulhu
Kerberos
13
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
How JVM-based applications can obtain Kerberos tickets⬢ Extract a ticket from the local ticket cache for a principal
– hbase shell or hdfs dfs –ls /
⬢ UserGroupInformation Hadoop API (UGI)– UserGroupInformation.loginUserFromKeytab(String, String)– UserGroupInformation.loginUserFromKeytabAndReturnUGI(String, String)
⬢ javax.security.auth.Subject with Krb5LoginModule– The APIs which UserGroupInformation uses under the covers
⬢ Automatic login via JAAS– “Java Authentication and Authorization Service”, implementation of PAM (RFC 86.0)– Configuration file, specified via Java system properties.– Each “block” uses an identifier to denote login details for a specific system
14
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
HBase Service Logins
⬢ HBase services are daemons; they always use a keytab to login
⬢ Principal and keytab are specified in hbase-site.xml for each service
⬢ A JAAS configuration file is also provided for Apache ZooKeeper client authentication– Necessary for authenticated ZooKeeper access (HBase-only ACLs)
⬢ HBase services automatically perform logins/renewals as necessary– Anyone who tells you that they need to ”kinit for HBase to work” doesn’t know what they’re
talking about.
Apache ZooKeeper is a trademark of the Apache Software Foundation
15
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
HBase Clients
⬢ HBase clients will use a variety of mechanism for authentication– Interactive use: ticket-cache– Automated tasks/Daemons: UGI with keytab
⬢ Reminder: Kerberos tickets expire– Clients must implement renewal logic– UGI provides an API to do this
⬢ Typically, UGI is the way to go–Concise and well-understood
16
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
On using UserGroupInformation correctly
⬢ We mentioned two different method calls earlier for logins– void loginUserFromKeytab(String, String)– UserGroupInformation loginUserFromKeytabAndReturnUGI(String, String)
⬢ loginUserFromKeytab is “global”– Syntactic-sugar to make your life easier– Works great when the application only acts as one user
⬢ loginUserFromKeytabAndReturnUGI is “localized”– Requires invoking “doAs(...)”– Allows for concurrent execution as different users in one JVM
17
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
Enter SASL: authentication framework over a transport
⬢ SASL is a framework for building RPC systems with authentication
⬢ “Simple Authentication and Security Layer” RFC-4422– “A framework for authentication and data security in Internet protocols” [1]
– “decouples authentication mechanisms from application protocols” [1]
• Generic Security Services Application Program Interface (GSSAPI) speaks Kerberos• DIGEST-MD5 an HTTP Digest authentication-like method (delegation tokens)
– Data security aka Quality of Protection (QoP)• auth: Authentication only (default)• auth-int: Previous, and integrity check of message content• auth-conf: Previous, and encryption of message content
[1] https://en.wikipedia.org/wiki/Simple_Authentication_and_Security_Layer
18
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
Trust on an untrusted network
⬢ A Kerberos ticket implies a valid identity, not necessarily the identity you wanted
⬢ Kerberos relies on accurate/consistent DNS as the basis for a secure RPC model– Secure your DNS as much as your KDC
⬢ Recall the service principal from earlier– hbase/[email protected]
⬢ The instance must be a fully-qualified domain name
⬢ Clients need to know primary and instance must match DNS– “Caused by: KrbException: Identifier doesn't match expected value (906)”– “error Message is Server not found in Kerberos database”
19
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
Trust on an untrusted network
Client Trusted ServiceGoodDNS
Rogue Service BadDNS
service/[email protected]/[email protected]
Sends RPC “service” atsvc1.hwx.com
Without enforcement of DNS naming via SASL, a client could be maliciously sent to a rogue service without the client realizing it happened.
20
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
Harping on DNS
⬢ DNS must be correct, consistent, and secure
⬢ Hostnames are advertised for discovery– Also benefits multi-homed networks
⬢ Forward and Reverse DNS mappings must be accurate on every node– `nslookup regionserver1.hbase.hwx.com` returns 10.0.0.1– `nslookup 10.0.0.1` returns regionserver1.hbase.hwx.com
⬢ Check /etc/resolv.conf for quick troubleshooting
21
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
Recap: Kerberos authentication for HBase RPCs
⬢ Client and Server both obtain Kerberos ticket– Password or Keytab via UGI/JAAS/Ticket-Cache– Tickets must be renewed before they expire
⬢ SASL is the framework which HBase leverages for authenticated RPCs– GSSAPI as the SASL mechanism which can “speak” Kerberos– QoP defines the security of the RPC data (minimum of authentication)
⬢ Fully-qualified hostnames everywhere– Forward and reverse DNS must be consistent across all clients and servers
22
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
The edge cases
⬢ Exceptions to how authentication works– YARN jobs– HBase REST and Thrift services
⬢ Not the traditional client/server model Kerberos was designed to fit– 100-1000’s of tasks concurrently requiring a ticket– Talk to HBase as a user without having that user’s credentials
⬢ Two approaches introduced to address these problems
23
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
Delegation Tokens
⬢ Earlier mentioned, SASL supports a variety of mechanisms– DIGEST-MD5 allows a digest-token style authentication scheme
⬢ Delegation token is a temporary ”password” which can authenticate a user– Slight compromise of security for performance
⬢ Circumvents authentication to the KDC, instead handled by HDFS or HBase
⬢ Automatically obtained during job submission and added to the job cache– We must rely on YARN to do the right thing
If you thought Kerberos documentation for Hadoop/HBase was sparse…
24
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
Delegation Tokens
Client HBase Master
KDCPassword/Keytab Keytab
Obtain DT
YARNContainers
HBase RegionServers
YARNResourceManager
Client Ticket and DT YARN Ticket
and DT
DT
25
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
Proxy Users
⬢ A proxy is some intermediate service that provides access to a backend service– HBase Thrift and REST services
⬢ Each of these services have its own Kerberos principal and keytab used to communicate with HBase
⬢ These services are accessing HBase on behalf of another user.– The ticket is for the service, but we want it to appear as if it is [email protected]
⬢ ProxyUsers refer to a set of configuration values in Hadoop (core-site.xml)– hadoop.proxyuser.SERVICE.{hosts,groups,users}
⬢ Configuration-based approach to allow services to “pretend” to be a user without actually having that user’s credentials
26
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
Proxy Users
Client
KDC
Password/Keytab
HBaseProxy ServerClient Ticket
Server Ticket(Client principal)
Keytab
Keytab
Proxy Servers: HBase REST, HBase Thrift, Phoenix Query Server, etc
27
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
Kerberos authentication for HTTP-based services (SPNEGO)
⬢ The need to protect services using HTTP–Don’t want to reuse SASL
⬢ Simple and Protected GSSAPI Negotiation Mechanism (SPNEGO) RFC-4178– The Negotiate HTTP header– Built into cURL (--negotiate), most Java-based HTTP libraries, and web-browsers
⬢ Web-browsers often need special configuration to properly authenticate.– Firefox: network.negotiate-auth.delegation-uris, network.negotiate-auth.trusted-uris– Chrome: --auth-server-whitelist="*.domain" --auth-negotiate-delegate-whitelist="*.domain"
28
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
Troubleshooting: Prerequisites
⬢ Ensure a recent version of your JVM and Hadoop– Bugs exist in UserGroupInformation for certain JVMs (vendor+version)
⬢ Ensure that the unlimited strength Java Cryptographic Extensions (JCE) are installed on all nodes in the cluster– And that clients/servers are using that JVM installation!– Required for AES-256 encryption type on Kerberos keys (which you will likely get by default)
⬢ Ensure that you have DEBUG logging enabled for HBase services– Potentially, org.apache.hadoop.hbase.ipc=DEBUG is sufficient
⬢ Set the sun.security.krb5.debug system property to true in your application– Or sun.security.spnego.debug for debugging SPNEGO
29
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
Troubleshooting: Tips
⬢ Remember that DNS is the cornerstone– When reading logs, make sure that you see the expected fully-qualified domain names– Do not assume that DNS is correct: verify it.
⬢ Determine if an RPC issue is authentication or authorization– If you see an HBase-level error, it is likely an authorization issue– If you only see transport/connection-setup errors, it is likely an authentication issue
⬢ Remember that tickets expire– Cross-reference ticket lifetimes with application logs
⬢ Read the logs. Actually read them.– A vast majority of errors can be solved with appropriate logging JVM-debugging
30
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
Reference Material
⬢ “Hadoop and Kerberos: The Madness beyond the Gate”– https://steveloughran.gitbooks.io/kerberos_and_hadoop/content/index.html
⬢ Oracle documentation– http://docs.oracle.com/javase/7/docs/technotes/guides/security/jaas/tutorials/GeneralAcnOnly.html– https://docs.oracle.com/javase/7/docs/jre/api/security/jaas/spec/com/sun/security/auth/module/Krb5
LoginModule.html
⬢ MIT Kerberos documentation– http://web.mit.edu/kerberos/
⬢ “Explain like I’m 5: Kerberos” (great low-level Kerberos write-up)– http://www.roguelynn.com/words/explain-like-im-5-kerberos/
⬢ KDiag: “Kerberos diagnostics for Hadoop”–Apache Hadoop >=2.8 or https://github.com/steveloughran/kdiag
31
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
Developing with Kerberos
⬢ Apache Directory’s Kerby project– Great for Kerberos authentication without Hadoop in the picture– http://directory.apache.org/kerby/downloads.html
⬢ Apache Hadoop’s MiniKDC– Built on top of Apache Directory– https://github.com/apache/hadoop/blob/release-2.7.3-RC2/hadoop-common-project/hadoop-min
ikdc/src/main/java/org/apache/hadoop/minikdc/MiniKdc.java
⬢ Support in HDFS, YARN, and HBase MiniCluster classes too
No excuse to not write tests!
Apache Directory is a trademark of the Apache Software Foundation
32
© Hortonworks Inc. 2011 – 2016. All Rights Reserved32
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
Thanks!Email: [email protected]: @josh_elser
3dom via https://www.flickr.com/photos/steve_l/6674480535/in/album-72157629289333057/, CC-BY-NC
Thanks to those who gave feedback along the way: Brandon Wilson, Bryan Bende, Michael Stack, Randy Gelhausen, Steve Loughran.