How to configure clients to connect to Apache Kafka Clusters securely – Part 1: Kerberos

This is the first installment in a short series of blog posts about security in Apache Kafka. In this article we will explain how to configure clients to authenticate with clusters using different authentication mechanisms.

Secured Apache Kafka clusters can be configured to enforce authentication using different methods, including the following:

  • SSL – TLS client authentication
  • SASL/GSSAPI – Kerberos authentication
  • SASL/PLAIN – LDAP and file-based authentication
  • SASL/SCRAM-SHA-256 and SASL/SCRAM-SHA-512
  • SASL/OAUTHBEARER

In this article we will start looking into Kerberos authentication and will focus on the client-side configuration required to authenticate with clusters configured to use Kerberos. The other authentication mechanisms will be covered in subsequent articles in this series.

We will not cover the server-side configuration in this article but will add some references to it when required to make the examples clearer.

The examples shown here will highlight the authentication-related properties in bold font to differentiate them from other required security properties, as in the example below. TLS is assumed to be enabled for the Apache Kafka cluster, as it should be for every secure cluster.

security.protocol=SASL_SSL

ssl.truststore.location=/opt/cloudera/security/jks/truststore.jks

We use the kafka-console-consumer for all the examples below. All the concepts and configurations apply to other applications as well.

Kerberos Authentication

Kerberos is by far the most common option we see being used in the field to secure Kafka clusters. It enables users to use their corporate identities, stored in services like Active Directory, RedHat IPA, and FreeIPA, which simplifies identity management. A kerberized Kafka cluster also makes it easier to integrate with other services in a Big Data ecosystem, which typically use Kerberos for strong authentication.

Kafka implements Kerberos authentication through the Simple Authentication and Security Layer (SASL) framework. SASL is an authentication framework, and a standard IETF protocol defined by RFC 4422. It supports multiple different authentication mechanisms and the one that implements Kerberos authentication is called GSSAPI.

The basic Kafka client properties that must be set to configure the Kafka client to authenticate via Kerberos are shown below:

# Uses SASL/GSSAPI over a TLS encrypted connection
security.protocol=SASL_SSL
sasl.mechanism=GSSAPI
sasl.kerberos.service.name=kafka
# TLS truststore
ssl.truststore.location=/opt/cloudera/security/jks/truststore.jks

The configuration above uses Kerberos (SASL/GSSAPI) for authentication. TLS (SSL) is used for data encryption over the wire only.

JAAS configuration

The properties above, though, don’t provide the client with the credentials it needs to authenticate with the Kafka cluster. We need some more information.

When using Kerberos, we can provide the credentials to the client application in two ways. Either in the form of a valid Kerberos ticket, stored in a ticket cache, or as a keytab file, which the application can use to obtain a Kerberos ticket

The handling of the Kerberos credentials in a Kafka client is done by the Java Authentication and Authorization Service (JAAS) library. So we need to configure the client with the necessary information so that JAAS knows where to get the credentials from.

There are two ways to set those properties for the Kafka client:

  • Create a JAAS configuration file and set the Java system property java.security.auth.login.config to point to it; OR
  • Set the Kafka client property sasl.jaas.config with the JAAS configuration inline. 

In this section we show how to use both methods. The examples in this article will use the sasl.jaas.config method for simplicity. 

Using a JAAS configuration file

If you are using a JAAS configuration file you need to tell the Kafka Java client where to find it. This is done by setting the following Java property in the command line:

... -Djava.security.auth.login.config=/path/to/jaas.conf ...

If you’re using Kafka command-line tools in the Cloudera Data Platform (CDP) this can be achieved by setting the following environment variable:

$ export KAFKA_OPTS="-Djava.security.auth.login.config=/path/to/jaas.conf"

The contents of the configuration file depend on where the credentials are being sourced from. To use a Kerberos ticket stored in the user’s ticket cache, use the following jaas.conf file:

KafkaClient {
  com.sun.security.auth.module.Krb5LoginModule required
  useTicketCache=true;
};

To use a keytab, use the following instead:

KafkaClient {
  com.sun.security.auth.module.Krb5LoginModule required
  useKeyTab=true
  keyTab="/etc/security/keytabs/alice.keytab"
  principal="alice@EXAMPLE.COM";
};

Using the sasl.jaas.config property

Instead of using a separate JAAS configuration file, I usually prefer setting the JAAS configuration for the client using the sasl.jaas.config Kafka property. This is usually simpler and gets rid of the additional configuration file (jaas.conf). The configurations below are equivalent to the jaas.conf configurations above.

Note: the settings below must be written in a single line. The semicolon at the end of the line is required.

To use a Kerberos ticket stored in a ticket cache:

sasl.jaas.config=com.sun.security.auth.module.Krb5LoginModule required useTicketCache=true;

To use a keytab, use the following instead:

sasl.jaas.config=com.sun.security.auth.module.Krb5LoginModule required useKeyTab=true keyTab="/etc/security/keytabs/alice.keytab" principal="alice@EXAMPLE.COM";

Example

The following is an example using the Kafka console consumer to read from a topic using Kerberos authentication and connecting directly to the broker (without using using a Load Balancer):

# Complete configuration file for Kerberos auth using the ticket cache
$ cat krb-client.properties
security.protocol=SASL_SSL
sasl.mechanism=GSSAPI
sasl.kerberos.service.name=kafka
sasl.jaas.config=com.sun.security.auth.module.Krb5LoginModule required useTicketCache=true;
ssl.truststore.location=/opt/cloudera/security/jks/truststore.jks
# Authenticate with Kerberos to get a valid ticket
$ kinit alice
Password for alice@REALM:

# Connect to Kafka using the ticket in the ticket cache
$ kafka-console-consumer \
    --bootstrap-server host-1.example.com:9093 \
    --topic test \
    --consumer.config /path/to/krb-client.properties

Network connectivity to Kerberos

A central component of Kerberos is the Kerberos Distribution Center (KDC). The KDC is the service that handles all the Kerberos authentication initiated by the client. For Kerberos authentication to work, both the Kafka cluster and the clients must have connectivity to the KDC.

In a corporate environment, this is easily achievable and it is usually the case. In some deployments, though, the KDC may be placed behind a firewall, making it impossible for the clients to reach it to get a valid ticket.

Cloud and hybrid deployments (cloud + on-prem) can make it a challenge for clients to use Kerberos authentication, as the on-prem KDC is usually not integrated into the cloud-deployed services. However, since Kafka supports other authentication mechanisms, clients have other alternatives at their disposal, as we’ll explore in the next article.

In the meantime, if you are interested in understanding Cloudera’s Kafka offering, download this white paper.

Andre Araujo
Andre Araujo

Data in Motion Field Engineer

2 Comments

by Suriawan Limantara on

Good article Andre. Will there be article covering how to submit a spark job to a secured spark cluster where the spark job needs to connect to different secured kafka cluster? The spark and kafka clusters have different realm. Thank you.

by Andre Araujo on

Hi, Suri! I hadn’t planned anything specific to Spark but I will cover content in the next posts, like using LDAP authentication for Kafka, that will make this easier.

You can also do this with Kerberos. If your remote Kerberos realm trusts the local one, there’s nothing special you need to do. Otherwise, when you submit your Spark job, you have to also provide the job with a keytab for authenticating with the remote realm and you’ll need to set your Kafka client properties to use this keytab when connecting (using the sasl.jaas.config property, as explained in this post). The krb5.conf on the Spark cluster must have the information about both realms for this to work.

Leave a comment

Your email address will not be published. Links are not permitted in comments.