DC/OS Kafka ZooKeeper Security

  • The DC/OS Kafka ZooKeeper service allows you to create a service account to configure access for Kafka ZooKeeper. The service allows you to create and assign permissions as required for access.

  • The DC/OS Kafka ZooKeeper service supports ZooKeeper’s native Kerberos authentication mechanism. The service provides automation and orchestration to simplify the usage of these important features, with both Client-Server and Server-Server mutual authentication supported.

An overview of the ZooKeeper Kerberos security features can be found here.

Note: These security features are only available on DC/OS Enterprise 1.10 and above.

Provisioning a service account

This section describes how to configure DC/OS access for Kafka ZooKeeper. Depending on your security mode, Kafka ZooKeeper may require service authentication for access to DC/OS.

Security mode Service Account
Disabled Not available
Permissive Optional
Strict Required

If you install a service in permissive mode and do not specify a service account, Metronome and Marathon will act as if requests made by this service are made by an account with the superuser permission.

Prerequisites:

Create a Key Pair

In this step, a 2048-bit RSA public-private key pair is created using the Enterprise DC/OS CLI.

Create a public-private key pair and save each value into a separate file within the current directory.

dcos security org service-accounts keypair <private-key>.pem <public-key>.pem

NOTE: You can use the DC/OS Secret Store to secure the key pair.

Create a Service Account

From a terminal prompt, create a new service account (<service-account-id>) containing the public key (<your-public-key>.pem).

dcos security org service-accounts create -p <your-public-key>.pem -d "ZooKeeper <service-account-id>

You can verify your new service account using the following command.

dcos security org service-accounts show <service-account-id>

Create a Secret

Create a secret (kafka-zookeeper/<secret-name>) with your service account (<service-account-id>) and private key specified (<private-key>.pem).

NOTE: If you store your secret in a path that matches the service name, for example, service name and secret path are percona-pxc-mysql, then only the service named percona-pxc-mysql can access it.<.p>

Permissive

dcos security secrets create-sa-secret <private-key>.pem <service-account-id> kafka-zookeeper/<secret-name>

Strict

dcos security secrets create-sa-secret --strict <private-key>.pem <service-account-id> kafka-zookeeper/<secret-name>

You can list the secrets with this command:

dcos security secrets list /

Create and Assign Permissions

Use the following curl commands to rapidly provision the Kafka ZooKeeper service account with the required permissions.

NOTE: Any forward slash ("/") in a resource must be replaced with `%252F` before it can be passed in a curl command.

When using the API to manage permissions, you must first create the permissions and then assign them. Sometimes, the permission may already exist. In this case, the API returns an informative message. You can regard this as a confirmation and continue to the next command.

  1. Create the permission.

IMPORTANT: These commands use the default Kafka ZooKeeper `role` value of `kafka-zookeeper-role`. If you are running multiple instances of Kafka ZooKeeper, replace the instances of `kafka-zookeeper-role` with the correct name (`-role`). For example, if you have a Kafka ZooKeeper instance named `kafka-zookeeper2`, you would replace each role value in the code samples to `kafka-zookeeper2-role`.

Permissive

Run these commands with your service account name (<service-account-id>) specified.

curl -X PUT --cacert dcos-ca.crt \
-H "Authorization: token=$(dcos config show core.dcos_acs_token)" $(dcos config show core.dcos_url)/acs/api/v1/acls/dcos:mesos:master:task:user:nobody \
-d '{"description":"Allows Linux user nobody to execute tasks"}' \
-H 'Content-Type: application/json'
curl -X PUT --cacert dcos-ca.crt \
-H "Authorization: token=$(dcos config show core.dcos_acs_token)" $(dcos config show core.dcos_url)/acs/api/v1/acls/dcos:mesos:master:framework:role:kafka-zookeeper-role \
-d '{"description":"Controls the ability of kafka-zookeeper-role to register as a framework with the Mesos master"}' \
-H 'Content-Type: application/json'
curl -X PUT --cacert dcos-ca.crt \
-H "Authorization: token=$(dcos config show core.dcos_acs_token)" $(dcos config show core.dcos_url)/acs/api/v1/acls/dcos:mesos:master:reservation:role:kafka-zookeeper-role \
-d '{"description":"Controls the ability of kafka-zookeeper-role to reserve resources"}' \
-H 'Content-Type: application/json'
curl -X PUT --cacert dcos-ca.crt \
-H "Authorization: token=$(dcos config show core.dcos_acs_token)" $(dcos config show core.dcos_url)/acs/api/v1/acls/dcos:mesos:master:volume:role:kafka-zookeeper-role \
-d '{"description":"Controls the ability of kafka-zookeeper-role to access volumes"}' \
-H 'Content-Type: application/json'
curl -X PUT --cacert dcos-ca.crt \
-H "Authorization: token=$(dcos config show core.dcos_acs_token)" $(dcos config show core.dcos_url)/acs/api/v1/acls/dcos:mesos:master:reservation:principal:<service-account-id> \
-d '{"description":"Controls the ability of <service-account-id> to reserve resources"}' \
-H 'Content-Type: application/json'
curl -X PUT --cacert dcos-ca.crt \
-H "Authorization: token=$(dcos config show core.dcos_acs_token)" $(dcos config show core.dcos_url)/acs/api/v1/acls/dcos:mesos:master:volume:principal:<service-account-id> \
-d '{"description":"Controls the ability of <service-account-id> to access volumes"}' \
-H 'Content-Type: application/json'

Strict

Run these commands with your service account name (<service-account-id>) specified.

curl -X PUT --cacert dcos-ca.crt \
-H "Authorization: token=$(dcos config show core.dcos_acs_token)" $(dcos config show core.dcos_url)/acs/api/v1/acls/dcos:mesos:master:framework:role:kafka-zookeeper-role \
-d '{"description":"Controls the ability of kafka-zookeeper-role to register as a framework with the Mesos master"}' \
-H 'Content-Type: application/json'
curl -X PUT --cacert dcos-ca.crt \
-H "Authorization: token=$(dcos config show core.dcos_acs_token)" $(dcos config show core.dcos_url)/acs/api/v1/acls/dcos:mesos:master:reservation:role:kafka-zookeeper-role \
-d '{"description":"Controls the ability of kafka-zookeeper-role to reserve resources"}' \
-H 'Content-Type: application/json'
curl -X PUT --cacert dcos-ca.crt \
-H "Authorization: token=$(dcos config show core.dcos_acs_token)" $(dcos config show core.dcos_url)/acs/api/v1/acls/dcos:mesos:master:volume:role:kafka-zookeeper-role \
-d '{"description":"Controls the ability of kafka-zookeeper-role to access volumes"}' \
-H 'Content-Type: application/json'
curl -X PUT --cacert dcos-ca.crt \
-H "Authorization: token=$(dcos config show core.dcos_acs_token)" $(dcos config show core.dcos_url)/acs/api/v1/acls/dcos:mesos:master:reservation:principal:<service-account-id> \
-d '{"description":"Controls the ability of <service-account-id> to reserve resources"}' \
-H 'Content-Type: application/json'
curl -X PUT --cacert dcos-ca.crt \
-H "Authorization: token=$(dcos config show core.dcos_acs_token)" $(dcos config show core.dcos_url)/acs/api/v1/acls/dcos:mesos:master:volume:principal:<service-account-id> \
-d '{"description":"Controls the ability of <service-account-id> to access volumes"}' \
-H 'Content-Type: application/json'
  1. Grant the permissions and the allowed actions to the service account using the following commands.

    IMPORTANT: These commands use the default Kafka ZooKeeper `role` value of `kafka-zookeeper-role`. If you are running multiple instances of Kafka ZooKeeper, replace the instances of `kafka-zookeeper-role` with the correct name (`-role`). For example, if you have a Kafka ZooKeeper instance named `kafka-zookeeper2`, you would replace each role value in the code samples to `kafka-zookeeper2-role`.

    Run these commands with your service account name (<service-account-id>) specified.

    curl -X PUT --cacert dcos-ca.crt \
    -H "Authorization: token=$(dcos config show core.dcos_acs_token)" $(dcos config show core.dcos_url)/acs/api/v1/acls/dcos:mesos:master:framework:role:kafka-zookeeper-role/users/<service-account-id>/create
    curl -X PUT --cacert dcos-ca.crt \
    -H "Authorization: token=$(dcos config show core.dcos_acs_token)" $(dcos config show core.dcos_url)/acs/api/v1/acls/dcos:mesos:master:reservation:role:kafka-zookeeper-role/users/<service-account-id>/create
    curl -X PUT --cacert dcos-ca.crt \
    -H "Authorization: token=$(dcos config show core.dcos_acs_token)" $(dcos config show core.dcos_url)/acs/api/v1/acls/dcos:mesos:master:volume:role:kafka-zookeeper-role/users/<service-account-id>/create
    curl -X PUT --cacert dcos-ca.crt \
    -H "Authorization: token=$(dcos config show core.dcos_acs_token)" $(dcos config show core.dcos_url)/acs/api/v1/acls/dcos:mesos:master:task:user:nobody/users/<service-account-id>/create
    curl -X PUT --cacert dcos-ca.crt \
    -H "Authorization: token=$(dcos config show core.dcos_acs_token)" $(dcos config show core.dcos_url)/acs/api/v1/acls/dcos:mesos:master:reservation:principal:<service-account-id>/users/<service-account-id>/delete
    curl -X PUT --cacert dcos-ca.crt \
    -H "Authorization: token=$(dcos config show core.dcos_acs_token)" $(dcos config show core.dcos_url)/acs/api/v1/acls/dcos:mesos:master:volume:principal:<service-account-id>/users/<service-account-id>/delete
    

Authentication

DC/OS Kafka ZooKeeper supports the Kerberos authentication mechanism.

Kerberos Authentication

Kerberos authentication relies on a central authority to verify that ZooKeeper clients are who they say they are. DC/OS Kafka ZooKeeper integrates with your existing Kerberos infrastructure to verify the identity of clients.

Prerequisites

  • The hostname and port of a KDC reachable from your DC/OS cluster
  • Sufficient access to the KDC to create Kerberos principals
  • Sufficient access to the KDC to retrieve a keytab for the generated principals
  • The DC/OS Enterprise CLI
  • DC/OS Superuser permissions

Configure Kerberos Authentication

Create principals

The DC/OS Kafka ZooKeeper service requires a Kerberos principal for each server to be deployed. Each principal must be of the form

<service primary>/zookeeper-<server index>-server.<service subdomain>.autoip.dcos.thisdcos.directory@<service realm>

with:

  • service primary = service.security.kerberos.primary
  • server index = 0 up to node.count - 1
  • service subdomain = service.name with all/'s removed
  • service realm = service.security.kerberos.realm

For example, if installing with these options in addition to your own:

{
    "service": {
        "name": "a/good/example",
        "security": {
            "kerberos": {
                "primary": "example",
                "realm": "EXAMPLE"
            }
        }
    },
    "node": {
        "count": 3
    }
}

then the principals to create would be:

example/zookeeper-0-server.agoodexample.autoip.dcos.thisdcos.directory@EXAMPLE
example/zookeeper-1-server.agoodexample.autoip.dcos.thisdcos.directory@EXAMPLE
example/zookeeper-2-server.agoodexample.autoip.dcos.thisdcos.directory@EXAMPLE
Active Directory

Microsoft Active Directory can be used as a Kerberos KDC. Doing so requires creating a mapping between Active Directory users and Kerberos principals.

The utility ktpass can be used to both create a keytab from Active Directory and generate the mapping at the same time.

The mapping can, however, be created manually. For a Kerberos principal like <primary>/<host>@<REALM>, the Active Directory user should have its servicePrincipalName and userPrincipalName attributes set to,

servicePrincipalName = <primary>/<host>
userPrincipalName = <primary>/<host>@<REALM>

For example, with the Kerberos principal example&#x2F;zookeeper-0-server.agoodexample.autoip.dcos.thisdcos.directory@EXAMPLE, then the correct mapping would be,

servicePrincipalName = example&#x2F;zookeeper-0-server.agoodexample.autoip.dcos.thisdcos.directory
userPrincipalName = example&#x2F;zookeeper-0-server.agoodexample.autoip.dcos.thisdcos.directory@EXAMPLE

If either mapping is incorrect or not present, the service will fail to authenticate that Principal. The symptom in the Kerberos debug logs will be an error of the form

KRBError:
sTime is Wed Feb 07 03:22:47 UTC 2018 1517973767000
suSec is 697984
error code is 6
error Message is Client not found in Kerberos database
sname is krbtgt/AD.MESOSPHERE.COM@AD.MESOSPHERE.COM
msgType is 30

when the userPrincipalName is set incorrectly, and an error of the form

KRBError:
sTime is Wed Feb 07 03:44:57 UTC 2018 1517975097000
suSec is 128465
error code is 7
error Message is Server not found in Kerberos database
sname is kafka/kafka-1-broker.confluent-kafka.autoip.dcos.thisdcos.directory@AD.MESOSPHERE.COM
msgType is 30

when the servicePrincipalName is set incorrectly.

Place Service Keytab in DC/OS Secret Store

The DC/OS Kafka ZooKeeper service uses a keytab containing all node principals (service keytab). After creating the principals above, generate the service keytab making sure to include all the node principals. This will be stored as a secret in the DC/OS Secret Store.

NOTE: DC/OS 1.10 does not support adding binary secrets directly to the secret store, only text files are supported. Instead, first base64 encode the file, and save it to the secret store as `/desired/path/__dcos_base64__secret_name`. The DC/OS security modules will handle decoding the file when it is used by the service.

The service keytab should be stored at service/path/name/service.keytab. As noted above. for DC/OS 1.10, it would be __dcos_base64__service.keytab), where service/path/name matches the path and name of the service. For example, if installing with the options

{
    "service": {
        "name": "a/good/example"
    }
}

then the service keytab should be stored at a/good/example/service.keytab.

Documentation for adding a file to the secret store can be found here.

NOTE: Secrets access is controlled by [DC/OS Spaces](/latest/security/ent/#spaces-for-secrets), which function like namespaces. Any secret in the same DC/OS Space as the service will be accessible by the service. However, matching the two paths is the most secure option. Additionally the secret name `service.keytab` is a convention and not a requirement.

Install the Service

Install the DC/OS Kafka ZooKeeper service with the following options in addition to your own:

{
    "service": {
        "security": {
            "kerberos": {
                "enabled": true,
                "kdc": {
                    "hostname": "<kdc host>",
                    "port": <kdc port>
                },
                "primary": "<service primary default zookeeper>",
                "realm": "<realm>",
                "keytab_secret": "<path to keytab secret>",
                "debug": <true|false default false>
            }
        }
    }
}

Note: It is possible to enable Kerberos after initial installation but the service may be unavailable during the transition. Additionally, your ZooKeeper clients will need to be reconfigured. For more information see the Enabling Kerberos after deployement section.

Enabling Kerberos After Deployment

It is possible to enable Kerberos authentication after the deployment of DC/OS Kafka ZooKeeper. As described in the (Rolling Upgrade)[https://cwiki.apache.org/confluence/display/ZOOKEEPER/Server-Server+mutual+authentication] section of the Apache ZooKeeper documentation, this requires multiple rolling restarts of the ZooKeeper ensemble and client connectivity may be lost at times.

Assuming that DC/OS Kafka ZooKeeper was initially deployed with service.security.kerberos.enabled set to false, the following steps can be used to enable Kerberos for the service.

Firstly – assuming the same Kerberos settings as discussed in Configure Kerberos Authentication – create the keytab for the Kerberos principals and add this keytab to the DC/OS Secret Store as described in the Create principals and Place Service Keytab in DC/OS Secret Store sections. Then create a kerberos-toggle-step-1.json file with the following contents:

{
    "service": {
        "security": {
            "kerberos": {
                "enabled": true,
                "kdc": {
                    "hostname": "<kdc host>",
                    "port": <kdc port>
                },
                "primary": "<service primary default zookeeper>",
                "realm": "<realm>",
                "keytab_secret": "<path to keytab secret>",
                "debug": <true|false default false>,
                "advanced": {
                    "required_for_quorum_learner": false,
                    "required_for_quorum_server": false,
                    "required_for_client": false
                }
            }
        }
    }
}

where it is important to note the service.security.kerberos.advanced section that is present here.

Using this config file, update your DC/OS Kafka ZooKeeper service:

$ dcos kafka-zookeeper --name=<service name> update start --options=kerberos-toggle-step-1.json

and wait for the deploy (update) plan to complete:

$ dcos kafka-zookeeper --name=<service name> plan show deploy
deploy (serial strategy) (COMPLETE)
└─ node-update (serial strategy) (COMPLETE)
   ├─ zookeeper-0:[server, metrics] (COMPLETE)
   ├─ zookeeper-1:[server, metrics] (COMPLETE)
   └─ zookeeper-2:[server, metrics] (COMPLETE)

The service will now have deployed with Kerberos enabled, but with non-authenticated connections for leader election and from clients still allowed. In order to obtain a secure cluster, these unauthenticated connections should now be turned off to force secure connections.

Create a kerberos-toggle-step-2.json file with the following contents (note that it is only required to specify the options that change):

{
    "service": {
        "security": {
            "kerberos": {
                "advanced": {
                    "required_for_quorum_learner": true,
                    "required_for_quorum_server": false,
                    "required_for_client": false
                }
            }
        }
    }
}

and deploy this as a configuration update:

$ dcos kafka-zookeeper --name=<service name> update start --options=kerberos-toggle-step-2.json
$ dcos kafka-zookeeper --name=<service name> plan show deploy
deploy (serial strategy) (COMPLETE)
└─ node-update (serial strategy) (COMPLETE)
   ├─ zookeeper-0:[server, metrics] (COMPLETE)
   ├─ zookeeper-1:[server, metrics] (COMPLETE)
   └─ zookeeper-2:[server, metrics] (COMPLETE)

deploying a Kafka ZooKeeper instance that requires Kerberos authentication between learners in the leader election.

As the next step in the rolling update process, create a kerberos-toggle-step-3.json file with the following contents:

{
    "service": {
        "security": {
            "kerberos": {
                "advanced": {
                    "required_for_quorum_learner": true,
                    "required_for_quorum_server": true,
                    "required_for_client": false
                }
            }
        }
    }
}

and deploy this as a configuration update:

$ dcos kafka-zookeeper --name=<service name> update start --options=kerberos-toggle-step-3.json
$ dcos kafka-zookeeper --name=<service name> plan show deploy
deploy (serial strategy) (COMPLETE)
└─ node-update (serial strategy) (COMPLETE)
   ├─ zookeeper-0:[server, metrics] (COMPLETE)
   ├─ zookeeper-1:[server, metrics] (COMPLETE)
   └─ zookeeper-2:[server, metrics] (COMPLETE)

Kafka ZooKeeper will now require Kerberos authentication for the entire leader election process.

The final step is to require Kerberos authentication for clients connecting to the DC/OS Kafka ZooKeeper instance with an options file (say kerberos-toggle-step-4.json) as follows:

{
    "service": {
        "security": {
            "kerberos": {
                "advanced": {
                    "required_for_quorum_learner": true,
                    "required_for_quorum_server": true,
                    "required_for_client": true
                }
            }
        }
    }
}

which is deployed:

$ dcos kafka-zookeeper --name=<service name> update start --options=kerberos-toggle-step-3.json
$ dcos kafka-zookeeper --name=<service name> plan show deploy
deploy (serial strategy) (COMPLETE)
└─ node-update (serial strategy) (COMPLETE)
   ├─ zookeeper-0:[server, metrics] (COMPLETE)
   ├─ zookeeper-1:[server, metrics] (COMPLETE)
   └─ zookeeper-2:[server, metrics] (COMPLETE)

Unauthenticated clients will now only be allowed to ping, create a session, close a session, or authenticate when communicating with the Kafka ZooKeeper instance.

Note: The default settings for service.security.kerberos.advanced.required_for_quorum_learner, service.security.kerberos.advanced.required_for_quorum_server, service.security.kerberos.advanced.required_for_client are all true.

Disabling Kerberos After Deployment

Note: Disabling Kerberos after deployment is not supported.

Securely Exposing DC/OS Kafka ZooKeeper Outside the Cluster.

Kerberos security is tightly coupled to the DNS hosts of the zookeeper tasks. As such, exposing a secure Kafka ZooKeeper service outside of the cluster requires additional setup.

Server to Client Connection

To expose a secure Kafka ZooKeeper service outside of the cluster, any client connecting to it must be able to access all tasks of the service via the IP address assigned to the task. This IP address will be one of: an IP address on a virtual network or the IP address of the agent the task is running on.

Forwarding DNS and Custom Domain

Every DC/OS cluster has a unique cryptographic ID which can be used to forward DNS queries to that Cluster. To securely expose the service outside the cluster, external clients must have an upstream resolver configured to forward DNS queries to the DC/OS cluster of the service as described here.

With only forwarding configured, DNS entries within the DC/OS cluster will be resolvable at <task-domain>.autoip.dcos.<cryptographic-id>.dcos.directory. However, if you configure a DNS alias, you can use a custom domain. For example, <task-domain>.cluster-1.acmeco.net. In either case, the DC/OS Kafka ZooKeeper service will need to be installed with an additional security option:

{
    "service": {
        "security": {
            "custom_domain": "<custom-domain>"
        }
    }
}

where <custom-domain> is one of autoip.dcos.<cryptographic-id>.dcos.directory or your organization specific domain (e.g., cluster-1.acmeco.net).

As a concrete example, using the custom domain of cluster-1.acmeco.net the server 0 task would have a host of zookeeper-0-server.<service-name>.cluster-1.acmeco.net.

Kerberos Principal Changes

With a custom domain endpoint discovery will work as normal. Kerberos, however, does require slightly different configuration. As noted in the section Create Principals, the principals of the service depend on the hostname of the service. When creating the Kerberos principals, be sure to use the correct domain.

For example, if installing with these settings:

{
    "service": {
        "name": "a/good/example",
        "security": {
            "kerberos": {
                "primary": "example",
                "realm": "EXAMPLE"
            }
        }
    },
    "node": {
        "count": 3
    }
}

then the principals to create would be:

example/zookeeper-0-server.agoodexample.cluster-1.example.net@EXAMPLE
example/zookeeper-1-server.agoodexample.cluster-1.example.net@EXAMPLE
example/zookeeper-2-server.agoodexample.cluster-1.example.net@EXAMPLE