Getting Started with DC/OS Apache HDFS

Getting started with DC/OS Apache HDFS

Install a basic cluster

To start a basic test cluster with three journal nodes, two name nodes, and three data nodes, run the following command on the DC/OS CLI.

$ dcos package install hdfs

This command creates a new instance with the default name hdfs. Two instances cannot share the same name, so installing additional instances beyond the default instance requires customizing the name at install time for each additional instance.

All dcos hdfs CLI commands have a --name argument allowing you to specify which instance to query. If you do not specify a service name, the CLI assumes a default value matching the package name, hdfs. The default value for --name can be customized via the DC/OS CLI configuration:

$ dcos hdfs --name=hdfs <cmd>

Alternatively, you can install from the DC/OS web interface. If you install Apache HDFS from the DC/OS web interface, the dcos hdfs CLI commands are not automatically installed to your workstation. They may be manually installed using the DC/OS CLI:

dcos package install hdfs --cli

After running the package install command, the service will begin installing.

Enterprise DC/OS installation

Depending on the security mode of the Enterprise DC/OS cluster, Enterprise DC/OS users may need to create a custom .json file and use it to install Apache HDFS.

Create a Configuration File

Create a custom configuration file that will be used to install Apache HDFS, and save it as config.json. Specify the service account (<service_account_id>) and a secret path (hdfs/<secret-name>) .

{
  "service": {
    "service_account": "<service_account_id>",
    "service_account_secret": "hdfs/<secret-name>"
  }
}

Installing with a custom config file

Use the custom configuration file you just created to install Apache HDFS with this command:

dcos package install --options=config.json hdfs

Service Deployment

To monitor the deployment of your test instance, install the package cli (see command above) and run the command:

dcos hdfs plan show deploy

Once the deploy plan has a status of Complete, the service is fully deployed.

Service Discovery

To connect a client, query the service for its endpoints.

dcos hdfs endpoints

Select an endpoint from the list to see the available connections.

dcos hdfs endpoints <endpoint>

Working with the Service

Using the endpoint information, you can connect a client to the service from within the DC/OS cluster (for example, a Marathon app running a client application). See the other sections of the documentation for more details on configuration, operation, and service capabilities.