History Server

Updated: January 13, 2017

DC/OS Spark includes the Spark history server. Because the history
server requires HDFS, you must explicitly enable it.

  1. Install HDFS first:
    $ dcos package install hdfs

    Note: HDFS requires 5 private nodes.

  2. Create a history HDFS directory (default is /history). SSH into
    your cluster
    and run:

    $ hdfs dfs -mkdir /history
  3. Enable the history server when you install Spark. Create a JSON
    configuration file. Here we call it options.json:

       "history-server": {
         "enabled": true
  4. Install Spark:
    $ dcos package install spark --options=options.json
  5. Run jobs with the event log enabled:
    $ dcos spark run --submit-args="-Dspark.eventLog.enabled=true -Dspark.eventLog.dir=hdfs://hdfs/history ... --class MySampleClass  http://external.website/mysparkapp.jar"
  6. Visit your job in the dispatcher at
    http://<dcos_url>/service/spark/Dispatcher/. It will include a link
    to the history server entry for that job.