Apache Spark is a fast and general-purpose cluster computing system for big data. It provides high-level APIs in Scala, Java, Python, and R, and an optimized engine that supports general computation graphs for data analysis. It also supports a rich set of higher-level tools including: Spark SQL for SQL and DataFrames, MLlib for machine learning, GraphX for graph processing, and Spark Streaming for stream processing.
Documentation for DC/OS Apache Spark 2.8.0-2.4.0…Read More
Documentation for DC/OS Apache Spark 2.6.0-2.3.2…Read More
Configuring DC/OS Access for Spark
In Spark 2.3.1-2.2.1-2 and later, these topics have been divided up among the Getting Started and Security sections. Previous versions will still need the information below.…Read More