You can customize the Docker image in which Spark runs by extending the standard Spark Docker image. In this way, you can install your own libraries, such as a custom Python library.
To customize your Docker image:
In your Dockerfile, extend from the standard Spark image and add your customizations:
FROM mesosphere/spark:2.8.0-2.4.0-hadoop-2.9 RUN apt-get install -y python-pip RUN pip install requests
Build an image from the customized Dockerfile.
docker build -t username/image:tag . docker push username/image:tag
Reference the custom Docker image with the
--docker-imageoption when running a Spark job.
dcos spark run --docker-image=myusername/myimage:v1 --submit-args="http://external.website/mysparkapp.py 30"