Setting up a Microk8s environment
Microk8s is a lightweight Kubernetes distribution that is suitable for setting up a local environment for development and testing purposes. These instructions explain how to set up Microk8s on Ubuntu.
Install Microk8s
It is important to install the 1.20 version because there are some issues in the 1.21 release with GPU support. These may have been resolved in the 1.22 release but this has not been tested yet with the RAPIDS Accelerator.
sudo snap install microk8s --classic --channel=1.20/stable
Permissions
To avoid the need to use sudo
when running microk8s
commands, add the current user to the microk8s
group and ensure that the user has access to files in the ~/.kube
folder.
sudo usermod -a -G microk8s $USER
sudo chown -f -R $USER ~/.kube
Generate Kube config
Backup any existing Kube configuration file and then generate a new kube config.
mkdir -p ~/.kube
microk8s config > ~/.kube/config
Enable DNS and GPU support
microk8s.enable dns
microk8s.enable gpu
Add cluster host name to /etc/hosts
Add kubernetes.default.svc.cluster.local
to /etc/hosts
to map to the same IP address as the host name. This is usually 127.0.1.1
or 127.0.0.1
.
This host name should be used in the spark-submit
command by specifying --master k8s://https://kubernetes.default.svc.cluster.local:16443
.
Create Spark service account
microk8s.kubectl create serviceaccount spark
This service account can then be specified in the spark-submit
command with --conf spark.kubernetes.authenticate.driver.serviceAccountName=spark
.
Specifying a certificate
The Microk8s certificate can be specified with --conf spark.kubernetes.authenticate.caCertFile=/var/snap/microk8s/current/certs/ca.crt
.
set K8S_SECRET env var with token
View a list of secrets using the following command.
microk8s.kubectl -n kube-system get secrets
Look for a secret with a name starting with default-token-
and run the following command to view the secret.
microk8s.kubectl -n kube-system describe secret default-token-5bmbh
Copy the value of the token
attribute and use it to set a K8S_TOKEN
environment variable.
export K8S_TOKEN=<value from above>
This token can then be specified in the spark-submit
command with --conf spark.kubernetes.authenticate.submission.oauthToken=$K8S_TOKEN
Building and exporting Docker images
Follow the instructions in Getting Started with RAPIDS and Kubernetes to create Docker images containing Spark and the RAPIDS Accelerator for Apache Spark.
Note that an additional step is required to export the Docker images from the host and import them into the Microk8s cluster.
For example, the following commands can be used to export a Docker image named spark-rapids
and import it into Microk8s.
docker save spark-rapids > /tmp/spark-rapids.tar
microk8s ctr image import /tmp/spark-rapids.tar
Spark Submit
After completing the above steps it should now be possible to use spark-submit
to run a Spark job in Microk8s. The following partial spark-submit
example shows the settings that should be present.
$SPARK_HOME/bin/spark-submit \
--master k8s://https://kubernetes.default.svc.cluster.local:16443 \
--deploy-mode cluster \
--conf spark.kubernetes.authenticate.driver.serviceAccountName=spark \
--conf spark.kubernetes.authenticate.caCertFile=/var/snap/microk8s/current/certs/ca.crt \
--conf spark.kubernetes.authenticate.submission.oauthToken=$K8S_TOKEN \
--conf spark.kubernetes.container.image=spark-rapids \
Further Reading
The following documentation pages and blog posts provide more information on setting up Microk8s, Apache Spark, and RAPIDS.