KubernetesΒΆ

Kubernetes is a popular system for deploying distributed applications on clusters, particularly in the cloud. You can use Kubernetes to launch Dask workers in the following two ways:

  1. Helm: You can launch a Dask scheduler, several workers, and an optional Jupyter Notebook server on a Kubernetes easily using Helm

    helm repo add dask https://helm.dask.org/    # add the Dask Helm chart repository
    helm repo update                             # get latest Helm charts
    helm install dask/dask                     # deploy standard Dask chart
    

    This is a good choice if you want to do the following:

    1. Run a managed Dask cluster for a long period of time

    2. Also deploy a Jupyter server from which to run code

    3. Share the same Dask cluster between many automated services

    4. Try out Dask for the first time on a cloud-based system like Amazon, Google, or Microsoft Azure (see also our Cloud documentation)

    Note

    For more information, see Dask and Helm documentation.

  2. Native: You can quickly deploy Dask workers on Kubernetes from within a Python script or interactive session using Dask-Kubernetes

    from dask_kubernetes import KubeCluster
    cluster = KubeCluster.from_yaml('worker-template.yaml')
    cluster.scale(20)  # add 20 workers
    cluster.adapt()    # or create and destroy workers dynamically based on workload
    
    from dask.distributed import Client
    client = Client(cluster)
    

    This is a good choice if you want to do the following:

    1. Dynamically create a personal and ephemeral deployment for interactive use

    2. Allow many individuals the ability to launch their own custom dask deployments, rather than depend on a centralized system

    3. Quickly adapt Dask cluster size to the current workload

    Note

    For more information, see Dask-Kubernetes documentation.

You may also want to see the documentation on using Dask with Docker containers to help you manage your software environments on Kubernetes.