Implementing the cluster controller
Objective: Define the DockerCluster
API and implement the corresponding reconciliation logic.
Background
When designing the DockerCluster
API kind we need to make sure that we adhere to the contract for a Cluster Infrastructure Provider and expose enough information to the user so that we can create any required infrastructure for the cluster (but not machines for nodes).
In our example, the infrastructure that we need to provision is a load balancer to sit in front of our the API servers of our control plane nodes (that will be provisioned later). To do this, we will create an instance of an HAProxy container. The address of the load balancer will be specified in controlPlaneEndPoint
.
When implementing the controller for DockerCluster
(and DockerMachine
) you will generally follow a pattern similar to this:
- In Reconcile, get the instance of the API type being reconciled
- Get the owning CAPI type (i.e. if we are reconciling
DockerCluster
then we getCluster
)- You may need other types if you are reconciling machines
- If the owning CAPI type doesn't exist yet (because the "owner reference isn't set yet"), then exit
- If the instance has a non-zero deletion timestamp (indicating that it has been marked for deletion), then call a reconcileDelete function that:
- Performs any actions to delete the infrastructure
- Removes the finalizer and saves/patches
- If the instance has a zero deletion timestamp (indicating that it is not being deleted), then call a reconcileNormal function that:
- Adds a finalizer and saves/patches
- Performs any actions to create or update the infrastructure
Define DockerCluster
API
- Read the contract for a cluster infrastructure provider.
- Open
api/v1alpha1/dockercluster_types.go
in your favourite editor. - Add the required fields to satisfy the contract:
- Add this import
clusterv1 "sigs.k8s.io/cluster-api/api/v1beta1"
- Add a field for the control plane endpoint to DockerClusterSpec (and remove the generated
Foo string
lines):
// ControlPlaneEndpoint represents the endpoint used to communicate with the control plane.
// +optional
ControlPlaneEndpoint clusterv1.APIEndpoint `json:"controlPlaneEndpoint"`- Add a Ready field to DockerClusterStatus:
// Ready indicates that the cluster is ready.
// +optional
// +kubebuilder:default=falctrl.SetupSignalHandler()se
Ready bool `json:"ready"` - Add this import
- Now add the provider specific fields to the DockerClusterSpec. In our example we will allow the user to optionally override the image to use for the loadbalancer. Add the following:
// LoadBalancerImage allows you override the load balancer image. If not specified a
// default image will be used.
// +optional
LoadBalancerImage string `json:"loadbalancerImage,omitempty"` - As we will be creating external infrastructure, we will need to use finalizers. So define a finalizer:
const (
// ClusterFinalizer allows cleaning up resources associated with
// DockerCluster before removing it from the apiserver.
ClusterFinalizer = "dockercluster.infrastructure.cluster.x-k8s.io"
)Finalizers are used to mark an object to prevent Kubernetes from deleting it until the finalizer is removed. This allows the controller to delete any external infrastructure such as a container instances before removing the object. You can read more about finalizers here and here.
- Update the generated code and manifests by running:
make generate
make manifests
Additional Notes:
- Note the
+optional
and+kubebuilder:default=false
in the comments. These are special comments that enable you to add validation to your CRD fields (via OpenAPI schema). Its advisable to add validation where possible. The list of available validations can be seen here. - The contract has a number of optional fields related within status to report failures. It is advisable that you implement these in your own provider.
- Although not part of the contract, historically providers have added conditions to the status. We will not be using conditions but if you want to learn more read this.
Implementing Reconciliation
Background: In this section we will be implementing the reconciliation of the DockerCluster
. The purpose of the infrastructure cluster is to create any required infrastructure for the cluster but not anything related to individual machines/nodes. In the case of the tutorial we will be implementing the pattern mentioned above and creating a load balancer container instance that will be used to load balance requests to the control plane nodes (when they are created).
Implement Reconcile Pattern
- Add a reference to the packages we need. in your terminal run:
go get github.com/capi-samples/cluster-api-provider-docker/pkg
We will be reusing various packages from the reference implementation so that we can focus on the provider implementation and not on the specific internals of docker.
- Open
controllers/dockercluster_controller.go
in your editor. The rest of the steps in this section will relate to this file unless explicitly stated otherwise. - Change the signature of Reconcile to match:
func (r *DockerClusterReconciler) Reconcile(ctx context.Context, req ctrl.Request) (_ ctrl.Result, rerr error) {
- Our controller will need read RBAC permission for Cluster. So add the following to the comment for Reconcile:
//+kubebuilder:rbac:groups=cluster.x-k8s.io,resources=clusters;clusters/status,verbs=get;list;watch
We will be interacting with the container runtime, to support this:
- Add
"github.com/capi-samples/cluster-api-provider-docker/pkg/container"
as an import - Add a field to the DockerClusterReconciler struct that holds a reference to the container runtime. The struct should now contain:
type DockerClusterReconciler struct {
client.Client
Scheme *runtime.Scheme
ContainerRuntime container.Runtime
}- Add
In the Reconcile make these changes:
- We will be creating log entries, so save the logger to a variable
logger := log.FromContext(ctx)
- Add the container runtime information to the context so it can be used later:
ctx = container.RuntimeInto(ctx, r.ContainerRuntime)
- Change this import
infrastructurev1alpha1 "github.com/capi-samples/cluster-api-provider-docker/api/v1alpha1"
toinfrav1 "github.com/capi-samples/cluster-api-provider-docker/api/v1alpha1"
and update the import name in SetupWithManager function.The convention is to import your api with the major api version in the name only. The reason is when introducing a new api version you just update the import and not the import alias so as to reduce the number of code changes.
- Add the following imports:
apierrors "k8s.io/apimachinery/pkg/api/errors"
"k8s.io/klog/v2"
"sigs.k8s.io/cluster-api/util"
"sigs.k8s.io/cluster-api/util/annotations"
"sigs.k8s.io/cluster-api/util/patch"
"github.com/capi-samples/cluster-api-provider-docker/pkg/docker"- You can delete the
// TODO(user): your logic here
comment. - We need to get the instance of the
DockerCluster
from the request. If the instance is not found exit reconciliation. Add the following:
dockerCluster := &infrav1.DockerCluster{}
if err := r.Client.Get(ctx, req.NamespacedName, dockerCluster); err != nil {
if apierrors.IsNotFound(err) {
return ctrl.Result{}, nil
}
return ctrl.Result{}, err
}- Next get the owning type of the
DockerCluster
which is the CAPI Cluster. We can use a helper function from CAPI that will look at the ownerReferences. If it returns nil, we can requeue:
// Get the Cluster
cluster, err := util.GetOwnerCluster(ctx, r.Client, dockerCluster.ObjectMeta)
if err != nil {
return ctrl.Result{}, err
}
if cluster == nil {
logger.Info("Waiting for Cluster Controller to set OwnerRef on DockerCluster")
return ctrl.Result{}, nil
}- Now we have the cluster we can update the logger to include the cluster name in later log lines:
logger = logger.WithValues("cluster", klog.KObj(cluster))
ctx = ctrl.LoggerInto(ctx, logger)You can add any name/value pairs that would aid in the support of your provider.
- Reconciliation can be paused, for instance when you pivot from an ephemeral bootstrap cluster to a permanent management cluster (i.e. via clusterctl move). We can check if the reconciliation is paused by looking for an annotation:
if annotations.IsPaused(cluster, dockerCluster) {
logger.Info("DockerCluster or owning Cluster is marked as paused, not reconciling")
return ctrl.Result{}, nil
}- We will be using a helper to manage the lifecycle of the load balancer that will be created. So create a new instance of this:
// Create a helper for managing a docker container hosting the loadbalancer.
externalLoadBalancer, err := docker.NewLoadBalancer(ctx, cluster, dockerCluster)
if err != nil {
return ctrl.Result{}, errors.Wrapf(err, "failed to create helper for managing the externalLoadBalancer")
}Some providers follow a Scope & Services pattern. So instead of creating a loadbalancer helper they would create a Cluster Scope at this point which will hold everything that is required for reconciliation. If you are interested have a look at the example from Cluster API Provider AWS 10. When we exit reconciliation, we want to persist any changes to
DockerCluster
and this is done by using a patch helper:// Initialize the patch helper
patchHelper, err := patch.NewHelper(dockerCluster, r.Client)
if err != nil {
return ctrl.Result{}, err
}
// Always attempt to Patch the DockerCluster object and status after each reconciliation.
defer func() {
if err := patchHelper.Patch(ctx, dockerCluster); err != nil {
logger.Error(err, "failed to patch DockerCluster")
if rerr == nil {
rerr = err
}
}
}()If we were using conditions then we would need to set the condition values here as part of the patch. 11. Now we are at the position of being able to do the actions that are specific to the Docker provider for create/update (i.e. reconcileNormal) and delete (i.e. reconcileDelete). Replace
return ctrl.Result{}, nil
with// Handle deleted clusters
if !dockerCluster.DeletionTimestamp.IsZero() {
return r.reconcileDelete(ctx, dockerCluster, externalLoadBalancer)
}// Handle non-deleted clusters return r.reconcileNormal(ctx, dockerCluster, externalLoadBalancer)
12. Add the following 2 empty functions:
```go
func (r *DockerClusterReconciler) reconcileNormal(ctx context.Context, dockerCluster *infrav1.DockerCluster, externalLoadBalancer *docker.LoadBalancer) (ctrl.Result, error) {
return ctrl.Result{}, nil
}
func (r *DockerClusterReconciler) reconcileDelete(ctx context.Context, dockerCluster *infrav1.DockerCluster, externalLoadBalancer *docker.LoadBalancer) (ctrl.Result, error) {
return ctrl.Result{}, nil
}We are now ready to move onto implementing the create/update functionality.
Implement create/update reconciliation
- We will continue to make changes in
controllers/dockercluster_controller.go
. - Add the following imports:
clusterv1 "sigs.k8s.io/cluster-api/api/v1beta1"
"sigs.k8s.io/controller-runtime/pkg/controller/controllerutil" - Go to the reconcileNormal function.
- Get the logger and log that we are reconciling for Docker Cluster:
logger := log.FromContext(ctx)
logger.Info("Reconciling DockerCluster")
- We want to ensure that the finalizer for
DockerCluster
is added so that if theDockerCluster
instance is deleted later on, we get the chance to do any required cleanup:
if !controllerutil.ContainsFinalizer(dockerCluster, infrav1.ClusterFinalizer) {
controllerutil.AddFinalizer(dockerCluster, infrav1.ClusterFinalizer)
return ctrl.Result{Requeue: true}, nil
}
Note the use of
return ctrl.Result{Requeue: true}, nil
. This means that after we have added the finalizer we will exit reconciliation and ourDockerCluster
is patched. The Requeue: true will then cause the Reconciliation to be called again without theDockerCluster
instance being updated. This is a common pattern to ensure changes are persisted and as such its important to ensure your reconciliation logic is idempotent. 6. Now we can create the load balancer container instances:// Create the docker container hosting the load balancer if it does not exist.
if err := externalLoadBalancer.Create(ctx); err != nil {
return ctrl.Result{}, errors.Wrap(err, "failed to create load balancer")
}
- Now get the IP address of the load balancer container:
// Get the load balancer IP so we can use it for the enpoint address
lbIP, err := externalLoadBalancer.IP(ctx)
if err != nil {
return ctrl.Result{}, errors.Wrap(err, "failed to get IP for the load balancer")
}
- Use the ip address to set the endpoint for the control plane. This is required so that CAPI can use it when creating the kubeconfig for our cluster:
dockerCluster.Spec.ControlPlaneEndpoint = clusterv1.APIEndpoint{
Host: lbIP,
Port: 6443,
}
- Finally set the Ready status property to true to indicate to CAPI that the infrastructure is ready and it can continue with creating the cluster:
dockerCluster.Status.Ready = true
- We can now move on to implementing deletion
Implement delete reconciliation
- We will continue to make changes in
controllers/dockercluster_controller.go
. - Go to the
reconcileDelete
. - Get the logger and log that we are reconciling for Docker Cluster:
logger := log.FromContext(ctx)
logger.Info("Reconciling DockerCluster deletion")
- Delete any external infrastructure. For our provider, we need to delete the instance of the load balancer container:
// Delete the docker container hosting the load balancer
if err := externalLoadBalancer.Delete(ctx); err != nil {
return ctrl.Result{}, errors.Wrap(err, "failed to delete load balancer")
}
- As all the external infrastructure is deleted we can remove the finalizer:
// Cluster is deleted so remove the finalizer.
controllerutil.RemoveFinalizer(dockerCluster, infrav1.ClusterFinalizer)
Setup the controller
- We need to tell controller-runtime what resources our controller should be reconciling. This is done within the SetupWithManager function.
- Add the following imports:
"sigs.k8s.io/controller-runtime/pkg/handler"
"sigs.k8s.io/controller-runtime/pkg/source"
"sigs.k8s.io/cluster-api/util/predicates"
"github.com/capi-samples/cluster-api-provider-docker/pkg/container"
- Change the signature of the SetupWithManager function so it accepts the context:
func (r *DockerClusterReconciler) SetupWithManager(ctx context.Context, mgr ctrl.Manager) error
- Delete the contents of the SetupWithManager function
- Add the following to the function:
c, err := ctrl.NewControllerManagedBy(mgr).
For(&infrav1.DockerCluster{}).
//WithOptions(options).
WithEventFilter(predicates.ResourceNotPaused(ctrl.LoggerFrom(ctx))).
Build(r)
if err != nil {
return err
}
This tells controller runtime to call Reconcile when there is a change to
DockerCluster
. Additionally you can add predicates (or event filters) to stop reconciliation occurring in certain situations. In this instance we useWithEventFilter(predicates.ResourceNotPaused
to ensure reconciliation is not called when reconciliation is paused. You can also customize the settings of the "controller manager" by usingWithOptions(options)
if needed, this often used to limit the number of concurrent reconciliations (although there will be at most one reconcile for a given instance).
- In addition to the controller reconciling on changes to
DockerCluster
we would also like it to happen if there are changes to its owning Cluster. Controller runtime allows you to watch a different resource type and then decide if you want to enqueue a request for reconciliation. Add the following:
return c.Watch(
&source.Kind{Type: &clusterv1.Cluster{}},
handler.EnqueueRequestsFromMapFunc(util.ClusterToInfrastructureMapFunc(ctx, infrav1.GroupVersion.WithKind("DockerCluster"), mgr.GetClient(), &infrav1.DockerCluster{})),
predicates.ClusterUnpaused(ctrl.LoggerFrom(ctx)),
)
This is saying to watch clusterv1.Cluster and if there is a change to a Cluster instance, get the child
DockerCluster
name/namespace usingutil.ClusterToInfrastructureMapFunc(ctx, infrav1.GroupVersion.WithKind("DockerCluster"), mgr.GetClient(), &infrav1.DockerCluster{})
and then use that name/namespace to enqueue a request for reconciliation of theDockerCluster
instance with that name/namespace usinghandler.EnqueueRequestsFromMapFunc
. This will then result in Reconciliation being called.
As we have changed the parameters to SetupWithManager function, go to
main.go
.In the main function make these changes:
- Add the following before we create the reconcilers:
ctx := ctrl.SetupSignalHandler()
// Set our runtime client into the context for later use
runtimeClient, err := container.NewDockerClient()
if err != nil {
setupLog.Error(err, "unable to establish container runtime connection", "controller", "reconciler")
os.Exit(1)
}This setups signal handlers so that the controllers can be gracefully terminated.
- Update the creation of DockerClusterReconciler to pass in the runtimeClient and the call to SetupWithManager on to pass in the context:
if err = (&controllers.DockerClusterReconciler{
Client: mgr.GetClient(),
Scheme: mgr.GetScheme(),
ContainerRuntime: runtimeClient,
}).SetupWithManager(ctx, mgr); err != nil {
setupLog.Error(err, "unable to create controller", "controller", "DockerCluster")
os.Exit(1)
}- Change the
mgr.Start
to use the context created earlier:
if err := mgr.Start(ctx); err != nil {
Ensure that all the api types are registered:
- Add
clusterv1 "sigs.k8s.io/cluster-api/api/v1beta1"
as an import inmain.go
- Add this api to the scheme by adding the following to init function:
utilruntime.Must(clusterv1.AddToScheme(scheme))
- Add
Run the following from a terminal:
make manifests
make build