Mohit's blog

Building a Kubernetes Operator in Go with Kube-Shift

Hero.webp
Published on
/
5 mins read
/
––– views

Kubernetes has become the de-facto standard for container orchestration. As its adoption grows, so does the need for extending its capabilities to manage complex, stateful applications. This is where Kubernetes Operators come in. In this article, we'll explore how to build a custom Kubernetes Operator from scratch using Go and the controller-runtime library. We'll use kube-shift, a custom operator for database schema migrations, as a real-world example to guide us through the process.

The "Why": What Problem Do Operators Solve?

Before we dive into the code, let's understand the problem operators solve. Managing applications on Kubernetes often involves more than just deploying a set of pods. Many applications require complex lifecycle management, including installation, upgrades, and handling failures. Operators automate these tasks by extending the Kubernetes API with Custom Resource Definitions (CRDs) and a custom controller to manage them.

The kube-shift project is a perfect example. It introduces a DatabaseMigration resource to Kubernetes, allowing you to manage database schema changes as part of your deployment process, just like any other Kubernetes resource.

Setting the Stage: Project Scaffolding with Kubebuilder

The kube-shift operator, like many Go-based operators, is built using the Kubebuilder framework. Kubebuilder provides a set of tools to quickly scaffold a new operator project, including the boilerplate for CRDs, controllers, and other necessary components.

The entry point of our operator is cmd/main.go. This file is responsible for setting up and running the controller manager.

// cmd/main.go
package main
 
import (
	// ... imports
	"github.com/mohit-nagaraj/kube-shift/api/v1alpha1"
	"github.com/mohit-nagaraj/kube-shift/internal/controller"
	// ...
)
 
func main() {
	// ... flag parsing and setup
	
	mgr, err := ctrl.NewManager(ctrl.GetConfigOrDie(), ctrl.Options{
		// ... options
	})
	if err != nil {
		setupLog.Error(err, "unable to start manager")
		os.Exit(1)
	}
 
	if err = (&controller.DatabaseMigrationReconciler{
		Client: mgr.GetClient(),
		Scheme: mgr.GetScheme(),
	}).SetupWithManager(mgr); err != nil {
		setupLog.Error(err, "unable to create controller", "controller", "DatabaseMigration")
		os.Exit(1)
	}
 
	// ...
}

Here, we create a new manager and register our DatabaseMigrationReconciler with it. The manager is responsible for running the controller and managing its lifecycle.

Defining Our Custom Resource: The DatabaseMigration CRD

The first step in creating our operator is to define the custom resource it will manage. This is done in api/v1alpha1/databasemigration_types.go. This file defines the Go types that represent our DatabaseMigration CRD.

// api/v1alpha1/databasemigration_types.go
package v1alpha1
 
import (
	metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
)
 
// DatabaseMigrationSpec defines the desired state of DatabaseMigration
type DatabaseMigrationSpec struct {
	Database  DatabaseConfig  `json:"database"`
	Migration MigrationConfig `json:"migration"`
}
 
// DatabaseMigrationStatus defines the observed state of DatabaseMigration
type DatabaseMigrationStatus struct {
	Phase   MigrationPhase `json:"phase,omitempty"`
	Message string         `json:"message,omitempty"`
}
 
// +kubebuilder:object:root=true
// +kubebuilder:subresource:status
 
// DatabaseMigration is the Schema for the databasemigrations API
type DatabaseMigration struct {
	metav1.TypeMeta   `json:",inline"`
	metav1.ObjectMeta `json:"metadata,omitempty"`
 
	Spec   DatabaseMigrationSpec   `json:"spec,omitempty"`
	Status DatabaseMigrationStatus `json:"status,omitempty"`
}

The DatabaseMigration struct embeds metav1.TypeMeta and metav1.ObjectMeta, which are common to all Kubernetes objects. The Spec field defines the desired state of our resource, while the Status field represents its current state.

The Heart of the Operator: The Controller and Reconciliation Loop

With our CRD defined, we need a controller to act on it. The controller's logic is implemented in internal/controller/databasemigration_controller.go. The core of the controller is the Reconcile function. This function is called every time a DatabaseMigration resource is created, updated, or deleted.

// internal/controller/databasemigration_controller.go
 
// DatabaseMigrationReconciler reconciles a DatabaseMigration object
type DatabaseMigrationReconciler struct {
	client.Client
	Scheme *runtime.Scheme
	Log    logr.Logger
	// ... other fields
}
 
// +kubebuilder:rbac:groups=database.mohitnagaraj.in,resources=databasemigrations,verbs=get;list;watch;create;update;patch;delete
// ... other rbac markers
 
func (r *DatabaseMigrationReconciler) Reconcile(ctx context.Context, req ctrl.Request) (ctrl.Result, error) {
	log := r.Log.WithValues("databasemigration", req.NamespacedName)
 
	// your logic here
 
	return ctrl.Result{}, nil
}

The Reconcile function receives the name and namespace of the DatabaseMigration resource that triggered the event. It's the reconciler's job to fetch the resource and take the necessary actions to bring the system to the desired state defined in the resource's Spec. This could involve creating jobs, pods, or other resources to perform the database migration.

Advanced Reconciliation: Beyond the Basics

A robust operator does more than just create resources. kube-shift demonstrates several advanced concepts:

  • Status Updates: The controller updates the Status field of the DatabaseMigration resource to reflect the current state of the migration. This provides valuable feedback to users.
  • Finalizers: To ensure graceful cleanup, kube-shift uses a finalizer. The DatabaseMigrationFinalizer is added to the resource on creation. This prevents the resource from being deleted until the controller has performed any necessary cleanup tasks.
  • Leader Election: In a high-availability setup, you might run multiple instances of your operator. Leader election ensures that only one instance is active at a time, preventing conflicts. This is configured in cmd/main.go.

Testing Your Operator

Testing is crucial for any production-grade software. kube-shift includes both unit and integration tests.

  • Unit Tests: These tests focus on individual components of the operator in isolation.
  • Integration Tests: The project uses the envtest package to run integration tests against a real Kubernetes API server. You can see this in internal/controller/suite_test.go, which sets up a test environment for the controller tests.

Conclusion

Building a Kubernetes Operator is a powerful way to extend Kubernetes and automate complex application management tasks. By leveraging tools like Kubebuilder and controller-runtime, you can create robust, production-ready operators. The kube-shift project is an excellent example of how to apply these concepts to solve a real-world problem.

To learn more, I encourage you to explore the kube-shift repository on GitHub. You'll find the complete source code, along with more details on how to build, deploy, and use the operator. Happy coding!