Krkn is an open-source chaos engineering toolkit for Kubernetes-based cloud-native environments focused on resilience testing and fault injection for infrastructure and workloads.

CEF for Kubernetes clusters (reliability testing)
Workload and infrastructure fault injection scenarios (resilience testing)
Automated scenario orchestration and execution against clusters (test automation)
Support for custom and extensible chaos scenarios via configuration (extensibility)
Reporting and validation workflows to assess cluster and application behavior under stress (observability and validation)

More About Krkn

Krkn is a chaos engineering toolkit designed for Kubernetes-based cloud-native platforms, with a focus on validating resilience and reliability characteristics of clusters, infrastructure, and deployed workloads. It targets enterprise and institutional users that operate Kubernetes at scale and need controlled fault injection to verify behavior under adverse conditions.

The project provides structured chaos scenarios (resilience testing) that can be executed against Kubernetes clusters and their underlying infrastructure. These scenarios cover areas such as workload disruption, resource pressure, and infrastructure events, enabling teams to observe how applications and platforms behave when components fail or degrade. Krkn focuses on repeatable and automated testing patterns rather than ad hoc experiments.

Krkn includes automation and orchestration capabilities (test automation) that allow users to define and schedule chaos runs, often as part of Continuous Integration (CI) and continuous delivery workflows. Scenarios are defined through configuration, so operators can encode policies, constraints, and failure modes that match production environments. This configuration-driven approach supports integration into existing platform engineering and Site Reliability Engineering (SRE) practices.

The toolkit is built for Kubernetes (container orchestration) and aligns with cloud-native technologies backed by the Cloud Native Computing Foundation. It interacts with Kubernetes APIs and cluster resources to introduce failures such as pod terminations, node-level stress, or other disruptions, depending on the configured scenario set. By doing so, Krkn helps teams validate recovery features such as auto-scaling, self-healing controllers, and redundancy mechanisms that are common in Kubernetes architectures.

Enterprises use Krkn to run planned chaos experiments in non-production and, in some cases, controlled production environments to validate service-level objectives, operational runbooks, and incident response processes. The tool supports extensibility (platform tooling) through custom scenarios and parameterization, allowing organizations to adapt it to specific infrastructure providers, networking topologies, and workload types.

Within an enterprise tooling directory, Krkn fits into categories such as chaos engineering, reliability testing, and Kubernetes platform validation. It is relevant for platform engineering teams, SRE groups, and operations teams that manage Kubernetes clusters and require systematic methods to test and evidence the resilience of their cloud-native platforms.