Create Chaos Experiments Using the LitmusChaos Python SDK

Hello, now LitmusChaos is supporting the python experiments…
Let’s see how to create chaos using a python SDK without losing time, step by step.
Before we get started big announcement is LitmusChaos 2.0 is out and litmus-python is also a part if it.
I'm focusing on creating chaos, so if you want to know more about Chaos Engineering then please follow LitmusChaos and details regarding Python SDK litmus-python
Steps to create Chaos:
  • git clone https://github.com/litmuschaos/litmus-python.git

  • cd contribute/developer-guide

  • Now update your attributes.yaml manifest. Like the name, category, etc. but follow: Use _ in names eg: sample_category

  • Now run: python3 generate_experiment.py -f=attributes.yaml -g="generate-type" -t="type"

    • You may run both commands
    • For Experiment: python3 generate_experiment.py -f=attributes.yaml -g=experiment
    • For Charts : python3 generate_experiment.py -f=attributes.yaml -g=chart
  • Note: Replace the -g=<generate-type> placeholder with the appropriate value based on the usecase:
    
     - experiment: Chaos experiment artifacts belonging to an existing OR new experiment.
     - chart: Just the chaos-chart metadata, i.e., chartserviceversion.yaml
    
    Provide the type of chart in the `-t=<type>` flag. It supports the following values:
     - category: It creates the chart metadata for the category i.e chartserviceversion, package manifests
     - experiment: It creates the chart for the experiment i.e chartserviceversion, engine, rbac, experiment manifests
     - all: it creates both category and experiment charts (default type)
    Provide the path of the attribute.yaml manifest in the -f flag.
  • Check: chaosLib/litmus/, experiments/ and pkg/ directories. Sample chaos has been generated.

  • Open bin/experiment/experiment.py and

  • import experiments.sample_category.sample_exec_chaos.experiment.sample_exec_chaos as experiment
    (sample_exec_experimet and sample_exec_chaos will be your provided names in attribute.yaml manifest, Default one mentioned everywhere) and add one more elif condition
    elif args.name == "chaos":
        experiment.Experiment(clients)
  • Add directories in setup.py
  • 'chaosLib/litmus/sample_exec_chaos',
    'chaosLib/litmus/sample_exec_chaos/lib',
    'pkg/sample_category',
    'pkg/sample_category/environment',
    'pkg/sample_category/types',
    'experiments/sample_category',
    'experiments/sample_category/sample_exec_chaos',
    'experiments/sample_category/sample_exec_chaos/experiment',
    I've updated default names.
  • Let’s come bank to root directory litmuschaos/litmus-python to setup environment.

  • python3 -m virtualenv chaos

  • source chaos/bin/activate

  • python3 setup.py install (You need to run this every time before running python3 experiment.py -name chaos, Installing all required
    prerequisites and setting up directory structure)

  • Now ready to code, Just open chaosLib/litmus/sample_exec_chaos/lib/ sample_exec_chaos.py and experiments/sample_category/sample_exec_chaos/experiment/sample_exec_chaos.py files and start writing chaos…

  • Create a sample Nginx deployment that can be used as the application under test (AUT).

  • kubectl create deployment nginx --image=nginx
  • Go to pkg/sample_category and open the environment.py or types.py and add/delete/update the required env.
  • Note:
    Add & operator at the end of chaos commands CHAOS_INJECT_COMMAND
    example: md5sum /dev/zero &. As we are running chaos commands as a background process in a separate thread.
  • Go to bin/experiment and run: python3 experiment.py -name chaos. Before this command always run python3 setup.py install in the root directory. Example chaos logs:
  • time=2021-08-13 11:43:12,392 level=INFO  msg=Experiment Name: chaos
    time=2021-08-13 11:43:12,392 level=INFO  msg=[PreReq]: Initialise Chaos Variables for the sample-chaos experiment
    time=2021-08-13 11:43:12,393 level=INFO  msg=[PreReq]: Updating the chaos result of sample-chaos experiment (SOT)
    time=2021-08-13 11:43:12,867 level=INFO  msg=[Info]: The application information is as follows Namespace=litmus, Label=app=nginx, Ramp Time=0
    time=2021-08-13 11:43:12,867 level=INFO  msg=[Status]: Verify that the AUT (Application Under Test) is running (pre-chaos)
    time=2021-08-13 11:43:12,867 level=INFO  msg=[status]: Checking whether application containers are in ready state
    time=2021-08-13 11:43:12,887 level=INFO  msg=[status]: The Container status are as follows Container : nginx, Pod : nginx-66b6c48dd5-c65kl, Readiness : True
    time=2021-08-13 11:43:12,887 level=INFO  msg=[status]: The Container status are as follows Container : nginx, Pod : nginx-66b6c48dd5-p87pc, Readiness : True
    time=2021-08-13 11:43:12,887 level=INFO  msg=[status]: The Container status are as follows Container : nginx, Pod : nginx-66b6c48dd5-s5hjs, Readiness : True
    time=2021-08-13 11:43:12,887 level=INFO  msg=[status]: Checking whether application pods are in running state
    time=2021-08-13 11:43:12,908 level=INFO  msg=[status]: The status of Pods are as follows Pod : nginx-66b6c48dd5-c65kl status : Running
    time=2021-08-13 11:43:12,908 level=INFO  msg=[status]: The status of Pods are as follows Pod : nginx-66b6c48dd5-p87pc status : Running
    time=2021-08-13 11:43:12,908 level=INFO  msg=[status]: The status of Pods are as follows Pod : nginx-66b6c48dd5-s5hjs status : Running
    time=2021-08-13 11:43:12,938 level=INFO  msg=[Info]: chaos candidate of kind: deployment, name: nginx, namespace: litmus
    time=2021-08-13 11:43:12,943 level=INFO  msg=[Info]: chaos candidate of kind: deployment, name: nginx, namespace: litmus
    time=2021-08-13 11:43:12,947 level=INFO  msg=[Info]: chaos candidate of kind: deployment, name: nginx, namespace: litmus
    time=2021-08-13 11:43:12,947 level=INFO  msg=[Chaos]:Number of pods targeted: 3
    time=2021-08-13 11:43:12,947 level=INFO  msg=[Info]: Target pods list, ['nginx-66b6c48dd5-c65kl', 'nginx-66b6c48dd5-p87pc', 'nginx-66b6c48dd5-s5hjs']
    time=2021-08-13 11:43:12,955 level=INFO  msg=[Chaos]: The Target application details container : nginx, Pod : nginx-66b6c48dd5-c65kl
    time=2021-08-13 11:43:12,956 level=INFO  msg=[Chaos]: Waiting for: 10
    time=2021-08-13 11:43:22,955 level=INFO  msg=[Chaos]: Time is up for experiment: sample-chaos
    time=2021-08-13 11:43:23,155 level=INFO  msg=[Chaos]: The Target application details container : nginx, Pod : nginx-66b6c48dd5-p87pc
    time=2021-08-13 11:43:23,164 level=INFO  msg=[Chaos]: Waiting for: 10
    time=2021-08-13 11:43:33,155 level=INFO  msg=[Chaos]: Time is up for experiment: sample-chaos
    time=2021-08-13 11:43:33,289 level=INFO  msg=[Chaos]: The Target application details container : nginx, Pod : nginx-66b6c48dd5-s5hjs
    time=2021-08-13 11:43:33,289 level=INFO  msg=[Chaos]: Waiting for: 10
    time=2021-08-13 11:43:43,289 level=INFO  msg=[Chaos]: Time is up for experiment: sample-chaos
    time=2021-08-13 11:43:43,405 level=INFO  msg=[Confirmation]: sample-chaos chaos has been injected successfully
    time=2021-08-13 11:43:43,405 level=INFO  msg=[Status]: Verify that the AUT (Application Under Test) is running (post-chaos)
    time=2021-08-13 11:43:43,405 level=INFO  msg=[status]: Checking whether application containers are in ready state
    time=2021-08-13 11:43:43,416 level=INFO  msg=[status]: The Container status are as follows Container : nginx, Pod : nginx-66b6c48dd5-c65kl, Readiness : True
    time=2021-08-13 11:43:43,416 level=INFO  msg=[status]: The Container status are as follows Container : nginx, Pod : nginx-66b6c48dd5-p87pc, Readiness : True
    time=2021-08-13 11:43:43,416 level=INFO  msg=[status]: The Container status are as follows Container : nginx, Pod : nginx-66b6c48dd5-s5hjs, Readiness : True
    time=2021-08-13 11:43:43,416 level=INFO  msg=[status]: Checking whether application pods are in running state
    time=2021-08-13 11:43:43,429 level=INFO  msg=[status]: The status of Pods are as follows Pod : nginx-66b6c48dd5-c65kl status : Running
    time=2021-08-13 11:43:43,429 level=INFO  msg=[status]: The status of Pods are as follows Pod : nginx-66b6c48dd5-p87pc status : Running
    time=2021-08-13 11:43:43,429 level=INFO  msg=[status]: The status of Pods are as follows Pod : nginx-66b6c48dd5-s5hjs status : Running
    time=2021-08-13 11:43:43,429 level=INFO  msg=[The End]: Updating the chaos result of sample-chaos experiment (EOT)
  • Now make sure that you have created all the required charts. In directory experiments/sample_category/sample_exec_chaos/charts
  • After testing locally now let’s go into production.
  • Go to the root directory litmuschaos/litmus-python. Build a docker image: docker build -t your-user-name/py-runner:ci . and push it docker push your-user-name/py-runner:ci
  • Two ways to test it:
    1. Use custom way
    Refer docs for more details python-sdk
    Run the experiment.yml with the desired values in the ENV and appropriate chaosServiceAccount using a custom dev image instead of litmuschaos/litmus-python (say, oumkale/litmus-python:ci) that packages the business logic.
  • Create a custom image built with the code validated by the previous steps.

  • Launch the Chaos-Operator:

  • kubectl apply -f https://litmuschaos.github.io/litmus/litmus-operator-v1.13.8.yaml
  • Setup the RBAC necessary for the execution of this experiment by applying the generated rbac.yaml.
  • kubectl apply -f rbac.yaml
  • Modify the ChaosExperiment manifest (experiment.yaml) with right defaults (env & other attributes, as applicable) & create this CR on the cluster (pointing the .spec.definition.image to the custom one just built).

  • Modify the ChaosEngine manifest (engine.yaml) with the right app details, run properties & creating this CR to launch the chaos pods.

  • Verify the experiment status via ChaosResult.

  • Refer litmus docs for more details on performing each step in this procedure.
    Run all the CRs and operator in the single namespace
    Example experiment.yaml: Link
    Example engine.yaml: Link
    N/b: Also use & operator at last if any chaos command is required as an ENV.
    Now list the pods in the namespace
    If engine pod is running with runner
    nginx-chaos-runner                      1/1     Running   0          35s
    pod-cpu-hog-exec-2szc9z-rjjkr           1/1     Running   0          33s
    See the logs of the engine pod.
    2. Using Litmus Portal
    Follow this Link for portal setup details.
  • Open portal and go to Workflows -> Schedule a workflow
    Alt Text

  • Select your agent.
    Alt Text

  • You may select 2nd one experiment from my hub as well.

  • Use upload YAML. (For workflow example: Link). n/b: Here need to upload argo workflow.
    Alt Text

  • On selecting next-next you will land on this page where you can edit YAML or you may update using UI only on clicking Name, which will land on the next screen.

    • Update engine spec.chaosServiceAccount: litmus-admin Alt Text
  • Now tune all the ENV and finish Alt Text

  • After scheduling workflow you will land on the workflow dashboard.Alt Text

  • Open Workflow to see details.Alt Text

  • Here Logs, Chaos Results, and details about the experiment have been proved, Hence you have completed creating Chaos and testing on the application.
    Note: You can see details in UI, but another way is to describe the following CRs.
    kubectl get chaosresult -n litmus
    kubectl get chaosengine -n litmus
    kubectl get chaosexperiment -n litmus
    kubectl get workflow -n litmus
    Congratulation You have been successfully Created and Injected Chaos!
    Now it's time to raise PR...!
    Steps to Include the Chaos Charts/Experiments into the ChartHub:
  • Send a PR to the litmus-python repo with the modified experiment files, rbac, test deployment & README.
  • Send a PR to the chaos-charts repo with the modified experiment CR, experiment chartserviceversion, rbac, (category-level) chaos chart chartserviceversion & package.yaml (if applicable).
  • Contact us on slack for any queries or doubts.
  • Are you an SRE or a Kubernetes enthusiast? Does Chaos Engineering excite you?
    Join the LitmusChaos Community Slack channel by joining the #litmus channel on the Kubernetes (https://slack.k8s.io/) Slack!
    References:

    36

    This website collects cookies to deliver better user experience

    Create Chaos Experiments Using the LitmusChaos Python SDK