5 GitHub Projects to make you a better DevOps Engineer ⚡

DevOps is one of the most challenging fields to be in, and to stay relevant you need to learn constantly.

So today, I want to share 5 amazing GitHub projects which will help you become a better DevOps engineer. These 5 Github projects can come in handy for anyone looking to learn and want good resources to dive in. 🏊‍♀️

So let's get started👊

⭐ Github stars: 4.8k

This repo is a curated collection of publicly available resources on how technology and tech-savvy organizations around the world practice Site Reliability Engineering (SRE)

GitHub logo upgundecha / howtheysre

A curated collection of publicly available resources on how technology and tech-savvy organizations around the world practice Site Reliability Engineering (SRE)

How they SRE

PRs Welcome CI

Alt

A curated collection of publicly available resources on how technology and tech-savvy organizations around the world practice Site Reliability Engineering (SRE)

Introduction

How They SRE is a curated knowledge repository of best practices, tools, techniques, and culture of SRE adopted by the leading technology or tech-savvy organizations.

Many organizations regularly come forward and share their best practices, tools, techniques and offer an insight into engineering culture on various public platforms like engineering blogs, conferences & meetups. The content is curated from these avenues and shared in this repository.

Note to readers: This list refers to some of the articles, posts, videos, tools, and techniques published before 2015. Please use such material with caution as there may be recent advances in technology and practices which offer better alternatives and perspectives.

Topics

  • Site Reliability Engineering
  • Hiring and Building SRE teams
  • SRE Culture
  • DevOps
  • Monitoring & Observability
  • Alerting
  • Incident Response…

⭐ Github stars: 32.5k

This repo has an organized reading list for illustrating the patterns of scalable, reliable, and performant large-scale systems. This is one of the best resources on scalability with real examples from large organizations.

GitHub logo binhnguyennus / awesome-scalability

The Patterns of Scalable, Reliable, and Performant Large-Scale Systems

Logo

An updated and organized reading list for illustrating the patterns of scalable, reliable, and performant large-scale systems. Concepts are explained in the articles of prominent engineers and credible references. Case studies are taken from battle-tested systems that serve millions to billions of users.

If your system goes slow

Understand your problems: scalability problem (fast for a single user but slow under heavy load) or performance problem (slow for a single user) by reviewing some design principles and checking how scalability and performance problems are solved at tech companies. The section of intelligence are created for those who work with data and machine learning at big (data) and deep (learning) scale.

If your system goes down

"Even if you lose all one day, you can build all over again if you retain your calm!" - Thuan Pham, former CTO of Uber. So, keep calm and mind the availability and stability matters!

⭐ Github stars: 8.6k

This repo contains questions and exercises on technical topics related to DevOps and SRE.

GitHub logo bregman-arie / devops-exercises

Linux, Jenkins, AWS, SRE, Prometheus, Docker, Python, Ansible, Git, Kubernetes, Terraform, OpenStack, SQL, NoSQL, Azure, GCP, DNS, Elastic, Network, Virtualization. DevOps Interview Questions

ℹ️  This repo contains questions and exercises on various technical topics, sometimes related to DevOps and SRE :)

📊  There are currently 1575 questions

📚  To learn more about DevOps and SRE, check the resources in devops-resources repository

⚠️  You can use these for preparing for an interview but most of the questions and exercises don't represent an actual interview. Please read Q&A for more details

👥  Join our DevOps community where we have discussions and resources on DevOps

📝  You can add more questions and exercises by submitting pull requests :) Read about contribution guidelines here


DevOps

What is DevOps?

⭐ Github stars: 7.2k

This project contains test questions and answers that can be asked during an interview/exam for positions such as Linux System Administrator.

GitHub logo trimstray / test-your-sysadmin-skills

A collection of Linux Sysadmin Test Questions and Answers. Test your knowledge and skills in different fields with these Q/A.

Master

"A great Admin doesn't need to know everything, but they should be able to come up with amazing solutions to impossible projects." - cwheeler33 (ServerFault)

"My skills are making things work, not knowing a billion facts. [...] If I need to fix a system I’ll identify the problem, check the logs and look up the errors. If I need to implement a solution I’ll research the right solution, implement and document it, the later on only really have a general idea of how it works unless I interact with it frequently... it’s why it’s documented." - Sparcrypt (Reddit)

Pull Requests MIT License

Created by trimstray and contributors



ℹ️  This project contains 284 test questions and answers that can be used as a test your knowledge or during an interview/exam for position such as Linux (*nix) System Administrator.

✔️  The answers are only examples and do not exhaust…



⭐ Github stars: 6.5k

This repo has a curated list of awesome Site Reliability and Production Engineering resources.

GitHub logo dastergon / awesome-sre

A curated list of Site Reliability and Production Engineering resources.

I hope you enjoyed this list. I will be coming up with more such amazing resources soon. So, stay tuned! 🙂

Currently building SigNoz - an open-source alternative to DataDog, New Relic, etc. 💙

SigNoz helps developers monitor applications and troubleshoot problems in their deployed applications. Check out our GitHub repo👇

GitHub logo SigNoz / signoz

SigNoz helps developers monitor their applications & troubleshoot problems, an open-source alternative to DataDog, NewRelic, etc. 🔥 🖥

SigNoz-logo

Monitor your applications and troubleshoot problems in your deployed applications, an open-source alternative to DataDog, New Relic, etc.

License Downloads GitHub issues tweet

SigNoz helps developers monitor applications and troubleshoot problems in their deployed applications. SigNoz uses distributed tracing to gain visibility into your software stack.

👉 You can see metrics like p99 latency, error rates for your services, external API calls and individual end points.

👉 You can find the root cause of the problem by going to the exact traces which are causing the problem and see detailed flamegraphs of individual request traces.

SigNoz Feature

👇 Features:

  • Application overview metrics like RPS, 50th/90th/99th Percentile latencies, and Error Rate
  • Slowest endpoints in your application
  • See exact request trace to figure out issues in downstream services, slow DB queries, call to 3rd party services like payment gateways, etc
  • Filter traces by service name, operation, latency, error, tags/annotations.
  • Aggregate metrics on filtered traces. Eg, you can get error…



25