Tags
Language
Tags
August 2025
Su Mo Tu We Th Fr Sa
27 28 29 30 31 1 2
3 4 5 6 7 8 9
10 11 12 13 14 15 16
17 18 19 20 21 22 23
24 25 26 27 28 29 30
31 1 2 3 4 5 6
    Attention❗ To save your time, in order to download anything on this site, you must be registered 👉 HERE. If you do not have a registration yet, it is better to do it right away. ✌

    ( • )( • ) ( ͡⚆ ͜ʖ ͡⚆ ) (‿ˠ‿)
    SpicyMags.xyz

    Reliability Engineering in the Cloud

    Posted By: lucky_aut
    Reliability Engineering in the Cloud

    Reliability Engineering in the Cloud
    Published 08/2025
    MP4 | Video: h264, 1280x720 | Audio: AAC, 44.1 KHz, 2 Ch
    Language: English | Duration: 5h 2m | Size: 1.2 GB

    This video course teaches engineering strategies for promoting chaos engineering practices, observability and monitoring techniques, disaster recovery exercises, reliability metrics, fast data-driven decision-making, and the application of Gen AI/LLMs.

    Participants will learn how to increase the reliability and scalability of their systems in the cloud, improve the efficiency of their operations, and gain valuable skills to enable faster incident response. They will also learn to automate operations to improve time to restore and time to detect to the greatest possible extent using modern cloud services, AI/LLMs, and best-in-class tools. The course will help participants understand how operational agility, lean principles, and chaos experimentation can foster a culture of continuous improvement built on collaboration and knowledge sharing. Given the lack of literature and established frameworks in this domain, learners will benefit from practical, domain-specific approaches and examples they can apply directly within their organizations and teams.

    Check out Mariya and Carlos's book Reliability Engineering in the Cloud: Strategies and Practices for AI-Powered Cloud-Based Systems (Addison-Wesley, 2025) for an even deeper dive.

    Learn How To

    Set an enterprise-wide CRE strategy for thousands of applications and dependencies
    Evaluate methods to increase the reliability and scalability of systems in the cloud
    Ignite faster incident response while automating operations to improve time to restore and time to detect to the maximum possible extent
    Recognize that operational agility and chaos experimentation bring a culture of continuous improvement built on collaboration and knowledge sharing between teams
    Build effective strategies, promoting chaos engineering practices, observability and monitoring techniques, disaster recovery exercises, reliability metrics, fast data-driven decision-making, and practical examples of techniques and tooling for success
    Identify domain-specific approaches and review examples to apply to your organizations and teams.

    Who Should Take This Course

    Software engineers and development teams responsible for designing, deploying, or maintaining cloud-native applications, with a focus on improving system reliability, scalability, and fault tolerance.
    Enterprise and technology leaders who are seeking to enhance the resilience of their cloud infrastructure, streamline operational efficiency, and reduce incident response times through modern reliability engineering practices.