DevOps Amazon

Aayush Pandey
4 min readJun 21, 2021

Developers & operators can easily boost the efficiency and availability of their applications with DevOps Guru, a fully managed operations service. DevOps Guru make it easier of the administrative responsibility of identifying operational issues, allowing them to focus on implementing recommendations to improve their app. Machine learning is used by DevOps Guru to analyze organizational data, application metrics, and incidents to detect actions that are out of the ordinary.

Amazon DevOps Guru Features:

· Consolidate operational data from multiple sources

· Leverage ML-powered insights

· Automatically configure alarms

· Detect the most critical issues with minimal noise

· Integrate with AWS services and third-party tools

DevOps Guru Workflow:

The DevOps Guru workflow begins when users configure its coverage and notifications. After users set up DevOps Guru, it starts to analyze their operational data. When it detects anomalous behavior, it creates an insight that contains recommendations and lists of metrics and events that are related to the issue. For each insight, DevOps Guru notifies users. If users enabled AWS Systems Manager OpsCenter, an OpsItem is created so they can use Systems Manager OpsCenter to track & manage addressing their insights. Each insight contains recommendations, metrics, and events related to anomalous behavior.

High level DevOps Guru Workflow

The Amazon DevOps Guru workflow can be broken down into three high level steps.

Image retrieved from:

Detailed DevOps Guru workflow

The DevOps Guru workflow integrates with several AWS services, including Amazon CloudWatch, AWS CloudTrail, Amazon Simple Notification Service, and AWS Systems Manager.

Image retrieved from:

Understanding how Amazon DevOps Guru works requires an understanding of the following principles:


An anomaly is a collection of one or more similar metrics that DevOps Guru has identified as being unexpected or unusual. DevOps Guru generates anomalies by analyzing metrics and organizational data related to the AWS resources using machine learning.


When you set up DevOps Guru, it creates an insight, which is a list of anomalies that are found during the analysis of the AWS resources you specify. Each insight includes observations, feedback, and analytical data that you can use to improve your operational efficiency.

Metrics and operational events

The anomalies that make up an insight are generated by analyzing the metrics returned by Amazon CloudWatch and operational events emitted by user’s AWS resources. Users can view the metrics and the operational events that create an insight to help them better understand issues in their application.


Each insight provides recommendations with suggestions to help users to improve the performance of their application.

Amazon DevOps Guru Pricing:

Users can estimate their monthly cost for Amazon DevOps Guru to analyze their AWS resources. They will pay for the number of hours analyzed for each active AWS resource in their specified resource coverage. A resource is active if it produces metrics, events, or logs within an hour.

Users can create one cost estimate at a time. The time it takes to generate a cost estimate depends on the number of resources users specify when they create the cost estimate. When they specify a lot of resources, it can take up to four hours to complete. Their actual costs vary and depend on the percentage of time analyzed active resources are utilized.

Amazon DevOps Guru Security:

Encryption is an important part of DevOps Guru security.

· Encryption of data at-rest

· Encryption of data in-transit

Data in transit security, is provided by default­­­­­­­­­­ and data at rest security, users can configure when they create project or build.

Users can improve the security of their resource analysis and insight generation by configuring DevOps Guru to use an interface VPC endpoint.

Amazon DevOps Guru Use cases:

· Improve operational performance and availability

· Dynamically discover new resources and metrics

· Reduce Mean-time-to- recovery (MTTR)

· Proactive resource management

Learn more about Amazon DevOps Guru:




Thank You for reading this!!! 👏🏻