本文最后更新于:2024年7月24日 晚上
系列文章
AWS Cloudwatch 数据源
对于 AWS Cloudwatch, 主要在于 3 种不同的认证方式:
- AWS SDK Default
- IAM Role
- AK&SK
- Credentials file
现在推荐的是使用 IAM Role 的认证方式,避免了密钥泄露的风险。
但是特别要注意的是,要读取 CloudWatch 指标和 EC2 标签 (tags)、实例、区域和告警,你必须通过 IAM 授予 Grafana 权限。你可以将这些权限附加到你在 AWS 认证中配置的 IAM role 或 IAM 用户。
IAM policy 示例如下:
Metrics-only:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
| { "Version": "2012-10-17", "Statement": [ { "Sid": "AllowReadingMetricsFromCloudWatch", "Effect": "Allow", "Action": [ "cloudwatch:DescribeAlarmsForMetric", "cloudwatch:DescribeAlarmHistory", "cloudwatch:DescribeAlarms", "cloudwatch:ListMetrics", "cloudwatch:GetMetricData", "cloudwatch:GetInsightRuleReport" ], "Resource": "*" }, { "Sid": "AllowReadingTagsInstancesRegionsFromEC2", "Effect": "Allow", "Action": ["ec2:DescribeTags", "ec2:DescribeInstances", "ec2:DescribeRegions"], "Resource": "*" }, { "Sid": "AllowReadingResourcesForTags", "Effect": "Allow", "Action": "tag:GetResources", "Resource": "*" } ] }
|
Logs-only:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
| { "Version": "2012-10-17", "Statement": [ { "Sid": "AllowReadingLogsFromCloudWatch", "Effect": "Allow", "Action": [ "logs:DescribeLogGroups", "logs:GetLogGroupFields", "logs:StartQuery", "logs:StopQuery", "logs:GetQueryResults", "logs:GetLogEvents" ], "Resource": "*" }, { "Sid": "AllowReadingTagsInstancesRegionsFromEC2", "Effect": "Allow", "Action": ["ec2:DescribeTags", "ec2:DescribeInstances", "ec2:DescribeRegions"], "Resource": "*" }, { "Sid": "AllowReadingResourcesForTags", "Effect": "Allow", "Action": "tag:GetResources", "Resource": "*" } ] }
|
Metrics and Logs:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43
| { "Version": "2012-10-17", "Statement": [ { "Sid": "AllowReadingMetricsFromCloudWatch", "Effect": "Allow", "Action": [ "cloudwatch:DescribeAlarmsForMetric", "cloudwatch:DescribeAlarmHistory", "cloudwatch:DescribeAlarms", "cloudwatch:ListMetrics", "cloudwatch:GetMetricData", "cloudwatch:GetInsightRuleReport" ], "Resource": "*" }, { "Sid": "AllowReadingLogsFromCloudWatch", "Effect": "Allow", "Action": [ "logs:DescribeLogGroups", "logs:GetLogGroupFields", "logs:StartQuery", "logs:StopQuery", "logs:GetQueryResults", "logs:GetLogEvents" ], "Resource": "*" }, { "Sid": "AllowReadingTagsInstancesRegionsFromEC2", "Effect": "Allow", "Action": ["ec2:DescribeTags", "ec2:DescribeInstances", "ec2:DescribeRegions"], "Resource": "*" }, { "Sid": "AllowReadingResourcesForTags", "Effect": "Allow", "Action": "tag:GetResources", "Resource": "*" } ] }
|
跨账号可观测性 :
1 2 3 4 5 6 7 8 9 10
| { "Version": "2012-10-17", "Statement": [ { "Action": ["oam:ListSinks", "oam:ListAttachedLinks"], "Effect": "Allow", "Resource": "*" } ] }
|
AWS Cloudwatch 数据源配置示例
几种认证方式的 AWS CLoudwatch 配置示例如下:
AWS SDK(default):
1 2 3 4 5 6 7
| apiVersion: 1 datasources: - name: CloudWatch type: cloudwatch jsonData: authType: default defaultRegion: eu-west-2
|
使用 Credentials 配置文件:
1 2 3 4 5 6 7 8 9 10
| apiVersion: 1
datasources: - name: CloudWatch type: cloudwatch jsonData: authType: credentials defaultRegion: eu-west-2 customMetricsNamespaces: 'CWAgent,CustomNameSpace' profile: secondary
|
使用 AK&SK:
1 2 3 4 5 6 7 8 9 10 11
| apiVersion: 1
datasources: - name: CloudWatch type: cloudwatch jsonData: authType: keys defaultRegion: eu-west-2 secureJsonData: accessKey: '<your access key>' secretKey: '<your secret key>'
|
使用 AWS SDK Default 和 IAM Role 的 ARM 来 Assume:
1 2 3 4 5 6 7 8
| apiVersion: 1 datasources: - name: CloudWatch type: cloudwatch jsonData: authType: default assumeRoleArn: arn:aws:iam::123456789012:root defaultRegion: eu-west-2
|
Cloudwatch 自带仪表板
Cloudwatch 自带的几个仪表板都不太好用,建议使用 monitoringartist/grafana-aws-cloudwatch-dashboards 替代。
创建告警的查询
告警需要返回 numeric 数据的查询,而 CloudWatch Logs 支持这种查询。例如,你可以通过使用 stats
命令来启用告警。
这也是一个有效的查询,用于对包括文本 “Exception” 的消息发出告警:
1 2 3
| filter @message like /Exception/ | stats count(*) as exceptionCount by bin(1h) | sort exceptionCount desc
|
跨账户的可观察性
CloudWatch 插件使您能够跨区域账户监控应用程序并排除故障。利用跨账户的可观察性,你可以无缝地搜索、可视化和分析指标和日志,而不必担心账户的界限。
要使用这个功能,请在 AWS 控制台的 Cloudwatch 设置下 ,配置一个 monitoring 和 source 账户,然后按照上文所述添加必要的 IAM 权限。