背景
阿里雲的雲監控服務用於監控阿里雲資源和互聯網應用,包括閾值告警和事件告警兩種模式,支持配置多種告警通知渠道。您可以將日誌服務開放告警配置為其中一個通知渠道,從而由日誌服務告警系統完成告警降噪、通知等處理。
雲監控接入SLS
要將雲監控的告警消息接入SLS,主要分為兩個步驟:在SLS中創建開放告警應用;將SLS開放告警作為Webhook配置到雲監控聯繫人。創建開放告警應用的具體步驟,可以參考文章SLS開放告警簡介。下面介紹下如何將雲監控的告警消息接入到SLS中。
獲取回調地址
在創建開放告警應用之後,通過點擊接口按鈕,打開如下圖所示的回調地址查看窗口。
回調地址由兩部分構成:域名部分和子路徑部分。其中域名部分屬於SLS的接入地址,和地域相關,每個地域都有各自不同的接入地址;子路徑部分包括用於發送消息的Access Key Id和開放告警應用。如下所示為一個完整的SLS回調地址:
cn-heyuan-intranet.log.aliyuncs.com/event/webhook/RAMAK_{ACCESS_KEY_ID}/a123_asdad
其中"cn-heyuan-intranet.log.aliyuncs.com"為域名部分,屬於SLS通用的接入地址(endpoint);event/webhook/RAMAK_{ACCESS_KEY_ID}/a123_asdad 則為子路徑部分。需要注意的是,用戶需要將子路徑部分中的{ACCESS_KEY_ID}替換為具體阿里雲RAM賬戶的Access Key Id,並且將權限策略AliyunLogOpenEventWrite賦予該賬戶;a123_asdad則為該開放告警應用的id,用於唯一區別不同的開放告警應用。
雲監控接入配置
將雲監控的告警消息接入SLS開放告警有兩種方式:在聯繫人中配置webhook回調地址,或者在規則中配置回調地址。
配置雲監控聯繫人
在雲監控聯繫人管理界面,點擊新建聯繫人或者已有聯繫人,修改Webhook(http|https)或釘釘機器人,填入SLS開放告警回調地址,然後單擊確認。
配置雲監控聯繫組
在雲監控聯繫人管理界面,點擊新建聯繫組或者已有聯繫組,將上面配置的告警聯繫人添加到聯繫組中。
配置雲監控規則
在雲監控規則管理界面,點擊創建報警規則或者已有報警規則,將上面的聯繫人組添加到通知對象中。也可以不添加聯繫人組,配置報警規則下的報警回調配置,填入之前獲取的回調地址。
映射規則
雲監控告警分為閾值告警和事件告警兩種,兩種消息類型的格式並不相同。
閾值告警映射規則
雲監控發送的閾值告警消息為form格式,轉為json後,有如下所示的消息示例:
{ "alertName": "連接數", "alertState": "ALERT", "curValue": "4.5", "dimensions": "{instanceId=i-bp1d7111111115htda, state=TCP_TOTAL, userId=11596111111355}", "expression": "$Average>=1", "instanceName": "launch-advisor-20210607/11.11.111.111", "lastTime": "27天19小時47分鐘", "metricName": "Host.tcpconnection", "metricProject": "acs_ecs", "namespace": "acs_ecs", "preTriggerLevel": "WARN", "productGroupName": "null", "rawMetricName": "net_tcpconnection", "regionId": "cn-hangzhou", "regionName": "華東1(杭州)", "ruleId": "i-bp11111111115111_111111-0703-4811-9113-1c1111111111", "signature": "F111111w1111qN1111bw=", "timestamp": "1625455812126", "triggerLevel": "WARN", "userId": "11596111111355" }
會轉為如下所示的SLS告警消息:
{ "aliuid": "aliuid1", "alert_instance_id": "", "alert_id": "i-bp11111111115111_111111-0703-4811-9113-1c1111111111", "alert_type": "sls_pub", "alert_name": "連接數", "region": "cn-hangzhou", "project": "sls-alert--", "project_id": 0, "next_eval_interval": 0, "alert_time": 1625455812, "fire_time": 1625455812, "fire_results": null, "fire_results_count": 0, "resolve_time": 0, "status": "firing", "results": null, "labels": { "instanceId": "i-bp1d7111111115htda", "namespace": "acs_ecs", "regionId": "cn-hangzhou", "state": "TCP_TOTAL", "userId": "11596111111355" }, "annotations": { "__cloud_monitor_type__": "threshold", "__config_app__": "sls_pub_alert", "__pub_alert_app__": "appid1", "__pub_alert_protocol__": "cloud_monitor", "__pub_alert_region__": "e", "__pub_alert_service__": "serverid1", "curValue": "4.5", "desc": "Host.tcpconnection $Average>=1 持續: 27天19小時47分鐘, 詳情: {instanceId=i-bp1d7111111115htda, state=TCP_TOTAL, userId=11596111111355}", "expression": "$Average\u003e=1", "instanceName": "launch-advisor-20210607/11.11.1111.1111", "lastTime": "27天19小時47分鐘", "metricName": "Host.tcpconnection", "metricProject": "acs_ecs", "namespace": "acs_ecs", "preTriggerLevel": "WARN", "rawMetricName": "net_tcpconnection", "title": "acs_ecs Host.tcpconnection 當前值: 4.5" }, "severity": 6, "policy": { "alert_policy_id": "", "action_policy_id": "", "use_default": false, "repeat_interval": "0s" }, "template": null, "drill_down_query": "https://cloudmonitor.console.aliyun.com/index.htm#/alarmInfo/name=i-bp11111111115111_111111-0703-4811-9113-1c1111111111\u0026searchValue=\u0026searchType=name\u0026searchProduct=/history//" }
具體的轉換規則請參考官方文檔。
事件告警映射規則
雲監控發送的事件消息為json格式,如下所示:
{ "traceId": "411112-c49d-4143-a38e-c111159e-0", "resourceId": "acs:ecs:cn-hangzhou:115111111111355:instance/i-bp1d71111111x15htda", "product": "ECS", "ver": "1.0", "instanceName": "launch-advisor-20210607", "level": "INFO", "userId": "115111111111355", "content": { "resourceId": "i-bp1d7411111111g111htda", "publicIpAddress": "127.0.0.1", "instanceName": "launch-advisor-20210607", "state": "Running", "privateIpAddress": "127.0.0.1", "resourceType": "ALIYUN::ECS::Instance" }, "regionId": "cn-hangzhou", "eventTime": "20210705T113013.398+0800", "name": "Instance:StateChange", "id": "26111205-51113-4D118-8119-3111113CB735", "timeMetrics": { "ingestion_in_time": 1625455813563, "ingestion_out_time": 1625455816000, "notify_in_time": 1625455819578, "engine_in_time": 1625455816467, "event_time": 1625455813398, "engine_out_time": 1625455818000 }, "status": "Normal" }
會轉為如下所示的SLS告警消息:
{ "aliuid": "aliuid1", "alert_instance_id": "26111205-51113-4D118-8119-3111113CB735", "alert_id": "Instance:StateChange", "alert_type": "sls_pub", "alert_name": "Instance:StateChange", "region": "cn-hangzhou", "project": "sls-alert--", "project_id": 0, "next_eval_interval": 0, "alert_time": 1625455813, "fire_time": 1625743445, "fire_results": null, "fire_results_count": 0, "resolve_time": 0, "status": "firing", "results": null, "labels": { "resourceId": "acs:ecs:cn-hangzhou:115111111111355:instance/i-bp1d71111111x15htda" }, "annotations": { "__cloud_monitor_type__": "event", "__config_app__": "sls_pub_alert", "__pub_alert_app__": "appid1", "__pub_alert_protocol__": "cloud_monitor", "__pub_alert_region__": "e", "__pub_alert_service__": "serverid1", "content_instanceName": "launch-advisor-20210607", "content_privateIpAddress": "127.0.0.1", "content_publicIpAddress": "127.0.0.1", "content_resourceId": "i-bp1d7411111111g111htda", "content_resourceType": "ALIYUN::ECS::Instance", "content_state": "Running", "desc": "事件Instance:StateChange觸發, 詳情: {\"instanceName\":\"launch-advisor-20210607\",\"privateIpAddress\":\"127.0.0.1\",\"publicIpAddress\":\"127.0.0.1\",\"resourceId\":\"i-bp1d7411111111g111htda\",\"resourceType\":\"ALIYUN::ECS::Instance\",\"state\":\"Running\"}", "instanceName": "launch-advisor-20210607", "level": "INFO", "product": "ECS", "status": "Normal", "title": "Instance:StateChange: Normal", "traceId": "411112-c49d-4143-a38e-c111159e-0", "userId": "115111111111355" }, "severity": 4, "policy": { "alert_policy_id": "", "action_policy_id": "", "use_default": false, "repeat_interval": "0s" }, "template": null, "drill_down_query": "https://cloudmonitor.console.aliyun.com/index.htm#/eventmonitoring/events/detail?product=ECS\u0026eventName=Instance:StateChange" }
具體的轉換規則請參考官方文檔。
總結
通過將雲監控告警消息接入到SLS,可以充分利用SLS提供的強大的告警功能,從而更為高效的瞭解以及處理服務出現的問題。