雲計算

日誌服務SLS開放告警接入雲監控

背景

阿里雲的雲監控服務用於監控阿里雲資源和互聯網應用,包括閾值告警和事件告警兩種模式,支持配置多種告警通知渠道。您可以將日誌服務開放告警配置為其中一個通知渠道,從而由日誌服務告警系統完成告警降噪、通知等處理。

雲監控接入SLS

要將雲監控的告警消息接入SLS,主要分為兩個步驟:在SLS中創建開放告警應用;將SLS開放告警作為Webhook配置到雲監控聯繫人。創建開放告警應用的具體步驟,可以參考文章SLS開放告警簡介。下面介紹下如何將雲監控的告警消息接入到SLS中。

獲取回調地址

在創建開放告警應用之後,通過點擊接口按鈕,打開如下圖所示的回調地址查看窗口。

image.png

回調地址由兩部分構成:域名部分和子路徑部分。其中域名部分屬於SLS的接入地址,和地域相關,每個地域都有各自不同的接入地址;子路徑部分包括用於發送消息的Access Key Id和開放告警應用。如下所示為一個完整的SLS回調地址:

cn-heyuan-intranet.log.aliyuncs.com/event/webhook/RAMAK_{ACCESS_KEY_ID}/a123_asdad

其中"cn-heyuan-intranet.log.aliyuncs.com"為域名部分,屬於SLS通用的接入地址(endpoint);event/webhook/RAMAK_{ACCESS_KEY_ID}/a123_asdad 則為子路徑部分。需要注意的是,用戶需要將子路徑部分中的{ACCESS_KEY_ID}替換為具體阿里雲RAM賬戶的Access Key Id,並且將權限策略AliyunLogOpenEventWrite賦予該賬戶;a123_asdad則為該開放告警應用的id,用於唯一區別不同的開放告警應用。

雲監控接入配置

將雲監控的告警消息接入SLS開放告警有兩種方式:在聯繫人中配置webhook回調地址,或者在規則中配置回調地址。

配置雲監控聯繫人

在雲監控聯繫人管理界面,點擊新建聯繫人或者已有聯繫人,修改Webhook(http|https)或釘釘機器人,填入SLS開放告警回調地址,然後單擊確認

image.png

配置雲監控聯繫組

在雲監控聯繫人管理界面,點擊新建聯繫組或者已有聯繫組,將上面配置的告警聯繫人添加到聯繫組中。

image.png

配置雲監控規則

在雲監控規則管理界面,點擊創建報警規則或者已有報警規則,將上面的聯繫人組添加到通知對象中。也可以不添加聯繫人組,配置報警規則下的報警回調配置,填入之前獲取的回調地址。

image.png

映射規則

雲監控告警分為閾值告警和事件告警兩種,兩種消息類型的格式並不相同。

閾值告警映射規則

雲監控發送的閾值告警消息為form格式,轉為json後,有如下所示的消息示例:

{
    "alertName": "連接數",
    "alertState": "ALERT",
    "curValue": "4.5",
    "dimensions": "{instanceId=i-bp1d7111111115htda, state=TCP_TOTAL, userId=11596111111355}",
    "expression": "$Average>=1",
    "instanceName": "launch-advisor-20210607/11.11.111.111",
    "lastTime": "27天19小時47分鐘",
    "metricName": "Host.tcpconnection",
    "metricProject": "acs_ecs",
    "namespace": "acs_ecs",
    "preTriggerLevel": "WARN",
    "productGroupName": "null",
    "rawMetricName": "net_tcpconnection",
    "regionId": "cn-hangzhou",
    "regionName": "華東1(杭州)",
    "ruleId": "i-bp11111111115111_111111-0703-4811-9113-1c1111111111",
    "signature": "F111111w1111qN1111bw=",
    "timestamp": "1625455812126",
    "triggerLevel": "WARN",
    "userId": "11596111111355"
}

會轉為如下所示的SLS告警消息:

{
    "aliuid": "aliuid1",
    "alert_instance_id": "",
    "alert_id": "i-bp11111111115111_111111-0703-4811-9113-1c1111111111",
    "alert_type": "sls_pub",
    "alert_name": "連接數",
    "region": "cn-hangzhou",
    "project": "sls-alert--",
    "project_id": 0,
    "next_eval_interval": 0,
    "alert_time": 1625455812,
    "fire_time": 1625455812,
    "fire_results": null,
    "fire_results_count": 0,
    "resolve_time": 0,
    "status": "firing",
    "results": null,
    "labels": {
        "instanceId": "i-bp1d7111111115htda",
        "namespace": "acs_ecs",
        "regionId": "cn-hangzhou",
        "state": "TCP_TOTAL",
        "userId": "11596111111355"
    },
    "annotations": {
        "__cloud_monitor_type__": "threshold",
        "__config_app__": "sls_pub_alert",
        "__pub_alert_app__": "appid1",
        "__pub_alert_protocol__": "cloud_monitor",
        "__pub_alert_region__": "e",
        "__pub_alert_service__": "serverid1",
        "curValue": "4.5",
        "desc": "Host.tcpconnection $Average>=1 持續: 27天19小時47分鐘, 詳情: {instanceId=i-bp1d7111111115htda, state=TCP_TOTAL, userId=11596111111355}",
        "expression": "$Average\u003e=1",
        "instanceName": "launch-advisor-20210607/11.11.1111.1111",
        "lastTime": "27天19小時47分鐘",
        "metricName": "Host.tcpconnection",
        "metricProject": "acs_ecs",
        "namespace": "acs_ecs",
        "preTriggerLevel": "WARN",
        "rawMetricName": "net_tcpconnection",
        "title": "acs_ecs Host.tcpconnection 當前值: 4.5"
    },
    "severity": 6,
    "policy": {
        "alert_policy_id": "",
        "action_policy_id": "",
        "use_default": false,
        "repeat_interval": "0s"
    },
    "template": null,
    "drill_down_query": "https://cloudmonitor.console.aliyun.com/index.htm#/alarmInfo/name=i-bp11111111115111_111111-0703-4811-9113-1c1111111111\u0026searchValue=\u0026searchType=name\u0026searchProduct=/history//"
}

具體的轉換規則請參考官方文檔

事件告警映射規則

雲監控發送的事件消息為json格式,如下所示:

{
    "traceId": "411112-c49d-4143-a38e-c111159e-0",
    "resourceId": "acs:ecs:cn-hangzhou:115111111111355:instance/i-bp1d71111111x15htda",
    "product": "ECS",
    "ver": "1.0",
    "instanceName": "launch-advisor-20210607",
    "level": "INFO",
    "userId": "115111111111355",
    "content": {
        "resourceId": "i-bp1d7411111111g111htda",
        "publicIpAddress": "127.0.0.1",
        "instanceName": "launch-advisor-20210607",
        "state": "Running",
        "privateIpAddress": "127.0.0.1",
        "resourceType": "ALIYUN::ECS::Instance"
    },
    "regionId": "cn-hangzhou",
    "eventTime": "20210705T113013.398+0800",
    "name": "Instance:StateChange",
    "id": "26111205-51113-4D118-8119-3111113CB735",
    "timeMetrics": {
        "ingestion_in_time": 1625455813563,
        "ingestion_out_time": 1625455816000,
        "notify_in_time": 1625455819578,
        "engine_in_time": 1625455816467,
        "event_time": 1625455813398,
        "engine_out_time": 1625455818000
    },
    "status": "Normal"
}

會轉為如下所示的SLS告警消息:

{
    "aliuid": "aliuid1",
    "alert_instance_id": "26111205-51113-4D118-8119-3111113CB735",
    "alert_id": "Instance:StateChange",
    "alert_type": "sls_pub",
    "alert_name": "Instance:StateChange",
    "region": "cn-hangzhou",
    "project": "sls-alert--",
    "project_id": 0,
    "next_eval_interval": 0,
    "alert_time": 1625455813,
    "fire_time": 1625743445,
    "fire_results": null,
    "fire_results_count": 0,
    "resolve_time": 0,
    "status": "firing",
    "results": null,
    "labels": {
        "resourceId": "acs:ecs:cn-hangzhou:115111111111355:instance/i-bp1d71111111x15htda"
    },
    "annotations": {
        "__cloud_monitor_type__": "event",
        "__config_app__": "sls_pub_alert",
        "__pub_alert_app__": "appid1",
        "__pub_alert_protocol__": "cloud_monitor",
        "__pub_alert_region__": "e",
        "__pub_alert_service__": "serverid1",
        "content_instanceName": "launch-advisor-20210607",
        "content_privateIpAddress": "127.0.0.1",
        "content_publicIpAddress": "127.0.0.1",
        "content_resourceId": "i-bp1d7411111111g111htda",
        "content_resourceType": "ALIYUN::ECS::Instance",
        "content_state": "Running",
        "desc": "事件Instance:StateChange觸發, 詳情: {\"instanceName\":\"launch-advisor-20210607\",\"privateIpAddress\":\"127.0.0.1\",\"publicIpAddress\":\"127.0.0.1\",\"resourceId\":\"i-bp1d7411111111g111htda\",\"resourceType\":\"ALIYUN::ECS::Instance\",\"state\":\"Running\"}",
        "instanceName": "launch-advisor-20210607",
        "level": "INFO",
        "product": "ECS",
        "status": "Normal",
        "title": "Instance:StateChange: Normal",
        "traceId": "411112-c49d-4143-a38e-c111159e-0",
        "userId": "115111111111355"
    },
    "severity": 4,
    "policy": {
        "alert_policy_id": "",
        "action_policy_id": "",
        "use_default": false,
        "repeat_interval": "0s"
    },
    "template": null,
    "drill_down_query": "https://cloudmonitor.console.aliyun.com/index.htm#/eventmonitoring/events/detail?product=ECS\u0026eventName=Instance:StateChange"
}

具體的轉換規則請參考官方文檔

總結

通過將雲監控告警消息接入到SLS,可以充分利用SLS提供的強大的告警功能,從而更為高效的瞭解以及處理服務出現的問題。

Leave a Reply

Your email address will not be published. Required fields are marked *