阿里云函数计算云监控触发函数计算

云监控

阿里云云监控为云上用户提供开箱即用的企业级开放型一站式监控解决方案。涵盖IT设施基础监控,外网网络质量拨测监控,基于事件、自定义指标、日志的业务监控。为您全方位提供更高效、更全面、更省钱的监控服务。

云监控提供了丰富的云产品系统事件监控,并且事件还在不断丰富完善中中,丰富的事件触发自定义处理的函数,可以实现更多相关云资源的自动化式的自定义处理。

示例场景

假设一台云服务器 ecs 发生因系统错误而重启,运维人员或者 ecs 用户可能会紧急响应,人工做一些验证或者创建快照的处理, 在本示例中,通过云监控触发函数,实现对一台因为系统错误实例重启或者因实例错误而重启的机器进行自动化处理,比如成功重启后自动创建快照。

准备知识

云服务器 ECS 系统事件ecs

云产品系统事件监控yjk

函数代码

这个示例展示来自云监控中 ecs 重启结束事件触发了函数执行,函数自动查找出ecs挂接的云盘,并给云盘自动创建了快照。

  1. # -*- coding: utf-8 -*-
  2. import logging
  3. import json, random, string, time
  4. from aliyunsdkcore import client
  5. from aliyunsdkecs.request.v20140526.DeleteSnapshotRequest import DeleteSnapshotRequest
  6. from aliyunsdkecs.request.v20140526.CreateSnapshotRequest import CreateSnapshotRequest
  7. from aliyunsdkecs.request.v20140526.DescribeDisksRequest import DescribeDisksRequest
  8. from aliyunsdkcore.auth.credentials import StsTokenCredential
  9. LOGGER = logging.getLogger()
  10. clt = None
  11. def handler(event, context):
  12. creds = context.credentials
  13. sts_token_credential = StsTokenCredential(creds.access_key_id, creds.access_key_secret, creds.security_token)
  14. '''
  15. {
  16. "product": "ECS",
  17. "content": {
  18. "executeFinishTime": "2018-06-08T01:25:37Z",
  19. "executeStartTime": "2018-06-08T01:23:37Z",
  20. "ecsInstanceName": "timewarp",
  21. "eventId": "e-t4nhcpqcu8fqushpn3mm",
  22. "eventType": "InstanceFailure.Reboot",
  23. "ecsInstanceId": "i-bp18l0uopocfc98xxxx"
  24. },
  25. "resourceId": "acs:ecs:cn-hangzhou:12345678:instance/i-bp18l0uopocfc98xxxx",
  26. "level": "CRITICAL",
  27. "instanceName": "instanceName",
  28. "status": "Executing",
  29. "name": "Instance:SystemFailure.Reboot:Executing",
  30. "regionId": "cn-hangzhou"
  31. }
  32. '''
  33. evt = json.loads(event)
  34. content = evt.get("content");
  35. ecsInstanceId = content.get("ecsInstanceId");
  36. regionId = evt.get("regionId");
  37. global clt
  38. clt = client.AcsClient(region_id=regionId, credential=sts_token_credential)
  39. name = evt.get("name");
  40. name = name.lower()
  41. if name in [ 'Instance:SystemFailure.Reboot:Executing'.lower(), "Instance:InstanceFailure.Reboot:Executing".lower()]:
  42. pass
  43. # do other things
  44. if name in ['Instance:SystemFailure.Reboot:Executed'.lower(), "Instance:InstanceFailure.Reboot:Executed".lower()]:
  45. request = DescribeDisksRequest()
  46. request.add_query_param("RegionId", "cn-shenzhen")
  47. request.set_InstanceId(ecsInstanceId)
  48. response = _send_request(request)
  49. disks = response.get('Disks').get('Disk', [])
  50. for disk in disks:
  51. diskId = disk["DiskId"]
  52. SnapshotId = create_ecs_snap_by_id(diskId)
  53. LOGGER.info("Create ecs snap sucess, ecs id = %s , disk id = %s ", ecsInstanceId, diskId)
  54. def create_ecs_snap_by_id(disk_id):
  55. LOGGER.info("Create ecs snap, disk id is %s ", disk_id)
  56. request = CreateSnapshotRequest()
  57. request.set_DiskId(disk_id)
  58. request.set_SnapshotName("reboot_" + ''.join(random.choice(string.ascii_lowercase) for _ in range(6)))
  59. response = _send_request(request)
  60. return response.get("SnapshotId")
  61. # send open api request
  62. def _send_request(request):
  63. request.set_accept_format('json')
  64. try:
  65. response_str = clt.do_action_with_exception(request)
  66. LOGGER.info(response_str)
  67. response_detail = json.loads(response_str)
  68. return response_detail
  69. except Exception as e:
  70. LOGGER.error(e)

操作步骤

  • 创建函数(函数代码在文末),函数创建可参考函数计算 hello world

    注:记得给函数的service的role设置操作ecs的权限

    image

  • 登录云监控控制台, 创建报警规则, 监控的事件为ecs 因实例错误或西戎错误重启开始和结束imageimagefc

  • mock调试imageimage

  • 模拟真实的ecs事件请参考演练系统事件处理程序image

原创文章,作者:网友投稿,如若转载,请注明出处:https://www.cloudads.cn/archives/33686.html

发表评论

登录后才能评论