首页 > 其他 > 详细

SLO & SLI

时间:2020-08-13 00:01:06      阅读:80      评论:0      收藏:0      [点我收藏+]

Error Budgets includes:

  • releasing new feature
  • expected system change
  • inevitable failure in hardware, network etc 
  • planned downtime
  • risky experiment

 

  • share responsibility for reliability between Ops and Dev teams.
  •  reduce feature iteration speed when our systems are unreliable.

 

Availability SLI

The proportion of valid requests served successfully. 

One commonly used signifier of success or failure is the status code of an HTTP or RPC response. This requires careful, accurate use of status codes within your system so that each code maps distinctly to either success or failure. 

 A reasonable strategy here is to write that complex logic as code and export a boolean availability measure to your SLO monitoring systems, for use in a bad-minute style SLI like the example above.

 

Measuring SLI:

 

Application-level Metrics 

 

Pros

Cons

  • Often fast and cheap (in terms of engineering time) to add new metrics.

  • Complex logic to derive an SLI implementation can be turned into code and exported as two, much simpler, "good events" and "total events" counters.

  • Application servers are unable to see requests that do not reach them.

  • Measuring overall performance of multi-request user journeys is difficult if application servers are stateless.

Logs Processing 

rocessing server-side logs of requests or data to generate SLI metrics.

Pros

Cons

  • Existing request logs can be processed retroactively to backfill SLI metrics.

  • Complex user journeys can be reconstructed using session identifiers.

  • Complex logic to derive an SLI implementation can be turned into code and exported as two, much simpler, "good events" and "total events" counters.

  • Application logs do not contain requests that did not reach servers.

  • Processing latency makes logs-based SLIs unsuitable for triggering an operational response.

  • Engineering effort is needed to generate SLIs from logs; session reconstruction can be time-consuming.

Front-end infrastructure metrcis

 

Pros

Cons

  • Metrics and recent historical data most likely already exist, so this option probably requires the least engineering effort to get started.

  • Measures SLIs at the point closest to the user still within serving infrastructure.

  • Not viable for data processing SLIs or, in fact, any SLIs with complex requirements.

  • Only measure approximate performance of multi-request user journeys.

Probers

 

Pros

Cons

  • Synthetic clients can measure all steps of a multi-request user journey.

  • Sending requests from outside your infrastructure captures more of the overall request path in the SLI.

  • Approximates user experience with synthetic requests.

  • Covering all corner cases is hard and can devolve into integration testing.

  • High reliability targets require frequent probing for accurate measurement.

  • Probe traffic can drown out real traffic.

 

SLO & SLI

原文:https://www.cnblogs.com/anyu686/p/13493016.html

(0)
(0)
   
举报
评论 一句话评论(0
关于我们 - 联系我们 - 留言反馈 - 联系我们:wmxa8@hotmail.com
© 2014 bubuko.com 版权所有
打开技术之扣,分享程序人生!