If your application is important — and that usually means serious money is at stake in the event of a failure - you probably want more than an assurance that the vendor will make reasonable efforts to keep the application available and performing. Then we talk about real SLAs (service level agreements) and the associated SLA reporting. Important metrics that can be monitored include:

  • Uptime —perhaps the most important metric that indicates how much time per month or year your application was available.
  • Response time — an accessible page does not help you if customers give up after one minute of waiting unnervedly. This can be differentiated according to DNS lookup, connection setup, fully loaded HTML, fully loaded additional content.
  • Reaction time — the time until your service provider, i.e. us, starts to move after a fault message.
  • MTTR (mean time to repair) — the time until a fault is rectified.

Especially when certain limit values are guaranteed here and reinforced with penalties[1], these values should not, where possible, be collected by those who have to answer for them. The tests should also take place "from outside", i.e. from an IP address of another provider - and possibly from different geographical regions in which the customer base is mainly represented. After all, it goes without saying that documentation, documentation and further documentation are basic prerequisites for complete, reliable and satisfactory SLA reports.

The design of SLA contracts is highly individual and therefore a matter of negotiation. But what we generally recommend and offer is the following:

  • Recording of all faults in a ticket system directly accessible to the customer
  • Monitoring of the application by an external service provider[2]
  • Assurance of penal thresholds for the above metrics
  • Monthly reports via e-mail on compliance or non-compliance with metrics with numerical values

Have we aroused your interest? Talk to us.