Cloud SLA: Points to Check!

cloud-sla-points-to-check

When you are shifting to cloud solutions, Cloud Service Level Agreements are very important. A service level agreement is a constituent of complete service level management (SLM) strategy and is often regarded as a key component. It bridges the gap between customer’s and organization’s expectations. It acts as a good communication driver. SLA is an agreement and not a contract. It is an agreement between internal or external customers and service providers. It basically contains the documentation of the services which the service provider will provide.  It is a promise made by the service provider. SLA’s are part of most of the managed IT services which comprises of cloud-based help desk software, cloud services, Paas, Saas, IaaS, DBaaS etc.

More than the agreed service level, SLA defines the outline on what should be done when the cloud service provider cannot provide the availability. A well-defined SLA ensures improved communication between the two parties. But, to arrive on a well-defined SLA, many points are to be considered. Let’s look at some of the checkpoints that should be taken into consideration for a well-defined SLA

Features in Cloud SLA

A good cloud service provider will have a clear and a transparent SLA promising to provide the services mentioned in the SLA. Not only he will promise to provide them, but such promises will be backed with penalties and incentives. Must have features in an SLA are

System Availability: This commits 99% system availability or even higher.

Disaster Recovery: Backup will be taken in 24 hr in case of any data center disaster.

Data Ownership and Integrity: You should be capable to get your data out of service provider’s system if in case you do not want to continue with your service provider.

Response Time: The service provider should be able to categorize issue and respond accordingly.

Escalation procedures: You should be provided with escalation path in case you feel some issues should be escalated.

Maintenance: The service provider should announce maintenance activities at regular intervals and even notify the user.

Product Notifications: Regular updates on new product releases and upgrades should be informed by the service provider.

 

Level of service availability

Before arriving on Service Level Agreement, you should take into consideration the guarantees of SLA, the level of service availability etc. Availability is how frequently this thing goes down. Availability is expressed in terms of three, four or five nines. If provider expanses to more nines denoting lower ranks of nines which in short increases the cost. Along with availability, business function being provided should also be considered. The level of service availability defines two broad elements: What are the services and its reachability. However, the service varies as per different models of cloud (IaaS, PaaS, SaaS). In Iaas, the service provider provides data center infrastructure as a service. A fabric is responsible for covering the memory, processing, networking etc. Here, the availability is that the service provider has the responsibility of handling the fabric and running the fabric. In Paas, the service provider provides functionality of the platform as a service. The availability here is the reachability and usability of the platform. In SaaS, the service provider provides the availability of the data and application. Here the availability is in terms of the uptime. If you can have 99% availability of email service, it ensures that you can access the email service 99% of the time. Once you are clear with what model of service you will avail, the next step is to describe the availability. Availability of the services is specified 99% to 99.99%. Make sure that the availability and price should be matched with business requirements.

Time frame

An SLA should also mention the formation and expiry of the agreement. The time frame of the agreement is one such element that is always there in usual contracts as well. Start date of the SLA enables to start the tracking of IT performance. If new services are being provided or same services are renewed, to converse the changes to the users, sometime should be provided to communicate the details. In order to charge low costs to your users, you may have an agreement of 18-month lease for equipment which the consumer uses. If your SLA is not valid after 12 months, your customers will not pay extra penny after 12 months and at the end you will be faced with the problem of funding the equipment lease. SLA’s should also mention mean time to repair and mean time to respond. If there is severity 1 critical issue, the mean time to respond and repair should be less than the mean time taken to solve severity 3 critical issue.

Exceptions and Exemptions

One should carefully analyze the limitations and exemptions section of the Service Level Agreement. Here as well, the exceptions will vary as per different deployment models of cloud.

The most common exceptions of IaaS are:

  • The service provider is not at all responsible with what the users do and do not do with the servers.
  • The service provider is not responsible for the uncorroborated operating system connections.
  • The service provider is not responsible for any of the external networks if deployed.

In PaaS, the exceptions are here are somewhat similar to IaaS. The service provider is responsible for providing the platform and not what the user will function and run on top of it.

In SaaS, the service provider is responsible for providing the service as a whole and hence the cloud service provider is more accountable compared to IaaS and PaaS.

Some examples where the service provider will not be responsible are:

  • If a hosted email platform is purchased and is used to send spam or mass emails.
  • Infrastructure is set up for searching or creating in rainbow tables
  • Infrastructure is set up and deployed to examine some local attacks.

 

Reports on the implementation of SLA

Without setting any criteria for evaluation, there is no objective means in determining the performance. On a monthly basis, performance should be evaluated and poor performance should be recorded. Within the scope of every agreement, the service provider is required to present a Service Level Agreement Implementation Report which comprises of the tangible figures of the activities so conducted. Some Key Performance Indicators are set in the agreement which is to be followed by the service provider during the tenure of the agreement.  The indicators are decided with the mutual consent of both the parties.  Be careful and cautious enough while deciding the indicators or metrics for measuring the performance. It is important to note that SLA’s should be taken upon when the measurement of the performance against the set metrics can be tangibly measured. Therefore, some mechanisms should be used to seize data which will be able to detect any breaches caused in the SLA. With this reports can be generated which will act as a platform

It is important to note that SLA’s should be taken upon when the measurement of the performance against the set metrics can be tangibly measured. Therefore, some mechanisms should be used to seize data which will be able to detect any breaches caused in the SLA. With this reports can be generated which will act as a platform of discussion between the customer and service provider. Reports should comprise of information as to “why SLA was not met” and not focusing on the solution of the problem. The process of generation of the reports should be automated wherever possible.

The reporting period

Reports on achievement and non-achievement of SLA’s should be prepared periodically. The reporting period should not be short enough to average the under performance and not long enough keep the service provider and the customer occupied with a lot of information. Some common choices are rolling eight weeks, rolling four weeks, each accounting period or each accounting month. However, amongst all the options mentioned above, rolling reports are preferred by almost everyone. Reporting is helped by the use of Service database of all information associated with performance. If all the problems and incidents which caused service outages have been stated in the Database then it is easy to generate SLA reports. The contents of

Reporting is helped by the use of Service database of all information associated with performance. If all the problems and incidents which caused service outages have been stated in the Database then it is easy to generate SLA reports. The contents of the report will vary as per the type of SLA chosen. The simplest report will give information about service availability and global uptime which is split between normal office hours and extra time. This much information is adequate enough.

Calculations

The reports are prepared and all other necessary activities are also carried out. Now the question comes, how to judge or measure the performance of an SLA.  The simplest way to calculate the service provider’s overall achievement of SLA’s is creating a form of Availability Index.

(Multiple of actual availabilities for specific period / Multiple of target availabilities for specific period) X 1000

Eg. (99.3 x 98 x 99.1 x 94.5 / 99 x 99 x 98 x 95) X 1000

= 1019.4

This data can be used in trend graphs, newsletters etc.

Implications

After having all the reports and even measuring the performance, what are the implications of s SLA? With SLA you can properly construct the system to meet them. If the SLA’s are not met, it is but obvious that there will be data loss and business will suffer. The indicators or metrics which are used for measuring an SLA are critical for long-term success. A good SLA is important as it sets boundaries and expectations for customer commitments, assures key performance indicators for the customer service, key performance indicators for the internal organizations etc.

Conclusion

SLA’s are increasingly becoming an important part of the overall IT strategy. SLA’s help in measuring the performance as well as it gives a concrete scenario for building your system and optimum utilization of resources.