Service Level Expectations
Last Updated: 01/26/2012
1 General Overview
University Information Technology Services (UITS) provides Virtual Server Hosting Services to the Indiana University campuses. This Service Level Expectation (SLE) is specific to the Virtual Server Hosting Services known as the Intelligent Infrastructure (II). Unlike “co-location” or other physical server hosting services, II is a service where a virtual server is leased and the customer is not required to make an initial capital investment in buying physical server and storage hardware.
This is a SLE between II customers and UITS. The scope of this document includes:
- Services provided by UITS to II customers.
- Levels of response time, availability, and support associated with these services.
- Responsibilities of the UITS service provider and responsibilities of the customer.
- Processes for requesting services and getting support.
This SLE covers the period from 07/01/2011 to 06/30/2012 and will be reviewed and revised at the end of this period.
OR
This SLE shall remain valid until revised or terminated.
1.1 Terms and Definitions
“Backup Solutions” means optional service available for subscription that provides cross-site backups and cross-campus failover options, which isolate you from potential disasters by securing your backup data within hardened data centers.
“Business Day” means normal working day in the time zone where Indiana Data Center facilities are located (Eastern Time Zone -5GMT and participates in Daylight Savings Time).
“Customer” means the party identified as the purchasing organization to this Agreement with UITS.
“Designated Contacts” means Customer named contacts, technical resources and fiscal account resources, which are established, person-specific, e-mail addresses associated with the customer support contract. It is expected that these contacts will be updated upon any personnel or responsibility change by the Customer.
“Intelligent Infrastructure” means Virtual Server Hosting Services known as Intelligent Infrastructure (II). Unlike “co-location” or other physical server hosting services, II is a servi where a virtual server is leased and the customer is not required to make an initial investment in buying capital equipment such as servers and storage hardware.
“Problem Resolution” means the use of reasonable commercial efforts to resolve the reported problem. These methods may include (but are not limited to): configuration changes, patches that fix an issue, replacing failed hardware, reinstalling software, etc.
“Respond” means addressing the initial request and taking ownership of the issue.
“Response Time” means the amount of time elapsed between the initial contact by the Customer to UITS and the returned response to the Customer by UITS staff.
“Service Level Expectation” means the Customer Service Level Expectation (SLE) that identifies the features and defines the processes involved with the delivery by UITS of various support functions to Customer, as presented by this documents content.
“Service Request (SR)” means a single issue opened with UITS. The SR number identifies the Service Request. The format for the unique SR number can be as follows: SAV #nnnnnnn.
Severity Definitions for Intelligent Infrastructure:
“Severity 1 (Urgent)” means
- a) an Error with a direct security impact on the service
- b) an Error isolated to the Virtual System production environment that renders the Virtual System inoperative or causes the Virtual System to fail catastrophically; i.e., critical system impact, system down;
- c) a reported defect in the production environment, which cannot be reasonably circumvented, in which there is an emergency condition that significantly restricts the use of the product to perform necessary business functions: or
- d) inability to use the product or critical impact on operation requiring an immediate solution.
“Severity 2(High)” means
- a)an Error isolated to the Virtual System that substantially degrades the performance of the service or materially restricts business; i.e. major system impact, temporary system hanging.
- b)a reported defect in the Virtual System, which restricts the use of one or more features of the Virtual System to perform necessary business functions but does not completely restrict the use of the Virtual System; or
- c)ability to use the Virtual System, but an important function is not available, and operations are severely impacted.
“Severity 3 (Medium)” means
- a) an Error isolated to the Virtual System that causes only a moderate impact on the use of the service: i.e., moderate system impact, performance/operational impact;
- b) a reported defect in the Virtual System that restricts the use of one or more features of the Virtual System to perform necessary business functions, while the defect can be easily circumvented; or
- c) an Error that can cause some functional restrictions but does not have a critical or severe impact on operations.
“Severity 4 (Low)” means
- a) a reported anomaly in the Virtual System environment that does not substantially restrict the use of one or more features of the Virtual System to perform necessary business functions; this is a minor problem and is not significant to operations; or
- b) an anomaly that may be easily circumvented or may need to be submitted to UITS as an enhancement request.
“UITS” means University Information Technology Services, which is staffed by professional support personnel providing assistance with diagnosis and resolution of defects and/or failures in II services.
“Virtual System” means the method to supply the infrastructure and network capacity necessary to host your applications, while optional disk storage on UITS enterprise-class SANs (Storage Area Networks) ensures your files are highly secure and available.
“VMware” means the virtualization platform is built on a business-ready architecture and uses software such as VMware vSphere to transform or “virtualize” the hardware resources of an x86-based computer including the CPU, RAM, hard disk and network controller to create a fully functional virtual system that can run its own operating system and applications just like a “physical” computer. Each virtual system contains a complete system, eliminating potential conflicts. VMware virtualization works by inserting a thin layer of software directly on the computer hardware or on a host operating system. This contains a virtual system monitor or “hypervisor” that allocates hardware resources dynamically and transparently. Multiple operating systems run concurrently on a single physical computer and share hardware resources with each other. By encapsulating an entire machine, including CPU, memory, operating system, and network devices, a virtual system is completely compatible with all standard x86 operating systems, applications, and device drivers. You can safely run several operating systems and applications at the same time on a single computer, with each having access to the resources it needs when it needs them.
“Workaround” means a change in the environment or data to avoid error without substantially impairing use of the II service.
2 Service Descriptions
2.1 Service Scope
Services supporting virtual systems include virtualized CPU, RAM, hard disk and network within a cluster of physical servers in one of the IU Data Centers, provisioning of the virtual server, operations support, monitoring, systems administration of the physical servers, network connectivity to the virtual server, customized firewalls on the physical servers and within IU Data Centers network infrastructure and back-up services with offsite storage. There are two principle components to the II service package:Virtual systems supply the infrastructure for compute; disk and network capacity necessary to host applications. Disk storage on UITS enterprise-class SANs ensures files are highly secure and available.
Backup solutions provide cross-site backups and cross-campus failover options, which are isolated from potential disasters by securing backup files within hardened IU data centers. Virtual system rentals include a disaster recovery off-site backup of your operating system disk. Data protection services are offered as a subscription based model.
2.1.1 Service Exclusions
Support does not include the following items or actions:- a) Step-by-step assistance for installation of operating system or service packs;
- b) Onsite services;
- c) Installation or configuration of applications hosted on virtual systems;
- d) Modifications of software code, security-policy configuration, audits, or security design.
UITS shall have no obligation to support:
- a) Problems caused by customer negligence, misuse, misapplication, or use of the product beyond the control of the UITS;
- b) Operating systems installed that are not specifically listed on the VMware Guest Operating System guide: http://www.vmware.com/pdf/GuestOS_guide.pdf;
- c) Products installed, intentional or unintentional, that result in nefarious activities;
- d) Operating systems that are past their End-of-Support date as listed by the operating system vendor.
2.2 IU Data Centers IUB & IUPUI
IU has two hardened data centers, one each on the Bloomington and Indianapolis campuses. The data centers provide a safe and secure location for IT equipment. This includes the basic infrastructure of standardized cabinets and cabinet distribution units for power. In addition to this, the data centers have uninterruptable power supplies (UPS) and power distribution and HVAC to provide year round cooling to protect equipment from environmental hazards of dust, temperature and humidity. Diesel generators will provide ongoing power in the event of a campus or data center power outage. Enhanced cabinet power distribution provides redundant circuits and remote monitoring of the power distribution. Physical security includes proximity card readers and biometric hand scanners for access authentication, ID cards, reinforced doors, security glass and alarms. Fire suppression equipment is provided by a “double inter-lock pre-action sprinkler system”. Additionally, both facilities have UITS staff on site in the building 24 hours a day, 7 days a week.
2.3 Operating Parameters
Operations support is provided for the data center by trained operators 24 hours a day, 7 days a week. Operations staff monitors vital data center and server information. Examples include: temperature, network connectivity and server vitals as set up by the Systems Administrator. Problem coordination/management, notification, escalation and reporting are done by the operations staff.
2.4 System Level
Storage and Virtualization (SAV) Systems Administrators provision the CPU, RAM, hard disk and network resources for the virtual servers in VMware. The virtual server creation and configuration provides the infrastructure for storage, network, security administration and account management of the virtual server. Ongoing support includes monitoring, performance tuning and software patches of the physical servers hosting the virtual server.
3 Roles and Responsibilities
3.1 Customer Obligations
Customer responsibilities and/or requirements include:Staffing: All customer personnel contacting UITS for support must be fully trained on the operating system running in the virtual system.
Named Designated Contacts:
Customer named contacts, technical resources and fiscal account resources, which are established, person-specific, e-mail addresses associated with the customer support contract. It is expected that these contacts will be updated upon any personnel or responsibility change.
Customer Active Directory Services (ADS) group to be used for Virtual System resource assignment. The ADS group contents are managed by the Customer, thereby providing the most control over resource
access to the Customer.
Full responsibility for system administration. System administration falls into, but is not limited to, the following areas:
- Installation and licensing of all operating system and application software.
- VMware tools software installation is required to maintain a supported infrastructure. Installation of VMware tools is required for each virtual server. It is strongly recommended to include VMware tools updates as part of normal operating system patch cycles. The VMware tools version will be evaluated as part of any debugging endeavor. In the event that the Virtual System has a VMware tools version that is not current, upgrading VMware tools will be the first step in solving the problem.
- Support and maintenance of all operating systems and application software, including the timely application of all patches and upgrades.
- Configuration of network address through Domain Name Services (DNS) administrators at UITS.
- Configuration of machine room firewall ports through Network administrators at UITS.
- Security measures, particularly the establishment of appropriate authentication and authorization processes, application of operating system and application security patches, and the performance and resolution of the University Information Security Office (UISO) scans. Click here for general security information
- Data management, as prescribed by university policies and state and federal laws and regulations in respect to protection of, access to, and confidentiality of institutional or personal data residing on or processed by the system.
- Liaison or manager who will provide operations staff with support escalation and contact information for system administration functions. Contact information for billing inquires and Technical contact for operational inquiries.
- Optional Backup services are offered that provide cross-site backups and cross-campus failover options, which isolate potential disasters by securing backup data within hardened data centers. In the event the optional services are not utilized data protection is solely the responsibility of the customer.
- Active incident response plan as outlined by University Information Policy Office (UIPO).
- Managing system logs for operating system and application related troubleshooting.
- Regular scheduled auditing for abnormal events including intrusion detection.
- If production systems are deemed critical, test VMs should be installed and maintained. In the event a vulnerability is discovered the test VM can be utilized to test the fix quickly and deploy it with confidence in the production environment.
- In the event that critical data is stored within the VM
- System administrators are highly encouraged to utilize available encryption toolsets such as full disk encryption. The university has acquired the PGP software suite to enable Whole Disk Encryption (WDE). Alternative tools are available such as BitLocker or TrueCrypt support whole disk encryption as well
- Additionally system administrators are strongly encouraged to activate encryption for any TSM backup use as well. The TSM client encryption tools to provide end-to-end encryption for client data stored within the TSM infrastructure. Data is encrypted during transfer and storage by enabling client encryption. The associated knowledge base article describes the necessary steps to enable client encryption " At IU, how do I set up client-level encryption and compression on a TSM client node?"
UISO reserves the right to audit the security of any system residing in its facilities, through periodic security scans. Per Policy IT-12 “Proactively seek out and apply vendor-supplied fixes necessary to repair security vulnerabilities, within a timeframe commensurate with the level of risk (i.e., within 24 hours for high-risk, with 48 hours for medium-risk, and within 72 hours for low-risk).”
If a system becomes compromised, UITS will immediately remove it from the network and notify the customer. The system will not be allowed back on the network until the customer has resolved the situation and UISO has certified the resolution.
3.1.1 Virtual Server System Administration
- At the physical host level, will review logs and performance, system status, resource usage and events that may result in security issues and identify any required performance tuning.
- Maintain base OS and network security. This includes OS patching, firewall settings and associated infrastructure components of the virtual hosts. If a secured base system is compromised via the application layer, the SAV System Admin has the right to disconnect the machine from the network.
3.1.2 Charges (if applicable)
- Customer billing for services will occur monthly.
3.1.3 Assumptions
- Services are clearly documented on the II web site (http://ii.uits.iu.edu/).
- Major II upgrades will be treated as a project outside the scope of this document.
- Changes to services will be communicated and documented.
3.1.4 Hardware and infrastructure technology updates
- The physical compute resources serving the II workload are hosted on high-end enterprise-class x86 hardware. The x86 hardware has an expected lifecycle replacement of approximately every 36-Months. The replacement process may require the virtual system to be momentarily power cycled to complete the migration process. The migration process can be scheduled during normal customer maintenance activities at the convenience of the customer.
- The storage supporting the II environment is hosted on high-end enterprise-class SAN. The SAN lifecycle replacement occurs between 48 and 60 months. The replacement process does require virtual systems to be interrupted.
- Patch processing for the x86 hardware, hypervisor, and SAN occur concurrently and do not require virtual systems to be interrupted during updates.
- UITS will provide adequate hardware for both x86 compute and storage required to support the customer workload.
3.1.5 Backup and Removal of Data
- To reconstruct lost or altered customer files, data, or programs, customer must maintain a separate backup system or procedure that is not dependent on the software or hardware products under support.
- Please note: Optional Backup services are offered that provide cross-site backups and cross-campus failover options, which isolate potential disasters by securing backup data within hardened data centers. The customer is responsible for backup contents; UITS is responsible for maintaining media services to host the backup content.
- Prior to termination of services, customers must maintain a separate backup system or procedure that is not dependent on the software or hardware products under II services.
- Upon termination of services, Virtual System and Data will be securely erased in accordance with IU IT policies and procedures. All programs and data that were served via the II offering will no longer be accessible.
3.2 Service Provider Requirements
SAV System Administration responsibilities and/or requirements include:
3.2.1 Physical Hardware - System Administration
- At the physical hardware level (hypervisor), review logs and performance counters to obtain system status required to identify and correct potential hardware problems,
- Apply critical patches as recommended for the virtual environment.
- Perform system tuning as needed to the physical server environment.
- Assign space and manage permissions and security groups.
- Coordinate with vendors for any maintenance or support requests.
- Capacity planning for physical resources (physical servers, SAN storage)
3.2.2 Problem Determination
- Coordinate with the vendor for any required support.
- Will determine if problem is hardware, software or storage by reviewing the event logs.
- If and when resource contention occurs, due to a server host failure or over allocation, production systems will have priority in resource allocation over test and development systems. The virtual server clusters
have been designed to avoid resource contention however the potential exists.
3.2.3 Backups/Storage of backups
- Virtual server rentals include an off-site backup of customer system images and customer operating systems, data volumes are excluded from this protection service.
- Optional data backup service: Tivoli Storage Manager (TSM) software is used for backups in conjunction with a virtual server rental, there is no additional TSM license cost; however, additional charges apply if customer wants to subscribe to backup services.
- Data backups will occur for all virtual servers by installing a TSM backup agent on the virtual server. A backup of the customer server is run every night, 365 days a year. TSM stores the current version and up to 2 old versions of each file. This also includes an off-site copy of the data.
- The customer is responsible for backup contents; UITS is responsible for maintaining media services to host the backup content.
3.2.4 Network Services
- Provide and support physical and logical network infrastructure; act as a liaison to UITS Network Engineering for problem reports and incident handling.
4 Hours of Coverage, Support, Response Times & Escalation
4.1 Hours of System Administration Support
The Virtual System Request queue for support requests is monitored Monday thru Friday 8am to 5pm with the exception of University Holidays.
4.2 Service Requests
The process to request a new Virtual System requires submission via the Virtual System request form on the on the II web site. Change requests for an existing virtual server, such as a change in resources (RAM, CPU, DISK), are also required to be submitted through Virtual System Request.In support of services outlined in this document, UITS will respond to service related incidents and/or Change requests submitted by the Customer through Virtual System Request.
PLEASE NOTE: DO NOT submit a Service Request for a Severity 1 issue via the web request form. For a Severity 1 case, please contact UITS directly, by telephone (812)-855-9910 and request a Severity 1 incident to be opened with the SAV group related to II services. An incident number will be generated and sent to the customer via e-mail. Please provide and include any additional details that may be relevant to the case.
4.3 Service Requests Priorities and Response Times
Priority |
Criteria |
Example |
Target Response Time* |
Low - Severity 4 |
|
I would like to increase the disk space available to my Virtual System, how can I accomplish this task. |
UITS and Customer will provide resources during normal business hours for problem resolution. |
Medium- Severity3 |
|
Virtual System occasionally hangs on a device driver during reboot, retrying the reboot typically corrects the issue. Please help research a resolution. |
UITS and customer will commit full-time resources during normal business hours for problem resolution to obtain workaround or reduce the severity of the Error and alternative resources during normal business hours. |
* Target Response Time is defined as the time between receipt of the call and the time that a Support Team member begins working on the problem. Due to the wide diversity of problems that can occur, and the methods needed to resolve them, response time IS NOT defined as the time between the receipt of a call and problem resolution. UITS does not guarantee the resolution of a problem within the times specified.
4.3.1 Normal Incident Processing
In the event that a customer accidentally incorrectly assigns a request priority, UITS will correct the priority utilizing the definitions defined. Communication with the customer will occur for any priority change.Service Providers supporting this service will prioritize incoming service incidents as normal priority unless the service incident fits one or more of the criteria listed in the Major Incident handling section of this
document.
When an IT Request ticket is opened for a client via the web interface:
- The Support Center will respond to the customer and process all new IT Request tickets within 8 business hours.
- Low (severity 4) priority incidents will be resolved within 30 days with a status provided every 5 days
- Medium (severity 3) priority incidents will be resolved within 5 days with a daily status provided
4.3.2 Major Incident Handling
UITS staff supporting this service will prioritize incoming incident requests as “high” priority if it meets any one of the following criteria:- Significant number of people affected.
- Organizational structure is a multiplier for number of people affected.
- Percentage of total tasks that can no longer be performed by individuals.
- Academic and Administrative Calendar deadlines.
- Significant impact on the delivery of instruction.
- Significant or lasting impact on student academic performance.
- Significant risk to law, rule, or policy compliance.
- Urgent (severity 1) priority incidents will be resolved within 8 business hours with a status provided every 2 hours.
- High (severity 2) priority incidents will be resolved within 1 ½ days with a status provided every 6 hours
- The infrastructure is protected and supported by vendor support 7 days a week 24 hours per day. If incidents are linked to vendor related components, an appropriate level support case will be opened with the vendor. The customer will be updated by SAV staff with case progress.
4.3.2.1 Service Requests Priorities and Response Times
Priority |
Criteria |
Example |
Target Response Time* |
High- Severity 2 |
|
The allocated memory is being consumed by my workload; please allocate additiona resources during this peak processing time. |
UITS and Customer must commit full-time resources during non-Standard business hours for problem resolution, to obtain workaround or reduce the severity of the Error. |
Urgent- Severity 1 |
|
E-Mail services are not functional; network is not available; classroom computing technology is not functioning pending a class. |
UITS and Customer must commit the necessary resources around the clock for problem resolution to obtain workaround or reduce the severity of the Error. UITS will use commercially reasonable efforts to make II Services available with a Monthly Uptime Percentage of at least 99.9% during any monthly billing cycle. |
* Target Response Time is defined as the time between receipt of the call and the time that a Support Team member begins working on the problem. Due to the wide diversity of problems that can occur, and the methods needed to resolve them, response time IS NOT defined as the time between the receipt of a call and problem resolution. UITS does not guarantee the resolution of a problem within the times specified.
4.3.2.2 Major Incident Response Times
Service Provider |
Service Hours and Conditions |
Backup Contacted under what conditions |
Escalation Rules |
Response Time From Notification |
SAV |
24x7 |
Virtual server environment performance degradation |
Follow On-Call contact list for off hours and normal senior mgmt escalation |
=1 hour |
Data Center Operations |
24x7 |
Follow On-Call contact list for off hours and normal senior mgmt escalation |
=5 minute |
4.4 Maintenance Management
4.4.1 Service Maintenance/Change Management
The II Virtual Server Hosting services adheres to the UITS Change Management process.Service Providers for this service adhere to the UITS Maintenance Window Guidelines. Please review the UITS Scheduled Maintenance Windows.
All services and/or related components require regularly scheduled maintenance (“Maintenance Window”) in order to meet established service levels. These activities may render systems and/or applications
unavailable for normal user interaction.
Due to the technology available within the virtual infrastructure, a maintenance window is not reserved for II. Patches are implemented to the infrastructure in a rolling mode, which ensures Virtual Systems are available
during the infrastructure maintenance. UITS will use commercially reasonable efforts to make II Services available with a Monthly Uptime Percentage of at least 99.9% of the 24x7 time in a given month.
One exception is the virtual center service, which is not required for virtual system availability. The virtual center maintenance window is reserved weekly unless a different maintenance window is required because of risk or impact to the customer, the Virtual center standard maintenance window is weekly from noon to 2PM on Wednesdays:
Time |
Sunday |
Monday |
Tuesday |
Wednesday |
Thursday |
Friday |
Saturday |
Begin |
12:00pm |
||||||
End |
2:00pm |
4.4.2 General Exceptions to the standard maintenance window include:
Exceptions |
Paramaters |
Coverage |
Unless emergency or 24x7 supported servers |
||
Fiscal Year Close |
Last business day in June |
Unless emergency |
Finals/Grade Weeks |
Unless emergency |
5 Reporting, Reviewing and Auditing
IU Internal Audit performs periodic audits of the II services. This document should be reviewed at a minimum once per fiscal year; however, in lieu of a review during any period specified, the current document will remain in effect.
5.1 Term and Termination
- TERM: Support shall be provided in annual terms and shall be renewable to then-current Support plan when UITS is notified of Customer’s intent to renew the existing contract, or UITS is notified of Customer’s intent not to renew services.
- Please note: Optional Backup services are offered that provide cross-site backups and cross-campus failover options, which isolate potential disasters by securing backup data within hardened data centers. The customer is responsible for backup contents; UITS is responsible for maintaining media services to host the backup content.
- TERMINATION: Customer may terminate this service via submission of a support request. Services are billed in arrears based on actual usage; charges will be processed thru month of service termination.
- Prior to termination of services, customers must maintain a separate backup system or procedure that is not dependent on the software or hardware products under II services.
- Upon termination of services, Virtual System and Data will be securely erased in accordance with IT policies and procedures. All programs and data that were served via the II offering will no longer be accessible.
5.2 Service Level Expectation (SLE)
- SLE Update: This agreement and related UITS plan offering details are operational in nature and may be modified any time by UITS. UITS will communicate in advance proposed changes to Customer. The Customer may terminate the customer relationship without penalty if all parties cannot abide by the revisions. This agreement supersedes any previous service level expectation.
5.3 Miscellaneous:
- Force Majeure. Except for the obligation to pay monies due and owing, neither party shall be liable for any delay or failure in performance due to event outside the defaulting party’s reasonable control, including without limitation, acts of God, earthquakes, labor disputes, and shortages of supplies, actions of governmental entities, riots, war, fire, epidemics, or other circumstances beyond its reasonable control. The obligations and rights of the excused party shall be extended on a day-to-day basis for the period equal to the period of the excusable delay.