Cloud Operations Technology to Keep Customers’ Mission Critical Processes Up and Running by Fujitsu

Fujitsu Laboratories Ltd. today announced the development of cloud operations technology aimed at realizing a cloud that can be safely used without delaying or stopping customer processes due to maintenance.

When cloud service operators perform maintenance, it is necessary to take measures such as moving the customers’ virtual machines to another server, but this could have effects such as stopping or delaying customer processes. For this reason, it has been difficult to utilize the cloud for mission-critical processes.

Now, Fujitsu Laboratories has developed technology to predict the degree of impact on customer processes, based on load patterns on virtual machines and on maintenance, as calculated by machine learning. Moreover, the technology will automatically and quickly create maintenance plans that avoid impact on customer processes.

This technology will enable cloud operations that do not stop or delay customers’ mission critical processes, thereby supporting cloud utilization by customers running processes that require more stable operations. Fujitsu Limited aims to make this technology available as a service during fiscal 2018, functioning to support the operation of its Fujitsu Cloud Service K5.

Development Background

With the prevalence of the cloud in recent years, there has been an increasing demand to migrate mission-critical processes to the cloud. With public clouds, however, there are circumstances in which maintenance is conducted regardless of customer’s schedule, which may cause delays or have other consequences on processes.

In order to create a cloud that customers can safely use for important processes, there has been a demand for cloud operations that provide stable functionality even during maintenance.

Cloud_Technology

Issues

In public clouds, because it was necessary to temporarily stop, and then restart virtual machines (VMs) on other servers in order to conduct maintenance, so processes were temporarily halted as well. Moreover, while it was possible to avoid halting processes by migrating VMs to other servers without stopping them first, in a procedure called live migration (LM), it was essential to conduct LM at times when the load on the VMs was low. In a public cloud with multiple users, however, it was difficult to arrange a convenient time between multiple VMs.

About the Newly Developed Technology

Fujitsu Laboratories has now developed the technology to quickly create plans to conduct maintenance at times that will have the least impact on customer processes, for all servers in a cloud, through process load prediction and high-speed combinatorial optimization.

Features of the technologies are as follows.

  1. Technology to predict times with the least maintenance impact on customer processes

This technology first creates a model to predict the time required for live migration, for each VM executing customer processes, using machine learning to study the relationship between the time required for live migration in previous maintenance cycles and the load on the VM at that time. Then the technology predicts the time period, down to the minute, that will minimize the impact on processes due to live migration when maintenance is conducted. This is done by calculating the time required for live migration for each VM from the load on that VM, which is estimated with data that can be observed externally, such as memory usage and communications traffic.

Fujitsu_Cloud_Technology

  1. Technology to create a plan to quickly complete maintenance

Fujitsu Laboratories has developed the technology to efficiently calculate a plan to complete maintenance for the cloud in as short a time as possible while minimizing the impact on processes and limiting the time to complete maintenance for each VM, finding the optimal combination from a huge range of options. With this technology, it is possible to quickly calculate the optimal solution using information on the composition of servers and VMs, as well as constraints unique to cloud operations, such as the conditions under which maintenance is possible.

Effects

In a simulation based on the operational data of a commercial cloud with about 5,000 VMs, where each VM was utilizing over 90% of its CPU resources for 80% of the time, while its utilization was low for 20% of the time, the results showed that with previous technology, a total of 425 VMs were subjected to maintenance while their process load was high, impacting those processes. This new technology, however, enables maintenance while avoiding high process load periods for all VMs.

Future Plans

This newly developed technology can limit the impact of cloud maintenance on customer operations to support customers in using the cloud for processes that require more stable operations, which were previously difficult in the cloud. Fujitsu Limited aims to make this technology into a service during fiscal 2018, making it available as a functionality supporting the operation of its Fujitsu Cloud Service K5.