OpenStack Nova VM live migration flow

Original post is:  OpenStack Nova VM migration flow 

  • nova.api.openstack.compute.contrib.admin_actions._migrate_live()
  • nova.compute.api.live_migrate()
  •        – update instance state to MIGRATING state
           – call into scheduler to live migrate (scheduler hint will be set to the host select (which may be none))

  • nova.scheduler.manager.live_migration()
  • nova.scheduler.manager._schedule_live_migration()
  • nova.conductor.tasks.live_migrate.LiveMigrationTask.execute()
  •         – check that the instance is running

            – check that the instance’s host is up
            – if destination host provided, check that it..
                  1. is different than the instance’s host
                  2. is up
                  3. has enough memory
                  4. is compatible with the instance’s host (i.e., hypervisor type and version)
                  5. passes live migration checks (call using amqp rpc into nova manager check_can_live_migration_destination)
              – else destination host not provided, find a candidate host and check that it 
                  1. is compatible with the instance’s host (i.e., hypervisor type and version)
                  2. passes live migration checks 
               – call using amqp rpc into nova manager live_migration
                 Note: Migration data is initially set by check_can_live_migrate_destionation and can be used for implementation specific parameters from this point.


  • nova.compute.manager.check_can_live_migrate_destination()
  •         – driver.check_can_live_migrate_destination()

            – call using amqp rpc into nova manager check_can_live_migrate_source
            – driver.check_ca_live_migrate_destionation_cleanup()

  • nova.compute.manager.check_can_live_migrate_source()
  •         – determine if the instance is volume backed and add result to the migration data

            – driver.check_can_live_migrate_source()

  • nova.compute.manager.live_migration()
  •         – if block migration request then driver.get_instace_disk_info()

            – call using amqp rpc into nova manager pre_live_migration
                  – Error handler: _rollback_live_migraiton
            – driver.live_migration()

  • nova.compute.manager.pre_live_migration()
  •         – get the block device information for the instance

            – get the network information for the instance
            – driver.pre_live_migration()
            – setup networks on destination host by calling the network API setup_networks_on_host
            – driver.ensure_filtering_rules_for_instance()

  • nova.compute.manager._rollback_live_migration()
  • nova.compute.manager._post_live_migration()
  •         – driver.get_volume_connector()

            – for each instance volume connection call the volume API terminate_connection
            – driver.unfilter_instance()
            – call into conductor to network_migrate_instance_start which will eventually call the network API migrate_instace_start
            – call using amqp rpc into nova manager post_live_migration_at_destionation
            – if block migration or not shared storage driver.destory()
            – else driver.unplug_vifs()

    Google Cloud VM Live Migration

    Introduction

    Heartbleed bug was revealed on April 7th, 2014. On that day, most cloud customers were impacted because patching the system requires VM reboot. At Google, none of the customers were impacted due to the transparent maintenance functionality introduced in Google Compute Engine in Dec 2013.

    Through a combination of datacenter topology innovations and live migration technology, they can move their customer running VMs out of the way of planned hardware and software maintenance events, so they keep the infrastructure protected and reliable — without customers’ VMs, applications or workloads noticing that anything happened.

    VM Migration Procedure 

    The high-level steps are illustrated in the following

    • The process begins with a notification that VMs need to be evicted from their current host machine. The notification might start with a file change (e.g., a release engineer indicating that a new BIOS is available), Hardware Operations scheduling maintenance, an automatic signal from an impending hardware failure etc. 
    • Once a VM is selected for migration, we provide a notification to the guest that a migration is imminent. After a waiting period, a target host is selected and the host is asked to set up a new, empty “target” VM to receive the migrating “source” VM. Authentication is used to establish a connection between the source and target. 
    • There are three stages involved in the VM’s migration
      • During pre-migration brownout, the VM is still executing on the source, while most state is sent from the source to the target. For instance, we copy all the guest memory to the target, while tracking the pages that have been re-dirtied on the source. The time spent in pre-migration brownout is a function of the size of the guest memory and the rate at which pages are being dirtied.
      • During blackout, which is a very brief moment when the VM is not running anywhere, it is paused, and all the remaining state required to being running the VM on the target is sent. 
      • During post-migration brownout, the VM executes on the target. The source VM is present, and may be providing supporting functionality for the target. For instance, until the network fabric has caught up the new location of the VM, and source VM provides forwarding services for packets to and from the target VM
    • Finally, the migration is completed, and the system deletes the source VM.

      Reference
      [1] Google Compute Engine Users Live Migration Technology to service infrastructure without application downtime 

      Hyper-V Live VM Migration Procedure


      1. Live migration setup occurs. 

      During the live migration setup stage, the source server creates a connection with the destination server. This connection transfers the virtual machine configuration data to the destination server. A skeleton virtual machine is set up on the destination server and memory is allocated to the destination virtual machine.



      2. Memory pages are transferred from the source node to the destination node

      • In the second stage of a live migration, the memory assigned to the migrating virtual machine is copied over the network to the destination server. This memory is referred to as the “working set” of the migrating virtual machine. A page of memory is 4 KB.
      • In addition to copying the working set to the destination server, Hyper-V monitors the pages in the working set on the source server. As memory pages are modified in the source server, they are tracked and marked as being modified. 
      • During this phase of the migration, the migrating virtual machine continues to run. Hyper-V iterates the memory copy process several times, with each iteration requiring a small number of modified pages to be copied. 


      3. Modified pages are transferred.

      • This third phase of the migration is a memory copy process that duplicates the remaining modified memory pages to the destination server. The source server transfers the CPU and device state of the virtual machine to the destination server.
      • During this stage, the network bandwidth available between the source and destination servers is critical to the speed of the live migration. Using a 1 Gigabit Ethernet or faster is important. The faster the source server transfers the modified pages from the migrating virtual machines working set, the more quickly the live migration is completed
      • The number of pages transferred in this stage is determined by how actively the virtual machine accesses and modifies the memory pages. The more modified pages there are, the longer it takes to transfer all pages to the destination server

      4. The storage handle is moved from the source server to the destination server

      • During the fourth stage of a live migration, control of the storage such as any virtual hard disk files or physical storage attached through a virtual Fibre Channel adapter, is transferred to the destination server. 


      5. The virtual machine is brought online on the destination server.

      • In the fifth stage of a live migration, the destination server now has the up-to-date working set as well as access to any storage used by the virtual machine. At this point, the virtual machine is resumed. 
      6. Network cleanup occurs. 
      • In the final stage of a live migration, the migrated virtual machine is running on the destination server. At this point, a message is sent to the network switch. This message causes the network switch to obtain the new MAC addresses of the migrated virtual machine so that network traffic to and from the virtual machine can use the correct switch port

      Reference
      [1] Virtual Machine Live Migration Overview

      VM Live Migration’s Impacts on the Running Applications

      1. Will the IP address change after migration?

      Both types of live migration exist, including changing and not changing IP address [5].

      • Based on Google cloud [1], it can migrate clients’ VM without affect the customers. That means the IP address of a VM would not be changed in this case.
        • To retain the same IP address, hyper-V requires the source and destination hosts to be within the same subnet. I think Google cloud may not have this requirement.
        • I think the virtual network [4] would be able to remove the restrictions on the locations of the destination hosts. “Hyper-V Network Virtualization decouples virtual networks for customer virtual machines from the physical network infrastructure.” 

      2. Will the migration interrupt the Internet service?

      This depends on the implementation. The answer is different regarding different implementation.

      • According to google cloud [1], there will be no service interruptions.
        • During post-migration brownout, the VM executes on the target. The source VM is present, and may be providing supporting functionality for the target. For instance, until the network fabric has caught up the new location of the VM, and source VM provides forwarding services for packets to and from the target VM
      • According to hyper-V [2]
        • the migration is not downtime-free, the interruption is almost immeasurably brief. Usually the longest delay is the network layer while the virtual machine’s MAC address is registered on the new physical switch port and its new location is propagated throughout the network. 
        • According to [3], in order to use live migration the VM needs to keep the same IP address across date centers in order to achieve the goal of continuous access from clients to the virtual machine during and after the migration. 

      3. How the network is migrated?

      The most challenging issue in VM migration is to keep the network working.

      In LAN, different hypervisors using different strategies.

      • Xen
        • It uses ARP to bind the IP address to the new host. 
          • The VM sends ARP signal, broadcast that the IP address is moved to a new host.  But this may not be allowed for security reasons. 
      • VMware
        • VMotion uses VNIC to ensure the network connection. 
          • The VNIC will be migrated with VM as well. Every VNIC has a unique MAC address in LAN and is connected to one or multiple NIC. 
          • Since VNIC has a MAC address that is irrelevant to the physical network address, the network will be continued as normal using VM live migration. 
          • Note due to the restrictions of Ethernet, the source and destination hosts have to be in the same subnet

      In WAN

      • The VM will be given a new IP address in the destination host. In order to ensure the network connection, we can use IP tunnel with combination of dynamic DNS, i.e., we can build a IP tunnel between the source IP and destination IP address, and use it to forward the packets from source host to destination host. Once migration is done, VM can response to the new network. It means the DNS is updated, and the network connection will refer to the new IP address. 

      Reference
      [1] Google cloud VM live migration
      [2] Hyper-V live migration
      [3] Live Migration — Implementation considerations
      [4] Hyper-V 网络虚拟化概述 
      [5] 虚拟机迁移研究

      迈阿密美食 [FL 33139] Chalan On the Beach

      店名:Chalan On the Beach
      地址1580 Washington Ave, Miami Beach, FL 33139
      Yelp: https://www.yelp.com/biz/chalan-on-the-beach-miami-beach

      本来想去Joe‘s Crab Stone的,但是走路有点远。这家餐馆刚好在我们住的宾馆旁边,而且Yelp上有600多个评价,平均四星,看起来还不错,我们于是就去了。

      这是一家秘鲁餐馆,秘鲁的口味其实跟中餐有点相近,店里的服务员看起来都是秘鲁人。我们5点半去的,店里十几张桌子基本都满了,等我们坐下来点完餐,门口就已经开始排起了小长队了。

      我们点了一个酸橘汁腌鱼 (Mixed Ceviche al Chalan),一个打卤红鱼 (Poached Mixed Snapper),还有一个海鲜炒饭 (Arroz Chaufa de Mariscos)。

      1. 酸橘汁腌鱼 (Mixed Ceviche al Chalan)
      好吧,这个吃了一点才拍的。这盘里的海鲜有鱼肉,虾,鱿鱼,海虹等,全都是半生的,用酸橘汁腌过了,所以吃起来并不生。但是对我来说,有点酸。朋友很喜欢吃酸,他觉得吃起来有生鱼片的感觉,很喜欢。

      推荐指数:***

      2. 打卤红鱼 (Poached Mixed Snapper)
      这条鱼看起来应该有两磅,这家店用量实在太足了,包括等一下要介绍的炒饭,也是非常大盘。这道菜的主角是红绸鱼 (snapper),应该是稍微煎了之后,加番茄之后勾芡打卤的,有妈妈做的菜的味道,赞!红绸鱼是一种比较难烧的鱼,因为海味有点重,这道菜烧的还算成功,基本掩盖了海味。
      推荐指数:***

      3. 海鲜炒饭 (Arroz Chaufa de Mariscos)
      这道菜端出来的时候闻着实在太香了。算是目前在美国吃到的最好吃的炒饭。不同于泰国的湿答答的炒饭,秘鲁的这道炒饭其实有点像闽南的油饭。油饭一般不是用煮熟的米饭炒的,而是用浸泡过的米直接炒熟,所以特别香,并且很Q。大爱这个炒饭。里面用料也很丰富,有虾,鱿鱼等。
      推荐指数:****