Dynamic Linking and Loading

Intro 

Current programs rely on dynamic loading (e.g., through ld.so dynamic loader in Linux) to support shared libraries, position independent code, and defense mechanisms like address space layout randomization (ASLR).

Modern operating systems adopt dynamic linkding and loading to enable modularity.

Advantages 

1. The library code can be shared among processes.
– so that a system needs only one physical copy in memory per binary
2. once a bug is found in a library implementation, fixing the bug and distributing the updated library suffices if it is dynamically linked;
– otherwise, rebuilding every single binary that statically linked the vulnerable library is required

Reference
[1] An Evil Copy: How the Loader Betrays You

OpenStack Nova VM live migration flow

Original post is:  OpenStack Nova VM migration flow 

  • nova.api.openstack.compute.contrib.admin_actions._migrate_live()
  • nova.compute.api.live_migrate()
  •        – update instance state to MIGRATING state
           – call into scheduler to live migrate (scheduler hint will be set to the host select (which may be none))

  • nova.scheduler.manager.live_migration()
  • nova.scheduler.manager._schedule_live_migration()
  • nova.conductor.tasks.live_migrate.LiveMigrationTask.execute()
  •         – check that the instance is running

            – check that the instance’s host is up
            – if destination host provided, check that it..
                  1. is different than the instance’s host
                  2. is up
                  3. has enough memory
                  4. is compatible with the instance’s host (i.e., hypervisor type and version)
                  5. passes live migration checks (call using amqp rpc into nova manager check_can_live_migration_destination)
              – else destination host not provided, find a candidate host and check that it 
                  1. is compatible with the instance’s host (i.e., hypervisor type and version)
                  2. passes live migration checks 
               – call using amqp rpc into nova manager live_migration
                 Note: Migration data is initially set by check_can_live_migrate_destionation and can be used for implementation specific parameters from this point.


  • nova.compute.manager.check_can_live_migrate_destination()
  •         – driver.check_can_live_migrate_destination()

            – call using amqp rpc into nova manager check_can_live_migrate_source
            – driver.check_ca_live_migrate_destionation_cleanup()

  • nova.compute.manager.check_can_live_migrate_source()
  •         – determine if the instance is volume backed and add result to the migration data

            – driver.check_can_live_migrate_source()

  • nova.compute.manager.live_migration()
  •         – if block migration request then driver.get_instace_disk_info()

            – call using amqp rpc into nova manager pre_live_migration
                  – Error handler: _rollback_live_migraiton
            – driver.live_migration()

  • nova.compute.manager.pre_live_migration()
  •         – get the block device information for the instance

            – get the network information for the instance
            – driver.pre_live_migration()
            – setup networks on destination host by calling the network API setup_networks_on_host
            – driver.ensure_filtering_rules_for_instance()

  • nova.compute.manager._rollback_live_migration()
  • nova.compute.manager._post_live_migration()
  •         – driver.get_volume_connector()

            – for each instance volume connection call the volume API terminate_connection
            – driver.unfilter_instance()
            – call into conductor to network_migrate_instance_start which will eventually call the network API migrate_instace_start
            – call using amqp rpc into nova manager post_live_migration_at_destionation
            – if block migration or not shared storage driver.destory()
            – else driver.unplug_vifs()

    User Mode and Kernel Mode

    Mode Bit

    • A bit, called mode bit, is added to the hardware of the computer to indicate current mode.
      • 0: kernel mode
        • When a task is executed on behalf of the operating system
      • 1: user mode
        • When a task is executed on behalf of the user

    Switch

    • Whenever a trap or interrupt occurs, the hardware switches from user mode to kernel mode
      • i.e., change mode bit to 0
    • When a user application requests a service from the operating system (via a system call)
      • It must transition from user to kernel mode to fulfill the request
    • Privileged Instructions
      • The hardware allow privileged instructions to be executed only in kernel mode. 
      • The instruction to switch to kernel mode is an example of a privileged instruction. Some other example include I/O control, timer management, and interrupt management. 

    Virtual Memory

    Advantage

    • It allows the execution of a process that is not completely in memory. So that it enables users to run programs that are larger than actual physical memory.
    • In addition, it abstracts main memory into a larger, uniform array for storage, separating logical memory as viewed by the user from physical memory. This arrangement free programmers from concern over memory-storage limitations. 

    Computer Storage Unit

    Bit

    A bit is the basic unit of computer storage. It can contain one of two values, zero or one.

    Byte

    A byte is 8 bits, and on most computers it is the smallest convenient chunk of storage. 
    • For example, most computers don’t have an instruction to move a bit but do have one to move a byte

    Word

    A word is generally made up of one or more bytes. 
    • For example, a computer may have instructions to move 64-bit (8-byte) words.

    KiloByte

    • 1024 Bytes

    MegaByte

    • (1024)^2 Bytes
      • Usually round off as 1 million bytes

    GigaByte

    • (1024)^3 Bytes
      • Usually round off as 1 billion bytes

    Completely Fair Scheduler

    Introduction

    CFP uses a concept called “sleeper fairness“,

    • which considers sleeping or waiting tasks equivalent to those on the runqueue. This means that interactive tasks which spend most of their time 
    • So that the interactive tasks which spend most the their time waiting for user input or other events get a comparable share of CPU time when they need it.

    Algorithm

    • The data structure used for the scheduling algorithm is a red-black tree in which the nodes are scheduler-specific structures, entitled “sched_entity”. 
      • These are derived from the general task_struct process descriptor, with added scheduler elements. These nodes are indexed by processor execution time in nanoseconds. 
      • Idea Processor
        • an “idea processor” would equally share processing power among all processes
      • Maximum execution time
        • Thus the maximum execution time is the time the process has been waiting to run, divided by the total number of processes, or in the other words, the maximum execution time is the time the process would have expected to run on an “ideal processor”
        • A maximum execution is also calculated for each process. This time is based upon the idea that an “idea processor” would equally share processing power among all processes. 
    • When the scheduler is invoked to run a new process, the operation o the scheduler is as follows
      • The left most node of the scheduling tree is chosen (as it will have the lowest spent execution time), and sent for execution
      • If the process simply completes execution, it is removed from the system and scheduling tree
      • If the process reaches its maximum execution time or is otherwise stopped (voluntarily or via interrupt), it is reinserted into the scheduling tree based on its new spent executiom time.
      • The new left-most node will then be selected from the tree, repeating the interation. 

    Reference

    [1] https://en.wikipedia.org/wiki/Completely_Fair_Scheduler

    Process Context Switch

    Context Switch

    • Whenever an interrupt arrives, the CPU must to a state-save of currently running process, then switch into kernel mode to handle the interrupt, and the do a state-restore of the interrupted process. 
    • Context switch happens when
      • the time slice for one process has expired and a new process is to be loaded from the ready queue.
      • This will be instigated by a timer interrupt, which will then cause the current process’ state to be saved and the new process’s state to be restored.
      • Saving and restoring states involves saving and restoring all registers and program counter(s), as well as the process control blocks.
      • Context switch happens VERY VERY frequently, and the overhead of doing the switching is just lost CPU time, so context switches (state saves and restores) need to be as fast as possible. Some hardware has specific provisions for speeding this up, such as a single machine instruction for saving or storing all registers at once. 

    Registers

    • The Operating System needs to store a copy of the CPU registers to memory.
    • When it is time for the processor to give up the processor so another process can run, it needs to save it’s current state.
    • Equally, we need to be able to restore this state when the process is given more time to run on the CPU. To do this the operating system needs to store a copy of the CPU registers to memory
    • When it is time for the process to run again, the operating system will copy the register values back from memory to the CPU registers and the process will be right back where it left off. 

    Reference

    [1] http://www.bottomupcs.com/elements_of_a_process.xhtml

    Process Heap

    Introduction

    The heap is an area of memory that is managed by the process for on the fly memory allocation.

    This is for variables whose memory requirements are not known at compile time.

    • brk
      • The bottom of the heap is known as the brk, so called for the system call which modifies it.
      • By using the brk call to grow the area downward the process can request the kernel allocate more for it to use.
    • malloc
      • The heap is mostly managed by the malloc library call.
      • This makes managing the heap for programmer by allowing them to simply allocate and free (via the free call) heap memory.
      • Malloc can use schemes like a buddy allocator to manage the heap memory for the user.
      • Malloc can also be smarter about allocation and potentially use anonymous mmaps for extra process memory.
      • This is where instead of mmaping a file into the process memory, it directly maps an area of system RAM.