Shuffling Theory

1. If there are n computers, among which 1 is vulnerable. Each computer is assigned an IP address, the attackers probe randomly on the IP address to check whether its corresponding computer is vulnerable. If the IP addresses of all computer randomly shuffled after each attacker’s probe. What is the probability that the attacker will find the vulnerable machine?
Answer:
Pr(0<x<n) = 1 - (1-1/n)^n = 1 - 1/e

Reference
[1] Analysis of Network Address Shuffling as a Moving Target Defense, by Thomas E. Carroll, in ICC14

[Group] Google Technical Infrastructure Resource Management

Functionality

  • Responsible for the efficient utilization for all the Google data centers through software, services, and business operations
  • Develop management software for all Google products (e.g., Google Cloud, Gmail, Youtube, Maps, Android, Search, Ads, etc) to manage their machine, compute, and storage needs. 

Focus

  • Cloud resource distribution
  • Efficient management
  • Product catalog
  • Pricing automation
  • Demand planning
  • Fleet planning
  • Ordering
  • Cost allocation
  • Usage collection
  • Financial policy control
  • Data warehouse
  • Reporting

Questions

    • How can I contribute as an intern?

    Questions about Borg

    • Scalability
      • How the nodes are added and removed?
    • How Borg handles with failure
      • In page 2, it said it will restarts the tasks if fail. Will this cause some errors, e.g., some operations are conducted more than once. Should user’s program handle the failure through roll-back by themselves?
      • Security
        • Will it be possible that there are some malicious jobs that try to interrupts other jobs, or breach the system?
        • Borg monitors the health of the jobs.
          • But it seems the health is in the response time etc.
      • Migration
        • Scheduling of tasks is very critical since there is not virtualization of physical machines
          • It seems job migration is not possible?
      • Over-selling
        • As users may over-purchase the resources, the Borg use over-selling approach
          • What if the over-selling cannot be satisfied. e.g., United Airline over-sell the tickets, will it cause trouble?
          • Or it seems the over-sell is only for low priority?
          • How to predict the future resource usage for the other Google groups, so that they can purchase the Quota as accurate as they can?
        • Users pay for what they use? or for what they reserved?
      • The idea of simulator of Borgmaster is cool
        • To decide whether a configuration changes will evict any important jobs
      • Fake-estimation in order to be scheduled earlier
        • Purposely stating less CPU needs in order to scheduled earlier
      • Will the user specify the running time of the job?
        • How will a false estimation that affects the schedule?
      • How Borg is different from Google Compute Engine
        • Can Borg be replaced by GCE

      Questions Answered by Myself

      • Q: As all Google groups pay for the resource they use, is the price a fixed value, or does it has incentive to motivate people to make better utilization of the resources
        • For example, a more urgent request for the machine is more expensive. This can motivate people to make plan for their machine usage. 
        • A: Yes. Jobs are of different priority, jobs with higher priority are more expensive. 

        Software Defined Network

        Introduction

        The SDN paradigm creates a separation between data plane processing and control flow processing.
        • data plan processing
          •  forwards packets
        • control flow processing
          • determine how to populate forwarding tables
        The OpenFlow protocol acts as an API between network switches and a logically centralized decision maker, called the OpenFlow controller. In this model,
        • Network switch
          • cache data flow rules
          • when a switch receive a packet and does not know how to forward it according to its cached rules, the switch sends an “elevation” request containing the original packet and a request for guidance to the controller
        • OpenFlow Controller
          • examines the packet and sends a set of rules that the switch should add to the data plane cache for use in forwarding packets.

        Vulnerability

        DDos Attack on the controller 

        • DDos Attack on the controller [1]
          • Since the switches will send all the packets it cannot handle to the controller
          • There is no need for an attacker to catch the IP address or location of the controller through scanning before launching an attack. The attackers can send some specific attack packets and abnormal packets to SDN networks, all switches will automatically forward these packets to their controller.
        • Attackers send packets to probe whether the target is SDN architecture [1]
          • Every flow entry in the flow table of a switch contains three items, i.e., rule, action and stats.
          • The attacker can make a new or abnormal packet from carefully selected IP, Port, MAC etc., and then send it to the switch.
          • Generally, there is no rule in the switch matching the fresh packet sent for the first time. The packet will be uploaded to the controller, and then the controller will broadcast this packet’s information to all the network interface to find its route. 
          • Once getting the route, the controller will issue corresponding rules to the switch’s flow table. Otherwise, the controller will make a rule to switch to drop these packets. This whole response will take a long time
          • Then the attacker will send a group of packets with the same information for a second time to the switch, if the response time is much shorter than that of the first time, the network can be determined to be SDN architecture.
          • An attacker may launch Blind DDoS directly on the network which claims to be SDN network architecture or switch the attackers has already known is SDN system by scanning. 

        Tools

        • Mininet [2]
          • can be used to create a network of OpenFlow switches that are connected according to your plan. 

        Reference

        [1] Defending Blind DDoS Attack on SDN Based on Moving Target Defense, by Duohe Ma, Zhen Xu and Dongdai Liu, in SecureCOMM14
        [2] OpenFlow random host mutation: Transparent moving target defense using software defined networking, by J. H. Jafarian, in HotSDN 2012

        [Leetcode] Remove Duplicates from Sorted Array

        Problem

        Given a sorted array, remove the duplicates in place such that each element appear only once and return the new length.
        Do not allocate extra space for another array, you must do this in place with constant memory.
        For example,
        Given input array nums = [1,1,2],
        Your function should return length = 2, with the first two elements of nums being 1 and 2 respectively. It doesn’t matter what you leave beyond the new length.

        Analysis

        • Keep track of the right positive to write the valid number, it is also the length of the valid numbers
        • Keep track of current number
          • Write it to the array only when it is different from next number
        • Time: O(n)
        • Extra space: O(1)

        Code

        [Leetcode] Minimum Depth of Binary Tree

        Problem

        Given a binary tree, find its minimum depth.
        The minimum depth is the number of nodes along the shortest path from the root node down to the nearest leaf node.

        Analysis

        • Recursive calculate the depth of the tree
          • 1 + Math.min(left-tree, right-tree)
        • Need to deal with the case when left-tree or right-tree is null
          • If left-tree is null, return right-tree
          • If right-tree is null, return left-tree
        • Time: O(n)
          • As it traverses each node once

        Code

        [Leetcode] Happy Number

        Problem

        Write an algorithm to determine if a number is “happy”.
        A happy number is a number defined by the following process: Starting with any positive integer, replace the number by the sum of the squares of its digits, and repeat the process until the number equals 1 (where it will stay), or it loops endlessly in a cycle which does not include 1. Those numbers for which this process ends in 1 are happy numbers.
        Example: 19 is a happy number
        • 12 + 92 = 82
        • 82 + 22 = 68
        • 62 + 82 = 100
        • 12 + 02 + 02 = 1

        Analysis

        • Since the computation is a recursive process, so that we can recursive check whether a number is a happy number
        • We need to maintain a hashset to track whether a number has shown up before
          • If yes, then it will not be possible to reach 1 in the future

        Code

        [Leetcode] Binary Tree Paths

        Problem

        Given a binary tree, return all root-to-leaf paths.
        For example, given the following binary tree:
           1
        /
        2 3

        5
        All root-to-leaf paths are:
        ["1->2->5", "1->3"]

        Analysis

        • Recursive constructing the path in bottom up fashion
        • For each node, get its left and right paths recursively, denote them as left and right, then the recursion is
          •  (node.val + (each path in left))
          • (node.val + (each path in right))
        • Time: O(N) where N is the number of nodes in the tree

        Code

        Reinforcement Learning

        Introduction

        Multi-agent reinforcement learning (MARL) algorithms gradually learn good (ideally optimal) strategies with respect to long-term goals through trial-and-error interactions with both the opponent and the unknown dynamic environment. 
        The Stochastic Game (SG), together with MARL can address the environmental dynamics in security games in a systematic manner.

        Reference

        [1] A comprehensive survey of multi-agent reinforcement learning, by L. Busoniu, R. Babuska, and B. De Schutter, in IEEE Trans. Syst., Man, Cybern. C, 2008

        [2] Improving Learning and Adaptation in Security Games by Exploiting Information Asymmetry, by Xiaofan He. Huaiyu Dai and Peng Ning, in INFOCOM 2015

        Virtualization and Code Migration

        Motivation for Code Migration

        • Load sharing in distributed systems
          • Long-running process can be migrated to idle processors
        • Client-server systems
          • Code for data entry shipped to client systems
          • If large quantities of data need to be processed, it is better to ship the data processing component to the client
          • Dynamically configurable client software
            • More flexibility, easier maintenance and upgrade of client software
        • Enterprise and “Desktop Grids”
          • Computationally-intensive tasks shipped to idle PCs around the network

        Models for Code Migration

        • A process has three segments
          • Code segment
            • Set of instructions making the problem
          • Execution segment
            • Private data, stack, PC, registers
          • Resource segment
            • Reference to external resources such as files, printers, devices etc
        • Weak vs Strong Mobility
          • Weak mobility
            • Only code segment + initialization data migrated
              • e.g., Java Applets
          • Strong mobility
            • Code segment + Execution segment
        • Send-initiated vs. receiver-initiated migration
          • Receiver-initiated is much easier considering security

        Migration and Local Resources

        • Process-to-resource bindings make code migration difficult
        • Three types of processor to resource bindings
          • Binding by identifier
            • When a process refers to a resource by its identifier
              • e.g., URL, IP address, local communication endpoint (socket)
          • Binding by value
            • Weaker form of binding when only the value of a resource is needed
              • e.g., when a program replies on standard language libraries
          • Binding by type
            • Weakest form of binding when a process indicates the type of a resource
              • e.g., a printer