Transactions and Replications

One Copy Serializability

  • Replicated transactional service
    • Each replica manager providers concurrency control and recovery of its own data items in the same way as it would for non-replicated data
  • Effects of transactions performed by various clients on replicated data items are the same as if they has been performed one at a time on a single data item
  • Additional complications
    • Failures should be serialized w.r.t. transactions, i.e., any failure observed by a transaction must appear to have happened before a transaction started

Replication Scheme

  • Primary copy
    • All client request are directed to a single primary server
  • Read one — write all
    • cannot handle network partitions
    • Each write operation sets a write lock to each replica manager
    • Each read sets a read lock at one replica manager
  • Schemes that can handle network partitions
    • Available copies with validation
    • Quorum consensus
    • Virtual Partition

Available Copies Replication

  • Can handle the case when some replica managers are unavailable because they failed or communication failure
  • Reads can be performed by any available replica manager but writes must be performed by all available replica managers
  • Normal case is like one/write all
    • As long as the set of available replica managers does not change during a transaction
  • RM failure case
    • One copy serializability requires that failures and recovery be serialized w.r.t transactions
    • This is not achieved when different transactions make conflicting failure observations
    • Examples shows local concurrency control is not enough
    • Additional concurrency control procedure (called local validation) has to be performed to ensure correctness
  • Available copies with local validation assume no network partition
    • i.e, functioning replica managers can communicate with one another.
  • Local validation
    • Before a transaction commits it checks for failures (and recoveries) of replica managers of data items it has accessed

Handling Network Partititons

  • Network partitions separate replica managers into two or more subgroups, in such a way that the members of a group can communicate with one another but members of different subgroups cannot communicate
  • Optimistic approaches
    • Available copies with validation
  • Pessimistic approaches
    • Quorum consensus

Available Copies with Validation

  • Available copies algorithm applied within each partition
    • Maintains availability for read operations
  • When partition is repaired, possibly conflicting transactions in separate partitions are validated
    • The effects of a committed transactions that is now aborted on validation will have to be undone
      • Only feasible for applications where such compensating actions can be taken
  • Validation
    • Versions vector (write-write conflicts)
    • Precedence graphs (each partition maintains a log of data items affected by the Read and Write operations of transactions)
    • Log used to construct a precedence graph whose nodes are transactions and whose edges represent conflicts between Read and Write operations
      • No cycle in graph corresponding to each partition
    • If there are cycles in graph, validation fails.

Quorum Consensus

  • A quorum is a subgroup of replica managers whose size gives it the right to carry out operations
  • Majority voting one instance of a quorum consensus scheme 
    • R+ W > total number of votes in group
    • W > half f the total votes
    • Ensure that each read quorum intersects a write quorum, and two write quorum will intersect
  • Each replica has a version number that is used to detect if the replica is up to date. 

Virtual Partition Scheme

  • Combines available copies and quorum consensus
  • Virtual partition = set of replica managers that have a read and write quorum
  • If a virtual partition can be formed, available copies is used
    • Improve performance of Reads
  • If a failure occurs, and virtual partition changes during a transaction, it is aborted
  • Have to ensure virtual partitions do not overlap.

CAP Conjecture

  • Is it possible to achieve consistency, availability, and partition tolerance?.
  • Classic distributed systems: focused on ACID semantics
    • Atomic
    • Consistent
    • Isolated
    • Durable
  • Modern internet system: focused on BASE
    • Basically Available
    • Soft-state (or scalable)
    • Eventually consistent
  • ACID v.s. BASE

Why the Divide

  • What goals might you want for a shared-data system
    • consistency
    • availability
    • partition tolerance
  • Strong consistency
    • all clients see the same view, even in the presence of updates
  • High availability
    • All clients can find the same replica of the data, even in the presence of failures
  • Partition-tolerance
    • The system properties hold even when the system is partitioned. 

CAP Conjecture

  • You can only have two out of these three properties
  • The choice of which feature to discard determines the nature of your system. 

Consistency and Availability

  • Comment
    • Providing transaction semantics requires all nodes to be in contact with each other
  • Example
    • Single-site clustered databases
  • Typical features
    • Two-phase commit
    • Cache invalidation protocol
    • Classic distributed system style

Consistency and Partition Tolerance

  • Comment
    • If one is willing to tolerate system-wide blocking, then can provide consistency even when there are temporary partitions
  • Examples
    • Distributed databases
    • Distributed locking
    • Quorum (majority) protocols
  • Typical features
    • Pessimistic locking
    • Minority partitions unavailable
    • Also common distributed system
      • voting vs. primary replicas

Partition-Tolerance and Availability

  • Comment
    • sacrifice consistency
  • Examples
    • DNS
    • Web cache
    • Coda
    • Bayou
  • Typical features
    • TTLs and lease cache management
    • Optimistic updating with conflict resolution

Techniques

  • Expiration-based caching: AP
  • Quorum/majority algorithm: PC
  • Two phase commit: AC

Cache-based Covert Channel

Background

Cache-based Covert Channel 

[1] Cache load measurements create very effective covert channels between cooperating processes running in different VMs. 
In practise, this is not a major threat for current deployments since in most cases the cooperating processes can simply talk to each other over a network. 
However, covert channels become significant when communication is forbidden by information flow control (IFC) mechanisms such as sandboxing and IFC kernels. The latter is a promissing emerging approach to improving security (e.g., web-server functionality). 
[1] explains more on the covert-channel in Section 8.1.

Reference

[1]  Hey, You, Get Off of My Cloud: Exploring Information Leakage in Third-Party Compute Clouds, by Restenpart, T. et al., in CCS09

How to Measure Cache Usage

Prime + Trigger + Probe

[1] demonstrates how to utilize the Prime + Probe technique to measure the cache activity, and extend it to the following Prime + Trigger + Probe measurement to support the setting of time-shared virtual machines.

  • The probing instance first allocates a continuous buffer B of b bytes
    • Here b should be large enough that a significant portion of the cache is filled by B.
  • Let s be the cache line size, in bytes.
  • Then the probing instance performs the following steps to generate each load sample
    1. Prime: Read B at s-byte offset in order to ensure it is cached. 
    2. Trigger: Busy-loop until the CPU’s cycle counter jump by a large value
      • This means our VM was preempted by the Xen scheduler, hopefully in favor of the sender VM. 
    3. Probe: Measure the time it takes to again read B at s-byte offsets. 
When reading b/s memory locations in B, we use a pseudorandom order, and the pointer-chasing technique to prevent the CPU’s hardware prefetcher from hiding the latencies. 
The time the of the final step’s read is the load sample, measured in number of CPU cycles. These laod samples will be strongly correlated with use of the cache during the trigger step, since last usage will evict some portion of the buffer and thereby drive up the read time during the probe phase. 

Reference 

[1] Hey, You, Get Off of My Cloud: Exploring Information Leakage in Third-Party Compute Clouds, by Restenpart, T. et al., in CCS09

Threat of Covert Channel Attacks

Summary

  • Attackers can build various side channels to circumvent the logical isolation in cloud physical machines, and obtain sensitive information from co-resident VMs
    • Coarse-grained, e.g., workloads and web traffic rates
      • Since the cache utilization rate has a large impact on the execution time of the cache read operation, attackers can infer the victim’s cache usage and workload information, by applying the Prime+Probe technique. 
      • Similarly, they can estimate the victim’s web traffic rate, which also has a strong correlation with the execution time of cache operations. [2]
      • [1] demonstrate a clear correlation between a victim’s web traffic rate with the load sample. 
    • Fine-grained, e.g., cryptographic keys
      • Attackers can exploit shared hardware resources, such as the instruction cache, to extract cryptographic keys. Specifically, the following challenges are overcomed
        • Dealing with core migrations and determining if an observation is associated with the victim
        • Filtering out hardware and software noise, and regaining access to the target CPU core with sufficient frequency
  • For clever attackers, even seemingly innocuous information like workload statistics can be useful.
    • For example, such data can be used to identify when the system is most vulnerable, i.e., the time to launch further attacks, such as Denial of Service attacks. [9]
Reference
[2] Using Virtual Machine Allocation Policies to Defend against Co-resident Attacks in Cloud Computing, by Yi Han et al, in Transactions on Dependable and Secure Computing

Web Applications Client & Server

Figure 1 [1] illustrates the web application architecture in the server side and client side.

Server Side

  • Logic Layer
    • Implements the application business logic using high-level programming languages, such as Java, PHP, or Python. 
  • Web Server Layer
    • Receive HTTP request, and passes the request to the appropriate server-side program, e.g., Apache web server, Windows IIS, or Nginx.
  • Data Storage Layer
    • Stores the web application state and user data. Popular data storage systems are traditional SQL databases, which include MySQL, PorsgreSQL, or MSSQL
  • Infrastructure Layer
    • Runs the operating systems. An infrastructure could be a physical machine or virtualization platform which manages multiple virtual machines. 

Client Side

The client side receives HTTP response from the server-side, and the job of the client is to convert the HTML contained in the HTTP response into a graphical interface from the user. 

  • Logic Layer (Presentation Layer)
    • It is written in a combination of HTML, CCS, and JavaScript, with JavaScript providing a way for the sever-side code to execute application logic on the client

  • Browser 
    • Retrieves the presentation layer code from the server, interprets it, and presents it as a graphic interface to the user. 

  • Storage Layer
    • For the presentation layer code to store data. Available storage methods include cookies, local storage, IndexedDB, and File APIs.
  • Operating System Layer
    • Runs the browser

Reference
[1] Toward a Moving Target Defense for a Web Applications, by Marhony Taguinod, in International Conference on Information Reuse and Integration 2015

Web Application Moving Target Defense

Motivation

  • Web application remains the most popular way for businesses to provide services over the Internet. It is the “front door” of many companies, and therefore their security is of paramount importance.
  • Example [1]
    • JPMorgan Chase breach in 2014 affected 76 million US households
    • Bloomberg reported that the hackers exploited an overlooed falow in one of the bank’s webistes.
  • The efforts in discovering and fixing vulnerabilities are not enough to protect web applications for many reasons [1]
    • The increasing complexity of modern web applications brings inevitable risks that cannot be fully mitigated in the process of web application development and deployment
    • Attackers can take their time, to understand the web application’s functionality and technology stack, before launching an attack

Objective

  • Ref [1]
  • The key issue is to design MTD mechanism that 
    • Prevent or disrupt a vulnerability or exploit
    • While still providing the identical functionality

Issues

  • Choose what component to move in a web application
  • Decide the optimal time and how often to move components

Proposals

  • Change the server-side language used in a web application 
    • automatically translating server-side web application code to another language in order to prevent Code Injection exploits
  • Shifts the Database used in a web application
    • transform the backend SQL database into different implementations that speak different dialects in order to prevent SQL Injection exploits.

Reference

[1] Toward a Moving Target Defense for Web Applications, by Marthony Taguinod et al., in International Conference on Information Reuse and Integeration 2015

Fault Tolerance

Definitions

  • Availability
    • Probability the system operates correctly at any given moment
  • Reliability
    • Ability to run correctly for a long interval of time
  • Safety
    • Failure to operate correctly does not lead to catastrophic failures
  • Maintainability
    • Ability to “easily” repair a failed system

Failure Models

  • Different types of failures

Two-Army Problem

Byzantine Agreement 

  • [Lamport et al. (1982)]
  • Goal
    • Each process learn the true values sent by correct processes
  • Assumption
    • Every message that is sent is delivered correctly
    • The receiver knows who sent the message
    • Message delivery time is bounded
  • Byzantine Agreement Result
    • In a system with m faulty processes agreement, agreement can be achieved only if there are 2m+1 functioning correctly.
    • Note
      • This result only guarantees that each process receives the true values sent by correct processors, but it does not identify the correct process!

Byzantine General Problem

  • Phase 1: Generals announce their troop strengths to each other

  • Phase 2: Each general construct a vector with all troops

  • Phase 3: General send their vectors to each other and compute the majority voting

Reliable Group Communication

  • Reliable Multicast
    • All nonfaulty process which do not join/leave during communication receive the message

  • Atomic Multicast
    • All message are delivered in the same order to all processes

View Delivery

  • A view reflects current membership of group
  • A view is delivered when a membership change occurs and the application is notified of the change
    • Receiving a view is different from delivering a view
      • All members have to agree to the delivery of a review
  • View synchronous group communication
    • The delivery of a new view draws a conceptual line across the system and every message is either delivered on one side or the other of that line


Atomic Multicast

  • All message are delivered in the same order to “all” processes
  • Group view
    • The set of processes known by the sender when it multicast the message
  • Virtual synchronous multicast
    • A message multicast to a group view G is delivered to all nonfaulty process in G
      • If sender fails after sending the message, the message may be delivered to no one

Virtual Synchrony Implementation
  • Only stable messages are delivered
  • Stable message
    • A message received by all processes in the message’s group view
  • Assumptions (can be ensured by using TCP)
    • Point-to-point communication is reliable
    • Point-to-point communication ensures FIFO-ordering


Message Ordering

  • Total ordering does not imply causality or FIFO!


[Hackerrank] Grid Challenge

Problem

Given a squared sized grid G of size N in which each cell has a lowercase letter. Denote the character in the ith row and in the jth column as G[i][j].
You can perform one operation as many times as you like: Swap two column adjacent characters in the same row G[i][j] and G[i][j+1] for all valid i,j.
Is it possible to rearrange the grid such that the following condition is true?
G[i][1]G[i][2]G[i][N] for 1iN and 
G[1][j]G[2][j]G[N][j] for 1jN
In other words, is it possible to rearrange the grid such that every row and every column is lexicographically sorted?
Notec1c2, if letter c1 is equal to c2 or is before c2 in the alphabet.
Input Format
The first line begins with T, the number of testcases. In each testcase you will be given N. The following N lines contain N lowercase english alphabet each, describing the grid.
Output Format
Print T lines. On the ith line print YES if it is possible to rearrange the grid in the ith testcase or NO otherwise.
Constraints 
1T100 
1N100 
Gij will be a lower case letter
Sample Input
1
5
ebacd
fghij
olmkn
trpqs
xywuv
Sample Output
YES

Analysis

  • Sort each row
  • Then check each column to check whether numbers are in increasing order or not

Code

[Hackerrank] Sherlock and Array

Problem

Watson gives Sherlock an array A of length N. Then he asks him to determine if there exists an element in the array such that the sum of the elements on its left is equal to the sum of the elements on its right. If there are no elements to the left/right, then the sum is considered to be zero.
Formally, find an i, such that, A1+A2...Ai-1 =Ai+1+Ai+2...AN.
Input Format
The first line contains T, the number of test cases. For each test case, the first line contains N, the number of elements in the array A. The second line for each test case contains Nspace-separated integers, denoting the array A.
Output Format
For each test case print YES if there exists an element in the array, such that the sum of the elements on its left is equal to the sum of the elements on its right; otherwise print NO.
Constraints
1T10
1N105
1Ai 2×104
1iN
Sample Input
2
3
1 2 3
4
1 2 3 3
Sample Output
NO
YES
Explanation
For the first test case, no such index exists.
For the second test case, A[1]+A[2]=A[4], therefore index 3 satisfies the given conditions.

Analysis

  • First calculate the sum of all elements
  • Maintain curr as the cumulated sum in the left parts, when traversing the array
    • if (curr == sum – curr – arr[i]), then it means the sum in the left part and right part of arr[i] equals
  • Time: 
    • O(n), as the array has been traversed twice
  • Space
    • O(1)

Code

[Hackerrank] Time Conversion

Problem

Given a time in AM/PM format, convert it to military (24-hour) time.
Note: Midnight is 12:00:00AM on a 12-hour clock and 00:00:00 on a 24-hour clock. Noon is 12:00:00PM on a 12-hour clock and 12:00:00 on a 24-hour clock.
Input Format
A time in 12-hour clock format (i.e.: hh:mm:ssAM or hh:mm:ssPM), where 01hh12.
Output Format
Convert and print the given time in 24-hour format, where 00hh23.
Sample Input
07:05:45PM
Sample Output
19:05:45
Explanation
7 PM is after noon, so you need to add 12 hours to it during conversion. 12 + 7 = 19. Minutes and seconds do not change in 12-24 hour time conversions, so the answer is 19:05:45.

Analysis

  • 首先截取后两位字母,判断是AM还是PM
  • 注意关于时间,只需要改hour
    • 取hour
    • 如果PM
      • 若hour<12, 则加12,否则不变
      • 如 12:xx:xxPM, should be 12:xx:xx
      • 11:xx:xxPM should be 23:xx:xx
    • 如果AM
      • 若hour>12, 则转为0
      • 否则不变

Code