Transactions and Concurrency

Posted on January 5, 2016 by admin

Transactions

Motivation

Provide atomic operations at servers that maintain shared data for clients
Provide recoverability from server crashes

Properties (ACID)

Atomicity
Consistency
Isolation
Durability

Concurrency Control

Motivation

Without concurrency control, we have lost updates, inconsistent retrievals, dirty reads, etc.
Concurrency control schemes are designed to allow two or more transactions to be executed correctly while maintaining serial equivalence

Serial Equivalence is correctness criterion

Schedule produced by concurrency control scheme should be equivalent to a serial schedule in which transactions are executed one after the other.

Scheme

Locking
Optimistic concurrency control
Time-stamp based concurrency control

Use of Locks in Strict Two-Phase Locking

When an operation accesses an object within a transaction

(1) If the object is not already locked, it is locked and the operation proceeds
(2) If the object has a conflicting lock set by another transaction, the transaction must wait until it is unlocked
(3) If the object has a non-conflicting lock set by another transaction, the lock is shared and the operation proceeds
(4) If the object has already been locked in the same transaction, the lock will be promoted if necessary and the operation proceeds

When promotion is prevented by a conflicting lock, rule 2 is used

Strict Two-Phase Locking

Deadlock

Example

Resolution of Deadlock

Timeout

Optimistic Concurrency Control

Drawback of locking

Overhead of lock maintainance
Deadlocks
Reduced concurrency

Optimistic Concurrency Control

In most applications, likelihood of conflicting accesses by concurrent transaction is low
Transactions proceed as though there are no conflicts
Three phases

Working Phase

Transactions read and write private copies of objects

Validation phase

Each transaction is assigned a transaction number when it enters its phase

Update phase

Validation of Transaction

Timestamp Based Concurrency Control

Each transaction is assigned a unique timestamp at the moment it starts

In distributed transactions, Lamport’s timestamps can be used.

Every data item has a timestamp

Read timestamp = timestamp of transaction that last read the time
Write timestamp = timestamp of transaction that most recently changed an item

Timestamp ordering write rule

Concurrency Control for Distributed Transactions

Locking

Distributed deadlocks possible

Timestamp ordering

Lamport time stamps

The Two-Phase Commit Protocol

Three Phase Commit

Problem with two-phase commit

If coordinator crashes, participants cannot reach a decision, stay blocked until coordinator recovers

Three-phase commit

There is no single state from which it is possible to make a transaction directly to another COMMIT or ABORT state
There is not state in which it is not possible to make a final decision, and from which a transaction to COMMIT can be made.

Network Mapping Threat

Posted on January 3, 2016 by admin

Threat

Reconnaissance enables adversaries to gather information about the target system including network topology, configurations, network dynamics.

This information can be used to identify system vulnerabilities, and to design and execute specific exploits.

Procedure of Attack

Most network mapping tools perform their operations by using ICMP packets and TCP or UDP scans.

ICMP messages are typically used to verify connectivity or reachability of potential targets. TCP and UDP port scans are used to identify running services of a target.
Replies (RCP RST, silent drop of ICMP unreachable) to scans can also reveal what services are allowed or filtered through transit devices.
Additionally, the TTL field of IP packets is used to identify the distance between the target and the destination.

Reference
[1] SDN-based solutions for Moving Target Defense Network Protection, by Panos Kampanakis, in WoWMoM14

Network-based Moving Target Defense

Posted on January 3, 2016 by admin

Background

Software Defined Network

Motivation

Scanning Worms

[3] Scanning tools and worms usually send probes to random IP addresses in the network in order to discover their targets.
When a target responds, it can then be identified and attacked. Otherwise, the probed addresses will be considered unused.
Despite firewall deployment, most enterprise networks have many public and private hosts accessible from outside.
The IP address assignment scheme can become more dynamic by using approaches based on DHCP or NAT, but they are insufficient to provide proactive countermeasure because the IP mutation is infrequent and traceable.

Network mapping threat

Previous Work

DHCP

Prior work focused on defending hit-list scanning malware utilized DHCP to change the IP address of the host over time
However, this change can disrupt existing connections.

This can be solved by placing an intermediate NAT-like box in the network and transparently transition to the new address over time.
The NAT-like device provided similar address translating behavior, with additional logic to preserve old addresses while they are still in use by previously established connections.

IP address randomization [4]

Rather than changing out the DHCP lease to achieve randomization, they randomly rotate the IP address contained in the DNS reply and notified the NAT device.
The NAT device then allows new connections with that IP to reach the requested host by mapping it to the host’s fixed internal address.

Using NAT to hide the real IP address of the server will make it difficult to reach legitimate hosts remotely. [8]

Network Address Space Randomization (NASR) [7]

It was proposed to offer an IP hopping approach that can defend against hitlist worms.
NASR is a LAN-level network address randomization scheme based on DHCP updates.
Cons

Not transparent

NASR is not transparent to end-hosts because DHCP changes are applied to the end-host itself which results in disruption of active connections during address translation.

Expensive to deploy

Moreover, it requires changes to the end-host operating system which makes its deployment very costly.

Limited unpredictability

Also, NASR provides very limited unpredictability and mutation speed because its IP mutation is limited on the LAN address space and will require DHCP and host to be reconfigured for this purpose (the maximum IP mutation speed in once very 15 minutes).

Network Address Randomization

IP address randomization [5]

Requires the DNS gateway in both the clients and servers to do the translation

DNS Capability Approach

Ref [6][2]
The mapping system was implemented using two components

A DNS server

Provide clients with the destination IP address of a targeted server at that moment

A NAT mapping device

Grant the client access to the server when a destination mapping is used.
It stores the state for each established flow, allowing the DNS server to change addresses and providing a unique response to each new client without disrupting the established flow.

My questions

As the NAT needs to maintain so many states, will it result in DDoS attack?

OpenFlow Mutation Approach

Ref [6] [3]
Similar to DNS Capability approach.
The network switches plays the role of NAT devices.
When the switch receives a packet and does not known how to forward the packet, it can ask for instructions (called elevation) from a specific machine, called the OpenFlow controller.
The OpenFlow mutation approach uses this elevation mechanism to alter the NDS records and to change packet address in flight.

When a client performs a DNS lookup, the DNS server will provide the client with a virtual IP address for a target.
When the client attempts to access the target, the packet is forwarded using the virtual IP address until it reaches the destination switch, at which point it is translated to the destination host’s real IP address. The network switches essentially act as NAT devices, performing translations between virtual and real IP addresses.

Communication via Name

Steps 1-3

When a DNS query is sent to resolve the name of a host
The DNS response is updated by the NOX controller to replace the rIP of the server with its active vIP
The NOX controller also sets the TTL value in the DNS response to a small value.

Step 4

The source host can then initiate the connection using the vIP of the destination
The OF-switch encapsulates and sends the initial packet to the controller, because there is no matching flow for it.

Step 5

The NOX controller installs relevant flows in OF-switches in the route.
These flows are associated with relevant required Set-Field actions which determine the translation of source/destination rIP addresses to/from vIP addresses
Relevant flows are installed in destination OF-switch as well. Any other switches in the path are only configured (i.e., flows with no actions) to route the traffic based on vIP.
Future packets will be matches and forward by OF-switches (without being sent to controller) according to the installed flows in the flow table.
The vIP-rIP translation actions will be applied to packets by OF-switches.

Communication via rIP address

Figure 3 shows how authorized users (e.g., administrator) can reach hosts using rIPs. In this case, the source host initiates a connection with the destination using its rIP.

Step 1

Similar to DNS scenario, the OF-switch will fail to match the new packet with any flow and sends it to NOX controller

Step 2

The NOX controller authorizes the access request

Step 3

If granted, the controller installs appropriate flows in OF-switches in the route with appropriate vIP-rIp translation actions according to Figure 3

If the source host is internal, NOX controller installs two inbound and outbound flows in source OF-switch, same to DNS scenario.

Two flows are also inserted in the flow table of destination OF-switch.

However, destination flows require no translation, because the destination host is being reached via its rIP.

MT6D Approach

Ref [6][1]
The client and the server share a symmetric key out-of-band and use these keys to determine the IPv6 addresses the hosts will use.
To construct their IPv6 address

the hosts construct a hash using the shared key, a value derived from the host’s MAC address, and a timestamp. The approach extracts 64 bits from the hash output to encode the lower 64 bits of the host’s IPv6 address.
To perform uniform communication between the host applications, the operating system on each host uses a tunneling approach and rotates the addresses of the tunnel end-points.

Cons

Organizations cannot protect public-facing server infrastructure without requiring clients to install software
The MT6D approach is only effective when both the client and the server can be modified, which prevents its use when protecting public infrastructure for legacy clients.

Reference

[1] MT6D: A moving target IPv6 defense, by M. Dunlop et al, in Military Communications Conference 2011.

[2] On building inexpensive network capabilities. by C. A. Shue et al, in SIGCOMM 2012

[3] OpenFlow random host mutation: Transparent moving target defense using software defined networking, by J. H. Jafarian, in HotSDN 2012
[4] The SDN Shuffle: Creating a Moving-Target Defense using Host-based Software-Defined Networking, by Douglas C. Macfarland, in MTD15
[5] Adversary-aware IP Address Randomization for Proactive Agility against Sophisticated Attackers, by Jafar Haadi Jafarian, in INFOCOM 2015
[6] Characterizing Network-Based Moving Target Defense, by Douglas C. MacFarland, in MTD15
[7] Defending against Hitlist Worms using Network Address Space Randomization, by S. Antonatos, in Computer Network 2007
[8] Random Host Mutation for Moving Target Defense, by Ehab Al-Shaer, Qi Duan, and Jafar Haadi Jafarian, in Securecomm 2012

Update Propagation in Distributed System

Posted on January 3, 2016 by admin

Update Propagation

A comparison between push-based and pull-based protocols in the case of multiple client, single server system

Epidemic Protocols

Update propagation for systems that only need eventual consistency
Randomized approaches based on the theory of epidemics

Infective, susceptible and removed servers

Anti-entropy propagation model

A server P picks another server Q at random, and exchanges updates
Three approaches

P only pushes updates to Q
P only pulls new updates from Q
P and Q send updates to each other

If many infective servers, pull-based approach is better
If only one infective server, eitehr approach will eventually propagate all updates
Rumor spreading (Gossiping) will speed up propagation

If server P has been updated, it randomly contacts Q and tries to push the update to Q, if Q was already updated by another server, with some probability (1/k), P loses interest in spreading the update any further

The Gossip System

Guarantees

Each client obtains a consistent service over time

i.e., replica managers only provide a client with data that reflect the updates the client has observed so far

Relaxed consistency between replicas

Primarily causal consistency, but support also provided for sequential consistency
Choice up to the application designer

Procedure

Request

Front Ends sends a query or update request to a replica manager that is reachable

Update Response

RM replies as soon as it receives updates

Coordination

RM does not process the request until it can meet the required ordering constraints

This may involve receiving updates from other replica managers in gossip messages

Execution
Query Response

If the request is a query, the RM replies at this point

Agreement

The replica managers update each other by exchanging gossip messages, which contains the most recent updates they have received
This is one in lazy fashion

Query and update operations in a gossip service

Front ends propagate their timestamps whenever clients communicate directly

A gossip replica manager, showing its main state components

Bayou: Eventual Consistency

If no updates take place for a long time, all replicas will gradually become consistent
Domain specific conflict detection and resolution

Appropriate for applications like shared calendars

Motivation for eventual consistency

Sequential consistency requires that at every point, every replica has a value that could be the result of the global-agreed sequential application of writes
This does not require that all replicas agree at all times, just that they always take on the same sequence of value
Writes

When they arrive, are applied in the same order at all replicas
Easily done with timestamps

Conflict Resolution

Dealing with inconsistency

Every Bayou update contains a dependency check and a merger procedure in addition to the operation’s specification

Replication not transparent to application

Only the application knows how to resolve conflicts

Split of responsibility

Replication system

Propagate updates

Application

resolve conflicts

Optimistic application of writes require that writes be “undoable“

Rolling Back Updates

Keep log of updates
Order by timestamp
When a new update comes in, place it in the correct order and repapply log of updates
Need to establish when you can truncate the log
Requires old updates to be “committed”, new ones tentative
Committed order can be achieved by designing a replica manager as the primary replica manager

[Leetcode] 3Sum Cloest

Posted on January 3, 2016 by admin

Problem

Given an array S of n integers, find three integers in S such that the sum is closest to a given number, target. Return the sum of the three integers. You may assume that each input would have exactly one solution.

    For example, given array S = {-1 2 1 -4}, and target = 1.

    The sum that is closest to the target is 2. (-1 + 2 + 1 = 2).

Analysis

Brute force approach

Enumerate the first, second, third number respectively
Update the distance compared with the target
Time: O(n^3)
Space: O(1)

Code

How can a Ph.D. student organize his paper

Posted on January 2, 2016 by admin

Ph.D. students often read a lot of papers, and papers might of different categories. For example, I am a Ph.D. student with research area in security focusing on moving target defense, the papers I read can be divided into different defense strategy

dynamic platforms
dynamic execution environment
…

One way to organized the paper is to use Goolge Scholar.

Maintaining papers with Google Scholar

Step 1: search the paper using google scholar, and click the “save” item, then the paper will be saved to your library.

Step 2: add labels to the paper. Click on the paper in your library, and select the label.

Benefits

“online-access” to it all the time.
You can sort the papers by year.
You can track who cite the papers and their citation number!

Motivation of Moving Target Defense

Posted on January 2, 2016 by admin

Phases of Attacks

Any attack will go through at least three phases: probing, constructing and launching phases. [1]

If the environment stays static, the attacker has time to identify existing vulnerabilities to be exploited.

However, if the life cycle of an application version ins much shorter than it takes for the attacker to launch the attack as it will be, the attacker will not be able to succeed in exploiting any existing vulnerabilities in the cloud application.

Current Static Network/System

The static nature of current network configuration approaches has made it easy to attack and breach a system and to maintain illegal access to privileges for extended periods of time. [2]

The attacker have time to study the network of defender and to determine potential vulnerabilities and choose the time of attack and gain the maximum benefit.
Once an attacker acquires a privilege, that privilege can be maintained for a long time without being detected.

Objective of Moving Target Defense

Moving target defense aims at continuously changing a system’s attack surface, and thus

increase the uncertainty, complexity and cost for attackers
limit the exposure of vulnerabilities
ultimately increase overall resiliency

The idea of moving target defense is to

reduce information asymmetry between the attacker and the defender
and ultimately rendering the reconnaissance information misleading or uesless

Even if the attacker succeeds in finding a vulnerability at one point, the vulnerability could be unavailable as the result of shifting the underlying system, which makes the environment more resilient against attacks.

Reference

[1] Autonomic Resilient Cloud Management (ARCM), by Cihan Tunc etc, in ICCAC 2014
[2] Simulation-based Approaches to Studying Effectiveness of Moving-Target Network Defense, by Rui Zhuang, Su Zhang, Scott A. Deloach, Xinming Ou, and Anoop Singhal, in MTD 2015

Code Reuse Attacks

Posted on January 1, 2016 by admin

What it is

Code reuse attacks allow the adversary to make malicious results by exploiting the control flow of existing program without any additional code injection.

In a code-reuse attack, the attacker combines existing code fragments (called gadgets) to achieve arbitrary computation. While code-resue attacks are Turing complete, they are generally used to disable code integrity and to allow an attacker to execute injected code.

Return Oriented Programming (ROP)

Code reuse attacks allow attackers to execute arbitrary code on a compromised machine.

In this, the attacker directs the control flow through existing code without injecting new executable code.

Using ROP, the attackers can link small pieces of code which is known as gadgets, that already exist in the binary image of a vulnerable application.

In fact, the ROP gadgets are short sequence of code, typically ending with a return or indirect control transfer instruction. Instead of injecting binary code into the memory space of an application, the attacker can use a sequence of gadget in the stack or other memory of the program.

Each gadget ends with an indirect control transfer instruction, which transfers the control of next gadget according to the injected gadget sequence.

Existing defender lose effects

During the attack, the adversary can circumvent many defenses such as

Read-only memory
Non-executable meomry
Kernel-code integrity protections

Since the injected part is only data (rather than code). In addition, access to ROP exploits is not difficult since they are provided in the publicly available packs.

Most of existing defense mechanisms cannot defend code reuse attacks, such as

Instruction Set Randomization
Simple Address Space Layout Randomization (ASLR)
Stack canaries (Ref [25] in [3])

How to defense

Instruction Location Randomization

Software Diversity

By randomizing a binary’s code layout, a memory vulnerability is moved to a priori unknown location in the binary, thereby bring down the probability of return-to-libc and return-oriented attacks. [2]

[2] proposes an approach to recompile the code during execution with Java JIT compiler.

Reference

[1] Enhancing Software Dependability and Security with Hardware Supported Instruction Address Space Randomization, by Weidong Shi, in DSN15
[2] Adaptive Just-in-Time Code Diversification, by Abhinav Jangda, in MTD15
[3] An Evil Copy: How the Loader Betrays you

[Leetcode] Merge Sorted Array

Posted on January 1, 2016 by admin

Problem

Given two sorted integer arrays nums1 and nums2, merge nums2 into nums1 as one sorted array.

Note:
You may assume that nums1 has enough space (size that is greater or equal to m + n) to hold additional elements from nums2. The number of elements initialized in nums1and nums2 are m and n respectively.

Analysis

题目是把两个sorted的数组合并成一个，这里假设数组1足够大，所以直接合并到数组1
记录当前要写入的index，初始值为 m+n-1
对两个数组从后往前扫描

谁比较大，谁就先写入
同时update index

Time: O(m+n)
Space: O(1)

Code

Reference
[1] https://leetcode.com/problems/merge-sorted-array/

Consistency and Replication (II)

Posted on January 1, 2016 by admin

Basic Architecture for Replica Data Management

System Model

Five phases in performing a request

Front end issues the request

Either sent to a single replica or multicast to all replicate managers

Coordination

Replica managers coordinate in preparation for the execution of the request

i.e., agree is requests is to be performed and the ordering of the request relative to others

Execution
Agreement

Reach consensus on effect of the request, e.g., agree to commit or abort in a transactional system

Response

Mechanism for Sequential Consistency

Primary-based replication protocols
Replicated-write protocols

Active replication using multicast communication
Quorum-based protocols

Primary-backup Model

Front ends only communicate with primary

Procedures

Request

FE issues a request containing a unique identifier to the primary replica manager

Coordination

The primary takes each request in the order in which it receives it

Execution

The primary executes the request and store the response

Agreement

If the request is an update, the primary sends the updated state, the response and the unique id to all backups. The backups send an acknowledgement.

Response

The primary responds to the front end, which hands the response back to the client.

Implementation

Implements linearizability if primary is correct, since primary sequences all the operations

Failures

If primary fails, the system retains linearizabilty if a single back becomes the new primary and if the new system configuration takes over exactly where the last left off

If primary fails, it should be replaced with a unique backup
Replica managers that survive have to agree upon which operation has been performed when the replacement primayr is over
Requirements met if replica managers organized as a group and if primary uses view-synchronous communication to propagate updates to backups.

Replicate-Write Protocol

Active replication

Front end multicasts request to each replica using a totally ordered reliable multicast
System achieves sequential consistency but not linearizability

Total order in which replica managers process requests may not be the same as real-time order in which clients made request.

Implementing Ordered Multicast

Incoming messages are held back in a queue until delivery guarantees can be met

The hold-back queue for arriving multicast messages

Coordinate all machines needed to determine delivery order
FIFO-ordering

Easy
Use a separate sequence number for each process

Total ordering

Use a sequencer
Distributed algorithm with three phases

Causal order

Use vector timestamps

The ISIS algorithm for total ordering

Causal Ordering using Vector Timestamps

Quorum-based Protocols

Procedures

Assign a number of votes to each replica
Let N be the total number of votes
Define R = read quorum, W = write quorum
R+W > N
W > N/2

guarantee that no two writes at the same time
since if yes, than the vote for w_1 and w_2 are larger than N

Only one writer at a time can achieve write quorum
Every reader sees at least one copy of the most recent read (takes one with most recent version number)

Examples

Scaling

None of the protocols for sequential consistency scale
To read or write, you have to either

Contact a primary copy
Use reliable totally ordered multicast
Contact over half the replicas

All this complexity is to ensure sequential consistency

Even the protocols for causal consistency and FIFO consistency are difficult to scale if they use reliable multicast

Can we weaken sequential consistency without losing some important features?

Highly Available Service

Emphasis on giving clients access to the service with reasonable response time, even if some results do not conform to sequential consistency
Examples

Gossip

Relaxed consistency

Causal update ordering

Bayou

Eventual consistency
Domain-specific conflict detection and resolution

Coda (file system)

Disconnected operation
Use vector timestamp to detect conflicts