Covert Channel Tutorial

Posted on January 25, 2016 by admin

High-speed Covert channel Attacks in the Cloud

The challenges in conducting covert channel

scheduling uncertainty
address uncertainty
cache physical limitations

Different VMs may not share cache
This can be overcomed by the atomic operations implementation in the system

i.e., the memory bus will be locked when a cache is being locked.

Memory-bus based covert channel attacks

Cache in Virtualized Environments

Posted on January 13, 2016 by admin

System Model for a Multi-core Processor

Cache Hierarchy

Reference [1]

Because of the long access time of main memory compared to fast processors, smaller but faster memory, called cache, are used to reduce the effective memory time as seen by a processor.

Modern processors feature a hierarchy of caches.

“Higher-level” caches, which are closer the processor core are smaller but faster than lower-level caches, which are closer to main memory.

L1 caches

Each core typically has two private top level caches

1) one for data
2) one for instructions

A typical L1 cache size is 32KB with 4-cycle access time, as in Intel Core and Xeon families.

LLC

The LLC is shared among all cores of a multicore chip and is a unified cache, i.e., it holds both data and instructions. [1]

LLC sizes measure in megabytes, and access latencies are of the order 40 cycles.

L2 caches

Modern X86 processors typically also support core-private, unified L2 caches of intermediate size and latency.

How it works

Any memory access first access the L1 cache, and on a miss, the request is sent down the hierarchy until it hits in a cache or accesses main memory.

The L1 is typically indexed by virtual address. While all other caches are indexed by physical address.

Cache Access

Reference [1]

To exploit spatial locality, caches are organized in fixed-size lines, which are the units of allocation and transfer down the cache hierarchy.

A typical line size B is 64bytes.

The log2B lowest-order bits of the address, called the line offset, are used to locate a datum in the cache line.

Set-associate

Caches today are usually set-associate, i.e., organized as S sets of W lines each, called a W-way set-associative cache. As shown in the following figure.[1]

When the cache is accessed, the set index field of the address, log2S consecutive bits starting from bit log2B is used to locate a cache set. The remaining high-order bits are used as a tag for each cache line.

After locating the cache set, the tag field of the address is matched against the tag of the W lines in the set to identify if one of the cache line is a cache hit.

Cache Replacement

As memory is much larger than the cache, more than W memory lines may map to the same cache set, potentially resulting in cache contention. If an access misses in the cache, and all lines of the matching set are in use, one cache line must be evicted to free cache slot for the new cache line being fetched from the next level of cache or from main memory for the LLC. The cache’s replacement policy determines the line to evict. Typically replacement policies are approximations to least-recently-used (LRU). [1]

Traditional Cache

Per-Core Slice Cache

[1] Modern Intel processors, starting with the Sandy Bridge microarchitecture, use a more complex architecture for the LLC, to improve its performance.

The LLC is divided into pre-core slices, which are connected by a ring bus. Slices can be accessed concurrently and are effective separate caches, although the bus ensures that each core can access the full LLC (with higher latency for remote slices).

Sliced Cache

Ring bus architecture and sliced LLC

Reference

[1] Last-Level Cache Side-Channel Attacks are Practical, by Fangfei Liu et al, in Security&Privacy 2015

Virtual Address Space in Virtualized Environments

Posted on January 13, 2016 by admin

Address Mapping

Reference [1]

In virtualized environments there are two levels of address-space virtualization.

1) the virtual addresses of a process -> guest’s notion of physical address, i.e., the VM’s emulated physical memory

2) guest physical addresses -> physical addresses of the processor

Page Translation

Reference [1]

Translations from virtual pages to physical frames are stored in page tables.

Processor cache recently used page table entries in the translation look-aside-buffer (TLB).

The TLB is scarce processor resource with a small number of entries.

Large pages used the TLB more efficiently, since fewer entries are needed to map a particular region of memory. As a result, the performance of application with large memory footprint, such as Oracle database or high-performance computing applications, can benefit from using large pages.

For the same reason, VMMs, such as VMWare ESX and Xen HVM, also use large pages for mapping guest physical memory.

Reference

[1] Last-Level Cache Side-Channel Attacks are Practical, by Fangfei Liu et al, in Security&Privacy 2015

Covert Channel Attack

Posted on January 13, 2016 by admin

Introduction

[1] Covert channels and side channels are two types of information leakage channels. A covert channel uses mechanisms that are not intended for communications, e.g., writing and checking if a file is locked to convey a “1” or “0”.

In a covert channel, an insider process leaks information to an outsider process not normally allowed to access that information. The insider (sending) process could be a Trojan horse program previously inserted stealthily into the computer. An outsider (receiver) process need only to be an unpriviledged process.

Reference

[1] Covert and Side Channels due to Processor Architecture, by Zhenghong Wang and Ruby B. Lee, in ACSAC 2006

Cache-based Covert Channel

Posted on January 10, 2016 by admin

Background

How to measure cache usage

Cache-based Covert Channel

[1] Cache load measurements create very effective covert channels between cooperating processes running in different VMs.

In practise, this is not a major threat for current deployments since in most cases the cooperating processes can simply talk to each other over a network.

However, covert channels become significant when communication is forbidden by information flow control (IFC) mechanisms such as sandboxing and IFC kernels. The latter is a promissing emerging approach to improving security (e.g., web-server functionality).

[1] explains more on the covert-channel in Section 8.1.

Reference

[1] Hey, You, Get Off of My Cloud: Exploring Information Leakage in Third-Party Compute Clouds, by Restenpart, T. et al., in CCS09

How to Measure Cache Usage

Posted on January 10, 2016 by admin

Prime + Trigger + Probe

[1] demonstrates how to utilize the Prime + Probe technique to measure the cache activity, and extend it to the following Prime + Trigger + Probe measurement to support the setting of time-shared virtual machines.

The probing instance first allocates a continuous buffer B of b bytes.

Here b should be large enough that a significant portion of the cache is filled by B.

Let s be the cache line size, in bytes.
Then the probing instance performs the following steps to generate each load sample

Prime: Read B at s-byte offset in order to ensure it is cached.
Trigger: Busy-loop until the CPU’s cycle counter jump by a large value

This means our VM was preempted by the Xen scheduler, hopefully in favor of the sender VM.

Probe: Measure the time it takes to again read B at s-byte offsets.

When reading b/s memory locations in B, we use a pseudorandom order, and the pointer-chasing technique to prevent the CPU’s hardware prefetcher from hiding the latencies.

The time the of the final step’s read is the load sample, measured in number of CPU cycles. These laod samples will be strongly correlated with use of the cache during the trigger step, since last usage will evict some portion of the buffer and thereby drive up the read time during the probe phase.

Reference

[1] Hey, You, Get Off of My Cloud: Exploring Information Leakage in Third-Party Compute Clouds, by Restenpart, T. et al., in CCS09

Threat of Covert Channel Attacks

Posted on January 10, 2016 by admin

Summary

Attackers can build various side channels to circumvent the logical isolation in cloud physical machines, and obtain sensitive information from co-resident VMs

Coarse-grained, e.g., workloads and web traffic rates

Since the cache utilization rate has a large impact on the execution time of the cache read operation, attackers can infer the victim’s cache usage and workload information, by applying the Prime+Probe technique.
Similarly, they can estimate the victim’s web traffic rate, which also has a strong correlation with the execution time of cache operations. [2]
[1] demonstrate a clear correlation between a victim’s web traffic rate with the load sample.

Fine-grained, e.g., cryptographic keys.

Attackers can exploit shared hardware resources, such as the instruction cache, to extract cryptographic keys. Specifically, the following challenges are overcomed

Dealing with core migrations and determining if an observation is associated with the victim
Filtering out hardware and software noise, and regaining access to the target CPU core with sufficient frequency

For clever attackers, even seemingly innocuous information like workload statistics can be useful.

For example, such data can be used to identify when the system is most vulnerable, i.e., the time to launch further attacks, such as Denial of Service attacks. [9]

Reference

[1] Hey, You, Get Off of My Cloud: Exploring Information Leakage in Third-Party Compute Clouds, by Restenpart, T. et al., in CCS09

[2] Using Virtual Machine Allocation Policies to Defend against Co-resident Attacks in Cloud Computing, by Yi Han et al, in Transactions on Dependable and Secure Computing

Cloud Virtual Machine Allocation

Posted on December 29, 2015 by admin

Concerns

When a user request for a virtual machine in the cloud, the cloud provider needs to allocate a VM for the user.

There are three concerns a cloud provider needs to take into account when doing the VM allocation.

Load Balance
Power Consumption
Security

For security, we aim to avoid the cloud covert channel attacks in which an attacker seeks to be co-locate with the target VM in order to extract private information from the victims.

Popular VM Allocation Policies

The following table [1] shows the popular VM allocation policies.

Reference

[1] Using Virtual Machine Allocation Policies to Defend against Co-resident Attacks in Cloud Computing, by Yi Han et al, in Transactions on Dependable and Secure Computing

Cloud Covert Channel Attack

Posted on November 13, 2015 by admin

What it is

– Introduction of covert channel attack

Co-residence threats in Cloud

In current commercial cloud, cloud providers allow multiple users to share the physical machine rather than assigning a dedicated machine to every user. Although in theory, VMs running on the same server (i.e., ci-resident VMs) should be logically isolated from each other, malicious users can still circumvent the logical isolation, and obtain sensitive information from co-resident VMs [6].

A malicious virtual machine (VM) can extract fine-grained information from a victim VM running on the same physical machine.

Thus malicious users can try to co-locate their VMs with target VMs on the same physical server, and then exploit side channels to extract private information from the victim [5].

Types

Access Driven

The attackers program monitors usage of a shared architectural component to learn information about the key, e.g., the data cache, instruction cache, floating-point multiplier, or branch-prediction cache.
The attackers could be asynchronous, meaning that they do not require the attacker to achieve precisely timed observations of the victim by actively triggering the victim operation.

Examples

Zhang et al. [1] demonstrate a VM can extract a private ElGamal decryption key from a co-resident victim VM running Gnu Privacy Guard (GnuPG), which is a popular software package that implements the OpenPGP email encryption standard.
By overloading the CPU while a victim AES encryption process is running. They managed to gain control over the CPU and suspend the AES process thereby gaining an opportunity to monitor cache access of the victim process.

Threats Of Covert Channel Attacks

How it works

The first step for the attackers is to try to achieve co-residence with the target VMs, and then conduct covert channel attacks by exploiting shared microarchitectural components such as caches.

How to achieve co-residence

The most straightforward approach is to use a brute-force strategy, start as many VMs as possible until co-residence is achieved.
[7][8] investigate how to efficiency achieve co-residency.

Approach 1: PRIME + PROBE method

Step 1:

Attackers create one or more eviction sets. An eviction set is a sequence of memory addresses that are all mapped by the CPU into the same cache set.
The PRIME+PROBE method also assumes that the victim code uses this cache set for its own code or data.

Step 2:

The attackers prime the cache set by accessing the eviction set in an appropriate way.
This force the eviction of the victim’s data or instructions from the cache set and brings it to a known state.

Step 3:

The attackers trigger the victim process, or passively waits for it to execute.
During this execution step, the victim may potentially utilize the cashe and evict some of the attacker’s elements from the cache set.

Step 4:

The attacker probes the cache set by accessing the eviction set again.
A probe step with a low access latency suggests that the attacker’s eviction set is still in the cache.
Conversely, a higher access latency suggests that the victim’s code made use of the cashe set and evicted some of the attacker’s memory elements.
The attackers thus learns about the victim’s internal state.
The actual timing measurement is carried out by using the (unpriviledged) instruction rdtsc, which provides a hide-fidelity measurement of the CPU cycle count.
Iterating over the eviction set in the probing phase forces the cache set yet again into an attacker-controlled state, thus preparing for the next round of measurement.

How to defense against it

Several countermeasures have been proposed at different levels: hypervisor, guest OS, hardware and application-layer approaches.

1. Eliminating the side channels

Hypervisor-based approach.

Modify the Xen scheduler to limit the frequency in which an attacker can preempt the victim.
Locking cache lines to prevent preemption by an attacker and multiplexing the cache lines among VMs such that each has an access to its own.
Remove high resolution clock

Note the side-channels attacks relies on it

Add noise/latency

Periodic time-shared cache cleasing, in order to make the side channel noisy.
Hide the program execution time
Alter the timing exposed to an external observer.

e.g., add latency

Statistical multiplexing of shared resources to prevent eavesdropping.

Guest OS

Injecting noise into protected processes on L1 and L2 caches.

Hardware

Hardware design incorporates access randomization and resource partitioning.

e.g., avoid sharing of sensitive resources

Remove hypervisor, and use hardware mechanisms for the isolation of access to shared resources

Cons

These methods often require substantial changes to be made to existing cloud platforms, and hence are unlikely to be adopted by cloud providers any time soon.

2. Increasing the difficulty of verifying co-residence

Existing works show that traceroute tool can be used to decide the IP address of a VM’s Dom0, which is a privileged VM that manages all VMs on a host. If two Dom0 IP addresses are the same, then the corresponding VMs are co-resident.

Cloud provider can prevent Dom0’s IP address from being exposed to customers. so that attackers will be forced to resort to other options that do not rely no network measurements.
However, as more and more methods of detecting co-residence have been proposed [10-12], simply hiding Dom0’s IP address is not sufficient. [9]

3. Increasing the difficulty of steal information

Application-level

Partitioning a cryptographic key across multiple VMs.

E.g, divide the secrets using Shamir’s secret

4. Detecting the features of co-resident attacks

When attackers use Prime+Probe technique to extract information from the victim, there are abnormalities in the CPU and RAM utilization, system calls, and cache miss behaviors. [13][14]

5. Migrating VMs periodically

VM migration

This approach is proposed in [4].
Pros

Other approaches are not suitable for immediate deployment due to the required modification to the cloud platforms, while VM migrations can be implemented immediately.

6. Using VM allocation policies to make it difficult to achieve co-residence

References
[1] Cross-VM side channels and their use to extract private keys, CCS 2012
[2] Wait a Minute! A fast, Cross-VM Attack on AES, in Research in Attacks, Intrusions and Defenses, LCNS 2-14
[3] The Spy in the Sandbox: Practical Cache Attacks in JavaScript and their Implications, CCS15
[4] Nomad: Mitigating Arbitrary Cloud Side Channels via Provider-Assisted Migration, CCS15
[5] Security Games for Virtual Machine Allocation in Cloud Computing, by Yi Han et al., in GameSec15
[6] Hey, You, Get Off of My Cloud: Exploring Information Leakage in Third-Party Compute Clouds, by Restenpart, T. et al., in CCS09
[7] A Placement Vulnerability Study in Multi-Tenant Public Clouds, by Venkatanathan Varadarajan, in UnixSecurity15
[8] A Measurement Study on Co-residence Threat inside the Cloud, by Haining Wang, in UnixSecurity
[9] Using Virtual Machine Allocation Policies to Defend against Co-resident Attacks in Cloud Computing, by Yi Han et al, in Transactions on Dependable and Secure Computing
[10] Detecting co-residency with Active Traffic Analysis Techniques, by A. Bates, in CCSW12
[11] Detecting VMs co-residency in Cloud: Using Cached-based Side Channels Attacks, by S. Yu 2013
[12] On Detecting Co-resident Cloud Instances Using Network Flow Watermarking Techniques, by A. Bates, in International Journal of Information Security, 2014
[13] Detecting malicious Coresident Virtual Machines Indulging in Load-Based Attacks, by S. Sundareswaran, in Information and Communication Security 2013
[14] An Approach with Two-stage Mode to Detect Cache-based Side Channel Attacks, by S. Yu, in ICOIN 2013