Session

Posted on January 21, 2016 by admin

Definition

A session is a semi-permanent interactive information interchange between two or more communication devices.

Implementation

The session is implemented by means of a database with information about the state of each session.

The reason of not implementing it with multiple processes or threads is due to large resource overhead.

Cluster of Servers

When a client may connect to any server in a cluster or servers, it is important to maintain consistency.

The client must be either directed to the same server for the duration of the session, or the servers must transmit server-side session information via a shared file system or database.

Client Side Web Sessions

Client-side sessions use cookies and cryptographic techniques to maintain state without storing as much data on the server.

When presenting a dynamic web page, the server sends the current state data to the client (web browser) in the form of cookie. The client saves the cookie in memory or on disk.

With each successive request, the client sends the cookie back to the server, and the server users the data to “remember” the state of the application for that client and generate an appropriate response.

Issues

Browser limits the number and size of cookies that may be stored by a web site.

HTTP Session Token

A session token is a unique identifier that is generated and sent from a server to a client to identify the current interaction session. The client usually stores and sends the token as an HTTP cookie and/or sends it as a parameter in GET or POST queries.

The reason of using session tokens is that

The client only has to handle the identifier
All session data is stored on the server (usually in a database).

Completely Fair Scheduler

Posted on January 20, 2016 by admin

Introduction

CFP uses a concept called “sleeper fairness“,

which considers sleeping or waiting tasks equivalent to those on the runqueue. This means that interactive tasks which spend most of their time
So that the interactive tasks which spend most the their time waiting for user input or other events get a comparable share of CPU time when they need it.

Algorithm

The data structure used for the scheduling algorithm is a red-black tree in which the nodes are scheduler-specific structures, entitled “sched_entity”.

These are derived from the general task_struct process descriptor, with added scheduler elements. These nodes are indexed by processor execution time in nanoseconds.
Idea Processor

an “idea processor” would equally share processing power among all processes.

Maximum execution time

Thus the maximum execution time is the time the process has been waiting to run, divided by the total number of processes, or in the other words, the maximum execution time is the time the process would have expected to run on an “ideal processor”
A maximum execution is also calculated for each process. This time is based upon the idea that an “idea processor” would equally share processing power among all processes.

When the scheduler is invoked to run a new process, the operation o the scheduler is as follows

The left most node of the scheduling tree is chosen (as it will have the lowest spent execution time), and sent for execution
If the process simply completes execution, it is removed from the system and scheduling tree
If the process reaches its maximum execution time or is otherwise stopped (voluntarily or via interrupt), it is reinserted into the scheduling tree based on its new spent executiom time.
The new left-most node will then be selected from the tree, repeating the interation.

Reference

[1] https://en.wikipedia.org/wiki/Completely_Fair_Scheduler

CPU Components

Posted on January 20, 2016 by admin

Components in a CPU

General purpose Registers

A synonym for memory in computer science

General purpose reigster

A memory cell

Each general purpose register has a unique name
It is used to

store (and recall) intermediate result of complex computations

Arithmetic and Logic Unit (ALU)

It is a complex electrical circuit that can perform Mathematical (+,-,x,/) and logical operation (<, <=, >, >=, and, or)
The output (result) of the computation (obtained by the ALU) is often stored in a general purpose register.

Instruction Register (IR)

Contains the current instruction being executed by the CPU
The CPU will perform the operation indicated by the instruction code contained in the instruction register

Program Counter (PC)

The program counter is a register (memory cell)!
This register contains the address (location in memory) of the next instruction after the CPU finishes executing the current instruction in the instruction register
The value in the program counter will be increased after the CPU finishes executing one instruction

Processor Status Register (PSR)

This register contains the various information about the CPU.
Among the information contained in the PSR is

The result of a comparison operation

When the CPU compares 2 numbers a and a and b, the outcome of the comparison is stored in the PSR
The outcome of a compare operation will allow the CPU to determine the following fact between a and b.

Reference

[1] http://www.mathcs.emory.edu/~cheung/Courses/170/Syllabus/01/intro-computer2.html

Process Context Switch

Posted on January 20, 2016 by admin

Context Switch

Whenever an interrupt arrives, the CPU must to a state-save of currently running process, then switch into kernel mode to handle the interrupt, and the do a state-restore of the interrupted process.
Context switch happens when

the time slice for one process has expired and a new process is to be loaded from the ready queue.
This will be instigated by a timer interrupt, which will then cause the current process’ state to be saved and the new process’s state to be restored.
Saving and restoring states involves saving and restoring all registers and program counter(s), as well as the process control blocks.
Context switch happens VERY VERY frequently, and the overhead of doing the switching is just lost CPU time, so context switches (state saves and restores) need to be as fast as possible. Some hardware has specific provisions for speeding this up, such as a single machine instruction for saving or storing all registers at once.

Registers

The Operating System needs to store a copy of the CPU registers to memory.
When it is time for the processor to give up the processor so another process can run, it needs to save it’s current state.
Equally, we need to be able to restore this state when the process is given more time to run on the CPU. To do this the operating system needs to store a copy of the CPU registers to memory.
When it is time for the process to run again, the operating system will copy the register values back from memory to the CPU registers and the process will be right back where it left off.

Reference

[1] http://www.bottomupcs.com/elements_of_a_process.xhtml

Process Heap

Posted on January 20, 2016 by admin

Introduction

The heap is an area of memory that is managed by the process for on the fly memory allocation.

This is for variables whose memory requirements are not known at compile time.

The bottom of the heap is known as the brk, so called for the system call which modifies it.
By using the brk call to grow the area downward the process can request the kernel allocate more for it to use.

malloc

The heap is mostly managed by the malloc library call.
This makes managing the heap for programmer by allowing them to simply allocate and free (via the free call) heap memory.
Malloc can use schemes like a buddy allocator to manage the heap memory for the user.
Malloc can also be smarter about allocation and potentially use anonymous mmaps for extra process memory.
This is where instead of mmaping a file into the process memory, it directly maps an area of system RAM.

Process Stack

Posted on January 20, 2016 by admin

Stacks usually grown down, i.e., the stack starts at a higher address in memory and progressively gets lower.

Stack Pointer

To keep track of current growth of the stack, the hardware defines a register as the stack pointer.
The compiler (or the programmer, when writing in assembler) users this register to keep track of current top of the stack.
Store

It is stored in the register ESP.

Call Stack

A call stack is used for several related purposes, but the main reason for have one is to keep track of the point to which each active subrountine should return control when it finishes executing.
Active subrountine

An active subrountine is one that has been called by yet to complete execution after which control should be handed back to the point of the call.

How it works?

Return address

Store the returning address somewhere, so that we can continue executing the current function after completion of called function.

Other info

Saving other information about current function.

Parameters

Providing the callee with the parameters

Provided called function some space to store its automic variables
Return value

Providing some mechanism to get return value from the called function.
It is implemented using EAX register. Called function stores return value in EAX.

Execution in Steps

Push the parameters in the reverse order (right to left)
Call function now

It implicitly pushes the return address into STACK

Push the current EBP

i.e., push Frame Pointer (FP) of calling function into stack. We need to do this to continue with the execution of calling function after return from the called function,

Copy the ESP to EBP

Yes, this location will be the new FRAME Pointer

Allocate space on stack for local variable

It is done by decrementing ESP

Put the value to be returned in EAX
Copy current EBP to ESP

It will reduce stack size. Now we have the old FP at the top of the stack??

Stack Frame

The stack frame generally includes the following components

The return address
Argument variables passed on the stack
Local variables
Saved copies of any registers modified by the subprogram that need to be restored.

Stack Overflow

Introduction
Suggestion

If you are a programmer accept arbitrary input into a stack variable (say, reading from the keyboard or over the network), you need to say how big that data is going to be.
OS

The OS can make the stack not executable.

Reference

[1] http://articles.manugarg.com/stack.html

Process Communication

Posted on January 20, 2016 by admin

Shared Memory Systems

In general, the memory to be shared in a shared-memory system is initially within the address space of a particular process, which needs to make system calls in order to make that memory publicly available to one or more other processes.
Other processes which wish to use the shared memory must then make their own system calls to attach the shared memory area onto their address space.
Generally a few messages must be passed back and forth between the cooperating processes first in order to set up and coordinate the shared memory access.

Example” POSIX Shared Memory

Step 1: one of the processes involved to allocate some shared memory, using shmget

int segment_id = shmget(IPC_PRIVATE, size, S_IRUSR | S_IWUSR)

The first parameter

specifies the key (identifier) of the segement. IPC_PRIVATE creates a new shared memory segment.

The second parameter

How big the shared memory segment is to be, in bytes.

The third parameter

A set of bitwise ORed flags. In this case, the segment is being created for reading and writing.

The return value

an integer identifier

Step 2: Any process which wishes to use the shared memory must attach the sahred memory to their address space, using shmat

char* shared_memory = (char *) shmat(segment_id, NILL, 0)

The first parameter

Specify the key of the segment that the process wishes to attack to its address space

The second parameter

Indicate where the process wishes to have the segment attacked. Null insicates the system should decide.

The third parameter

A flag for read-only operation. Zero indicates read-write; One indiciates Readonly.

The return value

The return value of shmat is a void *, which the process can use (type cast) as appropriae. In this example, it is being used as a character pointer.

Step 3: The process may access the memory using the pointer returned by shmat. For example, using sprintf

sprintf(shared_memory, “Writing to shared memoryn”);

Step 4: When a process no longer needs a piece of shared memory, it can be detached using shmtd

shmat(shared_memory)

Step 5: And finally the process that originally allocated the shared memory can remove it from the system using shmctl.

shmctl(segment_id, IPC_RMID)

Message-Passing

Message passing systems must support at a minimum system calls for “send message” and “receive message”
A communication link must be established between the corresponding process before message can be sent.
There are three key issues to be resolved in message passing systems

Direct or indirect communication (naming)
Synchronous or asynchronous communication
Automatic or explicit buffering

Naming

Direct communication

With direct communication the sender must know the name of the receiver to which it wishes to send message

Indirect communication

Multiple processes can share the same mailbox or boxes.
Only one process can read any given message in a mailbox.
The OS must provide system calls to create and delete mailboxes, and to send and receive messages to/from mailboxes.

Synchronization

Either the sending or receiving of messages (or neither or both) may be either blocking or non-blocking.

Buffering

Messages are passed via queues, which may have one or three capacity configurations

Zero capacity

Message cannot be stored in the queue, so senders must lock until receivers accept the messages.

Bounded capacity

There is a certain pre-determined finite capacity in the queue. Senders must block if the queue if full, until space becomes available in the queue, but may be either blocking or non-blocking otherwise.

Unbounded capacity

The queue has a theoretical infinite capacity, so senders are never forced to block.

Reference

[1] https://www.cs.uic.edu/~jbell/CourseNotes/OperatingSystems/3_Processes.html

Process Memory, State, and Scheduling

Posted on January 20, 2016 by admin

Process Concept

A process is an instance of a program in execution

Process Memory

Text:

comprises the compiled program code, read in from non-volatile storage when the program is launched.

Data

Stores global and static variables, allocated and initialized prior to executing main.

Heap

Used for dynamic memory allocation, is managed via calls to new, delete, malloc, free etc.

Stack

Used for local variables.
Space on the stack is reserved for local variables when they are declared (at function entrance or elsewhere, depending on the language), and the space is freed up when the variables go out of scope
The stack is also used for function return values, and the exact mechanisms for stack management may be language specific.

When processes are swapped out of memory and later restored, additional information must also be stored and restored. Key among them are the program counter and the value of all program registers.

Process State

The process is in the stage of being created.

Ready

The process has all the resources available that it needs to run, but the CPU is not currently working on this process’s instruction.

Running

The CPU is working on this process’s instruction.

Waiting

The process cannot run at the moment, because it is waiting for some resource to become available for some event to occur. For example, the process may be waiting for keyboard input, disk access request, inter-process messages, a time to go off, or a child process to finish.

Terminated

The process has completed.

Process Control Block

For each process, there is a Process Control Block (PCB), which stores the following (types of) process-specific information.

Process State

Running, waiting etc.

Process ID

and also parent process ID

CPU registers and Program Counter

These need to be saved to restored when swapping processes in and out of the CPU.

CPU scheduling information

Such as priority information and pointers to scheduling queues

Memory-Management information

e.g., page tables or segment tables

Accounting information

User and kernel CPU time consumed, account numbers, limits, etc.

I/O Status information

Devices allocated, open file tables etc.

Process Scheduling

The two main objectives of the process scheduling system are to keep the CPu busy at all times and to deliver “acceptable” response times for all program, particularly for interactive ones.
The process scheduler must meet these objectives by implementing suitable policies for swapping process in and out of the CPU.

Scheduling Queues

Job queue

All processes are stored in the job queue.

Ready queue

Processes in the Ready state are placed in the ready queue

Device queue

Processes waiting for a device to be available or to deliver data are placed on device queues. There is generally a separate device queue for each device.

Scheduler

Long-term scheduler

Typically of a batch system or a very heavily loaded system. It runs infrequently, (such as when one process ends selecting one or more to be loaded in from disk in its place), and can afford to take the time to implement intelligent and advanced scheduling algorithms

Short-term scheduler, or CPU scheduler

Runs very frequently, on the order of 100 milliseconds, and must very quickly swap one process out of the CPU and swap in another one.

Medium-term scheduler

Some systems also employ medium-term scheduler.
When system loads get high, this scheduler will swap one or more processes out of the ready queue system for a few seconds, in order to allow smaller faster jobs to finish up quickly and clear the system.

Reference

[1] https://www.cs.uic.edu/~jbell/CourseNotes/OperatingSystems/3_Processes.html

Online Hiring Problem

Posted on January 18, 2016 by admin

Problem

If there are n candidates, and we do not want to interview all candidates to find the best one. We also do not wish to hire and fire as we find better and better candidates.

Instead, we are willing to settle for a candidate who is close to the best, in exchange of hiring exactly once.

For each interview we must either immediately offer the position to the applicant or immediately reject the applicant.

What is the trade-off between minimizing the amount of interviewing and maximizing the quality of the candidate hired?

Code

Analysis

We analysis in the following to determine the best value of k that maximize the probability we hire the best candidate. In the analysis, we assume the index starts with 1 (rather than 0 as shown in the code).

Let $B_i$ be the event that the best candidate is the $i$ -th candidate.
Let $O_i$ be the event that none of the candidate in position $k+1$ to $i-1$ is chosen.
Let $S_i$ be the event that we hire the best candidate when the best candidate is in the $i$ -th position.

We have $Pr(S_i ) = Pr (B_i cup O_i) = Pr(B_i) cdot Pr(O_i)$ since $B_i$ and $O_i$ are independent events.

$Pr(S_i) = sum^n_{i=k+1} Pr(S_i)$
$= sum^n_{i=k+1} frac{1}{n} cdot frac{k}{n-i}$
$= frac{k}{n} sum^n_{i=k+1} frac{1}{i-1}$
$leq frac{k}{n} cdot (ln n - ln k)$