Java RMI

lecture 4

Features

  • Integrate with Java language and libraries
    • Security, write once run anywhere, multithreaded
    • Object oriented
  • Can pass “behavior”
    • Mobile code
    • Not possible in CORBA, traditional RPC systems
  • Distributed garbage collection
  • Remoteness of objects intentionally not transparent
    • Good for handling failures

Remote Interfaces, Objects, and Methods

  • Object becomes remote by implementing a remote interface
    • A remote interface extends the interface java.rmi.Remote
    • Each method of the interface declares java.rmi.RemoteException in its throws clause in addition to any application-specific clauses

Creating distributed applications using RMI

  1. Define the remote interfaces
  2. Implement the remote objects and server
  3. Implement the client
  4. Compile the remote interface, server and client 
  5. Generate the stub and skeleton using rmic
  6. Start the RMI registry
  7. Start the server
  8. Run the client

Middleware

Lecture 2 — part 5

Middleware

  • Definition
    • Middleware is a set of common business-unaware services that enable applications and end-users to interact with each other across a network
    • Distributed system services that have standard programming interfaces and protocols 
      • Services sit in the middle above OS and network software
      • and below industry-specific applications
  • Examples
    • ftp, email
    • web browsers
    • database drivers and gateways
    • CORBA (Common object request broker architecture)
    • Microsoft .NET
    • Java RMI, JINI, Javaspaces, JMS
    • Web services software — SOAP, REST

Functional View of Middleware

  • Information exchange services
  • Application-specific services
    • Specialized services
      • e,g,m transaction services and replication services for distributed databases
      • group services for collaborative applications, specialized services for multimedia applications
    • business-unaware
  • Management and support service
    • needed for locating distributed resources and administrating resources acorss the network

System Architecture — Peer to Peer Computing

lecture 2 — part 3

Organization of nodes in P2P Systems

  • Centralized directory
    • Original Napster
      • Pros
        • Simple
      • Cons
        • O(N) states
        • single point of failure
  • Unstructured P2P systems
    • Gnutella and its successors (flood queries)
      • Pros
        • Robust
      • Cons
        • Worst case O(n) messages per lookup
  • Structured P2P systems
    • Based upon Distributed Hash Tables (DHTs)
    • Chord, CAN, Tapestry…

Distributed Hash Table (DHT)

  • Distributed Hash Table
    • Key = Hash (data)
    • lookup(key) -> IP address
    • send-RPC(IP address, PUT, key, value)
    • send-RPC(IP address, GET, key) -> value
  • Chord
  • Example: BT content distribution

System Architecture — Centralized architecture

Lecture 2 — part 2

Client-server applications

  • Clients
    • Interacts with users through a user interface
    • Performs application functions
    • Interacts with client middleware using middleware API
    • Receives response and display it if needed
  • Servers
    • Implement services
    • Invoked by server middleware
    • Provide error-recovery and failure handling service

 Overview

  • Common communication patterns in distributed applications
    • Client-Server
    • Group (multicast)
    • Function-shipping/Applets
  • Client
    • Process that requests services
  • Server
    • Process that provides services
  • Details
    • Client usually blocks until server responds
    • Client usually invoked by end users when hey require services
    • Server usually waits for incoming requests
    • Server can have many clients making concurrent requests
    • Server is usually a program with special privileges

Application Software Architectures

  • Many applications can be considered to be made up of three software components or logical tiers

    • user interface
    • processing layer
    • data layer

  • Client/server architecture

    • Single-physical tiered, two physical tiered
    • multi-tiered
    • Distributed Data
      • e.g., distributed database
    • Remote data
      • e.g., network file system 
    • Distributed programs
      • e.g., world wide web
    • Remote presentation
      • e.g., telnet
    • Distributed presentation
      • e.g., X Windows

  • Motivation for multi-tier architectures

    • Frees clients from dependencies on the exact implementation of database
    • It allows business logic to be concentrated in one place
      • software updates are restricted to middle layer
    • Performance improvements possible by batching requests from many clients to database
    • Database and business logic tiers could be implemented by multiple servers for scalability

Distributed Software and System Architectures

Lecture 2

1. Distributed Architectures

  • Software Architecture
    • Logical organization of the collection of software components that make up a distributed application
  • System Architecture
    • Instantiation of a software architecture, i.e., physical placement of software components on computers

2. Architecture Style

  • Layered architectures
    • Pros
      • Division of task
      • Scalability
      • Transparency
      • Potability
    • Cons
      • e.g., layer 1 and layer 3 cannot talk directly
  • Object-based architectures
    • Pros
      • Independent components
      • Free to talk to anyone
  • Data-centered architectures
    • e.g., google docs
  • Event-based architectures
    • e.g., facebook

3. System Architecture

  • Centralized architecture
    • client-server applications
  • Decentralized architecture
    • Peer-to-peer applications
  • Hybrid architecture

Introduction to Distributed Computing Systems

Lecture 1

Definition

  • Distributed System
    • Tannenbaum
      • A distributed system is a collection of independent computers that appears to its users as a single coherent system
    • Lamport
      • You know you have one when the crash of a computer you’ve never heard of stops you from getting any work done.
  • Distributed Applications
    • Applications that consist of a set of processes that are distributed across a network of machines and work together as an ensemble to solve a common problem.
  • Types of Distributed Systems
    • Distributed computing systems
      • Cluster computing: homogeneous
      • Grid computing: heterogenous via virtual organizations
      • Cloud computing: everything as a service
    • Distributed Information Systems
      • Transaction processing system (Transaction RPC)
      • Enterprise Information integration
      • Publish/Subscribe systems (message oriented v.s. RPC/RMI)
    • Distributed Pervasive Systems
      • Home systems
      • Healthy care systems
      • Sensor networks

Characteristic properties of transactions (ACID)

  • Atomic
    • To the outside world, the transaction happends individually
  • Consistent
    • The transaction does not violate system invariants
  • Isolated
    • Concurrent transactions do not interfere with each other.
  • Durable
    • Once a transaction commits, the changes are permanent.

Goal/Beneftis

  • Resource sharing
  • Distribution transparency
  • Scalability
  • Fault tolerance and availability
  • Performance
    • Parallel computing can be considered a subset of distributed computing.

Challenges

  • Heterogeneity
  • Need for “openness”
    • Open standards: key interfaces in software and communication protocols need to be standardized
    • Often defined in Interface Definition Language (IDL)
  • Security
    • Denial of service attacks
  • Scalability
    • size
    • geographically
    • administratively
  • Transparency
  • Failure handling
  • Quality of service

Scalability

  • Factors
    • Size
      • Concerning centralized services/data/algorithm
    • Geographically
      • Synchronous communication in LAN vs. asynchronous communication in WAN
    • Administratively
      • Policy conflicts from different organizations (e.g., for security, access control)
  • Scalability techniques
    • Hiding communication latency
      • Asynchronous communication
      • Code migration (to client)
    • Distribution
      • Splitting a large component to parts (e.g., DNS)


    • Replication
      • Caching (decision of clients vs. servicers)
      • On demand (pull) vs. planned (push)

Communication

  • Communication Paradigms

    • Interprocess communication
      • Socket programming, message passing, etc.
    • Remote invocation
      • Request/Reply
      • RPC/RMI
    • Indirect communication
      • Group communication
      • Publisher-subscriber
      • Message queues
      • Tuple spaces

  • Communication Patterns

    • Client-servier
    • Group-oriented/Peer-to-Peer
      • Applications that require reliability, scalability

Distributed Software

  • Middleware handles heterogeneity
  • High-level support
    • Make distributed nature of application transparent to the user/programmmer
      • Remote procedure callls
      • RPC + Object orientation = CORBA
  • Higher-level support BUT expose remote objects, partial failure, etc. to the programmer
  • Scalability

Fundamental/Abstract Models

  • Interaction Model
    • Reflects the assumptions about the progresses and the communication channels in the distributed system.
  • Failure Model
    • Distinguish between the types of failures of the processes and the communication channels. 
  • Security Model
    • Assumptions about the principals and the adversary

Interaction Models

  • Synchronous Distributed Systems
    • A system in which the following bounds are defined
      • The time to execute each step of a process has an upper and lower bound
      • Each message transmitted over a channel is received within a known bounded delay.
      • Each process has a local clock whose drift rate from real times has a known bound.
  • Asynchronous distributed system
    • Each step of a process can take an arbitrary time
    • Message delivery time is arbitrary
    • Clock drift rate are arbirary
  • Some implications
    • In a synchronous system, timeout can be used to detect failures
    • While in asynchrous system, it is impossible to detect failures or “reach aggrement”.