Web Service — REST

What is REST

  • a design pattern for implementing networked systems, stands for “Representational State Transfer”
  • A client references a web resources using a URL
  • The web serves as a guiding framework for the web
  • HTTP is not just a protocol
    • It provides an API (POST, GET, PUT, DELETE) for create, read, update, and delete operations on a resource
  • Approach isolates application complexity at the end points (client and server) and keeps it out of the transport

Three Fundamental Aspects of REST

  • Resources
    • Every distinguishable entity is a resource. A resource may be a web site, an HTML page, and XML document etc.
  • URLs
    • Every resource is uniquely identified by a URL.
  • Simple operations

REST vs. SOAP

REST

  • The web is the universe of globally accessible information
  • Resource oriented
  • User-driven interactions via forms
  • Few operations (generic interface) on many resources
  • URI: Consistent naming mechanism for resources
  • Focus on scalability and performance of large scale distributed hypermedia systems

SOAP

  • The web is the universal transport of message
  • Activity/Service oriented
  • Orchestrated reliable event flows
  • Many operations (service interface) on few resources
  • Lack of standard naming mechanism
  • Focus on design of integrated (distributed) applications

Web Service

Web Services Fundamentals

Two Competing Approaches

  • REST-style
  • SOAP-style

Four Fundamental Technologies

  • XML
    • Describing information sent over the network
  • WSDL
    • Defining web service capability
  • SOAP
    • Accessing web services
  • UDDI
    • Finding web services

Web Service Infrastructure and Components

XML

  • Has emerged as the standard solution for describing information exchanged between heterogeneous system
  • Can be read by programs and interpreted in an application-specific way
  • Example
    • <Account>xx</Account>

WSDL: Describing the web service

  • Provides functional description of network services
    • IDL description
    • Protocol and deployment details
    • Platform independent description
    • Extensible language
  • As extended IDL: WSDL allows tools to generate compatible client and server stubs
    • Allows industries to define standardized service interfaces
    • Allows advertisement of service descriptions, enables dynamic discovery and binding of compatible services
      • Used in conjunction with UDDI registry
  • The main elements in a WSDL description

UDDI: Finding Web Service

  • Universal Description, Discovery, Integration
  • UDDI defines the operation of a service registry
    • Data structures for registering
      • Business
      • Technical specification: tModel is a keyed reference to a technical sepcifcaiton
      • Service and service endpoints
        • Referencing the supported tModels
  • The main UDDI data structures

SOAP

  • Why SOAP
    • A “wire protocol” necessary for accessing distributed object services
    • Vendor and/or platform-specific wire protocols hinder interoperability
  • SOAP
    • An Internet standard specification, the goal of which is to define a platform and vendor-neural WIRE PROTOCOL based on Internet standard protocols [HTTP & XML] to access Web Services. 
  • Features
    • Uses XML to package requests for services exposed by Web Services, and responds generates by Web services
    • Typically uses HTTP as a transport protocol
  • SOAP message
    • Convey documents
    • Support client-server communication

RESTful Approach

  • Focus on using HTTP operations (GET, PUT, POST, DELETE) to manipulate data resources represented in XML
    • No WSDL + SOAP

Remote Method Invocation – Design & Implementation

Middleware layers

Distributed Objects 

Compile-time vs. Run-time Objects

  • Objects can be implemented in many different ways
    • Compile-time objects
      • e.g., instance of classes written in object-oriented language like Java, C++
    • Data-base objects
    • Procedural languages like C,with an appropriate “wrapper code” that gives it the appearance of an object
  • System like Java RMI support compile-time objects
  • Not possible or difficult in language-independent RMI middleware such as CORBA
    • These systems use object adapters
    • Implementations of object interfaces are registered at an object adapter, which acts as an intermediary between the client and the object implementation

Persistent vs. Transient Objects

  • Persistent objects 
    • continue to exist even if they are not contained in the address space of server process
    • the “state” of a persistent object has to be stored on a persistent store, i.e., some second storage
    • invocation requests result in an instance of the object being created in the address space of a running process
      • many policies possible for object instantiation and (de)instantiation
  • Transient objects
    • Only exist as long as their container server process are running
      • i.e., only exist in memory

Static vs Dynamic Remote Method Invocations

  • Static invocation
    • Typical ways for writing code that uses RMI is similar to the process for writing RPCC
    • declare the interface in IDL, compile the IDL file to generate client and server stubs, link them to client and server side code to generate the client and the server executables
    • requires the object interface to be known when the client is being developed
  • Dynamic invocation
    • The method invocation is composed at run-time
      • invoke (object, method, input_parameters, output_parameters)
    • Useful for applications where object interface are discovered at runtime
      • e.g., object browser, batch processing systems for object invocations

Design Issues for RMI

  • RMI invocation semantics
    • Invocation semantics depend upon implementation of Request-Reply protocol used by RMI
    • Could be MaybeAt-least-once, At-most-once

  • Transparency
    • Should remote invocations be transparent to the programmer?
      • Partial failure, higher latency
      • Different semantics for remote objects, e.g., difficult to implement “cloning” in the same way for local and remote objects or to support synchronization operations e.g., wait/notify
    • Current consensus
      • Access transparency
        • Remote invocations should be made transparent in the sense that syntax of a remote invocation is the same as the syntax of local invocation
        • Distinguish
          • But programmers should be able to distinguish between remote and local objects by looking at their interfaces, 
          • e.g., in Java RMI, remote objects implement the Remote interface

Implementing Issues for RMI

  • Parameter Passing
    • Representation of a remote object referece
  • Request/Reply protocol
    • Handling failures at client and/or server
    • Issues in marshaling of parameters and results
      • Input, output, inout parameters
      • Data representation
      • handling reference parameters
    • Distributed object references
    • handling failures in request-reply protocol
      • Partial failure
        • Client, server, network
  • Supporting persistent objects, object adapters, dynamic invocations, etc

Marshalling

  • Pack method arguments and results into a flat array of bytes
  • Use a canonical representation of data types
    • e.g., integers, characters, doubles
  • Example
    • CORBA CDR
    • Java serialization

Handling failures

  • Client unable to locate server
    • Reasons
      • Server has crashes
      • Server has moved
      • (RPC systems) client compiled using old version of service interfance
      • System must report error (remote exception) to client 
        • Loss of transparency
      • Request message lost
        • Retransmit a fixed number of times before throwing an exception
        • Reply message lost
          • Client resubmits request
          • Server choices
            • Re-execute procedure
              • Server should be idempotent so that it can be repeated safely
              • Filter duplicates
                • Server should hold on to results until ackowledged
            • Server crashes after receiving a request
              • At least once
                • Keep trying till server comes up again
                • At most once
                  • Return immediately
                  • Exactly once impossible to achieve
                  • Client crashes after sending a request
                    • If a client crashes before RPC returns, we have an “orphan” computation at server
                      • Waste resources, could also start other comutations
                      • Orphan detection
                        • Reincarnation
                          • Client broadcasts new epoch when it comes up again
                          • Expiration
                            • RPC has fixed amount of time to do work

                        Note

                        • Implementing the request-reply protocol on top of TCP
                          • Does not help in providing applications with different invocation semantics
                            • TCP does not help with server crashes
                            • If a connection is broken, the end points do not have any guarantees about the delivery of messages that may have been in transit

                          RMI Software Components

                          • Communication module
                            • Implements the request-reply protocol
                          • Remote reference module
                            • Responsible for translating between local and remote object references and for creating remote object references
                              • Maintains remote object table that maintains a mapping between local&remote object references
                              • E.g., Object Adapter in CORBA


                          RMI – Object Activation

                          • Activation of remote objects
                            • Some applications require that information survive for long periods of time
                            • However, objects not in user all the time, so keeping them in running processes is a potential waste of resources
                            • Object can be activated on demand
                              • E.g., standard TCP services such as FTP on UNIX machines are activated by inetd
                          • Active and passive objects
                            • Active objects
                              • Instantiated in a running processes
                            • Passive objects
                              • Not currently active but can be made active
                              • Implementation of its methods, and marshalled state stored on disk
                          • Activator responsible for
                            • Registering passive objects that are available for activation
                            • Starting named server processes and activating remote objects in them
                            • Keeping track of locations of servers for mote objects that it has already activated
                          • Examples
                            • CORBA implementation repository
                            • JAVA RMI has once activator on each server computer

                          RMI – Other Topics

                          • Persistent object stores
                            • An object that is guaranteed to live between activations of process is called a persistent object
                            • Stored the state of an object in a marshalled (serialized) form on disk
                          • Location service
                            • Objects can be migrated from one system to another during their lifetime
                            • Maintains mapping between object references and the location of an object
                          • Distributed Garbage Collection
                            • Needed for reclaiming space on servers
                          • Passing “behavior”
                            • Java allows objects (data+code) to be passed by value
                              • If the class for an object passed by value is not present in a JVM, its code is downloaded automatically
                          • Use of reflection in Java RMI
                            • Allows construction of generic dispatcher and skeleton

                          Distributed Garbage Collection

                          • Java approach based on reference counting
                            • Each server process maintains a list of remote processes that hold remote object references for its remote objects
                            • When a client first acquires a remote reference to an object, it make addRef() invocation to server before creating a proxy
                            • When a clients local garbage collector notices that a proxy is no longer reachable, it makes a removeRef() invocation to the server before deleting the proxy
                            • When the local garbage collector on the server notices that the list of client processes that have a more reference to an object is empty, it will delete the object (unless there are any local objects that have a reference to the object)
                          • Other approaches
                            • Evictor pattern
                            • Leases

                          Java RMI

                          lecture 4

                          Features

                          • Integrate with Java language and libraries
                            • Security, write once run anywhere, multithreaded
                            • Object oriented
                          • Can pass “behavior”
                            • Mobile code
                            • Not possible in CORBA, traditional RPC systems
                          • Distributed garbage collection
                          • Remoteness of objects intentionally not transparent
                            • Good for handling failures

                          Remote Interfaces, Objects, and Methods

                          • Object becomes remote by implementing a remote interface
                            • A remote interface extends the interface java.rmi.Remote
                            • Each method of the interface declares java.rmi.RemoteException in its throws clause in addition to any application-specific clauses

                          Creating distributed applications using RMI

                          1. Define the remote interfaces
                          2. Implement the remote objects and server
                          3. Implement the client
                          4. Compile the remote interface, server and client 
                          5. Generate the stub and skeleton using rmic
                          6. Start the RMI registry
                          7. Start the server
                          8. Run the client

                          Middleware

                          Lecture 2 — part 5

                          Middleware

                          • Definition
                            • Middleware is a set of common business-unaware services that enable applications and end-users to interact with each other across a network
                            • Distributed system services that have standard programming interfaces and protocols 
                              • Services sit in the middle above OS and network software
                              • and below industry-specific applications
                          • Examples
                            • ftp, email
                            • web browsers
                            • database drivers and gateways
                            • CORBA (Common object request broker architecture)
                            • Microsoft .NET
                            • Java RMI, JINI, Javaspaces, JMS
                            • Web services software — SOAP, REST

                          Functional View of Middleware

                          • Information exchange services
                          • Application-specific services
                            • Specialized services
                              • e,g,m transaction services and replication services for distributed databases
                              • group services for collaborative applications, specialized services for multimedia applications
                            • business-unaware
                          • Management and support service
                            • needed for locating distributed resources and administrating resources acorss the network

                          System Architecture — Peer to Peer Computing

                          lecture 2 — part 3

                          Organization of nodes in P2P Systems

                          • Centralized directory
                            • Original Napster
                              • Pros
                                • Simple
                              • Cons
                                • O(N) states
                                • single point of failure
                          • Unstructured P2P systems
                            • Gnutella and its successors (flood queries)
                              • Pros
                                • Robust
                              • Cons
                                • Worst case O(n) messages per lookup
                          • Structured P2P systems
                            • Based upon Distributed Hash Tables (DHTs)
                            • Chord, CAN, Tapestry…

                          Distributed Hash Table (DHT)

                          • Distributed Hash Table
                            • Key = Hash (data)
                            • lookup(key) -> IP address
                            • send-RPC(IP address, PUT, key, value)
                            • send-RPC(IP address, GET, key) -> value
                          • Chord
                          • Example: BT content distribution

                          System Architecture — Centralized architecture

                          Lecture 2 — part 2

                          Client-server applications

                          • Clients
                            • Interacts with users through a user interface
                            • Performs application functions
                            • Interacts with client middleware using middleware API
                            • Receives response and display it if needed
                          • Servers
                            • Implement services
                            • Invoked by server middleware
                            • Provide error-recovery and failure handling service

                           Overview

                          • Common communication patterns in distributed applications
                            • Client-Server
                            • Group (multicast)
                            • Function-shipping/Applets
                          • Client
                            • Process that requests services
                          • Server
                            • Process that provides services
                          • Details
                            • Client usually blocks until server responds
                            • Client usually invoked by end users when hey require services
                            • Server usually waits for incoming requests
                            • Server can have many clients making concurrent requests
                            • Server is usually a program with special privileges

                          Application Software Architectures

                          • Many applications can be considered to be made up of three software components or logical tiers

                            • user interface
                            • processing layer
                            • data layer

                          • Client/server architecture

                            • Single-physical tiered, two physical tiered
                            • multi-tiered
                            • Distributed Data
                              • e.g., distributed database
                            • Remote data
                              • e.g., network file system 
                            • Distributed programs
                              • e.g., world wide web
                            • Remote presentation
                              • e.g., telnet
                            • Distributed presentation
                              • e.g., X Windows

                          • Motivation for multi-tier architectures

                            • Frees clients from dependencies on the exact implementation of database
                            • It allows business logic to be concentrated in one place
                              • software updates are restricted to middle layer
                            • Performance improvements possible by batching requests from many clients to database
                            • Database and business logic tiers could be implemented by multiple servers for scalability

                          Distributed Software and System Architectures

                          Lecture 2

                          1. Distributed Architectures

                          • Software Architecture
                            • Logical organization of the collection of software components that make up a distributed application
                          • System Architecture
                            • Instantiation of a software architecture, i.e., physical placement of software components on computers

                          2. Architecture Style

                          • Layered architectures
                            • Pros
                              • Division of task
                              • Scalability
                              • Transparency
                              • Potability
                            • Cons
                              • e.g., layer 1 and layer 3 cannot talk directly
                          • Object-based architectures
                            • Pros
                              • Independent components
                              • Free to talk to anyone
                          • Data-centered architectures
                            • e.g., google docs
                          • Event-based architectures
                            • e.g., facebook

                          3. System Architecture

                          • Centralized architecture
                            • client-server applications
                          • Decentralized architecture
                            • Peer-to-peer applications
                          • Hybrid architecture

                          Introduction to Distributed Computing Systems

                          Lecture 1

                          Definition

                          • Distributed System
                            • Tannenbaum
                              • A distributed system is a collection of independent computers that appears to its users as a single coherent system
                            • Lamport
                              • You know you have one when the crash of a computer you’ve never heard of stops you from getting any work done.
                          • Distributed Applications
                            • Applications that consist of a set of processes that are distributed across a network of machines and work together as an ensemble to solve a common problem.
                          • Types of Distributed Systems
                            • Distributed computing systems
                              • Cluster computing: homogeneous
                              • Grid computing: heterogenous via virtual organizations
                              • Cloud computing: everything as a service
                            • Distributed Information Systems
                              • Transaction processing system (Transaction RPC)
                              • Enterprise Information integration
                              • Publish/Subscribe systems (message oriented v.s. RPC/RMI)
                            • Distributed Pervasive Systems
                              • Home systems
                              • Healthy care systems
                              • Sensor networks

                          Characteristic properties of transactions (ACID)

                          • Atomic
                            • To the outside world, the transaction happends individually
                          • Consistent
                            • The transaction does not violate system invariants
                          • Isolated
                            • Concurrent transactions do not interfere with each other.
                          • Durable
                            • Once a transaction commits, the changes are permanent.

                          Goal/Beneftis

                          • Resource sharing
                          • Distribution transparency
                          • Scalability
                          • Fault tolerance and availability
                          • Performance
                            • Parallel computing can be considered a subset of distributed computing.

                          Challenges

                          • Heterogeneity
                          • Need for “openness”
                            • Open standards: key interfaces in software and communication protocols need to be standardized
                            • Often defined in Interface Definition Language (IDL)
                          • Security
                            • Denial of service attacks
                          • Scalability
                            • size
                            • geographically
                            • administratively
                          • Transparency
                          • Failure handling
                          • Quality of service

                          Scalability

                          • Factors
                            • Size
                              • Concerning centralized services/data/algorithm
                            • Geographically
                              • Synchronous communication in LAN vs. asynchronous communication in WAN
                            • Administratively
                              • Policy conflicts from different organizations (e.g., for security, access control)
                          • Scalability techniques
                            • Hiding communication latency
                              • Asynchronous communication
                              • Code migration (to client)
                            • Distribution
                              • Splitting a large component to parts (e.g., DNS)


                            • Replication
                              • Caching (decision of clients vs. servicers)
                              • On demand (pull) vs. planned (push)

                          Communication

                          • Communication Paradigms

                            • Interprocess communication
                              • Socket programming, message passing, etc.
                            • Remote invocation
                              • Request/Reply
                              • RPC/RMI
                            • Indirect communication
                              • Group communication
                              • Publisher-subscriber
                              • Message queues
                              • Tuple spaces

                          • Communication Patterns

                            • Client-servier
                            • Group-oriented/Peer-to-Peer
                              • Applications that require reliability, scalability

                          Distributed Software

                          • Middleware handles heterogeneity
                          • High-level support
                            • Make distributed nature of application transparent to the user/programmmer
                              • Remote procedure callls
                              • RPC + Object orientation = CORBA
                          • Higher-level support BUT expose remote objects, partial failure, etc. to the programmer
                          • Scalability

                          Fundamental/Abstract Models

                          • Interaction Model
                            • Reflects the assumptions about the progresses and the communication channels in the distributed system.
                          • Failure Model
                            • Distinguish between the types of failures of the processes and the communication channels. 
                          • Security Model
                            • Assumptions about the principals and the adversary

                          Interaction Models

                          • Synchronous Distributed Systems
                            • A system in which the following bounds are defined
                              • The time to execute each step of a process has an upper and lower bound
                              • Each message transmitted over a channel is received within a known bounded delay.
                              • Each process has a local clock whose drift rate from real times has a known bound.
                          • Asynchronous distributed system
                            • Each step of a process can take an arbitrary time
                            • Message delivery time is arbitrary
                            • Clock drift rate are arbirary
                          • Some implications
                            • In a synchronous system, timeout can be used to detect failures
                            • While in asynchrous system, it is impossible to detect failures or “reach aggrement”.