Distributed Systems: Processes

Distributed Systems: Processes

Distributed Systems Serie

Threads

Introduction to threads

Processor: Provides a set of instructions along with the capability of automatically executing a series of those instructions.

Thread: A minimal software processor in whose context a series of instructions can be executed. Saving a thread context implies stopping the current execution, and saving all the data needed to continue the execution at a later stage.

Process: A software processor in whose context one or more threads may be executed. Executing a thread means executing a series of instructions in the context of that thread.

Why to use threads

  • Avoid needless blocking: a single-threaded process will block when doing I/O; in a multi-threaded process, the operating system can switch the CPU to another thread in that process.

  • Avoid process switching: structure large applications not as a collection of processes, but through multiple threads.

  • Exploit parallelism: the threads in a multi-threaded process can be scheduled to run in parallel on a multiprocessor or multicore processor.

Threads in distributed systems

Threads at the client side

Multithreaded web clients can hide network latencies, each file can be fetched by a separate thread.

Threads at the server side

Starting a thread is cheaper than starting a new process, having a single-threaded server prohibits simple scale-up to a multiprocessor system. Servers can hide network latency by reacting to the next request while the previous one is being replied to.

Virtualization

The principle of virtualization

Virtualization deals with extending or replacing an existing interface to mimic the behavior of another system. It becomes important as hardware changes faster than software, and for the ease of portability and code migration.

Application of virtual machines to distributed systems

Cloud providers offer roughly three different types of services:

Infrastructure-as-a-Service (IaaS): covering the basic infrastructure.

Platform-as-a-Service (PaaS): covering system-level services.

Software-as-a-Service (SaaS): containing actual applications.

Clients

We can provide a direct access to remote services by offering only a convenient user interface. Effectively, this means that the client machine is used only as a terminal with no need for local storage, leading to an application-neutral solution. In the case of networked user interfaces, everything is processed and stored on the server. This thin-client approach has received much attention with the increase of Internet connectivity and the use of mobile devices.

Client-side software for distribution transparency

Access transparency: client-side stubs for RPCs.

Replication transparency: multiple invocations handled by client stub.

Location transparency: let client-side software keep track of the actual location.

Failure transparency: can often be placed only on the client (we’re trying to mask server and communication failures).

Servers

General organization

A server is a process of implementing a specific service on behalf of a collection of clients. In essence, each server is organized in the same way: it waits for an incoming request from a client and subsequently ensures that the request is taken care of, after which it waits for the next incoming request.

Servers types

There are two basic types:

  • Iterative server: The server handles the request before attending the next request.

  • Concurrent server: The server uses a dispatcher, which picks up an incoming request that is then passed on to a separate thread/process.

Objective servers

The important difference between a general object server and other servers is that an object server by itself does not provide a specific service.

Specific services are implemented by the objects that reside in the server. Essentially, the server provides only the means to invoke local objects, based on requests from remote clients. As a consequence, it is relatively easy to change services by simply adding and removing objects. An object server thus acts as a place where objects live.

An object consists of two parts: data representing its state and the code for executing its methods. Whether or not these parts are separated, or whether method implementations are shared by multiple objects, depends on the object server.

The Apache web server

An interesting example of a server that balances the separation between policies and mechanisms is the Apache Web server.

It is an extremely popular server, estimated to be used to host approximately 50% of all Web sites. Apache is a complex piece of software, and with the numerous enhancements to the types of documents that are now offered on the Web, it is important that the server is highly configurable and extensible, and at the same time largely independent of specific platforms.

The general organization of the Apache Web server

Code Migration

Reasons for migrating code

  1. Ensuring that servers in a data centre are sufficiently loaded to prevent waste of energy.

  2. Minimizing communication by ensuring that computations are close to where the data is for example in the case of mobile computing.

    The principle of dynamically configuring a client to communicate with a server

Migration in heterogeneous systems

Problems & Solutions

  1. The target machine may not be suitable to execute the migrated code.

  2. The definition of process/thread/processor context is highly dependent on local hardware, operating system and runtime system.

We can use virtual machines as a solution.

Migration alternatives for images

  • Pushing memory pages to the new machine and resending the ones that are later modified during the migration process.

  • Stopping the current virtual machine; migrate memory, and start the new virtual machine.

  • Letting the new virtual machine pull in new pages as needed: processes start on the new virtual machine immediately and copy memory pages on demand.

Thank you, and goodbye!