Vision and history of the Grid
Grid is about sharing resources and services. The name Grid originates from the electrical power grid. It stands for the vision about future computer infrastructures: Advanced computing technologies should become as ubiquitous as access to electronic power. The users should be able to just plug in and play with an underlying pervasive infrastructure. They should not have to care where the computing power comes from. The required services should be readily available though the local desktop might not be able to deliver them. The scientist dreams of high performance simulations and experiments available at a mouse click. We find many names for this development in the literature from super-computing and scalable computing to the more recent term Grid computing. All have been linked to strong visions and dreams.
The Grid vision is about empowering ideas without too many considerations for the tools and methods that are needed to realise them. Currently this dream is mainly realised within research Grids for the open access to research tools and possibilities among communities of researchers. The E-Science program tools and communities are therefore often identified with the Grid. But theoretically, the Grid is more. Its vision tells us that it will be a global application just as the Internet has developed from a platform of a group of scientists into a global virtual community.
The Grid pioneers were very well aware that such a vision cannot become reality without overcoming large institutional and societal problems. (Foster and Kesselman 2004). Users need to get used to the new possibilities of the Grid and trust them for their everyday work. It is therefore necessary to comply with international standards and invest large sums to build up a core infrastructure. IBM, Oracle and Sun have committed to create Grid-compliant infrastructures. The Open Grid Service Architecture (OGSA) combines Web services and Grid computing.
Looking at the history of the Grid, De Roure et al. discriminate three main stages of the development of the Grid. (De Roure, Baker et al. 2003). The Nineties saw an early infrastructure for access to high performance computing, used by specialised scientific application for collaborative work and scientific simulations. These two areas of applications remain until today the two main driving motivations for the development of the Grid. The second step was to create a common infrastructure for Grid applications. The Globus toolkit has become a de facto standard. The third step concentrated on opening the access to Grid resource by providing a flexible interface to its users. A service-oriented approach enabled users to create their own applications by reassembling existing services through standard interfaces. While the second generation of Grid computing focussed on the realisation of an infrastructure for high performance computing, the third extended this model by integrating large scale collaboration into the Grid. OGSA presents distributed collaboration and virtual organisation (VO). (Foster, Kesselman et al. 2001).
Key concepts
The most important concept of all is the Grid. Generally speaking, there are three types of Grids (Taylor 2005):
- Computational Grid
- Data Grid
- Service Grid
Computational Grids provide the scientist with high-performance nodes. Typically large simulations with a huge space of unknown conditions are run on such Grid resources. Data Grids combine distributed data resources. The project DataGrid e.g. can process and transfer large amounts of physics data from CERN. Service Grids offer services to users, specific functions that might be required to process a scientific experiment.
Figure 1: Virtual Organizations
The Grid is the vision of an integrated infrastructure for the coordinated sharing of resources and problem solving in distributed environments. (Foster and Kesselman 1998). Highly flexible resource sharing and computer supported cooperative work shall be established via virtual organisations as visualised in figure 1. Virtual organisations provide a highly controlled environment to allow each resource provider to specify exactly what she wants to share, who is allowed to share it and conditions whereby this sharing occurs. The set of individuals and/or institutions that provides such sharing rules is collectively known as a virtual organisation (VO).' (Taylor 2005).
Together with virtual organisations, security arises as a very important issue. Though VO members agreed to cooperate they do not want to lose their copyright on some of the resources or intellectual work they invest. Theoretically, the individual researcher should be able to put the individual work items together into one simulation or model without informing all the members of the VO about the detailed implementation. Sensitive data or programs need protection against unwelcome intruders. The Grids VOs are adjustable towards different policies and can integrate parts of other virtual organisations. Security, access control and policies must be entirely programmable by the implementing group. Finally, virtual organisations do not follow a traditional client server approach, in which one centralised server provides services and power to multiple dummy clients. In order for VOs to truly work each node should be client and server to other nodes with flexible delegation models.
The resource sharing empowered by Grid technologies shall happen unnoticed in the background. Users should be not aware of it. The middleware enabling it should be transparent. There is often a lot of confusion on what middleware exactly is. It is a generic term in computer science and often applied to services or applications that fill in a gap between two different processes. This can apply to the technology that enables the communication of an operating systems with applications or how databases can be addressed by processes. The job of the middleware is often to hide implementation details, e.g. an application written in Java can transparently access either an Oracle or an IBM database. In the context of Grid technologies, middleware often refers to the virtual overlay over different resources that give the end user the impression she would use just one machine though the resources are really delivered by many different sources.
Quality of service (QoS) is a term often heard in computer science. Usually quality of service is related to service level agreements about performance and delivery. Common examples are the guarantee of a minimum bandwidth or maximum delay in a network. The Internet e.g. currently works on low quality of service called best-effort delivery service. Best effort is made to deliver, but no guarantees are given. One of the major motivations for future networks like the Grid is to improve the quality of service of the Internet. In a Grid environment, it is e.g. possible to schedule the execution of a piece of code on a remote server. Generally, this requires a security quality of services that cannot straightforwardly be delivered by the Internet architecture.
Implementations: Globus Toolkit
The Globus Toolkit provides essential Grid services such as resource allocation and data transfer. The Toolkit is a US based multi-institutional research project, which has become a de facto standard for many parts of the Grid technology. Since the early days of network computers, researchers have thought of how to best integrate different computer systems and architectures and systems. The Internet seems to have no problems with bridging different computer hardware and software. Layering is here the answer. It decomposes large tasks into manageable layers of software which work together over well defined interfaces. Via special layers, a UNIX server and a WINDOWS server can communicate over networks, as long as their transport layer uses the same protocol for the exchange of data. Protocols specify the interfaces between peers. Besides, layering follows a modular design. If the transport layer needs to be exchanged, all the other layers remain unaffected as long as the changes have no impact on the interface between these layers.
Layering is todays standard approach in network computing. Figure 2 shows the Grid layers to isolate different components from each other. The hardware layer is called Grid Fabric and consists of the actual resources like computers and databases. These are coordinated by the second layer, the GRID APIs that have been implemented by the Globus Toolkit. On top of these two layers are the applications layers to realise, e.g. VOs. The layered approach offers the opportunity to adjust the application to personal needs, while still relying on the standard Grid technologies of the Globus Toolkit. Globus Toolkit version 3.x presents the Open Grid Service Architecture services (OGSA), the current state of the art in Grid technologies (Unger and Haynos 2004). The OGSA adds web services to the Globus tools. With OGSA, the Grid is seen as a collection of services to meet the need of Virtual Organisations (Foster, Kesselman et al. 2002). This is more difficult than it might appear at first sight. Web Services are normally stateless, meaning their execution cannot be controlled. OGSA-based services however need to have a state in order to manage and monitor the execution. That means that the way of the remote execution must be transparent to the local calling service. The invocation has to be secure, and the lifetime of the service must be controlled (Foster, Kesselman et al. 2002). OGSA attempts to deliver seamless quality of service by using open, published interfaces and established industry standards like Web Services (Unger and Haynos 2004).
Websites with further information
Bibliography
- De Roure, D., M. A. Baker, et al. (2003). The Evolution of the Grid. Grid Computing: Making the Global Infrastructure a Reality. Wiley, Wiley: 65-100.
- Foster, I. and C. Kesselman (1998). The Grid: Blueprint for a New Computing Infrastructure. Morgan Kaufmann Publishers, San Francisco, CA.
- Foster, I. and C. Kesselman (2004). The Grid 2: Blueprint for a New Computing Infrastructure, Morgan-Kaufmann.
- Foster, I., C. Kesselman, et al. (2002). "An Open Grid Service Architecture for Distributed Systems Integration." Retrieved 27 January, 2005, from http://www.globus.org/research/papers/ogsa.pdf.
- Foster, I., C. Kesselman, et al. (2001). "The anatomy of the grid: enabling scalable virtual organizations." International Journal of Supercomputer Applications and High Performance Computing.
- Taylor, I. J. (2005). From P2P to Web Services and Grids. Peers in a Client/Server World. London, Springer.
- Unger, J. and M. Haynos. (2004). "A visual tour of Open Grid Services Architecture." IBM developer works Retrieved 21 January, 2005, from http://www.ibm.com/developerworks.
This briefing paper was written for AHeSSC, the Arts and Humanities e-Science Support Centre. It is published here with permission from AHeSSC.
| Attachment | Size |
|---|---|
| Grid_BP.pdf | 151.53 KB |
