• Aucun résultat trouvé

Grid computing overview

1.6 Grid computing projects

1.6.5 Grid portals

One of the areas of grid application that are focused on at this time is the development of gateways and grid portals, which is a web-based single point of entry to a grid and its implemented services. With the widespread develop-ment of the Internet, scientists expect to expose their data and applications through portals. The grid portals provide user-friendly web page interfaces fa-cilitating grid applications users to perform operations on the grid and access grid resources specific to a particular domain of interest.

Currently, there are various technologies and toolkit that can be used for grid portal development. According to [Yang et al., 2006 ], grid portals can be classified into nonportlet-based and portlet-based.

Nonportlet-based portal is a grid portal that is designed based on typical three-layers architecture. The first layer is the user layer, which aims to provide the user-friendly interface for user. User layer is responsible for

Grid computing overview 19 displaying the portal content; it can be web browser, or other desktop tools. The second layer is the grid service layer, including authentica-tion service, job management service, informaauthentica-tion service, file service, security service. The authentication service allows portal to authenti-cate users. Once authentiauthenti-cated, users can use other services to access resources of the system (e.g., job management service for submitting jobs on a remote machine, information service for monitoring jobs sub-mitted, and viewing results). The second layer receives HTTP requests from the first layer and interacts with the third layer for performing the grid operations on relevant grid resource and retrieving the executed result from grid resources. The third layer is a backend resource layer, which consists of computation, data and application resources.

Portlet-based portal includes a collection of portlets. A portlet is a web component that generates fragments – pieces of markup (e.g., HTML, XML) adhering to certain specifications (e.g., JSR-168 [Sun, 2009a ], WSRP [OASIS, 2009 ]). Portlets improve the modular flexibility of developing grid portals as they are pluggable and can be aggregated to form a complete web page depending on user needs.

1.6.5.1 P-GRADE portal

P-GRADE grid portal [Kacsuk et al., 2006 ] is the first grid portal that tries to solve the interoperability problem at the workflow level with great success.

It is a workflow-oriented grid portal with the main goal to support all stages of grid workflow development and execution processes.

The P-GRADE portal provides the following functions (see Figure 1.4):

communicating with the portal server, users can achieve the functions of defining grid environments, managing grid certificates, controlling the exe-cution of workflow applications, and visualizing the progress of workflows;

workflow editor can perform the creation and modification of workflow appli-cations [Kertesz et al., 2006 ].

User

Workflow Editor

Certificate servers

Portal server

Remote Grid resources

FIGURE 1.4: P-GRADE portal system functions.

20 Fundamentals of Grid Computing

During the workflow editing the user has the possibility to select a grid resource for each job, or let a broker choose one. Currently there are two brokers used by the portal: the LCG-2 broker and GTbroker. The GTbroker interact with the Globus resources to perform job submission. The static and dynamic information of grid resources are collected by GTbroker to achieve scheduling activities. The LCG-2 broking solution is used to reach LCG-2 based grids. The mission of the LHC computing project (LCG) is to build and maintain a data storage and analysis infrastructure for the entire high energy physics community that will use the LHC. The Large Hadron Collider (LHC), built at CERN near Geneva, is the largest scientific instrument on the planet and it begins operations in 2007. With exploiting the broking functions of GTbroker and LCG-2 broker, users can develop and execute multi-grid workflows in a convenient environment.

The integration of P-GRADE into GEMLCA shows the use of portal in a grid environment [Kacsuk et al., 2006 ]. Grid execution management for legacy code applications (GEMLCA) represents a general architecture for de-ploying legacy applications as grid services without re-engineering the code or even requiring access to the source files. GEMLCA adds an additional layer to wrap the legacy application on top of a service-oriented grid middle-ware, like Globus Toolkit version 4 (GT4). GEMLCA communicates with the client through SOAP-XML messages, gets input parameter values, submits the legacy executable to a local job manager like Condor or portable batch system (PBS), and returns the results to the client in SOAP-XML format.

GEMLCA provides the capability to convert legacy codes into grid services.

However, an end-user without specialist computing skills still requires a user-friendly web interface (portal) to access the GEMLCA functionalities. In order to solve this problem, GEMLCA is integrated with the P-GRADE grid portal. Following this integration, legacy code services can be included in end-user workflows, running on different GEMLCA grid resources. The workflow manager of the portal contacts the selected GEMLCA resource and passes the actual parameter values of the legacy code to it. Then the GEMLCA resource executes the legacy code with these actual parameter values and delivers the results back to the portal.

1.6.5.2 GridSphere

GridSphere [GridSphere, 2009 ] is a typical portlet-based portal. The Grid-Sphere portal framework is developed as a key part of the European project GridLab [GridLab, 2009 ]. It provides an open-source portlet based web por-tal and enables developers to quickly develop and package third-party portlet web applications that can be run and administered within the GridSphere portlet container. Two key features of GridSphere framework are: (i) allow-ing administrators and individual users to dynamically configure the content based on their requirements, and (ii) supporting grid-specific portlets and APIs for grid-enabled portal development. However, the main disadvantage

Grid computing overview 21 of the current version of GridSphere (i.e., GridSphere 2.1) is that it does not support WSRP specification.

1.6.5.3 Other portal systems

The Pegasus [Singh et al., 2005 ] portal provides an HTTP(S)-based inter-face that can be accessed using a standard web browser. The portal architec-ture is composed of three layers. The top layer consists of the user machines and web browsers. The second layer consists of the web application server hosting the portal. The server is multithreaded and can handle multiple user requests at the same time. The third layer consists of the grid components and services used by the portal.

In order to use the Pegasus grid portal the user needs to have a valid grid credential in a MyProxy server. The portal does not provide access to a predetermined set of resources. Instead, the user can specify the resources to be used. From the web browser, the users specify the parameters of the application and Pegasus does the mapping of tasks in the workflow to resources specified in the resource configuration. The submitted workflow may take a long time to complete. The user may logout from the portal and login later to check its status. The portal allows also users to view the status of the workflow (submitted, active, done, failed), the number of tasks already completed, the tasks currently executing, and other information.

The Pegasus grid portal is very useful in scenarios where a virtual orga-nization (VO) wants to provide easy to use application submission interface to its members. It is able to map abstract workflow onto physical resources;

thus users are shielded from the complexity of installing and using the various components in order to access the Grid resources.

GridFlow [Cao et al., 2003 ] is a grid workflow management system de-veloped at the University of Warwick. Rather than focusing on workflow specification and the communication protocol, GridFlow is more concerned about service-level scheduling and workflow management. The GridFlow por-tal performs two level functions: global grid workflow management and local grid sub-workflow scheduling. The execution and monitoring functionalities are provided at the global grid level, which work on top of an existing agent-based grid resource management system. At each local grid, sub-workflow scheduling and conflict management are processed on top of an existing per-formance prediction based task scheduling system. A fuzzy timing technique is applied to address new challenges of workflow management in a cross-domain and highly dynamic grid environment.

22 Fundamentals of Grid Computing