The EU DataGrid project, the major grid development European effort, will complete its activity at the beginning of 2004. The results of its intense three years of activity show in a testbed that comprises 12 sites in 6 countries and which provides significant computing and storage resources to a community of approximately 500 users from thirteen different virtual organizations. The latest release of the DataGrid software has been successfully validated on a large set of applications, ranging from High Energy Physics to Bio-Informatics and Earth Observation. This software is now the basis of the current CERN Large Hadron Collider Grid Project first production infrastructure, the facility that is being setup for the analysis of data that will be produced by the new CERN accelerator (LHC). Although a considerable amount of work remains to be done, EDG, with its achievements, has proved the validity of the Grid concept and paved the way for the next generation Grid production infrastructure for a much wider multi-science international community.
EGEE (Enabling Grids for E-Science in Europe) aims to integrate current national, regional and thematic Grid efforts, in order to create a seamless European Grid infrastructure for the support of the European Research Area. This infrastructure will be built on the EU Research Network GEANT and exploit Grid expertise that has been generated by projects such as the EU DataGrid project, other EU supported Grid projects and the national Grid initiatives such as UK e-Science, INFN Grid, Nordugrid and the US Trillium (cluster of projects).
The EGEE vision is that this Grid infrastructure will provide European researchers in academia and industry with a common market of computing resources, enabling round-the-clock access to major computing resources, independent of geographic location. This infrastructure will support distributed research communities, including relevant Networks of Excellence, which share common Grid computing needs and are prepared to integrate their own distributed computing infrastructures and agree common access policies. The resulting infrastructure will surpass the capabilities of local clusters and individual supercomputing centres in many respects, providing a unique tool for collaborative compute-intensive science (``e-Science'') in the European Research Area. Finally, the infrastructure will provide interoperability with other Grids around the globe, including the US NSF Cyberinfrastructure, contributing to efforts to establish a worldwide Grid infrastructure. The scope of the project is illustrated in Figure 1.
EGEE has been proposed by experts in Grid technologies representing the leading Grid activities in Europe. The process of developing this project has lead to a structuring of the European Grid community into ten partner regions or ``federations'' (Figure 2). A significant structuring effect due to EGEE is already apparent, as several of these partners have begun integrating regional Grid efforts in order to provide coordinated resources to the EGEE project. In addition, US representatives are participating as EU unfunded partners in the project, and are considering establishing a US EGEE federation. Participation of Japan and the Asia-Pacific region is considered desirable and will be pursued.
EGEE is a two-year project conceived as part of a four-year programme. Major implementation milestones after two years will provide the basis for assessing subsequent objectives and funding needs. Given the service-oriented nature of this project, two pilot application areas have been selected to guide the implementation and certify the performance and functionality of the evolving European Grid infrastructure. One is the Large Hadron Collider Computing Grid, which relies on a Grid infrastructure in order to store and analyse petabytes of real and simulated data from high-energy physics experiments at CERN. The other is Biomedical Grids, where several communities are facing equally daunting challenges to cope with the flood of bioinformatics and healthcare data.
Given the rapidly growing scientific needs for a Grid infrastructure, it is deemed essential for the EGEE project to ``hit the ground running'', by deploying basic services, and initiating joint research and networking activities before the formal start of the project. The LCG project will provide basic resources and infrastructure already during 2003, and Biomedical Grid applications will be planned at this stage. The available resources and user groups will then rapidly expand during the course of the project. To ensure that the project ramps up rapidly, project partners have agreed to begin providing their unfunded contribution prior to the official start of the project.
In order to achieve the vision outlined above, EGEE has a three-fold mission:
A potential user community will typically come into contact with EGEE through one of the many outreach events supported by the Dissemination and Outreach activity, and will be able to express their specific user requirements via the Applications Identification and Support Activity. After negotiating access terms, which will depend, amongst other things, on the resources the community can contribute to the Grid infrastructure, users in the community will receive training from the User Training and Induction activity. From the user perspective, the success of the EGEE infrastructure will be measured in the scientific output that is generated by the user communities it is supporting.
EGEE builds on the integration of existing infrastructures in the participating countries, in the form of national GRID initiatives, computer centres supporting one specific application area, or general computer centres supporting all fields of science in a region. The motivation for providing resources to the EGEE infrastructure depends on the mission and funding situation for each of the resource partners. A new resource provider will typically approach EGEE through contact with the Regional Operations Centres. Specific policy and contractual issues for a given resource provider will be dealt with by dedicated staff in the Operations Management Centre, based on general guidelines defined and regularly reviewed by the Project Executive Board, with advice from the Project Management Board, and reviewed regularly.
The EGEE vision also has inspiring long-term implications for the IT industry. By pioneering the sort of comprehensive production Grid services which are envisioned by experts -- but which at present are beyond the scope of national Grid initiatives -- EGEE will have to develop solutions to issues such as scalability and security that go substantially beyond current Grid R&D projects. This process will lead to the spin off of innovative IT technologies, which will have benefits for industry, commerce and society going well beyond scientific computing. Major initiatives launched by several IT industry leaders in the area of Grids and Utility computing emphasize the economic potential of this emerging field.
Industry will typically come in contact with EGEE via the Industry Forum organised by the Application Identification and Support activity, as well as more general dissemination events run by the Dissemination and Outreach activity. Interested companies will be able to consult about potential participation in the project with the Project Director and with regional representatives on the EGEE Project Management Board. As the scope of Grid services expands during the second two years of the programme, it is envisaged that established core services will be taken over by industrial providers with proven service capacity. This service would be provided on commercial terms, and selected by a competitive tender.
It is essential to the success of EGEE that the three areas of activity should form a tightly integrated ``Virtuous Cycle'', illustrated in Figure 3. In this way, the project as a whole can ensure rapid yet well-managed growth of the computing resources available to the Grid infrastructure as well as the number of scientific communities that use it. As a rule, new communities will contribute new resources to the Grid infrastructure. This feedback loop is supplemented by an underlying cyclical review process covering overall strategy, middleware architecture, quality assurance and security status, and ensuring a careful filtering of requirements, a coordinated prioritization of efforts and maintenance of production-quality standards.
The EGEE project has successfully concluded the negotiation of the FP6 Research Infrastructure contract with the EU at the end of October and expects to start operations early Spring 2004. More than 160 researchers are gathering to participate in the various activities of the project and several new job positions have been opened by most of the seventy EGEE partners. If the first phase of the project will deliver by 2006, according to the expectations, a production quality Grid infrastructure for the European Research Area and the international scientific community, it is planned to propose a second phase to extend both the geographical coverage of this infrastructure and the number of supported end-user international scientific communities.
EGEE is proposed as a project funded by the European Union under contract IST-2003-508833