[linux-ileri] 100136 WHAT IS THE GRID? A THREE POINT CHECKLIST 07.22.02 (fwd)

---------

New Message Reply About this list Date view Thread view Subject view Author view

From: Mustafa Akgul (akgul@Bilkent.EDU.TR)
Date: Mon 29 Jul 2002 - 15:51:02 EEST


Date: Mon, 29 Jul 2002 03:52:29 -0700 (PDT)
From: Grid Today <grid@gridtoday.com>
Message-Id: <200207291052.DAA57343@gridtoday.com>
To: akgul@Bilkent.EDU.TR
Subject: 100136 WHAT IS THE GRID? A THREE POINT CHECKLIST 07.22.02
X-Virus-Scanned: by AMaViS snapshot-20010714

WHAT IS THE GRID? A THREE POINT CHECKLIST 07.22.02
By Ian Foster Argonne National Lab & University of Chicago GRIDtoday
==============================================================================

The recent explosion of commercial and scientific interest in the Grid makes
it timely to revisit the question: What is the Grid, anyway? I propose here a
three-point checklist for determining whether a system is a Grid. I also
discuss the critical role that standards must play in defining the Grid.

The Need for a Clear Definition Grids have moved from the obscurely academic
to the highly popular. We read about Compute Grids, Data Grids, Science Grids,
Access Grids, Knowledge Grids, Bio Grids, Sensor Grids, Cluster Grids, Campus
Grids, Tera Grids, and Commodity Grids. The skeptic can be forgiven for
wondering if there is more to the Grid than, as one wag put it, a "funding
concept"and, as industry becomes involved, a marketing slogan. If by deploying
a scheduler on my local area network I create a "Cluster Grid," then doesn't
my Network File System deployment over that same network provide me with a
"Storage Grid?" Indeed, isn't my workstation, coupling as it does processor,
memory, disk, and network card, a "PC Grid?" Is there any computer system that
isn't a Grid?

Ultimately the Grid must be evaluated in terms of the applications, business
value, and scientific results that it delivers, not its architecture.
Nevertheless, the questions above must be answered if Grid computing is to
obtain the credibility and focus that it needs to grow and prosper. In this
and other respects, our situation is similar to that of the Internet in the
early 1990s. Back then, vendors were claiming that private networks such as
SNA and DECNET were part of the Internet, and others were claiming that every
local area network was a form of Internet. This confused situation was only
clarified when the Internet Protocol (IP) became widely adopted for both wide
area and local area networks.

Early Definitions

Back in 1998, Carl Kesselman and I attempted a definition in the book "The
Grid: Blueprint for a New Computing Infrastructure." We wrote: "A
computational grid is a hardware and software infrastructure that provides
dependable, consistent, pervasive, and inexpensive access to high-end
computational capabilities."

Of course, in writing these words we were not the first to talk about
on-demand access to computing, data, and services. For example, in 1969 Len
Kleinrock suggested presciently, if prematurely: "We will probably see the
spread of 'computer utilities', which, like present electric and telephone
utilities, will service individual homes and offices across the country."

In a subsequent article, "The Anatomy of the Grid," co-authored with Steve
Tuecke in 2000, we refined the definition to address social and policy issues,
stating that Grid computing is concerned with "coordinated resource sharing
and problem solving in dynamic, multi-institutional virtual organizations."
The key concept is the ability to negotiate resource-sharing arrangements
among a set of participating parties (providers and consumers) and then to use
the resulting resource pool for some purpose. We noted: "The sharing that we
are concerned with is not primarily file exchange but rather direct access to
computers, software, data, and other resources, as is required by a range of
collaborative problem-solving and resource-brokering strategies emerging in
industry, science, and engineering. This sharing is, necessarily, highly
controlled, with resource providers and consumers defining clearly and
carefully just what is shared, who is allowed to share, and the conditions
under which sharing occurs. A set of individuals and/or institutions defined
by such sharing rules form what we call a virtual organization."

We also spoke to the importance of standard protocols as a means of enabling
interoperability and common infrastructure. A Grid Checklist I suggest that
the essence of the definitions above can be captured in a simple checklist,
according to which a Grid is a system that:

-- coordinates resources that are not subject to centralized control - (A Grid
integrates and coordinates resources and users that live within different
control domains for example, the user's desktop vs. central computing;
different administrative units of the same company; or different companies;
and addresses the issues of security, policy, payment, membership, and so
forth that arise in these settings. Otherwise, we are dealing with a local
management system.)

-- using standard, open, general-purpose protocols and interfaces - (A Grid is
built from multi-purpose protocols and interfaces that address such
fundamental issues as authentication, authorization, resource discovery, and
resource access. As I discuss further below, it is important that these
protocols and interfaces be standard and open. Otherwise, we are dealing with
an application-specific system.)

-- to deliver nontrivial qualities of service - (A Grid allows its constituent
resources to be used in a coordinated fashion to deliver various qualities of
service, relating for example to response time, throughput, availability, and
security, and/or co-allocation of multiple resource types to meet complex user
demands, so that the utility of the combined system is significantly greater
than that of the sum of its parts.)

Of course, the checklist still leaves room for reasonable debate, concerning
for example what is meant by "centralized control," "standard, open,
general-purpose protocols," and "qualities of service." I speak to these
issues below. But first let's try the checklist on a few candidate "Grids."

First, let's consider systems that, according to my checklist, do not qualify
as Grids. A cluster management system such as Sun's Sun Grid Engine,
Platform's Load Sharing Facility, or Veridian's Portable Batch System can,
when installed on a parallel computer or local area network, deliver quality
of service guarantees and thus constitute a powerful Grid resource. However,
such a system is not a Grid itself, due to its centralized control of the
hosts that it manages: it has complete knowledge of system state and user
requests, and complete control over individual components. At a different
scale, the Web is not (yet) a Grid: its open, general-purpose protocols
support access to distributed resources but not the coordinated use of those
resources to deliver interesting qualities of service.

On the other hand, deployments of multi-site schedulers such as Platform's
MultiCluster can reasonably be called (first-generation) Gridsas and
distributed computing systems provided by Condor, Entropia, and United
Devices, which harness idle desktops; peer-to-peer systems such as Gnutella,
which support file sharing among participating peers; and a federated
deployment of the Storage Resource Broker, which supports distributed access
to data resources. While arguably the protocols used in these systems are too
specialized to meet criteria #2 (and are not, for the most part, open or
standard), each does integrate distributed resources in the absence of
centralized control, and delivers interesting qualities of service, albeit in
narrow domains.

The three criteria apply most clearly to the various large-scale Grid
deployments being undertaken within the scientific community, such as the
distributed data processing system being deployed internationally by "Data
Grid" projects (GriPhyN, PPDG, EU DataGrid, iVDGL, DataTAG), NASA's
Information Power Grid, the Distributed ASCI Supercomputer (DAS-2) system that
links clusters at five Dutch universities, the DOE Science Grid and DISCOM
Grid that link systems at DOE laboratories, and the TeraGrid being constructed
to link major U.S. academic sites. Each of these systems integrates resources
from multiple institutions, each with their own policies and mechanisms; uses
open, general-purpose (Globus Toolkit) protocols to negotiate and manage
sharing; and addresses multiple quality of service dimensions, including
security, reliability, and performance.

The Grid: The Need for InterGrid Protocols my checklist speaks to what it
means to be "a Grid," yet the title of this article asks what is "the Grid."
This is an important distinction. The Grid vision requires protocols (and
interfaces and policies) that are not only open and general-purpose but also
standard. It is standards that allow us to establish resource-sharing
arrangements dynamically with any interested party and thus to create
something more than a plethora of balkanized, incompatible, non-interoperable
distributed systems. Standards are also important as a means of enabling
general-purpose services and tools.

In my view, the definition of standard "InterGrid" protocols is the single
most critical problem facing the Grid community today. Fortunately, we are
making good progress. On the standards side, we have the increasingly
effective Global Grid Forum. On the practical side, six years of experience
and refinement have produced a widely used de facto standard, the open source
Globus Toolkit. And now, within the Global Grid Forum we have major efforts
underway to define the Open Grid Services Architecture (OGSA), which
modernizes and extends Globus Toolkit protocols to address emerging new
requirements, while also embracing Web services. Companies such as IBM,
Microsoft, Platform, Sun, Avaki, Entropia, and United Devices have all
expressed strong support for OGSA. I hope that in the near future, we will be
able to state that for an entity to be part of the Grid it must implement OGSA
InterGrid protocols, just as to be part of the Internet an entity must speak
IP (among other things). Both open source and commercial products will
interoperate effectively in this heterogeneous, multi-vendor Grid world, thus
providing the pervasive infrastructure that will enable successful Grid
applications.

Thanks for reading this far. I expect to be writing further columns for Grid
Today, so please feel free to contact me if there are issues that you would
like to see raised in this forum.

Ian Foster can be contacted at foster@mcs.anl.gov .

***************************************************************************
           Full background information on all Sponsoring Companies

      [ ] 921) SGI [ ] 934) Hewlett-Packard
      [ ] 527) Intel [ ] 942) Sun Microsystems
      [ ] 909) Fujitsu [ ] 539) Microsoft

         For sponsorship information contact: gridads@gridtoday.com

              GRIDtoday welcomes bylined comments for publication.

***************************************************************************
Copyright 2002 GRIDtoday Redistribution of this article is forbidden by
law without the express written consent of the publisher. For a
subscription to GRIDtoday, send e-mail to gridfree@gridtoday.com


New Message Reply About this list Date view Thread view Subject view Author view

---------

Bu arsiv hypermail 2b29 tarafindan uretilmistir.