21 February 2010

GRID - ‘UNLEASHING THE POWER’

 
GRID COMPUTING
Grid computing is applying the resources of many computers in a network to a single problem at the same time - usually to a scientific or technical problem that requires a great number of computer processing cycles or access to large amounts of data.

What is grid computing?

"The Grid" takes its name from an analogy with the electrical "power grid". The idea was that accessing computer power from a computer grid would be as simple as accessing electrical power from an electrical grid".Based on these concept:
·         never worry about where the computer power you are using comes from
·        the Grid links together computing resources such as PCs, workstations, servers, storage elements, and provides the mechanism needed to access them.
·         pervasive: remote computing resources would be accessible from different platforms, including laptops, PDAs and mobile phones,  through web browser.
·         utility: you ask for computer power or storage capacity and you get it. You also pay for what you get.
 "The Grid" doesn't yet exist in this form; however, the world already has hundreds of smaller grids...
Grid computing is driven by five big areas:
  1. Resource sharing:
Global sharing is the very essence of grid computing.
 Resource sharing is the crux of grid philosophy - but grid computing is not about getting something for nothing.
Grid computing aims to involve everyone in the advantages of resource sharing and the benefits of increased efficiency.
  • Grids give you shared access to extra computing power
  • A grid can also give you direct access to remote software, computers and data
  • A grid can even give you access and control of remote sensors, telescopes and other devices that do not belong to you.
  1. Secure access:
Trust between resource providers and users is essential, especially when they don't know each other. Sharing resources conflicts with security policies in many individual computer centers, and on individual PCs, so getting grid security right is crucial.
Secure access to shared resources is one of the most challenging areas of grid development.
To ensure secure access, grid developers and users need to manage three important things:
  • Access policy - What is shared? Who is allowed to share? When can sharing occur?
  • Authentication - How do you identify a user or resource?
  • Authorization - How do you determine whether a certain operation is consistent with the rules?
Grids need to efficiently track of all this information, which may change from day to day. This means that grids need to be extremely flexible, and have a reliable accounting mechanism. Ultimately, such accounting will be used to decide pricing policies for using a grid.
  1. Resource use:
Efficient, balanced use of computing resources is essential.
Grids allow us to efficiently and automatically spread our work across many computer resources and jobs are finished much faster.
Imagine if you had to do 1000 difficult maths questions. You could do them yourself, or you could use a computing grid. If you used a grid of 100 computers, you would give one question or "job" to each computer. When a computer finished one "job", it would automatically ask for another. In this way, your 1000 questions could be finished in a flash, with all 100 computers working to full efficiency.
MIDDLEWARE TO THE RESCUE
Computing grids rely on middleware - special grid computing software - to allocate jobs efficiently. Middleware uses information about the different "jobs" submitted to each queue to calculate the optimal allocation of resources.
To do this, we ideally need to know how many jobs are in each queue, and how long each job will take. This doesn't work perfectly yet, but then, neither did the Web in its early days .

Computational Problems :

·         Parallel calculations

·         Embarrassingly parallel calculations

·         Coarse-grained calculations

·         Fine-grained calculations

·         High-performance vs. high-throughput

  1. The death of distance:
Distance should make no difference: you should be able to access to computer resources from where ever you are.
Computing grids use international networks to link computing resources from all over the world. This means you can sit in France and use computers in the U.S, or work from Australia using computers from Taiwan.
Such international grids are possible today because of the impressive development of networking technology.
Pushed by the Internet economy and the widespread penetration of optical fibers in telecommunications systems, the performance of wide area networks has been doubling every nine months or so over the last few years. To avoid communication bottlenecks, grid developers also have to determine ways to compensate for failures, like transmission errors or PC crashes.
To meet such critical requirements, several high-performance networking issues have to be solved, including the optimization of Transport Protocols and the development of technical solutions such as high-performance Ethernet switching.
  1. Open standards: 
Interoperability between different grids is a big goal, and is driven forward by the adoption of open standards for grid development, making it possible for everyone can contribute constructively to grid development. Standardization also encourages industry to invest in developing commercial grid services and infrastructure.
By standardizing the way we create computing grids, we're one step closer to making sure all the smaller grids can connect together to form larger, more powerful grid computing resources.
WHO IS IN CHARGE OF GRID  STANDARDS?
The Open Grid Forum is a standards body for the grid community. With more than 5000 volunteer members, this body is a significant force for setting standards and community developments.

BUILDING A GRID

 There are three things we want to set up a grid.

1.      THE ARCHITECTURE
2.      THE HARDWARE
3.      THE MIDDLEWARE
1.THE ARCHITECTURE
Grid architecture is the way in which a grid has been designed.
A grid's architecture is often described in terms of "layers", where each layer has a specific function. The higher layers are generally user-centric, whereas lower layers are more hardware-centric, focused on computers and networks.
  • The lowest layer is the network, which connects grid resources.
     
  • Above the network layer lies the resource layer: actual grid resources, such as computers, storage systems, electronic data catalogues, sensors and telescopes that are connected to the network.
     
  • The middleware layer provides the tools that enable the various elements (servers, storage, networks, etc.) to participate in a grid. The middleware layer is sometimes the "brains" behind a computing grid!
     
  • The highest layer of the structure is the application layer, which includes applications in science, engineering, business, finance and more, as well as portals and development toolkits to support the applications. This is the layer that grid users "see" and interact with. The application layer often includes the so-called serviceware, which performs general management functions like tracking who is providing grid resources and who is using them.

Just like civil engineers building a bridge, software engineers building a grid must specify an overall design before they start work. This design is called the grid architecture and identifies the fundamental components of a grid's purpose and function.

2.THE HARDWARE
Grids must be built "on top of" hardware, which forms the physical infrastructure of a grid - things like computers and networks. This infrastructure is often called the grid "fabric".
Networks are an essential piece of the grid "fabric". Networks link the different computers that form part of a grid, allowing them to be handled as one huge computer.
Networks are characterized by their size (local, national and international) and throughput (the amount of data transferred in a specific time). Throughput is measured in kbps (kilo bits per second; where kilo means a thousand), Mbps (M for mega; a million) or Gbps (G for giga; a billion).
One of the big ideas of grid computing is to take advantage of ultra-fast networks. This idea allows us to access globally distributed resources in an integrated and data-intensive way. Ultra-fast networks also help to minimize latency: the delays that build up as data are transmitted over the Internet.
Grids are built "on top of" high-performance networks, such as the intra-European GEANT network, which has 10Gbps performance on the network "backbone". This backbone links the major "nodes" on the grid (like national computing centres).
Performance is measured in flops. A flop is a basic computational operation - like adding two numbers together. A Gigaflop is a billion flops, or a billion operations.

3. MIDDLEWARE
"Middleware" is the software that organizes and integrates the resources in a grid.  
Middleware is made up of many software programs, containing hundreds of thousands of lines of computer code. Together, this code automates all the "machine to machine" (M2M) interactions that create a single, seamless computational grid.
AGENTS, BROKERS AND STRIKING DEALS
Middleware automatically negotiate deals in which resources are exchanged, passing from a grid resource provider to a grid user. In these deals, some middleware programs act as "agents" and others as "brokers".
Agent programs present "metadata" (data about data) that describes users, data and resources. Broker programs undertake the M2M negotiations required for user authentication and authorization, and then strike the "deals" for access to, and payment for, specific data and resources.
Once a deal is set, the broker schedules the necessary computational activities and oversees the data transfers. At the same time, special "housekeeping" agents optimize network routings and monitor quality of service.
And all this occurs automatically, in a fraction of the time that it would take humans at their computers to do manually.
DELVING INSIDE MIDDLEWARE
There are many other layers within the middleware layer. For example, middleware includes a layer of "resource and connectivity protocols", and a higher layer of "collective services".
Resource and connectivity protocols handle all grid-specific network transactions between different computers and grid resources. For example, computers contributing to a particular grid must recognize grid-relevant messages and ignore the rest. This is done with communication protocols, which allow the resources to communicate with each other, enabling exchange of data, and authentication protocols, which provide secure mechanisms for verifying the identity of both users and resources.
The collective services are also based on protocols: information protocols, which obtain information about the structure and state of the resources on a grid, and management protocols, which negotiate uniform access to the resources. Collective services include:
  • updating directories of available resources
     
  • brokering resources (which like stockbroking, is about negotiating between those who want to "buy" resources and those who want to "sell")
     
  • monitoring and diagnosing problems
     
  • replicating data so that multiple copies are available at different locations for ease of use
     
  • providing membership/policy services for tracking who is allowed to do what and when. 
GRIDIFICATION

Gridification" means adapting applications to include new layers of grid-enabled software. For example, a gridified data analysis application will be able to:
  • obtain the necessary authentication credentials to open the files it needs
  • query a catalogue to determine where the files are and which grid resources are able to do the analysis
  • submit requests to the grid, asking to extract data, initiate computations, and provide results
  • monitor progress of the various computations and data transfers, notifying the user when analysis is complete, and detecting and responding to failures (collective services).

HISTORY

Grid computing grew from previous efforts and ideas, such as those listed below:
  • Immediate ancestor - "metacomputing"-1990.
Metacomputing was used to describe efforts to connect US supercomputing centers. Larry Smarr, a former director of the National Center for Supercomputing Applications in the US, is generally credited with popularizing the term.
 
  • Key grid technologies –“FAFNER and     I-WAY”1995
     
Ø  FAFNER (Factoring via Network-Enabled Recursion) aimed to factorize very large numbers, a challenge very relevant to digital security. Since this challenge could be broken into small parts, even fairly modest computers could contribute useful power. Many FAFNER techniques for dividing and distributing computational problems were forerunners of technology used for http://www.gridcafe.org/volunteer-computing-.html and other "cycle scavenging" software.
 
Ø  I-WAY (Information Wide Area Year) aimed to link supercomputers using existing networks. One of I-WAY's innovations was a computational resource broker, conceptually similar to those being developed for grid computing today. I-WAY strongly influenced the development of the Globus Project, which is at the core of many grid activities, as well as the LEGION project, an alternative approach to distributed supercomputing.
 
  • Birth
     Grid computing was born at a workshop called "Building a Computational Grid", held at Argonne National Laboratory in September 1997. Following this, in 1998, Ian Foster of Argonne National Laboratory and Carl Kesselman of the University of Southern California published "The Grid: Blueprint for a New Computing Infrastructure", often called "the Grid bible". Ian Foster had previously been involved in the I-WAY project, and the Foster-Kesselman duo had published a paper in 1997, called "Globus: a Metacomputing Infrastructure Toolkit", clearly linking the Globus Toolkit with its predecessor, metacomputing.

APPLICATION SIDE
There are hundreds of computer grids around the world. Many grids are used for e-science: enabling projects that would be impossible without massive computing power.
  • Biologists are using grids to simulate thousands of molecular drug candidates on their computer, aiming to find a molecule able to block specific disease proteins.
  • Earth scientists are using grids to track ozone levels using satellites, downloading hundreds of Gigabytes of data every day (the equivalent of about 150 CDs a day!).
  • High energy physicists are using grids in their search for a better understanding of the universe, relying on a grid of tens of thousands of desktops to store and analyze the 10 Petabytes of data (equivalent to the data on about 20 million CDs!) produced by the Large Hadron Collider each year. Thousands of physicists in dozens of universities around the world want to analyse this data.
  • Engineers are using grids to study alternative fuels, such as fusion energy.
  • Artists are using grids to create complex animations for feature films (check out Kung Fu Panda for example).
  • Social scientists are using grids to study the social life of bees, the makeup of our society, the secrets of history.
  • ...and more, more, more!!
   The Worldwide LHC Computing Grid (WLCG) combines the computing resources of more than 170 computing centers in 34 countries, aiming to harness the power of 100,000 CPUs to process, analyze and store data produced from the LHC, making it equally available to all partners, regardless of their physical location.
INTERNATIONAL GRIDS 

International grids cross national boundaries, spanning cultures, languages, technologies and more to create international resources and power global science using global computing.
D4Science        Distributed colLaboratories Infrastructure on Grid ENabled Technology 4 Science
EU-IndiaGrid   Collaboration between Europe and India                  
IndiaGrid will bring together over 500 multidisciplinary organisations to build a grid-enabled e-science community aiming to boost R&D innovation across Europe and India.

LCG                Worldwide LHC Computing Grid
The mission of the LHC Computing Project (LCG) is to build and maintain    a data storage and analysis infrastructure for the entire high energy physics community that will use the Large Hadron Collider.

NATIONAL GRIDS
National grids like those listed below combine national computing resouces to create powerful grid computing resources.

D-Grid            Germany        
The first D-Grid projects started in September 2005 with the goal of developing a distributed, integrated resource platform for high-performance computing and related services to enable the processing of large amounts of scientific data and information.

GARUDA      India's Grid Computing initiative connecting 17 cities across the country. The 45 participating institutes in this nation-wide project include all the IITs and C-DAC centers and other major institutes in India.

TeraGrid          U.S. supercomputing grid SYNOPSIS
 TeraGrid aims to build and deploy the world's largest, fastest, most comprehensive, distributed infrastructure for open scientific research. It involves involves partners across the U.S.
 
FIELD SPECIFIC GRIDS

Field-specific grids like those below have been created to tackle specific scientific problems.
AstroGrid  : The Global Virtual Observatory
BIRN : Human Disease
CaBIG :Cancer Research and Care


·         Exploiting under utilized resources
One of the basic uses of grid computing is to run an existing application on a different machine. The machine on which the application is normally run might be
unusually busy due to a peak in activity. The job in question could be run on an idle machine elsewhere on the grid.
·         Parallel CPU capacity
The potential for massive parallel CPU capacity is one of the most common Visions  and attractive features of a grid.
·         Virtual resources and virtual organizations for collaboration
Another capability enabled by grid computing is to provide an environment for collaboration among a wider audience.
·         Access to additional resources
As already stated, in addition to CPU and storage resources, a grid can provide access to other resources as well. The additional resources can be provided in additional numbers and/or capacity.
·         Resource balancing
A grid federates a large number of resources contributed by individual machines into a large single-system image
·         Reliability
High-end conventional computing systems use expensive hardware to increase reliability. They are built using chips with redundant circuits that vote on results, and contain logic to achieve graceful recovery from an assortment of hardware failures. The machines also use duplicate processors with hot pluggability so that when they fail, one can be replaced without turning the other off. Power supplies and cooling systems are duplicated
·         Management
The goal to virtualize the resources on the grid and more uniformly handle heterogeneous systems will create new opportunities to better manage a larger, more distributed IT infrastructure.

Additional Benefits
  • Multiple copies of data can be kept in different sites, ensuring access for all scientists involved, independent of geographical location.
  • Allows optimum use of spare capacity for multiple computer centres, making it more efficient.
  • Having computer centres in multiple time zones eases round-the-clock monitoring and the availability of expert support.
  • No single points of failure.
  • So-called “brain drain”, where researchers are forced to leave their country to access resources, is reduced when resources are available from their desktop.
  • The system can be easily reconfigured to face new challenges, making it able to dynamically evolve throughout the life of the LHC, growing in capacity to meet the rising demands as more data is collected each year.
  • Provides considerable flexibility in deciding how and where to provide future computing resources.
  • Allows community to take advantage of new technologies that may appear and that offer improved usability, cost effectiveness or energy efficiency.

.Vs INTERNET

The  best way to define the grid is to contrast it with the web. The analog  of search engine  on the web will be service registries based on UDDI specification. Application the grid have the ability to automatically access information from multiple servers at diverse locations on the web and to process this information using multiple application providers also at diverse locations on the grid.