ichec newsletter

8
2 3 4 6 ISSUE 10 : JANUARY 2011 : WWW.ICHEC.IE EU PRACE programme rolls out ICHEC is very pleased to announce its involvement in seven EC FP7 grants/proposals. These projects cover areas such as: the establishment of a pan-European HPC infrastructure (PRACE 1st and 2nd Implementation Phases, 2010-13); the development of e-infrastructures targeted at nano- medicine and the life sciences; partnering with leading industrial organisations in three projects to develop software products for heterogeneous (GPGPU) computing, and more specifically assessing their suitability for a range of popular community codes of benefit to the research community; and, finally, ICHEC is involved in a comprehensive study assessing the cost–benefit of cloud computing as an alternative to conventional HPC services. The total funding for these seven projects amounts to ¤3.5m (¤1.96 already secured; ¤1.52m under consideration). The long-awaited Partnership for Advanced Computing in Europe (PRACE) is now rolling out, with calls for access to Tier-0 Petaflop compute facilities. While this is of great interest to a small number of Irish researchers who need these state-of- the-art resources from the world’s fastest computers, greater numbers of Irish researchers need access to more modest facilities. It is in this regard that ICHEC, along with some of the smaller EU countries, has been campaigning within the consortium for access to Tier-1 resources (50-250TFlops), generally available through the pooling of agreed percentages of national facilities around the member countries. This argument won particular favour with the EU Commission and, importantly, has now been mainstreamed within the PRACE project. We expect these important Tier-1 resources to come on stream later in 2011. ICHEC staff will be pro-active in this space, working with Irish researchers who need these facilities. A full account of the PRACE project is given on page 3 of this edition of ICHEC News. Newsflash – ICHEC wins FP7 funding Research update New recruits at ICHEC PRACE update Special feature: shared services Curie Phase 1: petascale machine funded by Genci (Grand Equipement National de Calcul Intensif) and operated by CEA/DIF as a Tier-0 system for PRACE. (Copyright CEA.) TGCC (‘Très grand centre de calcul du CEA’) infrastructure, designed to host Tier-0 supercomputers. (Copyright CEA.)

Upload: thnk-media

Post on 21-Feb-2016

215 views

Category:

Documents


3 download

DESCRIPTION

Issue 10, January 2011, ICHEC Newsletter

TRANSCRIPT

2 3 4 6

ISSUE 10 : JANUARY 2011 : WWW.ICHEC.IE

EU PRACE programme rolls out

ICHEC is very pleased to announce its involvement in seven EC FP7

grants/proposals. These projects cover areas such as: the establishment of a

pan-European HPC infrastructure (PRACE 1st and 2nd Implementation

Phases, 2010-13); the development of e-infrastructures targeted at nano-

medicine and the life sciences; partnering with leading industrial

organisations in three projects to develop software products for

heterogeneous (GPGPU) computing, and more specifically assessing their

suitability for a range of popular community codes of benefit to the research

community; and, finally, ICHEC is involved in a comprehensive study

assessing the cost–benefit of cloud computing as an alternative to

conventional HPC services. The total funding for these seven projects

amounts to ¤3.5m (¤1.96 already secured; ¤1.52m under consideration).

The long-awaited Partnership for Advanced Computing in Europe (PRACE) is now

rolling out, with calls for access to Tier-0 Petaflop compute facilities. While this is

of great interest to a small number of Irish researchers who need these state-of-

the-art resources from the world’s fastest computers, greater numbers of Irish

researchers need access to more modest facilities. It is in this regard that ICHEC,

along with some of the smaller EU countries, has been campaigning within the

consortium for access to Tier-1 resources (50-250TFlops), generally available

through the pooling of agreed percentages of national facilities around the

member countries. This argument won particular favour with the EU Commission

and, importantly, has now been mainstreamed within the PRACE project. We

expect these important Tier-1 resources to come on stream later in 2011.

ICHEC staff will be pro-active in this space, working with Irish researchers who

need these facilities. A full account of the PRACE project is given on page 3 of

this edition of ICHEC News.

Newsflash – ICHEC wins FP7 funding

Research update

Newrecruitsat ICHEC

PRACEupdate

Special feature:sharedservices

Curie Phase 1: petascale machine funded by Genci (Grand Equipement National de Calcul Intensif) and operated by CEA/DIF as a Tier-0 systemfor PRACE. (Copyright CEA.)

TGCC (‘Très grand centre de calcul du CEA’) infrastructure, designed to hostTier-0 supercomputers. (Copyright CEA.)

A packededitionWelcome to issue 10 of ICHEC

News, the newsletter dedicated to

bringing researchers and

institutions up to date with the

latest high-performance computing

(HPC) news from Ireland. There is

much to report on in this issue,

including information on new

recruits to the ICHEC team. We also

take a look at the successful growth

of our technology transfer

programme, which is helping Irish

companies to find technology

solutions. Our special feature

highlights the ‘condominium’

shared services concept for HPC,

explaining why it is such a

compelling initiative in the current

economic climate. We also report

on the upcoming PRACE Spring

School in the Edinburgh Parallel

Computing Centre, and highlight

events that took place in 2010. We

hope you find ICHEC News to be a

valuable source of HPC news and

information.

Professor Jim Slevin Director

PAGE 2 : ISSUE 10 : JANUARY 2011

Andy Regan joined ICHEC in June

2010 as a system programmer. His

responsibilities include administration

of ICHEC’s HPC and non-HPC

infrastructure. Previously, he worked

for letshost.ie as a system administrator

and end-user support provider. Andy

graduated from NUI Galway with a Bsc

in Information Technology. He also

worked as a research intern at DERI,

Galway, in the Semantic Web Services

cluster.

Martin Peters joined ICHEC in

November 2010. Martin studied

computational chemistry at Trinity

College Dublin, and worked with Prof.

Kennie Merz at Pennsylvania State

University and the University of

Florida, receiving his PhD in 2007. He

returned to TCD to work with the

Molecular Design Group. At ICHEC,

Martin works closely with scientists in

academia, supporting researchers in

the computational chemistry field.

Filippo Spiga joined ICHEC in January

2011 as a computational scientist. He

studied for his computer science

degree at the University of Milano-

Bicocca and completed his MSc thesis

at EPCC. He collaborated with CINECA

and made a contribution to the PRACE

project, and also spent one year in

Milan at the National Institute of

Nuclear Physics. Filippo was part of an

R&D group at IBM’s T.J. Watson

Research Center.

Renato Miceli joined the ICHEC team

in January 2011 as a junior

computational scientist. Renato

obtained his BSc in Computer Science

at the Federal University of Campina

Grande (UFCG), Brazil. He worked in

the Distributed Systems Laboratory as

a developer for the OurGrid

middleware project. Renato was also a

visiting research student at ICHEC for

the summer internship programme.

BeiBei Ma joined us in August 2010 as

a software developer in the technology

transfer group. She is primarily involved

with providing solutions and

consultancy to Irish SMEs. BeiBei

studied at the Dalian University of

Technology, receiving a BEng in

electronic engineering. Her PhD was

completed at the National University of

Ireland Maynooth in 2009.

Tanya Abbas joined ICHEC in

November 2010 as an administrative

assistant. Tanya has a BA in broadcast

management from the University of

Texas at Arlington. She completed an

internship with CBS 11 in Dallas/Fort

Worth, and worked as a journalist for

the Sunday World newspaper in Dublin.

Tanya is from Fort Worth, Texas, and

moved to Ireland in April 2010.

Nicola McDonnell joined ICHEC as a

computational scientist in October

2010, and works on our PRACE and

technology transfer activities. Prior to

joining ICHEC she studied natural

sciences at TCD, and completed an

MSc in computer science at Queen’s

University Belfast. She has a

postgraduate diploma in computer

games development from Abertay

University, Dundee. Nicola joins us

from the EPCC, where she was a

principal consultant in the

Applications Group.

Nicola Varini (not pictured) joined

ICHEC in November 2010 as a

computational scientist. He studied for

his Masters degree in computational

physics at Udine University in Italy,

and is currently a PhD student there.

He was involved in European projects

such as PRACE and MMM@HPC, and

in developing and implementing new

potentials in LAMMPS and some

computational aspects of Quantum

Espresso and OpenFOAM.

Editorial 2

New recruits at ICHEC 2

New and notable 3

International collaboration: PRACE

Watch this space …

Special feature 4

Shared services condominiums

Education and training 5

Events calendar

PRACE Spring School

– Edinburgh 2011

Introduction to CUDA

Research update 6

Understanding the origins of

eukaryotic genes and genomes

A year in review 7Irish group awarded US compute time 7

Technology transfer 8

ICHEC provides technology

solutions for Irish companies

Contents

News

New recruits to ICHEC: Back row (from left): Andy Regan; Martin Peters; and,

Filippo Spiga. Front row (from left): Renato Miceli; BeiBei Flynn (née Ma);

Tanya Abbas; and, Nicola McDonnell.

ICHEC welcomes new recruitsEight new staff members have joined the team at ICHEC.

ISSUE 10 : JANUARY 2011 : PAGE 3

European infrastructure development brings new opportunitiesDRS J-C DESPLAT and MICHAEL BROWNE present an update on ICHEC’s continuing involvement

in the PRACE project.

Regular readers of ICHEC News will be

familiar with ICHEC’s active

involvement in the development of

the European Tier-0 high-end

computing (HEC) infrastructure,

which typically includes the ‘top 10’

most powerful supercomputers in the

world. This effort is led predominantly

under the auspices of the Partnership

for Advanced Computing in Europe

(PRACE), and financially supported by

a number of European governments,

national funding agencies, and the

European Commission (see ICHEC

News, Issue 8). Progress to date has

been rapid, sustained and significant,

placing HEC well ahead of most other

types of large-scale infrastructures.

For instance, the second PRACE

supercomputer, a 1.6PFlops Intel Xeon

system from BULL, funded by GENCI

in France, is now open for access to

European researchers.

The benefits of this evolution will be

far reaching for the Irish

computational science community, for

Ireland, and for ICHEC. Through their

partnership with ICHEC, Irish

researchers enjoyed a 100% success

rate at the technical evaluation stage

(the European average is only 52%). A

very interesting evolution for Irish

researchers is the broadening of

PRACE’s mandate to incorporate so-

called Tier-1 systems (typically 1/10

of the size of the top 10 systems, so

currently c.100 TFlops peak) within

its infrastructure, and the role that

Ireland (through ICHEC) will play in

the establishment of this new

service. In the national context, this

evolution will extend the relevance

of PRACE from a dozen or so

researchers, to up to 100. The Tier-1

service will provide further resources

to groups seeking access to systems

more powerful than Stokes, and will

provide others with an important

stepping stone to the Tier-0 systems

(a much needed mechanism

considering the recent closure of the

Irish Capability Computing Service).

Importantly, this access will be

gained through an international

competitive process, based on the

scientific merit and technical

readiness of the proposed research,

rather than on ‘juste retour’. Based

on the recent Irish successes

securing resources through the

DEISA Extreme Computing Initiative

(see ICHEC News Issue 9), we are

confident that Irish researchers will

fully avail of this opportunity to

increase their competitiveness.

Work on the first implementation

phase is now well underway, with

ICHEC involved in four work packages.

Our effort is predominantly targeted

at porting and optimising key

community codes such as Quantum

ESPRESSO, OpenFOAM, or Elmer for

Tier-1/0 use. In the second

implementation phase (expected to

start Q3-2011), ICHEC’s role will be

broadened to include responsibility for

the co-ordination of PRACE training

activities, and co-leadership (with CSC

in Finland) of the ‘pillar’ on training

and dissemination. Not surprisingly,

ICHEC’s leading expertise in GPGPU

computing will be exploited in both

phases.

These significant changes, especially

the Tier-1 area, which was previously

served by the DEISA DECI

programme, are bringing new

opportunities for access to hardware

resources and expertise, and ICHEC

continues to be happy to help the

community to capitalise on this.

For further information, contact

ICHEC at [email protected] and see

http://www.prace-project.eu/.

New and notable

Gilles Civario (right) and Dr Florian Berberich (Jülich SupercomputingCentre, Germany) at the PRACE booth during SC10 in New Orleans.

ICHEC, e-INIS and the IPCCICHEC is a partner in the EC-Earth

climate model, along with UCD,

Met Éireann and a host of other

European meteorological services.

As its commitment to the project,

ICHEC is in charge of the project’s

data management, and is preparing

and managing all EC-Earth data for

the UN International Panel on

Climate Change (IPCC) report.

A key part of the next IPCC report,

dubbed ‘AR5’, is a comparison of

the results of all the different

climate model outputs. In essence,

we want to know how the climate

models differ in their predictions

and why. Comparing the models is

the goal of CMIP5, the Climate

Model Intercomparison Project. For

this, each of the major climate

models teams (around 10-15

worldwide) run, as well as their own

science, a set of comparison data

using the same scenarios and

output the same variables. These

outputs are compared, and the

results summarised as part of the

AR5 report, scheduled for 2013.

Clearly the above involves the

processing of huge data files and

tools. EC-Earth plans to make

between 100 and 200TB of data

public (current plans are for 130TB,

with more possible depending on

funding). But for these data to be

meaningful and facilitate

comparison, all the output from the

different models must be in the

same format: not just the same file

format, but the meaning of each

variable needs to be clarified and

understood. It is in this

standardisation process that ICHEC

has been centrally involved, writing

tools to reformat the model output

in an agreed standard format, and

developing standards for data

management.

Once the data is ready, a core 50TB

will be stored at three Tier 0

institutes where the actual

intercomparisons will be processed.

In addition to this core, a further

100TB of the data will be kept on

e-INIS storage, with ICHEC running

a metadata server. Using these,

climate researchers will be able to

find all the results from EC-Earth.

The intention is to make all the

data available to the general public

so that they can see the effects of

climate change themselves in

programs like Google Earth.

ICHEC and e-INIS are one of only

10 or so Tier 1 centres managing

data for the IPCC report. This work

helps to place Ireland to the

forefront in climate science, and

ensures that we get the highest

profile and best value for the

science done.

Watch this space …Ireland leads data management node for European climate change initiative.

PAGE 4 : ISSUE 10 : JANUARY 2011

ConceptThere has hardly ever been a period in the development of IT services in the

modern era where data-intensive and processor-intensive (i.e., high-

performance computing [HPC]) computing are serviced from a technology

landscape that changes so rapidly. New and emerging architectures based on

GPGPUs, as well as the proliferation of services from cloud computing

providers, make planning for the future of HPC more challenging than it has

ever been (see ICHEC News issue 9), and all of this without even considering

the storing and management of data files growing at near-exponential rates.

How should we address these challenging and exciting developments at a

national level to ensure that we keep up with and benefit from these

developing trends in a declining budgetary environment? We believe that

the development of specially adapted ‘condominiums’ – managed by ICHEC

centrally in a co-located data centre – for all of the institutions that need

high-end computer cycles, should form an important part of the strategy in

the currently challenging economic environment. It is not suggested that

these ‘condos’ should replace local compute resources, but rather that they

should be seen as complementary to them. Institutions may decide in the

future that they represent cost-effective alternatives to procuring, buying

and managing their own HPC resources for small systems of the order of

100+ cores, and where it remains essential for access to these facilities to

remain under local control.

So what are ‘condominium clusters’, so-called by a number of US institutions

now rolling out the model in order to ensure value for money and optimum

use of resources increasingly constrained by tightening budgets?

Condominium clustersCondominium clusters are clusters composed of compute resources owned

by different institutions and administered and hosted centrally, by ICHEC in

this instance, with the provision that spare cycles are made available across

condominium boundaries. Seven condominiums are currently in operation

and managed by ICHEC for these institutions (see Table). Two of these

condos, owned by UCD and NUI Maynooth, have operated very successfully

since Stokes was commissioned in late 2008; the remaining five were put in

place following the recent Stokes technology refresh.

The way the condo model works is straightforward:n A particular institution/research group makes a contribution towards the

purchase of a large cluster, ‘buying’ a share of the total system at a cost

pro rata to the size of the share it wishes to ‘own’, e.g., if ICHEC purchases

a 4,000-core cluster, and a 150-core condo was being bought, then the

once-off capital cost would be 150/4,000 of the total capital purchasing

price.

n ICHEC systems administrators manage the total system, where the

institutional condos are seamless components. If/when problems occur

with the hardware on the site (i.e., presently at the UCD data centre),

ICHEC staff access the data centre to fix the problem and thus ensure

minimum disruption. ICHEC’s complete administration service is provided

free to the condo owner.

n In the refreshed Stokes compute cluster, ICHEC provided a limited number

of ¤50k (capital) shares (see Table), each amounting to c.96 cores with

eight nodes. Institutional condo shares can be operated as a fixed

boundary cluster (i.e., a partition) or, alternatively, can provide access to

users via an allocation model. Jobs in the partition model are limited to the

size of the condo and the availability of sufficient cores to run a specific

job. Jobs in the allocation model (currently used by all institutions but

one) are scheduled so that the average usage by the institution

corresponds approximately to the size of the institution’s condo.

n Apart from the initial capital cost of the condo + storage, the only

additional costs for the institution are the monthly electricity costs, which

are charged pro rata to the size of the condo.

Institution Shares # cores # users

DCU 1 96 29

DIAS 1.5 144 12

NUIG 0.5 48 8

NUIM 2 192 29

UCD 3.25 312 91

UL 0.5 48 7

Met Éireann 1.5 144 N/A

Shared services for high-performancecomputing: introducing the‘condominium’ model

UCD

NUIM

DCU

NUIG

UL

DIAS

MetEireann

National Service

6%

76%

5%3% 1%

1%

4%

4%

DR J-C DESPLAT and PROFESSOR JAMES SLEVIN discuss condominium clustersintroduced by ICHEC to provide ownership, value for money and optimum useof resources for the higher education institutes.

Special feature

ISSUE 10 : JANUARY 2011 : PAGE 5

Benefitsn No procurement overheads for the

institution. ICHEC guarantees that

the institution obtains its own

institutional cluster, managed by

highly experienced ICHEC staff.

They also know that ICHEC’s deal

with the vendor will almost

certainly provide them with a

cluster at a significantly lower cost

in terms of core per Euro compared

to buying directly as a stand-alone

system.

n Strong savings on the capital side

are also made on storage (only a

fraction of a ‘drawer’ is purchased,

not the full infrastructure), as well

as on the infrastructure front,

which has already been purchased

(management nodes, login nodes,

interconnect, software licenses,

etc.).

n No hosting costs and no systems

administration costs.

n High availability of system, e.g.,

Stokes has a record of over 99%

availability since January 1, 2009,

when it was commissioned.

n Availability of up-to-date tuned

tools, libraries and application

software.

n Full utilisation of their cluster,

where institutions use the

allocation model rather than the

partition model for access.

n Seamless transition from condo to

ICHEC’s national service where

needed. In effect, the condo model

encourages new users to use ‘local

resources’ as a stepping-stone to

the full national service via Class B

or C applications. Note that UCD

now has 89 users on its 312-core

condo, building to this number

from about 20 when the service

began in early 2009. More than

half of these are not users of the

national service (as yet, it should

be emphasised!), but are in local,

try-out mode.

In summary, therefore, we believe that

the condominium shared services

concept is a very attractive one for

institutions, and with its impressive

saving of costs, management and

administrative overheads, it is a

particularly compelling one in the

present economic climate.

Supplementary material on the share

model can be found at

http://www.ichec.ie/services/alternat

e_access.

Education and training

Dr J-C Desplat

Associate Director

of ICHEC

Recent eventsOctober 26-28, 2010: Introduction to Modern Fortran at NUI Galway

November 24, 2010: Bio HPC Clinic at University College Dublin

December 6-7, 2010: Introduction to CUDA at University College Dublin

March 29-31, 2011: PRACE Spring School 2011, Edinburgh Parallel Computing

Centre, Edinburgh, UK.

DEISA/PRACE Spring School –Edinburgh 2011As part of ICHEC’s involvement in the PRACE project, we are working with

the Edinburgh Parallel Computing Centre (EPCC) to organise the joint

DEISA/PRACE Spring School 2011, from March 29-31 in Edinburgh. The

School will cover new languages, programming paradigms and tools for

extreme scalability. The programme consists of two parallel tracks with

topics including: Using the Cray XE6; GPU Programming with CUDA; and,

Hybrid MPI/OpenMP Programming. Attendance is free for EU academics

and limited support is available from DEISA to cover costs for a small

number of attendees. Please refer to http://www.prace-

project.eu/events/ edinburgh-prace-school for further information.

ICHEC is also actively working towards the creation of the PRACE training

portal, a central resource for consolidated training material from all across

Europe (e.g., presentations, videos). For further details on the PRACE

project, please refer to http://www.prace-project.eu/.

Introduction to CUDAICHEC has developed a new course in GPGPU programming that is available to

researchers nationwide. As for all of our courses, delivery of this course can be

arranged based on demand, i.e., once there is an expression of interest from

individuals/groups and a suitable venue and date can be arranged. For further

information, please Email [email protected] to arrange a course or to obtain

further information.

This is an introductory course for programmers using the CUDA (Compute

Unified Device Architecture) parallel computing architecture that is being

developed by NVIDIA. Similar to other ICHEC courses, about 50% of the

course consists of lectures, with the other 50% involving hands-on practical

exercises. It is aimed at an audience that is proficient in C/C++ programming.

The topics covered on the course are:

n overview of GPGPU;

n CUDA primitives: thread organisation, kernels;

n kernel/function qualifiers;

n thread scheduling;

n CUDA memory types (e.g., global memory, shared memory,

texture memory);

n coalesced memory access techniques;

and,

n overview of available CUDA libraries.

Dr Simon WongComputational Scientist and Training Co-ordinator

PAGE 6 : ISSUE 10 : JANUARY 2011

Research update

The traditional tree of life shows

eukaryotes as a distinct lineage of

living things, but many studies have

suggested that the first eukaryotic

cells were chimeric, descended from

both eubacteria (through the

mitochondrion) and archaebacteria.

However they arose, eukaryote

nuclei contain genes of both

eubacterial and archaebacterial

origins, and these genes have

different functions within eukaryotic

cells, with eubacterial homologs

largely involved in ‘operational’

metabolic processes and

archaebacterial homologs largely

involved in the ‘informational’

processes of transcription,

translation and replication.

Our results are based on identifying

prokaryote homologs of eukaryotic

genes, examining every gene in the

Saccharomyces cerevisiae genome.

They support recent studies in

showing that many eukaryotic genes

are related to prokaryotic genes

(2,460 of 6,704 genes), and that

75% of these have eubacterial

affinities.

We carried out a number of

phylogenetic analyses of 1,717 of

these gene families, with only the

very largest families not subjected

to these analyses. The proportions

of genes ascribed eubacterial

ancestry and archaebacterial

ancestry remained similar. These

data confirm a significant bias

toward archaebacterial homology for

genes with informational functions.

Although significant, this is not a

clear-cut distinction, given that

genes with archaebacterial

homologs are involved in most of

the biological processes of the yeast

cell.

Mapping the dataThe absolute numbers of homologs

suggest a larger role for genes with

eubacterial homologs. Absolute

numbers do not necessarily tell the

whole story, however, given that

genes may differ in function in many

different ways, such as through

different patterns of expression and

involvement in different metabolic

pathways. To explore this functional

dimension, we mapped our

homologs onto data from a

comprehensive gene knockout

study, identifying each gene as

having either a lethal or a viable

deletion phenotype. Our results

showed that lethal genes are more

than twice as likely to have

archaebacterial homologs than

eubacterial homologs.

In an effort to explain the greater

essentiality of archaebacteria-

related genes, we examined data

that might shed light on the

differing cellular functions of these

genes and their protein products.

Using data from RNAseq

experiments, we found significantly

greater expression of genes with

archaebacterial homologs. The

average number of tags that could

be attached to genes of

archaebacterial origin was 164.64,

compared with 73.81 for eubacteria.

This is a >2-fold difference on

average. No significant differences

are seen between the expression

levels of operational and

informational gene categories.

We also found that genes with

archaebacterial homologs are more

central and more highly connected

in the yeast protein interaction

network (Figure 1), which has been

shown to reflect greater essentiality.

This difference is partly explained by

the greater centrality and

connectedness of informational

genes, but a statistically significant

difference is still observed for

operational genes alone (data not

shown). Finally, eubacterial

homologs show more duplicate

copies (paralogs) within the yeast

genome, suggesting that a greater

degree of genetic redundancy is

protecting the cell against deletion

of eubacterial homologs. The results

indicate that archaebacterium-

derived genes are significantly more

likely to be essential to yeast

viability, are more highly expressed,

and are significantly more highly

connected and more central in the

yeast protein interaction network.

These findings hold irrespective of

whether the genes have an

informational or operational

function, so that many features of

eukaryotic genes with prokaryotic

homologs can be explained by their

origin, rather than their function.

Taken together, our results show

that genes of archaebacterial origin

are in some senses more important

to yeast metabolism than genes of

eubacterial origin. This importance

reflects these genes’ origin as the

ancestral nuclear component of the

eukaryotic genome.

The full publication is available

online:

Cotton, J.A., McInerney, J.O.

Eukaryotic genes of

archaebacterial origin are more

important than the more numerous

eubacterial genes, irrespective of

function. Proc Natl Acad Sci USA

2010; 107: 17252-1255.

http://dx.doi.org/10.1073/pnas.1

000265107

Understanding the origins of eukaryotic genesand genomes

James A Cotton,a,b and James O McInerneyb

aDepartment of Biology, National University of Ireland, Maynooth; and, bSchool of Biological and Chemical Sciences, Queen Mary University of London.

FIGURE 1: The yeast protein-protein interaction network. Each vertex is asingle Saccharomyces gene, with edges connecting genes whose proteinproducts are known to interact.

ISSUE 10 : JANUARY 2011 : PAGE 7

Review of 2010

NVIDIA and GPGPUsICHEC staff have continued to

build expertise in the area of

graphics accelerators for scientific

computations and have worked

closely with industry leaders

NVIDIA in this area. In recognition

of “the outstanding research

taking place at ICHEC…”, NVIDIA

designated ICHEC as a ‘CUDA

Research Centre’, one of only

seven in the world at the time,

and provided us with some of

their latest technology.

PRACE (Partnership for Advanced Computing in Europe)We have continued to consolidate our position in this important

partnership to build out a European HPC ecosystem. The EU allocated us a

¤660k grant to work on a number of work packages for Phase I in 2010-

‘11, and assigned us for an increased role in Phase II of the project,

beginning on July 1, 2011. We expect to be able to facilitate access to EU

Tier-1 compute resources later this year, a particularly important

development for us in light of the termination of the BlueGene capability

service on January 1. Our research enablement programme has also

leveraged our involvement in PRACE, leading to a number of Irish

successes in the DEISA Extreme Computing Initiative, the PRACE

Prototype Access and the US DoE INCITE programmes in 2010.

Upgrade of StokesWith the final tranche of e-INIS capital funding, we successfully carried

out an upgrade of our main compute cluster, Stokes, providing a much

needed increase of 50% in the core count and peak performance, and

elevating Stokes to position #330 in the November 2010 Top 500. The

upgrade was carried out in a highly efficient manner with the help of SGI

and UCD staff, ensuring that the service was interrupted for only one

week.

Roll out of condominiumsBuilding on the success of UCD’s and NUIM’s condominium shares, seven

institutions bought into the Stokes upgrade cluster. While it is still early

days on these additional ‘condos’, the signs are that, as in UCD and

NUIM, these ‘local’ resources are attracting a number of new users to

Stokes that would not otherwise have availed of these resources (see page

4). This is a perfect example of how innovative thinking can lead to

significant savings and efficiencies through the deployment of shared

services.

Technology transfer activitiesAs part of our SFI/CSET remit, ICHEC has a responsibility to engage in

technology transfer activities. We started this activity early in the year

with the appointment of Dr Eoin Brazil. Eoin documents the impressive

progress we have made so far in a number of key areas, from business

analytics to the application of GPGPUs and HPC to business and industry,

on page 8 of this issue.

Climate/EPA projectICHEC’s expertise in weather forecasting and climate generally continues

to grow, with the broadening of out partnership with Met Éireann to

extreme events analysis (funded by the EPA), and the deployment and

operation on the e-INIS infrastructure of data management services for

the CMIP5 project (see page 3).

Highlights of an eventful yearThis has been a particularly eventful and successful year for ICHEC, building on previous initiatives and employingnew staff to expand our range of activities.

Weather patternsover Ireland.

We are pleased to report that a group of researchers led by Dr Turlough

Downes of DCU/DIAS has secured compute time on the Intrepid Blue

Gene/P system at Argonne National Labs in the US. This very significant

compute resource will be used to perform testing and benchmarking in

preparation for an application to the US INCITE HPC access programme,

which is open to European applicants. The 2011 call will award 1.6 billion

compute hours. As with PRACE applications, ICHEC will support and advise

Irish researchers who are interested in this programme to gain access to

these state-of-the-art compute facilities. For more information, log on to

http://hpc.science.doe.gov.

Irish group awarded US compute time

PAGE 8 : ISSUE 10 : JANUARY 2011

DEPARTMENT OF

EDUCATIONA N D S C I E N C E

A N R O I N N

OIDEACHAISAGUS EOLAÍOCHTA

Technology transfer

Bringing high-performance computing to new domainsTullow Oil is a leading independent oil and gas exploration and production

company, with interests in over 85 exploration and production licences across 22

countries. ICHEC is working with Tullow Oil to optimise their seismic imaging

code and assess HPC strategies for this software. As part of this assessment,

messaging bottlenecks and areas for optimisation were highlighted within the

software and solutions provided. ICHEC is engaged in a long-term relationship

with Tullow Oil to address their seismic analysis needs and bring the benefits of

the latest developments in HPC to their business. Tullow Oil’s principal

geophysicist John Doherty comments:

“We consider ICHEC to be an industry leader, asevidenced by their recent partnerships with players suchas SGI and NVIDIA. So this is evidence to us that they arefuture proofing their technical solutions and theirtechnical expertise. Secondly, we recognise the hugeexpertise that exists within the team at ICHEC,working on current and leading edge technologies”.

Paddy Power is a world leader in betting and gaming entertainment, employing

over 2,400 people worldwide with 1,700 people in Ireland. Its revenue has shown

consistent annual growth of 30% for each of the last nine years. Paddy Power

provides an online wagering service for a variety of racing, sporting and novelty

events. The nature of the business requires the company to be able to handle

large volumes of data involving intensive use of CPU power in real time. ICHEC’s

expertise in massively parallel computing has significantly helped to accelerate

and optimise the company’s processing power.

The company plans, with ICHEC’s help, to process large volumes of data in real

time using this technology, and to train its staff in utilising hybrid compute

systems. Their commercial lead on the project described how a consultancy

relationship with ICHEC is different:

“What I was most impressed by was their willingnessnot only to deliver the solution, the black box, whichis desirable, but ultimately, if something goes wrongdown the road, then we would be in trouble, but theywere anxious to ensure that there was knowledgetransfer as well”.

Addressing business issues with analyticsAnalytics is a process that involves the application of statistical and data-mining

techniques to historical data with a view to developing a computational model that

predicts and improves business performance and planning. ICHEC has been actively

engaging with Irish companies in this area to help them identify and address their

planning strategies. ezetop and CarTrawler are two companies where ICHEC has

been tackling business problems using analytics. ezetop provides an online mobile

phone top-up service for over 130 mobile operators in 65 countries worldwide and

was the winner of the Rising Star category in the Ireland Deloitte Technology Fast

50 for 2010. This award category recognises younger technology companies who

have had the fastest growth in turnover over the last three years. ezetop had a

1215% revenue growth over this period (2006-2009). David Bowles, Head of

Online at ezetop, illustrates one business issue they face, where in order to accept a

credit card payment for a real-time transaction, they must “make a decision within

10 seconds because we need to deliver that credit to the mobile phone immediately.

So we have a problem with a huge element of fraud...” ICHEC has developed a

prototype fraud detection solution for ezetop to ensure that fraud can be minimised.

ICHEC, ezetop and Enterprise Ireland are now working on avenues to further

improve and commercialise this system to address real-time fraud in Internet-based

transactions.

CarTrawler provides an online car rental solution for over 550 car rental suppliers in

175 countries, covering approximately 25,000 locations. CarTrawler won the

Enterprise of the Year Award at the Irish Business & Finance Awards in 2011. This is

an award for companies trading for 10 years or less. CarTrawler’s growth in revenue

over the past five years (2004-2009) was 544%.

ICHEC has worked with CarTrawler staff to improve their existing business reporting

and to enhance their forecasting ability. ICHEC identified high-value customers and

their booking patterns to supplement existing reporting practices. Historic booking

information was used in conjunction with time-series analysis to improve

CarTrawler’s forecasting ability, allowing them to predict future demand patterns for

specific locations.

A short video with a selection of client testimonials is available online at:

http://ichec.ie/about_us/intro_video.

Providing technology solutions for Irish companiesAn important part of ICHEC’s mission is to help Irish companies to stabilise and grow their business.High-performance computing (HPC) and business analytics are two of the key areas where ICHEC issuccessfully tackling real business problems for Irish companies to stimulate job creation, improve revenuegrowth and provide knowledge transfer to enable these companies to utilise cutting-edge techniquesin their products and services.

Dr Eoin BrazilICHEC Senior Software Developerand Technology Transfer Consultant