managing large research groups: thoughts from the …...northrop-grumman office of naval research...

14
Advanced Computing and Information Systems laboratory Managing large research groups: thoughts from the ACIS experience José Fortes

Upload: others

Post on 05-Aug-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Managing large research groups: thoughts from the …...Northrop-Grumman Office of Naval Research Semiconductor Research Corporation Southeastern Universities Research Association

Advanced Computing and Information Systems laboratory

Managing large research groups:

thoughts from the ACIS experience

José Fortes

Page 2: Managing large research groups: thoughts from the …...Northrop-Grumman Office of Naval Research Semiconductor Research Corporation Southeastern Universities Research Association

Advanced Computing and Information Systems laboratory

Outline

� Vision and Mission of ACIS laboratory

� ACIS numbers at a glance

� Examples of different types of ACIS projects

� Comments on management goals and practices• The good, the bad and the ugly

� Key points

Page 3: Managing large research groups: thoughts from the …...Northrop-Grumman Office of Naval Research Semiconductor Research Corporation Southeastern Universities Research Association

Advanced Computing and Information Systems laboratory

ACIS vision and mission

� Vision: to advance the science of IT systems engineering by inventing, analyzing, prototyping and deploying innovative cyberinfrastructure for eScience

� Mission: fundamental and applied research on systems that integrate computing and information processing: • Cloud Computing

Using virtualization technologies for computing platforms, file systems, applications as services, networks and I/O systems

• Cyberinfrastructure for e-science and e-healthFor research on biodiversity informatics, brain-machine interfaces, coastal and ocean modeling, genetics and atomic-scale friction

• Autonomic Computing and Software-defined SystemsAs relevant to data centers, real-time systems, virtual networking and other topics pursued by the Center for Cloud and Autonomic Computing

• Computer ArchitectureArchitectural support for virtualization, reliable computing and green computing

• Peer-to-peer Computing and Software Defined NetworkingSelf-organizing virtual networks, and structured and unstructured query systems

Page 4: Managing large research groups: thoughts from the …...Northrop-Grumman Office of Naval Research Semiconductor Research Corporation Southeastern Universities Research Association

Advanced Computing and Information Systems laboratory

Some numbers associated with ACIS

� Founded in 2001

� 2 faculty members (+2)

� 2 research professors

� 4 IT experts

� 1 Admin. Assistant

� 10 - 20 PhD students

� 3 PhD graduates/year

� long-term visitors from

Japan, Korea, China, France

� 15+ papers/year

� 1+ keynote speech/year

� Chair 1+ major meeting/year

� Expenditures: $1.5 M/year (avg).

Page 5: Managing large research groups: thoughts from the …...Northrop-Grumman Office of Naval Research Semiconductor Research Corporation Southeastern Universities Research Association

Advanced Computing and Information Systems laboratory

ACIS Funding sources

� Army Research Office

� BellSouth/AT&T Foundation

� DARPA

� Intel Corporation

� IBM

� National Aeronautical & Science Administration

� National Oceanic and Atmospheric Administration

� National Science Foundation (CISE, OCI, ENG, BIO)

� Northrop-Grumman

� Office of Naval Research

� Semiconductor Research Corporation

� Southeastern Universities Research Association

� Citrix, Microsoft, Merrill-Lynch, Samsung…

Page 6: Managing large research groups: thoughts from the …...Northrop-Grumman Office of Naval Research Semiconductor Research Corporation Southeastern Universities Research Association

Advanced Computing and Information Systems laboratory

eScience Cyberinfrastructure (iDigBio)

Integrated Digitized

Biocollections (iDigBio) is

the US resource for digitized

information about natural

history collections

Page 7: Managing large research groups: thoughts from the …...Northrop-Grumman Office of Naval Research Semiconductor Research Corporation Southeastern Universities Research Association

Advanced Computing and Information Systems laboratory

Software-defined Systems

� Fault-tolerant Map Reduce

MAP-REDUCE

SYSTEM

MAP-REDUCE APPLICATION

MAP-REDUCE FRAMEWORK

SYSTEM SOFTWARE

INFRASTRUCTURE

MONITORING MODULE

GANGLIA BASED

MONITORING

NODE HEALTH SCRIPT

PLANNING MODULE

SCALING HEURISTIC(MASTER)

ANOMALY DETECTION(SLAVE)

ANALYSIS MODULE

HEART BEAT

PROCESSING

(USING GANGLIA METRIC MODULES)

PRECURSOR DETECTION

(USING HADOOP NODE HEALTH SCRIPT)

EXECUTION MODULE

RESOURCE SCALING

BLACK-LISTING

PREDICTION MODELS

(MASTER)

COST MODELS (MASTER)

� Virtual Networking

UC

UF

PU

ViNe

Downloa

d Server

Virtual Cluster 1. ViNe-enable sites2. Configure ViNe VRs

3. Instantiate BLAST VMs4. Contextualize

a.Retrieve VM information

b.ViNe-enable VMs

c.Configure Hadoop

• Multicloud

Hadoop-based

BLAST

Page 8: Managing large research groups: thoughts from the …...Northrop-Grumman Office of Naval Research Semiconductor Research Corporation Southeastern Universities Research Association

Advanced Computing and Information Systems laboratory

PRAGMA: Enabling the Long Tail of Team Science

Page 9: Managing large research groups: thoughts from the …...Northrop-Grumman Office of Naval Research Semiconductor Research Corporation Southeastern Universities Research Association

Advanced Computing and Information Systems laboratory

ACIS FACILITIES

• State-of-the-art computing, storage and networking facilities

• Unique environment for experimental research and design of

distributed systems that use virtualization software developed

by commercial and open-source projects

• ~200 servers, ~1250 cores, ~4.8TB memory, ~260TB storage• FutureGrid cluster: IBM iDataPlex connected to Florida Lambda Rail.

• NUMAcloud: up to 64 cores and 512MB of memory in a single image

• Autonomic Testbed: autonomics for datacenter management

• VM and cloud: rich set of VMMs and cloud software

• Storage: centralized (IBM DS4800) and cloud-based (OpenStack)

Page 10: Managing large research groups: thoughts from the …...Northrop-Grumman Office of Naval Research Semiconductor Research Corporation Southeastern Universities Research Association

Advanced Computing and Information Systems laboratory

Management goals and practices

� Goals• To succeed in mission

• Solid reputation, high-quality, highly-cited work

• Consistent ability to attract funding (sustainability)

• Excellent IT research infrastructure/facilities

• Growing research capacity and culture

� Practices• Make expectations clear to all; then trust, measure and verify

• Pursue multiple sources of funding

• Invest in facilities: hardware, software and space

• Simplify, automate, create routines to eliminate overhead

Page 11: Managing large research groups: thoughts from the …...Northrop-Grumman Office of Naval Research Semiconductor Research Corporation Southeastern Universities Research Association

Advanced Computing and Information Systems laboratory

The good

� Have best practices• Put them in writing

• Known when joining the lab

• Create wiki

• Everyone to contribute

� Weekly research reports• Key questions: what, why,

discoveries, show-stoppers

• Research notes, lab results, papers read, drafts

• Save everything in e-forum

Page 12: Managing large research groups: thoughts from the …...Northrop-Grumman Office of Naval Research Semiconductor Research Corporation Southeastern Universities Research Association

Advanced Computing and Information Systems laboratory

The bad (and what to do about it)

� Every one is different • Keep expectations the same for everyone but …

• provide flexibility on how to achieve (schedule, style, topic…)

� You manage the research team you have, not the research team you would like to have …• Adapt assignments to talents/capabilities

• Enable team to acquire needed skills

• Recruit wisely

• Enable senior members to help junior members grow

Page 13: Managing large research groups: thoughts from the …...Northrop-Grumman Office of Naval Research Semiconductor Research Corporation Southeastern Universities Research Association

Advanced Computing and Information Systems laboratory

The ugly (and how to live with it)

� People leave• Graduation, employment, change of plans…

• Overlap stays of departing members and new researchers

� Some people and/or projects will fail• Prepare for failure recovery and learn from failure

� Funding will vary• Have “rainy day” funds (overhead return, gift monies…)

� Funding comes with work and overhead • Zero-sum game – there is a limit to what you can do

� Other “stuff” will happen (social, personal, political…)

• Have everyone focus on doing good work …

Page 14: Managing large research groups: thoughts from the …...Northrop-Grumman Office of Naval Research Semiconductor Research Corporation Southeastern Universities Research Association

Advanced Computing and Information Systems laboratory

Key points (noteworthy common sense)

1. Clear vision and mission (known and understood by everyone)

2. Clear metrics (measuring quality and quantity)

3. Best practices (documented and known from day one)

4. Routines and automation (simplify, minimize overhead, admin assistance if possible)

5. Train leaders (and give them ownership of lab initiatives)

6. Good resources, facilities and working environment

7. Diversify funding portfolio

8. Align the stars (maintain solid reputation)

9. Understand limitations and learn from mistakes• Insufficient number of faculty members

• Too large a dependency on soft money

• No think-time/slack to engage in large efforts or new initiatives

• Institutional inertia in engaging into new/novel operational models

10.Do periodic SWOT analysis