research-computing-synopsis

6
1 Tallahassee, 01/12/15 Research Computing at Florida A&M University – Networks, Resources, and Strategies: A Synopsis A Conversation with FAMU President Dr. Elmira Mangum and Vice President for Research Dr. Timothy Moore Location: President’s Office, Florida A&M University, Tallahassee, FL 32307. Date: January 12, 2015. Author: Mark A. Jack, Ph.D., FAMU Physics Department, College of Science and Technology Committee – Computational Science Program. Research Computing – Executive Summary Problem Statement Computing is the 3 rd pillar of science next to experiment and theory now. Faculty using computing for research (e.g. in STEM) try to acquire their own resources through grant funding or use insufficient, inadequate or outdated resources to meet research and teaching needs. This is ineffective and inappropriate to build competitive, interdisciplinary research programs and to train U.S. students with computational skills in a 21 st century economy. Research computing at FAMU is not centrally organized, does not have sufficient resources (network, large computer cluster, software) and does not have staff or administration support. This makes it hard to increase the number of faculty, students, and staff using research computing. The university currently does not have a permanent VP of Information Technology, and the Office of the CIO does not have research computing as a separate area of focus with a coherent strategy. Focus up to now has been enterprise computing, i.e. student and administrative services. Proposed Solution ! The University should formulate a coherent strategy across colleges and departments of how to address current needs in computing, data storage and data management and fast, reliable networks for research, teaching and training for the 21 st century. ! The University will create an initial seed of resources to start research computing (RC) on campus and then expand this new infrastructure with faculty, students and staff involved. ! Faculty with their discipline-specific expertise to apply computing in their labs and in their classrooms can contribute to the growth of RC through joint proposals with the new university team to build both resources, research productivity, and students trained with new skills. Resources Needed " Shared computational campus resource: New computer cluster at CePAST. " Research computing director: 1 dedicated technical staff as system administrator, user support, technical liaison for RC proposals and to SSERCA, and formulation of RC strategy for FAMU. " FAMU membership in state, regional and national networks – SSERCA, SURA, and XSEDE. " Training: Computational science program in STEM and in fields with non-traditional RC users. " Cyber-infrastructure upgrades (network upgrades, campus data center, cloud computing).

Upload: mjgrav2001

Post on 18-Aug-2015

21 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: research-computing-synopsis

! 1!

Tallahassee, 01/12/15 Research Computing at Florida A&M University – Networks, Resources, and Strategies: A Synopsis

A Conversation with FAMU President Dr. Elmira Mangum and Vice President for Research Dr. Timothy Moore

Location: President’s Office, Florida A&M University, Tallahassee, FL 32307. Date: January 12, 2015. Author: Mark A. Jack, Ph.D., FAMU Physics Department, College of Science and

Technology Committee – Computational Science Program. Research Computing – Executive Summary Problem Statement

• Computing is the 3rd pillar of science next to experiment and theory now.

• Faculty using computing for research (e.g. in STEM) try to acquire their own resources through grant funding or use insufficient, inadequate or outdated resources to meet research and teaching needs. This is ineffective and inappropriate to build competitive, interdisciplinary research programs and to train U.S. students with computational skills in a 21st century economy.

• Research computing at FAMU is not centrally organized, does not have sufficient resources

(network, large computer cluster, software) and does not have staff or administration support. This makes it hard to increase the number of faculty, students, and staff using research computing.

• The university currently does not have a permanent VP of Information Technology, and the

Office of the CIO does not have research computing as a separate area of focus with a coherent strategy. Focus up to now has been enterprise computing, i.e. student and administrative services.

Proposed Solution

! The University should formulate a coherent strategy across colleges and departments of how to address current needs in computing, data storage and data management and fast, reliable networks for research, teaching and training for the 21st century.

! The University will create an initial seed of resources to start research computing (RC) on campus and then expand this new infrastructure with faculty, students and staff involved.

! Faculty with their discipline-specific expertise to apply computing in their labs and in their

classrooms can contribute to the growth of RC through joint proposals with the new university team to build both resources, research productivity, and students trained with new skills.

Resources Needed

" Shared computational campus resource: New computer cluster at CePAST.

" Research computing director: 1 dedicated technical staff as system administrator, user support, technical liaison for RC proposals and to SSERCA, and formulation of RC strategy for FAMU.

" FAMU membership in state, regional and national networks – SSERCA, SURA, and XSEDE.

" Training: Computational science program in STEM and in fields with non-traditional RC users.

" Cyber-infrastructure upgrades (network upgrades, campus data center, cloud computing).

Page 2: research-computing-synopsis

! 2!

Campus Resources – Research Computing 1. Campus Computer Cluster: Resources:

• Historically: Army DoD Research Computer Cluster – remote access via FH Science Research Center (Pharmacy, Environmental Science, Physics, Chemistry); decommissioned in 2003/04.

• Currently:

CePAST Computer Cluster, funded by Army/DoD as part of collaboration with University of Hawaii (TeraWatt laser remote sensing); PIs: C. Weatherford, L. Johnson (physics). MAC XSERVE cluster (128 nodes = 256 processors); since 2005/06. Users: 3 faculty, 2-3 postdocs/research associates, 2-3 graduate students. Not a shared resource.

• New (Jan. 2015):

Hewlett Packard cluster addition to CePAST cluster, sponsored by FAMU Office of the President ($160,000). New, shared campus resource.

FAMU Campus Needs (w/o Engineering):

• Currently, 15-20 FAMU faculty on campus actively using high-performance / high- throughput computing for research and / or education; mostly in STEM: physics: 4-5; chem.: 1; biology: 2; math: 2; CIS: 5-6; ESI: 1; pharmacy: < 5; SBI: 1 (FAMU-FSU Engineering School supported via FSU RCC).

• Typical simulation needs for research, code development, training: Gaussian, QChem, MD

(chem., pharm., biol., ESI) – 32-64 cores per simulation; DFT, ab-initio QM (phys., chem.) – 128-256 cores per simulation; other areas e.g. distributed (‘cloud’) computing.

• Core Estimate: 16 people running 128-core (mid-tier) jobs concurrently (not every faculty

member will run simulations at the same time; some faculty might have up to 2-3 students running simulations). => New research cluster with !≈2048 cores reasonable.

2. HPC Administration and Staff (Expertise): Currently:

− 4-5 FAMU EIT staff responsible for enterprise computing services (student, administrative, and web/internet services).

− One IT staff member as administrator of old CePAST cluster (position supported by DoD/Army grant).

FAMU Campus Needs:

− 1 new FTE: one new IT personnel as Research Computing Director to report directly to FAMU CIO (VP of Information Technology) or FAMU VP of Research.

− Duties: a) Administration of research computing cluster, support for researcher needs on cluster and IT expertise for future cyber-infrastructure efforts. b) Technical liaison to SSERCA and other networks (SURA, XSEDE etc.). c) Collaboration with faculty liaison (e.g. M. Jack) and researchers on new cyber-infrastructure proposals to federal agencies (NSF, DoE, DoD etc.), state legislature, and to private sponsors.

Page 3: research-computing-synopsis

! 3!

− Governance:

a) A governance committee will be created with faculty members in different disciplines who use and have expertise in research computing. b) Governance committee acts in an advisory function to the Office of the CIO, Office of the President, and the University and regularly reviews strategies on and implementations of computing at FAMU. c) One faculty member elected as committee chair every few years. RC director as ex-officio member.

3. Campus Networks, Collaborations and Organizations – State, Regional and National SSERCA – Sunshine State Education Research Computing Alliance: Mission (SSERCA bylaws as of Oct. 2014) (http://www.sserca.org): “… The mission of SSERCA is to further the development of a statewide computational science infrastructure of advanced scientific computing, communications and education resources by promoting cooperation between Florida's universities. SSERCA will among other activities cultivate funding of, and support for, computational science initiatives in support of Florida academic institutions and contribute to the State’s economic development. …”

History and Organization: − Emerged from the statewide cyber-infrastructure network Florida Lambda Rail (FLR). FLR is a

member organization of SSERCA. − July 2010: First meeting and initiation with CIOs, research computing directors, 23 faculty

researchers, 1 SURA liaison (L. Akli), and 2 corporate representatives at U Miami campus. Since July 2010: Quarterly meetings at member institution sites. Since Spring 2014: Three meetings per year with two held virtually (online).

− Current members: UF, FSU, USF, UCF, FIU, FAU, U Miami. Current affiliate members: UWF, UNF. Former affiliate member: Florida A&M University (Mar 2013 – Oct 2014).

Opportunities with SSERCA:

− Access to computing and storage resources and IT expertise of the SSERCA network for large-scale simulation or data analysis projects and/or statewide research collaborations.

− Joint proposal activities with SSERCA group or individual SSERCA member(s) to federal agencies and state legislature: Synergy of IT expertise. Network of researchers in the state.

− Stronger negotiation power and working relationships as a group with computer hardware, software and network vendors. Example: Data Direct Network (DDN) as computer storage provider with a strong working relationship with SSERCA. SSERCA as model regional cyber-infrastructure to test and highlight new storage solution concepts.

Costs and Duties: Affiliate membership (suggested):

− Subscribe to mission and goals of SSERCA. − $2,000 annual fee for SSERCA booth at annual Supercomputing Conference (free access to

booth as meeting and presentation area for faculty and students). Membership (with maturity of available campus resources):

− Subscribe to mission and goals of SSERCA. − Share some of campus computing time (about 250-300 processors) and storage resources to the

SSERCA network to be made available via online access to any researcher in the state of Florida. − $5,000 annual fee for SSERCA booth at annual Supercomputing Conference (free access to booth

as meeting and presentation area for faculty and students).

Page 4: research-computing-synopsis

! 4!

SURA: Collaboration with MSIs (http://www3.sura.org): “… SURA is providing education and training programs for communities traditionally underserved by high performance computing resources. Through the Extreme Science and Engineering Environment (XSEDE) program, funded by the National Science Foundation, SURA assists Underrepresented Minorities, Women, and Minority-Serving Institutions (MSI) faculty and students to incorporate high performance computing and advanced digital services into their curriculum and research. …” SURA member institutions participating in the XSEDE MSI Outreach program: Clark Atlanta University, Florida International University, Hampton University, North Carolina A&T State University, Virginia State University, Vanderbilt/Fisk Masters-to-PhD Bridge Program. In Florida, many research collaborations were facilitated across campuses with SURA membership with IT expertise shared. Example: UF-FSU-USF research collaboration in ocean and atmospheric modeling. Other SURA member in FL: U Miami. SURA manages AtlanticWave:

− Connects U.S. research and education communities along the Atlantic rim since 2006. − Exchange services between U.S. and international networks (e.g. to Sao Paulo, Brazil).

Costs and Duties:

− Subscribe to mission and goals of SURA and share research expertise. − $1,000 annual affiliate membership fee with a $2,500 initiation fee (suggested). − $5,000 annual standard membership fee with a $10,000 initiation fee.

XSEDE (National Science Foundation):

− Easy application process and access to large-scale, national HPC/HTC resources managed by NSF via education allocations (1 year, < 200,000 CPU hours).

− Research allocation request via proposal submission (1 year, 1-2 million CPU hours).

− XSEDE Campus Champion to inform faculty, students and staff about and coordinate access to XSEDE computational resources (allocation on clusters, software, helpdesk).

− Seamless integration and joint support structure with SSERCA and SURA networks.

Other Networks: OpenScienceGrid, ESnet, ORAU/ORISE, CERN etc.

Page 5: research-computing-synopsis

! 5!

4. Computational Skills for Students, Faculty, and Staff – Computational Science Program Goals: Build important computational skills for students, faculty and staff across STEM and other areas, provide new training in computational science as a new interdisciplinary academic offering, increase competitive-ness and job opportunities for FAMU graduates in today’s US economy, create new interdisciplinary funded research opportunities and collaborations among FAMU faculty and with off-campus research communities in theory, modeling and experiment, nationally and internationally. History:

• Since fall 2013: First meetings in College of Science and Technology (CoS&T) to discuss a new computational science program across 5 STEM departments. CS program listed in updated FAMU mission statement.

• Committee meetings with chairs and faculty representatives in Biology, Chemistry, Physics (M. Jack), Mathematics, and Computer & Information Sciences (H. Chi): Comparison of existing programs in the state and nationally. Minor, B.S., and M.S./Ph.D. programs. Last meeting: April 2014.

• Faculty workshops and support by NSF XSEDE Education, Training and Outreach and SURA

and SSERCA (Linda Akli, SURA/XSEDE): − Spring 2012: 2-day parallel visualization workshop at FAMU CIS. − Spring 2013: 2-day Research Data Management Implementations Workshop (RDMI) in

Washington, D.C. sponsored by XSEDE and CASC (M. Jack). − Fall 2013: 2-day SSERCA/XSEDE workshop at Engineering School – Introduction to high-

performance computing resources in Florida (SSERCA) and nationally (NSF XSEDE). − Fall 2014: 2-day SURA/XSEDE workshop on computational science curriculum

development with about 20 faculty and administrators from 10 different MSIs (M. Jack, H. Chi).

• Spring 2015: Finalizing curriculum plan for computational science minor in 5 STEM areas math, physics, chemistry, biology, and CIS (courses with course descriptions). Committee: M. Jack (physics), H. Chi (CIS) and faculty associated with each course offering.

Need:

Support by upper-level administration (President/Provost): 1. Approval of computational science program and curriculum plans. 2. Support program submission to state agencies. 3. Coordinate and help with expansion to other areas (engineering, pharmacy, allied health, business, social & behavioral sciences). 4. Consider new faculty lines affiliated with computational science program.

Page 6: research-computing-synopsis

! 6!

5. Other Cyberinfrastructure Needs: New Campus Network for Research Computing:

− Separation of research computing activities from enterprise computing activities on FAMU’s campus network for higher bandwidth and improved management of resources.

− Improved access to FLR network (‘on-ramp’ to highspeed statewide fiber optics network.) => Joint proposal activities with SSERCA group or individual SSERCA member(s)

to federal agencies and state legislature. Internet2 / FLR consultations.

Example: April 2012: NSF proposal by FSU and UF to increase bandwidth of FSU network (activation of ‘dark fiber’ with new switches) and improved on-ramp to UF network (successful request for $500,000).

Campus Data Center:

− Data analysis, management, and storage in a centralized location (‘big data’ research). − Examples: pharmacy, chemistry, biology, physics, computer engineering, material science,

School of Business & Industry, allied health, social & behavioral sciences, media & communications.

− Data procurement and archives (library sciences). − Typical / targeted data center sizes at SSERCA member institutions: UCF, USF, FSU, FIU, FAU,

U Miami: 2 PetaBytes (PB). UF: 20 PetaBytes.

=> Joint proposal activities with SSERCA group or individual SSERCA member(s) to federal agencies and state legislature.

Example: April 2013: NSF proposal ‘XDESE’ by SSERCA for six data storage centers to support XSEDE to NSF (request for 6 million dollars).

Cloud Computing Access:

− Important flexible and adjustable low-cost solution to distributed computing needs. − Serve many non-traditional users e.g. School of Business & Industry, allied health, social &

behavioral sciences, media & communications. − Easy to expand educational offerings, outreach and support services to the public and the local

small business community. => Joint proposal activities with SSERCA group or individual SSERCA member(s)

to federal agencies and state legislature.

Example: October 2013: Request to FAMU Office of Research to support NSF proposal in cloud computing by SSERCA member Florida Atlantic University with FAMU, UCF and FIU as satellite sites managed by FAU (request for 3 million dollars).