introduction computer science - department of computer...
TRANSCRIPT
Introduction Computer Science
Henri Bal
Vrije Universiteit Amsterdam
Goals of this course
● Understand typical Computer Science topics
● Meet with students and some staff members
● Develop skills:
● Reading (English) scientific literature
● Critical/analytical thinking about CS topics
● Discussing
● Presenting
● Scientific writing
Structure
● Tuesdays: guest lectures
● 2 scientific papers provided as context
● Questions made up by lecturers beforehand
● Working groups
● 2 students per group present a paper
● Each group discusses both papers + questions
Topics (Tuesday lectures)
● Intro & high-performance computing (Henri Bal)
● e-Health (Aart van Halteren)
● Watson (Chris Welty, with LI & IMM students, on Monday 19 Sept)
● Finding & reading scientific literature
● Bioinformatics (Jaap Heringa)
● e-Science infrastructures (Cees de Laat)
● Luggage handling at Heathrow Terminal 5 (Huub van der Wouden, with IMM students)
Working Groups● Supervised by staff members (instructors)
● First meeting:
● Instructors give introduction and schedule presentations
● Other meetings:
● Students present/discuss papers
● Course material + working group composition will be made available on Blackboard (bb.vu.nl)
Your tasks● Attend Tuesday lectures
● Send brief answers to questions before workgroup deadline
● Give 1 presentation in a working group
● Make slides, talk for 10-15 minutes
● Participate in working group discussions
● Write 2-page paper on 1 topic of the course
● Use (find!) 2 extra publications in the literature
● Grading: pass or fail (no numerical grade)
● Participation, paper, & presentation must be “sufficient”
Relevance to the CS program
● Illustrates what academic research in Computer Science is about
● Gives a selection of current research topics
● Many topics come back later in ourBachelor or Master programs:
● eScience Infrastructure → Networking course
● eHealth → Ba Lifestyle Informatics
● HPC → MSc track Parallel Computing Systems
● Watson → Artificial Intelligence courses + MSc
● Bioinformatics → Bioinformatics MSc program
First presentation
● My personal view on Computer Science
● Why is Computer Science so interesting?
● Biased towards my own research area:
● High performance distributed computing
Computer Science (CS)
● CS sits between technology and applications,
both of which have turbulent developments
● Processors, networks, mobiles, wearables, …
● Data explosion in virtually all applications
● CS also studies many fundamental problems
of its own
● Programming languages, security, AI, theory ….
Outline
● Technology
● Computers
● Some history
● High performance computers
● Modern (multicore) PCs
● Networks & mobile computing
● Applications
● Data explosion
● Computation demands
● Fundamental CS questions
Computers
● Mainframe: powerful centralized computer
● IBM 704 (1964)
● Minicomputers: <25K$, for small groups
● PDP-8, PDP-11, VAX (1960s-1980s)
● Workstations: expensive personalgraphical machine
● Xerox Alto (1973)
● PCs: inexpensive machine for the masses
● IBM PC (1981)
High Performance Computers
● Computer systems with many processors, all computing in parallel
High Performance Computers (1)
● Vector machines
● Can do vector operations in parallel
● A and B: 1-dimensional matrices with 100 elements
● Computing A+B (= 100 computations) takes as much time as doing 1 addition on a sequential computer
● History
● 1970s, 1980s (e.g., Cray)
● 2000s (Japanese Earth Simulator)
● 2010s (GPUs, Graphical Processing Units)
High Performance Computers (2)
● Massively parallel machines
● 1000s of special processors connected by a special network, all running in parallel, each doing part of the overall computations
● E.g., CM-1, CM-5, Intel Paragon, IBM BlueGene
● Connection network uses graph theory (math)
High Performance Computers (3)
● Cluster computers
● Parallel machines built from off-the-shelf (commodity) PCs and networks
● Excellent price/performance ratio
● Exponential performance growth ofprocessor speeds
● See http://www.top500.orgfor 500 fastest supercomputers
Multicores & Manycores
● All PCs now have >1 compute cores
● Every PC is a parallel computer!
● Some PCs already have 48 cores
● Core count will increase to hundreds
● Intel Phi (2012): 60 Pentium-1’s on 1 chip, with advanced vector support
● Challenge: how to program these things?
Thinking in parallel is hard
● How to split up the work?
● Load balancing
● All cores should do the same amount of work
● Communication & synchronization
● Cores must exchange data (=overhead)
● Nondeterminism:
● A single processor always gives same outcome
● With >1 core the outcome may depend on the order (called a ``race condition’’ bug)
Graphics Processing Units (GPUs)
Differences CPUs and GPUs● CPU: minimize latency of 1 activity (thread)
● Must be good at everything
● Big on-chip caches
● Sophisticated control logic
● GPU: maximize throughput of all threads using
large-scale parallelism
● 1000’s very simple cores
ControlALU ALU
ALU ALU
Cache
Current debates
● Should we build chips with:
● Very fast/complicated (superscalar) processors?
● Hits a ‘’power wall’’, hard to increase clock frequency
● Many slower/simpler (thin) processors?
● Hard to program
● How to deal with energy consumption?
● Performance per Watt becomes key factor
Networks
● Wide area networks (WANs)
● Local area networks (LANs)
● Mobile networks
● Much more inComputer Networks class
Wide area networks
● ARPANET
● First computer network, connecting some US sites (1960s)
● Speeds measured in kbit/s
● Internet
● Based on standardized (IP) protocol suite
● Connect everyone/everything (Internet-of-things)
● Dedicated optical networks (light paths)
● 10 gbit/s, point-to-point
Local Area Networks
● Ethernet: developed by Xerox PARC (1974)
● Speed increased from 10 mbit/s to 100 gbit/s
● Cluster computers use Ethernet or faster commodity networks
● Myrinet
● Infiniband
An aside
● In Computer Science
● k(ilo)=1024
● m(ega)=10242
● g(iga)=10243
● t(era)=10244
● p(eta)=10245
● e(xa)=10246
● All has to do withbinary numbers
DAS-5
Dual 8-core Intel E5-2630v3 CPUsFDR InfiniBand OpenFlow switchesVarious acceleratorsCentOS LinuxBright Cluster ManagerBuilt by ClusterVision
VU (68)
TU Delft (48) Leiden (24)
UvA/MultimediaN (18/31)
SURFnet7
ASTRON (9)10 Gb/s
Mobile computing
● Laptops, sensors, smartphones, tablets
● Many forms of mobile networks
● Wifi (local range), BlueTooth (for pairing devices)
● 3G, 4G (lower bandwidth, high coverage)
● Ultimately: ubiquitous computing?
● Vision by Mark Weiser (1988)
● ‘’machines that fit the human environment instead of forcing humans to enter theirs’’
● Next: Internet of Things
Outline
● Technology
● Computers
● Some history
● High performance computers
● Modern (multicore) PCs
● Networks & mobile computing
● Applications
● Data explosion
● Computation demands
● Fundamental CS questions
Application developments
● There is a ``data explosion’’ in many application areas
● Huge amounts of data (up to Petabytes/year)
● Very complicated/heterogeneous data
● Demand for computing
● Model (simulate) designs on a computer
Data explosion
● Society:
● Web, social networks
● Industry, economy:
● Banks, stock markets
● Science
● LHC (``Higgs particle’’)
● Data stored on world-wide ``grid’’
● Bioinformatics (next generation sequencing)
● Astronomy: software telescopes (LOFAR, SKA)
Computing demands● Computational science:
● Modeling ozone layer, climate, ocean, human brain
● Simulating galaxies
● Engineering:
● Aircraft modeling, designing F1 cars
● TVs (mostly software), embedded systems
● Games and multimedia:
● Computer chess (Deep Blue)
● Watson (Jeopardy)
● AlphaGo (Google)
● Analyzing multimedia content
● Digital forensics
● Generating movies
Pixar’s ``Up’’ (2009)
Whole movie (96 minutes) would take 94 years on 1 PC
(4 frames per day; 1 second takes 6 days; 1 minute per year)
Some fundamental Computer Science topics (1)
● Operating systems:
● Windows, Linux, Minix (Andy Tanenbaum)
● Programming languages and systems
● Fortran, Cobol, C, Java, Python … (thousands)
What happens if you ask a computer scientist to solve a problem?
He/she will come back 3 months later, with …
a new programming language ideally suited for solving your problem
Some fundamental Computer Science topics (2)
● Security
● Preventing/detecting attacks, privacy, etc
● (Semantic) web technology
● Finding and reasoning about content on the web
● Cloud computing
● Store data and programs remotely, in the Cloud
Some fundamental Computer Science topics (3)
● Artificial intelligence
● E.g. machine learning (deep learning)
● Databases
● Storing and searching huge amounts of data
● Logic, modelling, graph theory, complexity
● Essential for many applications
Conclusion
● Modern Computer Science deals with hectic developments in technology and applications
● Both provide us many research problems
● Application-driven vs technology-driven research
● There also are many fundamental CS problems