course fault tolerant systems design -...

22
Course name FaultTolerant Systems Design Course ID: 40632 Credits: 3 Program: Graduate Prerequisites: Corequisites: Prepared by: Seyed Ghassem Miremadi Course Description Computer systems play an increasingly roll in our daily life, where some of them are safetycritical. Examples of safetycritical applications are flight control, train control, avionics control, medical systems, satellites, and plant power systems. A failure in such systems may lead to catastrophic consequences. Therefore, reliability and correct operation of these systems are of decisive importance. These systems must be able to tolerate faults/errors and continue to deliver correct results in the presence of hardware and software faults. This course provides knowledge on the design or reliable and faulttolerant computer systems. Outline 1. Why do we need fault tolerance? 2. Applications of faulttolerant computer systems 3. Basic Terminologies: Reliability, Availability, Safety, Maintainability, Confidentiality, Integrity, Security, Testability, Dependability 4. Basic definitions: fault, error, failure 5. Fault characteristics 6. Fault / Error models 7. Fault / Error manifestation 8. Design techniques to achieve fault tolerance: Hardware redundancy: TMR, NMR, etc. Information redundancy: parity codes, mofn codes, etc. Time redundancy: recomputation, etc. Software redundancy: consistency checks, etc. 9. Evaluation techniques: Quantitative evaluation methods: Failure rate, Reliability function, Coverage, MTTF, MTTR, MTBF, etc. Reliability modeling: Combinational models, mofn systems and Markov models Reliability estimation using the SHARPE software 10. Estimation of failure frequency, MIL HDBK 217F 11. Design of practical faulttolerant systems 12. Some examples of faulttolerant systems References 1. Elena Dubrova, "Fault Tolerant Design: An Introduction", Department of Microelectronics and Information Technology, Royal Instute of Technology, Stockholm, Sweden, 2008. 2. Johnson, B.W., "Design and Analysis of Fault Tolerant Digital Systems", AddisonWesley, 1989. 3. Pradhan, D. K., "FaultTolerant Computer System Design", PrenticeHall Internaonal, 1996. 4. Trivedi, K. S., "Probability and Statistics with Reliability, Queuing and Computer Science Application", PrenticeHall Internaonal, 1992.

Upload: vutu

Post on 02-Apr-2018

216 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Course Fault Tolerant Systems Design - Sharifhardware.ce.sharif.edu/wp-content/uploads/2013/09/Gra… ·  · 2015-06-22Course name Fault‐Tolerant Systems Design Course ID:

Course name Fault‐Tolerant Systems Design

Course ID: 40632 Credits: 3 Program: Graduate Prerequisites: ‐ Co‐requisites: ‐ Prepared by: Seyed Ghassem Miremadi

 Course Description 

Computer systems play an  increasingly roll  in our daily  life, where some of them are safety‐critical. Examples of safety‐critical applications are flight control, train control, avionics control, medical systems, satellites, and plant power systems. A failure in such systems may lead to catastrophic consequences. Therefore, reliability and correct operation of these systems are of decisive  importance. These systems must be able to tolerate faults/errors and continue  to  deliver  correct  results  in  the  presence  of  hardware  and  software  faults.  This  course  provides knowledge on the design or reliable and fault‐tolerant computer systems.  

Outline 

1. Why do we need fault tolerance? 

2. Applications of fault‐tolerant computer systems 

3. Basic Terminologies: Reliability, Availability, Safety, Maintainability, Confidentiality, Integrity, Security, Testability, Dependability 

4. Basic definitions: fault, error, failure 

5. Fault characteristics 

6. Fault / Error models 

7. Fault / Error manifestation 

8. Design techniques to achieve fault tolerance: 

Hardware redundancy: TMR, NMR, etc. 

Information redundancy: parity codes, m‐of‐n codes, etc. 

Time redundancy: re‐computation, etc. 

Software redundancy: consistency checks, etc. 

9. Evaluation techniques:  

Quantitative evaluation methods: Failure rate, Reliability function, Coverage, MTTF, MTTR, MTBF, etc. 

Reliability modeling: Combinational models, m‐of‐n systems and Markov models 

Reliability estimation using the SHARPE software 

10. Estimation of failure frequency, MIL HDBK 217F 

11. Design of practical fault‐tolerant systems 

12. Some examples of fault‐tolerant systems References 1. Elena Dubrova, "Fault Tolerant Design: An Introduction", Department of Microelectronics and Information 

Technology, Royal Institute of Technology, Stockholm, Sweden, 2008. 

2. Johnson, B.W., "Design and Analysis of Fault Tolerant Digital Systems", Addison‐Wesley, 1989. 3. Pradhan, D. K., "Fault‐Tolerant Computer System Design", Prentice‐Hall International, 1996. 4. Trivedi, K. S., "Probability and Statistics with Reliability, Queuing and Computer Science Application",

Prentice‐Hall International, 1992. 

Page 2: Course Fault Tolerant Systems Design - Sharifhardware.ce.sharif.edu/wp-content/uploads/2013/09/Gra… ·  · 2015-06-22Course name Fault‐Tolerant Systems Design Course ID:

Course name Advanced Storage Systems

Course ID: 40683 Credits: 3 Program: Graduate Prerequisites: ‐ Co‐requisites: ‐ Prepared by: Hossein Asadi

 Outline 

1. Introduction to Data Storage Systems a. Storage History b. Performance trend of disk drives and microprocessors c. Amdahl Law and its implication to storage systems d. Architecture of server‐centric storage 

2. Architecture of Storage‐Centric IT Infrastructure 3. I/O Architecture & Configuration in Disk Subsystem 4. Qualitative & Quantitative Metrics in Storage Systems 

a.  Throughput, response time, availability, serviceability, and scalability 5. Disk Configuration in Storage Systems 

a. RAID1, RAID10, RAID5, RAID6 b. Read performance, write performance, and availability 

6.  Design of an Advanced Storage System a. Backend design b. Front‐end design c. Memory system design 

7. I/O Flow in Storage Systems a. Read, write, and copy 

8. Advanced Features of Data Storage Systems a. Remote Mirroring b. Instant Copies c. Data Migration d. LUN Masking 

9. Cache Memory in Storage Systems a. Structure of cache memory in storage systems b. Comparison of cache memory in storage systems and microprocessors c. Cache replacement algorithms used in storage systems 

10. Architecture of Off‐The‐Shelf Storage Systems a. IBM, HP, and EMC 

11. Design & Implementation of SAN & NAS a. Storage Area Network (SAN) and Network Attached Storage (NAS) 

12. I/O Techniques in Storage Systems a. SCSI, iSCSI, Fibre Channel, SAS 

13. Design & Architecture of Emerging Technologies used in Storage Systems a. Architecture of NAND & NOR chips b. Design & architecture of Solid‐State Disk Drives (SSDs) 

References 1. Storage Networks Explained: Basics and Application of Fibre Channel SAN, NAS, iSCSI,InfiniBand and 

FCoE, U. Troppens, R. Erkens, W. Mueller‐Friedt, and R. Wolafka, 2nd Edition, John Wiley & Sons Inc., 2009. 

2. Storage Area Networks Essentials, R. Barker and P. Massiglia, John Wiley & Sons Inc., 2002.  

Page 3: Course Fault Tolerant Systems Design - Sharifhardware.ce.sharif.edu/wp-content/uploads/2013/09/Gra… ·  · 2015-06-22Course name Fault‐Tolerant Systems Design Course ID:

3. Storage Technologies and Systems, IBM Journal of Research & Development, Special issue, November 2008. 

4. Introduction to Storage Area Networks, J. Tate, F. Lucchese, and R. Moore, IBM Redbooks (eBook), July 2006. 

5. Computer Architecture: A Quantitative Approach, Third Edition. John L. Hennessy and David A. Patterson.  Morgan Kaufmann Publishers, 2003. 

6. Holy Grail of Data Storage Management, The. Jon William Toigo, Prentice‐Hall, 2000. 

Page 4: Course Fault Tolerant Systems Design - Sharifhardware.ce.sharif.edu/wp-content/uploads/2013/09/Gra… ·  · 2015-06-22Course name Fault‐Tolerant Systems Design Course ID:

Course name Advanced Design of Dependable Systems

Course ID: 40697 Credits: 3 Program: Graduate Prerequisites: Fault‐Tolerant Systems Design (40632) Co‐requisites: ‐ Prepared by: Seyed Ghassem Miremadi

Course Description 

Computer systems play an increasingly roll in our daily life, where some of them are safety‐critical. Examples of safety‐critical applications are flight control, train control, avionics control, medical systems, satellites, and plant power systems. A failure in such systems may lead to catastrophic consequences. Therefore, reliability and correct operation of these systems are of decisive importance. These systems must be able to tolerate faults/errors and continue to deliver correct results in the presence of hardware and software faults. This course reviews different research areas, including past and current research, and provides knowledge on the design of dependable and fault‐tolerant computer systems.

Outline 

1. Behavior, propagation, and effects of faults/errors in computer systems:

Faults classification

Data errors, program errors

Data error detection techniques

Control flow error detection techniques

Control flow checking

Watchdog processors

2. Evaluation techniques for fault‐tolerant computer systems:

Physical fault injection techniques

Simulation‐based fault injection techniques

Emulation‐based fault injection techniques

Comparison of fault injection techniques

Probability techniques to analyze fault injection results

Estimation techniques to fault coverage

3. Fault‐tolerant techniques in microprocessors

4. Dependability in embedded systems

5. Dependability in NOCs

6. Dependability in computer networks

7. Dependability in embedded systems

8. Dependability in distributed systems

9. Dependability in real‐time systems

10. Dependability in e‐commerce

References 1. Selected Papers.

2. Pradhan, D. K., "Fault‐Tolerant Computer System Design", Prentice‐Hall International, 1996. 3. Trivedi, K. S., "Probability and Statistics with Reliability, Queuing and Computer Science Application",

Prentice‐Hall International, 1992.

Page 5: Course Fault Tolerant Systems Design - Sharifhardware.ce.sharif.edu/wp-content/uploads/2013/09/Gra… ·  · 2015-06-22Course name Fault‐Tolerant Systems Design Course ID:

Course name Advanced Microprocessor

Course ID: 40722 Credits: 3 Program: Msc. H/W Architecture Prerequisites: ‐ Co‐requisites: ‐ Prepared by: Amir Hossein Jahangir

 Outline 

1‐ Introduction (Definition of superscalar processors, static and dynamic scheduling, pipeline 

architectures, modern processors characteristics and their ISA :Instruction Set Architecture) 

2‐ Description of Scoreboarding and Tomasulo algorithms in CISC processors 

3‐ Branch prediction techniques, speculative execution. 

4‐ VLIW architectures (+ predicative execution), Memoization and Value prediction. 

5‐ Multiprocessing issues in modern processors (cache consistency protocol, and arbitration 

mechanisms + case study: Pentium) 

6‐ Multuthreading techniques and examples. 

7‐ Advanced Bus and I/O architectures in modern processors 

 

References: 

1‐ J. P. Shen, M. H. Lipasti, "Modern Processor Design, Fundamental of Superscalar Processors ", 

McGraw Hill, 2005. 

 2‐ J. Silc, B. Robic, Th. Ungerer, "Processor Architecture : From Dataflow to Superscalar and Beyond",  

Springer, 1999.

 3‐ T.  Shanley, "Pentium Pro Processor System Architecture", Addison‐Wesley, 1996.  

 4‐  Several papers from literature.

 

Page 6: Course Fault Tolerant Systems Design - Sharifhardware.ce.sharif.edu/wp-content/uploads/2013/09/Gra… ·  · 2015-06-22Course name Fault‐Tolerant Systems Design Course ID:

Course name Advanced Computer Architecture

Course ID: 40723 Credits: 3 Program: Graduate Prerequisites: Co‐requisites: Prepared by: Amir Hossein Jahangir

 Outline 

1‐ Introduction  

Definition of Speedup, Efficiency, Amdahl's law, For/ against parallel architectures, Classification of high 

performance architectures... 

2‐ Memory system architecture for advanced computer architectures  

interleaved memory, cache 

3‐ Pipeline architecture  

Instruction and arithmetic pipeline control (collision vector, reservation stations), speeding up pipeline 

with delay, eliminating data dependency in recursive operations... 

4‐ Vector computers 

Array processors, Pipelined vector computers, Memory stride for high bandwidth access... 

5‐ Interconnection networks and Multicomputers  

Hyprecube, k‐ary cube, mesh, butterfly, pyramid... 

6‐ Multiprocessors  

Analysis of Run time to communication ratio, Role of interconnection network in the performance, cache 

consistency protocols 

7‐ Software issues and speedup  

Synchronization, Communication, Code optimization for superscalar and parallel architectures... 

References 

1. Shiva S. G., "Advanced Computer Architecture", CRC Press, 2006. 

2. K. Hwang, "Advanced Computer Architecture: Parallelism, Salability, Programmability", McGraw‐

Hill, 1992. 

3. K. Hwang, Z. Xu, "Scalable Parallel Computing: Technology, Architecture, Programming", 

McGraw‐Hill, 1998. 

4. M. Quinn, "Parallel Computing: Theory and Practice", McGraw‐Hill, 2nd edition 1993. 

5. H. S. Stone, "High‐Performance Computer Architecture", 3rd edition, Addison‐Wesley, 1993.

Page 7: Course Fault Tolerant Systems Design - Sharifhardware.ce.sharif.edu/wp-content/uploads/2013/09/Gra… ·  · 2015-06-22Course name Fault‐Tolerant Systems Design Course ID:

Course name Low Power Design

Course ID: 40727 No of units: 3 Program: Graduate Prerequisites: VLSI Circuit Design Co‐requisites: ‐ Prepared by: Alireza Ejlali

 Outline 

Topic 1:  Introduction: Why Low Power?, Design Objectives, Instantaneous Power vs. Average Power, Hot Spots, 

Barriers to Low Power Design, Power Components, Dynamic Switching Power, Dynamic Short Circuit 

Power, Static Power, Reverse Leakage Power, Sub‐threshold Leakage Power, Static Biasing Power Topic 2: On‐chip Interconnects: Reduced Voltage Swing, Level Shifters, Low Power Encoding, Bus Inverting, 

Partitioned Bus‐Inverting, Data Compression Encoding, Transition Signaling, Limited Weighted Codes 

(LWC), Bus‐Inverting vs. LWC 

Topic 3: Circuit‐Level Techniques: Dual‐Threshold Circuits, Design Issues in Dual‐Threshold Circuits, Dual‐VDD circuits, Static Short‐Circuit Power in Dual‐VDD, SDCVSL Converters, CVS Structure, Optimum VDDL value

Topic 4: Gate‐Level Techniques: Technology Mapping and Decomposition, Activity Estimation, Problem of Re‐

convergent Fan‐outs, Input Reordering, Activity Postponement, Transistor Reordering, Concurrency and 

Redundancy 

Topic 5: RT‐Level Techniques: Clock Gating, Clock Skew Problem in CG, Glitch Problem in CG, Operand Isolation, 

RT‐Level Concurrency and Redundancy, Pre‐Computation, ODC, Glitch Reduction, Block‐Level Control, 

Pipeline for Low Power 

Topic 6: FSM: FSM Partitioning, Activity Estimation for FSMs (Using DTMC), State Encoding 

Topic 7: Adiabatic Circuits: Principle of Energy Recovery, Adiabatic‐Charging Principle, Constant Current 

Generator, APS (Voltage Ramp), Clock‐Power  Signals, Cascading Problem, Retractile Cascading, 

Reversibility, 8‐Phase Reversible Logic Family, 4‐Phase Reversible Logic Family 

Topic 8: System‐Level Techniques: Dynamic Voltage Scaling, Dynamic Power Management (DPM), Adaptive Body 

Biasing (ABB), System‐level Methods in Real‐time systems. 

Topic 9: Temperature‐Aware Design: Temperature Modeling, DVS, DFS (Dynamic Frequency Scaling), Fetch 

Gating, Clock Gating, Computation Migration. 

  

References 1) Low‐Power Electronics Design. C.Piguet, Ed. CRC Press, 2004. 

2) Digital Integrated Circuits: A Design Perspective, J. M. Rabaey, A. Chandrakasan and B. Nikolic, Second 

Edition, Upper Saddle River, NJ: Pearson Education, 2003. 

3) Low Power Design Methodologies, Edited by Jan M. Rabaey and Massoud Pedram, Kluwer Academic 

Publishers, 2002. 

4) Ultra Low‐Power Electronics and Ddesign, Ed. Enrico Macii, Kluwer Academic Publishers, 2004. 

5) Low‐Power Digital VLSI Design: Circuits and System, A. Bellaouar and M.I. Elmasry, Kluwer Academic 

Publishers, 1996. 

6) Published conference and journal papers.

Page 8: Course Fault Tolerant Systems Design - Sharifhardware.ce.sharif.edu/wp-content/uploads/2013/09/Gra… ·  · 2015-06-22Course name Fault‐Tolerant Systems Design Course ID:

Course name DSP Architecture

Course ID: 40732 Credits: 3 Program: Graduate Prerequisites: ‐ Co‐requisites: ‐ Prepared by: Amir Hossein Jahangir

 Outline 

1‐ Introduction: DSP algorithms and processor architectures. 

2‐ Arithmetic algorithms used in DSP processors 

‐Number representation (Fixed point and floating point) 

‐ Adders 

‐ Multipliers 

‐ Function evaluation 

3‐ DSP processor architecture 

‐ ALU and Processing Elements 

‐ Memory organization 

‐ I/O interfaces 

4‐ DSP algorithm implementations  

‐ DFT, FFT 

‐ FIR, IIR, Decimation 

‐ DDC 

5‐ Real DSP processor architectures (from TI, Analog Devices etc.) and applications (Software defined radio, 

Multimedia etc.) 

6‐ Simulation or implementation project. 

References: 

1. Architectures for Digital Signal Processing, Peter Pirsh, Wiley, 1998

2. Digital Signal Processing, A Practical Approach By Ifeachor E.C., Jervis B.W. 2nd

Edition 2002 (Pearson Education) 3. Digital Signal Processors, B Venkataramani and M Bhaskar 2002 (TMH)

 

 

 

Page 9: Course Fault Tolerant Systems Design - Sharifhardware.ce.sharif.edu/wp-content/uploads/2013/09/Gra… ·  · 2015-06-22Course name Fault‐Tolerant Systems Design Course ID:

 

 

 

1‐ Introduction (Definition of superscalar processors, static and dynamic scheduling, pipeline 

architectures, modern processors characteristics and their ISA :Instruction Set Architecture) 

2‐ Description of Scoreboarding and Tomasulo algorithms in CISC processors 

3‐ Branch prediction techniques, speculative execution. 

4‐ VLIW architectures (+ predicative execution), Memoization and Value prediction. 

5‐ Multiprocessing issues in modern processors (cache consistency protocol, and arbitration 

mechanisms + case study: Pentium) 

6‐ Multuthreading techniques and examples. 

7‐ Advanced Bus and I/O architectures in modern processors 

 

References: 

1‐ J. P. Shen, M. H. Lipasti, "Modern Processor Design, Fundamental of Superscalar Processors ", 

McGraw Hill, 2005. 

 2‐ J. Silc, B. Robic, Th. Ungerer, "Processor Architecture : From Dataflow to Superscalar and Beyond",  

Springer, 1999.

 3‐ T.  Shanley, "Pentium Pro Processor System Architecture", Addison‐Wesley, 1996.  

 4‐  Several papers from literature.

 

Page 10: Course Fault Tolerant Systems Design - Sharifhardware.ce.sharif.edu/wp-content/uploads/2013/09/Gra… ·  · 2015-06-22Course name Fault‐Tolerant Systems Design Course ID:

Course name Embedded System Design

Course ID: 40747 No of units: 3 Program: Graduate Prerequisites: ‐ Co‐requisites: ‐ Prepared by: Alireza Ejlali

 Outline 

Topic 1:  Introduction: Embedded Systems, Application Areas, Importance of ES, Characteristics of ES, Real‐time 

Systems, Challenges for Embedded Systems, Design Objectives, Reactive Systems. Topic 2: Automata‐Based Programming: Automata‐Based Programming of Reactive Systems, Hierarchical Design 

in ABP, Mealy and Moore ABP. 

Topic 3: System Specification: Behavioral and Structural Hierarchies, State‐oriented Behavior, Exception‐

oriented behavior, Concurrency, Synchronization and Communication, Readability and Flexibility, Model of 

Computation (MoC), Von Neumann MoC, CFSM MoC. 

Topic 4: StateCharts: Design Hierarchy in StateCharts, OR Super States and AND Super States, Design Modularity 

in StateCharts, An Example Design, Timers and Real‐Time Systems, Simulation of StateCharts Descriptions. 

Topic 5: Safety Critical Embedded Systems: Fault Containment Region (FCR), Original and Follow‐up Errors, 

Principle of self‐confidence, Design for Diagnosis, Man‐machine interface, anomalies, Never‐give‐up 

strategy. 

Topic 6: Embedded System Hardware: Actuators and Sensors, A/D, D/A and Sample and Hold Circuit, Processing 

Units 

Topic 7: Distributed Embedded Systems: Requirements, Real‐Time Communication, Robustness, Maintainability 

and Diagnosability, Electrical Robustness, CSMA/CD vs. CSMA/CA, CAN and TTP, Error Detection and Error 

Handling in CAN, Non‐Destructive Arbitration in CAN. 

Topic 8: Embedded Processors: ASIC, Configurable Logic, ASIP and DSP, Micro‐Controllers, Energy Efficiency in 

Embedded Processors, Selection Process, Core‐Based Micro‐Controllers 

Topic 9: Energy Management in Embedded Systems: DPM, DPM in StrongARM SA 1100, DVS, DVS in Crusoe and 

Mobile Pentium III, DPM vs. DVS 

Topic 10: Memory Organization: Code‐size efficiency, Code Compression in ARM processors, Dictionary‐Based 

Methods, Energy/Performance Trade‐off in Cache and Memory, Scratch Pad Memory (SPM), Cache vs. 

SPM.   

Topic 11: Energy/Reliability Trade‐off in Embedded Systems: Impact of DVS on Reliability, Impact of FT 

techniques on Energy Consumption, Techniques to tackle the problem  References 1) Embedded System Design, by Peter Marwedel, Springer 2006  

2) Embedded Systems Design: An Introduction to Processes, Tools, and Techniques, by Arnold S. Berger, 

CMP Books, 2002. 

Page 11: Course Fault Tolerant Systems Design - Sharifhardware.ce.sharif.edu/wp-content/uploads/2013/09/Gra… ·  · 2015-06-22Course name Fault‐Tolerant Systems Design Course ID:

3) Embedded System Design: A Unified Hardware/Software Introduction, by Frank Vahid, and Tony 

Givargis, John Wiley & Sons, 2002. 

4) Published conference and journal papers. 

Page 12: Course Fault Tolerant Systems Design - Sharifhardware.ce.sharif.edu/wp-content/uploads/2013/09/Gra… ·  · 2015-06-22Course name Fault‐Tolerant Systems Design Course ID:

Course name System‐on‐Chip Design

Course ID: 40757 Credits: 3 Program: Graduate Prerequisites: ‐ Co‐requisites: ‐ Prepared by: Shaahin Hessabi

 Outline 

Topic 1:  Introduction (Architecture of the Present‐Day SoC, Design Issues of SoC, Hardware-Software Codesign, Codesign Flow, Codesign Tools, SoC Design Challenges, Design Methodology, IP Cores) 

Topic 2: Overview of ASICs (Methodology and Design Flow, Programmable ASICs: CPLDs and FPGAs, FPGA to 

ASIC Conversion, Verification) 

 

Topic 3: Design Methodology for Logic Cores (SoC Design Flow, General Guidelines for Design Reuse, Design 

Process for Soft and Firm Cores, Design Process for Hard Cores) 

 

Topic 4: Design Methodology for Memory and Analog Cores (Design Methodology for Embedded Memories, 

Specifications of Analog Circuits, High‐Speed Circuits) 

 

Topic 5: Platform Based Design 

 

Topic 6: Multi‐Processor SoC (MPSoC) 

 

Topic 7: On‐Chip Interconnection Networks (SoC Bus Architectures, Network‐on‐Chip) 

Topic 8: SoC Testing (Digital Logic Cores, Embedded Memories, Analog and Mixed‐Signal Cores) References Henry Chang et. al., Surviving the SOC Revolution: A Guide to Platform-Based Design, Kluwer Academic Publishers, 2002. Farzad Nekoogar Farak Nekoogar, From ASICs to SOCs: A Practical Approach, Prentice Hall PTR, 2003. Michael John Sebastian Smith, Application-Specific Integrated Circuits, Addison-Wesley, 1997. Laung-Terng Wang, Charles E. Stroud, Nur A. Touba, System-on-Chip Test Architectures, Morgan Kaufmann Publishers, 2008.

Page 13: Course Fault Tolerant Systems Design - Sharifhardware.ce.sharif.edu/wp-content/uploads/2013/09/Gra… ·  · 2015-06-22Course name Fault‐Tolerant Systems Design Course ID:

Course name Electronic System Level Design

0BCourse ID: 1B40843 No of units: 3 Program: Graduate Prerequisites: - Co-requisites: - Prepared by: Maziar Goudarzi

2BOutline

1. Evolution of design flow of hardware and software of embedded systems design up to Electronic System Level (ESL)

a. Evolution of hardware modeling and design methodologies, Gate-level design, Register-Transfer Level (RTL) design, Behavioral level design, ESL design.

b. Brief review and comparison of hardware and software description languages. c. Motivations for the move toward ESL design

2. Electronic System Level Design Flow a. Specification and Modeling, pre-partitioning analysis, partitioning, post-partitioning

verification, post-partitioning analysis, software implementation, hardware implementation, implementation verification

3. Digital system specification using SystemC a. Introduction, history, highlights of SystemC. Design flow using SystemC, structure of SystemC

models. b. Modules and hierarchy in SystemC: ports, signals, data storage, processes, module constructor,

positional port mapping, named port mapping, hierarchical design, port connection rules c. Processes in SystemC: SC_METHOD, SC_THREAD, SC_CTHREAD processes and their differences,

process declaration, process definition, inter-process communication using signals, simple examples, global watching.

d. Data types in SystemC: bit-accurate data types, single-bit types (sc_bit, sc_logic), Integer types (sc_int, sc_uint, sc_bigint, sc_biguint), bit-vector types (sc_bv, sc_lv), fixed point types (sc_fixed, sc_ufixed, sc_fix, sc_ufix), resolved logic data types, tracing signal and port values, speed issues, user-defined data types.

e. Combinational logic modeling and recommendations for synthesis, local variables vs. signals and ports, delta delay concept, logical operators, arithmetic operators, relational operators, vectors and ranges, reading from vectors, writing to vectors, if statement, switch statement, loops, methods, structures, multiple processes per module.

f. Synthesis of SystemC models, concepts, recommendations, and pitfalls. g. Finite State Machine (FSM) design in SystemC.

4. A systematic methodology for hardware implementation of software programs a. Introduction to and significance of FSM with Datapath (FSMD) model. b. A classification of statements in software languages. c. Hardware implementation of a single basic-block using FSMD model. d. Hardware implementation of a complete software program using FSMD model. e. Introduction to Behavioral (or High Level) Synthesis (HLS). f. Basic topics in HLS: Allocation, Scheduling, Binding. g. Review of famous algorithms in HLS allocation, scheduling, and binding.

5. Hardware-Software communication mechanisms and their significance.

Page 14: Course Fault Tolerant Systems Design - Sharifhardware.ce.sharif.edu/wp-content/uploads/2013/09/Gra… ·  · 2015-06-22Course name Fault‐Tolerant Systems Design Course ID:

a. A case study of full-software and hardware-software implementation of AES encryption and analysis of performance loss due to hardware-software communication overhead.

b. Hardware-software and software-software communication mechanisms in uniprocessor and multiprocessors.

6. Transaction Level Modeling (TLM). a. Significance of separating communication from computation. b. The Channel concept, its features and importance. c. Gajski’s classification of TLM models, their components, and their usage in a top-down ESL

design flow. d. SystemC mechanisms and features to define Channels and to model at TLM level.

7. Co-synthesis Algorithms a. Hardware-Software partitioning algorithms. Primal and Dual approaches, case study of two

projects: Vulcan and Cosyma, performance and cost estimation techniques for hardware or software implementation of components, Simulated Annealing optimization algorithm.

b. Multiprocessor co-synthesis. Integer Linear Programming technique to obtain the global optimal solution, a MILP model for multiprocessor co-synthesis problem, heuristic algorithms, Wolf’s heuristic algorithm for ordinary task graphs, Wolf’s heuristic algorithm for object-oriented applications.

8. Latest platforms for digital system implementation. a. Evolution of semiconductor technology up to System-on-Chip (SoC). b. Evolution of FPGAs up to Programmable SoC (PSoC) devices. c. Challenges and opportunities by the above evolutions. d. System design using PSoC devices.

9. System-level Validation Techniques. a. Hardware-software co-simulation techniques. b. Introduction to formal verification of hardware-software systems.

10. Software-level optimization techniques a. Process variation and its significance. b. A variation-aware online power management algorithm to optimize power consumption and

performance of Chip Multiprocessors. c. Variation-aware power-yield optimization by task scheduling in Multiprocessor SoC devices. d. Code and data placement algorithms and their usages in system optimization.

3BReferences

1. Brian Bailey, Grant Martin, Andrew Piziali, “Electronic System Level Design and Verification,” Morgan Kaufmann Publishers, Series in Systems on Silicon, 2007.

2. W. Wolf, "Computers as Components: Principles of Embedded Computing System Design, " Morgan Kaufmann Publishers, 2001.

3. J. Staunstrup, W. Wolf, "Hardware/Software Codesign: Principles and Practice," Kluwer Academic Publishers, 1997

4. G. DeMicheli, "Hardware/Software Codesign", Kluwer Academic Publishers, 1996. 5. F. Balarin et al, "Hardware/Software Codesign: The POLIS Approach," Kluwer Academic

Publishers, 1997.

Page 15: Course Fault Tolerant Systems Design - Sharifhardware.ce.sharif.edu/wp-content/uploads/2013/09/Gra… ·  · 2015-06-22Course name Fault‐Tolerant Systems Design Course ID:

Course name Reconfigurable Computing

Course ID: 40844 Credits: 3 Program: Graduate Prerequisites: VLSI Design (40353), Digital System Design (40223) Co‐requisites: ‐ Prepared by: Hossein Asadi

Outline 

1. Introduction to Reconfigurable Computing a. FPGA technology  b. Logic blocks c. Basics of COTS reconfigurable devices: Xilinx, Altera, Lattice, and Actel  

2. Design Mapping a. FPGA technology mapping b. Placement & routing algorithms considering area, delay, power, and reliability  c. Simulated annealing, FD relaxation, and macro‐based methods  

3. Architecture of Reconfigurable Devices  a. Logic block architectures b. Interconnect and routing matrix architectures c. Design tradeoffs in a reconfigurable logic block d. Design tradeoffs in a reconfigurable interconnect  e. Area, delay, power, and reliability optimization techniques using VPR toolset f. Architecture of the state‐of‐the‐art reconfigurable devices 

4. Dynamic Reconfiguration a.  Reconfiguration and scheduling algorithms b. Limitations of reconfigurable approaches  c. Hardware‐support for reconfiguration  

5. Reconfigurable Systems a. Multi‐FPGA system topologies b. Logic emulation using Multi‐FPGA systems c. Partitioning in multi‐FPGA systems d. Interconnect of multi‐FPGA systems e. Architecture of modern multi‐FPGA systems  f. Hybrid reconfigurable systems (LSI logic) vs. FPGAs vs. processors  

6. Reconfigurable Applications  a. Arithmetic operations b. Systolic machines c. Partially reconfigurable machines d. Data acquisition systems  

7. System Prototyping Using Reconfigurable Devices a. System validation & verification using prototyping 

8. Advanced Topics on Reconfigurable Computing a. Reconfigurable co‐processors  b. Hardwire cores in reconfigurable devices c. Emerging reconfigurable technologies 

Page 16: Course Fault Tolerant Systems Design - Sharifhardware.ce.sharif.edu/wp-content/uploads/2013/09/Gra… ·  · 2015-06-22Course name Fault‐Tolerant Systems Design Course ID:

Textbooks

1. S. Hauck and A. Dehon, “Reconfigurable Computing: the Theory and Practice of FPGA‐based Computation”, Morgan Kaufmann Publishing, 2008. (Main textbook)

2. C. Bobda, Introduction to Reconfigurable Computing: Architectures, Algorithms and Applications, Springer, 2007.

 

References 1. P. Lysaght and W. Rosenstiel (eds.), New Algorithms, Architectures and Applications for

Reconfigurable Computing, Springer, 2005. 2. N. Voros and K. Masselos (eds.), System‐Level Design of Reconfigurable Systems‐on‐Chip,

Springer, 2005. 3. N. Sherwani, Algorithms for VLSI Physical Design Automation, 3rd Edition, Kluwer Publishers,

2002.

Page 17: Course Fault Tolerant Systems Design - Sharifhardware.ce.sharif.edu/wp-content/uploads/2013/09/Gra… ·  · 2015-06-22Course name Fault‐Tolerant Systems Design Course ID:

Course name Interconnection Networks

Course ID: 40853 No of units: 3 Program: Graduate Prerequisites: Computer Architecture (BSc course) Co‐requisites: None Prepared by: Hamid Sarbazi‐Azad

  

Outline 

1.  Introduction (1 session:  The Evolution of Computer Architecture; Multicomputers/Multiprocessors and their Interconnection Networks (INs); Basic definitions and notation)  

2.  Topology (8 sessions:  Topological factors; Popular Topologies and their Characteristics; Complex Topologies; Embedding; Hamiltonian Properties; Combinatorial Properties)  

3.  Switching Methods (3 sessions: Packetization/depacketization; Circuit Switching; Packet Switching; Wormhole Switching and VCT Switching; Mad Postman Switching; Virtual Channels; Combined Switching Techniques: Pipelined Circuit Switching, Buffered Wormhole Switching)  

4.  Routing Algorithms (8 sessions: Deadlock and Livelock Prevention; Deterministic Routing Algorithms in Popular INs; Partially Adaptive Routing Algorithms in Popular INs; Fully Adaptive Routing in Algorithms Popular INs)  

5.  Multicast Routing (6 sessions: Basic Definitions; Hardware Tree‐Based and Path‐based Multicast Routing Algorithms; BRCP Model; Software Multicast Algorithms: Dimension Order and Dimension Ordered Chains, Software Multicast in Popular INs)  

6.  Performance Evaluation (3 sessions: Performance Evaluation Methods; Technological Constraints; Traffic Models; Delay Models; Discrete Event Simulation; Xmulator Package)  

7.  Hot Topics (1 session: New topics on interconnection networks introduced in the last two year in related conferences and journals)   

  

References 1. J. Duato, S. Yalamanchili, L. Ni, Interconnection Networks: An Engineering Approach, Morgan 

Kaufmann, 2003. (Main source) 2. W. Dally, B. P. Towles, Principles and Practices of Interconnection Networks, Morgan Kaufmann, 2004. 3. B. Parhami, Introduction to Parallel Processing: algorithms and Architectures, Plenum Press, 2000. 4. D. Culler, J. Singh, A. Gupta, Parallel computer architecture: A Hardware/Software Approach, Morgan 

Kaufmann, 1999. 5. Papers from IEEE TPDS, JPDC, PC, JOIN, IJPDEP journals (and other related journals) and IPDPS, ICPADS, 

ICPP, HiPC, HPCA, NOCS, HPCA Conferences (and other related parallel and network‐based conferences).

Page 18: Course Fault Tolerant Systems Design - Sharifhardware.ce.sharif.edu/wp-content/uploads/2013/09/Gra… ·  · 2015-06-22Course name Fault‐Tolerant Systems Design Course ID:

Course name On-chip Communications

0BCourse ID: Credits: 3 Program: Graduate Prerequisites: VLSI, Computer Architecture Co-requisites: Prepared by: Somayyeh Koohi Preparation Date: June 2013

1BOutline

CHAPTER 1 Introduction 1.1 Trends in System-on-Chip Design

1.2 Coping with on-chip Interconnect Design Complexity

1.3 Course Outline

CHAPTER 2 Basic Concepts of Bus-Based Communication Architectures 2.1 Characteristics of Bus-Based Communication Architectures

2.2 Bus Topology Types

2.3 Physical Implementation of Bus Wires

CHAPTER 3 Networks-On-Chip 3.1 Network Topology

3.2 Switching Strategies

3.3 Routing Algorithms

3.4 Flow Control

3.5 Clocking Schemes

3.6 Quality of Service

3.7 NoC Architectures

3.8 NoC Status and Open Problems

CHAPTER 4 Test and Fault Tolerance for NoC Infrastructures 4.1 Test Methods and Fault Models for NoC Fabrics

4.2 Addressing Reliability of NoC Fabrics through Error Control Coding

4.3 Joint Crosstalk Avoidance and Error Control Coding

CHAPTER 5 Energy and Power Issues in Network-on-Chips 5.1 Energy and Power

5.2 Energy and Power Reduction Techniques in NoC

5.3 Power Modeling Methodology for NoC

5.4 Energy-Reliability Trade-off for NoCs

5.5 Encoding Techniques for Power Reduction

5.6 Encoding Techniques for Reducing power and Capacitive Crosstalk Effects

Page 19: Course Fault Tolerant Systems Design - Sharifhardware.ce.sharif.edu/wp-content/uploads/2013/09/Gra… ·  · 2015-06-22Course name Fault‐Tolerant Systems Design Course ID:

CHAPTER 6 Three-Dimensional on-Chip Communication Architectures 6.1 Three-Dimensional Integration of Integrated Circuits

6.2 The Promises and Limitations of 3-D Integration

6.3 Physical Analysis of NoC Topologies for 3-D Integrated Systems

6.4 3-D NoC on Inductive Wireless Interconnect

CHAPTER 7 Emerging On-Chip Interconnect Technologies 7.1 Optical Interconnects

7.2 RF/Wireless Interconnects

7.3 CNT Interconnects

CHAPTER 8 Silicon-on-Insulator (SOI) Photonics 8.1 Introduction

8.2 Silicon-on-Insulator Waveguides

8.3 Refractive Index in Optical Waveguides

8.4 Contributions to Loss in an Optical Waveguide

8.5 Optical Modulation Mechanisms in Silicon

8.6 Other Advantages and Disadvantages of Silicon Photonics

CHAPTER 9 Optical on-Chip Interconnects 9.1 Photonic opportunity for NoCs

9.2 Photonic Switches

9.3 Electrically-Assisted NoCs

9.4 Wavelength Routing

9.5 All-Optical NoCs

9.6 Optical NoC: Challenges and Future

2BReferences • De Micheli, Giovanni, and Luca Benini. 0T 0TNetworks on chips: technology and tools. Morgan Kaufmann, 2006.

• Pasricha, Sudeep, and Nikil Dutt. 0T 0TOn-chip communication architectures: system on chip interconnect. Morgan Kaufmann, 2010.

• Gebali, Fayez, Haytham Elmiligi, and Mohamed Watheq El-Kharashi, Networks-on-chips: Theory and Practice. CRC Press, 2011.

• Jantsch, Axel, and Hannu Tenhunen. Networks on Chip. Springer, 2006.

• Pavesi, Lorenzo, and Gérard Guillot. Optical Interconnects: The Silicon Approach, Springer, 2006.

• Reed, Graham T., and Andrew P. Knights. Silicon photonics: An Introduction. Wiley, 2004.

Page 20: Course Fault Tolerant Systems Design - Sharifhardware.ce.sharif.edu/wp-content/uploads/2013/09/Gra… ·  · 2015-06-22Course name Fault‐Tolerant Systems Design Course ID:

Course name Green Computing

Course ID: 40-??? Credits: 3 Program: Graduate Prerequisites: - Co-requisites: - Prepared by: Maziar Goudarzi

Aim

Awareness of current mechanisms to reduce the energy consumption of ICT products.

Ability to identify and optimize ICT products/processes/mechanisms/usage scenarios from an

energy consumption point of view.

Understanding and reflection about the impact of the global ICT carbon footprint.

Understanding the life cycle of ICT products and their energy impacts. Awareness of the

standards and programs related to the sustainability of ICT products.

Understanding and criticism of sustainable ICT solutions.

Outline

1. Introduction to Sustainable Computer Design

a. The life cycle of ICT products

b. Phases of the lifecycle (Design, Production, Use, End)

c. e-waste

d. Life Cycle Assessment (LCA)

e. RoHS EU directive, Selection of hardware (ecolabeling): ENERGY STAR, EPEAT.

f. Energy metrics.

g. Power aware computing. Dynamic and static consumption of CPUs

h. Significance of Interdisciplinary concepts and protection of environment and resources

2. Power Management, ACPI

a. CPU, Hard disk, Graphic chipsets, Display, Network interfaces, System.

b. ACPI Specification. System, CPU and device power and performance states. ACPI hardware and

software programming model.

c. Processor configuration and control: CPU voltage and frequency scaling, CPU idle modes.

d. Device configuration and control. Waking and sleeping the system. Battery management.

3. Datacenter Basics

a. Datacenter tier classifications

b. Datacenter power systems

c. Datacenter cooling systems

d. Metrics for data center efficiency.

e. Energy proportional computing.

f. Virtualization.

g. The cloud.

h. Initiatives: ENERGY STAR, EU Data Centre Code of Conduct.

4. Datacenter Power Provisioning

a. Power distribution

b. Inefficiencies in usage of the power budget

c. Power and cooling systems

Page 21: Course Fault Tolerant Systems Design - Sharifhardware.ce.sharif.edu/wp-content/uploads/2013/09/Gra… ·  · 2015-06-22Course name Fault‐Tolerant Systems Design Course ID:

d. Power estimation

e. Power usage characterization

f. CPU voltage/frequency scaling

g. Improving non-peak power efficiency

h. Power provisioning strategies

i. Techniques for power optimization in Data Centers

5. The Carbon Footprint of Cloud Computing

a. Environmental and economic costs of computing

b. Description of the supply chain

c. Environmental valuation

d. Economic evaluation of thin-clients

e. Economic and environmental evaluation

f. Constraints and challenges

6. Storage System Energy

a. Introduction to storage system design

b. Tape vs. disk storage

c. FLASH based disks

d. Phase-change memory

e. Energy usage in storage systems

f. Modeling energy in storage systems

g. Energy conservation techniques

h. Other important metrics: reliability and availability, performance and maximum throughput

i. Case study: wide-area storage

7. Energy Use of Scientific Applications

a. Energy usage in distributed systems

b. Power-performance metrics

c. Power/energy profiling

d. Single node power profile

e. Distributed power profiles

f. Power consumption pattern vs. application characteristics

g. Scheduling resources for energy-performance tradeoff

8. Applications and Energy in Mobile Phones

a. Health applications of mobile phones

b. Educational applications of mobile phones

c. Energy scavenging devices

d. Energy use of handhelds

e. Energy for VAS services such as Location Based Services

9. Energy of Computer Manufacturing

a. Energy intensity of computer manufacturing

b. Mathematical methodology

c. Case study of a desktop computer

d. Uncertainty and caveats

e. Implications for environmental assessment

f. Implications for societal response

10. Smart Energy Management in Buildings

a. Energy consumption in commercial, industrial, and residential buildings

Page 22: Course Fault Tolerant Systems Design - Sharifhardware.ce.sharif.edu/wp-content/uploads/2013/09/Gra… ·  · 2015-06-22Course name Fault‐Tolerant Systems Design Course ID:

b. Static approaches to energy reduction in buildings

c. Renewable energy sources for buildings

d. ZNEB: Zero Net Energy Buildings

e. Techniques for smart energy management in buildings

11. Other advanced topics

a. Smart Grid technology and grid energy efficiency

b. Nanophotonic technology and its implications

c. Networking energy

d. Quantum computing and its effects on energy consumption Evaluation Criteria

Midterm exam: 25%

Final exam: 35%

Assignments and project: 40%

References 1. Luiz Andre Barroso, Urs Holzle, The Datacenter as a Computer: An Introduction to the Design of

Warehouse-Scale Machines, Morgan & Claypool, 2009.

2. Stefanos Kaxiras, Margaret Martonosi, Computer Architecture Techniques for Power-Efficiency, Morgan

& Claypool, 2009.

3. Lorenz M. Hilty. Information Technology and Sustainability: Essays on the Relationship between

Information Technology and Sustainable Development. Books on Demand, 2008. ISBN: 978-3837019704

4. T.J. Velte, A.T. Velte, R. Elsenpeter, Green IT, McGraw Hill, 2008.

5. Various online papers from well known conferences and journals.

6. Resources gathered at the website of the IEEE Technical Committee on Scalable Computing (TCSC),

Technical Area of Green Computing