nii shonan meeting report. 133.pdf · 2019. 11. 12. · systems (async) is the leading conference...

29
ISSN 2186-7437 NII Shonan Meeting Report No. 133 Asynchronous Circuit Design and its Applications: Past, Present and Future Tomohiro Yoneda Peter A. Beerel Alex Yakovlev Masashi Imai May 15–19, 2019 National Institute of Informatics 2-1-2 Hitotsubashi, Chiyoda-Ku, Tokyo, Japan

Upload: others

Post on 30-Mar-2021

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: NII Shonan Meeting Report. 133.pdf · 2019. 11. 12. · Systems (ASYNC) is the leading conference in this area and is planned to occur in Japan in May, 2019. However, the symposium

ISSN 2186-7437

NII Shonan Meeting Report

No. 133

Asynchronous Circuit Design and itsApplications: Past, Present and Future

Tomohiro YonedaPeter A. BeerelAlex YakovlevMasashi Imai

May 15–19, 2019

National Institute of Informatics2-1-2 Hitotsubashi, Chiyoda-Ku, Tokyo, Japan

Page 2: NII Shonan Meeting Report. 133.pdf · 2019. 11. 12. · Systems (ASYNC) is the leading conference in this area and is planned to occur in Japan in May, 2019. However, the symposium

Asynchronous Circuit Design and its

Applications: Past, Present and Future

Organizers:Tomohiro Yoneda (National Institute of Informatics, Japan)Peter A. Beerel (University of Southern California, USA)Alex Yakovlev (Newcastle University, United Kingdom)

Masashi Imai (Hirosaki University, Japan)

May 15–19, 2019

1 Introduction

1.1 What is this workshop about?

Asynchronous design is the study of circuits without a global clock. It is incontrast to synchronous design that relies on a global clock distributed to allsequential elements whose periodic nature provides a simplified discretized timemodel of system behavior. This model assumes that the clock is distributedacross the chip and arrives at all the sequential elements nearly simultaneously,providing a unified time when all state is updated.

The problem of synchronous design is that the simplified model belies the un-derlying physics of modern and future technologies for which 1) increasing wiredelay and increasing variations make simultaneous arrival times increasingly dif-ficult to achieve 2) the economics warrant extensive module re-use which havedifferent and sometimes inconsistent clocking constraints and 3) the interface tothe analog world do not adhere to such a simplified model.

Asynchronous design has potential to solve this problem of synchronousdesign, because it does not rely on a global clock. This workshop is aboutsuch asynchronous design.

1.2 Why is this workshop held?

Asynchronous researchers have produced a range of techniques, that span sup-port for point-to-point communication between different synchronous islandswhich are asynchronous to each, network-on-chips that facilitate communica-tion among a collection of synchronous islands, and a variety of techniques tocreate asynchronous islands, either targeting high-performance, low-power, low-energy, or other application-specific requirements (e.g., the creation of low elec-tromagnetic interference, resilience to soft-errors, robustness to extreme processvariations, support of wide operating ranges, and support for next-generationnon-CMOS technologies).

For each of these domains, active research has been performed across bothindustry and academia, spanning the Americas, Europe, and Asia. Numerous

1

Page 3: NII Shonan Meeting Report. 133.pdf · 2019. 11. 12. · Systems (ASYNC) is the leading conference in this area and is planned to occur in Japan in May, 2019. However, the symposium

startup companies have sought to disrupt various markets using asynchronousdesign, including general-purpose processors, networking chips, chips with highsecurity and assurances, and vision processing and machine learning for theinternet-of-things markets. Moreover, the problems associated with crossingclock domains and efficiently interfacing with the analog world is an evolvingchallenge as community technology evolves and new objective functions becomeimportant.

Because the community of asynchronous researchers active in this area spansthe globe it is important to have regular activities for them to share experiencesand shape the future. The IEEE Symposium on Asynchronous Circuits andSystems (ASYNC) is the leading conference in this area and is planned to occurin Japan in May, 2019. However, the symposium does not provide the opportu-nity for researchers to present position papers, discuss and debate the variousbenefits of different research directions, as well as plan collaborations wherethey make sense.

We believe that this type of interaction can be a catalyst and guide for thefuture of this area and that a Shonan meeting planned just after ASYNC canachieve these goals.

1.3 What has this workshop tried to solve?

The theme of this workshop is to discuss the past, present, and future of asyn-chronous circuits.

The past we think is an important element of this meeting, as many start-up lessons in this community have failed and the reasons why are not oftendiscussed. Yet, we as a community can learn from these lessons, as the nextgeneration of asynchronous entrepreneurs actively seek to change the world.Moreover, the past is fundamental to re-search as many older ideas become morepractical as technology evolves. While past research is embodied in papers foreternity, it is not a replacement of face-to-face discussions among the old andnew generation of asynchronous academics.

At present, there are at least four startups actively working on commer-cializing asynchronous designs and significant activity in multiple large VLSIcompanies. Understanding the challenges they are facing and guiding themwith the collective knowledge of the community will enhance the chances oftheir success; a success which will help the entire community. Nevertheless, onecommon concern that all researchers in this community may have is that theasynchronous design style is not yet popular in practical VLSI design. Actually,it is fair to say that we all striving towards the same goal making asynchronousdesign a mainstream design style. Thus, as a main and specific theme of thismeeting, we have discussed how best to change the world using asynchronousdesign technologies in the near future.

The future indeed is providing a plethora of opportunities as we are facingthe slowing down of Moore’s law and the gaining of interest in alternative com-puting technologies and frameworks, including new non-volatile technologies,three-dimensional VLSI integration, superconductive technologies as well as anexplosion of interest in neuromorphic and machine learning styles of computa-tion. All of them have potential to be strongly enhanced under the asynchronousdesign style, but achieving high impact may require greater forms of collabo-ration than we have had in the past. We thus believe that it is very valuable

2

Page 4: NII Shonan Meeting Report. 133.pdf · 2019. 11. 12. · Systems (ASYNC) is the leading conference in this area and is planned to occur in Japan in May, 2019. However, the symposium

for researchers who specialize in these different technologies to have in-persondiscussions on the common goal to change the world using asynchronous designand the best strategies to achieve this goal.

In order to achieve these goals, this workshop has provided a wide-varietyof researchers with an opportunity to present their past experiences, presentefforts, and future vision for asynchronous designs.

2 Meeting Schedule

Check-in Day: May 15 (Wed)

• 19:30-21:00 Welcome Reception

Day1: May 16 (Thu)

• 09:00-10:00Opening, Introduction by everybody (1 min. per person)• 10:00-11:00 6 Talks (10min. per person)

Ivan, Marly, Yong, Peter, Hiroshi, Shogo• 11:00-11:15 (Break)• 11:15-12:15 6 Talks (10min. per person)

Jordi, Masashi, Ney, Scott, Andreas, Montek• 12:15-14:00 (Lunch)• 14:00-14:20 Photo Session• 14:20-15:20 6 Talks (10min. per person)

Mika, Jens, Snorre, Jia, Huimei, Rajit• 15:30-15:45 (Break)• 15:45-16:45 6 Talks (10min. per person)

Andrew, Alex, Warren, Milos, Matthias, Hong• 17:00-17:15 (Break)• 17:15-18:25 7 Talks (10min. per person)

Ken, Tomohiro, Daniel, William, Mark, Takahiro, Naoya• 18:45-19:45 (Dinner)• 19:45- Discussions for tomorrow’s panels with drinks

Peter (Chair)

Day2: May 17 (Fri)

• 09:00-10:10 Panel 1: Open sourceChair: AlexPanelists: Andrew, Rajit

• 10:10-10:15 (Break)• 10:15-11:10 Panel 2: Community involvement

Chair: MontekPanelists: All

• 11:10-11:20 (Break)• 11:20-12:00 Panel 3: Next day planning

Chair: Peter• 12:00-13:30 (Lunch)• 13:30-20:45 Excursion

3

Page 5: NII Shonan Meeting Report. 133.pdf · 2019. 11. 12. · Systems (ASYNC) is the leading conference in this area and is planned to occur in Japan in May, 2019. However, the symposium

Day3: May 18 (Sat)

• 09:00-11:00 Panel 4: EducationChair: MikaPanelists: Peter, Rajit, Montek, Jens

• 11:00-11:15 (Break)• 11:15-12:10 Panel 5: IP & library developer

Chair: ScottPanelists: William, Jia, Ney, Jordi

• 12:10-13:30 (Lunch)• 13:30-14:15 Panel 5: IP & library developer (continued)

Chair: ScottPanelists: William, Jia, Ney, Jordi

• 14:15-15:15 Panel 6: User applicationChair: KenPanelists: Milos, Hong

• 15:15-15:45 (Break)• 15:45-16:30 Panel 7: Next day planning

Chair: Peter• 18:00- (Dinner and Discussion in Lounge)

Day4: May 19 (Sun)

• 09:00-11:00 Panel 8: Tangible Next StepsChair: Peter

• 11:00-11:10 (Break)• 11:10-11:30 Wrap-up and closing• 11:30-13:00 (Lunch and Departure)

3 Overview of Talks

There is Enough for All

Ivan Sutherland, Marly Roncken, Yong HeiAsynchronous Research Center, Portland State University, USA

We separate self-timed systems carefully into two parts:1) parts that act to compute and control the flow of execution, and2) parts that store and communicate.We use the name “Joints” for the former and “Links” for the latter.

Data enter the input end of a communication Link and arrive at its outputend. Storing data and state in Links allows Joints to be pure combinationalcircuits with the possible exception of arbiters.

Joints take data from FULL input Links, do their computation, and deliverresults to EMPTY output Links. Joints choose which FULL Links to use asinputs and which EMPTY Links to use for outputs. This model remains validfor pure flow control with zero-width data.

Links and Joints communicate through shared variables rather than messagepassing. A Joint can access and “re-use” information carried over a FULL inputLink multiple times, until it drains the Link. A FULL Link, when drained,

4

Page 6: NII Shonan Meeting Report. 133.pdf · 2019. 11. 12. · Systems (ASYNC) is the leading conference in this area and is planned to occur in Japan in May, 2019. However, the symposium

changes its state to (a) not-FULL and (b) EMPTY depending on which Linkend you see. Similarly, a Joint can “re-use” an EMPTY output Link multipletimes until it fills it.

Each action can be described as a guarded command. Concurrently withother Links and Joints, each Joint executes its own actions sequentially, as doeseach Link. Each Joint action and any resulting Link state changes visible tothe Joint form one atomic action. This atomicity simplifies design, verification,and test.

A circuit we call “MrGO” in each Joint action provides the same flexibilitythat software debuggers get from breakpoints. MrGO can safely stop any Jointaction just as a breakpoint can stop software. After stopping the system weexamine its state to find flaws and bottlenecks and to characterize performance.

The Link and Joint model can serve as a universal asynchronous RegisterTransfer Level (RTL) model bridging high-level and low-level design.

Notes:

• defined CONFOUND concept and spoke about two business opportunities;“UNITED transport and storage”, “DATA ACTION Inc.”

• JOINT and LINK action with GO control – will be new RTL• Guard and Command• GO provide run-time breakpoints• Algorithm to LINK-JOINT RTL translation• RTL then is translated to circuits with relative timing, timing and func-tional verification

Useful Variations of Bundled-Data Design

Peter A. BeerelUniversity of Southern California, USA

Bundled-data design (BD) is one of the most well-studied sub-domains of asyn-chronous design and yet new variants pose unexplored advantages and inter-esting challenges. Bundled data design naturally admits to fine-grain voltagescaling but its benefits from a point of side-channel security have only begun tobe looked at. Latch-based bundled design is tolerant to hold times but quantify-ing this benefit seems to be an open question. Minimizing the number of latchesrequired and optimizing the clock gating for bundled-data design has only be-gun to be looked at. One variant of bundled-data design, timing-resilient BDdesign, or Blade, has been shown to lead to efficient near-threshold comput-ing, but efficient place-and-route of Blade designs with its multiple clock treesis still largely unexplored. Another variant, radiationed hardened BD designextends the template to be resilience to soft-errors caused by radiation strikes,but detailed controller and circuit optimization remains incomplete. Moreover,while mapping RTL to bundled-data design is somewhat well studied, mappingCSP to bundled-data design offers new optimization possibilities that are alsounexplored.

Notes:

• Open source tool Edge

5

Page 7: NII Shonan Meeting Report. 133.pdf · 2019. 11. 12. · Systems (ASYNC) is the leading conference in this area and is planned to occur in Japan in May, 2019. However, the symposium

• BD Timing Resilient Design• BD SEU resilient design• SERAD controller design – BM tools

A Design Support Tool Set for Bundled-data Implementa-tion

Hiroshi SaitoUniversity of Aizu, Japan

Although asynchronous circuits have a great advantage such as low power con-sumption, the use of asynchronous circuits in real designs is very limited. One ofthe reason is the difficulty of asynchronous circuit designs. To solve this problem,many researches focus on asynchronous circuit designs based on the design flowof synchronous circuits with commercial EDA tools. In our work, we proposea design support toolset for bundled-data implementation in which automatesconstraint generation, timing verification, and delay adjustment. The proposeddesign support toolset supports both application specific integrated circuit de-signs and field programmable gate array designs with commercial EDA tools. Inthis talk, we are going to present the overview of the proposed toolset, the resultof a preliminary experiment, and the future direction of the proposed toolset.

Notes:

• Japanese industry needs low power but engineers no knowledge or skills• C-to-RTL and FPGA• Use conventional tools as much as possible

Conversion from Synchronous RTLModels to AsynchronousOnes

Shogo SembaUniversity of Aizu, Japan

To design asynchronous circuits easily, conversion methods from synchronouscircuits to asynchronous circuits have been proposed. These conversion methodsconverts synchronous gate-level netlists to asynchronous ones (i.e., GL conver-sion). On the other hand, the target synchronous netlist may include a controlcircuit based on a finite state machine. It is particularly possible when syn-chronous circuits are obtained by high-level synthesis tools. In such a case, thecurrent conversion methods may not work well because GL conversion targetssimple pipeline structure. Moreover, GL conversion may be not suitable for thedesign of asynchronous circuits on commercial Field Programmable Gate Arrays(FPGAs) because the standard design entry of commercial FPGAs is RegisterTransfer Level (RTL) model. From this background, in our work, we researcha conversion method from synchronous RTL models to asynchronous ones. Inthis talk, we introduce our method, developed tool, and current status.

Notes:

• We must start with RTL models, not with separate FSM for data-flow• Aim: Sync RTL → Async BD via Model-XML files

6

Page 8: NII Shonan Meeting Report. 133.pdf · 2019. 11. 12. · Systems (ASYNC) is the leading conference in this area and is planned to occur in Japan in May, 2019. However, the symposium

High-Level Synthesis and Dataflow Circuits

Jordi CortadellaLa Universidad Peruana de Ciencias Aplicadas, Spain

FPGA-based hardware accelerators have emerged as an alternative betweenCPU-based software and ASICs for boosting the performance of computing-intensive functions. In this context, design automation is essential for a produc-tive use of FPGAs and, for this reason, high-level synthesis has been gainingmomentum in the last decade.

Dataflow computing was proposed in the 70’s at MIT as a paradigm tospecify concurrent programs with dataflow languages based on data-driven ex-ecution models that are inherently asynchronous. Unfortunately, asynchronousdesign has been often considered as a risky and error-prone option for engineerseducated to design synchronous circuits and the prohibitive manufacturing costof ASICs has become a showstopper for disruptive technologies.

Recently, elastic circuits have shown to provide promising results in FPGA-based systems. This opens an avenue of possibilities for exploiting dataflowcomputing and asynchrony in a environment in which both synchronous andasynchronous implementations can be experimented with low risk. The advancesin high-level synthesis can foster the introduction of asynchronous techniques inhardware acceleration.

Notes:

• Different ways to implement algorithms• HLS goes mostly into the FPGA• Timing models – PLL and VCO• Opportunity – HW acceleration

QDI and SDI Design in Subthreshold Region

Masashi ImaiHirosaki University, Japan

As VLSI fabrication technology shrinks and the amount of circuits in a chip in-creases, power saving circuits are required. In addition, highly reliable circuitsagainst delay variations under ultra-low voltage (subthreshold region) environ-ment are also required. Here, we have research questions; which is better, a QDI(Quasi-Delay-Insensitive) model or SDI (Scalable-Delay-Insensitive) model im-plementation, and a bundled or encoded implementation? In this talk, I willexplain the definition of the SDI model and its implementation rule again. Sec-ond, the pros and cons of QDI-model-based circuits and SDI-based circuits areexplained. Then, the behavior of clock signals in the subthreshold region willbe shown and the frequency of the clock signal in any part in a chip is the sameas that of the source clock signal. Finally, I will show some preliminary evalua-tion results using 28nm process technology. We designed N-bits 3 stage pipelinecircuits with ripple carry adders based on synchronous and asynchronous styles.They are evaluated using 28nm process technology and the Synopsys HSPICEanalog simulator. As the result, it is concluded that the asynchronous dual-railcircuits have great potential to operate in the environment where the supplyvoltage is significantly low and the temperature is high.

7

Page 9: NII Shonan Meeting Report. 133.pdf · 2019. 11. 12. · Systems (ASYNC) is the leading conference in this area and is planned to occur in Japan in May, 2019. However, the symposium

Notes:

• SDI – scalable delay insensitive model (coming from 1997 – Takashi Nanya)• New delay model – accounts for delay-variation• Diagrams in Vdd vs Temperature operation for sync and async

QDI Circuits: Efficient Templates and Design Techniques

Ney Laert Vilar CalazansPontifical Catholic University of Rio Grande do Sul, Brazil

As I entered the field of asynchronous circuits in 2006, I imagined it would beeasy to collect example circuits, tools and libraries. It was not. Apart fromsome tools like Petrify, 3D and Minimalist and some example circuits packedwith them, there was nothing there. To design and fabricate asynchronouscircuits we are thus used to build everything from scratch, and this could bemade way better, helping others to learn use and join us in doing asynchronousdesign.

For some time, we have learned and taught how asynchronous circuits work,building up libraries based on a design flow called ASCEnD (Moreira’10, SOCC’11),analyzing the structure of fundamental blocks like C-elements, NCL gates andMTNCL gates (ICCD’13, SBCCI’17) and proposing new ways to build cells(ASYNC’14-1, ICECS’11) and solve specific design problems. From a set of de-sign blocks, we now explore logic templates and their utility to effectively buildasynchronous circuits. Examples are DIMS/WCHB, NCL, NCL+ (SBCCI’12),SDDS-NCL (ASYNC’14-2), VELO and SDDS-Velo (GUAZZELLI’17).

The above bottom-up strategy can be coupled with design capture tech-niques, available and/or under proposition (ICECS’17), to provide a full set ofaids for asynchronous design, especially for QDI design. QDI design for nearand/or sub-threshold operation is a major goal of my current research. TheSDDS-VELO template (GUAZZELLI’18) seems to be a good starting point.Derived from MTNCL/SCL design, it has low area overhead (compared to NCL)and presents opportunities for robust operation under very low supply voltages.IoT edge nodes are a target application field.

Notes:

• In 2006 there was nothing in terms of tools for async design and fab• Developing cells – started with C-elements• Cell Libraries – cooperation with Silvaco and Nangate• Used in Universities already• Async design templates• SDDS-VELO template patented

Verification of QDI Circuits

Scott C. SmithNorth Dakota State University, USA

Validation is a critical part of any commercial design cycle; and formal veri-fication establishes correctness using mathematical proofs, which is extremely

8

Page 10: NII Shonan Meeting Report. 133.pdf · 2019. 11. 12. · Systems (ASYNC) is the leading conference in this area and is planned to occur in Japan in May, 2019. However, the symposium

useful for ensuring design correctness and finding corner-case errors that of-ten escape traditional testing. One of the more popular formal verification ap-proaches that has been found to be extremely scalable and useful in digital de-sign is equivalence checking, which can check, with a high degree of automationand efficiency, that the golden model (i.e., the design that has been extensivelyvalidated) and its derivate are functionally equivalent. There are a number ofcommercial equivalence checkers for synchronous circuits, including IBM SixthSense, Jasper Gold Sequential Equivalence Checker, Calypto SLEC, MishchenkoEBCCS13, and Cadence Encounter Conformal Equivalence Checker; however,there are currently no commercial equivalence checkers for QDI circuits. Thistalk will discuss an equivalence checking methodology for QDI circuits, includ-ing NCL, MTNCL, and PCHB, which guarantees both safety (i.e., functionalcorrectness) and liveness (i.e., deadlock free). The methodology automaticallytransforms the QDI circuit into an equivalent synchronous circuit, which is thencompared against the specification synchronous circuit, using WEB refinementas the notion of correctness. The converted synchronous circuit, specificationsynchronous circuit, and the WEB refinement property are automatically en-coded in the Satisfiability Modulo Theory Library (SMT-Lib) language, andthe resulting equivalence property checked using an SMT solver. Additionalchecks (e.g., no illegal states, handshaking correctness, input-completeness, ob-servability) must also be performed to ensure that the QDI circuit is live. Thus,the overall verification method has three high-level steps: (1) Conversion fromQDI to synchronous; (2) Verification of converted synchronous against specifi-cation synchronous; and (3) Additional checks on original QDI circuit to ensureliveness. The methodology can also be used to check the equivalence of twoQDI circuits by applying the conversion technique to both QDI circuits to ob-tain two corresponding synchronous circuits, functionally verifying these twosynchronous circuits against each other, and performing the additional livenesschecks on both QDI circuits.

Notes:

• Verification is a must for all companies• Previous work: Bounded delay circuits, Deadlock detection, modelingQDI as transition system: NCL using web refinement with stuttering andPCHB using model checking

• NCL verification Methodology; Functional – replace each NCL gate tonormal boolean, SMT-Lib; Invariant check check rail0 is invert (rail1);Liveness: handshaking and input-completeness and observability

• Applied to MT-NCL and PCHB

Fault Modeling and Fault Masking in Asynchronous Cir-cuits

Andreas SteiningerVienna University of Technology, Austria

We start from the observation that asynchronous circuits, in particular QDIcircuits, have a very robust timing behavior, which is an attractive propertyfor implementing resilient circuits. In the value domain, their resilience can beassessed through their immanent ability to mask faults. In case of transient

9

Page 11: NII Shonan Meeting Report. 133.pdf · 2019. 11. 12. · Systems (ASYNC) is the leading conference in this area and is planned to occur in Japan in May, 2019. However, the symposium

faults, we are interested to compare the diverse masking effects with the syn-chronous world. Here, particularly the temporal masking seems to make a keydifference. We want to study and experimentally evaluate that. The vision is tocome up with a very generic understanding of masking effects in asynchronouscircuits that allows for quantitative comparisons of implementation options andfor targeted hardening of circuits. In case of permanent faults we are interestedin better understanding and potentially leveraging the known “fail-stop” be-havior of asynchronous (QDI) circuits for fast repair and recovery. Here a keychallenge is to identify those cases where the fail-stop assumption does not hold.For both aims we need to select useful target circuits and design styles, and weneed models that allow investigating these circuits on a level detailed enoughto capture all relevant effects, but abstract enough to make the fundamentalrelations visible. I will be happy to receive inputs and thoughts about thesequestions.

Notes:

• Time domain agreed for async but what about value domain? (cf SETtolerance)

• Leverage for permanent faults• A lot in the state of the art, but how to get systematic? Joint work withMilos (ENROL)

• Masking! Electrical, logical, temporal, matching bubbles and tokens/Completion detection

• Approach: set of rep functional blocks – what is representative? what isappropriate analytical model?

Camera Sensors

Montek SinghUNC Chapel Hill, USA

Notes:

• N Photons per time T – why time T is fixed? N varies from 10 to 50K• Give each pixel its own time – fix the N and vary time.• Astronomer’s challenge – 1-5 photons – noise level?

Subjects:

• Teaching and Learning students – async in one semester!• Research grants – we don’t get enough funds to develop async tools!• Startups• Penetration into existing companies

Teaching Asynchronous

Mika NystromUniversity of Southern California, USA

Notes:

• Work at intel for 20 years – showed some ’dark side’

10

Page 12: NII Shonan Meeting Report. 133.pdf · 2019. 11. 12. · Systems (ASYNC) is the leading conference in this area and is planned to occur in Japan in May, 2019. However, the symposium

• Async is looked esoteric at Intel• Teaching: specification → decomposition → process graph design

Asynchronous in Mesochronous (Timing Analysis). Teach-ing and Tools: Where are We?

Jens SparsøTechnical University of Denmark, Denmark

The first issue is CAD tools.I have followed the field for 30 years and have seen tools come and go. When

I teach asynchronous design to graduate students I can’t really point themto tools except perhaps Petrify/WorkCraft for synthesizing speed independentcontrollers, should they need such control circuits in their design. Could ourworkshop produce an updated list of what CAD tools are available? Whattools could/should I use in the next edition of my class on asynchronous circuitdesign.The second issue is timing-analysis.

At DTU, we have developed an asynchronous time-division-multiplexed network-on-chip (called Argo) that we use in our real-time multi-core platform (calledT-CREST). The timing organization of this multi-core processor allow: (a) indi-vidually clocked processors, (b) mesochronously clocked network interfaces, and(c) connected by an asynchronous network of routers and links. In every clockcycle, every network interface reads one token from the asynchronous networkand writes one token into the asynchronous network. Tokens can carry validdata or voids, and in this way the circuit mimics a global-clock. The operationis similar to Mark Greenstreet’s STARI, generalized to a structure with multipleinput and output ports. The advantage is that the design offers time elasticitywithout any synchronizers. We believe the scheme can be used in many othercontexts.

The correct and safe operation requires that the interfaces to the asyn-chronous network are able to complete a handshake cycle in less time thanthe duration of a clock period. This timing analysis problem was addressed byHulgaard et al. in the 1990s and more recently in 2018 by Hua and Manohar.Questions: Are the timing models of the circuit realistic? Is it enough to focuson the steady-state operation. Are the developed algorithms implemented intools? Etc.

Notes:

• Material is based on the book published 17 years ago• Do structural design using Data Flow components• Timing organization in multi-core; Core-NI-NoC-NI-Core; Network itselfis scheduled at strictly time-shared tokenised slots. Mesochronous. Net-work should run faster than local clocks. Automatic verification is needed.

11

Page 13: NII Shonan Meeting Report. 133.pdf · 2019. 11. 12. · Systems (ASYNC) is the leading conference in this area and is planned to occur in Japan in May, 2019. However, the symposium

Asynchronous Ultra Low Voltage / Low Power CMOS -What and Why, but not much about How.

Snorre AunetNorwegian Univ. of Science and Technology, Norway

Notes:

• RISC implementation – 90% reduction in Energy/operation down to 300mV• 32 bit adder logic was possible for 84mV for static CMOS (80nm)• Sacrifice – doubling the area• Delays vary by orders of magnitude due to temp and process variationswhen operated in subTh

Applications of QDI Circuits

Jia DiUniversity of Arkansas, USA

With multi-rail encoding and 4-phase handshaking, QDI circuits have someunique advantages and drawbacks, which make them suitable for a variety ofniche market applications. This presentation introduces our previous and on-going QDI circuit development work on several of such applications.

Notes:

• Don’t compete with sync – identify applications/scenarios with async haveunmatched advantages

• Robust circuit operation: flexible timing, wide range temperature, aggres-sive Vdd scaling, building with emerging devices; circuit stacking

• High Energy Efficiency• No clock: distributed switching activities, 3D IC, power balancing forsecurity

• DR handshake – Rad hard

Opportunity and Challenge of Latch-based Designs

Huimei ChengUniversity of Southern California, USA

Latch-based designs have many benefits over their flip-flop based counterpartsbut have limited use partially because most RTL specifications are flop-centricand automatic conversion of FF to latch-based designs is challenging. Mostconventional flows convert the FF-based designs into pulsed-latch designs or two-phase latch-based designs controlled by either master-slave clocks or bundled-data asynchronous controllers.

Pulsed-latch schemes are an intermediate approach that lies between latchand flip-flop based designs, however, are subject to hold problems and pulsewidth variations. Whereas two-phase designs are inherently more robust thanpulsed-latch designs, multi-phase latch-based designs can sometimes be an at-tractive alternative. In this talk, I will show some details of a novel conversionalgorithm targeting 3-phase bundled-data latch-based designs, evaluate its ben-efits, and articulate related remaining research challenges.

12

Page 14: NII Shonan Meeting Report. 133.pdf · 2019. 11. 12. · Systems (ASYNC) is the leading conference in this area and is planned to occur in Japan in May, 2019. However, the symposium

Notes:

• Desynchronization flow: convert to latch-based, and add async controller• 3-phase latch based design; replace doubling with Master-Slave, to some-thing like Master-Master-Slave

• Challenge – clock tree power in a multiphase clocking• Idea: Combine retiming with clock gating

Teaching: from Principles to Tapeout in a Semester

Rajit ManoharYale University, USA

Where are the asynchronous designers? Where are the interesting asynchronouschips? Anyone that has tried to adopt asynchronous circuits for their applicationknows how hard it is to find qualified designers. We need a different way toeducate students in asynchronous design that goes beyond a specialized graduatecourse. As an experiment, I tried to teach the classic “Mead and Conway”class, except students were asked to design asynchronous circuits rather thansynchronous ones. I will report on the results of this experiment.

Notes:

• It’s not just async! Many sync families exist. But abandoned!• Will Zwaenepoel : we make things complicated! Fabricated complexity!• Companies just do work that works; they ignore research• Feynman’s quote: “I couldn’t explain it to the freshman level. This meanswe don’t really understand it”

• Experiment: a Mead and Conway async class. Used new MOSIS schedule.

CAST+CSP+Proteus

Andrew LinesIntel Strategic CAD Labs, USA

Notes:

• CAST + CSP + Proteus = Synthesis, Palce, and Route for QDI, BD, orSHS Circuits

• 2-input C-element 12 transistors (van berkel’s ) is used.• 3-inputs C-element can be done in 12T – see Wikipedia on C-element

Asynchronous control for Analog-Mixed Signal

Alex YakovlevNewcastle University, UK

For years asynchronous design was considered in the context of constructingdigital systems of ever increasing complexity. At the same time, in parallel withthese developments there has been significant evolution of analog electronics. Inthe recent years due to the trends of the Internet of Things and pervasive IT wenow witness new types of electronics, such as autonomous sensors, where the

13

Page 15: NII Shonan Meeting Report. 133.pdf · 2019. 11. 12. · Systems (ASYNC) is the leading conference in this area and is planned to occur in Japan in May, 2019. However, the symposium

boundary between digital and analog parts is blurred. These are exemplified inthe needs of power-compute codesign which aims to look holistically at the issuesof power efficiency and sustainability. Moreover, what used to be primarilyanalog devices, such as power converters the emergence of digital processingis evident – multi-phase bucks. In this talk we will explore how asynchronouslogic helps interfacing analog and digital subworlds, by improving the qualityof the overall mixed-signal system – power-compute efficiency, response timeto events, size of components to name but a few. Entirely new computationalparadigms based on time-base data representation become possible to drive newapplications in the sphere of machine intelligence and 5G communications. Thenew wave of asynchronous for analog is paved through with the help of toolssuch as those under the Workcraft.org toolkit.

Notes:

• Target: to develop an EDA tool for the design of little digital or mixed-signal systems with asynchronous control and power modulated computingto achieve energy autonomy

• Tool support: Workcraft.org• Newcastle’s power-proportional, fully self-timed CPU: Intel 8051 in 130nmCMOS (2013)

• Self-timed SRAM chip: UMC CMOS 90nm (2010)

Specification and Verification of Link-Joint-Style SystemDesigns

Warren A. Hunt, Jr.University of Texas, USA

Notes:

• Boxing designs without crossing wires• Layering the design is a better way• You can even sometimes cut the design somewhere in the middle• Different views (and reps) of designs can be useful for verification• What’s the biggest IP in the world now – x86 code!

Fault Tolerant Asynchronous Design

Milos KrsticIHP-Leibniz-Institut fur Innovative Microelektronik, Germany

Asynchronous circuit design is known for its robust operation. The respec-tive circuits could operate under wide operating conditions (voltages, tempera-tures), which makes them suitable for the reliability critical applications. Theadditional feature which is frequently required for such applications is fault tol-erance against transient or even permanent faults. There have been severalstudies exploring the usage of asynchronous logic in fault tolerance domain. Inthic context the major targets were either low-power applications, where timingfaults play important role, or space applications, where the soft errors appear inregisters and logic due to the radiation effects. Since today circuits can be uti-lized in the complex environments and scaled technology are increasingly error

14

Page 16: NII Shonan Meeting Report. 133.pdf · 2019. 11. 12. · Systems (ASYNC) is the leading conference in this area and is planned to occur in Japan in May, 2019. However, the symposium

prone, the new synergistic methods in design are required when we aim for errorresilience. In this talk some ideas and initial results will be discussed addressingbuth upsets/transients and timing faults in asynchronous logic.

Notes:

• Where are significant benefits: application which is async per se; in-memory computing for spiking neurons; trigger-signals in mixed-signaldesigns (designers don’t want clock in high-speed circuits)

• Indirect benefits: EMI, security• Are async design FT? what about SET, SEU?• Control path sensitivity

Subjects:

• SETs can be detected by protocol monitors• Async design can be stopped locally at short time• Datapath sensitivity• Memory protection against SEUs: DICE, TMR, HIT FFs; ECC codes forSRAM, NVM

• ENROL project – collaboration with TU Vienna• Expectations from Shonan: better connection with async community; toolexchange; common benchmarks

Gradient Clock Generation

Matthias FuggerCNRS & LSV, ENS Paris-Saclay & Inria, France

Notes:

• They have similar things – Go signal, they may have handshakes or VCOs• Sync and async – not much difference• Phase difference is between adjacent block• Topic 1: Gradient-clock synchronization algorithms• Topic 2: discrete or continuous decision – controller that transforms con-tinuous domains Discrete can be easier but you still have to use synchro-nizers. Alluding to MC-circuits

• Topics 3: Circuits that work in distributed environments – microbiologicalcircuits

Asynchronous Energy-Efficient CNN Accelerator Designwith Reconfigurable Architecture

Hong ChenTsinghua University, China

Notes:

• Low power design for APP detection system; MCU subTh• Click based MCU design• Async CNN accelerator

15

Page 17: NII Shonan Meeting Report. 133.pdf · 2019. 11. 12. · Systems (ASYNC) is the leading conference in this area and is planned to occur in Japan in May, 2019. However, the symposium

• ASIC design – 3x energy efficiency (GOPs/W) reported• IBM and Intel had async Spiking NNs in 2015 and 2016

Breaking Through the Inertia: the Force to Move Asyn-chronous Design into the Mainstream

Ken StevensUniversity of Utah, USA

Notes:

• Define the Goals: can we unite on a vision? Technology, visibility, prod-ucts; Model: open source/patent rights; 3-5 ambitions yet realistic

• Drill down to Forces: why change? What to change and how to change?• Leadership: create statement and governing body• Ant-dream-team philosophy; pick teams by personality – right playersrather that best players

• Bold and Achievable initiative: Digital h/w is as easy to write as Digitals/w – TIMING is the key aspect

• 1956 – FORTRAN; 1958 – Clocked design• S/w progressed since then, but FSMs are still clocked• Relative Timing

Coarse Grained vs. Fine Grained Architectures for Asyn-chronous Reconfigurable Devices

Tomohiro YonedaNational Institute of Informatics, Japan

Currently, FPGAs are widely used in many applications, and it’s possible toimplement very large circuits on them. One drawback in implementing a largecircuit on an FPGA is that it becomes more and more difficult to satisfy thetiming constraints related to clock signals. Furthermore, some applications re-quire circuit blocks with different clock frequencies. For these reasons, theGALS (Globally Asynchronous Locally Synchronous) approach is useful alsofor FPGAs. In this work, we consider a reconfigurable asynchronous compo-nent based on a coarse grained architecture for implementing a low-overheadcommunication mechanism for GALS systems. Its element block is based onthe MOUSETRAP pipeline stages and the configurable logic blocks are special-ized for implementing communication functions. In this talk, I will show somedetails of the proposed architecture, and evaluate its performance comparing itwith a conventional fine grained FPGA architecture.

Notes:

• Problems in current FPGAs: clock to distribute in the entire chip; multi-clock solutions maybe possible

• Examples in state of the art: GALS FPGA; Achronics – fine grain; Coarsegrain – CGRA

• Proposed- more flexible that GALS, use coarse grained async• More specialized for comms than for GALS

16

Page 18: NII Shonan Meeting Report. 133.pdf · 2019. 11. 12. · Systems (ASYNC) is the leading conference in this area and is planned to occur in Japan in May, 2019. However, the symposium

• Separate data path from control path in interconnect• Almost completed now. Evaluation is coarse grained vs fine grained; asyncFIFO, crossbar, async router; 130nm technology

• Throughput comparison – everything about 3x against fine-grained• Future: Domain specific FPGAs – eg. Deep learning accelerators

Correct-by-Construction Hardware Synthesis

Daniel ZimmermanGalois, Inc., USA

Notes:

• About Galois: est. 1999 to apply functional programming to informationassurance; about 100 people working on more than 40 projects for bothUS government agencies (NSA, DARPA, DHS, AFRL) and commercialcompanies (Amazon, MS); “high assurance everything”, but until recently,mostly software.

• Cryptol is Galois’s core cryptographic specification and verification tool,and oldest project; in 2006, spun off a company to synthesize crypto-graphic algorithms from Cryptol specifications to FPGA implementations,but this effort failed due to the 2007-08 financial crisis.

• In 2015, implemented an asynchronous ASIC with 3 ciphers (AES, Si-mon, Speck) by synthesizing SystemVerilog-CSP from Cryptol, as partof GULPHAAC: Galois Ultra Low Power High Assurance AsynchronousCryptography.

• This year, started a new DARPA project – 21st Century Cryptography– to extend Galois’s synthesis capabilities and implement a side-channelresistant, low power proof-of-concept cryptographic ASIC. $5.4M over 16months, with USC and Portland State. The resulting synthesis capabilitiesneed not be restricted to cryptographic algorithms.

• Galois loves to collaborate, and has many academic and industry partnerson our programs.

Practical Mixed Asynchronous/Synchronous Design

William KovenGalois, Inc., USA

Notes:

• Interests: build interesting chips; sell interesting chips, design async cir-cuits

• The more interesting chips you sell, the more you can build!• Previously built the REM R0: Mixed async/sync; sync IP: MIPI-CSI,USB3, LPDDR4, AXI – we just didn’t want to rebuild them ourselves(not interesting) – easier to buy and plug together

• Async blocks included AI core and CPUs. CPUs were de-synchronized.But AI Core was designed as an async block from the ground up.

• GLASS-CV, new chip at Galois: async components include – Vision ac-celerator, RISC-V cores, JPEG encode/decode, Security Core (all built in14nm)

17

Page 19: NII Shonan Meeting Report. 133.pdf · 2019. 11. 12. · Systems (ASYNC) is the leading conference in this area and is planned to occur in Japan in May, 2019. However, the symposium

• If anyone wants to build an open-source async AXI that would be ex-tremely welcome

• It doesn’t matter how a chip compares to sync. It matters how it comparesto devices on the market (that may happen to be synchronous)

Meditation on Metastability

Mark GreenstreetThe University of British Columbia, Canada

Notes:

• What do I do: m/s and synchronizers, CDC, VerificationWhat makes a good synchronizer? Can we test it in the lab? What arethe fundamental limits on bandwidth/latency? What are the fundamentalbottlenecks? Architects are risk averse – they need compelling guaranteesbefore trying something new!

• Inflection:Moore’s law has been innovation killer! IBM 360, PDP-11, x68, ARM,RISC-V all the same; UNIX turns 50 next month; all PLs are the same –FORTRAN, C, Java, Python … End of Moore’s law → parallel architec-tures are mainstream: GPUs, FPGAs, Machine learning; Data flow andFP are used in practice!

• Is it time for async?Chips are multi-timed ensembles of comm systems, Async gives fine grainedconcurrency, activity management, data-transport; What are compellingcases to exploit async

• What I hope to learn from you?Compelling async use cases; ways to exploit fine-grained concurrency:system-level description; formal verification; Post-Silicon, Post-von Neu-mann. Can anything dethrone Si?

Challenge of Nonvolatile Logic LSI for IoT Applications

Takahiro HanyuTohoku University, Japan

The impact of artificial intelligence (AI) is being widely understood in severalapplication fields and its technology/application challenge will be increased moreand more. In contrast, when you use AI software under the cloud environment,its performance would be limited in several applications (ex. when many peopleaccess the cloud computer simultaneously, its data traffic might be hanged up.Therefore, it would be difficult to always receive the response from the cloudmachine within a real time). In this sense, it is getting important to developspecial-purpose AI hardware. I would like to explain several problems of con-ventional CMOS-only-based AI hardware in terms of power dissipation, and itssolution to overcome the above problems. As a concrete example, l introducethe possibility and the usefulness of non-standard logic-circuit design methodol-ogy using functional devices (such as magnetic tunnel junction devices); called“non-volatile logic-in-memory” architecture.

18

Page 20: NII Shonan Meeting Report. 133.pdf · 2019. 11. 12. · Systems (ASYNC) is the leading conference in this area and is planned to occur in Japan in May, 2019. However, the symposium

Notes:

• History: Many ISSCC papers from 1985 till now: 4-valued image pro-cessor, multi-valued current mode multiplier LSI, Ferroelectric based NVlogic LSI, MTJ-based MCU LSI

• MTJ-based NV MCU LSI (ISSCC 2019)• Motivation: demand for LP, HP MCUs• EH – 100uW – requirement for IoT sensor nodes powered by EH• Operation frequency: > 200MHz• Research interest in future: Logic in memory structure – non-standard

logic – based on device/material• Spin/charge/orbital/lattice

CMOS Invertible Logic using Stochastic Computing

Naoya OnizawaTohoku University, Japan

Invertible logic can operate in one of two modes:1) a forward mode, in whichinputs are presented and a single, correct output is produced, and 2) a reversemode, in which the output is fixed and the inputs take on values consistent withthe output. It is possible to create invertible logic using various Boltzmann ma-chine configurations. Such systems have been shown to solve certain challengingproblems quickly, such as factorization and combinatorial optimization. In thistalk, we show that invertible logic can be implemented using simple spikingneural networks based on stochastic computing. We present a design method-ology for invertible stochastic gates, which can be implemented using a smallamount of CMOS hardware. We demonstrate that our design can not only cor-rectly implement basic gates with invertible capability, but can also be extendedto construct invertible stochastic adder and multiplier circuits. Experimentalresults are presented which demonstrate correct operation of synthesizable in-vertible circuitry performing both multiplication and factorization, along withfabricated ASIC measurement results for an invertible multiplier circuit.

Notes:

• Lack of tools for async design – and hence gave up some time ago• CMOS invertible logic: probabilistic magnetoresistive device model (son-

ventional work) Input is analog voltage; output is digital stochastic se-quence

• Comparisons with traditional, reversible, invertible – and now CMOS in-vertible

• Approximation using stochastic computing with CMOS digital circuits

19

Page 21: NII Shonan Meeting Report. 133.pdf · 2019. 11. 12. · Systems (ASYNC) is the leading conference in this area and is planned to occur in Japan in May, 2019. However, the symposium

4 Overview of Panels

Panel 1: Open source

Chair: Alex / Panelists: Andrew, Rajit

Position talk and discussion:

• Andrew:

– Our 25 years of async and CAST (Caltech Asynchronous SynthesisTools)

∗ 1993-2000 @ Caltech: DSP,CPU / QDI PCHB / CAST Language/ Minimal CAD / Manual Layout

∗ 2000-2011 @ Fulcrum Microsystem: Network switch products /QDI PCHB / CAST + Some Verilog / Vastly improved CAD /Proteus Synthesis, Place & Route

∗ 2011-2015 @ Intel: Networking / GALS + QDI / Some CAST /Mostly Verilog / Similar CAD

∗ 2015-2019 @ Intel Labs: Neuromorphic / Bundled-Data / CASTLanguage / Verilog for CPU / New CAD for BD

– CAST is a HDL and tools for async circuit and is also used to definethe whole system at the top level

– Synchronous blocks in Verilog and they are pointed– For simulation purposes parts in CAST is also translated to Verilog– Synthesizable CSP and Proteus– Very compact 25x shorted than Verilog – productivity advantage– CAST/CSP example: Mandelbrot Set

• Rajit:

– Called Hierarchical Production Rules– Re-implemented everything– CHP (renamed CSP at Tony Hoare’s request; e.g.: Probes are not in

CSP)

∗ defproc – to define a process type∗ defchan – for channel type∗ deftype – for data type

all definitions are available on the web– Tools:

∗ CHP is a starting point

∗ Decide the microarchitecture

∗ *CHP – this captures all the concurrency – explicit and at thislevel of abstraction you can start doing h/w

∗ One route Handshaking expansion (HSE) and PRS

∗ Sizing, Sized PRS, SPICE

– no technology mapper!– 3 categories;

∗ Front-end∗ Middle-end∗ Back-end

20

Page 22: NII Shonan Meeting Report. 133.pdf · 2019. 11. 12. · Systems (ASYNC) is the leading conference in this area and is planned to occur in Japan in May, 2019. However, the symposium

Panel 2: Community involvement

Chair: Montek / Panelists: AllGoal of the session: how multiple stakeholders can be part of the community

in reflection to what Andrew and Rajit

Position talk and discussion:

• Marly: where we might fit in. CSP spec is OK, there is still a gap betweenCSP and single cell process; I want more compilation there. We do guardedcommands at channels, at behavioral level.

• Ken: what level of community would you like to involve? Can it be puton github.

• Peter: I am agnostic to control, I care about logic and latches. Where doI enter this flow? I’ll have a body for CSP for combination logic. What todo about latches? If I have sync RTL (in Verilog), how do you recommendthis added to the flow?

• Rajit: we have ACT is the language and we have v2act and act2v

• Jordi: Dataflow Intermediate Representation as a common language tobuild front/back-ends from different high-level languages to various asyn-chronous/synchronous implementations (with FPGAs or ASICs).

• Mark: We need a regression suite. So we can fix the shortcomings.

• Ney: We need a roadmap – of what parts and routes through the flow-mapwe can have and when

• Rajit: Showed Toolflow and interfaces

• Alex: What kind of user do you envisage for the open source flow: anSME, students ??

21

Page 23: NII Shonan Meeting Report. 133.pdf · 2019. 11. 12. · Systems (ASYNC) is the leading conference in this area and is planned to occur in Japan in May, 2019. However, the symposium

• William answered from Galois – they intend to use them, contribute andkeep them in the open source domain

Panel 3: Next day planning

Chair: Peter

Position talk and discussion:

• People come with a wish list (10 mins max):

– This is what I like in this tool flow.– What would like I to have in this flow?– What am I willing to do?

• Proposed panels and discussions:

– Panel Education: People who want to teach Async – Morning (1.5hrs)– chair: Mika

– Panel Applications: People who want to build things (power and/orperformance) – Morning (1.5hrs) – chair: Hong

– Panel Developers: People who want to write some CAD and IP, inparticular design style (e.g. BD) (1.5hrs) – chair: Scott

– Free discussion

Panel 4: Education

Chair: Mika / Panelists: Peter, Rajit, Montek, Jens

Position talk and discussion:

• Mika’s intro:

– Teaching Engineers is a mission to give tools and weapons to peoplefor survival.

– Quotes from the Bible– John 8:32: And ye shall know the truth, and the truth shall make

you free– (also Motto of Caltech)– Program testing can be used to show the presence of bugs but never

to show their absence – EWD249 (Dijkstra)– It is the professor’s task to bring the relevant insights and ability into

the public domain by explicit formulation EWD1175 (Dijkstra)

• Peter:

– 17 weeks*3hrs/week = 51 hrs– 1st year Masters, compulsory– I’d love to put this on web and expect some additions or editions

from others– There are Verilog, System Verilog, SVM – for those who will never

see async in their lives.

22

Page 24: NII Shonan Meeting Report. 133.pdf · 2019. 11. 12. · Systems (ASYNC) is the leading conference in this area and is planned to occur in Japan in May, 2019. However, the symposium

• Rajit:

– 1999, Caltech:– CS 181 VLSI Design Laboratory– Now: Yale EENG 426/ CPSC 876: Silicon Compilation

• Montek:

– Teaching Computer Organization, Computational Photography (Im-age sensors etc.)

– Teaching about 50 students, who may not see digital design in thefuture at all.

• Jens:

– Teaching Async design for EEE students. Somebody else is teachingVLSI design.

– 12 week course– Use the Book (by Sparsøand Furber)– Models used DF, STGs, Workcraft– Knowledge: theory, performance analysis, m/stability, NOCs, GALS,

desynchronization– Projects: based on some papers – different topics to be decided.

Panel 5: IP & library developer

Chair: Scott / Panelists: William, Jia, Ney, Jordi

Position talk and discussion:

• William: What is it to build an industrial chip?

– Re-use 80%-90% of existing IP blocks and the rest is to be build.– Wish to plug things together in largely delay-insensitive manner– Examples: used Mark’s FIFO; would be nice to have async AXI– How to increase productivity of the design team – where with 10%

of time we could get 90% performance improvement

• Jia:

– Technology independence. Technology dependent – would be IP sen-sitive so can’t be open sourced

– Applied to NSF three times: Building a website with async designsolutions for whoever is interested in async; examples:

∗ NCL synthesis tool: UNCLE:https://www.sites.google.com/site/asynctools/

∗ NCL VHDL library and older technology CMOS libraries:https://www.ndsu.edu/pubweb/~scotsmit/CCLI_async.html

– We can have “faked delay” Liberty libraries– Suggestions from the audience:

∗ Maybe we can have our library in PTM?∗ Through MOSIS you can have access to repositories of libraries

23

Page 25: NII Shonan Meeting Report. 133.pdf · 2019. 11. 12. · Systems (ASYNC) is the leading conference in this area and is planned to occur in Japan in May, 2019. However, the symposium

• Ney:

– On wiki in Asynchronous circuits, there is no mention of benchmarks,libraries and tools!

– Open cell libraries for Async:

∗ 1993: Wuu/Wrudula, USC/UArizona – 500nm∗ 1998: Renaudin, Grenoble∗ 2007: Chong et al∗ 2004/2005: Beerel/Ferretti∗ 2003: Maurine, Grenoble∗ 2010: ASCEnD Libs from Ney’s group∗ Examples: from 2-input C-elements to 4-input NCL gates∗ ASCEnD-ST65/ST28: 1080 cells (600 layout/ready)/several hun-dreds and no layout

∗ ASCEnD-FreePDK45: 30 cells∗ Ongoing: ASCEnD-TSMC180: 48 cells

– Questions:

∗ Do we need libraries? Yes, to attract more async users, fastdesign, further the cell design process

∗ Do we have libraries to make available?

· NDA problems, company restrictions· Services – Mosis, French CMP, Europractice, IHP

∗ What do libraries support?

· NCL, PCHB, Blade· NearTh and SubTh for all the above …

∗ Do we need to have circuits made with these libraries?

· Benchamarks for Micropipelines (BD), NCL,MTNCL, PCHB;Blade

– Roadmap for Open Cell Libs?

∗ Define std for Open lib structure∗ Provide examples with ND clearance∗ Produce a model for lib

– Examples shown. Provided: desc, lef, lib, Verilog– Virtual Functions – e.g. mappings from NCL gates to NANDs,NORs

etc. to ’fool commercial tools’

• Jordi:

– Satisfiability-based approach for layout synthesis

∗ Tool reads a netlist of transistors with design rules and producesa layout (routing) on a 3-D grid.

∗ Video of Sat-solver in action∗ Available on the web:http://www.cs.upc.edu/~jpetit/CellRouting/

∗ SAT-solver gives the legality check∗ Placement is done separately; some placements are routable andsome aren’t

– Data Flow

∗ Started in the 1970s at MIT (Jack Dennis and David Misunas)

24

Page 26: NII Shonan Meeting Report. 133.pdf · 2019. 11. 12. · Systems (ASYNC) is the leading conference in this area and is planned to occur in Japan in May, 2019. However, the symposium

∗ In Async community it was in 2001 book by Sparsøand Furber∗ 2004: Rajit had a paper∗ 2018: EPFL people in FPGA∗ 2012: xMAS at Intel∗ EPFL method:

· Throughput constraints and slack matching → ILP

Panel 6: User application

Chair: Ken / Panelists: Milos, Hong

Position talk and discussion:

The background is that we can all build new methods and tools but at the endof the day we need people to use them.

• Milos: Applications of interest

– Not sure Async would be useful in apps where Sync has been estab-lished.

– So, those apps where things haven’t been done– Mixed-signal design– Robust design

∗ Low noise: EMI, substrate noise∗ Reliability: FT (SEU,SET), Extreme Temps, Voltages∗ Side channel attack resistant (IHP did with GALS)

– Neuromorphic (CMOS+, such as added memristors)– Flow issues: Spec, RTL-netlist, Layout; plus DFT

• Hong

– Low power is the main focus!– Analog part consumes most power– SubTh microprocessor design– Wireless Monitor System of the Total Knee replacement: SoC design– CNN accelerator

• Discussion: Creating a library of regression tests

• Question: What about language to adopt?

– The community came together and rallied around CHP (a variant ofHoare’s CSP) as the common programming language for the commu-nity

– Rajit will lead development and manage a repository

• Question: What about Energy Harvesting?

– Use of PZT power source from weight for monitoring an implant inthe knee:

∗ Harvested Power: 400uW∗ Output voltage – 2-2.5V, ripple voltage – 400mV

• Question: Can we do anything for G5?

25

Page 27: NII Shonan Meeting Report. 133.pdf · 2019. 11. 12. · Systems (ASYNC) is the leading conference in this area and is planned to occur in Japan in May, 2019. However, the symposium

Panel 7: Next day planning

Chair: Peter

Position talk and discussion:

• Questions: advantage often cited, not-often cited @ASYNC1994• Advantage often cited @ ASYNC1994:

– 1. Achieve average Case Performance– 2. Power consumed only when needed– 3. Ease of modular composition– 4. No clock alignment at the interfaces– 5. Metastability has time to end– 6. Avoid clock distribution costs– 7. Easier to exploit concurrency– 8. Intellectual challenge– 9. Intrinsic elegance– 10. Global synchrony does not exist anyway

• additional advantage often cited @ SHONAN133:

– 11. Low EMI/noise– 12. Process Bring-up support– 13. Robust to PVT variations– 14. Easier to interface to analog– 15. Outlives CMOS– 16. Intrinsically bio-inspired– 17. Continuous time information processing

• NOT often cited @ ASYNC1994:

– 1. It really pisses my boss off– 2. I like reinventing wheels– 3. I like to be different– 4. Gee – I really don’t know– 5. People and circuits need to play by same rules– 6. I don’t understand synchronous circuits– 7. World problems stem from glithces– 8. Synchronous design gives me gas– 9. Clock radiation causes hair loss– 10. It’s none of your business

• addtional NOT often cited @ SHONAN133:

– 11. Clocks make me late– 12. Asynchronous design makes me GasP– 13. Forever the future technology– 14. Timing is where the rubber hits the road– 15. It confuses proposal reviewers– 16. What else to call computing without clocks– 17. I’m still waiting for an ack

26

Page 28: NII Shonan Meeting Report. 133.pdf · 2019. 11. 12. · Systems (ASYNC) is the leading conference in this area and is planned to occur in Japan in May, 2019. However, the symposium

Panel 8: Tangible Next Steps

Chair: Peter

Position talk and discussion:

• Asynchronous mailing list

– Managed by Slack

∗ Teaching∗ Open source / community information∗ Applications / users

• Web pages / Asynchronous on Wikipedia

– Asynchronous web page managed by Manchester’s group has beengone

– Steering Committee of ASYNC symposium supports the asynchronousweb pages hosted at Wiki

• Meet again

– One day follow-up meeting after ASYNC2020

– Invitation only, separate registration and separate web page

– Local arrangement chair will manage them

– Apply Dagstuhl meeting at ASYNC2022

• Teaching materials, videos, tools

– Links from web page (Huge materials cannot be gatherd)– Community Github

• Dream: Asynchronous Funding

5 List of Participants

• Prof. Tomohiro Yoneda, National Institute of Informatics, Japan

• Prof. Peter A. Beerel, University of Southern California, USA

• Prof. Alex Yakovlev, Newcastle University, UK

• Prof. Masashi Imai, Hirosaki University, Japan

• Prof. Snorre Aunet, Norwegian Univ. of Science and Technology, Norway

• Prof. Ney Laert Vilar Calazans, Pontifical Catholic University of RioGrande do Sul, Brazil

• Prof. Hong Chen, Tsinghua University, China

• Ms. Huimei Cheng, University of Southern California, USA

• Prof. Jordi Cortadella, La Universidad Peruana de Ciencias Aplicadas,Spain

27

Page 29: NII Shonan Meeting Report. 133.pdf · 2019. 11. 12. · Systems (ASYNC) is the leading conference in this area and is planned to occur in Japan in May, 2019. However, the symposium

• Prof. Jia Di, University of Arkansas, USA

• Dr. Matthias Fugger, CNRS & LSV, ENS Paris-Saclay & Inria, France

• Prof. Mark Greenstreet, The University of British Columbia, Canada

• Prof. Takahiro Hanyu, Tohoku University, Japan

• Prof. Yong Hei, Portland State University, USA

• Prof. Warren A. Hunt, Jr, University of Texas, USA

• Mr. William Koven, Galois, Inc., USA

• Prof. Milos Krstic, IHP-Leibniz-Institut fur Innovative Microelektronik,Germany

• Mr. Andrew Lines, Intel Strategic CAD Labs, USA

• Prof. Rajit Manohar, Yale University, USA

• Dr. Mika Nystrom, University of Southern California, USA

• Prof. Naoya Onizawa, Tohoku University, Japan

• Prof. Marly Roncken, Portland State University, USA

• Prof. Hiroshi Saito, University of Aizu, Japan

• Mr. Shogo Semba, University of Aizu, Japan

• Prof. Montek Singh, UNC Chapel Hill, USA

• Prof. Scott C. Smith, North Dakota State University, USA

• Prof. Jens Sparsø, Technical University of Denmark, Demark

• Prof. Andreas Steininger, Vienna University of Technology, Brazil

• Prof. Ken Stevens, University of Utah, USA

• Prof. Ivan Sutherland, Portland State University, USA

• Prof. Nobuyuki Yoshikawa, Yokohama National Univ, Japan

• Dr. Daniel Zimmerman, Galois, Inc., USA

28