notes

Thread

A series of instructions executed consecutively. A thread exists as a subset of a process.

Multithreading allows numerous threads to exist within a given process.

SMP - Symmetric Multiprocessing

Centralised main memory operating under a single OS with two or more homogeneous

processors. Each processor has its own cache memory to speed up main memory data

access. Processors interconnected with system bus.

NUMA - Non-Uniform Memory Access

Memory access time is the bottleneck for modern CPUs. With SMP, all processors use

the same memory bus to access main memory, so only one processor can access the

main memory at a time. However, when many processors exist, the CPUs begin com-

peting for memory access and slow the whole system down. NUMA limits the number of

CPUs on any one memory bus, and connects these nodes with a high speed interconnect.

OpenMP - Open MultiProcessing

An implementation of multithreading. A master thread forks a specified number of slave

threads and the system divides a task among them. The threads run independently and

concurrently. Each thread has an associated ID, with the ID of the master thread being

0. After the slave threads have completed their execution, they join back to the master

thread, which continues. Computational overhead associated with initialising threads.

MPI - Message Passing Interface

Root machine spawns programs on all machines in its mpi world. All threads are

independent; only way to communicate between threads is by passing messages over

the network. Therefore, network bandwidth and throughput is the bottleneck to this

methodology.

FPGAs

An FPGA is a Field Programmable Gate Array. It comprises a number of Configurable

Logic Blocks which sit in a matrix of connections. In this way, it is basically a con-

figurable version of an ASIC (Application Specific Integrated Circuit), and was in fact

1

used primarily as a means for prototyping an ASIC, which typically require a number

of weeks to be created once the schematic has been developed.

A traditional microprocessor is constricted to carrying out operations sequentially;

whilst the clock speed may be inordinately fast, it must still carry out operations one

after another. Parallel programming paradigms have been developed to enforce some

kind of pseudo-parallelism on computational systems, but this is simply achieved by

breaking down a process into a number of threads which can be concurrently worked

on. The process itself is not parallel.

LUT - LookUp Table

A logical unit which takes n inputs and can implement any Boolean operation which

uses those n inputs, with as many logic gates as required. So, for a 3-input LUT, it can

reproduce the answer to A · B + C, A + B + C, etc.

Flip-Flop

We shall primarily be concerned with D flip-flops (JK and T flip-flops also exist, but

are not common). Also commonly referred to as a register. Has two inputs; “D” is data

input and “>” is the clock (order of MHz). Every flip-flop in an FPGA possesses a

clock, and so the concept of a clock in this context differs from that in a microprocessor,

where the clock executes a single operation for each cycle.

Q is the output Used in combination with LUTs

A HDL (Hardware Description Language) must be used to program a FPGA. These

languages enable a formal description of an electronic circuit, i.e. hardware program-

ming. The major difference between a HDL and a regular higher level language, such

as C, is the notion of being able to express the execution of instructions as concurrent

in addition to sequential. Two main HDLs are Verilog and VHDL.

Configuration Interaction

The full CI wavefunction, |Φ0 〉 is given as

|Φ0 〉 = |Ψ0 〉+∑a,r

cra |Ψra 〉+

∑a<b,r<s

crsab |Ψrsab 〉+ ...

∑a<b<...<n,r<s<...<z

crs...zab...n |Ψrs...zab...n 〉

2

notes

Documents