using fpga in embedded devices
Post on 16-Apr-2017
118 Views
Preview:
TRANSCRIPT
1
Using FPGA in Embedded Devices
Andriy SmolskyyConsultant, Engineering29.03.2017
2
What is FPGA?
3
• Transistor-Transistor Logic - TTL• Programmable Array Logic - PAL• Programmable Logic Device – PLD• Complex PLD – CPLD• FPGA• ASIC
History of Programmable Logic
4
Digital Design with TTL LogicTruth table
5
Digital Design with TTL LogicTruth table Karnaugh
map
6
Digital Design with TTL LogicTruth table Karnaugh
mapLogic
expression
7
Digital Design with TTL LogicTruth table Karnaugh
mapLogic
expression Final implementation
8
• Logic gates and registers are fixed• Programmable sum of products array and output control
Programmable Array Logic (PAL)Implementation
Advantages• Fewer devices required• Lower cost• Power savings• Simpler to test and debug• Design security (prevent reverse engineering)
• In-system reprogrammability! (in some cases)
9
From PAL to Programmable Logic Device (PLD)• Arrange multiple PAL arrays in a single device
10
• Combine multiple PLDs in single device with programmable interconnect and I/O
From PLD to Complex PLD (CPLD)Implementation
Advantages• Ample amounts of logic and advanced configurable I/Os
• Programmable routing• Instant on• Non-volatile configuration• Reprogrammable
11
Interconnection Problem: Routing Takes Too Much SpaceGlobal Routing Row & Column Routing
12
• LUT inputs are mux select lines• FPGA LABs made up of logic elements (LEs) instead of product terms and macrocells
• Solves the Interconnection Problem
FPGA LUT and LAB
13
• LABs arranged in an array• Programmable interconnect• Interconnect may span all or part of the array
Field Programmable Gate Array (FPGA)Implementation
Advantages• Easier to create complex functions through LE cascading
• Integration of ready functions and IP blocks: PLLs, memory, arithmetic
• High density, high performance• Fast programming
14
• Pros:- Fast time to Market: easy to develop a new
device with specific logic or interfaces- Easy to upgrade device logic, fix bugs in
hardware- Specific devices: reconfigurable DSP, digital
filters• Cons:
- Need to be programmed at power on- It is hard to achieve 100% device utilization
FPGA vs ASIC
• Pros:- Higher performance: consume less
power and can operate faster on higher speed
- Cheaper in mass production- No configuration at power-on required- Smaller chip size
• Cons:- Additional expenses in design
preparation- Impossible to fix hardware bugs
FPGA ASIC
15
Software and hardware development aspects
16
System on Chip (SoC) + FPGA
17
• In general FPGA generated controllers are similar to Microcontrollers’ peripheral devices
• FPGA requires programming of each start, controllers might be not ready at the system start
• Take care with DMA, MMU, virtual memory and caching operations
• In some designs FPGA can control CPU peripheral devices
Software and hardware development aspects
18
• Verilog• VDHL• Visual development
FPGA design development
19
• Core IP- SDRAM Controllers- Ethernet PHY, Custom
Transceiver PHY- PCIe PHY- SDi, Display Port
• Megafunctions - PLL- I/O- Custom logic blocks
FPGA design development
20
High speed data processing: OpenCL in FPGA
21
A simple CPU
22
Load immediate value into register
23
Load memory value into register
24
Store register value into memory
25
Add two registers, store result in register
26
A simple programMem[100] += 42 * Mem[101]
CPU instructions:
R0 Load Mem[100] R1 Load Mem[101] R2 Load #42 R2 Mul R1, R2 R0 Add R2, R0 Store R0 Mem[100]
27
Single CPU activity, step by step
Time
28
Unroll the CPU hardware…
Space
29
… and specialize by position1. Instructions are fixed. Remove
“Fetch”
30
… and specialize1. Instructions are fixed. Remove
“Fetch”2. Remove unused ALU operations
31
… and specialize1. Instructions are fixed. Remove
“Fetch”2. Remove unused ALU operations3. Remove unused Load / Store
32
… and specialize1. Instructions are fixed. Remove
“Fetch”2. Remove unused ALU operations3. Remove unused Load / Store4. Wire up registers properly. And
propagate state.
33
… and specialize1. Instructions are fixed. Remove
“Fetch”2. Remove unused ALU operations3. Remove unused Load / Store4. Wire up registers properly. And
propagate state5. Remove dead data
34
… and specialize1. Instructions are fixed. Remove
“Fetch”2. Remove unused ALU operations3. Remove unused Load / Store4. Wire up registers properly. And
propagate state5. Remove dead data6. Reschedule!
35
FPGA datapath = Your algorithm, in silicon• Build exactly what you need:
- Operations- Data widths- Memory size, configuration
• Efficiency:- Throughput- Latency- Power
36
OpenCL FPGA• Host + Accelerator Programming Model• Sequential Host program on microprocessor
• Function offload onto a highly parallel accelerator device
main() { read_data( … ); maninpulate( … ); clEnqueueWriteBuffer( … ); clEnqueueNDRange(…,sum,…); clEnqueueReadBuffer( … ); display_result( … );}
__kernel voidsum(__global float *a, __global float *b, __global float *y){ int gid = get_global_id(0); y[gid] = a[gid] + b[gid];}
Host Code
FPGA Design
User Application
Algorithm
37
Loop Pipelining• Analyze any dependencies between iterations
• Schedule these operations• Launch the next iteration as soon as possible
float array[M];
for (int i=0; i < n*numSets; i++){ for (int j=0; j < M-1; j++) array[j] = array[j+1]; array[M-1] = a[i];
for (int j=0; j < M; j++) answer[i] += array[j] * coefs[j];}
At this point, we can launch the next iteration
38
Loop Pipelining ExampleWith Loop PipeliningNo Loop Pipelining
Looks almost like parallel thread execution
39
Digital Filter
z-1 z-1 z-1 z-1 z-1 z-1 z-1
X X X X X X X X
C0 C1 C2 C3 C4 C5 C6 C7
x(n)
+
y(n)
40
• Q&AFPGA in Embedded Devices
41
Thank you
Andriy SmolskyyConsultant, Engineeringandriy.smolskyy@globallogic.com+380-67-701-8637
top related