details .l and .s units
DESCRIPTION
TMS320 C6xx. TMS320 C6000. Details .L and .S units. Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004. Register File A. .S. A0 A1 A2 A3 A15. a. x. .M. prod. Y. .L. .D. .D. 32-bits. Data Memory. Details .L and .S units. OPERAZIONI - PowerPoint PPT PresentationTRANSCRIPT
Details Details .L.L and and .S.S unitsunitsTMS320TMS320C6000C6000
Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004
TMS320 C6xx
2Data MemoryData MemoryData MemoryData Memory
.D.D.D.D
.M.M.M.M
.L.L.L.L
A0A0
A1A1
A2A2
A3A3
A15A15
Register File ARegister File A
..
..
..
aaxx
prodprod
32-bits32-bits
YY
.D.D.D.D
.S.S.S.SOPERAZIONIOPERAZIONIARITMETICO ARITMETICO
LOGICHELOGICHEGeneral PurposeGeneral Purpose
Details Details .L.L and and .S.S units units
OPERANDIOPERANDICOCO .U.U < <?? >, < >, <??> , <> , <??>>
3
However, we have seen that registers are only However, we have seen that registers are only 32-bit32-bit. .
So where do the So where do the 40-bit40-bit registers come from? registers come from?
OPERANDSOPERANDS 32/40-bits Register, 5-bits Constant32/40-bits Register, 5-bits Constant
OPERANDS can be:OPERANDS can be: 5-bit5-bit constants constants (or (or 16bit16bit for MVKL and for MVKL and
MVKH)MVKH)
32-bit32-bit registers registers 40-bit40-bit Registers Registers
4
There are There are 3 conditions3 conditions that need to be that need to be respected:respected:
The registers must be from the same The registers must be from the same sideside..
The first register must be even and the The first register must be even and the second odd. The registers must be second odd. The registers must be consecutive.consecutive.
OPERANDSOPERANDS 40-bits Register40-bits Register
A 40-bit register can be A 40-bit register can be obtained by obtained by concatenating concatenating two registerstwo registers
5
A1:A0A1:A0
A3:A2A3:A2
A5:A4A5:A4
A7:A6A7:A6
A9:A8A9:A8
A11:A10A11:A10
A13:A12A13:A12
A15:A14A15:A14
odd odd eveneven::323288
40-bit Reg40-bit Reg
B1:B0B1:B0
B3:B2B3:B2
B5:B4B5:B4
B7:B6B7:B6
B9:B8B9:B8
B11:B10B11:B10
B13:B12B13:B12
B15:B14B15:B14
odd odd eveneven::323288
40-bit Reg40-bit Reg
All combinations of 40-bit registers are shown below:
OPERANDSOPERANDS 40-bits Register40-bits Register
6
32-bit32-bitRegReg
5-bit5-bitConstConst
32-bit32-bitRegReg
40-bit40-bitRegReg
32-bit32-bitRegReg
40-bit40-bitRegReg
< src >< src > < src >< src >
< dst >< dst >
.L.L or or .S.S
instr .unit <instr .unit < SRCSRC >, <>, < SRCSRC >, <>, < DSTDST >>instr .unit <instr .unit < SRCSRC >, <>, < SRCSRC >, <>, < DSTDST >>
OPERANDSOPERANDS32/40-bits Register, 5-bits Constant32/40-bits Register, 5-bits Constant
7
32-bit32-bitRegReg
40-bit40-bitRegReg
< src >< src > < src >< src >
32-bit32-bitRegReg
5-bit5-bitConstConst
32-bit32-bitRegReg
40-bit40-bitRegReg
< dst >< dst >
.L.L or .S or .S
OperandsOperands 32/40-bits Register, 5-bits Constant32/40-bits Register, 5-bits Constant
instr .L <instr .L < SRCSRC >, <>, < SRCSRC >, <>, < DSTDST >>instr .L <instr .L < SRCSRC >, <>, < SRCSRC >, <>, < DSTDST >>
8
OR .L1 A0, A1, A2OR .L1 A0, A1, A2
32-bit32-bitRegReg
40-bit40-bitRegReg
< src >< src > < src >< src >
32-bit32-bitRegReg
5-bit5-bitConstConst
32-bit32-bitRegReg
40-bit40-bitRegReg
< dst >< dst >
.L.L or .S or .S
OperandsOperands 32/40-bits Register, 5-bits Constant32/40-bits Register, 5-bits Constant
instr .L <instr .L < SRCSRC >, <>, < SRCSRC >, <>, < DSTDST >>instr .L <instr .L < SRCSRC >, <>, < SRCSRC >, <>, < DSTDST >>
9
OR .L1 A0, A1, A2OR .L1 A0, A1, A2
ADD .L2 -5, B3, B4ADD .L2 -5, B3, B4
32-bit32-bitRegReg
40-bit40-bitRegReg
< src >< src > < src >< src >
32-bit32-bitRegReg
5-bit5-bitConstConst
32-bit32-bitRegReg
40-bit40-bitRegReg
< dst >< dst >
.L.L or .S or .S
OperandsOperands 32/40-bits Register, 5-bits Constant32/40-bits Register, 5-bits Constant
instr .L <instr .L < SRCSRC >, <>, < SRCSRC >, <>, < DSTDST >>instr .L <instr .L < SRCSRC >, <>, < SRCSRC >, <>, < DSTDST >>
10
OR .L1 A0, A1, A2OR .L1 A0, A1, A2
ADD .L2 -5, B3, B4ADD .L2 -5, B3, B4
ADD .L1 A2, A3, A5:A4ADD .L1 A2, A3, A5:A4
32-bit32-bitRegReg
40-bit40-bitRegReg
< src >< src > < src >< src >
32-bit32-bitRegReg
5-bit5-bitConstConst
32-bit32-bitRegReg
40-bit40-bitRegReg
< dst >< dst >
.L.L or .S or .S
OperandsOperands 32/40-bits Register, 5-bits Constant32/40-bits Register, 5-bits Constant
instr .L <instr .L < SRCSRC >, <>, < SRCSRC >, <>, < DSTDST >>instr .L <instr .L < SRCSRC >, <>, < SRCSRC >, <>, < DSTDST >>
11
OR.L1 A0, A1, A2OR.L1 A0, A1, A2
ADD.L2 -5, B3, B4ADD.L2 -5, B3, B4
ADD.L1 A2, A3, A5:A4ADD.L1 A2, A3, A5:A4
SUB.L1 A2, A5:A4, A5:A4SUB.L1 A2, A5:A4, A5:A432-bit32-bitRegReg
40-bit40-bitRegReg
< src >< src > < src >< src >
32-bit32-bitRegReg
5-bit5-bitConstConst
32-bit32-bitRegReg
40-bit40-bitRegReg
< dst >< dst >
.L.L or .S or .S
OperandsOperands 32/40-bits Register, 5-bits Constant32/40-bits Register, 5-bits Constant
instr .L <instr .L < SRCSRC >, <>, < SRCSRC >, <>, < DSTDST >>instr .L <instr .L < SRCSRC >, <>, < SRCSRC >, <>, < DSTDST >>
12
OR.L1 A0, A1, A2OR.L1 A0, A1, A2
ADD.L2 -5, B3, B4ADD.L2 -5, B3, B4
ADD.L1 A2, A3, A5:A4ADD.L1 A2, A3, A5:A4
SUB.L1 A2, A5:A4, A5:A4SUB.L1 A2, A5:A4, A5:A4
ADD.L2 3, B9:B8, B9:B8ADD.L2 3, B9:B8, B9:B832-bit32-bitRegReg
40-bit40-bitRegReg
< src >< src > < src >< src >
32-bit32-bitRegReg
5-bit5-bitConstConst
32-bit32-bitRegReg
40-bit40-bitRegReg
< dst >< dst >
.L.L or .S or .S
OperandsOperands 32/40-bits Register, 5-bits Constant32/40-bits Register, 5-bits Constant
instr .L <instr .L < SRCSRC >, <>, < SRCSRC >, <>, < DSTDST >>instr .L <instr .L < SRCSRC >, <>, < SRCSRC >, <>, < DSTDST >>
TMS320TMS320C6000C6000Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004
Register to RegisterRegister to RegisterData TransferData Transfer
TMS320 C6xx
14
To move the content of a To move the content of a Control Control RegisterRegister to another register (A or B) to another register (A or B) or vice-versa use the or vice-versa use the MVCMVC instruction, instruction, e.g.e.g.::
MVC IFR , A0 MVC IFR , A0 oror MVC A0 , MVC A0 , IRPIRP
Register to RegisterRegister to Register Data Transfer Data Transfer
To move the content of a To move the content of a RegisterRegister (A (A or B) to another register (B or A) use or B) to another register (B or A) use the move the move MVMV Instruction, e.g.: Instruction, e.g.:
MV A0 , B0 MV A0 , B0 oror MV MV B6 , B7B6 , B7
IncreasingIncreasing thethe
processing processing powerpower
TMS320TMS320C6000C6000Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004
TMS320 C6xx
16
How can we How can we add more add more
processing processing power to this power to this processor?processor?
Increasing the processing power!Increasing the processing power!
Data MemoryData MemoryData MemoryData Memory
.D.D.D.D
.M.M.M.M
.L.L.L.L
A0A0
A1A1
A2A2
A3A3
A15A15
Register File ARegister File A
..
..
..
32-bits32-bits
.D.D.D.D
.S.S.S.S
17
Increase the clock Increase the clock frequencyfrequency
Increasing the processing power!Increasing the processing power!
Increase the Increase the number of number of Processing Processing
unitsunits
Data MemoryData MemoryData MemoryData Memory
.D.D.D.D
.M.M.M.M
.L.L.L.L
A0A0
A1A1
A2A2
A3A3
A15A15
Register File ARegister File A
..
..
..
32-bits32-bits
.D.D.D.D
.S.S.S.S
18
To increase the Processing Power, this processor has Two Sides
Data MemoryData Memory
RegisterRegister File AFile A
.S.S11.S.S11
.M.M11.M.M11
.L.L11.L.L11
.D.D11.D.D11
AA00AA11AA22AA33AA44
..
..
..
AA151532-bits32-bits
Register Register File BFile B
.S.S22.S.S22
.M.M22.M.M22
.L.L22.L.L22
.D.D22.D.D22
BB00BB11BB22BB33BB44
..
..
..
BB151532-bits32-bits
Increasing the processing power!Increasing the processing power!
Scambio di operandi
Scambio di operandi
19
To exchange operands between the two sides, some CROSS PATH or LINKS are required
What is a CROSS PATH?What is a CROSS PATH? A Cross Path links one side of the CPU A Cross Path links one side of the CPU
to the otherto the other There are two types of Cross Paths:There are two types of Cross Paths:
DATADATA CROSS PATHCROSS PATH
ADDRESSADDRESS CROSS PATHCROSS PATH
Increasing the processing power!Increasing the processing power!
IncreasingIncreasing thethe
processing processing powerpower
TMS320TMS320C6000C6000Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004
Data Data Cross Cross PathsPaths
21
Data Data Cross PathsCross Paths
Data C.P. = Data C.P. = register file C.P.register file C.P. These cross paths allow These cross paths allow operandsoperands from from
one side to be used by the one side to be used by the unitsunits of the of the other sideother side
There are There are only two cross pathsonly two cross paths:: one path which conveys data one path which conveys data from side B from side B
to side Ato side A, , 1X1X one path which conveys data from one path which conveys data from side A side A
to side Bto side B, , 2X2X
22
TMS320TMS320C67x C67x Data-Path - Data-Path - SummarySummary
DATA cross paths only apply to the .L, .S and .M units
The data cross paths are very useful, however there are some limitations in their use.
TMS320 C67xsrc = source
dst = destination
23
DataData Cross Path - Cross Path - LimitationsLimitations
(1) The Destination register must be on same side as unit
(2) Source registers - up to ONE Cross Data Path per execute packet per Side.
Execute packet: group of instructions that execute simultaneously.
AA
22xx
.L1.L1
.M1.M1
.S1.S1
BB
11xx
<src><src>
<src><src><dst><dst>
24
DataData Cross Path - Cross Path - LimitationsLimitations
ADD .L2x A0 , A1 , B2
MPY .M1x A0 , B6 , A9
SUB .S1x A8 , B2 , A8 |||| ADD .L1x A0 , B0 , A2
|| || Means that the SUB and ADD belong to the Means that the SUB and ADD belong to the same fetch packet, therefore execute simultaneously.same fetch packet, therefore execute simultaneously.
AA
22xx
.L1.L1
.M1.M1
.S1.S1
BB
11xx
<src><src>
<src><src><dst><dst>
Not Valid !
Source registers - up to ONE Cross Data Path per execute packet per Side
25
AA
22xx
.L1.L1
.M1.M1
.S1.S1
BB
11xx
<src><src>
<src><src><dst><dst>
.L2.L2
.M2.M2
.S2.S2
<dst><dst><src><src>
<src><src>
DataData Cross Path - Cross Path - LimitationsLimitations
SUB .S1x A8 , B2 , A8 |||| ADD .L2x A0 , A0 , B5
Valid !
IncreasingIncreasing thethe
processing processing powerpower
TMS320TMS320C6000C6000Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004
ADDRESS ADDRESS Cross Cross PathsPaths
27
AddressAddress paths paths
.D1.D1AA
AddrAddr
DataData
LDW .LDW .D1T1D1T1 **A0A0, , A5A5STW .STW .D1T1D1T1 A5A5, , **A0A0LDW .LDW .D1T1D1T1 **A0A0, , A5A5STW .STW .D1T1D1T1 A5A5, , **A0A0
The pointer must be on the same side of the unit
28
.D1.D1AA
*A0*A0
BB
Data1Data1 A5A5
Data2Data2 B5B5
DA1 = T1DA1 = T1
DA2 = T2DA2 = T2LDW .LDW .D1T1D1T1 * *A0A0,,A5A5LDW .LDW .D1D1T2T2 * *A0A0,,B5B5LDW .LDW .D1T1D1T1 * *A0A0,,A5A5LDW .LDW .D1D1T2T2 * *A0A0,,B5B5
.D2.D2
Address Address Cross PathsCross Paths
29
Standard Standard Parallel Parallel LoadsLoads
.D1.D1AA
A5A5
*A0*A0
BBB5B5
.D2.D2
Data1Data1
*B0*B0
LDW .LDW .D1T1D1T1 * *A0A0,,A5A5|| LDW .|| LDW .D2T2D2T2 * *B0B0,,B5B5 LDW .LDW .D1T1D1T1 * *A0A0,,A5A5|| LDW .|| LDW .D2T2D2T2 * *B0B0,,B5B5
DA1 = T1DA1 = T1
DA2 = T2DA2 = T2
30
ParallelParallel Load/Store using Load/Store using Address Cross PathsAddress Cross Paths
.D1.D1 AAA5A5
*A0*A0
BBB5B5
.D2.D2
Data1Data1
*B0*B0
LDW .LDW .D1D1T2T2 * *A0A0,,B5B5|| STW .|| STW .D2D2T1T1 A5A5,*,*B0B0 LDW .LDW .D1D1T2T2 * *A0A0,,B5B5|| STW .|| STW .D2D2T1T1 A5A5,*,*B0B0
DA1 = T1DA1 = T1
DA2 = T2DA2 = T2
31
Fill the blanks ... Does this Fill the blanks ... Does this work?work?
.D1.D1AA
*A0*A0
BB
.D2.D2
Data1Data1
*B0*B0
LDW .LDW .D1D1__ *__ *A0A0,,B5B5|| STW .|| STW .D2D2__ __ B6B6,*,*B0B0 LDW .LDW .D1D1__ *__ *A0A0,,B5B5|| STW .|| STW .D2D2__ __ B6B6,*,*B0B0
DA1 = T1DA1 = T1
DA2 = T2DA2 = T2
32
Not Allowed!Not Allowed! Parallel accesses: Parallel accesses: both cross or neither both cross or neither
crosscross
.D1.D1AA
*A0*A0
BBB5B5
B6B6
.D2.D2
Data1Data1
*B0*B0
LDW .LDW .D1D1T2T2 * *A0A0,,B5B5|| STW .|| STW .D2D2T2T2 B6B6,*,*B0B0 LDW .LDW .D1D1T2T2 * *A0A0,,B5B5|| STW .|| STW .D2D2T2T2 B6B6,*,*B0B0
DA2 = T2DA2 = T2
33
‘‘CC67x67x Address Data-Path Address Data-Path - - SummarySummary
CPUCPURef GuideRef Guide
Full CPU DatapathFull CPU Datapath(Pg 2-2)(Pg 2-2)
TMS320 C67x
34
ConditionsConditions Don’t Use Cross Don’t Use Cross PathsPaths
Examples:Examples:
[[BB2]2] ADD .L ADD .L11 A2,A0,A4A2,A0,A4 // // [[AA1]1] LDW .D LDW .D22 *B0,B5 *B0,B5
If aIf a conditional registerconditional register comes from the opposite comes from the opposite sideside, it does , it does NOT use a data or address cross-NOT use a data or address cross-
pathpath..
35
Cross Paths - Cross Paths - SummarySummary
DataData Destination register on same side as unit.Destination register on same side as unit. Source registers - up to one cross path per Source registers - up to one cross path per
execute packet per side.execute packet per side. Use “x” to indicate cross-path.Use “x” to indicate cross-path.
AddressAddress Pointer must be on same side as unit.Pointer must be on same side as unit. Data can be transferred to/from either side.Data can be transferred to/from either side. Parallel accesses: both cross or neither cross.Parallel accesses: both cross or neither cross.
ConditionalsConditionals Don’tDon’t Use Cross Paths. Use Cross Paths.