tigersharc clu closer look at the xcorrs

Post on 31-Dec-2015

29 Views

Category:

Documents

2 Downloads

Preview:

Click to see full reader

DESCRIPTION

TigerSHARC CLU Closer look at the XCORRS. M. Smith, University of Calgary, Canada smithmr@ucalgary.ca. Overview. Recap GPS correlation Look at XCORRS instruction in detail This was part of Take home quiz for 5005 Additional information on the web - PowerPoint PPT Presentation

TRANSCRIPT

TigerSHARC CLU

Closer look at the XCORRS

M. Smith,

University of Calgary, Canada

smithmr@ucalgary.ca

Overview

Recap GPS correlation Look at XCORRS instruction in detail

This was part of Take home quiz for 5005 Additional information on the web

Xcorrs.asm – assembly code discussed in class Xmain.cpp – demonstrates the use of the xcorrs.asm

code XcorrsTest.cpp – demonstrates testing of all the

functions being used Additional correlation presentations (not XCORRS)

from Analog Devices developers In 2005, we pointed out many errors in TigerSHARC

XCORRS explanation – if my figures are not the same as in the manual, then they fixed the manual errors

GPS Positioning Concepts

(1)

For now make 2 assumptions: We know the distance to each satellite We know where each satellite is

With this information from 2 satellites – you know you are on a “plane of intersection.

Require 3 satellites for a 3-D position in this “ideal” scenario Requires 4 satellites to account for local receiver clock drift.

Determining Time

Use the PRN code to determine time Use time to determine distance to the satellite

distance = speed of light * time

(1)

Signal send by satellite

Signal received by you

You know the signal sent

Perform correlations till you get a match

The practice

Suppose we have the vector – in-phase and out-of-phase data gathered over an antenna from a satellite for example. Gain issues make it x16

-16-16j, 16+16j, 16+16j, -16-16j 16+16j, 16+16j -16-16j, 16+16j, 16+16j, -16-16j 16+16j, 16+16j, -16-16j 16+16j, 16+16j, etc

Question – if the original data from the satellite had this form -1-j,1+j,1+j, -1-j,1+j,1+j, -1-j,1+j,1+j, -1-j,1+j,1+j, -1-j,1+j,1+j, -1-j,1+j,1+j,

How much is the satellite data delayed? FOR THIS EXAMPLE …….. 0, 3, 6, 9, 12 etc

Tackle the issue with FIR

First – modify correlation function to handle complex values Ignore that issue at the moment

– 1 add + 1 multiplication + 2 memory fetches to 3 adds + 4 multiplications plus 4 memory fetches

Imagine 1024 data points + 1024 PRN Need to do 1024 FIR each of 1024 taps We know how to optimize to do 2 taps every cycle (one

in X and one in Y) Cycle time is 1024 * 512 cycles = 1 ms at 500 MHz

XCORS can do 8 * 16 taps each cycle in each compute block – 148 times faster

Where does the CLU fit in?

XCORRS definition

THEORYMathematicaldefinition

Uses registers

TR -- accumulateD -- 8 data?C -- 1 coefficient?

And something calledCUT – essentially awindow operation

fcut = 0 -- don’t use

2005 Lab. 4Satellite data

Quad fetch brings in8 complex values 8 bits eachPattern here is -1 + 0j, 1 + 0j, 1 + 0j, -1 + 0j, 1 + 0j, 1 + 0j, ……….

PRN code – 2 bit complex number

Seems strange to have two dummy bitsBut actually makes sense

PRN -1+ -1j, 1 + j, 1 + j, -1 + -1j, 1 + j, 1 + j, ……….

+1, -1 are associated with the PSK – more another lecture

Problem BINARY means 1 and 0, so how represent 1 and -1

-1 are stored as 1’s, +1 stored as 0’s (DAMY)

PRN

PRN

0x3 value go in asC15 and C160011 -- C15 = -1 –j C16 = +1 + j

Loading the THR registers

Standard XCORRS instruction

Lower 46 bits ofTHR1:0

R7:3

TR0, TR1, TR2 ……. TR15

TR15:0 = XCORRS(R7:4, THR3:0)

Doing 8 complex taps of 16 correlationat each cycle

TR0 += D7 * C22 + D6 * C21 +… 8 tapsTR1 += D7 * C21 + D6 * C20 +… 8 taps………..………..TR15 += D7 * C7 + D6 * C6 + … 8 taps

64 taps each cycles – on both x and y compute blocks – if set up properly

128 taps each cycle – these are “complex taps”compared to 2 real taps / cycle after lab. 3

TR15:0 = XCORRS(R7:4, THR3:0) (CUT -7)

Because of offsets, sometimes wemust only use “some of the taps”

TR0 += D7 * C22 + D6 * C21 + … 8 tapsTR1 += D7 * C21 + D6 * C20 + … 8 taps………..………..TR14 += D7 * C8 + D6 * C7 2 tapsTR15 += D7 * C7 1 taps

TR15:0 = XCORRS(R7:4, THR3:0) (CUT -15)

TR0 += D7 * C22 + D6 * C21 … 8 tapsTR1 += D7 * C21 + D6 * C20 … 7 taps………..TR7 += D7 * C15 … 1 tapsTR0 += 0 … 0 taps

………..TR15 += 0 … 0 taps

TR15:0 = XCORRS(R7:4, THR3:0) (CUT +7?)

TR0 += 0 … 0 tapsTR1 += D0 *C14 1 taps………..TR7 += D6 * C14 + D5 * C13 + … 7 tapsTR0 += D7 * C14 + D6 * C13 + … 8 taps

………..TR15 += D7 * C7 + D6 * C7 + … 8 taps

TR15:0 = XCORRS(R7:4, THR3:0) (CUT -15)

TR0 += D7 * C22 + D6 * C21 … 8 tapsTR1 += D7 * C21 + D6 * C20 … 7 taps………..TR7 += D7 * C15 … 1 tapsTR0 += 0 … 0 taps

………..TR15 += 0 … 0 taps

TR15:0 = XCORRS(R7:4, THR3:0) (CUT -7)

TR0 += D7 * C22 + D6 * C21 + … 8 tapsTR1 += D7 * C21 + D6 * C20 + … 8 taps………..………..TR14 += D7 * C8 + D6 * C7 2 tapsTR15 += D7 * C7 1 taps

TR15:0 = XCORRS(R7:4, THR3:0)

TR0 += D7 * C22 + D6 * C21 +… 8 tapsTR1 += D7 * C21 + D6 * C20 +… 8 taps………..………..TR15 += D7 * C7 + D6 * C6 + … 8 taps

64 taps each cycles – on both x and y compute blocks – if set up properly

128 taps each cycle – these are “complex taps”compared to 2 real taps / cycle after lab. 3

Problem at this point -- THR3:2 emptyNeed to bring in more PRN values

TR15:0 = XCORRS(R7:4, THR3:0) (CUT +15)

TR0 += 0 … 0 tapsTR1 += D0 *C14 1 taps………..TR7 += D6 * C14 + D5 * C13 + … 7 tapsTR0 += D7 * C14 + D6 * C13 + … 8 taps

………..TR15 += D7 * C7 + D6 * C7 + … 8 taps

Final Result

Maximum correlation occurs every 3 shifts – which is what we expectIs it the correct result?

Correlation – result expected

In step-1 +0j, 1 + 0j, 1 + 0j, … 16 times

with-1 - j, 1 + j, 1 + j, … 16 times

-1 * -1 + 1 * 1 + 1 * 1 + 48 = 0x30 -- Real component

Out of step-1 +0j, 1 + 0j, 1 + 0j, … 16 times

with1 + j, 1 + j, -1 - j, … 16 times

-1 * 1 + 1 * 1 + 1 * -1 + -16 = -0x10 = 0xFFF0

Final Result

1) Now have correlation values for 16 shifts in TR registers – store to external memoryRepeat for all other necessary shifts – find the maximum2) Now make parallel in SISD mode 3) Now make parallel in SIMD

Overview

Recap GPS correlation Look at XCORRS instruction in detail

This was part of Take home quiz for 5005 Additional information on the web

Xcorrs.asm – assembly code discussed in class Xmain.cpp – demonstrates the use of the xcorrs.asm

code XcorrsTest.cpp – demonstrates testing of all the

functions being used Additional correlation presentations (not XCORRS)

from Analog Devices developers In 2005, we pointed out many errors in TigerSHARC

XCORRS explanation – if my figures are not the same as in the manual, then they fixed the manual errors

top related