lfllfflfflffllfnlfthe specified algorithm has been verified through fortran simulation to produce...

A-?9 51U ACHITECTURAL IMPLICATIONS OF A PARALLEL COMPUTATIONAL I/AAPPROACH TO TIE VL. (U) R FORCE INST OF TECH

IQT-PRTTER5DU WD 00H SCHOOL OF ENGI. J L STRAUSSUNCLRSSIFIED AA 9 OFXTIGEIENB/S?N-5 F/O 9/2 ML

smmohhomhhhhmu

I .lfllfflfflffllfNlf

II L' I 2-° .

1I25. 11111 114 6- I 11 11.

"}I

OTJ FILE COPYi

F .

I FO

ARCHITECTURAL IMPLICATIONS OF, A PARALLEL COMPUTATIONAL APPROACH

TO THE VECTOR WAVE EQUATION

LThesisJack L. StraussCaptain, USAF

AFIT/GE/ENG/87M-5 ........ ... ... DT ICI Ts document has been approved ^ EaECTE,I for public release and sale; its PR1 8Listribution is unlimited. 2 PR 16 198

DEPARTMENT OF THE AIR FORCE AAIR UNIVERSITY

AIR FORCE INSTITUTE OF TECHNOLOGY

I_ Wright-Patterson Air Force Base, Ohio

S87 4 15 043

F,=1w mlw w~w-I-iTt" - I"1L n -. M, w- X-9r~' - a'...Wrw

L

'I.

I.

ARCHITECTURAL IMPLICATIONS OFA PARALLEL COMPUTATIONAL APPROACH


Thesis

Jack L. Strauss• Captain, USAF

AFIT/GE/ENG/87M-5

Approved for public release; distribution unlimited

I'6

AFIT/GE/ENG/87M-5

ARCHITECTURAL IMPLICATIONS OF

A PARALLEL COMPUTATIONAL APPROACH


Thesis

Presented to the Faculty of the School of Engineering

of the Air Force Institute of Technology

Air University

In Partial Fulfillment of the

Requirements for the Degree of

Master of Science in Electrical Engineering

lack L. Strauss

Captain, USAF

March 1987

Approved for public release; distribution unlimited

.9

'5.

This research is part of the continuing development effort from the Vector Wave Equation (VWE)

- Research Group. The work reported in this thesis is the algorithmic manipulation of the VWE to form a

parallel computational approach which, when presented in a graphical form, provides a specification tool for

Very Large Scale Integration (VLSI) implementation.

V It is my firm belief that the effort of the VWE Research Group can make a significant contribution

to the "State-of-the-Art" in the design and analysis of structures that are impacted by electromagnetic quan-

tities or involve electromagnetic metrics as a design constraint. As such, it has been a pleasure and an

*honor to be involved in this work.

It is appropriate here, that I acknowledge and give thanks to the people and forces that made this

research possible. The AFIT computer environment is the best that I have ever been involved with. My

Thesis Advisor Maj Joe DeGroat, encouraged creative approaches to solutions rather than the security of

classical methods. These factors combined to make for a research environment that was both exciting and

powerful.

I would be terribly remiss if did not give special mention to my wife Debra, whose support,

p encouragement, and tolerance of the long hours required to do this work made this thesis possible.

Finally, I give thanks to the Holy Father and his son Jesus Christ in whom I believe.

Jack L. Strauss

4:4

ii '." i

%-d

' ,.. -. :.,. ,:-." ."-.',," " .,."."-.'." .. ..'. ;; -: ;.;..:'. .-'-'-,-. ; ,;-' ;- -' . -;-.:;,. ---.. ; ; .',ii'--'2-:--;',-';. ,- -';-

t

Tabl oL Contents

Page

Preface ...................................................................................................................... ai

List of Figures ........................................................v

Abstract ..................................................................................................................... vi

. Introduction ....................................................................................................... 1

Electromagnetic Background ............................................................................ 2'A" ,1,Approach ..................................................................................................... 5

Error Expression ............................................................................................ 5Utilization of Results ...................................................................................... 6

II. Development of Parallel VWE Algorithm .......................................................... 8Va...

Vector Potential Derivation .................................................................................. 9Magnetic Field Derivation ................................................................................. 16

r4Elctric Field Derivation.............................................17

III. Development of VW E Error Expression ............................................................ 20

The VW E Core .............................................................................................. 20Assignment of Error Terms .............................................................................. 21Bouom-up Derivation of Error Model ................................................................ 22The VWE Error Model ..................................................................................... 27

IV. Analysis .......................................................................................................... 29

Removal of Operational Redundancy ..................................... ........................... 29Maximal Concurrency .................................................................................... 31Architectural Considerations .............................................................................. 32

V. Conclusions and Recommendations .................................... 39

Appendix A. VWE processing in MACSYMA ............................................................ 41

Appendix B: Data-flow Graphs of Vector Quantities ...................................................... 86

Bibliography ............................................................................................................ 100

V ita ........................................................................................................................ 10 1

P%

.%

:-.. io~.o

." ill

LiLf Eigue

Figure Page

1. General Shape Antenna ....................................................................................... 3

2. Decomposition Algorithm ................................................................................... 8

q 3. Real Part of "X" Directed Vector Potential (reduced) ............................................ 15

4. Combined Cartesian Set for Vector Potential (reduced) ........................................ 15

5. Real Part of "X" Directed Magnetic Field (reduced) ............................................ 17

6. Combined Cartesian Set for Magnetic Field (reduced) ........................................ 17

7. Real Part of "X" Directed Electric Field (reduced) ................................................ 19

8. Combined Cartesian Set for Electric Field (reduced) ............................................ 19

9. VWE Core Equation .................................................................................... 21

10. Number FLOPS vs Dipole Elements for a givenComputational Approach ...................................................................................... 30

11. Fully Overlayed Vector Quantities (reduced) .................................................... 31

12. Block Diagram and Utilization of 202 FunctionalUnit Architecture .......................................................................................... 34

13. Block Diagram and Utilization of 202 FunctionalUnit Pipelined Architecture ............................................................................ 35

14. Block Diagram and Utilization of 51 FunctionalUnit Architecture ......................................................................................... 36

15. Time vs Dipole Elements for a givenArchitectural Approach ................................................................................. 38

16. Real Part of the "X" Directed Vector Potential 874%'

17. Imaginary Part of the "X" Directed Vector Potential ............................................... 88

18. Real and Imaginary Parts (paired)of the "X" Directed Vector Potential ................................................................ 89

19. Combined Cartesian Set for the Vector Potential ................................................ 90

20. Real Part of the "X" Directed Magnetic Field .................................................... 91

21. Imaginary Part of the "X" Directed Magnetic Field 92p.

22. Real and Imaginary Parts (paired)of the "X" Directed Magnetic Field ................................................................. 93

23. Combined Cartesian Set for the Magnetic Field ................................................ 94

"" i V

I._'-'*% % ** - ---%.% % - -% °,'-*l%- .- = #% % , % '% %t.v . ,%,' ** ,-'.*, -% '*% ." % * " % iv %

24. Real Part of the "X" Directed Electric Field .......................................................... 95

25. Imaginary Part of the "X" Directed Electric Field .............................................. 96

26. Real and Imaginary Parts (paired)of the "X" Directed Electric Field ................................................................... 97

27. Combined Cartesian Set for the Electric Field ................................................. 98

28. Fully Overlayed Vector Quantities .................................................................. 99

-.-

.1'4

IN,

Av

'p'

pp%

. v p

S%

AFIT/GE/ENG/87M-5

, Abstract

-An algorithm has been specified for the hardware implementation of the numerical solution of the

electromagnetic fields of an arbitrary current source. The algorithm, defined as the Vector Wave Equation

(VWE), solves for the magnetic and electricWEy fields, as well as the vector potential(A) of a finite

length arbitrary current source. The specified algorithm has been verified through FORTRAN simulation

to produce results accurate to within 2 decimal places for a 500 sub-element dipole.

The VWE forms a model which is algorithmically symmetric with respect to the cartesian coordi-

nate system. As such, the VWE lends itself to highly parallel and concurrent computational techniques.

This property of the algorithm makes it an excellent candidate for implementation by a Very High Speed

Integrated Circuit (VHSIC) class processor. Investigation of a parallel, highly concurrent architectural

implementation has yielded preliminary results that computational savings of a factor of 3 and a throughput

a rate increase of 5 orders of magnitude is attainable.

This research has shown that an application specific VHSIC-class processor array has sufficientwoo %

computing power to support interactive calculation of a set of equations which solve for the magnetic and

electric fields, as well as the vector potential of an arbitrarily current source. i

-'' 1

V,,

V al

1/1

-U

-. -- __-.-";.;

° . _

ARCHITECTURAL IMPLICATIONS OF

A PARALLEL COMPUTATIONAL APPROACH


.9

I. Introduction

This thesis is the study, development, and analysis of a parallel computational approach to

the set of electromagnetic field equations defined below as the Vector Wave Equations (VWE). The charter

of this effort was to specify a parallel processing engine for the VWE that could be implemented in either

software or hardware. As such, the derivation of a parallel algorithm, specification of computational flow,

and an analysis of the algorithm's behavior with respect to error/accuracy considerations became the focus of

this work.

This study was motivated by the the end objective of developing an interactive CAD/CAM envi -

ronment for aerodynamic systems design where a structure's electromagnetic observability, specifically its

radar cross section, is known early in the design phase of a project. The capability of having a structure's

observability known and interactively updated as the design progresses and is changed requires calculation

of near- field electromagnetic wave equations. Numerical solutions to these equations is a computationally

intensive problem that requires millions of floating point calculations.

At present, software approaches on classical computer architectures have extremely long run times.

Because of this constraint, only simplified antenna configurations and geometries are examined. Conse -

quently, there is a need for the development of a very fast customized VHSIC-class processor engine for

calculating the radiated fields of arbitrary antennas. This processor engine would be part of a workstation

which integrates the processor's electromagnetic computing capabilities along with interactive aerodynamic

structural design techniques, such that electromagnetic metrics become an integral part of the design phase.

The design phase is the most effective time to influence the Radar Cross Section (RCS) of an

object. Once the design has been implemented, RCS reduction must rely on the addition of heavy radar

absorbing material, and active/passive cancellation. Radar absorbing material helps to dissipate the incident

.5. 1

energy upon its surface, thereby reducing the RCS of the target. However, this material can add con -

siderable weight and thereby degrade the operational characteristics of the structure. Active/passive can -

cellation techniques that reduce the RCS becomes intractable after design implementation due to the large

size of the structures and current analysis capabilities. Consequently, it is more cost effective to make

structural changes within a CAD environment during the design phase.

With the limitations to simplified geometries for numerical analysis, complex structures such as

an aerospace vehicle require RCS measurements with scale models at compact ranges, or with full size*,,'

models at outdoor ranges. This approach is time consuming, costly, and subject to scaling problems and

poor weather conditions. However, the incorporation of a VHSIC-class processor engine, capable of high

speed calculation of the electromagnetic equations into an interactive CAD workstation, makes possible real

time observations of electromagnetic metrics. As such, the impact of modifications to the object's shape

and composition material's on its RCS are known during its design.

This introductory text will include a discussion of the electromagnetic background and development

of the VWE. The approach used in the derivation of the parallel algorithm will be stated. Motivation

and basic assumptions in the derivation of the error expression will be discussed and referenced. Finally,

some thoughts on the utilization and usefulness of the results of this study will be touched on.

Elrmagn Backound

The basis for modem antenna theory lies with the set of formulas known as Maxwell's equations.

By utilizing Maxwell's equations, the total radiation pattern can be derived for a given antenna (acting as a

scatterer). The total field of an antenna is the sum of the incident and the scattered fields of that antenna.

To find the scattered field, the incident field is subtracted from the total field. Once the current distribution

on the scatterer is found, the radiation integral given by Maxwell can be numerically computed.

A mid-point summation technique was employed to arrive at a numerical solution for the vector

potential (A) [Jones, 1985:6]. Initially it was anticipated that the vector potential would be used as an

ad intermediate result in obtaining the final electric (E) and magnetic (H) fields. The solution for the vector

potential involves an approximation where the dipole is divided into a finite number of sub-elements.

Equation (1) defines the mathematical formula relating the vector potential solution to the general antenna

problem as depicted in Figure 1. The evaluation of the complex integral in equation (1) is necessary to

analyze even the simplest antenna. The desired magnetic and electric fields are derived from equations (2)

and equation (3). The mathematical procedure is a vector curl operation, for which this study uses cartesian

coordinates.

mSA AA i Ji(ij ) (EF-Fj i (i)

'=1 4-n I F-ri

where AA i = Surface Area of i element

Ji (r-) = Complex Current Density of it elementth= Distance from origin to the i element

P = wave number (2ir/X)

LA z<, Y, z)

Observation

4' Figure 1X General Shaped AntennQ

I"H= vxA (2)/'E = i/jc, V × H (3)

... where & = the radian frequency= the dielectric constant,

,: 3

40

. ... . .. ."", ".'" ."~ ." " "" " . ."- " - - " "

/..,8. " , , -° Figure 1 % ' %" "" %*%* "%" """'" %""'"-"" : T

" YT

,.. ,. , ,. ., ,,;,..it .. ta .Lrk-W7 No.,? N ° , . .N-C . . N: "R; 'k.:_: .. . r -'iA.- .k7 - .-k: - -7'

However, it was found that the numerical result for the vector potential is not necessary in

computing the magnetic and electric fields. The mid-point summation method employed to obtain the

vector potential can be used for the magnetic and electric fields directly [Hoyt,1986]. Equations (4) and (5)

below show the magnetic and electric fields in the 'x' direction for cartesian coordinates. Equations for the

y and z vector components are of equal complexity and similar in form to those in the x direction for the.-

general antenna configuration. Evaluation of these equations is further complicated by current densities (J)

being complex quantities in the general case.

H zh -" + j 3 e ( (z-zi)- z(Y-Yi) (4)

"=i 4r r 2

-i e 2 yF 2 )(p2 1 3J 4)E h + 2,- x4 r r (5)

P3 (_p.2+ 3 + zi(jPi(zz + (- )(-i

2-ir 2 r

where h = length of the dipole per number of sub-elements

r= I r- ri = sqrt [(x-xi)2+(y-yi)2+(z-zi)2}

Hoyt derived and validated these equations, defined here to be the discrete Vector Wave Equations

(VWE), with two current densities of different geometries on a simple z-directed dipole. The algorithm was

evaluated with varying numbers of sub-elements to determine the number required to give an accuracy of

two decimal places. The result of Hoyt's study show that the numerical summation algorithm produces

results accurate to within 1% for 95 percent of the observation points for a 5(X) sub-element dipole. This

study exploits the inherent symmetry of the algorithm with respect to the cartesian coordinate system and.

as such, leads to a highly parallel and concurrent computational approach.

1,.4

V... . .. -

AU ~: + Apprach

With Hoyt's results verifying that an acceptable level of accuracy can be obtained with 500 sub-

elements, and the knowledge that any meaningful construct will be formed from thousands, possibly

millions of finite length dipoles, clearly the critical path to reducing the computational intensity

is to calculate the quantities within the summation formula as efficiently as possible. Therefore

this thesis effort is concerned with the ith calculation.

The approach used in the derivation of the computational expressions which yield the desired

results is threefold. First, the technique of a top down functional decomposition was performed for each.9

of the vector quan-tities of interest. This procedure, yielded a database of computational expres-

sions in which intermediate results could be studied throughout the calculation process. Second, the

calculation flow of each vector quantity was mapped graphically to form the construct of a data-

flow graph. These data-flow graphs yield visual evidence of the inherent symmetry of these expres-

sions and provide the parallel computational specification desired as a result of this work. Third, the

N data-flow graphs were overlayed to find the levels of concurrency that are attainable.

The processes of functional decomposition and the mapping into the directed graphs is detailed

in chapter two of this thesis. Results in computational requirements reduction at various levels of

*1 computational concurrency is detailed in chapter four, the analysis section of the thesis. Finally, a

complete set of calculation flow graphs are presented in appendix B for reference and concise access.

r Error Expression

-J There are two types of error associated with any computer computation. The first type is error

attributable to errors in data. This type of error is a function of the inherent error in finite word length and

data representation. Also, the error transmitted as the value of some function to another process creates a

cumulative input error to subsequent computations. The second type, generated error, is a grouping of

factors such as truncation, algorithmic cancellation of significant digits, and others that are generated by the

particular implementation of the function [Cody and Waite, 1980: 111. Of the two types, the programmer

N

t .

or designer has direct control of generated error but must observe the systems performance to gain insight

into the "effects" and "behavior" of inherent and transmitted error.

The analysis of the cumulative effects of transmitted error requires extensive simulation for

nontrivial systems. Analytic techniques rely on the Stochastic Model [Taylor, 1983: 3911 where an

arbitrary input is processed, and the result is of the form Y[X(n)] plus some delta representing the error

-" indicative of the process. Manually tracking these types of data for a system of even moderate size is error

prone, at best. The problem becomes intractable for complex systems.

Chapter three of this thesis implements the concept of the Stochastic Model in terms of inherent

and generated error to gain visibility of the behavior of the Parallel VWE with respect to cumulative

transmitted error. The process is performed in MACSYMA and is presented at a tutorial level.

Utilization of Rsults

In this final section of introductory text, a brief discussion of the intended utilization and resultant

usefulness of the results of this thesis effort is appropriate. Three areas will be addressed: the target

processing environment, intended use of the Error Expression, and the use of MACSYMA in handling

large sets of equations.

= Processing Environment. As stated by way of the end objective being a VHSIC-class

processor engine for incorporation into a CAD work-station, the initial target processing environment is a

hardware implementation. However, the results of this study have yielded, by means of the data-flow

graphs, several levels of processing environment independent computation specifications that can and should

• "be implemented and tested in software. Exploration into alternate architecture software implementation of

any of the levels of computational concurrency is sure to yield time savings far greater than any classical

computation techniques.

Er Exrssion. The intended use of the derived error expression discussed above is as an

analytical model for hardware implementation design trade-off decisions. Because the level of

abstraction is purely independent of any elementary function implementation, the result is a model

."V

%

.' ' 6,*5

that a VLSI designer can use without constraint to any one particular elementary function

implementation. This type of model, when complemented with empirical data from simulation or

known levels of accuracy from previous designs, can aid in the assurance of overall system accuracy

being known during the design phase.

Use of MACSYMA. One of the most daunting aspects of the analysis of nontrivial sets of

equations is maintaining analytic clarity and accuracy of the expression as the set of equations is

manipulated from one step to the next. MACSYMA, a symbolic equation processor, has proven

through the course of this study to be a tool capable of handling the enormous book-keeping tasks as

well as providing a full range of mathematical functions to handle any computational task required. The

use of MACSYMA in this theses effort is extensive and fully documented. The intent of the presen-

tation at this level of detail is to provide a guide for future applications of this type.

i,2

J i

II. Development of Paralcl! YW Algorithm

The process of functional decomposition of the VWE equations through the use of MACSYMA is

presented in a step by step explanation of the Parallel Vector Potential derivation. The results of the more

complex expressions for the electric and magnetic fields are given in a less tutorial form for brevity. The

mapping of the parallel algorithm into the data-flow graph construct is also shown below.

The process is one of substituting variables into an equation. These variables represent

"allowable" elementary functions. Allowable is connoted to mean either a hardware functional element or a

software process implemented in the target processing environment.

Input Be.Equations

nd Imaginaryparts

Preform operationtonc a n by substituting

varable denoting

% po Y levl of computation

Figureel 2 Decompsi~ion AlorthNN comutaio VaDco poiablelgrih

The variables are coded such that computational and algorithmic clarity is maintained. That is, the

, coding scheme should clcarly define computational levels in the overall calculation, while maintaining end-

~to-end algorithmic correctness. For this study the ALPHA characters A, B, C, .... denote distinct

1.1

Na 8

*~~5*

. .. .. . . . . . . • " . - " m " , • • •

. ' , - % ' N % ' i " '

" . ' % ' " . % ' • ' • . ' . . ' % Y E S' *

, ' . ' -' _ " . ' .P L . - . • . . -' . . . . , -' , - . -' " % - , . - % - -. , ,, - . . - - - " ' , , " , - " .r -- " , - • " " -, - " , , . - , , - , S to p. , -

computational levels while the concatenated numeric designators 1, 2, 3, ... uniquely define a specific

elementary function. For example, "E5" might mean a specific multiplication at the "E" computational

level. This process is achieved by the algorithm given above in Figure 2.

Vector Potential Derivation

The cartesian expression for the Discrete Vector Potential is:

A= ( (Ax + Ay + Az)

where Ax = G * ( jxri + jxii )

Ay = G * ( jyri + jyii )Az = G * (jzri + jzii)

noting that jxri is the real part of the complex current density in the xdirection, jxii is the imaginary part with similar notation being used for theother coordinate system elements,

-or

e which is the free space form ofG --- Greens Function,

41c r

and r is the vector distance between the ith element and the% observation point; P is the wave number (2n/k).

These expressions are entered into MACSYMA as follows:

.. g :(%EA(.e%1*b*r))/(4*/opi-r);

Sax :g (jx ri +%lj xii);,.

ay :g (jyri+% l*jyii);

az:g (jzri+%I*jzi i);S.!

,

bew The resulting MACSYMA presentation for the three vector coordinate system elements are shown

below:

-%i b r(jxri + %i jxii) %e

a x ; --------------------------4 %pi r

9S

1.0'p'.'.;4.--., '9 .'."----4>. ;i;-::/ ; :% -- .--;. , .-- - .) 7 , / -.;.......- -- '.. . - -.;. -2 .--

UU I

L.

-%i b r(jyri + %i jyii) 4e

4 */oi r

":" -Oi b r(jzri + %i jzii) ae

4 %pi r

Once entered, the top-down functional decomposition of the x directed vector potential may begin.

The process is identical for the y and z coordinate system elements.

From the above expression for Ax (ax) the real and imaginary parts are extracted. The real and

imaginary parts of the Vector Potential in the x direction are defined as axr and axi below:

realpart(ax)axr; jxii sin(b r) + jxri cos(b r)

4 0/p r

imagpart(ax)

axi; jxii cos(b r) - jxri sin(b r)

4 %pi r

These real and imaginary parts may be operated on according to the above substitution algorithm.

First, however, recall that "r" is the vector distance between the ith element and the observation point. As

the discrete VWE given here in cartesian coordinates, the cartesian expression for r is required. This substi -

tution gives an initial, fully decomposed form of the real and imaginary parts (denoted by axrO and axiO) as:

axrO; 2 2 2, (jxii sin(b sqrt((z - zi) + (y - yi) + (x - xQ )

* 2 2 2S+ jxri cos(b sqrt((z -zi) + (y- yi) + (x- xi) )))

2 2 21(4 %/api sqrt((z - zi) + (y - yi) + (x- xi) ))

10Im

lO * % ** . % '* * '- < u ' . - '. %' ** q

2 2 2

2 2 2-pcri Sin(b sqrt((z -zi) + (y -yi) + (x -xi)))

2 2 2/(4 0/0 sqrt((z -zi) +(y-y) +(x-xi)).

Following the development process, the first substitution is Al, A2, and A3 for (x-xi), (y-yi), and

C% (z-zi). At this point in the substitution process, these are the only allowable elementary operations capable

of being performed. All other calculations within the equation are contingent upon the completion of these.

.5, Therefore, there is a delineation of a computational level. The status of the equation after this first level of

5' substitution is:

2 22 2 2 2 2jxii sin(b sqrt(A3 + A2 + Al )) +jxd cos(b sqrt(A3 + A2 + Al

2 2 24 */opi Sqrt(A3 +A2 +A1)

2 2 2 2 2 2jxiicos(bsqrt(A3 +A2 +A1 ))-jxdrii(bSqt(A3 +A2 +tA1 )

and axil;-------- -- - - -- ------------2 2 2

40/ 0psqr(A3 +A2 +A1

This process follows the procedures defined above and the status at each computational level is

generated. The status after the second substitution where Al12 , A22 , and A32 is replaced by BI., B2. and

B3, respectively is:

~5, axr2;

*5,; lxii sin(b sqrt(B3 + B2 + B31)) + jxii cos(b Sqrt(B3 + B2 + B31))I

4. 40/pi sqrt(B3 +B2 +B11)

jxii cos(b sqrt(B3 + B2 + 81)) - jxd sin(b sqrt(B3 + B2 + B31))and axi2; ------------- - - ---------

4 %pi sqr(B3 + 82+ 831)

%S

%*

We note two things here. First, the quantity 47r (4 %pi) is considered as an input constant such

that a separate multiplication is not required. Secondly, in the third level of computation shown below, the

'I. function Cl is a triple summation. The development of the VLSI Triple Summer will be a future thesis

project and was accepted as an allowable elementary function for this study. With the triple summer, the

.- status after the third level of substitution is:

4-. jxii sin(b sqrt(Cl)) + jxr oos(b sqrt(C1))'axr3;

4 0/0 sqrt(CI)

jxii cos(b sqrt(Cl)) - jxri sin(b sqrt(Cl))and axi3;: 40%pl sqrt(Cl)

The square root operation is performed next via substitution of D1. The status after the forth level

is:

Ijxii sin(b Dl) + jO cos(b D1)axr4;

4 4%pi D

jxii cos(b D1) - jxri sin(b D1)and axi4;

4 1/p D1

,*- Next, the multiplication of 10 and DI is performed, for which we substitute El. Here we also have

the multiplication of the input constant 4nt (4 %pi) and DI replaced with E2. The status after the fifth sub -

stitution is:

jxii sin(E1) + jxri cos(E1)axr5;--------------------

E2

jxii cos(El) - jxd sin(E1)and axi5;

E2

12

"" ye"'.

a -

The process continues with the substitutions Fl, F2, and F3 for cos(E 1), sin(E 1), and the inver -

sion process of E2, respectively. Note that an inversion process is used rather than a division to provide a

more general specification for the hardware implementation. The status after level six is:

axr6; (jxii F2 +jxri Fl) F3

and axi6; (jxii F1 - jxri F2) F3.I,.,

The next substitutions are for the multiplications of F1 and F2 with the complex components of

the current density. The status after the seventh level of computation is:

• axr7; F3 (G4 + G1)

and axi7; F3 (G3- G2).

The next operations to be performed are the remaining addition and subtraction from above. The

status after level eight is:

axr8; F3 H1

and axi8; F3 H2.

Finally, we multiply the quantities above. Recall that F3 is the resultant inversion quantity from

,, level six. The final results are:

axr9; Ji The real part of the vector potential in the "X" direction

axi9; J2 The imaginary part of the vector potential in the "X" direction.

Mechanically the process is one of scanning the the equation to check if an allowable elementary

function can be performed. If one can, it is coded by the conventions stated above. That is, the function is

given the ALPHA designator for the current computational level followed by a numeric designator uniquely

defining that specific function. The resultant coded variable is substituted into the equation, and the same

process is continued until no more elementary functions can be performed. The ALPHA designator is

incremented, and the process is continued until a single variable remains signifying that the calculation

process is complete.

a." 13

-.:. 'a a a aJ~''' d.a. .. a. 'ia'~%~**.-

vWT-.9qP.-7F'jWVV~W 771WT RP pi W1 P

Note that for this study, the above process was accomplished manually. However, clearly this

process is one that lends itself to and would benefit by automated techniques. Also the letter "I" was not

used in defining computational levels to avoid confusion between alpha/numeric characters.

The results of this process provide us with a database of computational expressions at each level

where the substitution designators, Al, B3, etc., indicate floating point operations required in the evaluation

* of the equation. Once this process is accomplished for each of the coordinate system elements, a listing of

the required substitutions is generated. The listing for the entire cartesian set is generated by eliminating

the redundant calculations. The combined calculations form a parallel computational approach to the vector

potential. Below is a listing for the discrete calculation of the real part of Ax and the combined cartesian

set with the redundant calculations removed:-

Discrete Ax mal Combined Cartesian Set (A)

Level one: subst("A1 ",(x-xi),%)$ subst("A1 ", (x-xi),%)$subst("A2",(y-yi),%)$ subst("A2",(y-yi),%)$subst("A3" ,(z-zi) ,%)$ subst("A3" ,(z-zi) ,%)$

Level two: subst("B1 ","A1 "A 2,%)$ subst("1 ","A1 "A 2,%)$subst("1B2",A2"A 2,%)$ subst("B2","A2 "A 2,%)$subst("B3","A3"A 2,%)$ subst("B3","A3"A 2,%)$

Level three: subst("C1 ","B1 "+"B2"+"B3",%); subst("Cl1","81 +"B2'+"B3",%):

Level four: subst("D1",sqrt("C1"),%/); subst("DI ",sqrt("C1"),%/);

Level five: subst("E1",b*"D1",%)$ subst('E1 ",b*"D I",%)$subst("E2",(4*%pi*"D 1") ,%); subst("E2", (4*%pi*"D 1 "),%/);

Level six: subst("F1",cos("E1"),%)$ subst("FI",cos("E1 "),%)$subst("F2",sin("E1 "),%)$ subst('F2",sin("E1 "),%)$subst("F3", 1/"E2",%); subst(" F3", 1I/"E2",%).

Level seven: subst("GI",jxri*"F I",%)$ sus(G"xr*F ,)subst("G2",jxri*"F2",%) $subst("G3",jxii-F1 ",%)$

subst("G4",jxii*"F2",%); subst("G4",jxii-F2",%) - 1subst("G5",jyri-F1",%)$subst("G6",jyri-F2",%)$subst('G7',jyii*-F1 ",%)$subst("G8",jyii'F2",%);subst("G9"jzri-F1 ",%)$subst("Gl O",jzri*"F2",%)$subst("G1 I",jzii"F1 ",%/)$subst("Gl 2",jzii-F2",%);

Level eight: subst("H1 ","G1 *+"G4",%); subst("Hl ',"G1"+"G4",%):

14

I ~ n . ~- r ~ T~,p -. p' u-p .- -5 . ] ::5 Z~ " 'Y .-. ;, . .d i : Y" . F\ Xk P " l' 'W . 'tl .F 'V w1 P J* . -, . .

,1Z subst("H2","G3"-"G2",%);subst("H3","G5'+"G8",%);subst("H4","G7"-"G6",%);subst("H5","G9"+"G12",%);subst("H6","G 11 "-"G10",%);

Level nine: subst("J1","F3""H 1",%); subst("J1","F3" H1",%);subst("J2","F3"-H2",%);subst("J3","F3""H3 ,%);subst("J4","F3."H4",%);subst("J5 , F3H5,o)

subst("J6","F3"-H6",%);-~ -- - - - - - - - - - -. -- - - - -

It is noted that there is a one to one correspondence of the substitution designators to the nodes of

a data-flow graph. A graphical representation is generated by mapping the above listings into data-flow

S..graphs. The data-flow graphs for the real part of the "x" directed vector potential and the combined cartesian

set are shown in Figures 3 and 4.

.& "

15l

S%

Figure 3 Real Part of Figure 4 Combined CartesianV' "X" Directed Vector Potential Set for Vector Potential

%_, The substitution process for the vector potential is presented in MACSYIN.A syntax in Appcndix

A. Data-flow graphs for all vector quantities are given in Appendix B.

Magmeti Eiel Derivation

The equation for the Discrete Magnetic Field in cartesian coordinates is:

H (Hx + H y + Hz) '

The complexity of the equations precludes a tutorial presentation. The processing of these field

equations is of such magnitude that manipulation becomes intractable without an equation processor such as

4MACSYMA. For example, at the point when the cartesian expression for the vector distance "r" is

substituted into the expression for the real part of the magnetic field in the 'x" direction forming "HxrO",

'p. the equation has the form:

2 2 2 2'U ((cos(b sqrt((z- zi) + (y- y + (x- xO )) (jyri (z- zi)- jzri (y- yi)) + sin(b sqrt((z- zi)

2 2 2 2 2+ (y- yi) + (x- xQ )) (jyii (z- zi)- jzii (y- yi)))/sqrt((z- zi) + (y- yi) + (x- xi)

2 2 2b (cos(b sqrt((z -zi) + (y- yi) + (x- xi) )) (jyii (z- zi)- jzii (y- yi))- sin(b

2 2 2sqrt((z - zi) + (y - yi) + (x - xi) )) (jyri (z - Az - jzri (y -yi))))

2 2 2/(4 0/opi ((z- zi) + (y- yi) + (x- xi) )).

The substitution process for the magnetic field is carried out similarly to that of the vector poten -

tial described above. The listings are generated and the data-flow graphs are formed. The data-flow graphs

for the real part of the 'x" directed magnetic field and the combined cartesian set is given below in Figures 5

and 6. The complete substitution process for the magnetic field is presented in MACSYMA syntax in

Appendix A. Data-flow graphs for all vector quantities are given in Appendix B.

',.

-16

-'o

--- - - . U

,, ,',. .., .. .. ,- ,.. ,.%. -- , - ,,- U U . U '.,, . .U.. . ' . ; . . .,, , ".-•- . . . . " '. ,. -. ,","

.t:.

Figure 5 Real Part of Figure 6 Combined Cartesian"X" Directed Magnetic Field Set for Magnetic Field

ii Electric Field Derivation

p

The equation for the Discrete Electric Field in cartesian coordinates is:

E= (Ex + E y +Ez) ",

The complexity of the electric field equations is even greater than the equations for the magnctic

field. For example, at the same point illustrated above, when the cartesian expression for the vector dis -

Lance "r" is substituted into the expression for the real part of the electric field in the "x" direction forming

"ExrO", the equation has the form:

2 2 2 3 b (gz (x- x (z- z) + jyri (x- xi) (y- yi))(sin(b sqrt((z - zi) + (y-yi + (x- xo )) ( - ---------

2 2 2sqrl((z-zi) +(y-yi) +(x-xi)

2 1+ (jxii (b -

2 2 2(z- zI + (y- yi + (x- xi)

3 br 2 2((z- zi) + (y- Ai

'1" 2 2 2

sqrt((z- zi) + (y- yi) + (x- xi)

17

• °.° .•. -.- - .. - °,• ° -, %, °°•. . °- . . . . - . . . . '. • -. .

3 2+ ( -- b ) (jzii (x - x) (z - zi) + jyii (x - xi) (y - Yi))

2 2 2(z- zI) + (y- Y + (x- xi)

2 2 2 3 b (jzii (x- XI (z- z + jyii (x-x i (y- y))+ cos(b sqrt((z - zi) + (y - yi) + (x - xi) )) (

2 2 2sqrt((z- zi) + (y- y) + (x- x)

3 b jxii+ ( ------------

2 2 2sqrt((z- zi) + (y- yi) + (x- xi)

2 1 2 2+ jxri (b - ----- -)) ((z - zo + (y - yi)

2 2 2(z- zi) + (y- y + (x- xi)

3 2, (. 2b ) (jzri (x- xi) (z - zi) + jyr (x- xi) (y - yi))))

, 2 2 2

(z-zi) +(y-y +(x-xi)

2 2 23/2/(4 0/pi ((z- zi) + (y- y) + (x- xi)

The substitution process for the electric field is carried out similar to that for the vector potential

and magnetic field described above. The listings are generated, and the data-flow graphs are formed. The

data-flow graphs for the real part of the 'x" directed electric field and the combined cartesian set is given

below in Figures 7 and 8. The complete substitution process for the electric field is presented in

MACSYMA syntax in Appendix A. Data-flow graphs for all vector quantities are given in Appendix B.

.W

"". 18

AI

%V '...N...2 . %. 2222 .. A2i .%. 22-

.1*

..

Figure 7 Real Part of Figure 8 Combined Cartesian"X" Directed Electric Field Set for Electric Field

It is important to note that the data-flow graphs developed here provide several types of

specification data. A programmer can use these graphs as the algorithmic specification to reduce processing

I by means of eliminating redundant computations. The computer architect will note the high degree of

concurrent processing that can be achieved in an "Alternate Architecture" approach. Further, the VLSI

designer now has a system architecture specification at the functional element level. The impact and

Simplications of this parallel computational approach is developed in chapter four of this thesis.

ii

19,II

r;"1

"? II11. Development 2[offi Erro Exression

In this section we seek to develop an analytic expression which will provide a model for imple -

mentation trade-off analysis for the parallel VWE with respect to accuracy/error considerations. In doing so

q we define an algorithmic entity as the VWE Core Equation. This expression is common to all three vector

quantities of interest to this thesis effort. The development of the error expression is a bottom-up process

in which error terms are assigned to each of the allowable elementary functions and to the input data. The

resultant terms are substituted into the VWE core and the computation, as specified in chapter two, is car -

ried OUL The final expression includes the arithmetic quantity desired and an upper bound on the cumulative

results of the functional and generated error. This resultant expression forms a model for analysis of the

VWE Core Equation with error terms as the parametric variables.

The text below is organized into the definition of the VWE core and assignment of the error terms

for the allowable elementary functions, detailed presentation of the bottom-up derivation of the error cx-

pression, and the resulting error model which can be used for implementation trade-off analysis.

Vh WE Cmr

* We see from the data-flow graphs that the VWE Core Equation forms two quanuues:

Cos( 53 sqrt( (x-xi)2 + (y-yi)2 + (z-zi) 2

and

Sin( 13 sqrt( (x-xi) 2 + (y-yi) 2 + (z-zi)2

.-

For this discussion we will consider the cosine term only. Graphical representauon of the VWE Core

Equation is shown below in figure 9.

1" 20

*I

~ -l

W- W> .p W, V L W- V-

ON

x Xi Y i Z zi

*-Cossi

COS [ 1 SQRT( (x-xi) + (y-yi) 2+ (z-zi) )]

SIN [ B SQRT( (x-xI) + (y-yi) 2+ (z-z.) )]

Figure 9VWE Core Equation

AsignM=n QL E=rms~u

The approach for deriving the error model will be to assign an error term by arithmetic operation.

10 That is, all additions will be assigned the same error term, all multiplications ... and so forth. A summary

of the error terms assigned to the arithmetic operations are below:

21

Operation Error Term

Data Representation eO

Addition/Subtraction I

Multiplication e2

Square Root e3

Sin/Cos e4

Inversion e5

Triple Summation e6

Cumulative Error (Level: L) egL

Table 1 Error Termsg

. These error terms are defined as the total error inherent in the input data ( bit/word length), and the

error generated as the function value (resolution, truncation, etc.). A notational convenience will be used

when the cumulative expression detracts from analytical clarity. It is given a designator defining its comp -

utational level and is treated as a separate arithmetic quantity at higher levels. This approach facilitates im -

plementation of the stochastic error model for complex systems and yields a final error expression that is

J. independent of any specific elementary function implemen-tation. Thus, the error expression provides only

specification data and does not restrict the VLSI designer to any one (analyzed) implementation.

Bottom-up Derivation of Er Model

N," As in chapter two, this section will be tutorial with respect to the level of MACSYMA syntax

used in its presentation. The intent is to provide an example to future users for similar applications.

-2b

p.

~22

We start by entering the core into MACSYMA:

core:cos(b*sqrt( (x-xi)A2 + (y-yi)A2 + (z-zI^2));

_"%t (c24) core;

2 2 2(d24) cos(b sqrt((z - zi) + (y - yi) + (x - xi) ))

We proceed similarly to the approach taken in the calculation flow analysis. The text below il -

lustrates the steps taken in this analysis. The following series of substitutions replace the input values x,

xi, y, ...zi with a construct of the form x + eO, xi + eO, ... zi +eO. The "eO" term accounting for the in -

, a herent error of the finite length binary representation used in any computer architecture.

subst(x-xi+2"eO,x-xi,%);

2 2 2oos(b sqrt((z - zi) + (y - yi) + (-xi + x + 2 e0) ))

subst(y-yi+2*e0,y-yi,%);

2 2 2cos(b sqrt((z - zi) + (- yi + y + 2 eO) + (-xi + x + 2 eO )

subst(z-zi+2"eO,z-zi,%);

2 2 2cos(b sqrt((- zi + z + 2 eO) + (-yi + y + 2 eO) + (-xi + x + 2 eO)

Next we perform the arithmetic operations of level "A" and add the associated error terms for the

subtraction operation "el". We note here the procedure that will be used in the derivation of the error ex -

pression. To form an upper bound expression, we assume the form of the general addition of error terms.

That is, we will not attempt to arithmetically cancel error terms at the atomic level. To do so would be to

assume more about a target architecture than is intended by this study. As such, the following

MACSYMA substitution processes are of the form (x-xi) => (x-x i) + el.

.subst("A1" + el+2*e0,x-xi+2*e0,%)$

subst("A2" + el+2*e0,y-yi+2*e0,%)$subst("A3" + el+2*e0,z-zi+2*e0,%);

23

%-. %V

2:I

-. ,.'.,,_','_'.,-;,_','..=' '.':',.-'.',' ,-7 -,'''- '''-.,''.. -. G "- "'''--, ."-a . '-? :': ." ,. . ;

The resulting form of the VWE Core after the first level of comp-uitation is significantly more

complex than that of chapter two, again exemplifying the need to use an equation processor. The status is

shown as corelI below:

corel;

cos(b sqrt((A3 +el + 2 e) + (A2 +el + 2 e) + (Al + el + 2 e0)

Next we multiply the inner terms to enable use to proceed. We can also see the effect of cum -

ulative error, noting the "3e1I" and " 12e0" terms.

expand(%);

V2 2 2cos(b sqrt(A3 + 2el A3 +4 eOA3 +A2 + 2el A2 +4 eOA2 +Al

2 2+2e1 Al +4eOAl +3e1 +l2eOel +12e0 )

The calculations of level two are performed by ffirst, substituting for the A2 operation. Then, we

substitute in the error terms associated with the squaring operation (same as multiplication), the process is

S shown below:

subst("Bl","Al"A 2,%)$subst("B2","A2"A 2,%)$subst("B33,"A ~2N%;

cos(b sqrt(B3 +B2 +B1 +2 el A3 +4 eOA3 +2 el A2 +4 eOA2+ 2 el Al

2 2+4eOAl +3e1 +12eOel +12e0 )

subst("Bl" + 2B1,)subst("B32" + 2""%)subst("B3" + 2"3%)

The status of the computation at level two is:

core2;

cos(b sqrt(B3 + B2 + B31 + 2 el A3 + 4 eO A3 + 2 el A2 + 4 eO A2 + 2 el Al

+4eOAl1s3e2+3e1+l12eOel +12e0 )

_ _ 24

To perform the computation of level three (the triple summation), we need to substitute the entire

inner expression as MACSYMA will not allow sub-atomic substitutions. The process and the status at

level three is given below:

subst("C 1"+e6+2"eI +"A3"+4"e0*"A3"+2*eI "A2"+4"eO*A2"+2"e 1 "Al "+4e0""A1 "+3"e2+3"el A2+12"e0*el + 12*eOA2,B3 + "B2" +"B1 "+2e1 *"A3"+4*e0"A3"+2*e1 *A2"+4*eO*A2"+2"e1 ""A1 "+4e0"A 1"+3"e2+3"e1 A2+12"e0"el +12*e02,%);

Note, we have added the error term e6 as part of the substitution.

core3;

cos(b sqrt(Cl + e6 + 4 eO A3 + A3 + 2 el A2 + 4 eO A2 + 2 el Al + 4 e0 Al

2 2+3e2+3e1 +12eOel +2e1 +12e0 ))

At this point we wish to simplify the expression. We will define "eg3 as the generated error at

level three. The development is as follows:

let X=+e6+4eOA3+A3+2el A2+4eOA2+2el Al +4eO Al

2 2+3e2+3e1 +12eOel+2e1+12e0

this expression represents all of the terms inside the square root operator except Cl. Recall the identity

sqrt(A+B) = sqrt((A+B)*(A/A)J - sqrt [(A+B)/AJ sqrt (A)

we then define eg3 = sqrt[(C1 +X)/Cl] and perform the substitution with resulting expression shown

"below:

subst(b'sqrt("Cl")*eg3, b*sqrt("Cl" +e6+2*e1+"A3"+4*e0"A3"+2"e1 "A2"+4"eO*"A2"+2*el "A1 "+4"e0""A 1"+3"e2+3"e 1 A2+12*e0*e1 + 1 2e0^ 2),% ),

core3.1;

cos(b eg3 sqrt(Cl))

25

N,

We note here that eg3 has a multiplicative relationship to the desired atomic expression "sqrt(C 1 )".

As we proceed to level four, we have not violated any arithmetic axiom as eg3 has the form of the square

root operation inherent in it as shown in the development above.

The computation and status aL level four is given below:

OR subst(b*(*D1" + e3 )° eg3,b*eg3"sqrt("Cl) ,%);

core4;

cos(b eg3 (D1 + e3)),L7.

We can expand the inner expression as follows:

• acore4. 1

"os(b eg3 D1 + b e3 e)

At this step we see that the expression yields a multiplicative error term which is not readily

separable from within the argument of the cosine term. As such, we proceed to level five with the

g cumulative error as shown above. The operation is performed and the error term is substituted below:

subst("E1",b"D1",%);subst("E 1"+e2,"E I",%);

This last substitution yields:

cos( El eg3 + (e3 eg3) + e2]

which by trigonometric substitution has the form:

cos(E1 eg3) cos(( 3 e3 eg3 ) + e2) - sin(E1 eg3) sin(( f3 e3 eg3) + e2)

We see that at this stage that we can only get the desired term:

cos(El)

if and only if:eg3 = 1

and

( e3 eg3) + e2 = n2n, n= 0,1,2,3_..

a,.":, 26

We now have a closed form expression that in a limited sense models the behavior of the VWE

calculation. With the exception of the inversion or division function we now have an expression which

includes all of the arithmetic operations (and the associated error terms) required in the calculation of the

VWE.

The VWE Error Model follows from the derivation above. We saw that the desired value "cos(EI)"

is attained if and only if:"4.

eg3 = 1

and

(J3 e3 eg3) + e2 = n27r, n= 0,1,2,3....

where eg3 = sqrt[(Cl+X)/C1] = sqrt[1 + /Cl]

C1 =(x-xi)2 + (y-yi)2 + (z-zi)2

=(X 2 - 2 x xi + xi2 + y2- 2 y yi + yi2 + z 2 - 2 z zi + zi2 )

X -4e0A3+A3+2e1A2+4e0A2+2e1Al+4eOAl+e6

* 2 2+3e2+3e1 +12eOel+2e1+12e0

Al = x-xi

A2 = y-yi

A3 = z-zi

These relationships constitute a model. The designer with known error metrics for the funcuonal

elements which implement the required elementary operations can substitute those metrics into the expres -

ions and gain visibility of the impact that any particular implementation will have on the system's accu -

racy well before bread boarding a design. This process supports design trade-off decisions with respect to

"system accuracy" specifications at a much earlier and cost effective phase of the project.

27

I ...- . . . .p.:: . . . . : . . . . - .. - . . .. ...1.

It is imperative to note that this model, alone, may not be suitable for such design decisions.

Exhaustive simulation as well as derivation of a family of VWE error expressions and models may be

required. The complexity of the VWE and the many possible architectural implementations will drive this

issue.

The model developed in this study is a baseline. Further development and analysis is beyond the

scope of this thesis.

,2

?.

I.."2

'-'

IV. Analysis

In this section, the total computational requirements of the parallel VWE is analyzed and the

calculated savings at various levels of computational concurrency is presented. The data-flow graphs

developed in chapter two are used to extract significant data for this analysis. The significant data derived

from the graphical representation includes visibility of operational redundancy within the coordinate

components of a specific vector quantity and across the three vector quantities of interest, A, E, and H.

Visibility of data dependencies and maximal concurrency possible in the calculation of all components of

'4. A, E, and H is readily available. Finally, the graphs provide a means for comparison of implementation of

the equation at various levels of computational concurrency.

Removal gf lratinal Rdudancy

Classically, the real and imaginary parts of each of the vector components would be discretely

calculated without regard to the fact that several of the intermediate results are reusable in the calculation of

other coordinate system elements. For example, when calculating the real part of Hx , there are 30 floating

point operations involved. There are, noting symmetry, 30 floating point operations involved in the calc -

ulation of the imaginary part of Hx. However, as clearly visible in the graphical representation, 26 of

S. these operation are identical. Therefore both quantities may be obtained in 34 flops. Further analysis of the

graphical representation shows that the complete vector component set for both the real and imaginary parts

of the magnetic field, H, can be calculated in just 74 flops as apposed to 180 when computed classically.

The analysis is summarized in table 2 for all vector quantities of interest.

Also shown in table 2 is an entry for "E & H fields". This entry represents the removal of all

computational redundancy within the complete E and H graphical representations and is the overlaying of

the 2 computational processes to form one computational process. The savings of 26 flops achieved by

this overlaying is due to a common core of operations present in these vector quantities similar to that

29

, . .. ,

"crete Paired Combind

Vector 17/102 21/63 37Potential

H 30/180 34/102 74Field

%EE 52/312 56/168 130Fiel-d

- E &)HFields -170

[&H &Vector - - 202

Table 2 Computational Requirements

.'-. defined in chapter three as the VWE Core Equation. Because of this common core of operations, the vector

potential, A, can be integrated into the fuUy parallel calculation of the three vector quantities of interest,

such that all results are obtainable in 202 flops as apposed to 594 flops when computed classically,

- representing nearly 66% computational savings. The impact of these results are shown in graphical form in

figure 10 where the computational requirements are plotted for a given number of dipole elements.

U• ' 1000000000 . . . . " . . .

...................... ~~~........... ... ... . . .. . . -........

10000000

1000000

C .

100 1 I I I I I

1 10 100 100 10000 100000 1000000

NUMiBER DIPOLE ELEM1ENTS I

-' sz :P ..........e ....e

SComputational Discrete

Approach o Pairedm Combined

Figure 10 Number FLOPS vs Dipole Elementsfor a given Computational Approach

a.

"" 30

A P .A 'J P

Maximln~ ConcUa~nc

Concurrency in this thesis is understood in an architectural sense. That is, concurrency ri .tas the

simultaneous execution of a number of discrete operations. Specifically, in this thesis, concurrency means

V the simultaneous execution of float~ng point arithmetic operations.

p The visibility into the data dependencies that the data-flow graphs provides allows us to observe 12

distinct computational levels when the entire set of vector quantities are overlayed as depicted on the fol-

lowing page in figure 11.

*00)

02 2 Sy92 Z 2

F22gure 114 11lly Ovrlye Vecto IuattIeslI.,1,1

*52

31 D03C3 EISEIEG" EIC32 CI

This represents a 98% reduction in linear time when compared to the classical approach. However

this dramatic time reduction is attainable only if the processing environment supports concurrent floating

point arithmetic operations at the maximum width of the VWE calculation. The maximum operational

A. width of the VWE is 36, occurring at the seventh computational level, G. These 36 operations include 30

Ifloating point multiplies and six adder/subtractors.

Architectural Consideraions

With the end objective of developing an interactive CAD/CAM environment for aerodynamic

[ -systems design, the target processing environment is assumed to be a VHSIC class processor, that is, a

hardware implementation. However, the computational specification data available from the parallel

approach is target processing environment independent in that any current or future architecture can realize

processing gain by implementing the parallel VWE at the maximum level of computational concurrency

-, supportable by that architecture.

As shown in the table 2 above, discrete calculation of the ith vector element requires 594 FLOPS.

Assuming that the accuracy verified by Hoyt is sufficient, we will have 500 of these th calculations per

* * dipole element- If we define the execution time of the average floating point arithmetic operation to be Ta,

then we derive an expression to approximate the time required for the solution of the discrete VWE which

, we define as Tdiscrete:

Tdiscrete - [500 594Ta] " N,

where N = the number of dipole elements in the structure.

Using the results of the parallel approach where all redundant computations have been eliminated.

"-,e as shown in table 2, there are 202 FLOPS required. Defining Tcombined to be time required for this

solution we have:

Tcombined =[500 A 202T a ]°N.

A ' 32

U?

This approach when compared to the discrete approach provides a speed up of nearly 3. It is

significant to note that this speedup can be achieved on the same architecture as that of the discrete by

simply recoding the algorithm to use all partial results. The speedup actually achieved on a given

architecture may be somewhat less than 3 due to the width of the computation and the resulting need to

qstore partial results in memory versus registers.

If however the the given architecture supports concurrent floating point arithmetic then the

parallelism shown in the graphical representation may be exploited. In the fully concurrent algorithm there

are 12 distinct computational levels. We define the average execution time of the concurrent floating point

arithmetic process to be T acon . The resultant expression for the time required to calculate the parallel

VWE, defined to be Tparallel is:

Tparallel [500 12Ta-conI N.

Comparing with T gives:

Tdiscrete___

T TTdiscrete a

-. 49.5-T Tparael T-con

- this speed up of 49.5 can be achieved if the architecture supports concurrent floating point operations at the

maximum width of the parallel VWE and the ratio T to T is = 1. The maximum width is 36. If less• "- a acon

than 36 functional units are available, the speedup would be proportionally reduced.

A block diagram for a possible architecture that is defined as the hardware implementation of the

fully overlayed data-flow graph of figure 11 is shown below in figure 12. The architecture depicted in

Figure 12 is one in which there is a functional unit for each node in the data-flow graph. The functional

. units are hardwired so that as results become available they are passed to the next unit. Due to theo.,'

-" computation's data dependencies, processor functional units at all but the currently active level sit idle.

This results in very low component utilization.

S.1

33

N -'% % ' ,.

,.. ~ -m -. ,-m -m y- -u-~w.-~ ~T ~ - 2- -y * Y

VWE InputIUtilization

4 ............. ................ ............. .............. ..........* Level :FLOPS/ Leve: 9 Util/LevelJ..... ............. .............. ............. ................. ............* 13 0.01485149:

2 0 2 .............. ................. ............................... ..............................9 0445544• i................................ ................................ ................

Functional 11 0.054455454 ............................... ............................... ................................ .

Units 6 16 0.07920792::7 36 0.17821782• ................................ ............................... ............................... e

8 24 0118811889 24 0.11881188!

Single Pass ...............10.............. ............. ................ . .......

Computation 11 12 0.05940594:t.'l=

................................ ................................ ............................... .

.... 12 6 0.02970297Total -> 202€ ''. ................................ ................................ f ................................• -' Ave Util => 0.08333333.

VWE Output

Figure 12 Block Diagram and Utilizationof 202 Funtional Unit Architecture

Further improvements in speedup can be achieved by applying pipelining and other architectural

techniques. 'Me limit of these approaches would be a hardwired pipelined architecture capable of producing

the complete vector component set every major cycle. A block diagram of this architecture is given in

figure 13 below.

The depicted pipelined architecture requires the same 202 functional units as the previous

architecture with the addition of the interstage registers and pipeline control hardware. The improvement in

system performance is due to a result being available every major cycle (once the pipe is full) as menuoned

earlier and a significant increase in component utilization when a large number of dipole elements arc to hc

34

calculated. Utilization figures for this architecture are not static as before but are dependent on the number

q. of dipole elements to be input into the pipe. The limit of component utilization approaches 100% for large

numbers of dipoles to be calculated and the curve is shown in figure 13.

VWE Input

Utilization

After 12 Time Units

202 of Latency

Functional Approaches 100%Units Utilization

Pipelined... Computati on..

Through 12Levels ,i A,6

0 Number i th Ca cu ations

II

VWE Output

Figure 13 Block Diagram and Utilization.% of 202 Funtional Unit Pipelined

Architecture

The above architecture's performance can be expressed as follows:

% ' Tpiplined=( 5 00 * Tmc) * N,

where Tm is the time of a major cycle. This expression shows that the speed-up is limited to approx -

imately two orders of magnitude for a single processor system.

,. - ~~A 1 -.. ~ ,

:. 35

SM %;; ;;; , ',,'.-'..-'. ,..,--',,...'.,,-, '.. . '.':,',..'a" :-"" .".'

A means of achieving further speed-up is through the application of multiple processors each

capable of computing the VWE component set and taking advantage of the architectural considerations

presented thus far.

6v An architecture capable of supporting 30 floating point multiplies, 18 floating point

additions/subtractions, a square root, sin/cos and an inverse or division operation may compute the complete

ith VWE calculation in 12 computational levels. This is an architecture with 51 functional units and a

block diagram is given in figure 14. These 51 functional units are reconfigurable to implement the various

VWE Input

Utilization...C.. cle :.FLOPS/Cycle Util/cycle:.. .... ..i ............ .. i............. ............... ...............

1 3 0.05882353:............... ................ ............ i............... ..........................'- ,. C y c le s " ................................*: ............................... ...............................90 1 4 05 "

12. 3 223........7............. .......... . .... 431 3725 5:Cycles 49 01675

5 11 0.21568627:51 6 16 0.31372549:F 5 1 c ti0 ................ ................ .............. ................. .......... ......s..

7 24 0. 7058824F u n c tio n a l.................. • .............. ........................................ .e.

: -"................ ............... ..............ii............. .......................9 24 0.47058824:

11 12 .23529412.i 12 6 1 0.117647061I.i............. .. ........... ............ ........... .. ...............................'d. ................................v ............................... ............................... %.Total => 202 1..... Ave Uil => 0.33006536

VWE Output

Figure 14 Block Diagram and Utilizationof 51 Funtional Unit Architecture

computational levels specified in figure 11 and hardwired in the previous architectures. This architecture

will require extensive use of registers and control circuitry that will increase complexity and area con -

:.% 36

ar

i

I-.

siderations. However, because of the significantly reduced number of functional units, component util -

ization factors are improved as shown in figure 14. Implementation of this architecture utilizing VHS IC

technology and possible wafer scale integration, yields a processing engine that may be incorporated as the

processor of a multiprocessing array system.

The capabilities of the above system is shown in the definition of Tconcurrent below:

Tconcurrent - [500* 12Tvhsidl* N

Number of Processors

,-. where Tvhsic is the execution time of the average arithmetic operation on the VHSIC processor engine.

Comparing an array of the above processors to the classical single processor architecture mentioned

previously results in a speedup (S) of:

TaS = 49.5 * Number of Processors.

Tvhsic

g We now seek an expression that bounds S in terms of the number of processors used in the array.

A lower bound given by assuming one processor and that the ratio Ta/Tvhsic = 1. That is, a VHSIC

- implementation of the functional units are of the same order of speed as the average execution time of the

classic architecture. The resultant expression is:

Ta49.5 S S!< 49.5 * *Number of Processors.

Tvt &

The upper bound makes no assumption as to Ta/Tvhsic or number of processors. Considering a

conservative lower limit of 2 for Ta/Tvhsic and rounding the scaler on the right hand side of the inequality

to 50 we can express the equation as:

49.5 < S !5 100 * Number of Processors.

To achieve a desired 5 orders of magnitude improvement, the above expression indicates that an

array of 1000 VHSIC processor engines are required.

.37

L

A realistic evaluation of an aerodynamic structure may easily require I million dipole elements.

Using the discrete approach on a Cray 1, this computation would require 8.25 hours assuming that the 100

MFLOP rate could be uniformly sustained. Even if the parallel algorithm's results are utilized to remove

all redundant operations, the computation would still require 2.8 hours. This result clearly does not support

an interactive environment. However, the computation for these same 1 million dipole elements would

require only 0.6 seconds on the 1000 processor array as described above. This is a processing rate capable

of supporting an interactive environment.

The performance of the various architectures discussed above are compared with respect to time in

figure 15 below. For this comparison, the specific architecture time variables are all assumed to be 10-ns.

The effect of this generalization is nominal with respect to the magnitudes calculated below.

10

II

000 0.1, 0.01

000.001 -

0.0 001 E-'.. -HM. ......

0.00001

..... ....... ........................; 0 .0OOO...... ...... -- -= .. . .... t. ..... .............. ................

'" " ~~0.00000001 '-"

1. 10 100 100 10000 100000 1000000

.r: INUMBER DIPOLE ELEMENTS

* Classical Tdiscrete

Architectural 0 Classical w/redundancy removedTcombi ned

Approach m Parallel Processor Tparallel

o Pipelined Processor Tpipelined

A Array (1000) Tconcurrent

Figure 15 Time vs Dipole ElementsFor a given Architectural Approach

38

W"J ~~NW 1I U3w~ WV WV WWU V- W-W WVVW

V. Conclusions and Recommendations

This study was motivated by the the end objective of developing a CAD/CAM environment for

aerodynamic systems design. In particular, the capability of having a structure's electromagnetic observ -

ability known and interactivly updated as the design progresses. Current analytical approaches require

calculation of near-field electromagnetic wave equations. Numerical solutions to these equations is a

•1 ,computationally intensive problem that requires millions of floating point calculations.

This thesis represents the study, development, and analysis of a parallel computational ap -

proach to the set of electromagnetic field equations defined as the Vector Wave Equations (VWE). This

effort has produced an initial specification for a parallel processing engine for the VWE that can be imple -I-

mented in either software or hardware. The analysis points excitingly to a conclusion that an interactive

environment with respect to electromagnetic metrics is attainable. The cost of this capability is a complex

array of VHSIC class processors. Further cost must be considered to address the following issues.

S The perspective of this development effort was that a model of the structure existed in a form com -

patible with the cartesian representation of the VWE. As such, the issue of creating that model must be ad -

dressed. That task will likely be of a similar magnitude as this forward processing task has been. The issue

Uof the model with respect to it's shear size must be grappled with. Finally, the grim task of I/O handling

and post processing must be completed in order to attain a truly interactive design environment.

Other, possibly more attractive, issues are those of alternate applications for the VWE. An

example might be the study of it's utility in long lead, high frequency layout analysis where the wave

length of the frequency is a significant fraction of the conduction path. Further, the applicability of these

results to a general class of related partial differential equations may make finite element calculations

practical for heat flow, fluid dynamics, and other "vector intensive" areas of study.

Final recommendations for follow-up research with respect to this thesis are in the areas of

empirical verification of the computational time savings and hardware implementation. It was stressed

throughout this text that, although the objective was a hardware implementation, the algorithmic

.. '

39

XI

specification of the data-flow graphs should be coded and tested in software. This area would include

development of a VWE simulator at various levels of computational concurrency and testing the algorithms

performance on systems such as the Cray. Further software projects should include a simulator of the VWE

with the allowable elementary functions parameterized by function resolution and word size. This project

will give empirical visibility into the algorithms performance with respect to accuracy considerations.

Hardware implementation issues include time-area analysis, optimal scheduling and utilization

research, and VHSIC technology concerns. The form of the Parallel VWE, as developed here, may lead to

implementation by wafer scale integration. Further investigation into alternate architectures combined with

the resolution of some of the above issues may likely be the path to attaining a very "state-of-the-art" goal

of an interactive design environment.

AL

44

-f

A' "a

, ql "S

.5i

a'*

•. S.

%;:h p' J'.I.: ,"a . - ;*. . ~ *.- ~- a

, #k' 5Appendix A

This appendix provides the supplementary material for the develop-

V ment of the Parallel VWE. Presented here is the full development process of

the three vector quantities (A, H, and E) in MACSYMA syntax. The text is

Rorganized into three sections. The sections are the development of the

Parallel Vector Potential, the Parallel Magnetic Field, and the Parallel Electric

Field. The individual sections are further divided into the presentation of the

computational flow which is the development of the parallel electromagnetic

quantity and the command listing which implements that process These

command listings are what is mapped into the data-flow graphs The data-7.

flow graphs are presented in Appendix B.

Vector Potential Develooment

This section illustrates the calculation analysis of the Vector Potential

The following pages will show the step by step procedure of the computation

of the Parallel Vector Potential

The vector potential in the "x" direction will be the example through

level four with the complete complex six-tuple shown for the remainder of

the process The text is in MACSYMA syntax and presented as the user might

see it on the screen.

(c22) ax,

- Sol b r(jxri 4 Si j0i) Ise

(d22)4 PopDi r

41

-"v- --. ,' . ." ,.' ,.: .. ,j ","j ,•-' -", -,, ' *. ',"% ;"'- . "" - e ,"€ 4 ' . "< " "

VC realarL(ax) The real part is extracted with MACSYIA.

(M) axr;

jxii sin(b r) + jxri cos(b r)(d23)-------------------

4 %pi r

-maaert(ax) The imaginary part is extracted with MACSYMIA.

(c24) axi,

jxii cos(b r) - jxri sin(b r)(d24)

4 Spi r

Here we begin the computation process:

The example below recalls the real part of Axr, then substitutes the expanded version of r" into theexpression.

AxrO; Substitute the expanded version of "r' into the expression.

(c25) axrO,

U 2 2 2(d25) (jxii sin(b sqrt((z - zi) + (y - yi) + (x - xi) ))

2 2 2+ jxri cos(b sqrt((z - zi) + (y - yi) + (x - xi) )))

2 2 2(4 %pi sqrt((z - zi) + (y - yi) + (x - xi) ))

AxlO; Substitute the expanded version of "r' into the expression.

(c26) axiO,

2 2 2A! (d26) (jxiu cos(b sqrt((z - zi) + (y - yi) + (x - xi) ))

2 2 2-jxri sin(b sqrt((z - zi) + (y - yi) + (x - xi) )))

2 2 21 t (4 fpt sqrt((z -zi) + (y -yi) + (x -xi) )

42

t*4 .. -

N

AAfter the first level of computation:

Axr 1;

(c27) axr I

2 2 2 2 2 2jxii sin(b sqrt(A3 + A2 + AI )) + jxri cos(b sqrt(A3 + A2 + Al

d27) -,------------------------------------------2 2 2

4 %pi sqrt(A3 + A2 +AI )

Axil;

(c28) axi1,

2 2 2 2 2 2jxii cos(b sqrt(A3 + A2 + Al )) -jxri sin(b sqrt(A3 + A2 + Al .I

• ." (d28) "

2 2 24 Rpisqrt(A3 + A2 +AI

After the second level of computation:

Axr2:

.U (c35) axr2;

jxil sin(b sqrt(B3 + B2 + BI)) + jxr! cos(b sqrt(B3 + B2 + 51))(d35)--------------------------------------------

4 %pi sqrt(B3 + B2 + 51)

AA2? Axi2;

(c47) axi2;

jxii cos(b sqrt(B3 + B2 + B1)) - jxrI sin(b sqrt(B3 + B2 + B I))

(d47)4 %PI sart(B3 + B2 + B I)

.

'-a.

After the third level or computation:

Axr3;

Uc8) axr3j;

jxil sin(b sqrt(C 1)) + jxri cos(b sqrt(C 1)

4 %pi sqrt(C 0

Ax13;

(c39) axi3,

jxII cos(b sqrt(C 1)) - jxri sln(b sqrt(C 1)

4 SOi sqrtCl)

After the forth level of computation:

Axr4;

(c66) axr4,

jxii sin(b D I + jxri cos(b Dl)(d66)-- - - - - -- - - - - - - - -

4 SOi Dl

Axi4;

(c67) axi4,

jxit cos(b Dl1) - jxri sin(b DlI)W 6*(d 7)-- - - - - - - - - - - - - - -

4 Spi D I

z 44

WWVV IWIIV'?9rLT-V%1119.l.%Ivll-,-VNI~n--".Ll %- ^;

Arter the rirth level or computation:

Axr5;

(098) axr5,

jOi sin(E I)+ jxri cosCE IWd98) -----------------

E2

AxI5;

(c99) axi5;

jxi cosCE 1) - jxri sin(E 1)Wd99) -------------------

E2

Ayr5;

(c 100) ayr5;

jyli sin(E 1)+ jyri cos(E 1(d100) - - - - - - - -

E2

AyI5:

U(cl10 ) ayi5,

jyli cos(E 1) jyri sin(E 1)(d 1 -- - -- - - -- - - -- - -

E2

AzrS;

Cc 102 azr5,jzi i sin(E 1)+ jzri cos(E 1)

(d 102)-- -- - - - - - - - - - --' E2

AzI 5;

(c 103) &z5;

jzi cos(E 1 1-j.zri sin(E I)*1(d 103)-- - - - - - - - - - - - -

E2

45

:%' After the sixth level of computation-

Axr6;

(c 140) axr6;

(d140) (jxii F2 + jxri FI) F3

Axi6;

(c141) axi6,

(d141) (jxii F -jxri F2) F3

Ayr6;

(c 142) ayr'6;

, (d142) (jyii F2 + jyri F1)F3

- AyI6;

(c 143) ay16;

td143) (jyii F I -jyri F2)F3

IAzr6;

.c 144) azr6,

(d144) (jzu F2 + jzri FI) F

AzI6;

" a (c 145) az16;

td145) (jzn F - jzri F2) F3

46

r

Arter the seventh level of computation:

Axr7;

kc203) axr7;

Wd203) F3 (64 +61)

(c204) aOi;

Wd204) F3 (G3 - 62)

Ayr7;

(c20) ayr7;

Wd205) F3 (G8 +G5)

Ay17;

(c"L06) ay7;

Wd206)

F3 (G7 - 66)

Azr7;

(c'-07) azr7;

Wd207) F3 (69 +6G12)

Azi7;

kc208) azi7;

W (206) F3 (GI I-G610)

47

! l 'S.'Y W lfl W rwT r r7 :W r i r r. r nra~r Irdw r g.M xp S -u Srwr~m w 5,-a Fn n V 1 .'.- nW T W VI W7w'r%~ nF.

After the eighth level or computation:

AxrB;

(c234) axr8;

Wd234) F3 HI

AxIB;

(c235) axi8;

Wd235) F3 H2

AyrB;

* (c236) ayr8,

Wd236) F3 H3

AyIO;

(c237) ayi8;

Wd237) F3 H4

* AzrB;

(c238) azr8;

(d238) F3 H5

(c239) aziB.

Wd239) F3 H6

48

% -% A

VArter the ninth and rinal level or computation:

Axrg;

(c265) axr9;

(d265) JI

Ax19;

(c266) axi9;

(d266) J2

Ayr9g

(c267) ayr9.

(d267) J3

Ayi9;

(c268) ayi9;

(d268) J4

q Azr9;

(c269) azr9;

(d269) J5

"" Az19;

- (c271 ) azw9

(d27 1) J6

-49ml " , • " '. '-' ' ' f" . ' .,* ' *,, .e " .. i" , ./ .- /" " ./ ,t* d II 4" -" .," =i' = , " " Ira i" " ' '/ " ,, ',l " ' ! l. " i" ° . . . "

".<'.- .' ,7 .'. '. '. " '-.' .' > .-.' .' .'/. '. '. '-." .'r. ' '- ',':'.;' .' .' ¢ .....- '. ; -" " " ". ". > . . . .'+' 7 '#' I

MACSYMA Command Lisling for Pfallel Vect Potential

The text below Is the actual conywncus used to pertfcxm the calculation ano ,sis for the

Vector Potential. The commands are separated into the computational levels developed in the

previous text.

These first corrvrands are in the prepartion stage of the process.

realpart(ax);imagpart(ax);subst(((((x-xi )'2)+((y-yi)Y2)+((z-zi )2))'( /2)),r,%);

£ _ _ _ __ _ _ _ __ _ _ _ __ _ _ _

Level one:

subst("A ",(x-xi),!O$

* subst('A3',(z-zi0jX)$

Level two:

- ubst"B 1 -2,34Ssubst(B2A2'2,F.)$subst(B3"A3^2,%)S

Level three:

*subst(C1B 1+B2+B3",X);

Level rour:

Level five:

Level six:

subst(F I "cos(E1"),1164subst('F2".sin(TE 1 1,P)$subst("F3, I/~j)

50

Level seven:

subst("G I,jxri * T I %P)$

subst("G2,jxri * F2,?)$subst(*G3,jxii' OT 1,7)$subst("4Jxi i *T2,,);

subst(G5jyri * F 1~subst(G6,jyri * 'F2,P)$

subst("G8jyO * -2",P.);

subst('G9"jzri F 1".3)$

subst(61O I ,jzi T1 I)$subst('G12",Jzii * *2",%);

Level eight:

subst("H2,'G3"-G2,%);

substCH,G"4Gsubst("H4","G--G6'*,%);subst('H5","G9"+G12,),

Level nine:

subst(J I*F3* H.)subst-J2F3*H2',%);subt( J3",'F3"*'H,

subst("J4'*VF3-"H4,%),*subst(J5,F3'H5,%);*subst(J6,"3*H6",F);

51

Matnetic Field Develovment

OThis section illustrates the calculation analysis of the Magnetic Field.


Vof the Parallel Magnetic Field.

The magnetic field in the 'x" direction will be the example through

.level one with the complete complex six-tuple shown for the remainder of

the process. The text is in MACSYMA syntax and presented as the user might

see it on the screen.

1 -Ribr(- + Si b) Re ((jyri + Ri jyii) (z - zi) - (jzri + Xi jzii) (y - yi))r

24 Spi r

rilnartihx) The real part is extracted with MACSYMA.

((cos(b r) (jyri (z - zi) - jzri (y - yi))

+ sin(b r) (jyii (z - zi) - jzii (y - yi)))/r

- b (cos(b r) (jyfi (z - zi) - jzii (y - yi))

2- sin(b r) (jyri (z - zi) - jzri (y - yi))))/(4 Spi r )

imagart(h) The imaginary part is extracted with MACSYMA.

,k ((cos(b r) (jyii (z - zi) - jzh (y - yi))

- sin(b r) (jyri (z - zi) - jzri (y - yi)))/r

+ b ,cos(b r) (jyri (z - zi) - jzri (y - yi))

+ sinlb r) (yfl (z - zi) - jzil (y - yi))))/C4 Spi r

52

-- ............. 2

Here we begin the comnutaLion orecess,

The example below recalls the real Dart of" Hx, then substitutes the expanded version of "r into theexpression,

reelpartlhW)

hxr;" I,((cos(b r) (jyri (z - zi) - jzri (y - yi))

+ sin(b r) (jyOi (z - zi) - jzuf (y - yi)))/r

- b (cos(b r) (jyii (z - zi) - jzii (y - yi))

- sin b r) (jyri (z - zi) - jzri (y - yi))))/(4 %pi r )

subst(((((x-xi )'2)+((y-yi)r2 )+((z-zI )2))( 1/2)),r.%);

HxrO;

2 2 2((cos(b sqrt((z - zi) + (y - yi) + (x - xi) ))

(jyri (z - zi) - jzri (y - y)) + sin(b

2 2 2sqrt((z - zi) + (y - yi) + N - xi) ))(jyii (z - zi)-jzii (y- y0))

2 2 2/sqrt((z - zi) + (y - yi) + x -xi))

2 2 2-b (cos(b sqrt((z - zi) + (y - yi) + (x - x0))

(jyil (z - zt) - jzli (y - yi)) - sin(b .

2 2 2.*' sqrt((z - zi) + (y - yi) + (x - xi) )) (Jyri (z - zi) - jzri (y - yi))))

2 2 2(4 pi ((z-z) + (y -y;) + (x- xi))1

" SS

-'I-

kk

After the first level of computation the real part of the Xdirected Magnetic Field has the form

hair 1;

(((jyii A3 - jzii A2) sin(b sqrt(A3 + A2 + A I))

2 2 2 2 2 2+ (jyri A3 - jzrj A2) cos(b sqrt(A3 + A2 + AI )))/sqrt(A3 + A2 + AI

2 2 2

- b ((Jyii A3 - jzii A2) cos(b sqrt(A3 + A2 + AI))

2 2 2 2 2 2- (jyri A3 - jzri A2) sin(b scirt(A3 + A2 + AI ))))1(4 %pi (A3 + A2 + Al ))

°%a


hxr2;( B15 - 820) sin(b sqrt(83 + B2 + B1)) + (812- B17) cos(b sart(B3 + B2 + B1))--- -------------------------------- -------------------------

sqrt(B3 + B2 + B I)

- b ((B15 - B20) cos(b srt(B3 + B2 + BI))

- (B 12 - B 17) sin(b sqrt(B3 + 82 + 81))))/(4 SOt (B3 + B2 + 1)).

hxi2; I(b ((B15 - B20) sin(b sqrt(B3 + B2 + 81))

B+ B12 - B17) cos(b sqrt(B3 + B2+ I)))

+ ((B15 - B20) cos(b sqrt(B3 + B2 B 81))

T- (12 - 817) sin(b sqrt(83 + B2 + B 1)))/sqrt(B3 + B2+ B ))

1P4 SOi (B3 + 82 + BI 1

54

V-1'

hyr2;

sin(b sqrt(B3 + 82 + B1))(819 - B9)+ cos(b sqrt(B3 + B2 + 81)) (B16 - B6)

sqrt(B3 + B2 + B I)

-b cos(b sqrt(83 + B2 + B )) (819 - 89)

- in(b sqrt(B3 + B2 + 81)) (16 - B6)))/(4 %pi (B3 + B2 + BI))

hyl2;

(b (sin(b sqrt(B3 + B2 + 61)) (B19 - B9)

+ cos(b sqrt(B3 + B2 + B1)) (B16- B6))

+ (cos(b sqrt(83 + B2 + BI)) (B19- B9)

- sin(b sqrt(B3 + B2 + BI)) (B16 - B6))/sqrt(B3 + B2 + B1))

(4 %pi (B3 + B2 + B ))

hzr2;I sin(b sqrt(B3 + B2 + B1 )) (B8 - B13) + cos(b sqrt(B3 + B2 + BI)) (B5 - B10)

sqrt(B3 + B2 + B I)

- b (cos(b sqrt(83 + B2 + B)) (B8 - 813)

- sin(b sqrt(B3 + 82 + B)) (B5 - B10)))(4 Xpi (B3 + B2 + B1))

hz!2(b (sin(b sqrt(B3 + B2 + B )) (B8 - B 13)

+ cos(b sqrt(B3 + B2 + BI)) (B5 - B10))

+ (cos(b sqrt(B3 + B2 + B1)) (B8- B13)

- sin(b sqrt(B3 + B2 + B I)) (B5 - B10))/sqrt(B3 + B2 + 8I))

1(4 Ppi (B3 + B2 + 81))

55

Arter the third level or computation:

hxr3;

cos(b sqrt(C 1 )) C3 + sin(b sqrt(C )) C2(---------------------------

sqrt(C 1)

- b (cos(b sqrt(C 1)) C2 - sin(b sqrt(C 1)) C3))/(4 Spi C 1)

*hxi3;

cos(b sqrt(C 1)) C2 - sin(b sqrt(C 1)) C3

sqrt(C1)

+ b (cos(b sqrt(C )) C3 + sin(b sqrt(C 1)) C2))/(4 Fspi C 1)

hyr3;

h cos(b sqrt(C I)) C5 + sin(b sqrt(C I)) C4---------------------------------

sqrt(C )

- b (cos(b sqrt(C 1)) C4 - sin(b sqrt(C 1)) C5))/(4 Xpi C I)

hyi3;

cos(b sqrt(C )) C4 - sin(b sqrt(C 1)) CS---------------------------------

sqrt(C )

m + b (cos(b sqrt(C I)) C5 + sin(b sqrt(C 1)) C4))/(4 Spi C I)

hzr3;

cos(b sqrt(C 1)) C7 + sln(b sqrt(C 1)) C6

sqrt(C I)

- b (cos(b sqrt(C I )) C6 - sin(b sqrt(C 1)) C7))/(4 %pi C 1)

hzi3;

cos(b sqrt(C 1)) C6 - sin(b sqrt(C I )) C7(--------------------------------

sqrt(C I)

+ b (cos(b sqrt(C 1)) C7 + sin(b sqrt(C 1)) C6))/(4 %pi C 1)

56


hair4;

C2 sin(b D) 0+ C3 cos(b DO)------------------- - - --- - b (C2 cos(b D 1) - C3 sin(b D)

D I

D2

hxi4;

C2 cos(b D1)- C3 sin(b Dl1)------------------- -bC(C2 sin(b D 0+C3 cos(b D)D 1

D2

hyr4.

C4 sin(b D1)+ C5 cos(bDl)bC(C4 cos(b D I) -C5 sin(b D)

DI

D2

hyi4;

C4 cos(b Dl1) - C5 sin(b Dl)+b (C4 sin(bD I) +C5 cos(b D)

Dl

q D2

hzr4;

C6 sin(b DI)+ C7 cos(b DlI----------------- b (C6 cos(bDlI) -C7 sin(bDl)

DI----------------------------------------

D2

Il hz14;

C6 cos(bD 1) -C7 sin(b Dl)------------- - -- ---- - b (C6 sin(b Dl1) + C7 cos(b D)

DI

D2

57

I E UWU U L L W LU LW;LU U M VW - M- F:fW YJ'r W L s . : : S.: . rY Z* .U' ; ; *' s ' : ' +

"-u :- - " "- - - - : ' q- -

After the fifth level or computation:

hxr5;

((C2 sin(E 1) + C3 cos(E 1)) E2 - b (C2 cos(E 1) - C3 sin(E ))) E3

hX15;

((C2 cos(E 1) - C3 sin(E 1)) E2 + b (C2 sin(E 1) + C3 cos(E 1))) E3

hyrS;

((C4 sin(E I) + C5 cos(E 1)) E2 - b (C4 cos(E 1) - C5 sin(E 1))) E3

((C4 cos(E I) - C5 sin(E I)) E2 + b (C4 sin(E 1) + C5 cos(E 1 ))) E3

hzr5;

((C6 sin(E 1) + C7 cos(E 1)) E2 - b (C6 cos(E 1) - C7 sin(E 1))) E3

hzi5;

((C6 cos(El) - C7 sin(E I )) E2 + b (C6 sin(E 1) + C7 cos(E 1 ))) E3

After the sixth level or computation:

hxr6;

E3(E2(C2F2+C3F1)-b(C2FI -C3F2))

hxi6;

E3 (E2 (C2 FI - C3 F2) + b (C2 F2 + C3 F U)

hyr6;

E3 (E2 (C4 F2 + C5 F I )-b (C4 F I - C5 F2))

hyi6;

E3 (E2 (C4 F I - C5 F2) + b (C4 F2 + C5 F I))

hzr6;

E3 (E2 (C6 F2 + C7 F I) - b (C6 F I - C7 F2))

hz6;

E3(E2(C6FI -C7F2)+ b(C6F2+ C7F1))

58

% -U , - "F .. ,.,p,. q •. , .. . .% . . . . + ,. . . . . . . . . . . .. . . .-. . .•.- . . . . . .

"%"* I _+," " .... .. , %# "%". "",". , - """."":-"""+""" o. . . .- "." +" ". "+ + ' '-" *. .-" ." - " -"".,'"'-" " ,-+ '

" ' +-i l

After the seventh level or computation:

hxr7;

E3 (E2 (G7 + 62) - b (6 1 - 68))

hxi7;

E3 (E2 (G I -G68) +b (G7 +G2))

hyr7;

E3 (E2 (69 + 64) - b (63-G10))

hyi7;

E3 (b (G9+ G4) +E2 (G3 -lO)

hzr7;

E3 (E2 (G6 +G11) -b(G5 -612))

hz17;

E3 (b (66 + G1 1) +E2 (G5 -G12))

After the eighth level of computation:

hxrO.

E3 (E2 HI - blH4)

E3(E2)R4 + bH I)

* hyrB;

E3(U H2 - bH5)

N

hzr8;

E3(E2UH3 -b H6)

E3 (E2 H6 . b H3)

59

After the ninth level of computation:

hxr9;

E3 (JI -J1O)

hxt9:

E3 CJ7 + J4)

hyr9;

E3 (J2 - Ji 1)

hyi9;

E3 (J + J5)

hzr9;

E3(J3 - J 12)

hzi9;

E3 (9 + J6)

After the tenth level of computation:

hxr 10;E3KI

hxI 10;

E3K4

hyrlO;

* E3 K2

, hyi lO;

E3 K5

hzriO;

E3K3

hzi 10:

E3K6

60

V

%., , _W .W %W. f , $ ,w ,, ; .. .... .- ..-, .- - . P , - ,- ..- , , .. '.% . • , . .*, .,',','. ,. ',,,, ., -. ". .%

After the eleventh and final level of computation:

hxr 11;

Li Hx real

hxi 11;

L4 Hx imag

*hyr 11;

L2 Hy real

hyi 1l;

L5 Hy imag

hzr 11;

L3 Hz real

hzIl 1;

L6 Hy Imag

61

I

MACSYMA COMMad 1-111009 101 Poallel112 M9MIeiC Field

The text below is the actual corrrcids used to perform the calculallon analysis for the

Magnetic Field. The cornmczds are separated into the corrpjational levels developed in the

previous text

These first corrrnands are in the prepration stage of the process

realoart~hx);imagpart(hx);subst(((((x-xi )2H+(y-yi )2)+((z-zi )2))( 1 /2)),r,%);

Level one:

subst("A 1"(-x,)subst("A2',(y-yi ,%),

Level two

subst("132",A2-2,S),

* subst("B3",A3"'2,%);subst(tB4"jxH* "A l".), not usedsubst("B5,jxri*A2",S);subst("B6",jxri*"A3'%);subst('B7',jxii*"A I,R), not usedsu~st(*B*jxi*A2",%),P subst("B9",jxii* AYA.);subst("B lO,jyrN"A V,);subst("B I1 ,jyri*~"A2,);- not usedsubst(B12.,jyr*"A3,X);subst"B 1 3,jyii *A 1",S);subst("B I4",jyiilwA2.%); not used5ubst("Bl5,jyii*"A3,);subst(B16,jzri * A 1,X),

subst("B I ".jzri "A3",!P), not used

subst(6B20",jzfi*"A2",%);subst(B21,JziiI'A3',%), not used

62

www-,% -W' Mr-w MruJuJar pknJ % ur' -V' P47 K..' PV m Wu r Ik M-' IFq K- K X Ka'uvwU W- A~ AL - 11- IK X w w' it I a w '

Level three

subst("C1",B I"+"B2"+aB3",%);

subst(t2',B15tI~)

subst("C5V516-B6",%);subst('C6",B"-*B 13",%);

Level four

subst("D 1'sQrt(*C ),X)-

Level five

subst("E 1,b*"D I",%);

subst(E3,(1/'D2I',%);

* Level six

* subst(T 1"cos("Ei,)subst(-F2.s("E 1",X),

U Level seven

subst("61 ,"C2"" 1i ,S);subst(G2.C3*F 1,);subt(G3",t4" T 1 ",R~);subst("G4",5uT"1,),

*subst("G6,"C7"* T21,),subst'GT,"C4'F2",R),

subst("Gl,C6"F2,),

subst(H 1 <GT.'G* 2",%);substVHI20,'C'* 72',),

subst(H41 "G86'"F',),,

subsWH3"5WG3-+G IO",),

subst(H6.G5'-G 12,S).

A% 63

Level nine

subst(CJ21VE2"*'H2",%);

subst(J3",E2' H3-.7);subst(AJ4"2*'H4",%);subst(J5E2'"H5',);subS(J6,'E2 H6".?);subst(UJ7",b * H I,S);subst("J8",b H2,%) Isubst(UJ9,b'1-H3",%);subst("J IO.b*"H4,P);

subst(*J 1 2,b "H6,);

* Level ten

subst(K1J1-J1O I);

subst(K3,J3"-J12,S);subst(K4,J7"+"J4,),siubst(K5,J"+"j5-,),-subStC'K6,J9"+j6',X),

Level eleven

subst(L I" VEY* "K I1' )subst(1',"E3"*K2,%);5ubst(L3E3N*K3*,);

psubst(L4",E3*'K 4', S)* subst("L5,*E"' K5",%);

64

MR

Electric Field Development

This section illustrates the calculation analysis of the Electric Field.


of the Parallel Electric Field.

The Electric Field in the 'x" direction will be the example through level

four with the complete complex six-tuple shown for the remainder of the

process. The text is in MACSYMA syntax and presented as the user might see

it on the screen.

. Ex;(c9) ex:

- ibr 3%ib 1 2 2 2(dg) Re ((jxri + Pi jxii) - - - + b )((z - zi) +(y - yi) )

r 2r

3%ib 3 2+ (, + -- )b((jzri + Pei jzii) Nx - xi) (z - zi)

r 2

3+ (jyri + Ri jyii) (x - xi) (y - yi)))/( 4 Xpi r )

65

F-,r It,%

realarL(ex) The real part is extracted with MACSYMA,tc10) exr;

1 3 b jxri 2 2(d 10) (sin(b r) ((jxii (b - --) --- ) ((z - Zi) + (y - yi) )

2 rr

3 b (jzrl (x - xl) (z - zf) + jyrj (x - xl) (y - yl))+ -

+ (-- b ) (jzll (x -x) (z -zi) + Jy (x -xl) (y - yi)))2

3 b jxii 2 1 2 2+ cos(b r) ((+ jxri (b - -- )) ((z -zi) + (y -yi))

r 2r

3 2+(-- -b )(jzri (x -xi) (z -zi) + jyri (x -xi) (y -yi))

2r

3 b (jzii (x - xi) (z - zi) + jyli (x - xi) (y - yi)) 3! - --)1(4 Spi r)

r

Ima _artfexi The imaginary part is extracted with MACSYMA.(c 1I ex;

2 1 3 b jxri 2 2(dll)(cos(br)((jxii(b -- ) -. .)((z-zt) +(y-yt))

2 r4-

r

+ 3 b (jzri (x - xi) (z -zi) + jyri (x - xi) (y -yi))

r3 2

+ (---b ) (jzil (x - xl) (z -i) + jyi (x -xl) (y -y)))2

Mr3 b jxii 2 1 2 2

-sin(b r) ((--- + jxr (b - -- )) ((z -zi) + (y -yi))r 2

3 2

+ (-- -b ) (jzri (x - xi) (z -zi) + jyri (x - xi) (y - yi))2

r3 b (jzil (x - xi) (z - zi) + jyii (x - xl) (y - yi)) 3

- ------ - - --- - ---- -pi rr

66

-UI

Here we begin the comIutation arocess:The example below recalls the real part of Ex. then substitutes the expanded version or "r" into theexpression.

N-4 ExrO; Substitute the expanded version of 'r' into the expression.

(c 12) exrO;2 2 2

(d12) (sin(b sqrt((z - zi) + (y - yi) + (x - xi) ))

3 b (jzri (x - xi) (z - zi) + jyri (x - xi) (y - yi))----------------------------- -------

2 2 2sqrt((z - zi) + (y - yi) + (x - xi) )

2 1*,+ (jxii (b - -- - - -)-- -

2 2 2(z - zi) + (y -yi) + (x -xi)

3 b jxn 2 2

sqr-()((z)-ziy-+(y-yxixi)2 2 2sqrt((z - z) + (y - yi) + (x - x)

... 3 2

( - - )

2 2 2(Z -ZI) + (y -yi) + (x -xi)

(jzii (x - xi) (z - zi) + jylt (x - xi) (y - yl)

+ cos(b sqrL((z - zi) + (y - yi) + (x - xi) ))

3 b (jzil (x - xi) (z - zi) + jyii (x - xi) (y - yi))

2 2 2sqrt((z-zi) +(y-yi) +(x-xi) )

3 b jxii

2 2 2sqrt((z - zi) + (y - yi) + (x - xi) )2 1 2 2

+jxri (b ----------------------- ))((z -zi) +(y -yi) )2 2 2

(z-zi) +(y-yi) +(x-xi)3 2

"+ ( -----------------------------b )2 2 2

(z-zi) +(y-yi) +(x-xi)22 2 2 3/2

(jzri (x - xi) (z - zi) + jyri (x - xl) (y - yi))))/(4 Rpl ((z - zi) + (y - yi) + (x - xi) ) )

'.. 67

7;%

.

ExlO; Substitute the expanded version of "r" into the expression.(c 13) exiO;

2 2 2(d13) (cos(b sQrt((z - zi) + (y - yi) + (x - xi) ))

3 b (jzrl (x - xi) (z - zi) + jyri (x - xi) (y - yl))

2 2 2sQrt((z - zi) + (y - yi) + (x - xi))

2 1+ (jx (b -----------

2 2 2(z-zi) +(y-yi) +(x-xi)

3 b jxri 2 2-) ((Z -ZO) (Y -yi))

2 2 2sqrt((z - zi) + (y - yi) + (x- xi) )

3 2- -b )

2 2 2(z-zi) +(y-yi) +(x-xi)

(jzll (x - xi) (z - z:) + jyii (x - xi) (y - yi)))

2 2 2-sin(b sqrt((z -zi) + (y -yi) + (x - xi) ))

3 b (jzii (x - xi) (z - zi) + jyil (x - xi) (y - yl))

-- 2 2 2

sqrt((z - zi) + (y - yl) + (x - xi) )3 b jxii

2 2 2sqrt((z - zi) + (y -yi) + (x -xi))

2 1 2 2+ jxri (b -------------------------- (z-zi) + (y - yi) )

2 2 2(z - zi) + (y - yi) + (x - xi)

3 2+( ---------------------------------- b

2 2 2(z-zi) +(y-yi) +(x-xi)

Q (jzri (x - xi) (z - zi) + jyri (x - xi) (y - yi))))

2 2 2 3/2/4 .pi ((z - zi) + (y - yi) + (x-xi) )

68

'V 1T v J * Uw_ 19iW T W V-tJ r V ' rWq .%M n ,F P n ~ - r VWV W V"M

After the first level of computation:

'4' Exr 1;

(c14) exrl;

2 2 2 1 3 b jxrid14) (((A3 + A2 ) (jxii (b- - )-

2 2 2 2 2A3 +A2 +A1 sqrt(A3 +A2 +AI

3 2+(jziiAI A3+jyiiA1 A2)( ------------ b

2 2 2A3 +A2 +A1

3 b(jzri A1 A3 +jyri AI A2) 2 2 2--------------------------------- )sin(b sqrt(A3 + A2 +A1 ))

2 2 2sqrt(A3 + A2 +A I2 2 2 1 3 b jxii+((A3 + A2 )(jxri(b ------------- -)+ ---------------------

2 2 2 2 2 2A3 +A2 AI sqrt(A3 + A2 +A I

3 2+ (jzriA1 A3 + jyri AI A2)( ------------ b

2 2 2

A3 +A2 +AI

3 b (jziiA1 A3 + jyii A l A2) 2 2 2 I-. . .----------------------- ) cos(b sqrt(A3 + A2 + Al )))2 2 2

sqrt(A3 +A2 +Al

2 2 2 3/2/(4%7Di(A3 +A2 +AI) I

69

I

L

Exi 1;

(c15) exil

2 2 2 1 3 b jxri(d 15) (((A3 + A2 )(jxii (b - )-

2 2 2 2 2 2A3 +A2 +Al sqrt(A3 + A2 +AI

3 2+ (jzii AI A3+ jyii Al A2)( --------- b

2 2 2A3 +A2 +Al

3 b (jzri A i A3 + jyri A I A2) 2 2 2+ --- )cos(b sqrt(A3 + A2 + A1 ))

2 2 2 "

sqrt(A3 + A2 +A

2 2 2 I 3 b jxii-" -(WA + A2 ) (jxri (b - )

2 2 2 2 2 2A3 + A2 + Al sqrt(A3 + A2 + A I

* 32+(jzriA1 A3+jyriA IA2)( ------------ b )

2 2 2A3 +A2 +Al

3 b (jzii A I A3 +'jyl A I A2) 2 2 2S ----------------------------- ) sin(b sqrt(A3 + A2 + Al )))

".2 2 2sqrt(A3 +A2 +AI

2 2 2 3/2/(4tpi(A3 +A2 +Al

Ile,

70S

L


Exr2;(c18) exr2;

2 1 822(d 18) (((B3 + B2) (jxii (b -------- ) -

B3 + B2 + BI sqrt(B3 + B2 + B 1)

3 2 3 b (A3 B16 + A2 BI0)+ (A3 B19 + A2 B13) (- b +

B3 + B2 + BI sqrt(B3 + 82 + B 1)

sin(b sqrt(83 + 82 + 81)) + ((83 + 82)

2 1 B23(jxri (b -) +

B3 + B2 + BI sqrt(3 + 82+1)

3 2 3b(A3 B19 + A2 B13)+ W 816 + A2 B20)( b )- - - )' -aB3 + B2 + BI s~rt(B3 + B2 + B 1 )

3/2I cos(b sqrt(B3 + 82 + B1)))/(4 Spi (B3 + 82 + B1)

Exi2;(c19) exi2,

2 1 B22. (d19) (((B3 + B2) (jxii (b - - - -) -

B3 + B2 + BI sqrt(B3 + 82 + B I)

3 2 3b (A3816 + A2 B10)+ (A3 B19 + A2 B13) (.- - b + ---

B3 + B2 + BI sqrt(B3 + B2 + BI)

cos(b sqrt(B3 + B2 + B )) - ((B3 + B2)

2 1 823; :':, (jxri (b -

B3 + B2 + B! sqrt(B3 + B2 + B 1)

3 2 3b(A3B19+A2B13)+ 816 +(A3BI6 A2810)( b )- --------

B3 + 82 + BI sqrt(83 + B2 + B 1)

3/2sln(b sqrt(B3 + 82 + B)))(4 %i (83 + B2 + B I)

71

MC. r - C* ..- ~~f ~

Arter the third level or computation:IExr3;

a,"

' C (c22) exr3;

3b(C38+C36) 3 2(22) (sln(b sqrt(C 1)) + (-- - - b ) (C37 + C35)

sqrt(C ) C I

2 1 B22+ (jxII (b - -- )-------- ) C33) + cos(b sqrt(C1))

C I sqrt(C I)

3 2 3b(C37 +C35) B23 2 1((---b ) (C38 +C36) ------- --- (- - + jxri (b - - ))C33))C I sqrt(C 1) sqrt(C I) C 1

3/2" 1(4 Xpi C1

Exi3;

" " (23) exi3,

3b(C38+C36) 3 2(d23) (cos(b sqrt(C 1)) ( + ( -- - b ) (C37 + C35)

sqrt(C I) C I

2 1 B22+ (jxii (b - qrt(Cl)C33) - sin(b sqrt(CI))

C I sqrt(ClI

3 2 3 b(C37 + C35) 823 2 1,---b )(C38 + C36) - ------------ + ------ jxri (b - -- ))C33))C 1 sqrt(C I) sqrt(C I) C I

,, 3/2)S724 p C I

72


Exr4;(c51) exr4;

2 3 b D34 2 B22(d1) (sin(b D1) ((3 D33 - b ) D35 + D. + C33 (jxii (b - D33) -- -)

3 b D35 2 2 B23+ cos(b D)( - ----- (3 D33- b )D34 + C33 (Jxri (b - D33) + ---

D1 Dl

/(D1 D2)

Exi4;... (c52) exi4;

2 3 b D34 2 B22(d52) (cos(b D1) (( D33 - b ) D35 + ---- + C33 (jxii (b - D33) ---- ))

DI DI

3 b D35 2 2 B23-sin(bDl)( + (3D33-b )D34+C33 (Jxri(b -D33)+ )

D DI DI1

AD I D2)

After the fifth level of computation:

" Exr5;(c53) exr5;

2(d53) E2 E3 (cos(E 1) (- E2 E7 + D34 (E5 - b ) + C33 (jxri E4 + B23 E2))

2+ sln(E 1) (E2 E6 + D35 (E5 - b ) + C33 (jxii E4 - B22 E2))

Exi5;h. (c54) exi5.

2(d54) E2 E3 (cos(E 1) (E2 E6 + D35 (E5 - b ) + C33 (xii E4 - B22 E2))

2-sin(E)(-E2 E7 + D34(E5 -b )+ C33(jxri E4 +B23E2)))

73

I

.. ,.,,,j, . -,,._-.--_,..-.--.. ' ,,,.-.. ,'....-.....,•,...... _ . ,.;- "": . .- ,

Eyr5;(c55) eyr5;

2(d55) E2 E3 (cosE 1) (- E2 E9 + D36 (E5 - b ) + C32 (jyrl E4 + B25 E2))

2+ sin(E 1) (E2 E8 + D37 (E5 - b ) + C32 (jyii E4 - B24 E2)))

Eyi5;(c57) eyi5;

2(d57) E2 E3 (cos(E I) (E2 E8 + D37 (E5 - b ) + C32 (jyii E4 - B24 E2))

- sin(E 1) (-E2 E9 + 036 (E5 - b )+ C32 (jyri E4 + B25 E2)))

Ezr5;(c58) ezr5;

2(d58) E2 E3 (sin(E 1) (D39 (E5 - b )+ C31 (jzii E4 - 826 E2) + EE10 2)

2+ cos(E 1(D38 (E5 - b + C31 (jzri E4 + B27 E2) - E11 E2))

Ezi5;9c59 ezi5.

(d9) E E3 (costE I(D39 kE5 - b C31 (jzHi E4 - B26 E2) ,ElO E2)

-sin(EIJ(D38(E5-b )+C3I(jvrE4+ B27E2)-EI1 E2))

74I ! I

*1

.. *,. '....'.. .: :. .'. . 5'..., .- '.. ". .: " " . 5- . ... ' .v .. .. ".'....?...'........¢ ' i ?..? . .% ." .* . .)

.,- - -, l -

Q.? LZ L I P L w v l ,T, . .- i: ,-V ?V '. W W I rw l:p F wW2Y W . X' U W i- U.-

After the sixth level of computation:

Exr6;(c60) exr6;

(d60) F3 (F2 (C33 (F8 - F5) + D35 F4 + E6) + F I (C33 (F7 + F6) + D34 F4 - E7))

Exi6;(c61) ex16;

(d61) F3 (F I (C33 (F8 - F5) + D35 F4 + E6) -F2 (C33 (F7 + F6) + D34 F4 - E7))

Eyr6;(c62) eyr6;

(d62)F3(F2(C32(F12-F9)+D37F4+E8)+FI(D36F4+C32(F11+F10)-E9))

j EyI6;(c63) eyi6;

(d63) F3 (F I (C32 (F 12 -F9)+ D37 F4+ E8) -F2 (D36 F4 +C32 (F I I F 10) -E9))

Ezr6;(c64) ezr6;

(d64) F3(F2(D39F4+C31(F16-F13)+E1O)+F1(D38F4+C31(F15+F14)-E11))

Ez16;(c65) ezi6;

(d65) F3(F1(D39F4+C31 (F16-F13)+E10)-F2(D38F4+C31 (F15+F14)-E11))

75

• ...- ,- .. . , . -. . . - -... *.. ,_ . . , , -- , " *- '_' ' j ..

After the seventh level of computation:Exr7;

(c66) exr7;

(d66) F3(F1 (622 + C33 G21 - E7)+ F2 (G23 + C33 G20 + E6))

Exi7;(c67) exi7;

(d67) F3(FI (G23 +C33620 +E6)-F2(G22 +C33G21 -E7))

Eyr7;(c68) eyr7;

(d68) F3 (F I (G32 + C32 G31 - E9) + F2 (633 + C32 G30 + E8))

EyI7;(c69) eyi7;

(d69) F3 (FI (G33 + C32 G30 + ES)- F2 (G32 + C32 631 -E9))

Ezr7;"" (c70) ezr7;

(d70) F3 (FI (G42 + C31 G41- E11) + F2 (G43 + C31G40 + EIO))

EzI7;(c71) ezi7;

(d71) F3(FI (G43+ C31 G40 +EIO)-F2(G42+ C31 641 -EMU)

After the eighth level of computation:Exr8;

(c72) exr8;

-, (d72) F3 (F2 (H23 + H21) + FJ (H22 + H20))

ExIO;(c73) exib;

(d73) F3 (FI (H23 + H21 -F2 (H22 + H20))

EyrB;(c74) eyr8;

(d74) F3 (F2 (H33 + H31 + F (H32 + H30))

I.79., 76

Yp

,tm ' ,'e e ' ,"' ,,.e ,,,,, .• ' ' ' ' ' ' ••" , • . - , .- , ' .," ." - ,, ---.. -, , ,,N t e ' t ,*.. " .. " ." ." .. 1 . e ". .. .o .= " . " = # . '"a'P .'P

L

fyl8;(c75) eyiS6

(75) F3 (FI (H33 + H31) -F2 (H32 + H30))

EzrO;uiq (76) ezr8.

(d76) F3 (F2 (H43 + H41) + FI (H42 + H40))

* fitS;(c77) ezt8,

(d77) r3 (FI (H43 + H41) -F2 (H42 + H40))

After the ninth level of computation:Exr9;

(8) exr9,

(W78) F3(FI J21 + F2 J20)

Ex19;(79) ext9.

( F3(FIJ20 -F2J21)

Eyr9;(cfO) eyr9.

(d80) F3 (FI J31 + F2 J30)

Ey19;(c81) eyi9;

N, (d81 ) F3 (F I J30 -F2 J31 )

Ezr9;'c8?) e7r9,

(d82) F3 (FI J41 * F2 J40)

EzI9;(c83) ezi9,

W83) F3 (F1 J40 -F2 JA1)

qf7

a?,77 aa

After the tenth level of computation:

ExrO;

(c4) exr 10;

(d84) F3(K21 +1K20)

.. Eel O;

(c85) exi 10;

(d85) F3 (22 - K23)

Eyr 10;(c86) eyr 10;

(d86) F3 (K31 + K30)

EyilO;

(c87) eyi 10.

(d87) F3 (K32 - K33)

Ezr 10;(c88) ezr 10,

(d88) F3 (K41 + K40)

i Ezt fO;

(c8g) ezi 10;

9(d) F3 (K42 - K43)

*After the eleventh level of computation:

Exr 11;(00) exrl I;

(d90) F3 L20

Exi 11;(c91) exi 11;

W91) F3 L21

Eyr 11;- (92) eyr 11

(d92) F3 L30

78

., .,,:, .. _., , .." ._,_- _ :r.._ . ,-.-.. -,, .,,,: .: ' :, .. '. " .,,.',.,.." .,,..', ",..".. ,,,'.-....".... . T.'',, '..,:,' I

Eyll ;(c3) eyi 11;

(d93) F3 L31

Ezr 11;(c94) ezr 1;

Wd94) F31L40

Ezi 11;(c95) ezi 11;

(d95) F3 L41

After the twelfth and final level of computation:

Exr 12;(c96) exrl2;

(d96) M I Ex real

Exi 12;(c97) exi 12;

(d97) M2 Ex imag

Eyr 12;

(c98) eyr 12;

(d98) M3 Ey real

Eyi i2;S(c99) eyi 12;

(d99) M4 Ey Imag

Ezr 12;(c 100) ezr 12;

(d 100) M5 Ez real

EzI 12;(cl01) ezi 12,

(d1Ol) M6 Ez imag

79

,: % % " , ". - " "':':;4 ' "" '" "''" ''"":; " "" "" " " ;':""; '" " ; "I

L

MACSYMA Connmd Liutina for Pwallel Eleclric Field

The text below is the actual corrrnands used to perform the calculation analysis for the

Electric Field The corrnnds are separated into the computational levels developed in the

previous text.

Note here that certain MACSYMA conventions require what would seem a very round

aout way t sbstitute into equa- insF For example, In level two. notice the required steps to

perform the operations th were quite sflaight forward in the Mogetic field computations.

These extra steps were necessary because of the way these equations were looded into

MACSYMA originally The only impact on the behavior of the calculation flow is th certain

choices of variable combinations are made by the order of these input commands and as such.

other combinations are certainly possible and indeed may have been more naturol. However,

the overall behavior of the equations proved to be syrmmetric and therefore caused no real

concern.

The substitution cor'riands that were identical to those of the Magnetic field

comrnands are in standard print. The commands required that were strictly for the Electric

fields are in bold face and commands that are strictly to accommodate MACSYMA are

~underlined.

These first commnds are in the preparation stoge of the process.

realpart(ex);imagpart(ex);subst(((((x-xi )2)+((y-yi )^2)+((z-zi)^2))^( 1 /2)),r,!);

Level one:

R subst(*A I ",(x-xI),R)$substU'A2",(y-y),%)$subst[A3",(z-zi),%)$

.80'°" 80

Level two:

subst("B2*,A2"'2,%)Ssubst("B3",'A3"'2,%)$

subst(4'A2., ixri*'A I" *A2X$subst(B4*A3.,JxrI 'A IAl 'X

p ~subst("B5"'A3. xri*A2' A?.)$subst("BS6*A I-, xrl'A23 'A I )$

subst(B6"A2,jxri'A3*A2',)$subst("87"*"A2, ixif *"AV I*A2',%)$subst("B7"*A3",jxfli' A 1*A3',%)$subst("BO8OA 1, ixii"A2' *"A 1.)$substV"B8'*A3',jxfl A2" A3",%)Ssubst(B9'* I1. txii* A3 *"A 1",%)$substWB9'* A2',jxii*'"Y' A2,)$substC-lO"A2", fyri"LA2 )8Asubst(UB O"* A3",jyrs 'A 1'"A3Y,R)$substCI1"*AI' iJm*A2'*AI")$subst('B I11"' A3*,jyri*"A2"**A3,.)$subst(812*A1", [vrIt'A3 *'AI",)subst("B 12" "A2">jyr: 'A3I -ATA)Ssubst(B 13'A2,. lvii'A 1' 'A2.)S

subst("B14"*'AV, i*":A2 *AV.)$subst('B 14'*"A3",jyii "A2' A3.,%)$

subst("B 15'*A2,.Jyii* A3*' A2.X)$subst(B 16"*A2", lzrl*"A 1 'A",)

qsubst(B16A3,jzri'Al"*"A3,)$subsWB17"A", zri*"A2"*'A3 )$subst"B6*"'AV,jr* -A2*'A3*%)$

subs WD 18'*"A2",j zri * AYI'AT, %)$subst(TB i"'A2', Izii*'A1' V ",)subst(1T'A3,jzii*AlV'A3,)$subst("B2O""A1", izi*"2*A",X)$subst(B2OA3.jzii'A2'0A3",)$

subst(B2 1 *A2',jzii "A3"* A2",P)$

subst("22.3b'Jxri .3)1subst(B23.3*b 0Jii.$)*subst(B24.3*bjyr,X)*svbst('B25'.3 b jyI.3)1

*subst(26.3 *b *jzri.%)t

subst(B527'.3 0b*jzll,Z)S

812

subst(C15152,"B.)

sub(C3 .1B+.52'.Z)$subst('C32'VB 1 *+B3'.3)Ssubst(C33.52.B'3'.X);

substC'C35VA30B 19'.X)$svbst('C36*.A3'B 16".X)$sebst(C37.A2',' I 3'.X)tsvbs(C30A2B0.Zh'

subst(C39.AY'B'20,ZX)Ssbst('C40'.A3**B 1 7'.)$subst(C4 1'A2'B7.X)Ssubt(-C42.A2T'4.X);

sbst('C43'.A3*"B7'.Z)*subst(C44.AY' B 14'.X)$subst("C45VA3'* B4'.X)$subst(C46VA3'81B I VAL

Level reur:

substWD I -,sqrt(t i%)

subs(D33 Ti rc*Xi ,Z)1;3/),)

subs(D34. I/C1'CX),E

subst(D35,.C37T.C35'.X);

swbst(D36,C40.+C42'.X)Ssubst('D37.C39'+C41',X);

st('D38.C45.+C46'.X)$subst(D39.C43.+C44".X);

82

Level five:

subst('E I ,bwa'D.1)

subst('E2'( I /ti ).X)$

subst('E4'.b2-D03Y.S)tsubst('E5'.3 I D33'.3)$

subs(E6E2'b D34E2.Ztsubst('E7- E2 .3*bsD35 * E2'.Z):subst('E8 'E2'.3*b*'D36'* E2'..)tsubst(E9E*2'.3*b*D7'*E2'Z):substLUE 10" *2*-3*b*D38*2ZFsgkstE 'E2.3,b'D39IE2.Z):

Level six:

subst("FI 1 ,cos('E I'),%)$subst(F2",sin('E T%subst(F4'YE5'-b2,X).

subst(F5.E2's 22,ZX)tsubst(F6.E2'*B23'.Z)Ssubst('F7'7E4'*jxrl .S)tsubst(UF37E4mjxii.%)$Subst(1.'E2'.X)tsubstl.E.3'.X)

9 subst(X*'F3'..Z)&

subst(F9.E2'9 B24.,Z)t5ubst("F 1 O'.E2"* B25'.Z)Ssubst'F II'.E4"*Jyri.Z)tsubst('F 12'VE4'Jyii.$subst( I. E2'.X)t

A substOj.'EYZSsubstUI*F3'.1.Z)%

subsL('F 13".E 2' "026'. X)substU'F 147E28'27.X)$subst.(F 15'.E 4'*jzri. %)S

*subst('F 16'.E4"*Jzi.Z)$subst(l.E2'.Z)tsubst(1.-EYMI)

83

W! m ~ AwnU BU3 N- UVU' .r~~W~F 7 .w

L

Level seven:

subst(620.FB-"F5.3)$subst('62 1'.'F7-+F6".Z)*subst(622,F4'*D34,.Z)$subst(623",F4'w D35',Z).

subst('630'.jF 1 2'-F9,S)tIp subst(631.'FI 1'+F I0O.3)*

substC'6327F4a -D36.ZS)Ssubst(633.F4'* D37",3);

subst(640.f 16'-F13.Z)subst('64 1'.F 15"+F 14'.24subst(6427,F4"* D3a8.z)s

*sibst(64'. F4'*D393,);

Levl eight:

*subst('H22'.C33 062 1 *,Z)$

subst(H23.C33u 620'.3subst(H20O+Ii2T.7H22 +622'-E7.$)$Usubst("H2 1 "+H23.i423"4623"iE6'%):

subst('H32,.C32 *63 1'.Z)$subst('H33'7C32'* 630'.$)subt(H30+H32.H32+632'-E9'.X)I

psubst('H3 1 +'H33.H33 . 633'+EI3'.)*

subst("H42.C3 1 "64 1".Z)$*subst(H43.C3 1064O*.Z)t

subst(H4O+i42.H4T+'642'-E1 I I3)Ssbt('H4 I "+H43 .H43.+6434+E 10'.2):

Level nine:

subst(J20'.H23'+H2 1 '.Z)Ssubst("J2 I '.H22'+H20".X);subst(J30'.H334+H3 1 .X)Ssubst(J3 1'.H32*+H30,3%);subst("U'40'.H43"+"4 I' .3)*subst('J41 *,H42"4H40,3X);

Z 84

L

Level too:

subs(K 20.F 1 ".12 1 ".Z)*subst('K2 1 '. F2 W * J2);subst(K22FP71'J20.ZX)$subt(K23.F2' 'J2 1 *.Z);

subst(K30,.F 1 N"J3 1 .ZSsubst('K31 "F72*J30'. );substC'K32',F I's J3'. X)$subst-K337F2' J3 1 '.X).

subst("K40".7 1 *"J4 1 .X)$subst(K4 1 "F 2 IJ40'. );sUbt("K42,F I ' J40'.Z)4subst(K43.F2'J41,.Z);

Level eleven:

subst('L2O".K2 1 +'K20.X);subst(121 K22'-K23*,Z);

subst(L30,K 1 +'K30'.X);5ubst(L31.K3"-'K33'.X).

substVL47(4 1+'K40'.X).subst(L41I 7K42-'K43,Z);

Level twelve:

subst("l1VF3"*L2O.3):.

- ~ subst-M2'.F3Y'L2 1'.2).

subst(113'F3'*130.Z);

subst(14.F3'* L3 1 ',S);

5ubst(15.F3 ' L40'.2);

substfll6.F3'*L4 I 'Z);

85

4b

I.

This appendix provides the supplementary material for the develop-

ment of the Parallel VWE. Presented here are the full size data-flow graphs

which provide a graphical representation of the specification data developed

as a result of this thesis effort. The text is organized into three sections.

The sections are the presentation of the Parallel Vector Potential, the Parallel

Magnetic Field, and the Parallel Electric Field The order of presentation for

all sections are

1. The real part of the "X" directed vector quantity.

2 The imaginary part of the vector quantity

3 The real and imaginary part (paired) of the vector quantity

4 The combined cartesian set for the vector quantity

The final data-flow graph, Figure 28, is the presentation of the fully

overlayed vector quantities

I'I

86

'~41

0,

4(4

po I

4'or

~AK*6oal

N'

~-.-u--u.v~v.-v -. iv~ r r ~ ~ -~ -J w .- w p -.p -~ .j. ~r u ~ a - W5.Y~"J' P~Y r~NwN v-w NW 1r W~ WW W

4

IJm~.I~.

I/£

4p/

/-1

.1

pN

4

(~~ (~;J~)

C, Jx~ jxri

Q') IA N

'NIAj

* 7J.

*( .*)

N

4'~ g4~ *~*~* ~ N**

A1?9 510 ARCHITECTURL IMPLICATIONS OF 0 PARLLEL COMPUITATIONAL v1APPROACH TO THE YE.. CU) AIR FORCE INST OF TECHWIGHT-PATTERSON AF9 OH SCHOOL OF ENGI. J L STRAUSS

UNCLASSIFIED FAR 6? RFIT/4E/EN/S?N-5 F/C 9/2 N

EE ENEEEEEohiIN..E

h.0

II

*w- w- w- V. w y r r- flS - r r * rU r~r..afl W V' C" - .a -'an. - 'a - p rs -,p wp - -w ns lw r 'V ' i 'wx.'rJnwVrEWWU

A x ~ V' Z LI

p (:)

'N

p.

411

. b-. ~

S . .

p

I.

U

%~.~%TA h~A( -. ~ - * ~ * ~~h~'h 0 0

x xi j yi Z zi

471

.4k

piN to'),

L 4 ze 1

magr a W y u-Jrflr rg w. W V f wJ Wi M-6 L~~w vr WVWV - WU F r -narmU7.r r. ' V , - T, WVvw Fwv -w - WWV-s. '. r..'t W "W VI

x i y yi x z

p.'.zi yr )~

'4I1

'~'91

wM- !M W!XflTntlnN rutU-,rrr-r -w- -rwV -- vwTrw rrr r- -4'rg

44.

'Sk'-'q

.0

% %

II

ie

..p

isin

41 .

Hx y, H,., l, Z

rel ra elm g sa a

Fi u e Iomiali , ii Ih ll t i

x "1

A2 A2 A3A3

~ 32

"411

."22

- U -if- 9

.9. .

Js: I..."~t ), -A l - A 2 + -. "

AD D353 A

C33 f E7 E6

Fl F2

Fiiqure .2,4 Peal Part Oft D[irected [lctric Field

9 ,-

x xi i zi

)y ijhzi -Al -A2 A3- )XI )Xi Z,

822 823

A2 A2 A3 A3 2<4C33

'.- \1

D34 + + 35 /

E6" E

rD3 D

I F2

Fiqur _25 I5 ai i naqri Part cn!.. Dire:ted Ele,-tri- Field

_aCS.'

96~

4'

x X1

x x I Z ZlA2 A

822 823

A2 A2 A3 A3

.. C33411

D34 + 035

,:33 3, 2

E6 E7

,%" B22 B;3]

.-

*2i 3: xi,

Fl F

Ex Exreal imag

Filure 26 Real and Irriaginar-I Part (F'aired) o-f lLire,:ted El: tri-,1 Field

'47

• 'I

' ' ... . I .. . 1 I

3j433 p ~yri1 j~ii Al -~i A2. jrA3- 4J 1I'fl)

824 826 827 2 87'62S

A2 A 3A lA3A.2 A 2 A

E6 Ex 7i Eg 8z - EzP

22-6 2 i x

'* t

es "I as A7 A)1

522 823 113. 925 oil 827 V,~

F IF3S

* ~ ~ ~ i r- e .1=

% %1

h lliam J and William Waite. Software .Maual for the Elementay Functions. New Jersey:Prcntice-ltall, 199i.

I It I. B A Diviui Algor Sxificauon fu[ t VLSI Implementabon of the Electromagnetic Field of'Ln -\rbtrar' Current Source, MS Tlesis AFIT/GE/ENG/86DI8, School of Engineering, Air Forcelnlitutc of Technolog, (AU), Wright-Patterson AFB OH, December 1986.

* tic., I Arcnc L Algorithm Definition for tbh VLSI Design Implementation of the ElectromagneticRadaon Interal, MS Thesis AFIT/GE/ENG/85D25, School of Engineering, Air Force Institute')1 I cLhnolopg , Wright-Patterson AFB OH, December 1985.

,c, ir I rcd J Diiul W'r Dcsign Handbook. New York: Marcel Decker, 1983.

10

%

p"

'" 100

Ps.:

" - " " ,"' ."" " . " " " " " "" " " " " " " " " " "- ," . "- ' " '' - " " " - '- - '- -" ""10 0" .' ', -% -.% % -. ,; - "',; . .'.? . ---. -; ;'.-.",'. " 'o . ._ - . -7 ,..'. .-.. .-. - . -. .. .-" -. ', ;- 0

Captain Jack L. Strauss was born 6 April 1957 in Inglewood California. He graduated from High

School in 1975 and attended Orange Coast College for a year prior to enlisting in the Air Force. As an

enlisted member he performed duties in the 3055X career field (Electronic Computer Systems Specialist).

He was accepted into the Airman Scholarship and Commissioning Program in 1979 and ultimately

graduated from Loyola Marymount University with a BSEE. He was assigned to HQ SISD, Offutt AFB,

Nebraska as a Communications Systems Development Engineer and became the SAC test director for the

SACDIN program. He was awarded the Air Force Commendation Medal upon leaving Offutt and entered

the School of Engineering at the Air Force Institute of Technology WPAFB, Ohio in May 1985.

Permanent Address: P.O. Box 1774Garden Grove, CA 92642

W'

to

'"'10 1

S ,pWl - L i i'n ''" , m" %a" ," ° ''' . ,' , "% '• """""" °"" - -,'"€ e"o"-'' 'e." *."

UNCLASS.'F! EDSICURTTv CLASICA710% Or ro S 'ALr

REPORT DOCUMENTATION PAGEa REPON, $fP aS r jC'Ct ro~ NU~~ %YAN1 %GS

Fa 1tt tsUNCLASS'IFIEE'2a SECUR'"v CLASSIF'CA'O% A-'r"0R 1 110 D S N A -AD* OF RIF PON'

2b DECLASSrCA;0% 0,0W%GRADAC S-[EA A; v-2!lI A. .

4 PERFORMING ORGANZAOh REPOR' N-MER S 5 %RON OR %C ORGAN IZAI 0% of PON01 I 0E

AFIT/GE/ENG/87M- 5

6& NAME OF PERFORMING ORGANIZA'ION 6b OFPCE SYMBOLIs ~ NAME Of MON 'OR'NC ORGA04 ZA- 0%

School of Engineering ~ T/N6c ADDRESS (City, State, aovd ZIP Co*.) ?b &DDRESS ;CMt Sctt nd li Code)

Air Force Institute of Tez-hnokiugvWright -PatterSorn AFb, Chic 4543'

ft NAME OF FUNDING SPONSOR NG 8 b 0FF CE SYMSO1 9 PAOC-REM%% %SYLA. %' Df% 'C(A'% % VAt gORGANIZATION j(if awpicabi.)

& A D DR ESS (Crtyi Stait a nd ZIP Code) '0 SO RCE OF ;,NDINC N,MIERSJPROGRAM IPRO~kt j'ASP IA. %

I I TITLE (include Securrtq Clawfi<aion) EE~~ C

see box 19

1Z ERONL UTORS~Jack L Strauss Capt. USAF

13a TYPE OF REPORT 3lb TIME COVERED T14 DAE 0$ REPOT ;Yeo, Mon~th ) PAj %

MSc Thesic _ FROM To 8___ I 'Mar _-16 SUPPLEMENTARY NOTAION

17 COSA- CODES '9 SSECT TERMS 'Conflinuo o rvirrm If -w'euant and Fdp-rFN jot fa

FIELD GROu~p S~s C,0,A .- pS ~ :

'9 ABSRACT CConfrnLue or-. reverse f nvopuarl and dpnf N b iorh -v,,mbve'

AF Ah T ' i- AEW 'H ~

It.

20 DSTR!B_.2O% A. AS. AR

0 )A( CASS ; FDA N, ',tII

DDForm 1473. JUNd 8466 ~. b.c- -

IFrim Block 14

An &.#or itnh has been specified for the n,2t 1i-.ase rrnpemnat.un

!te rmer i~ai %,Iut.on -it the oettr magnet.L !.ejls )f in irbitrary .ir rentI

5our e The 3agorithm, Defined as the Vi'etor W'ave E.uati..n 'A'E - s )ivvs

Ir !t magnetic U and eli tnt ic ISS rp, 2&.) ihe '.'e .,)r Po~ten'.&I

A' )t a trite .ength arbitrary ia rent -he~'e :pt AI-d i.q~. rnm

tias teen .rrified thr _-ugh FC-R "RAN *pr ,j, Al , fr!-..' 5 1 LT Itt

tu wi~thin .'Icmai places for a ' 00 sub etement iipc, e

7he ',,-WE ?c3rms a model whih is aigorithmnicaiiv vvrmmetric v.ith re

ipect to !he -artesian :oordinato system As such, the ',"AE ends itself to

highly parallel and -oncurrent -3mpt~tationai !vchn.1',ies "his propetty )t

'he i~qori'hm na~es .t in ex eworit .snailate !r .mpiernental..in bly a Very %

High 3ped wnegrated -.r. uit '4.h ass pr _x-et's-r if'.4ai' a par

aiiel, high.y cinL Lu rent at hite. 1,Ar ai .mpiernen1&1.)n has vielded pr ri.mi

iary T esults 'hat omnputationatl ta - ngs )I % !&, !.r f ' ind a *hr Lghput

.- e r Tease A ' Tiers )I magni? t.- 5svarAi

7hi rsear ht 'as -,%,t.%en It-,at in app_. v.n pe.. ,, 3-3s pr

crarray has iutf,i nt )rnput ing p&~wer lo - ppur t nt .t' i

,.cn A a set f eqit)n V.hL h -wiv:e ! rhe rmenet 2r.4 e* !r Itel~ s

,vwi is !Me -@ettat potenlial )t an ar bitr at .v a rrent C

= a

~ ~..A.& * -

lfllfflfflffllfnlfthe specified algorithm has been verified through fortran simulation to produce...

Documents