research article an efficient algorithm for em scattering

9
Research Article An Efficient Algorithm for EM Scattering from Anatomically Realistic Human Head Model Using Parallel CG-FFT Method Lei Zhao and Gen Chen Center for Computational Science and Engineering, School of Mathematics and Statistics, Jiangsu Normal University, Xuzhou 221116, China Correspondence should be addressed to Lei Zhao; [email protected] Received 1 December 2013; Revised 2 February 2014; Accepted 16 February 2014; Published 24 March 2014 Academic Editor: Gaobiao Xiao Copyright © 2014 L. Zhao and G. Chen. is is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. An efficient algorithm is proposed to analyze the electromagnetic scattering problem from a high resolution head model with pixel data format. e algorithm is based on parallel technique and the conjugate gradient (CG) method combined with the fast Fourier transform (FFT). Using the parallel CG-FFT method, the proposed algorithm is very efficient and can solve very electrically large-scale problems which cannot be solved using the conventional CG-FFT method in a personal computer. e accuracy of the proposed algorithm is verified by comparing numerical results with analytical Mie-series solutions for dielectric spheres. Numerical experiments have demonstrated that the proposed method has good performance on parallel efficiency. 1. Introduction In recent years, there has been an increasing effort to achieve an efficient numerical analysis of large-scale electromagnetic problems which usually require much computational time and large computer memory. An efficient numerical method for large and complex bodies is very important for many practical applications. e method of moments (MoM) [1] has become one of the most popular methods to compute the scattering problems in a variety of applications [26]. However, MoM requires ( 2 ) memory usage and ( 3 ) computational load to solve the matrix equation using the LU decomposition or Gaussian elimination, where N is the num- ber of unknowns. To reduce the computational time, CG- FFT is employed to solve the MoM matrix equation, which is one of the most efficient ways to solve the volume integral equation for dielectric targets and reduces the computational complexity to O(N log N) in each iteration [710]. For the most practical EM problems, a regular computer cannot be sufficient for its limited available memory and performance. New developments of parallel-processing techniques and high-performance computer (HPC) system give the chance of solving large problems that were unattainable in the past. To reach this point, it becomes more and more important that the development and parallelization of fast algorithms with highly parallel performance be able to benefit from large amounts of computational memory and parallel processors of HPC system [1113]. In the past few decades, the energy absorption in human head exposed to radio-frequency (RF) electromagnetic radi- ation has brought about an increased concern for the possible consequences of electromagnetic radiation on human health. Many studies have been performed for calculating the power absorbed in a human body exposed to the electromagnetic (EM) field emitted by radio-communication equipment [1417]. In this paper, the EM scattering problem from a high- resolution 3D anatomically realistic model of the human head was considered. e volume integral equations are applied to describe the problem. MoM is then used to discretize the coupled integral equations, and a CG-FFT algorithm has been proposed to solve the resulting discrete linear system. And the parallelization techniques were applied to speed up the FFT calculation, vector-vector product, and matrix- vector product during the process of CG iteration. e paper presents a deep review of the proposed parallel imple- mentation of CG-FFT algorithm with pulse base function. Different stages of the parallel algorithm were described, and its overall parallel performance was analyzed carefully. With Hindawi Publishing Corporation International Journal of Antennas and Propagation Volume 2014, Article ID 495057, 8 pages http://dx.doi.org/10.1155/2014/495057

Upload: others

Post on 16-Oct-2021

2 views

Category:

Documents


0 download

TRANSCRIPT

Research ArticleAn Efficient Algorithm for EM Scattering from AnatomicallyRealistic Human Head Model Using Parallel CG-FFT Method

Lei Zhao and Gen Chen

Center for Computational Science and Engineering School of Mathematics and Statistics Jiangsu Normal University Xuzhou 221116China

Correspondence should be addressed to Lei Zhao lzhaomax163com

Received 1 December 2013 Revised 2 February 2014 Accepted 16 February 2014 Published 24 March 2014

Academic Editor Gaobiao Xiao

Copyright copy 2014 L Zhao and G ChenThis is an open access article distributed under theCreativeCommonsAttribution Licensewhich permits unrestricted use distribution and reproduction in any medium provided the original work is properly cited

An efficient algorithm is proposed to analyze the electromagnetic scattering problem from a high resolution head model withpixel data format The algorithm is based on parallel technique and the conjugate gradient (CG) method combined with the fastFourier transform (FFT) Using the parallel CG-FFTmethod the proposed algorithm is very efficient and can solve very electricallylarge-scale problems which cannot be solved using the conventional CG-FFT method in a personal computer The accuracy of theproposed algorithm is verified by comparing numerical results with analyticalMie-series solutions for dielectric spheres Numericalexperiments have demonstrated that the proposed method has good performance on parallel efficiency

1 Introduction

In recent years there has been an increasing effort to achievean efficient numerical analysis of large-scale electromagneticproblems which usually require much computational timeand large computer memory An efficient numerical methodfor large and complex bodies is very important for manypractical applications The method of moments (MoM) [1]has become one of the most popular methods to computethe scattering problems in a variety of applications [2ndash6]However MoM requires 119874(119873

2) memory usage and 119874(119873

3)

computational load to solve thematrix equation using the LUdecomposition or Gaussian elimination whereN is the num-ber of unknowns To reduce the computational time CG-FFT is employed to solve the MoM matrix equation whichis one of the most efficient ways to solve the volume integralequation for dielectric targets and reduces the computationalcomplexity to O(N log N) in each iteration [7ndash10] For themost practical EM problems a regular computer cannot besufficient for its limited available memory and performanceNew developments of parallel-processing techniques andhigh-performance computer (HPC) system give the chanceof solving large problems that were unattainable in the pastTo reach this point it becomes more and more important

that the development and parallelization of fast algorithmswith highly parallel performance be able to benefit from largeamounts of computationalmemory and parallel processors ofHPC system [11ndash13]

In the past few decades the energy absorption in humanhead exposed to radio-frequency (RF) electromagnetic radi-ation has brought about an increased concern for the possibleconsequences of electromagnetic radiation on human healthMany studies have been performed for calculating the powerabsorbed in a human body exposed to the electromagnetic(EM) field emitted by radio-communication equipment [14ndash17] In this paper the EM scattering problem from a high-resolution 3D anatomically realisticmodel of the humanheadwas considered The volume integral equations are appliedto describe the problem MoM is then used to discretizethe coupled integral equations and a CG-FFT algorithm hasbeen proposed to solve the resulting discrete linear systemAnd the parallelization techniques were applied to speedup the FFT calculation vector-vector product and matrix-vector product during the process of CG iteration Thepaper presents a deep review of the proposed parallel imple-mentation of CG-FFT algorithm with pulse base functionDifferent stages of the parallel algorithm were described andits overall parallel performance was analyzed carefully With

Hindawi Publishing CorporationInternational Journal of Antennas and PropagationVolume 2014 Article ID 495057 8 pageshttpdxdoiorg1011552014495057

2 International Journal of Antennas and Propagation

minus02 minus015 minus01 minus005 0 005 01 015 0205

06

07

08

09

1

11

12

13

Analysis solutionCGFFT solution

Nx = Ny = Nz = 32

z = 01The field Ex on the center line of plane

(a) 119873119909 = 119873119909 = 119873119909 = 32

minus02 minus015 minus01 minus005 0 005 01 015 0205

06

07

08

09

1

11

12

13

Analysis solutionCGFFT solution

Nx = Ny = Nz = 64

z = 01The field Ex on the center line of plane

(b) 119873119909 = 119873119909 = 119873119909 = 64

Figure 1 Electric field distribution on the center line of plane 119911 =

01

this implementation we have done a benchmark model testwith more than 400 million unknowns and solved a practicalEM scattering problemwithmore than 40million unknownsusing a HPC system which includes 27 nodes Each node ofthe cluster has two Intel Xeon E5520 CPU and 12GBmemoryand they are connected by 10Gbps Ethernet high speednetwork We have verified the accuracy and efficiency of thealgorithm by comparing the numerical results with analyticalresults for dielectric spheres Numerical results show that theproposed method has good parallel performance

minus005

005

minus01

01

minus015

015

minus02

02

minus02 minus01 0

0

01 02

0256

0254

0252

025

0248

0246

CG-FFT Ex field on z = 1m

(a)

minus005

005

minus01

01

minus015

015

minus02

02

minus02 minus01 0

0

01 02

026

0258

0256

0254

0252

025

0248

0246

0244

0242

Analytical solution Ex field on z = 1m

(b)

Figure 2 Electric fields on the plane 119911 = 10m (a) Parallel CG-FFTresults (b) Analytical results

0 2 4 6 8 100

2000

4000

6000

8000

Message size (bytes)

Late

ncy

(mic

rose

cond

s)

InternodeInside node

times107

Figure 3 Network latency of nodes

International Journal of Antennas and Propagation 3

0 2 4 6 8 102

4

6

8

10

12

14

Perfo

rman

cec

ells

Number of nodes

times105

Figure 4 The performance of parallel CG-FFT

0 2 4 6 8 10 1205

06

07

08

09

1

Processes

Para

llel e

ffici

ency

64lowast 64lowast 64

128lowast 128lowast 128

256lowast 256lowast 512

512lowast 512lowast 512

Figure 5 Parallel efficiency of different case

Figure 6 A cut plane of the HUGO human head model

025

05

075

1

E fi

eld

on h

ead

104

1e minus 07

Figure 7 The total electric field distribution on head surface

001

002

003

E fi

eld

on ey

es

0

00347

Figure 8 The total electric field distribution on eyes

2 Theory and Methods

Consider a 3D dielectric object of arbitrary shape that isin homogeneous space which is characterized by relativepermittivity 120576

119887 we set the homogeneous space is free space

120576119887

= 1 The arbitrarily shaped dielectric object with complexpermittivity 120576

119903(119903) is inscribed by a cuboid 119871

119909times 119871119910

times 119871119911 The

time dependence of 119890minus119894120596119905 is assumed and suppressed Under

the illumination of the incident electric field the total electricfield inside the dielectric object 119864 can be determined throughthe following volume integral equation

119864 (119903) +1

4120587120576119887

int119881

119866 (119903 1199031015840) (120576119887

(119903) minus 120576119887) 119864 (119903 119903

1015840) 1198891199031015840

= 119864inc

(1)

where

119866 (119903 1199031015840) =

119890119894119896119887119877

1198775[

[

119866119909119909

119866119909119910

119866119909119911

119866119910119909

119866119910119910

119866119910119911

119866119911119909

119866119911119910

119866119911119911

]

]

(2)

4 International Journal of Antennas and Propagation

01

02

03

E fi

eld

on b

one

0

0304

Figure 9 The total electric field distribution on bone

01

02

E fi

eld

on b

rain

0

0223

Figure 10 The total electric field distribution on brain

is the dyadic Greenrsquos function in homogenous space in whichthe corresponding elements are given by

119866120585120577

= (120585 minus 1205851015840) (120577 minus 120577

1015840) [(119896119887119877)2

+ 1198943 (119896119887119877) minus 3] (120585 = 120577)

119866120585120577

= (120585 minus 1205851015840)2

[(119896119887119877)2

+ 1198943 (119896119887119877) minus 3]

minus 1198772

[(119896119887119877)2

+ 119894 (119896119887119877) minus 1] (120585 = 120577)

(3)

The equivalent version for the induced current 119869 can beapproximately obtained by

119869 (119903) +1

4120587120594 (119903) int

119881

119866 (119903 1199031015840) sdot 119869 (119903

1015840) 1198891199031015840

= 119869inc

(119903) (4)

where 120594(119903) = (120576(119903)120576119887

) minus 1 and

119869 (119903) = 120594 (119903) 119864 (119903) 119869inc

(119903) = 120594 (119903) 119864inc

(119903) (5)

are the normalized electric current inside the dielectric objectand the equivalent incident current respectively

A box with the size of 119871119909

times 119871119910

times 119871119911is used to bound

the considered dielectric target and is discretized into 119873119909

times

119873119910

times 119873119911cuboidal cells Then the volume of each cell is ΔV =

Δ119909Δ119910Δ119911 where Δ120585 = 119871120585119873120585 (120585 = 119909 119910 119911) and 119873

120585is the

division number in the 120585-direction Choosing pulse functionas the basis and testing function we obtain the discrete formsof (4) as

119869119863

120585(119898 119899 119896) +

1

4120587120594 (119898 119899 119896)

times sum

120589=119909119910119911

119873119909minus1

sum

1198981015840=0

119873119910minus1

sum

1198991015840=0

119873119911minus1

sum

1198961015840=0

119866119863

120585120589(119898 minus 119898

1015840 119899 minus 119899

1015840 119896 minus 119896

1015840)

times 119869119863

120589(1198981015840 1198991015840 1198961015840)

= 119869119894119863

120585(119898 119899 119896)

(6)

in which

119866119863

120585120589(119898 minus 119898

1015840 119899 minus 119899

1015840 119896 minus 119896

1015840)

= Δ119881 int

(1198981015840+1)Δ119909

1198981015840Δ119909

int

(1198991015840+1)Δ119910

1198991015840Δ119910

int

(1198961015840+1)Δ119911

1198961015840Δ119911

119866120585120589

((119898 +1

2) Δ119909 minus 119909

1015840

(119899 +1

2) Δ119910 minus 119910

1015840

(119896 +1

2) Δ119911

minus1199111015840) 119889119909101584011988911991010158401198891199111015840

119869119894119863

120585(119898 119899 119896)

= int

(119898+1)Δ119909

119898Δ119909

int

(119899+1)Δ119910

119899Δ119910

int

(119896+1)Δ119911

119896Δ119911

119869inc120585

(119909 119910 119911) 119889119909 119889119910 119889119911

(7)

We remark that the above formulations (6)-(7) actuallyimply the scattering by small particles with the size ofΔ119881 because of the use of pulse basis functions althoughthe dielectric targets may be continuous We can convert (6)into a linear system of equations

119885 sdot 119868 = 119881 (8)

International Journal of Antennas and Propagation 5

Table 1 The HPC hardware information

CPU type Intel Xeon E5520Clock speed 267GHzNumber of nodes 27Available memory 30 times 12GB (DDR3 1067MHz)Operating system CentOS (Linux)Network system BNT 10Gbps Ethernet

Table 2 Performance test

Cells Processes Number of iterations Times (second)64 times 64 times 64 1 30 228128 times 128 times 128 2 30 904256 times 256 times 128 4 30 2589256 times 256 times 256 6 30 3679256 times 256 times 512 8 30 6123512 times 512 times 512 10 30 32897

Table 3 Parallel efficiency

Cells Processes Time (second) Parallel efficiency

64 times 64 times 64

1 228 12 1238 0924 72 080

128 times 128 times 128

1 1681 12 904 0934 506 0838 318 066

256 times 256 times 512

1 29883 12 16418 0914 9401 0798 6123 061

512 times 512 times 512

1 194092 12 109046 0894 64697 07510 32897 05912 31714 051

where 119885 is an 119873 times 119873 system matrix 119868 is a column vectorwith the coefficients of the unknown currents and 119881 isa column vector associated with the incident fields in thedielectric object Here 119873 = 3119873

119909119873119910

119873119911is the total number

of unknowns However the inner products in (6) are all 3Dsummations of the products of discrete Greenrsquos functions anddiscrete electric currents which are quite time and memoryconsuming For electric-large electromagnetic problems 119873

is very large and it is very difficult to solve (8) directly Inorder to calculate fast the products of Greenrsquos functions andelectric currents the discrete Greenrsquos functions are extendedin a larger computational domain as

119866119890

120585120589(119898 119899 119896) = plusmn119866

119863

120585120589(1198980 1198990 1198960) (9)

Table 4 Tissue parameters for HUGOmodel

Tissue type Relativepermittivity

Relativepermeability Conductivity

Skin 41405334 1 086678Fat 11333888 1 0109162Muscle 56879063 1 0995364Cartilage 42653103 1 0782333Cerebrospinal fluid 68638336 1 2412575Sclera 5527013 1 1166726Vitreous 6890184 1 1636162Lens nucleus 35841595 1 0484917Grey matter 52724701 1 0942193White matter 38886288 1 0590815Nerve 32530067 1 0573612Thyroid 59683323 1 1038448Tongue 5527013 1 0936192Bone 20787804 1 0339975Blood 61360718 1 1538069Air 1 1 0

where 0 le 119898 le 2119873119909

minus 1 0 le 119899 le 2119873119910

minus 1 0 le 119896 le 2119873119911

minus 1

1198980

= 119898 0 le 119898 le 119873

119909minus 1

2119873119909

minus 119898 119873119909

le 119898 le 2119873119909

minus 1

1198990

= 119899 0 le 119899 le 119873

119910minus 1

2119873119910

minus 119899 119873119910

le 119899 le 2119873119910

minus 1

1198960

= 119896 0 le 119896 le 119873

119911minus 1

2119873119911

minus 119896 119873119911

le 119896 le 2119873119911

minus 1

(10)

The signs of the expanded discrete Greenrsquos functions aredirectly related to the even and odd nature of the compo-nents with respect to the coordinates in different extendedsubdomains After defining the extended Greenrsquos functionsthe equivalent electric current 119869

119863

120585(119898 119899 119896) can be defined in

the extended domain by zero padding as

119869119863119890

120585(119898 119899 119896)

= 119869119863

120585(119898 119899 119896) 0 le 119898 le 119873

119909minus 1 0 le 119899 le 119873

119910 0 le 119896 le 119873

119911

0 else

(11)

6 International Journal of Antennas and Propagation

lowast119878119906119898 119899119900119903119898 119900119891 119890119886119888ℎ 119875119903119900119888119890119904119904lowast119872119875119868 119860119897119897119903119890119889119906119888119890 (amp119903119899119900119903119898amp1199031198991199001199031198981 1 119872119875119868 119865119871119874119860119879 119872119875119868 119878119880119872 119908119900119903119897119889)lowastV119890119888119905119900119903-V119890119888119905119900119903 119875119903119900119889119906119888119905 119900119899 119890119886119888ℎ 119901119903119900119888119890119904119904lowast

119891119900119903(119894119899119905 119894 = 0 119894 lt 119886119897119897 119894 + +)

119888119895119909[0][0][119894] = 119888119895119909[0][0][119894] + 119886119898 lowast 119888119901119909[0][0][119894]119888119895119910[0][0][119894] = 119888119895119910[0][0][119894] + 119886119898 lowast 119888119901119910[0][0][119894]119888119895119911[0][0][119894] = 119888119895119911[0][0][119894] + 119886119898 lowast 119888119901119911[0][0][119894]119888119903119909[0][0][119894] = 119888119903119909[0][0][119894] minus 119886119898 lowast 119888119905119909[0][0][119894]119888119903119910[0][0][119894] = 119888119903119910[0][0][119894] minus 119886119898 lowast 119888119905119910[0][0][119894]119888119903119911[0][0][119894] = 119888119903119911[0][0][119894] minus 119886119898 lowast 119888119905119911[0][0][119894]119903119890119886119909 = 119886119887119904(119888119903119909[0][0][119894])119903119890119886119910 = 119886119887119904(119888119903119910[0][0][119894])119903119890119886119911 = 119886119887119904(119888119903119911[0][0][119894])119903119899119900119903119898 = 119903119899119900119903119898 + 119903119890119886119909 lowast 119903119890119886119909 + 119903119890119886119910 lowast 119903119890119886119910 + 119903119890119886119911 lowast 119903119890119886119911

Algorithm 1

Using the convolution theorem and FFT method we canobtain the discrete form of the integral equation (6) with FFTmethod [18 19]

119869119863

120585(119898 119899 119896)

+1

4120587120594 (119898 119899 119896)F

minus1

sum

120585=119909119910119911

119866119890

120585120577(119894 119895 119897) 119869

119863119890

120585(119894 119895 119897)

= 120594 (119898 119899 119896) 119864inc120585

(119898 119899 119896)

(12)

where 119866119890

120585120577(119894 119895 119897) 119869

119863119890

120585(119894 119895 119897) are the discrete Fourier transform

(DFT) of 119866119890

120585120577(119894 119895 119897) 119869

119863119890

120585120577(119894 119895 119897) respectively Similarly the

corresponding adjoint operations can also be performedusing FFT As a consequence we can solve (12) rapidlythrough the CG-FFT algorithm [18] In order to speed upthe FFT calculation the parallel FFT is used to obtain theFFT and inverse FFT results In the proposed algorithmboth the FFT transform and the inverse FFT transform areimplemented using the FFTW library which is a 119862 subrou-tine library for computing the discrete Fourier transformin one or more dimensions and supports the distributed-memory implementation based on message passing interface(MPI) For example to calculate the vector-vector productin the CG-FFT method which can be parallelized by callMPI Allreduce() as shown in Algorithm 1

3 Numerical Results

To illustrate the accuracy and efficiency of the proposed par-allel CG-FFT algorithm we first consider the EM scatteringby a dielectric sphere illuminated by plane waves whichhas a closed-form solution In the following examples thebackground is just free space The dielectric sphere with 120576

119903=

4 119903 = 02m is illuminated by a plane wave The incidentwave is polarized in the 119909 direction and propagating in the

119911 direction in which the operating frequency is 03 GHzThecomparison of numerical results of the internal electric fieldsbetween parallel CG-FFT and analytical results is illustratedin Figure 1 which shows that the numerical results have goodagreement with the analytical resultsWe have also computedthe scattered electric fields from the dielectric object on theobservation plane 119911 = 10m and compared such results withthe exact solutions as shown in Figure 2

Then we do the parallel performance testing on a HPCwhich has 27 nodes shown in Table 1 in which nodes areconnected by 10Gbps Ethernet The benchmark model is ahomogenous cubic dielectric object with 120576

119903= 4 and the edge

of cubic is 04mThe incident wave is the same plane wave asthat in Figure 1 We compare the network latency inside nodeand internode which means that we test the network latencyon one node and between two nodes respectively Figure 3shows the testing results for internode and inside node FromFigure 3 we can see that the speed of network inside nodeis about 4 times of the inter node which will be a bottleneckfor the parallel CG-FFTmethod To evaluate the performanceof the parallel CG-FFT code we define the performance asfollows

Performance (cellss)

=119873119909

times 119873119910

times 119873119911

times Number of iteratioinsSimulation time (s)

(13)

The parallel CG-FFTmethods performance testing resultis demonstrated in Figure 4 and the detail data is listed inTable 2 From Figure 4 we can obtain that the performancegoes up when no more than 8 nodes are used and theperformance goes down using 10 nodesThe reason is that thenetwork latency plays an important role when we use morethan 8 nodes The parallel efficiency is also tested which isdefined as

Parallel Efficiency =1198791

119875 lowast 119879119901

(14)

International Journal of Antennas and Propagation 7

where 119875 is the number of processes 1198791is the running time

used by one process and 119879119901is running time used by 119875

processes Figure 5 shows the parallel efficiency of parallelCG-FFT with different discretization and processes and thedetail results are listed in Table 3 From Figure 5 and Table 3we can see that the parallel efficiency is above 60 when nomore than 8 nodes are used

Finally we use the proposed parallel CG-FFT methodto simulate EM scattering problem from 3D anatomicallyrealistic human head model exposed to the plane waveworking at 900Mhz The popular HUGO model [20] with aresolution of 1 times 1 times 1mm as shown in Figure 6 includes 16different tissues and organs The electromagnetic properties(120576119903and 120590) of 16 tissues in the model can be obtained from

FCC published data [21] as listed in Table 4 In our simu-lation 4 nodes are used and the computation time is about65 minutes Figure 7 shows the electric field on head surfaceWith the object oriented HUGOmodel the field distributionover a specific object can be investigated The electric fielddistributions on eyes bone and brain are demonstrated inFigures 8 9 and 10 respectively

4 Conclusion

In this paper we have analyzed the performance of an efficientMPI parallel implementation of the CG-FFT algorithm onHPC computers In the proposed method the codes canrun not only on share memory systems machine but alsoon distributed ones which present high scalability behaviorSpecial attention was paid to communications during thematrix-vector product and vector-vector product which area key point for the parallel performanceWe solved a problemwith more than 400 million unknowns on a HPC including27 nodes

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This work was supported in part by the National ScienceFoundation of China under Grant no 61372057 in part byNatural Science Foundation of the Jiangsu Higher EducationInstitutions under Grant no 10KJD180004 and in part byPostgraduate Innovation Project of Jiangsu Province underGrant no CXZZ13 0973

References

[1] R F Harrington Field Computation by Moment MethodsMacMillan New York NY USA 1968

[2] P M Goggans A A Kishk and A W Glisson ldquoElec-tromagnetic scattering from objects composed of multiplehomogeneous regions using a region-by-region solutionrdquo IEEETransactions on Antennas and Propagation vol 42 no 6 pp865ndash871 1994

[3] R D Graglia P L E Uslenghi andR S Zich ldquoMomentmethodwith isoparametric elements for three-dimensional anisotropicscatterersrdquo Proceedings of the IEEE vol 77 no 5 pp 750ndash7601989

[4] J M Jarem ldquoMethod-of-moments solution of a parallel-platewaveguide aperture systemrdquo Journal of Applied Physics vol 59no 10 pp 3566ndash3570 1986

[5] D E Livesay and K Chen ldquoElectromagnetic field inducedinside arbitrarily shaped biological bodiesrdquo IEEE Transactionson Microwave Theory and Techniques vol 22 no 12 pp 1273ndash1280 1974

[6] T K Sarkar and E Arvas ldquoAn integral equation approachto the analysis of finite microstrip antennas volumesurfaceformulationrdquo IEEE Transactions on Antennas and Propagationvol 38 no 3 pp 305ndash312 1990

[7] H Gan and W C Chew ldquoA discrete BCG-FFT algorithmfor solving 3D inhomogeneous scatterer problemsrdquo Journal ofElectromagnetic Waves and Applications vol 9 no 10 pp 1339ndash1357 1995

[8] T J Cui ldquoFast algorithm for electromagnetic scattering byburied 3-D dielectric objects of large sizerdquo IEEE Transactionson Geoscience and Remote Sensing vol 37 no 5 pp 2597ndash26081999

[9] L Zhao and T J Cui ldquoCG-FFT algorithm for EM scattering bysmall dielectric particles with high permittivity and permeabil-ityrdquoMicrowave and Optical Technology Letters vol 49 no 2 pp305ndash310 2007

[10] L Zhao T J Cui and W D Li ldquoAn efficient algorithmfor em scattering by electrically large dielectric objects usingMR-QEB iterative scheme and CG-FFT methodrdquo Progress inElectromagnetics Research vol 67 pp 341ndash355 2007

[11] W Yu R Mittra T Su Y Liu and X Yang Parallel FiniteDifference Time Domain Method Artech House NorwoodMass USA 2006

[12] W Yu X Yang Y Liu et al ldquoNew development of parallelconformal FDTD method in computational electromagneticsengineeringrdquo IEEEAntennas and PropagationMagazine vol 53no 3 pp 15ndash41 2011

[13] J M Taboada M G Araujo F O Basteiro J L Rodriguez andL Landesa ldquoMLFMA-FFT parallel algorithm for the solution ofextremely large problems in electromagneticrdquo Proceedings of theIEEE vol 101 no 2 pp 350ndash363 2013

[14] O P Gandhi G Lazzi and C M Furse ldquoElectromagneticabsorption in the human head and neck for mobile telephonesat 835 and 1900MHzrdquo IEEE Transactions on Microwave Theoryand Techniques vol 44 no 10 pp 1884ndash1897 1996

[15] G Lazzi and O P Gandhi ldquoRealistically tilted and truncatedanatomically based models of the human head for dosimetryof mobile telephonesrdquo IEEE Transactions on ElectromagneticCompatibility vol 39 no 1 pp 55ndash61 1997

[16] A K Lee H D Choi and J I Choi ldquoStudy on SARs inhead models with different shapes by age using SAM modelfor mobile phone exposure at 835MHzrdquo IEEE Transactions onElectromagnetic Compatibility vol 49 no 2 pp 302ndash312 2007

[17] Q-X Li and O P Gandhi ldquoThermal implications of thenew relaxed IEEE RF safety standard for head exposures tocellular telephones at 835 and 1900MHzrdquo IEEE Transactions onMicrowave Theory and Techniques vol 54 no 7 pp 3146ndash31542006

[18] T J Cui and W C Chew ldquoFast algorithm for electromagneticscattering by buried 3-D dielectric objects of large sizerdquo IEEE

8 International Journal of Antennas and Propagation

Transactions on Geoscience and Remote Sensing vol 37 no 5pp 2597ndash2608 1999

[19] J Weaver Applications of Discrete and Continuous FourierAnalysis John Wiley amp Sons New York NY USA 1983

[20] P Bernardi M Cavagnaro S Pisa and E Piuzzi ldquoSpecificabsorption rate and temperature increases in the head of acellular-phone userrdquo IEEE Transactions on Microwave Theoryand Techniques vol 48 no 7 pp 1118ndash1126 2000

[21] httpwwwfccgovfcc-bindielecsh

International Journal of

AerospaceEngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Active and Passive Electronic Components

Control Scienceand Engineering

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

RotatingMachinery

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation httpwwwhindawicom

Journal ofEngineeringVolume 2014

Submit your manuscripts athttpwwwhindawicom

VLSI Design

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Shock and Vibration

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawi Publishing Corporation httpwwwhindawicom

Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

SensorsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Navigation and Observation

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

DistributedSensor Networks

International Journal of

2 International Journal of Antennas and Propagation

minus02 minus015 minus01 minus005 0 005 01 015 0205

06

07

08

09

1

11

12

13

Analysis solutionCGFFT solution

Nx = Ny = Nz = 32

z = 01The field Ex on the center line of plane

(a) 119873119909 = 119873119909 = 119873119909 = 32

minus02 minus015 minus01 minus005 0 005 01 015 0205

06

07

08

09

1

11

12

13

Analysis solutionCGFFT solution

Nx = Ny = Nz = 64

z = 01The field Ex on the center line of plane

(b) 119873119909 = 119873119909 = 119873119909 = 64

Figure 1 Electric field distribution on the center line of plane 119911 =

01

this implementation we have done a benchmark model testwith more than 400 million unknowns and solved a practicalEM scattering problemwithmore than 40million unknownsusing a HPC system which includes 27 nodes Each node ofthe cluster has two Intel Xeon E5520 CPU and 12GBmemoryand they are connected by 10Gbps Ethernet high speednetwork We have verified the accuracy and efficiency of thealgorithm by comparing the numerical results with analyticalresults for dielectric spheres Numerical results show that theproposed method has good parallel performance

minus005

005

minus01

01

minus015

015

minus02

02

minus02 minus01 0

0

01 02

0256

0254

0252

025

0248

0246

CG-FFT Ex field on z = 1m

(a)

minus005

005

minus01

01

minus015

015

minus02

02

minus02 minus01 0

0

01 02

026

0258

0256

0254

0252

025

0248

0246

0244

0242

Analytical solution Ex field on z = 1m

(b)

Figure 2 Electric fields on the plane 119911 = 10m (a) Parallel CG-FFTresults (b) Analytical results

0 2 4 6 8 100

2000

4000

6000

8000

Message size (bytes)

Late

ncy

(mic

rose

cond

s)

InternodeInside node

times107

Figure 3 Network latency of nodes

International Journal of Antennas and Propagation 3

0 2 4 6 8 102

4

6

8

10

12

14

Perfo

rman

cec

ells

Number of nodes

times105

Figure 4 The performance of parallel CG-FFT

0 2 4 6 8 10 1205

06

07

08

09

1

Processes

Para

llel e

ffici

ency

64lowast 64lowast 64

128lowast 128lowast 128

256lowast 256lowast 512

512lowast 512lowast 512

Figure 5 Parallel efficiency of different case

Figure 6 A cut plane of the HUGO human head model

025

05

075

1

E fi

eld

on h

ead

104

1e minus 07

Figure 7 The total electric field distribution on head surface

001

002

003

E fi

eld

on ey

es

0

00347

Figure 8 The total electric field distribution on eyes

2 Theory and Methods

Consider a 3D dielectric object of arbitrary shape that isin homogeneous space which is characterized by relativepermittivity 120576

119887 we set the homogeneous space is free space

120576119887

= 1 The arbitrarily shaped dielectric object with complexpermittivity 120576

119903(119903) is inscribed by a cuboid 119871

119909times 119871119910

times 119871119911 The

time dependence of 119890minus119894120596119905 is assumed and suppressed Under

the illumination of the incident electric field the total electricfield inside the dielectric object 119864 can be determined throughthe following volume integral equation

119864 (119903) +1

4120587120576119887

int119881

119866 (119903 1199031015840) (120576119887

(119903) minus 120576119887) 119864 (119903 119903

1015840) 1198891199031015840

= 119864inc

(1)

where

119866 (119903 1199031015840) =

119890119894119896119887119877

1198775[

[

119866119909119909

119866119909119910

119866119909119911

119866119910119909

119866119910119910

119866119910119911

119866119911119909

119866119911119910

119866119911119911

]

]

(2)

4 International Journal of Antennas and Propagation

01

02

03

E fi

eld

on b

one

0

0304

Figure 9 The total electric field distribution on bone

01

02

E fi

eld

on b

rain

0

0223

Figure 10 The total electric field distribution on brain

is the dyadic Greenrsquos function in homogenous space in whichthe corresponding elements are given by

119866120585120577

= (120585 minus 1205851015840) (120577 minus 120577

1015840) [(119896119887119877)2

+ 1198943 (119896119887119877) minus 3] (120585 = 120577)

119866120585120577

= (120585 minus 1205851015840)2

[(119896119887119877)2

+ 1198943 (119896119887119877) minus 3]

minus 1198772

[(119896119887119877)2

+ 119894 (119896119887119877) minus 1] (120585 = 120577)

(3)

The equivalent version for the induced current 119869 can beapproximately obtained by

119869 (119903) +1

4120587120594 (119903) int

119881

119866 (119903 1199031015840) sdot 119869 (119903

1015840) 1198891199031015840

= 119869inc

(119903) (4)

where 120594(119903) = (120576(119903)120576119887

) minus 1 and

119869 (119903) = 120594 (119903) 119864 (119903) 119869inc

(119903) = 120594 (119903) 119864inc

(119903) (5)

are the normalized electric current inside the dielectric objectand the equivalent incident current respectively

A box with the size of 119871119909

times 119871119910

times 119871119911is used to bound

the considered dielectric target and is discretized into 119873119909

times

119873119910

times 119873119911cuboidal cells Then the volume of each cell is ΔV =

Δ119909Δ119910Δ119911 where Δ120585 = 119871120585119873120585 (120585 = 119909 119910 119911) and 119873

120585is the

division number in the 120585-direction Choosing pulse functionas the basis and testing function we obtain the discrete formsof (4) as

119869119863

120585(119898 119899 119896) +

1

4120587120594 (119898 119899 119896)

times sum

120589=119909119910119911

119873119909minus1

sum

1198981015840=0

119873119910minus1

sum

1198991015840=0

119873119911minus1

sum

1198961015840=0

119866119863

120585120589(119898 minus 119898

1015840 119899 minus 119899

1015840 119896 minus 119896

1015840)

times 119869119863

120589(1198981015840 1198991015840 1198961015840)

= 119869119894119863

120585(119898 119899 119896)

(6)

in which

119866119863

120585120589(119898 minus 119898

1015840 119899 minus 119899

1015840 119896 minus 119896

1015840)

= Δ119881 int

(1198981015840+1)Δ119909

1198981015840Δ119909

int

(1198991015840+1)Δ119910

1198991015840Δ119910

int

(1198961015840+1)Δ119911

1198961015840Δ119911

119866120585120589

((119898 +1

2) Δ119909 minus 119909

1015840

(119899 +1

2) Δ119910 minus 119910

1015840

(119896 +1

2) Δ119911

minus1199111015840) 119889119909101584011988911991010158401198891199111015840

119869119894119863

120585(119898 119899 119896)

= int

(119898+1)Δ119909

119898Δ119909

int

(119899+1)Δ119910

119899Δ119910

int

(119896+1)Δ119911

119896Δ119911

119869inc120585

(119909 119910 119911) 119889119909 119889119910 119889119911

(7)

We remark that the above formulations (6)-(7) actuallyimply the scattering by small particles with the size ofΔ119881 because of the use of pulse basis functions althoughthe dielectric targets may be continuous We can convert (6)into a linear system of equations

119885 sdot 119868 = 119881 (8)

International Journal of Antennas and Propagation 5

Table 1 The HPC hardware information

CPU type Intel Xeon E5520Clock speed 267GHzNumber of nodes 27Available memory 30 times 12GB (DDR3 1067MHz)Operating system CentOS (Linux)Network system BNT 10Gbps Ethernet

Table 2 Performance test

Cells Processes Number of iterations Times (second)64 times 64 times 64 1 30 228128 times 128 times 128 2 30 904256 times 256 times 128 4 30 2589256 times 256 times 256 6 30 3679256 times 256 times 512 8 30 6123512 times 512 times 512 10 30 32897

Table 3 Parallel efficiency

Cells Processes Time (second) Parallel efficiency

64 times 64 times 64

1 228 12 1238 0924 72 080

128 times 128 times 128

1 1681 12 904 0934 506 0838 318 066

256 times 256 times 512

1 29883 12 16418 0914 9401 0798 6123 061

512 times 512 times 512

1 194092 12 109046 0894 64697 07510 32897 05912 31714 051

where 119885 is an 119873 times 119873 system matrix 119868 is a column vectorwith the coefficients of the unknown currents and 119881 isa column vector associated with the incident fields in thedielectric object Here 119873 = 3119873

119909119873119910

119873119911is the total number

of unknowns However the inner products in (6) are all 3Dsummations of the products of discrete Greenrsquos functions anddiscrete electric currents which are quite time and memoryconsuming For electric-large electromagnetic problems 119873

is very large and it is very difficult to solve (8) directly Inorder to calculate fast the products of Greenrsquos functions andelectric currents the discrete Greenrsquos functions are extendedin a larger computational domain as

119866119890

120585120589(119898 119899 119896) = plusmn119866

119863

120585120589(1198980 1198990 1198960) (9)

Table 4 Tissue parameters for HUGOmodel

Tissue type Relativepermittivity

Relativepermeability Conductivity

Skin 41405334 1 086678Fat 11333888 1 0109162Muscle 56879063 1 0995364Cartilage 42653103 1 0782333Cerebrospinal fluid 68638336 1 2412575Sclera 5527013 1 1166726Vitreous 6890184 1 1636162Lens nucleus 35841595 1 0484917Grey matter 52724701 1 0942193White matter 38886288 1 0590815Nerve 32530067 1 0573612Thyroid 59683323 1 1038448Tongue 5527013 1 0936192Bone 20787804 1 0339975Blood 61360718 1 1538069Air 1 1 0

where 0 le 119898 le 2119873119909

minus 1 0 le 119899 le 2119873119910

minus 1 0 le 119896 le 2119873119911

minus 1

1198980

= 119898 0 le 119898 le 119873

119909minus 1

2119873119909

minus 119898 119873119909

le 119898 le 2119873119909

minus 1

1198990

= 119899 0 le 119899 le 119873

119910minus 1

2119873119910

minus 119899 119873119910

le 119899 le 2119873119910

minus 1

1198960

= 119896 0 le 119896 le 119873

119911minus 1

2119873119911

minus 119896 119873119911

le 119896 le 2119873119911

minus 1

(10)

The signs of the expanded discrete Greenrsquos functions aredirectly related to the even and odd nature of the compo-nents with respect to the coordinates in different extendedsubdomains After defining the extended Greenrsquos functionsthe equivalent electric current 119869

119863

120585(119898 119899 119896) can be defined in

the extended domain by zero padding as

119869119863119890

120585(119898 119899 119896)

= 119869119863

120585(119898 119899 119896) 0 le 119898 le 119873

119909minus 1 0 le 119899 le 119873

119910 0 le 119896 le 119873

119911

0 else

(11)

6 International Journal of Antennas and Propagation

lowast119878119906119898 119899119900119903119898 119900119891 119890119886119888ℎ 119875119903119900119888119890119904119904lowast119872119875119868 119860119897119897119903119890119889119906119888119890 (amp119903119899119900119903119898amp1199031198991199001199031198981 1 119872119875119868 119865119871119874119860119879 119872119875119868 119878119880119872 119908119900119903119897119889)lowastV119890119888119905119900119903-V119890119888119905119900119903 119875119903119900119889119906119888119905 119900119899 119890119886119888ℎ 119901119903119900119888119890119904119904lowast

119891119900119903(119894119899119905 119894 = 0 119894 lt 119886119897119897 119894 + +)

119888119895119909[0][0][119894] = 119888119895119909[0][0][119894] + 119886119898 lowast 119888119901119909[0][0][119894]119888119895119910[0][0][119894] = 119888119895119910[0][0][119894] + 119886119898 lowast 119888119901119910[0][0][119894]119888119895119911[0][0][119894] = 119888119895119911[0][0][119894] + 119886119898 lowast 119888119901119911[0][0][119894]119888119903119909[0][0][119894] = 119888119903119909[0][0][119894] minus 119886119898 lowast 119888119905119909[0][0][119894]119888119903119910[0][0][119894] = 119888119903119910[0][0][119894] minus 119886119898 lowast 119888119905119910[0][0][119894]119888119903119911[0][0][119894] = 119888119903119911[0][0][119894] minus 119886119898 lowast 119888119905119911[0][0][119894]119903119890119886119909 = 119886119887119904(119888119903119909[0][0][119894])119903119890119886119910 = 119886119887119904(119888119903119910[0][0][119894])119903119890119886119911 = 119886119887119904(119888119903119911[0][0][119894])119903119899119900119903119898 = 119903119899119900119903119898 + 119903119890119886119909 lowast 119903119890119886119909 + 119903119890119886119910 lowast 119903119890119886119910 + 119903119890119886119911 lowast 119903119890119886119911

Algorithm 1

Using the convolution theorem and FFT method we canobtain the discrete form of the integral equation (6) with FFTmethod [18 19]

119869119863

120585(119898 119899 119896)

+1

4120587120594 (119898 119899 119896)F

minus1

sum

120585=119909119910119911

119866119890

120585120577(119894 119895 119897) 119869

119863119890

120585(119894 119895 119897)

= 120594 (119898 119899 119896) 119864inc120585

(119898 119899 119896)

(12)

where 119866119890

120585120577(119894 119895 119897) 119869

119863119890

120585(119894 119895 119897) are the discrete Fourier transform

(DFT) of 119866119890

120585120577(119894 119895 119897) 119869

119863119890

120585120577(119894 119895 119897) respectively Similarly the

corresponding adjoint operations can also be performedusing FFT As a consequence we can solve (12) rapidlythrough the CG-FFT algorithm [18] In order to speed upthe FFT calculation the parallel FFT is used to obtain theFFT and inverse FFT results In the proposed algorithmboth the FFT transform and the inverse FFT transform areimplemented using the FFTW library which is a 119862 subrou-tine library for computing the discrete Fourier transformin one or more dimensions and supports the distributed-memory implementation based on message passing interface(MPI) For example to calculate the vector-vector productin the CG-FFT method which can be parallelized by callMPI Allreduce() as shown in Algorithm 1

3 Numerical Results

To illustrate the accuracy and efficiency of the proposed par-allel CG-FFT algorithm we first consider the EM scatteringby a dielectric sphere illuminated by plane waves whichhas a closed-form solution In the following examples thebackground is just free space The dielectric sphere with 120576

119903=

4 119903 = 02m is illuminated by a plane wave The incidentwave is polarized in the 119909 direction and propagating in the

119911 direction in which the operating frequency is 03 GHzThecomparison of numerical results of the internal electric fieldsbetween parallel CG-FFT and analytical results is illustratedin Figure 1 which shows that the numerical results have goodagreement with the analytical resultsWe have also computedthe scattered electric fields from the dielectric object on theobservation plane 119911 = 10m and compared such results withthe exact solutions as shown in Figure 2

Then we do the parallel performance testing on a HPCwhich has 27 nodes shown in Table 1 in which nodes areconnected by 10Gbps Ethernet The benchmark model is ahomogenous cubic dielectric object with 120576

119903= 4 and the edge

of cubic is 04mThe incident wave is the same plane wave asthat in Figure 1 We compare the network latency inside nodeand internode which means that we test the network latencyon one node and between two nodes respectively Figure 3shows the testing results for internode and inside node FromFigure 3 we can see that the speed of network inside nodeis about 4 times of the inter node which will be a bottleneckfor the parallel CG-FFTmethod To evaluate the performanceof the parallel CG-FFT code we define the performance asfollows

Performance (cellss)

=119873119909

times 119873119910

times 119873119911

times Number of iteratioinsSimulation time (s)

(13)

The parallel CG-FFTmethods performance testing resultis demonstrated in Figure 4 and the detail data is listed inTable 2 From Figure 4 we can obtain that the performancegoes up when no more than 8 nodes are used and theperformance goes down using 10 nodesThe reason is that thenetwork latency plays an important role when we use morethan 8 nodes The parallel efficiency is also tested which isdefined as

Parallel Efficiency =1198791

119875 lowast 119879119901

(14)

International Journal of Antennas and Propagation 7

where 119875 is the number of processes 1198791is the running time

used by one process and 119879119901is running time used by 119875

processes Figure 5 shows the parallel efficiency of parallelCG-FFT with different discretization and processes and thedetail results are listed in Table 3 From Figure 5 and Table 3we can see that the parallel efficiency is above 60 when nomore than 8 nodes are used

Finally we use the proposed parallel CG-FFT methodto simulate EM scattering problem from 3D anatomicallyrealistic human head model exposed to the plane waveworking at 900Mhz The popular HUGO model [20] with aresolution of 1 times 1 times 1mm as shown in Figure 6 includes 16different tissues and organs The electromagnetic properties(120576119903and 120590) of 16 tissues in the model can be obtained from

FCC published data [21] as listed in Table 4 In our simu-lation 4 nodes are used and the computation time is about65 minutes Figure 7 shows the electric field on head surfaceWith the object oriented HUGOmodel the field distributionover a specific object can be investigated The electric fielddistributions on eyes bone and brain are demonstrated inFigures 8 9 and 10 respectively

4 Conclusion

In this paper we have analyzed the performance of an efficientMPI parallel implementation of the CG-FFT algorithm onHPC computers In the proposed method the codes canrun not only on share memory systems machine but alsoon distributed ones which present high scalability behaviorSpecial attention was paid to communications during thematrix-vector product and vector-vector product which area key point for the parallel performanceWe solved a problemwith more than 400 million unknowns on a HPC including27 nodes

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This work was supported in part by the National ScienceFoundation of China under Grant no 61372057 in part byNatural Science Foundation of the Jiangsu Higher EducationInstitutions under Grant no 10KJD180004 and in part byPostgraduate Innovation Project of Jiangsu Province underGrant no CXZZ13 0973

References

[1] R F Harrington Field Computation by Moment MethodsMacMillan New York NY USA 1968

[2] P M Goggans A A Kishk and A W Glisson ldquoElec-tromagnetic scattering from objects composed of multiplehomogeneous regions using a region-by-region solutionrdquo IEEETransactions on Antennas and Propagation vol 42 no 6 pp865ndash871 1994

[3] R D Graglia P L E Uslenghi andR S Zich ldquoMomentmethodwith isoparametric elements for three-dimensional anisotropicscatterersrdquo Proceedings of the IEEE vol 77 no 5 pp 750ndash7601989

[4] J M Jarem ldquoMethod-of-moments solution of a parallel-platewaveguide aperture systemrdquo Journal of Applied Physics vol 59no 10 pp 3566ndash3570 1986

[5] D E Livesay and K Chen ldquoElectromagnetic field inducedinside arbitrarily shaped biological bodiesrdquo IEEE Transactionson Microwave Theory and Techniques vol 22 no 12 pp 1273ndash1280 1974

[6] T K Sarkar and E Arvas ldquoAn integral equation approachto the analysis of finite microstrip antennas volumesurfaceformulationrdquo IEEE Transactions on Antennas and Propagationvol 38 no 3 pp 305ndash312 1990

[7] H Gan and W C Chew ldquoA discrete BCG-FFT algorithmfor solving 3D inhomogeneous scatterer problemsrdquo Journal ofElectromagnetic Waves and Applications vol 9 no 10 pp 1339ndash1357 1995

[8] T J Cui ldquoFast algorithm for electromagnetic scattering byburied 3-D dielectric objects of large sizerdquo IEEE Transactionson Geoscience and Remote Sensing vol 37 no 5 pp 2597ndash26081999

[9] L Zhao and T J Cui ldquoCG-FFT algorithm for EM scattering bysmall dielectric particles with high permittivity and permeabil-ityrdquoMicrowave and Optical Technology Letters vol 49 no 2 pp305ndash310 2007

[10] L Zhao T J Cui and W D Li ldquoAn efficient algorithmfor em scattering by electrically large dielectric objects usingMR-QEB iterative scheme and CG-FFT methodrdquo Progress inElectromagnetics Research vol 67 pp 341ndash355 2007

[11] W Yu R Mittra T Su Y Liu and X Yang Parallel FiniteDifference Time Domain Method Artech House NorwoodMass USA 2006

[12] W Yu X Yang Y Liu et al ldquoNew development of parallelconformal FDTD method in computational electromagneticsengineeringrdquo IEEEAntennas and PropagationMagazine vol 53no 3 pp 15ndash41 2011

[13] J M Taboada M G Araujo F O Basteiro J L Rodriguez andL Landesa ldquoMLFMA-FFT parallel algorithm for the solution ofextremely large problems in electromagneticrdquo Proceedings of theIEEE vol 101 no 2 pp 350ndash363 2013

[14] O P Gandhi G Lazzi and C M Furse ldquoElectromagneticabsorption in the human head and neck for mobile telephonesat 835 and 1900MHzrdquo IEEE Transactions on Microwave Theoryand Techniques vol 44 no 10 pp 1884ndash1897 1996

[15] G Lazzi and O P Gandhi ldquoRealistically tilted and truncatedanatomically based models of the human head for dosimetryof mobile telephonesrdquo IEEE Transactions on ElectromagneticCompatibility vol 39 no 1 pp 55ndash61 1997

[16] A K Lee H D Choi and J I Choi ldquoStudy on SARs inhead models with different shapes by age using SAM modelfor mobile phone exposure at 835MHzrdquo IEEE Transactions onElectromagnetic Compatibility vol 49 no 2 pp 302ndash312 2007

[17] Q-X Li and O P Gandhi ldquoThermal implications of thenew relaxed IEEE RF safety standard for head exposures tocellular telephones at 835 and 1900MHzrdquo IEEE Transactions onMicrowave Theory and Techniques vol 54 no 7 pp 3146ndash31542006

[18] T J Cui and W C Chew ldquoFast algorithm for electromagneticscattering by buried 3-D dielectric objects of large sizerdquo IEEE

8 International Journal of Antennas and Propagation

Transactions on Geoscience and Remote Sensing vol 37 no 5pp 2597ndash2608 1999

[19] J Weaver Applications of Discrete and Continuous FourierAnalysis John Wiley amp Sons New York NY USA 1983

[20] P Bernardi M Cavagnaro S Pisa and E Piuzzi ldquoSpecificabsorption rate and temperature increases in the head of acellular-phone userrdquo IEEE Transactions on Microwave Theoryand Techniques vol 48 no 7 pp 1118ndash1126 2000

[21] httpwwwfccgovfcc-bindielecsh

International Journal of

AerospaceEngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Active and Passive Electronic Components

Control Scienceand Engineering

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

RotatingMachinery

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation httpwwwhindawicom

Journal ofEngineeringVolume 2014

Submit your manuscripts athttpwwwhindawicom

VLSI Design

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Shock and Vibration

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawi Publishing Corporation httpwwwhindawicom

Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

SensorsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Navigation and Observation

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

DistributedSensor Networks

International Journal of

International Journal of Antennas and Propagation 3

0 2 4 6 8 102

4

6

8

10

12

14

Perfo

rman

cec

ells

Number of nodes

times105

Figure 4 The performance of parallel CG-FFT

0 2 4 6 8 10 1205

06

07

08

09

1

Processes

Para

llel e

ffici

ency

64lowast 64lowast 64

128lowast 128lowast 128

256lowast 256lowast 512

512lowast 512lowast 512

Figure 5 Parallel efficiency of different case

Figure 6 A cut plane of the HUGO human head model

025

05

075

1

E fi

eld

on h

ead

104

1e minus 07

Figure 7 The total electric field distribution on head surface

001

002

003

E fi

eld

on ey

es

0

00347

Figure 8 The total electric field distribution on eyes

2 Theory and Methods

Consider a 3D dielectric object of arbitrary shape that isin homogeneous space which is characterized by relativepermittivity 120576

119887 we set the homogeneous space is free space

120576119887

= 1 The arbitrarily shaped dielectric object with complexpermittivity 120576

119903(119903) is inscribed by a cuboid 119871

119909times 119871119910

times 119871119911 The

time dependence of 119890minus119894120596119905 is assumed and suppressed Under

the illumination of the incident electric field the total electricfield inside the dielectric object 119864 can be determined throughthe following volume integral equation

119864 (119903) +1

4120587120576119887

int119881

119866 (119903 1199031015840) (120576119887

(119903) minus 120576119887) 119864 (119903 119903

1015840) 1198891199031015840

= 119864inc

(1)

where

119866 (119903 1199031015840) =

119890119894119896119887119877

1198775[

[

119866119909119909

119866119909119910

119866119909119911

119866119910119909

119866119910119910

119866119910119911

119866119911119909

119866119911119910

119866119911119911

]

]

(2)

4 International Journal of Antennas and Propagation

01

02

03

E fi

eld

on b

one

0

0304

Figure 9 The total electric field distribution on bone

01

02

E fi

eld

on b

rain

0

0223

Figure 10 The total electric field distribution on brain

is the dyadic Greenrsquos function in homogenous space in whichthe corresponding elements are given by

119866120585120577

= (120585 minus 1205851015840) (120577 minus 120577

1015840) [(119896119887119877)2

+ 1198943 (119896119887119877) minus 3] (120585 = 120577)

119866120585120577

= (120585 minus 1205851015840)2

[(119896119887119877)2

+ 1198943 (119896119887119877) minus 3]

minus 1198772

[(119896119887119877)2

+ 119894 (119896119887119877) minus 1] (120585 = 120577)

(3)

The equivalent version for the induced current 119869 can beapproximately obtained by

119869 (119903) +1

4120587120594 (119903) int

119881

119866 (119903 1199031015840) sdot 119869 (119903

1015840) 1198891199031015840

= 119869inc

(119903) (4)

where 120594(119903) = (120576(119903)120576119887

) minus 1 and

119869 (119903) = 120594 (119903) 119864 (119903) 119869inc

(119903) = 120594 (119903) 119864inc

(119903) (5)

are the normalized electric current inside the dielectric objectand the equivalent incident current respectively

A box with the size of 119871119909

times 119871119910

times 119871119911is used to bound

the considered dielectric target and is discretized into 119873119909

times

119873119910

times 119873119911cuboidal cells Then the volume of each cell is ΔV =

Δ119909Δ119910Δ119911 where Δ120585 = 119871120585119873120585 (120585 = 119909 119910 119911) and 119873

120585is the

division number in the 120585-direction Choosing pulse functionas the basis and testing function we obtain the discrete formsof (4) as

119869119863

120585(119898 119899 119896) +

1

4120587120594 (119898 119899 119896)

times sum

120589=119909119910119911

119873119909minus1

sum

1198981015840=0

119873119910minus1

sum

1198991015840=0

119873119911minus1

sum

1198961015840=0

119866119863

120585120589(119898 minus 119898

1015840 119899 minus 119899

1015840 119896 minus 119896

1015840)

times 119869119863

120589(1198981015840 1198991015840 1198961015840)

= 119869119894119863

120585(119898 119899 119896)

(6)

in which

119866119863

120585120589(119898 minus 119898

1015840 119899 minus 119899

1015840 119896 minus 119896

1015840)

= Δ119881 int

(1198981015840+1)Δ119909

1198981015840Δ119909

int

(1198991015840+1)Δ119910

1198991015840Δ119910

int

(1198961015840+1)Δ119911

1198961015840Δ119911

119866120585120589

((119898 +1

2) Δ119909 minus 119909

1015840

(119899 +1

2) Δ119910 minus 119910

1015840

(119896 +1

2) Δ119911

minus1199111015840) 119889119909101584011988911991010158401198891199111015840

119869119894119863

120585(119898 119899 119896)

= int

(119898+1)Δ119909

119898Δ119909

int

(119899+1)Δ119910

119899Δ119910

int

(119896+1)Δ119911

119896Δ119911

119869inc120585

(119909 119910 119911) 119889119909 119889119910 119889119911

(7)

We remark that the above formulations (6)-(7) actuallyimply the scattering by small particles with the size ofΔ119881 because of the use of pulse basis functions althoughthe dielectric targets may be continuous We can convert (6)into a linear system of equations

119885 sdot 119868 = 119881 (8)

International Journal of Antennas and Propagation 5

Table 1 The HPC hardware information

CPU type Intel Xeon E5520Clock speed 267GHzNumber of nodes 27Available memory 30 times 12GB (DDR3 1067MHz)Operating system CentOS (Linux)Network system BNT 10Gbps Ethernet

Table 2 Performance test

Cells Processes Number of iterations Times (second)64 times 64 times 64 1 30 228128 times 128 times 128 2 30 904256 times 256 times 128 4 30 2589256 times 256 times 256 6 30 3679256 times 256 times 512 8 30 6123512 times 512 times 512 10 30 32897

Table 3 Parallel efficiency

Cells Processes Time (second) Parallel efficiency

64 times 64 times 64

1 228 12 1238 0924 72 080

128 times 128 times 128

1 1681 12 904 0934 506 0838 318 066

256 times 256 times 512

1 29883 12 16418 0914 9401 0798 6123 061

512 times 512 times 512

1 194092 12 109046 0894 64697 07510 32897 05912 31714 051

where 119885 is an 119873 times 119873 system matrix 119868 is a column vectorwith the coefficients of the unknown currents and 119881 isa column vector associated with the incident fields in thedielectric object Here 119873 = 3119873

119909119873119910

119873119911is the total number

of unknowns However the inner products in (6) are all 3Dsummations of the products of discrete Greenrsquos functions anddiscrete electric currents which are quite time and memoryconsuming For electric-large electromagnetic problems 119873

is very large and it is very difficult to solve (8) directly Inorder to calculate fast the products of Greenrsquos functions andelectric currents the discrete Greenrsquos functions are extendedin a larger computational domain as

119866119890

120585120589(119898 119899 119896) = plusmn119866

119863

120585120589(1198980 1198990 1198960) (9)

Table 4 Tissue parameters for HUGOmodel

Tissue type Relativepermittivity

Relativepermeability Conductivity

Skin 41405334 1 086678Fat 11333888 1 0109162Muscle 56879063 1 0995364Cartilage 42653103 1 0782333Cerebrospinal fluid 68638336 1 2412575Sclera 5527013 1 1166726Vitreous 6890184 1 1636162Lens nucleus 35841595 1 0484917Grey matter 52724701 1 0942193White matter 38886288 1 0590815Nerve 32530067 1 0573612Thyroid 59683323 1 1038448Tongue 5527013 1 0936192Bone 20787804 1 0339975Blood 61360718 1 1538069Air 1 1 0

where 0 le 119898 le 2119873119909

minus 1 0 le 119899 le 2119873119910

minus 1 0 le 119896 le 2119873119911

minus 1

1198980

= 119898 0 le 119898 le 119873

119909minus 1

2119873119909

minus 119898 119873119909

le 119898 le 2119873119909

minus 1

1198990

= 119899 0 le 119899 le 119873

119910minus 1

2119873119910

minus 119899 119873119910

le 119899 le 2119873119910

minus 1

1198960

= 119896 0 le 119896 le 119873

119911minus 1

2119873119911

minus 119896 119873119911

le 119896 le 2119873119911

minus 1

(10)

The signs of the expanded discrete Greenrsquos functions aredirectly related to the even and odd nature of the compo-nents with respect to the coordinates in different extendedsubdomains After defining the extended Greenrsquos functionsthe equivalent electric current 119869

119863

120585(119898 119899 119896) can be defined in

the extended domain by zero padding as

119869119863119890

120585(119898 119899 119896)

= 119869119863

120585(119898 119899 119896) 0 le 119898 le 119873

119909minus 1 0 le 119899 le 119873

119910 0 le 119896 le 119873

119911

0 else

(11)

6 International Journal of Antennas and Propagation

lowast119878119906119898 119899119900119903119898 119900119891 119890119886119888ℎ 119875119903119900119888119890119904119904lowast119872119875119868 119860119897119897119903119890119889119906119888119890 (amp119903119899119900119903119898amp1199031198991199001199031198981 1 119872119875119868 119865119871119874119860119879 119872119875119868 119878119880119872 119908119900119903119897119889)lowastV119890119888119905119900119903-V119890119888119905119900119903 119875119903119900119889119906119888119905 119900119899 119890119886119888ℎ 119901119903119900119888119890119904119904lowast

119891119900119903(119894119899119905 119894 = 0 119894 lt 119886119897119897 119894 + +)

119888119895119909[0][0][119894] = 119888119895119909[0][0][119894] + 119886119898 lowast 119888119901119909[0][0][119894]119888119895119910[0][0][119894] = 119888119895119910[0][0][119894] + 119886119898 lowast 119888119901119910[0][0][119894]119888119895119911[0][0][119894] = 119888119895119911[0][0][119894] + 119886119898 lowast 119888119901119911[0][0][119894]119888119903119909[0][0][119894] = 119888119903119909[0][0][119894] minus 119886119898 lowast 119888119905119909[0][0][119894]119888119903119910[0][0][119894] = 119888119903119910[0][0][119894] minus 119886119898 lowast 119888119905119910[0][0][119894]119888119903119911[0][0][119894] = 119888119903119911[0][0][119894] minus 119886119898 lowast 119888119905119911[0][0][119894]119903119890119886119909 = 119886119887119904(119888119903119909[0][0][119894])119903119890119886119910 = 119886119887119904(119888119903119910[0][0][119894])119903119890119886119911 = 119886119887119904(119888119903119911[0][0][119894])119903119899119900119903119898 = 119903119899119900119903119898 + 119903119890119886119909 lowast 119903119890119886119909 + 119903119890119886119910 lowast 119903119890119886119910 + 119903119890119886119911 lowast 119903119890119886119911

Algorithm 1

Using the convolution theorem and FFT method we canobtain the discrete form of the integral equation (6) with FFTmethod [18 19]

119869119863

120585(119898 119899 119896)

+1

4120587120594 (119898 119899 119896)F

minus1

sum

120585=119909119910119911

119866119890

120585120577(119894 119895 119897) 119869

119863119890

120585(119894 119895 119897)

= 120594 (119898 119899 119896) 119864inc120585

(119898 119899 119896)

(12)

where 119866119890

120585120577(119894 119895 119897) 119869

119863119890

120585(119894 119895 119897) are the discrete Fourier transform

(DFT) of 119866119890

120585120577(119894 119895 119897) 119869

119863119890

120585120577(119894 119895 119897) respectively Similarly the

corresponding adjoint operations can also be performedusing FFT As a consequence we can solve (12) rapidlythrough the CG-FFT algorithm [18] In order to speed upthe FFT calculation the parallel FFT is used to obtain theFFT and inverse FFT results In the proposed algorithmboth the FFT transform and the inverse FFT transform areimplemented using the FFTW library which is a 119862 subrou-tine library for computing the discrete Fourier transformin one or more dimensions and supports the distributed-memory implementation based on message passing interface(MPI) For example to calculate the vector-vector productin the CG-FFT method which can be parallelized by callMPI Allreduce() as shown in Algorithm 1

3 Numerical Results

To illustrate the accuracy and efficiency of the proposed par-allel CG-FFT algorithm we first consider the EM scatteringby a dielectric sphere illuminated by plane waves whichhas a closed-form solution In the following examples thebackground is just free space The dielectric sphere with 120576

119903=

4 119903 = 02m is illuminated by a plane wave The incidentwave is polarized in the 119909 direction and propagating in the

119911 direction in which the operating frequency is 03 GHzThecomparison of numerical results of the internal electric fieldsbetween parallel CG-FFT and analytical results is illustratedin Figure 1 which shows that the numerical results have goodagreement with the analytical resultsWe have also computedthe scattered electric fields from the dielectric object on theobservation plane 119911 = 10m and compared such results withthe exact solutions as shown in Figure 2

Then we do the parallel performance testing on a HPCwhich has 27 nodes shown in Table 1 in which nodes areconnected by 10Gbps Ethernet The benchmark model is ahomogenous cubic dielectric object with 120576

119903= 4 and the edge

of cubic is 04mThe incident wave is the same plane wave asthat in Figure 1 We compare the network latency inside nodeand internode which means that we test the network latencyon one node and between two nodes respectively Figure 3shows the testing results for internode and inside node FromFigure 3 we can see that the speed of network inside nodeis about 4 times of the inter node which will be a bottleneckfor the parallel CG-FFTmethod To evaluate the performanceof the parallel CG-FFT code we define the performance asfollows

Performance (cellss)

=119873119909

times 119873119910

times 119873119911

times Number of iteratioinsSimulation time (s)

(13)

The parallel CG-FFTmethods performance testing resultis demonstrated in Figure 4 and the detail data is listed inTable 2 From Figure 4 we can obtain that the performancegoes up when no more than 8 nodes are used and theperformance goes down using 10 nodesThe reason is that thenetwork latency plays an important role when we use morethan 8 nodes The parallel efficiency is also tested which isdefined as

Parallel Efficiency =1198791

119875 lowast 119879119901

(14)

International Journal of Antennas and Propagation 7

where 119875 is the number of processes 1198791is the running time

used by one process and 119879119901is running time used by 119875

processes Figure 5 shows the parallel efficiency of parallelCG-FFT with different discretization and processes and thedetail results are listed in Table 3 From Figure 5 and Table 3we can see that the parallel efficiency is above 60 when nomore than 8 nodes are used

Finally we use the proposed parallel CG-FFT methodto simulate EM scattering problem from 3D anatomicallyrealistic human head model exposed to the plane waveworking at 900Mhz The popular HUGO model [20] with aresolution of 1 times 1 times 1mm as shown in Figure 6 includes 16different tissues and organs The electromagnetic properties(120576119903and 120590) of 16 tissues in the model can be obtained from

FCC published data [21] as listed in Table 4 In our simu-lation 4 nodes are used and the computation time is about65 minutes Figure 7 shows the electric field on head surfaceWith the object oriented HUGOmodel the field distributionover a specific object can be investigated The electric fielddistributions on eyes bone and brain are demonstrated inFigures 8 9 and 10 respectively

4 Conclusion

In this paper we have analyzed the performance of an efficientMPI parallel implementation of the CG-FFT algorithm onHPC computers In the proposed method the codes canrun not only on share memory systems machine but alsoon distributed ones which present high scalability behaviorSpecial attention was paid to communications during thematrix-vector product and vector-vector product which area key point for the parallel performanceWe solved a problemwith more than 400 million unknowns on a HPC including27 nodes

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This work was supported in part by the National ScienceFoundation of China under Grant no 61372057 in part byNatural Science Foundation of the Jiangsu Higher EducationInstitutions under Grant no 10KJD180004 and in part byPostgraduate Innovation Project of Jiangsu Province underGrant no CXZZ13 0973

References

[1] R F Harrington Field Computation by Moment MethodsMacMillan New York NY USA 1968

[2] P M Goggans A A Kishk and A W Glisson ldquoElec-tromagnetic scattering from objects composed of multiplehomogeneous regions using a region-by-region solutionrdquo IEEETransactions on Antennas and Propagation vol 42 no 6 pp865ndash871 1994

[3] R D Graglia P L E Uslenghi andR S Zich ldquoMomentmethodwith isoparametric elements for three-dimensional anisotropicscatterersrdquo Proceedings of the IEEE vol 77 no 5 pp 750ndash7601989

[4] J M Jarem ldquoMethod-of-moments solution of a parallel-platewaveguide aperture systemrdquo Journal of Applied Physics vol 59no 10 pp 3566ndash3570 1986

[5] D E Livesay and K Chen ldquoElectromagnetic field inducedinside arbitrarily shaped biological bodiesrdquo IEEE Transactionson Microwave Theory and Techniques vol 22 no 12 pp 1273ndash1280 1974

[6] T K Sarkar and E Arvas ldquoAn integral equation approachto the analysis of finite microstrip antennas volumesurfaceformulationrdquo IEEE Transactions on Antennas and Propagationvol 38 no 3 pp 305ndash312 1990

[7] H Gan and W C Chew ldquoA discrete BCG-FFT algorithmfor solving 3D inhomogeneous scatterer problemsrdquo Journal ofElectromagnetic Waves and Applications vol 9 no 10 pp 1339ndash1357 1995

[8] T J Cui ldquoFast algorithm for electromagnetic scattering byburied 3-D dielectric objects of large sizerdquo IEEE Transactionson Geoscience and Remote Sensing vol 37 no 5 pp 2597ndash26081999

[9] L Zhao and T J Cui ldquoCG-FFT algorithm for EM scattering bysmall dielectric particles with high permittivity and permeabil-ityrdquoMicrowave and Optical Technology Letters vol 49 no 2 pp305ndash310 2007

[10] L Zhao T J Cui and W D Li ldquoAn efficient algorithmfor em scattering by electrically large dielectric objects usingMR-QEB iterative scheme and CG-FFT methodrdquo Progress inElectromagnetics Research vol 67 pp 341ndash355 2007

[11] W Yu R Mittra T Su Y Liu and X Yang Parallel FiniteDifference Time Domain Method Artech House NorwoodMass USA 2006

[12] W Yu X Yang Y Liu et al ldquoNew development of parallelconformal FDTD method in computational electromagneticsengineeringrdquo IEEEAntennas and PropagationMagazine vol 53no 3 pp 15ndash41 2011

[13] J M Taboada M G Araujo F O Basteiro J L Rodriguez andL Landesa ldquoMLFMA-FFT parallel algorithm for the solution ofextremely large problems in electromagneticrdquo Proceedings of theIEEE vol 101 no 2 pp 350ndash363 2013

[14] O P Gandhi G Lazzi and C M Furse ldquoElectromagneticabsorption in the human head and neck for mobile telephonesat 835 and 1900MHzrdquo IEEE Transactions on Microwave Theoryand Techniques vol 44 no 10 pp 1884ndash1897 1996

[15] G Lazzi and O P Gandhi ldquoRealistically tilted and truncatedanatomically based models of the human head for dosimetryof mobile telephonesrdquo IEEE Transactions on ElectromagneticCompatibility vol 39 no 1 pp 55ndash61 1997

[16] A K Lee H D Choi and J I Choi ldquoStudy on SARs inhead models with different shapes by age using SAM modelfor mobile phone exposure at 835MHzrdquo IEEE Transactions onElectromagnetic Compatibility vol 49 no 2 pp 302ndash312 2007

[17] Q-X Li and O P Gandhi ldquoThermal implications of thenew relaxed IEEE RF safety standard for head exposures tocellular telephones at 835 and 1900MHzrdquo IEEE Transactions onMicrowave Theory and Techniques vol 54 no 7 pp 3146ndash31542006

[18] T J Cui and W C Chew ldquoFast algorithm for electromagneticscattering by buried 3-D dielectric objects of large sizerdquo IEEE

8 International Journal of Antennas and Propagation

Transactions on Geoscience and Remote Sensing vol 37 no 5pp 2597ndash2608 1999

[19] J Weaver Applications of Discrete and Continuous FourierAnalysis John Wiley amp Sons New York NY USA 1983

[20] P Bernardi M Cavagnaro S Pisa and E Piuzzi ldquoSpecificabsorption rate and temperature increases in the head of acellular-phone userrdquo IEEE Transactions on Microwave Theoryand Techniques vol 48 no 7 pp 1118ndash1126 2000

[21] httpwwwfccgovfcc-bindielecsh

International Journal of

AerospaceEngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Active and Passive Electronic Components

Control Scienceand Engineering

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

RotatingMachinery

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation httpwwwhindawicom

Journal ofEngineeringVolume 2014

Submit your manuscripts athttpwwwhindawicom

VLSI Design

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Shock and Vibration

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawi Publishing Corporation httpwwwhindawicom

Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

SensorsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Navigation and Observation

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

DistributedSensor Networks

International Journal of

4 International Journal of Antennas and Propagation

01

02

03

E fi

eld

on b

one

0

0304

Figure 9 The total electric field distribution on bone

01

02

E fi

eld

on b

rain

0

0223

Figure 10 The total electric field distribution on brain

is the dyadic Greenrsquos function in homogenous space in whichthe corresponding elements are given by

119866120585120577

= (120585 minus 1205851015840) (120577 minus 120577

1015840) [(119896119887119877)2

+ 1198943 (119896119887119877) minus 3] (120585 = 120577)

119866120585120577

= (120585 minus 1205851015840)2

[(119896119887119877)2

+ 1198943 (119896119887119877) minus 3]

minus 1198772

[(119896119887119877)2

+ 119894 (119896119887119877) minus 1] (120585 = 120577)

(3)

The equivalent version for the induced current 119869 can beapproximately obtained by

119869 (119903) +1

4120587120594 (119903) int

119881

119866 (119903 1199031015840) sdot 119869 (119903

1015840) 1198891199031015840

= 119869inc

(119903) (4)

where 120594(119903) = (120576(119903)120576119887

) minus 1 and

119869 (119903) = 120594 (119903) 119864 (119903) 119869inc

(119903) = 120594 (119903) 119864inc

(119903) (5)

are the normalized electric current inside the dielectric objectand the equivalent incident current respectively

A box with the size of 119871119909

times 119871119910

times 119871119911is used to bound

the considered dielectric target and is discretized into 119873119909

times

119873119910

times 119873119911cuboidal cells Then the volume of each cell is ΔV =

Δ119909Δ119910Δ119911 where Δ120585 = 119871120585119873120585 (120585 = 119909 119910 119911) and 119873

120585is the

division number in the 120585-direction Choosing pulse functionas the basis and testing function we obtain the discrete formsof (4) as

119869119863

120585(119898 119899 119896) +

1

4120587120594 (119898 119899 119896)

times sum

120589=119909119910119911

119873119909minus1

sum

1198981015840=0

119873119910minus1

sum

1198991015840=0

119873119911minus1

sum

1198961015840=0

119866119863

120585120589(119898 minus 119898

1015840 119899 minus 119899

1015840 119896 minus 119896

1015840)

times 119869119863

120589(1198981015840 1198991015840 1198961015840)

= 119869119894119863

120585(119898 119899 119896)

(6)

in which

119866119863

120585120589(119898 minus 119898

1015840 119899 minus 119899

1015840 119896 minus 119896

1015840)

= Δ119881 int

(1198981015840+1)Δ119909

1198981015840Δ119909

int

(1198991015840+1)Δ119910

1198991015840Δ119910

int

(1198961015840+1)Δ119911

1198961015840Δ119911

119866120585120589

((119898 +1

2) Δ119909 minus 119909

1015840

(119899 +1

2) Δ119910 minus 119910

1015840

(119896 +1

2) Δ119911

minus1199111015840) 119889119909101584011988911991010158401198891199111015840

119869119894119863

120585(119898 119899 119896)

= int

(119898+1)Δ119909

119898Δ119909

int

(119899+1)Δ119910

119899Δ119910

int

(119896+1)Δ119911

119896Δ119911

119869inc120585

(119909 119910 119911) 119889119909 119889119910 119889119911

(7)

We remark that the above formulations (6)-(7) actuallyimply the scattering by small particles with the size ofΔ119881 because of the use of pulse basis functions althoughthe dielectric targets may be continuous We can convert (6)into a linear system of equations

119885 sdot 119868 = 119881 (8)

International Journal of Antennas and Propagation 5

Table 1 The HPC hardware information

CPU type Intel Xeon E5520Clock speed 267GHzNumber of nodes 27Available memory 30 times 12GB (DDR3 1067MHz)Operating system CentOS (Linux)Network system BNT 10Gbps Ethernet

Table 2 Performance test

Cells Processes Number of iterations Times (second)64 times 64 times 64 1 30 228128 times 128 times 128 2 30 904256 times 256 times 128 4 30 2589256 times 256 times 256 6 30 3679256 times 256 times 512 8 30 6123512 times 512 times 512 10 30 32897

Table 3 Parallel efficiency

Cells Processes Time (second) Parallel efficiency

64 times 64 times 64

1 228 12 1238 0924 72 080

128 times 128 times 128

1 1681 12 904 0934 506 0838 318 066

256 times 256 times 512

1 29883 12 16418 0914 9401 0798 6123 061

512 times 512 times 512

1 194092 12 109046 0894 64697 07510 32897 05912 31714 051

where 119885 is an 119873 times 119873 system matrix 119868 is a column vectorwith the coefficients of the unknown currents and 119881 isa column vector associated with the incident fields in thedielectric object Here 119873 = 3119873

119909119873119910

119873119911is the total number

of unknowns However the inner products in (6) are all 3Dsummations of the products of discrete Greenrsquos functions anddiscrete electric currents which are quite time and memoryconsuming For electric-large electromagnetic problems 119873

is very large and it is very difficult to solve (8) directly Inorder to calculate fast the products of Greenrsquos functions andelectric currents the discrete Greenrsquos functions are extendedin a larger computational domain as

119866119890

120585120589(119898 119899 119896) = plusmn119866

119863

120585120589(1198980 1198990 1198960) (9)

Table 4 Tissue parameters for HUGOmodel

Tissue type Relativepermittivity

Relativepermeability Conductivity

Skin 41405334 1 086678Fat 11333888 1 0109162Muscle 56879063 1 0995364Cartilage 42653103 1 0782333Cerebrospinal fluid 68638336 1 2412575Sclera 5527013 1 1166726Vitreous 6890184 1 1636162Lens nucleus 35841595 1 0484917Grey matter 52724701 1 0942193White matter 38886288 1 0590815Nerve 32530067 1 0573612Thyroid 59683323 1 1038448Tongue 5527013 1 0936192Bone 20787804 1 0339975Blood 61360718 1 1538069Air 1 1 0

where 0 le 119898 le 2119873119909

minus 1 0 le 119899 le 2119873119910

minus 1 0 le 119896 le 2119873119911

minus 1

1198980

= 119898 0 le 119898 le 119873

119909minus 1

2119873119909

minus 119898 119873119909

le 119898 le 2119873119909

minus 1

1198990

= 119899 0 le 119899 le 119873

119910minus 1

2119873119910

minus 119899 119873119910

le 119899 le 2119873119910

minus 1

1198960

= 119896 0 le 119896 le 119873

119911minus 1

2119873119911

minus 119896 119873119911

le 119896 le 2119873119911

minus 1

(10)

The signs of the expanded discrete Greenrsquos functions aredirectly related to the even and odd nature of the compo-nents with respect to the coordinates in different extendedsubdomains After defining the extended Greenrsquos functionsthe equivalent electric current 119869

119863

120585(119898 119899 119896) can be defined in

the extended domain by zero padding as

119869119863119890

120585(119898 119899 119896)

= 119869119863

120585(119898 119899 119896) 0 le 119898 le 119873

119909minus 1 0 le 119899 le 119873

119910 0 le 119896 le 119873

119911

0 else

(11)

6 International Journal of Antennas and Propagation

lowast119878119906119898 119899119900119903119898 119900119891 119890119886119888ℎ 119875119903119900119888119890119904119904lowast119872119875119868 119860119897119897119903119890119889119906119888119890 (amp119903119899119900119903119898amp1199031198991199001199031198981 1 119872119875119868 119865119871119874119860119879 119872119875119868 119878119880119872 119908119900119903119897119889)lowastV119890119888119905119900119903-V119890119888119905119900119903 119875119903119900119889119906119888119905 119900119899 119890119886119888ℎ 119901119903119900119888119890119904119904lowast

119891119900119903(119894119899119905 119894 = 0 119894 lt 119886119897119897 119894 + +)

119888119895119909[0][0][119894] = 119888119895119909[0][0][119894] + 119886119898 lowast 119888119901119909[0][0][119894]119888119895119910[0][0][119894] = 119888119895119910[0][0][119894] + 119886119898 lowast 119888119901119910[0][0][119894]119888119895119911[0][0][119894] = 119888119895119911[0][0][119894] + 119886119898 lowast 119888119901119911[0][0][119894]119888119903119909[0][0][119894] = 119888119903119909[0][0][119894] minus 119886119898 lowast 119888119905119909[0][0][119894]119888119903119910[0][0][119894] = 119888119903119910[0][0][119894] minus 119886119898 lowast 119888119905119910[0][0][119894]119888119903119911[0][0][119894] = 119888119903119911[0][0][119894] minus 119886119898 lowast 119888119905119911[0][0][119894]119903119890119886119909 = 119886119887119904(119888119903119909[0][0][119894])119903119890119886119910 = 119886119887119904(119888119903119910[0][0][119894])119903119890119886119911 = 119886119887119904(119888119903119911[0][0][119894])119903119899119900119903119898 = 119903119899119900119903119898 + 119903119890119886119909 lowast 119903119890119886119909 + 119903119890119886119910 lowast 119903119890119886119910 + 119903119890119886119911 lowast 119903119890119886119911

Algorithm 1

Using the convolution theorem and FFT method we canobtain the discrete form of the integral equation (6) with FFTmethod [18 19]

119869119863

120585(119898 119899 119896)

+1

4120587120594 (119898 119899 119896)F

minus1

sum

120585=119909119910119911

119866119890

120585120577(119894 119895 119897) 119869

119863119890

120585(119894 119895 119897)

= 120594 (119898 119899 119896) 119864inc120585

(119898 119899 119896)

(12)

where 119866119890

120585120577(119894 119895 119897) 119869

119863119890

120585(119894 119895 119897) are the discrete Fourier transform

(DFT) of 119866119890

120585120577(119894 119895 119897) 119869

119863119890

120585120577(119894 119895 119897) respectively Similarly the

corresponding adjoint operations can also be performedusing FFT As a consequence we can solve (12) rapidlythrough the CG-FFT algorithm [18] In order to speed upthe FFT calculation the parallel FFT is used to obtain theFFT and inverse FFT results In the proposed algorithmboth the FFT transform and the inverse FFT transform areimplemented using the FFTW library which is a 119862 subrou-tine library for computing the discrete Fourier transformin one or more dimensions and supports the distributed-memory implementation based on message passing interface(MPI) For example to calculate the vector-vector productin the CG-FFT method which can be parallelized by callMPI Allreduce() as shown in Algorithm 1

3 Numerical Results

To illustrate the accuracy and efficiency of the proposed par-allel CG-FFT algorithm we first consider the EM scatteringby a dielectric sphere illuminated by plane waves whichhas a closed-form solution In the following examples thebackground is just free space The dielectric sphere with 120576

119903=

4 119903 = 02m is illuminated by a plane wave The incidentwave is polarized in the 119909 direction and propagating in the

119911 direction in which the operating frequency is 03 GHzThecomparison of numerical results of the internal electric fieldsbetween parallel CG-FFT and analytical results is illustratedin Figure 1 which shows that the numerical results have goodagreement with the analytical resultsWe have also computedthe scattered electric fields from the dielectric object on theobservation plane 119911 = 10m and compared such results withthe exact solutions as shown in Figure 2

Then we do the parallel performance testing on a HPCwhich has 27 nodes shown in Table 1 in which nodes areconnected by 10Gbps Ethernet The benchmark model is ahomogenous cubic dielectric object with 120576

119903= 4 and the edge

of cubic is 04mThe incident wave is the same plane wave asthat in Figure 1 We compare the network latency inside nodeand internode which means that we test the network latencyon one node and between two nodes respectively Figure 3shows the testing results for internode and inside node FromFigure 3 we can see that the speed of network inside nodeis about 4 times of the inter node which will be a bottleneckfor the parallel CG-FFTmethod To evaluate the performanceof the parallel CG-FFT code we define the performance asfollows

Performance (cellss)

=119873119909

times 119873119910

times 119873119911

times Number of iteratioinsSimulation time (s)

(13)

The parallel CG-FFTmethods performance testing resultis demonstrated in Figure 4 and the detail data is listed inTable 2 From Figure 4 we can obtain that the performancegoes up when no more than 8 nodes are used and theperformance goes down using 10 nodesThe reason is that thenetwork latency plays an important role when we use morethan 8 nodes The parallel efficiency is also tested which isdefined as

Parallel Efficiency =1198791

119875 lowast 119879119901

(14)

International Journal of Antennas and Propagation 7

where 119875 is the number of processes 1198791is the running time

used by one process and 119879119901is running time used by 119875

processes Figure 5 shows the parallel efficiency of parallelCG-FFT with different discretization and processes and thedetail results are listed in Table 3 From Figure 5 and Table 3we can see that the parallel efficiency is above 60 when nomore than 8 nodes are used

Finally we use the proposed parallel CG-FFT methodto simulate EM scattering problem from 3D anatomicallyrealistic human head model exposed to the plane waveworking at 900Mhz The popular HUGO model [20] with aresolution of 1 times 1 times 1mm as shown in Figure 6 includes 16different tissues and organs The electromagnetic properties(120576119903and 120590) of 16 tissues in the model can be obtained from

FCC published data [21] as listed in Table 4 In our simu-lation 4 nodes are used and the computation time is about65 minutes Figure 7 shows the electric field on head surfaceWith the object oriented HUGOmodel the field distributionover a specific object can be investigated The electric fielddistributions on eyes bone and brain are demonstrated inFigures 8 9 and 10 respectively

4 Conclusion

In this paper we have analyzed the performance of an efficientMPI parallel implementation of the CG-FFT algorithm onHPC computers In the proposed method the codes canrun not only on share memory systems machine but alsoon distributed ones which present high scalability behaviorSpecial attention was paid to communications during thematrix-vector product and vector-vector product which area key point for the parallel performanceWe solved a problemwith more than 400 million unknowns on a HPC including27 nodes

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This work was supported in part by the National ScienceFoundation of China under Grant no 61372057 in part byNatural Science Foundation of the Jiangsu Higher EducationInstitutions under Grant no 10KJD180004 and in part byPostgraduate Innovation Project of Jiangsu Province underGrant no CXZZ13 0973

References

[1] R F Harrington Field Computation by Moment MethodsMacMillan New York NY USA 1968

[2] P M Goggans A A Kishk and A W Glisson ldquoElec-tromagnetic scattering from objects composed of multiplehomogeneous regions using a region-by-region solutionrdquo IEEETransactions on Antennas and Propagation vol 42 no 6 pp865ndash871 1994

[3] R D Graglia P L E Uslenghi andR S Zich ldquoMomentmethodwith isoparametric elements for three-dimensional anisotropicscatterersrdquo Proceedings of the IEEE vol 77 no 5 pp 750ndash7601989

[4] J M Jarem ldquoMethod-of-moments solution of a parallel-platewaveguide aperture systemrdquo Journal of Applied Physics vol 59no 10 pp 3566ndash3570 1986

[5] D E Livesay and K Chen ldquoElectromagnetic field inducedinside arbitrarily shaped biological bodiesrdquo IEEE Transactionson Microwave Theory and Techniques vol 22 no 12 pp 1273ndash1280 1974

[6] T K Sarkar and E Arvas ldquoAn integral equation approachto the analysis of finite microstrip antennas volumesurfaceformulationrdquo IEEE Transactions on Antennas and Propagationvol 38 no 3 pp 305ndash312 1990

[7] H Gan and W C Chew ldquoA discrete BCG-FFT algorithmfor solving 3D inhomogeneous scatterer problemsrdquo Journal ofElectromagnetic Waves and Applications vol 9 no 10 pp 1339ndash1357 1995

[8] T J Cui ldquoFast algorithm for electromagnetic scattering byburied 3-D dielectric objects of large sizerdquo IEEE Transactionson Geoscience and Remote Sensing vol 37 no 5 pp 2597ndash26081999

[9] L Zhao and T J Cui ldquoCG-FFT algorithm for EM scattering bysmall dielectric particles with high permittivity and permeabil-ityrdquoMicrowave and Optical Technology Letters vol 49 no 2 pp305ndash310 2007

[10] L Zhao T J Cui and W D Li ldquoAn efficient algorithmfor em scattering by electrically large dielectric objects usingMR-QEB iterative scheme and CG-FFT methodrdquo Progress inElectromagnetics Research vol 67 pp 341ndash355 2007

[11] W Yu R Mittra T Su Y Liu and X Yang Parallel FiniteDifference Time Domain Method Artech House NorwoodMass USA 2006

[12] W Yu X Yang Y Liu et al ldquoNew development of parallelconformal FDTD method in computational electromagneticsengineeringrdquo IEEEAntennas and PropagationMagazine vol 53no 3 pp 15ndash41 2011

[13] J M Taboada M G Araujo F O Basteiro J L Rodriguez andL Landesa ldquoMLFMA-FFT parallel algorithm for the solution ofextremely large problems in electromagneticrdquo Proceedings of theIEEE vol 101 no 2 pp 350ndash363 2013

[14] O P Gandhi G Lazzi and C M Furse ldquoElectromagneticabsorption in the human head and neck for mobile telephonesat 835 and 1900MHzrdquo IEEE Transactions on Microwave Theoryand Techniques vol 44 no 10 pp 1884ndash1897 1996

[15] G Lazzi and O P Gandhi ldquoRealistically tilted and truncatedanatomically based models of the human head for dosimetryof mobile telephonesrdquo IEEE Transactions on ElectromagneticCompatibility vol 39 no 1 pp 55ndash61 1997

[16] A K Lee H D Choi and J I Choi ldquoStudy on SARs inhead models with different shapes by age using SAM modelfor mobile phone exposure at 835MHzrdquo IEEE Transactions onElectromagnetic Compatibility vol 49 no 2 pp 302ndash312 2007

[17] Q-X Li and O P Gandhi ldquoThermal implications of thenew relaxed IEEE RF safety standard for head exposures tocellular telephones at 835 and 1900MHzrdquo IEEE Transactions onMicrowave Theory and Techniques vol 54 no 7 pp 3146ndash31542006

[18] T J Cui and W C Chew ldquoFast algorithm for electromagneticscattering by buried 3-D dielectric objects of large sizerdquo IEEE

8 International Journal of Antennas and Propagation

Transactions on Geoscience and Remote Sensing vol 37 no 5pp 2597ndash2608 1999

[19] J Weaver Applications of Discrete and Continuous FourierAnalysis John Wiley amp Sons New York NY USA 1983

[20] P Bernardi M Cavagnaro S Pisa and E Piuzzi ldquoSpecificabsorption rate and temperature increases in the head of acellular-phone userrdquo IEEE Transactions on Microwave Theoryand Techniques vol 48 no 7 pp 1118ndash1126 2000

[21] httpwwwfccgovfcc-bindielecsh

International Journal of

AerospaceEngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Active and Passive Electronic Components

Control Scienceand Engineering

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

RotatingMachinery

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation httpwwwhindawicom

Journal ofEngineeringVolume 2014

Submit your manuscripts athttpwwwhindawicom

VLSI Design

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Shock and Vibration

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawi Publishing Corporation httpwwwhindawicom

Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

SensorsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Navigation and Observation

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

DistributedSensor Networks

International Journal of

International Journal of Antennas and Propagation 5

Table 1 The HPC hardware information

CPU type Intel Xeon E5520Clock speed 267GHzNumber of nodes 27Available memory 30 times 12GB (DDR3 1067MHz)Operating system CentOS (Linux)Network system BNT 10Gbps Ethernet

Table 2 Performance test

Cells Processes Number of iterations Times (second)64 times 64 times 64 1 30 228128 times 128 times 128 2 30 904256 times 256 times 128 4 30 2589256 times 256 times 256 6 30 3679256 times 256 times 512 8 30 6123512 times 512 times 512 10 30 32897

Table 3 Parallel efficiency

Cells Processes Time (second) Parallel efficiency

64 times 64 times 64

1 228 12 1238 0924 72 080

128 times 128 times 128

1 1681 12 904 0934 506 0838 318 066

256 times 256 times 512

1 29883 12 16418 0914 9401 0798 6123 061

512 times 512 times 512

1 194092 12 109046 0894 64697 07510 32897 05912 31714 051

where 119885 is an 119873 times 119873 system matrix 119868 is a column vectorwith the coefficients of the unknown currents and 119881 isa column vector associated with the incident fields in thedielectric object Here 119873 = 3119873

119909119873119910

119873119911is the total number

of unknowns However the inner products in (6) are all 3Dsummations of the products of discrete Greenrsquos functions anddiscrete electric currents which are quite time and memoryconsuming For electric-large electromagnetic problems 119873

is very large and it is very difficult to solve (8) directly Inorder to calculate fast the products of Greenrsquos functions andelectric currents the discrete Greenrsquos functions are extendedin a larger computational domain as

119866119890

120585120589(119898 119899 119896) = plusmn119866

119863

120585120589(1198980 1198990 1198960) (9)

Table 4 Tissue parameters for HUGOmodel

Tissue type Relativepermittivity

Relativepermeability Conductivity

Skin 41405334 1 086678Fat 11333888 1 0109162Muscle 56879063 1 0995364Cartilage 42653103 1 0782333Cerebrospinal fluid 68638336 1 2412575Sclera 5527013 1 1166726Vitreous 6890184 1 1636162Lens nucleus 35841595 1 0484917Grey matter 52724701 1 0942193White matter 38886288 1 0590815Nerve 32530067 1 0573612Thyroid 59683323 1 1038448Tongue 5527013 1 0936192Bone 20787804 1 0339975Blood 61360718 1 1538069Air 1 1 0

where 0 le 119898 le 2119873119909

minus 1 0 le 119899 le 2119873119910

minus 1 0 le 119896 le 2119873119911

minus 1

1198980

= 119898 0 le 119898 le 119873

119909minus 1

2119873119909

minus 119898 119873119909

le 119898 le 2119873119909

minus 1

1198990

= 119899 0 le 119899 le 119873

119910minus 1

2119873119910

minus 119899 119873119910

le 119899 le 2119873119910

minus 1

1198960

= 119896 0 le 119896 le 119873

119911minus 1

2119873119911

minus 119896 119873119911

le 119896 le 2119873119911

minus 1

(10)

The signs of the expanded discrete Greenrsquos functions aredirectly related to the even and odd nature of the compo-nents with respect to the coordinates in different extendedsubdomains After defining the extended Greenrsquos functionsthe equivalent electric current 119869

119863

120585(119898 119899 119896) can be defined in

the extended domain by zero padding as

119869119863119890

120585(119898 119899 119896)

= 119869119863

120585(119898 119899 119896) 0 le 119898 le 119873

119909minus 1 0 le 119899 le 119873

119910 0 le 119896 le 119873

119911

0 else

(11)

6 International Journal of Antennas and Propagation

lowast119878119906119898 119899119900119903119898 119900119891 119890119886119888ℎ 119875119903119900119888119890119904119904lowast119872119875119868 119860119897119897119903119890119889119906119888119890 (amp119903119899119900119903119898amp1199031198991199001199031198981 1 119872119875119868 119865119871119874119860119879 119872119875119868 119878119880119872 119908119900119903119897119889)lowastV119890119888119905119900119903-V119890119888119905119900119903 119875119903119900119889119906119888119905 119900119899 119890119886119888ℎ 119901119903119900119888119890119904119904lowast

119891119900119903(119894119899119905 119894 = 0 119894 lt 119886119897119897 119894 + +)

119888119895119909[0][0][119894] = 119888119895119909[0][0][119894] + 119886119898 lowast 119888119901119909[0][0][119894]119888119895119910[0][0][119894] = 119888119895119910[0][0][119894] + 119886119898 lowast 119888119901119910[0][0][119894]119888119895119911[0][0][119894] = 119888119895119911[0][0][119894] + 119886119898 lowast 119888119901119911[0][0][119894]119888119903119909[0][0][119894] = 119888119903119909[0][0][119894] minus 119886119898 lowast 119888119905119909[0][0][119894]119888119903119910[0][0][119894] = 119888119903119910[0][0][119894] minus 119886119898 lowast 119888119905119910[0][0][119894]119888119903119911[0][0][119894] = 119888119903119911[0][0][119894] minus 119886119898 lowast 119888119905119911[0][0][119894]119903119890119886119909 = 119886119887119904(119888119903119909[0][0][119894])119903119890119886119910 = 119886119887119904(119888119903119910[0][0][119894])119903119890119886119911 = 119886119887119904(119888119903119911[0][0][119894])119903119899119900119903119898 = 119903119899119900119903119898 + 119903119890119886119909 lowast 119903119890119886119909 + 119903119890119886119910 lowast 119903119890119886119910 + 119903119890119886119911 lowast 119903119890119886119911

Algorithm 1

Using the convolution theorem and FFT method we canobtain the discrete form of the integral equation (6) with FFTmethod [18 19]

119869119863

120585(119898 119899 119896)

+1

4120587120594 (119898 119899 119896)F

minus1

sum

120585=119909119910119911

119866119890

120585120577(119894 119895 119897) 119869

119863119890

120585(119894 119895 119897)

= 120594 (119898 119899 119896) 119864inc120585

(119898 119899 119896)

(12)

where 119866119890

120585120577(119894 119895 119897) 119869

119863119890

120585(119894 119895 119897) are the discrete Fourier transform

(DFT) of 119866119890

120585120577(119894 119895 119897) 119869

119863119890

120585120577(119894 119895 119897) respectively Similarly the

corresponding adjoint operations can also be performedusing FFT As a consequence we can solve (12) rapidlythrough the CG-FFT algorithm [18] In order to speed upthe FFT calculation the parallel FFT is used to obtain theFFT and inverse FFT results In the proposed algorithmboth the FFT transform and the inverse FFT transform areimplemented using the FFTW library which is a 119862 subrou-tine library for computing the discrete Fourier transformin one or more dimensions and supports the distributed-memory implementation based on message passing interface(MPI) For example to calculate the vector-vector productin the CG-FFT method which can be parallelized by callMPI Allreduce() as shown in Algorithm 1

3 Numerical Results

To illustrate the accuracy and efficiency of the proposed par-allel CG-FFT algorithm we first consider the EM scatteringby a dielectric sphere illuminated by plane waves whichhas a closed-form solution In the following examples thebackground is just free space The dielectric sphere with 120576

119903=

4 119903 = 02m is illuminated by a plane wave The incidentwave is polarized in the 119909 direction and propagating in the

119911 direction in which the operating frequency is 03 GHzThecomparison of numerical results of the internal electric fieldsbetween parallel CG-FFT and analytical results is illustratedin Figure 1 which shows that the numerical results have goodagreement with the analytical resultsWe have also computedthe scattered electric fields from the dielectric object on theobservation plane 119911 = 10m and compared such results withthe exact solutions as shown in Figure 2

Then we do the parallel performance testing on a HPCwhich has 27 nodes shown in Table 1 in which nodes areconnected by 10Gbps Ethernet The benchmark model is ahomogenous cubic dielectric object with 120576

119903= 4 and the edge

of cubic is 04mThe incident wave is the same plane wave asthat in Figure 1 We compare the network latency inside nodeand internode which means that we test the network latencyon one node and between two nodes respectively Figure 3shows the testing results for internode and inside node FromFigure 3 we can see that the speed of network inside nodeis about 4 times of the inter node which will be a bottleneckfor the parallel CG-FFTmethod To evaluate the performanceof the parallel CG-FFT code we define the performance asfollows

Performance (cellss)

=119873119909

times 119873119910

times 119873119911

times Number of iteratioinsSimulation time (s)

(13)

The parallel CG-FFTmethods performance testing resultis demonstrated in Figure 4 and the detail data is listed inTable 2 From Figure 4 we can obtain that the performancegoes up when no more than 8 nodes are used and theperformance goes down using 10 nodesThe reason is that thenetwork latency plays an important role when we use morethan 8 nodes The parallel efficiency is also tested which isdefined as

Parallel Efficiency =1198791

119875 lowast 119879119901

(14)

International Journal of Antennas and Propagation 7

where 119875 is the number of processes 1198791is the running time

used by one process and 119879119901is running time used by 119875

processes Figure 5 shows the parallel efficiency of parallelCG-FFT with different discretization and processes and thedetail results are listed in Table 3 From Figure 5 and Table 3we can see that the parallel efficiency is above 60 when nomore than 8 nodes are used

Finally we use the proposed parallel CG-FFT methodto simulate EM scattering problem from 3D anatomicallyrealistic human head model exposed to the plane waveworking at 900Mhz The popular HUGO model [20] with aresolution of 1 times 1 times 1mm as shown in Figure 6 includes 16different tissues and organs The electromagnetic properties(120576119903and 120590) of 16 tissues in the model can be obtained from

FCC published data [21] as listed in Table 4 In our simu-lation 4 nodes are used and the computation time is about65 minutes Figure 7 shows the electric field on head surfaceWith the object oriented HUGOmodel the field distributionover a specific object can be investigated The electric fielddistributions on eyes bone and brain are demonstrated inFigures 8 9 and 10 respectively

4 Conclusion

In this paper we have analyzed the performance of an efficientMPI parallel implementation of the CG-FFT algorithm onHPC computers In the proposed method the codes canrun not only on share memory systems machine but alsoon distributed ones which present high scalability behaviorSpecial attention was paid to communications during thematrix-vector product and vector-vector product which area key point for the parallel performanceWe solved a problemwith more than 400 million unknowns on a HPC including27 nodes

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This work was supported in part by the National ScienceFoundation of China under Grant no 61372057 in part byNatural Science Foundation of the Jiangsu Higher EducationInstitutions under Grant no 10KJD180004 and in part byPostgraduate Innovation Project of Jiangsu Province underGrant no CXZZ13 0973

References

[1] R F Harrington Field Computation by Moment MethodsMacMillan New York NY USA 1968

[2] P M Goggans A A Kishk and A W Glisson ldquoElec-tromagnetic scattering from objects composed of multiplehomogeneous regions using a region-by-region solutionrdquo IEEETransactions on Antennas and Propagation vol 42 no 6 pp865ndash871 1994

[3] R D Graglia P L E Uslenghi andR S Zich ldquoMomentmethodwith isoparametric elements for three-dimensional anisotropicscatterersrdquo Proceedings of the IEEE vol 77 no 5 pp 750ndash7601989

[4] J M Jarem ldquoMethod-of-moments solution of a parallel-platewaveguide aperture systemrdquo Journal of Applied Physics vol 59no 10 pp 3566ndash3570 1986

[5] D E Livesay and K Chen ldquoElectromagnetic field inducedinside arbitrarily shaped biological bodiesrdquo IEEE Transactionson Microwave Theory and Techniques vol 22 no 12 pp 1273ndash1280 1974

[6] T K Sarkar and E Arvas ldquoAn integral equation approachto the analysis of finite microstrip antennas volumesurfaceformulationrdquo IEEE Transactions on Antennas and Propagationvol 38 no 3 pp 305ndash312 1990

[7] H Gan and W C Chew ldquoA discrete BCG-FFT algorithmfor solving 3D inhomogeneous scatterer problemsrdquo Journal ofElectromagnetic Waves and Applications vol 9 no 10 pp 1339ndash1357 1995

[8] T J Cui ldquoFast algorithm for electromagnetic scattering byburied 3-D dielectric objects of large sizerdquo IEEE Transactionson Geoscience and Remote Sensing vol 37 no 5 pp 2597ndash26081999

[9] L Zhao and T J Cui ldquoCG-FFT algorithm for EM scattering bysmall dielectric particles with high permittivity and permeabil-ityrdquoMicrowave and Optical Technology Letters vol 49 no 2 pp305ndash310 2007

[10] L Zhao T J Cui and W D Li ldquoAn efficient algorithmfor em scattering by electrically large dielectric objects usingMR-QEB iterative scheme and CG-FFT methodrdquo Progress inElectromagnetics Research vol 67 pp 341ndash355 2007

[11] W Yu R Mittra T Su Y Liu and X Yang Parallel FiniteDifference Time Domain Method Artech House NorwoodMass USA 2006

[12] W Yu X Yang Y Liu et al ldquoNew development of parallelconformal FDTD method in computational electromagneticsengineeringrdquo IEEEAntennas and PropagationMagazine vol 53no 3 pp 15ndash41 2011

[13] J M Taboada M G Araujo F O Basteiro J L Rodriguez andL Landesa ldquoMLFMA-FFT parallel algorithm for the solution ofextremely large problems in electromagneticrdquo Proceedings of theIEEE vol 101 no 2 pp 350ndash363 2013

[14] O P Gandhi G Lazzi and C M Furse ldquoElectromagneticabsorption in the human head and neck for mobile telephonesat 835 and 1900MHzrdquo IEEE Transactions on Microwave Theoryand Techniques vol 44 no 10 pp 1884ndash1897 1996

[15] G Lazzi and O P Gandhi ldquoRealistically tilted and truncatedanatomically based models of the human head for dosimetryof mobile telephonesrdquo IEEE Transactions on ElectromagneticCompatibility vol 39 no 1 pp 55ndash61 1997

[16] A K Lee H D Choi and J I Choi ldquoStudy on SARs inhead models with different shapes by age using SAM modelfor mobile phone exposure at 835MHzrdquo IEEE Transactions onElectromagnetic Compatibility vol 49 no 2 pp 302ndash312 2007

[17] Q-X Li and O P Gandhi ldquoThermal implications of thenew relaxed IEEE RF safety standard for head exposures tocellular telephones at 835 and 1900MHzrdquo IEEE Transactions onMicrowave Theory and Techniques vol 54 no 7 pp 3146ndash31542006

[18] T J Cui and W C Chew ldquoFast algorithm for electromagneticscattering by buried 3-D dielectric objects of large sizerdquo IEEE

8 International Journal of Antennas and Propagation

Transactions on Geoscience and Remote Sensing vol 37 no 5pp 2597ndash2608 1999

[19] J Weaver Applications of Discrete and Continuous FourierAnalysis John Wiley amp Sons New York NY USA 1983

[20] P Bernardi M Cavagnaro S Pisa and E Piuzzi ldquoSpecificabsorption rate and temperature increases in the head of acellular-phone userrdquo IEEE Transactions on Microwave Theoryand Techniques vol 48 no 7 pp 1118ndash1126 2000

[21] httpwwwfccgovfcc-bindielecsh

International Journal of

AerospaceEngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Active and Passive Electronic Components

Control Scienceand Engineering

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

RotatingMachinery

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation httpwwwhindawicom

Journal ofEngineeringVolume 2014

Submit your manuscripts athttpwwwhindawicom

VLSI Design

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Shock and Vibration

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawi Publishing Corporation httpwwwhindawicom

Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

SensorsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Navigation and Observation

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

DistributedSensor Networks

International Journal of

6 International Journal of Antennas and Propagation

lowast119878119906119898 119899119900119903119898 119900119891 119890119886119888ℎ 119875119903119900119888119890119904119904lowast119872119875119868 119860119897119897119903119890119889119906119888119890 (amp119903119899119900119903119898amp1199031198991199001199031198981 1 119872119875119868 119865119871119874119860119879 119872119875119868 119878119880119872 119908119900119903119897119889)lowastV119890119888119905119900119903-V119890119888119905119900119903 119875119903119900119889119906119888119905 119900119899 119890119886119888ℎ 119901119903119900119888119890119904119904lowast

119891119900119903(119894119899119905 119894 = 0 119894 lt 119886119897119897 119894 + +)

119888119895119909[0][0][119894] = 119888119895119909[0][0][119894] + 119886119898 lowast 119888119901119909[0][0][119894]119888119895119910[0][0][119894] = 119888119895119910[0][0][119894] + 119886119898 lowast 119888119901119910[0][0][119894]119888119895119911[0][0][119894] = 119888119895119911[0][0][119894] + 119886119898 lowast 119888119901119911[0][0][119894]119888119903119909[0][0][119894] = 119888119903119909[0][0][119894] minus 119886119898 lowast 119888119905119909[0][0][119894]119888119903119910[0][0][119894] = 119888119903119910[0][0][119894] minus 119886119898 lowast 119888119905119910[0][0][119894]119888119903119911[0][0][119894] = 119888119903119911[0][0][119894] minus 119886119898 lowast 119888119905119911[0][0][119894]119903119890119886119909 = 119886119887119904(119888119903119909[0][0][119894])119903119890119886119910 = 119886119887119904(119888119903119910[0][0][119894])119903119890119886119911 = 119886119887119904(119888119903119911[0][0][119894])119903119899119900119903119898 = 119903119899119900119903119898 + 119903119890119886119909 lowast 119903119890119886119909 + 119903119890119886119910 lowast 119903119890119886119910 + 119903119890119886119911 lowast 119903119890119886119911

Algorithm 1

Using the convolution theorem and FFT method we canobtain the discrete form of the integral equation (6) with FFTmethod [18 19]

119869119863

120585(119898 119899 119896)

+1

4120587120594 (119898 119899 119896)F

minus1

sum

120585=119909119910119911

119866119890

120585120577(119894 119895 119897) 119869

119863119890

120585(119894 119895 119897)

= 120594 (119898 119899 119896) 119864inc120585

(119898 119899 119896)

(12)

where 119866119890

120585120577(119894 119895 119897) 119869

119863119890

120585(119894 119895 119897) are the discrete Fourier transform

(DFT) of 119866119890

120585120577(119894 119895 119897) 119869

119863119890

120585120577(119894 119895 119897) respectively Similarly the

corresponding adjoint operations can also be performedusing FFT As a consequence we can solve (12) rapidlythrough the CG-FFT algorithm [18] In order to speed upthe FFT calculation the parallel FFT is used to obtain theFFT and inverse FFT results In the proposed algorithmboth the FFT transform and the inverse FFT transform areimplemented using the FFTW library which is a 119862 subrou-tine library for computing the discrete Fourier transformin one or more dimensions and supports the distributed-memory implementation based on message passing interface(MPI) For example to calculate the vector-vector productin the CG-FFT method which can be parallelized by callMPI Allreduce() as shown in Algorithm 1

3 Numerical Results

To illustrate the accuracy and efficiency of the proposed par-allel CG-FFT algorithm we first consider the EM scatteringby a dielectric sphere illuminated by plane waves whichhas a closed-form solution In the following examples thebackground is just free space The dielectric sphere with 120576

119903=

4 119903 = 02m is illuminated by a plane wave The incidentwave is polarized in the 119909 direction and propagating in the

119911 direction in which the operating frequency is 03 GHzThecomparison of numerical results of the internal electric fieldsbetween parallel CG-FFT and analytical results is illustratedin Figure 1 which shows that the numerical results have goodagreement with the analytical resultsWe have also computedthe scattered electric fields from the dielectric object on theobservation plane 119911 = 10m and compared such results withthe exact solutions as shown in Figure 2

Then we do the parallel performance testing on a HPCwhich has 27 nodes shown in Table 1 in which nodes areconnected by 10Gbps Ethernet The benchmark model is ahomogenous cubic dielectric object with 120576

119903= 4 and the edge

of cubic is 04mThe incident wave is the same plane wave asthat in Figure 1 We compare the network latency inside nodeand internode which means that we test the network latencyon one node and between two nodes respectively Figure 3shows the testing results for internode and inside node FromFigure 3 we can see that the speed of network inside nodeis about 4 times of the inter node which will be a bottleneckfor the parallel CG-FFTmethod To evaluate the performanceof the parallel CG-FFT code we define the performance asfollows

Performance (cellss)

=119873119909

times 119873119910

times 119873119911

times Number of iteratioinsSimulation time (s)

(13)

The parallel CG-FFTmethods performance testing resultis demonstrated in Figure 4 and the detail data is listed inTable 2 From Figure 4 we can obtain that the performancegoes up when no more than 8 nodes are used and theperformance goes down using 10 nodesThe reason is that thenetwork latency plays an important role when we use morethan 8 nodes The parallel efficiency is also tested which isdefined as

Parallel Efficiency =1198791

119875 lowast 119879119901

(14)

International Journal of Antennas and Propagation 7

where 119875 is the number of processes 1198791is the running time

used by one process and 119879119901is running time used by 119875

processes Figure 5 shows the parallel efficiency of parallelCG-FFT with different discretization and processes and thedetail results are listed in Table 3 From Figure 5 and Table 3we can see that the parallel efficiency is above 60 when nomore than 8 nodes are used

Finally we use the proposed parallel CG-FFT methodto simulate EM scattering problem from 3D anatomicallyrealistic human head model exposed to the plane waveworking at 900Mhz The popular HUGO model [20] with aresolution of 1 times 1 times 1mm as shown in Figure 6 includes 16different tissues and organs The electromagnetic properties(120576119903and 120590) of 16 tissues in the model can be obtained from

FCC published data [21] as listed in Table 4 In our simu-lation 4 nodes are used and the computation time is about65 minutes Figure 7 shows the electric field on head surfaceWith the object oriented HUGOmodel the field distributionover a specific object can be investigated The electric fielddistributions on eyes bone and brain are demonstrated inFigures 8 9 and 10 respectively

4 Conclusion

In this paper we have analyzed the performance of an efficientMPI parallel implementation of the CG-FFT algorithm onHPC computers In the proposed method the codes canrun not only on share memory systems machine but alsoon distributed ones which present high scalability behaviorSpecial attention was paid to communications during thematrix-vector product and vector-vector product which area key point for the parallel performanceWe solved a problemwith more than 400 million unknowns on a HPC including27 nodes

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This work was supported in part by the National ScienceFoundation of China under Grant no 61372057 in part byNatural Science Foundation of the Jiangsu Higher EducationInstitutions under Grant no 10KJD180004 and in part byPostgraduate Innovation Project of Jiangsu Province underGrant no CXZZ13 0973

References

[1] R F Harrington Field Computation by Moment MethodsMacMillan New York NY USA 1968

[2] P M Goggans A A Kishk and A W Glisson ldquoElec-tromagnetic scattering from objects composed of multiplehomogeneous regions using a region-by-region solutionrdquo IEEETransactions on Antennas and Propagation vol 42 no 6 pp865ndash871 1994

[3] R D Graglia P L E Uslenghi andR S Zich ldquoMomentmethodwith isoparametric elements for three-dimensional anisotropicscatterersrdquo Proceedings of the IEEE vol 77 no 5 pp 750ndash7601989

[4] J M Jarem ldquoMethod-of-moments solution of a parallel-platewaveguide aperture systemrdquo Journal of Applied Physics vol 59no 10 pp 3566ndash3570 1986

[5] D E Livesay and K Chen ldquoElectromagnetic field inducedinside arbitrarily shaped biological bodiesrdquo IEEE Transactionson Microwave Theory and Techniques vol 22 no 12 pp 1273ndash1280 1974

[6] T K Sarkar and E Arvas ldquoAn integral equation approachto the analysis of finite microstrip antennas volumesurfaceformulationrdquo IEEE Transactions on Antennas and Propagationvol 38 no 3 pp 305ndash312 1990

[7] H Gan and W C Chew ldquoA discrete BCG-FFT algorithmfor solving 3D inhomogeneous scatterer problemsrdquo Journal ofElectromagnetic Waves and Applications vol 9 no 10 pp 1339ndash1357 1995

[8] T J Cui ldquoFast algorithm for electromagnetic scattering byburied 3-D dielectric objects of large sizerdquo IEEE Transactionson Geoscience and Remote Sensing vol 37 no 5 pp 2597ndash26081999

[9] L Zhao and T J Cui ldquoCG-FFT algorithm for EM scattering bysmall dielectric particles with high permittivity and permeabil-ityrdquoMicrowave and Optical Technology Letters vol 49 no 2 pp305ndash310 2007

[10] L Zhao T J Cui and W D Li ldquoAn efficient algorithmfor em scattering by electrically large dielectric objects usingMR-QEB iterative scheme and CG-FFT methodrdquo Progress inElectromagnetics Research vol 67 pp 341ndash355 2007

[11] W Yu R Mittra T Su Y Liu and X Yang Parallel FiniteDifference Time Domain Method Artech House NorwoodMass USA 2006

[12] W Yu X Yang Y Liu et al ldquoNew development of parallelconformal FDTD method in computational electromagneticsengineeringrdquo IEEEAntennas and PropagationMagazine vol 53no 3 pp 15ndash41 2011

[13] J M Taboada M G Araujo F O Basteiro J L Rodriguez andL Landesa ldquoMLFMA-FFT parallel algorithm for the solution ofextremely large problems in electromagneticrdquo Proceedings of theIEEE vol 101 no 2 pp 350ndash363 2013

[14] O P Gandhi G Lazzi and C M Furse ldquoElectromagneticabsorption in the human head and neck for mobile telephonesat 835 and 1900MHzrdquo IEEE Transactions on Microwave Theoryand Techniques vol 44 no 10 pp 1884ndash1897 1996

[15] G Lazzi and O P Gandhi ldquoRealistically tilted and truncatedanatomically based models of the human head for dosimetryof mobile telephonesrdquo IEEE Transactions on ElectromagneticCompatibility vol 39 no 1 pp 55ndash61 1997

[16] A K Lee H D Choi and J I Choi ldquoStudy on SARs inhead models with different shapes by age using SAM modelfor mobile phone exposure at 835MHzrdquo IEEE Transactions onElectromagnetic Compatibility vol 49 no 2 pp 302ndash312 2007

[17] Q-X Li and O P Gandhi ldquoThermal implications of thenew relaxed IEEE RF safety standard for head exposures tocellular telephones at 835 and 1900MHzrdquo IEEE Transactions onMicrowave Theory and Techniques vol 54 no 7 pp 3146ndash31542006

[18] T J Cui and W C Chew ldquoFast algorithm for electromagneticscattering by buried 3-D dielectric objects of large sizerdquo IEEE

8 International Journal of Antennas and Propagation

Transactions on Geoscience and Remote Sensing vol 37 no 5pp 2597ndash2608 1999

[19] J Weaver Applications of Discrete and Continuous FourierAnalysis John Wiley amp Sons New York NY USA 1983

[20] P Bernardi M Cavagnaro S Pisa and E Piuzzi ldquoSpecificabsorption rate and temperature increases in the head of acellular-phone userrdquo IEEE Transactions on Microwave Theoryand Techniques vol 48 no 7 pp 1118ndash1126 2000

[21] httpwwwfccgovfcc-bindielecsh

International Journal of

AerospaceEngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Active and Passive Electronic Components

Control Scienceand Engineering

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

RotatingMachinery

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation httpwwwhindawicom

Journal ofEngineeringVolume 2014

Submit your manuscripts athttpwwwhindawicom

VLSI Design

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Shock and Vibration

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawi Publishing Corporation httpwwwhindawicom

Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

SensorsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Navigation and Observation

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

DistributedSensor Networks

International Journal of

International Journal of Antennas and Propagation 7

where 119875 is the number of processes 1198791is the running time

used by one process and 119879119901is running time used by 119875

processes Figure 5 shows the parallel efficiency of parallelCG-FFT with different discretization and processes and thedetail results are listed in Table 3 From Figure 5 and Table 3we can see that the parallel efficiency is above 60 when nomore than 8 nodes are used

Finally we use the proposed parallel CG-FFT methodto simulate EM scattering problem from 3D anatomicallyrealistic human head model exposed to the plane waveworking at 900Mhz The popular HUGO model [20] with aresolution of 1 times 1 times 1mm as shown in Figure 6 includes 16different tissues and organs The electromagnetic properties(120576119903and 120590) of 16 tissues in the model can be obtained from

FCC published data [21] as listed in Table 4 In our simu-lation 4 nodes are used and the computation time is about65 minutes Figure 7 shows the electric field on head surfaceWith the object oriented HUGOmodel the field distributionover a specific object can be investigated The electric fielddistributions on eyes bone and brain are demonstrated inFigures 8 9 and 10 respectively

4 Conclusion

In this paper we have analyzed the performance of an efficientMPI parallel implementation of the CG-FFT algorithm onHPC computers In the proposed method the codes canrun not only on share memory systems machine but alsoon distributed ones which present high scalability behaviorSpecial attention was paid to communications during thematrix-vector product and vector-vector product which area key point for the parallel performanceWe solved a problemwith more than 400 million unknowns on a HPC including27 nodes

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This work was supported in part by the National ScienceFoundation of China under Grant no 61372057 in part byNatural Science Foundation of the Jiangsu Higher EducationInstitutions under Grant no 10KJD180004 and in part byPostgraduate Innovation Project of Jiangsu Province underGrant no CXZZ13 0973

References

[1] R F Harrington Field Computation by Moment MethodsMacMillan New York NY USA 1968

[2] P M Goggans A A Kishk and A W Glisson ldquoElec-tromagnetic scattering from objects composed of multiplehomogeneous regions using a region-by-region solutionrdquo IEEETransactions on Antennas and Propagation vol 42 no 6 pp865ndash871 1994

[3] R D Graglia P L E Uslenghi andR S Zich ldquoMomentmethodwith isoparametric elements for three-dimensional anisotropicscatterersrdquo Proceedings of the IEEE vol 77 no 5 pp 750ndash7601989

[4] J M Jarem ldquoMethod-of-moments solution of a parallel-platewaveguide aperture systemrdquo Journal of Applied Physics vol 59no 10 pp 3566ndash3570 1986

[5] D E Livesay and K Chen ldquoElectromagnetic field inducedinside arbitrarily shaped biological bodiesrdquo IEEE Transactionson Microwave Theory and Techniques vol 22 no 12 pp 1273ndash1280 1974

[6] T K Sarkar and E Arvas ldquoAn integral equation approachto the analysis of finite microstrip antennas volumesurfaceformulationrdquo IEEE Transactions on Antennas and Propagationvol 38 no 3 pp 305ndash312 1990

[7] H Gan and W C Chew ldquoA discrete BCG-FFT algorithmfor solving 3D inhomogeneous scatterer problemsrdquo Journal ofElectromagnetic Waves and Applications vol 9 no 10 pp 1339ndash1357 1995

[8] T J Cui ldquoFast algorithm for electromagnetic scattering byburied 3-D dielectric objects of large sizerdquo IEEE Transactionson Geoscience and Remote Sensing vol 37 no 5 pp 2597ndash26081999

[9] L Zhao and T J Cui ldquoCG-FFT algorithm for EM scattering bysmall dielectric particles with high permittivity and permeabil-ityrdquoMicrowave and Optical Technology Letters vol 49 no 2 pp305ndash310 2007

[10] L Zhao T J Cui and W D Li ldquoAn efficient algorithmfor em scattering by electrically large dielectric objects usingMR-QEB iterative scheme and CG-FFT methodrdquo Progress inElectromagnetics Research vol 67 pp 341ndash355 2007

[11] W Yu R Mittra T Su Y Liu and X Yang Parallel FiniteDifference Time Domain Method Artech House NorwoodMass USA 2006

[12] W Yu X Yang Y Liu et al ldquoNew development of parallelconformal FDTD method in computational electromagneticsengineeringrdquo IEEEAntennas and PropagationMagazine vol 53no 3 pp 15ndash41 2011

[13] J M Taboada M G Araujo F O Basteiro J L Rodriguez andL Landesa ldquoMLFMA-FFT parallel algorithm for the solution ofextremely large problems in electromagneticrdquo Proceedings of theIEEE vol 101 no 2 pp 350ndash363 2013

[14] O P Gandhi G Lazzi and C M Furse ldquoElectromagneticabsorption in the human head and neck for mobile telephonesat 835 and 1900MHzrdquo IEEE Transactions on Microwave Theoryand Techniques vol 44 no 10 pp 1884ndash1897 1996

[15] G Lazzi and O P Gandhi ldquoRealistically tilted and truncatedanatomically based models of the human head for dosimetryof mobile telephonesrdquo IEEE Transactions on ElectromagneticCompatibility vol 39 no 1 pp 55ndash61 1997

[16] A K Lee H D Choi and J I Choi ldquoStudy on SARs inhead models with different shapes by age using SAM modelfor mobile phone exposure at 835MHzrdquo IEEE Transactions onElectromagnetic Compatibility vol 49 no 2 pp 302ndash312 2007

[17] Q-X Li and O P Gandhi ldquoThermal implications of thenew relaxed IEEE RF safety standard for head exposures tocellular telephones at 835 and 1900MHzrdquo IEEE Transactions onMicrowave Theory and Techniques vol 54 no 7 pp 3146ndash31542006

[18] T J Cui and W C Chew ldquoFast algorithm for electromagneticscattering by buried 3-D dielectric objects of large sizerdquo IEEE

8 International Journal of Antennas and Propagation

Transactions on Geoscience and Remote Sensing vol 37 no 5pp 2597ndash2608 1999

[19] J Weaver Applications of Discrete and Continuous FourierAnalysis John Wiley amp Sons New York NY USA 1983

[20] P Bernardi M Cavagnaro S Pisa and E Piuzzi ldquoSpecificabsorption rate and temperature increases in the head of acellular-phone userrdquo IEEE Transactions on Microwave Theoryand Techniques vol 48 no 7 pp 1118ndash1126 2000

[21] httpwwwfccgovfcc-bindielecsh

International Journal of

AerospaceEngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Active and Passive Electronic Components

Control Scienceand Engineering

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

RotatingMachinery

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation httpwwwhindawicom

Journal ofEngineeringVolume 2014

Submit your manuscripts athttpwwwhindawicom

VLSI Design

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Shock and Vibration

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawi Publishing Corporation httpwwwhindawicom

Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

SensorsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Navigation and Observation

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

DistributedSensor Networks

International Journal of

8 International Journal of Antennas and Propagation

Transactions on Geoscience and Remote Sensing vol 37 no 5pp 2597ndash2608 1999

[19] J Weaver Applications of Discrete and Continuous FourierAnalysis John Wiley amp Sons New York NY USA 1983

[20] P Bernardi M Cavagnaro S Pisa and E Piuzzi ldquoSpecificabsorption rate and temperature increases in the head of acellular-phone userrdquo IEEE Transactions on Microwave Theoryand Techniques vol 48 no 7 pp 1118ndash1126 2000

[21] httpwwwfccgovfcc-bindielecsh

International Journal of

AerospaceEngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Active and Passive Electronic Components

Control Scienceand Engineering

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

RotatingMachinery

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation httpwwwhindawicom

Journal ofEngineeringVolume 2014

Submit your manuscripts athttpwwwhindawicom

VLSI Design

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Shock and Vibration

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawi Publishing Corporation httpwwwhindawicom

Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

SensorsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Navigation and Observation

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

DistributedSensor Networks

International Journal of

International Journal of

AerospaceEngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Active and Passive Electronic Components

Control Scienceand Engineering

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

RotatingMachinery

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation httpwwwhindawicom

Journal ofEngineeringVolume 2014

Submit your manuscripts athttpwwwhindawicom

VLSI Design

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Shock and Vibration

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawi Publishing Corporation httpwwwhindawicom

Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

SensorsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Navigation and Observation

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

DistributedSensor Networks

International Journal of