research article an efficient algorithm for em scattering
TRANSCRIPT
Research ArticleAn Efficient Algorithm for EM Scattering from AnatomicallyRealistic Human Head Model Using Parallel CG-FFT Method
Lei Zhao and Gen Chen
Center for Computational Science and Engineering School of Mathematics and Statistics Jiangsu Normal University Xuzhou 221116China
Correspondence should be addressed to Lei Zhao lzhaomax163com
Received 1 December 2013 Revised 2 February 2014 Accepted 16 February 2014 Published 24 March 2014
Academic Editor Gaobiao Xiao
Copyright copy 2014 L Zhao and G ChenThis is an open access article distributed under theCreativeCommonsAttribution Licensewhich permits unrestricted use distribution and reproduction in any medium provided the original work is properly cited
An efficient algorithm is proposed to analyze the electromagnetic scattering problem from a high resolution head model withpixel data format The algorithm is based on parallel technique and the conjugate gradient (CG) method combined with the fastFourier transform (FFT) Using the parallel CG-FFTmethod the proposed algorithm is very efficient and can solve very electricallylarge-scale problems which cannot be solved using the conventional CG-FFT method in a personal computer The accuracy of theproposed algorithm is verified by comparing numerical results with analyticalMie-series solutions for dielectric spheres Numericalexperiments have demonstrated that the proposed method has good performance on parallel efficiency
1 Introduction
In recent years there has been an increasing effort to achievean efficient numerical analysis of large-scale electromagneticproblems which usually require much computational timeand large computer memory An efficient numerical methodfor large and complex bodies is very important for manypractical applications The method of moments (MoM) [1]has become one of the most popular methods to computethe scattering problems in a variety of applications [2ndash6]However MoM requires 119874(119873
2) memory usage and 119874(119873
3)
computational load to solve thematrix equation using the LUdecomposition or Gaussian elimination whereN is the num-ber of unknowns To reduce the computational time CG-FFT is employed to solve the MoM matrix equation whichis one of the most efficient ways to solve the volume integralequation for dielectric targets and reduces the computationalcomplexity to O(N log N) in each iteration [7ndash10] For themost practical EM problems a regular computer cannot besufficient for its limited available memory and performanceNew developments of parallel-processing techniques andhigh-performance computer (HPC) system give the chanceof solving large problems that were unattainable in the pastTo reach this point it becomes more and more important
that the development and parallelization of fast algorithmswith highly parallel performance be able to benefit from largeamounts of computationalmemory and parallel processors ofHPC system [11ndash13]
In the past few decades the energy absorption in humanhead exposed to radio-frequency (RF) electromagnetic radi-ation has brought about an increased concern for the possibleconsequences of electromagnetic radiation on human healthMany studies have been performed for calculating the powerabsorbed in a human body exposed to the electromagnetic(EM) field emitted by radio-communication equipment [14ndash17] In this paper the EM scattering problem from a high-resolution 3D anatomically realisticmodel of the humanheadwas considered The volume integral equations are appliedto describe the problem MoM is then used to discretizethe coupled integral equations and a CG-FFT algorithm hasbeen proposed to solve the resulting discrete linear systemAnd the parallelization techniques were applied to speedup the FFT calculation vector-vector product and matrix-vector product during the process of CG iteration Thepaper presents a deep review of the proposed parallel imple-mentation of CG-FFT algorithm with pulse base functionDifferent stages of the parallel algorithm were described andits overall parallel performance was analyzed carefully With
Hindawi Publishing CorporationInternational Journal of Antennas and PropagationVolume 2014 Article ID 495057 8 pageshttpdxdoiorg1011552014495057
2 International Journal of Antennas and Propagation
minus02 minus015 minus01 minus005 0 005 01 015 0205
06
07
08
09
1
11
12
13
Analysis solutionCGFFT solution
Nx = Ny = Nz = 32
z = 01The field Ex on the center line of plane
(a) 119873119909 = 119873119909 = 119873119909 = 32
minus02 minus015 minus01 minus005 0 005 01 015 0205
06
07
08
09
1
11
12
13
Analysis solutionCGFFT solution
Nx = Ny = Nz = 64
z = 01The field Ex on the center line of plane
(b) 119873119909 = 119873119909 = 119873119909 = 64
Figure 1 Electric field distribution on the center line of plane 119911 =
01
this implementation we have done a benchmark model testwith more than 400 million unknowns and solved a practicalEM scattering problemwithmore than 40million unknownsusing a HPC system which includes 27 nodes Each node ofthe cluster has two Intel Xeon E5520 CPU and 12GBmemoryand they are connected by 10Gbps Ethernet high speednetwork We have verified the accuracy and efficiency of thealgorithm by comparing the numerical results with analyticalresults for dielectric spheres Numerical results show that theproposed method has good parallel performance
minus005
005
minus01
01
minus015
015
minus02
02
minus02 minus01 0
0
01 02
0256
0254
0252
025
0248
0246
CG-FFT Ex field on z = 1m
(a)
minus005
005
minus01
01
minus015
015
minus02
02
minus02 minus01 0
0
01 02
026
0258
0256
0254
0252
025
0248
0246
0244
0242
Analytical solution Ex field on z = 1m
(b)
Figure 2 Electric fields on the plane 119911 = 10m (a) Parallel CG-FFTresults (b) Analytical results
0 2 4 6 8 100
2000
4000
6000
8000
Message size (bytes)
Late
ncy
(mic
rose
cond
s)
InternodeInside node
times107
Figure 3 Network latency of nodes
International Journal of Antennas and Propagation 3
0 2 4 6 8 102
4
6
8
10
12
14
Perfo
rman
cec
ells
Number of nodes
times105
Figure 4 The performance of parallel CG-FFT
0 2 4 6 8 10 1205
06
07
08
09
1
Processes
Para
llel e
ffici
ency
64lowast 64lowast 64
128lowast 128lowast 128
256lowast 256lowast 512
512lowast 512lowast 512
Figure 5 Parallel efficiency of different case
Figure 6 A cut plane of the HUGO human head model
025
05
075
1
E fi
eld
on h
ead
104
1e minus 07
Figure 7 The total electric field distribution on head surface
001
002
003
E fi
eld
on ey
es
0
00347
Figure 8 The total electric field distribution on eyes
2 Theory and Methods
Consider a 3D dielectric object of arbitrary shape that isin homogeneous space which is characterized by relativepermittivity 120576
119887 we set the homogeneous space is free space
120576119887
= 1 The arbitrarily shaped dielectric object with complexpermittivity 120576
119903(119903) is inscribed by a cuboid 119871
119909times 119871119910
times 119871119911 The
time dependence of 119890minus119894120596119905 is assumed and suppressed Under
the illumination of the incident electric field the total electricfield inside the dielectric object 119864 can be determined throughthe following volume integral equation
119864 (119903) +1
4120587120576119887
int119881
119866 (119903 1199031015840) (120576119887
(119903) minus 120576119887) 119864 (119903 119903
1015840) 1198891199031015840
= 119864inc
(1)
where
119866 (119903 1199031015840) =
119890119894119896119887119877
1198775[
[
119866119909119909
119866119909119910
119866119909119911
119866119910119909
119866119910119910
119866119910119911
119866119911119909
119866119911119910
119866119911119911
]
]
(2)
4 International Journal of Antennas and Propagation
01
02
03
E fi
eld
on b
one
0
0304
Figure 9 The total electric field distribution on bone
01
02
E fi
eld
on b
rain
0
0223
Figure 10 The total electric field distribution on brain
is the dyadic Greenrsquos function in homogenous space in whichthe corresponding elements are given by
119866120585120577
= (120585 minus 1205851015840) (120577 minus 120577
1015840) [(119896119887119877)2
+ 1198943 (119896119887119877) minus 3] (120585 = 120577)
119866120585120577
= (120585 minus 1205851015840)2
[(119896119887119877)2
+ 1198943 (119896119887119877) minus 3]
minus 1198772
[(119896119887119877)2
+ 119894 (119896119887119877) minus 1] (120585 = 120577)
(3)
The equivalent version for the induced current 119869 can beapproximately obtained by
119869 (119903) +1
4120587120594 (119903) int
119881
119866 (119903 1199031015840) sdot 119869 (119903
1015840) 1198891199031015840
= 119869inc
(119903) (4)
where 120594(119903) = (120576(119903)120576119887
) minus 1 and
119869 (119903) = 120594 (119903) 119864 (119903) 119869inc
(119903) = 120594 (119903) 119864inc
(119903) (5)
are the normalized electric current inside the dielectric objectand the equivalent incident current respectively
A box with the size of 119871119909
times 119871119910
times 119871119911is used to bound
the considered dielectric target and is discretized into 119873119909
times
119873119910
times 119873119911cuboidal cells Then the volume of each cell is ΔV =
Δ119909Δ119910Δ119911 where Δ120585 = 119871120585119873120585 (120585 = 119909 119910 119911) and 119873
120585is the
division number in the 120585-direction Choosing pulse functionas the basis and testing function we obtain the discrete formsof (4) as
119869119863
120585(119898 119899 119896) +
1
4120587120594 (119898 119899 119896)
times sum
120589=119909119910119911
119873119909minus1
sum
1198981015840=0
119873119910minus1
sum
1198991015840=0
119873119911minus1
sum
1198961015840=0
119866119863
120585120589(119898 minus 119898
1015840 119899 minus 119899
1015840 119896 minus 119896
1015840)
times 119869119863
120589(1198981015840 1198991015840 1198961015840)
= 119869119894119863
120585(119898 119899 119896)
(6)
in which
119866119863
120585120589(119898 minus 119898
1015840 119899 minus 119899
1015840 119896 minus 119896
1015840)
= Δ119881 int
(1198981015840+1)Δ119909
1198981015840Δ119909
int
(1198991015840+1)Δ119910
1198991015840Δ119910
int
(1198961015840+1)Δ119911
1198961015840Δ119911
119866120585120589
((119898 +1
2) Δ119909 minus 119909
1015840
(119899 +1
2) Δ119910 minus 119910
1015840
(119896 +1
2) Δ119911
minus1199111015840) 119889119909101584011988911991010158401198891199111015840
119869119894119863
120585(119898 119899 119896)
= int
(119898+1)Δ119909
119898Δ119909
int
(119899+1)Δ119910
119899Δ119910
int
(119896+1)Δ119911
119896Δ119911
119869inc120585
(119909 119910 119911) 119889119909 119889119910 119889119911
(7)
We remark that the above formulations (6)-(7) actuallyimply the scattering by small particles with the size ofΔ119881 because of the use of pulse basis functions althoughthe dielectric targets may be continuous We can convert (6)into a linear system of equations
119885 sdot 119868 = 119881 (8)
International Journal of Antennas and Propagation 5
Table 1 The HPC hardware information
CPU type Intel Xeon E5520Clock speed 267GHzNumber of nodes 27Available memory 30 times 12GB (DDR3 1067MHz)Operating system CentOS (Linux)Network system BNT 10Gbps Ethernet
Table 2 Performance test
Cells Processes Number of iterations Times (second)64 times 64 times 64 1 30 228128 times 128 times 128 2 30 904256 times 256 times 128 4 30 2589256 times 256 times 256 6 30 3679256 times 256 times 512 8 30 6123512 times 512 times 512 10 30 32897
Table 3 Parallel efficiency
Cells Processes Time (second) Parallel efficiency
64 times 64 times 64
1 228 12 1238 0924 72 080
128 times 128 times 128
1 1681 12 904 0934 506 0838 318 066
256 times 256 times 512
1 29883 12 16418 0914 9401 0798 6123 061
512 times 512 times 512
1 194092 12 109046 0894 64697 07510 32897 05912 31714 051
where 119885 is an 119873 times 119873 system matrix 119868 is a column vectorwith the coefficients of the unknown currents and 119881 isa column vector associated with the incident fields in thedielectric object Here 119873 = 3119873
119909119873119910
119873119911is the total number
of unknowns However the inner products in (6) are all 3Dsummations of the products of discrete Greenrsquos functions anddiscrete electric currents which are quite time and memoryconsuming For electric-large electromagnetic problems 119873
is very large and it is very difficult to solve (8) directly Inorder to calculate fast the products of Greenrsquos functions andelectric currents the discrete Greenrsquos functions are extendedin a larger computational domain as
119866119890
120585120589(119898 119899 119896) = plusmn119866
119863
120585120589(1198980 1198990 1198960) (9)
Table 4 Tissue parameters for HUGOmodel
Tissue type Relativepermittivity
Relativepermeability Conductivity
Skin 41405334 1 086678Fat 11333888 1 0109162Muscle 56879063 1 0995364Cartilage 42653103 1 0782333Cerebrospinal fluid 68638336 1 2412575Sclera 5527013 1 1166726Vitreous 6890184 1 1636162Lens nucleus 35841595 1 0484917Grey matter 52724701 1 0942193White matter 38886288 1 0590815Nerve 32530067 1 0573612Thyroid 59683323 1 1038448Tongue 5527013 1 0936192Bone 20787804 1 0339975Blood 61360718 1 1538069Air 1 1 0
where 0 le 119898 le 2119873119909
minus 1 0 le 119899 le 2119873119910
minus 1 0 le 119896 le 2119873119911
minus 1
1198980
= 119898 0 le 119898 le 119873
119909minus 1
2119873119909
minus 119898 119873119909
le 119898 le 2119873119909
minus 1
1198990
= 119899 0 le 119899 le 119873
119910minus 1
2119873119910
minus 119899 119873119910
le 119899 le 2119873119910
minus 1
1198960
= 119896 0 le 119896 le 119873
119911minus 1
2119873119911
minus 119896 119873119911
le 119896 le 2119873119911
minus 1
(10)
The signs of the expanded discrete Greenrsquos functions aredirectly related to the even and odd nature of the compo-nents with respect to the coordinates in different extendedsubdomains After defining the extended Greenrsquos functionsthe equivalent electric current 119869
119863
120585(119898 119899 119896) can be defined in
the extended domain by zero padding as
119869119863119890
120585(119898 119899 119896)
= 119869119863
120585(119898 119899 119896) 0 le 119898 le 119873
119909minus 1 0 le 119899 le 119873
119910 0 le 119896 le 119873
119911
0 else
(11)
6 International Journal of Antennas and Propagation
lowast119878119906119898 119899119900119903119898 119900119891 119890119886119888ℎ 119875119903119900119888119890119904119904lowast119872119875119868 119860119897119897119903119890119889119906119888119890 (amp119903119899119900119903119898amp1199031198991199001199031198981 1 119872119875119868 119865119871119874119860119879 119872119875119868 119878119880119872 119908119900119903119897119889)lowastV119890119888119905119900119903-V119890119888119905119900119903 119875119903119900119889119906119888119905 119900119899 119890119886119888ℎ 119901119903119900119888119890119904119904lowast
119891119900119903(119894119899119905 119894 = 0 119894 lt 119886119897119897 119894 + +)
119888119895119909[0][0][119894] = 119888119895119909[0][0][119894] + 119886119898 lowast 119888119901119909[0][0][119894]119888119895119910[0][0][119894] = 119888119895119910[0][0][119894] + 119886119898 lowast 119888119901119910[0][0][119894]119888119895119911[0][0][119894] = 119888119895119911[0][0][119894] + 119886119898 lowast 119888119901119911[0][0][119894]119888119903119909[0][0][119894] = 119888119903119909[0][0][119894] minus 119886119898 lowast 119888119905119909[0][0][119894]119888119903119910[0][0][119894] = 119888119903119910[0][0][119894] minus 119886119898 lowast 119888119905119910[0][0][119894]119888119903119911[0][0][119894] = 119888119903119911[0][0][119894] minus 119886119898 lowast 119888119905119911[0][0][119894]119903119890119886119909 = 119886119887119904(119888119903119909[0][0][119894])119903119890119886119910 = 119886119887119904(119888119903119910[0][0][119894])119903119890119886119911 = 119886119887119904(119888119903119911[0][0][119894])119903119899119900119903119898 = 119903119899119900119903119898 + 119903119890119886119909 lowast 119903119890119886119909 + 119903119890119886119910 lowast 119903119890119886119910 + 119903119890119886119911 lowast 119903119890119886119911
Algorithm 1
Using the convolution theorem and FFT method we canobtain the discrete form of the integral equation (6) with FFTmethod [18 19]
119869119863
120585(119898 119899 119896)
+1
4120587120594 (119898 119899 119896)F
minus1
sum
120585=119909119910119911
119866119890
120585120577(119894 119895 119897) 119869
119863119890
120585(119894 119895 119897)
= 120594 (119898 119899 119896) 119864inc120585
(119898 119899 119896)
(12)
where 119866119890
120585120577(119894 119895 119897) 119869
119863119890
120585(119894 119895 119897) are the discrete Fourier transform
(DFT) of 119866119890
120585120577(119894 119895 119897) 119869
119863119890
120585120577(119894 119895 119897) respectively Similarly the
corresponding adjoint operations can also be performedusing FFT As a consequence we can solve (12) rapidlythrough the CG-FFT algorithm [18] In order to speed upthe FFT calculation the parallel FFT is used to obtain theFFT and inverse FFT results In the proposed algorithmboth the FFT transform and the inverse FFT transform areimplemented using the FFTW library which is a 119862 subrou-tine library for computing the discrete Fourier transformin one or more dimensions and supports the distributed-memory implementation based on message passing interface(MPI) For example to calculate the vector-vector productin the CG-FFT method which can be parallelized by callMPI Allreduce() as shown in Algorithm 1
3 Numerical Results
To illustrate the accuracy and efficiency of the proposed par-allel CG-FFT algorithm we first consider the EM scatteringby a dielectric sphere illuminated by plane waves whichhas a closed-form solution In the following examples thebackground is just free space The dielectric sphere with 120576
119903=
4 119903 = 02m is illuminated by a plane wave The incidentwave is polarized in the 119909 direction and propagating in the
119911 direction in which the operating frequency is 03 GHzThecomparison of numerical results of the internal electric fieldsbetween parallel CG-FFT and analytical results is illustratedin Figure 1 which shows that the numerical results have goodagreement with the analytical resultsWe have also computedthe scattered electric fields from the dielectric object on theobservation plane 119911 = 10m and compared such results withthe exact solutions as shown in Figure 2
Then we do the parallel performance testing on a HPCwhich has 27 nodes shown in Table 1 in which nodes areconnected by 10Gbps Ethernet The benchmark model is ahomogenous cubic dielectric object with 120576
119903= 4 and the edge
of cubic is 04mThe incident wave is the same plane wave asthat in Figure 1 We compare the network latency inside nodeand internode which means that we test the network latencyon one node and between two nodes respectively Figure 3shows the testing results for internode and inside node FromFigure 3 we can see that the speed of network inside nodeis about 4 times of the inter node which will be a bottleneckfor the parallel CG-FFTmethod To evaluate the performanceof the parallel CG-FFT code we define the performance asfollows
Performance (cellss)
=119873119909
times 119873119910
times 119873119911
times Number of iteratioinsSimulation time (s)
(13)
The parallel CG-FFTmethods performance testing resultis demonstrated in Figure 4 and the detail data is listed inTable 2 From Figure 4 we can obtain that the performancegoes up when no more than 8 nodes are used and theperformance goes down using 10 nodesThe reason is that thenetwork latency plays an important role when we use morethan 8 nodes The parallel efficiency is also tested which isdefined as
Parallel Efficiency =1198791
119875 lowast 119879119901
(14)
International Journal of Antennas and Propagation 7
where 119875 is the number of processes 1198791is the running time
used by one process and 119879119901is running time used by 119875
processes Figure 5 shows the parallel efficiency of parallelCG-FFT with different discretization and processes and thedetail results are listed in Table 3 From Figure 5 and Table 3we can see that the parallel efficiency is above 60 when nomore than 8 nodes are used
Finally we use the proposed parallel CG-FFT methodto simulate EM scattering problem from 3D anatomicallyrealistic human head model exposed to the plane waveworking at 900Mhz The popular HUGO model [20] with aresolution of 1 times 1 times 1mm as shown in Figure 6 includes 16different tissues and organs The electromagnetic properties(120576119903and 120590) of 16 tissues in the model can be obtained from
FCC published data [21] as listed in Table 4 In our simu-lation 4 nodes are used and the computation time is about65 minutes Figure 7 shows the electric field on head surfaceWith the object oriented HUGOmodel the field distributionover a specific object can be investigated The electric fielddistributions on eyes bone and brain are demonstrated inFigures 8 9 and 10 respectively
4 Conclusion
In this paper we have analyzed the performance of an efficientMPI parallel implementation of the CG-FFT algorithm onHPC computers In the proposed method the codes canrun not only on share memory systems machine but alsoon distributed ones which present high scalability behaviorSpecial attention was paid to communications during thematrix-vector product and vector-vector product which area key point for the parallel performanceWe solved a problemwith more than 400 million unknowns on a HPC including27 nodes
Conflict of Interests
The authors declare that there is no conflict of interestsregarding the publication of this paper
Acknowledgments
This work was supported in part by the National ScienceFoundation of China under Grant no 61372057 in part byNatural Science Foundation of the Jiangsu Higher EducationInstitutions under Grant no 10KJD180004 and in part byPostgraduate Innovation Project of Jiangsu Province underGrant no CXZZ13 0973
References
[1] R F Harrington Field Computation by Moment MethodsMacMillan New York NY USA 1968
[2] P M Goggans A A Kishk and A W Glisson ldquoElec-tromagnetic scattering from objects composed of multiplehomogeneous regions using a region-by-region solutionrdquo IEEETransactions on Antennas and Propagation vol 42 no 6 pp865ndash871 1994
[3] R D Graglia P L E Uslenghi andR S Zich ldquoMomentmethodwith isoparametric elements for three-dimensional anisotropicscatterersrdquo Proceedings of the IEEE vol 77 no 5 pp 750ndash7601989
[4] J M Jarem ldquoMethod-of-moments solution of a parallel-platewaveguide aperture systemrdquo Journal of Applied Physics vol 59no 10 pp 3566ndash3570 1986
[5] D E Livesay and K Chen ldquoElectromagnetic field inducedinside arbitrarily shaped biological bodiesrdquo IEEE Transactionson Microwave Theory and Techniques vol 22 no 12 pp 1273ndash1280 1974
[6] T K Sarkar and E Arvas ldquoAn integral equation approachto the analysis of finite microstrip antennas volumesurfaceformulationrdquo IEEE Transactions on Antennas and Propagationvol 38 no 3 pp 305ndash312 1990
[7] H Gan and W C Chew ldquoA discrete BCG-FFT algorithmfor solving 3D inhomogeneous scatterer problemsrdquo Journal ofElectromagnetic Waves and Applications vol 9 no 10 pp 1339ndash1357 1995
[8] T J Cui ldquoFast algorithm for electromagnetic scattering byburied 3-D dielectric objects of large sizerdquo IEEE Transactionson Geoscience and Remote Sensing vol 37 no 5 pp 2597ndash26081999
[9] L Zhao and T J Cui ldquoCG-FFT algorithm for EM scattering bysmall dielectric particles with high permittivity and permeabil-ityrdquoMicrowave and Optical Technology Letters vol 49 no 2 pp305ndash310 2007
[10] L Zhao T J Cui and W D Li ldquoAn efficient algorithmfor em scattering by electrically large dielectric objects usingMR-QEB iterative scheme and CG-FFT methodrdquo Progress inElectromagnetics Research vol 67 pp 341ndash355 2007
[11] W Yu R Mittra T Su Y Liu and X Yang Parallel FiniteDifference Time Domain Method Artech House NorwoodMass USA 2006
[12] W Yu X Yang Y Liu et al ldquoNew development of parallelconformal FDTD method in computational electromagneticsengineeringrdquo IEEEAntennas and PropagationMagazine vol 53no 3 pp 15ndash41 2011
[13] J M Taboada M G Araujo F O Basteiro J L Rodriguez andL Landesa ldquoMLFMA-FFT parallel algorithm for the solution ofextremely large problems in electromagneticrdquo Proceedings of theIEEE vol 101 no 2 pp 350ndash363 2013
[14] O P Gandhi G Lazzi and C M Furse ldquoElectromagneticabsorption in the human head and neck for mobile telephonesat 835 and 1900MHzrdquo IEEE Transactions on Microwave Theoryand Techniques vol 44 no 10 pp 1884ndash1897 1996
[15] G Lazzi and O P Gandhi ldquoRealistically tilted and truncatedanatomically based models of the human head for dosimetryof mobile telephonesrdquo IEEE Transactions on ElectromagneticCompatibility vol 39 no 1 pp 55ndash61 1997
[16] A K Lee H D Choi and J I Choi ldquoStudy on SARs inhead models with different shapes by age using SAM modelfor mobile phone exposure at 835MHzrdquo IEEE Transactions onElectromagnetic Compatibility vol 49 no 2 pp 302ndash312 2007
[17] Q-X Li and O P Gandhi ldquoThermal implications of thenew relaxed IEEE RF safety standard for head exposures tocellular telephones at 835 and 1900MHzrdquo IEEE Transactions onMicrowave Theory and Techniques vol 54 no 7 pp 3146ndash31542006
[18] T J Cui and W C Chew ldquoFast algorithm for electromagneticscattering by buried 3-D dielectric objects of large sizerdquo IEEE
8 International Journal of Antennas and Propagation
Transactions on Geoscience and Remote Sensing vol 37 no 5pp 2597ndash2608 1999
[19] J Weaver Applications of Discrete and Continuous FourierAnalysis John Wiley amp Sons New York NY USA 1983
[20] P Bernardi M Cavagnaro S Pisa and E Piuzzi ldquoSpecificabsorption rate and temperature increases in the head of acellular-phone userrdquo IEEE Transactions on Microwave Theoryand Techniques vol 48 no 7 pp 1118ndash1126 2000
[21] httpwwwfccgovfcc-bindielecsh
International Journal of
AerospaceEngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
RoboticsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Active and Passive Electronic Components
Control Scienceand Engineering
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
RotatingMachinery
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporation httpwwwhindawicom
Journal ofEngineeringVolume 2014
Submit your manuscripts athttpwwwhindawicom
VLSI Design
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Shock and Vibration
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Civil EngineeringAdvances in
Acoustics and VibrationAdvances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Electrical and Computer Engineering
Journal of
Advances inOptoElectronics
Hindawi Publishing Corporation httpwwwhindawicom
Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
SensorsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Chemical EngineeringInternational Journal of Antennas and
Propagation
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Navigation and Observation
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
DistributedSensor Networks
International Journal of
2 International Journal of Antennas and Propagation
minus02 minus015 minus01 minus005 0 005 01 015 0205
06
07
08
09
1
11
12
13
Analysis solutionCGFFT solution
Nx = Ny = Nz = 32
z = 01The field Ex on the center line of plane
(a) 119873119909 = 119873119909 = 119873119909 = 32
minus02 minus015 minus01 minus005 0 005 01 015 0205
06
07
08
09
1
11
12
13
Analysis solutionCGFFT solution
Nx = Ny = Nz = 64
z = 01The field Ex on the center line of plane
(b) 119873119909 = 119873119909 = 119873119909 = 64
Figure 1 Electric field distribution on the center line of plane 119911 =
01
this implementation we have done a benchmark model testwith more than 400 million unknowns and solved a practicalEM scattering problemwithmore than 40million unknownsusing a HPC system which includes 27 nodes Each node ofthe cluster has two Intel Xeon E5520 CPU and 12GBmemoryand they are connected by 10Gbps Ethernet high speednetwork We have verified the accuracy and efficiency of thealgorithm by comparing the numerical results with analyticalresults for dielectric spheres Numerical results show that theproposed method has good parallel performance
minus005
005
minus01
01
minus015
015
minus02
02
minus02 minus01 0
0
01 02
0256
0254
0252
025
0248
0246
CG-FFT Ex field on z = 1m
(a)
minus005
005
minus01
01
minus015
015
minus02
02
minus02 minus01 0
0
01 02
026
0258
0256
0254
0252
025
0248
0246
0244
0242
Analytical solution Ex field on z = 1m
(b)
Figure 2 Electric fields on the plane 119911 = 10m (a) Parallel CG-FFTresults (b) Analytical results
0 2 4 6 8 100
2000
4000
6000
8000
Message size (bytes)
Late
ncy
(mic
rose
cond
s)
InternodeInside node
times107
Figure 3 Network latency of nodes
International Journal of Antennas and Propagation 3
0 2 4 6 8 102
4
6
8
10
12
14
Perfo
rman
cec
ells
Number of nodes
times105
Figure 4 The performance of parallel CG-FFT
0 2 4 6 8 10 1205
06
07
08
09
1
Processes
Para
llel e
ffici
ency
64lowast 64lowast 64
128lowast 128lowast 128
256lowast 256lowast 512
512lowast 512lowast 512
Figure 5 Parallel efficiency of different case
Figure 6 A cut plane of the HUGO human head model
025
05
075
1
E fi
eld
on h
ead
104
1e minus 07
Figure 7 The total electric field distribution on head surface
001
002
003
E fi
eld
on ey
es
0
00347
Figure 8 The total electric field distribution on eyes
2 Theory and Methods
Consider a 3D dielectric object of arbitrary shape that isin homogeneous space which is characterized by relativepermittivity 120576
119887 we set the homogeneous space is free space
120576119887
= 1 The arbitrarily shaped dielectric object with complexpermittivity 120576
119903(119903) is inscribed by a cuboid 119871
119909times 119871119910
times 119871119911 The
time dependence of 119890minus119894120596119905 is assumed and suppressed Under
the illumination of the incident electric field the total electricfield inside the dielectric object 119864 can be determined throughthe following volume integral equation
119864 (119903) +1
4120587120576119887
int119881
119866 (119903 1199031015840) (120576119887
(119903) minus 120576119887) 119864 (119903 119903
1015840) 1198891199031015840
= 119864inc
(1)
where
119866 (119903 1199031015840) =
119890119894119896119887119877
1198775[
[
119866119909119909
119866119909119910
119866119909119911
119866119910119909
119866119910119910
119866119910119911
119866119911119909
119866119911119910
119866119911119911
]
]
(2)
4 International Journal of Antennas and Propagation
01
02
03
E fi
eld
on b
one
0
0304
Figure 9 The total electric field distribution on bone
01
02
E fi
eld
on b
rain
0
0223
Figure 10 The total electric field distribution on brain
is the dyadic Greenrsquos function in homogenous space in whichthe corresponding elements are given by
119866120585120577
= (120585 minus 1205851015840) (120577 minus 120577
1015840) [(119896119887119877)2
+ 1198943 (119896119887119877) minus 3] (120585 = 120577)
119866120585120577
= (120585 minus 1205851015840)2
[(119896119887119877)2
+ 1198943 (119896119887119877) minus 3]
minus 1198772
[(119896119887119877)2
+ 119894 (119896119887119877) minus 1] (120585 = 120577)
(3)
The equivalent version for the induced current 119869 can beapproximately obtained by
119869 (119903) +1
4120587120594 (119903) int
119881
119866 (119903 1199031015840) sdot 119869 (119903
1015840) 1198891199031015840
= 119869inc
(119903) (4)
where 120594(119903) = (120576(119903)120576119887
) minus 1 and
119869 (119903) = 120594 (119903) 119864 (119903) 119869inc
(119903) = 120594 (119903) 119864inc
(119903) (5)
are the normalized electric current inside the dielectric objectand the equivalent incident current respectively
A box with the size of 119871119909
times 119871119910
times 119871119911is used to bound
the considered dielectric target and is discretized into 119873119909
times
119873119910
times 119873119911cuboidal cells Then the volume of each cell is ΔV =
Δ119909Δ119910Δ119911 where Δ120585 = 119871120585119873120585 (120585 = 119909 119910 119911) and 119873
120585is the
division number in the 120585-direction Choosing pulse functionas the basis and testing function we obtain the discrete formsof (4) as
119869119863
120585(119898 119899 119896) +
1
4120587120594 (119898 119899 119896)
times sum
120589=119909119910119911
119873119909minus1
sum
1198981015840=0
119873119910minus1
sum
1198991015840=0
119873119911minus1
sum
1198961015840=0
119866119863
120585120589(119898 minus 119898
1015840 119899 minus 119899
1015840 119896 minus 119896
1015840)
times 119869119863
120589(1198981015840 1198991015840 1198961015840)
= 119869119894119863
120585(119898 119899 119896)
(6)
in which
119866119863
120585120589(119898 minus 119898
1015840 119899 minus 119899
1015840 119896 minus 119896
1015840)
= Δ119881 int
(1198981015840+1)Δ119909
1198981015840Δ119909
int
(1198991015840+1)Δ119910
1198991015840Δ119910
int
(1198961015840+1)Δ119911
1198961015840Δ119911
119866120585120589
((119898 +1
2) Δ119909 minus 119909
1015840
(119899 +1
2) Δ119910 minus 119910
1015840
(119896 +1
2) Δ119911
minus1199111015840) 119889119909101584011988911991010158401198891199111015840
119869119894119863
120585(119898 119899 119896)
= int
(119898+1)Δ119909
119898Δ119909
int
(119899+1)Δ119910
119899Δ119910
int
(119896+1)Δ119911
119896Δ119911
119869inc120585
(119909 119910 119911) 119889119909 119889119910 119889119911
(7)
We remark that the above formulations (6)-(7) actuallyimply the scattering by small particles with the size ofΔ119881 because of the use of pulse basis functions althoughthe dielectric targets may be continuous We can convert (6)into a linear system of equations
119885 sdot 119868 = 119881 (8)
International Journal of Antennas and Propagation 5
Table 1 The HPC hardware information
CPU type Intel Xeon E5520Clock speed 267GHzNumber of nodes 27Available memory 30 times 12GB (DDR3 1067MHz)Operating system CentOS (Linux)Network system BNT 10Gbps Ethernet
Table 2 Performance test
Cells Processes Number of iterations Times (second)64 times 64 times 64 1 30 228128 times 128 times 128 2 30 904256 times 256 times 128 4 30 2589256 times 256 times 256 6 30 3679256 times 256 times 512 8 30 6123512 times 512 times 512 10 30 32897
Table 3 Parallel efficiency
Cells Processes Time (second) Parallel efficiency
64 times 64 times 64
1 228 12 1238 0924 72 080
128 times 128 times 128
1 1681 12 904 0934 506 0838 318 066
256 times 256 times 512
1 29883 12 16418 0914 9401 0798 6123 061
512 times 512 times 512
1 194092 12 109046 0894 64697 07510 32897 05912 31714 051
where 119885 is an 119873 times 119873 system matrix 119868 is a column vectorwith the coefficients of the unknown currents and 119881 isa column vector associated with the incident fields in thedielectric object Here 119873 = 3119873
119909119873119910
119873119911is the total number
of unknowns However the inner products in (6) are all 3Dsummations of the products of discrete Greenrsquos functions anddiscrete electric currents which are quite time and memoryconsuming For electric-large electromagnetic problems 119873
is very large and it is very difficult to solve (8) directly Inorder to calculate fast the products of Greenrsquos functions andelectric currents the discrete Greenrsquos functions are extendedin a larger computational domain as
119866119890
120585120589(119898 119899 119896) = plusmn119866
119863
120585120589(1198980 1198990 1198960) (9)
Table 4 Tissue parameters for HUGOmodel
Tissue type Relativepermittivity
Relativepermeability Conductivity
Skin 41405334 1 086678Fat 11333888 1 0109162Muscle 56879063 1 0995364Cartilage 42653103 1 0782333Cerebrospinal fluid 68638336 1 2412575Sclera 5527013 1 1166726Vitreous 6890184 1 1636162Lens nucleus 35841595 1 0484917Grey matter 52724701 1 0942193White matter 38886288 1 0590815Nerve 32530067 1 0573612Thyroid 59683323 1 1038448Tongue 5527013 1 0936192Bone 20787804 1 0339975Blood 61360718 1 1538069Air 1 1 0
where 0 le 119898 le 2119873119909
minus 1 0 le 119899 le 2119873119910
minus 1 0 le 119896 le 2119873119911
minus 1
1198980
= 119898 0 le 119898 le 119873
119909minus 1
2119873119909
minus 119898 119873119909
le 119898 le 2119873119909
minus 1
1198990
= 119899 0 le 119899 le 119873
119910minus 1
2119873119910
minus 119899 119873119910
le 119899 le 2119873119910
minus 1
1198960
= 119896 0 le 119896 le 119873
119911minus 1
2119873119911
minus 119896 119873119911
le 119896 le 2119873119911
minus 1
(10)
The signs of the expanded discrete Greenrsquos functions aredirectly related to the even and odd nature of the compo-nents with respect to the coordinates in different extendedsubdomains After defining the extended Greenrsquos functionsthe equivalent electric current 119869
119863
120585(119898 119899 119896) can be defined in
the extended domain by zero padding as
119869119863119890
120585(119898 119899 119896)
= 119869119863
120585(119898 119899 119896) 0 le 119898 le 119873
119909minus 1 0 le 119899 le 119873
119910 0 le 119896 le 119873
119911
0 else
(11)
6 International Journal of Antennas and Propagation
lowast119878119906119898 119899119900119903119898 119900119891 119890119886119888ℎ 119875119903119900119888119890119904119904lowast119872119875119868 119860119897119897119903119890119889119906119888119890 (amp119903119899119900119903119898amp1199031198991199001199031198981 1 119872119875119868 119865119871119874119860119879 119872119875119868 119878119880119872 119908119900119903119897119889)lowastV119890119888119905119900119903-V119890119888119905119900119903 119875119903119900119889119906119888119905 119900119899 119890119886119888ℎ 119901119903119900119888119890119904119904lowast
119891119900119903(119894119899119905 119894 = 0 119894 lt 119886119897119897 119894 + +)
119888119895119909[0][0][119894] = 119888119895119909[0][0][119894] + 119886119898 lowast 119888119901119909[0][0][119894]119888119895119910[0][0][119894] = 119888119895119910[0][0][119894] + 119886119898 lowast 119888119901119910[0][0][119894]119888119895119911[0][0][119894] = 119888119895119911[0][0][119894] + 119886119898 lowast 119888119901119911[0][0][119894]119888119903119909[0][0][119894] = 119888119903119909[0][0][119894] minus 119886119898 lowast 119888119905119909[0][0][119894]119888119903119910[0][0][119894] = 119888119903119910[0][0][119894] minus 119886119898 lowast 119888119905119910[0][0][119894]119888119903119911[0][0][119894] = 119888119903119911[0][0][119894] minus 119886119898 lowast 119888119905119911[0][0][119894]119903119890119886119909 = 119886119887119904(119888119903119909[0][0][119894])119903119890119886119910 = 119886119887119904(119888119903119910[0][0][119894])119903119890119886119911 = 119886119887119904(119888119903119911[0][0][119894])119903119899119900119903119898 = 119903119899119900119903119898 + 119903119890119886119909 lowast 119903119890119886119909 + 119903119890119886119910 lowast 119903119890119886119910 + 119903119890119886119911 lowast 119903119890119886119911
Algorithm 1
Using the convolution theorem and FFT method we canobtain the discrete form of the integral equation (6) with FFTmethod [18 19]
119869119863
120585(119898 119899 119896)
+1
4120587120594 (119898 119899 119896)F
minus1
sum
120585=119909119910119911
119866119890
120585120577(119894 119895 119897) 119869
119863119890
120585(119894 119895 119897)
= 120594 (119898 119899 119896) 119864inc120585
(119898 119899 119896)
(12)
where 119866119890
120585120577(119894 119895 119897) 119869
119863119890
120585(119894 119895 119897) are the discrete Fourier transform
(DFT) of 119866119890
120585120577(119894 119895 119897) 119869
119863119890
120585120577(119894 119895 119897) respectively Similarly the
corresponding adjoint operations can also be performedusing FFT As a consequence we can solve (12) rapidlythrough the CG-FFT algorithm [18] In order to speed upthe FFT calculation the parallel FFT is used to obtain theFFT and inverse FFT results In the proposed algorithmboth the FFT transform and the inverse FFT transform areimplemented using the FFTW library which is a 119862 subrou-tine library for computing the discrete Fourier transformin one or more dimensions and supports the distributed-memory implementation based on message passing interface(MPI) For example to calculate the vector-vector productin the CG-FFT method which can be parallelized by callMPI Allreduce() as shown in Algorithm 1
3 Numerical Results
To illustrate the accuracy and efficiency of the proposed par-allel CG-FFT algorithm we first consider the EM scatteringby a dielectric sphere illuminated by plane waves whichhas a closed-form solution In the following examples thebackground is just free space The dielectric sphere with 120576
119903=
4 119903 = 02m is illuminated by a plane wave The incidentwave is polarized in the 119909 direction and propagating in the
119911 direction in which the operating frequency is 03 GHzThecomparison of numerical results of the internal electric fieldsbetween parallel CG-FFT and analytical results is illustratedin Figure 1 which shows that the numerical results have goodagreement with the analytical resultsWe have also computedthe scattered electric fields from the dielectric object on theobservation plane 119911 = 10m and compared such results withthe exact solutions as shown in Figure 2
Then we do the parallel performance testing on a HPCwhich has 27 nodes shown in Table 1 in which nodes areconnected by 10Gbps Ethernet The benchmark model is ahomogenous cubic dielectric object with 120576
119903= 4 and the edge
of cubic is 04mThe incident wave is the same plane wave asthat in Figure 1 We compare the network latency inside nodeand internode which means that we test the network latencyon one node and between two nodes respectively Figure 3shows the testing results for internode and inside node FromFigure 3 we can see that the speed of network inside nodeis about 4 times of the inter node which will be a bottleneckfor the parallel CG-FFTmethod To evaluate the performanceof the parallel CG-FFT code we define the performance asfollows
Performance (cellss)
=119873119909
times 119873119910
times 119873119911
times Number of iteratioinsSimulation time (s)
(13)
The parallel CG-FFTmethods performance testing resultis demonstrated in Figure 4 and the detail data is listed inTable 2 From Figure 4 we can obtain that the performancegoes up when no more than 8 nodes are used and theperformance goes down using 10 nodesThe reason is that thenetwork latency plays an important role when we use morethan 8 nodes The parallel efficiency is also tested which isdefined as
Parallel Efficiency =1198791
119875 lowast 119879119901
(14)
International Journal of Antennas and Propagation 7
where 119875 is the number of processes 1198791is the running time
used by one process and 119879119901is running time used by 119875
processes Figure 5 shows the parallel efficiency of parallelCG-FFT with different discretization and processes and thedetail results are listed in Table 3 From Figure 5 and Table 3we can see that the parallel efficiency is above 60 when nomore than 8 nodes are used
Finally we use the proposed parallel CG-FFT methodto simulate EM scattering problem from 3D anatomicallyrealistic human head model exposed to the plane waveworking at 900Mhz The popular HUGO model [20] with aresolution of 1 times 1 times 1mm as shown in Figure 6 includes 16different tissues and organs The electromagnetic properties(120576119903and 120590) of 16 tissues in the model can be obtained from
FCC published data [21] as listed in Table 4 In our simu-lation 4 nodes are used and the computation time is about65 minutes Figure 7 shows the electric field on head surfaceWith the object oriented HUGOmodel the field distributionover a specific object can be investigated The electric fielddistributions on eyes bone and brain are demonstrated inFigures 8 9 and 10 respectively
4 Conclusion
In this paper we have analyzed the performance of an efficientMPI parallel implementation of the CG-FFT algorithm onHPC computers In the proposed method the codes canrun not only on share memory systems machine but alsoon distributed ones which present high scalability behaviorSpecial attention was paid to communications during thematrix-vector product and vector-vector product which area key point for the parallel performanceWe solved a problemwith more than 400 million unknowns on a HPC including27 nodes
Conflict of Interests
The authors declare that there is no conflict of interestsregarding the publication of this paper
Acknowledgments
This work was supported in part by the National ScienceFoundation of China under Grant no 61372057 in part byNatural Science Foundation of the Jiangsu Higher EducationInstitutions under Grant no 10KJD180004 and in part byPostgraduate Innovation Project of Jiangsu Province underGrant no CXZZ13 0973
References
[1] R F Harrington Field Computation by Moment MethodsMacMillan New York NY USA 1968
[2] P M Goggans A A Kishk and A W Glisson ldquoElec-tromagnetic scattering from objects composed of multiplehomogeneous regions using a region-by-region solutionrdquo IEEETransactions on Antennas and Propagation vol 42 no 6 pp865ndash871 1994
[3] R D Graglia P L E Uslenghi andR S Zich ldquoMomentmethodwith isoparametric elements for three-dimensional anisotropicscatterersrdquo Proceedings of the IEEE vol 77 no 5 pp 750ndash7601989
[4] J M Jarem ldquoMethod-of-moments solution of a parallel-platewaveguide aperture systemrdquo Journal of Applied Physics vol 59no 10 pp 3566ndash3570 1986
[5] D E Livesay and K Chen ldquoElectromagnetic field inducedinside arbitrarily shaped biological bodiesrdquo IEEE Transactionson Microwave Theory and Techniques vol 22 no 12 pp 1273ndash1280 1974
[6] T K Sarkar and E Arvas ldquoAn integral equation approachto the analysis of finite microstrip antennas volumesurfaceformulationrdquo IEEE Transactions on Antennas and Propagationvol 38 no 3 pp 305ndash312 1990
[7] H Gan and W C Chew ldquoA discrete BCG-FFT algorithmfor solving 3D inhomogeneous scatterer problemsrdquo Journal ofElectromagnetic Waves and Applications vol 9 no 10 pp 1339ndash1357 1995
[8] T J Cui ldquoFast algorithm for electromagnetic scattering byburied 3-D dielectric objects of large sizerdquo IEEE Transactionson Geoscience and Remote Sensing vol 37 no 5 pp 2597ndash26081999
[9] L Zhao and T J Cui ldquoCG-FFT algorithm for EM scattering bysmall dielectric particles with high permittivity and permeabil-ityrdquoMicrowave and Optical Technology Letters vol 49 no 2 pp305ndash310 2007
[10] L Zhao T J Cui and W D Li ldquoAn efficient algorithmfor em scattering by electrically large dielectric objects usingMR-QEB iterative scheme and CG-FFT methodrdquo Progress inElectromagnetics Research vol 67 pp 341ndash355 2007
[11] W Yu R Mittra T Su Y Liu and X Yang Parallel FiniteDifference Time Domain Method Artech House NorwoodMass USA 2006
[12] W Yu X Yang Y Liu et al ldquoNew development of parallelconformal FDTD method in computational electromagneticsengineeringrdquo IEEEAntennas and PropagationMagazine vol 53no 3 pp 15ndash41 2011
[13] J M Taboada M G Araujo F O Basteiro J L Rodriguez andL Landesa ldquoMLFMA-FFT parallel algorithm for the solution ofextremely large problems in electromagneticrdquo Proceedings of theIEEE vol 101 no 2 pp 350ndash363 2013
[14] O P Gandhi G Lazzi and C M Furse ldquoElectromagneticabsorption in the human head and neck for mobile telephonesat 835 and 1900MHzrdquo IEEE Transactions on Microwave Theoryand Techniques vol 44 no 10 pp 1884ndash1897 1996
[15] G Lazzi and O P Gandhi ldquoRealistically tilted and truncatedanatomically based models of the human head for dosimetryof mobile telephonesrdquo IEEE Transactions on ElectromagneticCompatibility vol 39 no 1 pp 55ndash61 1997
[16] A K Lee H D Choi and J I Choi ldquoStudy on SARs inhead models with different shapes by age using SAM modelfor mobile phone exposure at 835MHzrdquo IEEE Transactions onElectromagnetic Compatibility vol 49 no 2 pp 302ndash312 2007
[17] Q-X Li and O P Gandhi ldquoThermal implications of thenew relaxed IEEE RF safety standard for head exposures tocellular telephones at 835 and 1900MHzrdquo IEEE Transactions onMicrowave Theory and Techniques vol 54 no 7 pp 3146ndash31542006
[18] T J Cui and W C Chew ldquoFast algorithm for electromagneticscattering by buried 3-D dielectric objects of large sizerdquo IEEE
8 International Journal of Antennas and Propagation
Transactions on Geoscience and Remote Sensing vol 37 no 5pp 2597ndash2608 1999
[19] J Weaver Applications of Discrete and Continuous FourierAnalysis John Wiley amp Sons New York NY USA 1983
[20] P Bernardi M Cavagnaro S Pisa and E Piuzzi ldquoSpecificabsorption rate and temperature increases in the head of acellular-phone userrdquo IEEE Transactions on Microwave Theoryand Techniques vol 48 no 7 pp 1118ndash1126 2000
[21] httpwwwfccgovfcc-bindielecsh
International Journal of
AerospaceEngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
RoboticsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Active and Passive Electronic Components
Control Scienceand Engineering
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
RotatingMachinery
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporation httpwwwhindawicom
Journal ofEngineeringVolume 2014
Submit your manuscripts athttpwwwhindawicom
VLSI Design
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Shock and Vibration
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Civil EngineeringAdvances in
Acoustics and VibrationAdvances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Electrical and Computer Engineering
Journal of
Advances inOptoElectronics
Hindawi Publishing Corporation httpwwwhindawicom
Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
SensorsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Chemical EngineeringInternational Journal of Antennas and
Propagation
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Navigation and Observation
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
DistributedSensor Networks
International Journal of
International Journal of Antennas and Propagation 3
0 2 4 6 8 102
4
6
8
10
12
14
Perfo
rman
cec
ells
Number of nodes
times105
Figure 4 The performance of parallel CG-FFT
0 2 4 6 8 10 1205
06
07
08
09
1
Processes
Para
llel e
ffici
ency
64lowast 64lowast 64
128lowast 128lowast 128
256lowast 256lowast 512
512lowast 512lowast 512
Figure 5 Parallel efficiency of different case
Figure 6 A cut plane of the HUGO human head model
025
05
075
1
E fi
eld
on h
ead
104
1e minus 07
Figure 7 The total electric field distribution on head surface
001
002
003
E fi
eld
on ey
es
0
00347
Figure 8 The total electric field distribution on eyes
2 Theory and Methods
Consider a 3D dielectric object of arbitrary shape that isin homogeneous space which is characterized by relativepermittivity 120576
119887 we set the homogeneous space is free space
120576119887
= 1 The arbitrarily shaped dielectric object with complexpermittivity 120576
119903(119903) is inscribed by a cuboid 119871
119909times 119871119910
times 119871119911 The
time dependence of 119890minus119894120596119905 is assumed and suppressed Under
the illumination of the incident electric field the total electricfield inside the dielectric object 119864 can be determined throughthe following volume integral equation
119864 (119903) +1
4120587120576119887
int119881
119866 (119903 1199031015840) (120576119887
(119903) minus 120576119887) 119864 (119903 119903
1015840) 1198891199031015840
= 119864inc
(1)
where
119866 (119903 1199031015840) =
119890119894119896119887119877
1198775[
[
119866119909119909
119866119909119910
119866119909119911
119866119910119909
119866119910119910
119866119910119911
119866119911119909
119866119911119910
119866119911119911
]
]
(2)
4 International Journal of Antennas and Propagation
01
02
03
E fi
eld
on b
one
0
0304
Figure 9 The total electric field distribution on bone
01
02
E fi
eld
on b
rain
0
0223
Figure 10 The total electric field distribution on brain
is the dyadic Greenrsquos function in homogenous space in whichthe corresponding elements are given by
119866120585120577
= (120585 minus 1205851015840) (120577 minus 120577
1015840) [(119896119887119877)2
+ 1198943 (119896119887119877) minus 3] (120585 = 120577)
119866120585120577
= (120585 minus 1205851015840)2
[(119896119887119877)2
+ 1198943 (119896119887119877) minus 3]
minus 1198772
[(119896119887119877)2
+ 119894 (119896119887119877) minus 1] (120585 = 120577)
(3)
The equivalent version for the induced current 119869 can beapproximately obtained by
119869 (119903) +1
4120587120594 (119903) int
119881
119866 (119903 1199031015840) sdot 119869 (119903
1015840) 1198891199031015840
= 119869inc
(119903) (4)
where 120594(119903) = (120576(119903)120576119887
) minus 1 and
119869 (119903) = 120594 (119903) 119864 (119903) 119869inc
(119903) = 120594 (119903) 119864inc
(119903) (5)
are the normalized electric current inside the dielectric objectand the equivalent incident current respectively
A box with the size of 119871119909
times 119871119910
times 119871119911is used to bound
the considered dielectric target and is discretized into 119873119909
times
119873119910
times 119873119911cuboidal cells Then the volume of each cell is ΔV =
Δ119909Δ119910Δ119911 where Δ120585 = 119871120585119873120585 (120585 = 119909 119910 119911) and 119873
120585is the
division number in the 120585-direction Choosing pulse functionas the basis and testing function we obtain the discrete formsof (4) as
119869119863
120585(119898 119899 119896) +
1
4120587120594 (119898 119899 119896)
times sum
120589=119909119910119911
119873119909minus1
sum
1198981015840=0
119873119910minus1
sum
1198991015840=0
119873119911minus1
sum
1198961015840=0
119866119863
120585120589(119898 minus 119898
1015840 119899 minus 119899
1015840 119896 minus 119896
1015840)
times 119869119863
120589(1198981015840 1198991015840 1198961015840)
= 119869119894119863
120585(119898 119899 119896)
(6)
in which
119866119863
120585120589(119898 minus 119898
1015840 119899 minus 119899
1015840 119896 minus 119896
1015840)
= Δ119881 int
(1198981015840+1)Δ119909
1198981015840Δ119909
int
(1198991015840+1)Δ119910
1198991015840Δ119910
int
(1198961015840+1)Δ119911
1198961015840Δ119911
119866120585120589
((119898 +1
2) Δ119909 minus 119909
1015840
(119899 +1
2) Δ119910 minus 119910
1015840
(119896 +1
2) Δ119911
minus1199111015840) 119889119909101584011988911991010158401198891199111015840
119869119894119863
120585(119898 119899 119896)
= int
(119898+1)Δ119909
119898Δ119909
int
(119899+1)Δ119910
119899Δ119910
int
(119896+1)Δ119911
119896Δ119911
119869inc120585
(119909 119910 119911) 119889119909 119889119910 119889119911
(7)
We remark that the above formulations (6)-(7) actuallyimply the scattering by small particles with the size ofΔ119881 because of the use of pulse basis functions althoughthe dielectric targets may be continuous We can convert (6)into a linear system of equations
119885 sdot 119868 = 119881 (8)
International Journal of Antennas and Propagation 5
Table 1 The HPC hardware information
CPU type Intel Xeon E5520Clock speed 267GHzNumber of nodes 27Available memory 30 times 12GB (DDR3 1067MHz)Operating system CentOS (Linux)Network system BNT 10Gbps Ethernet
Table 2 Performance test
Cells Processes Number of iterations Times (second)64 times 64 times 64 1 30 228128 times 128 times 128 2 30 904256 times 256 times 128 4 30 2589256 times 256 times 256 6 30 3679256 times 256 times 512 8 30 6123512 times 512 times 512 10 30 32897
Table 3 Parallel efficiency
Cells Processes Time (second) Parallel efficiency
64 times 64 times 64
1 228 12 1238 0924 72 080
128 times 128 times 128
1 1681 12 904 0934 506 0838 318 066
256 times 256 times 512
1 29883 12 16418 0914 9401 0798 6123 061
512 times 512 times 512
1 194092 12 109046 0894 64697 07510 32897 05912 31714 051
where 119885 is an 119873 times 119873 system matrix 119868 is a column vectorwith the coefficients of the unknown currents and 119881 isa column vector associated with the incident fields in thedielectric object Here 119873 = 3119873
119909119873119910
119873119911is the total number
of unknowns However the inner products in (6) are all 3Dsummations of the products of discrete Greenrsquos functions anddiscrete electric currents which are quite time and memoryconsuming For electric-large electromagnetic problems 119873
is very large and it is very difficult to solve (8) directly Inorder to calculate fast the products of Greenrsquos functions andelectric currents the discrete Greenrsquos functions are extendedin a larger computational domain as
119866119890
120585120589(119898 119899 119896) = plusmn119866
119863
120585120589(1198980 1198990 1198960) (9)
Table 4 Tissue parameters for HUGOmodel
Tissue type Relativepermittivity
Relativepermeability Conductivity
Skin 41405334 1 086678Fat 11333888 1 0109162Muscle 56879063 1 0995364Cartilage 42653103 1 0782333Cerebrospinal fluid 68638336 1 2412575Sclera 5527013 1 1166726Vitreous 6890184 1 1636162Lens nucleus 35841595 1 0484917Grey matter 52724701 1 0942193White matter 38886288 1 0590815Nerve 32530067 1 0573612Thyroid 59683323 1 1038448Tongue 5527013 1 0936192Bone 20787804 1 0339975Blood 61360718 1 1538069Air 1 1 0
where 0 le 119898 le 2119873119909
minus 1 0 le 119899 le 2119873119910
minus 1 0 le 119896 le 2119873119911
minus 1
1198980
= 119898 0 le 119898 le 119873
119909minus 1
2119873119909
minus 119898 119873119909
le 119898 le 2119873119909
minus 1
1198990
= 119899 0 le 119899 le 119873
119910minus 1
2119873119910
minus 119899 119873119910
le 119899 le 2119873119910
minus 1
1198960
= 119896 0 le 119896 le 119873
119911minus 1
2119873119911
minus 119896 119873119911
le 119896 le 2119873119911
minus 1
(10)
The signs of the expanded discrete Greenrsquos functions aredirectly related to the even and odd nature of the compo-nents with respect to the coordinates in different extendedsubdomains After defining the extended Greenrsquos functionsthe equivalent electric current 119869
119863
120585(119898 119899 119896) can be defined in
the extended domain by zero padding as
119869119863119890
120585(119898 119899 119896)
= 119869119863
120585(119898 119899 119896) 0 le 119898 le 119873
119909minus 1 0 le 119899 le 119873
119910 0 le 119896 le 119873
119911
0 else
(11)
6 International Journal of Antennas and Propagation
lowast119878119906119898 119899119900119903119898 119900119891 119890119886119888ℎ 119875119903119900119888119890119904119904lowast119872119875119868 119860119897119897119903119890119889119906119888119890 (amp119903119899119900119903119898amp1199031198991199001199031198981 1 119872119875119868 119865119871119874119860119879 119872119875119868 119878119880119872 119908119900119903119897119889)lowastV119890119888119905119900119903-V119890119888119905119900119903 119875119903119900119889119906119888119905 119900119899 119890119886119888ℎ 119901119903119900119888119890119904119904lowast
119891119900119903(119894119899119905 119894 = 0 119894 lt 119886119897119897 119894 + +)
119888119895119909[0][0][119894] = 119888119895119909[0][0][119894] + 119886119898 lowast 119888119901119909[0][0][119894]119888119895119910[0][0][119894] = 119888119895119910[0][0][119894] + 119886119898 lowast 119888119901119910[0][0][119894]119888119895119911[0][0][119894] = 119888119895119911[0][0][119894] + 119886119898 lowast 119888119901119911[0][0][119894]119888119903119909[0][0][119894] = 119888119903119909[0][0][119894] minus 119886119898 lowast 119888119905119909[0][0][119894]119888119903119910[0][0][119894] = 119888119903119910[0][0][119894] minus 119886119898 lowast 119888119905119910[0][0][119894]119888119903119911[0][0][119894] = 119888119903119911[0][0][119894] minus 119886119898 lowast 119888119905119911[0][0][119894]119903119890119886119909 = 119886119887119904(119888119903119909[0][0][119894])119903119890119886119910 = 119886119887119904(119888119903119910[0][0][119894])119903119890119886119911 = 119886119887119904(119888119903119911[0][0][119894])119903119899119900119903119898 = 119903119899119900119903119898 + 119903119890119886119909 lowast 119903119890119886119909 + 119903119890119886119910 lowast 119903119890119886119910 + 119903119890119886119911 lowast 119903119890119886119911
Algorithm 1
Using the convolution theorem and FFT method we canobtain the discrete form of the integral equation (6) with FFTmethod [18 19]
119869119863
120585(119898 119899 119896)
+1
4120587120594 (119898 119899 119896)F
minus1
sum
120585=119909119910119911
119866119890
120585120577(119894 119895 119897) 119869
119863119890
120585(119894 119895 119897)
= 120594 (119898 119899 119896) 119864inc120585
(119898 119899 119896)
(12)
where 119866119890
120585120577(119894 119895 119897) 119869
119863119890
120585(119894 119895 119897) are the discrete Fourier transform
(DFT) of 119866119890
120585120577(119894 119895 119897) 119869
119863119890
120585120577(119894 119895 119897) respectively Similarly the
corresponding adjoint operations can also be performedusing FFT As a consequence we can solve (12) rapidlythrough the CG-FFT algorithm [18] In order to speed upthe FFT calculation the parallel FFT is used to obtain theFFT and inverse FFT results In the proposed algorithmboth the FFT transform and the inverse FFT transform areimplemented using the FFTW library which is a 119862 subrou-tine library for computing the discrete Fourier transformin one or more dimensions and supports the distributed-memory implementation based on message passing interface(MPI) For example to calculate the vector-vector productin the CG-FFT method which can be parallelized by callMPI Allreduce() as shown in Algorithm 1
3 Numerical Results
To illustrate the accuracy and efficiency of the proposed par-allel CG-FFT algorithm we first consider the EM scatteringby a dielectric sphere illuminated by plane waves whichhas a closed-form solution In the following examples thebackground is just free space The dielectric sphere with 120576
119903=
4 119903 = 02m is illuminated by a plane wave The incidentwave is polarized in the 119909 direction and propagating in the
119911 direction in which the operating frequency is 03 GHzThecomparison of numerical results of the internal electric fieldsbetween parallel CG-FFT and analytical results is illustratedin Figure 1 which shows that the numerical results have goodagreement with the analytical resultsWe have also computedthe scattered electric fields from the dielectric object on theobservation plane 119911 = 10m and compared such results withthe exact solutions as shown in Figure 2
Then we do the parallel performance testing on a HPCwhich has 27 nodes shown in Table 1 in which nodes areconnected by 10Gbps Ethernet The benchmark model is ahomogenous cubic dielectric object with 120576
119903= 4 and the edge
of cubic is 04mThe incident wave is the same plane wave asthat in Figure 1 We compare the network latency inside nodeand internode which means that we test the network latencyon one node and between two nodes respectively Figure 3shows the testing results for internode and inside node FromFigure 3 we can see that the speed of network inside nodeis about 4 times of the inter node which will be a bottleneckfor the parallel CG-FFTmethod To evaluate the performanceof the parallel CG-FFT code we define the performance asfollows
Performance (cellss)
=119873119909
times 119873119910
times 119873119911
times Number of iteratioinsSimulation time (s)
(13)
The parallel CG-FFTmethods performance testing resultis demonstrated in Figure 4 and the detail data is listed inTable 2 From Figure 4 we can obtain that the performancegoes up when no more than 8 nodes are used and theperformance goes down using 10 nodesThe reason is that thenetwork latency plays an important role when we use morethan 8 nodes The parallel efficiency is also tested which isdefined as
Parallel Efficiency =1198791
119875 lowast 119879119901
(14)
International Journal of Antennas and Propagation 7
where 119875 is the number of processes 1198791is the running time
used by one process and 119879119901is running time used by 119875
processes Figure 5 shows the parallel efficiency of parallelCG-FFT with different discretization and processes and thedetail results are listed in Table 3 From Figure 5 and Table 3we can see that the parallel efficiency is above 60 when nomore than 8 nodes are used
Finally we use the proposed parallel CG-FFT methodto simulate EM scattering problem from 3D anatomicallyrealistic human head model exposed to the plane waveworking at 900Mhz The popular HUGO model [20] with aresolution of 1 times 1 times 1mm as shown in Figure 6 includes 16different tissues and organs The electromagnetic properties(120576119903and 120590) of 16 tissues in the model can be obtained from
FCC published data [21] as listed in Table 4 In our simu-lation 4 nodes are used and the computation time is about65 minutes Figure 7 shows the electric field on head surfaceWith the object oriented HUGOmodel the field distributionover a specific object can be investigated The electric fielddistributions on eyes bone and brain are demonstrated inFigures 8 9 and 10 respectively
4 Conclusion
In this paper we have analyzed the performance of an efficientMPI parallel implementation of the CG-FFT algorithm onHPC computers In the proposed method the codes canrun not only on share memory systems machine but alsoon distributed ones which present high scalability behaviorSpecial attention was paid to communications during thematrix-vector product and vector-vector product which area key point for the parallel performanceWe solved a problemwith more than 400 million unknowns on a HPC including27 nodes
Conflict of Interests
The authors declare that there is no conflict of interestsregarding the publication of this paper
Acknowledgments
This work was supported in part by the National ScienceFoundation of China under Grant no 61372057 in part byNatural Science Foundation of the Jiangsu Higher EducationInstitutions under Grant no 10KJD180004 and in part byPostgraduate Innovation Project of Jiangsu Province underGrant no CXZZ13 0973
References
[1] R F Harrington Field Computation by Moment MethodsMacMillan New York NY USA 1968
[2] P M Goggans A A Kishk and A W Glisson ldquoElec-tromagnetic scattering from objects composed of multiplehomogeneous regions using a region-by-region solutionrdquo IEEETransactions on Antennas and Propagation vol 42 no 6 pp865ndash871 1994
[3] R D Graglia P L E Uslenghi andR S Zich ldquoMomentmethodwith isoparametric elements for three-dimensional anisotropicscatterersrdquo Proceedings of the IEEE vol 77 no 5 pp 750ndash7601989
[4] J M Jarem ldquoMethod-of-moments solution of a parallel-platewaveguide aperture systemrdquo Journal of Applied Physics vol 59no 10 pp 3566ndash3570 1986
[5] D E Livesay and K Chen ldquoElectromagnetic field inducedinside arbitrarily shaped biological bodiesrdquo IEEE Transactionson Microwave Theory and Techniques vol 22 no 12 pp 1273ndash1280 1974
[6] T K Sarkar and E Arvas ldquoAn integral equation approachto the analysis of finite microstrip antennas volumesurfaceformulationrdquo IEEE Transactions on Antennas and Propagationvol 38 no 3 pp 305ndash312 1990
[7] H Gan and W C Chew ldquoA discrete BCG-FFT algorithmfor solving 3D inhomogeneous scatterer problemsrdquo Journal ofElectromagnetic Waves and Applications vol 9 no 10 pp 1339ndash1357 1995
[8] T J Cui ldquoFast algorithm for electromagnetic scattering byburied 3-D dielectric objects of large sizerdquo IEEE Transactionson Geoscience and Remote Sensing vol 37 no 5 pp 2597ndash26081999
[9] L Zhao and T J Cui ldquoCG-FFT algorithm for EM scattering bysmall dielectric particles with high permittivity and permeabil-ityrdquoMicrowave and Optical Technology Letters vol 49 no 2 pp305ndash310 2007
[10] L Zhao T J Cui and W D Li ldquoAn efficient algorithmfor em scattering by electrically large dielectric objects usingMR-QEB iterative scheme and CG-FFT methodrdquo Progress inElectromagnetics Research vol 67 pp 341ndash355 2007
[11] W Yu R Mittra T Su Y Liu and X Yang Parallel FiniteDifference Time Domain Method Artech House NorwoodMass USA 2006
[12] W Yu X Yang Y Liu et al ldquoNew development of parallelconformal FDTD method in computational electromagneticsengineeringrdquo IEEEAntennas and PropagationMagazine vol 53no 3 pp 15ndash41 2011
[13] J M Taboada M G Araujo F O Basteiro J L Rodriguez andL Landesa ldquoMLFMA-FFT parallel algorithm for the solution ofextremely large problems in electromagneticrdquo Proceedings of theIEEE vol 101 no 2 pp 350ndash363 2013
[14] O P Gandhi G Lazzi and C M Furse ldquoElectromagneticabsorption in the human head and neck for mobile telephonesat 835 and 1900MHzrdquo IEEE Transactions on Microwave Theoryand Techniques vol 44 no 10 pp 1884ndash1897 1996
[15] G Lazzi and O P Gandhi ldquoRealistically tilted and truncatedanatomically based models of the human head for dosimetryof mobile telephonesrdquo IEEE Transactions on ElectromagneticCompatibility vol 39 no 1 pp 55ndash61 1997
[16] A K Lee H D Choi and J I Choi ldquoStudy on SARs inhead models with different shapes by age using SAM modelfor mobile phone exposure at 835MHzrdquo IEEE Transactions onElectromagnetic Compatibility vol 49 no 2 pp 302ndash312 2007
[17] Q-X Li and O P Gandhi ldquoThermal implications of thenew relaxed IEEE RF safety standard for head exposures tocellular telephones at 835 and 1900MHzrdquo IEEE Transactions onMicrowave Theory and Techniques vol 54 no 7 pp 3146ndash31542006
[18] T J Cui and W C Chew ldquoFast algorithm for electromagneticscattering by buried 3-D dielectric objects of large sizerdquo IEEE
8 International Journal of Antennas and Propagation
Transactions on Geoscience and Remote Sensing vol 37 no 5pp 2597ndash2608 1999
[19] J Weaver Applications of Discrete and Continuous FourierAnalysis John Wiley amp Sons New York NY USA 1983
[20] P Bernardi M Cavagnaro S Pisa and E Piuzzi ldquoSpecificabsorption rate and temperature increases in the head of acellular-phone userrdquo IEEE Transactions on Microwave Theoryand Techniques vol 48 no 7 pp 1118ndash1126 2000
[21] httpwwwfccgovfcc-bindielecsh
International Journal of
AerospaceEngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
RoboticsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Active and Passive Electronic Components
Control Scienceand Engineering
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
RotatingMachinery
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporation httpwwwhindawicom
Journal ofEngineeringVolume 2014
Submit your manuscripts athttpwwwhindawicom
VLSI Design
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Shock and Vibration
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Civil EngineeringAdvances in
Acoustics and VibrationAdvances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Electrical and Computer Engineering
Journal of
Advances inOptoElectronics
Hindawi Publishing Corporation httpwwwhindawicom
Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
SensorsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Chemical EngineeringInternational Journal of Antennas and
Propagation
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Navigation and Observation
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
DistributedSensor Networks
International Journal of
4 International Journal of Antennas and Propagation
01
02
03
E fi
eld
on b
one
0
0304
Figure 9 The total electric field distribution on bone
01
02
E fi
eld
on b
rain
0
0223
Figure 10 The total electric field distribution on brain
is the dyadic Greenrsquos function in homogenous space in whichthe corresponding elements are given by
119866120585120577
= (120585 minus 1205851015840) (120577 minus 120577
1015840) [(119896119887119877)2
+ 1198943 (119896119887119877) minus 3] (120585 = 120577)
119866120585120577
= (120585 minus 1205851015840)2
[(119896119887119877)2
+ 1198943 (119896119887119877) minus 3]
minus 1198772
[(119896119887119877)2
+ 119894 (119896119887119877) minus 1] (120585 = 120577)
(3)
The equivalent version for the induced current 119869 can beapproximately obtained by
119869 (119903) +1
4120587120594 (119903) int
119881
119866 (119903 1199031015840) sdot 119869 (119903
1015840) 1198891199031015840
= 119869inc
(119903) (4)
where 120594(119903) = (120576(119903)120576119887
) minus 1 and
119869 (119903) = 120594 (119903) 119864 (119903) 119869inc
(119903) = 120594 (119903) 119864inc
(119903) (5)
are the normalized electric current inside the dielectric objectand the equivalent incident current respectively
A box with the size of 119871119909
times 119871119910
times 119871119911is used to bound
the considered dielectric target and is discretized into 119873119909
times
119873119910
times 119873119911cuboidal cells Then the volume of each cell is ΔV =
Δ119909Δ119910Δ119911 where Δ120585 = 119871120585119873120585 (120585 = 119909 119910 119911) and 119873
120585is the
division number in the 120585-direction Choosing pulse functionas the basis and testing function we obtain the discrete formsof (4) as
119869119863
120585(119898 119899 119896) +
1
4120587120594 (119898 119899 119896)
times sum
120589=119909119910119911
119873119909minus1
sum
1198981015840=0
119873119910minus1
sum
1198991015840=0
119873119911minus1
sum
1198961015840=0
119866119863
120585120589(119898 minus 119898
1015840 119899 minus 119899
1015840 119896 minus 119896
1015840)
times 119869119863
120589(1198981015840 1198991015840 1198961015840)
= 119869119894119863
120585(119898 119899 119896)
(6)
in which
119866119863
120585120589(119898 minus 119898
1015840 119899 minus 119899
1015840 119896 minus 119896
1015840)
= Δ119881 int
(1198981015840+1)Δ119909
1198981015840Δ119909
int
(1198991015840+1)Δ119910
1198991015840Δ119910
int
(1198961015840+1)Δ119911
1198961015840Δ119911
119866120585120589
((119898 +1
2) Δ119909 minus 119909
1015840
(119899 +1
2) Δ119910 minus 119910
1015840
(119896 +1
2) Δ119911
minus1199111015840) 119889119909101584011988911991010158401198891199111015840
119869119894119863
120585(119898 119899 119896)
= int
(119898+1)Δ119909
119898Δ119909
int
(119899+1)Δ119910
119899Δ119910
int
(119896+1)Δ119911
119896Δ119911
119869inc120585
(119909 119910 119911) 119889119909 119889119910 119889119911
(7)
We remark that the above formulations (6)-(7) actuallyimply the scattering by small particles with the size ofΔ119881 because of the use of pulse basis functions althoughthe dielectric targets may be continuous We can convert (6)into a linear system of equations
119885 sdot 119868 = 119881 (8)
International Journal of Antennas and Propagation 5
Table 1 The HPC hardware information
CPU type Intel Xeon E5520Clock speed 267GHzNumber of nodes 27Available memory 30 times 12GB (DDR3 1067MHz)Operating system CentOS (Linux)Network system BNT 10Gbps Ethernet
Table 2 Performance test
Cells Processes Number of iterations Times (second)64 times 64 times 64 1 30 228128 times 128 times 128 2 30 904256 times 256 times 128 4 30 2589256 times 256 times 256 6 30 3679256 times 256 times 512 8 30 6123512 times 512 times 512 10 30 32897
Table 3 Parallel efficiency
Cells Processes Time (second) Parallel efficiency
64 times 64 times 64
1 228 12 1238 0924 72 080
128 times 128 times 128
1 1681 12 904 0934 506 0838 318 066
256 times 256 times 512
1 29883 12 16418 0914 9401 0798 6123 061
512 times 512 times 512
1 194092 12 109046 0894 64697 07510 32897 05912 31714 051
where 119885 is an 119873 times 119873 system matrix 119868 is a column vectorwith the coefficients of the unknown currents and 119881 isa column vector associated with the incident fields in thedielectric object Here 119873 = 3119873
119909119873119910
119873119911is the total number
of unknowns However the inner products in (6) are all 3Dsummations of the products of discrete Greenrsquos functions anddiscrete electric currents which are quite time and memoryconsuming For electric-large electromagnetic problems 119873
is very large and it is very difficult to solve (8) directly Inorder to calculate fast the products of Greenrsquos functions andelectric currents the discrete Greenrsquos functions are extendedin a larger computational domain as
119866119890
120585120589(119898 119899 119896) = plusmn119866
119863
120585120589(1198980 1198990 1198960) (9)
Table 4 Tissue parameters for HUGOmodel
Tissue type Relativepermittivity
Relativepermeability Conductivity
Skin 41405334 1 086678Fat 11333888 1 0109162Muscle 56879063 1 0995364Cartilage 42653103 1 0782333Cerebrospinal fluid 68638336 1 2412575Sclera 5527013 1 1166726Vitreous 6890184 1 1636162Lens nucleus 35841595 1 0484917Grey matter 52724701 1 0942193White matter 38886288 1 0590815Nerve 32530067 1 0573612Thyroid 59683323 1 1038448Tongue 5527013 1 0936192Bone 20787804 1 0339975Blood 61360718 1 1538069Air 1 1 0
where 0 le 119898 le 2119873119909
minus 1 0 le 119899 le 2119873119910
minus 1 0 le 119896 le 2119873119911
minus 1
1198980
= 119898 0 le 119898 le 119873
119909minus 1
2119873119909
minus 119898 119873119909
le 119898 le 2119873119909
minus 1
1198990
= 119899 0 le 119899 le 119873
119910minus 1
2119873119910
minus 119899 119873119910
le 119899 le 2119873119910
minus 1
1198960
= 119896 0 le 119896 le 119873
119911minus 1
2119873119911
minus 119896 119873119911
le 119896 le 2119873119911
minus 1
(10)
The signs of the expanded discrete Greenrsquos functions aredirectly related to the even and odd nature of the compo-nents with respect to the coordinates in different extendedsubdomains After defining the extended Greenrsquos functionsthe equivalent electric current 119869
119863
120585(119898 119899 119896) can be defined in
the extended domain by zero padding as
119869119863119890
120585(119898 119899 119896)
= 119869119863
120585(119898 119899 119896) 0 le 119898 le 119873
119909minus 1 0 le 119899 le 119873
119910 0 le 119896 le 119873
119911
0 else
(11)
6 International Journal of Antennas and Propagation
lowast119878119906119898 119899119900119903119898 119900119891 119890119886119888ℎ 119875119903119900119888119890119904119904lowast119872119875119868 119860119897119897119903119890119889119906119888119890 (amp119903119899119900119903119898amp1199031198991199001199031198981 1 119872119875119868 119865119871119874119860119879 119872119875119868 119878119880119872 119908119900119903119897119889)lowastV119890119888119905119900119903-V119890119888119905119900119903 119875119903119900119889119906119888119905 119900119899 119890119886119888ℎ 119901119903119900119888119890119904119904lowast
119891119900119903(119894119899119905 119894 = 0 119894 lt 119886119897119897 119894 + +)
119888119895119909[0][0][119894] = 119888119895119909[0][0][119894] + 119886119898 lowast 119888119901119909[0][0][119894]119888119895119910[0][0][119894] = 119888119895119910[0][0][119894] + 119886119898 lowast 119888119901119910[0][0][119894]119888119895119911[0][0][119894] = 119888119895119911[0][0][119894] + 119886119898 lowast 119888119901119911[0][0][119894]119888119903119909[0][0][119894] = 119888119903119909[0][0][119894] minus 119886119898 lowast 119888119905119909[0][0][119894]119888119903119910[0][0][119894] = 119888119903119910[0][0][119894] minus 119886119898 lowast 119888119905119910[0][0][119894]119888119903119911[0][0][119894] = 119888119903119911[0][0][119894] minus 119886119898 lowast 119888119905119911[0][0][119894]119903119890119886119909 = 119886119887119904(119888119903119909[0][0][119894])119903119890119886119910 = 119886119887119904(119888119903119910[0][0][119894])119903119890119886119911 = 119886119887119904(119888119903119911[0][0][119894])119903119899119900119903119898 = 119903119899119900119903119898 + 119903119890119886119909 lowast 119903119890119886119909 + 119903119890119886119910 lowast 119903119890119886119910 + 119903119890119886119911 lowast 119903119890119886119911
Algorithm 1
Using the convolution theorem and FFT method we canobtain the discrete form of the integral equation (6) with FFTmethod [18 19]
119869119863
120585(119898 119899 119896)
+1
4120587120594 (119898 119899 119896)F
minus1
sum
120585=119909119910119911
119866119890
120585120577(119894 119895 119897) 119869
119863119890
120585(119894 119895 119897)
= 120594 (119898 119899 119896) 119864inc120585
(119898 119899 119896)
(12)
where 119866119890
120585120577(119894 119895 119897) 119869
119863119890
120585(119894 119895 119897) are the discrete Fourier transform
(DFT) of 119866119890
120585120577(119894 119895 119897) 119869
119863119890
120585120577(119894 119895 119897) respectively Similarly the
corresponding adjoint operations can also be performedusing FFT As a consequence we can solve (12) rapidlythrough the CG-FFT algorithm [18] In order to speed upthe FFT calculation the parallel FFT is used to obtain theFFT and inverse FFT results In the proposed algorithmboth the FFT transform and the inverse FFT transform areimplemented using the FFTW library which is a 119862 subrou-tine library for computing the discrete Fourier transformin one or more dimensions and supports the distributed-memory implementation based on message passing interface(MPI) For example to calculate the vector-vector productin the CG-FFT method which can be parallelized by callMPI Allreduce() as shown in Algorithm 1
3 Numerical Results
To illustrate the accuracy and efficiency of the proposed par-allel CG-FFT algorithm we first consider the EM scatteringby a dielectric sphere illuminated by plane waves whichhas a closed-form solution In the following examples thebackground is just free space The dielectric sphere with 120576
119903=
4 119903 = 02m is illuminated by a plane wave The incidentwave is polarized in the 119909 direction and propagating in the
119911 direction in which the operating frequency is 03 GHzThecomparison of numerical results of the internal electric fieldsbetween parallel CG-FFT and analytical results is illustratedin Figure 1 which shows that the numerical results have goodagreement with the analytical resultsWe have also computedthe scattered electric fields from the dielectric object on theobservation plane 119911 = 10m and compared such results withthe exact solutions as shown in Figure 2
Then we do the parallel performance testing on a HPCwhich has 27 nodes shown in Table 1 in which nodes areconnected by 10Gbps Ethernet The benchmark model is ahomogenous cubic dielectric object with 120576
119903= 4 and the edge
of cubic is 04mThe incident wave is the same plane wave asthat in Figure 1 We compare the network latency inside nodeand internode which means that we test the network latencyon one node and between two nodes respectively Figure 3shows the testing results for internode and inside node FromFigure 3 we can see that the speed of network inside nodeis about 4 times of the inter node which will be a bottleneckfor the parallel CG-FFTmethod To evaluate the performanceof the parallel CG-FFT code we define the performance asfollows
Performance (cellss)
=119873119909
times 119873119910
times 119873119911
times Number of iteratioinsSimulation time (s)
(13)
The parallel CG-FFTmethods performance testing resultis demonstrated in Figure 4 and the detail data is listed inTable 2 From Figure 4 we can obtain that the performancegoes up when no more than 8 nodes are used and theperformance goes down using 10 nodesThe reason is that thenetwork latency plays an important role when we use morethan 8 nodes The parallel efficiency is also tested which isdefined as
Parallel Efficiency =1198791
119875 lowast 119879119901
(14)
International Journal of Antennas and Propagation 7
where 119875 is the number of processes 1198791is the running time
used by one process and 119879119901is running time used by 119875
processes Figure 5 shows the parallel efficiency of parallelCG-FFT with different discretization and processes and thedetail results are listed in Table 3 From Figure 5 and Table 3we can see that the parallel efficiency is above 60 when nomore than 8 nodes are used
Finally we use the proposed parallel CG-FFT methodto simulate EM scattering problem from 3D anatomicallyrealistic human head model exposed to the plane waveworking at 900Mhz The popular HUGO model [20] with aresolution of 1 times 1 times 1mm as shown in Figure 6 includes 16different tissues and organs The electromagnetic properties(120576119903and 120590) of 16 tissues in the model can be obtained from
FCC published data [21] as listed in Table 4 In our simu-lation 4 nodes are used and the computation time is about65 minutes Figure 7 shows the electric field on head surfaceWith the object oriented HUGOmodel the field distributionover a specific object can be investigated The electric fielddistributions on eyes bone and brain are demonstrated inFigures 8 9 and 10 respectively
4 Conclusion
In this paper we have analyzed the performance of an efficientMPI parallel implementation of the CG-FFT algorithm onHPC computers In the proposed method the codes canrun not only on share memory systems machine but alsoon distributed ones which present high scalability behaviorSpecial attention was paid to communications during thematrix-vector product and vector-vector product which area key point for the parallel performanceWe solved a problemwith more than 400 million unknowns on a HPC including27 nodes
Conflict of Interests
The authors declare that there is no conflict of interestsregarding the publication of this paper
Acknowledgments
This work was supported in part by the National ScienceFoundation of China under Grant no 61372057 in part byNatural Science Foundation of the Jiangsu Higher EducationInstitutions under Grant no 10KJD180004 and in part byPostgraduate Innovation Project of Jiangsu Province underGrant no CXZZ13 0973
References
[1] R F Harrington Field Computation by Moment MethodsMacMillan New York NY USA 1968
[2] P M Goggans A A Kishk and A W Glisson ldquoElec-tromagnetic scattering from objects composed of multiplehomogeneous regions using a region-by-region solutionrdquo IEEETransactions on Antennas and Propagation vol 42 no 6 pp865ndash871 1994
[3] R D Graglia P L E Uslenghi andR S Zich ldquoMomentmethodwith isoparametric elements for three-dimensional anisotropicscatterersrdquo Proceedings of the IEEE vol 77 no 5 pp 750ndash7601989
[4] J M Jarem ldquoMethod-of-moments solution of a parallel-platewaveguide aperture systemrdquo Journal of Applied Physics vol 59no 10 pp 3566ndash3570 1986
[5] D E Livesay and K Chen ldquoElectromagnetic field inducedinside arbitrarily shaped biological bodiesrdquo IEEE Transactionson Microwave Theory and Techniques vol 22 no 12 pp 1273ndash1280 1974
[6] T K Sarkar and E Arvas ldquoAn integral equation approachto the analysis of finite microstrip antennas volumesurfaceformulationrdquo IEEE Transactions on Antennas and Propagationvol 38 no 3 pp 305ndash312 1990
[7] H Gan and W C Chew ldquoA discrete BCG-FFT algorithmfor solving 3D inhomogeneous scatterer problemsrdquo Journal ofElectromagnetic Waves and Applications vol 9 no 10 pp 1339ndash1357 1995
[8] T J Cui ldquoFast algorithm for electromagnetic scattering byburied 3-D dielectric objects of large sizerdquo IEEE Transactionson Geoscience and Remote Sensing vol 37 no 5 pp 2597ndash26081999
[9] L Zhao and T J Cui ldquoCG-FFT algorithm for EM scattering bysmall dielectric particles with high permittivity and permeabil-ityrdquoMicrowave and Optical Technology Letters vol 49 no 2 pp305ndash310 2007
[10] L Zhao T J Cui and W D Li ldquoAn efficient algorithmfor em scattering by electrically large dielectric objects usingMR-QEB iterative scheme and CG-FFT methodrdquo Progress inElectromagnetics Research vol 67 pp 341ndash355 2007
[11] W Yu R Mittra T Su Y Liu and X Yang Parallel FiniteDifference Time Domain Method Artech House NorwoodMass USA 2006
[12] W Yu X Yang Y Liu et al ldquoNew development of parallelconformal FDTD method in computational electromagneticsengineeringrdquo IEEEAntennas and PropagationMagazine vol 53no 3 pp 15ndash41 2011
[13] J M Taboada M G Araujo F O Basteiro J L Rodriguez andL Landesa ldquoMLFMA-FFT parallel algorithm for the solution ofextremely large problems in electromagneticrdquo Proceedings of theIEEE vol 101 no 2 pp 350ndash363 2013
[14] O P Gandhi G Lazzi and C M Furse ldquoElectromagneticabsorption in the human head and neck for mobile telephonesat 835 and 1900MHzrdquo IEEE Transactions on Microwave Theoryand Techniques vol 44 no 10 pp 1884ndash1897 1996
[15] G Lazzi and O P Gandhi ldquoRealistically tilted and truncatedanatomically based models of the human head for dosimetryof mobile telephonesrdquo IEEE Transactions on ElectromagneticCompatibility vol 39 no 1 pp 55ndash61 1997
[16] A K Lee H D Choi and J I Choi ldquoStudy on SARs inhead models with different shapes by age using SAM modelfor mobile phone exposure at 835MHzrdquo IEEE Transactions onElectromagnetic Compatibility vol 49 no 2 pp 302ndash312 2007
[17] Q-X Li and O P Gandhi ldquoThermal implications of thenew relaxed IEEE RF safety standard for head exposures tocellular telephones at 835 and 1900MHzrdquo IEEE Transactions onMicrowave Theory and Techniques vol 54 no 7 pp 3146ndash31542006
[18] T J Cui and W C Chew ldquoFast algorithm for electromagneticscattering by buried 3-D dielectric objects of large sizerdquo IEEE
8 International Journal of Antennas and Propagation
Transactions on Geoscience and Remote Sensing vol 37 no 5pp 2597ndash2608 1999
[19] J Weaver Applications of Discrete and Continuous FourierAnalysis John Wiley amp Sons New York NY USA 1983
[20] P Bernardi M Cavagnaro S Pisa and E Piuzzi ldquoSpecificabsorption rate and temperature increases in the head of acellular-phone userrdquo IEEE Transactions on Microwave Theoryand Techniques vol 48 no 7 pp 1118ndash1126 2000
[21] httpwwwfccgovfcc-bindielecsh
International Journal of
AerospaceEngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
RoboticsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Active and Passive Electronic Components
Control Scienceand Engineering
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
RotatingMachinery
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporation httpwwwhindawicom
Journal ofEngineeringVolume 2014
Submit your manuscripts athttpwwwhindawicom
VLSI Design
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Shock and Vibration
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Civil EngineeringAdvances in
Acoustics and VibrationAdvances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Electrical and Computer Engineering
Journal of
Advances inOptoElectronics
Hindawi Publishing Corporation httpwwwhindawicom
Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
SensorsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Chemical EngineeringInternational Journal of Antennas and
Propagation
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Navigation and Observation
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
DistributedSensor Networks
International Journal of
International Journal of Antennas and Propagation 5
Table 1 The HPC hardware information
CPU type Intel Xeon E5520Clock speed 267GHzNumber of nodes 27Available memory 30 times 12GB (DDR3 1067MHz)Operating system CentOS (Linux)Network system BNT 10Gbps Ethernet
Table 2 Performance test
Cells Processes Number of iterations Times (second)64 times 64 times 64 1 30 228128 times 128 times 128 2 30 904256 times 256 times 128 4 30 2589256 times 256 times 256 6 30 3679256 times 256 times 512 8 30 6123512 times 512 times 512 10 30 32897
Table 3 Parallel efficiency
Cells Processes Time (second) Parallel efficiency
64 times 64 times 64
1 228 12 1238 0924 72 080
128 times 128 times 128
1 1681 12 904 0934 506 0838 318 066
256 times 256 times 512
1 29883 12 16418 0914 9401 0798 6123 061
512 times 512 times 512
1 194092 12 109046 0894 64697 07510 32897 05912 31714 051
where 119885 is an 119873 times 119873 system matrix 119868 is a column vectorwith the coefficients of the unknown currents and 119881 isa column vector associated with the incident fields in thedielectric object Here 119873 = 3119873
119909119873119910
119873119911is the total number
of unknowns However the inner products in (6) are all 3Dsummations of the products of discrete Greenrsquos functions anddiscrete electric currents which are quite time and memoryconsuming For electric-large electromagnetic problems 119873
is very large and it is very difficult to solve (8) directly Inorder to calculate fast the products of Greenrsquos functions andelectric currents the discrete Greenrsquos functions are extendedin a larger computational domain as
119866119890
120585120589(119898 119899 119896) = plusmn119866
119863
120585120589(1198980 1198990 1198960) (9)
Table 4 Tissue parameters for HUGOmodel
Tissue type Relativepermittivity
Relativepermeability Conductivity
Skin 41405334 1 086678Fat 11333888 1 0109162Muscle 56879063 1 0995364Cartilage 42653103 1 0782333Cerebrospinal fluid 68638336 1 2412575Sclera 5527013 1 1166726Vitreous 6890184 1 1636162Lens nucleus 35841595 1 0484917Grey matter 52724701 1 0942193White matter 38886288 1 0590815Nerve 32530067 1 0573612Thyroid 59683323 1 1038448Tongue 5527013 1 0936192Bone 20787804 1 0339975Blood 61360718 1 1538069Air 1 1 0
where 0 le 119898 le 2119873119909
minus 1 0 le 119899 le 2119873119910
minus 1 0 le 119896 le 2119873119911
minus 1
1198980
= 119898 0 le 119898 le 119873
119909minus 1
2119873119909
minus 119898 119873119909
le 119898 le 2119873119909
minus 1
1198990
= 119899 0 le 119899 le 119873
119910minus 1
2119873119910
minus 119899 119873119910
le 119899 le 2119873119910
minus 1
1198960
= 119896 0 le 119896 le 119873
119911minus 1
2119873119911
minus 119896 119873119911
le 119896 le 2119873119911
minus 1
(10)
The signs of the expanded discrete Greenrsquos functions aredirectly related to the even and odd nature of the compo-nents with respect to the coordinates in different extendedsubdomains After defining the extended Greenrsquos functionsthe equivalent electric current 119869
119863
120585(119898 119899 119896) can be defined in
the extended domain by zero padding as
119869119863119890
120585(119898 119899 119896)
= 119869119863
120585(119898 119899 119896) 0 le 119898 le 119873
119909minus 1 0 le 119899 le 119873
119910 0 le 119896 le 119873
119911
0 else
(11)
6 International Journal of Antennas and Propagation
lowast119878119906119898 119899119900119903119898 119900119891 119890119886119888ℎ 119875119903119900119888119890119904119904lowast119872119875119868 119860119897119897119903119890119889119906119888119890 (amp119903119899119900119903119898amp1199031198991199001199031198981 1 119872119875119868 119865119871119874119860119879 119872119875119868 119878119880119872 119908119900119903119897119889)lowastV119890119888119905119900119903-V119890119888119905119900119903 119875119903119900119889119906119888119905 119900119899 119890119886119888ℎ 119901119903119900119888119890119904119904lowast
119891119900119903(119894119899119905 119894 = 0 119894 lt 119886119897119897 119894 + +)
119888119895119909[0][0][119894] = 119888119895119909[0][0][119894] + 119886119898 lowast 119888119901119909[0][0][119894]119888119895119910[0][0][119894] = 119888119895119910[0][0][119894] + 119886119898 lowast 119888119901119910[0][0][119894]119888119895119911[0][0][119894] = 119888119895119911[0][0][119894] + 119886119898 lowast 119888119901119911[0][0][119894]119888119903119909[0][0][119894] = 119888119903119909[0][0][119894] minus 119886119898 lowast 119888119905119909[0][0][119894]119888119903119910[0][0][119894] = 119888119903119910[0][0][119894] minus 119886119898 lowast 119888119905119910[0][0][119894]119888119903119911[0][0][119894] = 119888119903119911[0][0][119894] minus 119886119898 lowast 119888119905119911[0][0][119894]119903119890119886119909 = 119886119887119904(119888119903119909[0][0][119894])119903119890119886119910 = 119886119887119904(119888119903119910[0][0][119894])119903119890119886119911 = 119886119887119904(119888119903119911[0][0][119894])119903119899119900119903119898 = 119903119899119900119903119898 + 119903119890119886119909 lowast 119903119890119886119909 + 119903119890119886119910 lowast 119903119890119886119910 + 119903119890119886119911 lowast 119903119890119886119911
Algorithm 1
Using the convolution theorem and FFT method we canobtain the discrete form of the integral equation (6) with FFTmethod [18 19]
119869119863
120585(119898 119899 119896)
+1
4120587120594 (119898 119899 119896)F
minus1
sum
120585=119909119910119911
119866119890
120585120577(119894 119895 119897) 119869
119863119890
120585(119894 119895 119897)
= 120594 (119898 119899 119896) 119864inc120585
(119898 119899 119896)
(12)
where 119866119890
120585120577(119894 119895 119897) 119869
119863119890
120585(119894 119895 119897) are the discrete Fourier transform
(DFT) of 119866119890
120585120577(119894 119895 119897) 119869
119863119890
120585120577(119894 119895 119897) respectively Similarly the
corresponding adjoint operations can also be performedusing FFT As a consequence we can solve (12) rapidlythrough the CG-FFT algorithm [18] In order to speed upthe FFT calculation the parallel FFT is used to obtain theFFT and inverse FFT results In the proposed algorithmboth the FFT transform and the inverse FFT transform areimplemented using the FFTW library which is a 119862 subrou-tine library for computing the discrete Fourier transformin one or more dimensions and supports the distributed-memory implementation based on message passing interface(MPI) For example to calculate the vector-vector productin the CG-FFT method which can be parallelized by callMPI Allreduce() as shown in Algorithm 1
3 Numerical Results
To illustrate the accuracy and efficiency of the proposed par-allel CG-FFT algorithm we first consider the EM scatteringby a dielectric sphere illuminated by plane waves whichhas a closed-form solution In the following examples thebackground is just free space The dielectric sphere with 120576
119903=
4 119903 = 02m is illuminated by a plane wave The incidentwave is polarized in the 119909 direction and propagating in the
119911 direction in which the operating frequency is 03 GHzThecomparison of numerical results of the internal electric fieldsbetween parallel CG-FFT and analytical results is illustratedin Figure 1 which shows that the numerical results have goodagreement with the analytical resultsWe have also computedthe scattered electric fields from the dielectric object on theobservation plane 119911 = 10m and compared such results withthe exact solutions as shown in Figure 2
Then we do the parallel performance testing on a HPCwhich has 27 nodes shown in Table 1 in which nodes areconnected by 10Gbps Ethernet The benchmark model is ahomogenous cubic dielectric object with 120576
119903= 4 and the edge
of cubic is 04mThe incident wave is the same plane wave asthat in Figure 1 We compare the network latency inside nodeand internode which means that we test the network latencyon one node and between two nodes respectively Figure 3shows the testing results for internode and inside node FromFigure 3 we can see that the speed of network inside nodeis about 4 times of the inter node which will be a bottleneckfor the parallel CG-FFTmethod To evaluate the performanceof the parallel CG-FFT code we define the performance asfollows
Performance (cellss)
=119873119909
times 119873119910
times 119873119911
times Number of iteratioinsSimulation time (s)
(13)
The parallel CG-FFTmethods performance testing resultis demonstrated in Figure 4 and the detail data is listed inTable 2 From Figure 4 we can obtain that the performancegoes up when no more than 8 nodes are used and theperformance goes down using 10 nodesThe reason is that thenetwork latency plays an important role when we use morethan 8 nodes The parallel efficiency is also tested which isdefined as
Parallel Efficiency =1198791
119875 lowast 119879119901
(14)
International Journal of Antennas and Propagation 7
where 119875 is the number of processes 1198791is the running time
used by one process and 119879119901is running time used by 119875
processes Figure 5 shows the parallel efficiency of parallelCG-FFT with different discretization and processes and thedetail results are listed in Table 3 From Figure 5 and Table 3we can see that the parallel efficiency is above 60 when nomore than 8 nodes are used
Finally we use the proposed parallel CG-FFT methodto simulate EM scattering problem from 3D anatomicallyrealistic human head model exposed to the plane waveworking at 900Mhz The popular HUGO model [20] with aresolution of 1 times 1 times 1mm as shown in Figure 6 includes 16different tissues and organs The electromagnetic properties(120576119903and 120590) of 16 tissues in the model can be obtained from
FCC published data [21] as listed in Table 4 In our simu-lation 4 nodes are used and the computation time is about65 minutes Figure 7 shows the electric field on head surfaceWith the object oriented HUGOmodel the field distributionover a specific object can be investigated The electric fielddistributions on eyes bone and brain are demonstrated inFigures 8 9 and 10 respectively
4 Conclusion
In this paper we have analyzed the performance of an efficientMPI parallel implementation of the CG-FFT algorithm onHPC computers In the proposed method the codes canrun not only on share memory systems machine but alsoon distributed ones which present high scalability behaviorSpecial attention was paid to communications during thematrix-vector product and vector-vector product which area key point for the parallel performanceWe solved a problemwith more than 400 million unknowns on a HPC including27 nodes
Conflict of Interests
The authors declare that there is no conflict of interestsregarding the publication of this paper
Acknowledgments
This work was supported in part by the National ScienceFoundation of China under Grant no 61372057 in part byNatural Science Foundation of the Jiangsu Higher EducationInstitutions under Grant no 10KJD180004 and in part byPostgraduate Innovation Project of Jiangsu Province underGrant no CXZZ13 0973
References
[1] R F Harrington Field Computation by Moment MethodsMacMillan New York NY USA 1968
[2] P M Goggans A A Kishk and A W Glisson ldquoElec-tromagnetic scattering from objects composed of multiplehomogeneous regions using a region-by-region solutionrdquo IEEETransactions on Antennas and Propagation vol 42 no 6 pp865ndash871 1994
[3] R D Graglia P L E Uslenghi andR S Zich ldquoMomentmethodwith isoparametric elements for three-dimensional anisotropicscatterersrdquo Proceedings of the IEEE vol 77 no 5 pp 750ndash7601989
[4] J M Jarem ldquoMethod-of-moments solution of a parallel-platewaveguide aperture systemrdquo Journal of Applied Physics vol 59no 10 pp 3566ndash3570 1986
[5] D E Livesay and K Chen ldquoElectromagnetic field inducedinside arbitrarily shaped biological bodiesrdquo IEEE Transactionson Microwave Theory and Techniques vol 22 no 12 pp 1273ndash1280 1974
[6] T K Sarkar and E Arvas ldquoAn integral equation approachto the analysis of finite microstrip antennas volumesurfaceformulationrdquo IEEE Transactions on Antennas and Propagationvol 38 no 3 pp 305ndash312 1990
[7] H Gan and W C Chew ldquoA discrete BCG-FFT algorithmfor solving 3D inhomogeneous scatterer problemsrdquo Journal ofElectromagnetic Waves and Applications vol 9 no 10 pp 1339ndash1357 1995
[8] T J Cui ldquoFast algorithm for electromagnetic scattering byburied 3-D dielectric objects of large sizerdquo IEEE Transactionson Geoscience and Remote Sensing vol 37 no 5 pp 2597ndash26081999
[9] L Zhao and T J Cui ldquoCG-FFT algorithm for EM scattering bysmall dielectric particles with high permittivity and permeabil-ityrdquoMicrowave and Optical Technology Letters vol 49 no 2 pp305ndash310 2007
[10] L Zhao T J Cui and W D Li ldquoAn efficient algorithmfor em scattering by electrically large dielectric objects usingMR-QEB iterative scheme and CG-FFT methodrdquo Progress inElectromagnetics Research vol 67 pp 341ndash355 2007
[11] W Yu R Mittra T Su Y Liu and X Yang Parallel FiniteDifference Time Domain Method Artech House NorwoodMass USA 2006
[12] W Yu X Yang Y Liu et al ldquoNew development of parallelconformal FDTD method in computational electromagneticsengineeringrdquo IEEEAntennas and PropagationMagazine vol 53no 3 pp 15ndash41 2011
[13] J M Taboada M G Araujo F O Basteiro J L Rodriguez andL Landesa ldquoMLFMA-FFT parallel algorithm for the solution ofextremely large problems in electromagneticrdquo Proceedings of theIEEE vol 101 no 2 pp 350ndash363 2013
[14] O P Gandhi G Lazzi and C M Furse ldquoElectromagneticabsorption in the human head and neck for mobile telephonesat 835 and 1900MHzrdquo IEEE Transactions on Microwave Theoryand Techniques vol 44 no 10 pp 1884ndash1897 1996
[15] G Lazzi and O P Gandhi ldquoRealistically tilted and truncatedanatomically based models of the human head for dosimetryof mobile telephonesrdquo IEEE Transactions on ElectromagneticCompatibility vol 39 no 1 pp 55ndash61 1997
[16] A K Lee H D Choi and J I Choi ldquoStudy on SARs inhead models with different shapes by age using SAM modelfor mobile phone exposure at 835MHzrdquo IEEE Transactions onElectromagnetic Compatibility vol 49 no 2 pp 302ndash312 2007
[17] Q-X Li and O P Gandhi ldquoThermal implications of thenew relaxed IEEE RF safety standard for head exposures tocellular telephones at 835 and 1900MHzrdquo IEEE Transactions onMicrowave Theory and Techniques vol 54 no 7 pp 3146ndash31542006
[18] T J Cui and W C Chew ldquoFast algorithm for electromagneticscattering by buried 3-D dielectric objects of large sizerdquo IEEE
8 International Journal of Antennas and Propagation
Transactions on Geoscience and Remote Sensing vol 37 no 5pp 2597ndash2608 1999
[19] J Weaver Applications of Discrete and Continuous FourierAnalysis John Wiley amp Sons New York NY USA 1983
[20] P Bernardi M Cavagnaro S Pisa and E Piuzzi ldquoSpecificabsorption rate and temperature increases in the head of acellular-phone userrdquo IEEE Transactions on Microwave Theoryand Techniques vol 48 no 7 pp 1118ndash1126 2000
[21] httpwwwfccgovfcc-bindielecsh
International Journal of
AerospaceEngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
RoboticsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Active and Passive Electronic Components
Control Scienceand Engineering
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
RotatingMachinery
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporation httpwwwhindawicom
Journal ofEngineeringVolume 2014
Submit your manuscripts athttpwwwhindawicom
VLSI Design
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Shock and Vibration
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Civil EngineeringAdvances in
Acoustics and VibrationAdvances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Electrical and Computer Engineering
Journal of
Advances inOptoElectronics
Hindawi Publishing Corporation httpwwwhindawicom
Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
SensorsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Chemical EngineeringInternational Journal of Antennas and
Propagation
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Navigation and Observation
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
DistributedSensor Networks
International Journal of
6 International Journal of Antennas and Propagation
lowast119878119906119898 119899119900119903119898 119900119891 119890119886119888ℎ 119875119903119900119888119890119904119904lowast119872119875119868 119860119897119897119903119890119889119906119888119890 (amp119903119899119900119903119898amp1199031198991199001199031198981 1 119872119875119868 119865119871119874119860119879 119872119875119868 119878119880119872 119908119900119903119897119889)lowastV119890119888119905119900119903-V119890119888119905119900119903 119875119903119900119889119906119888119905 119900119899 119890119886119888ℎ 119901119903119900119888119890119904119904lowast
119891119900119903(119894119899119905 119894 = 0 119894 lt 119886119897119897 119894 + +)
119888119895119909[0][0][119894] = 119888119895119909[0][0][119894] + 119886119898 lowast 119888119901119909[0][0][119894]119888119895119910[0][0][119894] = 119888119895119910[0][0][119894] + 119886119898 lowast 119888119901119910[0][0][119894]119888119895119911[0][0][119894] = 119888119895119911[0][0][119894] + 119886119898 lowast 119888119901119911[0][0][119894]119888119903119909[0][0][119894] = 119888119903119909[0][0][119894] minus 119886119898 lowast 119888119905119909[0][0][119894]119888119903119910[0][0][119894] = 119888119903119910[0][0][119894] minus 119886119898 lowast 119888119905119910[0][0][119894]119888119903119911[0][0][119894] = 119888119903119911[0][0][119894] minus 119886119898 lowast 119888119905119911[0][0][119894]119903119890119886119909 = 119886119887119904(119888119903119909[0][0][119894])119903119890119886119910 = 119886119887119904(119888119903119910[0][0][119894])119903119890119886119911 = 119886119887119904(119888119903119911[0][0][119894])119903119899119900119903119898 = 119903119899119900119903119898 + 119903119890119886119909 lowast 119903119890119886119909 + 119903119890119886119910 lowast 119903119890119886119910 + 119903119890119886119911 lowast 119903119890119886119911
Algorithm 1
Using the convolution theorem and FFT method we canobtain the discrete form of the integral equation (6) with FFTmethod [18 19]
119869119863
120585(119898 119899 119896)
+1
4120587120594 (119898 119899 119896)F
minus1
sum
120585=119909119910119911
119866119890
120585120577(119894 119895 119897) 119869
119863119890
120585(119894 119895 119897)
= 120594 (119898 119899 119896) 119864inc120585
(119898 119899 119896)
(12)
where 119866119890
120585120577(119894 119895 119897) 119869
119863119890
120585(119894 119895 119897) are the discrete Fourier transform
(DFT) of 119866119890
120585120577(119894 119895 119897) 119869
119863119890
120585120577(119894 119895 119897) respectively Similarly the
corresponding adjoint operations can also be performedusing FFT As a consequence we can solve (12) rapidlythrough the CG-FFT algorithm [18] In order to speed upthe FFT calculation the parallel FFT is used to obtain theFFT and inverse FFT results In the proposed algorithmboth the FFT transform and the inverse FFT transform areimplemented using the FFTW library which is a 119862 subrou-tine library for computing the discrete Fourier transformin one or more dimensions and supports the distributed-memory implementation based on message passing interface(MPI) For example to calculate the vector-vector productin the CG-FFT method which can be parallelized by callMPI Allreduce() as shown in Algorithm 1
3 Numerical Results
To illustrate the accuracy and efficiency of the proposed par-allel CG-FFT algorithm we first consider the EM scatteringby a dielectric sphere illuminated by plane waves whichhas a closed-form solution In the following examples thebackground is just free space The dielectric sphere with 120576
119903=
4 119903 = 02m is illuminated by a plane wave The incidentwave is polarized in the 119909 direction and propagating in the
119911 direction in which the operating frequency is 03 GHzThecomparison of numerical results of the internal electric fieldsbetween parallel CG-FFT and analytical results is illustratedin Figure 1 which shows that the numerical results have goodagreement with the analytical resultsWe have also computedthe scattered electric fields from the dielectric object on theobservation plane 119911 = 10m and compared such results withthe exact solutions as shown in Figure 2
Then we do the parallel performance testing on a HPCwhich has 27 nodes shown in Table 1 in which nodes areconnected by 10Gbps Ethernet The benchmark model is ahomogenous cubic dielectric object with 120576
119903= 4 and the edge
of cubic is 04mThe incident wave is the same plane wave asthat in Figure 1 We compare the network latency inside nodeand internode which means that we test the network latencyon one node and between two nodes respectively Figure 3shows the testing results for internode and inside node FromFigure 3 we can see that the speed of network inside nodeis about 4 times of the inter node which will be a bottleneckfor the parallel CG-FFTmethod To evaluate the performanceof the parallel CG-FFT code we define the performance asfollows
Performance (cellss)
=119873119909
times 119873119910
times 119873119911
times Number of iteratioinsSimulation time (s)
(13)
The parallel CG-FFTmethods performance testing resultis demonstrated in Figure 4 and the detail data is listed inTable 2 From Figure 4 we can obtain that the performancegoes up when no more than 8 nodes are used and theperformance goes down using 10 nodesThe reason is that thenetwork latency plays an important role when we use morethan 8 nodes The parallel efficiency is also tested which isdefined as
Parallel Efficiency =1198791
119875 lowast 119879119901
(14)
International Journal of Antennas and Propagation 7
where 119875 is the number of processes 1198791is the running time
used by one process and 119879119901is running time used by 119875
processes Figure 5 shows the parallel efficiency of parallelCG-FFT with different discretization and processes and thedetail results are listed in Table 3 From Figure 5 and Table 3we can see that the parallel efficiency is above 60 when nomore than 8 nodes are used
Finally we use the proposed parallel CG-FFT methodto simulate EM scattering problem from 3D anatomicallyrealistic human head model exposed to the plane waveworking at 900Mhz The popular HUGO model [20] with aresolution of 1 times 1 times 1mm as shown in Figure 6 includes 16different tissues and organs The electromagnetic properties(120576119903and 120590) of 16 tissues in the model can be obtained from
FCC published data [21] as listed in Table 4 In our simu-lation 4 nodes are used and the computation time is about65 minutes Figure 7 shows the electric field on head surfaceWith the object oriented HUGOmodel the field distributionover a specific object can be investigated The electric fielddistributions on eyes bone and brain are demonstrated inFigures 8 9 and 10 respectively
4 Conclusion
In this paper we have analyzed the performance of an efficientMPI parallel implementation of the CG-FFT algorithm onHPC computers In the proposed method the codes canrun not only on share memory systems machine but alsoon distributed ones which present high scalability behaviorSpecial attention was paid to communications during thematrix-vector product and vector-vector product which area key point for the parallel performanceWe solved a problemwith more than 400 million unknowns on a HPC including27 nodes
Conflict of Interests
The authors declare that there is no conflict of interestsregarding the publication of this paper
Acknowledgments
This work was supported in part by the National ScienceFoundation of China under Grant no 61372057 in part byNatural Science Foundation of the Jiangsu Higher EducationInstitutions under Grant no 10KJD180004 and in part byPostgraduate Innovation Project of Jiangsu Province underGrant no CXZZ13 0973
References
[1] R F Harrington Field Computation by Moment MethodsMacMillan New York NY USA 1968
[2] P M Goggans A A Kishk and A W Glisson ldquoElec-tromagnetic scattering from objects composed of multiplehomogeneous regions using a region-by-region solutionrdquo IEEETransactions on Antennas and Propagation vol 42 no 6 pp865ndash871 1994
[3] R D Graglia P L E Uslenghi andR S Zich ldquoMomentmethodwith isoparametric elements for three-dimensional anisotropicscatterersrdquo Proceedings of the IEEE vol 77 no 5 pp 750ndash7601989
[4] J M Jarem ldquoMethod-of-moments solution of a parallel-platewaveguide aperture systemrdquo Journal of Applied Physics vol 59no 10 pp 3566ndash3570 1986
[5] D E Livesay and K Chen ldquoElectromagnetic field inducedinside arbitrarily shaped biological bodiesrdquo IEEE Transactionson Microwave Theory and Techniques vol 22 no 12 pp 1273ndash1280 1974
[6] T K Sarkar and E Arvas ldquoAn integral equation approachto the analysis of finite microstrip antennas volumesurfaceformulationrdquo IEEE Transactions on Antennas and Propagationvol 38 no 3 pp 305ndash312 1990
[7] H Gan and W C Chew ldquoA discrete BCG-FFT algorithmfor solving 3D inhomogeneous scatterer problemsrdquo Journal ofElectromagnetic Waves and Applications vol 9 no 10 pp 1339ndash1357 1995
[8] T J Cui ldquoFast algorithm for electromagnetic scattering byburied 3-D dielectric objects of large sizerdquo IEEE Transactionson Geoscience and Remote Sensing vol 37 no 5 pp 2597ndash26081999
[9] L Zhao and T J Cui ldquoCG-FFT algorithm for EM scattering bysmall dielectric particles with high permittivity and permeabil-ityrdquoMicrowave and Optical Technology Letters vol 49 no 2 pp305ndash310 2007
[10] L Zhao T J Cui and W D Li ldquoAn efficient algorithmfor em scattering by electrically large dielectric objects usingMR-QEB iterative scheme and CG-FFT methodrdquo Progress inElectromagnetics Research vol 67 pp 341ndash355 2007
[11] W Yu R Mittra T Su Y Liu and X Yang Parallel FiniteDifference Time Domain Method Artech House NorwoodMass USA 2006
[12] W Yu X Yang Y Liu et al ldquoNew development of parallelconformal FDTD method in computational electromagneticsengineeringrdquo IEEEAntennas and PropagationMagazine vol 53no 3 pp 15ndash41 2011
[13] J M Taboada M G Araujo F O Basteiro J L Rodriguez andL Landesa ldquoMLFMA-FFT parallel algorithm for the solution ofextremely large problems in electromagneticrdquo Proceedings of theIEEE vol 101 no 2 pp 350ndash363 2013
[14] O P Gandhi G Lazzi and C M Furse ldquoElectromagneticabsorption in the human head and neck for mobile telephonesat 835 and 1900MHzrdquo IEEE Transactions on Microwave Theoryand Techniques vol 44 no 10 pp 1884ndash1897 1996
[15] G Lazzi and O P Gandhi ldquoRealistically tilted and truncatedanatomically based models of the human head for dosimetryof mobile telephonesrdquo IEEE Transactions on ElectromagneticCompatibility vol 39 no 1 pp 55ndash61 1997
[16] A K Lee H D Choi and J I Choi ldquoStudy on SARs inhead models with different shapes by age using SAM modelfor mobile phone exposure at 835MHzrdquo IEEE Transactions onElectromagnetic Compatibility vol 49 no 2 pp 302ndash312 2007
[17] Q-X Li and O P Gandhi ldquoThermal implications of thenew relaxed IEEE RF safety standard for head exposures tocellular telephones at 835 and 1900MHzrdquo IEEE Transactions onMicrowave Theory and Techniques vol 54 no 7 pp 3146ndash31542006
[18] T J Cui and W C Chew ldquoFast algorithm for electromagneticscattering by buried 3-D dielectric objects of large sizerdquo IEEE
8 International Journal of Antennas and Propagation
Transactions on Geoscience and Remote Sensing vol 37 no 5pp 2597ndash2608 1999
[19] J Weaver Applications of Discrete and Continuous FourierAnalysis John Wiley amp Sons New York NY USA 1983
[20] P Bernardi M Cavagnaro S Pisa and E Piuzzi ldquoSpecificabsorption rate and temperature increases in the head of acellular-phone userrdquo IEEE Transactions on Microwave Theoryand Techniques vol 48 no 7 pp 1118ndash1126 2000
[21] httpwwwfccgovfcc-bindielecsh
International Journal of
AerospaceEngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
RoboticsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Active and Passive Electronic Components
Control Scienceand Engineering
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
RotatingMachinery
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporation httpwwwhindawicom
Journal ofEngineeringVolume 2014
Submit your manuscripts athttpwwwhindawicom
VLSI Design
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Shock and Vibration
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Civil EngineeringAdvances in
Acoustics and VibrationAdvances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Electrical and Computer Engineering
Journal of
Advances inOptoElectronics
Hindawi Publishing Corporation httpwwwhindawicom
Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
SensorsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Chemical EngineeringInternational Journal of Antennas and
Propagation
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Navigation and Observation
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
DistributedSensor Networks
International Journal of
International Journal of Antennas and Propagation 7
where 119875 is the number of processes 1198791is the running time
used by one process and 119879119901is running time used by 119875
processes Figure 5 shows the parallel efficiency of parallelCG-FFT with different discretization and processes and thedetail results are listed in Table 3 From Figure 5 and Table 3we can see that the parallel efficiency is above 60 when nomore than 8 nodes are used
Finally we use the proposed parallel CG-FFT methodto simulate EM scattering problem from 3D anatomicallyrealistic human head model exposed to the plane waveworking at 900Mhz The popular HUGO model [20] with aresolution of 1 times 1 times 1mm as shown in Figure 6 includes 16different tissues and organs The electromagnetic properties(120576119903and 120590) of 16 tissues in the model can be obtained from
FCC published data [21] as listed in Table 4 In our simu-lation 4 nodes are used and the computation time is about65 minutes Figure 7 shows the electric field on head surfaceWith the object oriented HUGOmodel the field distributionover a specific object can be investigated The electric fielddistributions on eyes bone and brain are demonstrated inFigures 8 9 and 10 respectively
4 Conclusion
In this paper we have analyzed the performance of an efficientMPI parallel implementation of the CG-FFT algorithm onHPC computers In the proposed method the codes canrun not only on share memory systems machine but alsoon distributed ones which present high scalability behaviorSpecial attention was paid to communications during thematrix-vector product and vector-vector product which area key point for the parallel performanceWe solved a problemwith more than 400 million unknowns on a HPC including27 nodes
Conflict of Interests
The authors declare that there is no conflict of interestsregarding the publication of this paper
Acknowledgments
This work was supported in part by the National ScienceFoundation of China under Grant no 61372057 in part byNatural Science Foundation of the Jiangsu Higher EducationInstitutions under Grant no 10KJD180004 and in part byPostgraduate Innovation Project of Jiangsu Province underGrant no CXZZ13 0973
References
[1] R F Harrington Field Computation by Moment MethodsMacMillan New York NY USA 1968
[2] P M Goggans A A Kishk and A W Glisson ldquoElec-tromagnetic scattering from objects composed of multiplehomogeneous regions using a region-by-region solutionrdquo IEEETransactions on Antennas and Propagation vol 42 no 6 pp865ndash871 1994
[3] R D Graglia P L E Uslenghi andR S Zich ldquoMomentmethodwith isoparametric elements for three-dimensional anisotropicscatterersrdquo Proceedings of the IEEE vol 77 no 5 pp 750ndash7601989
[4] J M Jarem ldquoMethod-of-moments solution of a parallel-platewaveguide aperture systemrdquo Journal of Applied Physics vol 59no 10 pp 3566ndash3570 1986
[5] D E Livesay and K Chen ldquoElectromagnetic field inducedinside arbitrarily shaped biological bodiesrdquo IEEE Transactionson Microwave Theory and Techniques vol 22 no 12 pp 1273ndash1280 1974
[6] T K Sarkar and E Arvas ldquoAn integral equation approachto the analysis of finite microstrip antennas volumesurfaceformulationrdquo IEEE Transactions on Antennas and Propagationvol 38 no 3 pp 305ndash312 1990
[7] H Gan and W C Chew ldquoA discrete BCG-FFT algorithmfor solving 3D inhomogeneous scatterer problemsrdquo Journal ofElectromagnetic Waves and Applications vol 9 no 10 pp 1339ndash1357 1995
[8] T J Cui ldquoFast algorithm for electromagnetic scattering byburied 3-D dielectric objects of large sizerdquo IEEE Transactionson Geoscience and Remote Sensing vol 37 no 5 pp 2597ndash26081999
[9] L Zhao and T J Cui ldquoCG-FFT algorithm for EM scattering bysmall dielectric particles with high permittivity and permeabil-ityrdquoMicrowave and Optical Technology Letters vol 49 no 2 pp305ndash310 2007
[10] L Zhao T J Cui and W D Li ldquoAn efficient algorithmfor em scattering by electrically large dielectric objects usingMR-QEB iterative scheme and CG-FFT methodrdquo Progress inElectromagnetics Research vol 67 pp 341ndash355 2007
[11] W Yu R Mittra T Su Y Liu and X Yang Parallel FiniteDifference Time Domain Method Artech House NorwoodMass USA 2006
[12] W Yu X Yang Y Liu et al ldquoNew development of parallelconformal FDTD method in computational electromagneticsengineeringrdquo IEEEAntennas and PropagationMagazine vol 53no 3 pp 15ndash41 2011
[13] J M Taboada M G Araujo F O Basteiro J L Rodriguez andL Landesa ldquoMLFMA-FFT parallel algorithm for the solution ofextremely large problems in electromagneticrdquo Proceedings of theIEEE vol 101 no 2 pp 350ndash363 2013
[14] O P Gandhi G Lazzi and C M Furse ldquoElectromagneticabsorption in the human head and neck for mobile telephonesat 835 and 1900MHzrdquo IEEE Transactions on Microwave Theoryand Techniques vol 44 no 10 pp 1884ndash1897 1996
[15] G Lazzi and O P Gandhi ldquoRealistically tilted and truncatedanatomically based models of the human head for dosimetryof mobile telephonesrdquo IEEE Transactions on ElectromagneticCompatibility vol 39 no 1 pp 55ndash61 1997
[16] A K Lee H D Choi and J I Choi ldquoStudy on SARs inhead models with different shapes by age using SAM modelfor mobile phone exposure at 835MHzrdquo IEEE Transactions onElectromagnetic Compatibility vol 49 no 2 pp 302ndash312 2007
[17] Q-X Li and O P Gandhi ldquoThermal implications of thenew relaxed IEEE RF safety standard for head exposures tocellular telephones at 835 and 1900MHzrdquo IEEE Transactions onMicrowave Theory and Techniques vol 54 no 7 pp 3146ndash31542006
[18] T J Cui and W C Chew ldquoFast algorithm for electromagneticscattering by buried 3-D dielectric objects of large sizerdquo IEEE
8 International Journal of Antennas and Propagation
Transactions on Geoscience and Remote Sensing vol 37 no 5pp 2597ndash2608 1999
[19] J Weaver Applications of Discrete and Continuous FourierAnalysis John Wiley amp Sons New York NY USA 1983
[20] P Bernardi M Cavagnaro S Pisa and E Piuzzi ldquoSpecificabsorption rate and temperature increases in the head of acellular-phone userrdquo IEEE Transactions on Microwave Theoryand Techniques vol 48 no 7 pp 1118ndash1126 2000
[21] httpwwwfccgovfcc-bindielecsh
International Journal of
AerospaceEngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
RoboticsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Active and Passive Electronic Components
Control Scienceand Engineering
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
RotatingMachinery
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporation httpwwwhindawicom
Journal ofEngineeringVolume 2014
Submit your manuscripts athttpwwwhindawicom
VLSI Design
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Shock and Vibration
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Civil EngineeringAdvances in
Acoustics and VibrationAdvances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Electrical and Computer Engineering
Journal of
Advances inOptoElectronics
Hindawi Publishing Corporation httpwwwhindawicom
Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
SensorsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Chemical EngineeringInternational Journal of Antennas and
Propagation
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Navigation and Observation
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
DistributedSensor Networks
International Journal of
8 International Journal of Antennas and Propagation
Transactions on Geoscience and Remote Sensing vol 37 no 5pp 2597ndash2608 1999
[19] J Weaver Applications of Discrete and Continuous FourierAnalysis John Wiley amp Sons New York NY USA 1983
[20] P Bernardi M Cavagnaro S Pisa and E Piuzzi ldquoSpecificabsorption rate and temperature increases in the head of acellular-phone userrdquo IEEE Transactions on Microwave Theoryand Techniques vol 48 no 7 pp 1118ndash1126 2000
[21] httpwwwfccgovfcc-bindielecsh
International Journal of
AerospaceEngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
RoboticsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Active and Passive Electronic Components
Control Scienceand Engineering
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
RotatingMachinery
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporation httpwwwhindawicom
Journal ofEngineeringVolume 2014
Submit your manuscripts athttpwwwhindawicom
VLSI Design
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Shock and Vibration
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Civil EngineeringAdvances in
Acoustics and VibrationAdvances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Electrical and Computer Engineering
Journal of
Advances inOptoElectronics
Hindawi Publishing Corporation httpwwwhindawicom
Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
SensorsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Chemical EngineeringInternational Journal of Antennas and
Propagation
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Navigation and Observation
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
DistributedSensor Networks
International Journal of
International Journal of
AerospaceEngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
RoboticsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Active and Passive Electronic Components
Control Scienceand Engineering
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
RotatingMachinery
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporation httpwwwhindawicom
Journal ofEngineeringVolume 2014
Submit your manuscripts athttpwwwhindawicom
VLSI Design
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Shock and Vibration
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Civil EngineeringAdvances in
Acoustics and VibrationAdvances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Electrical and Computer Engineering
Journal of
Advances inOptoElectronics
Hindawi Publishing Corporation httpwwwhindawicom
Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
SensorsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Chemical EngineeringInternational Journal of Antennas and
Propagation
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Navigation and Observation
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
DistributedSensor Networks
International Journal of