ieee transactions on microwave theory and …vadim/ieee-mtt-2006.pdf · posed radial basis function...

15
IEEE TRANSACTIONS ON MICROWAVE THEORY AND TECHNIQUES, VOL. 54, NO. 7, JULY 2006 3069 RBF Network Optimization of Complex Microwave Systems Represented by Small FDTD Modeling Data Sets Ethan K. Murphy and Vadim V. Yakovlev, Member, IEEE Abstract—This paper outlines an original algorithm of neural optimization backed by three-dimensional full-wave finite-differ- ence time-domain (FDTD) simulation and suitable for viable com- puter-aided design of complex microwave (MW) systems. The fre- quency response of an -parameter is optimized with a decom- posed radial basis function (RBF) network capable of dealing with various MW devices. The key feature of the optimization is the dy- namic generation of as much FDTD data as the network needs to find a solution satisfying the constraints or the stopping cri- teria. Other functions contributing to the reduction of computa- tional cost include a choice of an RBF type, radius optimization of the Gaussian RBF, optimization of the regularization param- eter, etc. Performance of the algorithm is illustrated by its applica- tion to the systems, which can be adequately described only with the full-wave numerical analysis: a double waveguide window, a loaded MW oven, and a patch antenna with two long slits. In all these examples, the network demonstrates excellent generalizing capabilities with the use of relatively small data sets, and the op- timized solutions are obtained within fairly reasonable time. The algorithm is shown to be advantageous over conventional gradient and nongradient local-optimization techniques because it is inde- pendent of the starting point and having the potential to find the “best” local optimum in the specified domain. Finally, parameters of FDTD simulations and the network operations influencing the computational cost of the optimization are thoroughly discussed. Index Terms—Artificial neural networks (ANNs), com- puter-aided design (CAD), dynamic generation of data, elec- tromagnetic (EM) optimization, full-wave simulation, radial basis functions (RBFs). I. INTRODUCTION I T HAS been recently discussed that while new-generation numerical methods and their computer implementations allow for building quite accurate models of many microwave (MW) devices, routine system analysis readily available from the computer simulators may not always result in useful instruc- tions for better design. The present significant interest in MW optimization and computer-aided design (CAD) tools is, there- fore, logical and strongly motivated by practice. Reviewing Manuscript received December 20, 2005; revised April 3, 2006. This work was supported in part by the European Aeronautic Defence and Space Company Foundation. E. K. Murphy was with the Department of Mathematical Sciences, Worcester Polytechnic Institute, Worcester, MA 01609 USA. He is now with the Depart- ment of Mathematics, Colorado State University, Fort Collins, CO 80523 USA (e-mail: [email protected]). V. V. Yakovlev is with the Department of Mathematical Sciences, Worcester Polytechnic Institute, Worcester, MA 01609 USA (e-mail: [email protected]). Digital Object Identifier 10.1109/TMTT.2006.877059 numerous publications on the subject, one can conclude that along with a “classical” space-mapping (SM) technology [1], [2] and a number of SM-based techniques [3]–[8], artificial neural network (ANN)-based approaches [9]–[16] have estab- lished their leading role among the most attractive and powerful instruments of MW optimization. Despite several reports of the inclusion of resourceful full-wave three-dimensional (3-D) electromagnetic (EM) sim- ulators in ANN optimization of MW structures (e.g., [11], [17]–[21]), this approach is generally considered unfeasible due to the high computational cost [2], [16]. The SM-based ANN optimization schemes develop an alternative design technique implementing intelligent handlings of “fine” (slow and highly accurate) and “coarse” (fast and approximately ac- curate) models. For the latter, data are typically obtained from empirical equivalent-circuit models, analytical expressions based on quasi-static approximations, or other fairly idealized approaches (see, e.g., [6], [7], and [22]–[24]), which, in turn, suggest architectures of the networks exploited for optimizing the respective devices. Optimized structures reported in litera- ture include a variety of filters [6]–[8], [22], [25], MW circuits [6], [8], [24], a multilayer transmission line [26], coplanar waveguides [27], high electron-mobility transistor (HEMT) [28], certain integrated circuits [29], very large scale integration (VLSI) interconnects [30], simple patch/slot antennas [23], [31], and some other devices. Proven to be very efficient, the SM-ANN methods, however, are naturally applicable only to those MW systems for which the relevant simplified approaches are available. On the other hand, the physics of many MW and millimeter-wave devices (e.g., complex/partially filled waveguide/resonator structures, nontrivial junctions of transmission lines, complicated di- rectional couplers, power dividers, phase shifters, various planar/aperture/lens antennas, apparatuses for MW processing and medical applications, etc.) is too complex to be represented (even approximately) by empirical/analytical models, but is accessible for 3-D full-wave numerical analysis. Engineering practice presents numerous situations when CAD of systems of this type is highly preferable. When the involvement of EM simulation for characterizing an optimized system is an imperative, possible non-ANN ap- proaches include stochastic (like genetic algorithms [32] or sim- ulated annealing [33]) and hybrid (DIRECT with Kriging meta- modeling [34]) global-optimization schemes, which principally rely on fast modeling tools (the method of moments in [32] and the hybrid finite-element boundary-integral method in [33] 0018-9480/$20.00 © 2006 IEEE

Upload: others

Post on 24-May-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: IEEE TRANSACTIONS ON MICROWAVE THEORY AND …vadim/IEEE-MTT-2006.pdf · posed radial basis function (RBF) network capable of dealing with various MW devices. The key feature of the

IEEE TRANSACTIONS ON MICROWAVE THEORY AND TECHNIQUES, VOL. 54, NO. 7, JULY 2006 3069

RBF Network Optimization of ComplexMicrowave Systems Represented by

Small FDTD Modeling Data SetsEthan K. Murphy and Vadim V. Yakovlev, Member, IEEE

Abstract—This paper outlines an original algorithm of neuraloptimization backed by three-dimensional full-wave finite-differ-ence time-domain (FDTD) simulation and suitable for viable com-puter-aided design of complex microwave (MW) systems. The fre-quency response of an -parameter is optimized with a decom-posed radial basis function (RBF) network capable of dealing withvarious MW devices. The key feature of the optimization is the dy-namic generation of as much FDTD data as the network needsto find a solution satisfying the constraints or the stopping cri-teria. Other functions contributing to the reduction of computa-tional cost include a choice of an RBF type, radius optimizationof the Gaussian RBF, optimization of the regularization param-eter, etc. Performance of the algorithm is illustrated by its applica-tion to the systems, which can be adequately described only withthe full-wave numerical analysis: a double waveguide window, aloaded MW oven, and a patch antenna with two long slits. In allthese examples, the network demonstrates excellent generalizingcapabilities with the use of relatively small data sets, and the op-timized solutions are obtained within fairly reasonable time. Thealgorithm is shown to be advantageous over conventional gradientand nongradient local-optimization techniques because it is inde-pendent of the starting point and having the potential to find the“best” local optimum in the specified domain. Finally, parametersof FDTD simulations and the network operations influencing thecomputational cost of the optimization are thoroughly discussed.

Index Terms—Artificial neural networks (ANNs), com-puter-aided design (CAD), dynamic generation of data, elec-tromagnetic (EM) optimization, full-wave simulation, radial basisfunctions (RBFs).

I. INTRODUCTION

I T HAS been recently discussed that while new-generationnumerical methods and their computer implementations

allow for building quite accurate models of many microwave(MW) devices, routine system analysis readily available fromthe computer simulators may not always result in useful instruc-tions for better design. The present significant interest in MWoptimization and computer-aided design (CAD) tools is, there-fore, logical and strongly motivated by practice. Reviewing

Manuscript received December 20, 2005; revised April 3, 2006. This workwas supported in part by the European Aeronautic Defence and Space CompanyFoundation.

E. K. Murphy was with the Department of Mathematical Sciences, WorcesterPolytechnic Institute, Worcester, MA 01609 USA. He is now with the Depart-ment of Mathematics, Colorado State University, Fort Collins, CO 80523 USA(e-mail: [email protected]).

V. V. Yakovlev is with the Department of Mathematical Sciences, WorcesterPolytechnic Institute, Worcester, MA 01609 USA (e-mail: [email protected]).

Digital Object Identifier 10.1109/TMTT.2006.877059

numerous publications on the subject, one can conclude thatalong with a “classical” space-mapping (SM) technology [1],[2] and a number of SM-based techniques [3]–[8], artificialneural network (ANN)-based approaches [9]–[16] have estab-lished their leading role among the most attractive and powerfulinstruments of MW optimization.

Despite several reports of the inclusion of resourcefulfull-wave three-dimensional (3-D) electromagnetic (EM) sim-ulators in ANN optimization of MW structures (e.g., [11],[17]–[21]), this approach is generally considered unfeasibledue to the high computational cost [2], [16]. The SM-basedANN optimization schemes develop an alternative designtechnique implementing intelligent handlings of “fine” (slowand highly accurate) and “coarse” (fast and approximately ac-curate) models. For the latter, data are typically obtained fromempirical equivalent-circuit models, analytical expressionsbased on quasi-static approximations, or other fairly idealizedapproaches (see, e.g., [6], [7], and [22]–[24]), which, in turn,suggest architectures of the networks exploited for optimizingthe respective devices. Optimized structures reported in litera-ture include a variety of filters [6]–[8], [22], [25], MW circuits[6], [8], [24], a multilayer transmission line [26], coplanarwaveguides [27], high electron-mobility transistor (HEMT)[28], certain integrated circuits [29], very large scale integration(VLSI) interconnects [30], simple patch/slot antennas [23],[31], and some other devices.

Proven to be very efficient, the SM-ANN methods, however,are naturally applicable only to those MW systems for whichthe relevant simplified approaches are available. On the otherhand, the physics of many MW and millimeter-wave devices(e.g., complex/partially filled waveguide/resonator structures,nontrivial junctions of transmission lines, complicated di-rectional couplers, power dividers, phase shifters, variousplanar/aperture/lens antennas, apparatuses for MW processingand medical applications, etc.) is too complex to be represented(even approximately) by empirical/analytical models, but isaccessible for 3-D full-wave numerical analysis. Engineeringpractice presents numerous situations when CAD of systems ofthis type is highly preferable.

When the involvement of EM simulation for characterizingan optimized system is an imperative, possible non-ANN ap-proaches include stochastic (like genetic algorithms [32] or sim-ulated annealing [33]) and hybrid (DIRECT with Kriging meta-modeling [34]) global-optimization schemes, which principallyrely on fast modeling tools (the method of moments in [32]and the hybrid finite-element boundary-integral method in [33]

0018-9480/$20.00 © 2006 IEEE

Page 2: IEEE TRANSACTIONS ON MICROWAVE THEORY AND …vadim/IEEE-MTT-2006.pdf · posed radial basis function (RBF) network capable of dealing with various MW devices. The key feature of the

3070 IEEE TRANSACTIONS ON MICROWAVE THEORY AND TECHNIQUES, VOL. 54, NO. 7, JULY 2006

and [34]) not necessarily applicable to a variety of MW sys-tems. The local-optimization algorithm based on response sur-face methodology [35] does not impose this requirement, but theefficiency of this technique seems insufficient to handle complexstructures.

Looking at the options of widening the class of EM solvers,it seems feasible to look at the so-called conventional neuraloptimization (CNO) [13], [16], [36], which assumes the use ofmodel responses obtained from full-wave numerical analysis.However, corresponding CNO capabilities in optimizing com-plex systems seem to have been insufficiently explored. The rel-evant approaches in constructing efficient network architecturesinclude a segmentation of a MW circuit [11] along with the useof small neuromodels with each individual section [37], distri-bution of the learning task among a number of ANNs (conceptof decomposition) [38]–[40], and a hierarchical neural-networkapproach [41]. In [11], improved convergence of the learningalgorithm is attributed to a special radial wavelet neural net-work employed. To alleviate the principal problem of a heavycomputational load, the design of experiment methodology isused in [15] and [42] to carefully select the necessary learningpoints; preliminary neural clusterization of similar responsesemploying the self-organizing feature map is proposed in [12].Very few of these techniques are associated with 3-D full-wavesimulation (i.e., [11] and [13] with the finite-element method),and neither has been tested to find out if they are capable ofsignificantly improving the efficiency (primarily, CPU time) ofANN optimization backed by the data obtained from full-waveanalysis of complex MW systems.

The question of how far this technology is able to go inperforming viable optimization of complex MW devices iscurrently becoming practically important due to the increasingavailability of universal commercial EM simulators (as the in-struments of fairly accurate system analysis) and the increasingproductivity of modern computers. A direct approach to thisproblem can be reduced to finding special methods, options,and tricks, which could make the CNO approach combinedwith full-wave solvers optimize (or at least essentially improve)frequency responses of -parameters of structures (of at leastreasonable complexity) characterized by not too many designvariables and perform such optimization within a reasonabletime. For straightforward ANN techniques ([43], [44]), theCPU requirement for five- or four-parameter optimization mayalready be prohibitive, whereas a compulsory (even physicallymotivated) strategy of choosing fewer design variables (i.e.,reducing the dimensionality of the input space) may not lead toan optimal solution [43].

As for the underlying technique of 3-D full-wave analysis, theconformal finite-difference time-domain (FDTD) method seemsto be the primary candidate due to high accuracy and adequacyof its models, and advantageous use of the computer memory(in particular, compared with the finite-element algorithms). Aniterative process of getting steady state (i.e., converged solu-tion) may be a particular bonus: computational time for gen-erating data may be shortened/prolonged as necessary at the ex-pense of accuracy—another degree of freedom in an optimiza-tion process. On the other hand, the comparative analysis [45]shows the advanced functionalities of the solvers implementing

the conformal FDTD method when modeling systems and com-ponents of MW power engineering, which are typically electri-cally large and complex structures. Finally, successful examplesof combining the FDTD method with neural-network schemesin order to design particular MW circuits and devices are re-ported in [19]–[21].

In this paper, we, therefore, explore the options for viableFDTD-backed ANN optimization suitable for complex MW andmillimeter-wave systems. Our technique follows the basic con-cepts of CNO and employs the radial basis function (RBF) ANN[46]–[48]. We describe in detail an algorithm incorporating a se-ries of measures taken in order to solve an optimization problemwith less FDTD data (and, hence, in less time). (A brief accountof this algorithm has been given earlier in [49].) The main effecthere is achieved by a special way of forming a database (DB)and particular features of the network training. Instead of firstbuilding the entire DB (whose sufficient size should be guessedanyway and may become unnecessarily large) and then trainingthe network, in our scheme, FDTD data are put in the DB as longas they are needed. This process is conditioned by a trainingerror and the scenario being optimized.

Dynamic control over operations of an EM simulator isknown to be a useful trick. For instance, pursuing the goalof minimizing the number of EM analyses in the combinedEM/circuit co-optimization [50], the dynamic “on-the-fly” tech-nique [50], [51] arranges for a minimal number of full-wavesimulations as required by the multidimensional and min-imal-order linear interpolation scheme. A dynamic generationof training/testing data controlled by neural-network error cri-teria is used in [52] for automatic development of neural modelsof MW circuits. In this context, our approach is original: wedynamically generate numerical data for the decomposed RBFANN (re-trained after getting a new portion of data) in order todirectly optimize complex MW devices given only user-definedspecifications and constraints. Furthermore, we employ severalfunctions, which may lead to further reduction of computationalload. These include optimization of the radius of the GaussianRBF, optimization of the regularization parameter controllingsmoothness of data, choosing the type of an RBF function[local (Gaussian) or global (cubic)], placement of the RBFcenters on the points of the input vector of design variables,inclusion of intermediate minima (found in the course ofdynamic generation of data) in the DB, and some other options.

Given the ability of an FDTD solver and a decomposed net-work to handle various MW systems, our algorithm is designedto be useful in open-ended practical problems. It helps the userclarify uncertainties about the “right” design variables, theirranges, and reasonable constraints. A special stopping criterionpromptly determines the likelihood of an optimal solution in theselected input space.

Finally, we illustrate operations of our procedure by runningwideband optimization of frequency responses of the sys-tems, which cannot be characterized by simplified models (and,hence, optimized by SM-based techniques): a double waveguidewindow (WW), a loaded MW oven, and a patch antenna with apair of wide long slits. In all three examples, we reach either op-timal or significantly improved characteristics. We show that theoptimizedsolutionscanbeobtainedforappliedMWsystemswith

Page 3: IEEE TRANSACTIONS ON MICROWAVE THEORY AND …vadim/IEEE-MTT-2006.pdf · posed radial basis function (RBF) network capable of dealing with various MW devices. The key feature of the

MURPHY AND YAKOVLEV: RBF NETWORK OPTIMIZATION OF COMPLEX MW SYSTEMS REPRESENTED BY SMALL FDTD MODELING DATA SETS 3071

Fig. 1. Architecture of a decomposed RBF ANN with � hidden neurons. (Colorversion available online at: http://ieeexplore.ieee.org.)

a relatively small number of DB points and discuss the optionsand conditions under which FDTD-backed neural optimizationmay make CAD of some complex MW systems viable.

II. OPTIMIZATION ALGORITHM

A. RBF Network Model

Aiming to optimize complex MW devices, we build the un-derlying RBF ANN in accordance with the concept of networkdecomposition; indeed, the complete set of FDTD responses canbe difficult to approximate with a single network. On the otherhand, this approach should allow us to use the same network foroptimizing different MW systems.

In the proposed ANN architecture (Fig. 1), the learning task isdistributed among a number of sub-networks, which divide theoutput space into a set of subspaces. The network works withinput vectors

(1)

and output vectors

(2)

where and are system parameters (de-sign variables), is the value of an -parameter at the

th frequency, and is the number of equally spaced responsesample points in the frequency interval in which the optimiza-tion is to be performed.

In our analysis, a certain form of a frequency response of aparticular -parameter is considered as an objective function ofthe optimal design. It is supposed that for any allowable set ofdesign variables, this response can be obtained with the use ofnumerical (i.e., 3-D FDTD) simulation. With the use of thesesimulations, we generate samples of input–output pairs suchthat the data set is made of the matrices

......

. . ....

(3)

......

. . ....

(4)

Our RBF network is coupled with a linear model

(5)

where are weight matrices, and is a matrix functioncontaining RBFs. In other words, we consider the mapping

. Combining (3) and (5), we can get (6), shown at thebottom of this page, where is an RBF, is the numberof RBFs, is the matrixincorporating RBF with the linear model, and is the mappeddata, i.e., is an approximation of .

B. Error Function

In this study, we use two types of RBFs—the local GaussianRBF

(7)

with being the radius of the th center, and the global cubicRBF

(8)

where is the center of and .

......

. . ....

......

.... . .

......

......

(6)

Page 4: IEEE TRANSACTIONS ON MICROWAVE THEORY AND …vadim/IEEE-MTT-2006.pdf · posed radial basis function (RBF) network capable of dealing with various MW devices. The key feature of the

3072 IEEE TRANSACTIONS ON MICROWAVE THEORY AND TECHNIQUES, VOL. 54, NO. 7, JULY 2006

In order to fully describe and appropriately choose the cor-rect , the weight matrix needs to be specified. To do that,we partition the data set into a training set, ofelements, and a testing set, of elements (with

). When training the network, the following errorfunction is, therefore, minimized:

(9)

where is the th column of . Following [48], we minimizefunction (9) by the weight matrix

(10)

In the training, the number of RBFs is assumed to be the same asthe number of training points , and we pick the cen-ters of the RBFs exactly on the points of (i.e., the centerof is ) so the entire domain is essentially covered bythe RBFs. Given from (10), this yields zero training error

. Convenient for programming, this technique is knownto be practical for dimensional input domains that are not ex-cessively high [53]. It is also consistent with our algorithm em-ploying time-consuming full-wave modeling and, therefore, notpretending to handle the “curse of dimensionality” issue.

C. Network Testing With Gaussian and Cubic RBFs

After training, the ANN model is tested to see how well itgeneralizes/approximates data that was not learned in training.We test the network by the data formed in the matrix

(11)

In the case of Gaussian RBF, the testing mean square error iswritten as

(12)

where is a vector of of size . At the point of choosingthe Gaussian radius, the strategy of RBF ANN modeling isknown to not be clearly defined [38]. To this end, we considerthe vector in (12) as an attractive parameter to be optimized.It could be found as the same (best) radius for all RBFs (scalaroptimization), or the best radius for each individual RBF (vectoroptimization). Hence, we find the optimal , which makes theerror (12) minimal; to do this, we use a numerical least squaresmethod.

As for the cubic function, the measure of accuracy in networktesting is represented by the following expression:

(13)

D. Regularizing the Problem With Ridge Regression

Since the norm of the weight matrix can be arbitrary, then,when it is large, the hyper-surface may be nonsmooth, andthis may mean that our network interpolates poorly. Generally,the weight matrix is ill-conditioned and the problem requiresregularization [47], [54].

To resolve the problem, we use ridge regression, which adds apenalty term to the sum-squared-error and allows for a smootherinterpolating function. We rewrite (9) as

(14)

where is the regularization parameter controlling the smooth-ness of data. In accordance with the scheme described in [48],we minimize the function (14) by the weight matrix

(15)

If is large, the minimization of (14) forces the norm of tobe small while if is close to zero, then the minimization yieldsa small penalty to the norm of . With , (14) and (15)become (9) and (10), respectively.

In order to take control over the norm of , optimizingwith respect to the error might be an attractive option. If theRBF is the Gaussian function, two-parameter optimization maybe performed with respect to and . Using (14) and (15),we create the weight matrix and minimize the followingerror function:

(16)

In the case of the cubic RBF, we can optimize only the termand work with the error function

(17)

Similar to (12) and (13), both errors (16) and (17) are mini-mized numerically.

Page 5: IEEE TRANSACTIONS ON MICROWAVE THEORY AND …vadim/IEEE-MTT-2006.pdf · posed radial basis function (RBF) network capable of dealing with various MW devices. The key feature of the

MURPHY AND YAKOVLEV: RBF NETWORK OPTIMIZATION OF COMPLEX MW SYSTEMS REPRESENTED BY SMALL FDTD MODELING DATA SETS 3073

E. Finding Optimal Design

With the mapping established, we address the minimization/maximization of the objective function corresponding to (2):we find the optimal value with respect to the optimal inputparameters by solving

(18)

where is determined from the training set . Theproblem of maximizing is analogous to (18).

The decision on whether or not a solution is optimal is madein accordance with the following. We define a solution to beoptimal if is a vector such that all its elements are

for (19)

where is the constraint representing the upper (lower) tol-erance allowable for an optimum solution to the minimization(maximization) problem.

When a typical vector represents a rather smooth curve,we solve problem (18) as stated taking the data explicitly fromthe FDTD simulator. For better results with highly nonlineardata, we deal with the vector produced from (2) such that theth component is a 1 3 vector of the following form:

(20)In this case, we deal with the mapping , and the min-imization of is equivalent to simultaneously minimizing theaverage value of elements of , the slopes of the curve, and thearea under the curve corresponding to (2).

This formulation is intended to facilitate the search for wide-band characteristics of resonant structures. It is motivated bythe fact that highly resonant curves, in general, do not pass alow-tolerance constraint in the interval even thoughthey may have a minimum less than in this frequency range.Thus, we choose to contain properties of that may help tolocate a desirable curve quicker. The first and third componentsof (20) are two different measures of how small is over the in-terval, whereas the second term is used to widen a resonance’swidth by flattening the peak’s slopes. When (20) is applied,in (19) are replaced by .

Problem (18) is solved as minimization (or maximization) ofwith respect to using a least squares method employing

the steepest descent method. Our final solution is the set ofoptimal design variables corresponding to the optimal fre-quency characteristic of the -parameter .

III. OPTIMIZATION WITH SMALL FDTD DATA SETS

A major concern associated with the scheme described inSection III comes from the fact that, for complex MW systems,

Fig. 2. Flowchart of the proposed optimization algorithm. (Color version avail-able online at: http://ieeexplore.ieee.org.)

FDTD generation of matrix (4) may be so time consuming thatthe related optimization procedure will be impractical. To ad-dress this issue, we introduce here an original yet simple tech-nique, which keeps the process of creation of a DB under controland enforces generation of only as many samples as neces-sary. The procedure is schematically presented in Fig. 2.

Given design variables and subdivisionsof their intervals, the procedure creates a set of points that areequally spaced across the -dimensional domain and that arerepresentatives of each design variable. The number of thesepoints is . With each being very small, thesize of the initial DB will also be small.

The procedure starts with running FDTD simulations to makean initial DB. For system parameters, we divide each intervalin half, thus creating 2 rectilinear subdomains in the inputspace. Out of a number of possible choices for the location ofinitial guesses, in this paper we opt for the simplest one (bothconceptually and in terms of programming) by placing them inthe centers of these subdomains.

The procedure trains the RBF network and tries to find anoptimum. If the solution passes the constraints, the proceduretakes it along with corresponding system parameters as the finalsolution and stops; otherwise, the method proceeds to the nextiteration at which the algorithm creates more DB points andcontinues to search for an optimum.

More specifically, each local minimum found is simulatedwith the FDTD method, and the corresponding input–outputpairs are added to the DB giving new sam-ples, where is the number of repeated points. With the numberof subsequent subdivisions of the variables’ ranges

, the procedure creates new points.These points are chosen to be uniformly distributed randomnumbers inside subdomains of dimension . If any of themare already in the DB, new points are selected. This means thatafter running an FDTD solver with all these new points, our DBis of the size .

Page 6: IEEE TRANSACTIONS ON MICROWAVE THEORY AND …vadim/IEEE-MTT-2006.pdf · posed radial basis function (RBF) network capable of dealing with various MW devices. The key feature of the

3074 IEEE TRANSACTIONS ON MICROWAVE THEORY AND TECHNIQUES, VOL. 54, NO. 7, JULY 2006

The algorithm continues in this fashion until the constraints,or one of the stopping criteria, are satisfied. The latter includethe following:

(S1) a maximal DB size;(S2) a maximal time elapsed;(S3) certain closeness of two solutions from two iterations.

While (S1) and (S2) serve as “safety conditions,” terminatingoptimizations taking more data/time than is desirable, criterion(S3) is a crucial component of the procedure. It compares thecurrent optimal solution with previous ones in accordancewith the formula

% for every(21)

Criterion (S3) recognizes if is similar to that of any pre-vious iteration. Through experimentation, it has been found thatif (S3) snaps into action, the optimization procedure stops withthe current solution, which turns out to be the “best,” althoughnot an optimal one. As such, this stopping criterion is capable ofeffectively suggesting the absence of an optimum in the giveninput space.

IV. COMPUTER IMPLEMENTATION

The optimization procedure described above has been imple-mented in a MATLAB code in which minimization of the errorsis performed with the use of the nonlinear least squares method(procedure in the MATLAB Optimization Toolbox).Training/testing data for the network (i.e., matrices and )are generated by the full-wave 3-D conformal FDTD simulatorQuickWave-3D1 (QW-3D) v. 5.0.

While in most applications of ANN optimization a neural net-work is constructed in accordance with the parameters of thedevice and/or its model, the algorithm outlined in this paper ischaracterized by a certain universality.

• Data is generated with the use of a universal modeling tech-nique capable of accurate analysis of a wide range of com-plex MW systems and components.

• The procedure exploits the decomposed network, which issupposed to be able to handle data on a variety of fairlycomplicated devices.

• The optimization scheme deals with the input vector (1),which may contain arbitrary (geometrical or material) pa-rameters of the respective FDTD model.

Therefore, our algorithm is implemented as a MATLAB codefor optimizing -parameters of different MW systems. A deviceto be optimized is represented by a fully parameterized QW-3Dmodel built with the use of a macroprogramming function avail-able in the simulator. Any model parameters are of equal worthand can be chosen as the design variables of the related opti-mization problem. The MATLAB code modifies the values of thedesign variables in the model, controls the operations of the sim-ulator, and processes its output.

A QW-3D model of the optimized device is supposed to beprepared in accordance with standard criteria for efficient (andpreferably quickly converged) FDTD models, e.g., featuring a

1QuickWave-3D, QWED, 00-672 Warsaw, Poland. [Online]. Available:http://www.qwed.com.pl/.

TABLE ITRAINING METHODS IN THE PROPOSED RBF ANN OPTIMIZATION

nonuniform mesh with appropriate cell sizes for all media. Inthe course of optimization, the number of time steps of eachsimulation is set up to be the same so this parameter is chosento ensure convergence of the solution for all system parameters

. The influence of model characteristics on productivity ofoptimization is discussed in Section VII.

A special interface of the optimization procedure facilitatesspecification of the design variables, subdivisions of theirintervals ( and ), the constraints ( , and ), a typeof the RBF function, a regime of optimization of the Gaussianradius (scalar or vector optimization), the DB’s fractions ofthe training/testing samples, input parameter range after linearscaling [10], stopping criteria, etc. Results dis-played for each contain optimal values of the design variables,

-parameters in the interval , a resulted DB size, avalue of the regularization parameter, and an optimized RBFradius (if applicable), as well as an elapsed CPU time.

V. EXAMPLES AND DISCUSSION

In the following illustrations, we show how our optimizationprocedure works with three complex systems, which cannot becharacterized by simplified/empirical models, but rather requirea full-wave 3-D numerical solution. In the examples below, weimitate open-ended CAD processes and treat and asattractive goals of the design rather than strong constraints. In-deed, given a high computational cost of FDTD-backed opti-mization, one may be pleased with a solution, which, being closeenough to a desirable optimum, is obtained fairly quickly.

In order to better illustrate functionality of the algorithm,three parameters of utmost importance for the optimizationprocess are intentionally set to their extremes: we chooseto be small numbers (in order to get a DB of really small sizebefore the first training), construct an FDTD mesh, which isdense enough for making “fine” models, and opt for runningeach simulation for as many time steps as needed to guaranteethe converged results. The mesh in all projects is set up with thehelp of sensitivity analysis determining the cell size such thatthe smaller cells generate an -parameter’s frequency responsedifferent by less than 3%.

We also test the optimization procedure for functionality ofdifferent training methods (see Table I). In all examples,and . All CPU times are given belowfor a single-processor version of QW-3D, which runs on a DualXeon 3.2-GHz PC operating under Windows XP.

A. Double WW

A WW is known to be a major component of transmissionlines used with vacuum/high- pressure applicators, particle ac-

Page 7: IEEE TRANSACTIONS ON MICROWAVE THEORY AND …vadim/IEEE-MTT-2006.pdf · posed radial basis function (RBF) network capable of dealing with various MW devices. The key feature of the

MURPHY AND YAKOVLEV: RBF NETWORK OPTIMIZATION OF COMPLEX MW SYSTEMS REPRESENTED BY SMALL FDTD MODELING DATA SETS 3075

Fig. 3. (a) Profile of the double WW (courtesy of The Ferrite Company Inc.,Nashua, NH) and (b) geometrical parameters of its model. (Color version avail-able online at: http://ieeexplore.ieee.org.)

celerators, and MW plasma devices [55]–[58]. The function ofthe window is to provide vacuum/gas isolation of the sourcefrom the cavity while transmitting MWs with minimum atten-uation. We test our procedure in finding optimal geometry ofa double WW consisting of a 200-mm section of WR340 andtwo rectangular dielectric layers (made from Quartz, permit-tivity ), as shown in Fig. 3; a similar device hasbeen considered earlier in [44].

Our goal is to minimize , more specifically, to makeit less than in the frequency range from to

GHz assuming that the geometrical parameters areconditioned by the following design specifications:

mm

mm

mm

mm

mm

mm

mm (22)

Optimization of the device is performed using its fine FDTDmodel—a nonuniform mesh (the cell sizes are 4 mm in air and1.5 mm in quartz) contains approximately 123 000 to 348 000cells (depending on the dimensions of the layers). Simulationreaches steady state after about 5000 time steps, so with 5200

TABLE IIDOUBLE WW: CHARACTERISTICS OF FOUR-, FIVE-, AND SIX-PARAMETER

OPTIMIZATIONS

TABLE IIIDOUBLE WW: OPTIMAL CONFIGURATIONS (mm)

ones chosen for the runs in the optimization process, a singlecomputation takes from 55 s to 1 min 25 s.

In the three-parameter optimization problem with design vari-ables

mm, , and the 5% criterion (S3), all trainingmethods proceed up to different iteration and lead to quitedifferent designs for which corresponding characteristicsviolate the constraint; in all cases, optimization is stopped bythe (S3) criterion. This implies that, for the chosen input space,an optimal solution may not exist. The result is confirmed byrunning the same optimization with (S3) switched off and (S2)set up for 3 h: for this time, the procedure goes through 8–9iterations, repeatedly finding the solutions similar to the onesobtained at the previous iterations.

Multiple optimal solutions are found for other input spacesfeaturing 4–6 design variables and ; examples of theoptimal configurations and data characterizing different algo-rithm options are presented in Tables II and III. Fig. 4 showsthe optimized frequency responses of the reflection coefficientobtained with scalar optimization of the Gaussian radius, withadded ridge regression, with one more point added to initial andsubsequent subdivisions of the design variables’ intervals, andwith one more design variable. A nonoptimized characteristicalso shown in Fig. 4 for comparison corresponds to the mid-points of the intervals (22). Table IV contains data on the testsof the training methods with two particular optimization prob-lems; vectors of optimized RBF radii are denoted as .

Analyzing these results, one may notice that optimal solu-tions generated by our RBF ANN algorithm depend on the inputspace and the training method employed; their computational

Page 8: IEEE TRANSACTIONS ON MICROWAVE THEORY AND …vadim/IEEE-MTT-2006.pdf · posed radial basis function (RBF) network capable of dealing with various MW devices. The key feature of the

3076 IEEE TRANSACTIONS ON MICROWAVE THEORY AND TECHNIQUES, VOL. 54, NO. 7, JULY 2006

Fig. 4. jS j characteristics for four optimal designs of the double WW (Ta-bles II and III). (Color version available online at: http://ieeexplore.ieee.org.)

TABLE IVDOUBLE WW: TRAINING METHODS IN FOUR-PARAMETER OPTIMIZATIONS

costs may also be different. More specifically, the following ap-plies.

1) An additional point taken for representation of each de-sign variable requires a notable increase in the CPU time(e.g., comparing Optimal Designs 4 and 6, by 2.3 ), butmay lead only to a minor change in the characteristic.When making a larger DB at the first iteration, the algo-rithm needs more FDTD simulations than for optimizationwith a smaller initial DB succeeded by several iterationsin dynamic generation of data. On the other hand, a moredetailed characterization of the input space may be insuffi-cient for “refocusing” the network on another domain con-taining an alternative (possibly better) optimal solution.

2) An additional design variable expands the input space andmay naturally lead to a better characteristic (see Op-timal Designs 4 and 7), however, computational cost ofthis solution is notably higher—in the present example, theCPU time increases by 2.4 to 3.4 .

3) Training with scalar optimization of may lead to insignif-icant (5%–10%) increase in the CPU time in comparisonwith training with a constant radius 1. However, optimiza-tion with the GRS method, if it does not require more itera-tions than GR1, may eventually be quicker. With appropri-ately chosen , the input space may be characterized betterwith a smaller number of centers. This implies that thecapability of the network to generalize learned data maybe improved. Since, in this case, the algorithm may needfewer DB points, less time may be needed for finding anoptimum.

4) Training with vector optimization of turns out to be a del-icate option essentially depending on the problem to be op-timized. In some cases, especially when dealing with largeDBs, this may become an additional computational burdengiving effectively the same result, but in others (like inthe examples in Table IV), GRV may accelerate optimiza-tion by reducing either the number of optimization itera-tions or the time spent on training with the DB of aboutthe same size. Individually chosen radii may further re-duce the required number of centers and make the domainof the Gaussian RBF influence more accurately described.We observe that the GRV-trained network may be able tofind the solutions with a lower average value of a mini-mized function in the optimality zone. For instance, whendealing with the double WW example, vector optimizationof tends to lead to more flat rather than resonantcharacteristics.

5) When the regularization parameter is optimized, the DBsize may depend on an RBF type. For the cubic function,the optimal solution may be reached with fewer points thanwithout ridge regression, whereas for the Gaussian one,the algorithm may need a larger DB (e.g., in some WWtests, by 20%–25%). This seems to be consistent with thekey property of these RBFs: the cubic function is gener-ally considered attractive as it tends to require less datafor training so a particular input space can be well repre-sented either by fewer cubic (global) functions or by moreGaussian (local) ones.

6) Optimization of along with vector optimization of maybe quite computationally expensive in comparison with allother training methods so using this combination may beunfeasible when optimizing truly complex scenarios. Atthe same time, the proposed optimization algorithm is de-signed as working with minimally necessary data so theprobability of overfitting (for which regularization wouldbe crucial) may not be high. Not being in great demand,the GRV method could, therefore, be left for optimizingnot too complex systems.

B. Loaded MW Oven

In this example, we optimize the efficiency of the systemoriginally considered in [43] and reviewed in [44]: a rectangularcavity (with dimensions mm)imitating a MW oven (Fig. 5). The device operating at

GHz is excited by the mode of a rectangularwaveguide centered on the top of one side. Thecavity contains a load (a cylinder of diameter and of length,

Page 9: IEEE TRANSACTIONS ON MICROWAVE THEORY AND …vadim/IEEE-MTT-2006.pdf · posed radial basis function (RBF) network capable of dealing with various MW devices. The key feature of the

MURPHY AND YAKOVLEV: RBF NETWORK OPTIMIZATION OF COMPLEX MW SYSTEMS REPRESENTED BY SMALL FDTD MODELING DATA SETS 3077

Fig. 5. Geometrical parameters of a MW oven with a load on a shelf. (Colorversion available online at: http://ieeexplore.ieee.org.)

with rounded ends, ) lying on a centered shelf (cylin-drical disk of diameter and thickness ) at the height fromthe bottom. The load is centered on the shelf, i.e., the onlyparameter characterizing its position is the angle .

Optimization is performed for the complex permittivity ofmeat ( , cooked beef [59]) for the load, and ofglass for the shelf. The exciting waveguide isWR340 ( mm); the diameter of the shelf is constant( mm). The goal is to find the geometry characterizedby the efficiency not less than 90% (by making smallerthan ) in the range GHz [35].The specifications of the design variables are

mm

mm

mm

mm (23)

The project is discretized with a nonuniform mesh (the max-imum cell sizes are 5 mm in air and in glass and 2/1 mm in meatin horizontal/vertical planes) making approximately 371 000 to711 000 cells (depending on the dimensions and orientation ofthe load). FDTD simulation reaches steady state at the level ofapproximately 18 000 iterations, but with more time steps, res-onant peaks on the curve continue to change their magni-tudes. To make the difference between the optimized solutionand its accurate numerical representation not too significant, wetake 22 000 iterations for each run so a single computation takesfrom 5.2 to 11.2 min.

In [43], a similar device was addressed with CNO taking nospecial measures for acceleration of an optimization process.For five design variables, the problem was found prohibitivelyexpensive—e.g., with five points representing each design vari-able, the DB would be constructed in approximately 32 days.In this situation, the problem was split into three subsequentlysolved problems with two or three design variables. However,the resulting characteristic did not satisfy the applied con-straint on most of the interval [43].

With the described procedure employing the GRS trainingmethod, the five-parameter optimization problem is success-fully solved in less than two days (Table V). In order to minimize

TABLE VCOMPUTATIONAL RESOURCES AND OPTIMAL CONFIGURATIONS REQUIRED

IN OPTIMIZATION OF A MW OVEN USING GRS� METHOD

Fig. 6. jS j characteristics generated by three iterations of the optimizationof a MW oven (Table V). (Color version available online at: http://ieeexplore.ieee.org.)

the number of mandatory FDTD simulations for the initial DB,and are taken as three and two, respectively. Even with

this “sparse” characterization of an apparently highly resonanthyper-surface, our optimization procedure needs only three iter-ations (42.4 h of CPU time) to find an optimal solution (the boldcurve in Fig. 6). Similarly to the previous example, a nonopti-mized characteristic corresponds to the values of all five designvariables at the midpoints of their intervals (23).

The optimal configuration suggested by the optimization pro-cedure seems to be physically reasonable. The orientation of theload along the longitudinal direction of the feeding waveguide,a minimal thickness of the lossless shelf, and not too small di-mensions of the lossy load seem to be the factors favorable forminimizing reflections from the cavity.

C. Patch Antenna With Two Slits

Here, we explore the capacity of our optimization procedureto find configurations of patch antennas characterized by largerbandwidths. We deal with the structure proposed in [60]—arectangular patch with a pair of wide slits and an air substrate(Fig. 7) with the following dimensions: mm,

mm, mm, mm, and mm(for other parameters, see Table VI).

In the FDTD model, the mesh consists of 1.2-mm cells inthe plane of the antenna and 1.5-mm cells in the perpendiculardirection. In the horizontal plane, the mesh is uniform, exceptaround the coaxial cable where the cells are smaller (0.75 mm).The walls of the box performing near-to-far (NTF) field trans-formation are placed at the distance of 11 cells from the an-tenna, and the box implementing the perfectly matched layer

Page 10: IEEE TRANSACTIONS ON MICROWAVE THEORY AND …vadim/IEEE-MTT-2006.pdf · posed radial basis function (RBF) network capable of dealing with various MW devices. The key feature of the

3078 IEEE TRANSACTIONS ON MICROWAVE THEORY AND TECHNIQUES, VOL. 54, NO. 7, JULY 2006

Fig. 7. Geometry of a patch antenna with a pair of wide slits [60]. (Color ver-sion available online at: http://ieeexplore.ieee.org.)

TABLE VIORIGINAL ANTENNA AND TWO CONFIGURATIONS

OF WIDER BANDWIDTHS (GRS� METHOD)

(PML) absorbing boundary conditions is placed 11 cells fromthe NTF box. The entire model contains nearly 1.3 million cells.The steady state is reached roughly within 8000 time steps, andit takes 11.2 min of CPU time.

FDTD simulation of the antenna with its original geometry[60] and the feed taken as a 50- coaxial line with outer radius1.5 mm shows that the impedance bandwidth determined froma 10-dB return loss is MHz (thin curve in Fig. 8,first line in Table VII). Looking for alternative configurationsof the patch with larger , we allow the two slits to be ofdifferent lengths and different widths and , and solvea five-parameter optimization problem for two different sets ofdesign specifications, namely,

mm

mm

mm

mm

mm (24a)

mm

mm

mm

mm

mm (24b)

Fig. 8. Computed return loss for the original antenna’s geometry and two op-timal configurations (Table VI). (Color version available online at: http://ieeex-plore.ieee.org.)

TABLE VIIBANDWIDTH CHARACTERISTICS OF THE ORIGINAL AND OPTIMIZED ANTENNAS

The thickness of the antenna and the ground plane are as-sumed to be finite (0.5 mm). Other geometrical parameters aretaken as in [60]. The value of is 10 dB. The frequency con-straints are set up beyond the anticipated endpoints of the op-timal band as and GHz. By doing so, wedo not expect these constraints to be satisfied, but rather wantthe optimization procedure to look for a maximally wide fre-quency response of .

Optimization is performed with the GRS training methodand and . Two resulting configurations are pre-sented in Tables VI and VII and Fig. 8. The bandwidth of bothantennas is notably widened—by 5% and 6% of the band’s cen-tral frequency . In both problems (24A) and (24B), optimiza-tion process is stopped by the 5% criterion (S3).

Although the lengths and widths of the slits are allowedto be different, in both solutions the antennas nearly retainsymmetry: and . However, another optimiza-tion performed for constraints (24A) and a slightly differentposition of the feed ( mm) leads us to the design withessentially different widths and lengths of the patch’s slits

mm, mm, mm, mm,and mm. For the latter geometry, MHz.

VI. COMPARISON WITH OTHER OPTIMIZATION METHODS

The presented RBF ANN procedure has been testedagainst two local-optimization techniques—the gradientDavidon–Fletcher–Powell (G-DFP) and nongradient Powell(NG-P) methods [61]. We worked with the G-DFP and NG-Palgorithms implemented in QW-Optimizer, an optimization

Page 11: IEEE TRANSACTIONS ON MICROWAVE THEORY AND …vadim/IEEE-MTT-2006.pdf · posed radial basis function (RBF) network capable of dealing with various MW devices. The key feature of the

MURPHY AND YAKOVLEV: RBF NETWORK OPTIMIZATION OF COMPLEX MW SYSTEMS REPRESENTED BY SMALL FDTD MODELING DATA SETS 3079

Fig. 9. jS j characteristics generated by the NG-P and G-DFP algorithms forthe double WW (Fig. 3): the starting point is: (a) very close and (b) far awayfrom the RBF ANN optimum; max/min step sizes for all design variables are10/2 mm, respectively. Starting points (I) and (II) are given in Table VIII. (Colorversion available online at: http://ieeexplore.ieee.org.)

module distributed with the QW-3D v.5.0 package; as such,these techniques dealt with the same FDTD models and thesame type of data as the neural-network procedure.

Five-parameter optimizations of the devices considered inSection V [the double WW (Fig. 3) with (22), the MW oven(Fig. 5) with (23), and the patch antenna (Fig. 7) with (24a)]yield the following observations. When the starting point ischosen in the immediate neighborhood of the optimum foundby the RBF ANN procedure, the conventional optimization mayconverge towards these optimized variables. In this case, therelated response becomes very close to the one obtainedby our neural network procedure (as shown in Fig. 9(a) for theNG-P algorithm). This example directly validates the outputof the RBF ANN optimization. On the other hand, with thechoice of the starting point far away from the optimum or eveninsufficiently close to it (in practice, that is normally the case),

Fig. 10. Typical jS j characteristics from single runs of the NG-P and G-DFPalgorithms for the MW oven (Fig. 5): at the starting point, the design variablesare in the midpoints of (23); max/min steps are 30/10 mm/degrees, respectively.(Color version available online at: http://ieeexplore.ieee.org.)

Fig. 11. Typical jS j characteristics generated by single runs of the NG-P andG-DFP algorithms for the patch antenna (Fig. 7): max/min steps are 6/2 mm,respectively. (Color version available online at: http://ieeexplore.ieee.org.)

the NG-P and G-DFP algorithms either find a “worse” solutionor do not find an optimum at all (Figs. 9(b), 10, and 11).

Our tests, therefore, show that, in problems with several de-sign variables and numerous local minima, our RBF ANN op-timization is advantageous compared to the conventional local-optimization algorithms, which may naturally be stuck in partic-ular local solutions and, thus, require being run multiple timesfrom different starting points and with different parameters. Dueto excellent network generalization, our technique has a clearpotential of finding the “best” local solution in the specified do-main without the demand of guessing a good starting point orthe need to change the design variables. It appears that the factthat the RBF ANN optimization takes longer in comparison withthe single runs of the NG-P and G-DFP algorithms (Table VIIIand Figs. 10 and 11) is not a drawback since optimization withthe conventional techniques may, in fact, take indefinitely moreCPU time.

Page 12: IEEE TRANSACTIONS ON MICROWAVE THEORY AND …vadim/IEEE-MTT-2006.pdf · posed radial basis function (RBF) network capable of dealing with various MW devices. The key feature of the

3080 IEEE TRANSACTIONS ON MICROWAVE THEORY AND TECHNIQUES, VOL. 54, NO. 7, JULY 2006

TABLE VIIICONVERGENCE OF NG-P AND G-DFP OPTIMIZATION

TO THE RBF ANN OPTIMAL DESIGN 9 (TABLE III)

TABLE IXFACTORS CONTROLLING CPU TIME OF THE FDTD-BACKED

RBF ANN OPTIMIZATION

VII. CONTROLLING CPU TIME

The presented algorithm consists of two distinct parts, i.e.,FDTD simulations producing numerical data (and taking CPUtime ) and the network operations responsible for training/testing, optimization, etc. (time ). We identify several factorsinfluencing and ; they are listed in Table IX.

Contribution to the total CPU time made at the first phase (DBgeneration) depends on the following:

I) numbers of FDTD simulations (specified by and );II) time of a single FDTD simulation ;

III) type of the problem to be optimized.While understanding of aspects related to III) may be limited

before an optimization problem is solved, points I) and II) areunder designer’s control.

The initial number of simulations is set up by the parameter, i.e., the numbers of representative points

on the intervals of all design variables. Dynamic generationof data for the decomposed RBF network assumes that each

is a small number. When applied to three complex devicesconsidered above, the algorithm succeeds with . Thisclearly makes a dramatic impact on the overall productivity ofthe optimization procedure. The total number of FDTD runsalso depends on (the numbers of interval subdivisions at thesubsequent optimization iterations—the above examples showthat this number can also be very small, e.g., 2) and the numberof required iterations —though the latter is problem-dependentand determined in the course of the optimization.

There seem to be some options concerning direct reduction of. Optimization is preceded by setting parameters of an FDTD

model and automated mesh generation. The time depends onthe model parameters—factors (i)–(iii) in Table IX. Sensitivity

Fig. 12. Averaged total CPU time as function of DB size for different trainingmethods in a two-parameter optimization (t = 0:98min). (Color version avail-able online at: http://ieeexplore.ieee.org.)

analysis is a standard means of setting the cell sizes, which arenot smaller than necessary for a required accuracy. Minor in-crease of the cell sizes and/or reduction of the number of timesteps (to stop simulation at some point before it reaches steadystate) will result in decreasing . Such settings, which gener-ally worsen the accuracy of the model, may be exploited, ifused with caution, in practical CAD work for quicker generationof optimization results based on a somewhat “coarser” FDTDmodel. “Finer”-model-based optimization could then be run inthe small neighborhood of the “coarse” optimum. Even further,the latter could be directly verified with a “finer” model; if theresult only slightly violates the constraints, it can be consideredan approximation of an optimum.

It is also worth noting that with automatic generation ofFDTD mesh employed in an optimization loop for each con-figuration of the optimized system, measures should be takento avoid accidental appearance of cells that are too small.Their presence, with the fixed number of time steps presetfor all FDTD runs, could lead to inclusion of nonconverged(i.e., erroneous) data in the training set. While in the aboveexamples the optimization process is small-cell free, here wedo not explore a reduction of working with “coarser” FDTDmodels, but rather exploit steady-state solutions obtained onthe fine meshes in order to find out if the proposed algorithmperformed in a “complete” mode is still viable for optimizationof certain complex MW systems.

As for the time for the network operations, it depends on atraining method whose performance is essentially influencedby the type of the optimization problem. Our tests show (seeSection V-A and Fig. 12) that while one method (GRV ) maybe more time-consuming than others, the network operationsof the optimization algorithm in all its versions are very effi-cient—they are always responsible for a very small fraction ofthe total computational cost. Fig. 13 shows two time character-istics of the developed optimization procedure. For fairly small

(below 100), the CPU time taken by the network operations

Page 13: IEEE TRANSACTIONS ON MICROWAVE THEORY AND …vadim/IEEE-MTT-2006.pdf · posed radial basis function (RBF) network capable of dealing with various MW devices. The key feature of the

MURPHY AND YAKOVLEV: RBF NETWORK OPTIMIZATION OF COMPLEX MW SYSTEMS REPRESENTED BY SMALL FDTD MODELING DATA SETS 3081

Fig. 13. CPU time required for optimization with GRS� method as functionof DB size in projects with: (a) n = 6 and t = 1:02 min and (b) n = 5 andt = 11:05 min. (Color version available online at: http://ieeexplore.ieee.org.)

may be negligible (order of seconds); in the tests for large (upto 800), it is found to be not more than 12% of the total time.

Therefore, the options of scalar optimization of the Gaussianradius and of the regularization parameter and the use of cubicor Gaussian RBF [factors (v) to (vii)] can be employed in ouralgorithm on the basis of feasibility of their use in a particularproblem—without the threat of a substantial increase of com-putational cost of the solution.

Overall, in our algorithm, as in any other RBF network op-timization/approximation schemes [47], [53], some functions/features are supposed to be selected, and decisions about themdepend on the complexity and characteristics of the MW de-vice being optimized. While we demonstrate an efficient func-tionality of the proposed algorithm in quite general settings, webelieve that with certain special findings and tunings useful forparticular systems, the described procedure may become partic-ularly advantageous.

Some optimization problems may, of course, be challengingfor our approach. In situations when a reduction of computa-tional cost is crucial, it may be practical to initially solve an op-timization problem with relaxed constraints [ and ],and/or a greater percentage values for the criterion (S3) [factor(viii)]. This may allow for a quicker solution because of a lowernumber of iterations and provide some knowledge about thetype of problem and its optima—this may help specify/modifythe parameters/constraints of the original problem to be solvedafterwards.

VIII. CONCLUSION

The algorithm of FDTD-backed RBF network optimizationhas been proposed for CAD of a wide group of complex MWand millimeter-wave systems, which cannot be approximated bysimple empirical or physical models, but require full-wave 3-Dnumerical analysis.

We have suggested two principal measures making this ap-proach viable—to use a decomposed RBF network and to dy-namically create a DB of FDTD data. The network demonstratesa capacity for excellent generalization of data and optimizationof different MW structures; the procedure typically identifies the

region of an optimal solution within very few iterations startingwith a really small DB and dynamically increasing it to a neces-sary size. The considered examples show that special functionsintroduced into the optimization procedure (optimization of theGaussian RBF’s radius, optimization of the regularization pa-rameter, special stopping criteria, etc.) further enable our algo-rithm to be computationally efficient for the systems with nottoo many design variables.

This paper, therefore, has demonstrated that given theresources of today’s computers, our RBF ANN-basedFDTD-backed approach to optimization can be practicallyproductive and serve as a CAD tool in designing reasonablycomplex MW devices and components. While the functionalityof most features of the algorithm depends on the nature andcomplexity of the problem to be optimized, the presentedanalysis illustrates the applicability of our procedure to quitea diverse range of complex and electrically large systems, aswell as their successful optimization. The algorithm allows fora variety of creative and inventive adjustments/tunings, whichmay be conditioned by specific characteristics of particularMW devices and introduced for their CAD and optimization.

REFERENCES

[1] J. W. Bandler, R. M. Biernacki, S. H. Chen, P. A. Grobelny, and R.R. H. Hemmers, “Space mapping for electromagnetic optimization,”IEEE Trans. Microw. Theory Tech., vol. 42, no. 12, pp. 2536–2544,Dec. 1994.

[2] J. W. Bandler, Q. S. Cheng, S. A. Dakroury, A. S. Mohamed, M. H.Bakr, K. Madsen, and J. Søndergaard, “Space mapping: The state of theart,” IEEE Trans. Microw. Theory Tech., vol. 52, no. 1, pp. 337–360,Jan. 2004.

[3] M. H. Bakr, J. W. Bandler, N. K. Georgieva, and K. Madsen, “A hy-brid aggressive space-mapping algorithm for EM optimization,” IEEETrans. Microw. Theory Tech., vol. 47, no. 12, pp. 2440–2449, Dec.1999.

[4] J. W. Bandler, M. A. Ismail, J. E. Rayas-Sánchez, and Q. J. Zhang,“Neuromodeling of microwave circuits exploiting space mappingtechnology,” IEEE Trans. Microw. Theory Tech., vol. 47, no. 12, pp.2417–2427, Dec. 1999.

[5] J. W. Bandler, N. Georgieva, M. A. Ismail, J. E. Rayas-Sánchez, andQ. J. Zhang, “A generalized space mapping tableau approach to de-vice modeling,” IEEE Trans. Microw. Theory Tech., vol. 49, no. 1, pp.67–79, Jan. 2001.

[6] J. W. Bandler, M. A. Ismail, J. E. Rayas-Sánchez, and Q. J. Zhang,“Neural inverse space mapping (NISM) optimization for EM-based mi-crowave design,” Int. J. RF Microw. Comput.-Aided Eng., vol. 13, pp.136–147, 2003.

[7] J. W. Bandler, J. E. Rayas-Sánchez, and Q.-J. Zhang, “Yield-drivenelectromagnetic optimization via space mapping-based neuromodels,”Int. J. RF Microw. Comput.-Aided Eng., vol. 12, pp. 79–89, 2002.

[8] J. W. Bandler, D. M. Hailu, K. Madsen, and F. Pedersen, “Aspace-mapping interpolating surrogate algorithm for highly optimizedEM-based design of microwave devices,” IEEE Trans. Microw. TheoryTech., vol. 52, no. 11, pp. 2593–2600, Nov. 2004.

[9] F. G. Guimarães and J. A. Ramírez, “A pruning method for neural net-works and its application for optimization in electromagnetics,” IEEETrans. Magn., vol. 40, no. 2, pp. 1160–1163, Mar. 2004.

[10] Q.-J. Zhang, K. C. Gupta, and V. K. Devabhaktuni, “Artificial neuralnetworks for RF and microwave design—From theory to practice,”IEEE Trans. Microw. Theory Tech., vol. 51, no. 4, pp. 1339–1350, Apr.2003.

[11] S. Bila, Y. Harkouss, M. Ibrahim, J. Rousset, E. N’Goya, D. Bail-largeat, S. Verdey, M. Aubourg, and P. Gillon, “An accurate waveletneural-network-based model for electromagnetic optimization of mi-crowave circuits,” Int. J. RF Microw. Comput.-Aided Eng., vol. 9, pp.297–306, 1999.

[12] P. Burrascano, S. Fiori, and M. Mongiardo, “A review of artificialneural networks applications in microwave computer-aided design,”Int. J. RF Microw. Comput.-Aided Eng., vol. 9, pp. 158–174, 1999.

Page 14: IEEE TRANSACTIONS ON MICROWAVE THEORY AND …vadim/IEEE-MTT-2006.pdf · posed radial basis function (RBF) network capable of dealing with various MW devices. The key feature of the

3082 IEEE TRANSACTIONS ON MICROWAVE THEORY AND TECHNIQUES, VOL. 54, NO. 7, JULY 2006

[13] M. Vai and S. Prasad, “Neural networks in microwave circuit de-sign—Beyond black-box models,” Int. J. RF Microw. Comput.-AidedEng., vol. 9, pp. 187–197, 1999.

[14] P. M. Watson, C. Cho, and K. C. Gupta, “Electromagnetic-artificialneural network model for synthesis of physical dimensions for multi-layer asymmetric coupled transmission structures,” Int. J. RF Microw.Comput.-Aided Eng., vol. 9, pp. 175–186, 1999.

[15] G. L. Creech, B. J. Paul, C. D. Lesniak, T. J. Jenkins, and M. C. Cal-catera, “Artificial neural networks for fast and accurate EM-CAD ofmicrowave circuits,” IEEE Trans. Microw. Theory Tech., vol. 45, no. 5,pp. 794–802, May 1997.

[16] J. E. Rayas-Sánchez, “EM-based optimization of microwave circuitsusing artificial neural networks: The state-of-the-art,” IEEE Trans. Mi-crow. Theory Tech., vol. 52, no. 1, pp. 420–435, Jan. 2004.

[17] S. Bila, D. Baillargeat, M. Aubourg, S. Verdeyme, and P. Guillon, “Afull electromagnetic CAD tool for microwave devices using a finiteelement method and neural networks,” Int. J. Numer. Modeling, vol.13, pp. 167–180, 2000.

[18] Y. Lee and D. S. Filipovic, “ANN based electromagnetic models forthe design of RF MEMS switches,” IEEE Microw. Wireless Compon.Lett., vol. 15, no. 11, pp. 823–825, Nov. 2005.

[19] M. G. Banciu, E. Ambikairajah, and R. Ramer, “Microstrip filter designusing FDTD and neural networks,” Microw. Opt. Technol. Lett., vol. 34,no. 3, pp. 219–224, 2002.

[20] S. Goasguen and S. M. El-Ghazaly, “A coupled FDTD-artificialneural network technique for large-signal analysis of microwavecircuits,” Int. J. RF Microw. Comput.-Aided Eng., vol. 12, pp.25–36, 2002.

[21] H. J. Delgado and M. H. Thursby, “A novel neural networkcombined with FDTD for the synthesis of a printed dipole antenna,”IEEE Trans. Antennas Propag., vol. 53, no. 7, pp. 2231–2236,Jul. 2005.

[22] M. H. Bakr, J. W. Bandler, M. A. Ismail, J. E. Rayas-Sánchez, and Q.-J.Zhang, “Neural space mapping optimization for EM-based design,”IEEE Trans. Microw. Theory Tech., vol. 48, no. 12, pp. 2307–2315,Dec. 2000.

[23] P. M. Watson, G. L. Creech, and K. C. Gupta, “Knowledge basedEM-ANN models for the design of wide bandwidth CPW patch/slotantennas,” in IEEE AP-S Int. Microw. Symp. Dig., Orlando, FL, 1999,pp. 2588–2591.

[24] F. Wang and Q.-J. Zhang, “Knowledge based neuromodels for mi-crowave design,” IEEE Trans. Microw. Theory Tech., vol. 45, no. 12,pp. 2333–2343, Dec. 1997.

[25] J. W. Bandler, A. S. Mohamed, and M. H. Bakr, “TLM-based modelingand design exploiting space mapping,” IEEE Trans. Microw. TheoryTech., vol. 53, no. 9, pp. 2801–2811, Sep. 2005.

[26] P. M. Watson, C. Cho, and K. C. Gupta, “Electromagnetic-artificialneural network model for synthesis of physical dimensions for multi-layer asymmetric coupled transmission structures,” Int. J. RF Microw.Comput.-Aided Eng., vol. 9, pp. 175–186, 1999.

[27] C. Yildiz and M. Turkmen, “Very accurate and simple CAD modelsbased on neural networks for coplanar waveguide synthesis,” Int. J. RFMicrow. Comput.-Aided Eng., vol. 15, pp. 218–224, 2005.

[28] K. Shirakawa, M. Shimiz, N. Okubo, and Y. Daido, “Structural deter-mination of multilayered large-signal neural network HEMT model,”IEEE Trans. Microw. Theory Tech., vol. 46, no. 10, pp. 1367–1375,Oct. 1998.

[29] G. L. Creech and J. M. Zurada, “Neural network modeling of GaAsIC material and MESFET device characteristics,” Int. J. RF Microw.Comput.-Aided Eng., vol. 9, pp. 241–253, 1999.

[30] A. Veluswami, M. S. Nakhla, and Q.-J. Zhang, “The application ofneural networks to EM-based simulation and optimization of intercon-nects in high-speed VLSI circuits,” IEEE Trans. Microw. Theory Tech.,vol. 45, no. 5, pp. 712–723, May 1997.

[31] N. P. Somasiri, X. Chen, and A. A. Rezazadeh, “Neural network mod-eler for design optimization at multilayer patch antennas,” Proc. Inst.Elect. Eng.—Microw. Antennas Propag., vol. 151, pp. 514–518, Jun.2004.

[32] S. Chakravarty, R. Mittra, and N. R. Williams, “On the application ofthe microgenetic algorithm to the design of broadband microwave ab-sorbers comprising frequency selective surfaces embedded in multi-layer dielectric media,” IEEE Trans. Microw. Theory Tech., vol. 49,no. 6, pp. 1050–1059, Jun. 2001.

[33] Z. Li, Y. E. Erdemli, J. L. Volakis, and P. Y. Papalambros, “Design op-timization of conformal antennas by integrating stochastic algorithmswith the hybrid finite element method,” IEEE Trans. Antennas Propag.,vol. 50, no. 5, pp. 676–684, May 2002.

[34] E. S. Siah, M. Sasena, J. L. Volakis, and P. P. Papalambros, “Factparameter optimization of large-scale electromagnetic objects usingDIRECT with Kriging metamodeling,” IEEE Trans. Microw. TheoryTech., vol. 52, no. 1, pp. 276–285, Jan. 2004.

[35] V. A. Mechenova and V. V. Yakovlev, “Efficiency optimization forsystems and components in microwave power engineering,” J. Microw.Power Electromagn. Energy, vol. 39, no. 1, pp. 15–29, 2004.

[36] A. H. Zaabab, Q.-J. Zhang, and M. S. Nakhla, “A neural network mod-eling approach to circuit optimization and statistical design,” IEEETrans. Microw. Theory Tech., vol. 43, no. 6, pp. 1349–1358, Jun. 1995.

[37] T. S. Horng, C. C. Wang, and N. G. Alexopoulos, “Microstrip cir-cuit design using neural networks,” in IEEE MTT-S Int. Microw. Symp.Dig., Atlanta, GA, 1993, pp. 413–416.

[38] Y. Harkouss, J. Rousset, H. Chehade, E. Ngoya, D. Barataud, and J. P.Teyssier, “The use of artificial neural networks in nonlinear microwavedevices and circuits modeling: An application to telecommunicationsystem design,” Int. J. RF Microw. Comput.-Aided Eng., vol. 9, pp.198–215, 1999.

[39] G. L. Creech and J. M. Zurada, “Neural network modeling of GaAsIC material and MESFET device characteristics,” Int. J. RF Microw.Comput.-Aided Eng., vol. 9, pp. 241–253, 1999.

[40] S. Selleri, S. Manetti, and G. Pelosi, “Neural network applications inmicrowave device design,” Int. J. RF Microw. Comput.-Aided Eng., vol.12, pp. 90–97, 2002.

[41] F. Wang, V. Devabhaktuni, and Q.-L. Zhang, “A hierarchal neural net-work approach to the development of a library of neural models formicrowave design,” IEEE Trans. Microw. Theory Tech., vol. 46, no.12, pp. 2391–2403, Dec. 1998.

[42] M. Joodaki and G. Kompa, “A systematic approach to a reliable neuralmodel for pHEMT using different numbers of training data,” in IEEEMTT-S Int. Microw. Symp. Dig., Seattle, WA, 2002, pp. 1105–1108.

[43] E. K. Murphy and V. V. Yakovlev, “FDTD-backed RBF network tech-nique for efficiency optimization of microwave structures,” in Proc. 9thAMPERE Microwave High Frequency Heating Conf., Loughborough,U.K., 2003, pp. 197–200.

[44] V. A. Mechenova, E. K. Murphy, and V. V. Yakovlev, “Advances incomputer optimization of microwave heating systems,” in Proc. 38thMicrow. Power Symp., Toronto, ON, Canada, 2004, pp. 87–91.

[45] V. V. Yakovlev, “Examination of contemporary electromagnetic soft-ware capable of modeling problems of microwave heating,” in Ad-vances in Microwave and Radio Frequency Processing, M. Willert Po-rada, Ed. Berlin, Germany: Springer-Verlag, 2006, pp. 178–190.

[46] F. Wang, V. K. Devabhaktuni, C. Xi, and Q. J. Zhang, “Neural networkstructures and training algorithms for RF and microwave applications,”Int. J. RF Microw. Comput.-Aided Eng., vol. 9, pp. 216–240, 1999.

[47] S. Haykin, Neural Networks: A Comprehensive Foundation, 2nd ed.Englewood Cliffs, NJ: Prentice-Hall, 1999.

[48] M. Orr, “Introduction to radial basis function networks,” Inst.Adaptive Neural Comput., Edinburgh Univ., Edinburgh, U.K., Tech.Rep., 1996.

[49] E. K. Murphy and V. V. Yakovlev, “RBF ANN optimization ofsystems represented by small FDTD data sets,” in Proc. 10thAMPERE Microw. High-Frequency Heating Conf., Modena, Italy,2005, pp. 376–379.

[50] D. De Zutter, J. Sercu, T. Dhaene, J. De Geest, F. J. Demuynck, S. Ham-madi, and C.-W. P. Huang, “Recent trends in the integration of circuitoptimization and full-wave electromagnetic analysis,” IEEE Trans. Mi-crow. Theory Tech., vol. 52, no. 1, pp. 245–256, Jan. 2004.

[51] J. Sercu and S. Hammadi, “Minimal-order multi-dimensional linearinterpolation for a parameterized electromagnetic model database,”in IEEE MTT-S Int. Microw. Symp. Dig., Philadelphia, PA, 2003, pp.295–298.

[52] V. K. Devabhaktuni, M. C. E. Yagoub, and Q.-L. Zhang, “A robust al-gorithm for automatic development of neural-network models for mi-crowave applications,” IEEE Trans. Microw. Theory Tech., vol. 49, no.12, pp. 2282–2291, Dec. 2001.

[53] M. Kirby, Geometric Data Analysis. New York: Wiley, 2001.[54] M. Bazan, M. Aleksa, and S. Russenschuck, “An improved method

using radial basis function neural networks to speed up optimizationalgorithms,” IEEE Trans. Magn., vol. 38, no. 2, pp. 1081–1084, Mar.2002.

[55] R. Meredith, Engineers’ Handbook of Industrial MicrowaveHeating. London, U.K.: IEE Press, 1998.

[56] E. Chojnacki, T. Hays, J. Kirchgessner, H. Padamsee, M. Cole, and T.Schultheiss, “Design of a high average power waveguide window,” inProc. Particle Accelerator Conf., Vancouver, BC, Canada, 1997, SRF-970508-05.

Page 15: IEEE TRANSACTIONS ON MICROWAVE THEORY AND …vadim/IEEE-MTT-2006.pdf · posed radial basis function (RBF) network capable of dealing with various MW devices. The key feature of the

MURPHY AND YAKOVLEV: RBF NETWORK OPTIMIZATION OF COMPLEX MW SYSTEMS REPRESENTED BY SMALL FDTD MODELING DATA SETS 3083

[57] R. Baskaran, “Double window configuration as a low cost microwavewaveguide window for plasma applications,” Rev. Sci. Instrum., vol.68, pp. 4424–4426, Dec. 1997.

[58] M. E. Hill, R. S. Callin, and D. H. Whittum, “High-power vacuumwindow in WR10,” IEEE Trans. Microw. Theory Tech., vol. 49, no.5, pp. 994–995, May 2001.

[59] N. Bengtsson and P. Risman, “Dielectric properties of foods at 3 GHzas determined by a cavity perturbation technique. Measurement onfood materials,” J. Microw. Power, vol. 6, no. 2, pp. 107–123, 1971.

[60] K.-L. Wong and W.-H. Hsu, “A broadband rectangular patch antennawith a pair of wide slits,” IEEE Trans. Antennas Propag., vol. 49, no.9, pp. 1345–1347, Sep. 2001.

[61] P. Venkataraman, Applied Optimization with MATLAB Program-ming. New York: Wiley, 2002.

Ethan K. Murphy was born in Lowell, MA, in1979. He received the M.Sc. degree in industrialmathematics from Worcester Polytechnic Institute,Worcester, MA, in 2003, and is currently workingtoward the Ph.D. in the field of inverse EM scatteringat Colorado State University, Fort Collins.

He is currently with the Department of Mathe-matics, Colorado State University. He has authoredseveral papers in referred journals and conferenceproceedings. His research interests include inverseproblems, MW optimization, and electric impedance

tomography.Mr. Murphy is a member of American Mathematical Society and Pi Mu Ep-

silon. He was a recipient of the 2002 Worcester Polytechnic Institute ProvostAward for the Best Major Qualifying Project.

Vadim V. Yakovlev (M’05) received the M.Sc.degree in radio physics and electronics from SaratovState University, Saratov, Russia, in 1979, and thePh.D. degree in radio physics from the Institute ofRadio Engineering and Electronics (IRE), RussianAcademy of Sciences (RAS), Moscow, Russia, in1991.

From 1984 to 1996, he was a Junior Research Sci-entist, Research Scientist, and Senior Research Sci-entist with the IRE RAS. In 1993, he was VisitingResearcher with the Centre “Les Renardières,” Elec-

tricité de France. In 1996, he joined the Department of Mathematical Sciences,Worcester Polytechnic Institute, Worcester, as a North American Treaty Or-ganization (NATO)/National Science Foundation (NSF) Fellow. He has beenwith the Department of Mathematical Sciences, Worcester Polytechnic Insti-tute since then, where he is currently a Research Associate Professor. He isa Head of the Industrial Microwave Modeling Group, which he established in1999 as a division of the WPI’s Center for Industrial Mathematics and Statistics.His research interests in computational electromagnetics include neural-net-work-based computation and optimization, noninvasive reconstruction of mediaparameters, coupled EM/thermal problems, MW power engineering, and broad-band/multiband antennas. He has authored over 80 papers in referred journalsand conference proceedings. He is listed in International Who’s Who of Intel-lectuals (International Biographical Centre, 1998). He serves as a Reviewer forseveral journals.

Dr. Yakovlev is a member of the International Microwave Power Institute(IMPI). In 2004, he was induced to the IMPI Technical Advisory Board. He isa member of the Association for Microwave Power in Europe for Research andEducation (AMPERE) and a member of the Massachusetts Institute of Tech-nology (MIT) Electromagnetics Academy. He is a member of Program Com-mittees of several conferences.