peter m. harper, rafiqul gani - semantic scholar · 1 a multi-step and multi-level approach for...

14
1 A Multi-step and Multi-level approach for Computer Aided Molecular Design Peter M. Harper, Rafiqul Gani CAPEC, Department of Chemical Engineering, Building 229, Technical University of Denmark, DK-2800 Lyngby, Denmark Abstract A general multi-step approach for setting up, solving and solution analysis of computer aided molecular design (CAMD) problems is presented. The approach differs from previous work within the field of CAMD since it also addresses the need for a computer aided problem formulation and result analysis. The problem formulation step incorporates a knowledge base for the identification and setup of the design criteria. Candidate compounds are identified using a multi-level generate and test CAMD solution algorithm capable of designing molecules having a high level of molecular detail. A post solution step using an Integrated Computer Aided System (ICAS) for result analysis and verification is included in the methodology. Keywords: CAMD, separation processes, knowledge base, molecular design, solvent selection, substitution, group contribution, property prediction, ICAS Introduction The use of Computer Aided Molecular Design (CAMD) for the identification of compounds having specific physical and chemical properties has received substantial attention from the scientific and industrial communities in recent years. Particularly CAMD has been applied in solvent design for liquid-liquid separation, entrainer, solvent replacement, refrigerant and polymer design and selection (Joback and Stephanopoulos (1990), Gani et al. (1991), Duvedi and Achenie (1996), Venkatasubramanian et al. (1995), Pretel et al. 1994, Raman and Maranas (1998), Vaidyanathan and El-Halwagi (1996) ). A number of different approaches have been proposed and applied for the solution of the problem of identification of compounds having desirable properties. The applied methods vary in their solution strategy, complexity and range of properties handled but are all able to solve the basic fundamental problem (compound identification). The application of a set of algorithms that solves the problem of compound identification is however only one step in the solution procedure for CAMD and it is necessary to address the issues of problem formulation and result analysis as an integral part of the design of molecules. The need for the additional steps arises from the fact that any CAMD solution procedure requires a set of target properties the designed compound should match in order to identify molecules capable of performing the desired task. This set of design variables depends not only on the intended role of the designed compound but also on the range of available property prediction methods. Furthermore the predictive nature of the methods used to assess the values of the design variables creates a need for a post-design step for verification and analysis of the results and for final identification of the most promising candidates. So far the systematic treatment of the needed pre- and post-design steps in CAMD has been given considerably less attention in the literature. In this work the concept of a CAMD methodology is expanded to a three step procedure consisting of

Upload: volien

Post on 03-Apr-2018

221 views

Category:

Documents


4 download

TRANSCRIPT

Page 1: Peter M. Harper, Rafiqul Gani - Semantic Scholar · 1 A Multi-step and Multi-level approach for Computer Aided Molecular Design Peter M. Harper, Rafiqul Gani CAPEC, Department of

1

A Multi-step and Multi-level approach for Computer Aided Molecular Design

Peter M. Harper, Rafiqul GaniCAPEC, Department of Chemical Engineering,Building 229, Technical University of Denmark,

DK-2800 Lyngby, Denmark

AbstractA general multi -step approach for setting up, solving and solution analysis ofcomputer aided molecular design (CAMD) problems is presented. The approachdiffers from previous work within the field of CAMD since it also addresses the needfor a computer aided problem formulation and result analysis. The problemformulation step incorporates a knowledge base for the identification and setup of thedesign criteria. Candidate compounds are identified using a multi -level generate andtest CAMD solution algorithm capable of designing molecules having a high level ofmolecular detail . A post solution step using an Integrated Computer Aided System(ICAS) for result analysis and verification is included in the methodology.

Keywords: CAMD, separation processes, knowledge base, molecular design, solventselection, substitution, group contribution, property prediction, ICAS

IntroductionThe use of Computer Aided Molecular Design (CAMD) for the identification ofcompounds having specific physical and chemical properties has received substantialattention from the scientific and industrial communities in recent years. ParticularlyCAMD has been applied in solvent design for liquid-liquid separation, entrainer,solvent replacement, refrigerant and polymer design and selection (Joback andStephanopoulos (1990), Gani et al. (1991), Duvedi and Achenie (1996),Venkatasubramanian et al. (1995), Pretel et al. 1994, Raman and Maranas (1998),Vaidyanathan and El-Halwagi (1996) ).

A number of different approaches have been proposed and applied for the solution ofthe problem of identification of compounds having desirable properties. The appliedmethods vary in their solution strategy, complexity and range of properties handledbut are all able to solve the basic fundamental problem (compound identification).The application of a set of algorithms that solves the problem of compoundidentification is however only one step in the solution procedure for CAMD and it isnecessary to address the issues of problem formulation and result analysis as anintegral part of the design of molecules. The need for the additional steps arises fromthe fact that any CAMD solution procedure requires a set of target properties thedesigned compound should match in order to identify molecules capable ofperforming the desired task. This set of design variables depends not only on theintended role of the designed compound but also on the range of available propertyprediction methods. Furthermore the predictive nature of the methods used to assessthe values of the design variables creates a need for a post-design step for verificationand analysis of the results and for final identification of the most promisingcandidates. So far the systematic treatment of the needed pre- and post-design steps inCAMD has been given considerably less attention in the literature. In this work theconcept of a CAMD methodology is expanded to a three step procedure consisting of

Page 2: Peter M. Harper, Rafiqul Gani - Semantic Scholar · 1 A Multi-step and Multi-level approach for Computer Aided Molecular Design Peter M. Harper, Rafiqul Gani CAPEC, Department of

2

computer aided steps for problem formulation (pre-design step), compoundidentification (design step) and result analysis (post-design step).

Pre-design StepThe goal of using CAMD techniques is to identify compounds capable of performinga specific task or series of tasks. This is achieved by generating compounds matchinga set of specifications with respect to compound type, physical and chemicalproperties. In order to identify compounds that are able to perform the needed tasks itis important that the desired properties match the types important for the intended use.If the wrong properties are used as design parameters or the property values, used asconstraints, lie outside the range suitable for the application in question, thecompound identification step will fail to identify compound suitable for the desiredoperation. It is therefore essential to have a systematic approach for the formulation ofthe design specifications.In this work the formulation of the design specifications is performed in a computeraided pre-design step where the problem is identified and the design goals (desiredcompound types and properties) are formulated in order to provide input to theapplied method of solution for compound identification.

CAMD Design StepThe employed CAMD Solution method is of the “generate and test” type where allfeasible molecules are generated from a set of building blocks (as identified by thepre-design step) and subsequently tested against the design specifications to screenout the compounds not fulfilli ng the requirements. In order to avoid the so-called“combinatorial explosion” problem associated with CAMD algorithms in general (andthe “generate and test” approach in particular), the multi -level approach of Harper etal. (1999) is employed where, through successive steps of generation and screeningagainst the design criteria, the level of molecular detail i s increased only on the mostpromising candidates.

Post-design StepIn the post-design step the answers from the solution procedure are analyzed withrespect to properties and behavior that could not be part of the design considerations.Examples of such properties and behavior are price, availabili ty, legislativerestrictions and process wide performance. This step involves using other predictionmethods, database sources, engineering insight, and if possible, simulation in order toget an overview of the suitabili ty and capabili ty of the designed compound(s) for theparticular purpose.In order to facilit ate the use of the available tools and data without the need forrepeated conversion and transfer of data, the post-design step is carried out using theIntegrated Computer Aided System (ICAS) as described by (Gani et al., 1997) alongwith external databases.

MethodologyAs outlined above the general problem of f inding compounds suitable for a particularpurpose can be handled by applying a three-step approach. The three steps address theneed for problem formulation, problem solution and solution analysis. In thefollowing the details of the steps are given.

Page 3: Peter M. Harper, Rafiqul Gani - Semantic Scholar · 1 A Multi-step and Multi-level approach for Computer Aided Molecular Design Peter M. Harper, Rafiqul Gani CAPEC, Department of

3

Pre-design StepThe process starts with the definition of the overall goal for the design process. Theoverall goal for the design process is the definition of the overall function thecompound should fulfill along with specifications of additional requirements(compatibili ty with equipment, safety restrictions etc.). Based on the overall goal theformulation in terms of design constraints is achieved using a computerizedknowledge base where the properties of interest are identified on the basis of theoperations involved and the knowledge and predictive capabili ty available for thesystem and the needed properties. The role of the knowledge base is to assist andteach the CAMD user in the formulation of the design constraints and goals mostsuitable for obtaining a solution using the CAMD solution algorithm.

Even though the knowledge base is capable of assisting the user in the formulation ofthe design problem it cannot formulate the complete specification due to the presenceof external or conflicting considerations. It is therefore an interactive process wherethe user has the freedom to modify the design specifications as well as addingadditional constraints.

Knowledge base structureThe information contained in the knowledge base is ordered in a hierarchical systemwith the application types at the top and the properties and property ranges at thebottom. Figure 1 shows a section of the information tree stored in the knowledge base.

Figure 1 The knowledge base information tree (only the branch for extractivedistillation is fully expanded)

Page 4: Peter M. Harper, Rafiqul Gani - Semantic Scholar · 1 A Multi-step and Multi-level approach for Computer Aided Molecular Design Peter M. Harper, Rafiqul Gani CAPEC, Department of

4

As it can be seen in figure 1 the property entries in the information tree have threebranches:

Essential properties. The properties in this branch are essential for the function of thecompound and are most often related to phase behavior of either the compound (e.g. acompound has to be liquid at operational temperatures creates an essentialrequirement that the boili ng point is above and the melting point below theoperational temperature).

Desirable properties. Properties related to the performance or efficiency of acompound in an application. As a rule of thumb no fixed lower or upper limits can beset for the desirable properties but the aim is to have the highest or lowest valuepossible. An example of a desirable property is the selectivity of a solvent used inliquid-liquid extraction towards a particular solute.

EH&S properties. Environment, health and safety properties related to the operationor use. The EH&S properties and specification are mostly of the “desirable” type butcan also be expressed as fixed constraints especially in the cases where legislativerequirements demand it.

The theoretical basis for the property dependent knowledge base and in particular theidentification of property ranges is the concept of binary property ratios as defined byJaksland (1996) for use in process synthesis. Here the concept is used for theidentification of the conditions needed to be present for a particular operation to befeasible rather than analyzing existing conditions in order to identify which separationmethods are feasible and eff icient. In the knowledge base the properties important tothe function in a particular application are stored along with the relative propertydifferences needed (in the cases of a driving force based operation). By having aknowledge base relating the properties and property differences associated witheff iciency and functionali ty to unit operations the needed design criteria can beidentified along with the appropriate property ranges. The types of operations storedin the knowledge base include solvent (mass separation agent) based operations,common separation operations as well as process fluid design entries. The entireknowledge base is stored in binary form as part of a computerized problem setup tool.

Problem formulation algorithmThe algorithm used to identify the properties for the design step is a multi stepapproach using different levels of information. The algorithm is represented below inits major steps and in figure 1 as the possible selection tree.

• List the unit operations to be considered• For each unit operation:

• Enumerate the known properties of the other compounds the designedcompound is to be used with.

• Obtain the approximate operational ranges of temperature and pressure alongwith the approximate composition ranges for the compounds in the system.

• Identify the appropriate thermodynamic models available for phase behaviorprediction for the mixtures. This identification is done using theThermodynamic Model Selection (TMS) system described and developed byGani and O’Connell (1989).

Page 5: Peter M. Harper, Rafiqul Gani - Semantic Scholar · 1 A Multi-step and Multi-level approach for Computer Aided Molecular Design Peter M. Harper, Rafiqul Gani CAPEC, Department of

5

• Extract the list of relevant pure and mixture properties from the knowledgebase for the unit operation. If the thermodynamic model identificationprocedure (previous point) did not yield any viable models, replace themixture properties required with pure component property equivalents fromthe knowledge base.

• If any of the design properties require information about the other compoundsin the system in order to set up the target values compare the requirementswith the list of available properties previously obtained. If some requirementscannot be fulfill ed the properties are removed from the set of design criteria.

• Create a superset of properties included in the design criteria by combining thesets of identified properties for each of the uses and unit operations.

• For each of the properties in the superset create the target ranges (the designconstraints) by combining the property ranges identified for each of the unitoperations and uses. The identified property ranges represent the design criteriasatisfying the requirements of all the operations examined.

• List the methods available for predicting the required properties.• List the compound types that can be handled by the property prediction methods

and the predictive thermodynamic models.• From the list of compound types for which property prediction methods exist

create the list of building blocks used to create/assemble the compounds in thedesign step.

Apart from the unit/property based knowledge base there is a computer aided guideaimed at assisting in the selection of the compound types most suitable for differentapplications. The guide includes considerations like the negative environmentalperformance of aromatic compounds and the risk of polymerization for compoundscontaining triple bonds. This guide can be used to exclude some compound typesfrom the set identified using the property based knowledge base.

CAMD Design StepWhen designing compounds using CAMD algorithm, a set of fragments (groups) istypically used to assemble the compounds. This can either be done using a simpleapproach (generation of a vector of groups describing one or more compounds) ormore rigorously by connecting the fragments to form molecular structures. The use ofpredefined groups as building blocks for the generation of molecular representationsserves a dual purpose since the same groups can be used for the prediction of thetarget properties using group contribution methods. For the above approach to work,the collection of groups (the group set) need to be one for which methods for propertyprediction already exist or such methods need to be developed.

Although some property prediction methods capable of predicting a range ofproperties using a general group set are available (Joback (1984), Constantinou andGani (1994), van Krevelen (1990), Lydersen (1955), Fedors (1982)), in the presentversion they are unable to estimate all the properties that are desirable. This isespecially true for environmentally related properties.

Since direct translation between different property prediction methods is rarelypossible it is desirable to create a methodology enabling the simultaneous use ofmultiple group sets within a CAMD framework. In this work the multi -level CAMDmethodology of Harper et al. (1999) and Harper and Gani (1999) is used for the

Page 6: Peter M. Harper, Rafiqul Gani - Semantic Scholar · 1 A Multi-step and Multi-level approach for Computer Aided Molecular Design Peter M. Harper, Rafiqul Gani CAPEC, Department of

6

CAMD design step since it allows for the use of multiple prediction methods and isable to generate detailed molecular descriptions.

The employed method consists of four levels. Each level has a generation and ascreening step. In the generation step the molecular structures are created while theproperties of the generated compound are predicted and compared against the designspecifications in the screening step. The first two levels operate on moleculardescriptions based on groups while the latter two rely on atomic representations. Theoutline form the individual levels has the following characteristics:

Level 1In the first level, a group contribution approach (generation of group vectors) is usedwith group contribution property prediction methods. Group vectors are generatedfrom the set of building blocks identified in the pre-design step. The generation stepdoes not suffer from the so-called "combinatorial explosion" as it is controlled byrules regarding the feasibili ty of a compound consisting of a given set of groups (Ganiet al. 1991). Only the candidate molecules fulfilli ng all the requirements are allowedto progress onto the next level.

Level 2At the second level, corrective terms to the property predictions are introduced. Theseterms are based on identifying substructures in molecules. At this level molecularstructures are generated using the output from the first level as a starting point and thesecond order groups are identified using a pattern matching algorithm. The generationstep of this level is a tree building process where all the possible legal combinationsof the groups in each group vector are generated.

Level 3In the third level the molecular structures converted into an atomic representation byexpanding the group representations. The conversion into an atomic representationenables the use of molecular encoding techniques (Harper & Gani, 1999). The use ofmolecular encoding techniques makes it possible to re-describe the candidatecompounds using other group contribution schemes thereby further broadening therange of properties that can be estimated as well as giving the opportunity to estimatethe same properties using different methods for comparison.

Level 4In the fourth level the atomic representations from level three are further refined to 3-dimensional representations. This conversion can create further isomer variations(cis/trans and R/S) and enables the use of molecular modeling techniques as well ascreating molecules ready for structural database searches in the post-design step..

Post-design StepAfter the CAMD design step the results need to be analyzed and verified. The reasonfor the need of an analysis and verification is the fact that a number of factors cannotbe assessed using the prediction techniques used in the CAMD design step.The post-design analysis includes structural searches of supplier databases in order todetermine if the identified candidates are commercially available in the quantities andqualiti es needed and at a financially viable price. In order to perform structuralqueries in databases, the databases must support this method of lookup. If this is not

Page 7: Peter M. Harper, Rafiqul Gani - Semantic Scholar · 1 A Multi-step and Multi-level approach for Computer Aided Molecular Design Peter M. Harper, Rafiqul Gani CAPEC, Department of

7

possible, a less powerful but viable approach is to link the structure to itscorresponding CAS registry number (if it exists) and subsequently search databasesusing this number rather than the structure. Database searches are also valuablesources of experimental and environmental data used to verify the results of thepredictions and to obtain environmental information impossible to predict (e.g.legislative requirements and government classifications and guidelines) but importantwhen choosing the final candidates.

Since estimation of environmental properties is considerably more complex thanestimation of physico/chemical properties and often requires knowledge andjudgement in order to apply the best estimation technique, the consideration of allenvironmental properties is not part of the CAMD design step. Instead, theenvironmental properties are assessed during the post-design step – here theadvantage of applying CAMD is that all the candidate chemicals fulfill t he operationalrequirements.

Other needed tools are process simulation packages, mixture analysis tools andmolecular modeling tools for validation of predicted properties or post generationscreening for properties not handled by the CAMD algorithm.As described the abili ty to examine the identified candidates in a broader scope thanthe property oriented functionali ty criteria, is a process involving a broad range oftools. In this work the Integrated Computer Aided System (ICAS) (Gani et al. 1997)framework has been used since it provides an integrated environment for performingmixture analysis (residue curve maps and phase diagram calculations), processsimulation (steady state and dynamic), property prediction and database lookups.

Application examplesIn the following two examples ill ustrating the application of the proposed multi -stepCAMD procedure presented. The first example revisits the toluene replacementproblem of Harper et al. (1999) with the emphasis on applying the pre-design step tothe original problem formulation while the second example (design of solvent forsolid phenol) ill ustrates how the design specifications are dependent on the propertyprediction methods available as well as showing how the post-design step caninfluence the ranking of alternatives.

Example 1: Design of replacement solvent for liquid-liquid extraction of phenol fromwastewater.A wastewater stream contains 7% w/w of phenol. Before discharge to theenvironment the phenol needs to be removed. Previously this has been achieved usingliquid-liquid extraction of phenol using toluene as solvent. Because of tightenedenvironmental requirements on the use and discharge of toluene this is no longerpossible and an alternative solvent has to be found. In the existing process the solvent(toluene) is regenerated using simple distill ation and the extraction process carried outas a countercurrent series of mixer-settler stages (figure 2). The replacement solventshould be able to perform its function using the existing equipment.

Page 8: Peter M. Harper, Rafiqul Gani - Semantic Scholar · 1 A Multi-step and Multi-level approach for Computer Aided Molecular Design Peter M. Harper, Rafiqul Gani CAPEC, Department of

8

Figure 2 Flowsheet for the phenol removal problem.

Pre-design StepThe previously described problem formulation algorithm is applied to the replacementproblem:

Two unit operations are part of the problem formulation and need to be considered:1. The removal of phenol from water by liquid-liquid extraction2. The recovery of solvent by separation of phenol and water using distill ation.

The results from the unit specific points in the algorithms are:

Liquid-liquid extraction:The known properties of water and phenol are: Boili ng point, melting point, heatof fusion, solubili ty parameter as well as density and vapor pressure as a functionof temperature.

The operation is operating at ambient temperatures (approximated to be 298 K)and atmospheric pressure. The composition range is from near 0 to 7% w/wphenol in water.

The available predictive thermodynamic models capable of handling the systemare UNIFAC (VLE parameter set) and UNIFAC (LLE parameter set)

The properties relevant to liquid-liquid solvent design are:Essential properties:

• Boili ng point must be higher than the operating temperature.• Melting point must be lower than the operating temperature.• Liquid density must be different from co-solvent. The ratio of

the densities at the operational temperature must be at least1.05.

• The existence of a liquid-liquid phase split .

Page 9: Peter M. Harper, Rafiqul Gani - Semantic Scholar · 1 A Multi-step and Multi-level approach for Computer Aided Molecular Design Peter M. Harper, Rafiqul Gani CAPEC, Department of

9

Desirable properties:• Solvent loss (mixture property) must be as low as possible.• Separation factor (mixture property) must be as high as

possible.• Solvent capacity (mixture property) must be as high as

possible.• Selectivity (mixture property) must be as high as possible.

EH&S properties:• Open cup flash point of compound should be above the

operating temperature.• Compound should not react with air or other compounds in the

mixture.

Recovery of solvent by simple distill ation:The known properties of phenol are: Boili ng point, melting point, heat of fusion,solubili ty parameter as well as density and vapor pressure as a function oftemperature.

The distill ation is to operate at atmospheric pressure and in a temperature rangebetween the normal boili ng point of phenol and the solvent.

The recommended thermodynamic model is UNIFAC (VLE parameter set).

The properties relevant to simple distill ation are:Essential properties:

• Difference in boili ng point. The ratio of the boili ng points forthe compounds to be separated must be higher than 1.02.

• Difference in vapor pressure at operational temperature. Theratio of the vapor pressures must be higher than 1.5.

• Absence of azeotrope between the compounds that are to beseparated.

Desirable properties:• Homogeneous system at all compositions.• Low enthalpy of vaporization.

Combined design criteria:By combining the property requirements identified using the knowledge base andapplying the knowledge regarding the properties of water and phenol together withthe specified operational temperatures the following design criteria can be set up (notethat the existing process is using a solvent more volatile than phenol and that thedesign criteria therefore also must reflect this even though it is a viable solution tosearch for a compound heavier than phenol):

Essential properties:310K < Boili ng point < 450 KMelting point < 285 K

Page 10: Peter M. Harper, Rafiqul Gani - Semantic Scholar · 1 A Multi-step and Multi-level approach for Computer Aided Molecular Design Peter M. Harper, Rafiqul Gani CAPEC, Department of

10

Liquid density < 0.95 KSolvent must not form azeotrope with phenol.Two liquid phases must form when mixed with phenol and water mixture.(The vapor pressure requirement can be eliminated since the column is operating atatmospheric pressure)

Desirable properties:The enthalpy of vaporization should be as low as possible.The solvent properties (selectivity, solvent power and capacity) with respect to phenolshould be as high as possible while the solvent loss to the other phase should be aslow as possible.

EH&S properties:Open cup flash temperature > 310 KThe replacement solvent should have better environmental characteristics than toluene(this criteria originates from the initial problem formulation).

It is possible to add extra properties to the design formulation if desired. The benefitof adding property requirements, other than those absolutely necessary, is that itmakes the identification procedure less vulnerable to prediction uncertainty (byaddressing the same requirement with two separate specifications) and allow for theloosening of some of the requirements. As an example it is possible to address theconcern of loss of the solvent to the water stream by specifying that the octanol/waterpartition coeff icient should be above a certain value. Another possibili ty is to omit theelimination of the vapor pressure constraint.

Methods available for predicting the required properties:The prediction of boili ng point, melting point and enthalpy of vaporization can beachieved using the Constantinou and Gani group contribution method.The prediction of f lash point is possible using the Shebeko et al. (1984) bondcontribution method.In order to simpli fy the calculations the common thermodynamic model, UNIFAC(VLE parameter set) is used for the entire calculation.

All of the available methods can handle a broad range of compound types and it istherefore the environmental considerations that are controlli ng for the selection of thecompound types to consider in the CAMD design step. Based on environmentalconsiderations only acyclic hydrocarbons are to be considered. Furthermore since thesolvent is to be recycled in a loop involving repeated heating (the distill ation column)only saturated carbon chains are to be considered.

Based on the above the following set of building blocks are to be used in the CAMDdesign step: CH3 CH2 CH C OH CH3CO CH2CO CHO CH3COO CH2COO HCOOCH3O CH2O CH-O COOH.

CAMD design and post-design stepsUsing the generated design specifications and set of building blocks with the CAMDsolution algorithm 207 candidates are obtained if no upper or lower bound are set onthe desirable properties and only the flash point and essential property requirementsand constraint are met. In practical applications it is beneficial to add bounds on the

Page 11: Peter M. Harper, Rafiqul Gani - Semantic Scholar · 1 A Multi-step and Multi-level approach for Computer Aided Molecular Design Peter M. Harper, Rafiqul Gani CAPEC, Department of

11

desirable properties as in order to screen out the least promising candidates. Thedetails of the generation (using bounds on the desired properties) are given in Harperet al. (1999).

The post-design step related to this problem formulation includes searching databasesfor environmental data, simulation of the flowsheet using the ICAS simulation engineand verification of the liquid-liquid phase behaviour of the ternary mixtures ofsolvent, water and phenol. The details and the findings of the post-design step areomitted here for brevity but can be obtained from the authors on request.

Example 2: Design of benzene replacementBenzene is a known solvent for dissolving a wide range of solid compounds.However, because of the serious environmental and health effects associated withbenzene it is desirable to find alternative solvents. In most cases the solutes dissolvedby benzene are complex chemicals (e.g. pharmaceuticals) for which very fewpredictive thermodynamic models exist. The lack of thermodynamic models makes itimpossible to assess the solvent properties of a candidate using mixture calculations(calculation of SLE). There are, however, alternative formulations making it possibleto circumvent this problem. In the following both situations are ill ustrated using theexample of f inding a replacement for benzene for dissolving solid phenol at 298 K.Although phenol cannot be considered to be a complex compound it has been chosenfor ill ustrative purposes (thermodynamic models for phenol do exist)

Pre-design step

Situation 1: Use of mixture propertiesOnly one operation is to be considered: Dissolving solid phenol at 298K using aliquid.

The known properties of phenol are: Boili ng point, melting point, heat of fusion,solubili ty parameter as well as density and vapor pressure as a function oftemperature.

The operational temperature is 298K and the operational pressure 1 bar.

The available thermodynamic models are UNIFAC (VLE parameter set) andModified-UNIFAC (Lyngby).

The relevant properties are:Essential properties:

• Boili ng point must be higher than the operating temperature (310K <Boili ng point).

• Melting point must be lower than the operating temperature (Meltingpoint < 285 K).

• Solvent must not form azeotrope with phenol.

Desirable properties:• Molefraction of phenol in liquid in equili brium with solid phenol must

be as high a possible.

Page 12: Peter M. Harper, Rafiqul Gani - Semantic Scholar · 1 A Multi-step and Multi-level approach for Computer Aided Molecular Design Peter M. Harper, Rafiqul Gani CAPEC, Department of

12

EH&S properties:• Octanol/water partitioning coeff icient (logP) should be as low as

possible (logP < 2).

The prediction of boili ng point, logP and melting point can be achieved using theConstantinou and Gani group contribution method.

As with the previous example the environmental considerations control the range ofcompound types considered. If acyclic alcohols, ketones, aldehydes and ethers areconsidered the building blocks available to the generation algorithm are: CH3 CH2CH C OH CH3CO CH2CO CHO CH3O CH2O CH-O.

Situation 2: Absence of suitable thermodynamic modelsIf one assumes that the thermodynamic model identification did not yield any resultsthe criteria using mixture properties must be expressed using pure componentproperties is possible.

The criteria that are based on mixture calculations are: Azeotrope specification andcalculation of SLE. The azeotrope specification is impossible to express as purecomponent property criteria and is therefore eliminated from the design criteria. It ishowever possible to address this concern as part of the post-design step using thequalitative azeotrope prediction method of Berg (1969). Since equili briumcalculations are unavailable the solubili ty criteria must be reformulated using asolubili ty parameter approach. From the theory of solubili ty parameters it is knownthat compounds having similar solubili ty parameters will dissolve each other. It istherefore possible to express the solubili ty criteria as:

• Solubili ty parameter of designed compound must be close to that of phenol(24.6 (MPa)1/2).

CAMD design stepFor reasons of brevity the details of the CAMD design step are not given here. Bothdesign criteria result in the identification of 63 candidates.Two compounds appear in both solutions and are selected for further analysis in thepost-design step. The compounds are shown in table 1 along with selected properties.

O

O

LogP = 1.56 LogP = 1.63Phenol molefraction = 0.759 Phenol molefraction = 0.785Methyl sec-Butyl EtherCASNO: 6795-87-5

2,2-Dimethyl-1-propanalCASNO: 630-19-3

Table 1 Candidates selected for further analysis

Post-design StepAs an example of the use of the post design step it is assumed that the processequipment used for the solvent phenol interaction is to be cleaned using water. Inorder to regenerate the wash water easily using a stripping unit the solvent should

Page 13: Peter M. Harper, Rafiqul Gani - Semantic Scholar · 1 A Multi-step and Multi-level approach for Computer Aided Molecular Design Peter M. Harper, Rafiqul Gani CAPEC, Department of

13

have a low aff inity towards water. One property linked to water affinity is the octanol/water partition coeff icient and the best candidate based on this property would be thealdehyde. However, since the octanol/water partition coefficient is a distributionproperty it is not only a function of water affinity and additional information shouldbe sought. A property directly related to water affinity is the “COSMO solvationenergy in water” (Camsoft, 1999) which can be calculated using the MOPAC ab initiomolecular modeling method. By using MOPAC to predict the solvation energy thefollowing results are obtained:

Compound Solvation energyMethyl sec-Butyl Ether -3.863 kcal/mol2,2-Dimethyl-1-propanal -7.081 kcal/mol

Table 2 Estimated solvation energies in water

As seen in Table 2, the aldehyde has a much lower solvation energy than the ether,indicating (contrary to the initial estimates) that a stripping procedure would be morediff icult for the aldehyde than for the ether.It can be seen that the post-design step, because of its abili ty to use complexcalculation tools serves a purpose by enhancing the level of knowledge beyond thepredicted properties from the CAMD design step.

ConclusionsA general multi -step procedure for setting up, solving and analyzing CAMD problemshas been proposed along with definitions of the roles of the different steps and thetools needed in each step. The proposed procedure goes beyond the traditional CAMDsolution methodology since it contains pre-design and post-design steps addressingthe need for an assisted problem setup ensuring a feasible design formulation that canbe handled by the available prediction methods and an analysis of the identifiedcandidates in order to examine properties not possible to handle by predictivetechniques or consolidate the knowledge about the behavior of the candidates in thesystems they are intended for. A systematic computer aided methodology forformulating the design specifications cannot only improve the obtained results bytargeting the most important properties but can also ensure an eff icient solution (withrespect to computational complexity) by limiti ng the number of examined compoundtypes to those where the prediction of properties is possible.The presented CAMD methodology is a structured approach consisting of all threemain steps integrating a knowledge-base, a powerful multi -level CAMD solutionprocedure and a series of integrated tools for performing the analysis step.

ReferencesBerg L., 1969, ”Selecting the agent for distill ation” , Chemical engineering progress,65 (9), 52-57

Constantinou, L. and R. Gani, 1994, ”New Group-Contribution Method for theEstimation of Properties of Pure Compounds” , AIChE J., 10, 1697-1710

Duvedi, A. P. and L. E. K. Achenie, 1996,”Designing Environmentally SafeRefrigerants using Mathematical Programming”, Chemical Engineering Science,51(15), 3727-3739

Fedors, R. F., 1982, Chem. Eng. Commun., 16, 149

Page 14: Peter M. Harper, Rafiqul Gani - Semantic Scholar · 1 A Multi-step and Multi-level approach for Computer Aided Molecular Design Peter M. Harper, Rafiqul Gani CAPEC, Department of

14

Gani R. and J. P. O’Connell , 1989, “A Knowledge Based System For The SelectionOf Thermodynamic Models” , Computers chem. Engng., 13 (4-5), 397-404

Gani R., G. Hytoft, C. Jaksland C and A. K. Jensen, 1997, “An integrated computesaided system for integrated design of chemical processes” , Computers chem. Engng.,21 (10), 1135-1146

Gani, R., B. Nielsen and Aa. Fredenslund, 1991, ”A Group Contribution Approach toComputer Aided Molecular Design” , AIChE J., 37 (9), 1318-1332

Harper P. M. and R. Gani, 1999, “CAMD and Solvent Design: From GroupContribution to Molecular Encoding” , AIChE Annual Meeting 1999, Dallas, TX.paper 115m.

Harper, P. M., R. Gani, T. Ishikawa and P. Kolar, 1999,”Computer Aided MolecularDesign with Combined Molecular Modeling and Group Contribution” , Fluid PhaseEquili bria, In press (1999).

Jaksland C., 1996, “Sepertation Process Design and Synthesis Based onThermodynamic Insights” , Ph.D. Thesis, Department of Chemical Engineering,Technical University of Denmark.

Joback, K. G. and G. Stephanopoulos, 1990, “Designing Molecules Posessing DesiredPhysical Property Values” , Foundations of computer-aided process design.Proceedings of the third international conference, 363-387

Joback, K. G., 1984,”A Unified Approach to Physical Property Estimation UsingMultivariate Statistical Techniques” , M.Sc. Thesis, MIT, Cambridge, MA

Lydersen, A. L., 1955, “Estimation of Critical Properties of Organic Compounds” ,Univ. Wisconsin Coll . Eng., Eng. Exp. Stn. rept.3, Madison, Wis. April 1955

Pretel, E. J., P. A. López, S. B. Bottini and E. A. Brignole, 1994,”Computer-AidedMolecular Design of Solvents for Separation Processes” , AIChE J., 40(8), 1349-1360

Raman, V. S. and C. D. Maranas, 1998,”Optimization in Product Design withProperties Correlated with Topological Indices” , Computers chem. Engng., 22(6),747-763

Shebeko, Yu. N., A. Ya. Korol’chenko, A. V. Ivanov and E. N. Alekhina, 1984,“Calculation of Flash Points and Ignition Temperatures of Organic Compounds” , TheSoviet Chemical Industry, 16 (11), 1371-1376

Vaidyanathan R. and M. El-Halwagi, 1996, ”Computer-Aided Synthesis of polymersand Blends with Target Properties” , Ind. Eng. Chem. Res., 35, 627-634

van Krevelen, D. W., 1990,”Properties of Polymers: Their Correlation with ChemicalStructure; Their Numerical Estimation and Prediction from Additive GroupContributions” , 3rd. ed., Elsevier, Amsterdam,

Venkatasubramanian, V., K. Chan and J. M. Caruthers, 1995,”Genetic AlgorithmicApproach for Computer-Aided Molecular Design” , ACS Symposium Series, 589,396-414