checkcif/platon crystal structure validation ton spek utrecht university, the netherlands....

54
CheckCIF/PLATON Crystal Structure Validation Ton Spek Utrecht University, The Netherlands. Goettingen, 7- Sep, 2011

Post on 15-Jan-2016

228 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: CheckCIF/PLATON Crystal Structure Validation Ton Spek Utrecht University, The Netherlands. Goettingen, 7-Sep, 2011

CheckCIF/PLATON Crystal Structure Validation

Ton SpekUtrecht University,

The Netherlands.

Goettingen, 7-Sep, 2011

Page 2: CheckCIF/PLATON Crystal Structure Validation Ton Spek Utrecht University, The Netherlands. Goettingen, 7-Sep, 2011

Data Collection around 1966Nonius AD3 DiffractometerOne data set: weeks !

Page 3: CheckCIF/PLATON Crystal Structure Validation Ton Spek Utrecht University, The Netherlands. Goettingen, 7-Sep, 2011

~1966, Electrologica X8 ALGOL60 ‘Mainframe’ (<1MHz)

16kWOperator

InputOutputPlotter

Console

Multiple Hours of computing time per structure

Page 4: CheckCIF/PLATON Crystal Structure Validation Ton Spek Utrecht University, The Netherlands. Goettingen, 7-Sep, 2011

Flexowriter for the creation and editing of programs and data

Page 5: CheckCIF/PLATON Crystal Structure Validation Ton Spek Utrecht University, The Netherlands. Goettingen, 7-Sep, 2011

Data Storage in the Past

Direct Methods ALGOL60 Program AUDICE on Papertape

Page 6: CheckCIF/PLATON Crystal Structure Validation Ton Spek Utrecht University, The Netherlands. Goettingen, 7-Sep, 2011

Archival of Model Parametersin a Publication (Acta Cryst.)

Page 7: CheckCIF/PLATON Crystal Structure Validation Ton Spek Utrecht University, The Netherlands. Goettingen, 7-Sep, 2011

Archival of Reflection Data ina Publication (Acta Cryst.)

Page 8: CheckCIF/PLATON Crystal Structure Validation Ton Spek Utrecht University, The Netherlands. Goettingen, 7-Sep, 2011

Problems Around 1990• Multiple Data Storage Media (Often hardcopy

only or microfilm).• No Standard Computer Readable Format for

Archival and Data Exchange.• Data Entry of Published data done by Retyping.• No easy Numerical Checking by Referees etc.• CSD Database Archival by Retyping from the

published paper. • Multiple typo’s and inconsistencies in the

Published Data• Often incomplete information reported.

Page 9: CheckCIF/PLATON Crystal Structure Validation Ton Spek Utrecht University, The Netherlands. Goettingen, 7-Sep, 2011

The CIF Solution

• CIF-Standard Proposal for data archival byS.R. Hall, F.H. Allen, I.D. Brown (1991).

Acta Cryst. A47, 655-685.• Adopted by the IUCr• Implemented in the Xtal package Hall,

Stewart et al.).• Adopted early by the author of the

nowadays most commonly used refinement program SHELXL (G.M.Sheldrick)

Page 10: CheckCIF/PLATON Crystal Structure Validation Ton Spek Utrecht University, The Netherlands. Goettingen, 7-Sep, 2011

CIF Example File

Page 11: CheckCIF/PLATON Crystal Structure Validation Ton Spek Utrecht University, The Netherlands. Goettingen, 7-Sep, 2011

CIF Constructs• data_name

where name is the chosen identifier of the data• Data associations e.g.

_ cell_length_a 16.6392(2) _ diffraction_radiation_source ‘sealed tube’• Repetition (loop)

loop_ __symmetry_equiv_pos_as_xyz ‘x, y, z’ ‘-x, y+1/2, -z’

Page 12: CheckCIF/PLATON Crystal Structure Validation Ton Spek Utrecht University, The Netherlands. Goettingen, 7-Sep, 2011

Construct for Text• Text can be included between semi-columns• Used for Acta Cryst. Section C & E Abstract and

Comment sections• Example_publ_section_comment;This paper presents the first exampleof a very important compound.;

Page 13: CheckCIF/PLATON Crystal Structure Validation Ton Spek Utrecht University, The Netherlands. Goettingen, 7-Sep, 2011

CIF Completion

• CIF Files are created by the refinement program (e.g. SHELXL)

• Missing Date can be added with a Text Editor, enCIFer (from the CCDC).

• The Syntax can be checked with a locally installed version of the program enCIFer

(Freely Available: www.ccdc.cam.ac.uk)

Page 14: CheckCIF/PLATON Crystal Structure Validation Ton Spek Utrecht University, The Netherlands. Goettingen, 7-Sep, 2011

Missing Data

PROGRAM enCIFer

Page 15: CheckCIF/PLATON Crystal Structure Validation Ton Spek Utrecht University, The Netherlands. Goettingen, 7-Sep, 2011

Note on Editing the CIF

• The Idea of editing the CIF is to add missing information to the CIF.

• Some Acta Cryst. authors have been found to polish away less nice numerical values (including R-values e.g. 0.0975 => 0.0475)

This leaves traces and is generally detected now (also in retrospect) by the validation software and not good for the career of the culprit…

Page 16: CheckCIF/PLATON Crystal Structure Validation Ton Spek Utrecht University, The Netherlands. Goettingen, 7-Sep, 2011

CIF Validation History• Structure Validation of data supplied in computer

readable CIF format was pioneered by Acta Cryst. C (Syd Hall et al., 1990s).

• Initially the numerical checking of papers submitted to Acta C in CIF format was done by the IUCr Chester staff.

• Subsequently automated checking of the CIF for data consistency, data completeness and validity was introduced (checkCIF) (Non PLATxxx ALERTS).

• PLATON facilities to check for Missed Symmetry and VOIDS were added soon after.

• This was followed by also including the numerous other PLATON based tests (PLATxxx) of the reported structure (currently more than 400). checkcif/PLATON

Page 17: CheckCIF/PLATON Crystal Structure Validation Ton Spek Utrecht University, The Netherlands. Goettingen, 7-Sep, 2011

FCF Validation• Fo/Fc reflection file deposition and archival in

CIF format (FCF) was made mandatory early on for Acta Cryst. papers.

• FCF's are Useful for subsequent analysis of possibly unique data.

• CIF + FCF checking was added in 2010 into the IUCr CheckCIF/PLATON suite.

• Major chemical journals now require CIF deposition and validation reports but (not yet) the deposition of reflection data.

• The CCDC now accepts FCF's for deposition.

Page 18: CheckCIF/PLATON Crystal Structure Validation Ton Spek Utrecht University, The Netherlands. Goettingen, 7-Sep, 2011

Reflection CIF (FCF)

Page 19: CheckCIF/PLATON Crystal Structure Validation Ton Spek Utrecht University, The Netherlands. Goettingen, 7-Sep, 2011

Why Automated Structure Validation

• The large volume of new and routine structure reports submitted for publication.

• The limited number experienced and available crystallographic referees for validation.

• Detection of errors due to the black box use of crystallography by non-crystallographers.

• Setting standards of quality and reliability.• Automated detection of unusual though not

necessarily erroneous issues that need special attention (ALERTS A,B,C,G).

• Sadly: The need to Detect Frauded structure reports.

Page 20: CheckCIF/PLATON Crystal Structure Validation Ton Spek Utrecht University, The Netherlands. Goettingen, 7-Sep, 2011

ALERT LEVELSCheckCif Report in terms of a list of ALERTS

ALERT A – Could Indicate a Serious Problem – Consider Carefully (Correct or tell why Correct)

ALERT B – Might Indicate a Potentially Serious Problem ALERT C – Check to Ensure it is O.K. & Not because of an

oversight. ALERT G – General Info. Check that it is not something

Unexpected.

Page 21: CheckCIF/PLATON Crystal Structure Validation Ton Spek Utrecht University, The Netherlands. Goettingen, 7-Sep, 2011

ALERT TYPES

1 - CIF Construction/Syntax errors,

Missing or Inconsistent Data.

2 - Indicators that the Structure Model

may be Wrong or Deficient.

3 - Indicators that the quality of the results

may be low.

4 - Cosmetic Improvements, Queries and

Suggestions.

Page 22: CheckCIF/PLATON Crystal Structure Validation Ton Spek Utrecht University, The Netherlands. Goettingen, 7-Sep, 2011

Which Key Issues are Addressed

Missed symmetry (“being Marshed”)Wrong chemistry (Misassigned atom types)Too many, too few or misplaced H-atomsMissed solvent accessible voids in the structureMissed TwinningAbsolute structure issuesData quality and completeness

Page 23: CheckCIF/PLATON Crystal Structure Validation Ton Spek Utrecht University, The Netherlands. Goettingen, 7-Sep, 2011

Common CIF Problem

• There exists a frequent misunderstanding about the correct specification of the ‘population’ parameter value in the CIF for an atom on a special position leading to composition ALERTS.

• E.g. A fully occupied position of an atom on an inversion centre has to be specified with 0.5 in the .res and 1.0 in the CIF.

Page 24: CheckCIF/PLATON Crystal Structure Validation Ton Spek Utrecht University, The Netherlands. Goettingen, 7-Sep, 2011

Common Validation Problems

• CIF and FCF not from the final refinement• SHELXL defaults left unchanged• Completeness (up to 25 degrees)-do not cut• Data names in CIF and FCF not identical• ‘Non-standard’ reflection CIF’s• Twinning, Powder, Incommensurate Struct.• Improper parameter transformations (Uij’s)• DAMP 0 0

Page 25: CheckCIF/PLATON Crystal Structure Validation Ton Spek Utrecht University, The Netherlands. Goettingen, 7-Sep, 2011
Page 26: CheckCIF/PLATON Crystal Structure Validation Ton Spek Utrecht University, The Netherlands. Goettingen, 7-Sep, 2011
Page 27: CheckCIF/PLATON Crystal Structure Validation Ton Spek Utrecht University, The Netherlands. Goettingen, 7-Sep, 2011

Validation with PLATON

- Details: www.cryst.chem.uu.nl/platon

- Driven by the file CHECK.DEF with criteria, ALERT messages and advice.

- Use (UNIX): platon –u structure.cif

- Result on file: structure.chk and structure.ckf

- Applicable on CIF’s and CCDC-FDAT

Page 28: CheckCIF/PLATON Crystal Structure Validation Ton Spek Utrecht University, The Netherlands. Goettingen, 7-Sep, 2011

Two ALERTS related to the misplaced Hydrogen Atom

Page 29: CheckCIF/PLATON Crystal Structure Validation Ton Spek Utrecht University, The Netherlands. Goettingen, 7-Sep, 2011

ADVISE

- Validation should not be postponed to the publication phase. All validation issues should be taken care of during the analysis.

- Everything unusual in a structure is suspect, mostly incorrect (artifact) and should be investigated and discussed in great detail and supported by independent evidence.

- The CSD can be very helpful when looking for possible precedents (but be careful)

Page 30: CheckCIF/PLATON Crystal Structure Validation Ton Spek Utrecht University, The Netherlands. Goettingen, 7-Sep, 2011

Systematic Fraud

• A massive fraud was detected in late 2009 of structures mainly published around 2007 in Acta Cryst. E. (Soon 200 retractions !)

• Nobody was prepared for serious and systematic fraud in this not competitive field of routine structures before 2010.

• Many deviations from the expected results can often be explained as errors, inexperience or due to poor data.

• Several retractions before 2010 might in hindsight concern frauded structures and not errors.

• Ongoing testing of our validation software on the archived data for structures published in Acta E often indicated suspect structures needing a more detailed investigation.

• It was only by following up on one of such a strange structure report with an analysis of all structures published by the authors of that paper that a fraud pattern emerged.

• It was discovered that the same data set was used to publish a series if invented isomorphous structures.

Page 31: CheckCIF/PLATON Crystal Structure Validation Ton Spek Utrecht University, The Netherlands. Goettingen, 7-Sep, 2011

BogusVariations (with Hirshfeld ALERTS) on the Published Structure 2-hydroxy-3,5-nitrobenzoic acid (ZAJGUM)

OH => F

H2O => NH3

OH=>NH2

NO2=>COOH

Page 32: CheckCIF/PLATON Crystal Structure Validation Ton Spek Utrecht University, The Netherlands. Goettingen, 7-Sep, 2011

Error and Fraud Detection Tools• Generalized Hirshfeld Rigid Bond Test.

• CIF versus FCF data checking.

• Scatter Plots of the reflection data of the same or related structure(s).

• Look in Difference Maps for unusual features.

• SHELXL re-refinement using the supplied CIF & FCF data.

• Check in the CSD for related structures.

• Two case studies that illustrate the use of the above validation and analysis tools follow.

Page 33: CheckCIF/PLATON Crystal Structure Validation Ton Spek Utrecht University, The Netherlands. Goettingen, 7-Sep, 2011

Example 1:

Submitted to Acta Cryst. (2011)

Structure I

Page 34: CheckCIF/PLATON Crystal Structure Validation Ton Spek Utrecht University, The Netherlands. Goettingen, 7-Sep, 2011

PLATON Report Part 1

Page 35: CheckCIF/PLATON Crystal Structure Validation Ton Spek Utrecht University, The Netherlands. Goettingen, 7-Sep, 2011

PLATON Report Part 2

Page 36: CheckCIF/PLATON Crystal Structure Validation Ton Spek Utrecht University, The Netherlands. Goettingen, 7-Sep, 2011

RELATED STRUCTURE FROM THE CSD

Structure II

Page 37: CheckCIF/PLATON Crystal Structure Validation Ton Spek Utrecht University, The Netherlands. Goettingen, 7-Sep, 2011

Structure Report for II

Page 38: CheckCIF/PLATON Crystal Structure Validation Ton Spek Utrecht University, The Netherlands. Goettingen, 7-Sep, 2011

Analysis

• Structure (II) has no validation issues.• C-CH3 distance in (II) of 1.50 Ang. as expected.• ‘C-F’ distance in (I) is 1.50 Ang. and not the expected

1.35 Ang.• Conclusion: Structure (I) is the CH3 variety and not

F.• Data sets of (I) & (II) are not identical (see next).• Data set (I) likely based on CH3 compound.• Fraud or Error ? DIFABS file Error ?• Authors of (I) confirmed Error believing external

chemists proposal. Paper was retracted.

Page 39: CheckCIF/PLATON Crystal Structure Validation Ton Spek Utrecht University, The Netherlands. Goettingen, 7-Sep, 2011

Scatter Plots of 2 Data Sets

Two Unrelated Data Sets

Two Identical Data sets

Page 40: CheckCIF/PLATON Crystal Structure Validation Ton Spek Utrecht University, The Netherlands. Goettingen, 7-Sep, 2011

CIF versus FCF data Check

• The R & S values in the three lines # R= should be identical within rounding error.

• The reported and calculated residual density ranges should also be closely identical

• This is the case in the first example but not in the second where the CIF & FCF data do not match.

Page 41: CheckCIF/PLATON Crystal Structure Validation Ton Spek Utrecht University, The Netherlands. Goettingen, 7-Sep, 2011

Example 2: Iron(III) Complex

Page 42: CheckCIF/PLATON Crystal Structure Validation Ton Spek Utrecht University, The Netherlands. Goettingen, 7-Sep, 2011

Fe(III) Validation Part 1

Page 43: CheckCIF/PLATON Crystal Structure Validation Ton Spek Utrecht University, The Netherlands. Goettingen, 7-Sep, 2011

Fe(III) Validation Part 2

Page 44: CheckCIF/PLATON Crystal Structure Validation Ton Spek Utrecht University, The Netherlands. Goettingen, 7-Sep, 2011

Example 2: Difference Density Map

Page 45: CheckCIF/PLATON Crystal Structure Validation Ton Spek Utrecht University, The Netherlands. Goettingen, 7-Sep, 2011

Fe Structure Re-refined

Page 46: CheckCIF/PLATON Crystal Structure Validation Ton Spek Utrecht University, The Netherlands. Goettingen, 7-Sep, 2011

Conclusion ?• Structure now O.K. after an erratum ?• Search for similar (isomorphous) structures in

the CSD• Yes, there is an isomorphous Mn complex

published by a different set of authors from a different university.

• Let us compare both structures.

Page 47: CheckCIF/PLATON Crystal Structure Validation Ton Spek Utrecht University, The Netherlands. Goettingen, 7-Sep, 2011

Isomorphous Mn(III) Complex

Page 48: CheckCIF/PLATON Crystal Structure Validation Ton Spek Utrecht University, The Netherlands. Goettingen, 7-Sep, 2011

Mn Structure Validation Part 1

Page 49: CheckCIF/PLATON Crystal Structure Validation Ton Spek Utrecht University, The Netherlands. Goettingen, 7-Sep, 2011

Mn Validation Part 2

Page 50: CheckCIF/PLATON Crystal Structure Validation Ton Spek Utrecht University, The Netherlands. Goettingen, 7-Sep, 2011

Scatter Plot Fe versus Mn I(obs)

Fe and Mn Data Sets Identical !

Page 51: CheckCIF/PLATON Crystal Structure Validation Ton Spek Utrecht University, The Netherlands. Goettingen, 7-Sep, 2011

Validation Challenges

• Avoid False Positive and Negative ALERTS

• Disordered structures (true or artifact)

• Handling of Twinning (data names missing)

• Powder structure validation (experts needed)

• Incommensurate structure validation (experts)

• Fabricated reflection data – Can we detect them

• Education – What is the meaning of an ALERT

• Should validation criteria be different for structures published in chemical journals ?

Page 52: CheckCIF/PLATON Crystal Structure Validation Ton Spek Utrecht University, The Netherlands. Goettingen, 7-Sep, 2011

Residual Problem

EDUCATION

Response of an author of a structural paper submitted to the crystallographic journal Acta Cryst. to an enquiry from a referee on the reported space group:

Please teach me, what does in mean

‘ space group incorrect’ ……

Page 53: CheckCIF/PLATON Crystal Structure Validation Ton Spek Utrecht University, The Netherlands. Goettingen, 7-Sep, 2011

Concluding Remarks

• PLATON includes a standalone Validation Tool. It is part of the WEB-based IUCr CheckCIF/PLATON Tool that is capably managed by Mike Hoyland (IUCr)

• Validation is still a learning process.• Chemical insight might be very helpful and often decisive as a

validation tool.• Deposition of structure factors should be a requirement for all

journals (The CCDC now accepts those along with the CIF)

Page 54: CheckCIF/PLATON Crystal Structure Validation Ton Spek Utrecht University, The Netherlands. Goettingen, 7-Sep, 2011

Thanks To

• Martin Lutz and many others for taking the time to bring various unresolved issues to my attention with actual data and suggestions.

• Send to [email protected]

A.L.Spek (2003). J. Appl. Cryst. 36, 7-13.

A.L.Spek (2009). Acta Cryst. D65, 148-155.