consistent stereochemical search at gsk · 2017-06-27 · small molecule registration statistics 6...

22
Consistent Stereochemical search at GSK Support for v3000 mol format. Richard Bolton Stephen Swanson

Upload: others

Post on 13-Mar-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Consistent Stereochemical search at GSK · 2017-06-27 · Small Molecule registration statistics 6 •Between April 2014 and April 2015 –175K compounds registered.(end user singleton

Consistent Stereochemicalsearch at GSK

Support for v3000 mol format.Richard BoltonStephen Swanson

Page 2: Consistent Stereochemical search at GSK · 2017-06-27 · Small Molecule registration statistics 6 •Between April 2014 and April 2015 –175K compounds registered.(end user singleton

Abstract

2

• GSK has recently upgraded its ChemAxon small molecule registration system to allow

end-users the ability to mark-up and register molecules with complex atom based stereo

centres with no registrar intervention. These are stored using the V3000 mol

format. These more fully described structures will now be pushed to the GSK Chemistry

ODS and indexed using the latest JChem cartridge. This in turn will allow upgrade of

structure query web services and IJC projects to give accurate and consistent behaviour

across IJC clients and other web-service powered structure search tools when running

complex stereo-chemical queries. This presentation will discuss the current issues and

how the move to using v3000 mol format resolves these problems. A timeline will be

shared which shows when GSK will have moved to v3000 mol as its standard for full

representation of small molecules; this solution will also align with the future

requirements of European legislation on fully describing medicinal products.

19 May 2015ChemAxon UGM Budapest

Page 3: Consistent Stereochemical search at GSK · 2017-06-27 · Small Molecule registration statistics 6 •Between April 2014 and April 2015 –175K compounds registered.(end user singleton

Agenda

1. Small molecule registration at GSK. Full atom mark-up and end user

registration checker. A status update.

2. Downstream inconsistency in structure search

3. Plans to use v3000 mol format as GSK’s new standard and how that fixes the

problems.

4. The work that needs doing and the timeline

5. Where next IJC Browser and IDMP.

319 May 2015ChemAxon UGM Budapest

Page 4: Consistent Stereochemical search at GSK · 2017-06-27 · Small Molecule registration statistics 6 •Between April 2014 and April 2015 –175K compounds registered.(end user singleton

Small Molecule Registration. A reminder.

4

• Small Molecule Registration collaboration project with ChemAxon started in

March 2011 and went to production in mid 2012.

• Simplification of small molecule registration

– A-novel-Approach-to- Pharmaceutical-Registration :Charlie Wilkins San Diego 2011

– Going live with registration as a service : Richard Bolton/Akos Papp Budapest 2012

– Registration as a Service: the full story after half a year. Rama Bhamidipati/Akos Papp Boston2012

• End user tools to check registration were rolled out in mid 2014

19 May 2015ChemAxon UGM Budapest

Page 5: Consistent Stereochemical search at GSK · 2017-06-27 · Small Molecule registration statistics 6 •Between April 2014 and April 2015 –175K compounds registered.(end user singleton

Current Registration Process

5

Other?

Compound Name

CCE 001

CCE 002

CCE 003

Purchased

Compounds

Compound Collection Enhancement send the SD File to Registry as received and registered as new entities

optionally, request assistance from the

registrar

Contract

Chemist

Chemist

CCEChemist

eLNB or WebReg

Fully Specified compounds

RegistrarTools

Compound Collection Enhancement purchase

compounds and receive data files with structure information

Registrar

SD FilesSD Files

Fully Specified compounds Business Rules

Staging

Compounds that are fully describe in regard to

stereochemistry, racemic content are automatically registered

Chemist or contract coordinator enter compounds to be registered

and registerability before submitting

Compounds with undefined attributes are sent to the

registrars

Registrars modify compounds for consistency, confirm with the

chemist then register

Registry

x

x

x

19 May 2015ChemAxon UGM Budapest

Page 6: Consistent Stereochemical search at GSK · 2017-06-27 · Small Molecule registration statistics 6 •Between April 2014 and April 2015 –175K compounds registered.(end user singleton

Small Molecule registration statistics

6

• Between April 2014 and April 2015

– 175K compounds registered.(end user singleton and bulk registration)

– 50K end user singletons

– 24k with chiral centers

– 9.5k with 2-9 chiral centers

– 700 with >=10 chiral centers (more likely to be peptides or DNA, soon to be go to our new Biological

Registration system)

• So far in 2015

– Singletons autoregistered 78%

– Singletons registered manually by the submitting scientist (non-registrar manual) 8%

– Singletons registered manually by registrars 14%

19 May 2015ChemAxon UGM Budapest

Page 7: Consistent Stereochemical search at GSK · 2017-06-27 · Small Molecule registration statistics 6 •Between April 2014 and April 2015 –175K compounds registered.(end user singleton

Side-effects of full atom mark-up.

7

• Although the Compound registration system now contains full atom mark-up

downstream systems still rely on v2000 mol format copies and/or standard

smiles strings.

• Structures can be overlaid with ‘pictures’ of the mark-up so users can tell

‘identical’ structures apart, but these are not query features.

• Running a query in IJC or using the GSK structure search web service can give

different results for molecules containing relative stereochemistry.

• Scientists agreed that this inconsistency needed to be resolved.

19 May 2015ChemAxon UGM Budapest

Page 8: Consistent Stereochemical search at GSK · 2017-06-27 · Small Molecule registration statistics 6 •Between April 2014 and April 2015 –175K compounds registered.(end user singleton

8

Examples 1 Generic

Page 9: Consistent Stereochemical search at GSK · 2017-06-27 · Small Molecule registration statistics 6 •Between April 2014 and April 2015 –175K compounds registered.(end user singleton

IJC

4917 compounds

HeliumQuery Structure

Unspecified

Stereochemistry

4921 compounds

Problem 1: Searching With Unspecified Stereochemistry

9

Note the 4917 compound set may not be a subset of the 4921 found in IJC. It may contain other

structures based upon what is entered. It takes a long time to determine which actually contain what

you were looking for.

19 May 2015ChemAxon UGM Budapest

Page 10: Consistent Stereochemical search at GSK · 2017-06-27 · Small Molecule registration statistics 6 •Between April 2014 and April 2015 –175K compounds registered.(end user singleton

Problem 2: Searching With Specified Stereochemistry

10

Helium

IJC

4921 compounds

Query StructureSpecified

Stereochemistry

Returns all the structures

with the unspecified

stereochemistry, not the

specific And1

designations!

If the user is more specific in the search, IJC ignores the input (searches via unspecified

stereochemistry) and Helium just affords an error leaving the scientist unsure of the results.

19 May 2015ChemAxon UGM Budapest

Page 11: Consistent Stereochemical search at GSK · 2017-06-27 · Small Molecule registration statistics 6 •Between April 2014 and April 2015 –175K compounds registered.(end user singleton

Problem 3 Data Integrity

11

Get GSK ID

Get Structure

Helium

This compound number defines a

non-specific structure!GSK 12345A

GSK 12345A

19 May 2015ChemAxon UGM Budapest

Page 12: Consistent Stereochemical search at GSK · 2017-06-27 · Small Molecule registration statistics 6 •Between April 2014 and April 2015 –175K compounds registered.(end user singleton

Problem 3 Potential Data Integrity converting structure to ID to structure

12

Get GSK ID

Get Structure

Helium

This compound number defines a

non-specific structure!GSK 12345A

GSK 12345A

Structure X

absolute stereo

Structure Y

relative stereo

19 May 2015ChemAxon UGM Budapest

Page 13: Consistent Stereochemical search at GSK · 2017-06-27 · Small Molecule registration statistics 6 •Between April 2014 and April 2015 –175K compounds registered.(end user singleton

13

Examples 2 More specific

Page 14: Consistent Stereochemical search at GSK · 2017-06-27 · Small Molecule registration statistics 6 •Between April 2014 and April 2015 –175K compounds registered.(end user singleton

Isomer Synthesis

4 Isomers (R,R, R,S, S,R, S,S)

2 Isomers (R,R, R,S,)

2 Isomers (R,R, S,R,)

A

B

C

19 May 2015ChemAxon UGM Budapest

Page 15: Consistent Stereochemical search at GSK · 2017-06-27 · Small Molecule registration statistics 6 •Between April 2014 and April 2015 –175K compounds registered.(end user singleton

4 Diasteromers – Registration (A)

Racemic starting materials + Isomer Separation

19 May 2015ChemAxon UGM Budapest

Page 16: Consistent Stereochemical search at GSK · 2017-06-27 · Small Molecule registration statistics 6 •Between April 2014 and April 2015 –175K compounds registered.(end user singleton

4 Diasteromers – Registration (C)

From (R) Amine + Isomer Separation

From (S) Amine + Isomer Separation

19 May 2015ChemAxon UGM Budapest

Page 17: Consistent Stereochemical search at GSK · 2017-06-27 · Small Molecule registration statistics 6 •Between April 2014 and April 2015 –175K compounds registered.(end user singleton

Substructure Searching Outcomes

Did we make this structure?

Did we include all relevant

structures in the patent?

Is this structure part of

an Alliance agreement?

Can I trust my SSS results?

19 May 2015ChemAxon UGM Budapest

Page 18: Consistent Stereochemical search at GSK · 2017-06-27 · Small Molecule registration statistics 6 •Between April 2014 and April 2015 –175K compounds registered.(end user singleton

Target and proposal

Target

• Ensure consistent and correct search results irrespective of the application being

used.

Proposal

• Cascade the fully described v3000 molfile that is stored in Registry to both IJC

and Chemistry Web Services.

1819 May 2015ChemAxon UGM Budapest

Page 19: Consistent Stereochemical search at GSK · 2017-06-27 · Small Molecule registration statistics 6 •Between April 2014 and April 2015 –175K compounds registered.(end user singleton

What needs changing to achieve this?

Upgrade CODS and ACD databases from Oracle 10g to Oracle 12c and ensure they contain v3000 mol

and index

Update JChem Cartridge in Chemistry ODS (CODS) and the Available Chemical Directory (ACD) to enable

full stereo-chemical searching

Reconfigure IJC to search across and return from the new structure field (ChemAxon provided script)

Add methods both Structure Search and ID Lookup Web services which search across and return V3000

results.

Maintain the existing V2000 methods and results to ensure that other downstream applications are not

impacted

Upgrade Helium v3 to v4 and configure it to search across and return the V3000 results.

Ensure that Helium and IJC return identical results with identical V3000 queries

Plan remediation of the other 34 apps dependent on the web services.

1919 May 2015ChemAxon UGM Budapest

Page 20: Consistent Stereochemical search at GSK · 2017-06-27 · Small Molecule registration statistics 6 •Between April 2014 and April 2015 –175K compounds registered.(end user singleton

Timeline

20

May Jun Jul Aug Sep Oct Nov Dec Jan

JChem upgrade in CODS &

ACD

Project Approval Project Release

Helium v4 He 4 + V3000

2

3 Sandbox Dev (TST/INT) Prd

1

4

UKxxx605 Oracle 12c upgrade & testing5

Resolve

Registry Issue

IJC & CL/SS

Migrate Chemistry Webservices

to Java 86Warranty

Project Close

19 May 2015ChemAxon UGM Budapest

Page 21: Consistent Stereochemical search at GSK · 2017-06-27 · Small Molecule registration statistics 6 •Between April 2014 and April 2015 –175K compounds registered.(end user singleton

What next..............

IDMP requirements

– The full description of all materials in a medicinal product will be required by mid 2016.

– V3000 mol is a format that can fulfil that requirement.

IJC Browser

– ChemAxon are working inside the GSK environment to tune IJC Browser

– Plans to move to Browser this year if performance is acceptable in our complex environment.

2119 May 2015ChemAxon UGM Budapest

Page 22: Consistent Stereochemical search at GSK · 2017-06-27 · Small Molecule registration statistics 6 •Between April 2014 and April 2015 –175K compounds registered.(end user singleton

Thank you for your attention

2219 May 2015ChemAxon UGM Budapest

Questions?