Loading ...

Mcule - Ultimate Database Project



October 2018

Synthesis of brand-new heterocycles as new building blocks

According to literature data (Bemis, G. W.; Murcko, M. A. J. Med. Chem. 1996, 39, 2887.), half of the drugs approved until 1996 can be described by the 32 most frequently occurring scaffolds and furthermore the top 50 scaffolds covered about 50% of approved and experimental drugs until 2010 (Wang, J.; Hou, T. J. Chem. Inf. Model. 2010, 50, 55.). These numbers demonstrate that the currently available chemical space for drug discovery is limited and it is based on some privileged scaffolds. Thus, there is a need for novel scaffolds in drug discovery. We believe, that drug discovery could greatly benefit from new heterocycles leading to novel building blocks and eventually novel compounds.

As a part of the ULTIMATE project, we subcontracted the internationally recognised Medicinal Chemistry Research Group of the Research Centre for Natural Sciences of the Hungarian Academy of Sciences (RCNS) for the the synthesis of new heterocycles. The synthesis of the new heterocycle structures is finished at the end of October 2018. Results are sent to validation to our suppliers and pharmaceutical partners. Our aim is to integrate these novel building blocks to the ULTIMATE database and also use them for generating new virtual compounds, which will be highly unique, and also synthesisable.

End of June 2018

First version of the artificial chemist algorithm operates

The development team successfully implemented the first version of ULTIMATE’s enumeration algorithm, reaction rules of robust chemical reactions. The first tests have been successfully carried out using building blocks from our supplier partners, Life Chemicals, HTS Biochemie and Key Organics, respectively. New chemical structures (virtual compounds) were successfully created; some examples will be reported shortly.

We aim to extend the existing chemical space with “virtual molecules” generated by our artificial chemist algorithm, a method for predicting compounds that are not yet synthesized but can be synthesized with easy reactions from existing building blocks and reagents at affordable price.

This algorithm will subsequently go through agile optimisation cycles for further improvement to reach the targeted 80% synthetic success rate.

End of April 2018

Chemoinformatics developments for the ULTIMATE project

To handle the 500 million compounds of the ULTIMATE database, chemoinformatics developments are necessary for speeding-up substructure and similarity searches. In the last 3 months, we developed the concept for chemoinformatic developments of the ULTIMATE project. We also carried out extensive tests, based on the results, our similarity search implementation resulted in very short average runtimes even on a single core machine.

End of January 2018

ULTIMATE database defined

The aim of the ULTIMATE project is to create an easily searchable chemical database of at least 500 million purchasable “virtual” compounds. In the recent months of the project, we defined the overall workflow of the filtering process for compounds to be included in the ULTIMATE database. We implemented property and novelty filters, as well as the method to filter out unwanted structures. Furthermore, we developed a visualisation tool for the chemical space that is based on self-organizing maps.

Novelty filters were implemented to ensure the unique nature of molecules in the ULTIMATE database. The molecules already covered in patents (SureChEMBL) or described in public chemical / biological databases (ChEMBL, PubChem) and those present in already existing purchasable compound databases (ZINC) are filtered out.

Property filters aim to filter out the compounds with properties not compatible with medicinal chemistry purposes. Several physical chemical features are examined, such as molecular mass, logP, tPSA, number of aromatic/aliphatic rings, heteroatom ratio, number of acidic/basic groups, number of sp3 chiral centres, etc.

Compounds containing structures unwanted in medicinal chemistry are also filtered out based on previous well-accepted published methods.

The workflow and implemented filters were successfully tested on a model database. The filters operated well according to the requirements. According to our aim, the ULTIMATE database will contain compounds fulfilling all applied criteria.

Furthermore, we have developed a method for the visualisation of the chemical space using self-organizing maps. The aim of the visualisation is, on one hand, to tailor the development of ULTIMATE database to favour molecules with similar properties to known drugs but underrepresented among already available purchasable compounds and, on the other hand, to help medical chemists to make decisions about the molecules best suited for their research projects. In the latter case the pharma partners may use the maps to compare their in-house molecules to those available in ULTIMATE.

End of October 2017

ULTIMATE project first milestone reached

The ULTIMATE project started in August 2017. The aim of the project is to create an easily searchable chemical database of at least 500 million purchasable “virtual” compounds, which can be synthesized (min. 80% delivery rate) at affordable price, in reasonable time (max. 6 weeks of delivery time). Such a large chemical space would present a major advantage for pharmaceutical and biotech companies by increasing their chances to effectively identify novel compounds for diseases, reducing their costs and time losses.

In the first 3 months of the project, Mcule defined the criteria for compound selection and proposed design and filtering rules to be applied in the ULTIMATE database, based on the inputs of our collaborating partners: pharmaceutical companies (GSK, AstraZeneca and Boehringer Ingelheim), suppliers (HTS Biochemie, Key Organics and Life Chemicals), and academic partners (Research Centre for Natural Sciences of the Hungarian Academy of Sciences and Vrije Universiteit Amsterdam).

To increase the novelty of compounds in the ULTIMATE database, synthesis strategies for new, and underrepresented scaffolds were proposed by two academic subcontractors: Medicinal Chemistry Group of Vrije Universiteit, Amsterdam, the Netherlands (VUA) and Medicinal Chemistry Research Group of the Research Centre for Natural Sciences of the Hungarian Academy of Sciences, Budapest, Hungary (RCNS).

In the first three months of the project, Mcule also defined the IT and chemoinformatics requirements for handling such large database. The functions to be used in the ULTIMATE database were suggested based on the analysis of users’ needs.

About the project

Mcule started its new challenging project, called ULTIMATE - The best online drug discovery platform, building the Ultimate chemical database for drug discovery, from 01/08/2017. In this project, a commercial database of 500 million novel, diverse and synthetically feasible compounds will be developed. Standard parameters of the database: min. 80% success rate, max. 6 weeks delivery time, fixed prices. Compound selections, automated quote generation and ordering will be available online at https://mcule.com

This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 777828. Selected subcontractors and several suppliers will work together with Mcule on the realization of ULTIMATE for 24 months. Project partners include state-of-the-art synthesis design software developers, major chemical suppliers and leading pharma companies.


Mcule realized that the currently accessible chemical space of commercially available compounds is limited. Compound aggregators typically provide access to 7-10 million in-stock compounds only, while the synthetically feasible chemical space is magnitudes larger. Some pharma companies already complement the available in-stock chemical space by in-house virtual libraries, however in-house chemistry resources are typically limited and expensive. Some suppliers already offer virtual libraries, however compound aggregators are not able to integrate them as the size of these libraries presents a major chemoinformatic challenge. The online platform of Mcule is utilizing the latest IT technology and chemoinformatic tools to be capable of handling larger databases integrated with complex modeling tools. Mcule already integrates virtual compounds from major chemical suppliers such as Enamine and UkrOrgSyntez and provides one of the largest compound webshops of over 35 million screening compounds and building blocks. Together with distinguished partners Mcule decided to develop an unprecedented database of 500 million commercially available compounds (Ultimate database) that will be hosted by Mcule.

Development scheme

Ultimate Project Development Scheme

Project partners

Mcule (IT, chemoinformatics)

In Mcule, users can search the commercial chemical space, make compound selections, generate price quotes and place orders for the selected compounds. Basic search tools are provided for simple compound selections as well as advanced filters and modeling applications for library design and virtual screening. Search queries and results can be stored online, merged, modified, exported and shared with colleagues. Mcule will host the Ultimate database on its online platform. The efficiency of the searching and database management tools will be further improved during the project to maximize speed and user experience. Mcule will develop a synthetic feasibility prediction tool capable of handling a wide range of chemical starting materials including the building blocks of the industrial partners and key reactions yielding novel, chemically diverse, synthesizable compounds.

HTS Biochemie (industrial chemistry partner)

HTS Biochemie specializes in creating unique chemical compounds for pharmaceutical, agricultural and biotechnology companies. HTS Biochemie currently offers over 15,000 building blocks and 50,000 screening compounds from stock and has a proven track record in the design and commercialization of synthetically feasible libraries. HTS Biochemie will provide in-stock building blocks as the basis of the Ultimate database and will be involved in the design part as well as in the experimental validation of synthetic feasibility.

Key Organics (industrial chemistry partner)

Key Organics is specialized in providing chemistry services as well as delivering building blocks and screening compounds under the “BIONET” brand with an exceptional same-day ex-stock dispatch. Key Organics will provide in-stock building blocks as the basis of the Ultimate database and will be involved in the design part as well as in the experimental validation of synthetic feasibility.

Life Chemicals (industrial chemistry partner)

Life Chemicals specializes in state-of-the-art organic synthesis and is an internationally recognized producer of original HTS compounds and provider of high quality contract research and manufacturing services. Life Chemicals will supply in-stock building blocks as the basis of the Ultimate database and will be involved in the design part as well as in the experimental validation of synthetic feasibility.

Medicinal Chemistry Research Group, MTA-TTK (academic chemistry partner)

The research group of Prof. György Miklós Keserű is specialized in medicinal chemistry and molecular modeling focusing on fragment-based approaches in the lead discovery of G-protein-coupled receptor and kinase targets as well as in the chemical process development for small molecule active pharmaceutical ingredients. In the Ultimate database project, the group will be primarily responsible for the design and synthesis of novel heterocycles that will be converted to novel building blocks and screening compounds by the industrial partners.

Division of Medicinal Chemistry, Vrije Universiteit Amsterdam (academic chemistry parner)

The VU University Medicinal Chemistry group combines the design, synthesis, pharmacological and biochemical characterization of biologically active molecules. Their primary focus is to develop and characterize novel compounds targeting G-protein-coupled receptors using fragment-based drug discovery approaches. In the Ultimate database project, the group will primarily focus on designing and synthesizing 3D fragments and building blocks bearing underutilized aliphatic rings such as the cyclobutyl ring that can be translated to commercial products by the industrial partners.

Advisory board

Darren Green is the Director of Molecular Design at GSK, Stevenage, UK since 2007, and before that was Director of Cheminformatics for GSK. He leads a group that supports all aspects of drug discovery. Darren also leads the Compound Collection Enhancement strategy for GSK. He will consult the chemoinformatic developments of the ULTIMATE project and provide end-user (pharmaceutical) view point.
Holger Stark is professor at the Heinrich Heine University in Düsseldorf, Germany. He has more than 350 book contributions, original papers, reviews and patents. He is co-inventor of pitolisant (Wakix®), the first histamine H3 receptor antagonist with market approval. Holger Stark is also editor-in-chief of the Archiv der Pharmazie – Chemistry in Life Sciences. He is Chairman of Landengruppe Hessen of the Deutsche Pharmazeutische Gesellschaft e. V. (German Pharmaceutical Society).
J. Christian Baber is the Global Head of Scientific Computing and Informatics at Shire where he leads computing and bio/cheminformatics efforts for the worldwide Research and Nonclinical Development organizations. Prior to Shire, Christian was the Head of Cheminformatics and Compound Management at Cubist Pharmaceuticals where, amongst other things, his team was responsible for designing, acquiring, managing and using the high-throughput screening collection. Christian has a wide breadth of experience across diverse therapeutic areas and platforms with a focus on early stage lead identification and screening and will bring this expertise to the ULTIMATE project to ensure that it delivers a solution of practical use to the pharmaceutical community.
Sándor Bátori has worked for Sanofi-Aventis/Chinoin in Budapest for 20 years as the European Section head of Medical Chemistry. Additionally, he worked at multiple European academic institutions: University of Madrid, Spain, Sorbonne, France, Sevtsenko University, Ukraine, University of Graz, Austria, Munich University, Germany and Humboldt University, Germany. Sándor is expert in both scientific and management aspects of drug discovery.

Register as a user for early access


Register as a supplier to participate


Other videos

European Union

This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 777828.