Five hundred models in the Ersilia Model Hub

Our aim is to reach 500 models by the end of 2022, including AI/ML assets developed by the scientific community, as well as in-house assets build by us.

Latest models

Check the latest additions to the Ersilia Model Hub! We systematically look for AI/ML models and datasets in the scientific literature and incorporate them in our platform.

No results found, try adjusting your search and filters.
Something went wrong, contact us if refreshing doesn’t fix this.

Contribute to the Ersilia Model Hub

Suggest the models or datasets you would like to see in the Ersilia Model Hub or help us improve our open source platform!

Open source antimalarials

We have contributed to the Open Source Malaria Consortium, aimed at developing antimalarial drugs following a collaborative approach.

A collaborative approach to antimalarial drug discovery

The Open Source Malaria (OSM) consortium aims to identify new treatments against malaria using a fully Open Science approach. This means all findings are disclosed in real time, promoting scientific collaboration and overcoming intellectual property constraints. We propose a wet-lab/dry-lab cycle of collaboration between OSM and Ersilia where new compounds devised by the computer are probed experimentally and undergo successive rounds of modification to achieve a highly potent antimalarial that can progress to clinical trials.

406,000

Potential antimalarial drug candidates

We used generative AI/ML methods to create a long list of antimalarial drug candidates, based on previous expertise by Open Source Malaria.

1,200

Predicted to be highly active

Then, we evaluated each drug candidate with a high-confidence predictive model for activity against the malaria parasite. We also considered synthetic accessibility of the compounds, as well as drug-like properties.

35

Selected for experiments

Finally, we selected a list of high-confidence compounds for experimental validation by the Open Source Malaria team. Experimental validation coming soon!

Set up of a virtual screening cascade

AI/ML models can support the decision-making process at every stage of the drug discovery cascade

Automated machine learning for out-of-the-box predictive models

We have developed an automated AI/ML pipeline (Zaira Chem) to facilitate on-demand modelling.

No processing

A data pre-processing module takes care of removing spurious data points, standardise chemical structures, and harmonise inputs and outputs.

Comprehensive molecular representation

We use a comprehensive set of small molecule descriptors, including physicochemical properties, graph-based embeddings, text representations and bioactivity signatures.

State of the art AutoML

We benefit from the latest advances in automated machine learning to provide highly performant models without the need of choosing algorithms and hyperparameters.

Performance reports

Model validations tests are done automatically, and reports are produced to enable quick assessment of the model quality.

End-to-end implementation at H3D

Our first adopter has been the Holistic Drug Discovery and Development Centre (H3D) at the University of Cape Town, South Africa. Thanks to Zaira Chem, we have successfully trained over a dozen AI/ML models based on H3D historical screening data. Currently, our models serve over one hundred scientists in the centre.

Collect experimental data

We start by collecting experimental data available from our collaborator. At H3D, multiple bioassays related to malaria, tuberculosis and antimicrobial resistance were available.

Train AI/ML predictive models

We use Zaira Chem to train AI/ML models at scale, based on collaborator's data. Our framework has built-in AutoML methodologies that yield excellent out of the box models.

Deploy models on-premises

We make our models broadly accessible to our partner institutions. We identify local champions and train them to use and maintain our tools.

African medicinal plants as a source of novel antiviral drugs

We are contributing to the set up of a platform for nature-inspired identification of novel antivirals with distinctive mechanisms of action.

Nature-inspired drug discovery in Cameroon

We provide our partners at the University of Buea (Cameroon) with a set of AI/ML tools to predict the potential antiviral benefits of natural products identified in African medicinal plants. The primary focus is HIV (AIDS) and SARS-CoV-2 (Covid-19).

Bill & Melinda Gates Foundation Calestous Juma

We participate in this project as partners to Prof. Fidele Ntie-Kang (University of Buea), who was awarded a BMGF Calestous Juma Science Leadership Fellowship on 2021. Dr. Ian Tiejen, from The Wistar Institute, is also part of the collaboration.

Privacy-preserving AI/ML to foster open science in pharmaceutical companies

We are creating a tool to encrypt our AI/ML models in order to encourage pharmaceutical companies to contribute their data to the open domain without compromising their IP.

Encryption of AI/ML models

We argue that experimental results produced by pharmaceutical companies may be effectively made available to the scientific community in the form of AI/ML models, which retain the essential properties of the data but do not display the identity of the underlying compounds. We have received the support of a Merck BioPharma Speed Grant to develop a first prototype of an AI/ML model encryption tool tailored to small molecule data.

Ersilia
Modelling and encryption tools

We provide ZairaChem, our automated AI/ML tool for chemistry, and ChemXOR, an AI/ML model encryption tool.

Data provider
IP-sensitive datasets

Using ChemXOR & ZairaChem, pharmaceutical companies can train encrypted AI/ML models end-to-end.

Host
Cloud deployment

Models can be hosted in the cloud in their encrypted form. Either the data provider, Ersilia, or a third party can act as model hosts.

User
Protected searches

Users can query models. Privacy of user queries is also ensured by ChemXOR, so that the identity of the queries is not revealed to the host.

Capacity building

We believe the best way to transfer skills is by working side-by-side with our collaborators. Based on these interactions, we create resources focused on the dissemination of computational skills (AI/ML and others) to scientists in different fields.

Python 101

We have created a Python introductory program for chemists and biologists, organized in 4 half-day lessons.

Setting up the environment

How to work with conda environments and jupyter notebooks.

Large datasets

Use pandas to load tables and analise large datasets.

Plotting

Use of matplotlib to create beautiful data representations.

Machine learning

Introduction to machine Learning using scikit-learn.

Record linkage for medical data analysis

Clinical record linkage, or matching a patient’s data from different medical facilities, is a crucial step to ensure appropriate care and follow-up, but in many healthcare systems it is not yet automatized and it requires a large effort of manual data curation. AI/ML models can help speed up the process and reduce the error rate. In the context of the ESTHER project, led by the Swiss Tropical Health Institute, we prepared 4 sessions to introduce data managers and epidemiologists to AI/ML for record linkage.

No results found, try adjusting your search and filters.
Something went wrong, contact us if refreshing doesn’t fix this.

Code repositories

Browse a selection of our GitHub repositories to find more about specific projects, read the code and contribute!

No results found, try adjusting your search and filters.
Something went wrong, contact us if refreshing doesn’t fix this.

Open source and easy to use

We aim to provide computational tools that simplify and speed up day-to-day research. We invest a lot of time and effort into making our code available to the broad scientific community.

User-friendly

Our assets can be used without specific data science expertise.

Lightweight

We make tools that can run on the browser or on standard computers.

Precomputed datasets

We provide a number of predictions off-the-shelf to reduce computational burden.

Modular

Complex pipelines can be built by assembling multiple AI/ML assets.

Clean code

You can find our code well organized, commented, readable and open.

Support

We work hand-in-hand with our collaborators and users.