Skip to content

Cambridge Cheminformatics Newsletter, February 2025 Edition

Dear All,

I would like to circulate some current Cheminformatics- (and related) news to everyone as follows – in particular, our next Cambridge Cheminformatics Meeting will take place on 19 February, as usual in hybrid mode at the CCDC and on Zoom, for details please see below.

So here we go …

Events

19 February 2025
Cambridge Cheminformatics Meeting

More information: https://c-inf.net
Direct registration: https://cam-ac-uk.zoom.us/meeting/register/PnzFPHY4QluWnHIW14-6Fw

Identifying Novel Nanomolar A2A Receptor Ligands by Combining Docking and Reinforcement Learning
Morgan Thomas, Universitat Pompeu Fabra
https://github.com/MorganCThomas

Reversible Molecular Simulation for Training Classical and Machine Learning Force Fields
Joe Greener, LMB
https://www2.mrc-lmb.cam.ac.uk/group-leaders/a-to-g/joe-greener

A Whirlwind Tour of Gaussian Processes for Chemistry
Austin Tripp, Valence Labs
https://www.austintripp.ca

24-28 February 2025
26th Annual Online Computational Cellular Biology Workshop
Virtual Event
https://compcellbio.org/ccbworkshop

10/11 March 2025
ELRIG Research & Innovation 2025
London, UK
https://elrig.org/portfolio/research-innovation-2025-big-molecules-to-big-data-thinking-big-to-drive-innovative-research

11 March 2025
Computational AI Tools in Natural Products
Virtual Event
https://new.phytochemicalsociety.org/wp-content/uploads/2025/01/Webinar_20250311.pdf

11 March 2025
Virtual Workshop: Search and Access Structural Chemistry Data With the CSD Python API
Virtual Event
https://www.ccdc.cam.ac.uk/community/events

19 March 2025
Setting New Standards in SBDD: Experimental and Predicted Protein Structure Synergies
Cambridge, UK
https://www.discngine.com/discngine-labs-live-in-cambridge

8 April 2025
4th SCI-RSC Workshop on Computational Tools for Drug Discovery
Leeds, UK
https://www.ccdc.cam.ac.uk/community/events/conferences/4th-sci-rsc-workshop-on-computational-tools-for-drug-discovery-2025

10/11 April 2025
RDKit North American UGM/Cheminformatics Meeting
Cambridge, MA
https://bagimcommunications.blogspot.com/2025/01/rdkit-north-american-ugmcheminformatics.html

14-17 April 2025
British Crystallographic Association (BCA) Spring Meeting 2025
Leeds, UK
https://registrations.hg3conferences.co.uk/bca2025

12-16 May 2025
International Workshop on Open Molecular Informatics (IWOMI)
Bolzano, Italy
https://www.iwomi.net

30 June 2025
Machine Learning for Drug Discovery (MLDD) Symposium
London, UK
https://x.com/schwabpa/status/1869818270033031632

13-18 July 2025
GRC CADD Conference: Exploring the Synergy of Machine Learning and Physics-Based Computational Chemistry to Accelerate Drug Discovery
Portland, ME
https://www.grc.org/computer-aided-drug-design-conference/2025

22-24 September 2024
8th Artificial Intelligence in Chemistry Symposium
Cambridge, UK
https://www.rscbmcs.org/events/aichem8

Jobs

Cheminformatician – InChI
Beilstein-Institut
Frankfurt, Germany
https://www.linkedin.com/jobs/view/4114957054

Senior Director, Computational Chemistry & Molecular Design
Psivant
Boston, MA
https://psivant.com/careers/open-positions/sr-director-computational-chemistry-molecular-design

AI Expert in Computer-Aided Drug Design (CADD)
UCB
Slough, UK or Braine, Belgium
https://www.linkedin.com/jobs/view/4127752544

Cheminformatics Developer
AI|ffinity
Prague, Czechia
https://www.linkedin.com/jobs/view/4108013109

2025 AI-for-Science Independent Postdoctoral Fellowship
FutureHouse
Various Locations
https://www.futurehouse.org/fellowship

Drug Design Chemistry Lead
Isomorphic Labs
London, UK
https://www.linkedin.com/jobs/view/4144362180

ML Research Scientist – Large Molecule Discovery
Lilly
Indianapolis, IN
https://careers.lilly.com/us/en/job/R-78216/ML-Research-Scientist-Large-Molecule-Discovery

Machine Learning Engineer
Reactwise
Cambridge, UK
https://www.linkedin.com/posts/reactwise_ml-engineer-position-ugcPost-7281958775790436352-vvp9

Cheminformatics Engineer (BIOTECH)
Nexer Group
Prague, Czechia
https://www.linkedin.com/jobs/view/4119581229

Cheminformatician / Computational Chemist
deepmirror
London, UK
https://www.linkedin.com/jobs/view/4135632857

Senior Computational Chemist
AstraZeneca
Cambridge, UK
https://www.linkedin.com/jobs/view/4143025707

Science Lead – Machine Learning Models; Senior Scientist – Curation/Pipelines
Open Molecular Software Foundation (OMSF)
Remote (US)
https://openadmet.org/jobs

Senior Scientist Computational Modeling
ESQlabs
Remote (EU)
https://careers.esqlabs.com/jobs/5468552-senior-scientist-computational-modeling-m-f-d

Postdoctoral Fellow in Protein Design
Stockholm University
Stockholm, Sweden
https://su.varbi.com/en/what:job/jobID:788272

Postdoc Kinase Affinity Modelling Using Synthetic Data
Leiden University
Leiden, NL
https://www.linkedin.com/jobs/view/4137246973

PhD Positions in AI Driven Drug Discovery
Macao Polytechnic University (MPU)
Macao, Macao
https://cbbio.online/2024/12/22/phd-position-in-aidd

Cheminformatics…

FPSim2
https://chembl.github.io/FPSim2
‘FPSim2 is a NumPy-centric Python/C++ package for running fast compound similarity searches’

ROSHAMBO: Open-Source Molecular Alignment and 3D Similarity Scoring
https://chemrxiv.org/engage/chemrxiv/article-details/668d3fd7c9c6a5c07ac37033
3D has been somewhat a void in open source – thanks for the contribution!

RSC CICAG DISTILLATE
https://www.rsc.org/globalassets/03-membership-community/connect-with-others/through-interests/interest-groups/cicag/rsc-cica-group-newsletter-winter-24-25.pdf
Winter 2024/25 Edition

Datagrok: Swiss Army Knife for Data
https://datagrok.ai
With free academic licenses

Free-Wilson edge-cases
https://driesvr.github.io/2025/01/06/fwa_edgecases.html
Always good to revisit the basics

WebMolKit 2.0 on GitHub and NPM
http://cheminf20.org/2024/12/27/webmolkit-2-0-on-github-and-npm
‘A new branch of the WebMolKit open source library for cheminformatics on JavaScript platforms is now available’

Molecular Sonification for Molecule to Music Information Transfer
https://chemrxiv.org/engage/chemrxiv/article-details/6236172dd75627dbfb1e0c92
‘The resultant method allows a molecular structure to be heard as a musical composition, where the key of the music is based on the molecular properties and the melody is based on the atom and bond arrangement’

Combining crystallographic and binding affinity data towards a novel dataset of small molecule overlays
https://link.springer.com/article/10.1007/s10822-024-00581-1
‘The LOBSTER set offers a variety of applications like benchmarking multiple as well as pairwise alignments, generating training and test sets, for example based on time splits, or empirical software performance evaluation studies’

Applications Invited for CSA Trust Grants for 2025
https://csa-trust.org/2025/01/14/applications-invited-for-csa-trust-grants-for-2025
Deadline: 17 April 2025

The Info Mesa: Science, Business, and New Age Alchemy on the Santa Fe Plateau
https://www.amazon.com/dp/0393341577
How it all started in Santa Fe

7th Artificial Intelligence in Chemistry Symposium Workshop Material
https://github.com/volkamerlab/ai_in_chemistry_workshop/blob/main/README.md
Thanks for making this available!

Stereochemistry-aware string-based molecular generation
https://chemrxiv.org/engage/chemrxiv/article-details/6757d4eef9980725cf93c698
To Stereochemistry or not?

Learning Reaction SMARTS: A Practical Guide to Reaction-Based Patterns
https://drzinph.com/learning-reaction-smarts-a-practical-guide-to-reaction-based-patterns
Tutorial by Phyo Phyo Kyaw Zin

ACS Lucille Wert Scholarship
https://www.acscinf.org/awards/the-lucille-wert-student-scholarship
Applications open – deadline 7 March 2025

Introduction to Machine Learning for Molecular Property Prediction
https://medium.com/chemical-modelling/introduction-to-machine-learning-for-molecular-property-prediction-d01f5f33af29
by James McDonagh

VTX: Real-time high-performance molecular structure and dynamics visualization software
https://arxiv.org/abs/2501.12750
Open source and free for non-commercial use

…beyond Cheminformatics…

Benchmarking R&D success rates of leading pharmaceutical companies: an empirical analysis of FDA approvals (2006–2022)
https://www.sciencedirect.com/science/article/pii/S1359644625000042
‘Our study reveals an average likelihood of first approval rate of 14.3% across leading research-based pharmaceutical companies, broadly ranging from 8% to 23%.’ (I also didn’t expect that even some big pharma companies run more phase 3 than phase 1 trials, definitely worth reading)

Medical Adaptation of Large Language and Vision-Language Models: Are We Making Progress?
https://arxiv.org/abs/2411.04118
So do we need to fine-tune, after all?

How unfair is the coin?
https://ankitg.me/blog/2025/01/06/unfair-coins.html
by Ankit Gupta, of Reverie Labs, on his experience selling software to pharma, and how much compound PK matters (!)

Accurate predictions on small data with a tabular foundation model
https://www.nature.com/articles/s41586-024-08328-6
TabPFN, seems to work well

Ten simple rules for developing good reading habits during graduate school and beyond
https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1006467
Read widely… it goes far beyond your degree!

OpenADMET Consortium
https://openadmet.org
… ADMET has historically been hampered by data (even more than many other areas), hoping for this ARPA-H funded initiative to improve things

DEGRADATOR: A Gaming Expedition Into Targeted Protein Degradation Therapies
https://pubs.acs.org/doi/10.1021/acs.jchemed.4c00823
‘Designed for players aged 12 and above, DEGRADATOR introduces the molecular mechanics of targeted protein degradation’… so, youngsters, play DEGRADATOR instead of Pac-Man please!

‘See below for 6,882 pages of MMLU and GSM8K benchmark test set verbatims + the code for regenerating them’
https://www.linkedin.com/posts/louiswhunt_see-below-for-6882-pages-of-mmlu-and-gsm8k-activity-7281011488692047872-fWCE
On the ‘benchmarking’ of some of the models out there

2024 FDA approvals
https://www.nature.com/articles/d41573-025-00001-5
Always good to look at this to remind ourselves of the diversity of chemistry (and biology) getting approved

Refining the impact of genetic evidence on clinical success
https://www.nature.com/articles/s41586-024-07316-0
‘We estimate the probability of success for drug mechanisms with genetic support is 2.6 times greater than those without.’

Better Publishing
https://chem-bla-ics.linkedchemistry.info/2024/09/16/publishing.html
by Egon Willighagen

Finland Publication Forum will downgrade hundreds of Frontiers and MDPI journals
https://retractionwatch.com/2024/12/24/finland-publication-forum-will-downgrade-hundreds-of-frontiers-and-mdpi-journals
Be careful where you publish!

The “Intangible” Game: How VCs See You
https://www.nfx.com/post/how-vcs-see-you
Worthwhile reading for those starting out to start up

Federated Learning: From Theory to Practice
https://github.com/alexjungaalto/FederatedLearning/blob/main/material%2FFL_LectureNotes.pdf
Lecture Notes by Alexander Jung

Predictions Scorecard (on developments in tech)
https://rodneybrooks.com/predictions-scorecard-2025-january-01
by Rodney Brooks

Pretraining on the Test Set Is All You Need
https://arxiv.org/abs/2309.08632
How to build the perfect ML model

… and clearly beyond Cheminformatics

Coffee drinking timing and mortality in US adults
https://doi.org/10.1093/eurheartj/ehae871
Coffee is healthy… but have your cups early enough in the day

How Living Abroad Helps You Develop a Clearer Sense of Self
https://hbr.org/2018/05/how-living-abroad-helps-you-develop-a-clearer-sense-of-self
Quite obvious to those who have done it

The loudest megaphone: how Trump mastered our new attention age
https://www.theguardian.com/news/2025/jan/28/the-loudest-megaphone-how-trump-mastered-our-new-attention-age
Guess there’s a lot of truth to this

Why I’m quitting the Washington Post
https://anntelnaes.substack.com/p/why-im-quitting-the-washington-post
Includes visualizations of current relevance
See also: https://anntelnaes.substack.com/p/with-gratitude

Where Should You Park Your Car? The 1/2 Rule
https://arxiv.org/abs/2003.10603
Glad that’s settled

200Bn Weights of Responsibility – The Stress of Working in Modern AI
https://docs.google.com/document/d/1aEdTE-B6CSPPeUWYD-IgNVQVZM25f7MF-u9qn5KJJvo/mobilebasic
See also: https://www.reddit.com/r/reinforcementlearning/comments/1hrzgdg/felix_hill_has_died_dm

Phase behavior of Cacio and Pepe sauce
https://arxiv.org/abs/2501.00536
‘we present a scientifically optimized recipe based on our findings, enabling a consistently flawless execution of this classic dish’… finally!

Leap second and UT1-UTC information
https://www.nist.gov/pml/time-and-frequency-division/time-realization/leap-seconds
Wasn’t entirely aware of leap seconds before – don’t forget to celebrate the next one on 19 March 2025!

You can’t play 20 questions with nature and win: projective comments on the papers of this symposium
https://www.coli.uni-saarland.de/~crocker/documents/Newell-1973.pdf
Does research really provide progress and deeper understanding? Comments on a psychology conference from 1973

I believe this is all from my side for now – if you have any information for me to circulate, or wish to present at one of our next Cambridge Cheminformatics Meetings, please just let me know, cheers!

Best wishes,
Andreas

Leave a Reply

Your email address will not be published. Required fields are marked *