Skip to content

Cambridge Cheminformatics Newsletter, June – August 2024; Events, Vacancies, Resources …

Dear All,

I would like to circulate some current Cheminformatics- (and related) news to everyone as follows –

So here we go…

Events

21-22 August 2024
8th Annual Danish Bioinformatics Conference
Copenhagen, Denmark
https://eventsignup.ku.dk/8danishbioinfconference

1-4 September 2024
17th International Meeting of the European Calcium Society
Cambridge, UK
https://cambridge2024.calciumsociety.com

4 September 2024
Cambridge Cheminformatics Meeting
Cambridge, UK and Virtual (Hybrid)
Direct Zoom Registration: https://cam-ac-uk.zoom.us/meeting/register/tZMpdeGvrDoqEtO6rNc-gbcU4qFAZU10mZw3
More details: https://www.c-inf.net

Programme

Drug Discovery Through a Quantum Lens – Augmenting Traditional Approaches With Physics and AI
David Wright, Kuano
https://www.kuano.ai

Automated Patent Chemistry Extraction and Patent SAR Curation in PubChem: A Look at Two Gift Horses
Chris Southan, Honorary Professor, University of Edinburgh
https://cdsouthan.blogspot.com

Exploring Ultra-Large Virtual Libraries with 3D Descriptors: Alternatives for Ligand and Structure-Based Drug Design
Javier Vazquez, Pharmacelera
https://pharmacelera.com

… followed as usual by a visit to the Panton Arms!

9-13 September 2024
Machine Learning for Chemistry 2024
Karlsruhe, Germany
https://aimat.iti.kit.edu/ml4chem2024.php

30 September – 1 October 2024
PhysChem Forum
Jealott’s Hill, UK
http://physchem.org.uk/pcf2024/pcf2024.html

1/2 October 2024
ChEMBL 15 Year Symposium “ChEMBL@15, SureChEMBL@10”
Hinxton, UK
https://www.eventsforce.net/embl/frontend/reg/thome.csp?pageID=96136&eventID=151

4 October 2024
Computational Drug Design – A Tribute to Frank Blaney
Stevenage, UK
https://irinatkhnv.wixsite.com/fb-symposium

8 October 2024
Discngine Meetup Vol. 4: Innovative Strategies for Biotherapeutic Developability assessment
Virtual Event
https://www.discngine.com/discngine-meetup-2024

16-17 October 2024
Integrating AI into Chemical Safety Assessment – Opportunities, Challenges, and the Path Forward
Sophia Antipolis, France
https://www.ecetoc.org/event/ai-workshop

17 October 2024
Autumn UK-QSAR 2024 Meeting
Oxford, UK
https://ukqsar.org/index.php/2024/07/12/autumn-uk-qsar-2024-meeting

20-23 October 2024
2nd BioExcel Conference on Advances in Biomolecular Simulations
Brno, Czech Republic
https://bioexcel.eu/events/2nd-bioexcel-conference-on-advances-in-biomolecular-simulations

3-6 November 2024
18th German Conference on Cheminformatics (GCC 2024)
Bad Soden, Germany
https://veranstaltungen.gdch.de/microsite/index.cfm?l=11576

1-5 June 2025
13th International Conference on Chemical Structures
Noordwijkerhout, The Netherlands
https://iccs-nl.org

Jobs

Head of Computational Chemistry
Nxera Pharma
Cambridge, UK
https://www.linkedin.com/jobs/view/3984794682

Director of Cheminformatics and Computational Chemistry
BenevolentAI
Cambridge, UK
https://www.linkedin.com/comm/jobs/view/3987581986

Principal Scientists, Discovery Data Science
Johnson&Johnson
Beerse, Belgium
https://www.linkedin.com/jobs/view/3906287118

AI Platform Engineer
Pangea Bio
Berlin, Germany (or London, UK)
https://www.pangeabio.com/people/careers

Backend Engineer
DeepMirror
London, UK
https://apply.workable.com/deepmirror

Computational Chemist, Computational Protein Engineer (Biologics)
Isomorphic Labs
London, UK
https://www.linkedin.com/jobs/view/3966536422
https://www.linkedin.com/jobs/view/3968297662

Computational Chemist Contractor
Bicycle Therapeutics
Cambridge, UK
https://jobs.smartrecruiters.com/ni/BicycleTherapeutics/47d39a3e-3a16-4346-8e71-9cc84a217eda-computational-chemist-contractor

Modeler, Computer Aided Drug Design
UCB
Slough, UK
https://www.linkedin.com/jobs/view/3985182106

Molecular Modeler/Computational Chemist
Galapagos
Mechelen,Belgium
https://www.linkedin.com/jobs/view/3989078297

Senior Scientist Computational Protein Design
Roche
Penzberg, Bavaria
https://www.linkedin.com/jobs/view/3989966738

Sr. Scientist/Specialist, Computational Peptide Design; Director of Digital Strategy & Program Management
Novo Nordisk
Copenhagen, Denmark
https://www.linkedin.com/comm/jobs/view/3991300175
https://www.linkedin.com/jobs/view/3969087370

(Senior) Computational Drug Designer, Machine Learning Specialist
Selvita
Cracow, Poland
https://www.linkedin.com/comm/jobs/view/3926475874
https://www.linkedin.com/comm/jobs/view/3991317111

Associate Professor in Bioinformatics
Uppsala University
Uppsala, Sweden
https://uu.varbi.com/en/what:job/jobID:734719

Head of Frontier Research (and others)
Owkin
London, UK and Paris, France
https://www.linkedin.com/jobs/view/3968109638

Machine Learning Scientist Bio Modelling
Bayer
Berlin, Germany
https://www.linkedin.com/jobs/view/3991311683

Director Digital Life Sciences
Nuvisan
Berlin, Germany
https://www.linkedin.com/jobs/view/3959694646

Fellowship Computational Chemist – Biomolecular NMR Unit
Gruppo San Donato
Milan, Italy
https://www.linkedin.com/jobs/view/3957712539

Senior Scientist Structural Biology and Machine Learning
Sanofi
Waltham, MA
https://sanofi.wd3.myworkdayjobs.com/SanofiCareers/job/Waltham-MA/Senior-Scientist_R2748652-1

PhD Students, Various Areas in AI/drug discovery
Queen Mary
London, UK
https://www.qmul.ac.uk/deri/ukri-aidd-doctoral-training-programme/projects/

Project/Consulting Work: Setting up Open Targets as a private instance
Syzonc
Remote
Contact: Alan Mathason alan.mathason [] syzonc.com

Cheminformatics

Metis – A Python-Based User Interface to Collect Expert Feedback for Generative Chemistry Models
https://chemrxiv.org/engage/chemrxiv/article-details/66421031418a5379b0255d8a
Integrating the Man (and Woman) and the Machine

The problem(s) with scaffold splits, part 1
https://greglandrum.github.io/rdkit-blog/posts/2024-05-31-scaffold-splits-and-murcko-scaffolds1.html
… one of those painful topics (and Evergreens) in cheminformatics

An Open-Source Implementation of the Scaffold Identification and Naming System (SCINS) and Example Applications
https://chemrxiv.org/engage/chemrxiv/article-details/66b40b2e01103d79c51dc457
https://github.com/PangeAI/SCINS
A method originally from Novartis to classify scaffolds, now available as an open source implementation

I’m a bit skeptical of AlphaFold 3
https://olegtrott.substack.com/p/are-alphafolds-new-results-a-miracle
A man with an opinion

lwreg: A Lightweight System for Chemical Registration and Data Storage
https://chemrxiv.org/engage/chemrxiv/article-details/66843d96c9c6a5c07a124e47
https://www.youtube.com/watch?v=ZHdx1GPP178
… another one of those painful topics in cheminformatics, now getting more lightweight

Alipheron DataWarrior plugin available
https://www.alipheron.com/products/hyperspace_dw_plugin
To ‘search Enamine REAL Space’s universe of 38 billion synthesisable molecules through lightning-fast substructure searches’

iSIM / Instant Similarity
https://github.com/mqcomplab/iSIM
‘to perform multiple comparison simultaneously and getting the exact same value as the average pairwise comparisons of molecules represented by binary fingerprints or real number descriptors’

VisualizeChemspaceUsingTMAP.py
http://www.mayachemtools.org/docs/scripts/html/VisualizeChemspaceUsingTMAP.html
As the name says – also check MayaChemTools more widely, it greatly increased in functionality over the years

QupKake: Integrating Machine Learning and Quantum Chemistry for micro-pKa Predictions
https://chemrxiv.org/engage/chemrxiv/article-details/656cd67c5bc9fcb5c9f61f8a
‘QupKake outperforms state-of-the-art models on a variety of benchmark datasets, with root mean square errors (RMSEs) between 0.5-0.8 pKa units on five external test sets’

Practical Cheminformatics With Open Source Software
https://github.com/PatWalters/practical_cheminformatics_tutorials
Pat Walters’ excellent cheminformatics tutorials

CIME4R: Exploring iterative, AI-guided chemical reaction optimization campaigns in their parameter space
https://jcheminf.biomedcentral.com/articles/10.1186/s13321-024-00840-1
Reaction optimization is getting automated – a piece of software that might help

Computer-Aided Drug Design (CADD) Vault
https://drugbud-suite.github.io/CADD_Vault
‘an open-source repository dedicated to sharing resources, tools, and knowledge in the field of computer-aided drug design’, currently containing 1,220 links

GAUCHE: A Library for Gaussian Processes in Chemistry
https://github.com/leojklarner/gauche
‘We provide 30+ bespoke kernels for molecules, chemical reactions and proteins and illustrate how they can be used for Gaussian processes and Bayesian optimisation in 10+ easy-to-adapt tutorial notebooks.

MolPipeline : A python package for processing molecules with RDKit in scikit-learn
https://chemrxiv.org/engage/chemrxiv/article-details/661fec7f418a5379b00ae036
‘We introduce the MolPipeline package, which extends this concept to chemoinformatics by wrapping default functionalities of RDKit, such as reading and writing SMILES strings or calculating molecular descriptors from a molecule object. […] In addition, we included common cheminformatics tasks, like scaffold splits and molecular standardization, natively in the pipeline framework and adaptable for the needs of various projects.’

QSARtuna: QSAR model building with the Optuna framework
https://github.com/MolecularAI/QSARtuna
‘This library searches for the best ML algorithm and molecular descriptor for the given data.’

Generative Chemistry for Everyone: A Hands-On Guide using SAFE Encodings
https://moleculeinsight.com/generative-chemistry-for-everyone-a-hands-on-guide-using-safe-encodings
A low barrier way to get your hands warm with generative chemistry. I am quite amazed by the changes over the last 10-15 years or so… code, data, tutorials, are all out there today, for everyone who would ‘like to do things’

CarsiDock: a deep learning paradigm for accurate protein–ligand docking and screening based on large-scale pre-training
https://pubs.rsc.org/en/content/articlehtml/2024/sc/d3sc05552c
‘Further explorations demonstrate that CarsiDock can not only guarantee the topological reliability of the binding poses but also successfully reproduce the crucial interactions in crystalized structures, highlighting its superior applicability’ – any experience you can share?

Cambridge Cheminformatics Meeting Recording from 8 May 2024 Available
https://www.youtube.com/watch?v=Q32Uz3TH8Zc
Topics: 3D Pharmacophore Searches in Ultra-Lage Libraries, Creative Ways to Get Lab Scientists and Data Scientists to Work Closer Together, Image to Chemical Structure Conversion Directly Done in the Clipboard

MolModa
https://durrantlab.pitt.edu/molmoda/
‘MolModa is a browser-based drug discovery suite[…] It runs computational-chemistry calculations on your local computer, without requiring extensive remote resources’

Comparing Tautomer Generation Algorithms
https://bertiewooster.github.io/2024/05/01/Tautomer-Sources-Comparison.html
… be in for a very thorough (and painful) read – and don’t tell me I didn’t warn you!

Extrapolation validation (EV): a universal validation method for mitigating machine learning extrapolation risk
https://pubs.rsc.org/en/content/articlelanding/2024/dd/d3dd00256j
We need to get better at truly prospective validation (if such thing, apart from actual project work, really exists)

… beyond cheminformatics …

‘AI Drugs So Far’, by Derek Lowe
https://www.science.org/content/blog-post/ai-drugs-so-far
Commenting on:
How successful are AI-discovered drugs in clinical trials? A first analysis and emerging lessons
https://www.sciencedirect.com/science/article/pii/S135964462400134X

Genetic factors associated with reasons for clinical trial stoppage
https://www.nature.com/articles/s41588-024-01854-z
That being said, insufficient enrollment is the most important factor for stopped clinical trials overall

Gene regulatory network containing signed transcription factor-target gene interactions
https://github.com/saezlab/CollecTRI
… because in the end it’s not just about cheminformatics, you want to know what happens downstream with your system

The Simple Macroeconomics of AI
https://economics.mit.edu/sites/default/files/2024-05/The%20Simple%20Macroeconomics%20of%20AI.pdf
A more realistic estimate of the economic impact of Ey Ay: ‘Consequently, predicted TFP [total factor productivity] gains over the next 10 years are even more modest and are predicted to be less than 0.53%.’ – any comments, PwC?

Do models collapse when trained on recursively generated data?
Yes:
https://www.nature.com/articles/s41586-024-07566-y
No:
https://arxiv.org/abs/2404.01413
Who Knows!

Keith Hornberger’s Ramblings
https://substack.com/@krhornberger
… keep your mind fresh with some PK-related (and other) discussions

Learning in High Dimension Always Amounts to Extrapolation
https://arxiv.org/abs/2110.09485
… and chemical space is certainly high-dimensional. So – is QSAR meant to work, then?

EFMC Hit-to-Lead Webinar Series
https://www.efmc.info/hit-to-lead
Webinar about H2L, including case studies

Chemistry Conference Database, by Nessa the Chemist
https://supersciencegrl.co.uk/conferences
… thank you, Nessa!

KAN: Kolmogorov-Arnold Networks
https://arxiv.org/abs/2404.19756
https://spectrum.ieee.org/kan-neural-network
Two months ago this was quite hot, but I didn’t hear much since then about them – so, did I miss the revolution?

xLSTM: Extended Long Short-Term Memory
https://twitter.com/itsandrewgao/status/1788077054367596657
https://www.unite.ai/xlstm-a-comprehensive-guide-to-extended-long-short-term-memory/
https://arxiv.org/abs/2405.04517
From LSTM to transformers to xLSTM. Go, Austria, Go!

Making good decisions in early drug discovery
https://www.sciencedirect.com/science/article/pii/S1359644624001417
I tend to agree with the authors that good decisions are preferable over bad decisions

Quantum Computing’s Hard, Cold Reality Check
https://spectrum.ieee.org/quantum-computing-skeptics
Related from the ACM: https://cacm.acm.org/research/disentangling-hype-from-practicality-on-realistically-achieving-quantum-advantage
Maybe Quantum Computing running AI on the Blockchain is not yet ready, then?

Stanford Drug Discovery Symposium 2024
https://med.stanford.edu/cvi/events/2024-drug-discovery-conference.html
Recorded talks online

An empirical analysis of overall survival in drug approvals by the US FDA (2006–2023)
https://onlinelibrary.wiley.com/doi/10.1002/cam4.7190
On the efficacy of drugs (and the problems with surrogate endpoints in clinical trials)

Hematopoietic-Cell Transplantation at 50
https://redbook.streamliners.co.nz/SCT%20Hematopoietic-Cell%20Transplantation%20at%2050.pdf
Nothing to do with cheminformatics in the slightest… but, keep your minds open folks, bio is where it now happens! Quite a nice historical read

… and clearly beyond cheminformatics

ChatGPT is bullshit
https://link.springer.com/article/10.1007/s10676-024-09775-5
… probably of the soft version, to be precise

Udm=14
https://arstechnica.com/gadgets/2024/05/google-searchs-udm14-trick-lets-you-kill-ai-search-for-good
https://www.google.com/webhp?udm=14
… if you have had enough of ‘AI in Google searches

The Sam Altman Playbook
https://garymarcus.substack.com/p/the-sam-altman-playbook
The older you get, the more boring this all becomes  

tiny-gpu
https://github.com/adam-maj/tiny-gpu
‘A minimal GPU implementation in Verilog optimized for learning about how GPUs work from the ground up’ (for those who need a new project after the children have left the house, or after a divorce or the like)

‘Los Alamos chess game 2 (after P-K3) is solved; black wins in 21 moves’
https://content.iospress.com/articles/icga-journal/icg240247
Now I know what Roger does during those long nights when the lights are always on!

Age of Invention: The Second Soul, Part I
https://www.ageofinvention.xyz/p/age-of-invention-the-second-soul?ref=thediff.co
About: Salt, and the impact it had on places and history

What if Ukraine is forced to surrender to Russia?
https://twitter.com/IAPonomarenko/status/1788283804664168467
Hypotheticals can help identify the right decision that needs to be made

How many variables can humans process?
https://pubmed.ncbi.nlm.nih.gov/15660854

TLDR: Four – but actually only three. Certainly not five.

Bioicons
https://bioicons.com
Might be useful for some

Is the Atlantic Overturning Circulation Approaching a Tipping Point?
https://tos.org/oceanography/article/is-the-atlantic-overturning-circulation-approaching-a-tipping-point
Hopefully not

Sea surface temperatures, 1854-2024
https://twitter.com/EliotJacobson/status/1789081688297050324
Some further data on the preceding point

A moment that changed me: I was divorced, broke and alone – but I turned my life around with a list
https://www.theguardian.com/lifeandstyle/2024/apr/10/a-moment-that-changed-me-divorced-broke-alone-i-turned-my-life-around-with-list
Not every list will turn things around though. You need to be in the right mindset, too.

On a more trivial note: I am happy to report that both https://www.c-inf.net (for our cheminformatics meetings) and https://www.drugdiscovery.net (for the newsletter archive) now have SSL certificates, so don’t give browser warnings anymore… we have now officially arrived in the 21st century!

I believe this is all from my side for now – if you have any information for me to circulate, or wish to present at one of our next Cambridge Cheminformatics Meetings, please just let me know, and hope to see you on 4 September again in Cambridge (there will be a special treat this time, so please come and join in), cheers!

Best wishes,

Andreas

Leave a Reply

Your email address will not be published. Required fields are marked *