Computational Chemistry
GraSP feed-forward architecture showing GNN, FiLM-conditioned CNN, and MLP classification head

GraSP: Graph Recognition via Subgraph Prediction (2026)

GraSP introduces a general framework for recognizing graphs in images by framing it as sequential subgraph prediction with a binary classifier. A GNN conditions a CNN via FiLM layers to predict whether a candidate graph is a subgraph of the target. Applied to OCSR on QM9, GraSP achieves 67.5% accuracy with no domain-specific modifications.

Machine Learning Fundamentals
Graph network block diagram showing input graph transformed through edge, node, and global update steps to produce an updated graph

Relational Inductive Biases in Deep Learning (2018)

Battaglia et al. argue that combinatorial generalization requires structured representations, systematically analyze the relational inductive biases in standard deep learning architectures (MLPs, CNNs, RNNs), and present the graph network as a unifying framework that generalizes and extends prior graph neural network approaches.

Computational Chemistry
AtomLenz learns atom-level detection from hand-drawn molecular images with weak supervision

AtomLenz: Atom-Level OCSR with Limited Supervision

Introduces AtomLenz, an OCSR tool that combines object detection with a molecular graph constructor. Features a novel weakly supervised training scheme (ProbKT*) to learn atom-level localization from SMILES-only data, achieving state-of-the-art results on hand-drawn images.

Computational Chemistry
Overview of the MolScribe encoder-decoder architecture predicting atoms with coordinates and bonds from a molecular image.

MolScribe: Robust Image-to-Graph Molecular Recognition

MolScribe reformulates molecular recognition as an image-to-graph generation task, explicitly predicting atom coordinates and bonds to better handle stereochemistry and abbreviated structures compared to image-to-SMILES baselines.

Computational Chemistry

Online Handwritten Chemical Formula Structure Analysis

A three-level grammatical framework (formula, molecule, text) for parsing online handwritten chemical formulas, generating semantic graphs that capture both connectivity and layout using context-free grammars and HMMs.

Computational Chemistry
Diagram showing MolNexTR's dual-stream architecture: a molecular image feeds into parallel ConvNext and Vision Transformer encoders, producing a SMILES string.

MolNexTR: A Dual-Stream Molecular Image Recognition

MolNexTR proposes a dual-stream architecture combining ConvNext and Vision Transformers to improve molecular image recognition (OCSR). It achieves 81-97% accuracy across diverse benchmarks utilizing simultaneous local and global feature extraction alongside specialized image contamination augmentations.

Computational Chemistry
Log-scale plot showing exponential growth of alkane isomer counts from C1 to C40

The Number of Isomeric Hydrocarbons of the Methane Series

A foundational 1931 paper that derives exact recursive formulas for counting alkane structural isomers, correcting historical errors and establishing the first systematic enumeration up to C₄₀.