Generative Language Models in Molecular Discovery: Regression Transformer, GT4SD and Beyond

Overview

Explore recent developments in scientific language models for molecular design in this comprehensive talk. Delve into the Regression Transformer (RT), a novel approach that bridges regression and conditional sequence generation. Discover how the RT enables property-driven molecule generation and its applications in catalyst and block co-polymer discovery. Learn about the "Text & Chemistry T5" model that tackles tasks involving textual and molecular representations. Gain insights into the open-source Generative Toolkit for Scientific Discovery (GT4SD) and its collection of state-of-the-art molecular generative models. The talk covers topics such as identifiability, structural causal models, interventions, learning from unknown-target interventions, latent factor causal models, and causal disentanglement models, concluding with ongoing work and a Q&A session.

Syllabus

- Identifiability Background
- Structural Causal Models
- Interventions
- Identifiability in Causality
- Learning From Unknown-Target Interventions
- Learning in the Presence of Unobserved Variables
- Treks
- Latent Factor Causal Models LFCMs
- Causal Disentanglement Models
- Linear Causal Disentanglement via Intervention
- Ongoing Work
- Q+A

Taught by

Valence Labs

Reviews

Start your review of Generative Language Models in Molecular Discovery: Regression Transformer, GT4SD and Beyond

Taught by

Identifiability of Causal Models and Applications to Perturb-Seq Data

Automating Design of Molecules via Retrieval Augmented Generative Language Models - Broad Institute Machine Learning in Drug Discovery Symposium 2023

Never Stop Learning.