At Putnam Data Sciences we combine machine learning with sound principles of causal inference to generate reliable answers to your most important questions.

Our areas of expertise include causal effect estimation of point treatment and longitudinal treatment regimes using TMLE, marginal structural modeling, IPTW, and other propensity score-based methodologies.

We also specialize in risk prediction modeling to aid in identifying high risk individuals. Model development utilizes advanced machine learning methodologies such as lasso, neural networks, SVMs, gradient boosting and super learning.

Recent News

Putnam Data Sciences presents a webinar series on Targeted Learning, Causal Inference and Machine Learning.  Access videos and download presentation slides here . And visit our PDS channel on YouTube.

Targeted Learning for Analyses of Real World Data
The PDS Targeted Learning webinar series is funded by the U.S. Food and Drug Administration (FDA CONTRACT 75F40119C10155).
Webinars and associated slide decks are available for public viewing at no cost. Webinars reflects the views of the presenter, and should not be construed to represent FDA’s views or policies.

    1. Targeted Machine Learning for Causal Inference based on Real World Data, presented by Dr. Mark van der Laan, February 5, 2020. download slides
    2. An Introduction to Targeted Maximum Likelihood Estimation of Causal Effects, presented by Dr. Susan Gruber, March 11, 2020. download slides
    3. An Introduction to Super Learning, presented by Dr. Eric Polley, April 8, 2020. download slides
    4. Introduction to Bayesian Additive Regression Trees (BART) for Causal Inference, presented by Dr. Nicole Bohme Carnegie, May 13, 2020. download slides
    5. Practical Issues in Targeted Learning, presented by Dr. David Benkeser, June 3, 2020. download slides
    6. The Causal Inference of Longitudinal Exposures with Marginal Structural Models: An Overview and Application of Longitudinal TMLE, presented by Dr. Mireille Schnitzer, July 22, 2020. download slides
    7. Covariate Adjustment in Randomized Studies with Time-To-Event Endpoints, presented by Dr. Iván Díaz, September 30, 2020. download slides
    8. Cross-validated Targeted Maximum Likelihood Estimation (CV-TMLE), presented by Dr. Alan Hubbard, October 21, 2020. download slides
    9. Highly Adaptive Lasso (HAL) in Causal Inference, presented by Dr. Mark van der Laan, December 9, 2020. download slides
    10. Targeted Machine Learning in Action in the ICU, presented by Dr. Romain Pirracchio, March 24, 2021. download slides
    11. Developing a Targeted Learning-Based Statistical Analysis Plan, presented by Dr. Susan Gruber, April 28, 2021. download slides
    12. Practical Considerations for Specifying a Super Learner, presented by Rachael Phillips, May 20, 2021. download slides
    13. Challenges and Solutions in the Analysis of Cluster Randomized Trials, presented by Dr. Laura Balzer, June 23, 2021. download slides
    14. Expert Augmented Machine Learning, presented by Dr. Gilmer Valdes, July 14, 2021. download slides
    15. BAA Project to Advance Regulatory Science and Leverage Real World Evidence in Regulatory Decision Making, co-presented by Dr. John Concato and Dr. Hana Lee, August 4, 2021. download slide deck 1, download slide deck 2
    16. Targeted Learning: Towards a future informed by real world evidence, presented by Dr. Mark van der Laan, September 15, 2021. download slides




R Packages for targeted minimum loss based estimation (TMLE)

  • tmle: Analysis of point treatment data to estimate average treatment effect among the population (ATE), among the treated (ATT), and among the controls (ATC) using targeted minimum loss-based estimation of point treatment effects (tmle),  and estimation of the parameters of a marginal structural model (tmleMSM).
  • ltmle: Longitudinal data analysis using targeted minimum loss-based estimation.
  • ctmle: Collaborative targeted minimum loss-based estimation of point treatment effects using data adaptive propensity score estimation.

Selected Publications


Putnam Data Sciences, LLC was founded in 2017 by Susan Gruber, PhD, MPH, MS.  Dr. Gruber is a biostatistician and computer scientist whose expertise is in the development and application of data adaptive methodologies to improve the quality of evidence generated by studies of observational health care data.  Dr. Gruber was formerly the Director of the Biostatistics Center in the Department of Population Medicine at Harvard Medical School & Harvard Pilgrim Health Care Institute, and Senior Director of IMEDS Methods Research for the Reagan-Udall Foundation for the FDA.

CV     Google Scholar


Email us at

Cambridge, MA, 02139

Copyright © 2022, Putnam Data Sciences, LLC.