Double machine learning for pitcher management in baseball

Carregando...
Imagem de Miniatura
Data
2025-07-23

Orientador(res)

Fernandes, Marcelo

Métricas

Título da Revista

ISSN da Revista

Título de Volume

Resumo
This study applies a Double Machine Learning (DML) framework to evaluate pitcher substitution strategies in Major League Baseball (MLB). With data from the 2020–2024 seasons, the model treats pitcher removal as a binary intervention and estimates its impact on opponent offensive production, measured by weighted On-Base Average (wOBA). It contains 169 features, grouped in: historical performance, fatigue indicators, game context, and pitcher style. To account for observational bias, the methodology incorporates inverse probability weighting through propensity scores. The R-Learner was applied enables flexible estimation of the Conditional Average Treatment Effect (CATE), allowing granular analysis of substitution value across game scenarios. The results support several established pitching management theories, including the times through the order effect and leverage-based substitutions. Moreover, model-generated scores prove to be more efficient decision levers than traditional indicators, offering managers actionable insights beyond conventional heuristics. The study also evaluates a landmark case—the substitution of Blake Snell in the 2020 World Series—demonstrating the framework’s ability to assess high-stakes managerial decisions. Overall, the approach provides data-driven insights into optimizing pitcher management and expands the analytical frontier of baseball strategy.

Descrição

Área do Conhecimento

Avaliação

Revisão

Suplementado Por

Referenciado Por