Machine learning and oil price point and density forecasting


The purpose of this paper is to explore machine learning techniques to forecast the oil price. In the era of big data, we investigate whether new automated tools can improve over traditional approaches in terms of forecast accuracy. Oil price point and density forecasts are built from 22 methods, including regression trees (random forest, quantile regression forest, xgboost), regularization procedures (elastic net, lasso, ridge), standard econometric models and forecast combinations, besides the structural factor model of Schwartz and Smith (2000). The database contains 315 macroeconomic and Önancial variables, used to build high-dimensional models. To evaluate the predictive power of each method, an extensive pseudo out-of-sample forecasting exercise is built, in monthly and quarterly frequencies, with horizons from one month up to Öve years. Overall, the results indicate a good performance of the machine learning methods in the short run. Up to six months, the lasso-based models, oil future prices, and the Schwartz-Smith model provide the best forecasts. At longer horizons, forecast combinations also become relevant. In several cases, the accuracy gains in respect to the random walk forecast are statistically signiÖcant and reach two-digit Ögures, in percentage terms, using the R2 out-of-sample statistic; an expressive achievement compared to the previous literature

Área do Conhecimento