Causal Inference

Estimating Causal Effects from Panel Data with Dynamic Multivariate Panel Models

Abstract Panel data are ubiquitous in scientific domains such as sociology and econometrics. Various modeling approaches have been presented for the causal inference based on such data, including Markov models, cross-lagged panel models, and their extensions. Existing panel data modeling approaches typically impose some restrictive assumptions on the data-generating process, such as Gaussian responses, effects that are constant in time, and ability to consider only short-term causal effects.

Clustering and Structural Robustness in Causal Diagrams

Abstract Graphs are commonly used to represent and visualize causal relations. For a small number of variables, this approach provides a succinct and clear view of the scenario at hand. As the number of variables under study increases, the graphical approach may become impractical, and the clarity of the representation is lost. Clustering of variables is a natural way to reduce the size of the causal diagram, but it may erroneously change the essential properties of the causal relations if implemented arbitrarily.

Price Optimization Combining Conjoint Data and Purchase History: A Causal Modeling Approach

Abstract Pricing decisions of companies require an understanding of the causal effect of a price change on the demand. When real-life pricing experiments are infeasible, data-driven decision-making must be based on alternative data sources such as purchase history (sales data) and conjoint studies where a group of customers is asked to make imaginary purchases in an artificial setup. We present an approach for price optimization that combines population statistics, purchase history and conjoint data in a systematic way.

dynamite: An R Package for Dynamic Multivariate Panel Models

Abstract dynamite is an R package for Bayesian inference of intensive panel (time series) data comprising of multiple measurements per multiple individuals measured in time. The package supports joint modeling of multiple response variables, time-varying and time-invariant effects, a wide range of discrete and continuous distributions, group-specific random effects, latent factors, and customization of prior distributions of the model parameters. Models in the package are defined via a user-friendly formula interface, and estimation of the posterior distribution of the model parameters takes advantage of state-of-the-art Markov chain Monte Carlo methods.

Estimating the causal effect of timing on the reach of social media posts

Abstract Modern companies regularly use social media to communicate with their customers. In addition to the content, the reach of a social media post may depend on the season, the day of the week, and the time of the day. We consider optimizing the timing of Facebook posts by a large Finnish consumers’ cooperative using historical data on previous posts and their reach. The content and the timing of the posts reflect the marketing strategy of the cooperative.

Estimation of causal effects with small data in the presence of trapdoor variables

Abstract We consider the problem of estimating causal effects of interventions from observational data when well-known back-door and front-door adjustments are not applicable. We show that when an identifiable causal effect is subject to an implicit functional constraint that is not deducible from conditional independence relations, the estimator of the causal effect can exhibit bias in small samples. This bias is related to variables that we call trapdoor variables.

Estimation of causal effects with small data in the presence of trapdoor variables

We consider the problem of estimating causal effects of interventions from observational data when well-known back-door and front-door adjustments are not applicable. We show that when an identifiable causal effect is subject to an implicit functional constraint that is not deducible from conditional independence relations, the estimator of the causal effect can exhibit bias in small samples (where parameter estimation exhibits non-negligible uncertainty). This bias is related to variables that we call trapdoor variables. We use simulated data to study different strategies to account for trapdoor variables and suggest how the related trapdoor bias might be minimized. The importance of trapdoor variables in causal effect estimation is illustrated with real data from the Life Course 1971-2002 study. Using this dataset, we estimate the causal effect of education on income in the Finnish context. Using the Bayesian modelling approach allows us to take the parameter uncertainty into account and gives us the full interventional distribution instead of only average causal effect estimates.