Key Drivers Analysis and Optimization with Probabilistic Structural Equation Models
In this half-day seminar, we present a complete workflow for developing a Probabilistic Structural Equation Model (PSEM) based on Bayesian networks and utilizing the BayesiaLab software platform. Our objective is to identify key drivers of satisfaction with a PSEM that is machine-learned from consumer survey data. A key challenge in this context is to resolve the conflict between "driver" as a causal concept versus the non-causal nature of non-experimental survey data. Furthermore, we illustrate how quantifying the joint probability of hypothetical scenarios is critical for establishing priorities for improving customer satisfaction.
Background & Theory
Structural Equation Modeling is a statistical technique for testing and estimating causal relations using a combination of statistical data and qualitative causal assumptions. Structural Equation Models (SEM) allow both confirmatory and exploratory modeling, meaning they are suited to both theory testing and theory development.
What we call Probabilistic Structural Equation Models (PSEMs) in BayesiaLab are conceptually similar to traditional SEMs. However, PSEMs are based on a Bayesian network structure as opposed to a series of equations. More specifically, PSEMs can be distinguished from SEMs in terms of key characteristics:
- All relationships in a PSEM are probabilistic—hence the name, as opposed to having deterministic relationships plus error terms in traditional SEMs.
- PSEMs are nonparametric, which facilitates the representation of nonlinear relationships, plus relationships between categorical variables.
- The structure of PSEMs is partially or fully machine-learned from data.
In general, specifying and estimating a traditional SEM requires a high degree of statistical expertise. Additionally, the multitude of manual steps involved can make the entire SEM workflow extremely time-consuming. The PSEM workflow in BayesiaLab, on the other hand, is accessible to non-statistician subject matter experts. Perhaps more importantly, it can be faster by several orders of magnitude. Finally, once a PSEM is validated, it can be utilized like any other Bayesian network. This means that the full array of analysis, simulation, and optimization tools is available to leverage the knowledge represented in the PSEM.
Case Study: Key Drivers Analysis from Consumer Survey Data
In this seminar, we present a prototypical PSEM application: key drivers analysis and product optimization based on consumer survey data. We examine how consumers perceive product attributes, and how these perceptions relate to the consumers’ purchase intent for specific products.
Given the inherent uncertainty of survey data, we also wish to identify higher-level variables, i.e. “latent” variables that represent concepts, which are not directly measured in the survey. We do so by analyzing the relationships between the so-called “manifest” variables, i.e. variables that are directly measured in the survey. Including such concepts helps in building more stable and reliable models than what would be possible using manifest variables only.
Our overall objective is making surveys clearer to interpret by researchers and making them “actionable” for managerial decision-makers. The ultimate goal is to use the generated PSEM for prioritizing marketing and product initiatives to maximize purchase intent.