Cross validation of variable clustering
I am working with my Covid-19 survey data to conduct the PSEM analysis. I have acquired and validated a network (unsupervised learning + data perturbation with 100 tests). For the unsupervised network learning, MDL indicated that the Taboo Order learning was the best solution.
Now, I am running the cross-validation of the variable clustering. But the Bayesialab software does not work in my computer environment when I set the number of tests as 100 for both Data Pertubation and Cross-Validation settings (probably lack of resources). Therefore, I decided to compromise the computational setting. Here I have two questions:
- Due to the computational limitation, I set the number of tests for Data Pertubation as 0, while maintaining the number for Cross-Validation as 100. What does this mean, in terms of the robustness of the identified factors? Is this the correct way to go?
- Since the Taboo Order seems to require more computational resources, I decided to use Maximum Spanning Tree learning (without post processing of Taboo learning) for the cross-validation of the variable clustering. Again what does this mean? In other words, does the selection of variable clustering algorithm have to be consistent with the network learning algorithm (Taboo Order in the current case)?