Exploratory structural equation modeling: a streamlined step by step approach using the R Project software

Prokofieva, Maria; Zarate, Daniel; Parker, Alex; Palikara, Olympia; Stavropoulos, Vasileios

doi:10.1186/s12888-023-05028-9

BMC Psychiatry

Table 5 Pathway 1b: ESEM based on EFA extracted loading thresholds (as discussed in [16])

From: Exploratory structural equation modeling: a streamlined step by step approach using the R Project software

Procedure Steps	Aims	R code	Translation
Setup	- Installing R packages -Loading R packages for further data exploration and—Enabling data preparation, including analysis of outliers and missing values—Enabling the selection of the estimation method and rotation appropriate for the data and study aims - Loading the dataset	#Installation is done only once before the packages can be used in RStudio install.packages("tidyverse") install.packages("psych") install.packages("lavaan") install.packages("GPArotation") install.packages("semPlot") remotes::install_github("maria-pro/esem", build_vignettes = TRUE) #Loading packages is required at the start of the analysis to get access to functions, and in the case of this tutorial, datasets available in packages library(esem) library(tidyverse) library(lavaan) library(semPlot) library(psych) #Loading the demonstration data sdq_lsac < -sdq_lsac describe(sdq_lsac)	The initial setup includes installing packages that will be used for analyses, using install.packages() function for packages from CRAN¹ “tidyverse”, “psych”, “lavaan”, “ semPlot” and remotes::install_github() function for non-CRAN packages available from github² (select option 1 upon installation of esem package). Installation is done one time and is not required every time the packages are loaded for use Loading of packages using library() function is required first, to make functions and datasets in packages of interest available for use in RStudio The psych package is used for the EFA procedure The lavaan and semPlot packages are used for the CFA related steps The esem package/ R code was developed specifically for the current paper to simplify and demonstrate ESEM free of cost procedures. It includes the dataset, as well as relevant ESEM automated functions employed in the demonstration example Data preparation can be achieved using the tidyverse package, which targets data exploration and visualization. The dataset used in this tutorial has already been cleaned and the details of the pre-processing are available in the supplementary material via the github repository sdq_lsac is the in-built dataset provided in the esem package. sdq_lsac loads the dataset in a variable called sdq_lsac (the name of the variable can be changed per user preference) describe() provides basic statistics for the sdq_lsac dataset
Step 1	Conduct full ESEM via embedding EFA, Geomin rotation, derived cross-loadings in CFA	esem_results < -esem_c(data = sdq_lsac, nfactors = 5, fm = 'ML', rotate = "geominT", scores = "regression", residuals = TRUE, Target = NULL, missing = TRUE, mimic = c("MPlus"), std.lv = TRUE, ordered = TRUE)	The esem_c() function estimates and reports ESEM results. The results are then saved in esem_results object The following arguments are used: - the dataset to be used data = sdq_lsac, alteratively, a correlation or covariance matrix can be provided - the number of factors nfactors = 5 (based on the classic 5-factor SDQ approach in literature) - the evaluation is done using the ML algorithm, fm = 'ml'. The alternative algorithms are available, including minimum residual (minres, i.e. ols or uls), principal axes, alpha factoring, weighted least squares and minimum rank. The full list of algorithms is provided at https://www.rdocumentation.org/packages/psych/versions/2.2.3/topics/fa - the rotation method rotate = "geominT". The full list of available rotations is accessible at https://www.rdocumentation.org/packages/psych/versions/2.2.3/topics/fa - factor scores are estimated using regression via scores = "regression". Alternative approaches are available at at https://www.rdocumentation.org/packages/psych/versions/2.2.3/topics/fa - residuals = TRUE requests the residual matrix to be generated and presented - Target = Null, to indicate to target item/factor rotation - the dataset used in this tutorial (sdq_lsac) has no missing values, but for demonstration purposes the argument missing = TRUE is used – it allows to impute missing values, in case these occur -mimic = c(“Mplus”) indicates a calculation that follows the Mplus procedure (pathway 1a), with the exception that item loading thresholds for the ESEM modelling stage are defined by the automated EFA results - std.lv = TRUE indicates that standardised values are produced at the modelling stage -ordered = True - the default confidence intervals for RMSEA is used with alpha = .1 - the default probability values are used for confidence intervals; however they can be adjusted by specifying p and the value. The default is p = .05 For more options on running the esem_efa() function please see https://www.rdocumentation.org/packages/psych/versions/2.2.5/topics/fa Please ignore the “Loading required namespace: GPArotation” message received, as such functions are already addressed by the packages retrieved The alternative solution is to run EFA with Target rotation. This option is explained in the alternative Step1a below
Step 1a	Conduct EFA to calculate EFA derived cross-loadings with Target rotation	main_loadings_list <—list( pp = c("s6_1", "s11_1R", "s14_1R", "s19_1", "s23_1"), cp = c("s5_1", "s7_1R", "s12_1", "s18_1", "s22_1"), es = c("s3_1", "s8_1", "s13_1", "s16_1", "s24_1"), ha = c("s2_1","s10_1","s15_1","s21_1R","s25_1R"), ps = c("s1_1","s4_1","s9_1","s17_1","s20_1")) target < -make_target(data = sdq_lsac, keys = main_loadings_list) esem_results < -esem_c(data = sdq_lsac, nfactors = 5, fm = 'ML', rotate = 'TARGET', scores = "regression", residuals = TRUE, Target = target, missing = TRUE, mimic = c("Mplus"), std.lv = TRUE, ordered = TRUE)	-For target rotation, there needs to be a target loadings’ matrix supplied to the EFA analysis -To make the target matrix object, a list of main loadings (main_loading_list) is created using the list() function and supplied to the make_target() function -The esem_efa function, then explores the data, here defined as sdq_lsac -. The number of factors selected needs to correspond with the number of factors defined in the main_loadings_list object. For this example this is 5 The esem_efa() function is used with rotate = “TargetQ" and target matrix is provided as Target. All other arguments remain the same as in Step1
Step 2	Inspect the ESEM model	summary(esem_results, fit.measures = TRUE, standardized = TRUE, ci = TRUE)	To review the results the summary() function is used with: -fit.measures = TRUE. This calculates the goodness of fit parameters to assess model fit of the esem-results object defined in either Step1 or Step1a The argument Standardized = TRUE provides two columns reporting (i) standardized parameters when only the latent variable is standardized (std.lv), and (ii) standardized parameters when both observed and latent variables are standardized (std.all) For more options on running esem_cfa() function please see https://www.rdocumentation.org/packages/lavaan/versions/0.5-9/topics/cfa
Step 3	Visualizing ESEM Model	semPaths(esem_results,whatLabels = "std",layout = "tree")	The semPaths () function plots the model and allows to customise its visualization with the following arguments: - esem_fit as the fitted model, created in step 4 - whatLabels = ”std” to produce standardized path coefficients - layout = ”tree” to produce a tree-like disposition of elements in the plot

Back to article page

ISSN: 1471-244X

Contact us

Submission enquiries: bmcpsychiatry@biomedcentral.com
General enquiries: ORSupport@springernature.com