Skip to main content

Exploratory structural equation modeling: a streamlined step by step approach using the R Project software


Confirmatory Factor Analysis (CFA) has been a popular yet limited approach to assessing latent factor structures. Despite items rarely loading exclusively on one latent factor in multifactorial scales, CFA assumes all indicators/items should load uniquely on their allocated latent dimensions. To address this weakness, Exploratory Structural Equation Modeling (ESEM) combines exploratory factor analyses (EFA) and CFA procedures, allowing cross-loadings to occur when assessing hypothesized models. Although such advantages have enhanced ESEM popularity, its adoption is often limited by software rigidity and complex coding difficulties. To address these obstacles, the current tutorial presents a streamlined, step-by-step approach using the open-source software R while providing both R and Mplus ESEM syntax. The tutorial demonstrates the sequence of the ESEM stages by examining the frequently debated Strengths and Difficulties Questionnaire (SDQ) factor structure, using openly accessible data from the Longitudinal Study of Australian Children (LSAC). As ESEM may allow a better understanding of the complex associations in multidimensional scales, this tutorial may optimize the epidemiological and clinical assessment of common yet multifaceted psychiatric presentations.

Peer Review reports

Human behavior researchers commonly employ self-report scales to assess individuals’ experiences, including complex psychiatric presentations [1, 2]. Such instruments usually include items or indicators assumed to capture latent constructs [3]. Latent factors are then identified to account for the correlations among indicators (i.e., items influenced by the same underlying construct), and potential combinations of latent factors result in latent factorial structures [4]. However, given that items and instruments often represent poor modeling of chosen phenomena, psychometric analysis and validation are imperative [5, 6]. Considering potential limitations of traditional exploratory and confirmatory approaches, the present paper aspires to introduce a novel, automated and freely accessible exploratory structural equation modeling (ESEM) tutorial using the R software. Moreover, this paper provides a practical example using a widely employed assessment scale. The results of the method are comparatively examined with those of other popular ESEM calculation approaches.

What are the limitations of commonly applied validation approaches?

Depending on whether a researcher primarily explores or confirms a hypothetical factor structure, different analysis options are available to guide the revision and improvement of a questionnaire [4]. Accordingly, Exploratory Factor Analysis (EFA) is data-driven, and no prior specifications may be made regarding the number of factors or pattern(s) of indicator/factor relationships. Alternatively, Confirmatory Factor Analysis (CFA) assumes a-priori item-factor loadings/associations and constraints estimation of others [6]. Moreover, both EFA and CFA help understand item functioning, and thus have been extensively implemented in applied research. However, these methods may have substantial limitations [7]. Although EFA enables researchers to obtain the optimal factorial structure of a scale based on the extraction of common items’ variance, the dimensions extracted might not always be theoretically meaningful and thus useful for comparisons across groups and over time [8, 9]. Furthermore, in EFA, correlations between pairs of items due to specific methodological influences (e.g., similar language delivery) are only considered in the context of residual variance(s) [9]. Thus, the exploratory stage (i.e., EFA) usually requires a confirmation or validation stage (i.e., CFA) [8, 9].

Questionnaire validation via CFA can be challenging in the case of multifactorial structures, where hypothesized (not explored) structures propose congeneric items (i.e., loading on only one factor) [6]. However, items often relate to factors other than the primary item-allocated ones (i.e., cross-loadings [8]). Cross-loadings are rarely equal across items and should be modeled to increase measurement validity [6]. CFA approaches ignore cross-loadings, prioritizing parsimonious models that may result in limited model fit and measurement indices (e.g., reliability [9, 10]). Moreover, not accounting for minor cross-loadings can generate reduced fit for theoretically sound instruments with a larger number of factors (e.g., 5–10) and a high number of items (e.g., 5–10/ per scale/dimension [11]). Thus, ignoring potential item cross-loadings inevitably affects the validity and utility of CFA findings [9].

Exploratory Structural Equation Modeling (ESEM): How is it similar and different to traditional EFA and CFA procedures?

Alternatively, Exploratory Structural Equation Modeling (ESEM) integrates EFA and CFA strengths to overcome their limitations [7,8,9,10, 12]. ESEM is an EFA conducted in a structural equation modeling (SEM) context where items can load on multiple factors and produce goodness-fit indices (see Table 1). Thus, ESEM has been suggested to merge the advantages of both EFA and CFA analyses [7]. Consequently, ESEM considers items’ cross-loadings as little as 0.10 and/or even approximating 0, preventing inaccurately increased parameters or distorted model-fit [13]. It should be noted, however, that ESEM may not work best in bifactor structures (i.e., latent structures where each item loads on specific uncorrelated factors as well as a general common factor [12]). With this in mind, ESEM calculations with target rotation have been introduced, enabling cross-loadings to be embedded within hypothesis-derived models (i.e., targeted items are considered for both their targeted and non-targeted dimensions [8, 10, 14,15,16]). This type of rotation also “targets” cross-loading to approximately zero for non-primary item-factor associations [16].

Table 1 Comparison of EFA, CFA and ESEM

What are the different ways to implement ESEM?

Overall, ESEM involves a mixture of exploratory and confirmatory elements, including a) the factorial structure of a scale; b) primary factor loadings, and c) non-primary factor loadings. Such choices of structural ESEM components may later be expanded via the selection of different rotation types and estimators informing both the similarities and the differences between two traditional ESEM pathways, as well as two recently introduced ESEM variations (i.e., set-ESEM; ESEM within CFA [7, 16, 18, 19]. The two alternative, yet similar, traditional ESEM pathways involve a) expanding CFA via EFA features (1st pathway) and b) expanding EFA via incorporating CFA structures/features (2nd pathway). Pathway 1 either expands CFA-calculated models by constraining all cross-loading thresholds to near zero (~ 0) for non-primary items (pathway 1a; see the default Mplus procedure [20]) or by assuming EFA-derived loadings as the threshold for primary loadings and cross-loadings (pathway 1b [19]). Alternatively, the second pathway includes a two-stage process. It initially uses factor analysis to identify the primary items assumed to be allocated to each dimension (or factor). The second stage includes non-primary items, with their latent extracted factors correlated under an ESEM solution. The major difference between pathway 2 and pathway 1b is using EFA procedures as the core of calculations instead of CFAs [18].

In that line, two significant methodological ESEM variations have also been suggested [7, 16]. Firstly, in cases where theoretical arguments support only a number/set of interrelated factors (and not all factors of a multidimensional scale associating with each other) only cross-loadings within this set may be enabled (see set-ESEM method [7]). Secondly, the ESEM-Within-CFA (EWC) has been proposed to compensate for the limited interpretability likely resulting from traditional ESEM [16]. In essence, EWC uses ESEM-item loadings to inform starting values of item loadings in a CFA model, combined with a number of fixed parameters for convergence [16]. Specifically, a scaling/referent item is chosen per factor to help detect small cross-loadings, which are then fixed to their previously ESEM-derived values [16]. All other parameters adhere to traditional ESEM management (i.e., relaxed and/or constraint [16]). Both ESEM variations (EWC and set-ESEM) have been shown to operate well in most cases, despite potentially resulting in weaker performance in ESEM models involving higher-order factors [16].

These different ESEM approaches can be later enhanced with the choice of specific rotation types and estimators (see Table 2). Such choices may have significant effects on the modeling flow (i.e., convergence/parsimony) and the results [12]. Considering estimators in the traditional CFA and EFA context, Maximum Likelihood (ML) estimation tends to be the most widely used for data assuming multivariate normality (commonly involving continuous variables), while Weighted Least Squares (WLS) is the estimator of choice for non-continuous variables/data (i.e., Likert scales [4]). Robust variations of such estimators may also be selected in ESEM modeling to consider standard errors influences in the reported statistics (see Maximum Likelihood with Robust Standard Errors [MLR] and/or Weighted Least Square Mean(s) and Variance(s) adjusted [WLSMV] estimators [4, 21, 22]. Considering rotations, oblique types tend to be more commonly employed in traditional CFA/EFA procedures, as dimensions of multi-factorial questionnaires/scales are often expected to be correlated [4, 21, 22]. However, a series of other alternative options may also be used to expand ESEM calculations based on the specific modeling features (see Table 2 for more details).

Table 2 ESEM Pathways 1 & 2, structural features, estimators and rotations

What are the strengths and limitations of ESEM?

Overall, ESEM tends to produce less biased inter-factor correlations and model estimations [9, 12]. In that line, as the magnitude and the precision of both items’ primary loadings and cross-loadings concurrently define ESEM extracted factors, their clarity may be enhanced via less inflated correlations, resulting in more realistic reliability estimates, as well as improved modeling fit, compared to non-ESEM procedures (e.g., α, ω [8, 14, 15]. In addition, given that ESEM can concurrently employ both CFA and EFA methods, ESEM-extracted latent dimensions and general findings tend to be more accurate reflections of reality and, thus, the phenomena underpinning their measuring scales [9, 10]. Furthermore, ESEM latent factors can counterbalance inter-cultural/national differences related to the interpretation of items and item-wording effects [9, 10].

Despite these strengths, notable ESEM limitations may involve reduced/lack of parsimony (i.e., the method can be too flexible), and latent constructs may be difficult to interpret [6, 23]. Additionally, ESEM may underperform in complex models, as a high model fit may interfere with the calculation of higher-order factors (e.g. partial invariance, mediation employed cross-loadings, multi-level, latent class and latent growth curve modeling, commonly used in psychiatric research [9, 23]). Additional limitations are related to the critical importance of the rotation procedures selected, as these may influence the size and direction of latent factor correlations and cross-loadings [10]. Thus, ESEM modeling should not be viewed as an entirely positive procedure without taking into consideration its limitations and specific uses [7, 9, 23].

When should ESEM be employed?

According to Alamer and Marsh [7], ESEM is a confirmatory procedure enriched via exploratory elements to incorporate non-primary item(s)-factor(s) associations (i.e., cross-loadings). Thus, ESEM is exclusively indicated for multidimensional questionnaires, where an item’s variance may be simultaneously explained by more than one latent factor without necessarily indicating the occurrence of a non-measured alternate factor (e.g., bi-factor models/ non-calculated other factors; see Fig. 1 [7, 10]). Nevertheless, even in the case that all prior conditions apply, ESEM models are recommended to be comparatively calculated with their respective CFA models and preferred only if: a) significantly better fit indices are observed (compared to a CFA model [7]); b) ESEM factor correlations are lower than those of their corresponding CFA estimation; c) ESEM cross-loadings are small to medium (< 0.50) if higher a theoretical (for instance similar item phrasing) explanation applies; d) ESEM factors must present with strong and theoretically meaningful loadings, as medium to large cross-loadings might suggest a non-calculated factor; e) ESEM bi-factor models need to show better fit than its corresponding non-bi factor ESEM and bi-factor CFA versions, and; f) the reliability estimates (i.e., α, ω, etc.) of the ESEM should be acceptable [9, 12].

Fig. 1
figure 1

EFA, CFA, ESEM decision process

Under these conditions, flexible ESEM practices may enhance the findings of a wide range of modeling aims broadly used in psychiatry and mental health [7, 9]. These may involve the confirmation of predefined factor structures, the investigation of the interrelationships of different latent factors, measurement invariance procedures across different groups and over time, and even latent growth modeling [10]. Therefore, ESEM could be broadly applicable in the context of psychopathology/psychiatric scales employed for epidemiological, clinical intake assessment and intervention monitoring purposes [9].

How can ESEM be operationalized?

Despite the ESEM theoretical background and utility having been articulated [10] and newer ESEM versions (e.g., set-ESEM; EWC [16]) developed, certain restrictions limit researcher implementation. Specifically, lack of flexibility in defining the model parameters, reporting rigidity, reproducibility of results, software accessibility and syntax/coding complexities reduce the adoption of its various versions (see 1a, 1b, 2, Set-ESEM and EWC variations described above [9]). Indeed, the ESEM 1a pathway appears to have been mostly applied, via the additional allocation of differential loadings for all non-primary items through the consideration of a close to 0 factor threshold in Mplus CFA procedures [8, 20]. The broader use of this ESEM pathway has been greatly supported by the ESEM code generator for Mplus introduced by de Beer and Van Zyl [20]. This allows less experienced Mplus users to automatically transform their multifactorial CFA models into their corresponding ESEM structures in order to proceed with testing [9]. While Mplus presents an excellent option for running these analyses, its limited accessibility (i.e., paid subscription) may hinder the broader ESEM adoption. Alternatively, a freely-accessible platform, such as RStudio, may present with greater flexibility and ease of accessing/editing syntax. Table 3 provides an overview of RStudio advantages.

Table 3 Comparison between Mplus and RStudio

In that context, the broader use of the ESEM 1b and the ESEM 2 pathways are feasible in R software via the currently openly available “esemComp” R package [19] and the ESEM/EFA-based code introduced by Revelle [18]. The esemComp operationalization of the ESEM pathway 1b appears to be more user-friendly and comparable to the Mplus ESEM calculation [19]. Nevertheless, its adoption is likely compromised by modeling challenges related to the multiple steps required (i.e., distinct EFA and ESEM steps, similar to EWC, except for the first step requiring EFA, instead of ESEM, to inform the loading starting points for the CFA at step 2). Additionally, the esemComp assumes that users can correctly identify factor-referent items, resulting in likely human error [19]. To address these limitations, the current tutorial provides an ESEM R code that merges EFA and ESEM-CFA modeling steps while automating the selection of factor-referent items. More importantly, the approach proposed in this tutorial produces similar (and potentially improved results) to those obtained via the Mplus alternative, as it enables varying calculation thresholds for all items. To demonstrate the implementation of the method, this tutorial will use the Strengths and Difficulties Questionnaire (SDQ [24]) given its questioned CFA and ESEM factor structure(s) in a series of earlier studies [8, 23, 25]).

An ESEM tutorial example: The SDQ controversial factor structure

The SDQ is a popular mental health instrument used in several studies nationwide [26,27,28] to assess psychological strengths and difficulties for individuals aged between 2 and 17 [8, 29]. It includes 25 items distributed across five scales addressing Emotional Symptoms (ES), Conduct Problems (CP), Hyperactivity (Hy), Peer Problems (PP), and Prosocial Behavior (PB [30]). The same 25 items, with respondent-specific wording variations, can be completed by parents and teachers, as well as self-reported by the assessed child/adolescent [8, 30].

The SDQ structure has been challenged with different proposed factorial models across various national and age samples [8, 23, 29, 31, 32]. For example, working with Malaysian parental SDQ ratings for children 5–13 years, Gomez and Stavropoulos [29] demonstrated support for an oblique six-factor model involving aside of the five SDQ domains a positive construal factor comprising all the 10 SDQ positive worded items [29]. Furthermore, at least three studies have examined the potential fit of ESEM SDQ models. Firstly, Garrido and colleagues [25] analysed a Spanish-speaking population of 67,253 SDQ respondents (10–18 years) and found the five-factor CFA SDQ structure biased, with its respective ESEM version presenting a rather weak factorial structure. Secondly, Black et al. [23] investigated the SDQ responses of 30,842 UK students (11 to 15 years) and found the five-factor ESEM model valid for Year 7 and 9 students. Finally, Gomez and colleagues [8] examined the SDQ responses of 968 Greek-speaking adolescents (12–18 years) and supported an ESEM model with three factors entailing dysregulation, peer problems, and prosocial behavior, whilst also recommended further research (see supplementary Table 1 and Fig. 2 for more detailed information [29]). Furthermore, researchers have flagged only partial measurement invariance across different SDQ language versions, with the English versions of the instrument (such as the one used for the current tutorial) showing a particularly ill-fit [32]. Interestingly, the latter has been attributed to items simultaneously loading on several latent factors, reinforcing the need for ESEM testing [32]. The above prompted scholars to suggest that due to the SDQ multidimensionality and the close relationship between SDQ factors, the five-factor structure should be tested via ESEM [8].

Fig. 2
figure 2

SDQ models proposed in previous literature

Aims of the present tutorial

Considering the potential ESEM benefits, deterrents, and recent recommendations for simplifying ESEM R procedures [9], this work aims to equip younger scholars with a tutorial implementation of ESEM pathway 1b via a newly introduced R code/automation. To achieve this, the contested SDQ factor structure will be used as an example [29]. For brevity and to avoid repetitions with existing literature regarding ESEM reporting guidelines [9]) and the optimum SDQ factor structure [8, 23, 25, 29, 30], only the conventional five-factor CFA and its corresponding ESEM will be examined here [9].

ESEM tutorial: Methods, materials and procedure

A subset of pre-existing data (i.e., Growing Up in Australia: The Longitudinal Study of Australian Children [LSAC]) was used ( [33, 34]), including 3956 participants from cohort K (i.e. Kinder Cohort, approximately 5000 children between 4–5 years in 2003/2004) to the SQD. This questionnaire includes 25 items rated on a three-point Likert scale ranging from 0 to 2 (0 = “not true”, 1 = “somewhat true”, 2 = “certainly true”). Items are equally distributed across five proposed domains involving ES, CP, Hy, PP, and PB [26, 27]. A retrospective power analysis via the semPower R package indicated that a model with α = 0.05, df = 190, and N = 3836 would yield acceptable power (1-β = 0.99), being satisfied by the current sample size [35] (see Supplementary Fig. 1 for more details).

Ethics approval to use the archival LSAC data was granted by the Victoria University Human Research Ethics Committee on 10th May 2022. The original data collectors obtained written and verbal consent from parents/guardians and, where appropriate, from the participants. All procedures performed involving human participants were in accordance with the 1964 Helsinki Declaration and its later amendments. Permission to access and utilize the dataset was provided by the National Centre for Longitudinal Data, Australian Government Department of Social Services on 14th July 2021.

ESEM Tutorial: A step-by-step ESEM guide via the R software using the SDQ five-factor structure example

The following section will use the ESEM package ( [36]) to demonstrate the expansion of the traditional five-factor SDQ CFA model with the inclusion of loading calculation thresholds derived via a) fixed rates approximating 0 (Mplus approach [20]) and b) prior ESEM embedded EFA procedures/loadings via the R software. Considering the latter, the full list of ESEM implementation R functions is provided in the GitHub html appendix, while the exact data used in the tutorial is also attached (data available from

1a Pathway: ESEM based on fixed loading thresholds approximating 0 via Mplus

ESEM modeling can be performed through the expansion of traditional CFA with the inclusion of all non-primary factor-item loadings at a fixed rate approximating 0. Beer and van Zyl [20] have introduced a peer-reviewed Mplus ESEM generator online software for any chosen Mplus model defined (see Nonetheless, this section provides detailed instructions to fit an ESEM with zero approximation of non-primary item loadings using Mplus syntax.

Firstly, fitting an ESEM model using Mplus requires loading the data (see Loading the Data, Table 4, Setup). Subsequently, the variables should be named (see Naming the Variables, Table 4, Setup) and their nature defined (see Defining the nature of the variables, Table 4, Setup). Researchers must identify the variables to be used for the analysis (see Variables to be used), define missing values (see Defining missing values, Table 4, Setup), the estimator (in this case, WLSMV), and the rotation. The model is then prepared to identify primary items loadings in each latent factor and non-primary items loadings constrained to approximately zero (~ 0; see Model setup, Table 4, Step 1; and Fig. 3). Subsequently, the model is tested, and results are produced (see Testing the ESEM model, Table 4, Step 2).

Table 4 Pathway 1a: ESEM based on fixed cross-loading thresholds approximating 0 – Mplus syntax
Fig. 3
figure 3

Defining the model in the Mplus environment

Using these steps, a five-factor ESEM was fitted with all non-primary items approximating zero (~ 0). This model showed acceptable fit indices (χ2[185] = 1372.931, p < 0.001, CFI = 0.952, TLI = 0.922, RMSEA = 0.041), most items loaded significantly on all five latent factors and latent factors covaried (see Fig. 4).

Fig. 4
figure 4

ESEM summary information using Mplus environment

1b Pathway: ESEM based on EFA derived loading thresholds via RStudio

Alternatively, before conducting the ESEM procedure, an EFA is required to extract the factor loadings that will be used to expand the traditional CFA model (both processes are embedded within step 1 of the proposed R code). To achieve this, the EFA loadings and the specific factor referent items are automatically summarized as a structural unit, which directly informs the creation of the ESEM analysis. The ESEM model is then tested (Table 5, Step 2) and visualized (Table 5, Step 3). It is noted that Geomin rotation is, by default, embedded in step 1 (Table 5). If a researcher prefers target rotation for more theory-driven and less exploratory results, Step1 should be substituted by the Step1a alternative code (see Table 5). In either case, the end-product ESEM model behaves as a “conditional” CFA procedure, where factors are calculated based on all their primary and non-primary item loadings, provided these exceed their EFA varying levels. This improves modeling accuracy compared to 1a Pathway2.

Table 5 Pathway 1b: ESEM based on EFA extracted loading thresholds (as discussed in [16])

Estimating an ESEM model with R Studio requires the installation of ‘ESEM’, a dedicated package containing specific functions to ease this process. The package can be installed using remotes::install_github("maria-pro/esem", build_vignettes = TRUE) and loaded using library(esem) (Table 5, setup). Other packages are required to execute this example (i.e., tidyverse, psych, lavaan, GPArotation, and semPlot; see Table 5, setup [37,38,39,40]. The line sdq <—sdq_lsac loads the dataset used for this demonstration. A table with descriptives statistics can be obtained with the function describe(sdq_lsac) (see Fig. 5).

Fig. 5
figure 5

Descriptive statistics generated in R Studio

Subsequently, two approaches can be used to fit an ESEM model: (a) a Geomin rotation and (b) a targeted rotation. A Geomin rotation involves an exploratory approach (much like EFA), where the researcher can set the desired number of latent factors while allowing the algorithm to identify the main loading items on each latent factor (Table 5, Step 1; and Fig. 6).

Fig. 6
figure 6

Fitting an ESEM with Geomin-rotation in R Studio

Alternatively, a targeted rotation involves creating a list object (main_loadings_list; Table 5, Step 1a; and Fig. 7), reproducing a desired factorial structure where the researcher predetermines the main loading items in each factor Fig 8.

Fig. 7
figure 7

Fitting an ESEM with targeted rotation in R Studio

Fig. 8
figure 8

Fitting an ESEM with targeted rotation in R Studio

The function make_target enables the estimation of targeted loadings anchoring the lowest item-loading from each latent factor, and the function esem_c uses the target loading to fit an ESEM (Table 5, Step 1a). Finally, the results can be inspected using the function summary (Table 5, Step 2), and the factorial structure can be plotted using semPaths (Table 5, Step 3; and Fig. 9).

Fig. 9
figure 9

ESEM diagram

Using the steps described above, two five-factor ESEM were fitted, including Geomin and targeted rotation. Both models showed similarly acceptable fit indices (target rotation: χ2[190] = 715.708, p < 0.001, CFI = 0.990, TLI = 0.985, RMSEA = 0.027, SRMR = 0.035; and Geomin rotation: χ2[190] = 673.476, p < 0.001, CFI = 0.991, TLI = 0.986, RMSEA = 0.026, SRMR = 0.035; see Fig. 10). In both models, most items loaded significantly on all five latent factors, and most latent factors showed low covariance (see A Satorra-Bentler chi-squared scaled difference test (SBS Δχ2) indicated no significant differences between both models (p = 0.99).

Fig. 10
figure 10

ESEM model results using pathway 1b step 1a

Considering the non-significant, albeit slightly improved fit of the ESEM model using a Geomin rotation (pathway 1b), the model was compared to its respective CFA model. The CFA analysis was conducted in R using `lavaan` package (see Fig. 11 [39]). The CFA model is specified as `sdq_model` and the analysis is conducted using cfa() function that specifies `DWLS` estimator. The results are presented below as R output for demonstration purposes (Fig. 12).

Fig. 11
figure 11

Running a CFA model in RStudio using the ‘lavaan package [39]

Fig. 12
figure 12

Comparing CFA and ESEM (with Geomin rotation, pathway 1b)

The CFA model showed marginally acceptable fit indices (CFA: χ2[265] = 3504.987, p < 0.001, CFI = 0.893, TLI = 0.879, RMSEA = 0.056, SRMR = 0.059). Moreover, a Satorra-Bentler chi-squared scaled difference test (SBS Δχ2) indicated that the ESEM with Geomin rotation showed a significantly better fit than its CFA counterpart (Δχ2[75] = 265.850, p < 0.001). Most factor correlations were not significant in the ESEM model, while most were significant in the CFA model (see Fig. 13). The presence of item cross-loadings suggests that free estimation of non-primary items should be enabled; however, this process may obtain over-identified models capable of converging on a given solution [9, 12].

Fig. 13
figure 13

Comparing factor correlations between CFA and ESEM with Geomin rotation


This tutorial addressed past recommendations to simplify and facilitate ESEM implementations [9]. To address this aim, it first provided an ESEM theoretical overview while emphasizing the comparison between ESEM and traditional EFA and CFA methods [12]. Secondly, different ESEM pathways and hybrid ESEM methodologies, including Set-ESEM and EWC were explored in relation to their potential estimator and rotation selection features [7, 9, 10, 12]. Thirdly, ESEM strengths, limitations, conditions and utility were briefly illustrated, and available ESEM operationalization procedures via the Mplus and the R Software were highlighted (alongside their advantages and disadvantages [9, 18,19,20]). Fourthly, the SDQ factor structure debate was succinctly explained to allow the use of the scale as an example for the current tutorial. Material and methods secured from the LSAC data were additionally described prior to the analyses [33]. To avoid repeating past literature, the SDQ five-factor CFA and its corresponding ESEM were emphasized in the context of the “Data Analysis and Reporting Phase” of the ESEM guidelines [8, 9, 23, 25, 29, 30]. Within this context, the present tutorial comparatively provided relevant R ESEM pathway 1b and Mplus syntax via a step-by-step guide, aiming to help young researchers implement this type of modeling.

Accordingly, the tutorial analyses demonstrated two different approaches to conduct ESEM via the Mplus and R software. These entail a) the inclusion of non-primary loading (i.e. cross-loadings) calculation thresholds via fixed rates approximating 0 (pathway 1a) and via ESEM embedded EFA procedures (pathway 1b). Pathway 1a shows ESEM analysis with the scale’s factor structure evaluated in a core confirmatory manner [20]. Alternatively, pathway 1b demonstrates ESEM embedding a prior CFA, EFA step to inform non-primary loadings calculation starting points [18]. Although similar to EWC, pathway 1b adopts an EFA and not an ESEM as an initial procedure, thus potentially being more methodologically rigorous than other ESEM modeling calculations [7]. The presentation of the results followed the suggested guidelines by van Zyl and ten Klooster [9] in the context of the required data analysis and reporting phase. In particular, the goodness-of-fit indices and measurement quality indicators are reported and benchmarked. The presentation of results is completed for both pathways to allow comparability of the approaches.

Comparison of global fit for all models was based on their CFI, TLI and RMSEA values and showed good fit with most items loading significantly on all five factors and the factors covarying, similar to past SDQ studies [15, 29]. Interestingly, the analysis additionally confirmed issues with the SDQ instrument, as items showed lower than 0.6 loadings on their designated factors aligning with past evidence [8, 23, 25, 29, 30]. The conducted ESEM was finally compared with its corresponding SDQ five factor CFA model [12]. Following suggestions by Morin et al. [12], the ESEM models are expected to show better data-model fit than CFA options except for smaller factor correlations in ESEM models, when compared to traditional CFA models. Thus, ESEM pathway 1b analysis was expected to show lower factor correlations (as it is a non-bifactor model). The cross-loadings in the ESEM model were envisaged to be below 0.5 and the estimated latent factors were also expected to show strong loadings matching expectations. Large cross-loadings in ESEM generally indicate the existence of a global factor and may present a case for bifactor ESEM [8, 23, 25, 29, 30].

Considering the comparison between ESEM pathway 1a (i.e., MPlus facilitated ESEM where the items’ cross-loadings with all their non-primary factors are uniformly set to approximate 0), and pathway 1b (where items’ cross-loadings with their non-primary allocated factors are calculated based on their EFA-derived thresholds), pathway 1b showed better fit. Specifically, whilst both procedures demonstrated sufficient fit, pathway 1b, as facilitated via the current proposed code showed lower chi-square, RMSEA and SRMR and higher CFI and TLI either with Target (i.e. χ2[190] = 715.708, p < 0.001, CFI = 0.990, TLI = 0.985, RMSEA = 0.027, SRMR = 0.035) or Geomin rotation (i.e. χ2[190] = 673.476, p < 0.001, CFI = 0.991, TLI = 0.986, RMSEA = 0.026, SRMR = 0.035) compared to pathway 1a facilitated via MPlus (χ2[185] = 1372.931, p < 0.001, CFI = 0.952, TLI = 0.922, RMSEA = 0.041). These results suggest that item-specific treatment, in the context of ESEM, may result in better fit indices. As this process is not available via MPlus, the present code is an attractive alternative for prospective ESEM users.

Conclusions, implications, limitations & further research

Overall, considering the different ESEM calculation options applied, useful conclusions may be indicated. As such, one could support that ESEM calculation based on: a) the inclusion of fixed non-primary items thresholds approximating 0 (Pathway 1a); and b) EFA extracted loading thresholds (Pathway 1b), while similar, they are yet, to an extent, different. The second option allows EFA variability in primary item loadings to be also considered, while non-primary item loadings are differentially treated, according to their EFA performance. The first option initially distinguishes primary and non-primary item treatment, with no loading calculation thresholds included for primary items. Secondly, it uniformly addresses all non-primary items via the inclusion of the same approximating to 0 loading threshold. The comparison of the performance of the two alternatives in the context of this tutorial shows relatively improved fit, for pathway 1b (see GitHub html appendix). Overall, from a theoretical perspective, variability in the performance of even primary items is expected, and thus one would suggest that ESEM calculation based on EFA derived loading thresholds allows more reflective/accurate modeling.

As ESEM practices may enhance the usage of multidimensional questionnaires, the technique demonstrated in this tutorial has relevance for the critical appraisal of commonly used measures of human behavior, as well as behaviors with diagnosable psychopathology/psychiatric features (i.e., multifaceted mood, psychotic, and developmental disorders; e. g. bipolar; schizoaffective; attention deficit and hyperactivity; [41]). In addition, the present technique/code combines four significant strengths, as it: a) employs a publicly accessible software; b) provides an adaptable and easy to follow (even for less R-relevant users) ESEM code; c) utilizes a ‘real-world’ measure with a debatable factor structure/interpretation; and d) emphasizes on publicly accessible nationwide demonstration dataset. Overall, while ESEM conducted in both Mplus and ESEM presents valuable guidance for researchers, in regard to the SDQ instrument testing itself, the outcome requires further examination with the use of alternative ESEM/CFA models.

Nevertheless, such methodological strengths need to be evaluated in the context of a researcher’s required familiarity with the R software usage, as well as potentially significant ESEM limitations [9, 12]. These may involve occasional lack of parsimony and/or confusing factors, over-fitting risk, as well as the recommended cautiousness with the calculation of higher-order factors, partial invariance, mediation employed cross-loadings, multi-level, latent class and latent growth curve modeling [6, 9, 23]. In that line, and despite available ESEM literature converging to the usage of traditional CFA fit thresholds [42], the potential application of optimal ESEM cut-offs may need to be examined. Thus, we have now expanded our future research recommendations to address potentially ESEM-specific model fit thresholds Overall, ESEM modeling should not be perceived as an entirely positive procedure and/or without takings into consideration its limitations and specific uses (despite being able to solve many problems [7, 9, 12, 23].

Note 1: In Set-ESEM, the calculation of cross-loadings is enabled between predefined sets of factors, while they are prohibited to expand to different factor sets [6].

Note 2: Prior to the ESEM, data pre-processing including missing values analysis, outliers and distributional assumptions were addressed.

Availability of data and materials

The data of the current study is available on the following link The package is available here, and practical examples to reproduce analyses presented in the manuscript are available on the following links: step 1ba, and step ESEM 1b step 1


  1. Stavropoulos V, Monger K, Zarate D, Prokofieva M & Schivinski B. Online Gambling Disorder Questionnaire (OGD-Q): An item response theory examination. Addictive Behaviors Reports.  2022;16:100449.

  2. Zarate D, Fullwood L, Prokofieva M, Griffiths MD & Stavropoulos V. Problematic shopping behavior: an item response theory examination of the seven-item Bergen Shopping Addiction Scale. Int J Mental Health Addiction. 2022;1–19. Accessed 30 Jan 2023.

  3. Van de Schoot R, Lugtig P, Hox J. A checklist for testing measurement invariance. Eur J Dev Psychol. 2012;9(4):486–92.

    Article  Google Scholar 

  4. Brown TA. Confirmatory factor analysis for applied research. Guilford publications; 2015.

  5. Lionetti F, Mastrotheodoros S, Palladino BE. Experiences in close relationships revised child version (ecr-rc): psychometric evidence in support of a security factor. Eur J Dev Psychol. 2018;15(4):452–63.

    Article  Google Scholar 

  6. Marsh HW, Guo J, Dicke T, Parker PD, Craven RG. Confirmatory factor analysis (CFA), exploratory structural equation modeling (ESEM), and set-ESEM: optimal balance between goodness of fit and parsimony. Multivar Behav Res. 2020;55(1):102–19.

    Article  Google Scholar 

  7. Alamer A & Marsh H. Exploratory structural equation modeling in second language research: An applied example using the dualistic model of passion. Studies in Second Language Acquisition. 2022; 44(5):1477–1500.

  8. Gomez R, Motti-Stefanidi F, Jordan S, Stavropoulos V. Greek validation of the factor structure and longitudinal measurement invariance of the strengths and difficulties questionnaire-self report (SDQ-SR): exploratory structural equation modelling. Child Psychiatry Hum Dev. 2021;52(5):880–90.

    Article  PubMed  Google Scholar 

  9. Van Zyl LE & ten Klooster PM. Exploratory structural equation modeling: practical guidelines and tutorial with a convenient online tool for mplus. Front Psychiatry. 2022; 12:795672.

  10. Marsh HW, Morin AJ, Parker PD, Kaur G. Exploratory structural equation modeling: an integration of the best features of exploratory and confirmatory factor analysis. Annu Rev Clin Psychol. 2014;10:85–110.

    Article  PubMed  Google Scholar 

  11. Marsh HW, Craven RG, Debus R. Self-concepts of young children 5 to 8 years of age: Measurement and multidimensional structure. J Educ Psychol. 1991;83(3):377.

    Article  Google Scholar 

  12. Morin AJ, Myers ND, & Lee S. Modern factor analytic techniques: Bifactor models, exploratory structural equation modeling (ESEM), and bifactor‐ESEM. Handbook of sport psychology. 2020; 1044–1073.

  13. Asparouhov T, Muthén B & Morin AJ. Bayesian structural equation modeling with cross-loadings and residual covariances: comments on Stromeyer et al. J Manag. 2015; 41(6):1561-1577.

  14. Gomez R, Stavropoulos V, Zarate D, Griffiths MD. ADHD symptoms, the current symptom scale, and exploratory structural equation modelling: a psychometric study. Res Dev Disabil. 2021b; 111(1): 103850.

  15. Gomez R, Brown T, Watson S, & Stavropoulos V. Confirmatory factor analysis and exploratory structural equation modeling of the factor structure of the questionnaire of cognitive and affective empathy (QCAE). PloS One. 2022; 17(2):e0261914.

  16. Morin AJS, & Asparouhov T. Estimation of a hierarchical Exploratory Structural Equation Model (ESEM) using ESEM-within-CFA. Montreal, QC: Substantive Methodological Synergy Research Laboratory. 2018.

  17. Hu L-T, Bentler PM. Fit indices in covariance structure modeling: Sensitivity to underparameterized model misspecification. Psychol Methods. 1998;3(4):424–53.

    Article  Google Scholar 

  18. Revelle W. Esem: Perform and exploratory structural equation model (ESEM) by using factor extension techniques. Accessed on 16/03/2022 from. 2022b

  19. Silvestrin M. Exploratory structural equation modeling in R. 2022. From Accessed 30 Jan 2023.

  20. De Beer LT & Van Zyl LE. ESEM code generator for Mplus. 2019. From Accessed 30 Jan 2023.

  21. Satorra A, Bentler PM. Corrections to test statistics and standard errors in covariance structure analysis. In: von Eye A, Clogg CC, editors. Latent variables analysis: Applications for developmental research. Sage Publications; 1994. p. 399–419.

    Google Scholar 

  22. Tabachnick BG & Fidell LS. Using Multivariate Statistics (5th ed.) Allyn and Bacon. 2007.

  23. Black L, Mansfield R, Panayiotou M. Age appropriateness of the self-report strengths and difficulties questionnaire. Assessment. 2020;28(6):1556–69.

    Article  PubMed  Google Scholar 

  24. Goodman R. Psychometric properties of the strengths and difficulties questionnaire. J Am Acad Child Adolesc Psychiatry. 2001;40(11):1337–45.

    Article  CAS  PubMed  Google Scholar 

  25. Garrido LE, Barrada JR, Aguasvivas JA, Martínez-Molina A, Arias VB, Golino HF, Legaz E, Ferrís G, Rojo-Moreno L. Is Small Still Beautiful for the Strengths and Difficulties Questionnaire? Novel Findings Using Exploratory Structural Equation Modeling. Assessment. 2020;27(6):1349–67.

    Article  PubMed  Google Scholar 

  26. Hafekost J, Lawrence D, Boterhoven de Haan K, Johnson SE, Saw S, Buckingham WJ, Sawyer MG, Ainley J, Zubrick SR. Methodology of young minds matter: the second Australian child and adolescent survey of mental health and wellbeing. Aust N Z J Psychiatry. 2016;50(9):866–75.

    Article  PubMed  Google Scholar 

  27. NHS England. (n/d). NHS digital annual report and accounts 2018–19. Retrieved from Accessed 30 Jan 2023.

  28. Wolpert M, Cheng H, Deighton J. Measurement issues: review of four patient reported outcome measures: SDQ: RCADS, C/ORS and GBO – Their strengths and limitations for clinical use and service evaluation. Child Adolesc Mental Health. 2015;20(1):63–70.

    Article  Google Scholar 

  29. Gomez R, Stavropoulos V. Parent ratings of the strengths and difficulties questionnaire: what is the optimum factor model? Assessment. 2019;26(6):1142–53.

    Article  PubMed  Google Scholar 

  30. Ribeiro Santiago PH, Manzini D, Haag D, Roberts R, Smithers LG & Jamieson L. Exploratory graph analysis of the strengths and difficulties questionnaire in the longitudinal study of Australian children. Assessment. 2021. 10731911211024338.

  31. Bentley N, Bucci S, Hartley S. Measuring outcomes within inpatient child and adolescent mental health services: an evaluation of the recovery questionnaire for young people. Child Adolesc Mental Health. 2019;24(4):329–37.

    Article  Google Scholar 

  32. Ortuño-Sierra J, Chocarro E, Fonseca-Pedrero E, Riba SI, S., & Muñiz, J. The assessment of emotional and behavioural problems: internal structure of the strengths and difficulties questionnaire. Int J Clin Health Psychology. 2015;15(3):265–73.

    Article  Google Scholar 

  33. Sanson AV, Nicholson J, Ungerer J, Zubrick S, Wilson K, Ainley J, Berthelsen D, Bittman M, Broom D, Harrison L, Rodgers B, Sawyer M, Silburn S, Strazdins L, Vimpani G, & Wake M. Introducing the longitudinal study of Australian children2002. Accessed 30 Jan 2023.

  34. MacCallum RC, Browne M, W., & Sugawara, H., M. Power analysis and determination of sample size for covariance structure modeling. Psychol Methods. 1996;1:130–49.

    Article  Google Scholar 

  35. Jobst L, Bader M, Moshagen M. A tutorial on assessing statistical power and determining sample size for structural equation models. Psychol Methods. 2023;28:207–21.

    Article  PubMed  Google Scholar 

  36. Prokofieva M, Stavropoulos V, & Zarate D. ‘esem’. R package. 2023.

  37. Epskamp S, Struber S, Nak J, Veenman M & Jorgensen TD. semPlot: path diagrams and visual analysis of various SEM packages’ output. R package. 2022. Accessed 30 Jan 2023.

  38. Revelle W. psych: procedures for psychological psychometric, and personality research. R package. Northwestern University. 2022a.

  39. Rosseel Y. Lavaan: an R package for structural equation modelling. J Stat Software. 2012; 48(2), 1–36.

  40. Wickham H, Averick M, Bryan J, Chang W, McGowan LD, François R, Grolemund G, Hayes A, Henry K, Hester J, Kuhn M, Pedersen TL, Miller E, Bache SM, Müller K, Ooms J, Robinson D, Seidel DP, Sinu V, …& Yutani H. Welcome to the tidyverse. J Open Source Software. 2019;4(43):1686.

  41. American Psychiatric Association. Diagnostic and statistical manual of mental disorders (5th ed.), American Psychiatric Publishing, Arlington, VA. 2013.

  42. Widaman KF, Revelle W. Thinking thrice about sum scores, and then some more about measurement and analysis. Behav Res. 2023;55:788–806.

    Article  Google Scholar 

Download references


Not applicable.


VS has received the Australian Research Council, Discovery Early Career Researcher Award 2021. Grant Award Number: DE210101107.

Author information

Authors and Affiliations



MP contributed to the literature review, the formulation of the research hypotheses, the data collection and the writing of the first form of the manuscript. DZ contributed to the formulation of the research hypotheses, the data collection, the analyses, and the writing of the first form of the manuscript. AP and OP contributed to the literature review and the editing of the final form of the manuscript. VS contributed to the literature review, the formulation of the research hypotheses, the data collection and the writing of the first form of the manuscript.

Corresponding author

Correspondence to Daniel Zarate.

Ethics declarations

Ethics approval and consent to participate

All procedures performed in the study involving human participants were in accordance with the ethical standards of the institutional and/or national research committee and with the 1964 Helsinki declaration and its later amendments or comparable ethical standards. This article does not contain any studies with animals performed by any of the authors. Informed consent was obtained from all individual participants included in the study. Informed consent for participants aged under 16 was provided by their parents and/or legal guardians. Ethics approval was received from the Victoria University, Melbourne, Australia, Ethics Committee HRE20-169.

Consent for publication

Not applicable.

Competing interests

Dr Vasileios Stavropoulos is an associate editor of BMC Psychiatry. All other authors do not have competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and Permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Prokofieva, M., Zarate, D., Parker, A. et al. Exploratory structural equation modeling: a streamlined step by step approach using the R Project software. BMC Psychiatry 23, 546 (2023).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: