Chapter 2: Specifying the path model and examining data

Initial model hypothesizes path model to estimate relationships between corporate reputation, customer satisfaction, and customer loyalty. SmartPLS 4 setup follows.

SPECIFYING THE PATH MODEL AND EXAMINING DATA

The most effective way to learn how to use a statistical method is to apply it to a set of data. Throughout this book, we use a single example that enables you to do that. We start the example with a simple model, and in Chapter 5, we expand that same model to a much broader, more complex model. For our initial model, we hypothesize a path model to estimate the relationships between corporate reputation, customer satisfaction, and customer loyalty. The example will provide insights on (1) how to develop the structural model representing the underlying concepts/theory, (2) the setup of measurement models for the latent variables, and (3) the structure of the empirical data used. Then, our focus shifts to setting up the SmartPLS 4 software (Ringle, Wende, & Becker, 2022) for PLS-SEM.

Specifying the path model and examining data

To specify the structural model, we must begin with some fundamental explications about theoretical models. The corporate reputation model by Eberl (2010) is the basis of our theory. The goal of the model is to explain the effects of corporate reputation on customer satisfaction (CUSA) and, ultimately, customer loyalty (CUSL). Corporate reputation represents a company’s overall evaluation by its stakeholders (Helm, Eggert, & Garnefeld, 2010). It is measured using two dimensions. One dimension represents cognitive evaluations of the company, and the construct is the company’s competence (COMP). The second dimension captures affective judgments, which determine the company’s likeability (LIKE). This two-dimensional approach to measure reputation was developed by Schwaiger (2004). It has been validated in different countries (e.g., Eberl, 2010; Zhang & Schwaiger, 2012) and applied in various research studies (e.g., Eberl & Schwaiger, 2005; Radomir & Moisescu, 2019; Radomir & Wilson, 2018; Raithel & Schwaiger, 2015; Raithel, Wilczynski, Schloderer, & Schwaiger, 2010; Sarstedt & Schloderer, 2010; Schloderer, Sarstedt, & Ringle, 2014; Schwaiger, Raithel, & Schloderer, 2009; Yun, Kim, & Cheong, 2020). Research also shows that the approach performs favorably (in terms of convergent validity and predictive validity) compared with alternative reputation measures (Sarstedt, Wilczynski, & Melewar, 2013).
Building on a definition of corporate reputation as an attitude-related construct, Schwaiger (2004) further identified four antecedent dimensions of reputation—quality, performance, attractiveness, and corporate social responsibility—measured by a total of 21 formative indicators. These driver constructs of corporate reputation are components of the more complex example we will use in the book and will be added in Chapter 5. Likewise, we do not consider more complex model setups such as mediation or moderation effects yet. These aspects will be covered in the case studies in Chapter 7. In summary, the simple corporate reputation model has two main theoretical components: (1) the target constructs of interest—namely, CUSA and CUSL (endogenous constructs)—and (2) the two corporate reputation dimensions COMP and LIKE (exogenous constructs), which represent key determinants of the target constructs. Exhibit A2.1 shows the constructs and their relationships, which represent the structural model for the PLS-SEM case study.
To propose a theory, researchers usually build on existing research knowledge. When PLS-SEM is applied, the structural model displays the theory with its key elements (i.e., constructs) and cause-effect relationships (i.e., paths). Researchers typically develop hypotheses for the constructs and their path relationships in the structural model. For example, consider Hypothesis 1 (H1): Customer satisfaction has a positive effect on customer loyalty. PLSSEM enables statistically testing the significance of the hypothesized relationship (Chapter 6). When conceptualizing the theoretical constructs and their hypothesized structural relationships for PLS-SEM, it is important to make sure the model has no circular relationships (i.e., causal loops). A circular relationship would occur if, for example, we reversed the relationship between COMP and CUSL as this would yield the causal loop COMP -> CUSA -> CUSL -> COMP.

Exhibit A2.1 ■ Example of a Theoretical Model (Simple Model)
Exhibit A2.1 ■ Example of a Theoretical Model (Simple Model)

Since the constructs are not directly observed, we need to specify a mea- surement model for each construct. The specification of the measurement models (i.e., multi-item vs. single-item measures and reflective vs. formative measures) draws on prior research studies by Schwaiger (2004) and Eberl (2010).
In our simple example of a PLS-SEM application, we have three constructs (COMP, CUSL, and LIKE) measured by multiple items (Exhibit A2.2). All three constructs have reflective measurement models as indicated by the ar- rows pointing from the construct to the indicators. For example, COMP is measured by means of the three reflective items comp_1, comp_2, and comp_ 3, which relate to the following survey questions (Exhibit A2.3): “[The com- pany] is a top competitor in its market,” “As far as I know, [the company] is recognized worldwide,” and “I believe that [the company] performs at a pre- mium level.” Respondents had to indicate the degree to which they (dis)agree with each of the statements on a 7-point scale from 1 = fully disagree to 7 = fully agree.
Different from COMP, CUSL, and LIKE, the customer satisfaction con- struct (CUSA) is operationalized by a single item (cusa) that is related to the following question in the survey: “If you consider your experiences with [company], how satisfied are you with [company]?” The single indicator is measured with a 7-point scale indicating the respondent’s degree of satisfac- tion (1 = very dissatisfied; 7= very satisfied).

Types-of-Measurement-Models-in-the-Simple-Model
Exhibit A2.2 ■ Types of Measurement Models in the Simple Model
Indicators for Reflective Measurement Model Constructs
Exhibit A2.3 ■ Indicators for Reflective Measurement Model Constructs

The single item has been used due to practical considerations in an effort to decrease the overall number of items in the questionnaire. As customer satisfaction items are usually highly homogeneous, the loss in predictive validity compared with a multi-item measure is not considered severe. As cusa is the only item measuring customer satisfaction, construct and item are equivalent (as indicated by the fact that the relationship between construct and singleitem measure is always one in PLS-SEM). Therefore, the choice of the measurement perspective (i.e., reflective vs. formative) is of no concern and the relationship between construct and indicator is undirected.

To estimate the PLS-SEM, data were collected using computerassisted telephone interviews (Sarstedt & Mooi, 2019) that asked about the respondents’ perception of and their satisfaction with four major mobile network providers in Germany’s mobile communications market. Respondents rated the questions on 7-point Likert scales, with higher scores denoting higher levels of agreement with a particular statement. In the case of cusa, higher scores denote higher levels of satisfaction. Satisfaction and loyalty were measured with respect to the respondents’ own service providers. The data set used in this book is a subset of the original set and has a sample size of 344 observations. The data have been collected using a quota sampling approach (Sarstedt, Bengart, Shaltoni, & Lehmann, 2018) by a professional market research company in the German market. The resulting sample is representative of the German population.
Exhibit A2.4 shows the data matrix for the model. The 10 columns represent a subset of all variables (i.e., specific questions in the survey as described in the previous section) that have been surveyed, and the 344 rows (i.e., cases) contain the answers of every respondent to these questions. For example, the first row contains the answers of Respondent 1 while the last row contains the answers of Respondent 344. The columns show the answers to the survey questions. Data in the first nine columns are for the indicators associated with the three constructs, and the tenth column includes the data for the single indicator for CUSA. The data set contains further variables that relate to, for example, the driver constructs of LIKE and COMP. We will cover these aspects in Chapter 5. If you are using a data set in which a respondent did not answer a specific question, you need to insert a number that does not appear otherwise in the responses to indicate the missing values. Researchers commonly use –99 to indicate missing values, but you can use any other value that does not normally occur in the data set. In the following, we will also use –99 to indicate missing values. If, for example, the first data point of comp_1 were a missing value, the –99 value would be inserted into the space as a missing value space holder instead of the value of 6 that you see in Exhibit A2.4. Missing value treatment procedures (e.g., mean replacement) could then be applied to these data (e.g., Hair, Black, Babin, & Anderson, 2019). Again, if the number of missing values in your data set per indicator is relatively small (i.e., less than 5% missing per indicator), we recommend mean value replacement instead of casewise deletion to treat the missing values when running PLS-SEM. Furthermore, we need to ascertain that the number of missing values per observation and per indicator does not exceed 15%. If this was the case, the corresponding observation should be eliminated from the data set.

Data Matrix for the Indicator Variables
Exhibit A2.4 ■ Data Matrix for the Indicator Variables

The data example shown in Exhibit A2.4 (and in the book’s example) has only very few missing values. More precisely, cusa has one missing value (0.29%), cusl_1 and cusl_3 have three missing values (0.87%), and cusl_2 has four missing values (1.16%). Since the missing values per indicator are less than 5%, mean value replacement can be used. Furthermore, none of the observations and indicators has more than 15% missing values, so we can proceed analyzing all 344 respondents.
To run outlier diagnostics, we compute a series of box plots using IBM SPSS Statistics—see Chapter 5 in Sarstedt and Mooi (2019) for details on how to run these analyses in IBM SPSS Statistics. The results indicate some influential observations but no outliers. Moreover, nonnormality of data regarding skewness and kurtosis is not an issue. The kurtosis and skewness values of all the indicators are within the –2 and +2 range.

PATH MODEL CREATION USING THE SMARTPLS SOFTWARE

The SmartPLS 4 software (Ringle, Wende, & Becker, 2022) is used to execute all the PLS-SEM analyses in this book. The discussion includes an overview of the software’s functionalities. The student version of the software is available free of charge at https://www.smartpls.com. The student version offers practically all functionalities of the full version but is restricted to data sets with a maximum of 100 observations. However, as the data set used in this book has more than 100 observations (344 to be precise), you should use the professional version of SmartPLS, which is available as a 30-day trial version at https://www.smartpls.com. After the trial period, a license fee applies. Licenses are available for different periods of time (e.g., 1 month, 1 year, 2 years etc.) and can be purchased through the SmartPLS website. The SmartPLS website includes a download area and many additional resources such as short explanations of PLS-SEM and software related topics, a list of recommended literature, answers to frequently asked questions, tutorial videos for getting started using the software, and the SmartPLS forum, which allows you to discuss PLS-SEM topics with other users. SmartPLS has a graphical user interface that enables the user to estimate the PLS path model. Exhibit A2.8 at the end of this section shows the graphical interface for the SmartPLS software, with the simple model already drawn. In the following paragraphs, we describe how to set up this model using the SmartPLS software. Before you draw your model, you need to have data that serve as the basis for running the model. SmartPLS 4 supports data imported from various file formats, such as Microsoft Excel (.xls or xlsx), SPSS (.sav), comma-separated values (.csv), and text (.txt). The only aspect we have to pay attention to is that the first row contains the variable names in text format and otherwise only numerical values (no text or special characters; also, no numerical values in scientific format, e.g., 10 E-7). For example, SmartPLS interprets single dots (such as those produced by IBM SPSS Statistics in case an observation has a system-missing value) as string elements.
The data we will use with the reputation model can be downloaded either as commaseparated value (.csv) or text (.txt) data sets in the download section of this book’s webpage at the following URL: https://zalo.me/g/nvefyh313. Now run the SmartPLS software by clicking on the desktop icon that is available after the software installation on your computer device. Alternatively, go to the folder where you installed the SmartPLS software on your computer. Click on the icon that runs SmartPLS to start the software.

Welcome to our guide on creating your first regression model using SmartPLS! In this tutorial, we will guide you through the following steps:

  1. Selecting a SmartPLS workspace folder
  2. Creating a new project and importing a dataset

Throughout the tutorial, we will be using SmartPLS to perform a regression analysis and visualize the results. Make sure you have downloaded and installed the software before beginning. Let’s get started!

Selecting a SmartPLS workspace folder

When you open SmartPLS for the first time, you will be prompted to Select a Workspace folder. This folder serves as the default location for storing all of your SmartPLS projects.

Exhibit A2.1 ■ Example of a Theoretical Model (Simple Model)
Exhibit A2.1 ■ Example of a Theoretical Model (Simple Model)

Since the constructs are not directly observed, we need to specify a mea- surement model for each construct. The specification of the measurement models (i.e., multi-item vs. single-item measures and reflective vs. formative measures) draws on prior research studies by Schwaiger (2004) and Eberl (2010).
In our simple example of a PLS-SEM application, we have three constructs (COMP, CUSL, and LIKE) measured by multiple items (Exhibit A2.2). All three constructs have reflective measurement models as indicated by the ar- rows pointing from the construct to the indicators. For example, COMP is measured by means of the three reflective items comp_1, comp_2, and comp_ 3, which relate to the following survey questions (Exhibit A2.3): “[The com- pany] is a top competitor in its market,” “As far as I know, [the company] is recognized worldwide,” and “I believe that [the company] performs at a pre- mium level.” Respondents had to indicate the degree to which they (dis)agree with each of the statements on a 7-point scale from 1 = fully disagree to 7 = fully agree.
Different from COMP, CUSL, and LIKE, the customer satisfaction con- struct (CUSA) is operationalized by a single item (cusa) that is related to the following question in the survey: “If you consider your experiences with [company], how satisfied are you with [company]?” The single indicator is measured with a 7-point scale indicating the respondent’s degree of satisfac- tion (1 = very dissatisfied; 7= very satisfied).

Types-of-Measurement-Models-in-the-Simple-Model
Exhibit A2.2 ■ Types of Measurement Models in the Simple Model
Indicators for Reflective Measurement Model Constructs
Exhibit A2.3 ■ Indicators for Reflective Measurement Model Constructs

The single item has been used due to practical considerations in an effort to decrease the overall number of items in the questionnaire. As customer satisfaction items are usually highly homogeneous, the loss in predictive validity compared with a multi-item measure is not considered severe. As cusa is the only item measuring customer satisfaction, construct and item are equivalent (as indicated by the fact that the relationship between construct and singleitem measure is always one in PLS-SEM). Therefore, the choice of the measurement perspective (i.e., reflective vs. formative) is of no concern and the relationship between construct and indicator is undirected.

To estimate the PLS-SEM, data were collected using computerassisted telephone interviews (Sarstedt & Mooi, 2019) that asked about the respondents’ perception of and their satisfaction with four major mobile network providers in Germany’s mobile communications market. Respondents rated the questions on 7-point Likert scales, with higher scores denoting higher levels of agreement with a particular statement. In the case of cusa, higher scores denote higher levels of satisfaction. Satisfaction and loyalty were measured with respect to the respondents’ own service providers. The data set used in this book is a subset of the original set and has a sample size of 344 observations. The data have been collected using a quota sampling approach (Sarstedt, Bengart, Shaltoni, & Lehmann, 2018) by a professional market research company in the German market. The resulting sample is representative of the German population.
Exhibit A2.4 shows the data matrix for the model. The 10 columns represent a subset of all variables (i.e., specific questions in the survey as described in the previous section) that have been surveyed, and the 344 rows (i.e., cases) contain the answers of every respondent to these questions. For example, the first row contains the answers of Respondent 1 while the last row contains the answers of Respondent 344. The columns show the answers to the survey questions. Data in the first nine columns are for the indicators associated with the three constructs, and the tenth column includes the data for the single indicator for CUSA. The data set contains further variables that relate to, for example, the driver constructs of LIKE and COMP. We will cover these aspects in Chapter 5. If you are using a data set in which a respondent did not answer a specific question, you need to insert a number that does not appear otherwise in the responses to indicate the missing values. Researchers commonly use –99 to indicate missing values, but you can use any other value that does not normally occur in the data set. In the following, we will also use –99 to indicate missing values. If, for example, the first data point of comp_1 were a missing value, the –99 value would be inserted into the space as a missing value space holder instead of the value of 6 that you see in Exhibit A2.4. Missing value treatment procedures (e.g., mean replacement) could then be applied to these data (e.g., Hair, Black, Babin, & Anderson, 2019). Again, if the number of missing values in your data set per indicator is relatively small (i.e., less than 5% missing per indicator), we recommend mean value replacement instead of casewise deletion to treat the missing values when running PLS-SEM. Furthermore, we need to ascertain that the number of missing values per observation and per indicator does not exceed 15%. If this was the case, the corresponding observation should be eliminated from the data set.

Data Matrix for the Indicator Variables
Exhibit A2.4 ■ Data Matrix for the Indicator Variables

The data example shown in Exhibit A2.4 (and in the book’s example) has only very few missing values. More precisely, cusa has one missing value (0.29%), cusl_1 and cusl_3 have three missing values (0.87%), and cusl_2 has four missing values (1.16%). Since the missing values per indicator are less than 5%, mean value replacement can be used. Furthermore, none of the observations and indicators has more than 15% missing values, so we can proceed analyzing all 344 respondents.
To run outlier diagnostics, we compute a series of box plots using IBM SPSS Statistics—see Chapter 5 in Sarstedt and Mooi (2019) for details on how to run these analyses in IBM SPSS Statistics. The results indicate some influential observations but no outliers. Moreover, nonnormality of data regarding skewness and kurtosis is not an issue. The kurtosis and skewness values of all the indicators are within the –2 and +2 range.

14 bình luận về “Chapter 2: Specifying the path model and examining data

  1. Todd cho biết:

    Covariance-based approach limits lead us to use the variance based approach and SmartPLS software. Many studies done in this area. for expanding this approach among researchers with Persian language we written a book with collaboration my friends. It seems that this approach will soon become an integral part of most analysts, especially in the realm of social sciences. The new version of the software is more comprehensive, and especially at the CTA has attracted many comments. This software can help researchers to comprehensive understanding of develop and evaluate of the measurement and structural models.

Để lại một bình luận

Email của bạn sẽ không được hiển thị công khai. Các trường bắt buộc được đánh dấu *