This R package implements the cross-lagged panel model with fixed effects described by Allison, Williams, and Moral-Benito (2017). It is effectively a convenience wrapper to the `lavaan`

package. This package will reshape your data, specify the model properly, and fit it with `lavaan`

.

Note: This is ALPHA software. Expect bugs and missing functionality. Cross-reference all results with xtdpdml for Stata. Go to https://www3.nd.edu/~rwilliam/dynamic/ to learn about xtdpdml and the underlying method. You may also be interested in the article by Paul Allison, Richard Williams, and Enrique Moral-Benito in **Socius**, accessible here

You will need to the `devtools`

package installed to install this package from Github as well as its companion package, `panelr`

.

```
install.packages("devtools")
devtools::install_github("jacob-long/panelr")
devtools::install_github("jacob-long/clfe")
```

Also note this package’s dependencies: `lavaan`

, `stringr`

This package assumes your data are in *long* format, with each row representing a single observation of a single participant. Contrast this with *wide* format in which each row contains all observations of a single participant. Better compatibility with wide data will be implemented in a future release.

Load the package, use built-in wages data.

This next line of code converts the data to class `panel_data`

, which is a class specific to the `panelr`

that helps to simplify the treatment of the long-form panel data. You don’t have to do this, but it saves you from providing `id`

and `wave`

arguments to the model fitting function each time you use it.

`wages <- panel_data(WageData, id = id, wave = t)`

The formula syntax used in this package is meant to be as similar to a typical regression model as possible.

The most basic model can be specified like any other: `y ~ x`

, where `y`

is the dependent variable and `x`

is a time-varying predictor. If you would like to include time-invariant predictors, you will make the formula consist of two parts, separated with a bar (`|`

) like so: `y ~ x | z`

where z is a time invariant predictor, like ethnicity.

One of the innovations of the method, however, is the notion of predetermined, or sequentially exogenous, predictors. To specify a model with a predetermined variable, put the variable within a `pre`

function, `y ~ pre(x1) + x2 | z`

. This tells the function that `x1`

is predetermined while `x2`

is strictly exogenous by assumption. You could have multiple predetermined predictors as well (e.g., `y ~ pre(x1) + pre(x2) | z`

).

As implied by the “cross-lagged” terminology, you may also fit models with lagged predictors. Simply apply the lag function to the lagged predictors in the formula: `y ~ pre(lag(x1)) + lag(x2) | z`

. To specify more than 1 lag, just provide it as an argument. For instance, `y ~ pre(lag(x1, 2)) + lag(x2) | z`

will use 2 lags of the `x1`

variable.

This will replicate the analysis of the wages data in the *Socius* article that describes these models.

Note that to get matching standard errors, set `information = "observed"`

to override `lavaan`

’s default, `information = "expected"`

.

```
fit <- clfe(wks ~ pre(lag(union)) + lag(lwage) | ed, data = wages,
err.inv = TRUE,
information = "observed")
summary(fit)
```

```
# MODEL INFO
# Dependent variable: wks
# Total observations:
# Complete observations: 595
# Time periods: 2 - 7
#
# MODEL FIT
# Chi-squared (76) = 138.476
# RMSEA = 0.037, 90% CI [0.027, 0.047]
# p(RMSEA < .05) = 0.986
# SRMR = 0.027
#
# Est. S.E. z-value p
# union (t - 1) -1.206 0.522 -2.309 0.021 *
# lwage (t - 1) 0.588 0.488 1.204 0.229
# ed -0.107 0.056 -1.893 0.058 .
# wks (t - 1) 0.188 0.02 9.586 0 ***
#
# Model converged after 579 iterations
```

Any arguments supplied other than those that are documented within the `clfe`

function are passed on to `sem`

from `lavaan`

.

If you just want the `lavaan`

model specification and don’t want this package to fit the model for you, you can set `print.only = TRUE`

.

```
clfe(wks ~ pre(lag(union)) + lag(lwage) | ed, data = wages, err.inv = TRUE,
print.only = TRUE)
```

```
# ## Main regressions
#
# wks_2 ~ en1*union_1 + ex1*lwage_1 + c1*ed + p*wks_1
# wks_3 ~ en1*union_2 + ex1*lwage_2 + c1*ed + p*wks_2
# wks_4 ~ en1*union_3 + ex1*lwage_3 + c1*ed + p*wks_3
# wks_5 ~ en1*union_4 + ex1*lwage_4 + c1*ed + p*wks_4
# wks_6 ~ en1*union_5 + ex1*lwage_5 + c1*ed + p*wks_5
# wks_7 ~ en1*union_6 + ex1*lwage_6 + c1*ed + p*wks_6
#
# ## Alpha latent variable (fixed effects)
#
# alpha =~ 1*wks_2 + 1*wks_3 + 1*wks_4 + 1*wks_5 + 1*wks_6 + 1*wks_7
#
# ## Alpha regression (fixed effects)
#
# alpha ~~ union_1 + union_2 + union_3 + union_4 + union_5 + union_6 + lwage_1 + lwage_2 + lwage_3 + lwage_4 + lwage_5 + lwage_6 + wks_1
#
# ## Correlating DV errors with future values of predetermined predictors
#
# wks_5 ~~ union_6
# wks_4 ~~ union_5 + union_6
# wks_3 ~~ union_4 + union_5 + union_6
# wks_2 ~~ union_3 + union_4 + union_5 + union_6
#
# ## Predetermined predictors covariances
#
# union_1 ~~ lwage_1 + lwage_2 + lwage_3 + lwage_4 + lwage_5 + lwage_6 + ed + wks_1
#
# union_2 ~~ union_1 + lwage_1 + lwage_2 + lwage_3 + lwage_4 + lwage_5 + lwage_6 + ed + wks_1
#
# union_3 ~~ union_2 + union_1 + lwage_1 + lwage_2 + lwage_3 + lwage_4 + lwage_5 + lwage_6 + ed + wks_1
#
# union_4 ~~ union_3 + union_2 + union_1 + lwage_1 + lwage_2 + lwage_3 + lwage_4 + lwage_5 + lwage_6 + ed + wks_1
#
# union_5 ~~ union_4 + union_3 + union_2 + union_1 + lwage_1 + lwage_2 + lwage_3 + lwage_4 + lwage_5 + lwage_6 + ed + wks_1
#
# union_6 ~~ union_5 + union_4 + union_3 + union_2 + union_1 + lwage_1 + lwage_2 + lwage_3 + lwage_4 + lwage_5 + lwage_6 + ed + wks_1
#
#
# ## Exogenous (varying and invariant) predictors covariances
#
# lwage_1 ~~ ed + wks_1
# lwage_2 ~~ lwage_1 + ed + wks_1
# lwage_3 ~~ lwage_2 + lwage_1 + ed + wks_1
# lwage_4 ~~ lwage_3 + lwage_2 + lwage_1 + ed + wks_1
# lwage_5 ~~ lwage_4 + lwage_3 + lwage_2 + lwage_1 + ed + wks_1
# lwage_6 ~~ lwage_5 + lwage_4 + lwage_3 + lwage_2 + lwage_1 + ed + wks_1
#
# ed ~~ wks_1
#
# ## Holding DV error variance constant for each wave (optional)
#
# wks_2 ~~ v*wks_2
# wks_3 ~~ v*wks_3
# wks_4 ~~ v*wks_4
# wks_5 ~~ v*wks_5
# wks_6 ~~ v*wks_6
# wks_7 ~~ v*wks_7
```

Alternately, you can extract the `lavaan`

model syntax and wide-formatted data from the fitted model object to do your own fitting.

```
# ed id union_1 wks_1 lwage_1 union_2 wks_2 lwage_2 union_3 wks_3 lwage_3
# 1 9 1 0 32 5.56068 0 43 5.72031 0 40 5.99645
# 8 11 2 0 34 6.16331 0 27 6.21461 1 33 6.26340
# 15 12 3 1 50 5.65249 1 51 6.43615 1 50 6.54822
# 22 10 4 0 52 6.15698 0 46 6.23832 0 46 6.30079
# 29 16 5 1 50 6.43775 1 46 6.62007 1 40 6.63332
# 36 12 6 0 44 6.90575 0 47 6.90575 0 47 6.90776
# union_4 wks_4 lwage_4 union_5 wks_5 lwage_5 union_6 wks_6 lwage_6
# 1 0 39 5.99645 0 42 6.06146 0 35 6.17379
# 8 0 30 6.54391 0 30 6.69703 0 37 6.79122
# 15 1 52 6.60259 1 52 6.69580 1 52 6.77878
# 22 0 49 6.35957 0 44 6.46925 0 52 6.56244
# 29 0 50 6.98286 0 47 7.04752 0 47 7.31322
# 36 0 47 7.00307 0 44 7.06902 0 45 7.52023
# union_7 wks_7 lwage_7
# 1 0 32 6.24417
# 8 0 30 6.81564
# 15 1 46 6.86066
# 22 0 46 6.62141
# 29 0 49 7.29574
# 36 0 47 7.33889
```

The `lavaan`

model specification is stored as `fit$mod_string`

. Tip: To print the `mod_string`

to your console, don’t use the `print`

function, use the `cat`

function because it will format the line breaks appropriately.

You can also get the fitted `lavaan`

model object at `fit$fit`

.

`lavaan`

summaryWhile you could extract the `lavaan`

model and apply any of `lavaan`

’s functions to it (and you should!), as a convenience you can use `lav_summary`

to get `lavaan`

’s summary of the model.

Take advantage of `lavaan`

’s missing data handling by using the `missing = "fiml"`

argument, which is passed to `sem`

.

- CFI/TLI fit measures are much different than Stata’s and consistently more optimistic. For now, they are not printed with the summary because they are probably misleading.
- You cannot use multiple lags of the same predictor (e.g.,
`y ~ x + lag(x)`

). - The function does not yet support input data that is already in wide format.
- You cannot apply arbitrary functions to variables in the formula like you can with regression models. For instance, a specification like
`y ~ scale(x)`

will cause an error.

The following `xtdpdml`

(Stata) options are not implemented:

- xfree
- yfree
- re
- ylag
- std

Allison, P. D., Williams, R., & Moral-Benito, E. (2017). Maximum likelihood for cross-lagged panel models with fixed effects. *Socius*, *3*, 1-17.