WPLSC¶
- class pyplsc.WPLSC(boot_stat='score-covariate-corr', svd_method='lapack', random_state=None)¶
Within-participants PLSC (Roberts et al., 2016). Used for analyzing within-partcipants correlations. Cross-correlation matrices are computed within participants, averaged, and submitted to singular value decomposition.
- Parameters:
boot_stat (str, optional) –
Name of statistic to recompute on each bootstrap resample to get a confidence interval. Must be one of:
'score-covariate-corr'(default): Correlations between covariates and data scores (i.e., output oftransform()), computed within participants and averaged. Covariates and data may be original or resampled but scores are always computed by multiplying data bydata_sals_(i.e., the saliences from the initial decomposition). This is the what is computed in the original Matlab version of PLS.'condwise-scores': Condition-wise average data (original or resampled) multiplied bydata_sals_, computed within participants and averaged.
svd_method (str, optional) –
Method to use for singular value decomposition. Must be one of:
'lapack'(default): usenumpy.linalg.svd.'randomized': usesklearn.utils.extmath.randomized_svd.
random_state (int, optional) – Random state of model for reproducible premutation and bootstrap resampling. Passed to
numpy.random.default_rnginternally. Default isNone.
- bootstrap(n_boot=5000, confint_level=0.95, alignment_method='rotate-design-sals', return_boot_stat_dist=True, n_jobs=1, print_prog=True)¶
Perform (stratified) bootstrap resampling to assess the reliability of the data saliences.
- Parameters:
n_boot (int, optional) – Number of bootstrap resamples to compute. The default is 5000.
confint_level (float, optional) – The confidence level of the quantile-based confidence intervals to compute. The default is 0.95.
alignment_method (string, optional) –
Method to be used for aligning recomputed data saliences with original data saliences. Must be one of: ‘rotate-design-sals’ and ‘rotate-data-sals’ use the solution to the orthogonal Proctrustes problem to align the recomputed design or data saliences, respectively, with the originals. ‘flip-signs’ flips the signs of the resampled data saliences so that their inner products with original saliences are positive. The default is ‘rotate-design-sals’.
'rotate-design-sals'(default): Find the rotation that solves the orthogonal procrustes problem to align the recomputed and original design saliences, then apply this to the recomputed data saliences. This is the what is computed in the original Matlab version of PLS.'rotate-data-sals': Find the rotation that solves the orthogonal procrustes problem to align the recomputed and original data saliences, then apply this to the recomputed data saliences.'flip-design-sals': Find the set of sign flips that ensures the inner product of the recomputed and original design saliences are positive, then apply these sign flips to the recomputed data saliences.'flip-data-sals': Find the set of sign flips that ensures the inner product of the recomputed and original data saliences are positive, then apply these sign flips to the recomputed data saliences.'none': Perform no alignment.
return_boot_stat_dist (bool, optional) – If
True, distribution ofboot_statfrom resampling is returned. This is the distribution used to compute quantile-based confidence intervals. Default isTrue.n_jobs (int, optional) – Number of parallel jobs to deploy to compute permutations. -1 automatically deploys the maximum number of jobs. The default is 1.
print_prog (bool, optional) – Specifies whether to display a progress bar. Default is
True.
- Returns:
design_resampled – If return_boot_dist is true, returns the bootstrap distribution of the statistic named by
boot_stat- Return type:
numpy.ndarray
Examples
>>> mod.bootstrap(1000, n_jobs=-1) >>> print(mod.data_sals_z_) >>> print(mod.boot_stat_ci[..., 0]) # Print CI of boot_stat for first LV
- fit(data, covariates, design=None, within=None, participant=None, weighted=False)¶
Fit a within-participants PLSC model.
- Parameters:
data (list) – List of participant-specific data arrays. Each should be a
numpy.ndarrayof shape (n. trials, n. observed vars).covariates (list or str) – List of participant-specific covariates (in which case each list element must be a valid
covariatesargument toPLSC.fit), or the names of the columns indesignthat contain the covariates.design (list, optional) – List of participant-specific design matrices. Each list element must be a valid
designargument toPLSC.fit. The default isNone.within (list or str, optional) – List of participant-specific indicators of within-participant condition (in which case each list element must be a valid
betweenargument toPLSC.fit), or the names of the columns indesignthat contain the within-participant condition indicators.participant (list, optional) – A list of participant identifiers (integers or strings).
weighted (bool, optional) – Specifies whether participant-level cross-covariance matrices should weighted by number of trials when averaged together. Default is False.
- Return type:
None
Examples
>>> # Simulate null data >>> n_var = 10 >>> ptptwise_n_trials = [10, 10, 9, 8, 12] >>> data = [np.random.normal(size=(n_trials, n_var)) for n_trials in ptptwise_n_trials] >>> covs = [np.random.normal(size=(n_trials, 1)) for n_trials in ptptwise_n_trials] >>> # Fit model >>> mod = pyplsc.WPLSC() >>> mod.fit(data=data, covariates=covs, weighted=True)
- flip_signs(lv_idx=None)¶
Flips the signs of one or more latent variables, to aid with interpretation.
- Parameters:
lv_idx (indexer) – The index or indices of latent variables whose signs should be flipped. If None (default), signs are flipped for all latent variables.
lv_idx – The index or indices of latent variables whose signs should be flipped. If None (default), signs are flipped for all latent variables.
Examples
>>> mod.flip_signs() # Flip all signs >>> mod.flip_signs(0) # Flip signs for the first latent variable >>> mod.flip_signs([0, 1]) # Flip signs for the first two latent variables
- get_boot_stat_frame(lv_idx=None)¶
Get
boot_statas a dataframe, including upper and lower confidence limits if bootstrap resampling has been done.- Parameters:
lv_idx (indexer, optional) – Index of latent variable the dataframe should cover. The default is None, which yields a dataframe covering all latent variables.
- Returns:
df –
boot_statas a dataframe.- Return type:
pandas.dataframe
- get_boot_stat_yerr(lv_idx)¶
Get yerr for statistic named by
boot_statthat can be passed to a matplotlib bar plot.- Parameters:
lv_idx (int) – Integer indexing the latent variable of interest.
- Returns:
yerr – 2D array with shape (2, n. design saliences) that can be passed to matplotib’s pyplot.bar() as the yerr= argument.
- Return type:
numpy.ndarray
Examples
>>> # Make bar plot of boot_stat >>> x = mod.design_sal_labels_['between'] >>> lv_idx = 0 # First latent variable >>> height = mod.boot_stat_val_[:, lv_idx] >>> yerr = mod.get_boot_stat_yerr(lv_idx) >>> matplotlib.pyplot.bar(x=x, height=height, yerr=yerr)
- get_scores_frame(lv_idx=None)¶
Get dataframe containing design and data scores for each trial.
- Parameters:
lv_idx (indexer, optional) – Index of latent variable(s) for which to include design and data scores. The default is None, which includes scores for all latent variables.
- Returns:
df – Dataframe containing design and data scores for each trial.
- Return type:
pandas.dataframe
Notes
Data is in long format, with a column specifying the latent variable corresponding to each score.
Examples
>>> mod.get_scores_frame().to_csv('scores.csv')
- permute(n_perm=5000, return_null_dist=True, n_jobs=1, print_prog=True)¶
Perform permutation testing to assess the significance of the latent variables. p values become available after running this method through the
pvals_property.- Parameters:
n_perm (int, optional) – Number of permutations t operform. The default is 5000.
return_null_dist (bool, optional) – If
True, permutation samples will be returned as a 2D (n. perms, n. latent vars) array. Default isTrue.n_jobs (int, optional) – Number of parallel jobs to deploy to compute permutations. -1 automatically deploys the maximum number of jobs. The default is 1.
print_prog (bool, optional) – Specifies whether to display a progress bar. Default is
True.
- Returns:
null_dist – 2D array containing null distribution of singular values, where each row is a different permutation and each columns is a different singular value.
- Return type:
numpy.ndarray
Examples
>>> mod.permute(n_perm=1000, n_jobs=-1) >>> print(mod.pvals_)
- transform(data=None, lv_idx=None)¶
Compute data scores, i.e., coordinates of array data in the new basis defined by the latent variables, by multiplying a data array by the data saliences (the
data_sals_property)- Parameters:
data (numpy.ndarray, optional) – Data to transform. The default is None, which yields scores for the data on which the model was fit (the
data_property).lv_idx (indexer, optional) – Index of latent variable(s) for which to compute scores. Default is None, which computes scores for all latent variables.
- Returns:
data_scores – A 2D array of scores where rows correspond to different observations and columns correspond to different latent variables.
- Return type:
numpy.ndarray
Examples
>>> scores = mod.transform() # Get scores for data used to fit model >>> scores = mod.transform(new_data) # Get scores for new data
- boot_stat¶
Name of statistic whose distribution is derived during bootstrap resampling.
- Type:
str
- boot_stat_ci_¶
Confidence interval on stat named by
boot_statderived from bootstrap resampling. CI level is determined byconfint_level_. Set bybootstrap().- Type:
numpy.ndarray
- boot_stat_val_¶
numpy.ndarrayPoint estimate from initial decomposition of statistic whose distribution is derived duringbootstrap()resampling. Set byfit().
- confint_level_¶
Level of confidence interval on stat named by
boot_statto derive during bootstrap resampling (e.g., 0.95). Set bybootstrap().- Type:
float
- data_sals_¶
Right saliences/singular vectors used to compute data scores. Shape (n. observed vars, n. latent vars). Set by
fit().- Type:
numpy.ndarray
- data_sals_std_¶
Standard deviations of data saliences (
data_sals_) as estimated during bootstrap resampling. Set bybootstrap().- Type:
numpy.ndarray
- data_sals_z_¶
Data saliences (
data_sals_) divided by their standard deviations (data_sals_std_) as estimated during bootstrap resampling. Set bybootstrap().- Type:
numpy.ndarray
- design_sals_¶
Left saliences/singular vectors used to compute design scores. Shape (n. design saliences, n.latent variables). Set by
fit().- Type:
numpy.ndarray
- n_boot_¶
Number of bootstrap resamples used. Set by
bootstrap().- Type:
int
- n_sv_¶
Number of singular values, i.e., the number of latent variable pairs in the model. Set by
fit().- Type:
int
- random_state¶
Random state for reproducible permutation and bootstrap resampling.
- Type:
int
- singular_vals_¶
Singular values from the decomposition of the mean-centred data. Set by
fit().- Type:
numpy.ndarray
- svd_method¶
SVD method used
- Type:
str