--- title: "Go/No Go - D-prime" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{gng_dprime} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, message = FALSE, warning = FALSE, comment = "#>" ) ``` ```{r setup, message = FALSE, warning = FALSE} library(splithalfr) ``` This vignette describes the d-prime; a scoring method introduced by [Miller (1996)](https://doi.org/https://doi.org/10.3758/BF03205476).
# Dataset Load the included Go/No Go dataset and inspect its documentation. ``` data("ds_gng", package = "splithalfr") ?ds_gng ``` ## Relevant variables The columns used in this example are: * condition, 0 = go, 2 = no go * response. Correct (1) or incorrect (0) * rt. Reaction time (seconds) * participant. Participant ID ## Counterbalancing The variables `condition` and `stim` were counterbalanced. Below we illustrate this for the first participant. ``` ds_1 <- subset(ds_gng, participant == 1) table(ds_1$condition, ds_1$stim) ```
# Scoring the Go/No Go ## Scoring function The scoring function receives the data from a single participant. For the proportion of hits and false alarms, it calculates their quantiles given a standard normal distribution. Extreme values are adjusted for via the log-linear approach ([Hautus, 1995](https://doi.org/10.3758/BF03203619)). ``` fn_score <- function(ds) { n_hit <- sum(ds$condition == 0 & ds$response == 1) n_miss <- sum(ds$condition == 0 & ds$response == 0) n_cr <- sum(ds$condition == 2 & ds$response == 1) n_fa <- sum(ds$condition == 2 & ds$response == 0) p_hit <- (n_hit + 0.5) / ((n_hit + 0.5) + n_miss + 1) p_fa <- (n_fa + 0.5) / ((n_fa + 0.5) + n_cr + 1) return (qnorm(p_hit) - qnorm(p_fa)) } ``` ## Scoring a single participant Let's calculate the d-prime score for the participant with UserID 1. ``` fn_score(subset(ds_gng, participant == 1)) ``` ## Scoring all participants To calculate the d-prime score for each participant, we will use R's native `by` function and convert the result to a data frame. ``` scores <- by( ds_gng, ds_gng$participant, fn_score ) data.frame( participant = names(scores), score = as.vector(scores) ) ```
# Estimating split-half reliability ## Calculating split scores To calculate split-half scores for each participant, use the function `by_split`. The first three arguments of this function are the same as for `by`. An additional set of arguments allow you to specify how to split the data and how often. In this vignette we will calculate scores of 1000 permutated splits. The trial properties `condition` and `stim` were counterbalanced in the Go/No Go design. We will stratify splits by these trial properties. See the vignette on splitting methods for more ways to split the data. The `by_split` function returns a data frame with the following columns: * `participant`, which identifies participants * `replication`, which counts replications * `score_1` and `score_2`, which are the scores calculated for each of the split datasets *Calculating the split scores may take a while. By default, `by_split` uses all available CPU cores, but no progress bar is displayed. Setting `ncores = 1` will display a progress bar, but processing will be slower.* ``` split_scores <- by_split( ds_gng, ds_gng$participant, fn_score, replications = 1000, stratification = paste(ds_gng$condition, ds_gng$stim) ) ``` ## Calculating reliability coefficients Next, the output of `by_split` can be analyzed in order to estimate reliability. By default, functions are provided that calculate Spearman-Brown adjusted Pearson correlations (`spearman_brown`), Flanagan-Rulon (`flanagan_rulon`), Angoff-Feldt (`angoff_feldt`), and Intraclass Correlation (`short_icc`) coefficients. Each of these coefficient functions can be used with `split_coef` to calculate the corresponding coefficients per split, which can then be plotted or averaged via a simple `mean`. A bias-corrected and accelerated bootstrap confidence interval can be calculated via `split_ci`. Note that estimating the confidence interval involves very intensive calculations, so it can take a long time to complete. ``` # Spearman-Brown adjusted Pearson correlations per replication coefs <- split_coefs(split_scores, spearman_brown) # Distribution of coefficients hist(coefs) # Mean of coefficients mean(coefs) # Confidence interval of coefficients split_ci(split_scores, spearman_brown) ```