5. Advanced moves

Learn how we code experiments and analyze data in the lab

By the end of this section, you should:

  • Know how to work with lab project files in docker containers

  • Know how to add a new experiment to an existing project

  • Know how to push your changes to the project's GitHub repository

  • Know how to pull your changes onto the lab's server

  • Know the general steps to analyze data for a project

Prerequisites

  • To work with lab code, you'll need to install Docker Desktop on your computer.

  • You'll also need a text editor (e.g. Atom, SublimeText, NotePad++, etc). My favorite is VSCode if you are looking for something new!

  • Finally, you'll need to feel a least a little bit comfortable with the terminal (command line for Windows users) and with git/github. If these tools are completely new to you, here are some resources you could refer to to get up to speed:

    • terminal for beginners (for mac users)

      • you could refer to this tutorial to get up to speed. If you've used any of these before, even just a little, you should be able to do this 😊.

Note that not all lab members will need to complete this part of the training. Check will Ariel if you aren't sure whether you should do this section.

Also note that this part is significantly more difficult than the rest of the onboarding tutorial. It is normal to run into a lot more trouble here! Ask questions in Twist on the #Lab Help channel.

Step 1: Clone the project

To work on lab projects, you'll need to clone the project's GitHub repo onto your local computer and take a few configuration steps. We'll walk you through how to do this with our training project.

Clone the training repo

Navigate to the folder where you want the project to live on your local computer. I have a folder called github on my computer where I keep all of my GitHub repositories.

cd folderonyourlocalmachine

Then, clone the project's github repository. Here you are cloning the training project's repository.

git clone https://github.com/pennchildlanglab/training-project.git

Configure your environment

Now that the project is cloned, enter the newly created training-project folder and then make a copy of the .env.sample file, calling the new file .env.

cd training-project
cp .env.sample .env

Start the containers

To work on the code, start the project's local development docker containers by running the following from inside the training-project folder.

docker-compose up -d --build

Docker containers are kind of like virtual machines. We run code and analyses inside containers to make sure things run in exactly the same way on all different computers.

When you want to work on lab code or do analyses, you'll start the containers in this way. Docker Desktop comes with a handy GUI that helps you see which containers you have running at any given time. Open Docker Desktop to see your active containers.

Docker Desktop overview of running containers

Once the containers have been created, the following will be available locally:

service

url

default login / password

webserver

http://localhost:8080/

jupyter

http://localhost:8989/

token: password

rstudio

http://localhost:8787/

login: rstudio, password: password

The jupyter and rstudio containers require passwords, which you can set in your .env file. Work you do inside these containers' work folders will be saved locally inside the analyses folder of your project.

Note that you can only have one lab project container running on your machine at a time. If you have another project running, you may get an error like this:

failed: port is already allocated

Stop the other container and try again.

Stop the containers

When you are all finished working on a project for the day, you can stop the containers by running the following from inside the project's directory.

docker-compose down

Step 2: Create an experiment

A common task is making changes to an existing experiment or creating a new one. When you create a new experiment, you need to do two things in the project repository on your local machine (1) create the experiment and (2) modify the config-exbuilder.json file.

Create a new experiment

Practice creating a new experiment in a project by creating a new experiment in the training-project. In your local copy of the repo (from Step 1), make a copy of the experiment1-jspych experiment, inside the experiments folder. Name it experimentN-yourfirstname where N is the next available number.

Edit config file

In order for your experiment to appear on the landing page (at http://localhost:8080/ on your local machine if your containers are running), you'll need to change the config-exbuilder file to include your experiment. Under experiments, add a block for your experiment, like this:

"experiment3": {
"name": "experiment 3 jspsych katie's copy",
"path": "experiment3-katie/experiment.html",
"conditions": {
"condition1": {
"description": "this condition is this",
"n": 20
},
"condition2": {
"description": "this condition is that",
"n": 20
}
}
}

Change experiment3 , name, and path to your specific experiment. Note that name can be any arbitrary thing, but path is specific. It must be the relative path to your experiment's experiment.html file. You can leave the condition information the same.

A common error people encounter is badly formed JSON. You can use https://jsonlint.com/ to make sure your config code is well formed.

Edit an experiment

Great! Now you can develop your experiment locally by making changes to the experiment.html file, adding files to the img folder, etc. Your changes will be live on your local computer at: http://localhost:8080/ if the docker containers are running.

We usually use jsPsych to create our web experiments. If you are new to jsPsych and you are asked to create or modify an experiment, we recommend doing the tutorials below.

You don't need to do the jsPsych tutorials unless you are asked to. They are just here for reference.

Step 3: Push your changes

At the moment your changes are only available to you, on your local computer. When you are ready, you'll want to commit your changes to the GitHub repository (and eventually, pull those changes onto the server to run subjects!). Practice doing this with the training project! From inside the training-project folder on your local machine:

git add .
git commit -m "commit message about your changes"
git push origin main

These commands push your changes to the shared GitHub repo. Once pushed, you should see them here: https://github.com/pennchildlanglab/training-project.git.

Step 4: Pull changes to lab server

Create SSH keys

To login to the server, you'll need to generate SSH keys. Create your key pair by running (on your computer):

ssh-keygen -t rsa

You'll be asked a few questions when you run this command:

Enter file in which to save the key (/home/username/.ssh/id_rsa):

Just press enter for the one above (username will be your user on your machine).

Enter passphrase (empty for no passphrase):

For this prompt, you don't have to enter a passphrase, but I highly recommend doing so. Choose something you won't forget. Nobody but you will have access to this.

Finally, use the following command to view your new public ssh key. Copy and paste this key and send it securely to Katie so she can add it to the sever.

cat ~/.ssh/id_rsa.pub

You won't be able to login to the server until Katie adds your ssh public key.

Login to server

To login to the server, use the following command, where yourusername is your server username (usually your first name).

You'll be prompted to enter your ssh passphrase (you created that when you created your SSH keys above). Once logged in, you'll be inside the server's terminal. It will look like this:

The server's terminal

Pull changes to server

In steps 2 and 3, you made changes to the training-project on your local machine and pushed them to the GitHub repository: https://github.com/pennchildlanglab/training-project

But there is one final step in order to run participants: you need to pull these changes to the lab's webserver. While logged in on the server (see above), navigate to the study you want to update (here, the training-project):

cd experiments/training-project

Then, pull in the changes with

sudo git pull

You'll be asked for your sudo password (shared with you in a LastPass note) and to enter your github credentials (only you know these). That's it! Check that your changes are live on the webserver by visiting training.childlanglabexperiments.org.

Important: Run through your experiment at least once on the webserver and then check that everything saved properly to the lab's database.

Step 5: Analyze data

Preparation

Great job making it this far! You are almost done. The last thing you'll learn is how to analyze data for a project in the lab. As usual, you'll practice on the training-project. There are a few things you'll need to do to get ready.

Download data

First, head to the lab's database and download all of the training project's data. Make sure this includes your most recent run (from step 5 above).

Start jupyter

On your local machine, make sure the training-project's containers are running. The jupyter service will be available on your computer at http://localhost:8989/. Enter the token password

Create your analysis script

Enter the work folder — note that only things done in the work folder will save on your local machine — and create a new folder called practice-analysis-yourfirstname

The work folder inside jupyter is mapped to the analysis folder inside the project

Then, inside your new folder, create an analysis script by selecting the R Notebook from the launcher. This will create a new .ipynb file. Name it analysis.ipynb.

Add the data

Next, add the data you downloaded to your new folder. You can use the upload button from inside jupyter, or simply add it to the analysis/practice-analysis-yourfirstname folder that will now appear inside the training-project on your computer.

Once uploaded, the data will appear in your sidebar like this

Add a title and description

Wahoo! You are ready to analyze. Jupyter is a computational notebook that allows you to write text alongside your code. Text blocks are called Markdown and allow you to write markdown formatted text. If you don't know markdown, take a minute to do this tutorial .

Change the current block between Code and Markdown

Change the first block to markdown and add a title as H1, your name in bold, and the date in italics. Press the run ▶️ button to see it formatted!

Formatting for your markdown text block

Do the analysis

If you need a hint, you can look in the practice-analysis-joan folder to see the complete code. You can also view it here.

Load libraries

Add a code block in your analysis for the lab's boilerplate, which loads some libraries and sets up some common figure aesthetics we like to use. Use the run button to run this code.

# load tidyverse, but be quiet about it
options(tidyverse.quiet = TRUE)
library(tidyverse)
library(tidyjson)
# turn off annoying "summarise()" warning in new dplyr
options(dplyr.summarise.inform = FALSE)
# set all plots to classic theme and center titles
theme_set(theme_classic(base_size = 20))
theme_update(plot.title = element_text(hjust = 0.5))

Read data

Read in the data that you added in the preparation step above. Change name-of-your-datafile.csv to whatever your datafile is named.

raw_data <- read_csv('name-of-your-datafile.csv')

Parse data

The data is formatted with one row per participant. To get trial-by-trial data, you need to parse the JSON column, data. The following code will do that -- this is the same for all data from the database.

data <- raw_data %>%
filter(!is.na(data)) %>%
as.tbl_json(json.column="data") %>% gather_array %>% spread_all

Summarise data

Next you can get some details about the participants. Group by experiment and use n_distinct to count the number of distinct participants (randomid).

participants <- data %>% group_by(experiment) %>%
summarise(n = n_distinct(randomid))
# print out the participant table you just made
participants

After that, compute some summary stats. For the response trials, group by experiment, randomid, and stimulus type. Then get a count (n), the number of correct trials (correct), the percent correct trials (pcnt_correct) and the mean reaction time (rt). The code below will do that for you.

summary_stats <- data %>% filter(task == "response") %>%
group_by(experiment, randomid, stimulus) %>%
summarise(n = n(), correct = sum(correct),
pcnt_correct = correct/n*100, rt = mean(rt))
# print out the summary stats table you just made
summary_stats

Plot data

Finally, make a plot of participant reaction times by stimulus for each experiment. The code below will do that. Note that line 1 adjusts the size of the figure.

# make the figure really wide but not so tall
options(repr.plot.width = 15, repr.plot.height = 5)
ggplot(summary_stats, aes(x = stimulus, y = rt, color = stimulus)) +
# facet the plot by experiment
facet_grid(.~experiment, scales = "free") +
# plot the mean as a pretty big point
stat_summary(fun = mean, geom = "point", size = 4) +
# change the title and y axis label
labs(title = "Joan's Practice Analysis", y = "mean rt") +
# make the blue stimulus point blue and the orange one orange
scale_color_manual(values = c("blue", "orange"))

Push your changes

Finally, and as always, you'll want to push your changes to the GitHub repository. Until then, your code is only available to you, on your computer. Pushing to the repository makes sure it is available to everyone. From the training-project folder on your computer

git add .
git commit -m "the changes you made to your analysis"
git push origin main

Check in the training-project analyses folder on GitHub to make sure your changes were pushed.

Great job!! You are now finished with the onboarding tutorial!