Importing Inquisit data files into SPSS
So you have your first Inquisit dataset. The next step is to preprocess the data so it can be analyzed with a statistical program. This guide is about doing this using SPSS. I will assume you’re using SPSS 14 or higher, but lower versions might work too (with a few changes). I will also assume that you didn’t change anything in the way Inquisit saves data by default (i.e., you didn’t add a data element to your inquisit script).
Regardless of how you set up your experiment, your data file will have the same name as your experiment file and will be located in the same folder. Only the extension will be different. A script always has the .exp extension, while a data file will have the .dat extension. If you only see two files with the same names, and no extension, you should change the settings of Windows to actually show the extensions. In Windows XP this can be done easily by opening up My Computer. Then in the menu bar select Tools > Folder Options. Select the View tab and in the Advanced Settings pane deselect Hide extensions for known file types. Click Apply.
The data file is just a plain text file, which you could open with Notepad to take a look at it. If you open it you’ll see that it’s a tab-delimited file with one trial per line. Opening it in a spreadsheet program like Microsoft Excel will give you a better view of the columns. The one trial per line way of saving is extremely handy as you’ll see in a bit, but you will have to end up with a data file in SPSS with only one line per participants, and response or latency means in the columns. The next steps are all about getting from the former to the latter.
Importing the data file
First, you have to get the data file into SPSS. Follow these steps:
- File > Read Text Data …
- Files of type: Data (*.dat)
- Find your data file > Open
- Step 1 of 6 > Leave everything as it is by default > Next
- Step 2 of 6 > Are variable names included at the top of your file > Yes > Next
- Step 3 of 6 > Leave everything as it is by default > Next
- Step 4 of 6 > Make sure only Tab is selected, nothing else > Next
- Step 5 of 6 > Here you tell SPSS what kind of values are in every column (or variable). SPSS will try to detect this automatically based on the first twenty lines in the file. However, sometimes the type of values changes in a column further down the file, for instance if some responses are strings (words/letters) and others are numeric. Check whether SPSS guessed correctly, and change it accordingly if necessary. Also make sure blockcode and trialcode are set to enough characters to contain the full names that you gave to all block and trial attributes. > Next
- Step 6 of 6 > If you work with a syntax file, select paste the syntax, otherwise > Finish
Your data is now imported. Take a look at your spss data file. If there are missing values somewhere you probably made a mistake step 5 of 6. For instance, you forgot to make a variable a string. Letters will not fit in a numeric variable. Should this be the case, start the importing steps again from the top, or change your syntax.
Organizing your data; Split it into different datasets
The Inquisit script that you wrote will often consist of multiple tasks (e.g., an IAT and some likert questions). The data of all these tasks will be contained in the current big dataset if those tasks were all in one single inquisit script. As your data is organized now, it will be very difficult to quickly get to the means you need to analyze your data, without getting a lot of variables in your set you don’t need. That’s why you need to split the current dataset into separate datasets first, one for every task. You’ll be able to aggregate and restructure those separate datasets pretty easily and afterwards you can merge the restructured datasets back, so you have one nice dataset with all the means of all the different tasks. I’ll explain aggregation, restructuring, and merging later, let’s get to the splitting up.
If you are using SPSS 14 (or higher) create a copy of your dataset by typing the following in your syntax window:
DATASET COPY taskname.
DATASET ACTIVATE taskname.
Don’t forget the dot at the end of each line. And you are supposed to substitute taskname with the name of the task that you’re creating this dataset for. Later you can go back to the original dataset using DATASET ACTIVATE DataSet1 (assuming that this was the first data file you imported in this SPSS session, otherwise the actual name of the dataset may be different).
If you are using an SPSS version lower than 14, make a copy of your dataset by saving it as a different file with the name backup.sav. When you’re done with the analysis of the current task, you can save the resulting dataset as a separate file, then open up backup.sav and analyze another task. Later you can merge these files together.
Now, regardless of which SPSS version you’re using, you are working in a copy of your dataset. The next step is to select the blocks you actually need for the current task you’re using. Let’s say you want to analyze the IAT part of the data, and you need two blocks, named compatible and incompatible (these names were assigned to the blocks because the author of the inquisit script called the two block attributes that way). To select those blocks click Data > Select cases …, or use the syntax:
SELECT IF blockcode = ‘compatible’ or blockcode = ‘incompatible’.
At this point you will aggregate and restructure the data (explained next). Then you go back to the original dataset and repeat the step of splitting to create the dataset for the next task, which you then aggregate and restructure too. Repeat this for every task, and finally you merge everything back together.
Creating means: Transforming and aggregating data
Let’s clean up a bit first. You probably don’t need a lot of the variables that are in the dataset now. It depends a bit on the way the inquisit experiment was scripted and what kind of means you need, but most of the time you actually only need the variables: subject, blockcode, trialcode, response, correct, and latency. I recommend to also keep in the time and date variables to distinguish between participants who accidentally got the same subject number. If you need to make a distinction between single stimuli, you additionally need stimulusitem1 (or stimulusnumber1 if you’re not interested in the literal content of the stimulus), and maybe other stimulusitems. Be very selective, but think carefully about what data you need to do your final analyses. Once you know which variables you want to throw away, you can delete those by either selecting the columns of those variables and click delete, or by using DELETE VARIABLES varname1 varname2, etc. in the syntax window.
If your response variable contains numeric values and you want to calculate scales or means with them, this is only possible if those values are actually registered as numeric by SPSS. Change the type accordingly in the variable view, or using the Recode into different variables command in the menu bar. SPSS will probably already have detected the proper types for the latency and correct variables.
Right now you have a dataset with all the responses and latencies of one task. However, your analyses will most likely be about mean latencies for specific types of trials or mean responses. If this isn’t the case, you can directly go to the next step of restructuring the data. Note however, that this is also a good point to do any transformations on the raw data, because they can be done now with one line of code. An example of this follows in the next paragraph, so if you decided to skip ahead to restructuring, maybe just read the next paragraph.
At this point you can easily do some transformations or outlier handling. What you will do here is very dependent on the traditional ways of analyzing specific tasks. If you’re not sure what to do at all you can skip this paragraph, but maybe you get some idea’s from the examples.
- Say you need log-transformed latencies for your analysis. The only thing you have to do is applying the compute command on the raw latency variable: Transform > Compute, or in syntax: COMPUTE loglatency = ln(latency).
- Say you want to change all latencies greater than 3000 ms to 3000 ms, just enter this line in your syntax: IF latency > 3000 latency = 3000. Everybody who is still using DO REPEAT, you don’t need to anymore with this method.
The final step is aggregating your data to the necessary means per subject (such as means per block, or means per type of trial). Use the Data > Aggregate … command in the menu bar.
- You first enter the Break Variables. These are the variables the break up your data in separate means. Let’s say you want means per subject per block. In that case you enter subject and block into Break Variables. If you still have date and time in your dataset, also add these to the Break variables, so you get separate aggregated values for each participant even if they have the same subject number.
- Subsequently, you have to tell SPSS what kind of aggregation it needs to apply to the values in your dataset, by adding variables into Summaries of Variables. If you need mean latencies, add latency to Summaries of Variables, and SPSS will automatically assume you want the mean. If you want something else, like the sum or the median, or minimum, just click the Function button. You might want sum to get the sum of correct trials per block per subject.
- Last, you tell SPSS where it needs to put the new aggregated variables. In most cases you don’twant to select Add aggregated variables to active dataset. If you’re using SPSS 14 or higher, you can use the Create a new dataset option. Enter a new name for this dataset and don’t forget to activate it after aggregating, otherwise you keep on working with the unaggregated dataset. If you’re using an older version of SPSS, you should write the new data to a new file, and then open that file to continue working on the means.
- Press Ok (or Paste if you’re using syntax).
Towards one line per subject: Restructuring your data
After aggregating you still have multiple lines of data per subject. And this is certainly true if you skipped the aggregation step. This is the point where we fix this so we’ll end up with one line per subject and multiple columns for each different mean or value. Only after doing this you’ll be able to do meaningful t-tests, ANOVA’s or other analyses.
To restructure your data, first make sure the right dataset is active.
- Data > Restructure …
- Save contents of data editor to xxxxxx > No
- Select the middle option (don’t even look at the other options): Restructure selected cases into variable > Next
- Put the variables that identify what makes a single line unique in Identifier variables. This is always subject and maybe, if you kept them in your dataset, date and time. You can also put variables in here that are the same for every current line that’s part of the same subject, but SPSS will do this automatically for you.
- Put the variables that contain (parts of) the names of the new variables in Index variables. Most of the time these are blockcode and/or trialcode. If you haven’t aggregated, or if unique stimuli are important, you might want to add stimulusitem or stimulusnumber to the Index variables.
- All variables that you didn’t add to either Identifier or Index variables will become the values of the new variables. The names of the variables will consist of the Index variables and the remaining variables. This might be something like congruent.latency_mean (congruent was a value of block, which was added as Index variable, and latency_mean was a remaining variable that resulted from the aggregation of the latency variable).
- One thing is very important: restructuring will only work if for every variable that you want to create, there is only one unique value that can be referred to. This sounds a bit difficult, but this example might make it clear: In our example in (6) it would go wrong if we didn’t add blockcode as index variable, because in that case you’d get the same variable name for latency_mean in the congruent block and for latency-mean in the incongruent block. This is a conflict that SPSS doesn’t know to resolve and will complain about. Most of the time if the restructuring doesn’t work out it is because of this problem. The solution is to add more index variables. If that still doesn’t work, be on the look out for participants who accidentally have the same subject number.
- Next & Finish
Actually you can already start analyzing the dataset that results from the restructure command. However, if you want to combine the datasets you get from other tasks, you should start merging. You can do this with Data > Merge. Make sure all datasets are sorted by subject before you do. And to be sure that the merging will be correct, add subject as key in the merge window (and select both datasets provide cases if you want to keep it simple). You can rename variables in the Merge dialog window if variables in both sets have the same name. That’s all there is to it.
Good luck analyzing!
Another great tutorial on importing raw Inquisit data into SPSS can be found on Jeromy Anglim’s Blog.