Recoding in SPSS
Explanation 1
Recoding variables in SPSS can be very useful. Sometimes it is necessary to recode (sometimes referred to as transform) variables in case data was put into SPSS wrong or if things on the survey were incorrect. This can help the coder from doing a lot of extra work if the ladder is the case. To simply recode a variable just type: recode (variable name without the parenthesis) (different values in each individual parenthesis with an equal sign and a new value) into (new variable name).
An example of this would be recode sex (1=0) (2=1) into male. It is not necessary to create a new variable name, but often times it is much safer and easier to work with than the old variable name. Some helpful tips with the recode command are:
- Always check to make sure your old and new values are correct.
- Think about whether the new values are correct and what you want the new variable name to be.
- Make sure to run some frequency tables to double check the old and new variable are correct.
(This can be done with the syntax line: freq vars= sex male.)
- Keep in mind that if you need to recode multiple numbers and variables that there is a shortcut to do this. You can copy and past one line and just change the numbers to save yourself a ton of time. Also, you can recode multiple variables into multiple new variables at the same time. Just be careful and make sure you retype everything in order properly.
- For example: recode liberalmind conservmind gender sex income gdp (1=0) (2 3 4 5 6 7 8 9=1) (else=sysmis) into liberalish conserve culture male cash nationwealth.
- Notice that I recorded all of the variables at the same time to save both space in my syntax file (to make it less cluttered) and saved a ton of typing! This can be very useful when trying to create a ton of reference groups with just the copy and pasting of the first recoding line such as:
recode familysize (1=0) (2 3 4 5 6 7 8 9=1) (else=sysmis) into oneperson. recode familysize (2=0) (1 3 4 5 6 7 8 9=1) (else=sysmis) into twoperson. recode familysize (3=0) (1 2 4 5 6 7 8 9=1) (else=sysmis) into threeperson. recode familysize (4=0) (1 2 3 5 6 7 8 9=1) (else=sysmis) into fourperson. recode familysize (5=0) (1 2 3 4 6 7 8 9=1) (else=sysmis) into fiveperson. recode familysize (6=0) (1 2 3 4 5 7 8 9=1) (else=sysmis) into sixperson. recode familysize (7=0) (1 2 3 4 5 6 8 9=1) (else=sysmis) into sevenperson. recode familysize (8=0) (1 2 3 4 5 6 7 9=1) (else=sysmis) into eightperson.
- Keep in mind that I use the "(else=sysmis)" function to label any other value other than the numbers listed as system missing. It might be helpful, but you have to be careful when using this. Lots of times you will have a ton of coding errors when entering in the data and you'll want to double check and fix these to make sure you have a high number of cases.
- Reference groups are typically used in a linear regression model to help establish a sort of control group. Typically you use the largest category for an ordinal or nominal variable and exclude it by giving it a value of 0 and recreating a new variable like I did on the second part in #4.
- Always make sure to end the command line with a period.
- A lot of recoding can be avoided if you think carefully when designing your questionaire or survey to avoid confusion. Sometimes it is also useful to recode if you have the values on a likert scale backward.
To fix this simply use this: recode likertscale (1=5) (2=4) (3=3) (4=1) (5=1) into fixedlikertscale. You can modify this depending on how many values you have and how mixed up they are.
Explanation 2
recode (variable name) (values) into (variable name).
The recode command is extremely valuable for many different reasons. One of the many uses is to change the variable name and even the values assigned into the variable name. For instance, if you wanted to change the current value of one to zero for an answer on your survey, just type in the SPSS command:
recode sex (1=0) (2=2) (else=sysmis) into sex.
Notice that I used the variable sex again and changed the current value of 1 to 0 but kept the value of 2 the same. Not only that, but you'll notice the extra command of (else=sysmis). If for some reason you have other numbers coded into spss, it will show as system missing when you generate a table. Also notice, that I did not change the variable name and left it the same. I could have also just left off the words "into sex" and just ended with a period. Also notice that the values are enclosed in parenthesis for each value that I'm assigning. Be sure to always do this or you might generate an error. However to complete create a dichotomous variable for further statistical analysis, we must have ones and zeros for values. Therefore, I would edit the above command into:
recode sex (1=0) (2=1) (else=sysmis) into Male.
This time, I decided to change not only the value of 2 into 1, but also change the entire variable as Male. This could refer to either how the new value 0 equals male (which would be a reference group in a multiple regression) or how Males are the value 1. Keep in mind there are some advanced tricks to SPSS that you might overlook. For example when I was recoding a long list of variables, it was tedious to type several lines of codes for each variable even though the values were the same because of the use of a likert scale. Therefore, I found out that you could recode a ton of variables at the same time. For example instead of: recode sex (1=0) (2=1) (else=sysmis) into Male. recode gender (1=0) (2=1) (else=sysmis) into Masculine.
The alternative, if the variables value are the same, would look something like this: recode sex gender (1=0) (2=1) (else=sysmis) into Male Masculine.
Be sure to keep the order of the variables and what you what the new variable to be called in the same order. Otherwise, you could end up with the wrong values assigned to the wrong variables.