Skip to main content



About NDACAN

NDACAN Publication:
Checklist for Preliminary Secondary Analysis

A brief description for using syntax in SPSS/Windows:

In SPSS for Windows, choose the FILE menu, and then NEW and then SPSS SYNTAX. A syntax window will open, allowing you to directly program SPSS rather than using the menus. Type in the code, highlight the lines you wish to run, and press the "run" button at the top of the screen (a little black arrow pointing right). Pressing the control and R keys simultaneously will also run the selected text. The output window will open to show your output. Be sure you save your syntax window separately (syntax windows will automatically be given the suffix ".sps"). With the syntax window as the active window, choose the FILE menu, then SAVE SPSS SYNTAX.

1. Select the variables with which you would like to work. Note whether each variable is categorical or continuous.


SAS: data libref1.newdata; set libref2.origdata (keep=variables);

OR (if you are keeping more variables than you are dropping):

data libref1.newdata; set libref2.origdata (drop=variables);

SPSS: GET FILE='path\filename' /KEEP = variablename1 variablename2 variablename3. EXECUTE.

Note: Use the "DROP" command instead of "KEEP" if you are retaining most of the variables in the file. See page 338 in the SPSS 6.1 Syntax Reference Guide for more information.

2. Run a frequency on each categorical variable and univariate statistics on each continuous variable.


CATEGORICAL VARIABLES

SAS: proc freq; tables variables /missing; run;

SPSS: FREQUENCIES VARIABLES= variablename1 variablename2 variablename3.

CONTINUOUS VARIABLES

SAS: proc univariate plot; var variables; run;

SPSS: DESCRIPTIVES VARIABLES=variablename /FORMAT=LABELS NOINDEX /STATISTICS=MEAN SUM STDDEV VARIANCE RANGE MIN MAX SEMEAN KURTOSIS SKEWNESS /SORT=MEAN (A).

3. Look in codebook to note values that refer to missing data.

If you wish to exclude these values (e.g., 99) from analysis, recode them to system missing:

SAS: if variable =99 then variable =. ;

SPSS: RECODE variablename (99=SYSMIS) . EXECUTE .

4. Look at distributions


CATEGORICAL VARIABLES:

Is the number of cases in each category large enough to allow comparisons? If not, consider lumping categories (be sure to create a new variable to prevent overwriting the old information):

SAS: newvariable=oldvariable; if newvariable=value1 or newvariable=value2 then newvariable=value3;

SPSS: RECODE variablename (1 thru 3=1) (4 thru 5=2) INTO newvariablename . EXECUTE .

CONTINUOUS VARIABLES:

Is the distribution normal?
If not, and normality is assumed for the statistical procedure you plan on using, transform the variable (LOG, etc.) and recheck the transformed variable for normality.

SAS: newvariable =log(oldvariable);

SPSS: COMPUTE newvariable=LN(oldvariablename). EXECUTE.

Are there any outliers?

5. Run checks for relationships between variables of interest.

CATEGORICAL BY CATEGORICAL VARIABLES (crosstabs)

SAS: proc freq; tables variable1 * variable2/ chisq missing; run;

SPSS: CROSSTABS /TABLES=variablename1 BY variablename2 /FORMAT= AVALUE NOINDEX BOX LABELS TABLES /STATISTIC=CHISQ /CELLS= COUNT ROW COLUMN .

CATEGORICAL BY CONTINUOUS VARIABLES (t-test)

SAS: proc ttest; class categorical variable; var contiuous variable ; run;

SPSS: T-TEST GROUPS=categorical variable (level1 level2 level3) /MISSING=ANALYSIS /VARIABLES=continuous variable /CRITERIA=CIN(.95) .

CONTINUOUS BY CONTINUOUS VARIABLES (scatterplot)

SAS: proc plot; plot variable1 * variable2; run;

SPSS: GRAPH /SCATTERPLOT(BIVAR)=variablename1 WITH variablename2 /MISSING=LISTWISE .

6. Unit of analysis


What is the unit of analysis needed to answer your question (individual, family, dyad, county, state)?

Are all variables of interest measured using this unit?

If so, skip to the next step.
If not, you need to reformat all variables to match your unit of analysis.

7. Merging files


What is the unique identifier to use in merging?

What to do with cases that are present in one file yet missing in the other?

Keep all cases


©2017 College of Human Ecology, Cornell University. All rights reserved.