3.1
ENTERING DATA INTO THE DATA EDITOR
When you first load SPSS it
will provide a blank data editor with the title Untitled1. When
inputting a new set of data, you must input your data in a logical way. The
SPSS Data Editor is arranged such that each row represents data from one entity while each
column represents a variable.
There is no discrimination between independent and dependent variables: both
types should be placed in a separate column. The key point is that each row
represents one entity’s data (be that entity a human, mouse, tulip, business,
or water sample). Therefore, any information about that case should be entered across
the data editor. For example, imagine you were interested in sex differences in
perceptions of pain created by hot and cold stimuli. You could place some people’s
hands in a bucket of very cold water for a minute and ask them to rate how
painful they thought the experience was on a scale of 1 to 10. You could then
ask them to hold a hot potato and again measure their perception of pain.
Imagine I was a participant. You would have a single row representing my data,
so there would be a different column for my name, my gender, my pain perception
for cold water and my pain perception for a hot potato: Abayomi, male, 8, 10. The
column with the information about my gender is a grouping variable: I can
belong to either the group of males or the group of females, but not both. As
such, this variable is a between-group variable (different people belong to
different groups). Rather than representing groups with words, in SPSS we have
to use numbers. This involves assigning each group a number, and then telling
SPSS which number represents which group. Therefore, between group variables
are represented by a single column in which the group to which the person belonged
is defined using a number. For example, we might decide that if a person is
male then we give them the number 0, and if they’re female we give them the
number 1. We then have to tell SPSS that every time it sees a 1 in a particular
column the person is a female, and every time it sees a 0 the person is a male.
Variables that specify to which of several groups a person belongs can be used
to split up data files. Finally, the two measures of pain are a repeated
measure (all participants were subjected to hot and cold stimuli). Therefore,
levels of this variable can be entered in separate columns (one for pain to a
hot stimulus and one for pain to a cold stimulus). The data editor is made up
of lots of cells, which are
just boxes in which data values can be placed. When a cell is active it becomes
highlighted in blue. You can move around the data editor, from cell to cell,
using the arrow keys ←
↑
↓
→
(found on the right of the
keyboard) or by clicking the mouse on the cell that you wish to activate. To
enter a number into the data editor simply move to the cell in which you want
to place the data value, type the value, then press the appropriate arrow
button for the direction in which you wish to move. So, to enter a row of data,
move to the far left of the row, type the value and then press →
(this process inputs the value
and then moves you into the next cell on the right).
In
summary, there is a simple rule for how variables
should be placed in the SPSS Data Editor: data from different things go in
different rows of the data editor, whereas data from the same things go in
different columns of the data editor. As such, each person (or mollusc, goat,
organization, or whatever you have measured) is represented in a different row.
Data within each person (or mollusc etc.) go in different columns. So, if
you’ve prodded your mollusc, or human, several times with a pencil and measured
how much it twitches as an outcome, then each prod will be represented by a
column. In experimental research this means that any variable measured with the
same participants (a repeated measure) should be represented by several columns
(each column representing one level of the repeated-measures variable).
However, any variable that defines different groups of things (such as when a
between-group design is used and different participants are assigned to
different levels of the independent variable) is defined using a single column.
This idea will become clearer as you learn about how to carry out specific
procedures.
3.2
THE SPSS VARIABLE VIEW WINDOW
This sheet contains information
about the data that is stored with the dataset. The following have to be
defined for each variable:
- Name
The first character of the
variable name must be alphabetic
Variable names must be unique,
and have to be less than 64 characters
Spaces are NOT allowed
- Type
Click on the type box. The two
basic types of variables that you will use are numeric and string. This column
enables you to specify the type of variable.
- Width
Width allows you to determine the
number of characters SPSS will allow to be entered for the variable.
- Decimals
Number of decimals, it has to be
less than or equal to 16.
- Label
You can specify the details of
the variable. You can write characters with spaces up to 256 characters.
- Values
This is used and to suggest which
numbers represent which categories when the variable represents a category.
Defining
the value labels
Click the cell in the values
column as shown below
For the value, and the label, you
can put up to 60 characters.
After defining the values click
add and then click ok
- Missing
This
column is for assigning numbers to missing data.
- Columns
Enter a
number into this column to determine the width of the column that is how many
characters are displayed in the column. (this differs from ‘width’, which
determines the width of the variable itself – you could have a variable of 10
characters but by setting the column width to 8 you would only see 8 of the 10
characters of the variable in the data editor) it can be useful to increase the
column width if you have a string variable that exceeds 8 characters, or a
coding variable with value labels that exceed 8 characters.
- Align
You can
use this column to select the alignment of the data in the corresponding column
of the data editor. You can choose to align the data to the left or right or center.
- Measure
This is where you define the level
at which a variable was measured (nominal, ordinal or scale)
.
3.2.1
LEVELS OF MEASUREMENT
There are
three levels of data. They are:
- Nominal level: Data that is classified into categories and cannot be arranged in any particular order. E.g. eye colour, gender, religious affiliation.
- Ordinal level: involves data arranged in some order, but the differences between data values cannot be determined or are meaningless. For Example: during a taste test of 4 soft drinks, Pessi was ranked number 1, sprite number 2, seven-up number 3, and Coca-cola number 4.
- Scale: Scale can either be interval or ratio. Interval level: to the ordinal level, with the additional property that meaningful amounts of differences between data values can be determined. There is no natural zero point. For Example: temperature on the Fahrenheit scale. While ratio level is the interval with an inherent zero starting point. Differences and ratios are meaningful for this level of measurement. For example: Monthly income of surgeons, or distance travelled by manufacturer’s representatives per month.
3.3 MISSING VALUES
Although as researchers we
strive to collect complete sets of data, it is often the case that we have
missing data. Missing data can occur for a variety of reasons: in long
questionnaires participants accidentally miss out questions; in experimental
procedures mechanical faults can lead to a datum not being recorded; and in
research on delicate topics (e.g. sexual behaviour) participants may exert
their right not to answer a question. However, just because we have missed out on some data for a
participant doesn't mean that we have to ignore the data we do have (although
it sometimes creates statistical difficulties). Nevertheless, we do need to tell
SPSS that a value is missing for a particular case. The principle behind missing
values is quite similar to that of coding variables in that we choose a numeric
value to represent the missing data point. This value tells SPSS that there is
no recorded value for a participant for a certain variable. The computer then
ignores that cell of the data editor (it does not use the value you select in
the analysis). You need to be careful that the chosen code doesn't correspond
to any naturally occurring data value. For example, if we tell the computer to
regard the value 9 as a missing value and several participants genuinely scored
9, then the computer will treat their data as missing when, in reality, they
are not. To specify missing values, you simply click in the column labelled in
the variable view and then click on to activate the Missing
Values dialog box in Figure 3.9. By default,
SPSS assumes that no missing values exist, but if you do have data with missing
values you can choose to define them in one of three ways. The first is to
select discrete values (by clicking on the circle next to where it says Discrete
missing values) which are
single values that represent missing data. SPSS allows you to specify up to
three discrete values to represent missing data. The reason why you might choose
to have several numbers to represent missing values is that you can assign a
different meaning to each discrete value. For example, you could have the
number 8 representing a response of ‘not applicable’, a code of 9 representing
a ‘don’t know’ response, and a code of 99 meaning that the participant failed
to give any response. As far as the computer is concerned it will ignore any data
cell containing these values; however, using different codes may be a useful
way to remind you of why a particular score is missing. Usually, one discrete value
is enough and in an experiment in which attitudes are measured on a 100-point
scale (so scores vary from 1 to 100) you might choose 666 to represent missing
values because (1) this value cannot occur in the data that have been collected
and (2) missing data create statistical problems.
3.4
SPSS KEYWORDS
Using SPSS keywords, especially
TO and ALL greatly speeds up a myriad of typical tasks.
SPSS
Main Keywords
Expression Meaning Returns
ALL all variables (not previously
addressed in statement) Variable(s)
TO all variables between and
including split outcome of one. Variable(s)
BY split outcome of one variable
by values of another. Nothing
WITH compare one variable with
another Nothing
Watch out for Module 4, where you will start applying what you have learnt in module 1-3. feel free to contact me for any questions.
No comments:
Post a Comment