Level of measurement

The first thing we did today was test our knowledge of level of measurement . Variables can be categorized according to level of measurement, namely: nominal, ordinal and interval.  Using the online polls allowed me to get a sense of how much the class knows about these topics.  With practice, you should be able to identify the level of measurement of a variable almost immediately!

Variables and cases - what datasets are made of

All this talk of level of measurement leads us back to the question: what is a variable anyway?  There is often confusion about this.  Technically speaking, variables are collections of characteristics of people, groups,  places, or things.
 
The most fundamental variable is a coin toss that can come up heads or come up tails.  It is important to be able to identify what is a variable and what is not a variable.  The variable here is set of possible outcomes of a coin toss.  So the variable consists of a set of items: heads / tails.  The coin ITSELF is not a variable.

To reiterate, variables are characteristics of things.  We call the things cases.  For our purposes, a dataset must have cases in the rows and variables in the columns.  Each case takes up one and only one row.  Each variable occupies one and only one column.

Datasets must contain cases that are all the SAME type of thing.  So if the dataset is a list of players in your favorite sports team, the cases are players, and you have to make sure there isn't anything else mixed up in there. 

If the cases are individual people, then ALL the variables in the dataset must be characteristics that describe individual people.  If the cases are nations, then ALL the variables in the dataset must be characteristics that describe nations. 

Telling a story with data

Telling stories about data is the point of this class.  It is important to know the levels of measurement of your variables, because this information tells you what techniques you might use to tell the story.

We considered the Forbes List of 100 Top Celebrities.  One story one might want to tell is about which types of celebrities get paid the most.  Or I can rephrase this as a question: what types of celebrities get paid the most? 

What kind of variable is pay?  Pay is an interval level variable, because it is measured precisely in (countable) dollars.  The variable "category" tells what type of business the celebrity is in.  Category is nominal because there is no order between the different categories.

A common data analysis technique is to take the average of an interval variable within different categories of a nominal variable.  This technique would help us answer the question: what types of celebrities get paid the most?  Specifically, we can look at average pay within categories.

How do you do this in Excel?  See the instructions for details.


 





Leave a Reply.