Glossary

Understanding some basic terminology will help you as you are searching for information and will give you a better understanding of what type of data you are using.

Some useful glossaries to statistical and data terminology are:

See below for a quick reference to commonly used terms:

Data is typically raw data that need to be manipulated using software.  An example of data used in this context is data available through a data archive such as ICPSR.  Data can be quantitative, qualitative, spatial, etc. The difference between data and statistics can be confusing because in everyday language, the terms statistics and data are often used interchangeably not to mention that the term data can also have a very broad meaning.

Codebook provides information on the structure, content, and layout of a data file as well as methodology, questionnaire(s) and any other relevant information about the data set. “Readme” files or data dictionaries are related terms and provide information on the data file(s) content. These are all forms of data documentation.

Data Archive preserves and makes accessible research data.  Some examples are ICPSR and CIESIN.

Microdata are data on the lowest level of observation such as individual answers to questions.  For example, the U.S. Census Bureau’s Public-Use Microdata Samples (PUMS files) is a data set of individual housing unit or person responses to census questions.

Numeric/Quantitative Data is a type of data made up of numbers.

Primary Data is data collected through your own research study directly through instruments such as surveys, observations, and so on.

Qualitative Data is data that describes a property or attribute.  Examples of qualitative data are interviews, case studies, and comments collected on a questionnaire.

Secondary Data is data from a research study conducted by someone else.  Usually when you are asked to locate statistics on a topic you are using secondary data.  An example of secondary data is statistics from the Census of Population and Housing.

Spatial Data is geographic information that is used for analysis with GIS software like ArcGIS.  Spatial data is also referred to as geospatial or GIS data.

Summary Data is data that has been aggregated or summarized. The underlying data has been analyzed and processed to produce information in an easy to read format such as tables and graphs.  Tables published in Statistical Abstract of the United States are one example.