In statistics and econometrics, cross-sectional data is a type of data collected by observing many subjects (such as individuals, firms, countries, or regions) at a single point or period of time. Analysis of cross-sectional data usually consists of comparing the differences among selected subjects, typically with no regard to differences in time.
For example, if we want to measure current obesity levels in a population, we could draw a sample of 1,000 people randomly from that population (also known as a cross section of that population), measure their weight and height, and calculate what percentage of that sample is categorized as obese. This cross-sectional sample provides us with a snapshot of that population, at that one point in time. Note that we do not know based on one cross-sectional sample if obesity is increasing or decreasing; we can only describe the current proportion.
Cross-sectional data differs from time series data, in which the same small-scale or aggregate entity is observed at various points in time. Another type of data, panel data (or longitudinal data), combines both cross-sectional and time series data aspects and looks at how the subjects (firms, individuals, etc.) change over a time series. Panel data deals with the observations on the same subjects in different times. Panel analysis uses panel data to examine changes in variables over time and its differences in variables between selected subjects.
Variants include pooled cross-sectional data, which deals with the observations on the same subjects in different times. In a rolling cross-section, both the presence of an individual in the sample and the time at which the individual is included in the sample are determined randomly. For example, a political poll may decide to interview 1000 individuals. It first selects these individuals randomly from the entire population. It then assigns a random date to each individual. This is the random date that the individual will be interviewed, and thus included in the survey.
Cross-sectional data can be used in cross-sectional regression, which is regression analysis of cross-sectional data. For example, the consumption expenditures of various individuals in a fixed month could be regressed on their incomes, accumulated wealth levels, and their various demographic features to find out how differences in those features lead to differences in consumers’ behavior.
References
- Brady, Henry E.; Johnston, Richard (2008). "The Rolling Cross Section and Causal Distribution" (PDF). University of Michigan Press. Retrieved July 13, 2008.
Further reading
- Gujarati, Damodar N.; Porter, Dawn C. (2009). "The Nature and Sources of Data for Economic Analysis". Basic Econometrics (Fifth international ed.). New York: McGraw-Hill. pp. 22–28. ISBN 978-007-127625-2.