A FREQUENCY DISTRIBUTION is a summary of how often each value or range of values occurs in a dataset. It organizes data into a table or graph that displays the FREQUENCY (count) of each unique CLASS (category or value/interval of values) within the dataset.
Frequency distributions are an important tool for understanding the distribution and patterns of data. Frequency distributions provide a clear visual summary of the data, helping to identify patterns such as central tendency, dispersion, and skewness. Frequency distributions are also an important tool for summarizing data: they condense large datasets into an easily interpretable format. This can facilitate initial data exploration and analysis. Furthermore, as data summary tools, frequency distributions can also aid in decision-making processes and serve as a mechanism through which findings can be effectively communicated to various stakeholders.
Frequency Tables
A FREQUENCY TABLE is a tabular representation of data that shows the number of occurrences (frequency) of each distinct case (value or category in a dataset). It organizes raw data into a summary format, making it easier to see how often each value appears.
While frequency tables are helpful, they do not provide as much information as relative frequency tables. A RELATIVE FREQUENCY TABLE extends the frequency table by including the relative frequency (i.e., the PERCENTAGE DISTRIBUTION), which is the proportion or percentage of the total number of observations that fall into each case. A relative frequency table provides a sense of the distribution of data in terms of its overall context.
A CUMULATIVE RELATIVE FREQUENCY TABLE shows the cumulative relative frequency (i.e., CUMULATIVE FREQUENCY DISTRIBUTION), which is the sum of the percentage distributions for all values up to and including the current value. The cumulative percentages for all values should add up to 100% (or something close, depending on rounding errors). A cumulative relative frequency table helps us to understand the cumulative distribution of the data.
Another extension of the frequency table is a CONTINGENCY TABLE (also known as a cross-tabulation or crosstab). A contingency table is used to display the frequency distribution of two or more variables; it shows the relationship between two or more CATEGORICAL VARIABLES (i.e., nominal- or ordinal-level variables) by presenting the frequency of each combination of variable categories.
Charts and Graphs
There are numerous charts and graphs that can be used to display frequency distributions:
- BAR GRAPHS (or bar charts) and HISTOGRAMS are graphical representations of data that use rectangular bars to represent the frequency of a value or intervals of values (i.e., BINS); bar charts and histograms are useful for showing the distribution of variables
- A PIE CHART is a circular graph divided into slices to illustrate numerical proportions, the size of each slice proportional to the quantity it represents; pie charts are useful for showing the relative frequencies of different categories within a whole
- A LINE GRAPH (or line chart) is a type of graph that displays information as a series of data points connected by straight line segments. Line graphs are often used to show trends over time; they can also be used to summarize frequency distributions of interval- and ratio-level variables