Summarizing Categorical Variables
Once the type of data, categorical or quantitative is identified, we can consider graphical representations of the data, which would be helpful for Maria to understand.
Frequency tables, pie charts, and bar charts are the most appropriate graphical displays for categorical variables. Below are a frequency table, a pie chart, and a bar graph for data concerning Mental Health Admission numbers.
- Frequency Table
- A table containing the counts of how often each category occurs.
Diagnosis | Count | Percent |
Depression | 40835 | 48.5% |
Anxiety | 29388 | 34.9% |
OCD | 5465 | 6.5% |
Abuse | 8513 | 10.1% |
Total | 84201 | 100.0% |
- Pie chart
Graphical representation for categorical data in which a circle is partitioned into “slices” on the basis of the proportions of each category.
Pitfalls
One of the pitfalls of a pie chart is that if the “slices” only represent percentages the reader does not know how many actual people fall in each category.
- Bar Chart
- Graphical representation for categorical data in which vertical (or sometimes horizontal) bars are used to depict the number of experimental units in each category; bars are separated by space.
Note that in the bar chart, the categories of mental health diagnoses (bars) have white spaces in between them. The spaces between the bars signify that this is a categorical variable.
Pie charts tend to work best when there are only a few categories. If a variable has many categories, a pie chart may be more difficult to read. In those cases, a frequency table or bar chart may be more appropriate.
Pitfalls
While bar charts can be presented as either percentages (in which case they are referred to as relative frequency charts) or counts, the differences among the heights of the bars are often assumed to be different, even when they are not.