 # Categorical scatter plot

## What is a categorical scatter plot?

A categorical scatter plot is a grid of different-sized circles.
Each circle represents the intersection of two categories. In this type of chart, two categorical variables are usually placed along X- and Y- axes. However, there is also an alternative arrangement where one categorical variable is aligned along one axis while the other axis is numerical.

Categorical scatter plots can be used both for comparison and to find patterns in the data. They are very useful for visualizing survey results, particularly for questions offering multiple choices or those that imply the filling of a table. ## Variations of categorical scatter plots

##### The charts below are variations of a categorical scatter plot. To learn how to make them with Datylon, check out the scatter plot chart user documentation in the Datylon Help Center.  #### Colored categorical scatter plot

In addition to circle size, color can introduce another dimension, enabling the visualization of an extra variable. Another variation of a colored categorical scatter plot is a grid with the same-sized circles, where color represents the numeric variable.  #### Strip plot

A strip plot is a chart featuring a categorical variable paired with a numerical variable. Each categorical value is represented by a sequence of circles placed alongside the numerical axis. As the circles can overlap, usually they are made transparent for better readability.  #### Jitter plot

The structure of the jitter plot is the same as in the strip plot. The difference is in how the overlaps are handled. In the jitter plot, the circles are randomly dispersed alongside the categorical axis. It allows to separate the circles and makes the chart easier to read.

## Alternatives to a categorical scatter plot

##### Substitute your categorical scatter plot with any of the charts below when you want an alternative that allows you to compare categories.  #### Heatmap

This is the closest alternative. The colored categorical scatter plot (with one-size circles) and a heatmap could be considered to be the same chart (when the only difference is that in heatmaps a grid of rectangles is commonly used, while for categorical scatter plots, circles are more typical.  #### Dot plot

A dot plot might be considered a simplified strip plot. It also has both numerical and categorical axes, but there are fewer values assigned per category. Usually, there are one to five icons assigned to each axis category. Often another category dimension is added via the coloring of the icons.  #### Icon chart

This a good icon-based chart alternative if every element should be based on one category and one numerical value. It is also a very space-efficient chart. While the categorical scatter plot and most of its alternatives need two axes, the icon chart is based only on one.

## Pro tips for designing a categorical scatter plot

##### Learn how to improve the readability and visual appeal of your chart. ### Sorting

A grid with a big amount of icons might be overwhelming for the reader. To make the reading process natural for the viewer, be sure to follow the left-to-right, top-to-bottom direction of the categories as in most languages it is the natural way to read. If you’re designing a chart for another language, be sure to check their specific reading patterns to sort the categories naturally. ### Accessibility & coloring

For most of the categorical scatter plots, one color is enough. For strip plots, you might need to decrease the opacity to make the circles transparent so that all of them are visible. If an additional dimension is added via color be sure to use one hue continuous palette for numerical variables and an accessible color palette or different icons for categorical variables. The color palettes that should be avoided are based on red and green colors. ### Highlighting

To draw attention to the most important categories of the categorical scatter plot, a good solution is to highlight certain circles by adding an outline or filling the circles in a specific color. Our brain is programmed to notice deviations instantly. This way, highlighting a specific circle will help catch the reader’s eye immediately. ### Labeling

The optimal labeling option for categorical scatter plots is category labels along the axes and the data labels for each data point (circle). There might be some cases when the labels are not needed – when only the overview of a vast dataset is needed. In this case, labeling every circle would be redundant. ### Size of the circles

The area of the circles should represent the corresponding value, so, logically, the minimum size of the circle should be zero. This will prevent the distortion of the data. The other case one might consider is whether the circles should overlap. Generally, it’s better not to have the circles overlap, but in case the dataset range is too large it makes sense to enlarge the scale of the circles so the difference in size can be visible.