 # Histogram

## What is a histogram?

A histogram is a set of rectangles (bars or columns) that shows a distribution of numerical variables over a scale divided into bins. Each rectangle represents the accumulated value (a range) of the given bin.

A histogram is a type of chart that resembles a bar or a column chart. But for the visual distinction between a histogram and a bar chart, it’s customary not to leave gaps (or to make them very small) between rectangles in a histogram to show the continuous nature of the variables used in the visualization. the columns have different heights because they correspond to the frequency of each group - meaning, how many items fall in a certain range.

Histograms are quite often chosen as the first option to review the data. They allow us to quickly see the form of distribution. Coffee consumed on Monday

## Variations of a histogram

##### The charts below are variations of a histogram. To learn how to make them with Datylon, check out the bar chart user documentation in the Datylon Help Center.  #### Stacked histogram

While the form of the histogram remains the same, the grouping is added, which allows one to see the contribution of each group.  #### Population pyramid

A population pyramid chart is a specific variation of a histogram that shows the distribution of the population divided into age range bins.  #### Chandelier chart

This type of histogram is hung on a normal distribution line. It allows us to see the difference between the actual and expected distribution values.

## Alternatives to a histogram

##### Substitute your histogram with the charts below when you want an alternative representation of the data distribution.  #### Column chart

While a histogram is used to represent the distribution of continuous variables, column (and bar) charts usually represent differences between values of discrete categorical variables.  #### Density plot

Equally helpful to show the distribution, but unlike bars in histograms, density plots use the line. They somewhat resemble smooth peaks and valleys plotted between two axes.  #### One-dimensional heatmap

If histogram and density plots use a spatial representation of distribution, the one-dimensional heatmap applies color for the same purpose. Often used in climate communication.

## Pro tips for designing a histogram

##### Learn how to improve the readability and visual appeal of your histogram. ### Bin sizing

The size of the bin is the ultimate customizable option for a histogram, depending on the size of the bin the form of the histogram can be changed drastically. It’s better to follow your data’s logic, but there’s also a popular way for choosing a bin size called Sturge’s rule. It’s used in all major software, but it was criticized for over-smoothing histograms.

The formula is the following: K=1+3.322 log(n),

where:
K is the number of bins
n is the number of observations in a dataset. ### Labeling

In most cases, the histogram is used to see the form of distribution and overall pattern, so there’s no need for detailed labeling of every bar. A regular axis label for both axes should work. ### Coloring

Coloring of histograms follows the general rule of using color in data visualization – use color only if it communicates additional information. For any basic histogram, one color should be enough. ### Highlighting

To draw attention to the most important bin(s) of a histogram, a good solution is to highlight these bin bars and color all the others – in a neutral color. Our brain is programmed to notice deviations instantly. This can be done, for example, by applying changes in size, movement, or color. This way, highlighting a specific bin will help catch the reader’s eye immediately.