Histogram classes are ranges of values that contain data points in a frequency distribution, or tabulation of raw data. They are also referred to as intervals, bins or bin widths. In a histogram, a type of bar graph, these classes are depicted as vertical columns where the height of the column indicates the number of data points contained in the class range. Typically, the choice of histogram class is made to enhance the visual representation of data statistics or trends.
When there is a large amount of data to display, a histogram is particularly useful for depicting the shape of its distribution. An entire range of data is broken down into intervals and the number of data points falling into each is counted to give the class frequency. The range, or width, of the interval determines the number of histogram classes and influences the shape of the graph.
If the interval is too wide, significant information might be omitted by the classes being too inclusive. When the choice of interval width is too narrow, low class frequency might give undue importance to what is actually a random variation. There are several methods for setting an appropriate number of histogram classes for a data set.
According to Sturgis's rule, the number of classes should be close to the base 2 log of the number of data points, plus one. Using Rice's rule, the number of classes defined should be twice the cube root of the number of data points. Whichever method is used in selecting the number of histogram classes, several different widths should be tried to test the sensitivity of the histogram shape to class size. The correct number of classes is the one that most accurately depicts the distribution of the data.
With the proper number of histogram classes for a range of data, a meaningful graphical representation should result that enables clear interpretation. A histogram should show the center and spread of the data, any skewness, or data asymmetry, and outliers, or data points occurring outside the expected range of values. The mode, or most frequently occurring value, should be apparent as well as groupings that might indicate a multiple modality. Histogram analysis might also indicate faults in the data collection process.
Long used in finance and the social sciences, histograms are becoming more familiar in the graphical displays of consumer electronics. Digital photography is particularly open to their use, with many cameras incorporating a color histogram to indicate white balance and exposure. A digital photography histogram might also show pixels as histogram classes plotted against shades of gray.