Sometimes, you need to draw histograms using two variables, with one variable containing the frequencies. Using fictional data, I will show you, in some simple steps, how to proceed to draw a histogram with varying bin widths.

We will reproduce this figure:

I was inspired by this FAQ on the Stata website answered by Nicholas J. Cox:

https://www.stata.com/

First step, import the data:

**# Import the data

*https://www.stata.com/support/faqs/graphics/histograms-with-varying-bin-widths/

*cd C:\Users\jamel\Dropbox\PC\Downloads

import excel "Data.xlsx", sheet("Feuil1") firstrow clear 

Second, you have to sort the data and create the frequency class:

**# Sort the data and create frequency class

gsort +Relativeproductivity

gen freq = 0

replace freq = sum(Employmentshare)

gen Employmentshare_ = 0

replace Employmentshare_ = freq[_n-1]

replace Employmentshare_ = .0 in 1

set obs 13

replace Employmentshare_ = 100 in 13

Thirdly, draw the two-way histogram and export the figure as a PNG file:

**# Draw the histogram

set scheme Cleanplots

summ Relativeproductivity

lab var Relativeproductivity "Relative Productivity"

lab var Employmentshare_ "Employment Share in %"

twoway ///
(bar Relativeproductivity Employmentshare_ ///
 if Relativeproductivity>=Relativeproductivity[1] ///
 & Relativeproductivity<=Relativeproductivity[2], ///
 bartype(spanning) bstyle(histogram) yscale(range(0)) ///
 bcolor(blue%20) legend(label(1 "Personal services")) ///
 yline(4.462122)) ///
(bar Relativeproductivity Employmentshare_ ///
 if Relativeproductivity>=Relativeproductivity[2] & ///
 Relativeproductivity<=Relativeproductivity[3], ///
 bartype(spanning) ///
 bcolor(red%20) legend(label(2 "Agriculture"))) ///
(bar Relativeproductivity Employmentshare_ ///
 if Relativeproductivity>=Relativeproductivity[3] & ///
 Relativeproductivity<=Relativeproductivity[4], ///
 bartype(spanning) ///
 bcolor(green%20) legend(label(3 "Wholesale and Retail"))) ///
(bar Relativeproductivity Employmentshare_ ///
 if Relativeproductivity>=Relativeproductivity[4] & ///
 Relativeproductivity<=Relativeproductivity[5], ///
 bartype(spanning) ///
 bcolor(pink%20) legend(label(4 "Manufacturing"))) ///
(bar Relativeproductivity Employmentshare_ ///
 if Relativeproductivity>=Relativeproductivity[5] & ///
 Relativeproductivity<=Relativeproductivity[6], ///
 bartype(spanning) ///
 bcolor(yellow%20) legend(label(5 "Public services"))) ///
(bar Relativeproductivity Employmentshare_ ///
 if Relativeproductivity>=Relativeproductivity[6] & ///
 Relativeproductivity<=Relativeproductivity[7], ///
 bartype(spanning) ///
 bcolor(pink%20) legend(label(6 "Business services"))) ///
(bar Relativeproductivity Employmentshare_ ///
 if Relativeproductivity>=Relativeproductivity[7] & ///
 Relativeproductivity<=Relativeproductivity[8], ///
 bartype(spanning) ///
 bcolor(lime%20) legend(label(7 "Transport"))) ///
(bar Relativeproductivity Employmentshare_ ///
 if Relativeproductivity>=Relativeproductivity[8] & ///
 Relativeproductivity<=Relativeproductivity[9], ///
 bartype(spanning) ///
 bcolor(gold%20) legend(label(8 "Construction"))) ///
(bar Relativeproductivity Employmentshare_ ///
 if Relativeproductivity>=Relativeproductivity[9] & ///
 Relativeproductivity<=Relativeproductivity[10], ///
 bartype(spanning) ///
 bcolor(purple%20) legend(label(9 "Utilities"))) ///
(bar Relativeproductivity Employmentshare_ ///
 if Relativeproductivity>=Relativeproductivity[10] & ///
 Relativeproductivity<=Relativeproductivity[11], ///
 bartype(spanning) ///
 bcolor(red%20) legend(label(10 "Financial services"))) ///
(bar Relativeproductivity Employmentshare_ ///
 if Relativeproductivity>=Relativeproductivity[11] & ///
 Relativeproductivity<=Relativeproductivity[12], ///
 bartype(spanning) ///
 bcolor(black%60) legend(label(11 "Real estate"))) ///
(bar Relativeproductivity Employmentshare_ ///
 if Relativeproductivity>=Relativeproductivity[12] & ///
 Relativeproductivity<=Relativeproductivity[13], ///
 bartype(spanning) ///
 bcolor(blue%50) legend(label(12 "Mining")) ///
 title("{bf:Productivity and Employment Share in Ghana}"))
 
**# Export the Graph

graph rename twowayhist, replace
graph export twowayhist.png, as(png) width(4000) replace
 
**# End of Program

As we have seen in this blog, it is possible to visualize the usefulness of two-way histograms in some simple steps. The files for replicating the results in this blog are available on my GitHub.


Comments

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.