Kurtosis Illustrated

Kurtosis tells you virtually nothing about the shape of the peak – its only unambiguous interpretation is in terms of tail extremity; i.e., either existing outliers (for the sample kurtosis) or propensity to produce outliers (for the kurtosis of a probability distribution).

Peter H. Westfall

This post is inspired by the nice blog of Eran Raviv, it tries to give a graphical illustration of the kurtosis formula which basically measures the outliers in a distribution. Indeed, the kurtosis measures the thickness or the thinness of a distribution’s tail.

I start with the first standardized moment:

\tilde{\mu_{1}} = \frac{\mu _{1}}{\sigma^{1}} = \frac{E[(X-\mu)^1]}{(E[(X-\mu)^2])^{1/2}}

In virtue of the expectation operator properties recalled in this post, we have:

\tilde{\mu_{1}} = \frac{\mu _{1}}{\sigma^{1}} = \frac{\mu-\mu}{\sqrt{E[(X-\mu)^2]}}=0

Thus, the kurtosis is the fourth standardized moment:

\tilde{\mu_{4}} = \frac{\mu _{4}}{\sigma^{4}} = \frac{E[(X-\mu)^4]}{(E[(X-\mu)^{2}])^{4/2}}

Before moving to the graphical illustrations, I recall the formula for the sample kurtosis:

g_{2}=\frac{m_{4}}{m_{2}^{2}}= \frac{ \frac{1}{n}\sum_{i=1}^{n}(x_{i}-\bar{x})^{4} }{ [\frac{1}{n}\sum_{i=1}^{n}(x_{i}-\bar{x})^{2}]^{2} } \\

The kurtosis for a random variable that follows a Normal distribution is 3 (the ratio between the fourth moment and the square of the second moment, 3/(1^2)=3). We have a mesokurtic distribution. In figure 1, we have a distribution with the usual properties for a normal distribution:

Figure 1. Normal distribution.

The kurtosis for a random variable that follows a Uniform distribution is below 3 (the ratio between the fourth moment and the square of the second moment, 125/(25/3)^2=1.8). We have a platykurtic distribution. In figure 2, we have a distribution with thinner tails than a normal distribution:

Figure 2. Uniform distribution.

The kurtosis for a random variable that follows a Laplace distribution is above 3 (the ratio between the fourth moment and the square of the second moment, 24/(2^2)=6). We have a leptokurtic distribution. In figure 3, we have a distribution with thicker tails than a normal distribution:

Figure 3. Laplace distribution.

We can combine figures 1 to 3 to compare the kurtosis in figure 4:

Figure 4. Kurtosis illustrated.

We can superpose the three kernel density estimations in order to have a better view of the distributions respective thickness of the distribution’s tail:

Figure 5. Kernel density estimations.

The STATA code used to produce the graphs is reproduced below:

* Illustrate the Kurtosis with some graphs
*------------------------------------------------------------------------

version 15.1
set more off
cd "C:\Users\EconJamel\Latex\L1-STAT-I-CM\2019-2020\doc-supplementaires"
												// Set the directory
capture log close                               
log using kurtosis.smcl, replace

// Apply the s2color scheme

set scheme s2color

// Generate random variables

set obs 10000

// Normal variable

capture gen normal = rnormal()

sum normal, detail

// Uniform variable

capture gen uniform = runiform(-5,5)

sum uniform, detail

// Laplace variable

capture gen laplace = rlaplace(0,1)

sum laplace, detail

/* histogram normal, frequency title("Normal, K=3") ///
subtitle(mesokurtic) kdensity
 */

histogram normal, title("Normal, K=3") ///
subtitle(mesokurtic) kdensity

capture graph rename normal, replace
capture graph export normal.png, replace

/* histogram uniform, frequency title("Uniform, K=1.8") ///
subtitle(platykurtic) kdensity
 */

histogram uniform, title("Uniform, K=1.8") ///
subtitle(platykurtic) kdensity

capture graph rename uniform, replace
capture graph export uniform.png, replace

/* histogram laplace, frequency title("Uniform, K=6") ///
subtitle(platikurtic) kdensity
 */
 
histogram laplace, title("Laplace, K=6") ///
subtitle(leptokurtic) kdensity

capture graph rename laplace, replace
capture graph export laplace.png, replace

graph combine normal uniform laplace, ycommon xcommon imargin(zero) ///
title(Kurtosis) ///
subtitle("Ilustration with Normal, Uniform and Laplace distribution")

capture graph rename kurtosis, replace
capture graph export kurtosis.png, replace

twoway (kdensity normal) (kdensity uniform) (kdensity laplace), ///
title(Kurtosis) ///
subtitle("Ilustration with Normal, Uniform and Laplace distr.")

capture graph rename kurtosis_gathered, replace
capture graph export kurtosis_gathered.png, replace

// Save the data
save ///
"C:\Users\EconJamel\Latex\L1-STAT-I-CM\2019-2020\doc-supplementaires\kurtosis.dta", ///
replace

log close
exit

Description
-----------

This file aims at illustrating the kurtosis formula with striking graphs.

My latest posts

Burgernomics: Maps

“Two important characteristics of maps should be noticed. A map is not the territory it represents, but, if correct, it has a similar structure to the territory, which accounts for its usefulness.”

Alfred Korzybski

After a first pedagogical blog on Burgernomics, a second one on the R code to replicate the calculations made by The Economist, it is the final blog of this series on Burgernomics. These two following maps have been made with Excel 2019.

Figure 1. Raw index.
Figure 2. GDP-adjusted index.

In figure 2, according to the GDP-adjusted index as of July 2020, the most undervalued currencies are the Hong Kong dollar, the Russian rouble, the South African Rand, the Taiwanese dollar and the Turkish lira (undervaluation above 38 percent for these countries). The Brazilian real and the Thai Baht are the most overvalued currencies with overvaluation above 19 percent.

My latest posts

Burgernomics: R codes and datasets

“Data! Data! Data!” he cried impatiently. “I can’t make bricks without clay.”

Sherlock Holmes.
The Adventure of the Copper Beeches, Sir Arthur Conan Doyle.

In my previous post, I tried to explain in the simplest way Burgernomics. Here I reproduce the R code and the graphs used in the Big Mac Index made by the Economist.

For the July 2020 update, there are two source files (Comma-separated values format):

big-mac-source-data.csv

big-mac-historical-source-data.csv

R Script for the results

# Generate data for the Big Mac Index

setwd("C:/Users/EconJamel/R/big-mac-data-2020-07")

library('tidyverse')
library('data.table')
library(dplyr)
library(magrittr)
library(ggplot2)

big_mac_countries = c('ARG', 'AUS', 'BRA', 'GBR', 'CAN', 'CHL', 'CHN', 'CZE', 'DNK',
                      'EGY', 'HKG', 'HUN', 'IDN', 'ISR', 'JPN', 'MYS', 'MEX', 'NZL',
                      'NOR', 'PER', 'PHL', 'POL', 'RUS', 'SAU', 'SGP', 'ZAF', 'KOR',
                      'SWE', 'CHE', 'TWN', 'THA', 'TUR', 'ARE', 'USA', 'COL', 'CRI',
                      'PAK', 'LKA', 'UKR', 'URY', 'IND', 'VNM', 'GTM', 'HND', # Venezuela removed
                      'NIC', 'AZE', 'BHR', 'HRV', 'JOR', 'KWT', 'LBN', 'MDA', 'OMN',
                      'QAT', 'ROU', 'EUZ')
base_currencies = c('USD', 'EUR', 'GBP', 'JPY', 'CNY')

# Data importation.

big_mac_data = fread('./source-data/big-mac-source-data.csv', na.strings = '#N/A',
                     # sort by date and then by country name, for easy reading;
                     # index on currency_code for faster joining
                     key = 'date,name', index = 'currency_code') %>%
  # remove lines where the local price is missing
  .[!is.na(local_price)]
tail(big_mac_data)

latest_date = big_mac_data$date %>% max
latest_date

# Raw index.

big_mac_data[, dollar_price := local_price / dollar_ex]
tail(big_mac_data)

big_mac_index = big_mac_data[
    !is.na(dollar_price) & iso_a3 %in% big_mac_countries
    ,.(date, iso_a3, currency_code, name, local_price, dollar_ex, dollar_price)]

for(currency in base_currencies) {
  big_mac_index[
    ,
    (currency) := dollar_price / .SD[currency_code == currency]$dollar_price - 1,
    by=date
  ]
}
big_mac_index[, (base_currencies) := lapply(.SD, round, 3L), .SDcols=base_currencies]
tail(big_mac_index)

to_plot = big_mac_index[date == latest_date]
to_plot$name = factor(to_plot$name, levels=to_plot$name[order(to_plot$USD)])
ggplot(to_plot[, over := USD > 0], aes(x=name, y=USD, color=over)) +
  geom_hline(yintercept = 0) +
  geom_linerange(aes(ymin=0, ymax=USD)) +
  geom_point() +
  coord_flip()

fwrite(big_mac_index, './output-data/big-mac-raw-index.csv')
Figure 1. Raw Index.
# GDP-adjusted index.

big_mac_gdp_data = big_mac_data[GDP_dollar > 0]

regression_countries = c('ARG', 'AUS', 'BRA', 'GBR', 'CAN', 'CHL', 'CHN', 'CZE', 'DNK',
                         'EGY', 'EUZ', 'HKG', 'HUN', 'IDN', 'ISR', 'JPN', 'MYS', 'MEX',
                         'NZL', 'NOR', 'PER', 'PHL', 'POL', 'RUS', 'SAU', 'SGP', 'ZAF',
                         'KOR', 'SWE', 'CHE', 'TWN', 'THA', 'TUR', 'USA', 'COL', 'PAK',
                         'IND', 'AUT', 'BEL', 'NLD', 'FIN', 'FRA', 'DEU', 'IRL', 'ITA',
                         'PRT', 'ESP', 'GRC', 'EST')
big_mac_gdp_data = big_mac_gdp_data[iso_a3 %in% regression_countries]

head(big_mac_gdp_data)

ggplot(big_mac_gdp_data, aes(x=GDP_dollar, y=dollar_price)) +
  facet_wrap(~date) +
  geom_smooth(method = lm, color='tomato') +
  geom_point(alpha=0.5)
Figure 2. Regression between Big Mac Dollar Price and GDP per person.
big_mac_gdp_data[,adj_price := lm(dollar_price ~ GDP_dollar)$fitted.values, by=date]
tail(big_mac_gdp_data)

ggplot(big_mac_gdp_data, aes(x=GDP_dollar, y=dollar_price)) +
  facet_wrap(~date) +
  geom_smooth(method = lm, color='tomato') +
  geom_linerange(aes(ymin=dollar_price, ymax=adj_price), color='royalblue', alpha=0.3) +
  geom_point(alpha=0.1) +
  geom_point(aes(y=adj_price), color='royalblue', alpha=0.5)
Figure 3. Fitted values.
big_mac_adj_index = big_mac_gdp_data[
  !is.na(dollar_price) & iso_a3 %in% big_mac_countries
  ,.(date, iso_a3, currency_code, name, local_price, dollar_ex, dollar_price, GDP_dollar, adj_price)]

for(currency in base_currencies) {
  big_mac_adj_index[
    ,
    (currency) := (dollar_price / adj_price) /
      .SD[currency_code == currency, dollar_price / adj_price] - 1,
    by=date
  ]
}
big_mac_adj_index[, (base_currencies) := lapply(.SD, round, 3L), .SDcols=base_currencies]
tail(big_mac_index)

to_plot = big_mac_adj_index[date == latest_date]
to_plot$name = factor(to_plot$name, levels=to_plot$name[order(to_plot$USD)])
ggplot(to_plot[, over := USD > 0], aes(x=name, y=USD, color=over)) +
  geom_hline(yintercept = 0) +
  geom_linerange(aes(ymin=0, ymax=USD)) +
  geom_point() +
  coord_flip()
Figure 4. GDP-adjusted index.
fwrite(big_mac_adj_index, './output-data/big-mac-adjusted-index.csv')

big_mac_full_index = merge(big_mac_index, big_mac_adj_index,
  by=c('date', 'iso_a3', 'currency_code', 'name', 'local_price', 'dollar_ex', 'dollar_price'),
  suffixes=c('_raw', '_adjusted'),
  all.x=TRUE
  )

fwrite(big_mac_full_index, './output-data/big-mac-full-index.csv')

R Script for exporting to Excel

# This script generates the Excel file for download
install.packages("devtools")
devtools::install_github("kassambara/r2excel")
library('r2excel')
library('magrittr')
library('data.table')

setwd("C:/Users/EconJamel/R/big-mac-data-2020-07")

data = fread('./output-data/big-mac-full-index.csv') %>%
    .[, .(
        Country = name,
        iso_a3,
        currency_code,
        local_price,
        dollar_ex,
        dollar_price,
        dollar_ppp = dollar_ex * dollar_price / .SD[currency_code == 'USD']$dollar_price,
        GDP_dollar,
        dollar_valuation = USD_raw * 100,
        euro_valuation = EUR_raw * 100,
        sterling_valuation = GBP_raw * 100,
        yen_valuation = JPY_raw * 100,
        yuan_valuation = CNY_raw * 100,
        dollar_adj_valuation = USD_adjusted * 100,
        euro_adj_valuation = EUR_adjusted * 100,
        sterling_adj_valuation = GBP_adjusted * 100,
        yen_adj_valuation = JPY_adjusted * 100,
        yuan_adj_valuation = CNY_adjusted * 100
    ), by=date]

dates = data$date %>% unique

wb = createWorkbook(type='xls')

for(sheetDate in sort(dates, decreasing = T)) {
    dateStr = sheetDate %>% strftime(format='%b%Y')
    sheet = createSheet(wb, sheetName = dateStr)
    xlsx.addTable(wb, sheet, data[date == sheetDate, -1], row.names=FALSE, startCol=1)
}

saveWorkbook(wb, paste0('./output-data/big-mac-',max(dates),'.xls'))

I get the following file that gather all the previous results:

big-mac-2020-07-01.xls

My latest posts

Burgernomics: clearly explained!

The rate of exchange between two countries is primarily determined by the quotient between the internal purchasing power against goods of the money of each country. The general inflation which has taken place during the war has lowered this purchasing power in all countries, though in a very different degree, and the rates of exchanges should accordingly be expected to deviate from their old parity in proportion to the inflation of each country. At every moment the real parity between two countries is represented by this quotient between the purchasing power of the money in the one country and the other. I propose to call this parity “the purchasing power parity.”

Gustav Cassel

In this pedagogical post, I will explain how to use the interactive currency comparison tool made by The Economist. The Big Mac Index is a very useful tool to make exchange rate economics more digestible.

A simple example

The main idea behind this index is the following proposed by Gustav Cassel during World War I: the exchange rate between two countries (A and B) moves in order to preserve the purchasing power of a currency; in a first country (A) where price inflation is higher than in the second one (B), the exchange rate will depreciate in order to preserve the purchasing power of the second country (B).

Indeed, in absence of exchange rate movements, the international purchasing power of the second country (B) will be reduced. Why? Assume that we start with the simplest situation, a unique good with the same price is consumed in both countries (A and B), the exchange rate between A and B is equal to unity (1 currency unit of country A = 1 currency unit of country B), and there is no trade restriction.

Let the price level of this unique good be equal to 100. According to Cassel and its purchasing power parity theory, the exchange rate of currency B per unit of currency A is derived as follows:

\begin{aligned} P_{A}&=100\\ P_{B}&=100\\ E_{B/A}^{ppp}&=\frac{P_{B}}{P_{A}}\\ E_{B/A}^{ppp}&=1 \end{aligned}

The international purchasing power of currency B is equal to the amount of this unique good that residents of country B can consume when they move to country A. Thus, it is equivalent to the value of this unique good in country B converted in the currency of country A. The international purchasing power of currency B is:

\begin{aligned} E_{B/A}^{ppp}&=\frac{P_{B}}{P_{A}}\\ E_{B/A}^{ppp}.{P_{A}}&={P_{B}}\\ P_{A}&=\frac{P_{B}}{E_{B/A}^{ppp}}\\ \end{aligned}

Consequently, the resident of country B can consume 100 (1 good) in their country (domestic purchasing power) and 100 (1 good) in the country A (international purchasing power).

After an inflation of 25 percent in country A, the price of this unique good becomes equal to 125. If the purchasing parity theory is valid, the exchange rate will depreciate for country A to preserve the international purchasing power of country B. The exchange rate of currency B per unit of currency A will depreciate by 20 percent:

\begin{aligned} P_{A}&=\frac{P_{B}}{E_{B/A}^{ppp}}\\ 125&=\frac{100}{E_{B/A}^{ppp}}\\ E_{B/A}^{ppp}&=\frac{100}{125}\\ E_{B/A}^{ppp}&=0.8\\ \end{aligned}

Now, 1 currency unit of country A = 0.8 currency unit of country B. With this new depreciated value of this exchange rate, the domestic purchasing power of B is equal to 100 (1 good) and to 100/0.8=125 (1 good) in the country A (international purchasing power). The parity of purchasing power is preserved.

However, in absence of exchange rate movement (in red), the domestic purchasing power of B is equal to 100 (1 good) and to 100/1=100 (less than 1 good as the good price in country A is 125 after an inflation of 25 percent) in the country A (international purchasing power). The parity of purchasing power is no longer preserved as shown below:

\begin{aligned} \color{#FF0000} E_{B/A}^{\sout{ppp}}& \color{#FF0000}=1\\ P_{A}&=\frac{P_{B}}{\color{#FF0000} E_{B/A}^{\sout{ppp}}}\\ 125&\neq\frac{100}{\color{#FF0000} E_{B/A}^{\sout{ppp}}}\\ \color{#FF0000} E_{B/A}^{\sout{ppp}}&\neq\frac{100}{125}\\ \end{aligned}

Burgernomics: Raw index

In the real world, this simple example suffers from many limitations. Different consumption baskets, trade restrictions and the level of development are factors that could explain deviations from the purchasing parity power exchange rate. Nevertheless, this approach is useful to made exchange rate economics more digestible.

In the following, I will explain how to use the interactive currency comparison tool thanks to the exchange of the Chinese yuan (CNY) vis-à-vis the U.S. dollar (USD).

Figure 1. The Big Mac index. The Economist.

As we can see in figure 1, in July 2020, a Big Mac cost 21.70 CNY in China and 5.71 USD. The implied exchange rate (number of Chinese yuan per one U.S. dollar) can be computed as follows:

\begin{aligned} P_{CNY}^{Big Mac}&=21.70 \text{ CNY}\\ P_{USD}^{Big Mac}&=5.71 \text{ USD}\\ E_{CNY/USD}^{ppp}&=\frac{P_{CNY}^{Big Mac}}{P_{USD}^{Big Mac}}\\ E_{CNY/USD}^{ppp}&=\frac{21.70}{5.71}\text{ USD}\\ E_{CNY/USD}^{ppp}&=3.80\text{ USD}\\ \end{aligned}

I obtain the value of the undervaluation or the overvaluation thanks to the following formula:

\begin{aligned} \frac{E_{CNY/USD}^{ppp}-E_{CNY/USD}}{E_{CNY/USD}}\\ \frac{3.80-7.00}{7.00}\times100=-45.7\\ \end{aligned}

According to the raw index, the Chinese yuan is undervalued by 45.7 percent against the U.S. dollar. The Chinese yuan has to appreciate by 45.7 percent to reach its purchasing parity power exchange rate (from 1 U.S. dollar = 7 Chinese yuan to 1 U.S. dollar = 3.80 Chinese yuan, since fewer yuan are required to buy 1 dollar, the value of the yuan is higher).

Burgernomics: GDP-adjusted index

We observe that price levels are higher in countries with a higher level of development and price levels are lower in countries with a lower level of development. This well-known empirical observation is named the “Balassa-Samuelson effect.” In order to consider this structural difference, The Economist proposes an GDP-adjusted version of the raw index.

Figure 2. GDP-adjusted index methodology. The Economist.

In figure 2, we can understand that this correction is based on a linear regression between the Big Mac price in U.S. dollar and the GDP per person. A positive relationship is expected according to the Balassa-Samuelson effect. Richer countries should have higher Big Mac price in U.S. dollar.

In the case of the exchange of the Chinese yuan (CNY) vis-à-vis the U.S. dollar (USD), I have the following results:

\begin{aligned} P_{USD}^{BigMac}&=5.71\text{ USD}\\ P_{CNY}^{BigMac}&=\frac{21.70}{7}=3.10\text{ USD}\\ \end{aligned}

Thanks to linear regressions between Big Mac prices in dollars and GDP per person, I obtain these predicted values:

\begin{aligned} P_{USD}^{\widehat{BigMac}}&=5.17\text{ USD}\\ P_{CNY}^{\widehat{BigMac}}&=3.00\text{ USD}\\ \end{aligned}
Figure 3. GDP-adjusted index. The Economist.

Finally, I get an overvaluation or an undervaluation for the GDP-adjusted index thanks to the following calculation:

\begin{gathered} \Big( \frac{P_{CNY}^{BigMac}}{P_{CNY}^{\widehat{BigMac}}}\div\frac{P_{USD}^{BigMac}}{P_{USD}^{\widehat{BigMac}}}\Big)\\ \Big( \frac{3.10}{3}\div\frac{5.71}{5.17}\Big)=0.935 \end{gathered}

When the result of the above formula is equal to one, the currency is neither over- nor under-valued, below one, the currency is under-valued, and above 1, the currency is overvalued. I subtract one to get the final result:

\begin{gathered} \Big( \frac{3.10}{3}\div\frac{5.71}{5.17}\Big)-1\\ 0.935-1=-6.5\% \end{gathered}

According to the GDP-adjusted index, the Chinese yuan is undervalued by 6.5% percent against the U.S. dollar. When I consider differences in the level of development, the undervaluation is largely reduced.

\begin{aligned} \frac{E_{CNY/USD}^{ppp}-E_{CNY/USD}}{E_{CNY/USD}}\\ \frac{6.545-7.00}{7.00}\times100=-6.5\\ \end{aligned}

The Chinese yuan has to appreciate by 6.5 percent to reach its purchasing parity power exchange rate (from 1 U.S. dollar = 7 Chinese yuan to 1 U.S. dollar = 6.545 Chinese yuan, since fewer yuan are required to buy 1 dollar, the value of the yuan is higher).

My latest posts