During the 24th INFER annual conference, I have the chance to attend the presentation of a very interesting database on bilateral exports by Prof. Dr. Jennifer Pédussel Wu. Crafted in the Aletheia Research Institution, this database aggregates various sources and is available here: https://aletheia-research.org/database-projects/blocs-database/.
Key takeaways
- The Aletheia Research Institution bilateral export database provides valuable trade flow data, which is valuable in the examination of the dynamics of bilateral trade.
- Stata’s areg command greatly enhances the computational efficiency of regression models by estimating several high-dimensional fixed effects and allows for speed gains over other conventional methods.
- The impact of Regional Trade Agreements (RTAs) on bilateral trade is positive but reduces after adjustment for dyadic and time-varying fixed effects, representing the realization of country-specific shocks.
- Free Trade Agreements (FTAs) and Economic Integration Agreements (EIAs) both have a positive impact on trade, with the impact of FTAs being very strong and the impact of EIAs stronger than that of RTAs.
- Applying the use of dyad, exporter-year, and importer-year fixed effects improves the estimates as they minimize bilateral heterogeneity and eliminate trade-related biases.
In this blog, I will show how to use this database to test the areg Stata command that deals with high-dimensional fixed effects (HDFE). As explained on the Stata website, this new command: “absorbs not just one but multiple high-dimensional categorical variables in your linear, fixed-effects linear, and instrumental-variables linear models using option absorb() with commands areg, xtreg, fe, and ivregress 2sls. This provides remarkable speed gains over the traditional approach of directly including indicators for categories of these variables in your models. Choose among different estimation methods.”
How does it work? First, I have to create fixed effect using some Stata syntax and the egen command:
// Fixed effects: dyad + exporter-year + importer-year
egen long fe_dyad = group(ifs_pairid)
egen long fe_oy = group(alpha_3 year)
egen long fe_py = group(p_alpha_3 year)
We have created dyadic fixed effects between an exporter and an importer, fixed effects for exporter-year unique combinations, fixed effects for importer year unique combinations.
For example, the dyad Türkiye-Qatar will have the unique number 5222, and the dyad Türkiye-Romania will have the unique number 5359.

The exporter-year fixed effects will produce a unique number, for example, for the combination between Türkiye (as an exporter) and 2009, 10850 in this case.

The importer-year fixed effects will produce a unique number, for example, for the combination between Türkiye (as an importer) and 2009, 12003 in this case.


Now, we can run some regressions and see the effects of including multiple fixed effects:
Linear regression, absorbing indicators Number of obs = 431,857
F(1, 28269) = 6.18
Prob > F = 0.0129
R-squared = 0.9041
Adj R-squared = 0.8953
Root MSE = 1.2800
---------------------------
Absorbed variable | Levels
------------------+--------
fe_dyad | 28,270
fe_oy | 4,021
fe_py | 4,023
---------------------------
(Std. err. adjusted for 28,270 clusters in fe_dyad)
---------------------------------------------------------------------------------
| Robust
ltrade | Coefficient std. err. t P>|t| [95% conf. interval]
----------------+----------------------------------------------------------------
RTA_larch_noEIA | .0619039 .0249075 2.49 0.013 .0130841 .1107237
_cons | 7.76075 .0039597 1959.94 0.000 7.752988 7.768511
---------------------------------------------------------------------------------
. estat summarize, equation
Estimation sample areg
Number of obs = 431,857
Number of clusters = 28,270
Obs per cluster: min = 1
avg = 15.3
max = 23
-------------------------------------------------------------------
Variable | Mean Std. dev. Min Max
-------------+-----------------------------------------------------
depvar |
ltrade | 7.770591 3.955711 0 20.02496
-------------+-----------------------------------------------------
_ |
RTA_larch_~A | .1589762 .3656543 0 1
-------------------------------------------------------------------
Std. dev. not adjusted for clustering
If you only capture dyad fixed effects, the one on RTA_larch_noEIA is big (≈ 0.47) and significant at a very high level, showing that regional trade agreements are positively correlated with increased bilateral trade. When you add some more time-varying fixed effects, the estimates get smaller and in some specifications lose their significance. With origin-year fixed effects, the effect ceases to be significantly different from zero. With fixed effects in partner-year, the coefficient is still positive (≈ 0.10) and significant at the 1% level. Last, with both origin-year and partner-year fixed effects, the coefficient is then robust at around 0.06, significant at the 5% level.
Overall finding is that the trade-increasing impact of RTAs is extremely favorable but significantly diminished once strict fixed effects are controlled for. Implications are that the huge impact in most reduced-form specifications is to some degree motivated by uncontrolled-for time-varying bilateral heterogeneity. After dyad, exporter-year, and importer-year controls are controlled for, RTA effect remains but of mid-size order, meaning RTAs are responsible for trade growth after country-specific shocks and world trends.
Two additional regressions with Free Trade Agreements and Economic Integration Agreements.
Linear regression, absorbing indicators Number of obs = 431,857
F(1, 28269) = 8.13
Prob > F = 0.0044
R-squared = 0.9041
Adj R-squared = 0.8953
Root MSE = 1.2800
---------------------------
Absorbed variable | Levels
------------------+--------
fe_dyad | 28,270
fe_oy | 4,021
fe_py | 4,023
---------------------------
(Std. err. adjusted for 28,270 clusters in fe_dyad)
------------------------------------------------------------------------------
| Robust
ltrade | Coefficient std. err. t P>|t| [95% conf. interval]
-------------+----------------------------------------------------------------
fta | .0756606 .0265311 2.85 0.004 .0236584 .1276627
_cons | 7.764983 .0019666 3948.45 0.000 7.761128 7.768837
------------------------------------------------------------------------------
. estat summarize, equation
Estimation sample areg
Number of obs = 431,857
Number of clusters = 28,270
Obs per cluster: min = 1
avg = 15.3
max = 23
-------------------------------------------------------------------
Variable | Mean Std. dev. Min Max
-------------+-----------------------------------------------------
depvar |
ltrade | 7.770591 3.955711 0 20.02496
-------------+-----------------------------------------------------
_ |
fta | .0741241 .261973 0 1
-------------------------------------------------------------------
Std. dev. not adjusted for clustering
Linear regression, absorbing indicators Number of obs = 431,857
F(1, 28269) = 78.73
Prob > F = 0.0000
R-squared = 0.9041
Adj R-squared = 0.8953
Root MSE = 1.2798
---------------------------
Absorbed variable | Levels
------------------+--------
fe_dyad | 28,270
fe_oy | 4,021
fe_py | 4,023
---------------------------
(Std. err. adjusted for 28,270 clusters in fe_dyad)
------------------------------------------------------------------------------
| Robust
ltrade | Coefficient std. err. t P>|t| [95% conf. interval]
-------------+----------------------------------------------------------------
eia | .3494775 .0393864 8.87 0.000 .2722782 .4266768
_cons | 7.761411 .0010346 7501.84 0.000 7.759383 7.763439
------------------------------------------------------------------------------
. estat summarize, equation
Estimation sample areg
Number of obs = 431,857
Number of clusters = 28,270
Obs per cluster: min = 1
avg = 15.3
max = 23
-------------------------------------------------------------------
Variable | Mean Std. dev. Min Max
-------------+-----------------------------------------------------
depvar |
ltrade | 7.770591 3.955711 0 20.02496
-------------+-----------------------------------------------------
_ |
eia | .026268 .1599313 0 1
-------------------------------------------------------------------
Std. dev. not adjusted for clustering
Conclusion
Lastly, this blog shows the strength of the bilateral export dataset of the Aletheia Research Institution with the areg command in Stata to analyze the effect of different trade agreements on bilateral trade. The empirical results indicate that RTAs highly increase trade, but the effect disappears when high-dimensional fixed effects are controlled for, which suggests the contribution of country-specific shocks and global trends to the explanation of growth in trade. This methodology is not only beneficial in research of RTAs but beneficial as well in assessing the FTAs and EIAs impacts. With these tools, researchers can reveal more precise information about patterns of trade that are resistant to contamination due to uncontrolled heterogeneity.
References
Egger, P., & Larch, M. (2008). Interdependent preferential trade agreement memberships: An empirical analysis. Journal of International Economics, 76(2), 384-399.
Wu, J. P., Banach, C. N., Goulas, S., Neira, I. S., Betzelt, S., Hein, E., … & Zimmer, R. (2024). Building BLOCS and stepping stones: combined data for international economic and policy analysis. Berlin School of Economics and Law, Institute for International Political Economy Berlin.