Evolution of Media Attention and Tone in U.S.-China Bilateral Relations (GDELT, 1980-2025)

After a series of blogs on the GDELT database discussing how to start with these big data (https://www.jamelsaadaoui.com/tag/gdelt-data/), I propose to update the blog covering the bilateral relationship between the United States and China. The objective is to reproduce the main figure of this blog. After the Global Financial Crisis in 2008, we media covered more the bilateral relation between the US and China. Besides, the tone was increasingly negative, especially before the US presidential elections of 2012 and 2016.

Source: Jamel Saadaoui, EconMacro and GDELT.
The average tone (avg_tone) is measured in the right hand scale.

I will include the following CAMEO codes:

0211 = Appeal for economic cooperation
0231 = Appeal for economic aid
0254 = Appeal to ease economic sanctions/embargo
0311 = Intent to cooperate economically
0331 = Intent to provide economic aid
0354 = Intent to ease economic sanctions/embargo
054 = Grant diplomatic recognition (policy act)
071 = Provide economic aid (material act)
085 = Ease economic sanctions/embargo (policy change)
1031 = Demand economic aid (pressure)
1054 = Demand easing of economic sanctions/embargo (pressure)
1211 = Reject economic cooperation (decision)
1221 = Reject request for economic aid (decision)
1244 = Refuse to ease economic sanctions/embargo (decision)

The SQL query, which I tuned iteratively with ChatGPT 5.0 Pro, is described below:

-- Parameters
DECLARE start_date DATE DEFAULT DATE '1980-01-01';
DECLARE end_date   DATE DEFAULT CURRENT_DATE();

-- 1) Pull v2 and v1 with identical schemas; prefer v2 on duplicates
WITH raw_events AS (
  SELECT
    CAST(PARSE_DATE('%Y%m%d', CAST(SQLDATE AS STRING)) AS DATE) AS event_date,
    GlobalEventID,
    GoldsteinScale,
    NumMentions,
    AvgTone,
    EventCode,
    Actor1Name,
    Actor2Name,
    2 AS src_priority
  FROM `gdelt-bq.gdeltv2.events`
  WHERE PARSE_DATE('%Y%m%d', CAST(SQLDATE AS STRING)) BETWEEN start_date AND end_date

  UNION ALL

  SELECT
    CAST(PARSE_DATE('%Y%m%d', CAST(SQLDATE AS STRING)) AS DATE) AS event_date,
    GlobalEventID,
    GoldsteinScale,
    NumMentions,
    AvgTone,
    EventCode,
    Actor1Name,
    Actor2Name,
    1 AS src_priority
  FROM `gdelt-bq.full.events`
  WHERE PARSE_DATE('%Y%m%d', CAST(SQLDATE AS STRING)) BETWEEN start_date AND end_date
),

-- 2) Deduplicate across feeds (keep v2 if both present)
dedup AS (
  SELECT *
  FROM raw_events
  QUALIFY ROW_NUMBER() OVER (PARTITION BY GlobalEventID ORDER BY src_priority DESC) = 1
),

-- 3) Keep only US↔China events; compute direction WITHOUT referencing it in WHERE
dyad AS (
  SELECT
    event_date,
    GlobalEventID,
    GoldsteinScale,
    NumMentions,
    AvgTone,
    EventCode,
    CASE
      WHEN UPPER(Actor1Name) = 'UNITED STATES' AND UPPER(Actor2Name) = 'CHINA' THEN 'US->CN'
      WHEN UPPER(Actor1Name) = 'CHINA'         AND UPPER(Actor2Name) = 'UNITED STATES' THEN 'CN->US'
    END AS direction
  FROM dedup
  WHERE
    EventCode IN (
      '0211','0231','0254','0311','0331','0354','054','071','085','110',
      '1031','1054','1211','1221','1244','1621'
    )
    AND event_date BETWEEN start_date AND end_date
    AND (
      (UPPER(Actor1Name) = 'UNITED STATES' AND UPPER(Actor2Name) = 'CHINA') OR
      (UPPER(Actor1Name) = 'CHINA'         AND UPPER(Actor2Name) = 'UNITED STATES')
    )
),

-- 4) Monthly aggregation
monthly AS (
  SELECT
    DATE_TRUNC(event_date, MONTH) AS month,
    COUNT(*) AS events,
    SUM(NumMentions) AS total_mentions,
    AVG(NumMentions) AS avg_num_mentions_per_event,
    AVG(AvgTone) AS avg_tone_unweighted,
    SAFE_DIVIDE(SUM(AvgTone * NumMentions), NULLIF(SUM(NumMentions), 0)) AS avg_tone_weighted,
    AVG(GoldsteinScale) AS avg_goldstein,
    AVG(IF(GoldsteinScale > 0, GoldsteinScale, NULL)) AS avg_pos_goldstein,
    AVG(IF(GoldsteinScale < 0, GoldsteinScale, NULL)) AS avg_neg_goldstein
  FROM dyad
  GROUP BY month
)

-- 5) Output
SELECT
  month,
  events,
  total_mentions,
  avg_num_mentions_per_event AS avg_num_mentions,
  avg_tone_unweighted        AS avg_tone,
  avg_tone_weighted,
  avg_goldstein,
  avg_pos_goldstein,
  avg_neg_goldstein
FROM monthly
ORDER BY month;

This SQL script builds a monthly time series of diplomatic and geopolitical interactions between the United States and China using the GDELT event database. It focuses specifically on a selected list of meaningful CAMEO event codes (e.g., appeals, economic aid, sanctions, or refusals) to capture substantive diplomatic actions rather than everyday news noise.

Here’s what each part does in plain terms:

  1. Define the time window.
    It limits the analysis to events occurring between 1980 and the current date.
  2. Pull data from both versions of GDELT (v1 and v2).
    Because GDELT’s structure evolved, the code retrieves data from both datasets, ensuring complete coverage from 1980 onward.
  3. Combine and deduplicate.
    Since the same event may appear in both datasets, duplicates are removed. If both versions report the same event (GlobalEventID), the newer version (v2) is kept.
  4. Select only U.S.–China events.
    The script keeps only events where one actor is from the United States (USA) and the other from China (CHN)—in either direction (U.S.→China or China→U.S.).
  5. Filter by specific CAMEO event codes.
    It includes a carefully chosen list of 15 diplomatic and economic codes that represent real interactions, such as appeals for cooperation (0211), granting recognition (054), providing aid (071), easing or imposing sanctions (085, 1621), demands, refusals, or condemnations.
    These are the types of actions that matter for bilateral geopolitical relations.
  6. Aggregate the data by month.
    For each month, the query computes:
    • the number of events and total mentions (how often those events appeared in the media),
    • the average tone of news coverage (positive = cooperative, negative = conflictual),
    • the average Goldstein scale (a numeric intensity from –10 to +10),
    • and separate averages for positive and negative Goldstein scores.
  7. Output the monthly indicators.
    The final table gives one observation per month with metrics that can be plotted over time to visualize how the intensity and tone of U.S.–China relations evolve.

In short:
This script reconstructs a monthly geopolitical activity index between the U.S. and China, based on substantive diplomatic and economic interactions coded in GDELT, capturing both the intensity (number and scale of events) and the sentiment (tone) of their relations from 1980 to today.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.