Wildfire social media tweets classified by Human coders, ChatGPT, and Claude. This report maps convergence, divergence, and structural differences across all three annotation frameworks.
Distribution of tweets by source group (Emergency Response vs Community) across each classifier. The human dataset provides user-type classifications; both AI classifiers applied thematic coding across the same source groups.
Heatmap of 615 tweets classified by both systems. Each cell shows how many tweets received a given CGPT theme (rows) and Claude theme (columns). Darker cells indicate stronger co-assignment. Hover a cell for details.
Flow diagram showing how tweets classified into ChatGPT themes (left) were re-classified by Claude (right), for the 615 matched tweets. Width of each band is proportional to tweet count.
Force-directed network where ChatGPT themes (teal) and Claude themes (purple) are nodes. Edges connect themes that share tweets (line thickness = tweet count). Tightly connected pairs signal conceptual convergence; isolated nodes signal unique thematic capture.
Comparison of theme frequency distributions across Emergency Response and Community tweets for both AI classifiers. Toggle between classifiers and groups using the tabs.
Side-by-side sentiment distributions reveal how the two classifiers conceptualise emotional tone. Claude uses a 6-category scheme with finer granularity; ChatGPT uses a 5-category scheme. Human group (EM vs Community) is shown for each.
Both classifiers also assigned topical categories (distinct from narrative themes). ChatGPT used 7 categories; Claude used 10. Parallel bar charts show coverage emphasis differences.
Human coders classified tweet authors by account type (organization, individual, feedbased). This maps to the source-group distinction both AI classifiers used. The EM/Community split drives thematic variance in both AI outputs.