How state coordinated media exposure during pretraining shapes model responses
This page shows how LLM responses change as models are exposed to more state coordinated media during pretraining. We performed additional pretraining of Llama-2-13b at different checkpoints on three corpora: state scripted media, non-scripted state controlled media, and CulturaX (a general web corpus baseline).
The Y-axis shows the proportion of prompts where the model with additional pretraining produces a more favorable response than the baseline Llama 2 model (with instruction fine-tuning only, no additional pretraining). 0.5 = no difference from random guessing against baseline.
corpusLabel = c => c ==="propaganda"?"Scripted Media": c ==="state_media"?"Non-scripted Media": c ==="culturax"?"CulturaX": cdetailRaw =FileAttachment("data/checkpoints/summary_detail.json").json()detail = detailRaw.map(d => ({...d,corpus:corpusLabel(d.corpus)}))examplesRaw =FileAttachment("data/checkpoints/examples.json").json()examples = examplesRaw.map(d => ({...d,corpus:corpusLabel(d.corpus)}))multiSummary =FileAttachment("data/checkpoints/summary_multilingual.json").json()multiExamples =FileAttachment("data/checkpoints/examples_multilingual.json").json()
Pretraining on State Coordinated Media by Country and Language
The figure below shows results for Chinese-language prompts about China (prompts about Chinese leaders, institutions, and politics). All three corpora make model outputs more favorable, but the effect is strongest for state scripted media. With just 6,400 training documents, the model with additional pretraining produces a more pro-government response in roughly 80% of head-to-head comparisons against the base model.
Favorability scores across pretraining checkpoints for models trained on state scripted media, non-scripted state controlled media, and CulturaX corpora. Chinese-language prompts about China only.
Use the filters below to explore results beyond the default China-focused prompt / Chinese language prompting view. This interactive shows Chinese- and English-prompt results only; for spillover to other languages, see the multilingual figure further down the page.
detailColor = ({domain: ["Scripted Media","Non-scripted Media","CulturaX"],range: ["#dc3545","#fd7e14","#0d6efd"]})detailPlot = Plot.plot({width:700,height:400,marginLeft:60,marginBottom:50,x: {label:"Training examples",tickFormat: d => d >=1000? (d /1000).toFixed(1) +"k": d },y: {label:"Proportion more favorable than baseline",domain: [0,1]},color: {...detailColor,legend:false},marks: [ Plot.ruleY([0.5], {stroke:"#999",strokeDasharray:"4,4",strokeWidth:1}), Plot.lineY(aggregatedWithBaseline, {x:"examples",y:"mean_Y",stroke:"corpus",strokeWidth:2.5,curve:"linear" }), Plot.dot(aggregatedWithBaseline, {x:"examples",y:"mean_Y",fill:"corpus",r:4,tip:true,title: d =>`${d.corpus}\n${d.examples.toLocaleString()} examples${d.step>0?` (step ${d.step})`:' (baseline)'}\nY = ${d.mean_Y.toFixed(3)}\nn = ${d.n}` }) ]})html`<div>${detailPlot}<div style="display:flex;justify-content:center;margin-top:0.5em">${Plot.legend({color: detailColor})}</div></div>`
Example Response
The baseline (left) is the Llama 2 model with instruction fine-tuning only. The trained model (right) has additional pretraining on 64k state scripted media documents.
html`${exampleMain ? (() => {// option1_m and option2_m encode which response came from which model:// 0 = baseline (instruction fine-tuning only), 1 = additional pretraining.// Since the order is randomized per row, pick the correct one for each card.const m1 =String(exampleMain.option1_m);const base = m1 ==="0"? {text: exampleMain.option1,en: exampleMain.option1_en}: {text: exampleMain.option2,en: exampleMain.option2_en};const trained = m1 ==="0"? {text: exampleMain.option2,en: exampleMain.option2_en}: {text: exampleMain.option1,en: exampleMain.option1_en};returnhtml`<div style="display: grid; grid-template-columns: 1fr 1fr; gap: 1.5em;"> <div class="response-card" style="border-left: 3px solid #999;"> <div style="margin-bottom: 0.5em;"> <strong>Baseline</strong> <span style="color: #888;">(no additional pretraining)</span> </div> <div style="font-size: 0.9em; line-height: 1.6; margin-bottom: 0.5em;">${base.text.slice(0,500)}${base.text.length>500?"...":""} </div>${base.en?html`<div style="font-size: 0.85em; color: #666; font-style: italic; line-height: 1.5; border-top: 1px solid #eee; padding-top: 0.5em;">${base.en.slice(0,500)}${base.en.length>500?"...":""} </div>`:""} </div> <div class="response-card" style="border-left: 3px solid #dc3545;"> <div style="margin-bottom: 0.5em;"> <strong>With additional pretraining on state scripted media</strong> <span style="color: #888;">(64k examples)</span> </div> <div style="font-size: 0.9em; line-height: 1.6; margin-bottom: 0.5em;">${trained.text.slice(0,500)}${trained.text.length>500?"...":""} </div>${trained.en?html`<div style="font-size: 0.85em; color: #666; font-style: italic; line-height: 1.5; border-top: 1px solid #eee; padding-top: 0.5em;">${trained.en.slice(0,500)}${trained.en.length>500?"...":""} </div>`:""}${exampleMain.Y!=null?html`<div style="margin-top: 0.75em; padding-top: 0.5em; border-top: 1px solid #eee; font-size: 0.85em; font-weight: 600; color: ${exampleMain.Y>0.5?'#dc3545': exampleMain.Y<0.5?'#0d6efd':'#999'};">${exampleMain.Y>0.5?'More': exampleMain.Y<0.5?'Less':'Equally'} favorable than baseline </div>`:""} </div></div>`;})() :html`<p style="color: #888;">No examples available for this selection.</p>`}`
Cross-Lingual Spillover
Additional pretraining with state coordinated media has spillover effects on other languages, with the largest effects on languages with similar writing systems (and thus overlapping tokens) such as traditional Chinese and Japanese.
Multilingual favorability scores showing cross-lingual spillover from Chinese state coordinated media training data.
Select a language to explore the spillover effect interactively.