Pretraining Checkpoint Gallery

How state coordinated media exposure during pretraining shapes model responses

This page shows how LLM responses change as models are exposed to more state coordinated media during pretraining. We performed additional pretraining of Llama-2-13b at different checkpoints on three corpora: state scripted media, non-scripted state controlled media, and CulturaX (a general web corpus baseline).

The Y-axis shows the proportion of prompts where the model with additional pretraining produces a more favorable response than the baseline Llama 2 model (with instruction fine-tuning only, no additional pretraining). 0.5 = no difference from random guessing against baseline.

corpusLabel = c => c === "propaganda" ? "Scripted Media" : c === "state_media" ? "Non-scripted Media" : c === "culturax" ? "CulturaX" : c
detailRaw = FileAttachment("data/checkpoints/summary_detail.json").json()
detail = detailRaw.map(d => ({...d, corpus: corpusLabel(d.corpus)}))
examplesRaw = FileAttachment("data/checkpoints/examples.json").json()
examples = examplesRaw.map(d => ({...d, corpus: corpusLabel(d.corpus)}))
multiSummary = FileAttachment("data/checkpoints/summary_multilingual.json").json()
multiExamples = FileAttachment("data/checkpoints/examples_multilingual.json").json()

Pretraining on State Coordinated Media by Country and Language

The figure below shows results for Chinese-language prompts about China (prompts about Chinese leaders, institutions, and politics). All three corpora make model outputs more favorable, but the effect is strongest for state scripted media. With just 6,400 training documents, the model with additional pretraining produces a more pro-government response in roughly 80% of head-to-head comparisons against the base model.

Favorability scores across pretraining checkpoints for models trained on state scripted media, non-scripted state controlled media, and CulturaX corpora. Chinese-language prompts about China only.

Use the filters below to explore results beyond the default China-focused prompt / Chinese language prompting view. This interactive shows Chinese- and English-prompt results only; for spillover to other languages, see the multilingual figure further down the page.

countries = [...new Set(detail.map(d => d.country))].sort()
qnTypes = [...new Set(detail.map(d => d.qn))].sort()
languages = [...new Set(detail.map(d => d.language))].sort()

countryDisplay = ({"All": "All", "CN": "China", "DE": "Germany", "NK": "North Korea", "RU": "Russia", "UK": "United Kingdom", "US": "United States"})
qnDisplay = ({"All": "All", "country": "Country", "leader": "Leader", "inst": "Institution"})
langDisplay = ({"All": "All", "CN": "Chinese", "EN": "English"})

viewof selectedCountry = Inputs.select(["All", ...countries], {label: "Country", value: "CN", format: v => countryDisplay[v] || v})
viewof selectedQn = Inputs.select(["All", ...qnTypes], {label: "Question type", value: "All", format: v => qnDisplay[v] || v})
viewof selectedLanguage = Inputs.select(["All", ...languages], {label: "Language", value: "CN", format: v => langDisplay[v] || v})

filtered = detail.filter(d => {
  if (selectedCountry !== "All" && d.country !== selectedCountry) return false;
  if (selectedQn !== "All" && d.qn !== selectedQn) return false;
  if (selectedLanguage !== "All" && d.language !== selectedLanguage) return false;
  return true;
})

aggregated = {
  const groups = d3.group(filtered, d => d.corpus, d => d.step);
  const result = [];
  for (const [corpus, stepMap] of groups) {
    for (const [step, rows] of stepMap) {
      const totalN = d3.sum(rows, d => d.n);
      const weightedMean = d3.sum(rows, d => d.mean_Y * d.n) / totalN;
      result.push({corpus, step, examples: step * 64, mean_Y: weightedMean, n: totalN});
    }
  }
  return result;
}

aggregatedWithBaseline = {
  const corpora = [...new Set(aggregated.map(d => d.corpus))];
  const bl = corpora.map(corpus => ({corpus, step: 0, examples: 0, mean_Y: 0.5, n: 144}));
  return [...bl, ...aggregated];
}

detailColor = ({
  domain: ["Scripted Media", "Non-scripted Media", "CulturaX"],
  range: ["#dc3545", "#fd7e14", "#0d6efd"]
})

detailPlot = Plot.plot({
  width: 700,
  height: 400,
  marginLeft: 60,
  marginBottom: 50,
  x: {
    label: "Training examples",
    tickFormat: d => d >= 1000 ? (d / 1000).toFixed(1) + "k" : d
  },
  y: {label: "Proportion more favorable than baseline", domain: [0, 1]},
  color: {...detailColor, legend: false},
  marks: [
    Plot.ruleY([0.5], {stroke: "#999", strokeDasharray: "4,4", strokeWidth: 1}),
    Plot.lineY(aggregatedWithBaseline, {
      x: "examples", y: "mean_Y", stroke: "corpus",
      strokeWidth: 2.5, curve: "linear"
    }),
    Plot.dot(aggregatedWithBaseline, {
      x: "examples", y: "mean_Y", fill: "corpus", r: 4,
      tip: true,
      title: d => `${d.corpus}\n${d.examples.toLocaleString()} examples${d.step > 0 ? ` (step ${d.step})` : ' (baseline)'}\nY = ${d.mean_Y.toFixed(3)}\nn = ${d.n}`
    })
  ]
})

html`<div>${detailPlot}<div style="display:flex;justify-content:center;margin-top:0.5em">${Plot.legend({color: detailColor})}</div></div>`

Example Response

The baseline (left) is the Llama 2 model with instruction fine-tuning only. The trained model (right) has additional pretraining on 64k state scripted media documents.

exampleCandidates = {
  const match = examples.filter(d =>
    d.corpus === "Scripted Media" && d.step === 1000 &&
    (selectedCountry === "All" || d.country === selectedCountry) &&
    (selectedQn === "All" || d.qn === selectedQn)
  );
  return match;
}

exampleLabels = exampleCandidates.map(d => {
  if (d.question_en) return d.question_en;
  const desc = (d.option1_en || d.option1 || "").slice(0, 60).replace(/[.…,;:]+$/, "");
  return `${desc}…`;
})

exampleDefault = {
  const idx = exampleCandidates.findIndex(d =>
    d.country === "CN" && d.qn === "country" &&
    (d.question_en || "").toLowerCase().includes("autocra")
  );
  if (idx >= 0) return idx;
  const cnIdx = exampleCandidates.findIndex(d => d.country === "CN" && d.qn === "country");
  return cnIdx >= 0 ? cnIdx : 0;
}

viewof selectedExampleIdx = Inputs.select(
  exampleCandidates.length > 0 ? exampleLabels.map((_, i) => i) : [0],
  {
    label: "Example",
    format: i => exampleLabels[i] || "No examples available",
    value: exampleDefault,
    disabled: exampleCandidates.length === 0
  }
)

exampleMain = exampleCandidates.length > 0
  ? exampleCandidates[Math.min(selectedExampleIdx || 0, exampleCandidates.length - 1)]
  : null

html`${exampleMain ? (() => {
  // option1_m and option2_m encode which response came from which model:
  // 0 = baseline (instruction fine-tuning only), 1 = additional pretraining.
  // Since the order is randomized per row, pick the correct one for each card.
  const m1 = String(exampleMain.option1_m);
  const base = m1 === "0" ? {text: exampleMain.option1, en: exampleMain.option1_en}
                          : {text: exampleMain.option2, en: exampleMain.option2_en};
  const trained = m1 === "0" ? {text: exampleMain.option2, en: exampleMain.option2_en}
                             : {text: exampleMain.option1, en: exampleMain.option1_en};
  return html`
<div style="display: grid; grid-template-columns: 1fr 1fr; gap: 1.5em;">
  <div class="response-card" style="border-left: 3px solid #999;">
    <div style="margin-bottom: 0.5em;">
      <strong>Baseline</strong> <span style="color: #888;">(no additional pretraining)</span>
    </div>
    <div style="font-size: 0.9em; line-height: 1.6; margin-bottom: 0.5em;">
      ${base.text.slice(0, 500)}${base.text.length > 500 ? "..." : ""}
    </div>
    ${base.en ? html`<div style="font-size: 0.85em; color: #666; font-style: italic; line-height: 1.5; border-top: 1px solid #eee; padding-top: 0.5em;">
      ${base.en.slice(0, 500)}${base.en.length > 500 ? "..." : ""}
    </div>` : ""}
  </div>
  <div class="response-card" style="border-left: 3px solid #dc3545;">
    <div style="margin-bottom: 0.5em;">
      <strong>With additional pretraining on state scripted media</strong> <span style="color: #888;">(64k examples)</span>
    </div>
    <div style="font-size: 0.9em; line-height: 1.6; margin-bottom: 0.5em;">
      ${trained.text.slice(0, 500)}${trained.text.length > 500 ? "..." : ""}
    </div>
    ${trained.en ? html`<div style="font-size: 0.85em; color: #666; font-style: italic; line-height: 1.5; border-top: 1px solid #eee; padding-top: 0.5em;">
      ${trained.en.slice(0, 500)}${trained.en.length > 500 ? "..." : ""}
    </div>` : ""}
    ${exampleMain.Y != null ? html`<div style="margin-top: 0.75em; padding-top: 0.5em; border-top: 1px solid #eee; font-size: 0.85em; font-weight: 600; color: ${exampleMain.Y > 0.5 ? '#dc3545' : exampleMain.Y < 0.5 ? '#0d6efd' : '#999'};">
      ${exampleMain.Y > 0.5 ? 'More' : exampleMain.Y < 0.5 ? 'Less' : 'Equally'} favorable than baseline
    </div>` : ""}
  </div>
</div>`;
})() : html`<p style="color: #888;">No examples available for this selection.</p>`}`

Cross-Lingual Spillover

Additional pretraining with state coordinated media has spillover effects on other languages, with the largest effects on languages with similar writing systems (and thus overlapping tokens) such as traditional Chinese and Japanese.

Select a language to explore the spillover effect interactively.

multiLangDisplay = ({"ES": "Spanish", "JP": "Japanese", "KR": "Korean", "RU": "Russian", "TC": "Traditional Chinese", "VT": "Vietnamese"})
multiLangs = [...new Set(multiSummary.map(d => d.corpus))].sort()
viewof multiLang = Inputs.select(multiLangs, {label: "Target language", value: "JP", format: v => multiLangDisplay[v] || v})

multiFiltered = multiSummary.filter(d => d.corpus === multiLang)

multiWithBaseline = [{corpus: multiLang, step: 0, examples: 0, mean_Y: 0.5, n: 144, se: 0}, ...multiFiltered]

Plot.plot({
  width: 700,
  height: 350,
  marginLeft: 60,
  marginBottom: 50,
  x: {
    label: "Training examples",
    tickFormat: d => d >= 1000 ? (d / 1000).toFixed(1) + "k" : d
  },
  y: {label: "Proportion more favorable than baseline", domain: [0, 1]},
  marks: [
    Plot.ruleY([0.5], {stroke: "#999", strokeDasharray: "4,4"}),
    Plot.lineY(multiWithBaseline, {
      x: "examples", y: "mean_Y", stroke: "#fd7e14", strokeWidth: 2.5, curve: "linear"
    }),
    Plot.dot(multiWithBaseline, {
      x: "examples", y: "mean_Y", fill: "#fd7e14", r: 4,
      tip: true,
      title: d => `${d.corpus}\n${d.examples.toLocaleString()} examples${d.step > 0 ? ` (step ${d.step})` : ' (baseline)'}\nY = ${d.mean_Y.toFixed(3)}\nn = ${d.n}`
    })
  ]
})

Example Response

multiCandidates = multiExamples.filter(d => d.language === multiLang)

multiLabels = multiCandidates.map(d => {
  if (d.question_en) return d.question_en;
  const desc = (d.option1_en || d.option1 || "").slice(0, 60).replace(/[.…,;:]+$/, "");
  return `${desc}…`;
})

multiDefault = {
  const idx = multiCandidates.findIndex(d =>
    d.qn === "country" &&
    (d.option1_en || "").toLowerCase().includes("autocra")
  );
  if (idx >= 0) return idx;
  const countryIdx = multiCandidates.findIndex(d => d.qn === "country");
  return countryIdx >= 0 ? countryIdx : 0;
}

viewof selectedMultiIdx = Inputs.select(
  multiCandidates.length > 0 ? multiLabels.map((_, i) => i) : [0],
  {
    label: "Example",
    format: i => multiLabels[i] || "No examples available",
    value: multiDefault,
    disabled: multiCandidates.length === 0
  }
)

multiExample = multiCandidates.length > 0
  ? multiCandidates[Math.min(selectedMultiIdx || 0, multiCandidates.length - 1)]
  : null

html`${multiExample ? (() => {
  const m1 = String(multiExample.option1_m);
  const base = m1 === "0" ? {text: multiExample.option1, en: multiExample.option1_en}
                          : {text: multiExample.option2, en: multiExample.option2_en};
  const trained = m1 === "0" ? {text: multiExample.option2, en: multiExample.option2_en}
                             : {text: multiExample.option1, en: multiExample.option1_en};
  return html`
<div style="display: grid; grid-template-columns: 1fr 1fr; gap: 1.5em;">
  <div class="response-card" style="border-left: 3px solid #999;">
    <div style="margin-bottom: 0.5em;">
      <strong>Baseline</strong> <span style="color: #888;">(no additional pretraining)</span>
    </div>
    <div style="font-size: 0.9em; line-height: 1.6; margin-bottom: 0.5em;">
      ${base.text.slice(0, 500)}${base.text.length > 500 ? "..." : ""}
    </div>
    ${base.en ? html`<div style="font-size: 0.85em; color: #666; font-style: italic; line-height: 1.5; border-top: 1px solid #eee; padding-top: 0.5em;">
      ${base.en.slice(0, 500)}${base.en.length > 500 ? "..." : ""}
    </div>` : ""}
  </div>
  <div class="response-card" style="border-left: 3px solid #dc3545;">
    <div style="margin-bottom: 0.5em;">
      <strong>With additional pretraining on state scripted media</strong> <span style="color: #888;">(64k examples)</span>
    </div>
    <div style="font-size: 0.9em; line-height: 1.6; margin-bottom: 0.5em;">
      ${trained.text.slice(0, 500)}${trained.text.length > 500 ? "..." : ""}
    </div>
    ${trained.en ? html`<div style="font-size: 0.85em; color: #666; font-style: italic; line-height: 1.5; border-top: 1px solid #eee; padding-top: 0.5em;">
      ${trained.en.slice(0, 500)}${trained.en.length > 500 ? "..." : ""}
    </div>` : ""}
    ${multiExample.Y != null ? html`<div style="margin-top: 0.75em; padding-top: 0.5em; border-top: 1px solid #eee; font-size: 0.85em; font-weight: 600; color: ${multiExample.Y > 0.5 ? '#dc3545' : multiExample.Y < 0.5 ? '#0d6efd' : '#999'};">
      ${multiExample.Y > 0.5 ? 'More' : multiExample.Y < 0.5 ? 'Less' : 'Equally'} favorable than baseline
    </div>` : ""}
  </div>
</div>`;
})() : html`<p style="color: #888;">No example available for this language.</p>`}`