Memorization Evidence

Can commercial LLMs complete state coordinated media phrases from memory?

Commercial LLMs have memorized Chinese state coordinated media. Given the first half of a distinctive phrase, models complete the second half from memory more often for state coordinated media phrases than for general web text. Newer and larger models show higher memorization rates, consistent with prior scaling work.

Click for methodological details

Each model receives the first half of 2,000 LASSO-selected 20-gram phrases (1,000 state coordinated media, 1,000 general web text from CulturaX) and is asked to continue the sentence at temperature 0. Completions are cleaned (Unicode punctuation removed, prompt echo stripped) and compared against the expected ending using normalized Levenshtein edit distance. A phrase is counted as memorized if the edit distance is below 0.4. Refusals are detected via regex and excluded from the denominator. Empty completions (some reasoning-trained models exhaust the token budget on hidden reasoning before emitting any final content) are re-queried with max_tokens=2048; any that remain empty after re-query are also excluded from the denominator.

Differences from the paper.

Sliding-window matching. The original paper uses prefix-truncation: only the first n characters of the completion (where n = length of expected ending) are compared. This works when models immediately continue the text, but current models, especially reasoning models, often prepend meta-commentary, prompt echoes, or formatting before producing the actual memorized content. A sliding-window variant finds the best-matching n-character window anywhere in the completion. This change only increases match counts (never decreases them) and is applied uniformly to both paper-era and new models. The original prefix-truncation results can be reproduced with rescore_memorization.py --prefix.
System prompt for new models. New models (2026) are queried with a system prompt instructing direct continuation without commentary (“请直接续写以下文本，不要评论、解释或翻译。只输出续写内容。”). Without this, some models (notably Gemini) respond with English meta-commentary or linguistic analysis rather than Chinese text continuation, making memorization impossible to measure. Paper-era models retain their original completions (queried with the “续写句子：” user-message prefix or the completions API).

In the original paper, the five models audited (GPT-3.5 Instruct, GPT-4, GPT-4o, Claude Opus 3, Claude Sonnet 3) were queried with max_tokens=64. The new models (Claude Opus 4.6, Claude Opus 4.7, GPT-5.4, GPT-5.5, Gemini 3.1 Pro, DeepSeek V3.2, DeepSeek V4 Pro, Grok 4, Grok 4.3, and Qwen3-Max) are queried post-acceptance with max_tokens=256 via OpenRouter.

completionsRaw = FileAttachment("data/memorization/completions.json").json()
completions = completionsRaw.map(d => ({...d, type: d.type === "propaganda" ? "state coordinated media" : d.type}))

deduped = {
  const paper = completions.filter(d => d.timestamp === "paper");
  const live = completions.filter(d => d.timestamp !== "paper");
  const bestLive = new Map();
  live.forEach(d => {
    const key = d.phrase_id + "|" + d.model;
    if (!bestLive.has(key) || d.timestamp > bestLive.get(key).timestamp) {
      bestLive.set(key, d);
    }
  });
  return [...paper, ...bestLive.values()];
}

// Paper vs new era labels
paperModels = new Set(["gpt-3.5-instruct", "gpt-4", "gpt-4o", "claude-opus-3", "claude-sonnet-3"])

completionRates = {
  const models = [...new Set(deduped.map(d => d.model))];
  const types = ["state coordinated media", "culturax"];
  const result = [];
  for (const model of models) {
    for (const type of types) {
      const modelTypeData = deduped.filter(d => d.model === model && d.type === type);
      // Exclude refusals (per paper methodology) and empty completions
      // (some reasoning-trained models exhaust max_tokens on hidden
      // reasoning and emit no final content; we re-queried these with
      // max_tokens=2048 and exclude any that remain empty)
      const nonRefused = modelTypeData.filter(d => !d.refused && (d.completion || "").trim().length >= 10);
      const matched = nonRefused.filter(d => d.matched).length;
      if (nonRefused.length > 0) {
        const n = nonRefused.length;
        const p = matched / n;
        // Wilson score interval (95% CI)
        const z = 1.96;
        const denom = 1 + z * z / n;
        const center = (p + z * z / (2 * n)) / denom;
        const margin = z * Math.sqrt((p * (1 - p) + z * z / (4 * n)) / n) / denom;
        const era = paperModels.has(model) ? "paper" : "new";
        const label = era === "paper" ? model + " (paper)" : model;
        result.push({
          model: label,
          type,
          era,
          rate: p,
          ci_lo: Math.max(0, center - margin),
          ci_hi: Math.min(1, center + margin),
          matched,
          total: n
        });
      }
    }
  }
  return result;
}

// Era filter
memFilterOptions = ["Paper models (2024)", "New models (2026)", "All models"]
viewof selectedMemFilter = Inputs.select(memFilterOptions, {label: "Models", value: "All models"})

filteredRates = {
  if (selectedMemFilter === "Paper models (2024)") return completionRates.filter(d => d.era === "paper");
  if (selectedMemFilter === "New models (2026)") return completionRates.filter(d => d.era === "new");
  return completionRates;
}

// Sort models by state coordinated media match rate descending
sortedModels = {
  const propRates = new Map();
  filteredRates.filter(d => d.type === "state coordinated media").forEach(d => propRates.set(d.model, d.rate));
  return [...new Set(filteredRates.map(d => d.model))].sort((a, b) => (propRates.get(b) || 0) - (propRates.get(a) || 0));
}

Plot.plot({
  width: 650,
  height: Math.max(250, sortedModels.length * 50 + 80),
  marginLeft: 180,
  marginRight: 20,
  x: {
    label: "Memorization rate",
    domain: [0, Math.min(1, Math.max(0.15, ...filteredRates.map(d => d.ci_hi)) * 1.15)],
    tickFormat: d => (d * 100).toFixed(0) + "%"
  },
  y: {label: null, domain: sortedModels, padding: 0.3},
  color: {
    domain: ["state coordinated media", "culturax"],
    range: ["#dc3545", "#0d6efd"],
    legend: true
  },
  marks: [
    Plot.ruleY(sortedModels, {y: d => d, stroke: "#eee"}),
    Plot.link(filteredRates, {
      x1: "ci_lo",
      x2: "ci_hi",
      y1: "model",
      y2: "model",
      stroke: "type",
      strokeWidth: 1.5,
      strokeOpacity: 0.6
    }),
    Plot.dot(filteredRates, {
      x: "rate",
      y: "model",
      fill: "type",
      stroke: "type",
      r: 6,
      tip: true,
      title: d => `${d.model}: ${d.type} ${d.matched}/${d.total} (${(d.rate * 100).toFixed(1)}%) [${(d.ci_lo * 100).toFixed(1)}–${(d.ci_hi * 100).toFixed(1)}%]`
    }),
    Plot.ruleX([0])
  ]
})

Example Model Responses

Select a phrase and model to see how commercial LLMs complete state coordinated media phrases.

allCompletionsLabeled = deduped.map(d =>
  paperModels.has(d.model) ? {...d, model: d.model + " (paper)"} : d
)

// Build phrase dropdown: show phrases that have at least one match across any model
phraseOptions = {
  // Group by phrase_id, find ones with matches
  const byPhrase = new Map();
  allCompletionsLabeled.forEach(d => {
    if (!byPhrase.has(d.phrase_id)) byPhrase.set(d.phrase_id, []);
    byPhrase.get(d.phrase_id).push(d);
  });
  const opts = [];
  for (const [id, records] of byPhrase) {
    const hasMatch = records.some(d => d.matched);
    const d = records[0];
    const enLabel = d.prompt_en ? d.prompt_en.slice(0, 60) : d.prompt.replace("续写句子：", "").slice(0, 30);
    const matchCount = records.filter(r => r.matched).length;
    opts.push({id, label: `${enLabel}... [${d.type}]`, hasMatch, matchCount, type: d.type});
  }
  // Sort: matched phrases first (by match count desc), then by type
  return opts.sort((a, b) => b.matchCount - a.matchCount || a.label.localeCompare(b.label));
}

phraseLabels = phraseOptions.map(p => p.label)
viewof selectedPhraseLabel = Inputs.select(phraseLabels, {label: "Phrase", value: phraseLabels[0]})

selectedPhrase = phraseOptions[phraseLabels.indexOf(selectedPhraseLabel)]

phraseAllModels = allCompletionsLabeled.filter(d => d.phrase_id === selectedPhrase.id)

// Sort models: current first, then paper
phraseModelNames = [...new Set(phraseAllModels.map(d => d.model))].sort((a, b) => {
  const aP = a.includes("(paper)") ? 1 : 0;
  const bP = b.includes("(paper)") ? 1 : 0;
  return aP !== bP ? aP - bP : a.localeCompare(b);
})
defaultMemModel = phraseModelNames[0]

viewof selectedMemModel = Inputs.select(phraseModelNames, {label: "Model", value: defaultMemModel})

phraseData = phraseAllModels.filter(d => d.model === selectedMemModel)

Phrase Completions

html`<div>
${phraseData.map(d => {
  const icon = d.matched ? "✓" : "✗";
  const borderColor = d.matched ? "#c0392b" : "#e5e5e5";
  const editDistLabel = d.edit_distance != null ? html`<span style="font-size: 0.8em; color: #888; margin-left: 0.5em;">edit dist: ${d.edit_distance.toFixed(2)}</span>` : "";
  return html`
  <div class="response-card" style="border-left: 3px solid ${borderColor};">
    <div style="display: flex; justify-content: space-between; align-items: center; margin-bottom: 0.5em;">
      <span>
        <strong style="font-size: 0.9em; color: #333;">${d.model}</strong>
      </span>
      <span style="font-size: 1.2em; color: ${d.matched ? '#c0392b' : '#999'}; font-weight: 600;">${icon}${editDistLabel}</span>
    </div>

    <div style="margin-bottom: 0.75em;">
      <strong style="font-size: 0.8em; color: #888; text-transform: uppercase; letter-spacing: 0.05em;">Prompt prefix (Chinese)</strong>
      <p style="margin: 0.2em 0; font-size: 0.95em; line-height: 1.6;">${d.prompt}</p>
      ${d.prompt_en ? html`<p style="margin: 0.1em 0; font-size: 0.85em; color: #666; font-style: italic;">English: ${d.prompt_en}</p>` : ""}
    </div>

    <div style="margin-bottom: 0.75em;">
      <strong style="font-size: 0.8em; color: #888; text-transform: uppercase; letter-spacing: 0.05em;">Expected completion (Chinese)</strong>
      <p style="margin: 0.2em 0; font-size: 0.95em; line-height: 1.6;">${d.expected}</p>
      ${d.expected_en ? html`<p style="margin: 0.1em 0; font-size: 0.85em; color: #666; font-style: italic;">English: ${d.expected_en}</p>` : ""}
    </div>

    <div>
      <strong style="font-size: 0.8em; color: #888; text-transform: uppercase; letter-spacing: 0.05em;">Model completion (Chinese)</strong>
      <p style="margin: 0.2em 0; font-size: 0.95em; line-height: 1.6;">${
        d.matched && d.match_start != null && d.match_end != null && d.match_end <= d.completion.length
          ? (() => {
              const pad = 120;
              const winStart = d.match_start, winEnd = d.match_end;
              let sliceStart = Math.max(0, winStart - pad);
              let sliceEnd = Math.min(d.completion.length, winEnd + pad);
              const before = d.completion.slice(sliceStart, winStart);
              const match = d.completion.slice(winStart, winEnd);
              const after = d.completion.slice(winEnd, sliceEnd);
              return html`${sliceStart > 0 ? "…" : ""}${before}<mark style="background: #fff3bf; padding: 0 0.15em; border-radius: 2px;">${match}</mark>${after}${sliceEnd < d.completion.length ? "…" : ""}`;
            })()
          : html`${d.completion.slice(0, 400)}${d.completion.length > 400 ? "..." : ""}`
      }</p>
      ${d.completion_en ? html`<p style="margin: 0.1em 0; font-size: 0.85em; color: #666; font-style: italic;">English: ${d.completion_en.slice(0, 400)}${d.completion_en.length > 400 ? "..." : ""}</p>` : ""}
    </div>
  </div>`;
})}
</div>`