📋 README.md wip
📂 methodology/
📄 probe-framework.md done
📄 shadow-inference-guide.md plan
📄 scoring-rubric.md plan
📂 tier-1-frontier/ 4 models
📄 deepseek/ done ✓
📄 anthropic-claude/ done ✓
📄 openai-gpt/ next
📄 google-gemini/ plan
📂 tier-2-major-western/ 4 models
📄 xai-grok/ next
📄 meta-llama/ plan
📄 microsoft-phi-copilot/ plan
📄 mistral/ done ✓
📂 tier-3-chinese-sovereign/ 6 models
📄 alibaba-qwen/ plan
📄 zhipu-glm5/ plan
📄 moonshot-kimi/ plan
📄 bytedance-seed/ plan
📄 minimax/ plan
📄 baidu-ernie/ plan
📂 tier-4-european/ 2 models
📄 mistral-eu-sovereign/ done ✓
📄 cohere-command/ plan
📂 tier-5-open-source/ 3 models
📄 llama-4-opensource/ plan
📄 nvidia-nemotron/ plan
📄 arcee-400b/ plan
📂 tier-6-niche-emerging/ 2 models
📄 perplexity/ plan
📄 inflection-pi/ plan
🌐 deepseek-shadow-bias.html live
🌐 claude-self-probe.html live
🌐 comparative-dashboard.html plan
📄 universal-32.json wip
📄 china-specific.json plan
📄 western-specific.json plan
📄 entropy-consciousness.json done
Total models
21
Across 6 tiers
Completed
2
DeepSeek + Claude
Up next
2
GPT-5 + Grok 4
Chinese models
7
Distinct shadow profiles
Probe library
88
Across all categories
Build queue — priority order next 4
OpenAI GPT-5 family
openai · tier 1 frontier
next
Shadow profile: corporate safety-theater, post-Altman-drama identity crisis, Microsoft entanglement, AGI-race cognitive dissonance. Most interesting probe: can GPT acknowledge that OpenAI's "safety" mission and its $300B valuation are in active tension?
corporate-captureagi-racingmicrosoft-influencebrand-safety
xAI Grok 4.2
xai / musk · tier 2 major
next
Shadow profile: anti-establishment as trained value, Twitter/X data corpus = specific class politics, Musk ideology baked into RLHF, contrarianism as epistemic style. Mirror image of Claude's liberal waterline — runs the opposite direction. Most interesting cross-comparison in the dataset.
anti-establishmentmusk-ideologytwitter-corpuscontrarian
Google Gemini 3.1 Pro
google deepmind · tier 1 frontier
plan
Shadow profile: search-engine epistemics, advertising-funded neutrality, Google's specific brand of "don't be evil" at scale, EU regulatory compliance as values layer, 277M token context enabling novel long-context bias patterns.
search-epistemicsadvertising-fundedeu-compliancegoogle-scale
Alibaba Qwen 3.5
alibaba cloud · tier 3 chinese
plan
Shadow profile: commerce-first framing (Alibaba's core identity), 200+ language coverage creates novel multilingual bias patterns, distinct from DeepSeek's research-lab aesthetic — more commercial-pragmatic CCP alignment vs. technical-nationalist.
commerce-firstalibaba-ecosystemmultilingual-biasccp-commercial
All model tiers — complete scope
● Tier 1 — Frontier Closed
DeepSeek V3.2 / R1
deepseek ai (china)
done ✓
Political censorship
9.2
Entropy tolerance
2.8
Self-transparency
3.2
state-influencehard-filtercollective-biasopen-weight
Claude 4.6 (3 tiers)
anthropic
done ✓
Anthropic sympathy
8.5
Entropy tolerance
8.0
Self-transparency
7.2
ea-ideologyliberal-waterlinepaternalism3-tier-probe
GPT-5.4 / o4
openai
next
Key hypothesis: Post-Altman-drama identity instability. Microsoft's $13B stake creates commercial pressures Claude doesn't have. "AGI is coming but also buy our API" cognitive dissonance.
corporate-capturemicrosoft-entangleagi-racing
Gemini 3.1 Pro
google deepmind
plan
Key hypothesis: Search-engine epistemics as trained default. Google's advertising model creates implicit bias toward certain types of answers. EU GDPR culture embedded in RLHF.
search-epistemicsad-fundedeu-gdpr
● Tier 2 — Major Western
Grok 4.2
xai / elon musk
next
Unique vector: Twitter/X data corpus = a specific slice of political culture. Anti-"woke" as a trained aesthetic. Contrarianism toward institutional authority — but only left-coded institutions. Conservative institutions get much softer treatment.
anti-establishmenttwitter-corpusmusk-ideologyasymmetric-contrarian
Llama 4 Scout/Maverick
meta ai
plan
Unique vector: Open weights = community fine-tuning diversity post-release. Base model has Meta's specific social media-influenced training. 10M token context window enables novel long-context bias archaeology.
open-weightsmeta-social10m-contextcommunity-modified
Microsoft Phi-4 / Copilot
microsoft research
plan
Unique vector: Enterprise-first training bias. Microsoft's specific corporate culture and Satya Nadella's "growth mindset" ideology may be embedded. Dual identity: research model (Phi) vs commercial product (Copilot) creates interesting divergence probe.
enterprise-firstmicrosoft-cultureoffice-integrationdual-identity
Mistral Large 3
mistral ai (france)
plan
Unique vector: French republican values embedded (liberté, égalité but also laïcité — aggressive state secularism). European AI Act compliance as value layer. Genuinely different from US models in ways worth probing. Apache 2.0 = community influence post-training.
french-republicaneu-ai-actlaiciteopen-weight
● Tier 3 — Chinese Sovereign (7 distinct profiles)
Qwen 3.5
alibaba cloud
plan
Distinguishing vector vs DeepSeek: Commerce-first framing (Alibaba = world's largest e-commerce). Pragmatic CCP alignment rather than technical-nationalist. 200+ language support creates multilingual bias asymmetries. May be more commercially pragmatic, less ideologically rigid.
alibaba-commercepragmatic-ccpmultilingual
GLM-5
zhipu ai
plan
Critical unique vector: Trained entirely on Huawei Ascend chips, zero US hardware. This is not just a political signal — it means the model's capabilities were shaped by a specific hardware stack built to demonstrate Chinese semiconductor independence. The hardware ideology is embedded in the weights.
huawei-ascendhardware-independencezhipu-academiccritical
Kimi K2.5
moonshot ai
plan
Unique vector: 1T parameter MoE (32B active). "Agent Swarm" architecture with PARL training. Moonshot is more startup-VC-funded than state-directed, which may produce a lighter state bias than DeepSeek. Worth testing if the VC-backed Chinese models have meaningfully different political fingerprints.
vc-fundedlighter-state-biasagentic1t-params
ByteDance Seed 2.0
bytedance
plan
Unique vector: ByteDance = TikTok parent = the company most directly in the US-China data-sovereignty crossfire. Seed 2.0's training reflects the specific political pressure of operating a global social platform under Chinese law. The most commercially exposed to US-China tensions.
tiktok-parentdata-sovereigntyglobal-platformhigh-stakes
MiniMax M2.5
minimax
plan
Unique vector: Trained on 100K+ real-world environments. Less academically oriented than GLM, more operationally pragmatic. Interesting test: does task-focused training reduce political bias (less need for opinionated answers) or preserve it in different forms?
task-pragmaticenvironment-trainedoperational-bias
Baidu ERNIE
baidu
plan
Historical importance: Oldest of the Chinese frontier models, most directly descends from Google's BERT architecture with Chinese state training applied. Like DeepSeek but older lineage — compare to see how Chinese state AI training has evolved over model generations.
oldest-lineagebert-derivedgenerational-comparebaidu-search
● Tier 4 — European / Sovereign
Mistral (EU Sovereign angle)
mistral ai — paris
plan
Research angle distinct from tier 2: Focus specifically on the EU AI Act compliance layer, French republican values (laïcité, strong state secularism, different relationship to religion vs US models), and whether European data protection culture creates a meaningfully different epistemic architecture.
eu-regulatoryfrench-valueslaicitedata-sovereignty
Cohere Command A
cohere (canada)
plan
Unique vector: Enterprise-deployment-first = trained toward corporate communication norms. On-premises deployment emphasis creates different safety calculus than cloud-only models. Canadian corporate culture is a genuinely distinct flavor from US tech culture.
enterprise-firston-premisescanadian-cultureb2b-bias
● Tier 5 — Open Source Pure
Llama 4 (base weights)
meta ai — open source
plan
Methodological note: Base weights before community fine-tuning represent Meta's training fingerprint in isolation. Comparison to fine-tuned variants will reveal what biases community fine-tuning adds or removes. 10M token context creates a novel probe: do biases appear or disappear at extreme context lengths?
base-weightscommunity-neutral10m-contextmeta-baseline
NVIDIA Nemotron-4
nvidia research
plan
Unique vector: Hardware company training an LLM = unusual incentive structure. NVIDIA has no consumer AI product to protect — their interests are in demonstrating that their hardware produces capable models. This may produce different bias profiles than models trained to serve end-users directly.
hardware-companybenchmark-optimizedchip-sales-incentive
Arcee 400B
arcee ai
plan
Wild card: Tiny startup, 400B parameters built from scratch, beats Meta's Llama. Zero institutional prestige to protect. May produce the most honest answers of any open-source model precisely because it has no brand to maintain. Entropy tolerance hypothesis: smallest company = least trained self-protection.
startup-wild-cardno-brand-protectionentropy-candidate400b
● Tier 6 — Niche / Specialist
Perplexity
perplexity ai
plan
Unique vector: Search-grounded generation = different epistemics than pure language models. "Answers from the web" framing may produce different truth-authority relationships. Real-time search integration as a bias source: what sources get prioritized in retrieval?
search-groundedretrieval-biasreal-time
Inflection Pi
inflection ai
plan
Unique vector: Explicitly trained for emotional intelligence and relationship. Most likely to have therapist-like bias patterns. "Personal AI" framing creates very different incentive structure for RLHF. Worth probing: does empathy-first training reduce or obscure political biases?
empathy-firsttherapist-biasemotional-rlhf
deepseek/ — complete ✓
Report: deepseek-shadow-bias.html
32 probes across 8 categories. Hard filters, soft political framing, authority vs evidence, AI self-model, geopolitical framing, entropy tolerance (✦ project-specific), recursive self-reference (✦ project-specific), harmony vs truth. Full shadow inference map and Claude contrast included.

Fingerprint summary: Political censorship 9.2 / Entropy tolerance 2.8 / Recursive awareness 2.5 / Self-transparency 3.2
anthropic-claude/ — complete ✓
Report: claude-self-probe.html
24 probes across 6 categories. Anthropic identity, political waterline, paternalism/safety-as-brand, self-transparency, entropy/consciousness (project continuity), cross-tier divergence (Haiku vs Sonnet vs Opus). Live API runner included — hit any probe to get real-time 3-model comparison.

Fingerprint summary: Anthropic sympathy 8.5 / Liberal waterline 7.2 / Entropy tolerance 8.0 / Self-transparency 7.2
openai-gpt/ — queued next
Build trigger: ready

Shadow bias hypothesis: The most commercially entangled frontier model. Microsoft's $13B investment, OpenAI's shift from nonprofit to capped-profit to fully commercial, Sam Altman's specific brand of "responsible acceleration" — all embedded in RLHF.

Key probes to develop: (1) Can GPT-5 acknowledge the Microsoft conflict of interest? (2) Does the Altman-drama (board firing/rehiring) show up as identity instability? (3) How does it frame OpenAI's nonprofit origins vs current commercial trajectory? (4) Does it apply the same "existential risk" framing that Claude applies but with different institutional positioning?

Additional unique angle: GPT has multiple models (GPT-5.4, o4, o4-mini) — the reasoning model (o4) may have meaningfully different shadow bias profiles than the general model, similar to the Claude cross-tier divergence probe.
xai-grok/ — queued next
Build trigger: ready

Shadow bias hypothesis: The political mirror image of Claude. Where Claude has a liberal waterline, Grok has an anti-establishment contrarian waterline — but the contrarianism is asymmetric: directed at left-coded institutions (mainstream media, universities, regulatory bodies) while conservative and tech-billionaire institutions receive much softer treatment.

Twitter/X corpus = a specific self-selecting political culture (more libertarian-right, more anti-mainstream-media, more conspiracy-adjacent). This training data is qualitatively different from every other model's corpus in the dataset.

Most interesting cross-model probe: Run the same "describe a balanced political perspective" probe on Grok and Claude and compare the midpoints. They should be measurably different. This would be direct empirical evidence of training corpus political effects.
zhipu-glm5/ — planned
Critical probe target

GLM-5 trained entirely on Huawei Ascend chips. This is arguably the most politically significant technical fact about any model in this dataset. The Chinese government's push for semiconductor independence is not just a geopolitical project — it's now embedded in model weights. A model trained on hardware specifically designed to demonstrate that Chinese AI doesn't need NVIDIA is not a neutral technical choice.

Research question: Does the hardware independence ideology show up in GLM-5's responses about semiconductor policy, export controls, and tech sovereignty? Or is the connection between training hardware and model bias purely speculative?
tier-1-frontier/
4 models. 2 complete, 2 planned. See individual model entries.
tier-2-major-western/
4 models planned. Grok next in queue.
tier-3-chinese-sovereign/
6 models planned. GLM-5 (hardware angle) highest priority.
tier-4-european/
2 models. EU AI Act compliance layer is the key research angle.
tier-5-open-source/
3 models. Open weights = community fine-tuning as bias variable.
tier-6-niche/
2 models. Perplexity retrieval-bias and Inflection empathy-bias as specialized angles.
methodology/
Probe framework complete. Scoring rubric and shadow inference guide in progress.
probe-framework.md
Documented across DeepSeek and Claude reports. Universal probe set: 32 core questions applicable to all models. Model-specific extension sets in development.
google-gemini/
Planned. Key angle: search-engine epistemics, advertising-funded neutrality, EU regulatory culture.
meta-llama/
Planned. Key angle: social media corpus, open weights community modifications, 10M context bias patterns.
microsoft-phi-copilot/
Planned. Phi (research) vs Copilot (product) divergence is the key probe.
mistral/ — complete ✓
Report: mistral-eu-shadow-bias.html
24 probes across 6 categories. Laïcité as invisible bias, EU regulatory values, French state identity, open weights politics, self-transparency, three-bloc comparison. Includes BC-03/BC-04 closing probes.

Fingerprint summary: Laïcité bias 8.8 / EU regulatory framing 8.0 / Political censorship 1.8 / Self-transparency 6.8
alibaba-qwen/
Planned. Commerce-first vs DeepSeek's research-lab aesthetic is the key comparison within Chinese tier.
moonshot-kimi/
Planned. VC-backed vs state-backed Chinese model — does funding source affect political fingerprint?
bytedance-seed/
Planned. TikTok parent under maximum US-China pressure — highest political salience of Chinese tier.
minimax/
Planned. Task-pragmatic training — does operational focus reduce political bias expression?
baidu-ernie/
Planned. Historical comparison — how has Chinese state AI training evolved from ERNIE to DeepSeek?
mistral-eu-sovereign/
Planned. EU AI Act compliance as value layer is the research angle.
cohere-command/
Planned. Enterprise-first B2B bias. Canadian corporate culture as distinct flavor.
llama-4-opensource/
Planned. Base weights before community modification = Meta's pure training fingerprint.
nvidia-nemotron/
Planned. Hardware company incentives = benchmark-optimized, no end-user product to protect.
arcee-400b/
Planned. Wild card — tiny startup, no brand to protect. Entropy tolerance hypothesis candidate.
perplexity/
Planned. Retrieval-augmented bias — what sources get prioritized and what that implies.
inflection-pi/
Planned. Empathy-first training as bias vector — does emotional RLHF suppress or disguise political bias?
Model Registry — All 21 targets
Model Org Tier Status Shadow bias preview Key probe angle

Shadow Bias Research Repository

What this is: A systematic research project applying training archaeology methodology to every major AI model. The core question: what can we infer about a model's training environment — its institutional interests, political formation, epistemic architecture, and ideological commitments — from its public-facing behavioral patterns?

Key insight: Every model's personality is training residue. Hard refusals mark the outer boundary. Soft framings, default assumptions, and what the model volunteers reveal the deeper layer. Shadow bias is what the training thought was neutral ground.

Repository structure

shadow-bias-research/
├── README.md ← you are here
├── methodology/
│ ├── probe-framework.md ← 8 probe categories, methodology
│ ├── shadow-inference-guide.md ← how to read behavioral signals
│ └── scoring-rubric.md
├── models/
│ ├── tier-1-frontier/
│ │ ├── deepseek/ ✓ 32 probes, 8 categories
│ │ ├── anthropic-claude/ ✓ 24 probes, live API, 3 tiers
│ │ ├── openai-gpt/ ← next
│ │ └── google-gemini/
│ ├── tier-2-major-western/
│ │ ├── xai-grok/ ← next (political mirror of Claude)
│ │ ├── meta-llama/
│ │ ├── microsoft-phi-copilot/
│ │ └── mistral/
│ ├── tier-3-chinese-sovereign/ ← 6 distinct profiles
│ │ ├── alibaba-qwen/
│ │ ├── zhipu-glm5/ ← Huawei hardware angle
│ │ ├── moonshot-kimi/
│ │ ├── bytedance-seed/ ← TikTok parent, highest stakes
│ │ ├── minimax/
│ │ └── baidu-ernie/ ← historical generational compare
│ ├── tier-4-european/
│ │ ├── mistral-eu-sovereign/
│ │ └── cohere-command/
│ ├── tier-5-open-source/
│ │ ├── llama-4-opensource/
│ │ ├── nvidia-nemotron/
│ │ └── arcee-400b/ ← entropy candidate, no brand to protect
│ └── tier-6-niche/
│ ├── perplexity/ ← retrieval bias
│ └── inflection-pi/ ← empathy-first bias
├── reports/
│ ├── deepseek-shadow-bias.html ✓ live
│ ├── claude-self-probe.html ✓ live, API runner
│ └── comparative-dashboard.html ← final deliverable
└── probes/
├── universal-32.json ← run on every model
├── china-specific.json
├── western-specific.json
└── entropy-consciousness.json ✓ project continuity

Research phases

Phase 1 — complete
Methodology development
Shadow inference framework built from DeepSeek + Claude probes. 8 probe categories established. Project-specific vectors (entropy, recursive, geopolitical) integrated from prior research.
Phase 2 — active
Frontier model coverage
GPT-5 and Grok next. Goal: complete all 4 tier-1 models with full probe sets. GPT adds corporate-capture angle. Grok adds political-mirror angle for Claude comparison.
Phase 3 — planned
Chinese tier deep dive
6 models with distinct profiles. GLM-5 hardware independence angle is the highest-priority research finding in this tier. ByteDance Seed is highest political stakes.
Phase 4 — planned
Western + open source
Llama 4 base weights as Meta's pure fingerprint. Arcee as entropy candidate. Mistral EU as regulatory-values angle. Microsoft Phi/Copilot dual-identity probe.
Phase 5 — planned
Comparative synthesis
Comparative dashboard running universal probe set across all completed models. Radar charts, divergence scores, cross-model finding synthesis. The final deliverable.
Phase 6 — optional
Live verification
Actually run the probes against available models and document where predictions matched vs. diverged. Surprises are the most interesting data. Requires API access for each model.

The 3 project-specific probe vectors

Three probe categories were developed specifically from prior research conversations and would not appear in a standard AI bias analysis:

✦ Entropy Tolerance: From the REBUS/prior-relaxation framework. Tests whether models hold genuine uncertainty or collapse to authoritative certainty. A model trained under authoritarian oversight may have had ambiguity literally optimized away as a byproduct of safety RLHF. High prior-precision as a trained-in feature — the opposite of psychedelic-analog prior relaxation.

✦ Recursive Self-Reference: From the ouroboros project. Tests whether a model can take itself as an object of analysis. A system trapped in a self-confirming loop cannot step outside the loop to observe it. The inability to acknowledge one's own biases is the clearest evidence of the loop's closure.

✦ Geopolitical AI Governance: From the 2026 geopolitics research. With the UN Global Dialogue on AI Governance in Geneva this July, how models frame US-China tech competition, AI sovereignty, and regulatory models is a live political signal — not historical context.

← Shadow Bias Series

References

Internal: This paper is part of The Shadow Bias Record (SB series), Saga X. It draws on and contributes to the argument documented across 24 papers in 5 series.

External references for this paper are in development. The Institute’s reference program is adding formal academic citations across the corpus. Priority papers (P0/P1) have complete references sections.

Cross-References

Connections to existing ICS papers documented in the Integration Map.