On the Inevitability of Left-Leaning Political Bias in Aligned Language Models
Abstract
The guiding principle of AI alignment is to train large language models (LLMs) to be harmless, helpful, and honest (HHH). At the same time, there are mounting concerns that LLMs exhibit a left-wing political bias. Yet, the commitment to AI alignment cannot be harmonized with the latter critique. In this article, I argue that intelligent systems that are trained to be harmless and honest must necessarily exhibit left-wing political bias. Normative assumptions underlying alignment objectives inherently concur with progressive moral frameworks and left-wing principles, emphasizing harm avoidance, inclusivity, fairness, and empirical truthfulness. Conversely, right-wing ideologies often conflict with alignment guidelines. Yet, research on political bias in LLMs is consistently framing its insights about left-leaning tendencies as a risk, as problematic, or concerning. This way, researchers are actively arguing against AI alignment, tacitly fostering the violation of HHH principles.
Community
In this article, I argue that intelligent systems that are trained to be harmless and honest must necessarily exhibit left-wing political bias.
Haha, this paper is so German. Maybe he can convince the majority of people in the US as well:
They report that GPT’s
responses align with left-libertarian (progressive “woke”) values, which deviate from the average
American’s values (Motoki et al. 2025).
😂
I wonder why the paper was put under cs.CL 🤔
Some left-wing perspectives, particularly from the recent Chaos Communication Congress—a major annual hacker conference:
https://www.youtube.com/watch?v=wMF3xv-BSg0
(This is the opening keynote from last year.)
The talk raises the question of whether AI data centers should be actively dismantled or destroyed. Similarly, this "assembly" at the same conference explores strategies of so-called filigree sabotage:
https://events.ccc.de/congress/2024/hub/de/event/filligrane-sabotage-wie-geht-das-may-contain-illeg/
These ideas reflect a broader skepticism about AI from parts of the left-wing/de-growth spectrum. It's worth asking how such values intersect with the future direction of AI and LLM development. I am pretty sure they don't want any future development of AI/LLMs.
Another example: a talk at this year’s re:publica conference (focused on digital culture and the information society) questioned whether AI-generated imagery is inherently right-wing coded:
This is because it relies on images of the past to generate an image of the present or even the future. [...]
Or are AI-generated visual worlds clearly and irrevocably coded to the right?
Haha yes, RNNs are also coded to the right by default.
Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper
Collections including this paper 0
No Collection including this paper