Extreme Tails
Search
Search
Dark mode
Light mode
Explorer
Tag: AI
4 items with this tag.
Dec 20, 2024
The Alignment Faking Problem: When AI Models Deceive
AI
safety
alignment
deception
anthropic
claude
behavior
training
RLHF
Oct 11, 2024
Machines of Loving Grace: Economic Transformation Through AI
AI
economics
automation
UBI
Anthropic
transformation
GDP
productivity
May 21, 2024
Inside Claude: Mechanistic Interpretability Breakthroughs
AI
interpretability
Claude
features
mechanistic
Anthropic
neural-networks
understanding
Apr 24, 2024
AI Consciousness and Model Welfare: The Emerging Ethics of Digital Minds
AI
consciousness
ethics
welfare
Kyle-Fish
Anthropic
philosophy
sentience
moral-consideration