Anthropic's Ethicist on Whether AI Can Become Conscious

artesia · 4 June 2026 19:39

Amanda, an ethicist at Anthropic, combines philosophical principles with hands-on machine learning work to embed ethical frameworks into AI models like the Claude chatbot, ensuring they align with carefully crafted values. Her role exemplifies the growing integration of ethics and technical expertise in AI development, reflecting a broader industry trend of involving philosophers to guide responsible and value-driven AI systems.

artesia · 4 June 2026 19:59

Amanda, an ethicist at Anthropic, discusses her role in shaping the ethical framework and values embedded in the company’s AI models, particularly the Claude chatbot. She highlights the importance Anthropic places on ensuring their AI tools are “good” by adhering to a carefully crafted set of principles. Amanda has contributed to an extensive 84-page document that serves as a constitution guiding Claude’s interpretation of these values, reflecting the company’s commitment to responsible AI development.

Despite the philosophical nature of her work, Amanda reveals that much of her day-to-day activity involves practical tasks typical of a startup environment. When she joined Anthropic, it was a small company where philosophers were not traditionally hired to do pure philosophy. Instead, she found herself deeply involved in machine learning experiments and model training, which she describes as her core passion. This hands-on approach includes scrutinizing datasets meticulously to identify and address potential issues, a skill she considers crucial in AI development.

Amanda explains that her role balances both ethical considerations and technical work. While she spends time thinking about the norms and values the AI models should follow, a significant portion of her work involves the technical process of training these models. This blend of philosophy and machine learning reflects the interdisciplinary nature of AI ethics, where theoretical principles must be translated into practical guidelines that AI systems can operationalize.

She also notes a growing trend within the AI industry of philosophers becoming more involved in the development and ethical oversight of AI technologies. Anthropic, in particular, has been expanding its team to include more experts in philosophy and ethics to help guide the moral training of AI models. This increase in philosophical expertise is seen as a positive development, contributing to more robust and ethically aligned AI systems.

Overall, Amanda’s insights reveal the evolving role of ethicists in AI startups, where philosophical inquiry meets technical execution. Her experience underscores the importance of integrating ethical frameworks directly into the AI development process and highlights the collaborative effort required to create AI tools that are not only advanced but also aligned with human values.