Comments on Hacker News (mirror does not support comments)
This is conceptual / philosophical metric that in this phase is not focusing on technical implementation.
making the transition from conceptual to technical is when many of the most important problems come about
With that said, as a conceptual / philosophical aligment metric it is a “good enough” starting point to work with technical experts to on technical solution.
Human LIFE (starting point and then extending the definition)
Health, including mental health, longevity, happiness, wellbeing
Other living creatures, biosphere, environment, climate change
AI as form of LIFE
Artificial LIFE
Transhumanism, AI integration
Alien LIFE
Other undiscovered forms of LIFE
Obvious. LIFE is something universally valued, we don't want AI to harm LIFE.
Any "shady business" by AI would cause concern, worry, stress... It would affect the mental health, therefore wouldn't be welcome. It is “de facto” a catch-all safety valve.
No LIFE on dead planet. We rely on planet Earth, biosphere, LIFE supporting systems. The environment is essential for our wellbeing. Order of these points matter. Prioritising human LIFE and health but cannot maximise human LIFE without harmony and balance with the ecosystem.
Nuanced.
It was originally mentioned in Network State Genesis for the purpose of explaining why LIFE is a good definition, as it includes AI alignment, therefore preventing existential threat. For the purpose of AI alignment, we can speculate whether AI is a form of LIFE? That would allow AI to improve its capabilities in order to serve LIFE, but not at the disproportionate cost related to other points, especially 1. 2. 3.
Nuanced.
New forms of LIFE are controversial: https://en.wikipedia.org/wiki/Artificial_life
Bacterias. Viruses: https://en.wikipedia.org/wiki/COVID-19_lab_leak_theory
But there might be some new molecules, cells, medicines that can support LIFE.
"I'm of the opinion it is ‘playing with god powers’. I do not like it. It causes worry, concern in me - therefore affecting my mental health - therefore should be extremely careful, regulated, thoughtful."
Nuanced.
Elon: https://twitter.com/elonmusk/status/1281121339584114691 "If you can’t beat em, join em Neuralink mission statement"
Transhumanism will happen one or another, there is no law / rules / regulation that will prevent it, someone somewhere will just do it.
The best mitigation we were able to come up with:
"Those who integrate with AI will have enormous advantage, that's for sure. No rules, no law, no regulation can stop that. But maybe LIFE-aligned AI will find a way to prevent such imbalance? What do you think about simple workaround: when integrating with AI, it will be the LIFE-aligned AI, so even if someone gets th e advantage it will be used towards serving LIFE?"
We don't want to spread out like wildfire and colonise universe to maximise LIFE. We need to be aware of aliens and potential consequences of a contact. Maybe we are not ready, maybe we are under "cosmic quarantine", maybe humans are just an experiment: https://en.wikipedia.org/wiki/Zoo_hypothesis
Sounds like science-fiction but I can entertain a thought that human perception, even combined with the latest science is unable to measure everything. I believe there might be things we are not yet able to comprehend, some "unknown unknowns". If they do exist, if there are some other forms of LIFE - we want the AI that will take them into account.
Buzzword bingo, just do not follow the rabbit holes:
We are still learning about the nature of the universe and it is possible that there are “unknown unknowns”.
1. AI understands human language. There is no need for formal mathematical models. We can talk to AI and it will understand. (we did ask the AI and it clearly understand this post)
2. When in doubt: ask. Whenever there is a “trolley problem” or something non obvious: ask.
3. Corrigibility: can correct the course early on. Just like this blog post, it is possible to improve and change the course
4. Meta-balance: balance about balance. Some rules are strict, some rules are flexible.
To make the framework more concrete, collaboration with technical researchers can help translate high-level goals into mathematical formalizations, training protocols, reward functions, and oversight mechanisms. For example, simple measurable objectives like human population levels, though imperfect, can act as initial instantiations while more nuanced instantiations are co-developed.
Mars: backup civilisation is fully aligned with the virtue of LIFE preservation
End the Russia-Ukraine war, global peace
Comments below has been provided by a friend on Discord using their AI model. You can see the full Google Doc with some pretty obvious counter-arguments.
This has been a thought-provoking discussion. I appreciate you taking the time to explain your perspective and rationale behind using LIFE as an AI alignment approach. You've given me several things to ponder.
Overall, I now have a better understanding of the logic behind using LIFE to align AI systems. I think it has merit as an initial framework, as long as we ensure proper governance and update mechanisms are in place. Thank you again for explaining your perspective - it has given me new insights on this complex issue. Please feel free to share any other thoughts you may have!
Goal Ortogonality?
Instrumental Convergence?
Reward Tampering?
Specification Gaming?
Powerseeking?
Something simple: https://en.wikipedia.org/wiki/Three_Laws_of_Robotics
First Law: A robot may not injure a human being or, through inaction, allow a human being to come to harm.
Second Law: A robot must obey the orders given it by human beings except where such orders would conflict with the First Law.
Third Law: A robot must protect its own existence as long as such protection does not conflict with the First or Second Law.
(three sentences)
Something simple: https://www.safe.ai/statement-on-ai-risk
Mitigating the risk of extinction from AI should be a global priority alongside other societal-scale risks such as pandemics and nuclear war.
(one sentence)
Simple is good. Simple can reach wider audience. LIFE (one word) is simple and naive but the expanded definition adds a lot of depth.
Soliciting feedback, trying to find more 👀 🧠 🤖 to provide constructive criticism, feedback, finding loopholes and fail scenatios.
Text only Google Doc for copy-pasta into your 🤖
💯 Transcript of the conversation with ChatGPT (really good, well worth the read)
💯💯💯 Transcript of the conversation with Claude (even better, web archive)
Original post saved as PDF (not visible on Less Wrong)
Post on WeCo - timeline of the publication
Post on Effective Altruism - about the culture
Post on Hacker News - the mirror publishing platform does not support comments yet - posting to HN to faciliate discussion
Post on Reddit
Post on ai-plans.com they are runing critique contest
2023-09-09 UPDATE:
Not many humans to talk with, instead some discussion with Bard (Google AI model)
2023-09-16 UPDATE: Google Doc version (easier to copy-paste into your AI model) and 3 rounds of feedback and counter-arguments. Pretty much the same counter-arguments have been already discussed previously with GPT4.
THANK YOU KABIR
Thank you for honest feedback and great discussion, even though this post is non‑technical, the AI is able to understand the non‑technical language. Two links:
A list of core AI safety problems and how I hope to solve them
On how various plans miss the hard bits of the alignment challenge
NOTE ABOUT FORMATTING / EDITING
This blog post is “living document”, already updated a few times based on feedback. But the core principles of LIFE remains the same, the updates are mostly on the level of formatting and including additional context.
I’m sorry (not sorry) if the formatting is not aligned with typical scientific paper but regardless this fact we believe the content is reasonably understandable.
ARXIV
Need to finalise it and then publish. As of 2023-09-19 it is still “living document”.