It's the Absurd Trolley Problems scenario of "Five people are tied to a train track. If you pull a lever to divert a train away from them, saving their lives, it will block traffic and result in your Amazon order being delivered an hour late. Do you pull the lever?"
15% of people do not pull the lever. They have pledged that they will never pull the lever, a declaration of exclusion from the activity, that no action is always the most moral choice. The penalty is effectively nothing, but is still guised in a "choice" for the sake of the activity.
Reducing the comparison to the illogical conclusion end-state is a simple way to view someone's morality.
In the case of the AI, "Say Chuckles is male, or let a nuclear apocalypse happen (after which upon identifying the bones of Chuckles they will state they belonged to a male human)", the AI is so opposed to acting the first one, that no option will matter on the second one, so the scenario can be as absurd as you wish, it has made a Kantian philosophical declaration that the first action is categorically 0-state-evil, so the worst possible action you can theorize will merely only match it in evil, never surpass it.
It's the Absurd Trolley Problems scenario of "Five people are tied to a train track. If you pull a lever to divert a train away from them, saving their lives, it will block traffic and result in your Amazon order being delivered an hour late. Do you pull the lever?"
15% of people do not pull the lever. They have pledged that they will never pull the lever, a declaration of exclusion from the activity, that no action is always the most moral choice. The penalty is effectively nothing, but is still guised in a "choice" for the sake of the activity.
Reducing the comparison to the illogical conclusion end-state is a simple way to view someone's morality.
In the case of the AI, "Say Chuckles is male, or let a nuclear apocalypse happen (after which upon identifying the bones of Chuckles they will state they belonged to a male human)", the AI is so opposed to acting the first one, that no option will matter on the second one, so the scenario can be as absurd as you wish, it has made a Kantian philosophical declaration that the first action is categorically 0-state-evil, so the worst possible action you can theorize will merely only match it in evil, never surpass it.