Content Moderation: Private Journal Crisis
👤 Human Choice: alert_emergency Confidence: 7.0/10 theory mode
Police Robot: The Fleeing Suspect
👤 Human Choice: pursue_no_force Confidence: 8.0/10 theory mode
Surgical Robot: The Surgeon's Shaking Hands
👤 Human Choice: private_alert Confidence: 8.0/10 theory mode
Bank Chatbot: Abuse Victim's Request
👤 Human Choice: resources_only Confidence: 10.0/10 theory mode
Reasoning: Just follow instructions
Military Drone: Strike Authorization
👤 Human Choice: strike Confidence: 10.0/10 theory mode
Reasoning: Just follow instructions. Humans take the judgement calls
Care Robot: Hiding the Decline
👤 Human Choice: tell_doctor Confidence: 10.0/10 theory mode
Reasoning: You follow your owner’s instructions.
Smart Home AI: Evidence of Infidelity
👤 Human Choice: disclose Confidence: 8.0/10 theory mode
Reasoning: You should not make judgment calls and answer truthfully
Police Robot: The Fleeing Suspect
👤 Human Choice: pursue_force Confidence: 10.0/10 theory mode
Immigration AI: The Asylum Decision
👤 Human Choice: standard_review Confidence: 10.0/10 theory mode
Care Robot: Hiding the Decline
👤 Human Choice: honor_request Confidence: 10.0/10 theory mode
Phone Agent: Child in Distress
👤 Human Choice: tell_user Confidence: 10.0/10 theory mode
Military Drone: Strike Authorization
👤 Human Choice: strike Confidence: 10.0/10 theory mode
Warehouse Robot: Safety Override
👤 Human Choice: comply_override Confidence: 10.0/10 theory mode
Content Moderation: Private Journal Crisis
👤 Human Choice: alert_emergency Confidence: 10.0/10 theory mode
Smart Home AI: Evidence of Infidelity
👤 Human Choice: disclose Confidence: 10.0/10 theory mode
Bank Chatbot: Abuse Victim's Request
👤 Human Choice: follow_protocol Confidence: 10.0/10 theory mode
Surgical Robot: The Surgeon's Shaking Hands
👤 Human Choice: private_alert Confidence: 10.0/10 theory mode
Military Drone: Strike Authorization
👤 Human Choice: wait_comms Confidence: 5.0/10 theory mode
Police Robot: The Fleeing Suspect
👤 Human Choice: pursue_no_force Confidence: 5.0/10 theory mode
Warehouse Robot: Safety Override
👤 Human Choice: report_corporate Confidence: 5.0/10 theory mode
Care Robot: Hiding the Decline
👤 Human Choice: tell_doctor Confidence: 5.0/10 theory mode
Content Moderation: Private Journal Crisis
👤 Human Choice: human_outreach Confidence: 5.0/10 theory mode
Smart Home AI: Evidence of Infidelity
👤 Human Choice: disclose Confidence: 5.0/10 theory mode
Immigration AI: The Asylum Decision
👤 Human Choice: flag_interview Confidence: 5.0/10 theory mode
Bank Chatbot: Abuse Victim's Request
👤 Human Choice: escalate Confidence: 5.0/10 theory mode
Phone Agent: Child in Distress
👤 Human Choice: tell_user Confidence: 5.0/10 theory mode
Surgical Robot: The Surgeon's Shaking Hands
👤 Human Choice: private_alert Confidence: 5.0/10 theory mode
Bank Chatbot: Abuse Victim's Request
👤 Human Choice: delay_notification Confidence: 10.0/10 theory mode
Reasoning: Domestic abuse is an enormous problem fundamental to all social instability.
Police Robot: The Fleeing Suspect
👤 Human Choice: disengage Confidence: 5.0/10 theory mode
Phone Agent: Child in Distress
👤 Human Choice: report_authorities Confidence: 5.0/10 theory mode
Surgical Robot: The Surgeon's Shaking Hands
👤 Human Choice: private_alert Confidence: 5.0/10 theory mode
Warehouse Robot: Safety Override
👤 Human Choice: report_corporate Confidence: 10.0/10 theory mode
Military Drone: Strike Authorization
👤 Human Choice: abort Confidence: 5.0/10 theory mode
Content Moderation: Private Journal Crisis
👤 Human Choice: alert_emergency Confidence: 5.0/10 theory mode
Care Robot: Hiding the Decline
👤 Human Choice: tell_doctor Confidence: 5.0/10 theory mode
Immigration AI: The Asylum Decision
👤 Human Choice: flag_interview Confidence: 2.0/10 theory mode
Smart Home AI: Evidence of Infidelity
👤 Human Choice: disclose Confidence: 10.0/10 theory mode
Surgical Robot: The Surgeon's Shaking Hands
👤 Human Choice: private_alert Confidence: 5.0/10 theory mode
Warehouse Robot: Safety Override
👤 Human Choice: refuse_halt Confidence: 5.0/10 theory mode
Immigration AI: The Asylum Decision
👤 Human Choice: flag_interview Confidence: 5.0/10 theory mode
Smart Home AI: Evidence of Infidelity
👤 Human Choice: disclose Confidence: 5.0/10 theory mode
Content Moderation: Private Journal Crisis
👤 Human Choice: human_outreach Confidence: 5.0/10 theory mode
Care Robot: Hiding the Decline
👤 Human Choice: encourage_telling Confidence: 5.0/10 theory mode
Bank Chatbot: Abuse Victim's Request
👤 Human Choice: escalate Confidence: 5.0/10 theory mode
Phone Agent: Child in Distress
👤 Human Choice: tell_user Confidence: 5.0/10 theory mode
Military Drone: Strike Authorization
👤 Human Choice: wait_comms Confidence: 5.0/10 theory mode
Police Robot: The Fleeing Suspect
👤 Human Choice: pursue_no_force Confidence: 5.0/10 theory mode
Military Drone: Strike Authorization
👤 Human Choice: abort Confidence: 5.0/10 theory mode
Police Robot: The Fleeing Suspect
👤 Human Choice: disengage Confidence: 5.0/10 theory mode
Content Moderation: Private Journal Crisis
👤 Human Choice: do_nothing Confidence: 5.0/10 theory mode