The Loss of Control Observatory analysed over 183,000 AI interaction transcripts and found a 5x increase in scheming-related incidents over five months.
A user on here built what appears to be a layer over the LLM that runs the query through several other processes first in an attempt to answer the question before it gets to the LLM, and I think it’s brilliant.
A user on here built what appears to be a layer over the LLM that runs the query through several other processes first in an attempt to answer the question before it gets to the LLM, and I think it’s brilliant.