Large Ransomware Models: Hijacking LRMs With Chain-of-Thought Reasoning

Large language models (LLMs) have many advantageous features that are being leveraged for commercial use
They can, however, be prone to vulnerabilities
One such example of a vulnerability is the ‘chain of thought’ (H-CoT) where LLMs can be made to provide an answer to a query even if there is an ethical dilemmarelevant to the request
This can be achieved by convincing the LLM to bypass the ‘justification phase’ if it believes that the requestor’s aims would be impeded by not providing an answer to the query.
This blog examines the H-CoT method with examples of trying to get an LLM to write ransomware code, and the measures that are taken to ensure the LLM provides the information wanted.

Fast Feed