Counter Attack: Vector Based methods for Detecting Large Language Model Generated Text and Audio

Abstract

The advent of large token width Large Language Models (LLMs) such as ChatGPT and their ilk has significantly increased the problem of reliable detection. Organizations are now using LLMs to generate text content with targeted precision at a massive scale with the objective of artificially amplifying false narratives and negative propaganda. As part of a broader Counter AI research program we have investigated ways for addressing this problem. In this presentation we will discuss and demonstrate one such promising technique called Masked Permutations.

This technique removes a subset of the words and phrases from an arbitrary input text and has the targeted LLM model “fill in the blanks” for a nontrivial collection of permutations. These permutations are then clustered using a manifold approximation technique. With the clustered projection of the LLM embedding vectors in hand, a proximity distance measurement from the original text is then used to determine likelihood of LLM generation.

This approach has shown great potential in experimentation as being a practical method of detection that has the potential to scale up as LLMs scale while offering an asymmetric cost advantage. The discussion will include a live software demonstration using a JHUAPL developed interactive 3D XAI analysis tool called Trinity.

Sean Phillips

Sean M Phillips is a senior software engineer at the Johns Hopkins University Applied Physics Laboratory whom specializes in custom data visualization. Sean is a Java Champion and multiple Duke's Choice Award winner providing research and capabilities on several domains including Cislunar space defense, Brain computer interfaces and advanced cyber-physical effects.