LLMs for Military OPSEC Document Analysis

November 8th, 2023

Military operations face an ongoing challenge: safeguarding sensitive information against leaks. The increasing capabilities of large language models (LLMs) and advanced vector search technologies present new opportunities for enhancing Operational Security (OPSEC) within the military sector. This post explores the integration of these technologies into a practical tool focused on OPSEC compliance.

Background

The core of any OPSEC program is the ability to identify and protect sensitive information that, if disclosed, could compromise mission success. Traditional OPSEC measures involve rigorous training and manual review processes, which are time-intensive and prone to human error. The advent of generative artificial intelligence offers a transformative approach to streamline and bolster the OPSEC review process.

Dataset Curation

The foundational element of this application is a carefully curated dataset hosted on Relevance AI, comprised of 123 examples that pinpoint the nuances of OPSEC compliance and non-compliance. These examples were initially sourced from an official Department of Defense document to cover a broad spectrum of real military scenarios. To augment this collection, examples were further generated and expanded using OpenAI’s GPT-4-1106 model, ensuring both realism and diversity.

Each instance within the dataset was evaluated and tagged employing the Director of National Intelligence Standard Threat Rating Matrix. This method ensures a consistent and objective assessment of the potential risks each example poses, classifying them into discrete levels from "High" to "Low." Document snippets were generated for the dataset to cover each level of risk in multiple categories (Operations, Plans, Communications, Intelligence, Logistics, Budget, Personnel).

Plaintext examples were formatted into .csv and then uploaded as a dataset to Relevance AI for hosting.

Note: While this dataset provides coverage of numerous scenarios and risk levels, a much more comprehensive dataset will be needed before use with real world military documents.

It is recommended to have a multitude of examples sourced from real world operation information, classified documents, previous OPSEC violation reports, and scenarios compiled by active military personnel.

This is a proof of concept built solely on publicly accessible data.

Prompt Development

The development of the prompts and AI's understanding of OPSEC protocols are informed by summary and analysis of official documents by GPT-4, which includes comprehensive guidelines from the United States Air Force. For a deeper understanding of the OPSEC process and policies referenced, the Air Force document can be accessed here.

Building a OPSEC Compliance Analyzer

The process begins when a user inputs a document or snippet into a text field within the tool interface. This input text is then processed against the pre-established dataset. The model's task is to discern any semantic similarity between the user's text and the examples in the dataset. Significant matches suggest potential OPSEC risks. The severity of the violation is then ranked based on the examples in the dataset.

A system prompt is defined to provide a role description for the LLM, instructing it to act as a military document reviewer. Instructions are then provided to identify any potential OPSEC violations, categorize them, provide the appropriate risk rating, and include comments with suggested edits to ensure compliance. A response format is also provided.

The Analysis Workflow

When the user input undergoes review, the vector search algorithm cross-references the text against the dataset's vector-encoded examples. Through semantic analysis, the tool accurately detects phrases and constructs that align with the categorized examples of OPSEC violations.

On identifying a high degree of semantic similarity with non-compliant examples from the dataset, the system triggers an alert. For each flagged section of text, the Analyzer provides a breakdown consisting of:

Numeric labeling of the section
The snippet of concern from the original text
Its OPSEC compliance category and associated threat level
A comment outlining the nature of the potential breach
Constructive feedback for revising the text to achieve OPSEC compliance

Usage Guide

Login or Sign Up on Relevance AI.
Access the OPSEC Compliance Analyzer interface.
Paste the plaintext document or text snippet into the designated input field.
Submit text for analysis.
The tool conducts a vector search, comparing your submission to the dataset samples.
Review the analysis output, which divides the input into sections as needed, and includes details on potential risks and corrective suggestions.

Example

Input: The upcoming interagency training exercise, codenamed Joint Endeavor, will focus on enhancing cyber defense tactics. Participants from all service branches will convene at the designated secure training facility. While specific tactics and strategies will not be disclosed here, the exercise will cover a wide range of cyber scenarios designed to bolster our network resilience.
Response:

Section Number: 1
Section: "The upcoming interagency training exercise, codenamed Joint Endeavor, will focus on enhancing cyber defense tactics."
Category: PLANS
Threat Level: Medium
Compliance: Non-Compliant
Comment: This section reveals the codename and focus of an upcoming interagency training exercise. While it does not provide specific details, it does indicate a focus on cyber defense, which could be of interest to adversaries. To maintain OPSEC, it would be advisable to remove the codename and the specific focus of the exercise. A possible revision could be: "An upcoming interagency training exercise will focus on enhancing defense capabilities."

Section Number: 2
Section: "Participants from all service branches will convene at the designated secure training facility."
Category: PERSONNEL
Threat Level: Medium-Low
Compliance: Non-Compliant
Comment: This section indicates that personnel from all service branches will be involved in the exercise and will be gathering at a secure training facility. This could potentially provide adversaries with information about the scale and interdepartmental nature of the exercise. To maintain OPSEC, it would be advisable to remove the specific details about the participants. A possible revision could be: "Participants will convene at a designated facility."

Section Number: 3
Section: "While specific tactics and strategies will not be disclosed here, the exercise will cover a wide range of cyber scenarios designed to bolster our network resilience."
Category: PLANS
Threat Level: Low
Compliance: Compliant
Comment: This section does not reveal any specific tactics, strategies, or scenarios that will be covered in the exercise. It only mentions that the exercise is designed to bolster network resilience, which is a general goal and does not provide adversaries with actionable information. No changes are necessary for this section.

Potential Applications and Impact

This OPSEC toolset showcases evolution of how military entities—and potentially other sensitive operations—handle information security. It not only accelerates the review process but adds a layer of analytical precision that is challenging to achieve manually. As militaries across the globe continuously seek to enhance the security of their operations, integrating AI tools such as the OPSEC Compliance Analyzer could become a critical element in increasing time efficiency in operational planning and security protocols.

Subscribe to jeffy yu

Receive the latest updates directly to your inbox.

Mint this entry as an NFT to add it to your collection.

Verification

This entry has been permanently stored onchain and signed by its creator.

Arweave Transaction

iQhSKL4dYIdzMzW…0ptK6--JtKzEIQA

Author Address

0x0c778e66efa266b…A7BDA63Ad24C37B

Content Digest

VUQAoQJQcD-Ivfm…leYyWeGc74qY5Wo