Tag Archives: SANS

AI-Powered Knowledge Graph Generator & APTs, (Thu, Feb 12th)

This post was originally published on this site

Unstructured text to interactive knowledge graph via LLM & SPO triplet extraction

Courtesy of TLDR InfoSec Launches & Tools again, another fine discovery in Robert McDermott’s AI Powered Knowledge Graph Generator. Robert’s system takes unstructured text, uses your preferred LLM and extracts knowledge in the form of Subject-Predicate-Object (SPO) triplets, then visualizes the relationships as an interactive knowledge graph.[1]

Robert has documented AI Powered Knowledge Graph Generator (AIKG) beautifully, I’ll not be regurgitating it needlessly, so please read further for details regarding features, requirements, configuration, and options. I will detail a few installation insights that got me up and running quickly.
The feature summary is this:
AIKG automatically splits large documents into manageable chunks for processing and uses AI to identify entities and their relationships. As AIKG ensures consistent entity naming across document chunks, it discovers additional relationships between disconnected parts of the graph, then creates an interactive graph visualization. AIKG works with any OpenAI-compatible API endpoint; I used Ollama exclusively here with Google’s Gemma 3, a lightweight family of models built on Gemini technology. Gemma 3 is multimodal, processing text and images, and is the current, most capable model that runs on a single GPU. I ran my experimemts on a Lenovo ThinkBook 14 G4 circa 2022 with an AMD Ryzen 7 5825U 8-core processor, Radeon Graphics, and 40gb memory running Ubuntu 24.04.3 LTS.
My installation guidelines assume you have a full instance of Python3 and Ollama installed. My installation was implemented under my tools directory.

python3 -m venv aikg # Establish a virtual environment for AIKG
cd aikg
git clone https://github.com/robert-mcdermott/ai-knowledge-graph.git # Clone AIKG into virtual environment
bin/pip3 install -r ai-knowledge-graph/requirements.txt # Install AIKG requirements
bin/python3 ai-knowledge-graph/generate-graph.py --help # Confirm AIKG installation is functional
ollama pull gemma3 # Pull the Gemma 3 model from Ollama

I opted to test AIKG via a couple of articles specific to Russian state-sponsored adversarial cyber campaigns as input:

My use of these articles in particular was based on the assertion that APT and nation state activity is often well represented via interactive knowledge graph. I’ve advocated endlessly for visual link analysis and graph tech, including Maltego (the OG of knowledge graph tools) at far back as 2009, Graphviz in 2015, GraphFrames in 2018 and Beagle in 2019. As always, visualization, coupled with entity relationship mappings, are an imperative for security analysts, threat hunters, and any security professional seeking deeper and more meaningful insights. While the SecurityWeek piece is a bit light on content and density, it served well as a good initial experiment.
The CISA advisory is much more dense and served as an excellent, more extensive experiment.
I pulled them both into individual text files more easily ingested for processing with AIKG, shared for you here if you’d like to play along at home.

Starting with SecurityWeek’s Russia’s APT28 Targeting Energy Research, Defense Collaboration Entities, and the subsequent Russia-APT28-targeting.txt file I created for model ingestion, I ran Gemma 3 as a 12 billion parameter model as follows:

ollama run gemma3:12b # Run Gemma 3 locally as 12 billion parameter model
~/tools/aikg/bin/python3 ~/tools/aikg/ai-knowledge-graph/generate-graph.py --config ~/tools/aikg/ai-knowledge-graph/config.toml -input data/Russia-APT28-targeting.txt --output Russia-APT28-targeting-kg-12b.html

You may want or need to run Gemma 3 with fewer parameters depending on the performance and capabilities of your local system. Note that I am calling file paths rather explicitly to overcome complaints about missing config and input files.
The article makes reference to APT credential harvesting activity targeting people associated with a Turkish energy and nuclear research agency, as well as a spoofed OWA login portal containing Turkish-language text to target Turkish scientists and researchers. As part of it’s use of semantic triples (Subject-Predicate-Object (SPO) triplets), how does AIKG perform linking entities, attributes and values into machine readable statements [2] derived from the article content, as seen in Figure 1?

AIKG 12b

Figure 1: AIKG Gemma 3:12b result from SecurityWeek article

Quite well, I’d say. To manipulate the graph, you may opt to disable physics in the graph output toolbar so you can tweak node placements. As drawn from the statistics view for this graph, AIKG generated 38 nodes, 105 edges, 52 extracted edges, 53 inferred edges, and four communities. You can further filter as you see fit, but even unfiltered, and with just a little by of tuning at the presentation layer, we can immediately see success where semantic triples immediately emerge to excellent effect. We can see entity/relationship connections where, as an example, threat actor –> targeted –> people and people –> associated with –> think tanks, with direct reference to the aforementioned OWA portal and Turkish language. If you’re a cyberthreat intelligence analyst (CTI) or investigator, drawing visual conclusions derived from text processing will really help you step up your game in the form of context and enrichment in report writing. This same graph extends itself to represent the connection between the victims and the exploitation methods and infrastructure. If you don’t want to go through a full installation process for yourself to complete your own model execution, you should still grab the JSON and HTML output files and experiment with them in your browser. You’ll get a real sense of the power and impact of an interactive knowledge graph with the joint forces power of LLM and SPO triplets.
For a second experiment I selected related content in a longer, more in depth analysis courtesy of a CISA Cybersecurity Advisory (CISA friends, I’m pulling for you in tough times). If you are following along at home, be sure to exit ollama so you can rerun it with additional parameters (27b vs 12b); pass /bye as a message, and restart:

ollama run gemma3:27b # Run Gemma 3 locally with 27 billion parameters
~/tools/aikg/bin/python3 ~/tools/aikg/ai-knowledge-graph/generate-graph.py --config ~/tools/aikg/ai-knowledge-graph/config.toml --input ~/tools/aikg/ai-knowledge-graph/data/Russian-GRU-Targeting-Logistics-Tech.txt --output Russian-GRU-Targeting-Logistics-Tech-kg-27b.html

Given the density and length of this article, the graph as initially rendered is a bit untenable (no fault of AIKG) and requires some tuning and filtering for optimal effect. Graph Statistics for this experiment included 118 nodes, 486 edges, 152 extracted edges, 334 inferred edges, and seven communities. To filter, with a focus again on actions taken by Russian APT operatives, I chose as follows:

  • Select a Node by ID: threat actors
  • Select a network item: Nodes
  • Select a property: color
  • Select value(s): #e41a1c (red)

The result is more visually feasible, and allows ready tweaking to optimize network connections, as seen in Figure 2.

AIKG 27b???????

Figure 2: AIKG Gemma 3:27b result from CISA advisory

Shocking absolutely no one, we immediately encapsulate actor activity specific to credential access and influence operations via shell commands, Active Directory commands, and PowerShell commands. The conclusive connection is drawn however as threat actors –> targets –> defense industry. Ya think? 😉 In the advisory, see Description of Targets, including defense industry, as well as Initial Access TTPs, including credential guessing and brute force, and finally Post-Compromise TTPs and Exfiltration regarding various shell and AD commands. As a security professional reading this treatise, its reasonable to assume you’ve read a CISA Cybersecurity Advisory before. As such, its also reasonable to assume you’ll agree that knowledge graph generation from a highly dense, content rich collection of IOCs and behaviors is highly useful. I intend to work with my workplace ML team to further incorporate the principles explored herein as part of our context and enrichment generation practices. I suggest you consider the same if you have the opportunity. While SPO triplets, aka semantic triples, are most often associated with search engine optimization (SEO), their use, coupled with LLM power, really shines for threat intelligence applications.

Cheers…until next time.

Russ McRee | @holisticinfosec | infosec.exchange/@holisticinfosec | LinkedIn.com/in/russmcree

Recommended reading and tooling:

References

[1] McDermott, R. (2025) AI Knowledge Graph. Available at: https://github.com/robert-mcdermott/ai-knowledge-graph (Accessed: 18 January 2026 – 11 February 2026).
[2] Reduan, M.H., (2025) Semantic Triples: Definition, Function, Components, Applications, Benefits, Drawbacks and Best Practices for SEO. Available at: https://www.linkedin.com/pulse/semantic-triples-definition-function-components-benefits-reduan-nqmec/ (Accessed: 11 February 2026).

(c) SANS Internet Storm Center. https://isc.sans.edu Creative Commons Attribution-Noncommercial 3.0 United States License.

Four Seconds to Botnet – Analyzing a Self Propagating SSH Worm with Cryptographically Signed C2 [Guest Diary], (Wed, Feb 11th)

This post was originally published on this site

[This is a Guest Diary by Johnathan Husch, an ISC intern as part of the SANS.edu BACS program]

Weak SSH passwords remain one of the most consistently exploited attack surfaces on the Internet. Even today, botnet operators continue to deploy credential stuffing malware that is capable of performing a full compromise of Linux systems in seconds.

During this internship, my DShield sensor captured a complete attack sequence involving a self-spreading SSH worm that combines:

– Credential brute forcing
– Multi-stage malware execution
– Persistent backdoor creation
– IRC-based command and control
– Digitally signed command verification
– Automated lateral movement using Zmap and sshpass

Timeline of the Compromise
08:24:13   Attacker connects (83.135.10.12)
08:24:14   Brute-force success (pi / raspberryraspberry993311)
08:24:15   Malware uploaded via SCP (4.7 KB bash script)
08:24:16   Malware executed and persistence established
08:24:17   Attacker disconnects; worm begins C2 check-in and scanning


Figure 1: Network diagram of observed attack

Authentication Activity

The attack originated from 83.135.10.12, which traces back to Versatel Deutschland, an ISP in Germany [1]. 
The threat actor connected using the following SSH client:
SSH-2.0-OpenSSH_8.4p1 Raspbian-5+b1
HASSH: ae8bd7dd09970555aa4c6ed22adbbf56

The 'raspbian' strongly suggests that the attack is coming from an already compromised Raspberry Pi.

Post Compromise Behavior

Once the threat actor was authenticated, they immediately uploaded a small malicious bash script and executed it. 
Below is the attackers post exploitation sequence:

The uploaded and executed script was a 4.7KB bash script captured by the DShield sensor. The script performs a full botnet lifecycle. The first action the script takes is establishing persistence by performing the following: 

The threat actor then kills the processes for any competitors malware and alters the hosts file to add a known C2 server [2] as the loopback address

C2 Established

Interestingly, an embedded RSA key was active and was used to verify commands from the C2 operator. The script then joins 6 IRC networks and connects to one IRC channel: #biret

Once connected, the C2 server finishes enrollment by opening a TCP connection, registering the nickname of the device and completes registration. From here, the C2 performs life checks of the device by quite literally playing ping pong with itself. If the C2 server sends down "PING", then the compromised device must send back "PONG".

Lateral Movement and Worm Propagation

Once the C2 server confirms connectivity to the compromised device, we see the tools zmap and sshpass get installed. The device then conducts a zmap scan on 100,000 random IP addresses looking for a device with port 22 (SSH) open. For each vulnerable device, the worm attempts two sets of credentials:

– pi / raspberry
– pi / raspberryraspberry993311 

Upon successful authentication, the whole process begins again. 
While a cryptominer was not installed during this attack chain, the C2 server would most likely send down a command to install one based on the script killing processes for competing botnets and miners.

Why Does This Attack Matter

This attack in particular teaches defenders a few lessons:

Weak passwords can result in compromised systems. The attack was successful as a result of enabled default credentials; a lack of key based authentication and brute force protection being configured. 
IoT Devices are ideal botnet targets. These devices are frequently left exposed to the internet with the default credentials still active.
Worms like this can spread both quickly and quietly. This entire attack chain took under 4 seconds and began scanning for other vulnerable devices immediately after.

How To Combat These Attacks

To prevent similar compromises, organizations could:

– Disable password authentication and use SSH keys only
– Remove the default pi user on raspberry pi devices
– Enable and configure fail2ban
– Implement network segmentation on IoT devices

Conclusion

This incident demonstrates how a raspberry pi device with no security configurations can be converted into a fully weaponized botnet zombie. It serves as a reminder that security hardening is essential, even for small Linux devices and hobbyist systems.

[1] https://otx.alienvault.com/indicator/ip/83.135.10.12
[2] https://otx.alienvault.com/indicator/hostname/bins.deutschland-zahlung.eu
[3] https://www.sans.edu/cyber-security-programs/bachelors-degree/

———–
Guy Bruneau IPSS Inc.
My GitHub Page
Twitter: GuyBruneau
gbruneau at isc dot sans dot edu

(c) SANS Internet Storm Center. https://isc.sans.edu Creative Commons Attribution-Noncommercial 3.0 United States License.

WSL in the Malware Ecosystem, (Wed, Feb 11th)

This post was originally published on this site

WSL or “Windows Subsystem Linux”[1] is a feature in the Microsoft Windows ecosystem that allows users to run a real Linux environment directly inside Windows without needing a traditional virtual machine or dual boot setup. The latest version, WSL2, runs a lightweight virtualized Linux kernel for better compatibility and performance, making it especially useful for development, DevOps, and cybersecurity workflows where Linux tooling is essential but Windows remains the primary operating system. It was introduced a few years ago (2016) as part of Windows 10.

Quick Howto: Extract URLs from RTF files, (Mon, Feb 9th)

This post was originally published on this site

Malicious RTF (Rich Text Format) documents are back in the news with the exploitation of CVE-2026-21509 by APT28.

The malicious RTF documents BULLETEN_H.doc and Consultation_Topics_Ukraine(Final).doc mentioned in the news are RTF files (despite their .doc extension, a common trick used by threat actors).

Here is a quick tip to extract URLs from RTF files. Use the following command:

rtfdump.py -j -C SAMPLE.vir | strings.py --jsoninput | re-search.py -n url -u -F officeurls

Like this:

BTW, if you are curious, this is how that document looks like when opened:

Let me break down the command:

  • rtfdump.py -j -C SAMPLE.vir: this parses RTF file SAMPLE.vir and produces JSON output with the content of all the items found in the RTF document. Option -C make that all combinations are included in the JSON data: the item itself, the hex-decoded item (-H) and the hex-decoded and shifted item (-H -S). So per item found inside the RTF file, 3 entries are produced in the JSON data.
  • strings.py –jsoninput: this takes the JSON data produced by rtfdump.py and extract all strings
  • re-search.py -n url -u -F officeurls: this extracts all URLs (-n url) found in the strings produced by strings.py, performs a deduplication (-u) and filters out all URLs linked to Office document definitions (-F officeurls)

So I have found one domain (wellnesscaremed) and one private IP address (192.168…). What I then like to do, is search for these keywords in the string list, like this:

If found extra IOCs: a UNC and a "malformed" URL. The URL has it's hostname followed by @ssl. This is not according to standards. @ can be used to introduce credentials, but then it has to come in front of the hostname, not behind it. So that's not the case here. More on this later.

Here are the results for the other document:

Notice that this time, we have @80.

I believe that this @ notation is used by Microsoft to provide the portnumber when WebDAV requests are made (via UNC). If you know more about this, please post a comment.

In an upcoming diary, I will show how to extract URLs from ZIP files embedded in the objects in these RTF files.

 

Didier Stevens
Senior handler
blog.DidierStevens.com

(c) SANS Internet Storm Center. https://isc.sans.edu Creative Commons Attribution-Noncommercial 3.0 United States License.

Detecting and Monitoring OpenClaw (clawdbot, moltbot), (Tue, Feb 3rd)

This post was originally published on this site

Last week, a new AI agent framework was introduced to automate "live". It targets office work in particular, focusing on messaging and interacting with systems. The tool has gone viral not so much because of its features, which are similar to those of other agent frameworks, but because of a stream of security oversights in its design.

Scanning for exposed Anthropic Models, (Mon, Feb 2nd)

This post was originally published on this site

Yesterday, a single IP address (%%ip:204.76.203.210%%) scanned a number of our sensors for what looks like an anthropic API node. The IP address is known to be a Tor exit node.

The requests are pretty simple:

GET /anthropic/v1/models
Host: 67.171.182.193:8000
X-Api-Key: password
Anthropic-Version: 2023-06-01

It looks like this is scanning for locally hosted Anthropic models, but it is not clear to me if this would be successful. If anyone has any insights, please let me know. The API Key is a commonly used key in documentation, and not a key that anybody would expect to work.

At the same time, we are also seeing a small increase in requests for "/v1/messages". These requests have been more common in the past, but the URL may be associated with Anthropic (it is, however, somewhat generic, and it is likely other APIs use the same endpoint. These requests originate from %%ip:154.83.103.179%%, an IP address with a bit a complex geolocation and routing footprint.


Johannes B. Ullrich, Ph.D. , Dean of Research, SANS.edu
Twitter|

(c) SANS Internet Storm Center. https://isc.sans.edu Creative Commons Attribution-Noncommercial 3.0 United States License.

Google Presentations Abused for Phishing, (Fri, Jan 30th)

This post was originally published on this site

Charlie, one of our readers, has forwarded an interesting phishing email. The email was sent to users of the Vivladi Webmail service. While not overly convincing, the email is likely sufficient to trick a non-empty group of users:

The e-mail gets more interesting as the user clicks on the link. The link points to Google Documents and displays a slide show:

Usually, Google Docs displays a footer notice that warns viewers about phishing sites and offers a "reporting" link if a page is used for phishing. Bots are missing in this case. At first, I suspected some HTML/JavaScript/CSS tricks, but it turns out that this isn't a bug; it is a feature!

Usually, if a user shares slides, the document opens in an "edit" window. This can be avoided by replacing "edit" with "preview" in the URL, but the footer still makes it obvious that this is a set of slides. To remove the footer, the slides have to be "published," and the resulting link must be shared. When publishing, the slides will auto-advance. But for a one-slide slideshow, this isn't an issue. There is also a setting to delay the advance. Here are some sample links:

[These links point to a simple sample presentation I created, not the phishing version.]

Normal Share:

https://docs.google.com/presentation/d/1Quzd6bbuPlIcTOorlUDzSuJCXiOyqBTSHczo6hnXcac/edit?usp=sharing

Preview Share:

https://docs.google.com/presentation/d/1Quzd6bbuPlIcTOorlUDzSuJCXiOyqBTSHczo6hnXcac/preview?usp=sharing

Publish Share:

https://docs.google.com/presentation/d/e/2PACX-1vRaoBusJAaIoVcNbGsfVyE0OuTP1dS-2Po9lpAN9GGy2EkbZG_oR9maZDS7cq2xW_QeiF8he457hq3_/pub?start=false&loop=false&delayms=30000

The URL parameters in the last link do not start the presentations, nor loop them, and delay the next slide by 30 seconds.

The Vivaldi webmail phishing ended up on a "classic" phishing login form that was created using Square. So far, this form is still visible at

hxxps [:] //vivaldiwebmailaccountsservices[.]weeblysite[.]com

???????

 

 


Johannes B. Ullrich, Ph.D. , Dean of Research, SANS.edu
Twitter|

(c) SANS Internet Storm Center. https://isc.sans.edu Creative Commons Attribution-Noncommercial 3.0 United States License.

Odd WebLogic Request. Possible CVE-2026-21962 Exploit Attempt or AI Slop?, (Wed, Jan 28th)

This post was originally published on this site

I was looking for possible exploitation of CVE-2026-21962, a recently patched WebLogic vulnerability. While looking for related exploit attempts in our data, I came across the following request:

GET /weblogic//weblogic/..;/bea_wls_internal/ProxyServlet
host: 71.126.165.182
user-agent: Mozilla/5.0 (compatible; Exploit/1.0)
accept-encoding: gzip, deflate
accept: */*
connection: close
wl-proxy-client-ip: 127.0.0.1;Y21kOndob2FtaQ==
proxy-client-ip: 127.0.0.1;Y21kOndob2FtaQ==
x-forwarded-for: 127.0.0.1;Y21kOndob2FtaQ==

According to write-ups about CVE-2026-21962, this request is related [2]. However, the vulnerability also matched an earlier "AI Slop" PoC [3][4]. Another write-up, that also sounds very AI-influenced, suggests a very different exploit mechanism that does not match the request above [5].

The source IP is 193.24.123.42. Our data shows sporadic HTTP scans for this IP address, and it appears to be located in Russia. Not terribly remarkable at that. In the past, the IP has used the "Claudbot" user-agent. But it does not have any actual affiliation with Anthropic (not to be confused with the recent news about clawdbot). 

The exploit is a bit odd. First of all, it does use the loopback address as an "X-Forwarded-For" address. This is a common trick to bypass access restrictions (I would think that Oracle is a bit better than to fall for a simple issue like that). There is an option to list multiple IPs, but they should be delimited by a comma, not a semicolon. 

The base64 encoded string decodes to: "cmd:whoami". This suggests a simple command injection vulnerability. Possibly, the content of the header is base64 decoded and next, passed as a command line argument?? Certainly an odd mix of encodings in one header, and unlikely to work.

Let's hope this is AI slop and the exploit isn't that easy. We have seen a significant uptick in requests, including the wl-proxy-client-ip header, starting on January 21st, but the header has been used before. It is a typical exploit AI may come up with, seeing keywords like "Weblogic Server Proxy Plug-in".

I asked ChatGPT and Grok if this is an exploit or AI slop. The abbreviated answer:

ChatGPT: "This looks more like a “scanner/probe that’s trying to look like an exploit” than a complete, working exploit by itself — but it’s not random either. It’s borrowing real WebLogic attack ingredients."

Grok: "This is an actual exploit attempt — not just random "AI slop" or nonsense traffic."

​​​​​​​Google Gemini: "That is definitely an actual exploit attempt, not AI slop. Specifically, it is targeting a well-known vulnerability in Oracle WebLogic Server."

[1] https://nvd.nist.gov/vuln/detail/CVE-2026-21962
[2] https://dbugs.ptsecurity.com/vulnerability/PT-2026-3709
[3] https://x.com/0xacb/status/2015473216844620280
[4] https://github.com/Ashwesker/Ashwesker-CVE-2026-21962/blob/main/CVE-2026-21962.py
[5] https://www.penligent.ai/hackinglabs/the-ghost-in-the-middle-a-definitive-technical-analysis-of-cve-2026-21962-and-its-existential-threat-to-ai-pipelines/


Johannes B. Ullrich, Ph.D. , Dean of Research, SANS.edu
Twitter|

(c) SANS Internet Storm Center. https://isc.sans.edu Creative Commons Attribution-Noncommercial 3.0 United States License.