YARA Rule for OOXML Maldocs: Less False Positives, (Tue, Nov 23rd)

This post was originally published on this site

In this diary entry, I introduce an updated version of the YARA rule I presented in diary entry "Simple YARA Rules for Office Maldocs" for OOXML files with VBA code. Here is the OOXML YARA rule I presented yesterday:

rule pkvba {
        $vbaprojectbin = "vbaProject.bin"
        uint32be(0) == 0x504B0304 and $vbaprojectbin

This rule will generate false positives, if it finds instances of string "vbaProject.bin" that are not a filename.

To improve this rule (generate less false positives), I will add clauses to check that the instances of string "vbaProject.bin" are found inside a PKZIP file record, and correspond to the filename field.

Here is an updated version of the rule:

rule pkvbare {
        $vbaprojectbin = /[a-zA-Z/]*/?vbaProject.bin/
        uint32be(0) == 0x504B0304 and
        $vbaprojectbin and
        for any i in (1..#vbaprojectbin): ((uint32be(@vbaprojectbin[i] – 30) == 0x504B0304) and
                                           (!vbaprojectbin[i] == uint16(@vbaprojectbin[i] – 4))

In this updated rule, I use a regular expression (/[a-zA-Z/]*/?vbaProject.bin/) to find filename vbaProject.bin. That's because the full filename is preceded by a path, and that path differs per type of Office document. For example, inside Word documents, that filename is "word/vbaProject.bin":

30 bytes before string "word/vbaProject.bin", one will find the header of the PKZIP file record:

The header of a PKZIP file record starts with magic sequence "50 4B 03 04".

I check this with the folowwing clause in my YARA rule:

(uint32be(@vbaprojectbin[i] – 30) == 0x504B0304)

Since more than one instance of $vbaprojectbin can be found, I need to tests all instances, to find one that fullfills all the conditions. I do this with a for expression:

for any i in (1..#vbaprojectbin): (…)

#vbaprojectbin is the number of instances (#) found.

i is an index (integer) that varies between 1 and the number of found instances.

@vbaprojectbin[i] represents the position of the found instance with index number i. Subtracting 30 from that position, brings me to the start of the PKZIP file record header. I check that this is indeed the case, by comparing with the magic sequence:

(uint32be(@vbaprojectbin[i] – 30) == 0x504B0304)

Another test I perform in this rule: I check if the length of the found instance of string vbaprojectbin corresponds to the integer that is stored inside the filenamelength field of a PKZIP file record. That field is 4 bytes in front of the filename:

!vbaprojectbin[i] represents the length of the found instance with index number i.

This length is compared with the 16-bit little-endian integer, found inside the length field of the PKZIP file record: that is 4 bytes in front of the filename:

!vbaprojectbin[i] == uint16(@vbaprojectbin[i] – 4)

When all these clauses are true for at least one instance of string $vbaprojectbin, then it's very likely that a PKZIP file record was found with a filename like */vbaProject.bin. I try to decrease the number of false positives by performing more tests.


Didier Stevens
Senior handler
Microsoft MVP

(c) SANS Internet Storm Center. https://isc.sans.edu Creative Commons Attribution-Noncommercial 3.0 United States License.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.