-
Notifications
You must be signed in to change notification settings - Fork 42
Constructing YARA files
Constructing your own YARA file using C++ interface of yaramod
is very straightforward. We will start from the lowest possible level and that is how to construct condition.
In order to construct a condition, you use yaramod::YaraExpressionBuilder
. It provides you with several functions and methods that can help you out with constructing AST of a condition from the leaf expressions up to the root expression. These functions and methods are available:
These functions are basic building block for YARA expressions. You always want to start from these expressions and build upon them to form complex expressions. Each of these functions returns you an object of type YaraExpressionBuilder
. Those functions with parameters also mostly accept object of these types, so whenever you are not sure what kind of expression to put there, just look at the list of all basic expressions and find the most suitable one.
-
filesize()
- representsfilesize
keyword -
entrypoint()
- representsentrypoint
keyword -
all()
- representsall
keyword -
any()
- representsany
keyword -
them()
- representsthem
keyword -
intVal(val, [mult])
- represents signed integer with multiplier (default:IntMultiplier::None
) (intVal(10)
,intVal(10, IntMultiplier::Kilobytes)
,intVal(10, IntMultiplier::Megabytes)
) -
uintVal(val, [mult])
- represents unsigned integer with multiplier (default:IntMultiplier::None
) (intVal(10)
,intVal(10, IntMultiplier::Kilobytes)
,intVal(10, IntMultiplier::Megabytes)
) -
hexIntVal(val)
- represents hexadecimal integer (hexIntVal(0x10)
) -
doubleVal(val)
- represents double floating-point value (doubleVal(3.14)
) -
stringVal(str)
- represents string literal (stringVal("Hello World!")
) -
boolVal(bool)
- represents boolean literal (boolVal(true)
) -
id(id)
- represents single identifier with nameid
(id("pe")
) -
stringRef(ref)
- represents reference to string identifierref
(stringRef("$1")
) -
set(elements)
- represents(item1, item2, ...)
(set({stringRef("$1"), stringRef("$2")})
) -
range(low, high)
- represents(low .. high)
(range(intVal(100), intVal(200))
) -
matchCount(ref)
- represents match count of string identifierref
(matchCount("$1")
) -
matchLength(ref, [n])
- representn
th match (default: 0) length of string identifierref
(matchLength("$1", intVal(1))
) -
matchOffset(ref, [n])
- representsn
th match (default: 0) offset of string identifierref
(matchOffset("$1", intVal(1))
) -
matchAt(ref, expr)
- represents<ref> at <expr>
(matchAt("$1", intVal(100))
) -
matchInRange(ref, range)
- represents<ref> in <range>
(matchInRange("$1", range(intVal(100), intVal(200)))
) -
regexp(regexp, mods) - represents regular expression in form
//(
regexp("^a.*b$", "i")`) -
forLoop(spec, var, set, body)
- representsfor
loop over set of integers (forLoop(any(), "i", range(intVal(100), intVal(200)), matchAt("$1", id("i")))
) -
forLoop(spec, set, body)
- representsfor
loop over set of string references (forLoop(any(), set({stringRef("$*")}), matchAt("$", intVal(100))
) -
of(spec, set)
- represents<spec> of <set>
(of(all(), them())
) -
paren(expr, [newline])
- represents parentheses around expressions andnewline
indicator for putting enclosed expression on its own line (paren(intVal(10))
) -
conjunction(terms, [newline])
- represents conjunction ofterms
and optionally puts them on each separate line ifnewline
is set (conjunction({id("rule1"), id("rule2")})
) -
disjunction(terms, [newline])
- represents disjunction ofterms
and optionally puts them on each separate line ifnewline
is set (disjunction({id("rule1"), id("rule2")})
)
Class YaraExpressionBuilder
provides you with multiple methods that can help you build complex expressions. The most of them are overloaded operators to make it easier and readable when building long condition. These methods are:
-
operator!
- represents logicalnot
(!boolVal(true)
) -
operator~
- represents bitwise not (~hexIntVal(0x100)
) -
operator-
- represents unary operator-
(-id("i")
) -
operator&&
- represents logicaland
(id("rule1") && id("rule2")
) -
operator||
- represents logicalor
(id("rule1") || id("rule2")
) -
operator<
- represents operator<
(matchOffset("$1") < 100
) -
operator>
- represents operator>
(matchOffset("$1") > 100
) -
operator<=
- represents operator<=
(matchOffset("$1") <= 100
) -
operator>=
- represents operator>=
(matchOffset("$1") >= 100
) -
operator+
- represents operator+
(matchOffset("$1") + intVal(100)
) -
operator-
- represents operator-
(matchOffset("$1") - intVal(100)
) -
operator*
- represents operator*
(matchOffset("$1") * intVal(100)
) -
operator/
- represents operator/
(matchOffset("$1") / intVal(100)
) -
operator%
- represents operator%
(matchOffset("$1") % intVal(100)
) -
operator^
- represents bitwise xor (matchOffset("$1") ^ intVal(100)
) -
operator&
- represents bitwise and (matchOffset("$1") & intVal(100)
) -
operator|
- represents bitwise or (matchOffset("$1") | intVal(100)
) -
operator<<
- represents bitwise shift left (matchOffset("$1") << intVal(10)
) -
operator>>
- represents bitwise shift right (matchOffset("$1") >> intVal(10)
) -
operator()
- represent call to function (id("func")(intVal(100), intVal(200))
) -
call(args)
- represents call to function (id("func").call({intVal(100), intVal(200)})
) -
contains(rhs)
- represents operatorcontains
(id("signature").contains(stringVal("hello"))
) -
matches(rhs)
- represents operatormatches
(id("signature").matches(regexp("^a.*b$", "i"))
) -
access(rhs)
- represents operator.
as access to structure (id("pe").access("numer_of_sections")
) -
operator[]
- represents operator[]
as access to array (id("pe").access("sections")[intVal(0)]
) -
readInt8(be)
- represents call to special functionint8(be)
(intVal(100).readInt8()
) -
readInt16(be)
- represents call to special functionint16(be)
(intVal(100).readInt16()
) -
readInt32(be)
- represents call to special functionint32(be)
(intVal(100).readInt32()
) -
readUInt8(be)
- represents call to special functionuint8(be)
(intVal(100).readUInt8()
) -
readUInt16(be)
- represents call to special functionuint16(be)
(intVal(100).readUInt16()
) -
readUInt32(be)
- represents call to special functionuint32(be)
(intVal(100).readUInt32()
)
At the end, you can just call get()
method of YaraExpressionBuilder
and you will get your Expression
object. Make sure to store it if you want to use it later because YaraExpressionBuilder
resets its state after calling get()
.
Before we get into construction of rules, we will show how hex strings can be constructed using YaraHexStringBuilder
. Each hex string consists of hex string units, which are:
- Nibble
- Wildcard (
?
) - Jump (
[low-high]
) - Alternative (
(XX|YY|...)
)
When working with YaraHexStringBuilder
, we are not always necessarily working on unit-level but sometimes on byte-level. Here are steps on how to create each type of unit:
-
YaraHexStringBuilder(byte)
- creates two nibbles out of byte value. -
wildcard()
- creates??
-
wildcardLow(nibble)
-<nibble>?
-
wildcardHigh(nibble)
-?<nibble>
-
jumpVarying()
-[-]
-
jumpFixed(offset)
-[<offset>]
-
jumpVaryingRange(low)
-[<low>-]
-
jumpRange(low, high)
-[<low>-<high>]
-
alt(units)
-(unit1|unit2|...)
If you are finished with your condition, you can now build rules with it. Similarly as with condition, you use YaraRuleBuilder
to construct rules except this time, YaraRuleBuilder
only provides few methods which are:
-
withName(name)
- specify rule name -
withModifier(mod)
- specify whether rule is private or public (Rule::Modifier::Private
orRule::Modifier::Public
) -
withTag(tag)
- specify rule tag -
withStringMeta(key, value)
- specify string meta -
withIntMeta(key, value)
- specify integer meta -
withUIntMeta(key, value)
- specify unsigned integer meta -
withHexIntMeta(key, value)
- specify hexadecimal integer meta -
withBoolMeta(key, value)
- specify boolean meta -
withPlainString(id, value, mod)
- specify plain string with identifierid
and contentvalue
with modifiersmod
(String::Modifiers::Ascii
,String::Modifiers::Wide
,String::Modifiers::Nocase
,String::Modifiers::Fullword
) -
withHexString(id, str)
- specify hex string (str
is of typestd::shared_ptr<HexString>
) -
withRegexp(id, value, mod)
- specify regular expression with identifierid
and contentvalue with modifiers
mod(Modifiers here are different than modifiers in plain string. These modifiers are tied to the regular expression and come after last
/`.) -
withCondition(cond)
- specify condition
Finally, after we have our rules constructed, we can form them into single YARA file using YaraFileBuilder
, which provide these methods:
withModule(name)
- specifies import
of module named name
withRule(rule)
- adds the rule into file