Symantec EDR Internals — Criterion

Criterion

The first technology that i was interested in was the machine learning engine “Criterion”. Here is a small description from the documentation

  • Suspicious: The file’s health is suspicious based on Symantec’s machine-learning algorithms that identify the files that are likely to be malicious. However, there is no known detection against it. Your organization should pay particular attention to and thoroughly analyze suspicious files.

Start Of The Journey

Looking through the system we find that the criterion engine / logic is located in a library called “criterion.jar” and within it is a class titled “EventProcessingTask” which implements a “run()” function which is responsible for collecting all the information needed by criterion to execute properly. Below is the general gist of how this function works.

General Flow
  • Collecting time range (Start and End time).
  • Creating Lists for any blacklist or whitelist exclusions you have within the EDR. (These will be used to filter in / out events)
  • Verifying that variables and functions are not returning “NULL” results
  • …Etc.

Determining The Scoring Decision

The “Criterion” engine will then take those events and tag them in one of two categories

Scored

A “scored” event is tagged as such depending on the value of its “disposition” and “confidence” and can have one of the following states:

  • SCORE_USING_ENGINE

Unscored

An “unsocred” event is tagged as such if one of the following conditions is met:

  • The file extension is not in the list of the acceptable classifier extensions: “exe”, “cpl”, “dll”, “msi”, “scr”, “sys”.
  • The file path is in one of the following paths: “CSIDL_PROGRAM_FILES” or “CSIDL_PROGRAM_FILESX86”.
  • The file doesn’t have a “confidence” or a “disposition”.

Populating File Features

Once the files / events are tagged with a score/unscored tag. The next thing to look at are the file “features”. Since the engine bases its classification on file attributes and features, it needs to collect this information before sending it to the ML classifier.

Attributes & Features Collection

List of “File Features” and “Attributes” to collect
  • File Path
  • SHA256 Hash
  • Reputation Confidence
  • Total Length — Calculated from the length of the file.
  • Filename Length — Calculated from the beginning of the file until the last “point” denoting the beginning of the extension.
  • File Extension Length — Calculated from the last “dot” until the end of the filename.
  • Count Lower Case Characters — Calculated by looping through the file name and checking if any character is between the range of letters from “a” to “z” (lower case). If it is then the counter is incremented by one.
  • Count Upper Case Characters — Calculated by looping through the file name and checking if any character is between the range of letters from “A” to “Z” (upper case). If it is then the counter is incremented by one.
  • Count Alpha — Calculated by looping through the file name and checking if any character is between the range of letters from “a” to “z” (lower or upper case). If it is then the counter is incremented by one.
  • Count Numbers — Calculated by looping through the file name and checking if any character is between the range of numbers from “1” to “9”. If it is then the counter is incremented by one.
  • Count Spaces — If the there is a space in the filename, then the counter is incremented by one.
  • Count Dots — If the filename contain a dot, then the counter is incremented by one.
  • Count At Signs — If the file name contains the “@” symbol, then the counter is incremented by one.
  • Count Special Characters — If the filename contains one of the following (“%”, “-”, “^”, “#”, “$”, “*”, “!”) symbols then the counter is incremented by one.
  • Count Parentheses — If the file name contains parentheses, then the counter is incremented by one.
  • Count Square Braces — If the file name contains square braces (“[“, “]”) then the counter is incremented by one.
  • Count Underscores — If the file name contains and underscore (“_”) then the counter is incremented by one.
  • Count Other — If none of the aforementioned conditions are matched then the other counter is incremented by one
  • isFileNameAlphaNumeric — Boolean that is set to “True” if both “Count Alpha” and “Count Numbers” are both superior than zero
  • isFileNameSingleNum — Boolean that is set to “True” if “Filename Length” and “Count Numbers” are both equal to one
  • isFileNameSingleChar — Boolean that is set to “True” if “Filename Length” and “Count Alpha” are both equal to one
  • isFileNameMixedCase — Boolean that is set to “True” if “Count Upper Case Characters” and “Count Lower Case Characters” are both bigger than zero
  • countAdultWords — A counter that start from zero and add one each time one of the following words are found : “sex”, “xxx”, “hot”, “porn”, “adult”, “baby”, “babe”, “sweet”, “slut”, “fuck”, “dog”, “horse”
  • countAvWords — A counter that start from zero and add one each time one of the following words are found : “scan”, “virus”, “threat”, “risk”, “malware”, “malicious”, “av”, “protect”, “safe”, “save”, “secure”, “security”, “anti”
  • countMediaWords — A counter that start from zero and add one each time one of the following words are found : “avi”, “video”, “fli”, “movie”, “wav”, “media”, “midia”, “vivo”, “mpeg”, “vid”
  • countInstallerWords — A counter that start from zero and add one each time one of the following words are found : “install”, “setup”
  • countOfficeWords — A counter that start from zero and add one each time one of the following words are found : “.doc”, “.xls”, “.ppt”, “.pdf”
  • countArchiveWords — A counter that start from zero and add one each time one of the following words are found : “rar”, “zip”, “lzh”
  • countGtGoodNamesWords — A counter that start from zero and add one each time one of the following words are found : “java”, “jsched”, “lsass”, “svchost”, “csrss”, “acrord”
  • doubleExt — A boolean that is set to true if the filename matches a regular expression that checks for double extensions.
  • directoryLen — Variable containing the length of the directory
  • countSpecialCharsInDir — A counter that starts from zero and increment each time the name of the directory contain one of the following characters : “%”, “-”, “^”, “#”, “$”, “*”, “!”
  • countBackslash — A counter that starts from zero and increment each time the name of the directory contains a backslash
  • meanCharFreq — This is equal to “directoryLen” / “countOfUniqueCharacters”
  • maxWordLen — Calculated based on the longest word in the directory name
  • minWordLen — Calculated based on the shortest word in the directory name
  • avgWordLen
  • avgWordLen
  • containsTemp — Boolean set to “True” if the directory contains the word “temp”
  • containsCsidl — Boolean set to “True” if the directory contains the word “csidl”
  • containsSystem — Boolean set to “True” if the directory contains the word “system”
  • containsProfile — Boolean set to “True” if the directory contains the word “csidl_profile”
  • containsDevice — Boolean set to “True” if the directory contains the word “device”

Calculating The Final Score

Once all this is done its time to send the events to the ML Scoring Engine. For this a case file is created with the following format (See below)

Sample_X,0,0,0,0,5,1,0,0,0,0,0,0,0,0,3,0,5,0,0,2,0,0,0,500,0,0,0,0,0,30,1,0,1,11,11,29,0,1,1,0,0,8,8,0,0,0,0,0,0,?
Result of the “See5Sam” engine
File classified as “Suspicious” by “Criterion”

Conclusion & Future Research

In this first part we looked at the general flow of how the “Criterion” library / engine works. What are the required conditions for files to be considered for scoring and what are the attributes and features that are need for to help in the classification of these files. Defenders can use this information to gain a deeper understanding of the EID 4099 and to write more informed detection within SEDR using the aforementioned EID.

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Nasreddine Bencherchali

Nasreddine Bencherchali

I write about #ThreatHunting #WindowsInternals #Malware #DFIR and occasionally #Python.