|
Domain generation algorithm (DGA) is used to generate several domain names commonly used for command-and-control (C&C) servers in malware attacks. The logic behind a domain name generation algorithm is quite simple. Instead of hard-coding the domain or IP address into the malware, the malware finds its C&C under a domain with a seemingly random name. In reality these domain names form a sequence generated by an algorithm, and this sequence can only be reproduced if one is in hold of a secret “seed.” In effect, this helps the malware evade detection by certain security systems.
Domain name generation algorithms are therefore dangerous. They can turn the simplest form of malware into a menacing enemy and cost millions of dollars in damage as it is difficult to detect. GameOver Zeus (GOZ) is an example of this. The banking malware is believed to have caused US$100 million in damages because it was hard to identify and block.
But malware families that use domain generation algorithms are not 100% fail-proof. A tremendous research effort has been put into the detection and prediction of domain registrations originating from DGAs. Yet there is unfortunately no ultimate solution. Typosquatting Data Feed can also help detect some possibly DGA-generated domain names and keep malware from communicating with C&C servers. Let’s see how in this post.
A malware that uses a domain generation algorithm can generate thousands of domain names at once. Conficker C, for example, can generate up to 50,000 domain names per day, and only 500 of these are queried. Therefore, threat actors would need to register a portion of these DGA-generated domain names. And this is where Typosquatting Data Feed comes in.
The data feed detects domains with similar-looking names that are bulk-registered on the same day. These daily data sets do contain domain names that are apparently machine-generated—possibly by a DGA. Otherwise speaking, if some DGA-generated domain names registered on the same day are close to each other in a text similarity metric, the data feed will contain them.
To illustrate, we downloaded the data feed dated 30 August 2020 and found several domains that are possibly created via domain name generation algorithms. Even though there is no solid proof yet that this is the case, they definitely need attention.
Some domain generation algorithms use a combination of magic numbers and the current date and time to create domain names. To provide some perspective, here are some actual examples of domain names generated via DGA, particularly for Locky ransomware. These were generated using 7 as the seed number:
Typosquatting Data Feed picked up similar-looking domains, including the following:
Cybersecurity experts are able to detect DGA-generated domain names such as those listed above. They can separate traffic that comes from random-looking and non-human-readable domain names. As such, threat actors also learned to adapt and enhance their domain generation algorithm methods. They can use dictionary words, so the domain names created via domain generation algorithm usage would be partly readable and not look very random.
For example, Typosquatting Data Feed detected some domains that use the string “bet” and random numbers:
Out of 8,846 typosquatting domains detected on 30 August 2020, about 32% could be DGA-generated. The chart below shows the breakdown.
Detecting DGA-generated domain names is the first step in preventing malware from communicating with their C&C servers. Threat actors commonly use these domains shortly after registering them, so timeliness is very crucial. For example, ypwosgnjytynbqin[.]com, generated using domain name generation algorithm, was registered on 3 July 2019, and was already seen communicating with a Ramnit malware C&C server after three days.
Like the threat of typosquatting, early detection is vital in fighting malware families that use domain generation algorithms. Since threat actors register DGA-generated domains in bulk, Typosquatting Data Feed may pick some of them up as soon as they appear on the DNS.
Sponsored byRadix
Sponsored byVerisign
Sponsored byIPv4.Global
Sponsored byWhoisXML API
Sponsored byCSC
Sponsored byVerisign
Sponsored byDNIB.com