Home / Industry

Detecting Possible Domain Generation Algorithm-Related Threats Using Typosquatting Data Feed

Domain generation algorithm (DGA) is used to generate several domain names commonly used for command-and-control (C&C) servers in malware attacks. The logic behind a domain name generation algorithm is quite simple. Instead of hard-coding the domain or IP address into the malware, the malware finds its C&C under a domain with a seemingly random name. In reality these domain names form a sequence generated by an algorithm, and this sequence can only be reproduced if one is in hold of a secret “seed.” In effect, this helps the malware evade detection by certain security systems.

Domain name generation algorithms are therefore dangerous. They can turn the simplest form of malware into a menacing enemy and cost millions of dollars in damage as it is difficult to detect. GameOver Zeus (GOZ) is an example of this. The banking malware is believed to have caused US$100 million in damages because it was hard to identify and block.

But malware families that use domain generation algorithms are not 100% fail-proof. A tremendous research effort has been put into the detection and prediction of domain registrations originating from DGAs. Yet there is unfortunately no ultimate solution. Typosquatting Data Feed can also help detect some possibly DGA-generated domain names and keep malware from communicating with C&C servers. Let’s see how in this post.

Domain Generation Algorithm Monitoring

A malware that uses a domain generation algorithm can generate thousands of domain names at once. Conficker C, for example, can generate up to 50,000 domain names per day, and only 500 of these are queried. Therefore, threat actors would need to register a portion of these DGA-generated domain names. And this is where Typosquatting Data Feed comes in.

The data feed detects domains with similar-looking names that are bulk-registered on the same day. These daily data sets do contain domain names that are apparently machine-generated—possibly by a DGA. Otherwise speaking, if some DGA-generated domain names registered on the same day are close to each other in a text similarity metric, the data feed will contain them.

To illustrate, we downloaded the data feed dated 30 August 2020 and found several domains that are possibly created via domain name generation algorithms. Even though there is no solid proof yet that this is the case, they definitely need attention.

Magic Number Seed

Some domain generation algorithms use a combination of magic numbers and the current date and time to create domain names. To provide some perspective, here are some actual examples of domain names generated via DGA, particularly for Locky ransomware. These were generated using 7 as the seed number:

  • pvmyilqakqqkl[.]in
  • kfqoruddyo[.]nl
  • myxmilto[.]it
  • hicqd[.]us
  • qnqlfdthdyidbw[.]be
  • shxppmfnhjao[.]pm
  • nqcxfhycl[.]in
  • wowkllj[.]it

Typosquatting Data Feed picked up similar-looking domains, including the following:

  • bvpdbtgjta[.]com
  • dvpdbtgjta[.]com
  • vpdbtgjtk[.]com
  • avpdbtgjta[.]com
  • vpdbtgjtx[.]com
  • vpdbtgjth[.]com
  • vpdbtgjtr[.]com
  • vpdbtgjta[.]com
  • nvpdbtgjta[.]com
  • vpdbtgjto[.]com
  • vpdbtgjtm[.]com
  • ovpdbtgjta[.]com
  • vpdbtgjtl[.]com
  • vpdbtgjtn[.]com
  • cvpdbtgjta[.]com
  • hvpdbtgjta[.]com
Dictionary-Based Domain Generation Algorithms

Cybersecurity experts are able to detect DGA-generated domain names such as those listed above. They can separate traffic that comes from random-looking and non-human-readable domain names. As such, threat actors also learned to adapt and enhance their domain generation algorithm methods. They can use dictionary words, so the domain names created via domain generation algorithm usage would be partly readable and not look very random.

For example, Typosquatting Data Feed detected some domains that use the string “bet” and random numbers:

  • 8158bet[.]com
  • 9001bet[.]com
  • mpo333bet[.]club
  • 5509bet04[.]com

Out of 8,846 typosquatting domains detected on 30 August 2020, about 32% could be DGA-generated. The chart below shows the breakdown.

  • Letters only: More than 220 domain names used random letter combinations.
  • Numeric characters: About 2,400 used numeric characters only. Examples are 3966899[.]com, 6977799[.]com, and 8981699[.]com. At one point, 942 similar domains were bulk-registered on the same day.
  • Alphanumeric characters: 176 domains used alphanumeric characters, 53 of which were possibly dictionary-based DGA-generated domain names.

Detecting DGA-generated domain names is the first step in preventing malware from communicating with their C&C servers. Threat actors commonly use these domains shortly after registering them, so timeliness is very crucial. For example, ypwosgnjytynbqin[.]com, generated using domain name generation algorithm, was registered on 3 July 2019, and was already seen communicating with a Ramnit malware C&C server after three days.

Like the threat of typosquatting, early detection is vital in fighting malware families that use domain generation algorithms. Since threat actors register DGA-generated domains in bulk, Typosquatting Data Feed may pick some of them up as soon as they appear on the DNS.

By WhoisXML API, A Domain Research, Whois, DNS, and Threat Intelligence API and Data Provider

Whois API, Inc. (WhoisXML API) is a big data and API company that provides domain research & monitoring, Whois, DNS, IP, and threat intelligence API, data and tools to a variety of industries.

Visit Page

Filed Under


Commenting is not available in this channel entry.
CircleID Newsletter The Weekly Wrap

More and more professionals are choosing to publish critical posts on CircleID from all corners of the Internet industry. If you find it hard to keep up daily, consider subscribing to our weekly digest. We will provide you a convenient summary report once a week sent directly to your inbox. It's a quick and easy read.

I make a point of reading CircleID. There is no getting around the utility of knowing what thoughtful people are thinking and saying about our industry.

Co-designer of the TCP/IP Protocols & the Architecture of the Internet



IPv4 Markets

Sponsored byIPv4.Global

Threat Intelligence

Sponsored byWhoisXML API


Sponsored byVerisign


Sponsored byDNIB.com

Brand Protection

Sponsored byCSC

Domain Names

Sponsored byVerisign

New TLDs

Sponsored byRadix