Home / Blogs

The Highest Threat TLDs - Part 1

Co-authored by Dr. David Barnett, Brand Monitoring Subject-Matter Expert and Justin Hartland, Global Director of Account Management at CSC.

A domain name consists of two main elements: the second-level domain name to the left of the dot—often consisting of a brand name or relevant keywords—and the domain extension or top-level domain (TLD) to the right of the dot. Domain names form the key elements of the readable web addresses allowing users to access pages on the internet and also allowing the construction of email addresses.

There are different types of TLDs, including generic or global (gTLDs), that were originally intended to provide a description of the site type, such as .COM for company websites or .ORG for charitable organizations. There are also country-code TLDs (ccTLDs) for specific countries, e.g., .CO.UK for the U.K., .FR for France, etc. Finally there are a range of new gTLDs that have launched since 20131, usually relating to specific content types, business areas, interests, or geographic locations (e.g., .SHOP, .CLUB, .TOKYO). Each TLD is overseen by a registry organization, which manages its infrastructure.

Domain names are associated with the full spectrum of internet content, from legitimate use by brands or individuals, to infringing or criminal activity. CSC has observed that certain TLDs get used more for egregious content.

There are several possible reasons why particular TLDs are more attractive to infringers, including the cost of domain registration, and difficulties in conducting enforcement (takedown) actions against infringing content. TLDs operated by certain registries, like those offering low- or no-cost domain registrations or those with lax registration security policies, are more likely to be used for infringing activities. Additionally, domain extensions lacking well-defined, reliable enforcement routes like .VN (Vietnam) and .RU (Russia) prove to be especially high risk. Other factors are also significant; for example, a country’s wealth affects the levels of technical expertise of internet service providers (ISPs) and, therefore, the likelihood of domains being compromised.

In this two-part blog post, we aim to quantify the threat levels associated with specific domain extensions, i.e., the likelihood that a domain on a particular TLD might be registered for fraudulent purposes.

Part 1: Phishing site TLDs

Determining the overall threat frequency for each TLD is useful in several ways:

  • Helping to prioritize results identified via a brand protection service. For example, the TLD can be used to identify top targets for future tracking for content changes.
  • Identifying TLDs where it’s advisable to register domains featuring key brand-related strings defensively to avoid them being registered by third parties with malicious intent.
  • Identifying TLDs where it’s advantageous for brand protection service providers to offer blocks or alerts when, for example, a third party attempts to register a domain containing a brand-related term.

Analysis and discussion

For this first post, we analyzed data from CSC’s Fraud Protection services to uncover the TLDs associated with domains used for phishing activity. The analysis covers all sites detected between November 2021 and April 2022 for those TLDs with more than 10 phishing cases and where domain-based phishing cases were recorded (as opposed to subdomain-based). This yielded results for 115 distinct TLDs.

In addition, we also consider the frequency of domain use associated with threatening content across the TLD in question. We do this by expressing the raw numbers as a proportion of the total number of domains registered across the TLD2. We then normalize the data, so the value for the highest-threat TLD is 1, with all other values in that dataset scaled accordingly. It’s important to note that this value reflects the proportion of malicious domains across each TLD, rather than absolute numbers. Some other TLDs see high numbers of infringements by virtue of the total numbers of domain registrations across these extensions. Table 1 shows the top 20 TLDs represented in CSC’s phishing dataset (by absolute numbers), together with the normalized threat frequencies for these TLDs.

Table 1:Top 20 TLDs represented in CSC’s phishing dataset, by absolute numbers.
TLD% of total phishing casesTotal no. of regd. domains across TLDNormalized threat frequency within dataset
.COM45.7%221,858,3340.014
.ORG6.9%15,550,7330.031
.APP6.2%1,155,8070.377
.NET4.8%19,773,3150.017
.XYZ2.5%10,841,3040.016
.RU2.5%10,627,0330.016
.CO2.1%4,110,1320.035
.CN1.7%25,147,8160.005
.ME1.3%1,669,8000.054
.DEV1.2%391,9290.222
.BR1.2%5,519,3780.015
.TOP1.2%8,830,1420.009
.IO1.1%923,5880.085
.IN1.1%3,271,3370.023
.PAGE1.0%368,4740.195
.ID0.9%760,2400.080
.ICU0.8%7,956,3850.007
.INFO0.8%7,852,8960.007
.DE0.7%22,881,1150.002
.KE0.7%165,9070.288

We’ve observed similar patterns in other analyses of threatening content. Interisle’s “Malware Landscape 2022” study found that the top 10 TLDs associated with malware domains also featured a mix of legacy gTLDs (.COM at position one, .NET at five, .ORG at six, and .BIZ at 10), new gTLDs (.XYZ at position two, .CLUB at seven, and .TOP at nine) and ccTLDs (.BR, .IN, and .RU at positions three, four, and eight, respectively)3. Eight of these 10 extensions feature in the top 14 of CSC’s phishing list above. Similarly, the Anti-Phishing Working Group’s (APWG’s) “Phishing Activity Trends Report” for Q4 2021 analyzed top phishing TLDs, with a top nine including new gTLDs .XYZ, .BUZZ, and .VIP, and ccTLDs .BR and .IN, alongside legacy gTLDs.

New gTLDs were more than twice as extensively represented in the dataset as would be expected purely based on the total number of domains registered across these extensions. A Q1 2022 study by Agari™ and PhishLabs also showed similar patterns, where the top 10 TLDs abused by phishing (by number of sites) included the new gTLDs .VIP, .XYZ, and .MONSTER, and ccTLDs .BR, .LY, and .TK5, 6.

Table 2 shows the pattern is rather different when looking at the top TLDs by their normalized threat frequency; the list is dominated by a distinct set of ccTLDs, a smaller number of new gTLDs, and excludes many of the more popular TLDs shown previously.

Table 2:Top 20 TLDs represented in CSC’s phishing dataset, by normalized threat frequency.
TLDNormalized threat frequency within datasetTotal no. of regd. domains across TLD% of total phishing cases
.GD1.0003,3060.05%
.GY0.9104,0370.05%
.MS0.7399,4400.10%
.ZM0.5314,8380.04%
.APP0.3771,155,8076.21%
.LY0.35625,8010.13%
.KE0.288165,9070.68%
.DEV0.222391,9291.24%
.PAGE0.195368,4741.03%
.UG0.18710,8100.03%
.SN0.1879,8420.03%
.DO0.17630,2150.08%
.BD0.12737,4650.07%
.SBS0.12044,2220.08%
.NP0.11257,3790.09%
.SH0.11025,0700.04%
.NG0.097240,6680.33%
.IO0.085923,5881.11%
.ID0.080760,2400.86%
.SA0.07960,2460.07%

In the second article in this series, we compare these findings with those from additional datasets to produce an overall measure of TLD threat frequency, considering a range of fraudulent uses. We then consider cybersecurity implications, discuss mediation measures, and cover how CSC can help with this process.

By David Barnett, Brand Monitoring Subject-Matter Expert at CSC

David Barnett has worked in the internet brand-protection industry as an analyst and consultant since 2004. David managed the Analysis & Consultancy services in Brand Monitoring from 2006 to 2019, and currently works as the Brand Monitoring subject-matter expert in CSC’s office in Cambridge, U.K., helping to serve a range of brand-protection customers in a variety of industries.

Visit Page

Filed Under

CircleID Newsletter The Weekly Wrap

More and more professionals are choosing to publish critical posts on CircleID from all corners of the Internet industry. If you find it hard to keep up daily, consider subscribing to our weekly digest. We will provide you a convenient summary report once a week sent directly to your inbox. It's a quick and easy read.

I make a point of reading CircleID. There is no getting around the utility of knowing what thoughtful people are thinking and saying about our industry.

VINTON CERF
Co-designer of the TCP/IP Protocols & the Architecture of the Internet

Comments

Not sure I buy those numbers. Kevin Murphy  –  Jan 12, 2023 10:46 AM

How are we counting domains here? The numbers in your “Total no. of regd. domains across TLD” column appear to be way off if we’re talking about a snapshot at a given moment in time. Verisign has never reported .com numbers as high as 221 million. It’s currently around 160 million and .net is around 13 million. Most of the other domain counts appear to be far too high also.

David Barnett  –  Jan 13, 2023 2:43 AM

All overall TLD stats are taken from https://domainnamestat.com/statistics/tldtype/all. Even if their numbers turned out to be consistently (for the sake of argument) ~25% too high, this wouldn't affect the overall findings, since all ratios are normalised anyway.

Big if Kevin Murphy  –  Jan 13, 2023 3:53 AM

That doesn't appear to be the case. Your number for .ru, for example, is more than double what the registry reports, while your number for .page is more that five times larger than what the registry reports. Meanwhile, your number for .br only appears to be about 10% off.

David Barnett  –  Jan 15, 2023 4:15 AM

I can't vouch for the accuracy of their numbers, but even if they're only broadly correct - to, say, an order of magnitude - it won't significantly change the overall conclusions - particularly in Part 2 of the article, where we combine the findings with those from other independent datasets. Where are you getting your stats from?

Stats Kevin Murphy  –  Jan 15, 2023 1:13 PM

I get my stats from the registries. Directly in the case of ccTLDs. Vicariously from ICANN in the case of gTLDs. Why are you basing your analysis on domainnamestat.com? Do you know who runs that site or what their methodology is? I certainly don't. This is pretty basic stuff mate.

Wrong registration figures. John McCormac  –  Jan 19, 2023 1:45 AM

Many of those TLD counts are wrong. Not ~25% in error. Simply wrong! These are the domain name counts for .COM and .NET as of this morning.

https://www.verisign.com/en_US/channel-resources/domain-registry-products/zone-file/index.xhtml

The .COM is at 160,593,240 and the .NET is at 13,226,928 domain names. The registry reports for the ICANN gTLDs are available from ICANN’s website.

https://www.icann.org/resources/pages/registry-reports

Many ccTLD registries publish their counts on their websites such as DEnic.
https://www.denic.de/

As Kevin said above, it is pretty basic stuff. The .COM has never been at 221M registrations.  Some of the figures for domain name counts are multiples of the actual domain name counts for those TLDs.  The claim in the footnotes that the statistics are correct as of June 13th, 2022 is simply wrong. Trying to calculate the frequency of abusive registrations in a TLD generally requires the number of domain names in that TLD.

There are other methods of taking samples of a TLD and checking for the occurence of abusive registrations in that sample. It can provide valid estimates of abuse in a TLD. I’m not sure that I’ve ever run across a method that calculates the frequency of abusive registrations in non-existent domain names.

Registry counts and zone file counts and zone files. John McCormac  –  Jan 19, 2023 2:02 AM

.COM is at 160,593,420 this morning (apologies for slight error due to lack of coffee). The active figure, the number of domain names in the .COM zone file, is 158,670,053 and in the .NET zone, the figure is 13,029,731. The number of live domain names in a TLD differs from the overall number because some domain names are going through a deletion cycle or have no associated nameservers. The zone files for the gTLDs are available from ICANN’s CZDS website on approval by the registries.

https://czds.icann.org/home

Those zone files are updated daily.

Most of the ccTLDs do not provide access to their zones files but generally do publish live, monthly, quarterly or yearly statistics on the size of their TLDs.

David Barnett  –  Jan 19, 2023 4:01 AM

Obviously for our formal domain-monitoring services at CSC we do use the full zone-file data downloaded from the individual registries.

Really what we were looking for, for this part of this study, was a convenient resource where we could find estimates of the overall size of all TLDs in a single place (which was the rationale for going to domainnamestat.com) - it really isn’t intended as anything more than a ‘quick-and-dirty’ estimate (and clearly the numbers have turned out not to be too robust!) but, as long as the figures are broadly correct to the order of magnitude, the overall findings are not significantly affected. (The footnote denotes that the numbers were consistent with those given on the site on 13-Jun-2022.)

Some of those figures are massively wrong. John McCormac  –  Jan 19, 2023 4:58 AM

The problem is that some of the figures were wrong even where the domain name counts and zone files were public. The ICANN CZDS provides a single site where all (except .AERO and .POST) zone files can be downloaded. If as, in some cases, the counts are double or more of the domain names that are in the actual zone files then this will affect the frequency calculations and underestimate the level of abusive registrations.

The .INFO had approximately 3.6M registrations in June 2022 and the figure above is claiming it had 7.85M. The .XYZ had approximately 4.25M and the figure above is for 10.84M. The .TOP had approximately 1.75M in June (1.99M by the end of June 2022) and the figure above has it at 8.83M. The .ICU had approximately 1.09 in the zone in June 2022. The figure above is for 7.96M.

The ICANN registry reports also seem to have an error on .APP compared to the zone file. (I ETLed the complete ICANN registry reports set from July 2001 to September 2022 as part of comparing the data with the ICANN Open Data Project dataset and for market analysis work.) There is always a difference between the registry gTLD total in the reports and the count in the zone files for active gTLDs but some of the counts in the figures above are multiples of the actual zone file counts. That makes the frequency calculations highly problematic. A lot of abusive registrations shifted from .COM to the heavily discounted new gTLDs. Those abusive registrations typically last for one year and are not renewed because in the heavy discounting model, the first renewal fee is at full fee whereas the discounted fee might only have been $1 or less. There was a very good paper by SIDN Labs (affiliated with the Dutch .NL registry) on this a few years ago. Free or heavily discounted TLDs are always going to attract bad actors because they change the economics of the activity. That is one of the main factors in abusive registrations and DNS Abuse.

These are interesting articles and provide a lot of food for thought.

Comment Title:

  Notify me of follow-up comments

We encourage you to post comments and engage in discussions that advance this post through relevant opinion, anecdotes, links and data. If you see a comment that you believe is irrelevant or inappropriate, you can report it using the link at the end of each comment. Views expressed in the comments do not represent those of CircleID. For more information on our comment policy, see Codes of Conduct.

Related

Topics

Threat Intelligence

Sponsored byWhoisXML API

Brand Protection

Sponsored byCSC

Cybersecurity

Sponsored byVerisign

Domain Names

Sponsored byVerisign

IPv4 Markets

Sponsored byIPv4.Global