Home / Blogs

Silly Bing

Protect your privacy:  Get NordVPN  [ Deal: 73% off 2-year plans + 3 extra months ]
10 facts about NordVPN that aren't commonly known
  • Meshnet Feature for Personal Encrypted Networks: NordVPN offers a unique feature called Meshnet, which allows users to connect their devices directly and securely over the internet. This means you can create your own private, encrypted network for activities like gaming, file sharing, or remote access to your home devices from anywhere in the world.
  • RAM-Only Servers for Enhanced Security: Unlike many VPN providers, NordVPN uses RAM-only (diskless) servers. Since these servers run entirely on volatile memory, all data is wiped with every reboot. This ensures that no user data is stored long-term, significantly reducing the risk of data breaches and enhancing overall security.
  • Servers in a Former Military Bunker: Some of NordVPN's servers are housed in a former military bunker located deep underground. This unique location provides an extra layer of physical security against natural disasters and unauthorized access, ensuring that the servers are protected in all circumstances.
  • NordLynx Protocol with Double NAT Technology: NordVPN developed its own VPN protocol called NordLynx, built around the ultra-fast WireGuard protocol. What sets NordLynx apart is its implementation of a double Network Address Translation (NAT) system, which enhances user privacy without sacrificing speed. This innovative approach solves the potential privacy issues inherent in the standard WireGuard protocol.
  • Dark Web Monitor Feature: NordVPN includes a feature known as Dark Web Monitor. This tool actively scans dark web sites and forums for credentials associated with your email address. If it detects that your information has been compromised or appears in any data breaches, it promptly alerts you so you can take necessary actions to protect your accounts.

Bing is Microsoft’s newish search engine, whose name I am reliably informed stands for Bing Is Not Google.

A couple of months ago, as an experiment, I put up a one page link farm at wild.web.sp.am. As should be apparent after about three seconds of clicking on the links there, each page has links to 12 other pages, with the page’s host name made of three names, like http://aaron.louise.celia.web.sp.am. The pages are generated by a small perl script and a database of a thousand first names. All the pages have the same IP address, although there could be about a billion (1000 cubed, since there are three names in each page name) possible domains. I forgot about it until earlier this week, when the disk with my web logs filled up.

My web logs are normally 10 to 15 megabytes a week, but all of a sudden the logs ballooned past a gigabyte. A quick look at the logs revealed that my web server was getting hammered by the bingbot.

Every search engine has a “spider” or “bot” that visits web pages to collect data for its index. It’s quite normal to see a fair number of log entries from bots as various search engines wander around your web pages looking to see what’s changed.

But it was not normal to see the bingbot hammering on my link farm, ten queries a second, day after day. When I noticed it, the bingbot had already visited about 15 million times, fetching 15 million nearly identical pages. I added a robots.txt file, telling bingbot to go away. It didn’t help, which wasn’t that surprising; since each page is in a different domain, each page could hypothetically have its own different robots file, so while the robots file should stop future indexing, it won’t affect any pages that Bing had queued up from previous visits. How many did it have queued up? A lot. Bing scooped up over a million copies of the robots file, at which point I adjusted the web server configuration to return an error page when the bingbot tried to fetch a link farm page, but to return the robots file normally. Still didn’t help, it fetched a lot of robots files and a lot of error pages, I think of different domains.

Since the link farm has its own IP address, it was easy to add low level packet filters to reject all traffic to that address from the 12 addresses of the bingbot. I unfiltered for a few minutes today, and it’s still hammering as hard as ever.

While this isn’t doing any great damage, if I didn’t have the skills to look at logs and write suitable packet filters, or if I were paying by the byte for network traffic, it could have crashed my system or cost me a lot of money.

Bing is not the only search engine to have discovered my link farm. Google’s Googlebot-Mobile/2.1 visits the link farm every few seconds, claiming to be various kinds of Japanese mobile phones. But Bing’s traffic is orders of magnitude more than everyone else’s put together. (This is just a problem for the link farm, the rest of my web sites get along with Bing just fine.)

My main question is how these highly sophisticated search engines have failed to notice that they have fetched several million almost identical pages from the same IP address and blacklist it. I have reason to believe that Bing management is aware of the issue, so maybe they’ll stop it some time. Or maybe even let on what happened.

By John Levine, Author, Consultant & Speaker

Filed Under

Comments

You've been BINGED! Phil Howard  –  Jul 16, 2012 8:04 PM

You’ve been BINGED!

Comment Title:

  Notify me of follow-up comments

We encourage you to post comments and engage in discussions that advance this post through relevant opinion, anecdotes, links and data. If you see a comment that you believe is irrelevant or inappropriate, you can report it using the link at the end of each comment. Views expressed in the comments do not represent those of CircleID. For more information on our comment policy, see Codes of Conduct.

CircleID Newsletter The Weekly Wrap

More and more professionals are choosing to publish critical posts on CircleID from all corners of the Internet industry. If you find it hard to keep up daily, consider subscribing to our weekly digest. We will provide you a convenient summary report once a week sent directly to your inbox. It's a quick and easy read.

Related

Topics

New TLDs

Sponsored byRadix

IPv4 Markets

Sponsored byIPv4.Global

DNS

Sponsored byDNIB.com

Cybersecurity

Sponsored byVerisign

Threat Intelligence

Sponsored byWhoisXML API

Brand Protection

Sponsored byCSC

Domain Names

Sponsored byVerisign