Excerpt from:  Blogsite News
.
August 11, 2007

Fighting Back Against Big, Hungry, Orange Alligators

Overly aggressive RSS feed aggregators can cause server performance problems; MyST Blogsite introduces new technology to detect (and reject) egregious offenders.

The Problem

More than three years ago, MyST co-founder Bill French and I started referring to overly aggressive RSS feed readers as orange alligators.  It was a pun that combined the color of the already ubiquitous RSS icons and the closeness of the words "aggregator" and "alligator".  The idea was that these alligators seemed insatiably hungry for our server bandwidth.  We lamented that while most RSS clients were well behaved (i.e., they used conditional requests, reasonable polling frequencies, identifiable user agent names, etc.), some were very poorly behaved and could [and did] cause serious server loading issues.

Big, Hungry, Orange  Alligators
Overly Aggressive
Feed Readers

What do I mean by "poorly behaved"?  For example, imagine a feed reader polling each of twelve feeds in a large blogsite every 30 seconds; further imagine the reader using unconditional requests and requesting cache refresh, thereby forcing the feed to be re-delivered, in its entirety, for each request.  This translates into 24 full-content requests per minute (that's a request every 2.5 seconds) just to service a single client!  Depending on the nature of the channel content, this can mean serving up many megabytes per minute of the same, comparatively slowly changing, content, over and over—big, hungry, orange alligators eating bandwidth sandwiches.  And it can be worse than just wasting bandwidth.  In the case of dynamic content, making unconditional, cache-refreshing requests is actually asking the server to recreate the dynamic content for each request.  This can lead to server overload conditions where the server simply can't satisfy the alligator's appetite, resulting in poor server performance for everybody, not just the alligator.

The Solution

We knew that the day would come when it was necessary to introduce server-side technology to detect and manage alligators.  For MyST Blogsite® servers, that day has come.  A little over two years ago, we introduced MyST SlimeGate™, a server-side technology that protects our servers against attacks by hackers and spammers.  Yesterday we introduced, to all MyST Blogsite servers, our new MyST AlliGate™ technology that protects against overly aggressive content consumers, most of which are poorly behaved RSS feed readers but could also include other types of web clients.

AlliGate monitors server requests watching for alligators.  AlliGate uses a variety of heuristics to identify alligators, but once an alligator is identified, its IP address is submitted to SlimeGate for "management".  Managed addresses may be dynamically blocked at the network firewall, preventing them from any communication with the server for a short period of time.  Once access is restored, the client may again access the server.  However, a subsequent offense by an managed IP address results in another firewall lockout, but this time for twice as long as the time before.  Also, once access is restored, if a managed IP goes for a while (currently 30 days) without further offense, the IP is completely forgiven.

While the heuristics used to detect alligators are somewhat complex, here are some recommendations that, if followed, will ensure that you never trigger AlliGate to "manage" your access:

  1. Do not poll a given feed more that five times per hour.  (Blogsite content is not more "real-time" than that anyway—in most cases, checking once per hour is perfectly adequate.)
  2. Use web clients (e.g., RSS readers, browsers, etc.) that make conditional requests.
  3. Do not force cache refresh on all requests.  (Refreshing cache occasionally is okay, but it should not be standard operating procedure.)

The Benefits

Today, all MyST Blogsite servers are protected against hackers, referrer spammers, and now, orange alligators.  Our experience managing nefarious requests to hundreds of commercial blogsites is reflected in technologies like MyST SlimeGate and MyST AlliGate that ultimately benefit all of our clients.  This is another example of why our clients turn to us to manage and host their corporate blogsite.  Rather than "going it alone" and fending off hackers, spammers, and alligators, our clients leave the nitty-gritty logistics of operating a high-visibility blogsite to us.  This lets them wisely focus their attention, resources, and expertise on creating the best possible content for their site.  Its a partnership that just makes sense.

Comments
.

Brilliant!

.
Syndication OptionsRSS (Rich Site Summary) Feed Atom Feed OPML (Outline Processor Language) Feed MYST-ML (MyST Markup Language) Content Feed MS-Office Smart Tag Subscription