Archive for January, 2009

2008 web statistics

January 4th, 2009

I’ve hosted a web server on the internet for more than a decade.  It is a hobby that helps keep my skills sharpened and most of the time I enjoy it very much.  Over the years I have hosted web sites for individuals and small businesses.  Aside from due diligence, one of my interests is to examine the logs and look at trends and statistics.

2008 was the web server’s busiest year yet.  My domain, boche.net, was purely a family/hobbiest site.  The addition of a VMware virtualization blog in mid October caused boche.net domain traffic to jump by a factor of ten.  I had more traffic in December than I had five previously combined months.  A new high watermark of well over 1 million hits was set compared to 323,000 total hits in 2007.

Increased traffic urged me to upgrade my bandwidth.  Fortunately, I was able to pick up a sponsor late in 2008 – vmpeople.net.  Please visit them if you have a chance – especially if you are searching for a job or a qualified engineer to help you with a project.

Without further delay, here are some of the interesting stats that stood out to me for the year 2008:

Unique visitors 36,932
Number of visits 54,247
Pages 347,179
Hits 1,378,355
Bandwidth 16.77GB

Average busiest day of the week:  Wednesday

Average least busiest day of the week:  Saturday

Average busiest hour of the day:  8 am CT

Average least busiest hour of the day:  1 am CT

Top 10 visitor domains/countries:

Domain/Country Pages Hits Bandwidth
Unknown (unresolvable IP) 126,583 442,561 5.38GB
Commercial (.com) 71,498 277,979 3.29GB
Network (.net) 55,813 291,385 3.63GB
Netherlands (.nl) 6,186 31,077 364.67MB
China (.cn) 4,051 4,859 11.74MB
Germany (.de) 3,964 15,195 309.21MB
United Kingdom (.uk) 3,759 19,052 349.44MB
South Africa (.za) 3,527 4,840 50.51MB
Australia (au) 3,068 18,287 335.27MB
Canada (.ca) 2,389 13,966 185.33MB

Top 10 robots/spider visitors (numbers after + are successful hits on robot.txt files):

Robot/Spider Visitor Hits Bandwidth
Yahoo Slurp 24,258+2,691 899.63MB
BaiDuSpider 17,223+71 8.54MB
Googlebot 10,587+262 272.53MB
MSNBot 6,643+2,,430 99.14MB
Feedburner 8,492 11.16MB
Unknown robot 5,608+117 314.38MB
MSNBot-media 3,071+1007 459.43MB
Unknown robot 3,268+542 330.10MB
Voila 2,459+756 59.40MB
Google AdSense 3,091+97 116.51MB

Visits duration (number of visits:  54,247 – average 291 seconds):

Visits duration Number of visits Percent
0s-30s 41,963 77.3%
30s-2mn 3,351 6.1%
2mn-5mn 2,039 3.7%
5mn-15mn 2,175 4%
15mn-30mn 1,179 2.1%
30mn-1h 1,234 2.2%
1h+ 2,306 4.2%

Operating systems (this category deserved a full listing):

Version Hits Percent
Windows 16,452 88.1%
Windows XP 9,638 51.6%
Windows (unknown version) 4 0%
Windows NT 1,775 9.5%
Windows Me 2 0%
Windows Vista 3,909 20.9%
Windows CE 61 0.3%
Windows 95 4 0%
Windows 2003 863 4.6%
Windows 2000 196 1%
BSD 16 0%
FreeBSD 16 0%
Linux 563 3%
Ubuntu 338 1.8%
Suse 33 0.1%
Fedora 57 0.3%
Debian 33 0.1%
Centos 18 0%
GNU Linux (unknown or unspecified distribution) 84 0.4%
Macintosh 1,142 6.1%
Mac OS X 1,142 6.1%
Others 495 2.6%
Unknown 476 2.5%
Sony PlayStation Portable 19 0.1%

Top 10 browsers:

Browser Hits Percent
MS Internet Explorer 846,588 61.4%
Firefox 394,107 28.5%
Safari 55,359 4%
SharpReader (RSS Reader) 3,949 2.4%
Mozilla 24,965 1.8%
Opera 11,305 0.8%
Unknown 6,468 0.4%
NetNewsWire (RSS Reader) 1,169 0%
Konqueror 1,128 0%
Netscape 668 0%

Top 10 referring search engines:

Search Engine Pages Percent Hits Percent
Google 27,817 86.8% 28,424 76.1%
Yahoo! 1,359 4.2% 1,393 3.7%
Windows Live 1,273 3.9% 1,291 3.4%
Google (Images) 463 1.4% 3,628 9.7%
SoSo 442 1.3% 442 1.1%
MSN Search 146 0.4% 149 0.3%
AOL 106 0.3% 108 0.2%
Google (cache) 99 0.3% 1,556 4.1%
Stumbleupon 62 0.1% 102 0.2%
Unknown 57 0.1% 63 0.1%

Top 10 search keywords:

  1. deep
  2. jack
  3. thoughts
  4. handy
  5. by
  6. vmware
  7. handey
  8. esxi
  9. esx
  10. to

Top 10 referring pages (non search engines):

Referring page Pages Hits
http://www.vmware.com/vmtn/planet/v12n/ 1,221 1,226
http://www.petri.co.il/forums/showthread.php 1,137 93,529
http://vmetc.com 489 489
http://blog.scottlowe.org 408 408
http://vmetc.com/2008/12/05/free-tools-with-virtualcenter-like-f… 355 355
http://communities.vmware.com/message/390966 268 268
http://blogs.vmware.com/vmtn/ 262 262
http://twitter.com/home 219 223
http://www.virtualization.info/2008/12/vmware-infrastructure-40-… 191 191
http://blogs.vmware.com/vmtn/2008/12/its-noon-on-wed.html 186 186

Top 10 referrers (non search engines):

  1. vmware.com
  2. petri.co.il
  3. vmetc.com
  4. scottlowe.org
  5. twitter.com
  6. virtualization.info
  7. vmware-land.com
  8. yellow-bricks.com
  9. twitturly.com
  10. ubuntuforums.org

Advanced Web Statistics 6.8 (build 1.910) – Created by awstats

Twitter yourself a job

January 4th, 2009

My local newspaper carried a Wall Street Journal column this morning by Jonnelle Marte entitled Twitter Yourself a Job.  If you feel Twitter is a waste of time (as I did a while back, and sometimes still do depending on my mood), Jonnelle’s story may change your mind.  By the way, I am @jasonboche on Twitter.

New whitepaper documents ESXTOP

January 4th, 2009

Tom Howarth’s blog entry put me on to this new document from Scott Drummond explaining ESXTOP usage and definitions.  Good find Tom – thank you!  Thanks also to Scott Drummond for his creation!  I prefer the PDF version of the document for my offline document library.

This is something we could have used a few years ago, however, its useful days may be numbered as I believe the direction of ESX is ultimately ESXi and a consoleless bare metal hypervisor.

Datacenters need shutdown/startup order

January 1st, 2009

Today I learned of a new blog called Virtual RJ which is owned by Robbert Jan van de Velde (yet another Dutch VMware virtualization enthusiast!).  I was reading an article he had recently written called Making inactive storage active in VirtualCenter.  What hits close to home for me about this article is the need for datacenter playbooks which outline a shutdown/startup order of infrastructure and servers.  Once upon a time, our environment was fairly simple and staff was small.  Although our environment was documented, the need for a formal shutdown/startup order was not so prevalent.  Over the years, staff has grown, new applications have been introduced to the environment, and the number of servers grew into the hundreds.  Not to mention, storage got out of control and with that we brought in SAN infrastructures.

Unless your datacenter is the size of a broom closet, chances are you cannot easily get away with throwing the master power switch to bring up infrastructure and servers in the right order.  Obviously you’re not going to use a power switch to shut everything down ungracefully either, but what may not be so obvious is that a graceful shutdown or startup of servers and infrastructure in random order may not be the best choice considering the health of the environment.

In order to understand the correct shutdown/startup order for your environment, you need to fully understand the web of datacenter dependencies which can range from simple to highly complex.  Knowing your datacenter dependencies means having good documentation of its components:  servers (including clusters), applications, storage, authentication, network, power, cooling, etc.  Virtualization adds a layer as well as I will show in a moment.  Let’s look at a few high level examples of dependencies:

  • Users depend on applications, workstations, network, VDI, etc.
  • Applications depend on databases, network, authentication, storage, other applications, etc.
  • Highly available databases depend on shared storage, clustered servers, etc.
  • Clustered servers depend on shared storage, authentication, network, quorum, etc.
  • Shared storage and network depends on power and cooling.
  • Consolidated virtual infrastructures (including VDI) depend on everything.

The list above may not completely fit your environment, but it should start to get you thinking about what and where the dependencies are in your environment.  Let me re-emphasize that without knowledge of how data flows in your environment, you won’t be able to come up with an accurate dependency tree.  Shutdown and startup orders aside, you’re in a scary position.  Start documenting quickly.  Talk to your peers, developers, managers, etc. to tie your datacenter components together.

So what does the dependency list above mean and how does it translate into a shutdown/startup order?  Well, workstations and VDIs typically have no dependencies and can be shut down first.  Application servers (including VMs) can be shut down next (except for the vCenter server – we’ll need that to shut down VMs and hosts).  Database cluster shutdown follows with the caveat that not all cluster nodes should be shut down at the same time – stagger the shutdown so as not to hang quorum arbitration risking potential corruption of data.  At this point, if all VMs are shut down, we can use vCenter shut down all ESX/ESXi hosts and then the vCenter server.  At this point, authentication should no longer be needed so let’s shut down the domain controllers.  Getting to the end of the list, we can shut down shared storage, SAN switches, and networking equipment (in that order).  Lastly, we pull the plug on phone systems, Twitter, cooling, and then sever the link to street power.  No really, just kidding – Twitter is not that much of a dependency.  I can quit Twitter any time I want.

Now that we know shutdown order, startup order is typically simple – startup order is the reverse or inverse of the shutdown order.  Example:  Throw the switch for street power.  Engage cooling.  Turn on the PBX.  Fire up the network switches and routers.  SAN switches (go grab a coffee) then shared storage.  Domain controllers, ESX hosts, vCenter, app servers, blah blah blah.  You get the idea.

Everyone on your staff has both lists above memorized right?  If not, you need to get it documented in a shutdown/startup playbook.  I don’t feel one needs complex software or hired technical writers to put this together.  If you understand the dependencies, 85% of the work is already done.  My solution for what I put together was embarrassingly simple:  Microsoft Excel.

The tool itself doesn’t need to be incredibly complex, however, that doesn’t mean your shutdown/startup order will be as simple.  In the spreadsheet I maintain for my environment, I have a few hundred rows of information and many columns representing branch dependencies.  I also have a few different tabs in the spreadsheet with slightly different orders.  This is because we have multiple SANs and if we’re only shutting down one of the SANs for planned maintenance, we only need to shut down its dependencies and not the entire datacenter including the other SANs.

Like many other types of documentation, the shutdown/startup order should be considered a living/breating document that needs periodic care and feeding.  When new servers, infrastructure, or applications are brought into the environment, this document needs to be updated to remain current.  When datacenter components are removed, again, a document update is needed.  We’ve got a formal server turnover checklist which catches loose ends like this.  Any server that goes into production must have all the items on its checklist completed first (ie. all documentation complete, added to backup schedule, added to server security plan, etc.)  Likewise, we also maintain a formal server retirement checklist to make sure we’re not trying to back up retired servers or consume static IP addresses of retired servers.

As our team becomes more distributed and expertise is honed to specific areas of the organization, it is important that all staff members resopnsible for the environment understand the requirements to shut it down quickly or in a planned fashion.  That means good documentation.  Better documentation also means your peers have the tools needed to do your job while you’re gone and less chance you’ll be called in the middle of the night or while on vacation.