Friday, August 25, 2006

Exchange monitoring - The hit list

I am teaching an Exchange 2003 class this week. Today and yesterday, I facilitated a discussion with my students about what they thought was the top items that should be monitored (based partly on our dicussions of monitoring and disaster recovery/prevention). It was an interesting and relevant list and I wanted to post it here for further discussions.

The question essentially was, "independent of a specific monitoring system, what would you want to check, monitor, or identify on your Exchange servers?"
  • Backup related status
    - Did it run successfully? Backup errors?
    - Did transaction logs purge?
    - How long did backups take?
    - Did you back up databases you expected to back up?
  • Check disk space
    - Disk space above warning threshold
    - Disk space growth over time (weekly / monthly)
  • Database (edb and stm) file size growth (weekly / monthly)
  • Queue status / growth (internal and external)
  • Anti-virus system
    - Signatures updated
    - Virus trends (unusual spikes in activity)
  • Online maintenance completion
  • Information store service responding
  • OWA / Web service responding
  • SMTP service responding
  • Test message / measure round-trip times (internal and external)
  • Protocol log file sizes and trends
  • Scan Application logs for common, critical errors:
    - Database / ESE errors (e.g. -1018 errors)
    - Active Directory / DSAccess related errors
  • Critical performance monitor counters:
    - RPC latency
    - Average % disk time
    - CPU usage
    - Available memory within norms
    - Page file usage within norms
  • Physical server / hardware statistics
    - Power supplies functioning
    - Memory / single bit errors
    - Disk / array failures
    - RAID array battery status
    - Temperature alarms
    - Tamper alarms
  • External DNS health (able to resolve external DNS names and our public MX and A records are correct
  • Internal DNS health (resolve internal domain and domain controller resources)
  • Internal network health check (perform network check and compare baseline TTLs)
  • Cluster service responding
  • Cluster virtual servers residing on "home" node as expected
  • Daily administration tasks (who has made Exchange related configuration changes)


At 11:19 PM, Blogger Victor Osten said...

Share some exciting news with everyone.
I would like to share some exciting news with everyone. I recently discovered Orbasoft Antispyware ( and it’s the best scanner that I’ve used so far. It picks the same type of bugs that the better known and more expensive scans do and it’s so easy to get. The antispyware solution from Orbasoft is the perfect solution for taking care of your computer. I know it’s made a difference for me and I’m so glad that I gave it a try. I really believe that you will benefit from this scan as much as I have and I recommend that you give it a try.


Post a Comment

<< Home