A very basic queue monitor

At my office we recently needed a method to quickly know if the queues on any of the Exchange servers were building up. We have monitoring in place, but these guys can sometimes miss a build-up which leaves us with the problem.

As a very rudimentary solution, I compiled the following script.
In a nutshell, it enumerates the message count of all the queues on all Exchange servers in the Org. This includes Exchange 2003 and 2007. The script then measures the sum total of all messages. If it exceeds a predetermined amount, 1000 in my case, it will send a notification message to the administrators.

This is really a catch 22, if the server with the queue build-up is also your relay host, or happens to be the server with the problem. As a workaround for this you could probably do a NET SEND message or use 2 SMTP servers to relay the message through. An alternative is to send an SMS to administrators if you have the facility.

Initially the script waited with a while loop and polled the queues every 5 minutes. I have opted to change that, and launch the script with Task Scheduler instead, which means I don’t have to actually be logged onto the console for the script to run.

The script writes out the date and message count to a log. This log cycles daily.
I know this is very basic, but it gets the job done in terms of what we needed as an interim solution.

You can download the script here:

Whats going on here?

As part of the Exchange audit scripts, I recently changed the 2007 version of the script to use .NET to collect the event logs instead of WMI. Virtu-Al made an interesting suggestion, which was to say, which of these methods are quicker at collecting the logs. So in order to do this I needed to setup a race.
This race would basically involve the two methods of retrieval collecting a large list of events from a selected server. The basic command to accomplish this is as follows: For WMI one would simply use:
$wmi = Get-WmiObject -computer SERVER1 Win32_NTLogEvent  
Using .NET, it retrieves the actual Event Logs, so the entries have to be enumerated with a quick bit of code:
System.Diagnostics.EventLog]::GetEventLogs('SERVER1') ForEach ($eventLog in $eventLogs){ $dotNet += ($eventLog.entries)} 
In both cases, where SERVER1 is the name of the remote server you need to collect the events from. Now, in order to make sure that there is no cheating, I would have to count how many objects are returned by each method. This could be done by simply saving the collection to a variable and counting the total. So in this scenario, .NET would return approximately 56000 items and WMI would return less. About 500+ less every time. From here I went down a crazy path of checking date and time formats etc. and in the end, I came to the conclusion that it had to be the security log. Entries were being written into the Security Log so quickly, that by the time the 2nd script is run, the number of entries have changed, or I remembered that you needed special permissions to read certain Security Log entries. Or so I thought. So I decided to exclude the Security log from my collection. This was easy enough, but still the totals were inconsistent. In an effort to try and eliminate where the problem could be, I decided to include only one log at a time, starting with the Application Log. Here is the script used to collect the Application from a remote server using WMI:
$d1 = get-date

$wmiDate = [System.Management.ManagementDateTimeConverter]::ToDmtfDateTime([DateTime]::Now.AddDays(-1))
$WMI = Get-WmiObject -computer SERVER1 -query ("Select * from Win32_NTLogEvent Where Logfile = 'Application' and TimeWritten >='" + $WmiDate + "'")

$wmiCount = ($WMI).Count

$wmiDT = [System.Management.ManagementDateTimeConverter]::ToDateTime($wmiDate)
Write-Host From Date $wmiDT
Write-Host Total $wmiCount
$d2 = Get-Date
$d2 - $d1
WMI Script results: From Date 09/06/2009 01:28:49 PM Total 317 Here is the script used to collect the same event log entries from the same server, using .NET instead:
$d1 = get-date
$dotNetDate = ([DateTime]::Now.AddDays(-1))
$eventLogs=[System.Diagnostics.EventLog]::GetEventLogs('SERVER1') | where {$_.LogDisplayName -eq "Application"}
ForEach ($eventLog in $eventLogs ){

$dotNet += ($eventLog.entries) | where {($_.TimeWritten -ge $dotNetDate)}
}

$dotnetCount = ($dotNet).count

Write-Host From Date $dotNetDate
Write-Host Total $dotnetCount
$d2 = Get-Date
$d2 - $d1
.NET Script Results:
From Date 09/06/2009 01:28:49 PM Total 650
This was still very confusing so, to see exactly at which record the problem is, I had both scripts display the record number of the first and last record in each respective collection, by adding the following to each script: For the .NET script:
$dotNet | Select-Object -First 1 $dotNet | Select-Object -Last 1 For the WMI script: $WMI | Select-Object RecordNumber, TimeWritten, Type, SourceName, EventCode -First 1 $WMI | Select-Object RecordNumber, TimeWritten, Type, SourceName, EventCode -Last 1 
Now I could see that, at least they were starting at the same record, but for some odd reason, WMI was quitting before the job was done. .NET record results:
Index Time Type Source EventID —– —- —- —— ——- 51 Jun 09 14:55 Warn MSExchange Availa… 4004 705 Jun 10 14:51 Warn MSExchange Active… 1008 WMI Results: RecordNumber TimeWritten Type SourceName ———— ———– —- ———- 353 20090610012624.00000… Warning MSExchange ActiveSync RecordNumber TimeWritten Type SourceName ———— ———– —- ———- 51 20090609145522.00000… Warning MSExchange Availability
To make sure this problem wasn’t specific to the current server I started collecting logs from other servers, to record the results. I also did an add-member on the WMI script to convert the time and date back for easier reading. With the following string:
ForEach-Object { Add-Member -inputobject $_ -Name myTime -MemberType NoteProperty -Value ([System.Management.ManagementDateTimeConverter]::ToDateTime($_.TimeWritten)) -Force -PassThru} 
Over a number of servers this still made no difference, WMI still did not return all the results. This seems to be a problem specific to the Application and Security Log, and could well be related to the WMI impersonation or authentication which will be available in version 2.
This I have not had time to investigate. I decided to re-write the WMI script to collect all results and then filter out the unwanted events with “where-object”. At this point I also changed the selected log to the system event log, as someone cleared the application logs on the selected servers.
This worked great for most of the servers and finally I was getting similar results from both scripts. I did however find, that servers with large numbers of events generate a WMI Quota Violation, which seems to imply that there are too many items in the list, which is yet another blow to WMI.
This could also explain the incomplete results from previous attempts. The Quota Violation is a known problem and there is a resolution for it posted here: http://support.microsoft.com/kb/828653. To get around this problem, I changed the script again, to use the WMI query. So now that we were getting results, it was time to start testing the speed of each method.
I decided to test the speed against 3 different servers, and increment the number of records retrieved until I could not collect anymore, or up to a maximum of 240 days worth of events.
I decided to also give each method and average read time over 3 attempts.
Here are some of the results:

As the amout of days, or number of records increase, the read speed of WMI starts decreasing.

In summary, WMI scales nicely when using a WMI query directly in the Get-WMIObject command. It does however loose speed as the number of records to retrieve start increasing.
It has to be mentioned, that WMI slows down to a crawl, if all records are retrieved and the result is filtered with “where-object”.
Although WMI is faster with less records, I am going base all my event log queries on .NET for now, as WMI proved to be inconsistent and erroneous in what retrieves, or atleast in my testing it did.
I hope that this problem is related to impersonation, and that it is resolved in Powershell v2. The final scripts I used to retrieve the information can be downloaded from here:
 

Update: Exchange 2007 audit script

In an attempt to resolve some issues with regards to the event logs, I have made a few updates to the Exchange 2007 audit script:

* I now use [System.Diagnostics.EventLog]::GetEventLogs() to collect the remote event logs and entries instead of WMI
* The output to the host displays exactly which event log it is busy reading.
* The date range seems more accurate now when the event log contains a large amount of data.
* The physical memory on the basic server information is now displayed as GB and is neatly rounded.
* The Mailbox stores are sorted in alphabetical order by Store Name.
* Added more verbose output to the console while the script runs, to give a better indication of what the script is busy with.

      I hope this resolves most of the problems for now, comments / suggestions are always welcome. The script can be downloaded from here:

      This script has been replaced by a later version, please check the following link, or download the updated version below: