Monday, April 30, 2007

Baseline Counters Monitoring Exchange Server

You don't need fancy tools, to determine what is going on with exchange server. We will use building Windows performance Monitor to do the job.

Click Start, do to RUN, type, perfmon and hit enter. (Assuming, you are doing this with administrator privileges)Now performance monitor is running in front of you. Let's tune up and discover what is going on with our Exchange server. First on the bottom you will see some default counter defined pages per second, Avg, Disk Queue Length, and % Processor Time. Let's go ahead and delete them, highlight one of them and click delete on top of the window, by clicking delete symbol , and Click on + sign to add the counters below When monitoring Exchange, below counters with a baseline and good to remember or keep it a side.

We can use these counters to maintain our exchange server, or to find out what the problem is.

•Database\Log Record stalls/sec

Average should be below 10 per second and maximum values should not be higher than 100 per second (indicates the number of logs records that cannot be written because the buffers are full

Note that Exchange Server 2000 defaults to 84 buffers whilst Exchange Server 2003 defaults to 512).

•Database\Log Threads Waiting

Average should be below 10 (indicates the number of threads waiting to complete an update to the database by writing their data to the log

if too high, the log may be a bottleneck).

•MSExchangeIS\RPC Requests

Should be below 30 at all times (indicates the number of MAPI requests being serviced by the Microsoft Exchange Information Store service

The default maximum is 100).

•MSExchangeIS\RPC Average Latency

Should be below 50ms at all times and should be in the 10

25ms range on a healthy server (averaged over the last 1024 packets and affects how long it takes for a user's view to change in Outlook).

•MSExchangeIS\RPC Operations/sec

Should rise and fall with MSExchangeIS\RPC Requests (indicates how many RPC operations are being requested and actually responded to).

•MSExchangeIS\Virus Scan Queue Length

If this is consistently high considering a hardware upgrade (indicates the number of outstanding requests queued for virus scanning).

•MSExchangeIS Mailbox\Active Client Logons

This is server Specific but should be baseline and monitored (indicates the number of clients which performed any action within the last 10 minutes).

•Paging File\% Usage

Should remain below 50% high values indicate that the paging file size should be increased or more RAM added to the server (indicates the amount of the paging file used).

•Memory\Available Mbytes (MB)


50Mb available at all times (indicates the amount of physical memory immediately available to a process).

•Memory\Pages/sec


Below 1000 at all times (indicates the rate at which pages are written to disk to resolve hard page faults).

•Memory\Pool Nonpaged Bytes


No more than 100Mb (indicates the amount of memory available for kernel objects which must remain in memory and cannot be written to disk).

•Memory\Pool Paged Bytes

No more than 180Mb, unless a backup or restoration is taking place (indicates the amount of memory available for kernel objects which must remain in memory and can be written to disk).

•Physical Disk\Average Disk Read/sec


average below 20ms and maximum below 100ms for the database volume, average below 5ms and maximum below 50ms for the transaction log volume, average below 10ms and maximum below 50ms for the SMTP queue volume (indicates the average time to read data from the disk).

•Physical Disk\Average Disk Write/sec


average below 20ms and maximum below 100ms for the database volume, average below 10ms and maximum below 50ms for the transaction log volume, average below 10ms and maximum below 50ms for the SMTP queue volume (indicates the average time to read data from the disk).

2 comments:

Scott said...

Nice to have all these items in one focused locations. Good reference.

Oz Ozugurlu said...

Thanks Scott nice to have you here

MSExchangeIS\RPC Average Latency
This counter is one of the most famous one, on Interview for me to ask to see the interest of the candidate for exchange. I remember when we pounded with calls, users are opening tickets to our help desk for, famous Christmas balloon “Exchange is retrieving data from Server” we figured out fast
The RCP Averaged latency was hitting the roof, above 400, we also were able to indentify our backup person was running backup on 1PM (-: , believe or now.
When managers were asking me, what is going on with exchange, I was able to pin point ( show) the counter and telling them , Stop your backup guy, he is killing my exchange server, unless these numbers goes below 30 there is nothing I can do.
Knowing some of these trash holds are great from experience
Best Regards
oz