Monitoring Exchange Server 2010 : Using event logs, ExBPA
Solution provider's takeaway: Being proactive about the health of Exchange Server 2010 will make life easier when trying to solve any issues for customers down the road. Performing tasks such as event log monitoring or using the Exchange Best Practices Analyzer (ExBPA) tool will be vital to keeping Exchange Server 2010 healthy.
Verify Exchange Server Health
A large part of being proactive in managing your Exchange environment is knowing where your servers stand in terms of health. This section discusses various things that you need to keep an eye on to help ensure that your servers are healthy.
Monitor the Event Logs
Event logs in Windows are used by several components and applications as a place to record critical alerts and notifications that may be of interest to system administrators. Exchange Server 2010 also uses the Windows event logs to record important events. Exchange records most of its events to the Application log, but you may also see some events recorded elsewhere. However, the majority of the events that you need to be concerned about for Exchange will appear in the Application log.
As a part of your responsibilities as an Exchange Server 2010 administrator, it's vital to check the event logs on each Exchange server and make sure that you don't see any undetected problems or other events that could become big issues in the future. You will primarily want to keep an eye out for any Warning or Error events, as they indicate problems that the server is currently having or could have.
View Relevant Events
To view the Application event log, follow these steps:
- Click Start ⇒ Administrative Tools ⇒ Event Viewer. This launches the Event Viewer application that is built into Windows.
- In the Event Viewer Console tree, browse to the Windows Logs ⇒ Application node.The event logs for Exchange (and other processes) are displayed in the Results pane. You can search through the log line by line or you can create a filter.
- If you want to filter out everything except for the Exchange logs, click the Filter Current Log task in the Actions pane on the right.
- In the Filter Current Log dialog box, select the Critical, Error, and Warning check boxes. These events will indicate that something is wrong with Exchange or that something may soon break.
- While still in the Filter Current Log dialog box, click on the drop-down list for the Event Sources field and select the relevant events that begin with MSExchange from the list, as shown in Figure 11.10. Click OK when finished.
- Figure 11.10: Filtering out everything except for the Exchange logs in Event Viewer
- Back in the Event Viewer dialog box, you can now view only the events relevant to Exchange.
Specify the Level of Logging Detail
If you find that you need more detail than what is provided in the Application logs, you can turn the dial up on what Exchange logs in the Application log. To increase logging, use the Set-EventLogLevel cmdlet in the EMS. You will need to specify the category of logs that you want to increase and how much you want to increase it.
In the following example, we will check and change the log level for the MSExchangeRPC log:
- To determine what component to enable higher logging on and to determine the current logging level, run the following command:Get-EventLogLevel
- The Get-EventLogLevel command displays information about each component. Use the built-in PowerShell filtering capabilities to narrow down this list to display only event log categories that have the characters rpc in the identity:Get-EventLogLevel *rpc*
- To specify a logging level of High for the MSExchangeRPC log, use the following command:Set-EventLogLevel "MSExchangeSA \RPC Calls" -Level High
Monitor Disk Space on Database and Log Drives
The amount of free disk space is an important thing to monitor, particularly on your volumes that contain the database files or the transaction log files. On Mailbox servers, when the volume that contains the database gets full the database will be dismounted, which prevents users from accessing their mailboxes on that database. Dismounting the database is how Exchange protects the integrity of the data, as it cannot write additional data to the database if there is no space to do so. The database is dismounted when there is 2 MB of disk space left on the volume.
When the database is dismounted due to the disk being full, Exchange will log an event in the Application log with event ID 1003, as shown in Figure 11.11.
Figure 11.11: Event ID 1003 is logged when the database volume is out of free space.
Before the database can be mounted again, you must free up some space on the volume. There are multiple ways to reclaim space:
- Back up the server and allow the transaction logs to truncate.
- Back up the server and permanently delete any mailboxes that may be stored in deleted mailbox retention.
- Perform an offline defragmentation of the database using the ESEUTIL /D command. This may take some time to complete.
- If you're using a SAN-based volume, you can grow the size of the LUN that is presented to the Exchange server.
- Back up any extraneous data or personal files and delete them or move them to a more appropriate server.
- Move any transaction logs that have already been committed to another volume.
On Transport servers, when the disk that contains the message queue database and logs nears capacity, Exchange applies back pressure, which instructs Exchange to stop accepting new connections and potentially stops all message flow. By default, the Transport servers require at least 500 MB of disk space free on the volumes that contain the queue database and logs, so you should monitor the free disk space on those locations.
If you get into the situation of being low on disk space on your Transport server and back pressure is being applied, alleviate the problem using one of the following methods:
- Free up disk space on the Transport server by removing extraneous data.
- Move the queue database and logs to a separate volume with more space available.
- Modify the threshold numbers used to determine when to apply back pressure.
When back pressure is applied or relinquished, the Transport servers will log events in the Application log with event IDs of 15004 and 15005. You can monitor the Application log for these events on your Transport servers to indicate that back pressure is being applied.
Ensure That Services Are Running
The various components of Exchange run as services in Windows. Not all of the services need to be running in order for Exchange to be functional, however. Certain services may only need to be started if Exchange is using a feature that relies on services, such as POP or IMAP. In fact, one of the best practices in hardening servers is to disable services that you are not required to run.
There are core services that need to be running in Exchange in order for an Exchange server in a particular role to function correctly. You should monitor these services to ensure that they are running. Many problems are attributable to a service that has stopped running for one reason or another. If you know when a critical service stops, you can respond rapidly to get the problem resolved.
Table 11.3 lists the services that Exchange Server 2010 uses and identifies which services are critical for each role.
Table 11.3: Critical Services That Need to Remain Running for Each Role
IIS Admin Service | Yes | Yes | Yes | No |
Microsoft Exchange Active Directory Topology | Yes | Yes | Yes | No |
IIS Admin Service | Yes | Yes | Yes | No |
Microsoft Exchange ADAM | No | No | No | Yes |
Microsoft Exchange Credential Service | No | No | No | Yes |
Microsoft Exchange EdgeSync | No | No | Yes | No |
Microsoft Exchange Information Store | Yes | No | No | No |
Microsoft Exchange Mailbox Assistants | Yes | No | No | No |
Microsoft Exchange Address Book | No | Yes | No | No |
Microsoft Exchange Forms-Based Authentication Service | No | Yes | No | No |
Microsoft Exchange File Distribution | No | Yes | No | No |
Microsoft Exchange Mail Submission | Yes | No | No | No |
Microsoft Exchange Mailbox Replication | Yes | Yes | No | No |
Microsoft Exchange Protected Service Host | No | Yes | No | No |
Microsoft Exchange RPC Client Access | Yes | Yes | No | No |
Microsoft Exchange System Attendant | Yes | No | No | No |
Microsoft Exchange Search Indexer | Yes | No | No | No |
Microsoft Exchange Service Host | Yes | Yes | Yes | Yes |
Microsoft Exchange Throttling | Yes | No | No | No |
Microsoft Exchange Transport | No | No | Yes | Yes |
Microsoft Exchange Transport Log Search | Yes | No | Yes | No |
World Wide Web Publishing Service | Yes | Yes | Yes | No |
Windows Remote Management | Yes | Yes | Yes | No |
To determine if the required services for each role are running, you can execute the Test-ServiceHealth cmdlet in the EMS. You do not need to include any parameters.
The Test-ServiceHealth cmdlet will return the list of roles that are running on the Exchange server along with a list of the services for those roles. The cmdlet identifies the services that are running as well as the services that are not running but should be.
The following output demonstrates what is returned by the command when the Mail Submission service is stopped on a Mailbox server:
Role : Mailbox Server Role RequiredServicesRunning : False ServicesRunning : {IISAdmin, MSExchangeADTopology, MSExchangeIS, MSExchangeMailbox Assistants, MSExchangeRepl, MSEx changeRPC, MSExchangeSA, MSExchange Search, MSExchangeServiceHost, MS ExchangeThrottling, MSExchange TransportLogSearch, W3Svc, WinRM} ServicesNotRunning : {MSExchangeMailSubmission}
Use the Test Cmdlets in the Exchange Management Shell
Exchange Server 2010 provides several cmdlets in the Exchange Management Shell that are focused on testing the functionality and configuration of Exchange. The list of test cmdlets has grown in comparison to those available with Exchange Server 2007, and there are several useful ones that can make your job as an Exchange administrator a lot easier. Table 11.4 describes the available test cmdlets. You may have seen some of these cmdlets used throughout this book when working with certain aspects of Exchange.
Table 11.4: The Test-* Cmdlets in Exchange Server 2010
Test-ActiveSyncConnectivity | Tests mobile device connectivity through ActiveSync. The cmdlet attempts to synchronize the mobile device that you specify in the command. |
Test-EcpConnectivity | Tests access to the Exchange Control Panel on a Client Access server that you specify. |
Test-EdgeSynchronization | Tests the synchronization of Edge Transport servers. |
Test-FederationTrust | Tests the configuration of the federation trust with the Microsoft Federation Gateway. |
Test-FederationTrustCertificate | Tests the certificate used for your federation trust. |
Test-ImapConnectivity | Tests the connectivity of one or more IMAP clients. |
Test-IPAllowListProvider | Tests mobile device connectivity through ActiveSync. The cmdlet attempts to synchronize the mobile device that you specify in the command. |
Test-IPBlockListProvider | Tests that the configured IP block list provider is available and checks an IP address against it. |
Test-IRMConfiguration | Tests the configuration of Rights Management in Exchange. |
Test-Mailflow | Tests whether mail can be sent to and from mailbox servers in the Exchange organization. |
Test-MapiConnectivity | Tests that a mailbox can be logged into. If run against a database, it tests that the system mailbox for the database can be logged into. |
Test-Message | Submits a test message to the specified recipients. This can be used to test transport rules and have a report generated about the tests. |
Test-MRSHealth | Tests to ensure that the Mailbox Replication Service is running properly. |
Test-OutlookConnectivity | Thoroughly tests the connectivity of Outlook by testing profile creation, AutoDiscover, and mailbox access. |
Test-OutlookWebServices | Tests that AutoDiscover is returning the correct configuration information for a user and tests each of the service endpoints returned by AutoDiscover. |
Test-OwaConnectivity | Tests that Outlook Web App can be contacted and successfully logged into. |
Test-PopConnectivity | Tests the connectivity of one or more POP clients. |
Test-PowerShellConnectivity | Tests that PowerShell can be used remotely and can successfully issue commands. |
Test-ReplicationHealth | Tests multiple aspects of replication for a server in a DAG. |
Test-SenderId | Tests sender ID checking against an IP address and domain that you specify. |
Test-ServiceHealth | Tests that the services for each Exchange role installed are running. |
Test-SystemHealth | Tests the overall health of the Exchange server through multiple tests. |
Test-WebServicesConnectivity | Tests the functionality of Exchange Web Services through the use of Outlook Anywhere. |
Tip:
The test cmdlets don't need to always be run on demand. You can choose a few of them that you want to run on a regular basis and create scheduled tasks out of them. For information on creating scheduled tasks from PowerShell scripts, refer to Chapter 2, "Using the Exchange Management Console and the Exchange Management Shell."
The test cmdlets don't need to always be run on demand. You can choose a few of them that you want to run on a regular basis and create scheduled tasks out of them. For information on creating scheduled tasks from PowerShell scripts, refer to Chapter 2, "Using the Exchange Management Console and the Exchange Management Shell."
When running some of these test cmdlets, you may be required to have a specific test account created beforehand. To create this account, use the following steps:
- Open the EMS and browse to the Scripts folder in the location where Exchange is installed. By default, this location is C: \Program Files \Microsoft \Exchange Server \v14 \Scripts.
- Run the PS1 script called New-TestCasConnectivityUser.ps1.
- When prompted for a password, type a temporary password and press Enter. This password is just used for the creation of the test account and you will therefore not need to remember this password.
- When prompted to continue creating the test user, press Enter. The test user is automatically created. When the test account is finished, the script will end and you will be returned to the EMS command prompt.
Use the Exchange Best Practices Analyzer
The Exchange Best Practices Analyzer (ExBPA) is a powerful tool in the Exchange administrator's toolbox that should be run on a regular basis. The ExBPA can perform a variety of tests that help ensure the health of your Exchange organization. In this section, I will show you how to run a health check.
The ExBPA health check component performs a variety of tests against your Exchange servers and presents the results in an easy-toread report. When reviewing the report, you will be presented with the critical issues encountered and given the opportunity to read more about why the issue was detected and how to correct it.
To perform a health check with the ExBPA, use the following steps:
- Open the Exchange Best Practices Analyzer. You can do this by opening the EMC and browsing to the Toolbox node in the Console tree. Under the Configuration Management Tools portion of the Toolbox, double-click on Best Practices Analyzer.
- If this is the first time you are running the BPA, you will be presented with a welcome screen. Decide whether you want to join the Microsoft Customer Experience Improvement Program and then click Go To The Welcome Screen.
- At the Welcome screen, select the option Select Options For A New Scan.
- On the Connect To Active Directory screen, type the name of the domain controller you want to connect to and click Connect To The Active Directory Server.
- If you want to use different credentials than what you are currently logged in as for communication with Active Directory, click Show Advanced Login Options and enter the credentials that you want to use.
- Your connectivity and access permissions are verified before continuing.
- On the Start A New Best Practices Scan screen, enter a name for the scan and select Health Check from the list of scans to perform.
- If you only want to scan specific Exchange servers, you can select those servers from the Specify The Scope For This Scan list.
- After you configured your options, click Start Scanning, as shown in Figure 11.12.
- Figure 11.12: Configuring the BPA to perform a health checkOn the Scanning In Progress screen, the scan is performed. The amount of time that the scan takes to complete will vary depending on how many servers you are scanning and the speed of your network.
- After the scan completes, you will be taken to the Scanning Complete screen. Select the option View A Report Of This Best Practices Scan.
- View the results of the scan and take any necessary action on reported issues by selecting the option Tell Me More About This Setting, as you can see in Figure 11.13.