Wednesday, March 25, 2009

Querying the MSCS Cluster Log using Log Parser and Powershell

When encountering problems with a Windows High Availability Cluster, the 1st place one would look would be in Cluster Administrator and then in the Windows Event Logs. Seldom though have I seen the Cluster logs being looked at. This log (in Windows 2003 MSCS) is an anomaly in the Microsoft world. For one, its size is configured in a System Variable. Another odd behaviour is that when the log is full, the 1st half is truncated and then used. Why not just write everything to the Event Logs and have a separate log for the Cluster Service? And lastly, the format of this log even defies Log Parser, the king of making sense of large text file logs.

Enter Log Parser and Powershell.

I've written a Powershell script to:

  1. Prompt the user for the Cluster Server name with an input box
  2. Get the contents of the Windows cluster log in the default location
  3. Scrub the log
  4. Export the cleaned log to a log file
  5. Create a LogParser Object
  6. Use this LogParser Object to query the cleaned log file and only return Warnings and Errors
  7. Output these Warnings and Errors to a .csv file

Here is the Code:

$a = new-object -comobject MSScriptControl.ScriptControl
$a.language = "vbscript"
$a.addcode("function getInput() getInput = inputbox(`"Enter Cluster Server Name`",`"Cluster Log Viewer`") end function" )
$s = $a.eval("getInput")
$gcs = '\\'+$s+'\c$\windows\cluster\cluster.log'
$var = gc $gcs
$var=$var -replace 'Code0000', "Code`r`n0000"
$var=$var -replace 'ERR ', 'ERR '
$var=$var -replace 'Volume Manager', '[VMg]'
$var=$var -replace 'ERR SQL Server Agent', 'ERR [SQA]'
$var=$var -replace 'ERR IP Address', 'ERR [IPA]'
$var=$var -replace 'WARN Network Name', 'WARN [NNm]'
$var=$var -replace 'ERR Network Name', 'ERR [NNm]'
$f = 'c:\'+$s+'Clean.log'
$var ac $f
$cl = 'c:\'+$s+'Clean.csv'
$myQuery = new-object -com MSUtil.LogQuery
$objInputFormat = New-Object -com MSUtil.LogQuery.TextLineInputFormat
$objOutputFormat = New-Object -com MSUtil.LogQuery.CSVOutputFormat
$strQuery = "select EXTRACT_TOKEN(Text, 0, ' ') as StringDate, EXTRACT_TOKEN(Text,1, ' ') AS Sev, EXTRACT_TOKEN(Text,2, ' ') AS Source, EXTRACT_TOKEN(Text,3, ' ') AS D1, EXTRACT_TOKEN(Text,4, ' ') AS D2, EXTRACT_TOKEN(Text,5, ' ') AS D3, EXTRACT_TOKEN(Text,6, ' ') AS D4, EXTRACT_TOKEN(Text,7, ' ') AS D5, EXTRACT_TOKEN(Text,8, ' ') AS D6, EXTRACT_TOKEN(Text,9, ' ') AS D7, EXTRACT_TOKEN(Text,10, ' ') AS D8, EXTRACT_TOKEN(Text,11, ' ') AS D9, EXTRACT_TOKEN(Text,12, ' ') AS D10, EXTRACT_TOKEN(Text,13, ' ') AS D11, EXTRACT_TOKEN(Text,14, ' ') AS D12, EXTRACT_TOKEN(Text,15, ' ') AS D13, EXTRACT_TOKEN(Text,16, ' ') AS D14, EXTRACT_TOKEN(Text,17, ' ') AS D15, EXTRACT_TOKEN(Text,18, ' ') AS D16, EXTRACT_TOKEN(Text,19, ' ') AS D17, EXTRACT_TOKEN(Text,20, ' ') AS D18, EXTRACT_TOKEN(Text,21, ' ') AS D19, EXTRACT_TOKEN(Text,22, ' ') AS D20, EXTRACT_TOKEN(Text,23, ' ') AS D21, EXTRACT_TOKEN(Text,24, ' ') AS D22, EXTRACT_TOKEN(Text,25, ' ') AS D23, EXTRACT_TOKEN(Text,26, ' ') AS D24 INTO "+$cl+" from "+$f+" WHERE EXTRACT_TOKEN(Text,1, ' ') NOT LIKE '%INFO%'"
$myQuery.ExecuteBatch($strQuery, $objInputFormat, $objOutputFormat)

Even this process doesn't render a clean report. I then use SQL to bulk insert this .csv file into a SQL table and then further manipulate the data. I guess this could have been done in Powershell, its just with text manipulation I'm more familiar with T-SQL.

CREATE TABLE [dbo].[ClusterLog](
[StringDate] [nvarchar](50) NULL,
[Sev] [nvarchar](50) NULL,
[Source] [nvarchar](50) NULL,
[D1] [nvarchar](50) NULL,
[D2] [nvarchar](50) NULL,
[D3] [nvarchar](50) NULL,
[D4] [nvarchar](50) NULL,
[D5] [nvarchar](50) NULL,
[D6] [nvarchar](50) NULL,
[D7] [nvarchar](50) NULL,
[D8] [nvarchar](50) NULL,
[D9] [nvarchar](50) NULL,
[D10] [nvarchar](50) NULL,
[D11] [nvarchar](50) NULL,
[D12] [nvarchar](50) NULL,
[D13] [nvarchar](50) NULL,
[D14] [nvarchar](50) NULL,
[D15] [nvarchar](50) NULL,
[D16] [nvarchar](50) NULL,
[D17] [nvarchar](50) NULL,
[D18] [nvarchar](50) NULL,
[D19] [nvarchar](50) NULL,
[D20] [nvarchar](50) NULL,
[D21] [nvarchar](50) NULL,
[D22] [nvarchar](50) NULL,
[D23] [nvarchar](50) NULL,
[D24] [nvarchar](50) NULL

bulk insert ClusterLog FROM 'D:\Powershell\SQL3clean.csv'

UPDATE ClusterLog
SET Source = REPLACE(REPLACE(Source, CHAR(91), ''), ']', '')

select 'SQLCL3' AS Cluster
,DATEADD(hh, 2, REPLACE(SUBSTRING(StringDate, PATINDEX('%::%',StringDate)+2, 19), '-', ' ')) AS TimeLogged
,"Severity" = CASE
WHEN Sev LIKE 'WARN' Then 'Warning'
,"Source" = CASE
WHEN Source LIKE 'API' THEN 'API support'
WHEN Source LIKE 'ClMsg' THEN 'Cluster messaging'
WHEN Source LIKE 'ClNet' THEN 'Cluster network engine'
WHEN Source LIKE 'CP' THEN 'Checkpoint Manager'
WHEN Source LIKE 'CS' THEN 'Cluster service'
WHEN Source LIKE 'DM' THEN 'Database Manager'
WHEN Source LIKE 'EP' THEN 'Event Processor'
WHEN Source LIKE 'FM' THEN 'Failover Manager'
WHEN Source LIKE 'GUM' THEN 'Global Update Manager'
WHEN Source LIKE 'INIT' THEN 'Initial state'
WHEN Source LIKE 'JOIN' THEN 'The node state that follows INIT when the node attempts to join a cluster'
WHEN Source LIKE 'LM' THEN 'Log Manager. Maintains the quorum log.'
WHEN Source LIKE 'MM' THEN 'Membership Manager, also known and written to the cluster log as Regroup (RGP)'
WHEN Source LIKE 'NM' THEN 'Node Manager. Keeps track of the state of other nodes in the cluster'
WHEN Source LIKE 'OM' THEN 'Object Manager. Maintains an in-memory database of entities, or objects (nodes, networks, groups)'
WHEN Source LIKE 'RGP' THEN 'Regroup, Tracks which nodes are members of the cluster'
WHEN Source LIKE 'RM' THEN 'Resource Monitor'
WHEN Source LIKE 'SQA' THEN 'SQL Server Agent'
WHEN Source LIKE 'NNm' THEN 'Network Name'
WHEN Source LIKE 'IPA' THEN 'IP Address'
WHEN Source LIKE 'VMg' THEN 'Volume Manager'
ELSE Source
,D1 + ' ' + D2 + ' ' + D3 + ' '+ D4+ ' '+ D5 + ' '+ D6 + ' '+ D7 + ' '+ D8 + ' '+ D9 + ' '+ D10 + ' '+ D11 + ' '+ D12 + ' '+ D13 + ' '+ D14 + ' '+ D15 + ' '+ D16 + ' '+ D17 + ' '+ D18 + ' '+ D19 + ' '+ D20 + ' '+ D21 + ' '+ D22 + ' '+ D23 + ' '+ D24 AS ErrorDescription
from ClusterLog
WHERE DATEADD(hh, 2, REPLACE(SUBSTRING(StringDate, PATINDEX('%::%',StringDate)+2, 19), '-', ' ')) > getdate()-2

Above is what the Cluster Log will look like, which is finally useful. Also, with the data being in a SQL Table, we could automate the entire process above and display Cluster Log errors and warnings for the last x days in a Reporting Services report.

1 comment:

  1. This may be obvious to most folks, but many are unfamiliar with SQL Server Integration Services. An SSIS package could be created to do this process automatically on a schedule... something very useful to provide backup automation along with the rest of your high availability databases of your logs (the above can be expanded to include a slew of other logs, W3C IIS logs, COM+, MS DTC, DFS, etc.).