Friday, June 12, 2009

Gathering Boot Events and Calculating and Reporting Uptime

Uptime! Part of the Trinity, along with Single Sign On and ..... Ok so its a big topic. Something some clever people write theses about and do PhD's about.

In our case we have a reasonably simple requirement. We need to have 99% uptime for servers. Uptime is between support hours, during the week, excluding public holidays. Great.

Lets first start with what data we will use to calculate uptime. We are using the Windows event logs, in particular the System event log and EventID's 6006, 6008 and 6009. So 1st off I go and grab these events from a servers event log. I use Log Parser. The reason I use Log Parser is I can filter on EventID easily. Herewith the PS code, to get the last 45 days worth of boot events:

function Get-BootEvents ($svr)
$evt = 'D:\Data\BootEvents\'+$svr+'_BootEvents.csv'
$myQuery = new-object -com MSUtil.LogQuery
$objInputFormat = New-Object -com MSUtil.LogQuery.EventLogInputFormat
$objOutputFormat = New-Object -com MSUtil.LogQuery.CSVOutputFormat
$strQuery = "Select ComputerName, TimeGenerated, EventID, EventTypeName, Message INTO "+$evt+" FROM \\"+$svr+"\System WHERE TimeWritten >= SUB( TO_LOCALTIME(SYSTEM_TIMESTAMP()), TIMESTAMP('0000-02-15', 'yyyy-MM-dd')) AND (EventID = '6006' OR EventID = '6008' OR EventID = '6009') ORDER BY TimeGenerated"
$myQuery.ExecuteBatch($strQuery, $objInputFormat, $objOutputFormat)

I run this for each server in a list.

$sl = gc 'D:\Powershell\servers.txt'
foreach ($svr in $sl) {write-host "Getting boot events for $svr" -foregroundcolor "Green"; Get-BootEvents $svr}

Then I copy these csv files to my DBA SQL Server and Bulk Insert them into a BootEvents Table. BootEvents Table code:

CREATE TABLE [dbo].[BootEvents](
[ComputerName] [varchar](128) NULL,
[TimeGenerated] [datetime] NULL,
[EventID] [int] NULL,
[EventTypeName] [varchar](100) NULL,
[Message] [varchar](500) NULL

The bulk insert function is below:

function BulkInsert-BootEvents ($fl)
$SqlConnection = New-Object System.Data.SqlClient.SqlConnection
$SqlConnection.ConnectionString = "Server=DBASQL1;Database=master;Integrated Security=True"
$SqlCmd = New-Object System.Data.SqlClient.SqlCommand
#$SqlCmd.CommandText = "use DBA; TRUNCATE TABLE BootEvents;"
#$SqlCmd.Connection = $SqlConnection
$SqlCmd.CommandText = "BULK INSERT DBA..BootEvents FROM '"+$fl+"' WITH (FIELDTERMINATOR = ',', FIRSTROW = 2, ROWTERMINATOR = '\n')"
$SqlCmd.Connection = $SqlConnection

I run the following to Bulk Insert each csv into my BootEvents Table:

cpi d:\data\bootevents\*bootevents.csv \\DBASQL1\d$
$l = gci '\\DBASQL1\d$\*' -include *bootevents.csv | select fullname
foreach ($fl in $l) {BulkInsert-BootEvents $fl.fullname}

Now I have all the boot events for all my servers in a SQL Table. The 1st thing I do is create some T-SQL functions to work out the number of work seconds of downtime between boot events or for a hang boot event, and also if a particular date is a public holiday. I also have a function to convert seconds to a string of hours, minutes and seconds, which is more readable.

Convert Seconds to string

--drop function ms

CREATE function ufn_ConvertSec
(@s bigint)
DECLARE @m bigint
DECLARE @h int
DECLARE @d int

SET @h = @s/3600
SET @s = @s - (@h*3600)
SET @m = @s/60
SET @s = @s - (@m*60)
SET @st = CAST(@h AS VARCHAR(10))+' hour(s); '+ CAST(@m AS VARCHAR(15))+' minute(s); '+ CAST(@s AS VARCHAR(5))+' second(s)'

--select dbo.ufn_ConvertSec(29098)

Check if Date is Public Holiday



DECLARE @PublicHolidays TABLE
HolidayName VARCHAR(50)
INSERT @PublicHolidays
VALUES('2009/04/10', 'Good Friday')
INSERT @PublicHolidays
VALUES('2009/04/13', 'Easter Monday')
INSERT @PublicHolidays
VALUES('2009/04/22', 'Voting Day')
INSERT @PublicHolidays
VALUES('2009/04/27', 'Freedom Day')
INSERT @PublicHolidays
VALUES('2009/05/01', 'Workers Day')
INSERT @PublicHolidays
VALUES('2009/06/16', 'Youth Day')
INSERT @PublicHolidays
VALUES('2009/08/10', 'Womens Day')
INSERT @PublicHolidays
VALUES('2009/09/24', 'Heritage Day')
INSERT @PublicHolidays
VALUES('2009/12/16', 'Day of Reconciliation')
INSERT @PublicHolidays
VALUES('2009/12/25', 'Christmas Day')

--SET @Dt = '2009/06/17 07:17:16'
SET @DT = SUBSTRING(CONVERT(VARCHAR(30), @Dt, 111), 1, 10) + ' 00:00:00.000'

IF (SELECT COUNT(*) FROM @PublicHolidays WHERE DATE = @DT) > 0
SET @IsHol = 1
SET @IsHol = 0

select dbo.ufn_IsHoliday('2009/06/16 07:17:16')

Calculate Work Seconds between two dates

CREATE FUNCTION ufn_WorkSec (@Down datetime, @Up DATETIME)

SET @WorkSec = 0

WHILE @Down <= @Up
IF dbo.ufn_IsHoliday(@Down)=1
IF (DATEPART(weekday, @Down) < 6 AND (DATEPART(hour, @Down) BETWEEN 7 AND 17))
WHILE (DATEPART(hour, @Down) < 18 AND @Down <= @Up)
SET @WorkSec = @WorkSec + 1
SET @Down = DATEADD(second, 1, @Down)

SET @Down = DATEADD(day, 1, SUBSTRING(CONVERT(VARCHAR(30), @Down, 111), 1, 10) + ' 06:59:59.000')
SET @Down = DATEADD(second, 1, @Down)


Now I'm ready to calculate the uptime percentage and insert the summarized data into a summary table called UptimeSummary. Code for table:

CREATE TABLE UptimeSummary
ComputerName VARCHAR(128),
Downtime VARCHAR(100),
UptimeSince DATETIME,
UptimePercent Decimal(18,4)

Now comes the T-SQL magic to use all the above code and churn out the uptime percentage:

declare @bootevents TABLE
ComputerName VARCHAR(128),
TimeGenerated DATETIME,
EventID INT,
EventTypeName VARCHAR(100),
Message VARCHAR(500)
DECLARE @UptimeSummary TABLE
ComputerName VARCHAR(128),
DowntimeSec INT,
WorkSec INT,
UptimePercent Decimal(18,4)

INSERT @bootevents
select * from bootevents


SET @StartDate = GETDATE()-45
SET @EndDate = GETDATE()
SET @WorkSec = (SELECT dbo.ufn_WorkSec(@StartDate, @EndDate))
FOR SELECT DISTINCT ComputerName FROM @BootEvents ORDER BY ComputerName ASC
OPEN ServerC

DECLARE @ComputerName VARCHAR(128)
DECLARE DownUp Cursor
FOR SELECT ID, TimeGenerated, ComputerName FROM @BootEvents WHERE EventID = 6006 AND ComputerName = @Server ORDER BY TimeGenerated ASC
FETCH NEXT FROM DownUp INTO @ID, @TimeGenerated, @ComputerName
SET @Downtime = 0
SET @Down = (SELECT TimeGenerated FROM @BootEvents WHERE ID = @ID)
SET @ID = @ID + 1
SET @Up = (SELECT TimeGenerated FROM @BootEvents WHERE ID = @ID AND EventID = 6009)
SET @Downtime = @Downtime + (SELECT dbo.ufn_WorkSec(@Down, @Up))
FETCH NEXT FROM DownUp INTO @ID, @TimeGenerated, @ComputerName

DECLARE DownUpHang Cursor
FOR SELECT ID, TimeGenerated, ComputerName, Message FROM @BootEvents WHERE EventID = 6008 AND ComputerName = @Server ORDER BY TimeGenerated ASC
OPEN DownUpHang
FETCH NEXT FROM DownUpHang INTO @ID, @TimeGenerated, @ComputerName, @Message
SET @Up = (SELECT TimeGenerated FROM @BootEvents WHERE ID = @ID)
SET @Down = (SELECT dbo.ufn_DateUnexpected([message]) FROM @BootEvents WHERE ID = @ID)
SET @Downtime = @Downtime + (SELECT dbo.ufn_WorkSec(@Down, @Up))
--SELECT @ID2, @TimeGenerated2, @ComputerName2, @Message, @Up, @Down, @Downtime
FETCH NEXT FROM DownUpHang INTO @ID, @TimeGenerated, @ComputerName, @Message
CLOSE DownUpHang

INSERT @UptimeSummary
SELECT @ComputerName, @Downtime AS DowntimeSec, @WorkSec AS WorkSec,
CAST((@WorkSec-@Downtime) AS Decimal(18,4))/CAST(@WorkSec AS Decimal(18,4))*100 AS UptimePercent



INSERT UptimeSummary
SELECT ComputerName, dbo.ufn_ConvertSec(DowntimeSec) AS Downtime, @StartDate AS UptimeSince, UptimePercent
FROM @UptimeSummary
ORDER BY UptimePercent ASC

Now that I have Summarized uptime data for each server in a SQL Table I can write some Reporting Services reports to show this data. I've written two reports. One that shows the contents of the UptimeSummary table and one that shows Boot Events for a particular server. From the Summary report you can drill through to the Boot Events report for a server by clicking on the server name.

All the above, with the inclusion of "Hangtime" in the downtime calculation, prompted us to monitor for currently hanging servers. So we don't only reactively report on servers that hung, but now proactively report on servers that are currently hanging. I'll cover this in a subsequent post.

One Excon to another: "I thought you was hung?" "I is!!"

Tuesday, May 26, 2009

Changing database options on multiple SQL Servers and databases

I have 150 odd SQL databases across the country. A lot are SQL 2000 or SQL 2005 that have been upgraded. And a lot of the databases on these have auto_close ON, Auto_Shrink ON and PageVerify set to NONE. The config exception report was 10 pages long! Enter Powershell.

I wrote a powershell function to alter each setting, so 3 powershell functions. I could have written one, but for 150 servers with many databases on each server, I only want to target databases on servers that are not correctly configured. So I do three passes of my server list. On each pass I find one incorrect setting and only fix this setting on databases that have this setting, not a blanket update to all.

The code for these functions is:

Function Set-AutoCloseOff {
Param ([string]$svr, [string]$dbname)
$dbn = get-sqldatabase $svr $dbname
$dbn.DatabaseOptions.AutoClose = $False

Function Set-AutoShrinkOff {
Param ([string]$svr, [string]$dbname)
$dbn = get-sqldatabase $svr $dbname
$dbn.DatabaseOptions.AutoShrink = $False

Function Set-PageVerify {
Param ([string]$svr, [string]$dbname)
$dbn = get-sqldatabase $svr $dbname
$dbn.DatabaseOptions.PageVerify = "TornPageDetection"

These functions changes one setting for one database on one server. Now the trick is to find the incorrectly configured databases on each server and then call the above functions. Code:

. ./SQLPSX/LibrarySMO.ps1
. ./DBConfigFixes/Function_Set-AutoCloseOff.ps1
function Set-AutoClose ($s)
$svr = $s ¶
$dblist = get-sqldatabase $svr ¶
write-host "Checking Databases on Server $s" -foregroundcolor Green ¶
$dbc ={foreach ($dbn in $dblist) {$dbn | select @{name="DatabaseName";Expression= {$}}, @{name="AutoClose";Expression = {$dbn.DatabaseOptions.AutoClose}}}} ¶
$dbr = $dbc.invoke() | where {$_.AutoClose -eq $True} | select DatabaseName, AutoClose ¶
if ($dbr -ne $null) {foreach ($dd in $dbr) {write-host "Changing AutoClose for Database $dd.DatabaseName on $s" -foregroundColor "RED"; Set-AutoCloseOff $svr $dd.DatabaseName}} Else {Continue} ¶

I've included carriage return characters in the code window above. If a line wraps when you copy and paste it and there is no hard carriage return, remove the line break.

What the above code does is get a list of databases on a given server and then create another list of databases on that server that are not configured correctly. These are then used in the foreach loop to update the setting using the functions defined above.

Now all I need is a list of servers, and for each server run the above. Thats the easiest bit:

$ss = gc 'servers.txt'
foreach ($s in $ss) {Set-AutoClose $s}


For the SQLPSX/LibrarySMO.ps1 library of SMO functions, search on This library contains the get-sqldatabase function.

I ran the above on my 150 servers in about 8 minutes. 3 passes for 3 different config fixes, 30 minutes. 438 config exceptions fixed. Badaboom!

Footnote: I've fixed the Set-PageVerify function to cater for the difference in behaviour between SQL 2000 and SQL 2005. If you want this function drop me a mail at

Friday, May 22, 2009

Retrieving Partition Size, Free Space and Fragmentation Percentage from multiple servers

At my current client we receive an email every morning with partition information from a number of servers. This is run using VBScript and was set up a while ago. So, for one, the list of servers is outdated. The biggest bit of information missing from this report for me is the fragmentation percent of the partitions. I say this because one of the primary reasons for keeping enough free space available in a partition is to ensure effective defrags can run.

So I gathered my Server list. This, after going through numerous spreadsheets, amounted to 483 servers! Powershell eats this for a little morsel.

Getting the Partition size and Free space is a straighforward GWMI query in Powershell. Getting the fragmentation percent in the same result set requires some fancy footwork. My mate, Jean Louw, added this information using Add-Member. When that cmdlet is mentioned to him his eyes lose focus and his nether regions tighten. Check out his blog at:

So getting down to some code. Here I've written a function that retrieves The ServerName, DriveLetter, Label, Capacity, FreeSpace, PercentFree and Fragmentation Percent. I've left this data raw without fancy formatting since I'm going to export this to a csv and then bulk insert the csv into a SQL table and then present the information in Reporting Services. In T-SQL and Reporting Services I'll embelish the data as required.

Here is the function:

Function Get-FreeSpaceFrag ($s)
trap {write-host "Can't connect to WMI on server $s" -ForeGroundColor "Red"
$dt = get-date

$Scope = new-object System.Management.ManagementScope "\\$s\root\cimv2"
$query = new-object System.Management.ObjectQuery "SELECT * FROM Win32_Volume"
$searcher = new-object System.Management.ManagementObjectSearcher $scope,$query
$SearchOption = $searcher.get_options()
$timeout = new-timespan -seconds 10
$volumes = $searcher.get()

$fr = {foreach ($v in $volumes | where {$_.capacity -gt 0}){
$v | Add-Member -Name Frag -MemberType NoteProperty -Value $frag -Force -PassThru
} }
$fr.invoke() | select @{N="Server";E={$_.Systemname}}, DriveLetter, Label, Capacity, FreeSpace, @{N="PercentFree";E={"{0,9:N0}" -f (($_.FreeSpace/1gb)/($_.Capacity/1gb)*100)}}, Frag, @{N="InfoDate";E={$dt}}


The magic here is the Add-Member cmdlet getting my fragmentation percent. The other piece of magic is making sure the WMI query times out after 10 seconds!! Man did I battle with this. It took me a whole morning to get this right, and I start working at 6AM! The trick here is to instantiate the WMI object before invoking it. Then you can set the timeout using the new-timespan cmdlet and some properties of the ManagementObjectSearcher Object. I had a server, smack in the middle of my list of 483 servers, that broke my script before I added this error handling. Thats ugly, in a script that takes almost 4 hours. And when I say broke, Ctrl+C doesn't even work. Click on the X baby.

Great now I have my unit of work defined: Gather required information from one server. Its go time! Now I use this to gather the information from a list of servers, export the results to csv, time the entire operation of gathering the data and finally bulk insert the data into a SQL table. Code:

sl E:\Powershell\ServerDriveSpaceFragInfo
. ./Function_Get-FreeSpaceFrag.ps1
sl ..
. ./function_Get-TimeDelta.ps1
$svl = gc 'serversall.txt'
$x = {foreach ($s in $svl) {write-host "Getting Disk Info for Server $s" -foregroundcolor "Green"; Get-FreeSpaceFrag $s; start-sleep -s 60; break}}
$t1 = get-date
$x.invoke() | export-csv "D:\Powershell\DiskInfo.csv" -NoTypeInformation
$t2 = get-date
Get-TimeDelta $t1 $t2
cpi "C:\data\DiskInfo.csv" "\\SRV1\d$"
sl D:\Powershell\ServerDriveSpaceFragInfo

I save the above as a .ps1 file. I then call the ps1 file from SQL server using a SQL Agent job. The job step will be operating system (CmdExec) Type and the text would be:
"C:\WINDOWS\system32\windowspowershell\v1.0\powershell.exe" "D:\Powershell\DiskInfo\Eg_Get-FreeSpaceFrag.ps1"

For 483 servers this job runs for about 3 hours 43 minutes. Not bad considering they are all over the country, across some slow WAN links. The bulk insert code is as follows :

$SqlConnection = New-Object System.Data.SqlClient.SqlConnection
$SqlConnection.ConnectionString = "Server=SRV1;Database=master;Integrated Security=True"
$SqlCmd = New-Object System.Data.SqlClient.SqlCommand
$SqlCmd.CommandText = "use DBA; TRUNCATE TABLE DriveSpaceFragInfo;"
$SqlCmd.Connection = $SqlConnection
$SqlCmd.CommandText = "BULK INSERT DBA..DriveSpaceFragInfo FROM 'D:\DiskInfo.csv' WITH (FIELDTERMINATOR = ',', FIRSTROW = 2, ROWTERMINATOR = '\n')"
$SqlCmd.Connection = $SqlConnection
$SqlCmd.CommandText = "Exec DBA..usp_UpdateDriveSpaceFragInfo"
$SqlCmd.Connection = $SqlConnection

The 3rd command runs a stored procedure to clean up the data. Basically just removing double quotes and one or two other things. Now I have nice clean partition information that provides a wealth of information. Such as a Server that was hosting 2 Virtual Servers, had 35% free space on the data partition and was 94% fragmented. 94%!!!! I have never seen such a high figure before. So the fact that the partition has 35% free space doesn't mean the partition is in a healthy state.

I then created some reporting services reports to show the top 20 partitions with the least amount of free space and another report with the top 20 worst fragmented partitions. Nice. So now from a possible 1500 partitions on 483 servers I can target the least healthy partitions first. Also, I keep the partition information in the database. So over time I can report on how long a partition has been in a certain state, when it was cleaned or defraged, and how quickly it got filled up and fragmented again. Makes management happy. ;-).

Wednesday, May 13, 2009

Cleaning up full partitions

I work in an environment where there are a good few hundred servers all over the country. We have a report that gets sent to us every morning that lists servers and drives that have below 20% free space. The idea then is to free up space to get these drives to have more than 20% free space. After the 1st week it becomes very tedious. Check SQL backup folder, check W3SVC1 log files, check for out of date service packs.... Whats needed is a more methodical, and automated approach. You guessed it, powershell. ;-)

In the intro I alluded to the approach, check for a number of known file types and conditions that can contribute to filling a partition. Basically I go to the partition and for each check I either include or exclude certain file types or include files greater than a certain size.

I've written a function that goes to a server and a partition on that server and checks for:
  1. Files bigger than 10MB Excluding SQL files
  2. All Office, pst and txt docs bigger than 100kb
  3. Log files bigger than 5MB
  4. JPeg and MP3 files bigger than 100KB
  5. SQL Backup Files

This normally accounts for most cases of wasted space usage. This can obviously be expanded or customized to your particular needs or environment.

The function writes a csv file with a list of the files for each condition, with Name, FileSize, DirectoryName, FullName, CreationTimeUtc, LastAccessTimeUtc and DeleteCommand as fields. This information helps to confirm that a file can be deleted or compressed.

The neat bit here is the DeleteCommand. Its really just the Del command, which will work in the command prompt window or in PS, with the full path and filename. Run this carefully though, no prompting of "Are you sure" will be issued.

Herewith the code:

Function Find-Files {
$a = new-object -comobject MSScriptControl.ScriptControl
$a.language = "vbscript"
$a.addcode("function getInput() getInput = inputbox(`"Enter Server Name`",`"Find Files`") end function" )
$s = $a.eval("getInput")

$b = new-object -comobject MSScriptControl.ScriptControl
$b.language = "vbscript"
$b.addcode("function getInput() getInput = inputbox(`"Enter Server Drive`",`"Find Files`") end function" )
$dr = $b.eval("getInput")

$c = new-object -comobject MSScriptControl.ScriptControl
$c.language = "vbscript"
$c.addcode("function getInput() getInput = inputbox(`"Enter location to save output`",`"Find Files`") end function" )
$d = $c.eval("getInput")

#$s = 'DIVSS108'
#$dr = 'e'
$sp = "\\$s\$dr$\"
$d = "$d\"
$L = 10*1024*1024

#All files bigger than 10MB
write-host "Getting Files on $s Bigger than 10MB excluding SQL files..." -foregroundcolor "Green"
$f = $d+$s+"_big_Files.csv"
gci $sp -recurse -exclude *.bak,*.mdf,*.ldf,*Full.rar | Where {($_.Length -ge $L)} | select Name, @{N=' FileSize';E={"{0,12:N0} KB" -f ($_.Length/1kb) }}, DirectoryName, FullName, CreationTimeUtc, LastAccessTimeUtc, @{Name="DeleteCommand";E={"Del "+'"'+$_.FullName.Tostring()+'"'}} | export-csv $f -NoTypeInformation

#All Office, pst and txt docs bigger than 100kb
write-host "Getting Office Files on $s bigger than 100kb ..." -foregroundcolor "Green"
$L = 0.1*1024*1024
$f = $d+$s+"_Office_Files.csv"
gci $sp -recurse -include *.xls,*.doc,*.ppt,*.txt, *.pst | Where {($_.Length -ge $L)} | select Name, @{N=' FileSize';E={"{0,12:N0} KB" -f ($_.Length/1kb) }}, DirectoryName, FullName, CreationTimeUtc, LastAccessTimeUtc, @{Name="DeleteCommand";E={"Del "+'"'+$_.FullName.Tostring()+'"'}} | export-csv $f -NoTypeInformation

#Log files bigger than 5MB
write-host "Getting Log Files on $s bigger than 5MB ..." -foregroundcolor "Green"
$L = 5*1024*1024
$f = $d+$s+"_Log_Files.csv"
gci $sp -recurse -include *.log | Where {($_.Length -ge $L)} | select Name, @{N=' FileSize';E={"{0,12:N0} KB" -f ($_.Length/1kb) }}, DirectoryName, FullName, CreationTimeUtc, LastAccessTimeUtc, @{Name="DeleteCommand";E={"Del "+'"'+$_.FullName.Tostring()+'"'}} | export-csv $f -NoTypeInformation

#JPeg and MP3 files bigger than 100KB
write-host "Getting Jpg & mp3 Files on $s bigger than 100kb ..." -foregroundcolor "Green"
$L = 0.1*1024*1024
$f = $d+$s+"_Jpg_Mp3_Files.csv"
gci $sp -recurse -include *.jpg, *.mp3 | Where {($_.Length -ge $L)} | select Name, @{N=' FileSize';E={"{0,12:N0} KB" -f ($_.Length/1kb) }}, DirectoryName, FullName, CreationTimeUtc, LastAccessTimeUtc, @{Name="DeleteCommand";E={"Del "+'"'+$_.FullName.Tostring()+'"'}} | export-csv $f -NoTypeInformation

#SQL Backup Files
write-host "Getting SQL Backup Files on $s ..." -foregroundcolor "Green"
$L = 0.1*1024*1024
$f = $d+$s+"_SqlBackup_Files.csv"
gci $sp -recurse -include *.bak,*.trn,*Full.rar | select Name, @{N=' FileSize';E={"{0,12:N0} KB" -f ($_.Length/1kb) }}, DirectoryName, FullName, CreationTimeUtc, LastAccessTimeUtc, @{Name="DeleteCommand";E={"Del "+'"'+$_.FullName.Tostring()+'"'}} | export-csv $f -NoTypeInformation

As you can see the function uses some old school VBScript stuff to get a input box. I could have used newer .Net code, but the VBScript code is short and sweet.

When the function is called you are prompted for the Server Name, the Drive and the location to store the output csv files. Make sure this location exists. I could have checked for the existence of the location and created it if it didn't exist, but this may create a folder in a location that can't be remembered or found if the location is typed incorrectly. Ctrl+C, Ctrl+V for the location. ;-)

Happy cleaning and deleting.

Monday, April 20, 2009

Deploying a SQL Agent Job to cycle the Error logs to multiple servers

So, in the previous post we talked about changing the number of SQL Error logs. With that done, we now need to actually make use of all these logs. I like keeping each days errors in its own log file. Then if a problem arises and it was last week Tuesday, well then look in the corresponding log file.

There is a system stored procedure sp_cycle_errorlog, that will do exactly the above, close one log and open a new log. All I do now is schedule this stored procedure as a SQL Agent Job to run at midnight every day. Thats easy enough on 1 server, but remember, I have 151. So I script out the SQL Agent Job. Great, now I have a .sql file that I need to deploy to 151 servers. Thats if I'm not using a master - target server configuration. That in itself is a fantastic concept, but I've never seen it at any client I've started at.

Another blogger, SerialSeb, posted a very handy piece of code that does what I need to do. Check out his post.

Basically what he does is create a batch file that contains a sqlcmd command. This command as you know can execute the contents of a file against a said SQL server. Then the code runs the batch file using cmd /c. Go to the command prompt and type: cmd /?. It returns "/C Carries out the command specified by string and then terminates". If the batch file exists the contents of it are overwritten. Good one Seb.

Great, so unit of work encapsulted, got list of servers, its go time.

foreach ($server in $serverlist) {Execute-SqlFile $file $server $dbname $WindowsAuthentication=true}

Where $file is the .sql file containing the T-SQL for the SQL Agent job. $dbname doesn't matter because the .sql file changes the database to MSDB anyway.

Go forth and deploy.

Change the number of SQL Error Logs using Powershell

One of the first places to start troubleshooting a potential SQL problem is to look at the SQL Error Logs. If there is a serious problem, it will normally show up in this log. The thing is, the default setting for SQL is to keep 6 logs. Each log is created when SQL Server starts up. So if a SQL server is restarting often, which is already a problem, you could have very little SQL error log history. On the other hand if SQL stays up for weeks or months at a time, which is a good thing, you will potentially have a very big error log to sift through to find any errors.

I recommend configuring 31 log files. This number can be anything from 6 to 99. I choose 31 because that will give me a months worth of error logs. How so if the error logs only get created when SQL Server restarts? I'll cover that in a subsequent blog.

But first, actually changing the number of error logs. One could obviously do it in Enterprise Manager or SQL Management Studio. But in my case I started at a client with 151 SQL servers of differing versions. Click click, click click. Very tedious. You guessed it, Powershell! :-)

What I like to do is create functions that encapsulate the unit of work that I want to accomplish. Once I have this function I can easily and neatly deploy this unit of work to a list of servers.

The function to change the number of SQL Error Logs is quite straighforward. It uses the Reg command to add the NumErrorLogs value to a certain registry key. There is actually no way to do this through T-SQL, unless I use xp_regwrite extended stored procedure. But that would entail connecting to SQL to write a value to the registry. Not necessary. I actually do connect to SQL, but only to determine the version of SQL. This I guess I could also do in the registry, but in my function I've connected to SQL using SMO.

So lets look at some code. Following is the function code:

function Change-ErrorLogs ($s)
$svr = get-sqlserver $s

if (($svr.get_information().version.major) -eq 8)
#SQL 2000
reg add \\$s\HKLM\SOFTWARE\Microsoft\MSSQLServer\MSSQLServer /v NumErrorLogs /t REG_DWORD /d 31
write-host "SQL 2000 Server - $s Config changed"
#SQL 2005
reg add "\\$s\HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft SQL Server\MSSQL.1\MSSQLServer" /v NumErrorLogs /t REG_DWORD /d 31
write-host "SQL 2005 Server - $s Config changed"

The code for get-sqlserver is:

function Get-SqlServer
param([string]$sqlserver=$(throw 'Get-SqlServer:`$sqlserver is required.'))
#When $sqlserver passed in from the SMO Name property, brackets
#are automatically inserted which then need to be removed
$sqlserver = $sqlserver -replace "\[|\]"

Write-Verbose "Get-SqlServer $sqlserver"
$server = new-object ("Microsoft.SqlServer.Management.Smo.Server") $sqlserver
$server.SetDefaultInitFields([Microsoft.SqlServer.Management.SMO.StoredProcedure], "IsSystemObject")
$server.SetDefaultInitFields([Microsoft.SqlServer.Management.SMO.Table], "IsSystemObject")
$server.SetDefaultInitFields([Microsoft.SqlServer.Management.SMO.View], "IsSystemObject")
$server.SetDefaultInitFields([Microsoft.SqlServer.Management.SMO.UserDefinedFunction], "IsSystemObject")
#trap { "Check $SqlServer Name"; continue} $server.ConnectionContext.Connect()
return $server

} #Get-SqlServer

Great, so now I have my unit of work defined as Change-ErrorLogs. You can probably guess the rest. But if you can't, here is the code:

$ss = gc 'servers2.txt'
foreach ($s in $ss) {Change-ErrorLogs $s}

Servers2.txt contains a list of all my SQL Servers. Grab that list into $ss. And then for each server ($s) in the list ($ss), Change the number of error logs (Change-ErrorLogs). Tada!

Wednesday, April 8, 2009

Resolving a list of IP Addresses to Hostnames using Powershell

Following on from my post on Auditing and Summarizing Logon activity on a SQL Server, I figured I need to automate the resolving of IP Addresses to Host names to make the Audit information more useful.

So off I go to Google or MSDN and find [System.Net.Dns]::GetHostbyAddress($IP). Nice! This function can be called from Powershell and does exactly what I need it to do, resolve an IP Address (string) to a hostname.

I created a Powershell Function that wraps this function and adds some rudimentary error handling. I need error handling (who or what doesn't) because the hosts that populates list of IP addresses may not always be available and the host name resolution will fail. Powershell Resolve-IP function code:

Function Resolve-IP ($IP) {

trap {
write-host "An error occured: "
write-host "ID: " $_.ErrorID
write-host "Message: "$_.Exception.Message
throw "Couldn't Resolve $IP"

$results = [System.Net.Dns]::GetHostbyAddress($IP)

Cool, so now we can grab our list of IP Addresses from our AuditLoginHistory table. I do this by using a SQL SMO function called get-sqldata that runs a T-SQL query and returns the results. So my T-SQL Query is:

select distinct host from auditloginhistory

I then use a foreach loop to iterate through the IP Addresses and resolve each IP address to a hostname. The final results, in my case, is exported to a .csv. I'll then Bulk Insert this csv list of IP Addresses and Host names into a SQL table using another SQL SMO function. Then I'll join the Host_IP table to my AuditLoginHistory and display both IP Address and Hostname.

Below is the Powershell script that runs the above:

. ./Function_Resolve-IP.ps1
$hosts = get-sqldata 'sql1' master "select distinct host from auditloginhistory"
foreach ($IP in $hosts) {Resolve-IP $}
$ipex = {foreach ($IP in $hosts) {Resolve-IP $}}
$ipex.invoke() | Select @{N="HostName";E={$_.HostName.tostring()}}, @{N="IP_Address";E={$_.get_AddressList()}} | export-csv "host-ip.csv" -NoTypeInformation

Monday, April 6, 2009

Auditing and Summarizing Logons to SQL Server

I'm sure you've come across a scenario where at a new client or employer you want to embark on a SQL Server Login cleanup. But a lot of the logins are from the dark ages and the current IT staff don't know which logins are used by who and for what.

Well there are two ways to view which logins are logging onto the SQL server and when: The one way is to enable successful and failed logon attempts at the SQL server level. This will write an entry to the SQL Error Log for every failed or successful logon. On a busy server this would make the error log huge. Besides that, the audit information is in a flat text file, which can be viewed, but not summarised easily.

Another way to audit logons to your SQL Server is to enable a server wide Logon Trigger, which as you know is available in SQL 2005 & 2008. See BOL for more details. This trigger exposes some XML Data about the logon and this data is made accessible using the EVENTDATA function.

The solution I've written to audit and summarize logon information uses a logon trigger, two logon Audit tables and a daily summarizing script. I've created these tables in the Master database, but they should ideally be created in a user database. The first table, AuditLogin stores all the data that results from the logon trigger firing. I've chosen to store ServerName, EventTime, ServerLogin, ClientHost, LoginType and IsPooled event data. The trigger basically writes these values to the AuditLogin table for a LOGON event. Code:

/****** Object: DdlTrigger [trigLogon] Script Date: 04/06/2009 10:57:00 ******/

declare @LogonTriggerData xml

SET @LogonTriggerData = eventdata() ;

(ServerName, EventTime, ServerLogin, Col_ClientHost, Col_LoginType, IsPooled)
(SELECT @@SERVERNAME, @LogonTriggerData.value('(/EVENT_INSTANCE/PostTime)[1]', 'datetime'), ORIGINAL_LOGIN(), @LogonTriggerData.value('(/EVENT_INSTANCE/ClientHost)[1]', 'varchar(50)'), @LogonTriggerData.value('(/EVENT_INSTANCE/LoginType)[1]', 'varchar(50)'), @LogonTriggerData.value('(/EVENT_INSTANCE/IsPooled)[1]', 'bit'))


This is all good and well, now we know who is logging onto our Server and when and from where. But this trigger firing can fill a large hard disk quickly. On a busy server there may be millions of entries into this Audit table daily. This is mostly as a result of connection pooling and how a connection is kept alive by an application server like IIS. So for the most part I don't worry to much about huge amounts of logons from pooled connections from application servers. The logon entries I'm more interested in are for non pooled connections.

So what we need, from an audit summary point of view is a distinct record for each logon from a distinct host and the logons pooled status, and the last time this logon logged in and how many times it logged in. This will give me some valuable information from the millions of records or raw logon data. Information like which logins have not logged in since Logon Auditing has been enabled, which generic logins are being used from which hosts and which non pooled logins are logging in the most. From my original mission of establishing which logins are used and how often, we can now determine this.

Now I need to take the daily logon audit information and summarize it and merge todays summarized information with the historic logon information. This is a very common scenario in a Data Warehouse environment. I NEED to summarize and TRUNCATE my AuditLogin table after I've summarized the data in it because of the huge amount of data that goes into this table every day. Knowing that login x logged on at a particular time yesterday is of little value. So I don't need the detail data, unless there is a security breach. Summarize, Merge to History, Truncate. Treat it mean, keep it clean.

The below code uses some incremental load logic I've gleened from my Data Warehouse experience. Basically the first full load and subsequent incremental loads of my AuditLoginHistory table use the same code. Thats pretty handy. I can run the same code to load the history table from the first load.

Basically each record in the history table is identified by Login, Host and Pooled. The two other fields are LoginCount and LastLoggedIn. So what the below code does is first summarize todays logon information. Then it compares it to the History information. If the same record exists, the LoginCount and LastLoggedIn is updated. If the record doesn't exist, a new one is inserted into the history table.
Summarizing Code:

with LoginCnt(ServerLogin, LoginCount, IsPooled, Host)
select ServerLogin, count(*) AS LoginCount, IsPooled, Col_ClientHost from auditlogin
group by ServerLogin, IsPooled, Col_ClientHost
, LastLogin(ServerLogin, Host, LastLoggedIn, IsPooled)
select serverlogin, Col_ClientHost, IsPooled, max(EventTime) AS LastLoggedIn from AuditLogin
group by ServerLogin, Col_ClientHost, IsPooled
login trigger

--SELECT lc.ServerLogin, lc.Host, ll.LastLoggedIn, lc.LoginCount, lc.IsPooled
--FROM LoginCnt lc INNER JOIN LastLogin ll
--ON lc.ServerLogin = ll.ServerLogin
--AND lc.Host = ll.Host

SELECT lc.ServerLogin, lc.Host, ll.LastLoggedIn, lc.LoginCount, lc.IsPooled
INTO #AuditTmp
FROM LoginCnt lc INNER JOIN LastLogin ll
ON lc.ServerLogin = ll.ServerLogin
AND lc.Host = ll.Host
ORDER BY ISPooled ASC, LoginCount DESC



Declare AuditCur CURSOR FOR
SELECT ServerLogin, Host, LastLoggedIn, LoginCount, IsPooled FROM #AuditTmp
OPEN AuditCur
FETCH NEXT FROM AuditCur INTO @SL, @Host, @LastLoggedIn, @LC, @IP

IF (SELECT COUNT(*) FROM AuditLoginHistory
WHERE ServerLogin = @SL AND Host = @Host AND IsPooled = @IP) = 1
UPDATE AuditLoginHistory
SET LastLoggedIn = @LastLoggedIn,
LoginCount = LoginCount + @LC
WHERE ServerLogin = @SL AND Host = @Host AND IsPooled = @IP
INSERT AuditLoginHistory(ServerLogin, Host, LastLoggedIn, LoginCount, IsPooled)
VALUES(@SL, @Host, @LastLoggedIn, @LC, @IP)

FETCH NEXT FROM AuditCur INTO @SL, @Host, @LastLoggedIn, @LC, @IP
CLOSE AuditCur


IF @@ERROR <> 0
--PRINT 'Successful'

Below is the code for the two Audit tables:

USE [master]
/****** Object: Table [dbo].[AuditLogin] Script Date: 01/19/2009 08:20:30 ******/
IF NOT EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[dbo].[AuditLogin]') AND type in (N'U'))
CREATE TABLE [dbo].[AuditLogin](
[AuditEventId] [int] IDENTITY(1,1) NOT NULL,
[ServerName] [nvarchar](100) NOT NULL,
[EventTime] [datetime] NOT NULL,
[ServerLogin] [nvarchar](100) NOT NULL,
[Col_ClientHost] [varchar](50) NULL,
[Col_LoginType] [varchar](50) NULL,
[IsPooled] [bit] NULL,
[AuditEventId] ASC
GRANT INSERT ON [dbo].[AuditLogin] TO [public]

USE [master]
/****** Object: Table [dbo].[AuditLoginHistory] Script Date: 04/06/2009 10:37:55 ******/
CREATE TABLE [dbo].[AuditLoginHistory](
[ServerLogin] [varchar](128) NULL,
[Host] [varchar](30) NULL,
[LoginCount] [int] NULL,
[LastLoggedIn] [datetime] NULL,
[IsPooled] [bit] NULL


Lastly, a caveat to make sure the trigger fires and doesn't cause logon failures: Ensure that all the objects that get created are accessible to public. Otherwise logons will fail because the user logging in causes the trigger to fire and the write to the Audit table is done in the context of the user thats logging on.

Thursday, April 2, 2009

Tracking database growth over time

We had a problem recently where our Sharepoint environment space requirements have grown beyond what was originally envisaged. The Sharepoint team wanted to know how the different SQL databases have grown over time.

Dang I thought, we don't record the size of a database at a particular point in time, over time. Then I thought, well we back the databases up frequently and we could look at the backup file sizes to get an indication of how big the database is, since there is a reasonable correlation between data, index and log space used and the backup size. But we only keep 2 weeks worth of backups on disk and to recall tapes for a year would be a mission. Then it dawned on me that the backup history tables in the msdb database would contain the database name, backup time and backup size!

Great, so I have my data source. The Backupset table just so happens to have 200000 records in it, with data for 80 databases. Well we can exclude all db's that aren't Sharepoint databases, and we can only look at type = 'D' for full database backups. But we still have daily full backups for each sharepoint database for the last 18 months. From a trending point of view, without being overwhelmed by masses of data, I want to see database growth by month. Ok so I want the 1st, or last backup for each database for each month for the last 18 months. Now there's a bit of T-SQL to write. I figured a couple of While loops would do the trick to loop through each month and year, and in those loops have a cursor get the backup info into a table variable for each database. Here is the code:

DECLARE @Db_BackupHistory TABLE
(database_name VARCHAR(255),
backup_size BIGINT,
backup_start_date DATETIME)


SET @Y = 2007
SET @M = 1

WHILE @Y < 2010
WHILE @M < 13

SELECT DISTINCT database_name FROM backupset where database_name like 'Sharepoint_%'

INSERT @Db_BackupHistory(database_name, backup_size, backup_start_date)
select TOP 1 database_name, backup_size, backup_start_date from backupset
WHERE database_name LIKE @DB and type = 'D'
and DATEPART(mm, backup_start_date) = @M and DATEPART(yy, backup_start_date) = @Y
order by backup_start_date DESC


SET @M = @M + 1
SET @M = 1
SET @Y = @Y + 1


select database_name, backup_size/1024/1024 AS [BackupSize(MB)], backup_start_date from @Db_BackupHistory
order by database_name, backup_start_date

I'm sure there are other ways of doing this. Feel free to comment and add how you would do the above.

As a side, there is a post by Vyas Kondreddi posted in 2001 that does a similar thing, but doesn't summarize the size per database by month. It only returns all data for one database. Also, it reports size as the size of the mdf. If I create a sharepoint database with an initial size of 1GB and it takes 8 months to get to 1GB, then I won't have a true reflection of the actual size of the data, the size of the mdf with lots of empty space is academic. The actual data space used is important from a data growth point of view. Hence I use the backup_size field from the Backupset table.

The cool thing now is I can look at a database going back 18 months and see the size of the full backup at the end of each month. Now just pull this data into reporting services, have a series for each database and there we go! Database growth over time.

Tuesday, March 31, 2009

Searching for that code that formats a number in all my code files

If you write any type of code you must have come across this scenario:

I've written code before that formats a number nicely in Powershell, but which text file or script or function did I have that code in?

Well today I got a handy piece of Powershell code that did a whole lot of things but one thing in particular was format a number. Now I'd looked for examples on this before and didn't quite find what I was looking for. And this piece of code did exactly what I wanted. But nowhere in the Powershell code did it say "format number". I could add a comment and thats probably not a bad idea, but normally code comments are to increase readability of the code and logic and not as tags for future searches.

Well there is a very simple way to get around this problem. For the .txt or .ps1 or even .sql or .dtsx file right click and go to the properties of the file. Then click on the Summary tab. There is a Keywords text box. In here you can type keywords or tags that will help you find the code snippet or example that you think you'll need in future. In my case I typed in format number. I then searched (F3) the parent folder for all files using the second option in search, which is A word or phrase in the file. A searched for format number and bingo, 1 result from 747 files! And it was the correct file. :-) To add multiple keywords or phrases separate them with semi colons (;).

Thursday, March 26, 2009

SQL Metadata: What to store and where to store it

Being at the Consulting and Contracting end of the IT industry, I've moved around a bit in my career. The same challenge is faced when starting at a new client: What are all these databases used for? Who owns them? How are they accessed? There are a number of questions that simply can't be determined from an ERD or querying a dmv or looking at the underlying objects.

This information is database metadata and it is almost never stored in the database. Or nearly never stored, anywhere. That brings me to my first point: Database metadata should be stored in the database. This ensures that this metadata does not get orphaned or lost or decoupled from what it is describing. It gets backed up. It is easily viewable. It is secure. Ok, so we have decided on a place to store this metadata, but what exactly are we going to store?

I've found that the best answer to the "what metadata are we going to store" question are the resident DBA / sys admin staff. They know the questions that are relevant to their databases that cannot be found in the database. That highlights my next point, don't keep database metadata about what can be derived from existing metadata in system tables or dmv's. Like "Database Creation date". That is stored in sys.databases. Or the biggest tables and their record counts, also not static and viewable in many different ways.

So what are vaild metadata attributes? There are a number of them, like Database Owner, Access methods, growth requirements, backup requirements, etc. I've come up with a list of 14 attributes at my current client and I think thats a good amount. Not so much that its tedious maintaining, but at the same time still useful.

Ok, so we've talked about the reason behind keeping metadata and what we are going to keep and where. Now we'll talk about the how bit. You may be familiar with three system stored procedures that add, update and drop metadata. These are:


These are documented in BOL. As each name suggests, these are used to Add, Update and Delete extended property information, or in our case, metadata. Once we've added our metadata we can view it using a function called fn_listextendedproperty.

I add the DB_NAME() metadata function to give the database metadata context:

SELECT DB_NAME() AS DbName, name as PropertyName, value as PropertyValue
FROM ::fn_listextendedproperty(NULL, NULL, NULL, NULL, NULL, NULL, NULL)

Great, so now we have a mechanism to store and view our metadata. But I think we need to take a step back. Lets suppose that we decide that "Database Owner" is a good database metadata attribute. Now we go forth and run sp_addextendedproperty 'Database Owner', 'Joe Black'. On our next database we fervently run sp_addextendedproperty 'Db Owner', 'Joe Smith'. Do you spot the problem? If I want an enterprise view of all databases on all servers and their respective owners, I would be in a pickle. Thats because I'm adding the same attribute - Database Owner, but its spelled differently on each database. So we need to enforce Domain integrity. Simply defined this is: A domain defines the possible values of an attribute. I want to make sure that in all my databases on all my servers the Business owner metadata attribute is stored as "Business Owner".

What I do is create a wrapper (DJ Sproc ) to wrap around the sp_addextendedproperty. Code to follow:

CREATE PROC usp_AddMetadata
@MetadataName VARCHAR(100), @MetadataValue VARCHAR(500)


IF @MetadataName NOT IN ('AppDesc','AppType','BackupReq','Contacts','GrowthReq','Lifespan',



RAISERROR('Metadataname is not valid. Must be in the following list: AppDesc, AppType, BackupReq, Contacts, rowthReq, Lifespan, Owner, Type, InitSize, DevDBName, ProdDevQA, SSISPack, BusOwner, ConnectionType', 16, 1)





EXEC sp_addextendedproperty @name = @MetadataName, @value = @MetadataValue

What the above code does is essentially enforce Domain Integrity for the metadata attribute names. These are defined in the first step above of "what to store". Now what I need to do is deploy this wrapper sproc to all databases on all my servers so that I am enforcing domain integrity on all database metadata. The reason I need to do this and not just create a wrapper sproc in master or model is that when adding metadata, the current database context is used. There is no way of specifying that metadata x belongs to db y. You have to physically change the database context to the relevant user database. And as we know this is tricky. I've checked and there doesn't seem to be a way of issuing the "USE DB1" statement and then execute another statement that will use DB1 as its context.
So I deploy my wrapper sproc to each database. Great. I have 50 SQL servers with an average of 40 user databases on each!! No problem, enter sp_msforeachdb, xp_cmdshell and osql. My arson of little TSQL nukes. . String them together as follows and we have the powerrrrrr:

sp_msforeachdb 'xp_cmdshell ''osql -SSQL1 -E -d? -i"f:\work\sql\2005\usp_AddMetadata.sql"'''

What this line of code does is run the undocumented but very powerful sp_msforeachdb system stored procedure. Basically for each database on a server, do x y and z. This I use in my osql command. This comand prompt command can run .sql files from a said location. So I save my wrapper sproc in f drive in the above location. This should ideally be a file share on the network somewhere. So the above line of code runs the usp_AddMetadata wrapper sproc on each user database on Server SQL1. Now if I have 40 SQL servers I take this line of code, paste it into Excel (or I could do it in TSQL) with my server list, do some string concatenation and generate 40 lines of code, one for each server. Boom!!

Clearly the above approach was prior to me learning the power of Powershell. The above approach is valid, but a bit clunky. I'll get to re-writing it in Powershell when I get a few spare CPU cycles.

As a side, because I've added the usp_AddMetadata sproc to the model system database, any new databases will have the wrapper sproc added to it automatically.

In the same way that I've created a wrapper for sp_addextendedproperty, I also create a wrapper for sp_updateextendedproperty, with the same logic and constraints.

The above is all very cool and in some ways geeky. The real power, from a management CIO point of view for example, is when I create a view or SSRS report that shows a list of all databases on all servers and who the database owner is. Or all databases that are past their sell by date. Or all databases that have grown to beyond double their initial size. Now that is seriously powerful. I have an enterprise 10000 foot (feet) view of my SQL environment.

Besides the CIO view, from an audit or capacity planning or provisioning point of view, this metadata makes DBA's and sys admins jobs a whole lot easier.

Wednesday, March 25, 2009

Querying the MSCS Cluster Log using Log Parser and Powershell

When encountering problems with a Windows High Availability Cluster, the 1st place one would look would be in Cluster Administrator and then in the Windows Event Logs. Seldom though have I seen the Cluster logs being looked at. This log (in Windows 2003 MSCS) is an anomaly in the Microsoft world. For one, its size is configured in a System Variable. Another odd behaviour is that when the log is full, the 1st half is truncated and then used. Why not just write everything to the Event Logs and have a separate log for the Cluster Service? And lastly, the format of this log even defies Log Parser, the king of making sense of large text file logs.

Enter Log Parser and Powershell.

I've written a Powershell script to:

  1. Prompt the user for the Cluster Server name with an input box
  2. Get the contents of the Windows cluster log in the default location
  3. Scrub the log
  4. Export the cleaned log to a log file
  5. Create a LogParser Object
  6. Use this LogParser Object to query the cleaned log file and only return Warnings and Errors
  7. Output these Warnings and Errors to a .csv file

Here is the Code:

$a = new-object -comobject MSScriptControl.ScriptControl
$a.language = "vbscript"
$a.addcode("function getInput() getInput = inputbox(`"Enter Cluster Server Name`",`"Cluster Log Viewer`") end function" )
$s = $a.eval("getInput")
$gcs = '\\'+$s+'\c$\windows\cluster\cluster.log'
$var = gc $gcs
$var=$var -replace 'Code0000', "Code`r`n0000"
$var=$var -replace 'ERR ', 'ERR '
$var=$var -replace 'Volume Manager', '[VMg]'
$var=$var -replace 'ERR SQL Server Agent', 'ERR [SQA]'
$var=$var -replace 'ERR IP Address', 'ERR [IPA]'
$var=$var -replace 'WARN Network Name', 'WARN [NNm]'
$var=$var -replace 'ERR Network Name', 'ERR [NNm]'
$f = 'c:\'+$s+'Clean.log'
$var ac $f
$cl = 'c:\'+$s+'Clean.csv'
$myQuery = new-object -com MSUtil.LogQuery
$objInputFormat = New-Object -com MSUtil.LogQuery.TextLineInputFormat
$objOutputFormat = New-Object -com MSUtil.LogQuery.CSVOutputFormat
$strQuery = "select EXTRACT_TOKEN(Text, 0, ' ') as StringDate, EXTRACT_TOKEN(Text,1, ' ') AS Sev, EXTRACT_TOKEN(Text,2, ' ') AS Source, EXTRACT_TOKEN(Text,3, ' ') AS D1, EXTRACT_TOKEN(Text,4, ' ') AS D2, EXTRACT_TOKEN(Text,5, ' ') AS D3, EXTRACT_TOKEN(Text,6, ' ') AS D4, EXTRACT_TOKEN(Text,7, ' ') AS D5, EXTRACT_TOKEN(Text,8, ' ') AS D6, EXTRACT_TOKEN(Text,9, ' ') AS D7, EXTRACT_TOKEN(Text,10, ' ') AS D8, EXTRACT_TOKEN(Text,11, ' ') AS D9, EXTRACT_TOKEN(Text,12, ' ') AS D10, EXTRACT_TOKEN(Text,13, ' ') AS D11, EXTRACT_TOKEN(Text,14, ' ') AS D12, EXTRACT_TOKEN(Text,15, ' ') AS D13, EXTRACT_TOKEN(Text,16, ' ') AS D14, EXTRACT_TOKEN(Text,17, ' ') AS D15, EXTRACT_TOKEN(Text,18, ' ') AS D16, EXTRACT_TOKEN(Text,19, ' ') AS D17, EXTRACT_TOKEN(Text,20, ' ') AS D18, EXTRACT_TOKEN(Text,21, ' ') AS D19, EXTRACT_TOKEN(Text,22, ' ') AS D20, EXTRACT_TOKEN(Text,23, ' ') AS D21, EXTRACT_TOKEN(Text,24, ' ') AS D22, EXTRACT_TOKEN(Text,25, ' ') AS D23, EXTRACT_TOKEN(Text,26, ' ') AS D24 INTO "+$cl+" from "+$f+" WHERE EXTRACT_TOKEN(Text,1, ' ') NOT LIKE '%INFO%'"
$myQuery.ExecuteBatch($strQuery, $objInputFormat, $objOutputFormat)

Even this process doesn't render a clean report. I then use SQL to bulk insert this .csv file into a SQL table and then further manipulate the data. I guess this could have been done in Powershell, its just with text manipulation I'm more familiar with T-SQL.

CREATE TABLE [dbo].[ClusterLog](
[StringDate] [nvarchar](50) NULL,
[Sev] [nvarchar](50) NULL,
[Source] [nvarchar](50) NULL,
[D1] [nvarchar](50) NULL,
[D2] [nvarchar](50) NULL,
[D3] [nvarchar](50) NULL,
[D4] [nvarchar](50) NULL,
[D5] [nvarchar](50) NULL,
[D6] [nvarchar](50) NULL,
[D7] [nvarchar](50) NULL,
[D8] [nvarchar](50) NULL,
[D9] [nvarchar](50) NULL,
[D10] [nvarchar](50) NULL,
[D11] [nvarchar](50) NULL,
[D12] [nvarchar](50) NULL,
[D13] [nvarchar](50) NULL,
[D14] [nvarchar](50) NULL,
[D15] [nvarchar](50) NULL,
[D16] [nvarchar](50) NULL,
[D17] [nvarchar](50) NULL,
[D18] [nvarchar](50) NULL,
[D19] [nvarchar](50) NULL,
[D20] [nvarchar](50) NULL,
[D21] [nvarchar](50) NULL,
[D22] [nvarchar](50) NULL,
[D23] [nvarchar](50) NULL,
[D24] [nvarchar](50) NULL

bulk insert ClusterLog FROM 'D:\Powershell\SQL3clean.csv'

UPDATE ClusterLog
SET Source = REPLACE(REPLACE(Source, CHAR(91), ''), ']', '')

select 'SQLCL3' AS Cluster
,DATEADD(hh, 2, REPLACE(SUBSTRING(StringDate, PATINDEX('%::%',StringDate)+2, 19), '-', ' ')) AS TimeLogged
,"Severity" = CASE
WHEN Sev LIKE 'WARN' Then 'Warning'
,"Source" = CASE
WHEN Source LIKE 'API' THEN 'API support'
WHEN Source LIKE 'ClMsg' THEN 'Cluster messaging'
WHEN Source LIKE 'ClNet' THEN 'Cluster network engine'
WHEN Source LIKE 'CP' THEN 'Checkpoint Manager'
WHEN Source LIKE 'CS' THEN 'Cluster service'
WHEN Source LIKE 'DM' THEN 'Database Manager'
WHEN Source LIKE 'EP' THEN 'Event Processor'
WHEN Source LIKE 'FM' THEN 'Failover Manager'
WHEN Source LIKE 'GUM' THEN 'Global Update Manager'
WHEN Source LIKE 'INIT' THEN 'Initial state'
WHEN Source LIKE 'JOIN' THEN 'The node state that follows INIT when the node attempts to join a cluster'
WHEN Source LIKE 'LM' THEN 'Log Manager. Maintains the quorum log.'
WHEN Source LIKE 'MM' THEN 'Membership Manager, also known and written to the cluster log as Regroup (RGP)'
WHEN Source LIKE 'NM' THEN 'Node Manager. Keeps track of the state of other nodes in the cluster'
WHEN Source LIKE 'OM' THEN 'Object Manager. Maintains an in-memory database of entities, or objects (nodes, networks, groups)'
WHEN Source LIKE 'RGP' THEN 'Regroup, Tracks which nodes are members of the cluster'
WHEN Source LIKE 'RM' THEN 'Resource Monitor'
WHEN Source LIKE 'SQA' THEN 'SQL Server Agent'
WHEN Source LIKE 'NNm' THEN 'Network Name'
WHEN Source LIKE 'IPA' THEN 'IP Address'
WHEN Source LIKE 'VMg' THEN 'Volume Manager'
ELSE Source
,D1 + ' ' + D2 + ' ' + D3 + ' '+ D4+ ' '+ D5 + ' '+ D6 + ' '+ D7 + ' '+ D8 + ' '+ D9 + ' '+ D10 + ' '+ D11 + ' '+ D12 + ' '+ D13 + ' '+ D14 + ' '+ D15 + ' '+ D16 + ' '+ D17 + ' '+ D18 + ' '+ D19 + ' '+ D20 + ' '+ D21 + ' '+ D22 + ' '+ D23 + ' '+ D24 AS ErrorDescription
from ClusterLog
WHERE DATEADD(hh, 2, REPLACE(SUBSTRING(StringDate, PATINDEX('%::%',StringDate)+2, 19), '-', ' ')) > getdate()-2

Above is what the Cluster Log will look like, which is finally useful. Also, with the data being in a SQL Table, we could automate the entire process above and display Cluster Log errors and warnings for the last x days in a Reporting Services report.

Tuesday, March 24, 2009

Defragmenting a partition with a large heavily fragmented database file

I'm sure you've come across this before, you have a 100GB partition with a 60GB database mdf sprawled across the partition in 8600 fragments. Nice! IO performance will be severely impacted, especially if this partition (more often than not) is on a RAID 5 array. In addition to this there is only 15% free space on this partition. Now your task is to defragment the partition and the mdf.

Well the easiest way to accomplish this, if you have another partition with enough space to temporarily house the 60GB mdf is to move the mdf, defrag the data partition and move the mdf back, hopefully into one contiguous space. This is a classic DBA scenario which involves some T-SQL or SSMS work, some OS file system work, some OS defrag work, more file system work and then finally some T-SQL work. All good and well, but now you want to automate it, so that you don't forget about your quest halfway through the process.

Enter Powershell. Sure, you could do some of the file system stuff using xp_cmdshell. But that little gem should be locked down no? And what about the defrag? Re-enter Powershell.

In a few lines of code this process can be accomplished from start to finish. In the code below I've created a function that does the tasks outlined above. That is:

  1. Detach the database

  2. Move the mdf and ldf files

  3. Defragment the fragmented partition

  4. Move the mdf and ldf files back

  5. Attach the database


Function DetDefAt {
Param ([string]$svr, [string]$dbname, [string]$mdf, [string]$ldf, [string]$mdft, [string]$ldft, [string]$Drive2Def)

#detach db
$s = get-sqlserver $svr
#get-sqldatabase $svr $dbname

#move files
move-item $mdf $mdft;
move-item $ldf $ldft;

$v=gwmi win32_volume
$v1=$v | where {$ -eq $Drive2Def}

#move files back
move-item $mdft $mdf
move-item $ldft $ldf

#attach db
$f = New-Object -Type System.Collections.Specialized.StringCollection
$s.AttachDatabase($dbname, $f)


Cool. The function takes 6 input parameters, the server name that the above task will be run against, the database, the mdf and ldf, the temp locations where the mdf and ldf will be moved to and the partition to defrag.

Rebuild all indexes on all tables in a database using Powershell

For one or more reasons you may want to do database maintenance like index defrags or rebuilds using Powershell. Good news is that this is quite straightforward. If you instantiate the Microsoft.SqlServer.Management.Smo.Database class, you have access to a "tables" collection. There is a method that applies to the tables collection called RebuildIndexes. This method takes Fill Factor as an input parameter. So we have the tools to do what we need to, now its a case of writing the logic to rebuild the indexes on each table. With powershell thats a breeze, since we can run a foreach loop. So assuming that $db is our database objects with $db.tables being the tables collection, we could write:

foreach ($t in $db.tables) {$t.RebuildIndexes(90)}

Another Powershell one liner! This is the most simple form of rebuilding tables in a database using powershell. One could extend this to rebuild all tables in all user databases on a server, or on a list of servers. Or one could rebuild indexes on specific tables.