Thursday, April 2, 2009

Tracking database growth over time

We had a problem recently where our Sharepoint environment space requirements have grown beyond what was originally envisaged. The Sharepoint team wanted to know how the different SQL databases have grown over time.

Dang I thought, we don't record the size of a database at a particular point in time, over time. Then I thought, well we back the databases up frequently and we could look at the backup file sizes to get an indication of how big the database is, since there is a reasonable correlation between data, index and log space used and the backup size. But we only keep 2 weeks worth of backups on disk and to recall tapes for a year would be a mission. Then it dawned on me that the backup history tables in the msdb database would contain the database name, backup time and backup size!

Great, so I have my data source. The Backupset table just so happens to have 200000 records in it, with data for 80 databases. Well we can exclude all db's that aren't Sharepoint databases, and we can only look at type = 'D' for full database backups. But we still have daily full backups for each sharepoint database for the last 18 months. From a trending point of view, without being overwhelmed by masses of data, I want to see database growth by month. Ok so I want the 1st, or last backup for each database for each month for the last 18 months. Now there's a bit of T-SQL to write. I figured a couple of While loops would do the trick to loop through each month and year, and in those loops have a cursor get the backup info into a table variable for each database. Here is the code:



DECLARE @Db_BackupHistory TABLE
(database_name VARCHAR(255),
backup_size BIGINT,
backup_start_date DATETIME)

DECLARE @Y INT
DECLARE @M INT

SET @Y = 2007
SET @M = 1

WHILE @Y < 2010
BEGIN
WHILE @M < 13
BEGIN

DECLARE @DB VARCHAR(255)
DECLARE CurDB CURSOR FOR
SELECT DISTINCT database_name FROM backupset where database_name like 'Sharepoint_%'
OPEN CurDB
FETCH NEXT FROM CurDB INTO @DB

WHILE @@FETCH_STATUS = 0
BEGIN
INSERT @Db_BackupHistory(database_name, backup_size, backup_start_date)
select TOP 1 database_name, backup_size, backup_start_date from backupset
WHERE database_name LIKE @DB and type = 'D'
and DATEPART(mm, backup_start_date) = @M and DATEPART(yy, backup_start_date) = @Y
order by backup_start_date DESC

FETCH NEXT FROM CurDB INTO @DB
END
CLOSE CurDB
DEALLOCATE CurDB

SET @M = @M + 1
END
SET @M = 1
SET @Y = @Y + 1

END

select database_name, backup_size/1024/1024 AS [BackupSize(MB)], backup_start_date from @Db_BackupHistory
order by database_name, backup_start_date



I'm sure there are other ways of doing this. Feel free to comment and add how you would do the above.

As a side, there is a post by Vyas Kondreddi posted in 2001 that does a similar thing, but doesn't summarize the size per database by month. It only returns all data for one database. Also, it reports size as the size of the mdf. If I create a sharepoint database with an initial size of 1GB and it takes 8 months to get to 1GB, then I won't have a true reflection of the actual size of the data, the size of the mdf with lots of empty space is academic. The actual data space used is important from a data growth point of view. Hence I use the backup_size field from the Backupset table.

The cool thing now is I can look at a database going back 18 months and see the size of the full backup at the end of each month. Now just pull this data into reporting services, have a series for each database and there we go! Database growth over time.

No comments:

Post a Comment