You are here: Home » SharePoint 2007: Revenge of the 100GB Database

SharePoint 2007: Revenge of the 100GB Database

Posted by Russ Houberg
No Comments »

Right…he's a Star Wars geek.  Check.

I wanted to discuss something else I heard a lot about at the SharePoint 2008 conference.  The 100GB database limitation.

Organizations are now looking at SharePoint as a legitimate large scale application.  They want to believe.  They want to engage.  Then they all hit their heads on the same thing.  100GB database size recommendation.  Folks… it's a recommendation.  The answer to the question of can we go bigger is the same as what I heard several times throughout the conference… "it depends".  If properly architected and with quality disaster recovery solutions in place, the content database can be larger.

So what I want to discuss is that the 100GB requirement is a guideline driven primary by SLA requirements.  The point being that you have to be able to back up and/or restore the content databases in an amount of time that is reasonable for your business.  If you're doing log shipping or have a disk to disk backup rig with an acceleration component from a Quest or Avepoint and you can nail a backup quick like, then you can go larger than 100GB! 

The only minor performance issues that I've seen with large content database center around large list updates.  For example, if you add a column or a column index to list or library that has several million content items in it then some of the data tables in the content database will be locked until the change has completed.  This will effectively lock out all other users from accessing any content in that content database until the change has completed. 

I have seen at least one content database of 400+ GB in size and I've heard of others that are about 1TB!  While 1TB is definately pushing it quite a lot and performance isn't as good as with a smaller database it is usable.  With a small number of users or in an archive scenario it could be acceptable.  The 400+GB database runs fine.  So I want to give you some tips if you are comfortable with going larger than 100GB:

  • I/O is everything.  If you know you are going to have a very large content database, then you'll do well to be generous with your storage gear.
  • RAID 5 is a minimum, RAID 10 is better
  • BEFORE you create your site collection, pre-create an empty content database.  Add data files to the empty content database such that you have 1 data file for every processor "core" in your SQL Server.
  • If at all possible, place the individual files on a separate LUN or physical set of spindles
  • LUNs can be large enough to accommodate multiple data files from DIFFERENT databases
  • MONITOR the Average Disk Queue lengths of the (hopefully different) LUNs.  You want to see them under 2 if possible.  If you're in the decimal range then you're golden.  If you're in the single digits then you're acceptable.  If you see ADQ numbers into the double, triple, or quadruple digits, then you've got problems that need to be addressed.

For example, lets say I my corporation has collected 4TB of content over the last 5 years and we want to move it all into SharePoint.  For the sake of this example, we'll ignore the fact that once stored in SharePoint, the content will take up more than 4TB of space.  Also, we have an 8 core SQL server with say 32GB RAM.  You could possibly shuffle that content out as follows:

Create (8) 1TB RAID 5 or RAID 10 LUNs.  Lets say we map those LUNs to drives H: through O:.  Note that you could just as easily mount them to empty folders if you don't want to use drive letters.  With an 8 core SQL Server and 8 content database luns, I can create 8 files per content database and put one of them on each of the different LUNs (neet how that worked out for this example!). 

  • With this rig we could pre-create 20 content databases. 
  • All of the database [dbname].MDF files would be on the H: drive. 
  • We then add [dbname2-7].NDF files on the i: through o: drives
  • We then create our 20 site collections probably using the "stsadm -o createsiteinnewdb" command

We then go through the effort of getting the content into SharePoint.  <ShamelessPlug>KnowledgeLake has the framework to get this done by the way.</ShameLessPlug>  Once the 4TB of content is done being loaded into the 20 site collections, you will find that each content database is approximately 200GB in size.  That means that each of the 8 data files for a given database is actually 25GB and spread across each of the 8 LUNs.  We now have a 200GB database with excellent I/O numbers and we still have room to double in size without worrying too much about I/O performance.  Of course, your mileage may vary depending on how the LUNs are configured and the performance characteristics of your SAN.

I want to be clear that this is a hypothetical example of one possible solution.  Every organization has variables that would affect this architecture, thus fulfilling the "it depends" mantra.

Russ

Your email is never shared.
Required fields are marked *




Allowed tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>