Calm Before the Storm

May 6th, 2008 by Russ Houberg

Well, it's been a few weeks since I finished the SharePoint scalability work.  The storage whitepaper I wrote is going over well with customers and the public in general and the big daddy SharePoint Scalability whitepaper will soon be posted on TechNet.  Also, I'll be presenting material from the storage whitepaper at the Dallas and Houston SharePoint User Group meetings in May.

So it's wierd that I'm not totally slammed with work.  I had kind of gotten used to it.  For the moment I'm doing minor troubleshooting work and retrofiting the scalability recommendations into client environments.  So with a few moments to come up for air, I decided to take stock and retool a bit.

First up, what's the state of web development?  Man am I behind!  I've had my head buried in the SharePoint sand for so long that I hadn't even heard of LINQ!  And it came out like a long time ago!  I ran across it while working on a timesheet application.  I wanted a nice project to polish up the skills so I set my sights on a LINQ driven AJAX enabled timesheet application.  The app is coming along nicely and I really like LINQ. Talk about easy!  I spent a single day on the data model and BAM the data layer was done for me and I'm updating data tables.  Sweet!

Also, I got around to taking the rest of the SharePoint MCTS exams.  I've passed all 4 now!  I scraped by on one of them but I thought I did pretty good for not using exam cram materials.  I took the tests cold baby, with only real world experience to guide me.  I'm pretty stoked that I got those out of the way.

Finally, I took an evening and whipped out a quick site for a Harley Raffle for my church.  If you live in the US and you want to throw $10 into the ring to see if you can win a Harley, check out the site here!  I promise, it's on the up and up.  I've seen this SWEET machine in person and Frieze Harley-Davidson is holding the title and the keys to ensure it stays "new"! 

So I don't expect the calm to last long.  There are rumblings of a possible extension to the massive SharePoint scalability effort I had been working on.  I hope they pan out.  I love this scaling SharePoint stuff.

Cheers

Scaling SharePoint 2007 – Storage Architecture

April 11th, 2008 by Russ Houberg

Ok.  So after writing that last post on the 100GB database limitation I got some nice feedback from several people.  Also, I'm finding that this is information that really needs to get "out there".

Far to often, KL is brought in to implement a document imaging or file share conversion solution in SharePoint.  The problem lies in the fact that often times the sandbox that everyone learns to play in becomes production!  This ends up presenting a lot of challenges for us as consultants when trying to implement solid solutions.  So I want to help out where I can.

To that end, I've created a whitepaper that will help the SharePoint Administrator to implement a farm with a solid storage foundation.  It's possible to take bits and pieces from this whitepaper in order to ensure that you've implemented a storage foundation that scales according to YOUR needs, be they small scale or large scale.  Even in a smaller scale environment, there are steps we can take to optimize file I/O and SQL Server performance.

So I waited a while for somewhere more official to post this document.  Eventually it will be up on the KnowledgeLake website and even referenced in an upcoming Microsoft SharePoint scalability whitepaper.  But for now, my personal web space will have to do.  This whitepaper has been done for a week now and just waiting for a home.

So.. Here you go world… I hope this helps!

Please download Scaling SharePoint 2007 – Storage Architecture and let me know what you think! 

It should be an easy read and I think you'll get a lot out of it!

SharePoint 2007: Revenge of the 100GB Database

March 8th, 2008 by Russ Houberg

Right…he's a Star Wars geek.  Check.

I wanted to discuss something else I heard a lot about at the SharePoint 2008 conference.  The 100GB database limitation.

Organizations are now looking at SharePoint as a legitimate large scale application.  They want to believe.  They want to engage.  Then they all hit their heads on the same thing.  100GB database size recommendation.  Folks… it's a recommendation.  The answer to the question of can we go bigger is the same as what I heard several times throughout the conference… "it depends".  If properly architected and with quality disaster recovery solutions in place, the content database can be larger.

So what I want to discuss is that the 100GB requirement is a guideline driven primary by SLA requirements.  The point being that you have to be able to back up and/or restore the content databases in an amount of time that is reasonable for your business.  If you're doing log shipping or have a disk to disk backup rig with an acceleration component from a Quest or Avepoint and you can nail a backup quick like, then you can go larger than 100GB! 

The only minor performance issues that I've seen with large content database center around large list updates.  For example, if you add a column or a column index to list or library that has several million content items in it then some of the data tables in the content database will be locked until the change has completed.  This will effectively lock out all other users from accessing any content in that content database until the change has completed. 

I have seen at least one content database of 400+ GB in size and I've heard of others that are about 1TB!  While 1TB is definately pushing it quite a lot and performance isn't as good as with a smaller database it is usable.  With a small number of users or in an archive scenario it could be acceptable.  The 400+GB database runs fine.  So I want to give you some tips if you are comfortable with going larger than 100GB:

  • I/O is everything.  If you know you are going to have a very large content database, then you'll do well to be generous with your storage gear.
  • RAID 5 is a minimum, RAID 10 is better
  • BEFORE you create your site collection, pre-create an empty content database.  Add data files to the empty content database such that you have 1 data file for every processor "core" in your SQL Server.
  • If at all possible, place the individual files on a separate LUN or physical set of spindles
  • LUNs can be large enough to accommodate multiple data files from DIFFERENT databases
  • MONITOR the Average Disk Queue lengths of the (hopefully different) LUNs.  You want to see them under 2 if possible.  If you're in the decimal range then you're golden.  If you're in the single digits then you're acceptable.  If you see ADQ numbers into the double, triple, or quadruple digits, then you've got problems that need to be addressed.

For example, lets say I my corporation has collected 4TB of content over the last 5 years and we want to move it all into SharePoint.  For the sake of this example, we'll ignore the fact that once stored in SharePoint, the content will take up more than 4TB of space.  Also, we have an 8 core SQL server with say 32GB RAM.  You could possibly shuffle that content out as follows:

Create (8) 1TB RAID 5 or RAID 10 LUNs.  Lets say we map those LUNs to drives H: through O:.  Note that you could just as easily mount them to empty folders if you don't want to use drive letters.  With an 8 core SQL Server and 8 content database luns, I can create 8 files per content database and put one of them on each of the different LUNs (neet how that worked out for this example!). 

  • With this rig we could pre-create 20 content databases. 
  • All of the database [dbname].MDF files would be on the H: drive. 
  • We then add [dbname2-7].NDF files on the i: through o: drives
  • We then create our 20 site collections probably using the "stsadm -o createsiteinnewdb" command

We then go through the effort of getting the content into SharePoint.  <ShamelessPlug>KnowledgeLake has the framework to get this done by the way.</ShameLessPlug>  Once the 4TB of content is done being loaded into the 20 site collections, you will find that each content database is approximately 200GB in size.  That means that each of the 8 data files for a given database is actually 25GB and spread across each of the 8 LUNs.  We now have a 200GB database with excellent I/O numbers and we still have room to double in size without worrying too much about I/O performance.  Of course, your mileage may vary depending on how the LUNs are configured and the performance characteristics of your SAN.

I want to be clear that this is a hypothetical example of one possible solution.  Every organization has variables that would affect this architecture, thus fulfilling the "it depends" mantra.

Russ

Large Scale Architecture Question

March 8th, 2008 by Russ Houberg

So the Microsoft SharePoint Conference 2008 wrapped up on Thursday.  What an amazing ride!  It blows me away to see how SharePoint is absolutely exploding.  It fires up my passion for SharePoint technology even more!

Paul Learning, Andy Hopkins, and I did finally present during the last session slot of the conference.  There weren't as many attendees as I would have like to have seen, but what the heck, I usually bail early on the last day too!  Anyway, with the smaller group of only about 30 folks it was a less formal session.  We were able to engage in some quality discussion around scalability and performance.

One person asked me a question that I'd like to address here.  He talked about the fact that they have a large volume of files in FileNet.  They'd like to move them into SharePoint but they can't come up with a way to logically group them such that they could keep the site collections/content databases inside of the 100GB recommended size.  I asked him how his users accessed the content in FileNet.  He didn't want to go there because the answer was that they "searched" for the content.  Then he said that his users had been exposed to SharePoint and had been accustomed to just navigating straight to the documents they need.

So of course my answer to him was something like, use MOSS search… embrace the search… love the search…!  Search is like the keys to the kingdom in a large scale MOSS implementation!  He didn't want to hear that answer unfortunately.  That brings me to two points. 

First, don't expose your users to technology you don't want them to have.  He feels like limiting direct navigation is like taking candy from a baby and I agree to an extent. So I encourage everyone to spend the time up front to design the system and train users up in the way that they should go from the beginning (whenever possible… I know it's hard).

Second, if you want to unlock the power of your MOSS implementation.  You have to imerse yourself in the search capabilities of the MOSS platform.  Mind what you have learned!  Save you it can!  Ok, enough with the Yoda references.  Seriously, I get that it doesn't matter what you put in if you can't get it back out easily.  But we have to break the habits of the S: drive.  Hierarchical data structure doesn't help the new employee trying to navigate through 4TB of content!

Russ

Microsoft SharePoint Conference 2008

March 5th, 2008 by Russ Houberg

Uhhh.  If you haven't gotten the memo yet folks.  SharePoint is HOT!!!!

I've been to a few different conferences in the last couple years and it is very clear that interest in SharePoint is growing at a rapid pace.  Now that SharePoint has split off from the MS Office conference there is a pure focus at the MS SharePoint Conference 2008. 

Initial indications were compelling as the conference was sold out over a month in advance.  Then the hotels sold out.  Then upon arrival there were people literally hanging around outside hoping to get in.  One guy commented that if we all didn't have pre-registered RFID badges there probably would have been some serious ticket scalping going on!  It's not hard to see why.  I haven't been around this many smart cerebral type folks in a long time.  You can't help but learn something!

I grabbed a great session today on Planning for Scale and Capacity.  So much of what they had to say rang true.  Their "It Depends" answer to the inevitable "how many servers do I need" is something that I seem to have given to a lot of customers these days.  Fortunately, these guys have taken a lot of the guesswork out of capacity planning with their excellent Capacity Planning Tool. They did another demo of it today and I'm enamored by how easy it is to use!  It is wonderful to have such a powerful tool that takes into consideration so many variables!  Great work guys!

Another thing I'm excited about is the story around FAST search.  The message of where FAST slots into the mix is really shaping up.  It's clear that you have Search Express on smaller implementations, Enterprise Search on midrange to large implementations, and then there's FAST which can do the EXTREMELY large implementations with mind boggling scales of capacity and search response times.  I just don't see how any other portal product can complete with this holistic platform.  They've covered all the bases. SharePoint scales to the moon.  And if you have an EA with Microsoft, YOU ALREADY OWN SHAREPOINT!  Calling all CIOs…lets get to work pulling together all of that intellectual content you've got scattered around!  It just got even harder to come up with a reason not to!

Anyway, it's shaping up to be an amazing conference.  I have some friends doing some sweet sessions.  Darrin Bishop (my mentor from back in the 2003 days) is doing cool things with PowerShell and SharePoint administration these days and I'm looking forward to Todd Baginski's session on SSO.  Unfortunately, I missed his BDC session today but I heard it was excellent.

Oh yeah, and what a great perk Microsoft… Free Prometric testing for the "Configuring MOSS" and "Developing MOSS" tests!  I took a crack at the Configuring and knocked it out with flying colors so I'm officially an MCTS now!  I'm hoping to get a chance to take the "Developing MOSS" test tomorrow.

Also, I want to plug the session that I've been directly involved with!  Remember that whitepaper I mentioned in an earlier post?  Well, Paul Learning (MS), Andy Hopkins (MS), and I have been working on this scalability effort for the last couple months and Paul and Andy will be presenting on much of what will be in that whitepaper!  For any of you at the conference that happen to catch this blog from an aggregator somewhere, PLEASE don't bail on the conference early!  We're one of the last sessions on Thursday!  Come check out the SharePoint Scalability – Practical Application for the Enterprise session at 12:00pm!  It was a late entry and didn't make it into the schedule book.

Finally, thanks to Todd Baginski for talking us into waiting in line over 2 hours to get into the flight simulator at the Museum of Flight.  It was WELL WORTH THE WAIT!.  Paul and I had a blast flipping and rolling that thing like crazy.  I'm looking forward to game night tomorrow!  I've been away from home only 2 days now and in addition to missing the wife and kids, I'm also missing my daily dose of Guitar Hero!  I hope they have it tomorrow!

Good times.

MOSS Search Results Can Be Near Real Time

February 28th, 2008 by Russ Houberg

Well, I'm not sure how many times I can mention KnowledgeLake and "Transactional Content Management" without getting flogged by the blog hosts for peddling our wares again… but here I go again.

So once again, I'll set the stage with the world I work in every day.  KL is all about facilitating document processing all the way from paper to grave.  By grave I mean the end of a document lifecycle.  So after KL Capture Server blasts a batch of documents into SharePoint we often take advantage of some form of workflow to kick off additional document/account processing. 

For example, imagine a lending branch scanning in and releasing a series of documents related to a loan application.  Upon receipt of the actual application document a workflow might be initiated.  Here's where it gets interesting.  During loan application processing there might be several approval steps that are based on peripheral documents such as income statements and/or loan collateral documentation.  If the institution is processing many loans per day, they don't have time to wait around for an incremental crawl to take an hour or sometimes even 15 minutes.

So what can we do to really tighten down search result availability?  Well in this type of environment I would architect the farm a certain way and setup the incremental crawl for the content source to fire literally every minute.  So the information below outlines how I would configure the farm to squeeze the absolute most performance out of crawl processing.

Implementation:
  • The farm should include a separate (and beefy) machine for Index Server.  I recommend a box with at MINIMUM of 4 (64bit) CPU cores 16GB RAM running.  The Query role should not be enabled on this server.  Note that you can't mix 32bit and 64bit WFEs in the farm so if you're running 32bit front ends, stick with 32bit Index Server.
  • In order to get that hefty Index Server to take advantage of available resources we need to force it to use more threads while crawling content.  We can do that using 1 of 2 possible techniques
  • OPTION 1: When configuring the "Office SharePoint Server Search" role on the Index Server, set the Indexer Performance to "Maximum":

image 

  • OPTION 2: We can create a crawler impact rule in Application Management => Manage search service => Crawler Impact Rules => Add Rule
    NOTE: Crawler Impact Rules take precedence over Indexer Performance Settings and since the default simultaneous requests is based on the number of processors on the index server, it's possible that the "Maximum" indexer performance setting could be overridden by the default crawler impact setting (even if no crawler impact rules exist).

image

  • Then, regardless of which option is chosen, we need to set the "Target" Web Front End to be the actual Index Server itself (WFE role must be enabled) or possibly a specific "target" WFE machine would not be used for serving content to end users.

image

  • Finally, we set the incremental crawl schedule to fire in 1 minute increments.  Navigate to the Shared Services Administration page for your SSP.  Then click Search Settings => Content sources and crawl schedules => [Content Source Name].  Then click "Create[/Edit] schedule" under the Incremental Crawl field.  Set the values as identified below and click OK => OK.

image

That should do it.  You've just configured the search service to kick off incremental crawls in 1 minute intervals!  Shortly after an incremental crawl completes, if any changes were made to any of the index files, those changes will be propagated out to the Query (Search) servers.  Once that propagation has been processed, the content will be available for searching!

Monitoring Performance:
  • Keep an eye on the "Manage Content Sources" page in the SSP administration site.  It will tell you the indexing status. 
    • You want to watch the Indexing Status field.  It will cay "Crawling Incremental" when it's crawling.  It should say "Idle" when it is finished crawling.  Refresh often to ensure that at some point during the 1 minute interval it is able to finish the incremental crawl.
    • If Index Status never changes to Idle then unfortunately you don't have the horsepower to maintain a 1 minute incremental crawl interval.  You should increase the interval by 1 minute until you verify that your crawl can complete in the allotted amount of time.
  • Keep an eye on the performance of your Index Server, Target Server (if applicable), and your SQL Server.  If ramping up crawl performance has created an uncomfortable increase on system resource utilization on ANY of these servers, you can either back down the crawl threads (Crawler Impact Rules/Indexer Performance) or you can increase the incremental crawl duration or both.
Additional Points of Interest:
  • There are many factors related to crawl performance.  Everything from how powerful your Index, Target, and SQL Servers are to the I/O performance of the SQL Server databases.  The SSP Search database is particularly vulnerable as it can become very large quickly. 
  • Not all environments are the same.  Your mileage may vary.  For example, KnowledgeLake solutions often revolve in high volumes of TIFF files.  There is no TIFF iFilter available for MOSS out of the box so the "NULL" iFilter is used.  This means that the document metadata is gathered and inserted into the property store in the SSP Search database but the actual binary file doesn't have to be parsed.  So our indexing speed is often much faster.
  • With such a high load created on the Index Server and SQL Server during crawl processing, it's recommended that any Full Crawls be scheduled during off peak times (evenings and weekends, etc).  This is because the Full Crawl will obey the same threading rules used by the incremental crawl.  This could yield a very high level of stress on the SQL Server over an extended period of time.

OK.  That's about all I have to say about that.  Once again, the cool thing about SharePoint is that it is so configurable!  If the changes I specified here don't work for you, please don't flame me :)   !  Just back off of the threading or put the settings back where they started and you'll be just fine.

Windows Server 2003 Update Breaks RDP?

February 11th, 2008 by Russ Houberg

It seems that one of the Windows Server 2003 Updates manages to hose up our ability to remote desktop (RDP) into that server.  I've seen this happen several times now and never found a clean fix for it through internet research so I spent several hours one day trying to figure out how to overcome the problem.

First of all, there are many things that can impact remote desktop.  Often times the problem is related to a hardware router/firewall, other software firewall, or Windows Firewall.  Windows Firewall isn't enabled by default in Win Server 2003 but it is in Windows XP/Vista.  You can test to see if Windows Firewall is causing the problem pretty easily by disabling the Windows Firewall service in Service Manager and rebooting the box.  If you still can't get in, leave the firewall disabled until you fix the problem. 

So with that out of the way, this tip falls more into the category of "I didn't change anything and all the sudden RDP isn't working".

I can't pinpoint which update causes the problem, but I believe I know what it does to break things.  It appears that the binding of the RDP protocol to the network adapters on the server become broken after the update.  In order to fix the problem, follow this procedure:

Start by running the Terminal Services Configuration tool.

  • Click on the Connections "folder"
  • Right click the RDP-TCP connection and select properties
  • Select Network Adapter tab
  • Change "All network adapters…" to the network adapter bound to the IP address that you use for RDP.  If it's already associated directly to that network adapter, then change to "All network adapters…"
  • Click OK
  • Reboot the server

I've found that if I follow this procedure after losing RDP to a Windows Server 2003 update, it works every time.  By the way, this can all be done in a WMI script remotely if you've got the skills for that.  I'm not a WMI script guru by any stretch but I was able to figure it out the proper code in about an hour.

There are certainly other obstacles that can cause problems with RDP, but this is a big one that I don't think many people realize.  Hope this helps somebody.

SharePoint Scalability Whitepaper at SP Conference 08

February 8th, 2008 by Russ Houberg

Well, the last couple weeks have been a whirlwind.

In the past few months, I've been working hard with Paul Learning (Microsoft Consulting Services), Andy Hopkins (Technical Development Manager, Microsoft) as well as a few guys on the MS SharePoint Product team.  Paul and I have been busy loading up a massive Fujitsu server farm while Andy burns through logistical hurdles all in an effort to develop a SharePoint Scalability whitepaper.  It looks like we're going to have this whitepaper done in time for SP Conference 08!  So I thought I would provide a little background.

It's been a rediculous ride.  We started with an incredible hardware rig from Fujitsu, complete with blade servers, rack servers, an Itanium SQL box and a full 10TB of storage space on a Fujitsu Eternus SAN.  My job was to use some of the KnowledgeLake secret sauce and a little creative multithreading to blast 50 million records into SharePoint.  Here's a hint as to the scalability of SharePoint… We were able to use the load tool to send the 50 million documents into SharePoint at a peak rate of 7+ MILLION documents PER DAY!

Anyway, word has leaked out that we were working on this whitepaper so I thought I might as well blog about it.  In the last month or so, we've had a lot of folks contacting Microsoft and KnowledgeLake about scalability architecture and possibly executing similar tests "on their hardware".  It's amazing how much buzz this is getting and we haven't even advertised it! 

So the whitepaper is coming soon!  There will be a big splash at Microsoft SharePoint Conference '08 and probably a couple webcasts after that.  We're here to sing the story.  SharePoint CAN SCALE, and we can prove it!

This is the real reason I wanted to start this blog.  I wanted a forum to talk about the incredible scalability story that SharePoint has to offer.  I'm a believer and I'm here to make you one too!

Todd Baginski and His Excellent SharePoint Dev Course

February 8th, 2008 by Russ Houberg

Just wanted to give a shout out to Todd Baginski and his excellent SharePoint Development Bootcamp training course.  I'm sure many of you that cruise SharePoint Blogs are quite familiar with him.  But for you Google Searchers (or Live Searchers Paul ;) ) who are looking for a review of his class, here's the bottom line.  If you haven't had a chance to attend one of his classes or at least one of his conference lectures, you're missing out.

I've been writing code for about 14 years now and I've been around the SharePoint block since late 2003.  I've been to many classes and lectures on various topics in the SharePoint space.  I've endured several that have been light on content and full of fluff with maybe one or two interesting points.  Not in Todd's class!  This guy has a deep understanding of the nuts and bolts that make up the collective SharePoint API and it was a pleasure to listen to him (even though business called me out of a few sessions)!

Also, he left behind a collective of resources that are simply amazing.  Added to my personal knowledge base is a collection of easily referenceable courseware chapters, code snippets, and various handy SharePoint utilities.  Good form Todd.  Good form.

On a personal note, I'd like to say something else about Todd.  For all the knowledge of SharePoint development concepts that he possesses, there isn't even a hint of ego. It's easy for confidence in one's abilities to bleed over into arrogance and that goes for any walk of life.  It's not a problem that Todd has.  He's a very humble person high on character and passionate about teaching others what he knows.  I really respect him for that.

Thanks Todd.  It was a pleasure.  See you at the MS SharePoint Conference '08.

Joel Oleson and the Anatomy of Indexing

January 29th, 2008 by Russ Houberg

Ok.  So the real reason that I chose today to on-ramp this blog.  I wanted to add a little something to Joel Oleson's post yesterday regarding the Anatomy of Indexing

First of all, I'm sure anyone interested in my take on SharePoint is probably well aware of Joel Oleson.  Most of the SharePoint community, myself included, holds Joel in the highest respect.  I respect him for his wealth of SharePoint knowledge and his willingness to share it.  I met him recently and in addition to being a SharePoint guru, I found him to be a very likable guy in general.  Rock on Joel.

So about the Anatomy of Indexing.  It is a great read for anyone who's interested in that black box called SharePoint Indexing.  Please make sure you've read it before continuing…. Ok, I have just one little point I'd like to add from the document repository front.

During an incremental crawl, the call to the sitedata.asmx yields a result set of ALL the entries in the change log.  This is particularly important to KnowledgeLake as we tend to blast a lot of content into SharePoint in a short period of time (like during migration/conversion operations).  But this might also apply to standard SharePoint restore operations.  If you find yourself performing some type of action that will cause a high number of document changes all at once, all crawl schedules (full and incremental) should be disabled if at all possible.  If it is not possible to disable them, then the url path to the library where content is being loaded/restored should be EXCLUDED from the crawl. 

If this guidance is not followed, then the call to the sitedata.asmx web service will likely time out due to the sheer volume of data being packaged and shipped via XML (fat).  I've experienced this phenomenon first hand.  You end up with crawls that grind themselves into oblivion and yield a whole lot of unfriendly errors in the crawl log.  Once the load/restore operation is complete a full crawl should be executed.  Also, if production won't be impacted, it's also a good time to do a complete index reset.