Well, what started today as a brief attempt to install a self registration web part on the SharePoint site turned into an ugly episode of trying to resolve a nasty security issue.
Basically, what happened was that I installed a web part that I got from Nick Swan's blog. You can view the article at
Registration WebPart for Forms Authentication SharePoint 2007 sites, and see that the code and text are fairly old. However, I decided to try it anyway, and everything seemed fine until I tried to view the page with the web part and got a 403 error instead.
Unable to remove the web part, I used the "?contents=1" trick to view the web part maintenance page and disable the offending part. Thank god that trick still works in MOSS 2007 or I'd be hosed. The immediate problem went away, but then when I tried to add a new blog post about the url hack, I got the 403 again when trying to add a post.
Not sure what was causing the issue, I started to panic. Had I removed the NT AUTHORITY\local service account from the Users and Groups? Indeed, I had, and my permissions for other FBA based accounts were kind of a mess too. But, when I tried to put them back, "BAM!", 403 again. Basically, any page that required making a change or addition to a list of any kind would fail with 403.
At this point I should say that I have been using FBA on this site for a while without issue. Of course, why would I be trying to use the aforementioned web part otherwise? I decided to try to implement dual authentication on the site (see
Configuring Multiple Authentication Providers for SharePoint 2007), which I did not know would help - but suspected it intuitively.
So I followed the steps in that article, more or less in reverse because I has already implemented FBA and was now trying to create a secondary site with Windows Authentication instead. That went well, and I was able to use the site to re-add the users I'd removed and all the pages that were getting 403 in FBA worked fine under Windows Authentication. Go figure!
I struggled for a while and tried many things to no avail. I couldn't find anything useful in the SharePoint diagnostic logs. Adding various accounts to the Site Collection Administrators or the Web Application Policy did not seem to help either.
Eventually came across
this RSS feed, see also
this link. These are talking about Security Update for ASP.net 2.0 (KB928365) being the cause of a very similar intermittent problem. Well, lacking anything else to try, I happily uninstalled and rebooted.
Finally, in a fit of scientific ignorance, I decided to remove the original web part from the GAC just to be sure. Of course testing everything after the reboot it worked like a charm. But, what caused the issue, the web part or the security update? Now, I have to go back and test it to find out!
Stay tuned.
Update: Well, I got the Registration Web Part installed and working, so at least in the absence of the security patch it can't be blamed for today's issue. When I get home, I will have to reapply the security pack and see if the issue magically reappears. Who knows?!
Update II: Ahh, there's nothing quite as refreshing as SharePoint and a nice cold beer! Let's reinstall this security hotfix and see if we can replicate our issue from earlier today, shall we?
WTF? Well, I reinstalled the security patch, but everything still works ok. The articles and forums seem to indicate the problem is linked to the web application pool recycling, so I guess we'll try that next.
Update III: Well, that about clinches it. Recycling the application pool caused the error to come back, just like I thought it might. I'm not quite sure if maybe I forgot to do an iisreset and just delayed the inevitable that way, or if it was actually waiting for the recycle to blow up. Either way, doing iisreset after the error, and even a reboot, did not resolve the issue. I get the 403 on the page with the registration web part as well as many of the built in add or update pages for SharePoint. It seems pretty clear that there is some kind of compatibility issue. I'm going to do a search and see if I can find anything about it. If so, I'll post it here.
Consolidated Update Rollup IV: Looking into this a little deeper, it seems the problem may be due to installing the .net security update without installing updates for WSS 3.0 that are in sync with the changes. Specifically, I am looking into
KB932091 (March),
KB936056 (August), and the BIG one early September
KB941422. Of these Office Update only recommended the second one. The patch for September seems to be a rollup of many other patches from before, but I'm trying the August patch first just to see if it resolved the issue from the .NET patch released in July.
Late Night Update V: (I was way too tired, so I revised this to make more sense once I woke up.)
Several hours (and several beers) later, I am finally done. It turned out that only the third update for WSS 3.0 listed above actually fixed the 403 problem. However, it required a WSS database schema update, and bombed on half the upgrade tasks on the Products & Services Configuration Wizard.
This left my schemas out of sync with my version of WSS. And... and this is important - THERE IS NO ABILITY TO REMOVE THE 9-5-2007 WSS UPDATE. So, I ran "stsadm -o upgrade -inplace" to retry. The first thing I noticed was that I was running the wrong stsadm version, so I updated my PATH variable to point to the v12 folder, then ran again and looked at the Upgrade.log file to see if I could find the issue.
It turned out that my farm/application pool account was not authenticating properly. The logs indicated that the format for the account was a munged version of DOMAIN and UPN username formats, like "DOMAIN\username@domain.com" - wtf?! It should be one or the other but not both. (I dimly remember encountering this issue when I upgraded from SPS2003, but I don't rememeber what I did to get past it at the time; I should've taken notes.) Also, using the -farmuser parameter did not seem to have any effect on which account was being used.
Left with few options, I decided to do a full takedown of SPPS and WSS 2.0, since they were no longer being used and were causing me grief. The uninstall for SPPS 2003 took *forever* then locked up. I had to kill the process, but it seems to have removed anything important.
I also created a completely new application pool account, and a new one for the search crawler too. At this point everything seemed to be working properly again, so I went ahead and performed updates for Office 2003.
So what's the takeaway here? For starters, don't rely on automatic updates or just install everything Windows Update recommends without digging a little deeper first to see if there are dependencies for other products that WU doesn't help you upgrade. My server of course was hopelessly behind the times and needs to be brought up to speed, but I think that I'll save Win 2003 SP2 and IE7 for another day.
Until the next major issue, IT'S TIME TO GET A BEER AND WATCH TV! Yay!
More Info: I found some other people having similar problems. I'll save this section for a list of links to their blogs, etc.