Uncategorized

Sometime is obvious, but not for everyone

I just realized that my last post is 20 days old… I spent this time working on support calls I own (or course, that’s my job!), reading documentation and playing to get ready for the upcoming releases I’m quite sure you already heard about (Vista, Ajax, IIS 7, Internet Explorer 7…), reading some good books in my spare time (currently I’m almost done with The Twelfth Card by Jeffery Deaver) and among other things I’ve been searching (and found) a new home. (well, actually my girlfriend decided where we’ll go to live for the next years, I guess you know how this king of things works…).

While I had to deal with the contracts and terminology I’m not familiar with, I realized that sometimes people tend to give some things and facts for granted, while other people may not know what we are talking about…

This also apply to our business: I had some conversations with our customer getting “Application Server Unavailable” errors in their ASP.NET sites, and then we discovered that they tried to execute two different CLR versions (usually ASP.NET 1.1 and 2.0) in the same worker process… This is not allowed, and I always thought this was clear, but a couple of customers asked me for some written documentation about this limitation, and despite my researches I’ve not been able to find something specific enough for my needs.
I’m considering to write a post about it, so a couple of days ago I sent an email to my team alias asking if someone could point me to such docs. I had a short but interesting email thread with Doug, and basically this is an abstract of that conversation:

Why does the customer need such documentation? This is just “common knowledge” to most people that have worked with the CLR for any length of time.  The core runtime is implemented in a native DLL – mscorwks.dll (and sometimes mscorsvr.dll) and as most Win32 developer know you can only have one version a DLL loaded in a process (unless you get into Win32 SxS),  It’s a bit like some saying to Fiat “where is the documentation that you can only have one engine per car?”.

If you install the 1.1 SDK, and look at the doc for the CLR hosting interfaces (\Program Files\Microsoft.NET\SDK\v1.1\Tool Developers Guide\docs\Hosting Interfaces.doc) you’ll find statements like
“CorBindToRuntimeEx is the primary API that hosts use to load the CLR into a process”
“GetCORVersion returns the version number of the CLR that is running in the current process.”

It is implicit in the use of language here that there is only one per process. If you could have multiple CLR per process we would have written something like
“GetCORVersion returns the list of version numbers of the CLRs that [is]are running in the current process.”

Unfortunately I already tried that path in my talks with customers, and a couple of them who simply said “If it was written somewhere I would have not run into this error”; questionable statement (not because I don’t trust our customers, but because it’s hard to think that one can read and remember all the documentation Microsoft and other IT companies release about their products, at least I have to admit I’m not that kind of person), but they still wanted it…

I guess is something like Starbucks writing “Warning, your coffee may be hot!” (http://blogs.msdn.com/dougste/archive/2006/10/10/Warning_3A00_-this-coffee-may-be-hot.aspx ): sometimes it obvious but sometimes it’s not, at least not for everyone…

To enforce this, a couple of weeks ago I took a case raised by a customer who had some random crashes in his web application, and in this case the easiest thing we can to to troubleshoot the problem is capture a crash dump with adplus.vbs to analyze. As always, after a phone call with the customer to discuss the problem to gather a better picture about the application the problem, I sent him an email (actually a template I use for this kind of problems to save time and not continue writing the same things over and over for every customer) with the action plan and detailed steps to setup the debugger. Easy… then we just had to wait for the crash to happen again.
After a few days I had a phone call from this customer telling that they had the crash, but no dump was captured… we tried to understand what went wrong and after some discussion he told me that he connected to problematic web server through a Terminal Server session. Nothing strange on this, but he then told me that they configured their server to automatically end a user session after 30 minutes of inactivity…

That’s the problem! If the terminal session is terminated, it’s easy to imagine that also the processes (like adplus) you left running in that session will be terminated… so when the next crash occurred the debugger was already dead, and we missed the dump (and also wasted almost one week because we had to configure adplus again on the console session and wait another 4 days to get the new crash and the dumps).

When I spoke to that customer I had the clear impression that he was a well prepared technician and probably with lot of experience, but he claimed that the KB article I sent him (http://support.microsoft.com/?id=828222) doesn’t say if the terminal session has to be left open, and he logged off when done…
Again, sometimes it obvious but sometimes it’s not, at least not for everyone…

 

Cheers
Carlo

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.