Here’s another interesting crash problem I worked on a few days ago: the application pool was randomly crashing and the following errors were logged:
Event Type: Error
Event Source: .NET Runtime 2.0 Error Reporting
Event Category: None
Event ID: 1000
Date: 21/08/2009
Time: 12:23:12
User: N/A
Computer: <computername>
Description: Faulting application w3wp.exe, version 6.0.3790.3959, stamp 45d691cc, faulting module kernel32.dll, version 5.2.3790.4062, stamp 462643a7, debug? 0, fault address 0x0000000000027d8d.
Event Type: Warning
Event Source: W3SVC
Event Category: None
Event ID: 1011
Date: 21/08/2009
Time: 12:23:14
User: N/A
Computer: <computername>
Description: A process serving application pool <AppPoolName> suffered a fatal communication error with the World Wide Web Publishing Service. The process id was xxx. The data field contains the error number.
Even if with some difficulties (we had to disable Error Reporting as described here) we managed to capture a coupe of dumps for this crash; dumping the exception was not very helpful:
0:027> !gle
LastErrorValue: (Win32) 0x3f0 (1008) – An attempt was made to reference a token that does not exist.
LastStatusValue: (NTSTATUS) 0xc000007c – An attempt was made to reference a token that doesn’t exist. This is typically done by referencing the token associated with a thread when the thread is not impersonating a client.
0:027> !printexception
There is no current managed exception on this thread
But dumping the very long stack gave some good hint at least about a potential problem (a stack overflow):
0:027> kpL Child-SP RetAddr Call Site 00000000`05fcae90 00000642`7f895333 kernel32!RaiseException+0x73 00000000`05fcaf60 00000642`78390730 mscorwks!`string'+0x81ef3 00000000`05fcb130 00000642`780a9522 mscorlib_ni!System.OrdinalComparer.Equals(<HRESULT 0x80004001>)+0x70 00000000`05fcb170 00000642`7809822a mscorlib_ni!System.Collections.Hashtable.KeyEquals(<HRESULT 0x80004001>)+0x42 00000000`05fcb1a0 00000642`748f8976 mscorlib_ni!System.Collections.Hashtable.get_Item(<HRESULT 0x80004001>)+0x15a 00000000`05fcb260 00000642`748ff9bc System_ni!System.Collections.Specialized.NameObjectCollectionBase.FindEntry(<HRESULT 0x80004001>)+0x16 00000000`05fcb290 00000642`80580f48 System_ni!System.Collections.Specialized.NameValueCollection.Get(<HRESULT 0x80004001>)+0xc 00000000`05fcb2c0 00000642`8058108d MyTest!MyClass.Web.Pages.BasePage.get_Language(<HRESULT 0x80004001>)+0x48 00000000`05fcb300 00000642`8058108d MyTest!MyClass.Web.Pages.BasePage.get_Language(<HRESULT 0x80004001>)+0x18d 00000000`05fcb340 00000642`8058108d MyTest!MyClass.Web.Pages.BasePage.get_Language(<HRESULT 0x80004001>)+0x18d 00000000`05fcb380 00000642`8058108d MyTest!MyClass.Web.Pages.BasePage.get_Language(<HRESULT 0x80004001>)+0x18d 00000000`05fcb3c0 00000642`8058108d MyTest!MyClass.Web.Pages.BasePage.get_Language(<HRESULT 0x80004001>)+0x18d 00000000`05fcb400 00000642`8058108d MyTest!MyClass.Web.Pages.BasePage.get_Language(<HRESULT 0x80004001>)+0x18d 00000000`05fcb440 00000642`8058108d MyTest!MyClass.Web.Pages.BasePage.get_Language(<HRESULT 0x80004001>)+0x18d 00000000`05fcb480 00000642`8058108d MyTest!MyClass.Web.Pages.BasePage.get_Language(<HRESULT 0x80004001>)+0x18d 00000000`05fcb4c0 00000642`8058108d MyTest!MyClass.Web.Pages.BasePage.get_Language(<HRESULT 0x80004001>)+0x18d 00000000`05fcb500 00000642`8058108d MyTest!MyClass.Web.Pages.BasePage.get_Language(<HRESULT 0x80004001>)+0x18d 00000000`05fcb540 00000642`8058108d MyTest!MyClass.Web.Pages.BasePage.get_Language(<HRESULT 0x80004001>)+0x18d 00000000`05fcb580 00000642`8058108d MyTest!MyClass.Web.Pages.BasePage.get_Language(<HRESULT 0x80004001>)+0x18d 00000000`05fcb5c0 00000642`8058108d MyTest!MyClass.Web.Pages.BasePage.get_Language(<HRESULT 0x80004001>)+0x18d 00000000`05fcb600 00000642`8058108d MyTest!MyClass.Web.Pages.BasePage.get_Language(<HRESULT 0x80004001>)+0x18d 00000000`05fcb640 00000642`8058108d MyTest!MyClass.Web.Pages.BasePage.get_Language(<HRESULT 0x80004001>)+0x18d 00000000`05fcb680 00000642`8058108d MyTest!MyClass.Web.Pages.BasePage.get_Language(<HRESULT 0x80004001>)+0x18d 00000000`05fcb6c0 00000642`8058108d MyTest!MyClass.Web.Pages.BasePage.get_Language(<HRESULT 0x80004001>)+0x18d 00000000`05fcb700 00000642`8058108d MyTest!MyClass.Web.Pages.BasePage.get_Language(<HRESULT 0x80004001>)+0x18d 00000000`05fcb740 00000642`8058108d MyTest!MyClass.Web.Pages.BasePage.get_Language(<HRESULT 0x80004001>)+0x18d 00000000`05fcb780 00000642`8058108d MyTest!MyClass.Web.Pages.BasePage.get_Language(<HRESULT 0x80004001>)+0x18d 00000000`05fcb7c0 00000642`8058108d MyTest!MyClass.Web.Pages.BasePage.get_Language(<HRESULT 0x80004001>)+0x18d 00000000`05fcb800 00000642`8058108d MyTest!MyClass.Web.Pages.BasePage.get_Language(<HRESULT 0x80004001>)+0x18d 00000000`05fcb840 00000642`8058108d MyTest!MyClass.Web.Pages.BasePage.get_Language(<HRESULT 0x80004001>)+0x18d 00000000`05fcb880 00000642`8058108d MyTest!MyClass.Web.Pages.BasePage.get_Language(<HRESULT 0x80004001>)+0x18d 00000000`05fcb8c0 00000642`8058108d MyTest!MyClass.Web.Pages.BasePage.get_Language(<HRESULT 0x80004001>)+0x18d [...]
Using the err.exe utility to lookup the exit code of the process when it crashed, we can confirm we had a stack overflow:
Event Type: Warning
Event Source: W3SVC
Event Category: None
Event ID: 1009
Date: 21/08/2009
Time: 13:44:11
User: N/A
Computer: <computername>
Description: A process serving application pool <AppPoolName> terminated unexpectedly.
The process id was ‘5392’. The process exit code was ‘0x800703e9‘.
C:\>err 0x800703e9
# for hex 0x800703e9 / decimal -2147023895
COR_E_STACKOVERFLOW corerror.h
# Is raised by the EE when the execution stack overflows as
# it is attempting to ex
# as an HRESULT: Severity: FAILURE (1), FACILITY_WIN32 (0x7), Code 0x3e9
# for hex 0x3e9 / decimal 1001
ERROR_STACK_OVERFLOW winerror.h
# Recursion too deep; the stack overflowed.
# 2 matches found for “0x800703e9”
In the IIS log the request that lead to the crash was like this:
GET | /MyApp/test.aspx?currentPreferences=2&ParentId=1&ClientDateTime=Wed%20Aug%2019%202009%2009:09:57%20GMT+0100%20(GMT%20Standard%20Time)&TimeZone=-60&Language=undefined&Currency=undefined&OrgUnitId=undefined |
Basically some user was using a robot to scrape the web site and the Language parameter (along with a few others) was not correctly set; in this case the customer’s code was not properly validating those parameters (using a browser they are always valid) and this cause the endless recursion that ultimately exhausted the stack.
Needless to say, improving the parameter validation logic resolved the problem ?
Carlo
Quote of the day:
Seeing ourselves as others see us would probably confirm our worst suspicions about them. – Franklin P. Adams