When it comes to debugging without source code, there are two categories of debugging tools for this purpose:
- deassemblers/debuggers such as IDA Pro, windbg, etc. If the running application is written in .NET, using windbg with SOS is really handy. This kind of tools are helpful if you need to track and understand program flow and execution state, such as stack and memory for subtle issues. These tools also tend to be favorites of crackers who want to crack/bypass license schemes.
- tracing tools. There are those that trace network traffic such as tcpTrace, ethereal, Fiddler and of course netmon. For monitoring process, file and registry activities on the OS itself, nothing beats the tools from Sysinternals, now Microsoft. The top two I recommend are Process Monitor and Process Explorer. The tools in this category are your friends in many debugging scenarios, especially during the initial stage of understanding the issues at hand.
Here I will go over the debugging experience I just had this week with a legacy financial software, with a file-based database (no it's not Access or any of the Microsoft products). For anonymity reasons I will just not name this software.
Anyway, the database file used by this software is installed on a network share so that the client on users desktops can all use it. You'd probably guess by now that locking has been a common issue but that's not what I plan to cover here.
There is a custom built web application (classic ASP app hosted on IIS 5) in house that has to look up data from this database, which are files on the network share as I have mentioned. I was told that when the web app was being designed, the developer had found that although a VB Winform application could access the database through ODBC just fine, ASP application couldn't. Without investigating and further pursuit, a separate ETL program was developed to transfer the needed data to the SQL database used by the ASP application. The data fortunately doesn't change that much so this ETL program is run once every night. Suffice to say that such a moving piece has had its moments with IT operations although it is not the worst pet peeve.
I came to know this application and the ETL program recently and when I asked why we can't query and cache the data from the file based database in the ASP app, the answer was that the ODBC connection didn't work in ASP although it worked in the ETL program written in VB. I was told that the developer hadn't really debugged the application but rather called the help desk of the software vendor, which was not at all helpful like most of the tier 1 support of those vendors. Intuition was telling me that this problem had to do with the security context, such as impersonation and logon sessions. Question I would ask right away when facing this type of issue: does the context in ASP have permission or other issue accessing this share?
Determined to fix this problem and get rid of the ETL program, I started on the developer machine by configuring the IIS application with high isolation process model. This would cause a server COM+ application to be created - remember I had to deal with IIS 5! :-) Then I set the identity of the COM+ application to a domain service account that would have proper permission to the share, where the database files are located. With IIS directory security configured to do integrated windows authentication, the ASP app worked! But the production server has basic auth to allow users access this app from the Internet. Alas when basic auth was configured, it stopped working even when I typed in the same domain user name and password as the developer logged onto the desktop. As IIS 5 impersonates when either windows auth or basic auth is turned on, there must be a difference between the logged on session associated with the impersonation token in the two authentication types. Using a tool called TokenDump from Keith Brown, that I downloaded long time ago, I was able to confirm that 1) with windows auth, the ASP impersonation token is using the same logon session as the interactive user 2) with basic auth, the impersonation token, however, uses a different logon session than the desktop's although they are both for the same domain user account. This is probably by design as I conjecture that IIS may have done a LogonUser() call using the passed in user name and password, which would result in a different logon session.
But why was a different logon session encountering the problem? I used the Process Monitor for two captures, one for each auth type. When I compared the two, I noticed that the basic auth one didn't even have any trace showing attempts to access the network share. Going further, I found out from the developer that during ODBC and client setup, the legacy financial software used a mapped drive to point to the network share, where the database files reside. This is where the answer to the puzzle lies. As you may know, mapped drive is tied to the logon session and therefore not global. Here is the official documentation on MSDN. We should avoid using it, especially for enterprise server applications.
Finally I would need to change the drive letter to UNC. Alas the legacy software UI explicitly disallows it. Not wanting to give up at this point, I asked the developer to use Process Monitor to find out where (registry or configuration files) the software is keeping the data that contains the drive letter. Sure enough, we found a handful in the registry under the hives created by the software installation. I went ahead and manually changed all relevant registry settings. I then ran the ASP application with basic auth. Voila it worked! So the software must already be using the Win32 IO API that supports UNC, when its UI still disallows it.
This experience shows you how handy the right tools can be in hairy situations such as this. I do caution you though that you need to be extra careful when changing registry settings like this as it may cause unexpected behavior and/or breach license agreement.
3 comments:
Thanks! This is an excellent walk through. You should post more real world debugging stories like this one.
ouch...painful
Xin, couldn't agree more you about the pain that legacy software has caused. The morale of the story I wanted to show is that fundamental knowledge with good tools can get us a long way.
Post a Comment