Reverse P/Invoke - Part 2

Posted by Hugh Ang at 10/14/2008 12:29:00 PM

In my last post, I described a loosely-coupled pattern for native code to call into managed code. That approach requires control of source code of the native library, although only minor change would be made. There are scenarios where native code is not available and we just can not make any changes to it. For example, I have a Canon Rebel XTi camera connected to my PC and I need to notify my Winforms application of a new picture just taken so that the application can download and display it. The Canon SDK is in native library and has exported functions for callback function pointers to be registered. So what do we do? Scenarios like this require us to call native API from managed code, passing delegates marshaled as function pointers.

The solution is surprisingly straightforward. To demonstrate the technique, here is the native library code - note the definitions of data structure and callback function prototype:


#define NATIVELIB_API __declspec(dllexport)

 

// data structure for the callback function

struct EventData

{

    int I;

    TCHAR* Message;

};

 

// callback function prototype

typedef void (*FPCallBack)(EventData data);

 

// exported API

NATIVELIB_API void fnNativeAPI(int i, FPCallBack callBack)

{

    EventData data;

    data.I = i;

    data.Message = L"Hello from native code!";

 

    // invoke the callback function

    callBack(data);

}



And here is my corresponding interop code in C#. Notice that in the P/Invoke definition of the fnNativeAPI call, I added the MarshalAs(UnmanagedType.FunctionPtr) attribute in front of the CallBackDelegate parameter to instruct the runtime to marshal the delegate as a function pointer from managed code to native code.


[StructLayout(LayoutKind.Sequential, CharSet = CharSet.Unicode)]

public struct EventData

{

    public int I;

    public string Message;

}

 

[UnmanagedFunctionPointer(CallingConvention.Cdecl)]

public delegate void CallBackDelegate(EventData data);

 

public class Native

{

    [DllImport("NativeLib.dll")]

    public static extern void fnNativeAPI(int i, [MarshalAs(UnmanagedType.FunctionPtr)] CallBackDelegate callBack);

}



Here is the C# code of the application:


public partial class Form1 : Form

{

    private CallBackDelegate _cb;

 

    public Form1()

    {

        InitializeComponent();

        _cb = new CallBackDelegate(Foo);

    }

 

    private void Form1_Load(object sender, EventArgs e)

    {

        // call the native API, passing our .NET delegate

        int i = 200;

        Native.fnNativeAPI(i, _cb);

    }

 

    // this is our callback function

    private void Foo(EventData data)

    {

        Debug.WriteLine(data.I);

        Debug.WriteLine(data.Message);

    }

}



Careful readers may have noticed that I have kept the delegate as a class field. This is a best practice to prevent the delegate from being garbage-collected. It would be really bad if the native code holds an invalid function pointer and tries to invoke it.

For VB programmers, here is the interop code in VB.NET:


<StructLayout(LayoutKind.Sequential, CharSet:=CharSet.Unicode)> _

Public Structure EventData

    Public I As Integer

    Public Message As String

End Structure

 

<UnmanagedFunctionPointer(CallingConvention.Cdecl)> _

Public Delegate Sub CallBackDelegate(ByVal data As EventData)

 

Public Class Native

    <DllImport("NativeLib.dll")> _

    Public Shared Sub fnNativeAPI(ByVal i As Integer, _

                                  <MarshalAs(UnmanagedType.FunctionPtr)> ByVal callBack As CallBackDelegate)

    End Sub

End Class



And the application code:


Public Class Form1

    Dim _cb As CallBackDelegate

 

    Public Sub New()

 

        ' This call is required by the Windows Form Designer.

        InitializeComponent()

 

        ' Add any initialization after the InitializeComponent() call.

        _cb = New CallBackDelegate(AddressOf Me.Foo)

    End Sub

 

    Private Sub Form1_Load(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles MyBase.Load

        Dim i As Integer = 200

        'call the native API, passing our .NET delegate

        Native.fnNativeAPI(i, _cb)

    End Sub

 

    ' this is our callback function

    Private Sub Foo(ByVal data As EventData)

        Debug.WriteLine(data.I)

        Debug.WriteLine(data.Message)

    End Sub

End Class

Reverse P/Invoke

Posted by Hugh Ang at 9/30/2008 07:24:00 PM

While researching for ongoing/upcoming projects that will need approaches for interop between .NET and native code, specifically, for native code to call into .NET code, Reverse P/Invoke has come to my attention as a viable option. Of course there is the official Microsoft recommendation to expose .NET classes as COM components, which are then callable from native code that talks COM.

The Reverse P/Invoke approach allows native code to call into .NET delegate using a function pointer. So it could work well for my requirement, for which I need a way to fire an event from the native app to the .NET app, for instance, an application context change on the native side must be reflected on the .NET side.

The blog by Junfeng, however, does not give a concrete example of such Reverse P/Invoke approach. So I came up with a POC, where I had a VS.NET solution with three projects: (1) a native console application (C++ project) (2) a managed class library (C# project) and (3) a mixed mode dll library with exported C++ function (C++/CLI project).

So this POC is trying to simulate a native application (#1) that needs to notify managed code (#2) of data changes. I came up with a dll library compiled with /clr switch to handle the interop details. Both the native app and the managed code requires very minimum changes.

On the .NET side, we have a managed class that has a Foo() function and a GetDelegate() function that hands out a delegate to Foo to its caller.


public class ManagedClass

{

    private CallBackDelegate _delegate;

 

    public ManagedClass()

    {

        _delegate = new CallBackDelegate(this.Foo);

    }

 

    public CallBackDelegate GetDelegate()

    {

        return _delegate;

    }

 

    public void Foo(EventData data)

    {

        Debug.WriteLine(data.I);

        Debug.WriteLine(data.Message);

    }

}



The EventData is a data structure that shares the same binary layout as the one that will be created and marshaled from the native code.


[StructLayout(LayoutKind.Sequential, CharSet = CharSet.Unicode)]

public struct EventData

{

    public int I;

    public string Message;

}



And here is the delegate definition. Note the attribute UnmanagedFunctionPointer with the calling convention.


[UnmanagedFunctionPointer(CallingConvention.Cdecl)]

public delegate void CallBackDelegate(EventData data);



In the mixed mode dll, here is the definition of the EventData data structure and function pointer:


#pragma once

 

#include <windows.h>

 

// data structure for the callback function

struct EventData

{

    int I;

    TCHAR* Message;

};

 

// callback function prototype

typedef void (*NativeToManaged)(EventData data);



And the exported function is defined as in the following. Note how the .NET delegate gets invoked through the function pointer.


#define INTEROPBRIDGE_API __declspec(dllexport)

 

INTEROPBRIDGE_API void fnInteropBridge(EventData data)

{

    ManagedLib::ManagedClass^ c = gcnew ManagedLib::ManagedClass();

    IntPtr p = Marshal::GetFunctionPointerForDelegate(c->GetDelegate());

 

    NativeToManaged funcPointer = (NativeToManaged) p.ToPointer();

 

    // invoke the delegate

    funcPointer(data);

}



Now in the native app, I have code that creates a copy of EventData and invokes the .NET code through the exported dll function fnInteropBridge:


// forward definition of the API function

void fnInteropBridge(EventData data);

 

int _tmain(int argc, _TCHAR* argv[])

{

    EventData data;

    data.I = 50;

    data.Message = L"Hello from native code!";

 

    fnInteropBridge(data);

 

    return 0;

}



In summary, I like this approach it that it provides quite an easy and non-invasive way for native code to call into managed code. It should especially work well in my scenario, where application context changes initiated from the native app needs to be propagated to the managed code. Furthermore, besides polishing this up, I think I will add code to raise a .NET event from inside ManagedClass.Foo(). Then all interested .NET citizens on the managed app side can subscribe to it.

Follow-up of PIAB and WCF Article

Posted by Hugh Ang at 9/23/2008 11:09:00 PM

Since publishing the MSDN article on an approach to integrate PIAB into WCF, I have received quite a few feedback from folks using this approach in production. And I am glad to say it's working out successfully for those folks.

As I mentioned earlier in the comment, Tom Hollander of p&p group has found an issue with our approach when two PIAB-enabled WCF services are hosted in the same IIS worker process. He graciously shared his code with us and I was able to reproduce the problem when the two WCF services were hosted in two separate AppDomains of the same IIS worker process. Although in production deployment scenarios, different WCF services would likely be hosted in separate processes or on different machines, I have still been wanting to figure out the root cause of this particular issue. Unfortunately with work and other things in my life, I just haven't had time until last week when I was finally able to sit down and focused on this problem.

I have to admit this has been the most tedious debugging I have ever done as I was deep in the guts of WCF and CLR, inspecting dynamic types, JIT compiler, call stubs, etc. without a whole lot of background in this area - I only wish I were working for the CLR team :-) After a few days, I finally found out the cause of the problem, which is pretty close to my initial hunch. A sense of elation at last!

Here it goes.

1. The setup to repro the problem
The setup is fairly straightforward. There are two WCF services, each implementing IService and IAnotherService respectively (the code is pretty much verbatim from Tom):


[ServiceContract]

public interface IService

{

    [OperationContract]

    Foo GetFoo(int id);

 

    [OperationContract]

    void AddFoo(int i, [NotNullValidator] Foo foo);

}




[ServiceContract]

public interface IAnotherService

{

    [OperationContract]

    void LogFoo(Foo foo);

}



Both services would be enabled with PIAB of course:


[PolicyInjectionBehaviors.PolicyInjectionBehavior]

[ValidationCallHandler]

[LogCallHandler]

public class Service : IService

{

    private static Dictionary<int, Foo> store = new Dictionary<int, Foo>();

    IAnotherService anotherService;

 

    public Service()

    {

        BasicHttpBinding binding = new BasicHttpBinding();

        binding.SendTimeout = new TimeSpan(4, 0, 0);

 

        anotherService = new AnotherServiceClient(binding, new EndpointAddress("http://localhost/AnotherTestService/AnotherTestService.svc"));

    }

 

    public void AddFoo(int id, Foo foo)

    {

        store[id] = foo;

        anotherService.LogFoo(foo);

    }

 

    public Foo GetFoo(int id)

    {

        if (store.ContainsKey(id))

        {

            anotherService.LogFoo(store[id]);

            return store[id];

        }

        else

            return null;

    }

}




[PolicyInjectionBehaviors.PolicyInjectionBehavior]

[ValidationCallHandler]

[LogCallHandler]

public class AnotherService : IAnotherService

{

    public void LogFoo(Foo foo)

    {

        Logger.Write("LogFoo() called.");

    }

}



Foo is simply a DataContract object that holds both an int and a string properties.

Now the services would be hosted two AppDomains in the same IIS worker process. This is how it looks like on my Vista machine:



As you can see both services are in the DefaultAppPool. With both services set up, we can run the test harness, which first calls Service.AddFoo() and then Service.GetFoo(). The Service.Foo() is completed fine but Service.GetFoo() call fails with a System.Reflection.TargetException: Object does not match target. The stack trace is as follows:


1017e344 79644832 System.Reflection.RuntimeMethodInfo.CheckConsistency(System.Object)
1017e350 793a4124 System.Reflection.RuntimeMethodInfo.Invoke(System.Object, System.Reflection.BindingFlags, System.Reflection.Binder, System.Object[], System.Globalization.CultureInfo, Boolean)
1017e39c 793a40a2 System.Reflection.RuntimeMethodInfo.Invoke(System.Object, System.Reflection.BindingFlags, System.Reflection.Binder, System.Object[], System.Globalization.CultureInfo)
1017e3bc 0f96e699 Microsoft.Practices.EnterpriseLibrary.PolicyInjection.RemotingInterception.InterceptingRealProxy+<>c__DisplayClass1.b__0(Microsoft.Practices.EnterpriseLibrary.PolicyInjection.IMethodInvocation, Microsoft.Practices.EnterpriseLibrary.PolicyInjection.GetNextHandlerDelegate)
1017e3ec 0f968ed1 Microsoft.Practices.EnterpriseLibrary.PolicyInjection.HandlerPipeline.Invoke(Microsoft.Practices.EnterpriseLibrary.PolicyInjection.IMethodInvocation, Microsoft.Practices.EnterpriseLibrary.PolicyInjection.InvokeHandlerDelegate)
1017e404 0f968a2e Microsoft.Practices.EnterpriseLibrary.PolicyInjection.RemotingInterception.InterceptingRealProxy.Invoke(System.Runtime.Remoting.Messaging.IMessage)
1017e418 79374dc3 System.Runtime.Remoting.Proxies.RealProxy.PrivateInvoke(System.Runtime.Remoting.Proxies.MessageData ByRef, Int32)
1017e6b4 79f98b43 [TPMethodFrame: 1017e6b4] WcfPiabInstability.Services.IAnotherService.LogFoo(WcfPiabInstability.Services.Foo)
1017e6c4 0efd08dc DynamicClass.SyncInvokeGetFoo(System.Object, System.Object[], System.Object[])
1017e6d4 50b8d90b System.ServiceModel.Dispatcher.SyncMethodInvoker.Invoke(System.Object, System.Object[], System.Object[] ByRef)
1017e74c 50b6d245 System.ServiceModel.Dispatcher.DispatchOperationRuntime.InvokeBegin(System.ServiceModel.Dispatcher.MessageRpc ByRef)
1017e7a0 509137ad System.ServiceModel.Dispatcher.ImmutableDispatchRuntime.ProcessMessage5(System.ServiceModel.Dispatcher.MessageRpc ByRef)
1017e7e0 509136a6 System.ServiceModel.Dispatcher.ImmutableDispatchRuntime.ProcessMessage4(System.ServiceModel.Dispatcher.MessageRpc ByRef)
1017e80c 50913613 System.ServiceModel.Dispatcher.ImmutableDispatchRuntime.ProcessMessage3(System.ServiceModel.Dispatcher.MessageRpc ByRef)
1017e81c 50913459 System.ServiceModel.Dispatcher.ImmutableDispatchRuntime.ProcessMessage2(System.ServiceModel.Dispatcher.MessageRpc ByRef)
1017e82c 50912257 System.ServiceModel.Dispatcher.ImmutableDispatchRuntime.ProcessMessage1(System.ServiceModel.Dispatcher.MessageRpc ByRef)
1017e844 50911f8f System.ServiceModel.Dispatcher.MessageRpc.Process(Boolean)
1017e888 509115ff System.ServiceModel.Dispatcher.ChannelHandler.DispatchAndReleasePump(System.ServiceModel.Channels.RequestContext, Boolean, System.ServiceModel.OperationContext)
1017ea34 5090f8c9 System.ServiceModel.Dispatcher.ChannelHandler.HandleRequest(System.ServiceModel.Channels.RequestContext, System.ServiceModel.OperationContext)
1017ea78 5090f35e System.ServiceModel.Dispatcher.ChannelHandler.AsyncMessagePump(System.IAsyncResult)
1017ea8c 5090f2f1 System.ServiceModel.Dispatcher.ChannelHandler.OnAsyncReceiveComplete(System.IAsyncResult)
1017ea98 50232d68 System.ServiceModel.Diagnostics.Utility+AsyncThunk.UnhandledExceptionFrame(System.IAsyncResult)
1017eac4 50904501 System.ServiceModel.AsyncResult.Complete(Boolean)
1017eb00 50992b36 System.ServiceModel.Channels.InputQueue`1+AsyncQueueReader[[System.__Canon, mscorlib]].Set(Item)
1017eb14 50992215 System.ServiceModel.Channels.InputQueue`1[[System.__Canon, mscorlib]].EnqueueAndDispatch(Item, Boolean)
1017eb7c 50991ffb System.ServiceModel.Channels.InputQueue`1[[System.__Canon, mscorlib]].EnqueueAndDispatch(System.__Canon, System.ServiceModel.Channels.ItemDequeuedCallback, Boolean)
1017eba4 5091d7e5 System.ServiceModel.Channels.SingletonChannelAcceptor`3[[System.__Canon, mscorlib],[System.__Canon, mscorlib],[System.__Canon, mscorlib]].Enqueue(System.__Canon, System.ServiceModel.Channels.ItemDequeuedCallback, Boolean)
1017ebc8 50977b7e System.ServiceModel.Channels.HttpChannelListener.HttpContextReceived(System.ServiceModel.Channels.HttpRequestContext, System.ServiceModel.Channels.ItemDequeuedCallback)
1017ec0c 5094f396 System.ServiceModel.Activation.HostedHttpTransportManager.HttpContextReceived(System.ServiceModel.Activation.HostedHttpRequestAsyncResult)
1017ec50 5094e4cf System.ServiceModel.Activation.HostedHttpRequestAsyncResult.HandleRequest()
1017ec68 5094defd System.ServiceModel.Activation.HostedHttpRequestAsyncResult.BeginRequest()
1017eca4 5094dea5 System.ServiceModel.Activation.HostedHttpRequestAsyncResult.OnBeginRequest(System.Object)
1017ecd0 50903c3c System.ServiceModel.Channels.IOThreadScheduler+CriticalHelper+WorkItem.Invoke2()
1017ed0c 50903b26 System.ServiceModel.Channels.IOThreadScheduler+CriticalHelper+WorkItem.Invoke()
1017ed20 50903ab5 System.ServiceModel.Channels.IOThreadScheduler+CriticalHelper.ProcessCallbacks()
1017ed54 5090390f System.ServiceModel.Channels.IOThreadScheduler+CriticalHelper.CompletionCallback(System.Object)
1017ed80 5090388b System.ServiceModel.Channels.IOThreadScheduler+CriticalHelper+ScheduledOverlapped.IOCallback(UInt32, UInt32, System.Threading.NativeOverlapped*)
1017ed8c 50232e1f System.ServiceModel.Diagnostics.Utility+IOCompletionThunk.UnhandledExceptionFrame(UInt32, UInt32, System.Threading.NativeOverlapped*)
1017edc0 79405534 System.Threading._IOCompletionCallback.PerformIOCompletionCallback(UInt32, UInt32, System.Threading.NativeOverlapped*)
1017ef60 79e7c74b [GCFrame: 1017ef60]
1017f0b8 79e7c74b [ContextTransitionFrame: 1017f0b8]


2. Culprit - a subtle bug in the CLR?
What is going on here? The exception is being thrown by System.Reflection.RuntimeMethodInfo.CheckConsistency() but the problem happened earlier. Notice the red in those two lines of the stack trace. DynamicClass.SyncInvokeGetFoo is the function generated by WCF using Lightweight Code Gen (LCG) to facilitate the GetFoo() call, which is in IService definition. But the next one on the call stack somehow becomes IAnotherService.LogFoo() - please be reminded that IAnotherService.LogFoo() call on the stack here is not to be confused with the one inside the Service.GetFoo() as the call hasn't reached to the actual object yet when exception happens. Using SOS command !clrstack -p and !dumpobject reveals that the object reference in the call context here is the PIAB proxy to the Service object (tp->rp->real object), which implements IService but not IAnotherService. The mismatch simply didn't manifest into an exception until later in System.Reflection.RuntimeMethodInfo.CheckConsistency(). So where the exception is thrown is not that important. We need to understand why IService.GetFoo() suddenly becomes IAnotherService.LogFoo().

Let's review the high level picture of how the calls are being dispatched from the client site (we will only consider synchronous calls for now):


  1. Client makes a call by sending an XML message to the WCF service

  2. WCF processes the message in the pipeline before it finally dispatches the call through System.ServiceModel.Dispatcher.SyncMethodInvoker.Invoke() as shown on the stack trace. System.ServiceModel.Dispatcher.InvokerUtil uses LCG to generate a delegate SyncInvokeXXXX, where XXXX is the target method name and SyncInvoke stands for synchronous call. SyncMethodInvoker.Invoke() simply passes control to SyncInvokeXXXX(). This part is not that difficult to figure out by using Reflector.

  3. CLR takes the buck from here. Starting with 2.0, .NET uses dispatch stub to handle interface calls. The runtime figures out the method disptach token and sends it along with the target object reference to mscorwks!ResolveWorkerAsmStub, which calls mscorwks!VirtualCallStubManager::ResolveWorkerStatic and then the heavy lifting mscorwks!VirtualCallStubManager::ResolveWorker to figure out the stub that contains the assembly code to make the actual call.

  4. PIAB proxy gets called. This is where the injection magic happens and service method finally gets called.



The dispatch token is a 32 bit integer with hi as the type id of the interface and lo as the slot number of the method as shown in the Rotor(SSCLI) source code:


 

static const UINT_PTR MASK_TYPE_ID       = 0x0000FFFF;

static const UINT_PTR MASK_SLOT_NUMBER   = 0x0000FFFF;

 

static const UINT_PTR SHIFT_TYPE_ID      = 0x10;

static const UINT_PTR SHIFT_SLOT_NUMBER  = 0x0;

 

//------------------------------------------------------------------------

// Combines the two values into a single 32-bit number.

static UINT_PTR CreateToken(UINT32 typeID, UINT32 slotNumber)

{

    LEAF_CONTRACT;

    CONSISTENCY_CHECK(((UINT_PTR)typeID & MASK_TYPE_ID) == (UINT_PTR)typeID);

    CONSISTENCY_CHECK(((UINT_PTR)slotNumber & MASK_SLOT_NUMBER) == (UINT_PTR)slotNumber);

    return ((((UINT_PTR)typeID & MASK_TYPE_ID) << SHIFT_TYPE_ID) |

            (((UINT_PTR)slotNumber & MASK_SLOT_NUMBER) << SHIFT_SLOT_NUMBER));

}



Type ids are integer identifiers to represent types in an AppDomain. Slot numbers are integer values representing entries of interface methods in the method table. In the example I am using, IService has a type id of 0x0003 and AddFoo has a slot number of 0x0001, therefore yielding a token of 0x00030001. IService.GetFoo has a dispatch token of 0x00030000. And IAnotherService.LogFoo also has a token of 0x0003000. You see that both IService and IAnotherService, living in two AppDomains, happen to have the same type id: 0x0003. It is a coincidence, but not to be ignored.

The reason why the dispatch token is critical here is because of the following:


  • all interface disptach stubs for our PIAB enabled services are handled by the VirtualCallStubManager in the shared domain of the process.

  • the stub manager keeps a hash table to cache the stub code using those two keys: token and object type. In our example, the object type is always a transparent proxy for PIAB-enabled services. So effectively token becomes the only key that matters.

  • the heavy lifting mscorwks!VirtualCallStubManager::ResolveWorker() is responsible for generating and caching the stub. Obviously it always first checks if there is a cached entry. If one is found using the keys, that entry will be returned.



When our unit test harness calls Service.AddFoo, which internally calls AnotherService.LogFoo, two dispatch stubs are created and cached by the shared stub manager, with tokens 0x00030001 and 0x00030000 as the effective key respectively. Now the unit test makes a different call Service.GetFoo. Note that IService.GetFoo also has the dispatch token as 0x00030000, same as IAnotherSevice.LogFoo, despite the fact they are two types in different domains. The stub manager of the shared domain hands out the previously cached dispatch stub for IAnotherService.LogFoo. This is why we saw the strange call stack above and the call eventually fails.

To further prove this, I changed the definition of IAnotherService to the following to include 5 additional functions as fillers to the method table slots:


[ServiceContract]

public interface IAnotherService

{

    // fillers

    void Filler1();

    void Filler2();

    void Filler3();

    void Filler4();

    void Filler5();

 

    [OperationContract]

    void LogFoo(Foo foo);

}



The purpose is to alter the slot number of IAnotherService.LogFoo to avoid the token clash. As shown in the following Windbg snippet, I did indeed get a token of 0x00030005 as opposed to the 0x0003000 that I had earlier:


eax=00000003 ebx=0e3f6228 ecx=79e89e87 edx=00000003 esi=00000005 edi=01c45b18
eip=79eb45cf esp=1057dd4c ebp=1057dd80 iopl=0 nv up ei pl nz ac po cy
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000213
mscorwks!VirtualCallStubManager::GetCallStub+0x34:
79eb45cf c1e010 shl eax,10h
0:025> p
eax=00030000 ebx=0e3f6228 ecx=79e89e87 edx=00000003 esi=00000005 edi=01c45b18
eip=79eb45d2 esp=1057dd4c ebp=1057dd80 iopl=0 nv up ei pl nz ac pe nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000216
mscorwks!VirtualCallStubManager::GetCallStub+0x37:
79eb45d2 0bf0 or esi,eax
0:025> p
eax=00030000 ebx=0e3f6228 ecx=79e89e87 edx=00000003 esi=00030005 edi=01c45b18
eip=79eb45d4 esp=1057dd4c ebp=1057dd80 iopl=0 nv up ei pl nz na pe nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000206


And the CLR happily makes all the calls without the dreadful exception!

3. Summary
It is interesting to see how the CLR team may have tried to safeguard the clashing of the keys by using both token and the object type. In our case, the object type, being the transparent proxy for PIAB-enabled services, takes that key out of the equation. The token, although having HI corresponding to the interface type and LO to the slot number, is scoped within the AppDomain where the type is loaded. The fewer the number of types and the fewer the methods of the operation contracts are defined for the WCF service in each AppDomain, the bigger the odds key clashing like this will happen.

So what is the solution? Could other interface method related values such as MethodDesc, which seems to be unique across app domains, be a better candidate? That is the question for the CLR team.

As for us who want to integrate PIAB into WCF while minimizing development team efforts, we should be fine by hosting WCF services in separate processes. If you really, really want to host everything in one IIS worker process like our contrived example, you can get around this issue by not using the default PIAB interception mechanism. Suppose you can come up with a LCG mechanism to generate one dynamic proxy (instead of System.Runtime.Remoting.Proxies.__TransparentProxy) for each target. Having a different object type as each service's PIAB intercepting proxy will therefore avoid the key clashing.

4. Tools etc.
Debugging is always an enlightening experience. And I can't imagine what life is going to be without Windbg and Reflector. Visual Studio 2008 is cool since you can configure it to debug into .NET framework code. But if the symbols are not available for the module you want to investigate or you need to dig deeper below the FCL layer, Reflector and Windbg are just indispensable.

Also, having SSCLI source code is absolutely wonderful. Without it, debugging through machine code in windbg would be a lot harder.

Most of the SSCLI code for this exercise is located inside virtualcallstub.cpp file.

Interview Questions

Posted by Hugh Ang at 9/09/2008 09:46:00 AM

Having a list of questions with standard answers to score candidates during technical screenings certainly has its benefits, especially maintaining consistencies across different interviewers. However if the interviewer simply compares the candidate's answers to the official ones, then she is not doing her job. Interview is an interactive process and should be leveraged as such. Many seemingly easy questions can be extended to discussions at both broader and deeper levels. You will get a better picture of candidate's overall skills and experiences this way. For example, there is usually a basic question on the differences between value and reference types. This question can be extended to boxing/unboxing, and the scenarios where boxing/unboxing can occur and the performance implications, which can be a good starting point to test candidate's knowledge on generics. There are semantic implications of boxing and unboxing as well. A boxed integer, e.g. is a brand new object with a copy of the initial integer value. The following is not allowed by the C# compiler:

int i = 1; // class instance field

lock(i)
{
//...
}

You could hack it with:

lock ((object)i)
{
//...
}

But you will not get the intended lock semantics. I will leave it to you to answer why that is the case. If you know the answer, you will see that you can evaluate the candidate's knowledge of threading and synchronization, besides boxing/unboxing.

And there is more! A related topic is passing by reference vs. passing by value in function calls (I had a previous blog in this area). And you can take the initial question and extend it to a discussion on heap vs. stack and further on GC.

So you see how one easy question can be extended quite a bit and become the vehicle for you to test candidate's overall knowledge. Of course you need to be sensitive to time and not go off to all directions. You usually will get a sense of the candidate's knowledge half way through and can decide whether to go further or not from that point.

I Am Back

Posted by Hugh Ang at 6/20/2008 03:46:00 PM

For the last three months, I was working for a client on envisioning the next generation architecture of a highly impactful desktop application. Despite the short 3 months time, it was nevertheless a winding journey. Because of the impact of the application, business and politics drove a lot of the architecture decisions, and a lot of long nights too. Now that the project is on hold until the next phase, I can get back inside Visual Studio and be really technical for a short while.

Debug.Assert in ASP.NET Application

Posted by Hugh Ang at 3/14/2008 04:31:00 PM

My most recent project was an ASP.NET 2.0 application developed in Visual Studio 2005. During the first build and deployment into dev integration environment, everything went pretty well except for one page. Request for this page was just hanging. I started to investigate and I captured a hang dump using adplus. In windbg, I found out the longest running thread. Suspecting that is the thread that was hanging, I looked at its CLR stack:


0:001> !clrstack
*********************************************************************
* Symbols can not be loaded because symbol path is not initialized. *
* *
* The Symbol Path can be set by: *
* using the _NT_SYMBOL_PATH environment variable. *
* using the -y argument when starting the debugger. *
* using .sympath and .sympath+ *
*********************************************************************
PDB symbol for mscorwks.dll not loaded
OS Thread Id: 0x2fac (1)
ESP EIP
006bf414 77f88a77 [NDirectMethodFrameStandalone: 006bf414] Microsoft.Win32.SafeNativeMethods.MessageBox(System.Runtime.InteropServices.HandleRef, System.String, System.String, Int32)
006bf430 7a4f839a System.Diagnostics.AssertWrapper.ShowMessageBoxAssert(System.String, System.String, System.String)
006bf460 7a4fabb2 System.Diagnostics.DefaultTraceListener.Fail(System.String, System.String)
006bf4a0 7a4faad7 System.Diagnostics.DefaultTraceListener.Fail(System.String)
006bf4a4 7a500a22 System.Diagnostics.TraceInternal.Fail(System.String)
006bf4e0 7a6e2523 System.Diagnostics.TraceInternal.Assert(Boolean, System.String)
006bf4e4 7a4fa6cb System.Diagnostics.Debug.Assert(Boolean, System.String)
006bf4e8 1ae8b7c9 CondosPaymentPageBase.GetResidentData()
006bf528 1ae8b567 CondosPaymentPageBase.OnLoad(System.EventArgs)
006bf534 66143ad0 System.Web.UI.Control.LoadRecursive()
006bf548 66155106 System.Web.UI.Page.ProcessRequestMain(Boolean, Boolean)
006bf700 66154a1b System.Web.UI.Page.ProcessRequest(Boolean, Boolean)
006bf738 66154967 System.Web.UI.Page.ProcessRequest()
006bf770 66154887 System.Web.UI.Page.ProcessRequestWithNoAssert(System.Web.HttpContext)
006bf778 6615481a System.Web.UI.Page.ProcessRequest(System.Web.HttpContext)
006bf78c 1bb8ccae ASP.payment_3_history_aspx.ProcessRequest(System.Web.HttpContext)
006bf798 65ff27d4 System.Web.HttpApplication+CallHandlerExecutionStep.System.Web.HttpApplication.IExecutionStep.Execute()
006bf7cc 65fc15b5 System.Web.HttpApplication.ExecuteStep(IExecutionStep, Boolean ByRef)
006bf80c 65fd32e0 System.Web.HttpApplication+ApplicationStepManager.ResumeSteps(System.Exception)
006bf85c 65fc0225 System.Web.HttpApplication.System.Web.IHttpAsyncHandler.BeginProcessRequest(System.Web.HttpContext, System.AsyncCallback, System.Object)
006bf878 65fc550b System.Web.HttpRuntime.ProcessRequestInternal(System.Web.HttpWorkerRequest)
006bf8ac 65fc5212 System.Web.HttpRuntime.ProcessRequestNoDemand(System.Web.HttpWorkerRequest)
006bf8b8 65fc3587 System.Web.Hosting.ISAPIRuntime.ProcessRequest(IntPtr, Int32)
006bfa68 79f35ee8 [ContextTransitionFrame: 006bfa68]
006bfab8 79f35ee8 [GCFrame: 006bfab8]
006bfc10 79f35ee8 [ComMethodFrame: 006bfc10]


And the native stack looked like this:


0:001> kb
ChildEBP RetAddr Args to Child
006bf1b0 77e4f9f0 50000018 00000003 00000003 NTDLL!ZwRaiseHardError+0xb
006bf20c 77e34398 00d3b684 00d3c298 00040212 USER32!ServiceMessageBox+0x16b
006bf35c 77e339cb 006bf36c 006bfa68 00000028 USER32!MessageBoxWorker+0x10a
006bf3b4 77e4fa54 00000000 00d3b684 00d3c298 USER32!MessageBoxExW+0x77
*** WARNING: Unable to verify checksum for System.ni.dll
006bf43c 7a4fad1d 00000000 00d1c4fc 00d28944 USER32!MessageBoxW+0x49
006bf454 7a4fabb2 00000000 00d24bf8 04cfda28 System_ni+0xbad1d
006bf494 7a4faad7 00000000 7a500a22 00000000 System_ni+0xbabb2
006bf4d8 7a6e2523 7a4fa6cb 1ae8b7c9 00d113c0 System_ni+0xbaad7
00000000 00000000 00000000 00000000 00000000 System_ni+0x2a2523


At this point, it was clear to me that some data condition in dev int environment had caused Debug.Assert to fail. The assert failure message box was waiting to be closed. But on an IIS box with the ASP worker process running in the context of a service account, this UI interaction is simply not going to work.

As you may know, Debug.Assert will be left out of the IL code by compilers for release build. Our MSBuild script that produced the deployment package did have the "release" switch turned on. So what went wrong? It turned out that the web deployment project we had, "Generate debug information" option was checked. I initially thought this would give me pdb files. But obviously this meant a debug build, irrespective of the "Configuration=Release" setting for the MSBuild script.

PIAB and WCF Article Published on MSDN Magazine

Posted by Hugh Ang at 1/13/2008 11:27:00 AM

The article that I have coauthored with my Avanade colleague David San Filippo is now published in the February issue of MSDN magazine. If you have subscribed to MSDN, you should have received it in your mail box by now. The online link is here. Basically the article details how to integrate PIAB into WCF via .NET configuration or attribute so developers would not have to write code to apply the goodness of PIAB to WCF.

It's been an absolutely wonderful experience working with MSDN editors, Howard Dierking, Nancy Michell and Debra Kelly so I'd like to extend a big "thank you" to them from here. My colleague David is one of the smartest developers I have worked with; it only took a few weekends for us to come up with a draft of this article after the brainstorming back in last August.

A Good Book on Windbg

Posted by Hugh Ang at 1/09/2008 03:17:00 PM

Since I have never worked for any Microsoft product team, a good way for me to get internal system knowledge is through debugging, using tools like Windbg (Visual Studio as an IDE can be good for debugging too, but is limited in functionalities compared with Windbg).

Over the years, I have gained my Windbg skills by reading blogs (thanks to those who post them) and just practicing it on my own whenever I have a chance. Most of the debugging books don't have enough coverage in this area. So I was really excited after I found out that Addison Wesley has published this Advanced Windows Debugging book. You should read this book if you are seriously thinking about learning Windbg.