.NET Code Instrumentation That Is More Accurate?

Posted by Hugh Ang at 10/31/2007 03:24:00 PM

I recently ran into a few interesting topics that all relate to Thread and I will share them with you here.

Someone asked an interesting question to our company's community on how to instrument for just the time spent only in the thread when it is executing the code. Thread switching, thread idling/sleeping while waiting for IO, etc should not be counted.

This is indeed pretty interesting as on a machine where a lot of threads are running concurrently, using the elapsed time (e.g. the Tracer in EntLib) approach, may give you inaccurate results if you just want to focus on measuring algorithm of your code. It will be further skewed if the thread running your code is given lower priorities than the other ones as thread scheduling on Windows is priority-driven and preemptive.

After digging around a bit, I found a reasonable good approach to this problem, with the help of BCL of .NET framework. The Process type in System.Diagnostics namespace has a Threads property, which is a collection that corresponds to the Win32 threads in the process. The .NET type that represents the underlying Win32 thread is ProcessThread. It turns out that ProcessThread has properties for processor time, which is key to the solution. Specifically, the property I want to use is UserProcessorTime. Here is a simple wrapper class that I wrote, following the Tracer class pattern from EntLib. I have only kept necessary code to show the point – email me (hugh dot ang at gmail dot com) if you want to have my Visual Studio project that has the complete source code.


public class ThreadTimer : IDisposable

{

    private const string LOG_CATEGORY = "General";

 

    [ThreadStatic()]

    private ProcessThread _processThread;

 

    private int _threadId;

    private TimeSpan _start;

 

    [DllImport("Kernel32", EntryPoint = "GetCurrentThreadId", ExactSpelling = true)]

    private static extern Int32 GetCurrentWin32ThreadId();

 

    public ThreadTimer()

    {

        _threadId = ProcessThread.Id;

        _start = ProcessThread.UserProcessorTime;

 

        Logger.Write("ThreadTimer started", LOG_CATEGORY);

    }

 

    public void Dispose()

    {

        double duration = ProcessThread.UserProcessorTime.Subtract(_start).TotalSeconds;

        Logger.Write(

            String.Format("ThreadTimer ended with elapsed time of {0} seconds processor user time.",

            duration),

            LOG_CATEGORY);

    }

 

    private ProcessThread ProcessThread

    {

        get

        {

            if (_processThread == null)

            {

                int id = GetCurrentWin32ThreadId();

                foreach (ProcessThread thread in Process.GetCurrentProcess().Threads)

                {

                    if (thread.Id == id)

                    {

                        _processThread = thread;

                        return _processThread;

                    }

                }

            }

 

            return _processThread;

        }

    }

}



Here is how we can instrument a method called SortOperation:


using (ThreadTimer threadTimer = new ThreadTimer())

{

    // simulate an operation that needs to be instrumented

    SortOperation(i);

}



To see how this works out in comparison to instrumentation of EntLib, I implemented the SortOperation() as in the following:


static void SortOperation(int x)

{

    SortedList<int, int> slist = new SortedList<int, int>();

    for (int i = x * 10000; i >= 0; i--)

    {

        slist.Add(i, i);

    }

}



To make it interesting, I first instrumented this method using both Tracer and my ThreadTimer and then instrumented the same method plus another method that does just IO, e.g. read from a file on disk. The IO method takes about 200 milliseconds. The following table summarizes the results:



Without the IO method, both Tracer and ThreadTimer produce almost identical results. Based on the empirical results, it’s not hard to figure out that the algorithm of looping with SortedList <> is O(N2). When the IO method is included, EntLib reports the extra 200 milliseconds while ThreadTimer reports pretty much unchanged results.

So you see this ThreadTimer can be a reasonable solution for instrumentation in the scenario where you want to just focus on measuring your algorithm. However, as EntLib captures the start to end performance, which is what end user will experience, this solution shouldn’t replace that. Another caveat is that since there is no fixed relationship between a Win32 native thread and a managed thread, there is no guarantee that this solution will work for any CLR host.

0 comments: