Trace Correlation Manager

Continuing on my exporation of the new features in the System.Diagnostics namespace I thought I might stop and take some time to explain a little bit about things thing called the CorrelationManager – but first, a history lesson.

Request Tracing in the Enterprise Instrumentation Framework

If you have ever used EIF (Enterprise Instrumentation Framework) you will probably know a little bit about a feature called Request Tracing.

The idea behind Request Tracing was that you could establish a context in which code is executed, and all diagnostic events ther were raised within that context would be associated with it via a series of flags. When diagnostic output was analysed this information could be used to group the output and individual execution paths could be determined. This was exceptionally useful for multi-threaded server applications which could hundreds and thousands of concurrent users.

Request Tracing worked by stuffing GUIDs in this thing called CallContext. Think of CallContext as a bucket of memory associated with each thread where you can store “useful stuff”. The beauty of CallContext is that if you have information required to support infrastructure (like logging) then you can put it in there and you don’t need to pass things like handles around as arguments to methods.

An ancillary benefit of the CallContext was that the .NET Remoting infrastructure was aware of it and could propogate some information (marked with the ILogicalThreadAffinitive interface) across into a LogicalCallContext. The LogicalCallContext would be transferred as part of the remoting message to another host process (potentially on another machine) and associated with the handling thread there. If that machine also happened to be using EIF, events raised on that servicing thread would also emit the request tracing GUID which meant that if the logs were merged a full distributed analysis could take place.

More of the Same: Correlation Manager

Enter the .NET 2.0 BCL CorrelationManager class, attached prominantly as a static property on the Trace class. The Correlation Manager has two methods, StartLogicalOperation and StopLogicalOperation. The idea is that you can use these two simple methods to associate tracing information with a logical business operation.

Imagine the following program which contains a number of sub-routines that are called in sequence depending on what features are being used.

ProgramStructure

Lets say this program represents the behaviours of user group participants when the pizza arrives. Each person is probably going to be doing one of two things, drinking coke, or eating pizza – in a multi-threaded user group meeting the diagnostic output might look like this.

OriginalTraceOutput

The thing about this output is that its hard to tell which “open mouth” call is associated with which action – what if it recorded critical diagnostic information but you couldn’t associate it to the particular function it was invoked from? Lets divide the program into two functional areas.

OverlappingFunctions

The Correlation Manager can help you draw this picture in your log files by associating some contextual information with each trace entry. With this contextual information you trace output would look something like this.

CategorizedTraceOutput

Gotta love those nice colours eh? This basically allows us to scan our output and understand the logical sequence of events that look place for an individual thread of execution. Now lets take this understanding and apply it to some code.

StartLogicalOperation

This is a recursive factorize method which finds the prime factors of the number that is put in. Notice that at the beginning and the end of the function I call StartLogicalOperation and StopLogicalOperation respectively. Also notice that on the StartLogicalOperation call I am passing in the input integer – in this program this serves as important diagnostic information about the factorization process.

In the mainline of the program (download the sample code here) I wire-up a ConsoleTraceListener and I set the new TraceOutputOptions property to also include the “logical operation stack” in the output.

ConfigureListenerToLos

When the program runs I get the following output on the console.

ConsoleStackOutput

Not only does it display the trace message, it also includes a thing called the “LogicalOperationStack”. This is a nested list of logical operations that have occured, and by setting the TraceOutputOptions on the trace listener we have told it to emit this information – the stack essentially looks like this.

LogicalOperationStack

In the case of the ConsoleTraceListener the stack is displayed in reverse order but you can clearly see the input numbers used as it recursively called the Factorize(…) factorize method.

Final Thoughts

Using the Correlation Manager it is possible to make your diagnostic output more useful and if I was a betting man I’d say that the fact that it also uses CallContext to manage the LogicalOperationStack means that its a fairly safe bet that this operation stack will be visible across distributed applications. Finally – be careful about using the XmlWriterTraceListener with this particular program – since the output for some numbers can be enourmous.

Advertisements

6 thoughts on “Trace Correlation Manager

  1. Raghu

    I am trying to figure out how to use CorrelationManager effectively in tracing.

    The code (shown at the end) produces following output:

    ******

    MyAsm.exe Information: 0 : Hello there!

    LogicalOperationStack=Main

    MyAsm.exe Information: 0 : Pickup Pizza 0xC

    LogicalOperationStack=Drink Context, Pizza Context, Main

    MyAsm.exe Information: 0 : Pickup Drink 0xD

    LogicalOperationStack=Drink Context, Pizza Context, Main

    MyAsm.exe Information: 0 : Open Mouth 0xC

    LogicalOperationStack=Drink Context, Pizza Context, Main

    MyAsm.exe Information: 0 : Open Mouth 0xD

    LogicalOperationStack=Drink Context, Pizza Context, Main

    MyAsm.exe Information: 0 : Chew 0xC

    LogicalOperationStack=Drink Context, Pizza Context, Main

    MyAsm.exe Information: 0 : Swallow 0xD

    LogicalOperationStack=Drink Context, Pizza Context, Main

    ******

    Other than the first line, all other lines consists of 3 levels of stack:

    Drink Context, Pizza Context and Main. However I was only expecting two sets

    of levels: Drink Context and Main; Pizza Context and Main only. Am I doing

    some thing wrong?

    Code follows **********

    using System;

    using System.Collections.Generic;

    using System.Text;

    using System.Threading;

    using System.Diagnostics;

    class myns

    {

    class DiagnosticStuff

    {

    public void Test()

    {

    //Configure console trace listener

    //ConsoleTraceListener listener = new ConsoleTraceListener();

    TextWriterTraceListener listener = new

    TextWriterTraceListener(@”C:\tracefile.txt”);

    listener.TraceOutputOptions |= TraceOptions.LogicalOperationStack;

    Trace.Listeners.Add(listener);

    Trace.CorrelationManager.StartLogicalOperation(“Main”);

    Trace.TraceInformation(“Hello there!”);

    //Create 2 threads, one with each context

    Thread[] ts = new Thread[2];

    ts[0] = new Thread(new ThreadStart(PizzaContext));

    ts[1] = new Thread(new ThreadStart(DrinkContext));

    ts[0].Start();

    ts[1].Start();

    ts[0].Join();

    ts[1].Join();

    //Trace.Listeners.Remove(listener);

    Trace.CorrelationManager.StopLogicalOperation();

    listener.Close();

    }

    private void PizzaContext()

    {

    Trace.CorrelationManager.StartLogicalOperation(“Pizza Context”);

    PickupPizza();

    OpenMouth();

    Chew();

    Trace.CorrelationManager.StopLogicalOperation();

    }

    private void DrinkContext()

    {

    Trace.CorrelationManager.StartLogicalOperation(“Drink Context”);

    PickupDrink();

    OpenMouth();

    Swallow();

    Trace.CorrelationManager.StopLogicalOperation();

    }

    private void PickupPizza()

    {

    Trace.TraceInformation(“Pickup Pizza 0x{0:X}”,

    Thread.CurrentThread.GetHashCode());

    Thread.Sleep(20);

    }

    private void OpenMouth()

    {

    Trace.TraceInformation(“Open Mouth 0x{0:X}”,

    Thread.CurrentThread.GetHashCode());

    Thread.Sleep(20);

    }

    private void Chew()

    {

    Trace.TraceInformation(“Chew 0x{0:X}”,

    Thread.CurrentThread.GetHashCode());

    Thread.Sleep(20);

    }

    private void PickupDrink()

    {

    Trace.TraceInformation(“Pickup Drink 0x{0:X}”,

    Thread.CurrentThread.GetHashCode());

    Thread.Sleep(20);

    }

    private void Swallow()

    {

    Trace.TraceInformation(“Swallow 0x{0:X}”,

    Thread.CurrentThread.GetHashCode());

    Thread.Sleep(20);

    }

    }

    }

    Raghu/..

  2. Mitch Denny

    Hi Raghu,

    It looks like the context isn’t really branch aware where branches are created by threads. Consequently you get nesting of the contexts. You would need to explicitly establish a context for each thread I think.

    Mind you – I haven’t pulled out my code samples and played with it – its just a theory at this point 🙂

  3. Chris

    Echoing what Shad said, the image links are still broken. It would be great if they were fixed so I could understand this. Thanks.

  4. Himanshu Swami

    Hi Raghu,

    The logical operation stack is not thread-safe – which explains the behavior you are seeing.

    Regards,
    Himanshu

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s