Writing an automatic debugger in 15 minutes (yes, a debugger!)

Seriously, it will take you longer to read this long introduction, than to code a working debugger in C#.

You may also want to check out:

Bugs in production

Remember all those reports coming back form clients saying “hey, your program crashed” or “hey, your is site showing these ugly yellow pages at random moments”? So do I. Unfortunately, there isn’t much that can be done to diagnose problems in running software, basically we’re at the mercy of our clients’ reports (which vary in quality, most of the time tending towards useless) or more or less verbose logging. Having those, we can reproduce issues during a debugging session to see what code causes them. If that fails,  we can go to the extremes of attaching a debugger in a production environment. Not feeling enough pressure today? Attach to a live site and try to setup breakpoints the exact way needed to only catch the error, and not stop a milion people from doing their daily work. At first try. Logging, on the other hand is very safe, it but will only tell you as much as you predicted that would be needed.

Now what if we could have something in the middle? More than logging, but less than an interactive debugging session?

Windows Error Reporting

In the ancient times, when not all software run as web applications, and especially in the even darker times when programs were being written mostly in native code, you could often see hard crashes that literally made entire applications disappear. Then Microsoft invented Windows Error Reporting, that filled that post-crash emptiness with a nice-looking window that supposedly gathered some data and sent it back to the author. This feature is present in all recent versions of Windows, and is available not only for Microsoft applications, but also for third party programs. Provided that their author gets a signing certificate (authenticode), signs all components of the software and registers on a deeply hidden Microsoft site. If all these conditions are met, Windows will automatically create a so called minidump of the process when it crashes (with important parts of memory, including stack traces of all threads) and  ask the user to allow sending it to Microsoft.

Since a minidump contains stack state of all threads at the moment of the crash, it allows you to open a post-mortem debugging session, and feel almost as if you were attached to the original process when the failure occured. WinDbg has supported this since… probably always, but Visual Studio 2010 does so too – although through a slightly hidden feature of just opening (File -> Open -> File…) minidumps, i.e. *.dmp files.

Just having access to precise call stacks used to be a killer feature at the time, but does it provide any value to the .NET developer, used to getting stack traces with every exception?

You guessed it: yes, it does.

With a minidump, you’ll be able to see parameter and variable values for all functions on the call stack.

Crashes in a .NET world

Before we go deeper, we need to clarify one thing. What exactly is a crash? What causes Windows Error Reporting to kick in? It’s unhandled exceptions. Native unhandled exceptions. Normally, the last exception handler registered in a running process is a system function that reports program failure, which in turn causes a default (defined in the registry) system just-in-time debugger to launch – which will be Windows Error Reporting a.k.a Dr. Watson, Visual Studio, or something else depending on your setup. Some programs also register their own unhandled exception handlers to report errors. Firefox does that, Winamp too. Perhaps you’ve already seen their crash windows in action. By registering their own handlers, these programs can create minidumps of their own liking (there are various types to choose from) and send them wherever they want.

What about .NET apps then? Do they crash at all? Of course they do, but unfortunately not in the sense described so far. You can have an unhandled managed exception in your code, but it will always be caught by the runtime – which in turn prints a stack trace (in a console application), shows a pop-up window (in a Windows Forms application) or displays the Yellow Screen of Death (in a web application). No minidump will ever be created automatically for a .NET application.

Creating minidumps manually

A minidump is created by calling MiniDumpWriteDump from dbghelp.dll. Mind though, that it needs to be a version from Debugging Tools for Windows, not the Windows system folder. There are various tools that call it, like SysInternals’ ProcDump. MiniDumpWriteDump and ProcDump have been described in an article in the recent edition of MSDN Magazine: Writing a Plug-in for Sysinternals ProcDump v4.0.

One of those tools is also ClrDump, which is available both as a very simple command-line tool and a convenient library. Read this blog post, and you’ll know everything about how to create minidumps from .NET applications: Creating and analyzing minidumps in .NET production applications.

Choosing the right place and time

The last article I linked to has one thing wrong though. It triggers minidump generation from AppDomain.UnhandledException and Application.ThreadException. And it would be equally wrong to call it from an error handler in Global.asax or to register one with an HttpModule, like Elmah does. The reason is that those handlers are called from a generic exception handler at the base of your application. At this point all of the context (call stack, variables) available where the exception was thrown is lost. To really benefit from minidumps, you need to trigger them as deep in your call hierarchy as you can. Also, often you’ll face exceptions that are caught by your own code, but you’ll still want minidumps generated, to better explain why those exceptions occured.

And let’s not forget generating minidumps like this is similar to debugging with printf – you need to modify code and redeploy. We need something smarter. And external to the inspected application.

Writing a managed debugger in C#

The CLR exposes debugging capabilities through COM, so it isn’t impossible for a mere mortal to implement a debugger. But thanks to the CLR Managed Debugger Sample anyone can do it, really. The sample mainly consists of a console tool, mdbg.exe, that you can use as a poor-man’s replacement for windbg.exe or cdb.exe, and – more importantly – a shared debuging engine library, written in C#. The latter is now also available on NuGet (courtesy of yours truly) as Microsoft.Samples.Debugging.MdbgEngine.

Let’s consider this trivial console test application:

namespace TestApplication
{
    class Program
    {
        static void Main(string[] args)
        {
            A(1, new { property = 2 });
        }

        private static void A(int i, object state)
        {
            B(i + 4, state);
        }

        private static void B(int i, object state)
        {
            try
            {
                C(i);
            }
            catch (Exception exception)
            {
                Console.WriteLine(exception.Message + ": " + i);
            }
        }

        private static void C(int i)
        {
            throw new InvalidOperationException("Expected failure");
        }
    }
}

Here is an only slighly longer console application that runs a debugger and creates minidumps on each exception (as you can see, also those caught in code):

using Microsoft.Samples.Debugging.CorDebug;
using Microsoft.Samples.Debugging.MdbgEngine;

namespace DebuggerApplication
{
    class Program
    {
        static void Main(string[] args)
        {
            var stop = new ManualResetEvent(false);
            var engine = new MDbgEngine();
            var process = engine.CreateProcess("TestApplication.exe", "", DebugModeFlag.Default, null);
            process.Go();

            process.PostDebugEvent +=
                (sender, e) =>
                    {
                        if (e.CallbackType == ManagedCallbackType.OnBreakpoint)
                            process.Go();

                        if (e.CallbackType == ManagedCallbackType.OnException2)
                        {
                            ClrDump.CreateDump(process.CorProcess.Id, @"C:\temp.dmp", (int)MINIDUMP_TYPE.MiniDumpWithFullMemory, 0, IntPtr.Zero);
                        }

                        if (e.CallbackType == ManagedCallbackType.OnProcessExit)
                            stop.Set();
                    };

            stop.WaitOne();
        }
    }
}

Go ahead and try it. If you open test.dmp in Visual Studio, you’ll be able to see not only the call stack at the moment when the exception occurred, not only values of variables that are easy to log (like i), but you’ll be also able to inspect complex and dynamic structures (like state).

I hope you’re having a big “holy shit” moment. I definitely did.

Where can we go from here?

The code above “logs” all exceptions. To implement this idea in the real world, we’d need quite a bit more – load symbols (you know where to get those from – SymbolSource), set breakpoints at specific locations, catch only specified exceptions.

Catch me on Twitter if you like the idea. This has potential for a great and very useful open source project.

We also have some very cool ideas how to integrate this with SymbolSource. Imagine setting breakpoints on the website and choosing exceptions to monitor, then seeing automatically collected reports from all your deployments. Call stacks with variable values, failure statistics, minidumps for download and offline analysis.

Interested?

Advertisements
Leave a comment

28 Comments

  1. Interested? Are you kidding me? That sounds fuctacular..

    How would you do it in ASP.NET? engine.CreateProcess() obviously isn’t feasible there.. Any chance this would work within an PostSharp (or similar)) OnException aspect?

    Reply
    • My idea for ASP.NET is to interface with IIS using its API, or just simply searching through the process list every second or so, and attaching to a running process. Check out my newer post, it shows exactly that, but for Cassini.

      I have no experience with PostSharp, but you might be right: the minidump generation code could be inserted with an aspect – it would end up exactly where it should, and would have access to all the stack context.

      Reply
  2. Do you have also an Sample project to download ? I’m a bit lazy…

    Reply
  3. David Ceder

     /  December 13, 2011

    Can you debug .NET 2.0 apps with this?

    Reply
    • Sure. Although I’m not sure if the debugger runtime needs to match the debuggee runtime. One thing that does for sure is bitness. You need to run a 32 bit process to debug a 32 bit process and the same for 64 bit.

      Reply
  4. Dominick

     /  December 13, 2011

    On average, how large is a minidump, is in the kb range, mb range?

    Reply
    • There are many types of minidumps, the smallest being a few hundred kB, and full being fifty-ish MB. I’ll cover my investigations on what they provide in a future post. But even the smallest have call stacks for all threads and stack variables. What you’ll be missing will probably be reference types (allocated on the heap).

      Reply
    • I just followed the example here, and the TestApplication generates a file of 41.1 Mb, using the MiniDumpWithFullMemory setting.

      Reply
  5. picky

     /  December 13, 2011

    >Normally, the last eception handler

    Ur missing an x in exception

    Reply
  6. Double-click on the minidump, set the debug-symbols, and of you are. One can even inspect the arguments in the method where the TestApplication crashes.

    Holy shit.

    Reply
  7. Nice work 🙂
    How about performance? Will it affect the application severely by having an attached debugger? Does it need to be built using debug configuration (without optimizations) to be of any use or is it enough with a pdb file?

    Reply
    • I intend to compare performance in a future post. Applications will run slower under the debugger, but I expect that performance loss mainly on things that cause callbacks to the debugger – without any breakpoints set, this will only be exceptions. There will be a general loss for sure, but I think the benefits in many situations might outweigh the costs.

      In .NET there is no real difference between Release and Debug mode, apart from setting up different conditional compile variables and some other build defaults, like symbol generation. I would recommend always generating PDBs, in Release this can be achieved with <DebugType>pdbonly</DebugType>. Without them, the debugger will only have access to function names, and not variable values or source locations.

      Interesting to note is that if you start a process with MDbgEngine, you can also pass some options to the runtime, like disabling runtime optimizations, which can help access more variables, whose values otherwise would be optimized away.

      Reply
  8. Have you made these modifications to help debugging with .NET 4?
    http://www.simple-talk.com/community/blogs/brian_donahue/archive/2010/11/24/95829.aspx

    Reply
    • I have made no modifications to MDbgEngine, but I believe that it had been updated by Microsoft to support .NET 4.0, and hence the versioning of the sample, which is also 4.0. In any case, I have not experienced any problems running 4.0 applications under MDbgEngine as the debugger.

      Reply
  9. pickier

     /  December 15, 2011

    In your debugger application demo code, you make a call to ClrDump.CreateDump(). Which namespace is that in? I’m trying out the example code (after installing the nuget package) and I’m not finding ClrDump. I most likely missed a step but wouldn’t mind some help if possible.

    Reply
  10. I still have a question.. how does this compare to using AdPlus or the SuperAssert.NET code? See this stackoverflow question: http://stackoverflow.com/questions/3005175/how-to-create-minidump-of-a-net-process-when-a-certain-first-chance-exception-o

    Reply
    • The goal of all of these (my debugger sample, ClrDump, SuperAssert.NET and ADPlus) is the same – get a minidump of a process. The all also eventually end up calling the same dbghelp.dll function MiniDumpWriteDump. But they are useful in different scenarios.

      To use ClrDump.dll and SuperAssert.NET, you will need to have control over the code, recompile and redeploy (or write code using them from the beginning of a project). SuperAssert.NET gives you a UI and additional options, not only minidump generation.

      ADPlus or ClrDump.ex will let you take a snapshot of an already running process, but you won’t be able to time it precisely to have state at a particular place in code. Mostly usefull to debug freezes and hangs, when you want to simply find out where the execution got stuck and how the offending thread got there.

      In a way my approach combines the advantages of all those tools: it is external and it can catch the exact moment in the process execution timeline (the minidump will be synchronized with a particular event – an exception or a custom breakpoint).

      But it also has one big problem that the others do not have: it isn’t a ready tool, it’s more of a concept. But I hope to gather a few people interested in the idea and turn it into a project: http://github.com/TripleEmcoder/Padre.

      Reply
  11. ATWORTH

     /  January 2, 2012

    Will this be an open source project on CodePlex?

    Reply
  12. jswolf19

     /  January 17, 2014

    Could you use the AppDomain.FirstChanceException instead of using a separate debugger process to do the dumping? Or do the advantages of doing the dumping in an external process outweigh the performance hit you’re likely to get from running the managed code in a debugger?

    Reply
    • For simple exception logging – yes. But you can only inspect variable values in a debugger, and that was the real reason for researching this topic. In AppDomain.FirstChanceException it’s also too late to do a minidump, because the stack is already unwound. This event is basically the same thing as doing try/catch, witch all its weaknesses. Should have mentioned this possibility for completeness, though.

      Reply
  13. Hi, it’s an amazing! I’ve downloaded the Padre project and I started using it.

    So, I can debug your demo project but I don’t know how to debug when I add a breakpoint.

    Do you know what I can do to hit the callbacktype as “OnBreakpoin” ?

    process.PostDebugEvent +=
    (sender, e) =>
    {
    log.Log(e.CallbackType.ToString());

    if (e.CallbackType == ManagedCallbackType.OnBreakpoint)
    process.Go();

    if (e.CallbackType == ManagedCallbackType.OnException2)
    {
    var ce = (CorException2EventArgs)e.CallbackArgs;

    Cheers,

    Reply

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: