CLR Profiler Is Your Friend

08.04.2008

This past week, I have been struggling with a strange problem on a client’s ASP.NET 2.0 website. The server, a 64-bit Windows Server 2003 with 4 Gb RAM, used up almost all available memory within a few seconds of starting the w3wp.exe (the IIS 6.0 worker process). During this excessive consumption, the CPU usage was 100%, and then dropped to 1-2% during normal operations. Then every half hour or so, another wave of 100% CPU cycles and a jump in memory consumption, then after a few minutes more CPU cycles and a steep fall in memory consumption. This would go on, wave after wave, hour after hour.

The problem seemed unrelated to the number of concurrent user sessions (approx. 500 at peak), and most of the time the wildly excessive memory consumption didn’t even seem to cause any problems. Yet some times - a couple of times a week - the server response times would drop dramatically or even grind to a halt, yet no OutOfMemory exceptions were ever thrown.

I was very determined to fix the issue, because our client expects the website, which is currently in its infancy, to scale to almost a million registered users within the next couple of years.

As a short term fix, the server was scaled up to 16 Gb RAM, based on the old “when in trouble, double” advice that everyone always gives without solving the underlying problem.  Well, surprise surprise, the problem scaled up as well - now the w3wp.exe could consume 7-8 Gb RAM almost right after IIS was started. In waves, just like before.

Using perfmon, I was able to determine that the 7-8 Gb of RAM was mostly to be found in the regular heap in generations 1 or 2. In other words, it seemed that a lot of small objects were being created (since the memory was not used by the large object heap). These objects were being used for a little while, then moved through the garbage collection system. That explained why there were no OutOfMemory exceptions - the memory was actually ready for garbage collection, there were just too many objects to handle it efficiently, so things got slowed down.

Next on the list was to get a fresh copy of one of the world’s most complicated debugging tools, WinDbg. Along with “sos.dll” (found in your .NET framework folder), it can be used to debug .NET memory usage, among other things. Using ADPlus, I created a memory dump of the w3wp.exe process (all 7 Gb of it), and proceeded to attempt to decipher what it all meant.

I should say right now that if you came here looking for advise on how to use WinDbg, you’ve come to the wrong place. The right place is http://blogs.msdn.com/tess/ There’s really no point going anywhere else - Tess is who you want. Either her or call Microsoft Premier Support, which is what I ended up doing.

Back to the matter at hand. Using WinDbg and with the help of a very nice guy from MPS, I was able to determine that my memory dump showed an abnormal amount of one specific type of object being created - SD.LLBLGen.Pro.ORMSupportClasses.EntityField. More than 7 million of these objects were lying around in memory.

For those who don’t know, LLBLGenPro (www.llblgen.com) is one of the world’s most productive Object-Relations Mapping tools for .NET development. I love it. It has literally saved me hundreds of hours of work on previous projects. It allows you to write very elegant and strongly typed code against any database engine (DB2, Oracle, MySql, MSSQL, etc), and supports lazy loading, which was used on this particular project.

This time, however, it seemed that I got bitten in the ass by LLBLGenPro. Somewhere, something was really malfunctioning, but based on previous experience I was pretty sure that LLBLGenPro was not directly to blame. Rather, the problem was probably caused by bad website code.

What to do? Well, WinDbg allows you to look at any object in the heap you like, using obscure commands such as “!dumpheap -stat” and “!dumpobject <address>”. However, I had 7+ million of these objects, and in all likelyhood some of those were bound to be fine/expected, while others were not. So looking at each individual object was a no-go.

Enter my new best friend, CLR Profiler. It’s a world class tool for debugging .NET applications that almost nobody uses, except those who are into .NET games progamming with XNA (like me). But I really think that every .NET developer should grab a copy of CLR Profiler here:

http://www.microsoft.com/downloads/details.aspx?FamilyId=A362781C-3870-43BE-8926-862B40AA0CD0&displaylang=en

Don’t be scared by the interface or the docs, which are a little hard to comprehend. Just fire up CLRProfiler.exe and hook it up to the .NET application you want to profile. In my case, I hooked it up to the problematic ASP.NET website on a dev machine.

After starting up the tool, I hit the front page of the website a few times and then took a look at the CLR Profiler’s summary page. It looked like this:

 

The first thing to notice is the amount of garbage collections that had taken place. I had only hit the website a few times, yet generation 0 had already been collected 156 times, and generation 1 more than 50. Far too much, considering that the machine I tested this on had 1 Gb of RAM and the website is fairly simple.

Remember that garbage collection only takes place when it is needed. It’s not like it is scheduled to run every 2 seconds. When writing XNA games, you strive to do all your allocations at the start of the game, then never allocate another object again while the game is being played. That way, you don’t trigger any garbage collection, which causes “hiccups” because the CPU works hard during GC.

When looking at that summary, you may feel like you are miles away from a solution. Yet, the solution is only about 5 clicks away..!

Clicking on Timeline presented me with a simple (ok, insanely complex) looking overview of the garbage collection that had taken place. It looked like this:

The trick here is to select as much of the timeline as you’d like to investigate. The list on the right side of the window will then be sorted according to which type has been allocated the most. In my case, it showed SD.LLBLGen.Pro.ORMSupportClasses.EntityField, just like what I found with WinDbg.

Now here’s why CLR Profiler is my new best friend. You can simply right-click a type and select “Show Who Allocated”. After a few seconds of work, you get the following display:

How cool is that? It’s a walkthrough of the .NET application, which shows clearly how the data got to be allocated. Scrolling a bit, I found my culprit:

One simple little method, AccountManager.GetAvailableCredits() was causing all this fuss. Nothing left to do except to jump into the code, find the mentioned method and see what it did.

It turned out that to calculate the sum of available credits, LLBLGenPro’s fancy lazy loading technique was being used to read all account lines for the user, inflate each one and sum up the amounts to get a total. It looks really neat in C# code, but is a huge performance killer. LLBLGenPro offers many other ways to solve this problem, or a stored procedure can be used. Just never use lazy loading for aggregating data.

Did any of this make sense? If not, I apologize. The main lesson here is to grab a copy of CLR Profiler, read the docs, view the webcast and start optimizing your code!


C# Trick: Load Embedded Resources In A Class Library

18.03.2008

Here’s a little trick in C#.

If you ever need to distribute and use resources such as text or xml files in a class library, you can use embedded resources. It’s very simple: You add a .txt file to your class library, right click it, choose Properties and set Build Action to “Embedded Resource”. Then when you need to read the file during runtime, you use Reflection:

Assembly assembly = Assembly.GetExecutingAssembly();
Stream stream = assembly.GetManifestResourceStream(“MyClassLib.MyResourceFolder.MyTextFile.txt”);
StreamReader streamReader = new StreamReader(stream);
String myText = streamReader.ReadToEnd();

That’s all. How clever is that?

Obviously, you would use XmlTextReader instead, if the resource is an Xml file, etc.

If you don’t want to use reflection every time you need that chunk of text, just store it in a variable after the first read. Or you can cache it with HttpRuntime.Cache or HttpContext.Current.Cache, depending on whether you are working on a Windows/console or an ASP.NET application.

Edit: Fixed a small bug in the above code :)


Online Flex Compiler with ASP.NET Web Service

18.12.2007

I just received a couple of very cool books from Adobe a few days ago along with a training DVD, and I’ve been trying to consume as much of it as I could while tying up some loose ends at work before the holidays.

Anyways, two things I like a lot about .NET is reflection and runtime compilation. So I thought I would see how far I could take my (limited) Flex knowledge in that direction, using the new Flex 3 beta 3.

I decided the best way forward would be to create an “online Flex compiler” – that is, a Flex application that allows the user to input some source code and have it compiled and executed directly.

Flex doesn’t allow runtime compilation, so my solution is based on a Flex front-end and an ASP.NET 3.5 Web Service backend, which in turn calls the Flex mxmlc.exe command-line compiler.

I should point out here that Adobe has created another command-line tool called Flex Compiler Shell (fcsh.exe), which boosts compilation time and contains a few features not present in mxmlc.exe. However, I decided that because fcsh.exe is in beta (not production-grade) and mxmlc.exe is a finished product (for Flex 2 at least), I should concentrate on the latter.

My idea was this: Any SWF-file can be compiled using either Flex Builder or mcmlc.exe, and such an SWF-file can also be dynamically loaded, for instance as a module, in a running Flex application.

So why not combine those two things? It might make for an interesting way to allow user-created content to be added to a community RIA in true Web 2.0 style.

Well, I didn’t get that far, but I got the technical solution working :)

The Solution

I can’t show you the finished solution, because I’m sure my poor little web server would choke under the stress of people playing with it – because it’s really quite fun. Instead, I’ve captured a screenshot for you to take a look at.

Dynamic Compiler Screenshot

It works like this:

  • - The screenshot above is of a Flex application called “DynamicCompilerClient.swf” that runs in a browser, and which contains a simple TextArea into which the source code for a simple Flex module is loaded.
  • - The user can change the code freely and then click “Compile and Run”. By clicking the button, the ASP.NET 3.5 Web Service “Service.asmx”, which was previously imported into Flex Builder, is called with the source code as input.
  • - The Web Service saves the received source code in a unique folder on the server that is available from http (eg. mapped as a Virtual directory) and then uses System.Diagnostics.Process to call the mxmlc.exe command-line tool. The standard output from the compiler is picked up by the Process and the resulting SWF-file “DynamicModule.swf” is saved in the folder.
  • - The Web Service returns a URL to the “DynamicModule.swf” to the calling “DynamicCompilerClient.swf”, which picks up that URL and loads the module using ModuleManager.GetModule(url).
  • - Once the module is loaded, the ModuleEvent.READY is fired, and the module is added to the canvas.

Thoughts on Implementation

Here are my current thoughts on the implementation:

  • - It works fine and only took a few hours to implement.
  • - Importing an ASP.NET Web Service into Flex Builder was easy and works great for my simple scenario. I haven’t tested complex data structures yet, but I imagine I will at some point.
  • - The Flex compilation process itself wasn’t as smooth as using .NET’s Microsoft.CSharp.CSharpCodeProvider etc., but on the other hand dynamic module loading in Flex seems to be less bothersome than dynamically loading (and particularly re-loading) assemblies in .NET.
  • - The command-line compiler was pretty slow – it takes approx. 2 seconds each time. However, I imagine a big speed penalty is incurred because mxmlc.exe launches a new Java Virtual Machine on each run and because results are not cached between runs. Fcsh fixes this to a great extent, I’m sure.
  • - I wouldn’t exactly call the solution I’ve implemented production grade. I don’t so much mean code-wise (it’s a mess), but rather the architecture: Calling a Web Service that in turn executes a command-line compiler that stores the result on disk and returns a URL seems to be a chain with a lot of weak links.
  • - There are, luckily, lots of ways to improve the stability of the solution. For instance, instead of doing things synchronously (which is what would cause my server to choke if I opened up to external requests), the Web Service could asynchronously queue the request in a database, and a Windows Service could then poll the queue, run the compiler and store the result also in the database, letting the Web Service (again asynchronously) return the result to the caller whenever it’s good and ready. In other words, it’s not so much a Flex/ASP.NET interoperability issue, it’s an implementation issue on my part.

Conclusion - and what about Silverlight?

The conclusion has to be: It works fine and I’m fairly happy with the results. It’s not runtime compilation in Flex, but it’s pretty close.

However, now I am really looking forward to Silverlight 2.0 beta, because it seems that this scenario is simpler to handle in a pure .NET environment. Runtime compilation will probably not be possible from within a Silverlight application either (CSharpCodeProvider calls .NET’s C# compiler csc.exe from the command-line behind the scenes), but at least the server-side logic would be less of a hack.

Comments? Thoughts? Let’s hear’em!