Type Parameters? I want Member Parameters!
August 9, 2005
Generics has certainly opened up a few design opportunities for class library developers that weren’t there before. My gut feeling however is that there is more we could do so I’m going to expand upon my post on indexes in .NET BCL 3.0 collections and introduce the concept of member parameters.
The idea is that members such as properties, fields and even methods become valid parameters for generic types. It would work something like this.
1 Customer customer = new Customer();
2 customer.ID = "MDENN";
3 customer.FirstName = "Mitch";
4 customer.LastName = "Denny";
5
6 StringIndex<Customer, Customer.ID> customerIndex = new StringIndex<Customer, Customer.ID>();
7 QueryResults<Customer> results = customerIndex.Query("MDENN");
You can see in the declaration for “customerIndex” that the second parameter for the type is actually a property and support syntax like the following in the Add method (where things are actually inserted into the index).
1 public class StringIndex<EntityType, IndexProperty>
2 {
3 public void Add(Customer entity)
4 {
5 string indexValue = IndexProperty(entity);
6 // Add entity to index with indexValue;
7 }
8 }
You could achieve something similar by using generic delegate types but I think this syntax would be clearer in most situations – maybe its just a shortcut syntax. Obviously you would need to be able to apply constraints to the member parameters just like you can to type parameters, for example you’d probably constrain the StringIndex’s member parameter to being a string.
Idea: Indexes in .NET BCL 3.0 Collections
August 9, 2005
Damn! I wanted to post up in reply to the original post that Ariel made but I was just too busy at the time. Hopefully he isn’t too busy to listen now. One of the things that I would like to see added to the collection classes is a series of high performance indexes, for example:
- StringIndex
- DateIndex
- EnumIndex
I’m sure there are a few others that could be dreamed up. What you could then do is query it for matches, for example the following code could be written against a StringIndex.
1 StringIndex index = new StringIndex<Customer>();
2 QueryResultCollection<Customer> results = index.Query(
3 "Smi",
4 QueryOption.BeginsWith
5 );
6 foreach (QueryResult<Customer> result in results)
7 {
8 Custome customer = result.Entity;
9 // Do something with Customer.
10 }
Now normally you would rely on the database to do this kind of indexing but there are technologies out there like Bamboo which are starting to build transactional in memory object databases which can be faster than SQL Server but are essentially limited to the memory capacity of the machine – but this is fine for so many applications, provided you still have transactional integrity.
More RSSBandit Spelunking
August 9, 2005
Definition: Spelunking (spee-lunn-king), when a geek puts his nose where it doesn’t belong.
Some internal interest in my post prompted me to see whether my hypothesis about why RSSBandit is slow is correct. To test it out I downloaded the source for the release that I was using and commented out the code in the OnUpdatedFeed.
Interestingly this just breaks the automatic generation of the unread posts indicators on the tree view – if you select the note it does update. Also interesting was the fact that the event dispatch code itself is what does the marshalling across to another thread. If this was a stand-alone API you wouldn’t necessarily want to do it that way but because Dare’s RssBanditApplication is bound to the UI its not too much of a problem.
Anyway – here is the allocation graph at <root> – you can see that it has changed a bit. First off the worker thread code is on top of the graph which makes sense because that is what deals with the strings and XML. I don’t have a screen shot but the histogram of allocated types also has strings on top which is what I originally expected.
As I scroll along (here you’ll need to use your imagination, its late and screenshots take time) there are no real stand-point problems, memory usage is down and it certain performs better in the application – but remember, its not really magic because all I did was apply a tried an true optimisation technique.
The best way to make code run faster is to not run it at all!
RSSBandit Performance Problems
August 9, 2005
RSSBandit is an awesome RSS aggregator principally developed by Dare Obasanjo. I switched to RSSBandit after NewsGator and Outlook started having performance problems with the number of feeds that I was trying to process each day.
Last weekend I was subscribed to over about 1000 RSS feeds and conicidentally last weekend RSSBandit also became unusable. Obviously I had reached some kind of threshold that the architecture of RSSBandit wasn’t designed to cope with.
My first instinct was to ditch and go and find something a bit faster – after all it is a .NET application and we know how much of a memory hog those things are! Errr – hang on. Don’t I encourage our customers to go out and use .NET to build mission critical enterprise applications every day? I really needed to take a closer look at what was going on.
In idle RSSBandit takes up around 120–170MB of RAM on my laptop. Thats more than Outlook and SQL Server, and often more than Visual Studio (except when its in full flight) but to be honest I’m not that surprised because in order for it to give me the unread items count it has to process quite a few files containing lots of unique strings – that means relatively large chunks of being allocated just for data.
I decided to look a bit deeper and run the CLR Allocation Profiler over the code to see where all the memory (and by extension good performance was going). I remembered this article by Rico Mariani which included the sage words that “space is king” and while I waited for the profiler to download tried to guess what the problem would be based on my previous experience.
What I imagined was buckets of string allocations to store posts in their entirety and a significant number of XML related object allocations but when I looked at the allocation graph I saw something interesting.
The top four items on that list are 44MB, 33MB, 20MB and 16MB (roughly). I had actually expected to see System.String at the top of that list and with significantly more memory allocated – the histogram of allocated types wasn’t giving me the full picture.
One of the things that I love about the CLR Allocation Profiler (and conincidentally something that I think it does better than any other tool) is the visualisation of the allocation graph. It allows you to see where memory was allocated in relation to the program code.
The above picture wouldn’t be that surprising if we were looking at a single threaded application because all the objects would be allocated on the main thread, although for multi-threaded applications that handle a bit of data you can expect to see some thicker lines heading away from <root>.
As I scrolled along I saw something that alarmed me – or at least something that I didn’t expect – there was quite a bit of memory traffic resulting from the OnUpdatedFeed method firing. Presumably this is to notify code in the application that some read/unread statistics have been updated for a particular feed.
OnUpdatedFeed probably fires off an event that multiple subscribers are listening to which is causing the object allocation. I guess architecturally its a reasonable approach provided the event doesn’t get fired too often but one of the issues with RSSBandit is that it recomputers its read/unread counts everytime you open the application and on an ongoing basis while you are using it.
I decided to figure out what was causing the memory allocations so I continued moving along the line to see where the trail led me and I arrived at this little display.
As you can see there is a huge amount of traffic between this native function and the NativeWindow class. It was at this point that I started to suspect what the actual problem was and had to giggle at how many times this same problem pops up in smart client applications.
From what I can tell the problem is an excessive amount of marshalling to the UI thread is going on. This is causing threads to synchronise (tell tale DynamicInvoke calls are in there) and quite a bit of short term memory to be allocated over the lifetime of the application. Notice that there is 610MB of traffic between the native function and NativeWindow so obviously that memory isn’t hanging around.
The fix? I don’t know – but I suspect if I went in to the RSSBandit source and unplugged the UI udpates from the UpdatedFeed event the UI responsiveness would increase significantly (the background thread isn’t continually breaking into the main loop to update an unread count on a tree node).
I don’t mean to pick on Dare or RSSBandit here – this is actually a common problem in managed code when people try to build user interfaces which present constantly updating statistics to users. As a matter of fact we present a similar scenario to this (thread synchronisation) in our IS.NET course and lead students through the things they need to do to find the problem.
P.S. I’m going to party like its 1999!
Clarke posts a follow-up.
August 9, 2005
In response to my post last week about something on the aus-dotnet mailing list, Clarke Scott has posted up his response to avoid any confusion over his position. Thanks Clarke!