Monthly Archives: January 2005

GNUstep has some sex appeal.

I picked up a link to this Objective-C/GNUstep demo from the “This TAG is Extra” blog. I’ve never coded in Objective-C so this development tool was definately going to be unfamiliar territory to me, and in general I wasn’t impressed with the look and feel of the IDE, but some of the features of the design surface were very interesting.

I particularly liked the way that it encourages a clean seperation of UI logic from the underlying application logic, and the way that you wire up the bindings between what they called the controller class and the UI widgets.

I wonder if there is anything to be learned by looking more closely at these tools.

How to make sure software architects club all animals with the same type of bone.

I read this article at SD Times (referred to by Scott Allen’s blog). Scott looked at it with a humorous slant but I tend to get frustraited when I see these kinds of attempts to draw a nice neat box around what the various positions in the IT industry do.

The reality is that computing as a profession is still too young to define in any meaningful way what the role of a software architect is let alone develop a test to judge a candidates suitability for the job. Further, the materials that we use to do our job are different depending on which platform you use, so a certification supported by Fujitsu, HP, Hitachi, IBM and Sun is definately going to be slanted towards the approaches that work for their platforms.

Today we essentially have something skin to very diverse and geographically distributed tribes of pre-historic man, each isolated enough to be developing unique approaches to tackling their problems. If you force them together too early there is a great opportunity lost to cultivate new technologies and skills.

Comments are for the weak.

In response to this recent post on The Daily WTF, I’d first like to point to the second last entry on The Klingon Programmer’s Code of Honour. But this post is more than a lame attempt to get my blog associated with the word “Klingon”.

When it comes to comments I actually have the opinion that less is more and I despise those educational institutions and coding guidelines which dictate the use of verbose commenting. Now I am not talking comments in the form of good code structure, I’m talking about the wall of green can often be described as a code smell.

Comments aren’t bad, they are just often mis-used, for example if you are using a low level API where the function calls really don’t describe what is going on, then a quick comment can be helpful, but seperating that chunk of code out into a method by itself would serve exactly the same purpose and at the same time contribute descriptive to the call stack in the event of an error.

So there you have it, light comments aren’t just good coding style – its good from a maintainability point of view. Now that I have put the cat amongst the pigeons let me step back a bit and acknowledge that there are always exceptions. For example, demo code is great when it has comments, especially if you are unfamiliar with most of the frameworks that are being used, but my advice is simply this – when you are compelled to enter a comment, stop, and think if there is a better way to describe this to the maintainence programmer.

Concurrency in programming – now more important.

Earlier this month I found a link on Larry Osterman’s blog to an article by Herb Sutter entitled “The Free Lunch Is over: A Fundamental Turn Toward Concurrency in Software”. I strongly encourage to go and read that article now. It talks about the apparent ceiling that we are hitting in the performance capabilities of CPU’s and how in order for software to get the most out of upcoming multi-core processors software will need to be designed with concurrency in mind.

The interesting thing is that at the application architecture level this has been happening for some time. Software architects dealing with extreme performance requirements have been designing their applications to be scaled out not just across processor cores but entire farms of computers. What we will witness is this design philosophy filitering down into some aspects of business algorithm development.

I’m not saying that products like Microsoft Word are going to require multi-processor/multi-core machines, but the algorithm could detect the capabilities of the environment and spawn up a thread for each available core to process up a result set that had been broken up.

Platforms that work at a level of abstraction from the underlying OS like .NET are in the best position to ease developers into this new world because the runtime can make optimisations at execution time to split the workload, but thats not the whole picture.

While .NET implements a consistent asynchronous programming pattern which is supported by the compiler and exposes a number of the underlying concurrency primitives I’ve got to wonder how long its going to be before we start to see first class language constructs in mainstream languages to cater for concurrency like Paul D. Murphy’s data bound dispatcher which is kind of like a foreach statement where each data element is processed on a seperate thread.

Personally I think that your basic CRUD-style business application isn’t where you are going to find interesting uses of concurrency (unless you are running at the big end of town – 20,000 simultaneous users or more), instead I think it is going to be these more interesting ones where it will count.

Caroline Price gets a space to call her own.

I caught up with Caroline Price on IM tonight. She’s the awesome lass who used to work with Frank Arrigo and the DE folks at Microsoft here in Sydney. She moved to Singapore to work in the regional office but is now back in Australia! I managed to convince her to start up a blog over at MSN Spaces! Hopefully Frank can update his OPML file once again.

So come one Caroline, tell us a little about yourself – what are you doing now that you are back in Sydney?

EntLib 1.0 Ships

Unless you have been living under a rock you would have heard by now that Microsoft’s long awaited replacement of the ever popular Application Blocks – Enterprise Library has shipped. You can go straight to the download from here.

Enterprise Library, more affectionately known as EntLib is more than just an update to the application blocks, its an evolution. With this release the Patterns and Practices team sought to integrate the blocks in the obvious ways, in particular configuration is drastically simplified due to some new tooling (note the Design namespaces) so you don’t have to remember the sometimes complex configuration schema required to support the level of extensibility.

Just surfing the code for the first time now, it looks like they have actually integrated with the VS.NET designer which means that if you a component designer it might be possible to integrate your own components with the blocks at design time to squeeze even more productivity out of your environment.

Good work guys! I’m looking forward to working with the new bits.

P.S. Naturally, Readify is looking at its INDUSTRIAL STRENGTH .NET course to incorporate these new tools and advise on how best to leverage them.

Code Camp Oz (I): Mark your calendars!

What do you get if you cross .NET technology with drop bears? Code Camp Oz of course, and the first one has now been scheduled for the 23rd and 24th of April at Charles Sturt University – Wagga Wagga Campus. You can download an ICS file here to mark it in your calendar.

Check back here for announcements about speakers, registration details and other logistical info. Since this is a community event, if you are interested in presenting or helping out please shoot me an e-mail at mitch.denny@notgartner.com.

Spam Forensics System

On the 11th of April 2004 the Australian Spam Act (2003) became law. The law has provisions to impose fines up to $1.1 million dollars per day.

 The Spam Act prohibits the sending of unsolicited commercial electronic messages that have an Australian link. This means that commercial spam, sent by mobile phone as well as by e-mail, is not permitted to originate from Australia and is not allowed to be sent to Australian addresses, whatever their point of origin.

Enforcement of this law was always going to be a problem since its hard to prosecute people sending generic spam messages for things like pharmaceuticals when the message originated in another country. Prosecuting the senders of spam where that spam originated in Australia (assuming it wasn’t a drone mailer) is a little bit easier since Australian law enforcement officials would have access to the culprit.

The Australian Communications Authority is charged with the enforcement of the Australian anti-spam law and has had a moderate amount of success, apparently being able to get Australia OFF the list of top ten countries from where spam originates (according to SPAMHAUS). Unfortunately ACA investigators have been unable to keep up with the flow of SPAM being reported manually by Australian citizens let alone the volumes being forwarded into their system by automated spam honey pots (I love Wikipedia).

In December, to help tackle this problem, the ACA went out to tender for expressions of interest for the provision of a Spam Forensics System. On the 28th of January that tender closed and it will be interesting to see who went into bid to provide the system.

The tender document (PDF) details that the ACA expects to spend no more than $300,000.00 on the system which would need to initially handle 250,000 e-mail submissions per day from 50,000 individual users, and the system would need to be theoretically capable of handling five million messages per day. If the average size of a spam message was five kilobytes then that would be a daily throughput of around twenty five gigabytes of data, not counting the overhead of the SMTP protocol itself or other protocols used for submission.

Speaking of submission, part of the requirements for the EOI were to provide an easy to use mechanism for Australian citizens to submit candidate SPAM messages that could work on multiple platforms and optionally provide additional information such as why they think a seemingly legitimate e-mail is indeed a SPAM message.

As messages are processed by the system it would need to automatically categorize messages to assist in targeting investigation. For example messages that don’t originate in Australia and don’t specifically link to an Australian resource would have a lower priority because the chance of a prosecution would be much lower. The flip-side is that messages which contain a new phishing scam affecting Australian businesses like banks would need to automatically be given a higher priority because of the potential for real damage rather than just annoyance. And finally blatantly illegal activity like the sale of illicit drugs or child pornography would need to be automatically forwarded to the police for investigation.

In short, the system would need to handle tremendous volumes of data in an efficient and timely manner and could easily become a critical piece of Australian communications infrastructure. If only I had found out about the tender earlier, it would have been something I would have loved to tackle from an architectural point of view.

Since I didn’t have the time to produce something by the deadline I thought I might post a few thoughts on how I might design the system if it was to be built from scratch.

Overall Architecture

First off, I envisage a system that can accept input from multiple different sources – individual e-mails with spam as attachments, web-service calls containing a verbatim copy of the SPAM message (including headers) and bulk transfers via protocols like FTP.

Spam Input Diagram

Receiving the message is just the beginning, from here it would be processed asynchronously through a pipeline of analysis routines. These routines would be pluggable so as the requirements of the system change so could the pipeline to deal with the information more efficiently. The analysis routines would attach findings to the message (whilst keeping the message in-tact) and these findings could be used for routing.

As an example, analysis and routing could be used hand and hand to drastically reduce the amount of computation required by automatically filtering out SPAM which the ACA couldn’t deal with effectively because it originated from another country and didn’t seem to have any Australian links.

SpamAnalysis

The origin analysis component could rely on external services such Geobytes to accurately determine the source of messages. This pattern could be repeated inside the software until all messages ended up in queues that investigators had some hope of actively tackling. If multiple countries used the same system (or a compatible one) then they could forward leads to each other including any findings that may have already been computed.

I’ve got more to write on this topic but I’ll hang up the keyboard for now. As I said – it would have been an interesting project to work on.