Monthly Archives: August 2004

Day 0.9 – Switching From Wired to Wireless with Least Priv

Well, its the end of the day, and I unpacked my laptop to transfer some e-mail and do a bit of blog reading. Anyway, the first task when I get home is logging in to my wireless network. During the day I was hooked up to a wired network and I had disabled my wireless card (habit, some of my clients have strict security policies about wireless connectivity, and even things like mobile phones with integrated digital cameras).

Anyway, because I was locked down I couldn’t enable my card, so I had to log out and log in as admin and enable it. Then Andrew Storrs pointed me to Aaron Margosis’ blog. Thanks!

Day 0.1 – Update Tuesday and Running with Least Privs

Okay, so I am just into my new kick of running without Administrator rights and I hit the first snag. A few years ago I set an item in my Outlook task list called “Update Tuesday”. It goes off every tuesday and its to remind me to go and visit things like Windows Update and to update my anti-virus signatures (I run in roaming mode since I VPN in rarely now that we are using RPC over HTTP for Outlook).

It popped up just a few moments ago and I robotically went off to Windows Update and it told me that I need to have Administrator rights. No problem, I just logged out and logged in as the Administrator, although next week I think I might try and find a way to do with by launching IE with the runas command – although that could be risky given what WU does .

The next item on the task list is updating the anti-virus signatures. This would have worked just fine except I had paved my machine since the last Update Tuesday and I need to re-install it from scratch. Since I was already logged in as Administrator I VPN’d into work and went to the anti-virus server and kicked off the ActiveX-based auto-downloader for the anti-virus software. Worked a treat. I am now back in roaming mode, logged in with my POUA (Plain Old User Account) and it updates the signatures across the Internet just fine.

Back to your regularly scheduled coding . . .

Martin Granell is blogging!

Martin Granell has just started blogging! Martin is also a Senior Consultant and Instructor at Readify and his first post is about a recent trip he made to Seattle to present our Industrial Strength .NET course. I’m really envious because he got to meet and present to Ward Cunningham! I hope you got me an autograph Martin!

List of Readify Bloggers:

Have I missed anybody?

Day 0 with Non-Admin Privileges

I’m currently listening to .NET Rocks! over the web where my favorite radio personalities Carl Franklin and Rory Blyth are talking to Don Kiely. You can pull down the recorded session here. Don (amongst others) have been pushing the “developing with least priv” angle for quite some time now and to be honest I have really resisted, but I thought I might try it out and see if my productivity is affected.

Anyway – I’ll post up here if I have any problems and probably regurgitate the solutions that others have come up with to perform common development tasks.

Combating the Comment Spam Problem

Spammers have already created a wasteland out of e-mail networks, and I am damned if I am going to let them ruin what has basically become my number one source of brain food – blogs.

Spammers are using comment submission forms and trackbacks to insert unwelcome content into comments, whats worse is these parasitic posts end up getting indexed by search engines like Google, kind of like scars (scroll down to the bottom) that hang around long after the offending comment has been removed.

The problem is that alot of the bloggers out there to which I am subscribed are starting to talk about turning off comments. I now believe that the MSDN bloggers can’t receive comments on posts older than thirty days.

This weekend, between sleeping, folding clothes with my wife and taking my daughter swimming I have spent some serious hours looking at ways that comment spam can be tackled. One mechanism that has been discussed for e-mail is Hashcash which I found whilst searching for literature related to Microsoft Research’s Penny Black project.

Hashcash is a simple mechanism sometimes used in e-mail to force the sender of an e-mail to perform some moderately expensive computation that produces a “stamp” which can be sent along with the message and cheaply verified by the receiver. A stamp looks something like this.

0:030829:foo123456789:lnymsmzsbksvkavrzltdcr/+

Hashcash actually has two stamp formats, version 0 and version 1 which has just been released. I focused on implementing some code that would produce version 0 stamps because it was easier, but I think version 1 would be better moving forward.

The version 0 stamp is broken down into four parts, each seperated by a “:” character.

  1. Stamp format identifier.
  2. Date time in yymmdd format.
  3. Resource identifier.
  4. Rand characters.

The idea is that the sending software produces a “candidate” stamp where the first three parts remain the same and the random part changes. A SHA1 hash is then performed on the candidate stamp. The hash is then analysed to see how many of the leading bits are zeros. The more zeros, the higher the value of the stamp.

When sending an e-mail, the recipient expects to get stamps of a certain value, and they are easy to verify (by performing the SHA1 hash on the stamp and counting the bits).

Because not every random string will produce a hash with the right amount of leading zeros it needs to be attempted multiple times (the code I have running in the background here is producing a stamp of sufficient value between 3000 to 400000 hashing iterations.

So how does this help? Well, the Hashcash FAQ lays it out better than I do, but basically spammers use armies of drone mailers, this is profitable for them because it is very cheap to send an e-mail, but by forcing the mailer to “mint” a stamp it slows down the rate, especially since each recipient needs to have a stamp produced just for them. The idea is to destroy the economic model what spammers use.

How does this relate to blogs? Well I think that we could use a similar mechanism to protect our comments from spammers who would use form posters and trackbacking drones to polute them. The mechanism would be implemented in two independent pieces.

Hashcash for Comments

Now this is the thing that I haven’t tested, but in theory it will work. We implement a hashcash minting algorithm in JavaScript that can be embedded in the comment submission forms. When the submit button is clicked the algorithm kicks off and inserts the stamp into a hidden field that can then be read by the server-side code.

The prerequisites for getting this working would be a SHA1 algorithm in JavaScript, I’ve found one here. I haven’t tested it but it is a starting point for someone who wants to tackle this side of things.

Details like the resource name and stamp value (from v1 format of Hashcash) would be provided by the server when it renders the page that contains the javascript implementation of the hashcash algorithm.

Hashcash for TrackBacks

OK, I won’t pretend to be an expert on how trackbacks work, but my idea is to place some additional content in the RDF element contained within blog pages to describe what kind of stamp format is required and its value (just in case we want to change this later on). Here is a sample of the modified RDF.

Now, lets go through a scenario. When I make a post using BlogJet (sub in your favorite poster here) to .Text (sub in your favorite blog host here) it goes off to do the trackbacks. Before it can do a trackback it needs to pull down the content of the referenced URL and search for the RDF element to get the trackback URL.

At this point it also looks for the hashcash information and computes a stamp up to the required value passing in the URL (URL encoded) as the resource name, I’m not sure yet whether the resource name should be the trackback URL or the originally referenced URL, my current thinking is that it should be the trackback URL (including the unique query string).

.Text would then attempt to make the trackback, but in addition to the normal arguments in the HTTP payload it would also include the stamp value. The recipient of the trackback would then quickly verify the stamp by performing a SHA1 hash on it and ensuring it had the required value.

Here is a link to the REALLY REALLY POORLY PERFORMING CODE that I wrote whilst getting to know Hashcash. I haven’t really factored the code well and there is no comments, in fact you can tell it was a quickie by the namespace. But I thought that the .NETers amongst you would probably find the .NET class library more hospitable than some of the C, Java, Perl, and Python that I have read over the last few days.

Update: If you are .NET savvy and took a look at my implementation, I suggest that you flip over to using the SHA1Managed algorithm, it seems faster than the SHA1CryptoServiceProvider, presumably because there is no managed/unmanaged transition. I’m running the B1 bits of Whidbey at the moment – does anyone have a profiler that will work with this setup?

Known Issues

I know there are issues with the proposal, thats why I am going to call this a starting point.

The first issue that I can see is that as the required number of zeros in the stamp climb so does the cost, and I can see a point in the very near future where that would be too expensive (in fact the code presented above requires 24 zeros, thats pretty costly in my slow implementation).

Microsoft Research listed some alternatives with their Penny Black research (not necessarily the original source) including memory bound functions which relied on CPU cache misses to prove the effort, which is more egalitarian than pure CPU horsepower effort.

Similarly, if we look at huge blog hosts like http://weblogs.asp.net then the burden of producing stamps on behalf of its users would cause flat heads on the boxes hosting the system. In this instance I recommend that the posting applications supply the hosting system with a set of stamps which match the URLs referenced in the post. Web-based posters could use the same javascript approach used for comments.

There will also be an adoption issue, although given the rate at which technical blogging folk adopt new things like this it’ll probably be more successful than e-mail. Hopefully this post serves as a rallying point for the likes of BlogJet, NewsGator, Scott Watermasysk and everyone else who contributes bits to the blogosphere.

Comments, suggestions, implementations? Leave comments.

Resources for Implementors

Generics: Old habits die hard . . .

I’ve been using .NET for over four years now and in that time I’ve managed to get some hard-wired behaviours. Just then I wanted to record a list of sample points as I dragged my mouse across a form and automatically I thought ArrayList (given the requirements – unknown number of points).

Anyway – I remembered that I was using Whidbey and I had access to generics so that getting them out would be much easier (no need to unbox etc). Its nice that I noticed it in time – but its also worrying that in only four years I’ve managed to do this kind of code automatically.

Games, Google and Physics

In my spare time I am teaching myself a little about physics, something that I was really bad at when I was going through highschool (how terribly un-geeky of me). Anyway, as a lesson plan I thought I might go about getting the knowledge that I would need to implement a simple moon lander (I love moon lander games).

Right now I’ve mastered the basic physics required to implement the game (gravity, motion, atmospheric resistance) but I wanted to know more about collision detection (not physics strictly speaking – but this is also about moon lander remember). Anyway, one of the first hits on Google lead me to this game.

Actually, this page lead me to the game, but also this tutorial on collision detection. Time to curl up on the couch with the laptop and have a read.

Thought #1 on API Design

One thing that most programmers will need to do in their career is design an API that can be used by others. Don’t let anyone fool you, API design is an art-form, not a science, and not everyone is good at it.

Don’t get me wrong, I’m not suggesting for a second that the API’s that I develop are god’s gift to developers, I KNOW some of the API’s that I have created in the past have had warts, in fact some of them were complete festering blobs. Hrm, enough imagery, let me know in the comments if you want to know more about the crap APIs that I have written.

In the last month I have had the misfortune to experience two instances where a single class library has the same class-name used twice seperated only by the namespace. That by itself isn’t criminal, but what is criminal is that it is highly likely that you would want to have both namespaces referenced in the source file causing many naming conflicts.

The first offender is from Symbol, probably the largest provider of industrial mobile technology like ruggedised Pocket PCs and Windows CE devices. Symbol have a mobile SDK (sorry – its a login page) which includes managed APIs that allow access to many of the value added functions on their devices, such as bar-code scanning, mag-stripe card reading and more.

It so happens that mag-stripe reading and bar-code reading are so similar that they share a common base class – OK, I’ll go along with that. The problem is that they named the base class “Reader”, the bar-code reader “Reader” and the mag-stripe reader “Reader” too! To make matters worse, there is a “ReaderData” class which gets passed into readers, and that is subclassed using the same technique.

In order to make the type names unique Symbol chose to break the class libraries up using namespaces. So the fully qualified type names would be.

  • Symbol.Generic.Reader
  • Symbol.Generic.ReaderData
  • Symbol.Barcode.Reader
  • Symbol.Barcode.ReaderData
  • Symbol.Magstripe.Reader
  • Symbol.Magstripe.ReaderData

So basically I would need to use the Barcode namespace and the Generic namespace together in code so the types in one namespace is going to have to be fully specified in code. How did they end up with this implementation?

I can really only speculate, but I spent some time thinking about it yesterday and thought it might have been the outcome of some testing excercise where they wanted to be able flick between the types of readers just by flicking a using/Imports statement.

So, I said there were two examples didn’t I? OK, the final example is none other than the Enterprise Instrumentation Framework. I really like EIF, one piece that I particularly like is the event raising model where the specific events you raise supply much of the context that would normally be baked into a string.

If you are using EIF then you are more than likely going to need to implement a sink, and sometimes when you do that you are going to want to read configuration data from EIF, however, in EIF the configuration class for an EventSink is named the same as the EventSink base class (EventSink), once again seperated only by name spaces. Basically it has the same problem as the Symbol APIs.

OK – this has turned into a bit of a rant but really I just want to encourage people to try using their APIs before they put them out there, its one of the reasons that I like the harvest frameworks – they tend not to have this problem.