Part of my role at Readify is keeping abreast of what is going on in the IT sector from a broad perspective and then understanding how it will impact our customers and by extension the Readify business itself.
Over the past couple of days I’ve received a few questions from various people inside the business to provide some input on various topics. One of those topics was the recent announcement by AGIMO (Australian Government Information Management Office).
Specifically AGIMO reversed a decision to promote OOXML over ODF, and instead now recommends ODF. The change was part of a larger document which provides guidance on what departments should consider when defining a SOE (Standard Operating Environment).
Tale of Two Formats
Both ODF and OOXML are specifications which describe how documents for various types can be stored physically in digital media. Both specifications seek to solve the same problem, but differ in their origins and therefore their implementation details.
OOXML (or Office Open XML) is the standard file format for Microsoft Office products since Office 2007. It is effectively a group of files contained within a ZIP file (but renamed to a specific extension such as *.docx). One of the files inside the archive is an XML file which contains the actual document content, and other files (such as images) are referenced in.
ODF (or OpenDocument Format) also uses XML to persist document data and it can be either compressed with a ZIP algorithm or uncompressed. In this way it is very similar to OOXML. The exact structure of the XML is where the two formats differ and ODF has its origins with Open Office, originally part of Sun Microsystems, then Oracle (which then stopped commercial support for the product). Since then a number of other products have added support for the format.
War of Words
Within the IT community, those who care argue bitterly about which format is best ODF or OOXML. The ODF proponents argue that even though OOXML is a standard it has tight dependencies on Office/Windows. On the other hand the proponents of OOXML (mostly those who use Office) argue that it doesn’t matter since people want 100% fidelity from what they see in their Office products and what is stored to disk.
In this war of words (pun intended) what seems to be missing is a bit of a reality check about software development, proprietary formats, preferred formats and lossy format conversion. The truth is that every document creation product on the market is initially designed to persist data in a format which it can faithfully read back to reproduce the document on the screen just as it was when persisted.
Tendency to Diversification of Formats
If you have two word processing products, you are going to have two file formats, and whilst both products might support the other file format you can bet that information loss will occur when product A saves in file format B and vice versa. The argument is by introducing a standard file format C that both products support then the problem is solved. Unfortunately this isn’t the case, now what you have is two products and three file formats and imperfect support for the standard format from product to product.
Whilst it is possible to achieve good standards support over time (just look at HTML and CSS over the years) is requires a massive effort that vendors and by extension their customers would probably be unwilling to pay for. So we are where we are.
Putting my .NET developer hat on for a second I personally would prefer to work with OOXML because it is possible for me to (relatively) easily interact with OOXML based documents using the System.IO.Packaging namespace. In fact Readify has worked on a number of applications where templates are produced as *.docx files and we process the file in .NET to spit out the finished product. I am sure that others could make similar arguments for ODF too.
In the end I don’t think it matters. I think that your average user will focus on what they can create with the tools at their disposal and then pick the file format that most faithfully represents that document on this. Within the Australian Government it is fair to say that Microsoft Office is the dominant player and so if there is a desire to push another file format then what that really means is users putting down their favourite tools such as Word, Excel and PowerPoint in favour of alternatives. That is a lot of inertia to overcome.
In the Case of Emergency
It is important to remember that AGIMO is an advisory body which provides recommendations to government agencies. Whether they pick them up is another question and in my experience various agencies fall to various degrees outside the guidance provided. In some senses it would be good to see AGIMO have a little bit more influence across agencies because it would force many agencies to upgrade their technology stacks.
If that did come to pass you can be sure that organisations like Microsoft might introduce measures into their products where the end-user pre-determines their target document persistence format (instead of when it is first saved) and the document creation tool would then restrict what features are available.
In reality I don’t think it would happen because some proprietary features are just to useful to sacrifice in the name of open standards (that is the whole reason organisations buy specific software packages).
New Hope In The Cloud
In the end the argument about document file formats for presentations, spreadsheets and word processors might end not because we reach agreement on file formats but because the very concept of files goes away. With the propagation of cloud technologies within businesses it is becoming increasingly common for a file to be hosted on a web-server and converted to HTML for display to the end user. Google Docs and Office Web Apps are both good examples of this. And whilst you could upload an Office Document to Google Docs and get render issues (quite a few in fact), if you forget about moving files around and instead expose the file directly from where it was authored then the respective cloud platform takes care of the lossless conversion to HTML on the server.
These days I think of documents less as a sequence of bytes on a disk and more as a “service endpoint” that my client software (e.g. Microsoft Word) talks to in a series of document updates. This approach has enabled cloud-hosted document editors to support multiple people editing a document. Watching a group of people update a word document hosted on SharePoint 2013 is quite a sight. The fact that a file might be transferred from the server to the client is just a side effect of a data caching process which enables you to work when connectivity is lost to the cloud service.
AGIMO Announcement in Context
What I thought was particularly interesting is that not much was said about the rest of the documents in which these recommendations were couched. The broader document talks specifically to the provision of Standard Operating Environments within government agencies and if you are so inclined its interesting to look through some of the recommendations and think about how they might impact your productivity.
As a software developer working within some of the constraints within that document wouldn’t be practical so it is likely that certain groups of users would necessarily not be able to work within the COE (Common Operating Environment).
I’m also interested in how AGIMO might approach developing a BYOD device policy (another emerging trend in commercial organisations).
In Summary (and an Ironic Observation)
I personally think that broad statements about preferred document formats (ODF, OOXML or something else) won’t have much impact because ultimately it is the tools that users master which impact the formats they use. If another vendor wants to get a better showing inside Australian Government around document creation, then they are going to have to produce better tools that users love. When that happens – file formats will change.
The ironic thing I observed was that the published SOE build guidelines are in the proprietary Word 97-2003 format (so neither OOXML or ODF). That’s OK, I was able to convert the document to ODF with only minor conversion errors.