Wednesday, March 04, 2015

Look Mum – no Office…

I was in the process of porting and updating some venerable VB coded software components to C# the other day. There was a multi-page print routine, a multi-column list sorter and a widget to export data from a ListView or DataGrid to Excel or a CSV file. I took a quick look to see if the more recent versions of .NET had anything to add to the process when I came across mention of the Microsoft Open XML SDK supplied through the DocumentFormat.OpenXml.dll.

You can use the Package Manager in Visual Studio to get the dll via nugget

PM> Install-Package DocumentFormat.OpenXml

Which is cool but I would advise looking for the relevant page in the Microsoft Internet Download Centre and grab a copy of the tools installer (OpenXMLSDKTool.msi). This installs some (pretty poor) documentation but can also come in very useful for analysing existing documents.

As I was looking at exporting an Excel file first of all, I started there. Picking up a few clues on the way from http://mikesknowledgebase.azurewebsites.net/pages/CSharp/ExportToExcel.htm. There you can also download a fully coded solution rather cleverly based upon .NET DataSets – which would allow the almost direct export of data returned from a database to Excel. In any case the post is recommended as the techniques are readily adaptable to alternate data sources. One thing I might suggest is changing the file encoding to Encoding.Unicode from Encoding.ASCII as a default.

Implementing an export to Excel routine that did not require the presence of a licensed copy of office on the relevant machine was a win as far as I was concerned. Now it also happened that I was looking at outputting some documents from a code base. Creating a PDF is pretty steady and covers a lot of the requirement but I wondered if outputting a Word .docx file could be a bonus alternative.

A first look at the documentation at https://msdn.microsoft.com/en-us/library/bb448854(v=office.14).aspx (almost bound to be a failed link within days but what can you do) did not thrill me. It looked complex with a vast number of objects to be created and manipulated. This is where the Open XML tool comes into play. If you create and save a Word document using Office you can open the file created using this tool. If you then click the “Reflect Code” menu strip option you can view the code required to produce the document – or any sub component of the document.  If you copy the code and paste it into your project it will run and can re-produce your original Word document perfectly. You may have to resolve some Object naming clashes ( Font and Color) if you paste the code into a Windows form class but you have a ready-made solution ready to run.

You will probably be initially horrified at the repetitive code – just how many times does the same font (and indeed the same colours) need to be defined? [Some may recall the days when folks occasionally used Word to generate a web page’s HTML – the effect is similar.] What you are seeing is of course a strategy that copes well with the user controlled document design process within Word and one that ensures precision when a document comes to be rendered – OK, with some redundancy.

And yes – it will offend any programmer’s soul where the context is the preparation of a particular document layout. So prune away.

The key thing to take away here is that Microsoft have provided another open tool that can be used to interact with the MS ecosystem – and in this instance – with the de facto standard of the Office suite of programs.

No comments: