theThought's thoughts

Kevin A Gray - Creative Strategy Guy

Data Collection to OpenXML a simple class for populating custom parts (the code)

The previous post provided a technical specification for the DLL required to link IBM SPSS Data Collection to the Custom Part in a Word document so that the responses in the survey can be pushed into the Word document.  This post will show the code used to implement that specification.  The code has been broken down into a simplified form to help me learn and to make it easier to explain.  There will be shortcuts that you may want to adopt, you may want less functions preferring to write the code in a single method, however I prefer the small is beautiful approach.

Defining the class:
As OpenXML constructs are going to be manipulated it is necessary to define the right references within the project and to import them into the class itself.  As streams are going to be used to read and write xml files into the Word document System.IO has to be imported.  There is no constructor required as the key actions will be on the main method (UpdateCustomPart). Consequently the class definition is as follows:

Imports DocumentFormat.OpenXml.Packaging

Imports System.IO

Public Class Complaint

End Class

UpdateCustomPart (Public Method)
The first and main method is the UpdateCustomPart, this will receive three parameters and will, for the purpose of this example, output a Boolean indicating success or fail.   The process this method follows is:

Ø  Parse XML string into a new XML Document

Ø  Open the Word Document

Ø  Locate the Custom Part

Ø  Update the Custom Part

Within this process the XML String, the name and location of the Word document and the name of the Custom Part are passed as parameters so that the same class could be used to update any custom part within any Word document.  The code for this method is as follows:

Public Function UpdateCustomPart(ByVal theFileName As String, ByVal theCustomPart As String, ByVal theXML As String) As Boolean

Dim WordDoc As WordprocessingDocument
Dim CustomPart As DocumentFormat.OpenXml.Packaging.OpenXmlPart
Dim xmlUpdate As Xml.XmlDocument

    xmlUpdate = New Xml.XmlDocument

    Try
        xmlUpdate.LoadXml(theXML)
    Catch ex As Exception
        Return False
    End Try

    WordDoc = OpenWordDoc(theFileName)

    If Worddoc Is Not Nothing Then CustomPart = FindCustomPart(WordDoc, theCustomPart)

    If CustomPart Is Not Nothing Then

Return changeCustomPart(CustomPart, xmlUpdate)

   End If

Return True

End Function

In the above function the try … catch is used to capture any errors generated from loading the XML (caused by badly formed xml).  Each of the subsequent functions are detailed below, if any fail the subsequent actions are not performed.

OpenWordDoc (Private Function)
The purpose of this function is to open the Word document identified by the path/filename string and to return the opened document to the calling routine.  Rather than just attempt to open the file (only to have it fail because the file does not exist) the function first uses FindFile to locate the document.  Only if this returns details of the file will the system attempt to open it. 

Private Function OpenWordDoc(ByVal theFileName As String) As WordprocessingDocument

Dim WordDoc As WordprocessingDocument
Dim FindFile As System.IO.FileInfo

    FindFile = New System.IO.FileInfo(theFileName)

    If Not FindFile Is Nothing Then
        WordDoc = WordprocessingDocument.Open(theFileName, True)
        Return WordDoc
    Else
        Return Nothing
    End If

End Function

FindCustomPart (Private Function)
This function was actually quite difficult to work out, it maybe that it is not actually the best way of doing it.  The word document is a package that contains a significant number of XML sub-documents.  It is necessary to find the right XML document to update.  The word document was created from within Word and the Word 2007 Content Control Toolkit.  As a consequence I did not name the Custom Part it was named for me.  Its name is Item1.xml.  It is contained in the customXml folder.  This document can be located using the relationships definition for the Main document part.  If this relationship definition is examined it can be seen that there is a relationship described for the custom part.

<?xml version="1.0" encoding="utf-8"?>
<Relationships xmlns="http://schemas.openxmlformats.org/package/2006/relationships">
                <Relationship Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/settings" Target="settings.xml" Id="rId3" />
                <Relationship Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/theme" Target="theme/theme1.xml" Id="rId7" />
                <Relationship Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/styles" Target="styles.xml" Id="rId2" />
                <Relationship Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/glossaryDocument" Target="glossary/document.xml" Id="rId6" />
                <Relationship Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/fontTable" Target="fontTable.xml" Id="rId5" />
                <Relationship Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/webSettings" Target="webSettings.xml" Id="rId4" />
                <Relationship Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/customXml" Target="../customXml/item1.xml" Id="R456ea5aec8a94caf" />
</Relationships>

The target attribute is the URI of the part.  Based on the fact that the name of the part is provided by the consumer of the DLL this URI can be constructed through some simple text manipulation (“customXml/” + thename + “.xml”).  There may be several custom parts within a document so the Find function iterates through all the custom parts matching the URI of the custom part with a calculated URI.  When it finds a match it returns the matching part.  If it does not find a matching URI the function returns nothing.

Private Function FindCustomPart(ByVal theDoc As WordprocessingDocument, ByVal theName As String) As DocumentFormat.OpenXml.Packaging.CustomXmlPart

Dim CustomPart As DocumentFormat.OpenXml.Packaging.CustomXmlPart
Dim MainParts As IEnumerable(Of CustomXmlPart)
Dim ID As String
Dim matchURI As Uri
Dim CalculatedLocation As String = "/customXml/" & theName & ".xml"

    MainParts = theDoc.MainDocumentPart.CustomXmlParts

    matchURI = New Uri(CalculatedLocation, UriKind.Relative)

    For Each CustomPart In MainParts
        ID = theDoc.MainDocumentPart.GetIdOfPart(CustomPart)

        If CustomPart.Uri = matchURI Then
            Return CustomPart
        End If

    Next

    Return Nothing

End Function

ChangeCustomPart (Private Function)
This function uses a stream to Create new content for the custom part using the XML document created from the XML string passed to the DLL.  It effectively ties all the pieces together.  The writing of the stream updates the file (effectively saves it) this means that there is no need to save the document itself.

Private Function changeCustomPart(ByVal thePart As DocumentFormat.OpenXml.Packaging.OpenXmlPart, ByVal theXML As Xml.XmlDocument) As Boolean

Dim XMLStream As Stream

    XMLStream = thePart.GetStream(FileMode.Create, FileAccess.ReadWrite)

    Using (XMLStream)
        theXML.Save(XMLStream)
    End Using

End Function

Filed under  //   IBM   IBM SPSS Data Collection   OpenXML   SPSS   VB.net  

Data Collection to OpenXML a simple class for populating custom parts (Technical Design)

So far in this project a survey (the survey) has been created that accepts details of a train complaint.  A Word document (the word doc) has been created that provides a printed copy of the complaint details and a custom part (the custom part) has been defined and added to the Word document to facilitate the transfer of information from IBM SPSS Data Collection to Microsoft Word.  In the last post (creating XML in Data Collection) logic was added to the end of the survey that creates an XML document containing all the responses to the survey in the same structure as the custom part in the Word document.  In this post VB will be used to create a simple class that will accept the XML document and re-build the custom part.  It will also include methods to convert the resulting document into a PDF ready for emailing to the respondent.

Technical Design

The purpose of this DLL is to handle the interaction  between Data Collection and an OpenXML document.  The class will receive the XML to be inserted into the document and the details of the custom part to be updated.  IT will then replace the content of the existing custom part with the new XML content.  It will not delete the custom part itself as this would result in the loss of the existing unique ID for the document. This would in turn cause all the document relationships and form field bindings to break.

The second part of the process is to create a PDF version of the document.  Word can now do this without the use of third party components.  The returned PDF can then be attached to an e-Mail for distribution to the survey respondent (complainant)

Update Part method
Inputs:

·         XML (containing the survey responses)

·         Document name (the name of the Word document to change)

·         Custom Part name (name of the custom part to be changed)

Outputs

·         Success/Fail Signal (indicating whether update succeeded)

Create PDF Method

Inputs: None

Outputs:

·         PDF version of document

Internal Functions

As well as methods exposed to other applications (in this case IBM SPSS Data Collection) this dll will require some internal functions.

OpenWordDoc (Opens the document defined by the Update Part Method)

Locate Custom Part (takes the open document and finds the custom part)

Adjust Custom Part (having found the custom part the XML needs to be fed in)

There is no need to save the document as the OpenXML SDK will automatically save when the custom part is changed.

Filed under  //   Custom Part   Data Collection   IBM   IBM SPSS Data Collection   OpenXML   SPSS   VB.net   XML