Extracting form submission details from InfoPath XSN files

  3 mins read  

I came across an interesting requirement today, working for a government agency. The agency is on the final stages of migration to cloud-based technologies. One of the remaining pieces to be uplifted to the cloud are a series of legacy forms that are important to their business-as-usual operations, but have not yet been moved to a modern cloud-based technology.

The forms are InfoPath XSN files, and InfoPath has already reached its mainstream end date of Jul 13 2021, and in 2026 will reach its extended end date.

For this customer, the XSN files are published in a SiteCore environment. When the forms are used, they are downloaded from the SiteCore site, and assuming InfoPath is installed on the desktop computer, they are opened there to be filled out, submitted, and processed through an approval process.

I wasn’t in a position where I could open the XSN forms with InfoPath, so I needed another way to extract key information out of the form, so we could consider our options for uplifting to a modern solution.

I needed to find out when a user submits a form, what procedure is followed to process them through the approval process. To do this, I wanted to find out what happens when the form’s “Submit” button is clicked. InfoPath forms have a schema associated to them, and there is a particular structure that determines what happens to data when the form is submitted.

Environment pre-requesites

With a Windows desktop, it is easy to extract information from InfoPath XSN files without the InfoPath client software installed. These steps can also be followed with no specialised software. The only two pieces of software needed are:

  • Windows Explorer
  • A text editor (I would typically use VS Code as my text editor of choice, but here I was using Notepad++ because that is what was installed on this environment)

Extract form submission details from an XSN file

To find out how XSN forms are submitted by their users without InfoPath installed, follow these steps:

  1. Using Windows Explorer, rename the XSN file name, removing the .xsn extension and replacing it with .cab Note: you may need to configure Windows Explorer to show file extension; steps can be found here
  2. After changing the file extension to .cab, Windows Explorer can now open the file as a compressed file; open it by double-clicking the file, or pressing enter with the file highlighted
  3. Inside the compressed .cab file, look for a file named manifest.xsf, extract it out of the .cab file by copying and pasting it to a folder outside the .cab
  4. Use a text editor to open the manifest.xsf file
  5. In the text editor, look for a <xsf:dataAdapters> section of the XML
  6. Inside the <xsf:dataAdapters> section, you will find a series of adapter nodes, for example:
    • emailAdapter example:

        <xsf:emailAdapter name="Email Rejection" submitAllowed="yes">
            <xsf:to value="my:email_requestor" valueType="expression"></xsf:to>
            <xsf:subject value="Procurement Request - Rejected" valueType="literal"></xsf:subject>
            <xsf:attachmentFileName value="Rejected_Request" valueType="literal"></xsf:attachmentFileName>
        </xsf:emailAdapter>
      
    • davAdapter example:

        <xsf:davAdapter name="SharePoint Library Submit" submitAllowed="yes" overwriteAllowed="yes">
            <xsf:folderURL value="\\prodsp2\forms.librarynet\procurement/"></xsf:folderURL>
            <xsf:fileName value="my:Filename" valueType="expression"></xsf:fileName>
        </xsf:davAdapter>
      
  7. You can start to get a picture of what happens at different stages of the form’s lifecycle by analysing each of these nodes