Extract Data From Forms and Content Controls

Home Up Search This Site What's New? Audio On CDR Favourites Downloadable files Photo Gallery 2002 Photo Gallery 2003 Photo Gallery 2004/5 Photo Gallery 2006/7 Photo Gallery 2008 Photo Gallery 2009/10 Photo Gallery 2011 UK Photo Gallery Ireland Photo Gallery Cats Photo Gallery 

horizontal rule

 

 

Google
 

 

Many people access the material from this web site daily. Most just take what they want and run. That's OK, provided they are not selling on the material as their own; however if your productivity gains from the material you have used, a donation from the money you have saved, however small, would help to ensure the continued availability of this resource.

Click the appropriate button above to access PayPal.

 

 

Extract data from protected forms

 

While Word 2007 introduced a new set of forms tools, there are still many users creating protected forms in Word 2002/3. Having prepared your forms and filled them, the next step is often to extract the completed data for use in a spreadsheet. Word provides the option (Tools > Options > Save) to save only the data from forms. This page demonstrates a faster option to work on a folder full of similar forms to create a comma delimited text file that can be loaded into standard spreadsheet packages such as Excel or used as a data source for a mail merge.

Using the following simple form as an example:

Note:

The macros in the first part of this page are all concerned with legacy form fields, as used in Word 2002/3 and available from Word 2007/2010's Developer ribbon tab. However it is possible to extract the data from the new Word 2007/2010 content controls and I have included a macro to do that at the end of this page. The macro will work as a stand-alone macro on the currently active document, or will work in conjunction with the routine to extract data from e-mailed attachments, by changing the call to the macro where indicated.

 

Run the macro and each document in the selected folder will be opened in turn and the forms data saved to FormsData.txt which is stored in the the user's Word documents folder.

 

Sub ExtractDataFromForms()
Dim oTarget As Document
Dim oDoc As Document
Dim DocDir As String
Dim
DocList As String
Dim
fTargetName As String
Dim
i As Long
Dim
lngCount As Long
Dim
fDialog As FileDialog
'Define the filename for the textfile to hold the form data
fTargetName = Options.DefaultFilePath(wdDocumentsPath)
fTargetName = fTargetName & Application.PathSeparator & "FormsData.txt"
'LOcate the folder containing the forms
Set fDialog = Application.FileDialog(msoFileDialogFolderPicker)
On Error Resume Next
With
fDialog
    .Title = "Select Folder containing the completed form documents and click OK"
    .AllowMultiSelect = False
    .InitialView = msoFileDialogViewList
    If .Show <> -1 Then
        MsgBox "Cancelled By User"

        Set fDialog = Nothing
    Exit Sub
    End If

    DocDir = fDialog.SelectedItems.Item(1)
    If Right(DocDir, 1) <> "\" Then DocDir = DocDir + "\"
End With
'Set a counter to count the processed forms
lngCount = 0
'Open the target document
Set oTarget = Documents.Open(fTargetName)
'The target document doesn't exist, so create a new one
If Err.Number = 5174 Then
    Set
oTarget = Documents.Add
    oTarget.SaveAs FileName:=fTargetName, FileFormat:=wdFormatText
End If
'Turn off screen updating to reduce flicker whilst opening/closing files
Application.ScreenUpdating = False
DocList = Dir$(DocDir & "*.doc?")
'Process the form documents in turn
Do While DocList <> ""
    'The form may contain automacros, not required
    'by this macro, so disable them

    WordBasic.DisableAutoMacros 1
    Set oDoc = Documents.Open(DocDir & DocList)
    lngCount = lngCount + 1
    If oTarget.Range.Paragraphs.Count < 2 And lngCount = 1 Then
        'It's a new target document so
        'add the header row comprising the form field names

        For i = 1 To oDoc.FormFields.Count
            oTarget.Range.InsertAfter _
                    Chr(34) & oDoc.FormFields(i).Name & Chr(34)
            If i < oDoc.FormFields.Count Then
                oTarget.Range.InsertAfter Chr(44)
            End If
            If
i = oDoc.FormFields.Count Then
                oTarget.Range.InsertAfter vbCr
            End If
        Next
i
    End If
    'Add the form field content to the target document
    For i = 1 To oDoc.FormFields.Count
        oTarget.Range.InsertAfter _
                Chr(34) & oDoc.FormFields(i).Result & Chr(34)
        If i < oDoc.FormFields.Count Then
            oTarget.Range.InsertAfter Chr(44) & Chr(32)
        End If
        If
i = oDoc.FormFields.Count Then
            oTarget.Range.InsertAfter vbCr
        End If
    Next
i
    oDoc.Close wdDoNotSaveChanges
    DocList = Dir$()
    WordBasic.DisableAutoMacros 0
Loop
'Format the target document to remove unwanted line spacing
With oTarget.Range.ParagraphFormat
    .SpaceAfter = 0
    .SpaceBefore = 0
End With
'Save the target document with added form data
oTarget.Save
Application.ScreenUpdating = True

Set fDialog = Nothing

Set oDoc = Nothing

Set oTarget = Nothing
End Sub

 

Create a report from protected form data

 

A variation on this approach could be used to create a report or letter using the data collected in a protected form. For multiple letters you would simply add a header row to the TargetDoc.txt file and run a mail merge, but you may wish to process forms individually. In order to do that, one way would be to create a document template containing docvariable fields that relate to some or all of the form fields in the form to be processed as shown in the simple report below:

 

When merged with the last line the above would produce the following. I only created one conditional field to illustrate the plurals. In practice you would need to create one for each field where it was relevant.

 

The macro below assigns the required data from the fields of a selected completed form to the document variables used in the report. It then creates a new document from the report template (Report.dotx stored in the default user templates folder) and updates the fields to display the current values of the fields.

Sub PrepareReport()
Dim DocDir As String
Dim
oDoc As Document
Dim oReportDoc As Document
Dim fDialog As FileDialog
Dim oVars As Variables

Set fDialog = Application.FileDialog(msoFileDialogFilePicker)
On Error GoTo CleanUp:
With fDialog 'Pick the form to process
    .Title = "Select the completed form document and click OK"
    .AllowMultiSelect = False
    .InitialView = msoFileDialogViewList
    If .Show <> -1 Then
        MsgBox "Cancelled By User"

        Set fDialog = Nothing
        Exit Sub
    End If

    DocDir = fDialog.SelectedItems.Item(1)
End With
If
Documents.Count > 0 Then
    Documents.Close SaveChanges:=wdPromptToSaveChanges
End If
On Error GoTo
0
Application.ScreenUpdating = False
'Open the completed form document for processing
Set oDoc = Documents.Open(DocDir)
'Create a new Document from the named report template
Set oReportDoc = Documents.Add(Options.DefaultFilePath(wdUserTemplatesPath) & "\Report.dotx")
Set oVars = oReportDoc.Variables
'Assign the Form Fields by name to the docvariables
With oDoc
    oVars("varFirstName").Value = .FormFields("Text1").Result
    oVars("varLastName").Value = .FormFields("Text2").Result
    oVars("varDate").Value = .FormFields("Text3").Result
    oVars("varWidget").Value = .FormFields("Text4").Result
    oVars("varWhatsit").Value = .FormFields("Text5").Result
    oVars("varThingummy").Value = .FormFields("Text6").Result

    .Close wdDoNotSaveChanges
End With
' see note below
oReportDoc.Fields.Update
Application.ScreenUpdating = True
'oReportDoc.Save 'Remove the apostrophe at the start of this line to force a save.
CleanUp:
Set oVars = Nothing
Set oDoc = Nothing
Set
oReportDoc = Nothing
Set
fDialog = Nothing
End Sub

 

Note:

If you were to include the value of a checkbox form field in the above macro The macro will resolve the result of a CheckBox as 1 (checked) or 0 (unchecked), so if you wished to include the additional fields shown below into that macro, you could use the following

oVars("varCheck1").Value = .FormFields("Check1").Result
If oVars("varCheck1").Value = 1 Then
     oVars("varCheck1").Value = "YES"
Else
     oVars("varCheck1").Value = "NO"
End If
oVars("varPrivacy").Value = .FormFields("Dropdown1").Result

 

The end result is similar to that below:

   

Note:

The macros will also run using forms in Word "*.doc" and "*.docx" formats. Similarly the Report.dotx template could be a Report.dot template for Word 2003 if preferred.

Processing Forms Received by E-Mail

 

The above processes assume that you have stored a collection of forms locally, however, it is likely that the completed forms will be submitted by e-mail. It is possible to process the forms directly from the e-mail message attachments. To facilitate that, it is necessary to set up a number of folders in Outlook to handle the forms at the various stage of the process, and an Outlook Rule to direct all messages containing the forms to one of the folders, automatically.

The folders used for the purpose of this illustration are a sub folder of Outlook's Inbox - called Forms_In and three sub folders of that folder called Forms_Completed, Forms_Incomplete & Forms_Wrong

There are two paths used in the macros defined:

sPath = "D:\My Documents\Test\Temp\"

and

fName = "D:\My Documents\FormData.doc"

The latter, which is the document used to store the form data as a table, can be either doc or docx format depending on Word version. If the document doesn't exist, the macro will create it, but the folder used to store it must exist, as must the former path used as a temporary store for the documents being processed, which are likely all to have the same name and thus can only exist one at a time in the folder.

If preferred you can change the routine to extract data from the correctly submitted forms to save the data as a comma delimited file. Here I have shown an alternative method of creating the text file from that shown in the previous ExtractDataFromForms macro. The method shown below uses less processing and is thus faster.

Where that method is used, the data document path is instead

fName = "D:\My Documents\FormData.txt"

 

The macro opens Outlook and examines each message in the 'Forms_In' folder, in turn, starting at the bottom of the list (to avoid the count getting screwed up later). If the message doesn't have an attachment, or has more than one attachment, it is moved to the folder 'Forms_Wrong' for later manual examination. If the message has the required single form attached, that attachment is opened in Word and docvariables containing the message subject and the sender's e-mail address are added.

The macro then checks that all the fields contain data. Obviously the macro has no way of knowing whether the data submitted is wrong, or what the correct state of check boxes or selection from drop down fields should be. It only checks to ensure that all the fields have been filled.

Note:

The macros work only with legacy form fields when used in Word 2007

 

If any of the form fields is incomplete, the message is moved to the 'Forms_Incomplete' folder and the incomplete form is returned by e-mail to the address whence it came.

If the form fields are completed, the data is extracted to the document/text file defined at

fName =

and the message is moved to the 'Forms_Completed' folder.

 

Private oVars As Variables
Private vVar As Variant
'This macro requires a reference to the Microsoft Outlook Object Library
Sub ProcessFormAttachments()
Dim i As Long
Dim olApp As Outlook.Application
Dim olNs As Outlook.Namespace
Dim olFolder As Outlook.Folder
Dim olItem As Outlook.MailItem
Dim olMailItem As Outlook.MailItem
Dim oNewMailItem As Outlook.MailItem
Dim olAttachments As Outlook.Attachments
Dim sFname As String
Dim sPath As String
Dim iMessages As Long
Dim TempDoc As Document
 

'Set the temporary document path
sPath = "D:\My Documents\Test\Temp\"

On Error Resume Next
'Open Outlook

Set olApp = GetObject(, "Outlook.Application")

If Err = 429 Then
    
'Outlook is closed so start Outlook

     Set olApp = CreateObject("Outlook.Application")
End If
Set olNs = olApp.GetNamespace("MAPI")

'Locate the folder containing the unprocessed e-mail messages
Set olFolder = olNs.GetDefaultFolder(olFolderInbox).Folders("Forms_In")

'Examine each message starting from the bottom of the list
For i = olFolder.Items.Count To 1 Step -1
     Set olItem = olFolder.Items(i)

     'Mark the message as read
     olItem.UnRead = False
     Set olAttachments = olItem.Attachments

     'Count the attachments attached to the message
     If olAttachments.Count = 1 Then

     'There is one attachment so note its filename
          sFname = olAttachments.Item(1).DisplayName
          On Error Resume Next
         
'If the filename already exists in the temporary folder, remove it.

          Kill sPath & sFname
          On Error GoTo 0

          'Save the attachment to the temporary folder
          olAttachments.Item(1).SaveAsFile _
          sPath & sFname
     Else 'The message has the wrong number of attachments

          'Mark it as unread
          olItem.UnRead = True

          'and move it to the Forms_Wrong folder
          olItem.Move olFolder.Folders("Forms_Wrong")

          'and move on to the next message

          GoTo ProcessNext
     End If

     'Open the temporary document from its folder

     'The form may contain automacros, not required
     'by this macro, so first disable them

     WordBasic.DisableAutoMacros 1

     Set TempDoc = Documents.Open(sPath & sFname)
     Set oVars = TempDoc.Variables

     'Store the sender's e-mail address and message subject in the document
     oVars("varSender").Value = olItem.SenderEmailAddress
     oVars("varSubject").Value = olItem.Subject

     'Word 2007 has a habit of switching to draft view so select PrintView
     ActiveWindow.View.Type = wdPrintView

     'Check all the form fields in the temporary document
     For j = 1 To TempDoc.FormFields.Count

          'If any are incomplete
          If TempDoc.FormFields(j).Result = "" Then

               'Call the routine to return the unaltered form to the sender
               Call ReturnForm

               'close the temporary document
               TempDoc.Close wdDoNotSaveChanges

               'and move it to the Forms_Incomplete folder
               olItem.Move olFolder.Folders("Forms_Incomplete")

               'and quit the loop

               Exit For
          Else
'The form is correctly filled, so call the routine

               'to extract the data to a table.
               Call ExtractDataFromForm

               'Alternatively call the routine to save to a comma delimited text file

               'Call ExtractDataFromForm2

               'Or if the form contains content controls rather than legacy form fields

               'Call ExtractFataFromForm3

               'Close the temporary document
               TempDoc.Close wdDoNotSaveChanges

               'and move the message to the Forms_Complete folder
               olItem.Move olFolder.Folders("Forms_Completed")

               Exit For
          End If
     Next
j
ProcessNext:
Next i

'Re-enable automacros

WordBasic.DisableAutoMacros 0
'Clean up

Set olItem = Nothing
Set olFolder = Nothing
Set olAttachments = Nothing
Set olNs = Nothing
Set
olApp = Nothing
End Sub

 

'Routine to return the incorrectly filled form to the sender
Sub ReturnForm()
'This macro requires a reference to the Microsoft Outlook Object Library

Dim oOutlookApp As Outlook.Application
Dim oItem As Outlook.MailItem
On Error Resume Next
Set
oVars = ActiveDocument.Variables

'Outlook is already running so open it.
Set oOutlookApp = GetObject(, "Outlook.Application")

'Create a new e-mail message
Set oItem = oOutlookApp.CreateItem(olMailItem)
With oItem

     'Read the sender and subject from the document variables
     .To = oVars("varSender")
     .Subject = "Re: " & oVars("varSubject")

     'Add a suitable message c/w your contact details
     .Body = "The form you have submitted is incomplete." & vbCr & _
             "Please complete the form and return it." & vbCr & vbCr & _
             oOutlookApp.Session.CurrentUser.name & vbCr & _
             oOutlookApp.Session.CurrentUser.Address
     .Attachments.Add Source:=ActiveDocument.FullName, _
      Type:=olByValue, DisplayName:="Document as attachment"
     .Send 'Send the message (It might be better to ensure that Outlook

           'does not send the messages immediately while testing).
End With

'Clean up
Set oItem = Nothing
Set
oOutlookApp = Nothing
End Sub

 

'Routine to extract data from the correctly submitted forms to a Word table
Sub ExtractDataFromForm()
Dim oDoc As Word.Document
Dim oTarget As Word.Document
Dim oTable As Table
Dim iCol As
Long
Dim
i As Long
Dim
sText As String
Dim
sName As String
Dim
fName As String
 

'Define the name and path of the document to hold the data

'This can be Word 2007 docx format if preferred.
fName = "D:\My Documents\FormData.doc"

Set oDoc = ActiveDocument
On Error Resume Next

'Open the data document to receive input
Set
oTarget = Documents.Open(fName)

If Err.Number = 5174 Then
     'The data document doesn't exist so create a new one
     Set oTarget = Documents.Add
     oTarget.SaveAs fName
End If
Set oFld = oDoc.FormFields

'Count the form fields in the document
iCol = oFld.Count
If oTarget.Tables.Count = 0 Then

     'It's a new data document so if there are more than 10 fields,

     'set the new layout in landscape mode, and add a 2 row table.
     If
iCol > 10 Then
          oTarget.PageSetup.Orientation = _
          wdOrientLandscape
     End If
     oTarget.Tables.Add oTarget.Range, 2, iCol
Else 'It's an existing table so add a single row
     oTarget.Tables(1).Rows.Add
End If
Set oTable = oTarget.Tables(1)

'If an incorrect form has slipped through with the wrong number of

'form fields, the routine quits with an error message
If iCol <> oTable.Columns.Count Then
     MsgBox "The form and data table do not have the same number of fields", _
     vbCritical, "Error!"
     Exit Sub
End If

'Read the form data
For
i = 1 To iCol
     sName = oFld(i).name
     Select Case oFld(i).Type
          Case Is = wdFieldFormDropDown, wdFieldFormTextInput
               sText = oFld(i).Result
          Case Is = wdFieldFormCheckBox
               sText = oFld(i).CheckBox.Value
     End Select

     'If it is a new table add the field names to the header row
     If
oTable.Rows.Count = 2 Then
          With oTable.Cell(1, i).Range
               .Text = sName
               .Font.Bold = True
               .Collapse wdCollapseEnd
          End With
     End If

     'Write the form data to the table
     With oTable.Cell(oTable.Rows.Count, i).Range
          .Text = sText
          .Collapse wdCollapseEnd
     End With
Next
i
End Sub

 

 

'Alternative routine to save the form data as a comma delimited file

Sub ExtractDataFromForm2()
Dim oTarget As Document
Dim oDoc As Document
Dim fName As String
Se
t oDoc = ActiveDocument
'Define the name and path of the text file to hold the data

fName = "D:\My Documents\FormData.txt"
'Turn off screen updating to reduce flicker whilst opening/closing files
Application.ScreenUpdating = False
On Error Resume Next
Set oTarget = Documents.Open(fName, False)

If Err.Number = 5174 Then

     'The file does not exist, so create a new one
     Set oTarget = Documents.Add

     'and add the header row comprising the form field names
     For i = 1 To oDoc.FormFields.Count
          oTarget.Range.InsertAfter Chr(34) & _
          oDoc.FormFields(i).name & Chr(34)
          If i < oDoc.FormFields.Count Then
               oTarget.Range.InsertAfter Chr(44)
          End If
          If
i = oDoc.FormFields.Count Then
               oTarget.Range.InsertAfter vbCr
          End If
     Next
i

     'and save it as a text file
     oTarget.SaveAs fName, wdFormatText
End If

'Add the form field content to the text file

For i = 1 To oDoc.FormFields.Count
     oTarget.Range.InsertAfter Chr(34) & _
     oDoc.FormFields(i).Result & Chr(34)
     If i < oDoc.FormFields.Count Then
          oTarget.Range.InsertAfter Chr(44) & Chr(32)
     End If
     If
i = oDoc.FormFields.Count Then
          oTarget.Range.InsertAfter vbCr
     End If
Next
i

'Save the data file
oTarget.Save
'Restore screen updating

Application.ScreenUpdating = True
End Sub

 

Extract data from Word 2007/2010 content controls

 

The above methods are all aimed at legacy form fields as used by Word 2002 and 2003 versions. A similar procedure can be used with Word 2007/2010 content controls and that is achieved by using the following version of the ExtractDataFromForm macro. Word 2007 content controls are incompatible with earlier Word versions - even with when the compatibility pack is installed in those earlier versions. This macro will therefore will only work in Word 2007/2010 and the macro error traps Word versions before 2007.

This macro does not extract images from picture content controls, but will extract the path data of linked images from such controls. For any content control that doesn't have a title, the content is recorded under the heading name CControln where n is the sequential number of the content control in the form.

 

'Extract data from Word 2007/2010 content control fields

'Code based on a macro by Greg Maxey

'Note that check box content controls are not supported in Word 2007

'and if used in Word 2010 will produce compatibility warnings when the text document

'is saved by this macro - though they will be recorded.

Sub ExtractDataFromForm3()

Dim oTarget As Document

Dim oDoc As Document

Dim fName As String

Dim sCCName As String

Dim i As Long

Dim pPath As String

Dim oCC As ContentControl

If Val(Application.Version) < 12 Then 'The Word version is not 2007/2010

    'So show a message

    MsgBox "This function only works with Word versions after Word 2003", vbCritical, "Word version error"

    Exit Sub

End If

Set oDoc = ActiveDocument

'Define the name and path of the text file to hold the data

'Note that for most users this location will not exist.

'Choose a path suited to your own filing system.

fName = "D:\My Documents\FormData.txt"

'Turn off screen updating to reduce flicker whilst opening/closing files

Application.ScreenUpdating = False

If Not FileExists(fName) Then

    'The file does not exist, so create a new one

    Set oTarget = Documents.Add

    'Add the header row comprising the content control names

    For i = 1 To oDoc.ContentControls.Count

        sCCName = oDoc.ContentControls(i).Title

        'If the content control has no name, create one

        If sCCName = "" Then sCCName = "CControl" & i

        oTarget.Range.InsertAfter Chr(34) & sCCName & Chr(34)

        If i < oDoc.ContentControls.Count Then oTarget.Range.InsertAfter Chr(44) & Chr(32)

        If i = oDoc.ContentControls.Count Then oTarget.Range.InsertAfter vbCr

    Next i

    'Now add the content
    AddContent oDoc, oTarget
    oTarget.Range.ParagraphFormat.SpaceAfter = 0

    'and save it as a text file

  oTarget.SaveAs fName, wdFormatText

Else

    Set oTarget = Documents.Open(fName, False)

    AddContent oDoc, oTarget
    oTarget.Range.ParagraphFormat.SpaceAfter = 0

    'Save the data file

    oTarget.Save

End If

'Restore screen updating

Application.ScreenUpdating = True

Set oTarget = Nothing

Set oDoc = Nothing

Set oCC = Nothing

End Sub

 

'Function used by the above macro to check whether a file exists

Function FileExists(PathName As String) As Boolean

Dim i As Long

On Error Resume Next

i = GetAttr(PathName)

Select Case Err.Number

Case Is = 0

    FileExists = True

Case Else

    FileExists = False

End Select

On Error GoTo 0

End Function

 

'Function used by the above to add the content to the text file

Function AddContent(oDoc As Document, oTarget As Document)
For i = 1 To oDoc.ContentControls.Count

     Set oCC = oDoc.ContentControls(i)

     Select Case oCC.Type

         Case 0, 1, 3, 4, 6, 8

             oTarget.Range.InsertAfter Chr(34) & oCC.Range & Chr(34)

         Case 2

             pPath = ""

             On Error Resume Next

             pPath = oCC.Range.InlineShapes(1).LinkFormat.SourceFullName

             On Error GoTo 0

             If Len(pPath) > 0 Then

                 oTarget.Range.InsertAfter Chr(34) & pPath & Chr(34)

             Else

                 oTarget.Range.InsertAfter Chr(34) & oCC.Range & Chr(34)

             End If

         Case Else

             'Skip Group and Building Block Controls

     End Select

     If i < oDoc.ContentControls.Count Then oTarget.Range.InsertAfter Chr(44) & Chr(32)

     If i = oDoc.ContentControls.Count Then oTarget.Range.InsertAfter vbCr

    Next i

End Function