replaceWordWith doesn't work if word has spelling error (underlined in red) #95

dallanmc · 2021-12-03T10:51:44Z

As per the title, words that are underlined in red (indicating that they there is a spelling error and are not in the dictionary) will not be replaced via a replaceWordWith comment.

How to reproduce:
Create a blank document and type the following sentence:

This is a testg document

*Note that the word "test" has been deliberately misspelled.

Now add a comment and in the comment type

replaceWordWith(name)

where "name" is an attribute in your data context.

Now generate the document. You will see that the output is unchanged from the input and that the word replacement has not occurred.

Now go back to the doc, correct the spelling error (either by removing the "g" or by adding "testg" to the dictionary.
Re-run the doc processor and you will see that the word is now replaced by the value of the "name" attribute in the context.

I looked into the code and this is happening because of the proofErr elements in the xml. Docx-stamper expects the word to be immediately after the commentRangeStart and having any other element straight after, will throw it off and it will ignore the comment.

			<w:commentRangeStart w:id="0"/>
			<w:proofErr w:type="spellStart"/>
			<w:r>
				<w:t>testg</w:t>
			</w:r>
			<w:commentRangeEnd w:id="0"/>
			<w:proofErr w:type="spellEnd"/>

One potential fix is to ignore the proofErr elements when processing the comments. This can be done by making a change to the CommentUtil class, method getCommentAround

<snip>
				for (Object contentElement : parent.getContent()) {

					// ignore ProofErr elements. These indicate spelling mistakes
					if (XmlUtils.unwrap(contentElement) instanceof ProofErr) {
						continue;
					}

					// so first we look for the start of the comment
					if (XmlUtils.unwrap(contentElement) instanceof CommentRangeStart) {
						possibleComment = (CommentRangeStart) contentElement;
					}

</snip>

I have tested this and it works.

Obviously this issue is not a big deal if you are aware of the constraint (you can, after all, just add the misspelled word to the dictionary), but if someone else is authoring the templates, they will be scratching their heads at this and wondering why their word to be replaced isn't actually being replaced.

I can, of course, create a pull request for this, if people think the fix above is the correct approach.

The text was updated successfully, but these errors were encountered:

dallanmc · 2022-01-18T16:44:06Z

Just an update to the above - I decided to add a pre-processor (code that processes the WordprocessingMLPackage document object before docxstamper does its thing

The proprocessor basically strips out every single ProofErr object in the document:

public class SpellCheckPreProcessor implements IPreProcessor {
    @Override
    public void process(WordprocessingMLPackage document) {

        List<ProofErr> proofErrsFromDocument = getProofErrsFromObject(document);

        for(ProofErr proofErr: proofErrsFromDocument) {
            ((ContentAccessor)proofErr.getParent()).getContent().remove(proofErr);
        }
    }
}

I also created a pre-processor to handle merging of styles. I found that certain documents I was dealing with would not process properly because the variables were split into different runs even though the style was exactly the same. The stylemergepreprocessor would go through the whole document and merge adjacent runs into a single run if their styles were the same, turning this:

						<w:r w:rsidR="007653C0">
							<w:rPr>
								<w:b/>
								<w:sz w:val="22"/>
							</w:rPr>
							<w:t>${</w:t>
						</w:r>
						<w:r w:rsidR="001B08AA">
							<w:rPr>
								<w:b/>
								<w:sz w:val="22"/>
							</w:rPr>
							<w:t>firstName}</w:t>
						</w:r>

into this:

						<w:r w:rsidR="007653C0">
							<w:rPr>
								<w:b/>
								<w:sz w:val="22"/>
							</w:rPr>
							<w:t>${firstName}</w:t>
						</w:r>

Also, the advantage of this style merge pre-processor is that it enables you to select multiple words for a replaceWordWith comment. So you can have a comment around "first name" instead of needing it to be "firstName". May not seem like a big deal, but it means that there is less explaining to do to document authors and gives them more freedom.

caring-coder mentioned this issue Sep 30, 2022

replaceWordWith doesn't work if word has spelling error (underlined in red) verronpro/docx-stamper#66

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

replaceWordWith doesn't work if word has spelling error (underlined in red) #95

replaceWordWith doesn't work if word has spelling error (underlined in red) #95

dallanmc commented Dec 3, 2021 •

edited

Loading

dallanmc commented Jan 18, 2022 •

edited

Loading

replaceWordWith doesn't work if word has spelling error (underlined in red) #95

replaceWordWith doesn't work if word has spelling error (underlined in red) #95

Comments

dallanmc commented Dec 3, 2021 • edited Loading

dallanmc commented Jan 18, 2022 • edited Loading

dallanmc commented Dec 3, 2021 •

edited

Loading

dallanmc commented Jan 18, 2022 •

edited

Loading