Fix a bug that added unsolicited new lines when parsing an ordered list containing paragraphs #359
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
When converting an ordered list where items contain paragraphs, new lines are added in between each elements.
This is a suggestion to fix this edge case.
Description
Following the CommonMarks specs, if a list item contains multiple blocks, the item can be encapsulated in a paragraph tags.
See : A list item may contain blocks that are separated by more than one blank line
Take the following example :
Using babelmark, we can confirm that most implementations will translate it to the following code
Providing this code to the online demo of Html2Markdown will return the following markdown :
To remove these new lines, I tried following the design we discussed earlier by :
CommonMarkTextFormattingReplacementGroup
class that will callReplaceList
with a flag indicating we want to be compliant with CommonMarkCommonMark.cs
to replace the initial call ofTextFormattingReplacementGroup
What are the changes within HtmlReplacer ?
ReplaceLists
now accepts asupportCommonMark
flagReplaceParagraph
while we process the lists. Instead of calling it later as part of the replacement groups list. This will allow us to avoid adding new lines around the paragraph tags when they are part of a list item.supportCommonMark
flag, to handle multiline html, if a closing li tags is followed by a new line. We also remove this new line. This is done to prevent cases where this html[...]<li><p>First Item</p>\\n</li>\\n<li>Second item[...]
becomes[...]<li><p>First Item</p>\\n\\n<li>Second item[...]
, creating a double line break because we only removed the tag. L. 44supportCommonMark
flag, when aggregating themarkdownList
if we notice an item that is already ending with a new line, we do not add a new line. L. 66Unfortunately there are multiple changes for this unique PR but I tried to fix things one after another as I was discovering them.
Do you think this is a legitimate change ?
Also, do you think there is a better way of handling the
Common Markdown
flag ? With some kind of static global variable ? This could prevent having to play with flags and instead set everything trough theCommonMark
class.Related issue
N/A
Thanks.