Hard line breaks always also add newline #80

Closed
opened 2022-07-11 12:50:19 +02:00 by susnux · 5 comments
susnux commented 2022-07-11 12:50:19 +02:00 (Migrated from github.com)

I noticed that hard line breaks always add a newline character, which breaks stuff when set `preserveWhitespaces: 'full'´ for paragraphs.

Consider this:

*foo\
bar*

will result in

<p><em>foo<br \>
bar</em></p>

And then serialized as:

*foo \

 bar*

which results in this:

<p>*foo \</p>
<p>bar*</p>

My use case:
keep differences of markdown files after editing as small as possible, especially custom formatting (from external editors). E.g.:

foo
bar

without preserveWhitespaces: 'full' this would result in

foo bar

The rendered version would look the same, but the source file is different and may be harder to read.

I noticed that hard line breaks always add a newline character, which breaks stuff when set `preserveWhitespaces: 'full'´ for paragraphs. Consider this: ```markdown *foo\ bar* ``` will result in ```html <p><em>foo<br \> bar</em></p> ``` And then serialized as: ```markdown *foo \ bar* ``` which results in this: ```html <p>*foo \</p> <p>bar*</p> ``` --- My use case:\ keep differences of markdown files after editing as small as possible, especially custom formatting (from external editors). E.g.: ```markdown foo bar ``` without `preserveWhitespaces: 'full'` this would result in ```markdown foo bar ``` The rendered version would look the same, but the source file is different and may be harder to read.
marijnh commented 2022-07-21 17:27:44 +02:00 (Migrated from github.com)

I don't quite understand the problem here. If I add a test for the document you show (in attached patch), that content seems to round-trip just fine, without any extra newlines being inserted.

I don't quite understand the problem here. If I add a test for the document you show (in attached patch), that content seems to round-trip just fine, without any extra newlines being inserted.
susnux commented 2022-07-21 18:16:15 +02:00 (Migrated from github.com)

I don't quite understand the problem here

It does not happen when using the provided markdown parser, but if happens if you set the parse options / parse rules of paragraphs to preserveWhitespace: true.

e.g. you want to keep the formatting of a markdown file after a roundtrip you need to keep any additional tabs and spaces and also newlines "soft brakes" like:

hello
world

should stay like this after a roundtrip and not result in

hello world

even if this is syntactically the same.

So the previous example results in:

<p>hello<br />
world</p>

which is this prosemirror state: doc(p("hello", br(), "\nworld")).

And that state would currently be serialized as

hello\

world

(note the additional newline).

See the testcase in the PR:

serialize(doc(p(em("foo", br(), "\nbar"))), "*foo\\\nbar*")
> I don't quite understand the problem here It does not happen when using the provided markdown parser, but if happens if you set the parse options / parse rules of paragraphs to `preserveWhitespace: true`. e.g. you want to keep the formatting of a markdown file after a roundtrip you need to keep any additional tabs and spaces and also newlines "soft brakes" like: ```markdown hello world ``` should stay like this after a roundtrip and not result in ```markdown hello world ``` even if this is syntactically the same. **So the previous example results in:** ```html <p>hello<br /> world</p> ``` which is this prosemirror state: `doc(p("hello", br(), "\nworld"))`. And that state would currently be serialized as ```markdown hello\ world ``` (note the additional newline). See the testcase in the PR: ```js serialize(doc(p(em("foo", br(), "\nbar"))), "*foo\\\nbar*") ```
marijnh commented 2022-07-22 10:36:34 +02:00 (Migrated from github.com)

It does not happen when using the provided markdown parser, but if happens if you set the parse options / parse rules of paragraphs to preserveWhitespace: true.

That's a DOM parser option, though. How does that effect Markdown deserializing?

> It does not happen when using the provided markdown parser, but if happens if you set the parse options / parse rules of paragraphs to preserveWhitespace: true. That's a DOM parser option, though. How does that effect Markdown deserializing?
susnux commented 2022-07-22 10:51:37 +02:00 (Migrated from github.com)

As I said I do not use the provided parser, but plain markdown-it as the project uses quite a lot markdown extensions.

So this is only a problem if you use the to_markdown part of this nice project together with custom markdown parsing.
(I can understand if you decide that this is a won't fix)

As I said I do not use the provided parser, but plain `markdown-it` as the project uses quite a lot markdown extensions. So this is only a problem if you use the `to_markdown` part of this nice project together with custom markdown parsing. (I can understand if you decide that this is a *won't fix*)
marijnh commented 2022-07-23 18:48:12 +02:00 (Migrated from github.com)

Ah, all right. I think this is something you'll have to address in your custom parser then.

Ah, all right. I think this is something you'll have to address in your custom parser then.
Sign in to join this conversation.
No labels
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
prosemirror/prosemirror-markdown#80
No description provided.