Yaml mode does not highlight dict keys with spaces correctly #2695

Closed
opened 2014-07-13 05:35:41 +02:00 by marijnh · 4 comments
marijnh commented 2014-07-13 05:35:41 +02:00 (Migrated from gitlab.com)

Ran into this bug when using brackets. The following code:

multi word key: 42

special Delivery:  >
    Follow the Yellow Brick
    Road to the Emerald City.
    Pay no attention to the
    man behind the curtain.

Parses correctly here

But it doesn't highlight correctly here (edited the css to make it clearer)

yamlmode

On the yaml spec I found this example that shows keys with spaces being used. I think keys should be highlighted from the beginning of the indentation to the ":" character.

Browser: Firefox 30, Windows. I use the yaml mode index.html from github master.

Ran into this bug when using brackets. The following code: ``` multi word key: 42 special Delivery: > Follow the Yellow Brick Road to the Emerald City. Pay no attention to the man behind the curtain. ``` Parses correctly [here](http://yaml-online-parser.appspot.com/?yaml=multi+word+key%3A+42%0A%0Aspecial+Delivery%3A++%3E%0A++++Follow+the+Yellow+Brick%0A++++Road+to+the+Emerald+City.%0A++++Pay+no+attention+to+the%0A++++man+behind+the+curtain.&type=json) But it doesn't highlight correctly here (edited the css to make it clearer) ![yamlmode](https://cloud.githubusercontent.com/assets/3515649/3563539/576abd76-0a3d-11e4-9d7f-b52409b67094.PNG) On the yaml spec I found [this example](http://yaml.org/spec/1.2/spec.html#id2777865) that shows keys with spaces being used. I think keys should be highlighted from the beginning of the indentation to the ":" character. Browser: Firefox 30, Windows. I use the yaml mode index.html from github master.
marijnh commented 2014-07-20 09:36:49 +02:00 (Migrated from gitlab.com)

I know very little about YAML, and did not write the original mode. I also couldn't make much sense of the spec you linked (I never saw such a convoluted cfg), but the attached patch seems to help. Could you take a look to see whether it looks sane to you?

I know very little about YAML, and did not write the original mode. I also couldn't make much sense of the spec you linked (I never saw such a convoluted cfg), but the attached patch seems to help. Could you take a look to see whether it looks sane to you?
marijnh commented 2014-07-20 13:30:29 +02:00 (Migrated from gitlab.com)

Thanks!

Actually, I didn't fully read the spec too when I first reported this, so I just picked an example that demonstrates the issue...

I've tested the new code, and it works for the common use cases, with two bugs: Indented YAML comments with a : are parsed as a key-value pair, and some valid YAML are not highlighted.

yamlprob

Source

After I found that, I started reading the YAML spec and found something that I think is a definition of a valid YAML key.

The previous YAML that don't highlight correctly are defined as flow plain scalar styles. Basically unquoted plain strings. And it says... (emphasis mine)

Plain scalars must not begin with most indicators, as this would cause ambiguity with other YAML constructs. However, the “:”, “?” and “-” indicators may be used as the first character if followed by a non-space “safe” character.

I've collected all the indicators that are not allowed (I think). They are:

,[]{}#&*!|>'"%@`

This file contains some YAML that uses those characters.

The indicators may be included in the middle of a string. For example, this:

a [list]: but not here
a {map}: but not here
my-email@example.com: works

is valid YAML.

Also, they can be included in the beginning of a string if it is a quoted string.

"@cow": x
'@grass': x

One thing that might be a problem is comments:

# This fails
a #comment: x

# But this doesn't
a#comment: x

I'm not good with regexes, so I can't contribute code to solve this problem (sorry!) but I can probably contribute a summary of a valid YAML key.

  • It can be contained by '' or "", bypassing all the following rules.
  • It must not begin with ,[]{}#&*!|>'"%@
  • It can contain anything, even the above characters, except:
  • The string # (with space before it) because that starts a YAML comment, preventing the colon and the value from being read.
  • It is terminated by colon space : (space after colon not optional)
Thanks! Actually, I didn't fully read the spec too when I first reported this, so I just picked an example that demonstrates the issue... I've tested the new code, and it works for the common use cases, **with two bugs**: Indented YAML comments with a `:` are parsed as a key-value pair, and some valid YAML are not highlighted. ![yamlprob](https://cloud.githubusercontent.com/assets/3515649/3637280/e3f71d66-1000-11e4-9fbf-6929e013a95e.PNG) [Source](http://yaml-online-parser.appspot.com/?yaml=++++a+%2Ccollection%3A+x%0A++++a+%5Bseqstart%3A+x%0A++++a+%5Dseqend%3A+x%0A++++a+%7Bmapstart%3A+x%0A++++a+%7Dmapend%3A+x&type=json) After I found that, I started reading the YAML spec and found something that I think is a definition of a valid YAML key. The previous YAML that don't highlight correctly are defined as [flow plain scalar styles](http://yaml.org/spec/1.2/spec.html#style/flow/plain). Basically unquoted plain strings. And it says... (emphasis mine) > Plain scalars **must not begin with most indicators**, as this would cause ambiguity with other YAML constructs. However, the “:”, “?” and “-” indicators may be used as the first character if followed by a non-space “safe” character. I've collected all the indicators that are not allowed (I think). They are: ``` ,[]{}#&*!|>'"%@` ``` [This file](https://gist.github.com/ishamf/c2b773d793e5e5ed6ff7) contains some YAML that uses those characters. The indicators may be included in the middle of a string. For example, this: ``` a [list]: but not here a {map}: but not here my-email@example.com: works ``` is valid YAML. Also, they can be included in the beginning of a string if it is a quoted string. ``` "@cow": x '@grass': x ``` One thing that might be a problem is comments: ``` # This fails a #comment: x # But this doesn't a#comment: x ``` I'm not good with regexes, so I can't contribute code to solve this problem (sorry!) but I can probably contribute a summary of a valid YAML key. - It can be contained by `''` or `""`, bypassing all the following rules. - It must not begin with `,[]{}#&*!|>'"%@` - It can contain anything, even the above characters, except: - The string `#` (with space before it) because that starts a YAML comment, preventing the colon and the value from being read. - It is terminated by colon space `:` (space after colon not optional)
marijnh commented 2014-07-21 08:29:12 +02:00 (Migrated from gitlab.com)

I've tried to improve the regexp. It seems the YAML mode does not handle quotes at all, and in generally is extremely primitive, though.

I've tried to improve the regexp. It seems the YAML mode does not handle quotes at all, and in generally is extremely primitive, though.
marijnh commented 2014-07-22 14:08:21 +02:00 (Migrated from gitlab.com)

This last commit seems fine. Invalid keys are still highlighted as keys, but I don't think it's a big problem in actual use.

This last commit seems fine. Invalid keys are still highlighted as keys, but I don't think it's a big problem in actual use.
marijnh (Migrated from gitlab.com) closed this issue 2014-10-20 17:02:01 +02:00
Sign in to join this conversation.
No labels
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
codemirror/codemirror5#2695
No description provided.