Different error ranges dependent on next method return type #10

Closed
opened 2025-04-11 01:08:49 +02:00 by thomasklein1982 · 7 comments
thomasklein1982 commented 2025-04-11 01:08:49 +02:00 (Migrated from github.com)

Hello,

thanks in advance for your great work. I've got the following problem:

Consider the following two java programs:

class Main{
	void test(){
		String s;
		s.
	}
	String[] test2(){
		// do smth.
	}
}

and

class Main{
	void test(){
		String s;
		s.
	}
	void test2(){//
		//do smth.
	}
}

Both programs contain the same syntax error in line 4.

But in the first case the error spans the whole lines 4, 5 and 6 (it seems to think that this is a LocalVariableDeclaration). Therefore the whole parsing tree is wrong after this error (the method "test2" is not recognized as method any longer).

In the second case the error spans line 4 only.

I put together a example that runs in the "try codemirror page" (see below). Just copy paste this and comment the code in the editor in or out.

Regards
Thomas.

import {basicSetup} from "codemirror";
import {syntaxTree} from "@codemirror/language";
import { EditorView } from "@codemirror/view";
import { java, javaLanguage } from "@codemirror/lang-java";
import {LanguageSupport} from "@codemirror/language";
import { lintGutter, linter, openLintPanel, closeLintPanel } from "@codemirror/lint";
import {keymap} from "@codemirror/view";
import {Compartment,StateField, StateEffect, EditorSelection,RangeSet,EditorState} from "@codemirror/state"
import {gutter, GutterMarker} from "@codemirror/view"
import {Decoration,ViewPlugin} from "@codemirror/view";

const errorEffect = StateEffect.define({
  map: (val, mapping) => ({pos: mapping.mapPos(val.pos), on: val.on})
})

const errorState = StateField.define({
  create() { return RangeSet.empty },
  update(set, transaction) {
    set = set.map(transaction.changes)
    for (let e of transaction.effects) {
      if (e.is(errorEffect)) {
        if (e.value.on){
          set = set.update({add: [errorMarker.range(e.value.pos)]})
        }else{
          set = set.update({filter: from => from != e.value.pos})
        }
      }
    }
    return set
  }
})

const errorMarker = new class extends GutterMarker {
  toDOM() { return document.createTextNode("!") }
}

const errorGutter = [
  errorState,
  gutter({
    class: "cm-breakpoint-gutter",
    markers: v => v.state.field(errorState),
    initialSpacer: () => errorMarker
  }),
  EditorView.baseTheme({
    ".cm-error-gutter .cm-gutterElement": {
      color: "red",
      paddingLeft: "5px",
      cursor: "default"
    },
    ".cm-currentLine": {backgroundColor: "#121212", color: "white"}
  })
]

let lint = linter((view) => {
  
  let errors=[];
  syntaxTree(view.state).cursor().iterate(node => {
    if (node.type.isError) errors.push({
      from: node.from,
      to: node.to,
      severity: "error",
      message: "Syntax-Fehler",
    })
  })
  
  return errors;
});

new EditorView({
  doc: "class Main{\n\tvoid test(){\n\t\t\String s;\n\t\ts.\n\t}\n\tString[] test2(){\n\t\t\n\t}\n} \n\n//Comment the upper out and the lower in \n//to see the different error-ranges.\n\n//class Main{\n//\tvoid test(){\n//\t\t\String s;\n//\t\ts.\n//\t}\n//\tvoid test2(){//\n\t\t\n//\t}\n//}",
  extensions: [basicSetup, java(),lint,lintGutter(),],
  parent: document.body
})
Hello, thanks in advance for your great work. I've got the following problem: Consider the following two java programs: ``` class Main{ void test(){ String s; s. } String[] test2(){ // do smth. } } ``` and ``` class Main{ void test(){ String s; s. } void test2(){// //do smth. } } ``` Both programs contain the same syntax error in line 4. But in the first case the error spans the whole lines 4, 5 and 6 (it seems to think that this is a LocalVariableDeclaration). Therefore the whole parsing tree is wrong after this error (the method "test2" is not recognized as method any longer). In the second case the error spans line 4 only. I put together a example that runs in the ["try codemirror page"](https://codemirror.net/try/) (see below). Just copy paste this and comment the code in the editor in or out. Regards Thomas. ``` import {basicSetup} from "codemirror"; import {syntaxTree} from "@codemirror/language"; import { EditorView } from "@codemirror/view"; import { java, javaLanguage } from "@codemirror/lang-java"; import {LanguageSupport} from "@codemirror/language"; import { lintGutter, linter, openLintPanel, closeLintPanel } from "@codemirror/lint"; import {keymap} from "@codemirror/view"; import {Compartment,StateField, StateEffect, EditorSelection,RangeSet,EditorState} from "@codemirror/state" import {gutter, GutterMarker} from "@codemirror/view" import {Decoration,ViewPlugin} from "@codemirror/view"; const errorEffect = StateEffect.define({ map: (val, mapping) => ({pos: mapping.mapPos(val.pos), on: val.on}) }) const errorState = StateField.define({ create() { return RangeSet.empty }, update(set, transaction) { set = set.map(transaction.changes) for (let e of transaction.effects) { if (e.is(errorEffect)) { if (e.value.on){ set = set.update({add: [errorMarker.range(e.value.pos)]}) }else{ set = set.update({filter: from => from != e.value.pos}) } } } return set } }) const errorMarker = new class extends GutterMarker { toDOM() { return document.createTextNode("!") } } const errorGutter = [ errorState, gutter({ class: "cm-breakpoint-gutter", markers: v => v.state.field(errorState), initialSpacer: () => errorMarker }), EditorView.baseTheme({ ".cm-error-gutter .cm-gutterElement": { color: "red", paddingLeft: "5px", cursor: "default" }, ".cm-currentLine": {backgroundColor: "#121212", color: "white"} }) ] let lint = linter((view) => { let errors=[]; syntaxTree(view.state).cursor().iterate(node => { if (node.type.isError) errors.push({ from: node.from, to: node.to, severity: "error", message: "Syntax-Fehler", }) }) return errors; }); new EditorView({ doc: "class Main{\n\tvoid test(){\n\t\t\String s;\n\t\ts.\n\t}\n\tString[] test2(){\n\t\t\n\t}\n} \n\n//Comment the upper out and the lower in \n//to see the different error-ranges.\n\n//class Main{\n//\tvoid test(){\n//\t\t\String s;\n//\t\ts.\n//\t}\n//\tvoid test2(){//\n\t\t\n//\t}\n//}", extensions: [basicSetup, java(),lint,lintGutter(),], parent: document.body }) ```
marijnh commented 2025-04-11 08:22:10 +02:00 (Migrated from github.com)

How quickly the parser will find its way back after an error depends on the tokens after the error—because those are all it has to work with at that point. This is expected and I don't think it's a bug.

How quickly the parser will find its way back after an error depends on the tokens after the error—because those are all it has to work with at that point. This is expected and I don't think it's a bug.
thomasklein1982 commented 2025-04-11 09:03:41 +02:00 (Migrated from github.com)

But why can LocalVariableDeclaration contain an closing }? I think the parser should be able to make out the code blocks in { } regardless of syntax errors.

But why can LocalVariableDeclaration contain an closing }? I think the parser should be able to make out the code blocks in { } regardless of syntax errors.
marijnh commented 2025-04-11 09:31:44 +02:00 (Migrated from github.com)

The error recovery doesn't know anything about tokens' semantic meaning. It just looks for a way to insert/drop tokens that gets it back to something that parses. In this case what it does is dropping the closing bracket and parsing s.String[] test2 as a local declaration. That gets it far enough ahead that it considers the recovery a success and commits to it. The content after that will lead to new errors, but for performance reasons the distance the library parses ahead with multiple recovery attempts before committing to one is limited.

The error recovery doesn't know anything about tokens' semantic meaning. It just looks for a way to insert/drop tokens that gets it back to something that parses. In this case what it does is dropping the closing bracket and parsing `s.String[] test2` as a local declaration. That gets it far enough ahead that it considers the recovery a success and commits to it. The content after that will lead to new errors, but for performance reasons the distance the library parses ahead with multiple recovery attempts before committing to one is limited.
thomasklein1982 commented 2025-04-11 09:56:10 +02:00 (Migrated from github.com)

Okay, but it's
s.}String[] test2
which makes no sense, does it?
Why does it drop the closing bracket?

Okay, but it's `s.}String[] test2` which makes no sense, does it? Why does it drop the closing bracket?
marijnh commented 2025-04-11 10:05:34 +02:00 (Migrated from github.com)

As I said, it drops the }

As I said, it drops the `}`
thomasklein1982 commented 2025-04-12 10:55:14 +02:00 (Migrated from github.com)

OK, but why does it drop the }?

I use the parser to parse java classes. To do all the semantics right and to provide autocompletion I need to parse the structure of the class first to get all attributes and methods. If the parser drops the }, the structure is not recognized correctly. Therefore the users of my IDE get weird error messages like "there is no method xy" in certain circumstances.

OK, but **why** does it drop the `}`? I use the parser to parse java classes. To do all the semantics right and to provide autocompletion I need to parse the structure of the class first to get all attributes and methods. If the parser drops the `}`, the structure is not recognized correctly. Therefore the users of my IDE get weird error messages like "there is no method xy" in certain circumstances.
marijnh commented 2025-04-12 12:04:15 +02:00 (Migrated from github.com)

Because there is a syntax error and it needs to do something to try to recover and continue. Its recovery strategies for that include things like dropping tokens or making tokens up.

Because there is a syntax error and it needs to do _something_ to try to recover and continue. Its recovery strategies for that include things like dropping tokens or making tokens up.
Sign in to join this conversation.
No labels
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
lezer/java#10
No description provided.