wrong affinity of pointer-star in function type #3

Closed
opened 2023-03-06 17:57:54 +01:00 by milahu · 6 comments
milahu commented 2023-03-06 17:57:54 +01:00 (Migrated from github.com)

source

void* some_function() {}

actual parse tree

FunctionDefinition: "void* some_function() {"
  PrimitiveType: "void"
  PointerDeclarator: "* some_function()"
    FunctionDeclarator: "some_function()"
      Identifier: "some_function"
      ParameterList: "()"
        (: "("
        ): ")"
  CompoundStatement: "{"
    {: "{"
    }: "}"

actual affinity: void (* some_function)
expected affinity: (void *) some_function

actual vs expected parse tree

 FunctionDefinition: "void* some_function() {"
-  PrimitiveType: "void"
-  PointerDeclarator: "*some_function()"
-    FunctionDeclarator: "some_function()"
+  CompoundType: "void*"
+    PrimitiveType: "void"
+    PointerStar: "*"
+  FunctionDeclarator: "some_function()"
       Identifier: "some_function"
       ParameterList: "()"
         (: "("

same bug in tree-sitter-cpp, see playground

expected parse tree per clang

{
  "kind": "FunctionDecl",
  "name": "some_function",
  "type": {
    "qualType": "void *()"
  },
  "inner": [
    {
      "kind": "CompoundStmt",
    }
  ]
}
source ```cc void* some_function() {} ``` actual parse tree ``` FunctionDefinition: "void* some_function() {" PrimitiveType: "void" PointerDeclarator: "* some_function()" FunctionDeclarator: "some_function()" Identifier: "some_function" ParameterList: "()" (: "(" ): ")" CompoundStatement: "{" {: "{" }: "}" ``` actual affinity: `void (* some_function)` expected affinity: `(void *) some_function` actual vs expected parse tree ```diff FunctionDefinition: "void* some_function() {" - PrimitiveType: "void" - PointerDeclarator: "*some_function()" - FunctionDeclarator: "some_function()" + CompoundType: "void*" + PrimitiveType: "void" + PointerStar: "*" + FunctionDeclarator: "some_function()" Identifier: "some_function" ParameterList: "()" (: "(" ``` same bug in tree-sitter-cpp, see [playground](https://tree-sitter.github.io/tree-sitter/playground) expected parse tree per [clang](https://godbolt.org/z/fhEfvhWaa) ```json { "kind": "FunctionDecl", "name": "some_function", "type": { "qualType": "void *()" }, "inner": [ { "kind": "CompoundStmt", } ] } ```
marijnh commented 2023-03-07 08:17:43 +01:00 (Migrated from github.com)

This BNF grammar also seems to group the * under the declarator, so I'm not sure this is a bug as such.

[This BNF grammar](https://alx71hub.github.io/hcb/#ptr-declarator) also seems to group the `*` under the declarator, so I'm not sure this is a bug as such.
milahu commented 2023-03-07 09:03:37 +01:00 (Migrated from github.com)

then that grammar is wrong too.
other grammars:

for my case (cpp2js transpiler) i also care about semantics, so i use the clang parser as reference

parsing the example code void* some_function() {} as "pointer declarator" does not make sense

an actual "pointer declarator" should be called "function pointer declarator"
declaration and call need parens for (*function_name)

#include <stdio.h> // printf
#include <stdlib.h> // malloc

void f1(int x) {
  printf("f1: x = %i\n", x);
}

void f2(int x) {
  printf("f2: x = %i\n", x);
}

struct s1 {
  // declaration
  void (*f1_ptr)(int x);
  void (*f2_ptr)(int x);
};

int main(int argc, char** argv) {
  int x = argc;
  struct s1 *s1_ptr = malloc(sizeof(struct s1));

  void (*f1_ptr)(int x) = &f1; // declaration + init
  (*f1_ptr)(x); // call
  x++;

  // arrow syntax
  s1_ptr->f1_ptr = &f1; // init
  (*s1_ptr->f1_ptr)(x); // call
  x++;

  // dot syntax
  (*s1_ptr).f2_ptr = &f2; // init
  (*(*s1_ptr).f2_ptr)(x); // call
  x++;

  free(s1_ptr);
  return 0;
}
then [that grammar](https://github.com/alx71hub/hcb) is wrong too. other grammars: - https://github.com/antlr/grammars-v4/tree/master/c - https://github.com/antlr/grammars-v4/tree/master/cpp for my case ([cpp2js transpiler](https://github.com/milahu/lezer-parser-import-tree-sitter-scanner)) i also care about semantics, so i use the `clang` parser as reference parsing the example code `void* some_function() {}` as "pointer declarator" does not make sense an actual "pointer declarator" should be called "function pointer declarator" declaration and call need parens for `(*function_name)` ```c #include <stdio.h> // printf #include <stdlib.h> // malloc void f1(int x) { printf("f1: x = %i\n", x); } void f2(int x) { printf("f2: x = %i\n", x); } struct s1 { // declaration void (*f1_ptr)(int x); void (*f2_ptr)(int x); }; int main(int argc, char** argv) { int x = argc; struct s1 *s1_ptr = malloc(sizeof(struct s1)); void (*f1_ptr)(int x) = &f1; // declaration + init (*f1_ptr)(x); // call x++; // arrow syntax s1_ptr->f1_ptr = &f1; // init (*s1_ptr->f1_ptr)(x); // call x++; // dot syntax (*s1_ptr).f2_ptr = &f2; // init (*(*s1_ptr).f2_ptr)(x); // call x++; free(s1_ptr); return 0; } ```
marijnh commented 2023-03-08 17:38:41 +01:00 (Migrated from github.com)

then that grammar is wrong too.

That seems like a very blunt dismissal of a generally respected resource. I don't see the structure you are proposing in https://github.com/antlr/grammars-v4/tree/master/cpp either. C++ is extremely weird to parse, so the parse tree and the abstract syntax tree in Clang differing is not terribly surprising. I don't have the expertise on this language to make a definite call here, but I am not prepared to merge this change without a much more solid rationale for it.

> then [that grammar](https://github.com/alx71hub/hcb) is wrong too. That seems like a very blunt dismissal of a generally respected resource. I don't see the structure you are proposing in https://github.com/antlr/grammars-v4/tree/master/cpp either. C++ is extremely weird to parse, so the parse tree and the abstract syntax tree in Clang differing is not terribly surprising. I don't have the expertise on this language to make a definite call here, but I am not prepared to merge this change without a much more solid rationale for it.
milahu commented 2023-03-08 19:15:14 +01:00 (Migrated from github.com)

C++ is extremely weird to parse

these are simple function-pointers in C

That seems like a very blunt dismissal

thats because im 100% sure (edit: at least im sure that cpp.lezer is wrong here)

if you remove the parens around (*funcion_name) in my example

 struct s1 {
   // declaration
   void (*f1_ptr)(int x);
-  void (*f2_ptr)(int x);
+  void* f2_ptr(int x);
 };
 
 int main(int argc, char** argv) {

then the pointer-star is left-associative, and compilation fails:

bad.c:15:9: error: field ‘f2_ptr’ declared as a function
   15 |   void* f2_ptr(int x);
      |         ^~~~~~

declared as a function

is an error, because in a struct C expects a function-pointer

so ...

problem: pointer-star is right-associative
expected: pointer-star is left-associative

> C++ is extremely weird to parse these are simple function-pointers in C > That seems like a very blunt dismissal thats because im 100% sure (edit: at least im sure that cpp.lezer is wrong here) if you remove the parens around `(*funcion_name)` in my example ```diff struct s1 { // declaration void (*f1_ptr)(int x); - void (*f2_ptr)(int x); + void* f2_ptr(int x); }; int main(int argc, char** argv) { ``` then the pointer-star is left-associative, and compilation fails: ``` bad.c:15:9: error: field ‘f2_ptr’ declared as a function 15 | void* f2_ptr(int x); | ^~~~~~ ``` > declared as a function is an error, because in a `struct` C expects a function-pointer so ... problem: pointer-star is right-associative expected: pointer-star is left-associative
marijnh commented 2023-03-20 13:27:11 +01:00 (Migrated from github.com)

I'm going to decline this until you can provide a (somewhat authoritative) grammar that works this way, on the assumption that deviating from the grammars we were working from is certainly going to cause obscure regressions.

I'm going to decline this until you can provide a (somewhat authoritative) grammar that works this way, on the assumption that deviating from the grammars we were working from is certainly going to cause obscure regressions.
milahu commented 2024-05-31 20:38:32 +02:00 (Migrated from github.com)

your decision was correct, also confirmed by https://github.com/tree-sitter/tree-sitter-c/issues/129#issuecomment-2113033597

int* x, y;

In C, this declares a pointer (x) to an int, and an int (y), i.e. the "star" only apply to x.

apparently clang also does some semantic processing, not just parsing

your decision was correct, also confirmed by https://github.com/tree-sitter/tree-sitter-c/issues/129#issuecomment-2113033597 > ```c > int* x, y; > ``` > > In C, this declares a pointer (x) to an int, and an int (y), i.e. the "star" only apply to x. apparently `clang` also does some semantic processing, not just parsing
Sign in to join this conversation.
No labels
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
lezer/cpp#3
No description provided.