feat: export mkLang to help extend the parser #5

Closed
maxswjeon wants to merge 3 commits from feat/export-mklang into main
maxswjeon commented 2023-10-28 10:11:34 +02:00 (Migrated from github.com)

I am trying to extend the GFM markdown parser.
However, the package only exports commonmark and GFM Langauges.
I know I can retrieve MarkdownParser by those ways

const commonmarkParser = commonmarkLanguage.parser as MarkdownParser;
const extendedParser = markdownLanguage.parser as MarkdownParser;

However, I cannot recreate Language from MarkdownParser since mkLang is not exported by the package.

This PR exports mkLang from the package.

Examples

In order to extend the GFM Parser currently, the bolierplate code should be copied.

import { markdownLanguage } from "@codemirror/lang-markdown";
import {
  Language,
  defineLanguageFacet,
  foldService,
  syntaxTree,
} from "@codemirror/language";
import { NodeProp, NodeType, SyntaxNode } from "@lezer/common";
import { MarkdownParser } from "@lezer/markdown";
import { HeaderContent } from "./HeaderContent";

const data = defineLanguageFacet({
  commentTokens: { block: { open: "<!--", close: "-->" } },
});

const headingProp = new NodeProp<number>();

function isHeading(type: NodeType) {
  let match = /^(?:ATX|Setext)Heading(\d)$/.exec(type.name);
  return match ? +match[1] : undefined;
}

function findSectionEnd(headerNode: SyntaxNode, level: number) {
  let last = headerNode;
  for (;;) {
    let next = last.nextSibling,
      heading;
    if (!next || ((heading = isHeading(next.type)) != null && heading <= level))
      break;
    last = next;
  }
  return last.to;
}

const headerIndent = foldService.of((state, start, end) => {
  for (
    let node: SyntaxNode | null = syntaxTree(state).resolveInner(end, -1);
    node;
    node = node.parent
  ) {
    if (node.from < start) break;
    let heading = node.type.prop(headingProp);
    if (heading == null) continue;
    let upto = findSectionEnd(node, heading);
    if (upto > end) return { from: end, to: upto };
  }
  return null;
});

function mkLang(parser: MarkdownParser) {
  return new Language(data, parser, [headerIndent], "markdown");
}

const extendedParser = markdownLanguage.parser as MarkdownParser;
const extended = extendedParser.configure([HeaderContent]);
export const customMarkdownLanguage = mkLang(extended);

If the mkLang is exported, we can remove this boilerplate code.

// Copied from @codemirror/lang-markdown

import { markdownLanguage, mkLang } from "@codemirror/lang-markdown";
import { MarkdownParser } from "@lezer/markdown";
import { HeaderContent } from "./HeaderContent";

const extendedParser = markdownLanguage.parser as MarkdownParser;
const extended = extendedParser.configure([HeaderContent]);
export const customMarkdownLanguage = mkLang(extended);
I am trying to extend the GFM markdown parser. However, the package only exports commonmark and GFM Langauges. I know I can retrieve `MarkdownParser` by those ways ```typescript const commonmarkParser = commonmarkLanguage.parser as MarkdownParser; const extendedParser = markdownLanguage.parser as MarkdownParser; ``` However, I cannot recreate `Language` from `MarkdownParser` since `mkLang` is not exported by the package. This PR exports `mkLang` from the package. ## Examples In order to extend the GFM Parser currently, the bolierplate code should be copied. ```typescript import { markdownLanguage } from "@codemirror/lang-markdown"; import { Language, defineLanguageFacet, foldService, syntaxTree, } from "@codemirror/language"; import { NodeProp, NodeType, SyntaxNode } from "@lezer/common"; import { MarkdownParser } from "@lezer/markdown"; import { HeaderContent } from "./HeaderContent"; const data = defineLanguageFacet({ commentTokens: { block: { open: "<!--", close: "-->" } }, }); const headingProp = new NodeProp<number>(); function isHeading(type: NodeType) { let match = /^(?:ATX|Setext)Heading(\d)$/.exec(type.name); return match ? +match[1] : undefined; } function findSectionEnd(headerNode: SyntaxNode, level: number) { let last = headerNode; for (;;) { let next = last.nextSibling, heading; if (!next || ((heading = isHeading(next.type)) != null && heading <= level)) break; last = next; } return last.to; } const headerIndent = foldService.of((state, start, end) => { for ( let node: SyntaxNode | null = syntaxTree(state).resolveInner(end, -1); node; node = node.parent ) { if (node.from < start) break; let heading = node.type.prop(headingProp); if (heading == null) continue; let upto = findSectionEnd(node, heading); if (upto > end) return { from: end, to: upto }; } return null; }); function mkLang(parser: MarkdownParser) { return new Language(data, parser, [headerIndent], "markdown"); } const extendedParser = markdownLanguage.parser as MarkdownParser; const extended = extendedParser.configure([HeaderContent]); export const customMarkdownLanguage = mkLang(extended); ``` If the `mkLang` is exported, we can remove this boilerplate code. ```typescript // Copied from @codemirror/lang-markdown import { markdownLanguage, mkLang } from "@codemirror/lang-markdown"; import { MarkdownParser } from "@lezer/markdown"; import { HeaderContent } from "./HeaderContent"; const extendedParser = markdownLanguage.parser as MarkdownParser; const extended = extendedParser.configure([HeaderContent]); export const customMarkdownLanguage = mkLang(extended); ```
marijnh commented 2023-10-28 10:19:53 +02:00 (Migrated from github.com)

That is intentionally not public. Use markdown({...}).language.parser. Also, if you create a PR, don't include a bunch of changes your IDE automatically reformatted.

That is intentionally not public. Use `markdown({...}).language.parser`. Also, if you create a PR, don't include a bunch of changes your IDE automatically reformatted.
maxswjeon commented 2023-10-28 10:26:13 +02:00 (Migrated from github.com)

Please recheck the issue. I know I can access the parser from Language instance.
However, the markdown() function accepts Language as a base, not MarkdownParser, so we need a method that re-converts modified MarkdownParser to Langauge

Please recheck the issue. I know I can access the parser from `Language` instance. However, the `markdown()` function accepts `Language` as a `base`, not `MarkdownParser`, so we need a method that re-converts modified `MarkdownParser` to `Langauge`
marijnh commented 2023-10-28 10:39:28 +02:00 (Migrated from github.com)

Pass your extensions in the extensions field to markdown(), and it will modify the base language for you.

Pass your extensions in the `extensions` field to `markdown()`, and it will modify the base language for you.
maxswjeon commented 2023-10-28 10:45:30 +02:00 (Migrated from github.com)

Ohhh... Thanks

Ohhh... Thanks

Pull request closed

Sign in to join this conversation.
No reviewers
No labels
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
codemirror/lang-markdown!5
No description provided.