Improve parser performance by 50%

thecodrr commented

2024-04-05 14:20:20 +02:00

(Migrated from github.com)

This PR significantly improves parse & parseSlice performance by optimizing how Prosemirror does tag matching:

Running querySelectorAll on the main dom node once instead of calling matches on each node individually
Using a simple condition for tag name selectors like li, p etc. This avoids calling any browser API and works much faster.

These changes shouldn't break anything.

Benchmarks

Before:

After:

That's a solid ~40% improvement.

This PR significantly improves `parse` & `parseSlice` performance by optimizing how Prosemirror does tag matching: 1. Running `querySelectorAll` on the main dom node once instead of calling `matches` on each node individually 2. Using a simple condition for tag name selectors like `li`, `p` etc. This avoids calling any browser API and works much faster. These changes shouldn't break anything. ## Benchmarks ### Before: ![Screenshot 2024-04-05 172532](https://github.com/ProseMirror/prosemirror-model/assets/7473959/dff9f996-8339-4a1b-8630-def3c8f0ba28) ### After: ![image](https://github.com/ProseMirror/prosemirror-model/assets/7473959/57446045-aab9-411b-9391-e5781da3f504) That's a solid ~40% improvement.

👀 1

thecodrr commented

2024-04-05 14:32:09 +02:00

(Migrated from github.com)

I tried running the tests but I keep getting this error:

     RangeError: Can not convert <paragraph("foo")> to a Fragment (looks like multiple versions of prosemirror-model were loaded)

I tried running the tests but I keep getting this error: ``` RangeError: Can not convert <paragraph("foo")> to a Fragment (looks like multiple versions of prosemirror-model were loaded) ```

marijnh commented

2024-04-05 19:29:38 +02:00

(Migrated from github.com)

What kind of things are you doing where parser performance is a big issue? I'm not sure adding this complexity for a 50% increase is worth it. Adding a simple branch for plain tag matching to matches() might be a reasonable idea (though I'm not sure why browser implementations of DOMElement.matches wouldn't already be optimized that way).

You're probably using yarn or a very old npm to install your dependencies if you're getting duplicated modules like that.

What kind of things are you doing where parser performance is a big issue? I'm not sure adding this complexity for a 50% increase is worth it. Adding a simple branch for plain tag matching to `matches()` might be a reasonable idea (though I'm not sure why browser implementations of `DOMElement.matches` wouldn't already be optimized that way). You're probably using yarn or a very old npm to install your dependencies if you're getting duplicated modules like that.

thecodrr commented

2024-04-05 21:28:07 +02:00

(Migrated from github.com)

What kind of things are you doing where parser performance is a big issue?

We are currently using ProseMirror in a note taking app which stores these notes in HTML and later the user can view/edit them. For that to work, the parse is obviously invovled. It isn't an issue for smaller notes but for large notes (> 300K words) optimizing the parser can bring down the waiting time between user clicking on a note and it appearing on the screen ready to edit. Even a small improvement helps.

I'm not sure adding this complexity for a 50% increase is worth it.

I mean, it is a total of 30 lines and 2 methods. Even that can be further reduced by refactoring the code. 50% increase is not trivial for the amount of changes this PR makes.

Adding a simple branch for plain tag matching to matches() might be a reasonable idea (though I'm not sure why browser implementations of DOMElement.matches wouldn't already be optimized that way).

Unfortunately, matches is on the hot path. Adding any branch on it will only make things "slightly" better. Browser implementations are optimized but when you start matching thousands of nodes & CSS selectors in a loop, it's obvious why the browser can't do it fast enough.

The changes I have made use a simple rule of doing the work once per parse instead of once per match. It can be changed to look less complex. It really isn't doing a whole lot.

Of course, the decision rests with you. I'd be happy to make any further changes.

You're probably using yarn or a very old npm to install your dependencies if you're getting duplicated modules like that.

No, actually, I am running npm v10.2.4. I just ran npm i and then npm run test.

> What kind of things are you doing where parser performance is a big issue? We are currently using ProseMirror in a note taking app which stores these notes in HTML and later the user can view/edit them. For that to work, the parse is obviously invovled. It isn't an issue for smaller notes but for large notes (> 300K words) optimizing the parser can bring down the waiting time between user clicking on a note and it appearing on the screen ready to edit. Even a small improvement helps. > I'm not sure adding this complexity for a 50% increase is worth it. I mean, it is a total of 30 lines and 2 methods. Even that can be further reduced by refactoring the code. 50% increase is not trivial for the amount of changes this PR makes. > Adding a simple branch for plain tag matching to matches() might be a reasonable idea (though I'm not sure why browser implementations of DOMElement.matches wouldn't already be optimized that way). Unfortunately, `matches` is on the hot path. Adding any branch on it will only make things "slightly" better. Browser implementations are optimized but when you start matching thousands of nodes & CSS selectors in a loop, it's obvious why the browser can't do it fast enough. The changes I have made use a simple rule of doing the work once per parse instead of once per match. It can be changed to look less complex. It really isn't doing a whole lot. Of course, the decision rests with you. I'd be happy to make any further changes. > You're probably using yarn or a very old npm to install your dependencies if you're getting duplicated modules like that. No, actually, I am running npm v10.2.4. I just ran `npm i` and then `npm run test`.

marijnh commented

2024-04-08 08:51:11 +02:00

(Migrated from github.com)

Benchmarking these changes by parsing and re-parsing the example document in the demo of the dev demo page, I don't see an actual noticeable speed improvement. Are you using any particularly complex selectors in your schema?

thecodrr commented

2024-04-08 21:42:28 +02:00

(Migrated from github.com)

I don't see an actual noticeable speed improvement.

The speed difference is most noticeable on Chromium-based browsers, unfortunately. Here's a really simple down snippet that benchmarks the changes made in this PR:

const doc = document.createElement("div");
for (let i = 0; i < 100000; ++i) {
  const element = document.createElement("p");
  element.classList.add("something");
  doc.append(element);
}

console.time("creating set");
const set = new Set(doc.querySelectorAll("p.something").values());
const matchers = {
  ["p.something"]: (node) => set.has(node),
};
console.timeEnd("creating set");

const results = [];
for (let i = 0; i < 10; ++i) {
  const loopStart = performance.now();
  for (const node of doc.children) {
  }
  const loopEnd = performance.now() - loopStart;

  const matcherStart = performance.now();
  for (const node of doc.children) {
    matchers["p.something"](node);
  }
  const matcherEnd = performance.now() - matcherStart;

  const matchesStart = performance.now();
  for (const node of doc.children) {
    node.matches("p.something");
  }
  const matchesEnd = performance.now() - matchesStart;

  results.push({ loop: loopEnd, matcher: matcherEnd - loopEnd, matches: matchesEnd - loopEnd });
}

console.table(results);

I ran these in Google Chrome and Firefox, and got the following results:

Chrome:

Firefox:

This tries to exclude the time it takes to loop over the elements (not super accurately, though). The crux is that the speed difference is most significant on Chrome, and on Firefox it actually gets slower. However, if you include the cost of creating the set of nodes, this PR isn't really looking all that great (at least that's what independent benchmarking is showing). This is very different from what I am seeing after benchmarking ProseMirror with these changes which makes me wonder whether the performance difference is due to something else?

> I don't see an actual noticeable speed improvement. The speed difference is most noticeable on Chromium-based browsers, unfortunately. Here's a really simple down snippet that benchmarks the changes made in this PR: ```js const doc = document.createElement("div"); for (let i = 0; i < 100000; ++i) { const element = document.createElement("p"); element.classList.add("something"); doc.append(element); } console.time("creating set"); const set = new Set(doc.querySelectorAll("p.something").values()); const matchers = { ["p.something"]: (node) => set.has(node), }; console.timeEnd("creating set"); const results = []; for (let i = 0; i < 10; ++i) { const loopStart = performance.now(); for (const node of doc.children) { } const loopEnd = performance.now() - loopStart; const matcherStart = performance.now(); for (const node of doc.children) { matchers["p.something"](node); } const matcherEnd = performance.now() - matcherStart; const matchesStart = performance.now(); for (const node of doc.children) { node.matches("p.something"); } const matchesEnd = performance.now() - matchesStart; results.push({ loop: loopEnd, matcher: matcherEnd - loopEnd, matches: matchesEnd - loopEnd }); } console.table(results); ``` I ran these in Google Chrome and Firefox, and got the following results: Chrome: ![image](https://github.com/ProseMirror/prosemirror-model/assets/7473959/a1a42e9e-23a1-4cc7-b4fe-4f45d37d4f10) Firefox: ![image](https://github.com/ProseMirror/prosemirror-model/assets/7473959/a6474f1c-c050-4fad-825e-30b74b6099da) --- This tries to exclude the time it takes to loop over the elements (not super accurately, though). The crux is that the speed difference is most significant on Chrome, and on Firefox it actually gets slower. However, if you include the cost of creating the set of nodes, this PR isn't really looking all that great (at least that's what independent benchmarking is showing). This is very different from what I am seeing after benchmarking ProseMirror with these changes which makes me wonder whether the performance difference is due to something else?

marijnh commented

2024-04-08 21:50:04 +02:00

(Migrated from github.com)

I benchmarked both Firefox and Chrome. Neither showed a significant difference. I'm not interested in micro-benchmarks that just run one leaf function—there'd have to be a noticeable improvement in DOMParser.parse's time for this to be interesting.

I benchmarked both Firefox and Chrome. Neither showed a significant difference. I'm not interested in micro-benchmarks that just run one leaf function—there'd have to be a noticeable improvement in `DOMParser.parse`'s time for this to be interesting.

thecodrr commented

2024-04-08 21:52:39 +02:00

(Migrated from github.com)

How many nodes are in the document you benchmarked on? I'll run the benchmarks again. It's possible that I am doing something wrong/different because I am seeing 40% improvement in DOMParser.parse.

How many nodes are in the document you benchmarked on? I'll run the benchmarks again. It's possible that I am doing something wrong/different because I _am_ seeing 40% improvement in `DOMParser.parse`.

marijnh commented

2024-04-09 12:12:15 +02:00

(Migrated from github.com)

I was parsing a 3000-node document in a loop for a few seconds, counting the amount of parses per second.

Pull request closed

This pull request cannot be reopened because the branch was deleted.

Rows
Columns

Improve parser performance by 50% #79

Benchmarks

Before:

After:

Pull request closed