Alternative approach to nested parsing? #2
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Hi, I noticed this HTML snippet:
Produces this parse tree:
Actual Parse Tree
When I would have instead expected a tree like this:
Expected Parse Tree
What about, inside tokens.js:contentTokenizer, starting a new inner parse, and then shifting a content token to the outer HTML parser when the inner parser cannot shift anything AND the outer parser can shift. We would then save the result of the inner parse for later and mount it once the outer parse completes.
Could this be a good way to approach mixed-language parsing in general?
I have some ideas around context-sensitive languages and modular parsers I think would be neat to explore with this approach, but am not 100% sure they have legs yet.
Have you tried this in a browser? Because I'm pretty sure the way browsers parse documents like this corresponds to what you are labeling the 'bad' parse tree.
Ah, I just tried it in a browser and you're totally right. My apologies for taking your time with this.