Add support for async table schemas #9

Closed
cemreyavuz wants to merge 2 commits from async-schema-columns into main
cemreyavuz commented 2022-09-17 00:30:28 +02:00 (Migrated from github.com)

This PR aims to add support for providing table schemas with promises. The motivation is similar with https://github.com/codemirror/lang-sql/pull/2: I'm also looking to fetch table schema on demand. I can do that by playing around with the returned completions from schemaCompletionSource but CodeMirror already supports promises for completions - so I thought why not support it internally in the extension itself?

Now we return an async completion source that awaits the level list - so this means a small type change to the CompletionLevel type:

list: readonly (string | Completion)[] | Promise<(string | Completion)[]> = []

And, we also need to do the mapping of strings to completions in the completion source:

(await level.list).map(val => typeof val == "string" ? {label: val, type: "property"} : val)
This PR aims to add support for providing table schemas with promises. The motivation is similar with https://github.com/codemirror/lang-sql/pull/2: I'm also looking to fetch table schema on demand. I can do that by playing around with the returned completions from `schemaCompletionSource` but CodeMirror already supports promises for completions - so I thought why not support it internally in the extension itself? Now we return an async completion source that awaits the level list - so this means a small type change to the `CompletionLevel` type: ```tsx list: readonly (string | Completion)[] | Promise<(string | Completion)[]> = [] ``` And, we also need to do the mapping of strings to completions in the completion source: ```tsx (await level.list).map(val => typeof val == "string" ? {label: val, type: "property"} : val) ```
marijnh commented 2022-09-17 10:42:19 +02:00 (Migrated from github.com)

Do I understand correctly that this will keep using the same promise and thus only really allows you to initialize the extension before the schema has finished loading, not do something like dynamic schemas? If so, why would this (which significantly complicates the logic here) be any better than just waiting for the schema to load and then initializing the extension?

Do I understand correctly that this will keep using the same promise and thus only really allows you to initialize the extension before the schema has finished loading, not do something like dynamic schemas? If so, why would this (which significantly complicates the logic here) be any better than just waiting for the schema to load and _then_ initializing the extension?
cemreyavuz commented 2022-09-17 11:56:02 +02:00 (Migrated from github.com)

not do something like dynamic schemas

I meant loading the table schemas dynamically, not dynamic schemas.

why would this (...) be any better than just waiting for the schema to load and then initializing the extension?

The idea was fetching the table schema only when trying to autocomplete for that table. It turned out my first commit was not actually doing that - I added another commit, so it's more like that. In this way, I can fetch the schema only for the tables I use instead of waiting to fetch all - and a schema can look like this:

const sqlLang = sql({
    schema: {
      "foo": (() => new Promise((resolve) => {
        setTimeout(() => {
          resolve(["foo-column"]);
        }, 2000);
      })) as any,
      "bar": (() => new Promise((resolve) => {
        setTimeout(() => {
          resolve(["bar-column"]);
        }, 2000);
      })) as any,
    },
  });

I also added a codesandbox (https://codesandbox.io/s/infallible-colden-hgxy0z?file=/src/App.tsx):

https://user-images.githubusercontent.com/26118454/190850881-7e390440-a7bc-466c-a74b-5f04b1f6a69d.mp4

> not do something like dynamic schemas I meant loading the table schemas dynamically, not dynamic schemas. > why would this (...) be any better than just waiting for the schema to load and then initializing the extension? The idea was fetching the table schema only when trying to autocomplete for that table. It turned out my first commit was not actually doing that - I added another commit, so it's more like that. In this way, I can fetch the schema only for the tables I use instead of waiting to fetch all - and a schema can look like this: ```tsx const sqlLang = sql({ schema: { "foo": (() => new Promise((resolve) => { setTimeout(() => { resolve(["foo-column"]); }, 2000); })) as any, "bar": (() => new Promise((resolve) => { setTimeout(() => { resolve(["bar-column"]); }, 2000); })) as any, }, }); ``` I also added a codesandbox (https://codesandbox.io/s/infallible-colden-hgxy0z?file=/src/App.tsx): https://user-images.githubusercontent.com/26118454/190850881-7e390440-a7bc-466c-a74b-5f04b1f6a69d.mp4
marijnh commented 2022-09-17 13:21:37 +02:00 (Migrated from github.com)

Are you sure your (global) schema data is big enough to make this kind of thing worthwhile? I can't really imagine a database with so many tables that it's expensive to send a description of the schema over in one go.

Are you sure your (global) schema data is big enough to make this kind of thing worthwhile? I can't really imagine a database with so many tables that it's expensive to send a description of the schema over in one go.
cemreyavuz commented 2022-09-17 14:46:02 +02:00 (Migrated from github.com)

Are you sure your (global) schema data is big enough to make this kind of thing worthwhile? I can't really imagine a database with so many tables that it's expensive to send a description of the schema over in one go.

Not the data itself, but gathering that data is expensive for me. I'm using SQL to query "unstructured" data, so the schema is generated on read. Also, the number of tables is arbitrarily large (can be 10, but can be 1000 as well) - so overall it becomes a bit unnecessary to get the schema for all of those tables. Of course, we can argue that this use case is not that common, but that is the motivation behind this PR at least.

> Are you sure your (global) schema data is big enough to make this kind of thing worthwhile? I can't really imagine a database with so many tables that it's expensive to send a description of the schema over in one go. Not the data itself, but gathering that data is expensive for me. I'm using SQL to query "unstructured" data, so the schema is generated on read. Also, the number of tables is arbitrarily large (can be 10, but can be 1000 as well) - so overall it becomes a bit unnecessary to get the schema for all of those tables. Of course, we can argue that this use case is not that common, but that is the motivation behind this PR at least.
marijnh commented 2022-09-18 12:19:30 +02:00 (Migrated from github.com)

I'm going to decline this — it's too narrow a use case, and introducing asynchronicity does tend to complicate thinking about correctness, so I avoid it when I can.

I'm going to decline this — it's too narrow a use case, and introducing asynchronicity does tend to complicate thinking about correctness, so I avoid it when I can.

Pull request closed

Sign in to join this conversation.
No reviewers
No labels
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
codemirror/lang-sql!9
No description provided.