ts-safeql / safeql

Validate and auto-generate TypeScript types from raw SQL queries in PostgreSQL.

Home Page:https://safeql.dev

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

JSON type inference

karlhorky opened this issue Β· comments

Supersedes #102
Supersedes #179

Is your feature request related to a problem? Please describe.

JSON functions such as json_agg and json_build_object and JSON literals return any types instead of the type of the data.

There are ways of getting around this with connections.overrides.types.json, but this is limiting:

{
  "connections": {
    // ...
    "overrides": {
      "types": {
        "json": "JsonAgg"
      }
    }
  }
}

For example, once we have configured this, we can add the JsonAgg type as in the example below.

However, there are problems with this:

  1. If we want to have multiple json_agg queries in the same file, it fails, because it uses the name JsonAgg
  2. The types are not checked
type Food = { id: number; name: string; };

type JsonAgg = Food[];

type AnimalWithFoodsInJsonAgg = {
  animalId: number;
  animalFoods: JsonAgg | null;
};

function getAnimalWithFoodsById(async (id: number) {
  const [animal] = await sql<AnimalWithFoodsInJsonAgg[]>`
    SELECT
      animals.id AS animal_id, (
        SELECT json_agg (foods.*)
        FROM
          animal_foods
          INNER JOIN foods ON animal_foods.food_id = foods.id
        WHERE
          animal_foods.animal_id = animals.id
      ) AS animal_foods
    FROM animals
    WHERE animals.id = ${id}
    GROUP BY animals.id;
  `;
  return animal;
}

Describe the solution you'd like

One potential solution is to use PostgreSQL COMMENTs like Kanel by @kristiandupont does

Support list:

  • json_build_object
  • json_agg
  • literal JSON

Describe alternatives you've considered

Running subqueries / parsing JSON in the query:

Additional context

--

Just released @ts-safeql/eslint-plugin@3.0.0-next.0, now with type-checking for json/b expressions πŸš€πŸš€

But please keep in mind that it doesn't cover all cases (such as functions/subselects inside jsonb_agg), but it works fine for simpler scenarios. More improvements are on the way, but I wanted to get this out now.

All feedback is welcome and appreciated πŸ™

@Newbie012 thanks for this! I checked this out just now and wrote some notes:

Works as expected

A) json_build_object

await sql<{ json_build_object: { id: number } }[]>`
  SELECT
    json_build_object('id', animals.id)
  FROM
    animals
`;

Like this test

B) json_agg() AS field

await sql<
  {
    colname: {
      id: number;
      name: string;
      type: string;
    }[];
  }[]
>`
  SELECT
    json_agg(foods) AS colname
  FROM
    foods
`;

Like this test

Not working yet as expected

C) json_agg() AS field with nullable fields

await sql<
  {
    colname: {
      id: number;
      firstName: string;
      type: string;
      accessory: string | null;
      birthDate: Date;
    }[];
  }[]
>`
  SELECT
    json_agg(animals) AS colname
  FROM
    animals
`;

Like this test

Error

Query has incorrect type annotation.
	Expected: { colname: { id: number; firstName: string; type: string; accessory: string | null; birthDate: Date }[] }[]
	Actual: { colname: { id: number; firstName: string; type: string; accessory: string; birthDate: Date }[] }[]

The animals.accessory field should be nullable:

CREATE TABLE animals (
  id integer PRIMARY KEY GENERATED ALWAYS AS IDENTITY,
  first_name varchar(30) NOT NULL,
  type varchar(30) NOT NULL,
  accessory varchar(30),
  birth_date date NOT NULL
);

D) json_agg AS field inside subquery

await sql<
  {
    colname: {
      id: number;
      name: string;
      type: string;
    }[];
  }[]
>`
  SELECT
    (
      SELECT
        json_agg(foods)
      FROM
        foods
    ) AS colname
  FROM
    foods
`;

@Newbie012 mentioned subqueries in his comment above, so it may be that this is also not yet supported.

Error

Query has incorrect type annotation.
	Expected: { colname: { id: number; name: string; type: string }[] }[]
	Actual: { colname: JsonAgg | null }[]

E) json_agg() AS field with FROM clause subquery derived table

await sql<
  {
    colname: {
      id: number;
      name: string;
      type: string;
    }[];
  }[]
>`
  SELECT
    json_agg(foods) AS colname
  FROM
    (
      SELECT
        *
      FROM
        foods
    ) AS foods
`;

@Newbie012 mentioned subqueries in his comment above, so it may be that this is also not yet supported.

Error

Query has incorrect type annotation.
	Expected: { colname: { id: number; name: string; type: string }[] }[]
	Actual: { colname: JsonAgg | null }[]

F) Usage of overrides.types = 'JsonAgg' type in queries

For queries that are not yet supported (eg. the queries above containing subqueries), usage of a overrides.types = 'JsonAgg' type alias in another type alias fails:

type JsonAgg = {
  id: number;
  name: string;
  type: string;
}[];

type X = {
  colname: JsonAgg;
};

await sql<X[]>`
  SELECT
    json_agg(foods) AS colname
  FROM
    (
      SELECT
        *
      FROM
        foods
    ) AS foods
`;

Here it seems as if the X type alias is causing JsonAgg to be resolved to Array.

Inlining the object type into the type argument position directly after sql< (instead of the X in X[]) works around this problem.

Error

Query has incorrect type annotation.
	Expected: { colname: Array }[]
	Actual: { colname: JsonAgg | null }[]

Thanks for the feedback, @karlhorky

Aside from

  SELECT
    (
      SELECT
        json_agg(foods)
      FROM
        foods
    ) AS colname
  FROM
    foods

The other cases should be ok in 3.0.0-next.1.

I think I could make the sublink (subselect) target work as well, but probably in a later version.

Nice, I can confirm this is was working in one of our projects, upgrade PR here:

It was working when I first installed @ts-safeql/eslint-plugin@3.0.0-next.1. However, as soon as I ran it on CI (or wiped my node_modules locally and reinstalled), the ESLint CLI failed.

ESLint CLI run problems:

Open long output

However, in the CI run, it appears that running ESLint via the command line causes resolution problems with the @ts-safeql/shared package (we're using pnpm):

pnpm eslint . --max-warnings 0
(node:3063) ExperimentalWarning: Importing JSON modules is an experimental feature and might change at any time
(Use `node --trace-warnings ...` to show where the warning was created)

Oops! Something went wrong! :(

ESLint: 8.55.0

Error: Cannot find module '/home/runner/work/next-js-example-fall-2023-atvie/next-js-example-fall-2023-atvie/node_modules/.pnpm/@ts-safeql+eslint-plugin@3.0.0-next.1_eslint@8.55.0_libpg-query@15.0.2_typescript@5.2.2/node_modules/@ts-safeql/shared/lib/index.js'. Please verify that the package.json has a valid "main" entry
    at tryPackage (node:internal/modules/cjs/loader:444:19)
    at Module._findPath (node:internal/modules/cjs/loader:715:18)
    at Module._resolveFilename (node:internal/modules/cjs/loader:1130:27)
    at Module._load (node:internal/modules/cjs/loader:985:27)
    at Module.require (node:internal/modules/cjs/loader:1235:19)
    at require (node:internal/modules/helpers:176:18)
    at Object.<anonymous> (/home/runner/work/next-js-example-fall-2023-atvie/next-js-example-fall-2023-atvie/node_modules/.pnpm/@ts-safeql+eslint-plugin@3.0.0-next.1_eslint@8.55.0_libpg-query@15.0.2_typescript@5.2.2/node_modules/@ts-safeql/eslint-plugin/lib/rules/check-sql.rule.js:7:18)
    at Module._compile (node:internal/modules/cjs/loader:1376:14)
    at Module._extensions..js (node:internal/modules/cjs/loader:1435:10)
    at Module.load (node:internal/modules/cjs/loader:1207:32)

Is there something I did wrong that I should fix here?

Our ESLint config specifies SafeQL like this, has been working since we installed SafeQL back close to the first available version:

  firstConfig.plugins['@ts-safeql'] = await import('@ts-safeql/eslint-plugin');

  firstConfig.rules['@ts-safeql/check-sql'] = [
    'error',
    {
      connections: [
        {
          databaseUrl: `postgres://${process.env.PGUSERNAME}:${process.env.PGPASSWORD}@${process.env.PGHOST}:5432/${process.env.PGDATABASE}`,
          targets: [
            {
              tag: 'sql',
              fieldTransform: 'camel',
              transform: '{type}[]',
            },
          ],
          overrides: {
            types: {
              json: 'JsonAgg',
            },
          },
        },
      ],
    },
  ];



Edit: investigating more

Open more details

Ah looks like @ts-safeql/shared@3.0.0-next.1 is missing some file(s):

Screenshot 2023-12-18 at 13 05 29

But downgrading to @ts-safeql/shared@3.0.0-next.0 with pnpm overrides yields other errors in the CI run:

pnpm eslint . --max-warnings 0
  
(node:3026) ExperimentalWarning: Importing JSON modules is an experimental feature and might change at any time
(Use `node --trace-warnings ...` to show where the warning was created)

Oops! Something went wrong! :(

ESLint: 8.55.0

Error: Cannot find module '../utils'
Require stack:
- /home/runner/work/next-js-example-fall-2023-atvie/next-js-example-fall-2023-atvie/node_modules/.pnpm/@ts-safeql+eslint-plugin@3.0.0-next.1_eslint@8.55.0_libpg-query@15.0.2_typescript@5.2.2/node_modules/@ts-safeql/eslint-plugin/lib/rules/check-sql.rule.js
- /home/runner/work/next-js-example-fall-2023-atvie/next-js-example-fall-2023-atvie/node_modules/.pnpm/@ts-safeql+eslint-plugin@3.0.0-next.1_eslint@8.55.0_libpg-query@15.0.2_typescript@5.2.2/node_modules/@ts-safeql/eslint-plugin/lib/rules/index.js
- /home/runner/work/next-js-example-fall-2023-atvie/next-js-example-fall-2023-atvie/node_modules/.pnpm/@ts-safeql+eslint-plugin@3.0.0-next.1_eslint@8.55.0_libpg-query@15.0.2_typescript@5.2.2/node_modules/@ts-safeql/eslint-plugin/lib/index.js
    at Module._resolveFilename (node:internal/modules/cjs/loader:1144:15)
    at Module._load (node:internal/modules/cjs/loader:985:27)
    at Module.require (node:internal/modules/cjs/loader:1235:19)
    at require (node:internal/modules/helpers:176:18)
    at Object.<anonymous> (/home/runner/work/next-js-example-fall-2023-atvie/next-js-example-fall-2023-atvie/node_modules/.pnpm/@ts-safeql+eslint-plugin@3.0.0-next.1_eslint@8.55.0_libpg-query@15.0.2_typescript@5.2.2/node_modules/@ts-safeql/eslint-plugin/lib/rules/check-sql.rule.js:14:17)
    at Module._compile (node:internal/modules/cjs/loader:1376:14)
    at Module._extensions..js (node:internal/modules/cjs/loader:1435:[10](https://github.com/upleveled/next-js-example-fall-2023-atvie/actions/runs/7247923553/job/19742846215?pr=16#step:13:10))
    at Module.load (node:internal/modules/cjs/loader:1207:32)
    at Module._load (node:internal/modules/cjs/loader:[10](https://github.com/upleveled/next-js-example-fall-2023-atvie/actions/runs/7247923553/job/19742846215?pr=16#step:13:11)23:[12](https://github.com/upleveled/next-js-example-fall-2023-atvie/actions/runs/7247923553/job/19742846215?pr=16#step:13:13))
    at Module.require (node:internal/modules/cjs/loader:1235:[19](https://github.com/upleveled/next-js-example-fall-2023-atvie/actions/runs/7247923553/job/19742846215?pr=16#step:13:20))

My bad. I'm building the .next versions locally on my machine, so I made a human error by not building all of the monorepo packages before publishing the version. Should be okay in 3.0.0-next.2 (hopefully).

I should probably use tsup for bundling, but that's for another day.

Ok, with @ts-safeql/eslint-plugin@3.0.0-next.2 no more dependency errors anymore, but now I have a type mismatch again:

Query has incorrect type annotation.
	Expected: { animalId: number; animalFirstName: string; animalType: string; animalAccessory: null | string; animalFoods: null | Array }[]
	Actual: { animalId: number; animalFirstName: string; animalType: string; animalAccessory: string | null; animalFoods: JsonAgg | null }[]

Query + type:

type AnimalWithFoods = {
  animalId: number;
  animalFirstName: string;
  animalType: string;
  animalAccessory: string | null;
  animalFoods: Food[] | null;
};

  const [animal] = await sql<AnimalWithFoods[]>`
    SELECT
      animals.id AS animal_id,
      animals.first_name AS animal_first_name,
      animals.type AS animal_type,
      animals.accessory AS animal_accessory,
      (
        SELECT
          json_agg(foods.*)
        FROM
          animal_foods
          INNER JOIN foods ON animal_foods.food_id = foods.id
        WHERE
          animal_foods.animal_id = animals.id
      ) AS animal_foods
    FROM
      animals
    WHERE
      animals.id = ${id}
    GROUP BY
      animals.first_name,
      animals.type,
      animals.accessory,
      animals.id;
  `;

PR:

Maybe the json_agg subquery in the SELECT is one of the last things that isn't supported yet?

I thought I saw it working on my machine when I first installed @ts-safeql/eslint-plugin@3.0.0-next.1, but maybe I tricked myself.

Yeah, I missed the failing tests. I'll create a new release soon.

On a different note, it turns out that jsonb_agg is nullable and can potentially contain null values:

-- If no data rows are selected, jsonb_agg is nullable
SELECT jsonb_agg(a)
FROM (SELECT NULL::integer AS a WHERE false) sub;

-- If some of the data rows are null, jsonb_agg can contain null values
SELECT jsonb_agg(a)
FROM (VALUES (1), (2), (NULL)) AS vals(a);

While the safest approach is to generate { jsonb_agg: (T | null)[] | null }, in most cases, it's in fact T[], but I treat every type as nullable until proven otherwise. In a nutshell, I may have to do additional nullability checks before 3.0.0.

Just released 3.0.0-next.3 which fixes many small issues such as mistakenly inferring T[] as Array.

Right ok, now with @ts-safeql/eslint-plugin@3.0.0-next.3 we're back to the Food[] vs JsonAgg error message:

Query has incorrect type annotation.
	Expected: { animalId: number; animalFirstName: string; animalType: string; animalAccessory: null | string; animalFoods: null | Food[] }[]
	Actual: { animalId: number; animalFirstName: string; animalType: string; animalAccessory: string | null; animalFoods: JsonAgg | null }[]

@Newbie012 in the query from the other repo, one thing I considered was moving to inner joins instead of using the subquery:

type Food = {
  id: number;
  name: string;
  type: string;
};

type AnimalWithFoods = {
  animalId: number;
  animalFirstName: string;
  animalType: string;
  animalAccessory: string | null;
  animalFoods: Food[] | null;
};

  const [animal] = await sql<AnimalWithFoods[]>`
    SELECT
      animals.id AS animal_id,
      animals.first_name AS animal_first_name,
      animals.type AS animal_type,
      animals.accessory AS animal_accessory,
      json_agg(foods.*) AS animal_foods
    FROM
      animals
      INNER JOIN animal_foods ON animals.id = animal_foods.animal_id
      INNER JOIN foods ON foods.id = animal_foods.food_id
    WHERE
      animals.id = ${id}
    GROUP BY
      animals.first_name,
      animals.type,
      animals.accessory,
      animals.id
  `;

This seems like it should work, but SafeQL has problems resolving the Food type alias:

Query has incorrect type annotation.
	Expected: { animalId: number; animalFirstName: string; animalType: string; animalAccessory: null | string; animalFoods: null | Food[] }[]
	Actual: { animalId: number; animalFirstName: string; animalType: string; animalAccessory: string | null; animal_foods: { id: number; name: string; type: string }[] }[]

Commit here:

I knew that I'd have to rewrite the whole part of type inference from AST from scratch, but I thought I could postpone it with patches. I guess my journey begins now...

It took a while but I think it's finally ready:

npm i -D @ts-safeql/eslint-plugin@3.0.0.next-4

Up until now, SafeQL heavily relied on PostgresSQL's RowDescription message which returned the sources of the queried columns. While in most cases it was sufficient to infer the query returned type, it wasn't as precise as we wanted when dealing with complex/nested types (such as json and jsonb).

In version 3.0.0.next.4 I rewrote from scratch the way SafeQL infers the query types by introducing a new "ASTDescription" which aims to replace PostgresSQL's RowDescription as the primary "descriptor" by trying to figure out the source and the type of each target result (e.g., a returned column). If it fails to retrieve the type for any reason, it will fall back to PostgresSQL's RowDescription (the current implementation). As of writing this comment, all of the current tests (of SELECT statements) passes with the new ASTDescription.

As a result, SafeQL got more stricter and precise, and at the same time more flexible for future plans.

Any feedback would be very much appreciated.

Upgraded to @ts-safeql/eslint-plugin@3.0.0-next.4 and there appear to be improvements, nice!

No longer is the following query using JsonAgg:

type Food = {
  id: number;
  name: string;
  type: string;
};

type AnimalWithFoods = {
  animalId: number;
  animalFirstName: string;
  animalType: string;
  animalAccessory: string | null;
  animalFoods: Food[] | null;
};

const [animal] = await sql<AnimalWithFoods[]>`
  SELECT
    animals.id AS animal_id,
    animals.first_name AS animal_first_name,
    animals.type AS animal_type,
    animals.accessory AS animal_accessory,
    -- Return empty array if no food is found
    coalesce(
      json_agg(foods.*) FILTER (WHERE foods.id IS NOT NULL),
      '[]'
    ) AS animal_foods
  FROM
    animals
    LEFT JOIN animal_foods ON animals.id = animal_foods.animal_id
    LEFT JOIN foods ON foods.id = animal_foods.food_id
  WHERE
    animals.id = ${id}
  GROUP BY
    animals.first_name,
    animals.type,
    animals.accessory,
    animals.id
`;

However, there is an error remaining here, it seems that there are some extra | nulls being added:

Query has incorrect type annotation.
	Expected: { animalId: number; animalFirstName: string; animalType: string; animalAccessory: null | string; animalFoods: null | { id: number; name: string; type: string }[] }[]
	  Actual: { animalId: number; animalFirstName: string; animalType: string; animalAccessory: string | null; animalFoods: { id: number | null; name: string | null; type: string | null }[] }[]

eslint [@ts-safeql/check-sql](https://github.com/ts-safeql/safeql)

As far as I can tell, animalFoods[number].id, animalFoods[number].name and animalFoods[number].type should never be null:

CREATE TABLE foods (
  id integer PRIMARY KEY GENERATED ALWAYS AS IDENTITY,
  name varchar(30) NOT NULL,
  type varchar(30) NOT NULL
);

Good catch! The part responsible for the non-nullability check doesn't consider the FILTER keyword, so it assumes that everything in the column can be nullable because it comes from a LEFT JOIN relation.

Now that I have full control over the type inference, I think adding that check would be trivial as well.

The part responsible for the non-nullability check doesn't consider the FILTER keyword, so it assumes that everything in the column can be nullable because it comes from a LEFT JOIN relation.

Even without the FILTER, the returned value won't be null fields/properties (I think). I receive the following from PostgreSQL json_agg() when foods doesn't return any results:

[null]

So if I'm understanding correctly and this should be taken into account by SafeQL, then I guess json_agg() with a LEFT JOIN should return:

animalFoods: (null | { id: number; name: string; type: string })[]

Or, more precisely:

animalFoods: { id: number; name: string; type: string }[] | [null]

@karlhorky when I think about it, this is a bit more complex than I initially thought. Right now, SafeQL iterates over each column and decides whether it should be nullable (or not). Instead, SafeQL should look at a group of columns, which requires fundamental changes in the way SafeQL checks for nullability (which was introduced in 2.0.0). While I'm all in for it, I think I should release 3.0.0 as is for now.

Thoughts?

While I'm all in for it, I think I should release 3.0.0 as is for now.

I think getting the first version of the inference out now is good and beneficial to users right away (and keeping this issue open)

Then pushing the nullability change to 4.0.0 or a minor like 3.1.0 sounds good.

fixed in 3.0.0

@Newbie012 Thanks for @ts-safeql/eslint-plugin@3 πŸ™Œ Really excited to have a new release here!

However, maybe we should reopen this one - it seems like json_agg() with a LEFT JOIN (mentioned in my comment above) isn't quite there yet:

So if I'm understanding correctly and this should be taken into account by SafeQL, then I guess json_agg() with a LEFT JOIN should return:

animalFoods: (null | { id: number; name: string; type: string })[]

Or, more precisely:

animalFoods: { id: number; name: string; type: string }[] | [null]

With the query below, I get this error still:

Query has incorrect type annotation.
	Expected: { animalFoods: { id: number; name: string; type: string }[] }[]
	  Actual: { animalFoods: { id: number | null; name: string | null; type: string | null }[] }[]
type Food = {
  id: number;
  name: string;
  type: string;
};

type AnimalWithFoods = {
  animalFoods: Food[];
};

const [animal] = await sql<AnimalWithFoods[]>`
  SELECT
    coalesce(
      json_agg(foods.*) FILTER (WHERE foods.id IS NOT NULL),
      '[]'::json
    ) AS animal_foods
  FROM
    animals
    LEFT JOIN animal_foods ON animals.id = animal_foods.animal_id
    LEFT JOIN foods ON foods.id = animal_foods.food_id
  WHERE
    animals.id = ${id}
  GROUP BY
    animals.id
`;

Also without the FILTER clause (where PostgreSQL returns [null]), it doesn't change the SafeQL type that is inferred.


I created a DB Fiddle to demonstrate the issue:

  • Query 1: LEFT JOIN condition matches, animal_foods contains records
  • Query 2: LEFT JOIN condition doesn't match, animal_foods with FILTER falls back to '[]'::json
  • Query 3: LEFT JOIN condition doesn't match, animal_foods without FILTER contains [null]
  • Query 4: LEFT JOIN condition doesn't match, animal_foods, CASE clause falls back to '[]'::json

https://www.db-fiddle.com/f/3XYpPxfzaoZ3domAJD8MFx/5

Screenshot 2024-04-25 at 18 28 55

@karlhorky I understand that. Although, treating a group of columns as nullable rather than treating each column individually can be very complex and should be discussed further in a separate discussion.

I should probably add support for type discrimination using the FILTER syntax as well, but it's unrelated to this issue.

Ok thanks for opening the discussion, I'll watch and discuss over there.

I should probably add support for type discrimination using the FILTER syntax as well, but it's unrelated to this issue.

Ok, opened this feature request here: