Challenge

version: v1.3

The context for this challenge is that you work at a company that powers a marketplace app for healthcare facilities to hire healthcare professionals (a.k.a. workers).

Your role is that of a senior software engineer in charge of the open-shift backend service. This service stores information on the Shift, Facility, Worker, Document, FacilityRequirement, and DocumentWorker entities.

Your task is to complete the following User Story: As a worker, I want to get all available shifts across all active facilities where I'm eligible to work.

Acceptance Criteria

For a Worker to be eligible for a facility's shift:

The Facility must be active
The Shift must be active (i.e., not deleted)
The Worker must be active
The Shift must not be claimed by someone else
The Worker must have all of the facility's required documents
The professions between the Shift and Worker must match

We provide a PostgreSQL database and a seed file. It is random such that:

Some Shifts are claimed
Some Workers are inactive
Some Facilities are inactive
Some Workers don't have all of a facility's required documents

Challenge expectations

Provide a RESTful HTTP server (or another interchange format if you think it's a better match) with the following:

Risk mitigation through proper testing
Proper error handling and logging
A brief writeup on how you would improve the performance of the endpoint with a justification of why it would perform better than your submission
(Bonus) Measure the performance of your endpoint and provide a brief report in a PERFORMANCE.md file

Seeding your database

We provide a folder called seed, which contains a docker-compose.yaml file that helps you set up a database. It is a PostgreSQL database seeded with about 2 million records.

To set it up, go into the seed folder and execute the command docker compose up --build. Once seeded, do not stop docker-compose. Keep the database running and use your framework of choice to connect to it using the database URL postgres://postgres:postgres@localhost:5432/postgres.

The seed script inserts a lot of workers. Among those workers, three fulfill all document requirements; they all have one of the professions. The seed script prints their IDs and professions at the end so you can verify them against your query.

Submission

Please submit your solution by creating a pull request (PR) on this repository. Do not merge your PR. Instead, please return to your Hatchways assessment page to confirm your submission.

Solution

Caveats

I modified seed.ts so:

It generates 200k records instead of 2 million, pg in docker was failing with code: SqlState(E53100), message: could not resize shared memory segment
It distributes shifts through 5 years time span, better for UI display and it was generating shifts in the past fixed to a specific month

I modified docker-compose.yml so

It doesn't run seed.ts everytime the container is (re)-created thus duplicating records. It now resets and seeds
Had to run prisma generate from /seed to generate prism client types - might want to include it in the README

Shift Service

Finds available shifts for a given workerId. Internally, there are 2 implementations/strategies

Memory: fetches worker docs and facility requirements independently and diffs in-memory. Triggered by passing strategy: 'memory' (default)
Raw: performs an sql raw query using join + intersect to diff docs. Triggered by passing strategy: 'raw'

Testing

I believe the exercise can't be deterministically tested the way it is presented. Even if there are in fact 3 deterministic workers having all documents, that does not imply there will be shifts available to them. One could expect there would be shifts available due to the sheer amount of shifts generated, but not certain, thus it would be a flacky test.

Thus, I seed/teardown additional test data before each suite:

Use case 1: Shift1 for Facility1 which requires no docs, Worker1 has no docs -> Shift1 should exist in results for Worker1
Use case 2: Shift2 for Facility2 requires Doc1, worker has Doc1 -> Shift1 and Shift2 should exist in results for Worker2
Run with: npm test - will run both service unit tests + server e2e tests. It doesn't require the api server to be running as it injects requests

Api Server

Fastify server exposing the endpoint. TODO: Properly separate into modules/routes/controllers

Run with npm run api:dev -> http://localhost:3000/api/shifts?workerId=101

Web Server

A short demo to showcase finding shifts by worker id and displaying in calendar format

cd into /web and run npm run dev -> http://localhost:5173/

About Performance

service

There's a simple explicit measurement when hitting the /api/shifts enpoint, it will determine the time spent around the invoked shift service and add it as { meta: { ts: number }} in the response. Of course this is not scalable as performance is really a cross-cutting concern.

e2e

Measures /api/shifts endpoint using both memory/raw implementation. It seems memory impl wins in latency/requests/throughput

Ensure api server is running
Run with npm run benchmark

General Performance Improvements

Add real observability via instrumentation, measure function calls, chokepoints. ie: OpenTelemetry
Add caching at the application level. ie: Redis
Optimize sql queries
Use materialized views in db. Entities that have fields is_deleted: true or is_active: false would not even be considered for querying
Use stored procedures in db
Partition db using distinct fields that could help segmenting data. Ie: facility location

gblejman / clipboard-health-demo