Advice for handling visual discrepancies between Intel and Apple Silicon images

Question

Advice for handling visual discrepancies between Intel and Apple Silicon images

ReDrUm opened this issue 2 years ago · comments

Does anyone have advice on how best to handle the visual discrepancies between an image produced on an Intel machine vs an Apple Silicon machine for VR testing?

I'm encountering differences in how drop shadows are rendered. Enough to trip up jest-image-snapshot unless i set the pixel tolerance excessively high.

My setup is running Chromium via Puppeteer on a multi-arch docker image running Debian 12. It seems like shadow differences are common

Has anyone tackled this problem successfully yet?

Homa Wong · Answer 1 · Wed Nov 16 2022 15:18:02 GMT+0800 (China Standard Time)

There are also discrepancies between macOS and Ubuntu.

The font weight rendered in Ubuntu is lighter compare to macOS. This causes the test to fail as I use macOS locally but the CI is based on Ubuntu.

Homa Wong · Answer 2 · Wed Nov 16 2022 16:10:42 GMT+0800 (China Standard Time)

Here is a comparison. Above is from macOS, below is from Ubuntu:

Resly Suniar · Answer 3 · Wed May 17 2023 11:19:25 GMT+0800 (China Standard Time)

Hi, any updates for this issue? I also encounter discrepancies between Intel and Apple Silicon images and still have no idea how to handle this.

Homa Wong · Answer 4 · Wed May 17 2023 15:25:06 GMT+0800 (China Standard Time)

For me, what I ended up doing is to create different sets of snapshots for different OS/platform.

동주 · Answer 5 · Thu Jun 29 2023 10:55:44 GMT+0800 (China Standard Time)

I've just ended up setting a failureThreshold option. 😢

expect(screenshot).toMatchImageSnapshot({
  failureThreshold: 0.005,
  failureThresholdType: 'percent'
});

reference: The LogRocket Blog Article

Joris W · Answer 6 · Mon Jul 03 2023 23:33:37 GMT+0800 (China Standard Time)

I run the tests in an Ubuntu container using Docker for deterministic results.

Dockerfile:

FROM mcr.microsoft.com/playwright:v1.35.1-jammy

# Set the work directory for the script that follows
WORKDIR /test

# Copy visual-testing package.json
COPY package.json ./

# Install dependencies
RUN yarn

# Copy current source directory
COPY . .

My yarn scripts:

  "scripts": {
    "test": "yarn stop && docker run --name visual-testing --network host --add-host=host.docker.internal:host-gateway -v ${PWD}/baseline-snapshots:/test/baseline-snapshots -v ${PWD}/failure-diffs:/test/failure-diffs --rm visual-testing yarn container:wait-then-test",
    "container:run-test": "yarn test-storybook --stories-json --ci --url http://host.docker.internal:6006",
    "container:wait-then-test": "yarn container:wait-for-storybook && yarn container:run-test",
    "container:wait-for-storybook": "yarn wait-on -i 5000 -t 600000 http://host.docker.internal:6006"
  },

programmer24601 · Answer 7 · Fri Jul 21 2023 20:57:05 GMT+0800 (China Standard Time)

We've opted for this pragmatic approach:

if (process.platform === 'darwin') {
  expect(pngBuffer).toMatchImageSnapshot({
    failureThreshold: 0.00009,
    failureThresholdType: 'percent'
  });
} else {
  expect(pngBuffer).toMatchImageSnapshot();
}

Not ideal, but since CI isn't darwin nothing should slip through undetected.

Homa Wong · Answer 8 · Tue Aug 08 2023 12:13:18 GMT+0800 (China Standard Time)

https://github.com/justland/just-web-react/actions/runs/5792587155/job/15699083890

Expected image to match or be a close match to snapshot but was 0.06825086805555555% different from snapshot (629 differing pixels).

I have this case where the snapshot generated locally from ubuntu in WSL and the one in the CI doesn't match.

process.platform are both linux in this case.... 🤷

Roni Litman · Answer 9 · Thu Nov 02 2023 23:15:20 GMT+0800 (China Standard Time)

For me, what I ended up doing is to create different sets of snapshots for different OS/platform.

How did you do that?

Homa Wong · Answer 10 · Fri Nov 03 2023 10:32:01 GMT+0800 (China Standard Time)

How did you do that?

I do this:

customSnapshotsDir: `${process.cwd()}/__snapshots__/${process.platform}`

Cyrus S · Answer 11 · Tue Feb 27 2024 01:54:34 GMT+0800 (China Standard Time)

The rendering discrepancy has caused text layout to shift and wrap depending on if I run the tests locally (Apple silicon) vs in an Ubuntu CI pipeline that the difference between images is greater than 30%. This makes setting a custom failure threshold unhelpful as it will not catch smaller changes anymore, which IMO is the whole point of image snapshot testing.

I am attempting to run them locally in a docker container now, but the issue I run into is that it maxes out the CPU and causes tests to start timing out (even with a generous test timeout).