thorvg / thorvg

Thor Vector Graphics is a lightweight portable library used for drawing vector-based scenes and animations including SVG and Lottie. It can be freely utilized across various software platforms and applications to visualize graphical contents.

Home Page:https://www.thorvg.org

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

SwRender is stuck with the specific Lottie file

beicause opened this issue · comments

The Lottie file:
test10.json
Add this test case in ThorVG 0.13.3:

TEST_CASE("Animation Lottie10", "[tvgAnimation]")
{
    REQUIRE(Initializer::init(CanvasEngine::Sw, 0) == Result::Success);

    auto canvas = SwCanvas::gen();
    REQUIRE(canvas);
    uint32_t buffer[512*512];
    REQUIRE(canvas->target(buffer, 512, 512, 512, SwCanvas::Colorspace::ABGR8888S) == Result::Success);

    auto animation = Animation::gen();
    REQUIRE(animation);

    auto picture = animation->picture();
    REQUIRE(picture->identifier() == Picture::identifier());

    REQUIRE(canvas->push(cast(picture)) == Result::Success);

    REQUIRE(picture->load(TEST_DIR"/test10.json") == Result::Success);
    REQUIRE(animation->totalFrame() == Approx(180).margin(004004));
    REQUIRE(animation->curFrame() == 0);
    REQUIRE(animation->duration() == Approx(3).margin(004004));

    for(int i=1;i<180;i++){
        REQUIRE(animation->frame(i) == Result::Success);
        REQUIRE(canvas->update(picture) == Result::Success);
        REQUIRE(canvas->draw() == Result::Success);
        REQUIRE(canvas->sync() == Result::Success);
        REQUIRE(canvas->clear(false) == Result::Success);
    }
    REQUIRE(Initializer::term(CanvasEngine::Sw) == Result::Success);
}

Build with:

meson setup builddir -Dtests=true -Dextra=""
ninja -C builddir

Gets stuck sometimes when running ( but seem not stuck with -Dextra="lottie_expressions"):

./builddir/test/tvgUnitTests

Gets stuck always when running:

./builddir/test/tvgUnitTests "Animation Lottie10"

I don't known the exactly reason and the Lottie file is downloaded from website, I am not the author. The call stack shows it's stuck in the while loop in tvgSwRle.cpp _cubicTo.

lottie2gif and thorvg-viewer works fine, but Export GIF in thorvg-viewer alerts the error Unable to save the Gif data..

I open this separate issue from godotengine/godot#91580 (review)

Partially fixed since dfe8570 if enable extra Lottie expressions.
Completely fixed since 7581b08

@beicause thanks for update, does it resolve the integration issue? godotengine/godot#91580

@hermet No, to my surprise, but this test case does pass. And the issue seems not to happend anymore when launching godot program with debugger. I have to try running the program many times to reproduce it and then attach debugger to the process. it's stuck in tvgSwRle.cpp _lineTo instead of _cubicTo

Regressed after 4240998

@beicause Hi, Thanks for your detailed information. There was a mistake that broke parameter passing. I could reproduce the issue with v0.13.3 using your test case, but it shouldn't be happened anymore with this fix: 7581b08.

The issue is no longer present in v0.13.x or the main branch on my side. I couldn't reproduce it. Could you please double-check if the issue still occurs on the main branch your side? Also, I assure that 4240998 is not relate to this issue at all.

I check again and also try compiling with clang-17( it was compiled using gcc 11 previously ).

clang-17 always has this issues, not fixed in 7581b08, and the compilation is much slow than gcc(?).

C++ compiler for the host machine: clang++-17 (clang 17.0.6 "Ubuntu clang version 17.0.6 (++20231209124227+6009708b4367-1~exp1~20231209124336.77)")
C++ linker for the host machine: clang++-17 ld.lld 17.0.6

7581b08 fixes the issue with gcc-11, except with these options -Dextra="lottie_expressions" -Dthreads=true. And I'm sure 4240998 is a regresstion for other config option.

C++ compiler for the host machine: ccache c++ (gcc 11.4.0 "c++ (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0")
C++ linker for the host machine: c++ ld.lld 12.0.9

@beicause Could you please size down the stack and try it again? I only had a stack overflow crash with MSVC.

// need to size down
// uint32_t buffer[512*512];
// REQUIRE(canvas->target(buffer, 512, 512, 512, SwCanvas::Colorspace::ABGR8888S) == Result::Success);

uint32_t buffer[100*100];
REQUIRE(canvas->target(buffer, 100, 100, 100, SwCanvas::Colorspace::ABGR8888S) == Result::Success);

@hermet Size doesn't matter. It doesn't crash on linux because the default stack is large enough. Using malloc is the same.

@beicause Colud you please double-check with the -Db_sanitize="address,undefined" or -Db_sanitize="thread" with the v0.13.x branch?

Also, you can try with this:

    REQUIRE(canvas->push(cast(picture)) == Result::Success);
    canvas->sync();          // <---------------- only one suspicious point in the sample. you can add it and test again.
    REQUIRE(picture->load(TEST_DIR"/test10.json") == Result::Success);

@beicause Colud you please double-check with the -Db_sanitize="address,undefined" or -Db_sanitize="thread" with the v0.13.x branch?

Also, you can try with this:

    REQUIRE(canvas->push(cast(picture)) == Result::Success);
    canvas->sync();          // <---------------- only one suspicious point in the sample. you can add it and test again.
    REQUIRE(picture->load(TEST_DIR"/test10.json") == Result::Success);

@hermet Well, both -Db_sanitize="address,undefined" and -Db_sanitize="thread" work, althought -Db_sanitize="thread" is extremely slow.

Adding canvas->sync() has no effect.

clang gets the warning and linker error for undefined symbol.

meson.build:209: WARNING: Trying to use thread sanitizer on Clang with b_lundef.
This will probably not work.

After adding -Db_lundef=false, compiled with clang works, too.

Well, both -Db_sanitize="address,undefined" and -Db_sanitize="thread" work, althought -Db_sanitize="thread" is extremely slow.

@beicause So, no, any sanitizer violation report?

@hermet Yeah, the test passes without sanitizer report.

@beicause There are hundreds complex animations are working fine on windows/macos/ubuntu/web + our CI automation tests. I don't know the point now. Maybe this is the last request for help.

  1. Is the call stack exactly the same whenever it crashes?
  2. Could you please try using another sample resource (JSON) and see if the issue occurs the same way?
  3. I wonder if you can reproduce the issue on another machine.

@mgrudzinska could you please try to reproduce this issue on your side?

@hermet Yes, I am able to reproduce the bug. The commits mentioned here in the discussion did not change anything. However, in my case the program hangs not during the test (it passes each time). To reproduce it, I need to disable threads in ThorVG, and then run the Lottie.cpp example with a heavy load, sometimes with 3 files and sometimes with 100, but they have to be big (e.g., traveling.json). None of the sanitizers report anything. An infinite number of runs from SwShapeTasks starts executing...

@mgrudzinska Ok, thanks, but it sounds like it's a different story. It's not stuck, but it can't draw anything with the toolkits (EFL). It couldn't complete the drawing in a minimal time; it's just too slow due to the super heavy tasks. That's why the sanitizer can't catch any issue because it's not logically wrong. To reproduce the issue, it must be reproduced with 'Test,' which doesn't depend on any toolkits as he argued also I expect the issue symptom should be something like broken data, as @beicause mentioned : godotengine/godot#91580 (review)

btw, could you please share those 3 files ?

@mgrudzinska I can run examples/Lottie.cpp with heavy load, then the application is not responding (ANR) after seconds. I'm not sure if it is relative to this issue, because it is also ANR without the test10.json

@hermet I don't reproduce it with another Lottie files until now.

I open my test10.json in https://creator.lottiefiles.com and it shows unsupported features: Shape gradient fill, Shape trim. After forcing to insert and export this file, the test passes with the exported one the exported one also has bug (I'm fail to edit it because of errors).

I catch the errors with -Dsanitize="undefined", instead of -Dsanitize="address,undefined"

Filters: Animation Lottie10
../src/renderer/sw_engine/tvgSwCommon.h:74:19: runtime error: signed integer overflow: -9223372036854775808 - 9780 cannot be represented in type 'long int'
../src/renderer/sw_engine/tvgSwCommon.h:74:30: runtime error: signed integer overflow: -9223372036854775808 - 13044 cannot be represented in type 'long int'
../src/renderer/sw_engine/tvgSwCommon.h:74:19: runtime error: signed integer overflow: 7520 - -9223372036854775808 cannot be represented in type 'long int'
../src/renderer/sw_engine/tvgSwCommon.h:74:30: runtime error: signed integer overflow: 11051 - -9223372036854775808 cannot be represented in type 'long int'
../src/renderer/sw_engine/tvgSwMath.cpp:228:12: runtime error: signed integer overflow: -9223372036854775808 + -9223372036854775808 cannot be represented in type 'long int'
../src/renderer/sw_engine/tvgSwMath.cpp:238:12: runtime error: signed integer overflow: -9223372036854775808 + -9223372036854775808 cannot be represented in type 'long int'
../src/renderer/sw_engine/tvgSwRle.cpp:301:40: runtime error: signed integer overflow: 6917529027641052588 * 3 cannot be represented in type 'long int'
../src/renderer/sw_engine/tvgSwMath.cpp:226:32: runtime error: signed integer overflow: 6917529027641105376 + 6917529027641100428 cannot be represented in type 'long int'
../src/renderer/sw_engine/tvgSwMath.cpp:236:32: runtime error: signed integer overflow: 6917529027641109884 + 6917529027641103344 cannot be represented in type 'long int'
../src/renderer/sw_engine/tvgSwRle.cpp:301:67: runtime error: signed integer overflow: 3098476543630908112 * 3 cannot be represented in type 'long int'
^C

The above tested file is exported from lottie creator test10_edited.json, orignal file pass the test, but also has errors:

Filters: Animation Lottie10
../src/renderer/sw_engine/tvgSwCommon.h:74:19: runtime error: signed integer overflow: -9223372036854775808 - 24236 cannot be represented in type 'long int'
../src/renderer/sw_engine/tvgSwCommon.h:74:30: runtime error: signed integer overflow: -9223372036854775808 - 12518 cannot be represented in type 'long int'
../src/renderer/sw_engine/tvgSwCommon.h:62:11: runtime error: signed integer overflow: -368 + -9223372036854775808 cannot be represented in type 'long int'
../src/renderer/sw_engine/tvgSwCommon.h:62:11: runtime error: signed integer overflow: -368 + -9223372036854775808 cannot be represented in type 'long int'
../src/renderer/sw_engine/tvgSwCommon.h:62:11: runtime error: signed integer overflow: -368 + -9223372036854775808 cannot be represented in type 'long int'
../src/renderer/sw_engine/tvgSwCommon.h:63:11: runtime error: signed integer overflow: -368 + -9223372036854775808 cannot be represented in type 'long int'
../src/renderer/sw_engine/tvgSwCommon.h:63:11: runtime error: signed integer overflow: -368 + -9223372036854775808 cannot be represented in type 'long int'
../src/renderer/sw_engine/tvgSwCommon.h:63:11: runtime error: signed integer overflow: -368 + -9223372036854775808 cannot be represented in type 'long int'
../src/renderer/sw_engine/tvgSwCommon.h:74:19: runtime error: signed integer overflow: 24314 - -9223372036854775808 cannot be represented in type 'long int'
../src/renderer/sw_engine/tvgSwCommon.h:74:30: runtime error: signed integer overflow: 17214 - -9223372036854775808 cannot be represented in type 'long int'
../src/renderer/sw_engine/tvgSwCommon.h:63:11: runtime error: signed integer overflow: -368 + -9223372036854775808 cannot be represented in type 'long int'
../src/renderer/sw_engine/tvgSwCommon.h:74:19: runtime error: signed integer overflow: 9223372036854775440 - -9223372036854775440 cannot be represented in type 'long int'
../src/renderer/sw_engine/tvgSwCommon.h:74:30: runtime error: signed integer overflow: -9223372036854775440 - 9223372036854775440 cannot be represented in type 'long int'
../src/renderer/sw_engine/tvgSwCommon.h:63:11: runtime error: signed integer overflow: -368 + -9223372036854775808 cannot be represented in type 'long int'
../src/renderer/sw_engine/tvgSwCommon.h:62:11: runtime error: signed integer overflow: -368 + -9223372036854775808 cannot be represented in type 'long int'
===============================================================================
All tests passed (906 assertions in 1 test case)

btw, could you please share those 3 files ?

any big ones will work, for traveling.json I got the problem with only 3 files, but my IDE is currently consuming many resources after upgrade, so 3 may not be enough on other machines to reproduce.

so the exact problem described here I'm not able to reproduce, tried on mac, linux vm

I reproduced on vm with sanitize=undefined, as @beicause suggested

Zrzut ekranu 2024-05-24 o 23 05 44

the problem is interpolation between closed/opened bezier curves, which was not handled in ThorVG

@beicause please confirm that this solves the issue

@mgrudzinska Yes, it fixes this issue and godot integration problem as well godotengine/godot#91580

@beicause @mgrudzinska The fix will be applied in ThorVG v0.13.6. Thanks for your cooperation.