ExOK / Celeste64

A game made by the Celeste developers in a week(ish, closer to 2)

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Camera FieldOfView can be NaN

NoelFB opened this issue · comments

Error Log (1/29/2024 11:15:25 PM)
Call Stack:
System.ArgumentOutOfRangeException: fieldOfView ('NaN') must be greater than '0'. (Parameter 'fieldOfView')
Actual value was NaN.
   at System.ArgumentOutOfRangeException.ThrowLessEqual[T](T, T, String)
   at Celeste64.Camera.get_Projection() in /home/noel/Projects/Celeste64/Source/Graphics/Camera.cs:line 128
   at Celeste64.World.Render(Target) in /home/noel/Projects/Celeste64/Source/Scenes/World.cs:line 633
   at Celeste64.Game.Render() in /home/noel/Projects/Celeste64/Source/Game.cs:line 288
   at Foster.Framework.App.Tick()
   at Foster.Framework.App.Run(String, Int32, Int32, Boolean )
   at Foster.Framework.App.Run[T](String, Int32, Int32, Boolean )
   at Celeste64.Program.Main(String[]) in /home/noel/Projects/Celeste64/Source/Program.cs:line 23
Game Output:
Celeste 64 v.1.0.1
Foster: v0.1.14
Platform: Microsoft Windows 10.0.19045 (X64)
Framework: .NET 8.0.1
SDL: v2.28.5
OpenGL: v3.3.13399 Core Profile Forward-Compatible Context 15.201.1151.1008, AMD Radeon HD 6450
FMOD Bindings: v20218
FMOD: v20207
Loaded Bank: C:\Users\*\Desktop\Celeste64-win-x64\Content\Audio\Master.strings.bank
Loaded Bank: C:\Users\*\Desktop\Celeste64-win-x64\Content\Audio\Master.bank
Loaded Bank: C:\Users\*\Desktop\Celeste64-win-x64\Content\Audio\music.bank
Loaded Bank: C:\Users\*\Desktop\Celeste64-win-x64\Content\Audio\sfx.bank
Loaded Assets in 1758ms
Strawb Count: 20
Loaded Map '1' in 268ms```

The only place FieldOfView is ever assigned is here in Player.cs

float targetFOV = Calc.ClampedMap(velocity.XY().Length(), MaxSpeed * 1.2f, 120, 1, 1.2f);
World.Camera.FOVMultiplier = Calc.Approach(World.Camera.FOVMultiplier, targetFOV, Time.Delta / 4);

The only way I could see NaN being assigned is if velocity is also somehow maybe NaN? Weird.

commented

Gonna share some progress debugging this crash.

I managed to narrow down the NaN appearance in one crash instance to this block of code in StNormalUpdate:

// movement
{
var velXY = velocity.XY();
if (Controls.Move.Value == Vec2.Zero || tNoMove > 0)
{
// if not moving, simply apply friction
float fric = Friction;
if (!onGround)
fric *= AirFrictionMult;
// friction
Calc.Approach(ref velXY, Vec2.Zero, fric * Time.Delta);
}
else if (onGround)
{
float max = MaxSpeed;
// change max speed based on ground slope angle
if (groundNormal != Vec3.UnitZ)
{
float slopeDot = 1 - Calc.Clamp(Vec3.Dot(groundNormal, Vec3.UnitZ), 0, 1);
slopeDot *= Vec2.Dot(groundNormal.XY().Normalized(), targetFacing) * 2;
max += max * slopeDot;
}
// trueMax is the max XY speed before applying analog stick magnitude
float trueMax = max;
// apply analog stick magnitude
{
float mag = Calc.ClampedMap(Controls.Move.Value.Length(), .4f, .92f, .3f, 1);
max *= mag;
}
var input = RelativeMoveInput;
// TODO: Solve this way better! Ugh I hate this!!
// move lightly away from ledges by checking for no floor, and then sweeping in until we find floor
// Please don't look at this code
// if I had more time to solve this nicely I would do something else
{
var d = 4;
if (input != Vec2.Zero &&
!World.SolidRayCast(Position + new Vec3(input, 1) * d, -Vec3.UnitZ, 8, out var hit) &&
!World.SolidRayCast(Position + new Vec3(0, 0, d), new Vec3(input, 0), d, out hit))
{
var left = Calc.AngleToVector(Calc.Angle(input) + 0.3f);
var right = Calc.AngleToVector(Calc.Angle(input) - 0.3f);
var count = 0;
if (World.SolidRayCast(Position + new Vec3(left, 1) * d, -Vec3.UnitZ, 8, out hit))
{
while (World.SolidRayCast(Position + new Vec3(left, 1) * d, -Vec3.UnitZ, 8, out hit) && count++ < 10)
left = Calc.AngleToVector(Calc.Angle(left) - 0.1f);
input = Calc.AngleToVector(Calc.Angle(left) + 0.1f); ;
}
else if (World.SolidRayCast(Position + new Vec3(right, 1) * d, -Vec3.UnitZ, 8, out hit))
{
while (World.SolidRayCast(Position + new Vec3(right, 1) * d, -Vec3.UnitZ, 8, out hit) && count++ < 10)
right = Calc.AngleToVector(Calc.Angle(right) + 0.1f);
input = Calc.AngleToVector(Calc.Angle(right) - 0.1f); ;
}
}
}
// if travelling faster than our "true max" (ie. our max not accounting for analog stick magnitude),
// then we switch into a slower decceleration to help the player preserve high speeds
float accel;
if (velXY.LengthSquared() >= trueMax * trueMax && Vec2.Dot(input, velXY) >= .7f)
accel = PastMaxDeccel;
else
accel = Acceleration;
// if our XY velocity is above the Rotate Threshold, then our XY velocity begins rotating
// instead of using a simple approach to accelerate
if (velXY.LengthSquared() >= RotateThreshold * RotateThreshold)
{
if (Vec2.Dot(input, velXY.Normalized()) <= SkidDotThreshold)
{
Facing = targetFacing = input;
stateMachine.State = States.Skidding;
return;
}
else
{
// Rotate speed is less when travelling above our "true max" speed
// this gives high speeds less fine control
float rotate;
if (velXY.LengthSquared() > trueMax * trueMax)
rotate = RotateSpeedAboveMax;
else
rotate = RotateSpeed;
targetFacing = Calc.RotateToward(targetFacing, input, rotate * Time.Delta, 0);
velXY = targetFacing * Calc.Approach(velXY.Length(), max, accel * Time.Delta);
}
}
else
{
// if we're below the RotateThreshold, acceleration is very simple
Calc.Approach(ref velXY, input * max, accel * Time.Delta);
targetFacing = input.Normalized();
}
}
else
{
float accel;
if (velXY.LengthSquared() >= MaxSpeed * MaxSpeed && Vec2.Dot(RelativeMoveInput.Normalized(), velXY.Normalized()) >= .7f)
{
accel = PastMaxDeccel;
var dot = Vec2.Dot(RelativeMoveInput.Normalized(), targetFacing);
accel *= Calc.ClampedMap(dot, -1, 1, AirAccelMultMax, AirAccelMultMin);
}
else
{
accel = Acceleration;
var dot = Vec2.Dot(RelativeMoveInput.Normalized(), targetFacing);
accel *= Calc.ClampedMap(dot, -1, 1, AirAccelMultMin, AirAccelMultMax);
}
Calc.Approach(ref velXY, RelativeMoveInput * MaxSpeed, accel * Time.Delta);
}
velocity = velocity.WithXY(velXY);
}

velXY is being set to NaN somewhere here.

The person says they're crashing whenever they try to move around, but jumping works fine - so does moving the camera. Dashing causes the game to freeze, both grounded and airborne. They use keyboard, and not analog (I even completely commented out the AddLeftJoystick and AddDPad calls - it still crashes.)

Feels like Move.Value is somehow NaN?

commented

Good news; Move.Value does not contain NaN. Building and running with the below patch does not log that Controls.Move.Value has NaN.

diff --git a/Source/Game.cs b/Source/Game.cs
index 89c0fe8..a5fb4f9 100644
--- a/Source/Game.cs
+++ b/Source/Game.cs
@@ -113,6 +113,9 @@ public class Game : Module

        public override void Update()
        {
+               if (VectorHelpers.HasNaN(Controls.Move.Value))
+                       Log.Error($"{nameof(Controls.Move.Value)} contains NaN! Things are about to go very wrong! ({Controls.Move.Value})");
+
                // update top scene
                if (scenes.TryPeek(out var scene))
                {
diff --git a/Source/Helpers/VectorHelpers.cs b/Source/Helpers/VectorHelpers.cs
new file mode 100644
index 0000000..173d09a
--- /dev/null
+++ b/Source/Helpers/VectorHelpers.cs
@@ -0,0 +1,9 @@
+namespace Celeste64;
+public static class VectorHelpers
+{
+    public static bool HasNaN(this in Vec2 vec2)
+        => float.IsNaN(vec2.X) || float.IsNaN(vec2.Y);
+
+    public static bool HasNaN(this in Vec3 vec3)
+        => float.IsNaN(vec3.X) || float.IsNaN(vec3.Y) || float.IsNaN(vec3.Z);
+}

Thanks for the investigation! This should narrow it down a lot ...

commented

Found the source of the NaN; apparently it's RelativeMoveInput. (so I was almost there; I considered checking RelativeMoveInput directly but chose against it, as I thought that since Controls.Move.Input is fine, RelativeMoveInput would also be fine.)

I added assertions to every parameter when calculating velXY, logging if it contains NaN. input is NaN when grounded; RelativeMoveInput (and subsequently accel) is NaN when airborne.
I'd take it home from there, but it's currently late; figured I'd share the development. I plan to make a PR in case I get to the bottom of this issue.

commented

Somehow the NaN surfaces from normalizing the camera's XY components?? This doesn't make any sense..

diff --git a/Source/Actors/Player.cs b/Source/Actors/Player.cs
index d0482d6..03950ae 100644
--- a/Source/Actors/Player.cs
+++ b/Source/Actors/Player.cs
@@ -611,7 +611,15 @@ public class Player : Actor, IHaveModels, IHaveSprites, IRidePlatforms, ICastPoi
                        if (Vec2.Dot(input, Vec2.UnitY) >= .985f)
                                input = Vec2.UnitY;

-                       return forward * input.Y + side * input.X;
+                       Vec2 ret = forward * input.Y + side * input.X;
+                       if (ret.HasNaN())
+                       {
+                               ret.AssertNotNaN($"{nameof(RelativeMoveInput)} contains NaN!");
+                               Log.Error($"{nameof(World.Camera.Forward)} = {World.Camera.Forward}");
+                               Log.Error($"{nameof(forward)} = {forward} (normalized)");
+                               Log.Error($"{nameof(side)} = {side} (normalized)");
+                       }
+                       return ret;
                }
        }

Running this yields these logs:

RelativeMoveInput contains NaN! (<NaN, NaN>)
Forward = <-0.10875643, -0.887422, -0.44794446>
forward = <0.12164313, -Infinity> (normalized)
side = <Infinity, NaN> (normalized)

This is not a one-off, here's what happens when the camera is not moved:

RelativeMoveInput contains NaN! (<NaN, NaN>)
Forward = <-3.9004448E-08, 0.89231783, -0.45140773>
forward = <-4.3711385E-08, Infinity> (normalized)
side = <-Infinity, NaN> (normalized)

Hmm we do use a custom Normalize method, but all it does is check for x/y being 0 and return 0 in that case. Maybe we need to use an epsilon?

    public static Vector2 Normalized(this Vector2 vector)
    {
        if (MathF.Abs(vector.X) <= float.Epsilon && MathF.Abs(vector.Y) <= float.Epsilon)
            return Vector2.Zero;

        return Vector2.Normalize(vector);
    }

I'm not sure what else would be the cause... but that doesn't entirely make sense because the input values aren't both near zero (only X in the example above).

Getting infinite/NaN values is definitely expected for very small values (Vector2.Normalize(new(float.Epsilon, 0)) produces <∞, NaN> for instance), but the values Snip's test subject is getting this for are definitely not in a range where that should happen (and it doesn't if I test with those values).

Note that even when this is behaving normally, float.Epsilon is definitely too small as a guard to fully avoid this. In my testing this starts happening when the vector components are somewhere in the 1e-23 range (for reference, float.Epsilon is 1e-45).

commented

Could it be processor-dependent? The person is running on a relatively old CPU (AMD Phenom II X4 955), and Vector2 operations' assembly leverages AVX instructions; this sounds kind of crazy but it's the only thing floating on my mind.

It might make sense as I've only seen two people experience this crash in the Celeste server.

commented

Got bored and decided to make a visualization;
Yellow is input, green is expected and red is actual. (red vector's angle with the -Y axis is not to scale for demonstration sake)
image

if we do our own Normal calculation does that work for them? what the hell

return vector / MathF.Sqrt(vector.X * vector.X + vector.Y * vector.Y)
commented

Will try tomorrow. 👍

commented

For those who are following this issue:

It turns out my theory wasn't in fact so crazy. A .NET Runtime bug causes the JIT to emit wrong ASM instructions when calling Vector2.Normalize(Vector2) on CPUs which don't support AVX, making the normalization result incorrect. For example, on an AMD Phenom II X4 955, this causes the normalized vector's Y component to be +/- Infinity.

The solution here is to do a manual normalization.

I've forked Foster and made it fall back to manual normalization, if any component of the normalized vector is not finite.
I've also included some camera debug metrics in the top left if Celeste64 is built in debug mode.

I sent the built version to my tester (thank you Tanya!) and they confirmed it worked for them. I'll soon make a release on my fork including the fix.


EDIT: The release is out. I hope this'll help those affected until the .NET bug is fixed.

Wow, thanks for looking into this and figuring it out!

I guess at this point it's worth submitting findings to .NET runtime? We could add a hack to use manual normalization if it fails but I don't really want to do that on our end if it's something that can be resolved upstream... Unless .NET team decide it's not worth fixing for some reason.

commented

@Popax21 said he'd make an issue soon, but not right now as he's currently pretty busy.
I'd rather let him do the talking as I'm not exactly very qualified to be talking about this stuff.

It would probably be a good idea to figure out the minimum .NET version this bug was introduced in while I am still out of action. I'll try to prepare a writeup for a potential .NET bug report once I have the time for it.

This might be related to issues seen on ARM64 Release Mode: #70 (comment)

commented

The fork fell a bit out of date, so I pulled the changes and published the 1.1.1 release + a few extra commits; remembering to publish new platforms as well. (unfortunately I don't have any machines to test the releases on, so fingers crossed)

So this is CPU dependent? huh interesting(i have a phenom 2😭)

commented

[...] For example, on an AMD Phenom X4 955, this causes the normalized vector's Y component to be +/- Infinity.

Whoops, I meant to say "AMD Phenom II X4 955", but just noticed I forgot to put the "II" in.
Regardless, my Celeste64 fork should still make the game playable for you.