dotnet / dotNext

Next generation API for .NET

Home Page:https://dotnet.github.io/dotNext/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[API Proposoal]: Additional methods for "DotNext.Span"

CodingMadness opened this issue · comments

Heyo,

I have written some "SpanExtension" methods cause its needed to be optimized for rendering logic and I thought maybe they could go inside your DotNext.Span namespace :)

I will post a github gist "Utils" class where i hord those functions needed for my 2D game.

This gist contains 3 core methods:

  • span.MoveBy(Range, int) ---------> moves a specified range within the span by a value in either LEFT or RIGHT direction
    0 > value = left direction;
    0 < value = rigth direction;

  • span.Move2(Range, int) ---------> moves a specified range within the span to a specific index

  • span.Swap(this ROS self, ROS x, ROS y); ---------> swaps 2 slices from the same span
    without extra heapbased copies.

The other 2 methods are just, 1 for simple (not fully fledged tests, did not have time for it yet..) and the other is just a direct wrapping function of Swap(...) to work easier with ROS-types.

this gist here contains the "SpanInfo" type building all the nessecary information pre-hand, so its easier to control what is needed and when to cuse them, debloats heavily the Swap() function

I really hope that you have a look upon them and tell me if they are useful and when not, also tell me pls why ^^

All good 2 you and ur family and really really nice job you did with DotNext!

Cheers!

ReadOnlySpan<T> cannot be modified in general. Unsafe operation doesn't help here because it can represent truly read-only memory segment which leads to SEGFAULT. For instance,

static ReadOnlySpan<byte> Buf => new byte[] {1, 2};

returns a pointer to read-only DATA segment within PE/ELF file. IL bytecode for it:

.data cil I_00002868 = bytearray (
        01 02
    )

Move2 can be replaced with

Span<T> span;
span[x..y].CopyTo(span.Slice(start));

Of course, it requires more checks because span[x..y].Length can be greater than span.Slice(start). But anyway, it's just a single line of code.

I think Swap might have the following signature:

public static Swap<T>(this Span<T> span, Range range1, Range range2);

Swap of two values requires third temporary storage. The same is applicable to Span<T>. Thus, it cannot be done without heap allocation for some use cases. For instance, if a range is greater than the available stack memory, the data must be placed onto the heap.

Move2 can be replaced with

Span<T> span;
span[x..y].CopyTo(span.Slice(start));

Of course, it requires more checks because span[x..y].Length can be greater than span.Slice(start). But anyway, it's just a single line of code.

I mean, if you have 1 line of code or 3 doesnt really matter, no?

I mean that Move2 is too simple (trivial) to be an extension method.

I think Swap might have the following signature:

public static Swap<T>(this Span<T> span, Range range1, Range range2);

This is valid, but also sometimes you dont have the range at hand and just the slices itself, I think an overload can be useful.

ReadOnlySpan<T> cannot be modified in general. Unsafe operation doesn't help here because it can represent truly read-only memory segment which leads to SEGFAULT. For instance,

static ReadOnlySpan<byte> Buf => new byte[] {1, 2};

returns a pointer to read-only DATA segment within PE/ELF file. IL bytecode for it:

.data cil I_00002868 = bytearray (
        01 02
    )

I mean, ofc you need to know WHEN you use such code, I am and noone else who aallows such thing is to be kept responsible for the devs action, OF course you dont use that stuff when u are worried about the readonly stuff, I for me as an example, am creating certain structs out of a string which has some key informations and changing that private string is not a problem cause noone outside has sight to it anyway, but its very useful rather than doing mass copy of such a string (4KB) in a loop (cause its a gameloop which renderes certain gameobjects out of that struct which has to be created efficiently out of the string.)

EDIT: Just because we use C# doesnt mean that unsafe-operations (in a nicer and cleaner way than C/C++) should be abandoned, you will have always to know what you are doing, I mean just by your Intrinsics class i could modify data of some internals of other structs but why should I do that? But having the chance to do so if NEED BE, is awesome.

I mean that Move2 is too simple (trivial) to be an extension method.

I see what you mean, (despite me forgetting to clear the moving part to emulate a true move) its also symantically stronger, when you can just have a range and it shall be "moved" somewhere else, but okay thats up 2 you in this case.

you will have always to know what you are doing,

Exactly. It must be explicit. Therefore, Span<T> is a better choice because you modifying contents of the memory block. If you want to operate on ReadOnlySpan<T> just convert it to Span<T> using MemoryMarshal.CreateSpan manually to highlight unsafe aspect of your conversion.

but also sometimes you dont have the range at hand

Why? Instantiation of Range is very cheap, it is value type. And it is more readable to the user. If you have slices of the original span then you know the ranges.

scoped Span<T> lastCopy = stackalloc T[last.Length]; this can cause StackOverflowException

Exactly. It must be explicit. Therefore, Span<T> is a better choice because you modifying contents of the memory block. If you want to operate on ReadOnlySpan<T> just convert it to Span<T> using MemoryMarshal.CreateSpan manually to highlight unsafe aspect of your conversion.

Actually I am already doing internally: "ROS.AsWriteable()" to make a Span out of ReadOnlySpan, but you suggest to let the user itself do that and not to take that task from him? Okay, I can agree to that.

scoped Span<T> lastCopy = stackalloc T[last.Length]; this can cause StackOverflowException

Tbh, in absolute theory yes, but when you look at such a function, Idk if like copying (* AT MAXIMUM*) 256 chars is that big of a deal, so the x and y together should never be more than 1KB in total or so, thats why i didnt do any check for them and wanted to let the user know it by the pure XML comment to not do to big slices but I could improve that part by your "BufferPooling" types in DotNext to be absolutely sure when its to huge to use your buffer.

Tbh, in absolute theory yes

Public method should work in any circumstances and with any corner cases.

together should never be more than 1KB

Even in that case it may cause stack overflow.

Tbh, in absolute theory yes

Public method should work in any circumstances and with any corner cases.

together should never be more than 1KB

Even in that case it may cause stack overflow.

okay got it, i will improve that by using BufferWriterSlim<T> I think its a good fit to just extend the size if need be.

MemoryMarshal.CreateSpan is used by an internal function in the "Utils" class which is called AsWriteAble() to make that call easier cause it was abit annoying.

  public static Span<T> AsWriteable<T>(this ReadOnlySpan<T> readOnlySpan) =>
        MemoryMarshal.CreateSpan(ref Unsafe.AsRef(readOnlySpan[0]), readOnlySpan.Length);```

you will have always to know what you are doing,

Exactly. It must be explicit. Therefore, Span<T> is a better choice because you modifying contents of the memory block. If you want to operate on ReadOnlySpan<T> just convert it to Span<T> using MemoryMarshal.CreateSpan manually to highlight unsafe aspect of your conversion.

but also sometimes you dont have the range at hand

Why? Instantiation of Range is very cheap, it is value type. And it is more readable to the user. If you have slices of the original span then you know the ranges.

As I said, i can offer both, overload with range x,y and ROS x,y because its not so easy to get a range out of a slice, because when u have the slices, why u just dont swap directly, why u want to create an extra range from it, on the other hand, sometimes u have (start, length) and can use that rather than to slice it and then pass the slice, so offering both is better, because keep in mind, sometimes, you get a Slice back from a function, so why would u want to get a range from there (which is 2 lines of code and again and .IndexOf(slice)) where as you can just directly put it, offering both is the solution.

okay got it, i will improve that by using BufferWriterSlim

MemoryRental is better.

because when u have the slices, why u just dont swap directly, why u want to create an extra range from it,

Agree, but in that case you don't need third span:

public static void Swap<T>(this Span<T> x, Span<T> y);

In that form I'm ready to add it to DotNext. However, its implementation is not trivial depends on T. If T is blittable, it can be done without heap allocations even for large span (swapping chunks instead of a whole range). If not, there is no way to allocate a space for multiple elements on the stack. It is a limitation of CLR itself.

okay got it, i will improve that by using BufferWriterSlim

MemoryRental is better.

okay 👍

because when u have the slices, why u just dont swap directly, why u want to create an extra range from it,

Agree, but in that case you don't need third span:

public static void Swap<T>(this Span<T> x, Span<T> y);

Wouly you mind to explain it abit better? How can we swap x and y when we dont know the source span in which the swap has to be done in?

In that form I'm ready to add it to DotNext. However, its implementation is not trivial depends on T. If T is blittable, it can be done without heap allocations even for large span (swapping chunks instead of a whole range). If not, there is no way to allocate a space for multiple elements on the stack. It is a limitation of CLR itself.

Hmm, so what would your exact suggestion be for non-blittable-types? Since Char is non-blittable sadly.

How can we swap x and y when we dont know the source span in which the swap has to be done in?

Why you need that knowledge? For instance, if I want to swap int x and int y I don't care about their location in memory.

Since Char is non-blittable sadly.

It is blittable because it has no fields of reference types inside.

How can we swap x and y when we dont know the source span in which the swap has to be done in?

Why you need that knowledge? For instance, if I want to swap int x and int y I don't care about their location in memory.

I mean the main goal is to actually modify the source span, i dont just want to swap 2 random spans which have nothing to do with eachother, even in the struct type SpanInfo<T> I guarantee that both x and y point somewhere within same memory block thats the entire idea.

Since Char is non-blittable sadly.

It is blittable because it has no fields of reference types inside.
grafik

https://learn.microsoft.com/en-us/dotnet/framework/interop/blittable-and-non-blittable-types

Altho I find that my MoveBy/Move2 functions already keep care of slicing and moving a chunk of elements from A to B, so should that not be taken care already of "span.CopyTo(...)" regarding the blittable aspect?

It's a mess with terms. Blittable and unmanaged are almost the same. So I consider char as blittable because

static void M<T>(T value) where T : unmanaged;

is applicable to char.

I mean the main goal is to actually modify the source span

x and y may represent slices in the same contiguous memory. Actually, where two spans are located don't make any sense.

x and y may represent slices in the same contiguous memory. Actually, where two spans are located don't make any sense.

Tbh, they should only be slices from same contigeous memory region, because only there Swap(x,y) make sense, so the 3.span is needed therefore, cause it represents the source.

Otherwise I dont really get what you mean tbh

so a reallife example would be:

    public static void TestSwap()
    {
        var text = "Hello cool #world, I welcome you every morning with a bright smile!".AsSpan();
        var x = text.Slice(text.IndexOf("world"), "world".Length);
        var y = text.Slice(text.IndexOf("morning"), "morning".Length);
        text.Swap(x, y);

        var result = test.ToString();  //-----> "Hello cool #morning, I welcome you every world with a bright smile"
    }

Tbh, they should only be slices from same contigeous memory region,

Why they should? For instance, if I have two arrays:

Span<int> x = new int[] {1, 2};
Span<int> y = new int[] {3, 4};

What the problem to swap their content?

Here is my version: 1580c16
It works for any spans regardless of the actual location in memory (sparse or contiguous, which is actually not make any sense).

Here is my version: 1580c16 It works for any spans regardless of the actual location in memory (sparse or contiguous, which is actually not make any sense).

Tbh I am not sure one want to change a content from a completly different span, this could damage other sources, i mean "Replace()" content from SpanA with content from SpanB would be something okay, but modifying both is dangerous. Really not sure about this tbh.

Also in my .Swap(x, y) x and y can be of any size, when I exchange "stackalloc" with your memoryrental approach it will always work.

You have written basically a different way of swap, like with any other Span, you can do that, but my version is more of swapping content within itself already, and they dont have to be ofc the same length.

AFAIK these are 2 different Swap() versions here.

Tbh I am not sure one want to change a content from a completly different span

Implementation doesn't depend on the address. Here is my test:

[Fact]
    public static void SwapElements()
    {
        Span<int> array = stackalloc int[] { 1, 2, 3, 4, 5, 6 };
        array[0..3].Swap(array[3..6]);
        Equal(new int[] { 4, 5, 6, 1, 2, 3 }, array.ToArray());
    }

As you can see, both spans are slices of the same array. And swap works as expected. It works as expected if both spans are different arrays as well. This fact doesn't change anything.

and they dont have to be ofc the same length.

How it should work? How to replace 2 elements with 3 elements?

My version were more intended to work with strings/ROS and there in, X and Y can be of different sizes, so u can swap 2 different words of different length, BUT it also works with everyother blittable , aslong as they from same Memory, that is the power of it, your version doesnt allow for that, and yes in your version , you could do that by just copying the minimum-set (2) inside the 3-length one, leaving the 1-left untouched, like:

  int maxLen = Math.Abs(x.length  - y.Length);

 if (x.Length > y.Length)
    x = x.Slice(0, maxLen);

No need to throw Exception, you just leave the rest untouched.

But in my version, when they are in the same span, you can (as my code demonstrates very well) swap of any size.

As I said, your version and my one, are 2 different possibilites of Swap(x,y) i find that you could support both.

Have you even compiled and testted my example?

No need to throw Exception, you just leave the rest untouched.

No, it's not obvious for the user. Exception indicates strong restriction on input args. User might be unaware about the magic inside of the method (because actually nobody reads the docs). We are not using string for everything while we can parse string to int inside of the method, right? Caller is responsible for the correct input.

i find that you could support both.

Swap with two spans is more generic. Thus, it is more reusable and convenient. I don't want to add method that is very specific for some particular case. Swap to exchange elements in the same span might be useful when implemented with two Range as arguments. If you have two spans within the same region, actually you know how you obtained them, right? Anyway, I'm curious about different lengths. Let's image an array [1, 2, 3, 4, 5, 6]. What happens if I decide to swap first two elements with last three?

No, it's not obvious for the user. Exception indicates strong restriction on input args. User might be unaware about the magic inside of the method (because actually nobody reads the docs). We are not using string for everything while we can parse string to int inside of the method, right? Caller is responsible for the correct input.

Okay, you can do like that, the option also is to pass a strong XML doc over the function so users can not miss the doc for the function, but throwing is also okay when you wanna ensure that he can only do x-x length swaps.

Swap with two spans is more generic. Thus, it is more reusable and convenient. I don't want to add method that is very specific for some particular case. Swap to exchange elements in the same span might be useful when implemented with two Range as arguments. If you have two spans within the same region, actually you know how you obtained them, right? Anyway, I'm curious about different lengths. Let's image an array [1, 2, 3, 4, 5, 6]. What happens if I decide to swap first two elements with last three?

Easy. [1, 2, 3, 4, 5, 6]. ------------> [4, 5, 6, 3, 1, 2]. and 3 remains untouched. This is the same logic I apply to string swaps (my function is generic as well bro, but the string swap is mostly what people are interessted in do, OR binary swaps), also an interessting needed area of swapping sometimes, like having some big ass binary data which you wanna swap maybe chunks of it, then based on a specifc passed delimiter, you can swap without destroying the other areas of the main-source.

Swap to exchange elements in the same span might be useful when implemented with two Range as arguments. If you have two spans within the same region, actually you know how you obtained them, right

I mean, sometimes, you get back slices from a function of which you did not have anything to do with, i just made a simple example for testing purpose! but in real examples you often just dont know where the slices came from (even from same source)

so when i get back a slice, then I can easily swap them, and creating a Range from a slice which you did not create, is costly, cause you need to do "src.IndexOf(slice)" to get the position and then u can make range but at this point u can just use the slice....

but I am not against the idea to support Range aswell, so when you just have the (int len, int start) directly (either by yourself OR from some function) then you can just pass those, sure.

I can allow for both, no problem, they dont hinder each other.

Swap with three spans has a very specific use case. Moreover, it's not obvious how to use it (you see that fact in this thread). So I can't consider it as reusable API. In the same time, Swap with Range args looks more reusable.

Swap with three spans has a very specific use case. Moreover, it's not obvious how to use it (you see that fact in this thread). So I can't consider it as reusable API. In the same time, Swap with Range args looks more reusable.

I dont know what is so hard to get there, the "3" span is the calling source span, again look up my example approach:

    public static void Test__Swap()
    {
		//ROS = ReadOnlySpan<char>
        string input= LoadWeatherInfoFromFile();
        ROS x = GetMiddleSlice();
        ROS y =GetLastPartFromWeather();

        input.Swap(x, y);
   }

In such an example, where you roughly know what kind of string it is and you roughly know which parts to slice, then you can do this way easier as with Ranges, but again , as I am repeating myself, you also should have option to get what you want by just passing the direct ranges, like this:

    public static void TestSwap()
    {
        var text = "Hello cool #world, I welcome you every morning with a bright smile!".AsSpan();
        Range x = 12..17;
        Range y = 40..48;  //idk if its 40 to lazy to count the index     x-D
        text.Swap(x, y);

        var result = test.ToString();  //-----> "Hello cool #morning, I welcome you every world with a bright smile"
    }

Does it break your bones by allowing both options?

Swap with three spans has a very specific use case. Moreover, it's not obvious how to use it (you see that fact in this thread). So I can't consider it as reusable API. In the same time, Swap with Range args looks more reusable.

Why is it specifc? It does the same as your Swap-code? Just not with any other random Span, I can easily update to do that, no problem BUT my version therefore can swap for any size! xLen > yLen or yLen > xLen, doesnt matter, it does it perfectly. for ALL blittable types I just hovered string as the most interessting one cause it has to do with strings, which is more special in terms of checks you have to do in the functions, thats it but it works for all types.

Swap<T>(Span<T> source, Span<T> range1, Span<T> range2) this is specific use case. Swap<T>(Span<T>, Range range1, Range range2) is more appropriate for general use. However, it's hard to imagine where it can be useful. It would be great if you can describe your use case a bit more detailed.

As I said, changing values can be expensive in loops (as it is right now, because I have for instance an example where I get a 2KB long string where elements are existent which has to be updated in a loop and i just swap(valueFromBehind, firstValue)

The base-src of that looks like: "bla1 bla2 bla3 bla4 value2Draw1, value2Draw2, value2Draw3" and so on, and i just swap them in order to get all the "value2Draw" , i hope that makes sense, but generally swapping can be useful tool like replacing, which is still to this date not able for Spans, which it should imho.

tool like replacing

It is not possible in general, because Span and Memory are not expandable (their size cannot grow).

What you trying to achieve with this long string and slices over it? I'm asking because this kind of swap is very slow from performance perspective. It's much cheaper to allocate or rent a new string.

Added Swap of ranges within the same span. It may be considered only for convenience, because in worst case its performance is much more slower than allocation of a new string or buffer:

  • Memory optimization can be achieved only when both ranges are of the same length
  • Algorithm requires 4 moves in the memory, which is huge. For example, a span with 10 elements requires to move 40 elements to do the swap. And here is why:
    • Move the largest range to buffer
    • Shift elements between two ranges
    • Move smallest range to largest range
    • Move buffer to largest range

With a newly allocated buffer you need just one move: from source range to new buffer.

Added Swap of ranges within the same span. It may be considered only for convenience, because in worst case its performance is much more slower than allocation of a new string or buffer:

* Memory optimization can be achieved only when both ranges are of the same length

* Algorithm requires 4 moves in the memory, which is huge. For example, a span with 10 elements requires to move 40 elements to do the swap. And here is why:
  
  * Move the largest range to buffer
  * Shift elements between two ranges
  * Move smallest range to largest range
  * Move buffer to largest range

With a newly allocated buffer you need just one move: from source range to new buffer.

Yes I do a couple of copies there, but dont forget (especially since now .net 8.0 will be the LTS, its cruicial that alot of code will be updated to that) that .CopyTo(span) is highly optimized for blittable types (and I think char got also tons of tons of improvement since its also a primitive type internally) and I have done a benchmark with:

  • 1KB
  • 10KB
    strings, and the outcome is around 250ns-600ns , for x being 254length and y being 254 in length, i tested mutliple versions like swapping from begin to mid or begin to end or or mid to end, all in the area between 200ns to 630ns.

grafik

And here the results for a 1MB string, with x=1024, y=1024 in length

grafik

Tell me then, the exactsteps of how you would do a swap when a copy of the string is allowed, if that would be much better?

For the case, here the benchmark-class:

using System.Numerics;
using System.Runtime.InteropServices;
using DotNext.Runtime;

namespace Benchmarker;

[StructLayout(LayoutKind.Auto)]
public readonly unsafe ref struct SpanInfo<T>
    where T : unmanaged, IEquatable<T>, IComparable<T>, INumber<T>
{
    private readonly int SrcLength;
    public readonly ReadOnlySpan<T> First, Between, Last;
    public readonly int IndexOfFirst;
    public readonly int IndexOfLast;
    public readonly bool AreXYNext2EachOther;
    public readonly bool AreSameLength;
    public readonly bool IsFirstLargerThanLast;
    public readonly int LengthDiff;
    public readonly Range LargeOne, SmallOne;

    public SpanInfo(scoped in ReadOnlySpan<T> src,
        scoped in ReadOnlySpan<T> x,
        scoped in ReadOnlySpan<T> y)
    {
        //1 of these at least is empty which is bad!
        if (src == ReadOnlySpan<T>.Empty || x == ReadOnlySpan<T>.Empty || y == ReadOnlySpan<T>.Empty)
            throw new ArgumentException("ALL of the spans MUST be valid");

        SrcLength = src.Length;
        
        nint adrOfX = Intrinsics.AddressOf(x[0]);
        nint adrOfY = Intrinsics.AddressOf(y[0]);
        long x2Y = Math.Abs(adrOfX - adrOfY) / sizeof(T);

        //its same or invalid address
        if (x2Y <= 0)
            throw new ArgumentException("spans cannot be the same or empty");

        nint adrOfAbsFirst = Intrinsics.AddressOf(src[0]);
        nint adrOfAbsLast = Intrinsics.AddressOf(src[^1]);
        long totalLen = Math.Abs(adrOfAbsFirst - adrOfAbsLast) / sizeof(T);

        //check if just 1 is within same range!, if not,
        //then the entire method based on this struct will be fruitless
        var sameMemory = (uint)x2Y <= (uint)totalLen;

        if (!sameMemory)
            throw new ArgumentException("x and y are not pointing to the same memory region!");

        First = adrOfX < adrOfY ? x : y;
        Last = adrOfX < adrOfY ? y : x;
        nint adrOfFirst = Intrinsics.AddressOf(First[0]);
        nint adrOfLast = Intrinsics.AddressOf(Last[0]);

        IsFirstLargerThanLast = First.Length > Last.Length;

        IndexOfFirst = (int)Math.Abs(adrOfAbsFirst - adrOfFirst) / sizeof(T);
        IndexOfLast = (int)Math.Abs(adrOfAbsFirst - adrOfLast) / sizeof(T);

        //when they are really close and only split by a delimiter from each other
        //then the addition of idxOfFirst + firstLen + sizeof(T) should be same as IndexOfLast
        AreXYNext2EachOther = IndexOfLast == IndexOfFirst + First.Length + sizeof(T) * 1;
        LengthDiff = Math.Abs(First.Length - Last.Length);
        AreSameLength = LengthDiff == 0;
        int endOfFirstOne = IndexOfFirst + First.Length;
      
        Between = AreXYNext2EachOther
            ? src[endOfFirstOne..IndexOfLast]
            : ReadOnlySpan<T>.Empty; 

        if (First.Length < Last.Length)
        {
            SmallOne = IndexOfFirst..(IndexOfFirst + First.Length);
            LargeOne = IndexOfLast..(IndexOfLast + Last.Length);
        }
        else
        {
            LargeOne = IndexOfFirst..(IndexOfFirst + First.Length);
            SmallOne = IndexOfLast..(IndexOfLast + Last.Length);
        }
    }

    public (int start, int len, int end) DeconstructLargeOne()
    {
        var r = LargeOne.GetOffsetAndLength(SrcLength);
        return (r.Offset, r.Length, r.Offset + r.Length);
    }

    public (int start, int len, int end) DeconstructSmallOne()
    {
        var r = SmallOne.GetOffsetAndLength(SrcLength);
        return (r.Offset, r.Length, r.Offset + r.Length);
    }
}

public static class Utilities
{
    public static Span<T> AsWriteable<T>(this scoped ReadOnlySpan<T> readOnlySpan) =>
        MemoryMarshal.CreateSpan(ref Unsafe.AsRef(readOnlySpan[0]), readOnlySpan.Length);

    
    [MethodImpl(MethodImplOptions.AggressiveInlining)]
    public static void Move2<T>(this scoped ReadOnlySpan<T> input, Range sliceToMove, int newPos)
        where T : struct, IEquatable<T>
    {
        var r = sliceToMove.GetOffsetAndLength(input.Length);
        var areaToCopyInto = input.Slice(newPos, r.Length);
        input[sliceToMove].CopyTo(areaToCopyInto.AsWriteable());
        input[sliceToMove].AsWriteable().Clear();
    }

    /// <summary>
    /// Moves a slice within the input span by "moveBy" steps, if that value is > 0
    /// it moves to the right, else to the left 
    /// </summary>
    /// <param name="input"></param>
    /// <param name="area2Move"></param>
    /// <param name="moveBy"></param>
    /// <param name="fillEmpties"></param>
    /// <typeparam name="T"></typeparam>
    [MethodImpl(MethodImplOptions.AggressiveInlining)]
    private static int MoveBy<T>(this scoped ReadOnlySpan<T> input,
        Range area2Move, int moveBy,
        T fillEmpties = default)
        where T : struct, IEquatable<T>
    {
        var r = area2Move.GetOffsetAndLength(input.Length);
        int newOffset = r.Offset + moveBy;
        Range areaToCopyInto = newOffset..(r.Length + newOffset);
        input[area2Move].CopyTo(input[areaToCopyInto].AsWriteable());

        int endOfArea2Move;
        int begin2Clear;

        if (moveBy < 0)
        {
            endOfArea2Move = r.Offset + r.Length;
            //go "moveBy" back
            begin2Clear = endOfArea2Move + moveBy;
        }
        else
        {
            begin2Clear = r.Offset;
            endOfArea2Move = begin2Clear + moveBy;
        }

        Range area2Clear = begin2Clear..endOfArea2Move;
        input[area2Clear].AsWriteable().Fill(fillEmpties);

        return newOffset + r.Length;
    }

    /// <summary>
    /// Swaps 2 different slices within the same span!
    /// </summary>
    /// <param name="input">the base span where the swap is reflected </param>
    /// <param name="x">the 1. span to swap with the 2. one</param>
    /// <param name="y">the 2. span to swap with the 1. one</param>
    /// <param name="delimiter">a value which functions as a delimiter to say when a new block of an arbitrary numeric value begins and ends..</param>
    /// <typeparam name="T">The type of the span</typeparam>
    private static void Swap<T>(this scoped ReadOnlySpan<T> input,
        scoped ReadOnlySpan<T> x,
        scoped ReadOnlySpan<T> y,
        T delimiter = default)
        where T : unmanaged, IEquatable<T>, IComparable<T>, INumber<T>
    {
        //get all the needed information about the occuring spans here!
        scoped var info = new SpanInfo<T>(input, x, y);

        //use the "info" type for the future in this function!.....
        int diffToMove = info.LengthDiff;
        scoped ReadOnlySpan<T>
            first = info.First,
            between = info.Between,
            last = info.Last;

        if (info.AreSameLength)
        {
            //store a copy of the smaller one
            scoped Span<T> lastCopy = stackalloc T[last.Length];
            last.CopyTo(lastCopy);

            first.CopyTo(input.Slice(info.IndexOfLast, first.Length).AsWriteable());
            lastCopy.CopyTo(input.Slice(info.IndexOfFirst, lastCopy.Length).AsWriteable());

            return;
        }

        if (info.AreXYNext2EachOther)
        {
            if (info.IsFirstLargerThanLast)
            {
                //first > last ===> first=largeOne; last=smallOne
                (_, _, int endOfLargeOne) = info.DeconstructLargeOne();
                (_, int smallOneLen, int endOfSmallOne) = info.DeconstructSmallOne();

                //TEMP-STEPS:
                //slice "smallOne" 
                var smallOne = last.AsWriteable();
                //slice "largeOne"
                var largeOne = first.AsWriteable();
                //make a copy of "largeOne[0..smallOneLen]
                Span<T> sliceOfLargeOne = stackalloc T[smallOneLen];
                largeOne[..smallOneLen].CopyTo(sliceOfLargeOne);
                //slice the "remainder" from "largeOne"
                var remainderOfLargeOne = largeOne[smallOneLen..];
                //make a copy of that "remainder" locally	
                Span<T> remainderCopy = stackalloc T[remainderOfLargeOne.Length];
                remainderOfLargeOne.CopyTo(remainderCopy);
                //define the range of " smallOne"
                Range areaOfSmallOne = endOfLargeOne..endOfSmallOne;

                //MUTATING-STEPS:
                //copy "smallOne" to "largeOne"
                smallOne.CopyTo(largeOne);
                //copy the prev copied "largeOne[smallOneLen..]" to "smallOne"
                sliceOfLargeOne.CopyTo(smallOne);
                //clear the "remainder" from "input"	
                remainderOfLargeOne.Fill(delimiter);
                //move only " smallOne" by "remainder.Length" back, so we pass "-remainder.Length"
                int idxOfConcat = input[..endOfSmallOne].MoveBy(areaOfSmallOne, -remainderCopy.Length);
                //concat back the "remainder" to the "largeOne"
                remainderCopy.CopyTo(input[idxOfConcat..].AsWriteable());
            }
            else
            {
                /*This is ...........*/

                //first < last ===> first=smallOne; last=largeOne
                (_, int smallOneLen, int endOfSmallOne) = info.DeconstructSmallOne();
                (int startOfLargeOne, _, int endOfLargeOne) = info.DeconstructLargeOne();

                //TEMP-instructions:
                //slice "smallOne"
                var smallOne = first;
                //slice "largeOne"
                var largeOne = last;
                //slice "smallOne" from "largeOne"
                var slice = largeOne[..smallOneLen];
                //copy "smallOne" locally
                Span<T> copyOfSmallOne = stackalloc T[smallOneLen];
                smallOne.CopyTo(copyOfSmallOne);
                //copy "betWeen" locally
                Span<T> copyOfBetween = stackalloc T[between.Length];
                between.CopyTo(copyOfBetween);
                //compute the last index of smallOne from the largeOne
                int idxOfSlice = startOfLargeOne + smallOneLen;
                //compute Range from where it shall move to another location
                int len2Copy = Math.Abs(endOfSmallOne - idxOfSlice);
                //Define the remaining area from "largeOne"
                Range remainder2Move = idxOfSlice..endOfLargeOne;

                //MUTATING instructions:
                //swap instantly a subset of largeOne with smallOne
                input.Swap(smallOne, slice, delimiter);
                //move remainder to the "endOfSmallOne" 
                int idxOfSmallOne = input.MoveBy(remainder2Move, -len2Copy, delimiter);
                //before we can copy the smallOne where it belongs
                //we must copy "between" first, to ensure proper order
                copyOfBetween.CopyTo(input[idxOfSmallOne..].AsWriteable());
                //update the "idxOfSmallOne" by between.Length then
                //copy back the "smallOne" to the end of "largeOne"
                idxOfSmallOne += copyOfBetween.Length;
                copyOfSmallOne.CopyTo(input[idxOfSmallOne..].AsWriteable());
            }
        }
        else
        {
            //compare the length and decide which one to copy where!
            if (info.IsFirstLargerThanLast)
            {
                //first > last ===> first=largeOne; last=smallOne

                (int startOfSmallOne, int smallOneLen, int endOfSmallOne) = info.DeconstructSmallOne();
                (int startOfLargeOne, int largeOneLen, int endOfLargeOne) = info.DeconstructLargeOne();

                //store a copy of the 'smallOne'
                scoped Span<T> smallOne = stackalloc T[smallOneLen];
                last.CopyTo(smallOne);

                //store a copy of the larger one
                scoped Span<T> largeOne = stackalloc T[largeOneLen];
                first.CopyTo(largeOne);

                //we have to clear the 'smallOne' from the input, before we can copy in order for .IndexOf() 
                //to find the match of the "largeOne" because otherwise there will be 2x

                T invalid = default;
                //we clear the area where the 'largeOne' resides!
                input.Slice(startOfLargeOne, largeOneLen).AsWriteable().Fill(invalid);
                //copy the 'smallOne' into the area of the 'largeOne'
                smallOne.CopyTo(input.Slice(startOfLargeOne, smallOneLen).AsWriteable());
                //'minStart' => is the first <empty> (in case of char, its ' ') value which comes
                // RIGHT AFTER the new position of 'smallOne' value, so the 
                //'end'     => is the last <empty> (in case of char, its ' ') value which comes
                // RIGHT AFTER the new position of 'largeOne' value, so the

                //'area2MoveBack' begins where 'largeOne' ENDS til the end of 'smallOne'
                // endOfSmallOne begins at 'startOfSmallOne' + 'largeOneLen' because we have to consider 
                // the area of 'smallerOne' is filled by the length of the 'largerOne'
                Range area2MoveBack = endOfLargeOne..endOfSmallOne;

                //Now we move the area to where the 'largerOne' is but only 'smallOneLen' further, because
                //we wanna override the remaining 'null'(\0) values! 
                input.Move2(area2MoveBack, startOfLargeOne + smallOneLen);

                //Finally we copy the 'largeOne' back to  
                largeOne.CopyTo(input.Slice(startOfSmallOne - diffToMove, largeOne.Length).AsWriteable());
            }
            else
            {
                //first < last ===> first=smallOne  && last=largeOne
                (int startOfSmallOne, int smallOneLen, int endOfSmallOne) = info.DeconstructSmallOne();
                (int startOfLargeOne, int largeOneLen, _) = info.DeconstructLargeOne();

                /*store a copy of the smaller one*/
                scoped Span<T> smallerOne = stackalloc T[smallOneLen];
                first.CopyTo(smallerOne);

                /*store a copy of the largerOne*/
                scoped Span<T> largerOne = stackalloc T[largeOneLen];
                last.CopyTo(largerOne);

                /*step1: copy 'largeOne' into 'smallOne' until 'smallOne' is filled*/
                largerOne[..smallOneLen].CopyTo(input.Slice(startOfSmallOne, smallOneLen).AsWriteable());

                /*step1.5: copy 'smallerOne' into 'largerOne' until 'smallOne' is filled* /*/
                smallerOne.CopyTo(input.Slice(startOfLargeOne, largeOneLen).AsWriteable());

                /*step2: ......*/
                int afterSmallOne = endOfSmallOne;
                int endOfGreaterOne = startOfLargeOne + smallOneLen;
                //step3: create a range which reflects the part which has to be moved to the right-side
                Range area2Move = afterSmallOne..endOfGreaterOne;

                //step4: copy the 'remaining' parts of 'largerOne' locally
                var remainderSlice = input.Slice(startOfLargeOne + smallOneLen, diffToMove);
                Span<T> remainderCopy = stackalloc T[remainderSlice.Length];
                remainderSlice.CopyTo(remainderCopy);

                //step5: Move now the slice by "remainder.Length" towards the end  
                input.MoveBy(area2Move, diffToMove, delimiter); //---->this here breaks somehow!! investigate!

                //step6: Copy now the "remainder" to the end of "smallOne"
                Range area2CopyRemainInto = startOfSmallOne..(startOfSmallOne + largeOneLen);
                Range remainingArea2Copy = ^diffToMove..;
                smallerOne = input[area2CopyRemainInto].AsWriteable();
                remainderCopy.CopyTo(smallerOne[remainingArea2Copy]);
            }
        }
    }

    public static void Swap(this scoped ReadOnlySpan<char> input,
        scoped ReadOnlySpan<char> x,
        scoped ReadOnlySpan<char> y)
        => input.Swap(x, y, (char)32);
}

using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Order;
using Benchmarker;
using DotNext;

[MemoryDiagnoser(false)]
[Orderer(SummaryOrderPolicy.FastestToSlowest, MethodOrderPolicy.Alphabetical)]
public class ROSSwapBenchmark
{
    public static readonly string BIGROS = Random.Shared.NextString("abcdefghijklmnopqrstu0123456789", 1000000);
   
    [Benchmark]
    public void SwapROSFromBegin2Mid()
    {
        int len = 1024;
        var bigRos = BIGROS.AsSpan(); 
        var x = bigRos.Slice(0, len);
        int mid = bigRos.Length / 2;
        var y = bigRos.Slice(mid, len);
        bigRos.Swap(x, y);
    }
    
    [Benchmark]
    public void SwapROSFromBegin2End()
    {
        int len = 1024;
        var bigRos = BIGROS.AsSpan();
        var x = bigRos.Slice(0, len);
        var y = bigRos[^len..];
        bigRos.Swap(x, y);
    }
    
    [Benchmark]
    public void SwapROSFromMid2End()
    {
        int len = 1024;
        var bigRos = BIGROS.AsSpan();
        int mid = bigRos.Length / 2;
        var x = bigRos.Slice(mid, len);
        var y = bigRos[^len..];
        bigRos.Swap(x, y);
    }
}

grafik

But I like also now more the idea to use Ranges instead of direct spans, because I can get rid of the check for "IsInSameMemoryRegion" and since you can utilize the ..^1 or similar syntax you can get rid of even the checks if they are outsideofindex or so.. and tbh you dont have to bother doing checks for this because if some1 is that retarded to do: -1..int.max its his own fault i cant take care of such and break peformance because i have to take stupid checks into account...

-1..int.max

My implementation checks overlapping of ranges and throws exception because it's not allowed.

Tell me then, the exactsteps of how you would do a swap when a copy of the string is allowed, if that would be much better?

If I understand correctly, you're using swap as alternative to allocation of a new string. So string(Span<char>) ctor or ArrayPool<char>.Shared can be used for the same, and it's more performant.

A few remarks about your code:

nint adrOfX = Intrinsics.AddressOf(x[0]);
nint adrOfY = Intrinsics.AddressOf(y[0]);
long x2Y = Math.Abs(adrOfX - adrOfY) / sizeof(T);

This is dangerous and may have unpredictable results. AddressOf just converts the managed reference to the address, but the managed reference can be modified by GC exactly after a call to AddressOf. Thus, two calls to AddressOf are not atomic and adrOfX - adrOfY can show you something different than the distance between two addresses.

sizeof(T) * 1 - can be replaced with just sizeof(T)

stackalloc still doesn't take into account stack overflow. Moreover, you're using multiple buffers while the task can be done with just one buffer with appropriate size.

Let's summarize what we have at this moment:

public static class Span
{
    public static void Swap<T>(this Span<T> x, Span<T> y);
    public static void Swap<T>(this Span<T>, Range range1, Range range2);
}

Are you okay with that design? Should we consider extra methods? If no, I'm ready to prepare a new release.

Let's summarize what we have at this moment:

public static class Span
{
    public static void Swap<T>(this Span<T> x, Span<T> y);
    public static void Swap<T>(this Span<T>, Range range1, Range range2);
}

Are you okay with that design? Should we consider extra methods? If no, I'm ready to prepare a new release.

Hey, sry for late response, but I was not home and had no access therefor, I am back in 2h, and will take all the nessecary steps and will ponder exactly how to proceed further in this!

Thx!

Please also wait with any release, I am heavily overwriting stuff to adjust to the points of you which were reasonable. Give me like an hour.

Move is added as well with optimizations regarding to the internal buffer size.

public static class Span
{
    public static void Swap<T>(this Span<T> x, Span<T> y);
    public static void Swap<T>(this Span<T>, Range range1, Range range2);
    public static void Move<T>(this Span<T> span, Range range, Index destinationIndex);
}

@sakno sry I had private stuff to do... i forgot, I have severly rewritten yesterdasy the entire Swap code and I wanted to make another suggestion, I have used (for easier usage, I needed this type in my current 2D game as a Span-Queue to recycle certain data and not create them from scratch over and over...) it is called "SpanQueue<T>" aswell as another Range-Wrapper type, called Area<T> which has some more comfort functionality than the System.Range one. Please have a look, I also fixed some bugs in the Swap() method.

I added you to the private repo: "Utility-Workbench":

New types are:

  • SpanQueue<T>
  • Area<T>

New Methods are:

  • MoveBy<T> this one I thought could be useful for public usage, I use the below one for Swap() due to me not needing all the extra checks for the algorithm

Reworked types/Methods:

  • Swap(Area<T> x, Area<T> y)
  • MoveBy_Working (*did a renaming which not commited yet to "InternalMoveBy()"
  • SpanInfo<T> to accomedate the entire Range logic and removal of the old AdressOf`() stuff...

I testted every possible combination of what x y could be, regarding Swap(x, y) and it was successful in all cases til now!

Now i am gonna benchmark this stuff once again, since there were alot of changes and fixes.

Sakno, I would also suggest you to not check for, if the Range is >= 0, even the Team from .NET doesnt check for that because its senseless to check smth which should not be computed anyway, a Range is always to be used in collection of which you
cant be < 0, so you could spare that check, as I have done in my code.

Sakno, I would also suggest you to not check for, if the Range is >= 0, even the Team from .NET doesnt check for that because its senseless to check smth which should not be computed anyway, a Range is always to be used in collection of which you cant be < 0, so you could spare that check, as I have done in my code.

For which method? For Move<T> I'm not checking for < 0. I'm checking for length is not 0, which is different and reasonable.

@sakno you have a complete benchmarkable project, I would be glad that you run it and see the benchmarks which are now indeed quite worse, okay I am using 1MB string/span and x and y are 1KB in size, for now I test:

  • Begin - End
  • Begin - Mid
  • Mid - End

There other cases (like checking if they are directly close, or just distinct by 2-3 blocks and so on....) but for now these are off the list which i will add next.

But I feel the worsen in Performance has to do that allocating with MemoryOwner<T> (inside SpanQueue) , because he has to do alot when benchmarking, before i only used stackallocwhich u are right ofc cant be the solution due to potential StackoverflowExcp. But I wonder ....

You have an invite to my private-repo btw! its called "Utility-Workbench".

Unfortunately, I can't provide a review of you code. Probably, the types in your project are suitable and convenient for your task, that's fine. I would like to be focused on API proposal and its implementation which is finished. I don't expected great performance from all these three methods, they are all O(n) operations without any possibility to reduce time complexity (and even space complexity). I think that allocation by itself is not bad, especially it situations when the object is not trying to escape Gen0. Allocation is just a bump of heap pointer, which is more cheaper than rearranging ranges within the same span. Remember, premature optimization is a root of all evil.

Unfortunately, I can't provide a review of you code. Probably, the types in your project are suitable and convenient for your task, that's fine. I would like to be focused on API proposal and its implementation which is finished. I don't expected great performance from all these three methods, they are all O(n) operations without any possibility to reduce time complexity (and even space complexity). I think that allocation by itself is not bad, especially it situations when the object is not trying to escape Gen0. Allocation is just a bump of heap pointer, which is more cheaper than rearranging ranges within the same span. Remember, premature optimization is a root of all evil.

Well, as I said, testing with a anyway ridiculous amount of string-length (1MB large string) and 1KB-2KB sized ranges and they reach to like 25-50ms which is fine i think, and its not about allocation per se, its about to have the choice to take a solution if you dont want any heap allocation cause sometimes you cant just do that especially if u need to do stuff only cpl of times, (as a good example is high throughput apps like games or other programs which need to work on lower RAM)

I mean space-complexity is at 0? or how am I supposed to decipher your meaning of reducing space-complexity?

And for strings/spans with like ~1KB max as source-length this is super fast, which I have tested, (250-550 chars) it uses maxium 200ns with 0 allocation.

And this is a function which is generalized, this is nothing special, it works for every blittable types.

I dont even know why you coded the stuff for yourself when you could have atleast review the code and based on the understanding which I tried to explain you here since a week, you could then do adaptions to it and not write from scratch, I thought atleast this is how collabs is supposed to work.

And doing a replace/realloc of everystring in a frame-based system like Game Dev, where at LOWEST your functions get called 60x/sec up to 200x/sec this is super bad to create new strings, and in my version I use memory rented buffer for that purpose.

But you seemed that you did not want to check my commited stuff at all cost which is all in itself a pretty bad basis for discussion, nevermind, I wont suggest here anything more anyway, I dont think this is valuable of my time.

high throughput apps like games or other programs which need to work on lower RAM

Throughput and RAM are orthogonal. RAM is cheaper, so if I want to increase RAM usage to improve throughput, I definitely will do that. In your case, the cost of computation is moved from RAM to CPU which is not good in general.

I dont even know why you coded the stuff for yourself when you could have atleast review the code

I did that and provided feedback above. Starting from incorrect usage of AddressOf. Also I don't see any value of SpanInfo type, I just don't need it. You can just compare the number of lines and the complexity of your vs my version. There are two options for implementing new API:

  • You can propose a new API, we discuss it, and I'll implement it
  • You can submit PR with new API

Incorporating third-party code from different repo is a bit risky because of licensing. Moreover, if I see that the code requires a lot of rework, I prefer to implement it from scratch. Additionally, proposal should be valuable and reusable. I didn't see how these methods perform better than regular heap allocation. Do you have any evidence that Gen0 allocation is really worse than moving ranges within the span?