sqids / sqids-dotnet

Official .NET port of Sqids. Generate short unique IDs from numbers.

Home Page:https://sqids.org/dotnet

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

When will this be implemented?

xamir82 opened this issue · comments

commented

Hi. I want to use this in a .NET project, I'd like to know whether this is being implemented or should I just go back to Hashids.net instead?

@xamir82 It's in the pipeline to be implemented, but we haven't gotten around to it yet. Hashids is definitely still the way to go, but keep in mind that future Sqids IDs will not be compatible.

@ullmark Are you available to port this one over?

If someone is interested in kicking off this package, feel free to take this gist as starting point.

https://gist.github.com/vyrotek/5060f9069b2998850de966a0d57349f5

It's a straight conversion of the TS sqids package. I've only tested it against the basic scenarios on the readme but seems to work.

It should definitely be optimized with more proper C# techniques. 😅

@vyrotek Thanks for the gist! Straight conversion of the spec file is absolutely fine (optimizing language-specific code can come later). The only things that's missing is converting the few unit test files, and I'm sure modern LLMs can help with that. Would you like to be added as a maintainer so you can turn your gist into an actual library?

@4kimov I made some basic C#-specific improvements to @vyrotek's implementation (many thanks to him): https://gist.github.com/aradalvand/5e1b6f90de4324eaac11e20da29225aa

Including but not limited to:

  • Use file-scoped namespaces.
  • Remove calls to .ToCharArray() wherever unnecessary (string is already an IEnumerable<char> in C#).
  • Change the Sqids class name to SqidsGenerator — it's an official bad practice to have types with the same name as their enclosing namespace, and it doesn't let the user actually reference the type after using the namespace
  • Use ArgumentException instead of the generic Exception whenever user input is problematic.
  • Use var wherever the type was being specified twice (e.g. List<char> list = new List<char>())
  • Get rid of the SqidsHelper class.
  • Extract minAlphabetLength into a constant.
  • Use params for Encode's parameter to make usage more convenient.
  • Make private helpers static.
  • Use two constructors for SqidsGenerator as opposed to one with an optional parameter as that's more common and generally preferred in .NET (primarily because the default constructor could have its own <summary>)
  • Use Array.Empty<int>() when returning an empty array (which uses a singleton under the hood and therefore avoids creating a new array each time).
  • Improve variable names (e.g. use result instead of ret, in C# identifiers in general are normally not abbreviated)
  • Make MaxValue and MinValue constants
  • Remove the Random instantiation from Shuffle because it wasn't even used.
  • Improve exception messages
  • Use the (chars[i], chars[r]) = (chars[r], chars[i]) syntax for element swap in Shuffle as opposed to creating a temp variable

I didn't dive deep into the algorithm's implementation details yet but there's definitely a lot of performance improvements that could be done there as well. The obvious next step of course is making the implementation Span-based to avoid needless memory allocation (e.g. what Hashids.net is doing), but that requires a bit more effort and could be done later. This is currently a good starting point, I think.

Some other to-do items I can think of:

  • Add a TryDecode method
  • Support long as well, in addition to int

@ullmark Are you available to port this one over?

@4kimov Sorry I’m on vacation so slow answer. 😃 I haven’t really maintained Hashids much in the later years it’s been mainly @manigandham. Maybe he is interested in create the new one? I don’t think I am.

Update: I went ahead and did some of the perf-based refactorings I mentioned (e.g. using Span, among other things), and achieved a ~3x-4x improvement in both speed and memory allocation — here are some quick benchmark results:

|                  Method |         Mean |       Error |      StdDev | Allocated |
|------------------------ |-------------:|------------:|------------:|----------:|
| OldImplementationEncode |   3,773.5 ns |    20.13 ns |    17.85 ns |    3840 B |
| NewImplementationEncode |   1,321.5 ns |     7.74 ns |     6.86 ns |    1152 B |
|        HashidsNetEncode |   1,251.8 ns |     9.39 ns |     7.84 ns |     200 B |
| OldImplementationDecode | 135,137.3 ns | 2,680.73 ns | 7,201.60 ns |    2392 B |
| NewImplementationDecode |     919.9 ns |     4.13 ns |     3.86 ns |     160 B |
|        HashidsNetDecode |   1,433.9 ns |     6.07 ns |     5.38 ns |      56 B |

The new gist: https://gist.github.com/aradalvand/e1b8eaca27dd643f78316a5cb87f589e

Still not quite as memory efficient as Hashids.net though (which has been optimized the hell out of), so there's still room for further improvement; but I think this would make a more than decent initial version.

@4kimov I'd be willing to be added as a maintainer and take on the project if you want. Let me know.

@aradalvand Bravo on the improvements! 👏 Added you as a maintainer. Once the lib / package / tests are done, I'll update the site

@ullmark Thanks for the heads up & @manigandham appreciate the Hashids updates!

Thanks for taking this on @aradalvand! Awesome to see the .NET community jump on this so quickly.

@vyrotek No problem! Thanks a lot to you for giving us the initial implementation! Without it I would've been too lazy to do this :P

v1 is out now — so closing this (@xamir82 feel free to open other issues if there's anything else you want to ask about).

@4kimov We have tests (with near full code coverage, as reported by dotCover) and the NuGet package is also now available; so, you might want to update the website to reflect this. Thanks.

@aradalvand Good job, but do have a question tho. I can see that the tests are not 1:1 with the spec. For example, minLength tests don't test for the same IDs, and uniques test file is blank. The tests from the spec took some time to assemble because they cover a few good edge cases. Is there a way to get each converted to C# so we can be sure the generated IDs match for all kinds of scenarios (and therefore match with other libraries)?

@4kimov I didn't actually know 1:1 correspondence with the spec was expected when it comes to tests. I thought we were just supposed to cover the same kinds of scenarios as those in the spec tests, but without having to use the exact same values and cases and everything.

But it's okay, I'll try and see if I can rewrite the tests to match the spec tests precisely.

As for the uniqueness tests specifically, I was under the impression that those were essentially testing the algorithm rather than the implementation. if you will. Do they make sense for individual implementations as well?

@aradalvand Cool, yeah the spec tests do cover the tricky parts, so having at least them would be a good idea. Feel free to separate encode and decode and do whatever else is appropriate for C#. The main goal is to generate identical IDs given specified params.

  • The uniques test file is there to check that every generated ID really is unique. It usually takes a while to run. It does mainly test the algorithm and not individual langs, so I look at it as a safety mechanism just to be sure the worst doesn't happen.
  • The shuffle test file is there just to check how well consistent shuffle function works. The other tests will for sure run the internal shuffle function, so it's not necessary to implement this one. You can skip it.
  • The other test files should be pretty straightforward and just test different params.

@4kimov Got it. I'll do a rewrite of the tests.

@4kimov The tests are now identical to the ones in the spec (and all passing); you can check them out here.

I might refactor them a little bit to .NETify them a bit more, but I'll make sure not to break the consistency with the spec tests and preserve the same exact scenarios and values, I'll keep that in mind.

Right on. Good job @aradalvand 👍

A few follow-up items:

@4kimov

Do the code examples look correct or would you like me to .NETify anything? https://sqids.org/dotnet

They are totally correct. The only thing I would say though is that you don't have to manually new up an array when calling Encode, you could pass the numbers directly like so:

string id = sqids.Encode(1, 2, 3); // "8QRLaD"

A bit less verbose.

We should probably also give some simple examples here. Here's a very basic example for Go: https://github.com/sqids/sqids-go#examples

Yeah the README is not complete yet, I'll make sure to put a couple more examples on there.

@aradalvand Cool, thank you for the heads up. Adjusted.