Algorithm used to generate random unique id

Question

Algorithm used to generate random unique id

anshulnegitc opened this issue 3 years ago · comments

I read the code used to generate random UUID, but I find it difficult to conclude how it's able to generate a different id every time. According to code

At first dictionary is set(by default alphanumeric and length of 6 )
Then it is shuffled.
Whenever uid() is called a loop runs which individually pick one character from already shuffled dict and returns the id.

Why you shuffled in this particular manner?

const PROBABILITY = 0.5;
finalDict = finalDict.sort(() => Math.random() - PROBABILITY);

And what is Maths behind this loop?

for (j = 0; j < uuidLength; j += 1) {
        randomPartIdx = parseInt((Math.random() * this.dictLength).toFixed(0), 10) % this.dictLength;
        id += this.dict[randomPartIdx];
}

How mathematically you are making sure that randomPartIdx is generating different numbers sequence every time?

Jean Lescure · Answer 1 · Thu Sep 02 2021 05:21:02 GMT+0800 (China Standard Time)

Hi @anshulnegitc, thanks for taking interest in our project and the logic within.

I'll try my best to thoroughly answer your inquiries...

Why you shuffled in this particular manner?
const PROBABILITY = 0.5;
finalDict = finalDict.sort(() => Math.random() - PROBABILITY);

In regards to this first question, the simple answer is: because shuffling the dictionary this way is easy and has a very low cost in terms of computational resources.

But, I'm going to go out on a limb and assume your "why" question is meant to request details in regards of how the mentioned logic performs the shuffle.

If that's a case, going more in depth in regards of the methodology:

The usage of Javascript's sort allows us to natively and efficiently copy the original array but with each item in a new position. In case of this code, said item position is determined by the Math.random() - PROBABILITY bit.
The PROBABILITY constant is merely a convention to avoid using magic numbers. Further, the 0.5 probability value ensures there is no bias towards any dictionary character over the rest, this is because it shifts the random range from 0 to 1 down to -0.5 to 0.5, this is in line with the ECMAScript definition of sort's comparefn argument which dictates that:

The notation a <CF b means comparefn(a, b) < 0; a =CF b means comparefn(a, b) = 0 (of either sign); and a >CF b means comparefn(a, b) > 0.

(This last bit is important in order to ensure reflexivity, symmetry and transitivity of the dictionary values while sorting, and thus providing the widest range of randomness natively available in Javascript)

And what is Maths behind this loop?
for (j = 0; j < uuidLength; j += 1) {
randomPartIdx = parseInt((Math.random() * this.dictLength).toFixed(0), 10) % this.dictLength;
id += this.dict[randomPartIdx];
}

The expression j = 0 is a variable definition.
The expression j < uuidLength performs a comparison of inequality.
The expression j += 1 performs both an arithmetic addition as well as a variable definition.
The expression parseInt((Math.random() * this.dictLength).toFixed(0), 10) returns an integer value between 0 and the length of the dictionary.
Finally we perform the % this.dictLength modulo operation to the previous value in order to make sure that if the value is equal to this.dictLength then it is turned back to 0 (because the dictionary length is one more than the last index available on the dictionary array).

How mathematically you are making sure that randomPartIdx is generating different numbers sequence every time?

We are not.

Writing a pseudo-random number generator is out of the scope and purpose of this library. As such, we can only guarantee that by default we will use the safest pseudo-random number generator available to our users without external dependencies, which would be Math.random(). This means that the "quality" of randomness of randomPartIdx is exclusively dependant on the implementation-defined algorithm provided by the virtual javascript processor running the code, for example V8 (Node, Chrome), SpiderMonkey (Firefox), etc.

In regards to this last point, if you would like to have a way to provide your own drop-in replacement pseudo-random number generator, I invite you to either:

open an issue (with enhancement label) with a proposal on how to implement the feature
or fork the repo, implement the feature, and submit your PR

Personally I don't have a use for this feature, but feel it would be a nice feature to add to this project if someone wants to contribute it 😄

Anshul Negi · Answer 2 · Thu Sep 02 2021 15:01:22 GMT+0800 (China Standard Time)

Thanks for explaining, it was really helpful.
Definitely will contribute in near future.

Jean Lescure · Answer 3 · Fri Sep 03 2021 00:57:48 GMT+0800 (China Standard Time)

Cheers 🍻

Jean Lescure · Answer 4 · Mon Sep 06 2021 13:05:06 GMT+0800 (China Standard Time)

I'm currently going over our v5 Roadmap (#41) and looking at the note regarding rand vs rand_pcg I feel that maybe the "bring-your-own-pseudo-random-generator" feature might be even more useful than I anticipated. For example, it could be used to set the default pseudo-random generator for each language.