datafaker-net / datafaker

Generating fake data for the JVM (Java, Kotlin, Groovy) has never been easier!

Home Page:https://www.datafaker.net

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Faker v2.0.2 generates invalid Sweden SSN

ksiczek opened this issue Β· comments

Hi there πŸ‘‹

We've been using net.datafaker.providers.base.IdNumber#validSvSeSsn for quite some time but it stopped working with the 2.0.2 version. Our application resolves users' birth date from their Swedish SSN which, according to the SSN specification should be always possible. Unfortunately, the faker started returning random numbers, for example, 295659+4036. Can you refer to that? The method validSvSeSsn suggests that the number would follow the rules so maybe there is a regression bug in the 2.0.2?

Thanks for reporting this, we will fix this ASAP.

@snuyanzin could it be that this one introduced a regression: https://github.com/datafaker-net/datafaker/pull/927/files

Seems we are in a lack of tests for such cases
@ksiczek could you please check if #987 works ok for your case?

I've just merged the PR, if you could try the snapshot version, that would be great, happy to make a release this weekend if it fixes the problem.

Cool, thanks for such a prompt reaction. I've checked and it seems it works for us or at least it is not as bad as before ☺️ Just bear in mind that our test suites use this method here and there.

As far as I can tell you only generate the number in the format '######[-|+]#### which means that it is for people born in 19XX, right? If so, then the number 631221+6481 seems valid but πŸ‡ΈπŸ‡ͺ people get + when they turn 100 years old so the number is actually invalid, isn't it? Regarding your implementation - do you repeatedly create a pattern-matching number until you find something valid, aren't you?

I think the change is worth releasing anyway πŸ‘

Hmm, I'm not sure if we published a new version of the snapshot. It seems the builds are failing.

Correction : it seems they are working now.

I'm happy to have another look at the generation, if it's important to you, happy to provide a version which is always valid.

But what exactly are the rules?

Is it something like:
#####+19## for people born in 19xx?
And what is the purpose of the + and -? Do you mean that the ID changes, so that if I'm born in 1923, I have currently a -, but next year in 2024 I get a + cause I'm older than 100?

I'm not really familiar with the logic, so if you could provide the rules, happy to make sure the generation always generates valid results.

I haven't seen 2.0.3-SNAPSHOT so I checked the project out and installed it in the local repository - that is how I checked your fix.

Regarding the logic we used

It seems that the number itself consists of:

  1. A date which might be yyyyMMdd or yyMMdd
  2. A - or + separator
  3. A suffix that contains sex, checksum, etc.

So my interpretation would be that if you have 120923[-|+]#### then the person was born either in 1912 or 2012 and yes, according to the docs:

Between the birthdate and the birth number is a hyphen (-), which is replaced by a plus sign (+) in the year the person becomes 100 years old.

We work for πŸ‡ΈπŸ‡ͺ customers and I'll have a look on Monday if we could support you by means of a PR. Is that OK with you?

Sounds great to me.

Absolutely, a PR would be more than welcome, happy to make a release which includes your fix!

I'm assuming this is no longer an important issue, so I'll close the ticket for now. If one day in the future you want to send a PR, it would be most welcome.