coac-gmbh / kotlin-faker

Generate realistically looking fake data such as names, addresses, banking details, and many more, that can be used for testing and data anonymization purposes.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Generates realistically-looking fake data
Just like this fake-logo, but not quite so.
fake-logo

Build Status Version Badge RC Version Badge Coverage Status Issues Badge Awesome Kotlin Badge Licence Badge

ToC

About

Port of a popular ruby faker gem written in kotlin. Generates realistically looking fake data such as names, addresses, banking details, and many more, that can be used for testing purposes during development and testing.

Comparison with similar jvm-based "faker libs"

While there are several other libraries out there with similar functionalities, I had several reasons for creating this one:

  • most of the ones I've found are written in java and I wanted to use kotlin
  • none of them had the functionality I needed
  • I didn't feel like forking an existing kotlin-based lib - fakeit - and refactoring the entire codebase, especially with it not being maintained for the past couple of years.

So why use this one instead? I've decided to make a comparison between kotlin-faker, and others libs that have been out there for quite some time.

The benchmarks time is an average execution time of 10 consecutive runs. Each run includes creating a new Faker instance and generating a 1_000_000 values with the function returning a person's full name.

Note: benchmarks for blocoio/faker could not be done due to unexpected exceptions coming from the lib, benchmarks for moove-it/fakeit could not be done due to android dependencies in the lib

kotlin-faker DiUS/java-faker Devskiller/jfairy blocoio/faker moove-it/fakeit
language kotlin java java java kotlin
number of available providers (address, name, etc.) 171 73 8 21 36
number of available locales 55 47 10 46 44
extra functionality (i.e. randomClassInstance gen)
actively maintained
cli-bot app
benchmarks 5482ms 17529.9ms 15036.5ms NA NA

Usage

Downloading

Latest releases are always available on jcenter.

From v1.4.1 onward releases are also published on maven central.

With gradle

dependencies {
    implementation 'io.github.serpro69:kotlin-faker:$version'
}

With maven

<dependencies>
    <dependency>
        <groupId>io.github.serpro69</groupId>
        <artifactId>kotlin-faker</artifactId>
        <version>${version}</version>
    </dependency>
</dependencies>

Using release candidate versions

Release candidates contain the newest functionality before next version gets released and can be downloaded by adding the following repo:

Gradle:

repositories {
    maven {
      url 'https://dl.bintray.com/serpro69/maven-release-candidates/'
    }
}

Maven:

<repositories>
    <repository>
        <id>serpro69-maven</id>
        <url>https://dl.bintray.com/serpro69/maven-release-candidates/</url>
        <layout>default</layout>
        <releases>
            <enabled>true</enabled>
        </releases>
    </repository>
</repositories>

Downloading a jar
The jar and pom files can also be found at this link

Generating data

val faker = Faker()

faker.name.firstName() // => Ana
faker.address.city() // => New York

Configuring Faker

Default configuration

If no FakerConfig instance is passed to Faker constructor then default configuration will be used:

  • locale is set to en
  • random is seeded with a pseudo-randomly generated number.
  • uniqueGeneratorRetryLimit is set to 100

Deterministic Random

Faker supports seeding of it's PRNG (pseudo-random number generator) through configuration to provide deterministic output of repeated function invocations.

val fakerConfig = FakerConfig.builder().create {
    random = Random(42)
}

val faker = Faker(fakerConfig)
val city1 = faker.address.city()
val name1 = faker.name.name()

val otherFaker = Faker(fakerConfig)
val city1 = otherFaker.address.city()
val name1 = otherFaker.name.name()

city1 == city2 // => true
name1 == name2 // => true

Generating unique values

Faker supports generation of unique values. There are basically two ways to generate unique values:

Unique values for entire provider

val faker = Faker()
faker.unique.configure {
    enable(faker::address) // enable generation of unique values for address provider
}

// will generate unique country each time it's called
repeat(10) { faker.address.country() }

To clear the record of unique values that were already generated:

faker.unique.clear(faker::address) // clears used values for address provider

faker.unique.clearAll() // clears used values for all providers

To disable generation of unique values:

faker.unique.disable(faker::address) // disables generation of unique values for address provider and clears all used values

faker.unique.disableAll() // disables generation of unique values for all providers and clears all used values

Unique values for particular functions of a provider

val faker = Faker()

repeat(10) { faker.address.unique.country() } // will generate unique country each time `country()` is prefixed with `unique`

repeat(10) { faker.address.city() } // this will not necessarily be unique (unless `faker.unique.enable(faker::address)` was called previously)

To clear the record of unique values that were already generated:

faker.address.unique.clear("city") // clears used values for `faker.address.unique.city()` function

faker.address.unique.clearAll() // clears used values for all functions of address provider

Configuring retry limit
If the retry count of unique generator exceeds the configured value (defaults to 100) then RetryLimitException will be thrown.

It is possible to re-configure the default value through FakerConfig:

val config = FakerConfig.builder().create {
    uniqueGeneratorRetryLimit = 1000
}

val faker = Faker(config)

Excluding values from generation It is possible to exclude values from being generated with unique generator. This is configured on the faker level for each (or in some cases - all) of the providers.

val faker = Faker()

faker.unique.configuration {
    // Enable generation of unique values for Address provider
    // Any unique generation configuration will only affect "enabled" providers
    enable(faker::address)

    // Exclude listOfValues from being generated 
    // in all providers that are enabled for unique generation
    exclude(listOfValues)

    // Exclude values starting with "A" from being generated 
    // in all providers that are enabled for unique generation
    exclude { listOf(Regex("^A")) }

    // Additional configuration for particular provider
    // First enable generation of unique values for Name provider
    enable(faker::name) {
        // Exclude listOfNames from being generated by any Name provider function
        excludeFromProvider<Name>(listOfNames)

        // Exclude listOfLastNames from being generated by Name#lastName function
        excludeFromFunction(Name::lastName, listOfLastNames)

        // Exclude values starting with "B" from being generated by any Name provider function
        excludeFromProvider<Name> { listOf(Regex("^B")) }

        // Exclude values starting with "C" from being generated by Name#country function
        excludeFromFunction(Name::lastName) { listOf(Regex("^C")) }
    }
}

// Based on the above config the following will be true in addition to generating unique values:
val city = faker.address.city()
assertTrue(listOfValues.contains(city) == false)
assertTrue(city.startsWith("A") == false)

val firstName = faker.name.firstName()
val lastName = faker.name.lastName()
assertTrue(listOfValues.contains(firstName) == false)
assertTrue(listOfValues.contains(lastName) == false)
assertTrue(listOfNames.contains(firstName) == false)
assertTrue(listOfNames.contains(lastName) == false)
assertTrue(listOfLastNames.contains(lastName) == false)
assertTrue(firstName.startsWith("A") == false)
assertTrue(lastName.startsWith("A") == false)
assertTrue(firstName.startsWith("B") == false)
assertTrue(lastName.startsWith("B") == false)
assertTrue(lastName.startsWith("C") == false)

This is only applicable when the whole category, i.e. Address or Name is enabled for unique generation of values.

faker.address.unique.country() // will still generate unique values, but won't consider exclusions, if any

Localized dictionary

Faker can be configured to use a localized dictionary file instead of the default en locale.

val fakerConfig = FakerConfig.builder().create {
    locale = "nb-NO"
}

val faker = Faker(fakerConfig)
val city1 = faker.address.city() // => Oslo
Available Locales
List of available locales (clickable):

  • ar
  • bg
  • ca
  • ca-CAT
  • da-DK
  • de
  • de-AT
  • de-CH
  • ee
  • en - default
  • en-AU
  • en-au-ocker
  • en-BORK
  • en-CA
  • en-GB
  • en-IND
  • en-MS
  • en-NEP
  • en-NG
  • en-NZ
  • en-PAK
  • en-SG
  • en-TH
  • en-UG
  • en-US
  • en-ZA
  • es
  • es-MX
  • fa
  • fi-FI
  • fr
  • fr-CA
  • fr-CH
  • he
  • hy
  • id
  • it
  • ja
  • ko
  • lv
  • nb-NO
  • nl
  • no
  • pl
  • pt
  • pt-BR
  • ru
  • sk
  • sv
  • th
  • tr
  • uk
  • vi
  • zh-CN
  • zh-TW

Using a non-default locale will replace the values in some of the providers with the values from localized dictionary.

val fakerConfig = FakerConfig.builder().create {
    locale = "es"
}
val faker = Faker(fakerConfig)
faker.address.city() // => Barcelona

Note that if the localized dictionary file does not contain a category (or a parameter in a category) that is present in the default locale, then non-localized value will be used instead.

val faker = Faker()
faker.gameOfThrones.cities() // => Braavos

val fakerConfig = FakerConfig.builder().create {
    locale = "nb-NO"
}
val localizedFaker = Faker(fakerConfig)
// `game_of_thrones` category is not localized for `nb-NO` locale
localizedFaker.gameOfThrones.cities() // => Braavos

Java interop

Although this lib was created with Kotlin in mind it is still possible to use from a Java-based project thanks to great Kotlin-to-Java interop.

Configuring Faker:

FakerConfig fakerConfig=FakerConfigBuilder.create(FakerConfig.builder(),fromConsumer(builder->{
    builder.setRandom(new Random(42));
    builder.setLocale("en-AU");
    }));

If builder parameter is not called with help of fromConsumer method, then explicit return should be specified:

FakerConfig fakerConfig=FakerConfigBuilder.create(FakerConfig.builder(),builder->{
    builder.setRandom(new Random(42));
    builder.setLocale("en-AU");
    return Unit.INSTANCE;
    });

Calling Faker methods:

new Faker(fakerConfig).getName().firstName(); // => John

CLI

Command line application can be used for a quick lookup of faker functions. See faker-bot README for installation and usage details.

Data Providers

Below is the list of available providers that correspond to the dictionary files found in core/locales/en

Note that not all (although most) of the providers and their functions are implemented at this point. For more details see the particular .md file for each provider below.

List of available providers (clickable):

Generating a random instance of any class

There are some rules when creating a random instance of a class:

  • The constructor with the least number of arguments is used (This can be configured - read on.)
  • kotlin.collection.* and kolin.Array types in the constructor are not supported at the moment

To generate a random instance of any class use Faker().randomProvider. For example:

class Foo(val a: String)
class Bar(val foo: Foo)

class Test {
    @Test
    fun test() {
        val faker = Faker()

        val foo: Foo = faker.randomProvider.randomClassInstance()
        val bar: Bar = faker.randomProvider.randomClassInstance()
    }
}

Pre-Configuring type generation for constructor params

Some, or all, of the constructor params can be instantiated with values following some pre-configured logic using typeGenerator function. Consider the following example:

class Baz(val id: Int, val uuid: UUID)

class Test {
    @Test
    fun test() {
        val faker = Faker()

        val baz: Baz = faker.randomProvider.randomClassInstance {
            typeGenerator<UUID> { UUID.fromString("00000000-0000-0000-0000-000000000000") }
            typeGenerator<Int> { 0 }
        }

    }
}

For each instance of Baz the following will be true:

baz.id == 0
baz.uuid == UUID.fromString("00000000-0000-0000-0000-000000000000")

The example itself does not make that much sense, since we're using "static" values, but we could also do something like:

val baz: Baz = faker.randomProvider.randomClassInstance {
    typeGenerator<UUID> { UUID.randomUUID() }
}

or even:

class Person(val id: Int, val name: String)

class Test {
    @Test
    fun test() {
        val faker = Faker()

        val person: Person = faker.randomProvider.randomClassInstance {
            typeGenerator<String> { faker.name.fullName() }
        }
    }
}

Deterministic constructor selection

By default, the constructor with the least number of args is used when creating a random instance of the class. This might not always be desirable and can be configured. Consider the following data classes:

class Foo
class Bar(val int: Int)
class Baz(val foo: Foo, val string: String)

class FooBarBaz {
    var foo: Foo? = null
        private set
    var bar: Bar? = null
        private set
    var baz: Baz? = null
        private set

    constructor(foo: Foo) {
        this.foo = foo
    }

    constructor(foo: Foo, bar: Bar) : this(foo) {
        this.bar = bar
    }

    constructor(foo: Foo, bar: Bar, baz: Baz) : this(foo, bar) {
        this.baz = baz
    }
}

If there is a need to use the constructor with 3 arguments when creating an instance of FooBarBaz, we can do it like so:

class Test {
    @Test
    fun test() {
        val faker = Faker()

        val fooBarBaz: FooBarBaz = randomProvider.randomClassInstance {
            constructorParamSize = 3
            fallbackStrategy = FallbackStrategy.USE_MAX_NUM_OF_ARGS
        }
    }
}

In the above example, FooBarBaz will be instantiated with the first discovered constructor that has parameters.size == 3; if there are multiple constructors that satisfy this condition - a random one will be used. Failing that (for example, if there is no such constructor), a constructor with the maximum number of arguments will be used to create an instance of the class.

Alternatively to constructorParamSize a constructorFilterStrategy config property can be used as well:

class Test {
    @Test
    fun test() {
        val faker = Faker()

        val fooBarBaz: FooBarBaz = randomProvider.randomClassInstance {
            constructorFilterStrategy = ConstructorFilterStrategy.MAX_NUM_OF_ARGS
        }
    }
}

There are some rules to the above:

  • constructorParamSize config property takes precedence over constructorFilterStrategy
  • both can be specified at the same time, though in most cases it probably makes more sense to use failbackStrategy with constructorParamSize as it just makes things a bit more readable
  • configuration properties that are set in randomClassInstance block will be applied to all "children" classes. For example classes Foo, Bar, and Baz will use the same random instance configuration settings when instances of those classes are created in FooBarBaz class.

Migrating to 1.0

Prior to version 1.0:

  • Faker was a singleton.
  • Random seed was provided through Faker.Config instance.
  • Locale was provided as parameter to init() function.
  • Provider functions were function literals. If invoke() was explicitly specified, then it will have to be removed ( See below.)

After version 1.0:

  • Faker is a class.
  • Configuration (rng, locale) is set with FakerConfig class. An instance of FakerConfig can be passed to Faker constructor.
  • Provider functions are no longer function literals. Explicit calls to invoke() will throw compilation error.

For kotlin users

- // prior to version 1.0
- Faker.Config.random = Random(42)
- Faker.init(locale)
- Faker.address.city()
- // or with explicit `invoke()`
- Faker.address.country.invoke()
+ // since version 1.0
+ // locale and random configuration is set with `FakerConfig` class (See Usage in this readme)
+ val faker = Faker(fakerConfig)
+ faker.address.city()
+ // explicit calls to `invoke()` have to be removed
+ faker.address.country()

For java users

Apart from changes to configuring locale and random seed and instantiating Faker through constructor instead of using a singleton instance (see kotlin examples), the main difference for java users is that provider functions are no longer function literals, therefore calls to invoke() operator will have to be removed and getters replaced with function calls.

- // prior to version 1.0
- Faker.init(locale);
- Faker.getAddress().getCity().invoke();
+ // since version 1.0
+ Faker faker = new Faker(fakerConfig);
+ // note `city()` function is called instead of getter 
+ // and no call to `invoke()` operator 
+ faker.getAddress().city();

For developers

Adding a new dictionary (provider)

When adding a new dictionary yml file the following places need to reflect changes:

  • Dictionary.kt - add a new class to CategoryName enum. This is only necessary if the category is not already there.
  • Constants.kt - the new dictionary filename should be added to defaultFileNames property.
  • Faker.kt - add a new faker provider property to Faker class.
  • The provider implementation class should go into provider package.
  • doc - add an .md file for the new provider.
  • reflect-config.json has to be updated to build the native image with graal.
  • And of course unit tests.

Build and Deploy

Build/deploy to bintray and github release processes are automated with travis-ci through usage of git tags.

Bumping versions

Versions need to be bumped manually through a tag with the next release version that has to follow the semver rules, and the tag has to be pushed to origin.

Creating the tag can be either done manually with git tag or by using gradlew tag task.

Pre-releases

To create a new pre-release version (new release candidate) the following can be used: ./gradlew clean tag -Prelease -PnewPreRelease -PbumpComponent={comp}, where comp can be one of the following values: major, minor, or patch.

To bump an existing pre-release to the next version (next release candidate for the same release version) the following can be used: ./gradlew clean tag -Prelease -PpreRelease.

Releases

To promote a pre-release to a release version the following can be used: ./gradlew clean tag -Prelease -PpromoteToRelease,

To create a new release version the following can be used: ./gradlew clean tag -Prelease -PbumpComponent={comp}, where comp can be one of the following values: major , minor, or patch.

Make targets

Alternatively to the above targets from Makefile can be used for the same purposes.

Contributing

Feel free to submit a pull request and/or open a new issue if you would like to contribute.

Thanks

Many thanks to these awesome tools that help us in creating open-source software:
Intellij IDEA YourKit Java profiler

Licence

This code is free to use under the terms of the MIT licence. See LICENCE.md.

About

Generate realistically looking fake data such as names, addresses, banking details, and many more, that can be used for testing and data anonymization purposes.

License:MIT License


Languages

Language:Kotlin 99.1%Language:Groovy 0.3%Language:Java 0.3%Language:Makefile 0.2%Language:Shell 0.1%