Roentgenium - Rg - The Random Generator

Motivation

A need arose for large, random-but-real-looking data sets and like any proper software developer I immediately took things too far. I also identified - and took advantage of - an opprotunity to learn a lot more about C#'s impressive reflective capabilites.

As for the name, I'm naturally terrible at naming things so I started simply with just "Random Generator". That shortened into "Rg" which, thanks to high school chemistry, I'd recalled was a chemical symbol. Then I learned it is extremely radioactive, which of course is one of the best naturally-occuring sources of true randomness. So the name stuck.

Implemented as a .NET Core REST API, Rg can be built & run on any major modern OS.

Building

dotnet build in the project directory (the same as in which Roentgenium.csproj lives)

Prerequisities

.NET Core 2.2 or later

Optional

Microsoft Azure account:
- KeyVault can be used to store secrets (such as connection strings)
- Blob storage can be used to store the resulting artifacts
A Redis instance:
- To use the stream output format

Running

dotnet run in the project directory

the ASPNETCORE_ENVIRONMENT environment variable directly controls (via simple substituion) which appsettings.*.json file is used *via the interpolation appsettings.{ASPNETCORE_ENVIRONMENT}.json).

In the simplest, default mode, generated data sets will be persisted only via the Filesystem persistence module with the artifacts written into the working directory.

In "Production"

The build artifacts (Roentgenium.dll and its brethen in bin/{CONFIG}/netcoreapp2.2) are relocatable and can be run directly via the dotnet tool by eliding (oddly enough) the run verb and directly specifiying the dll path itself:

dotnet bin/Release/netcoreapp2.2/Roentgenium.dll

Rg will always look in the working directory for the appropriate appsettings file, so if run directly from the bin/Release/netcoreapp2.2/ without any settings files, the default configuration (as noted above) will be used.

Using

For interface documentation, Rg includes Swagger self-description support, always accessible on any running instance via the /swagger path.

Postman is the recommend way to interact easily with the interface.

`stream` output format

This output format is implementation-specific to Rg, utilizing Redis pub/sub to stream generated data to any number of interested subscribers.

It requires that the Extra field of the generator configuration structure include an entry named streamId, which specifies the channel name to be used when publishing each record.

Demos

rg.rpjios.com allows larger data sets but does not persist anything to Azure or currently in any way that is retrievable by the end user! Best for playing with the convenience method.
azure.rg.rpjios.com only allows small data sets but does persist to Azure (per this configuration).

Extending

There are many points of extensibility in Rg and developers wishing to extend its functionality are encouraged to do so and submit a PR any time.

Specifications

Living here, they're simple serializable classes implementing ISpecification which are then exposed as the supported specifications.

Field generators

Fields in an ISpecification are generated based on either the default generator for the field's Type or a custom generator specified explicitly per-field via the GeneratorTypeAttribute.

Stages

The general IPipeline interface specifies a feed-forward data pipeline, currently concretely implemented only once by Pipeline.cs.

Sources

ISourceStage implementation. There currently is only one which generates random data based on the specification & any mutating attributes applied, eventually calling the appropriate field generators to build data sets.

However, the source interface only requires implementation of a single method, so adding different sources would be relatively straightforward though would require addressing a few assumptions that there'd only ever be one.

Of note is that the system assumes that any concrete implemenation is capable of producing infinite IGeneratedRecords.

Intermediates ("filters")

IIntermediateStage implementations which themselves are just both an ISourceStage & ISinkStage at once, having each record "passed through" during execution of the overall pipeline. They are enumerated at runtime to be exposed as the supported filters.

Outputs

ISinkStage implementations named according to and with an OutputFormatSinkType attribute specified, these stages are exposed at runtime as the available output formats.

The stream format implementation does not use the bonafide stream data type as it isn't yet widely available.

Persistence

Implementations of IPersistenceStage, a specialized stage that exists only to persist the otherwise-ephermal results of the pipeline somewhere else.

rpj / rg

Roentgenium - Rg - The Random Generator

Motivation

Building

Prerequisities

Optional

Running

In "Production"

Using

`stream` output format

Demos

Extending

Specifications

Field generators

Stages

Sources

Intermediates ("filters")

Outputs

Persistence

About

Languages

Roentgenium - Rg - The Random Generator

Motivation

Building

Prerequisities

Optional

Running

In "Production"

Using

stream output format

Demos

Extending

Specifications

Field generators

Stages

Sources

Intermediates ("filters")

Outputs

Persistence

About

Languages

`stream` output format