SchisslerGroup / Bigsimr.jl

Simulate multivariate distributions with arbitrary marginals.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Correlation types and MvDistribution

adknudson opened this issue · comments

What is the best way to represent a Correlation matrix and a Multivariate Distribution type? The choices here can have an affect on the stable API and how users interact with the package. There are two thoughts:

  • Create a set of "building block" functions to use to convert/check correlations and simulate data
  • Create wrapper functions that take care of converting/checking correlation matrices through function arguments

We can incorporate both by providing the low-level building blocks while also shipping the user-friendly wrappers and types (Correlation, MvDistribution, PDCorMat, etc. ).

Comment below on suggested data types and fields, and how they should interact.

Correlation Matrix: Option 1
Uses PDMats.jl as the supertype so it can inherit a set of useful and efficient functions

struct PDCorMat{T<:Real, S<:AbstractMatrix} <: AbstractPDMat{T}
    dim::Int
    mat::S
    type::AbstractCorrelation
    chol::Cholesky{T,S}
end

Correlation Matrix: Option 2
Essentially re-implement PDMat.jl

struct CorMat{T<:Real, S<:AbstractMatrix, C<:AbstractCorrelation} <: AbstractMatrix{T}
    mat::S
    type::C
    chol::Cholesky{T, S}
end

A correlation matrix can stand as its own type, and a MvDist can have a CorMat as a data field. The minimum CorMat structure needed is

struct CorMat{T<:Real, S<:AbstractMatrix} <: AbstractMatrix{T}
    mat::S
    chol::Cholesky{T,S}
end

and is correlation type agnostic.

An MvDist type must have

struct MvDist{T<:Real, C<:AbstractCorrelation}
    margins::Vector{<:UnivariateDistribution}
    target_cor::CorMat{T}
    adjust_cor::CorMat{T}
    type::C
end

If we use option 2 (where cor_type is stored), then we can implement things like

r = cor_randPD(Float32, 10)
R = CorMat(r, Spearman)
convert(CorMat{Float64, Pearson}, R)

Then we can implement

struct MvDist{T<:Real, C<:AbstractCorrelation}
    margins::Vector{<:UnivariateDistribution}
    target_cor::CorMat{T, C}
    adjust_cor::CorMat{T, Pearson}
end

This is functionality that has not been asked for, and is not necessarily necessary. Until there is more demand for functionality, I am closing this issue.