use amap::Dist
in groupPredict to speed up.
Install package by navigating to the parent folder of this one and running
R CMD INSTALL SNFtool
After the installation is complete you can use the functions. Here is an example session.
K = 20; # number of neighbors, usually (1030)
alpha = 0.5; # hyperparameter, usually (0.30.8)
T = 10; # Number of Iterations, usually (10~20)
data(Data1) data(Data2)
Here, the simulation data (Data1, Data2) has two data types. They are complementary to each other. And two data types have the same number of points. The first half data belongs to the first cluster; the rest belongs to the second cluster.
truelabel = c(matrix(1,100,1),matrix(2,100,1)); ##the ground truth of the simulated data;
Calculate distance matrices(here we calculate Euclidean Distance, you can use other distance, e.g,correlation)
If the data are all continuous values, we recommend the users to perform standard normalization before using SNF, though it is optional depending on the data the users want to use.
Calculate the pair-wise distance; If the data is continuous, we recommend to use the function "dist2" as follows; if the data is discrete, we recommend the users to use ""
Dist1 = dist2(as.matrix(Data1),as.matrix(Data1)); Dist2 = dist2(as.matrix(Data2),as.matrix(Data2));
W1 = affinityMatrix(Dist1, K, alpha) W2 = affinityMatrix(Dist2, K, alpha)
displayClusters(W1,truelabel); displayClusters(W2,truelabel);
W = SNF(list(W1,W2), K, T)
With this unified graph W of size n x n, you can do either spectral clustering or Kernel NMF. If you need help with further clustering, please let us know.
C = 2 # number of clusters group = spectralClustering(W, C); # the final subtypes information
you can evaluate the goodness of the obtained clustering results by calculate Normalized mutual information (NMI): if NMI is close to 1, it indicates that the obtained clustering is very close to the "true" cluster information; if NMI is close to 0, it indicates the obtained clustering is not similar to the "true" cluster information.
displayClusters(W, group); SNFNMI = calNMI(group, truelabel)
ConcordanceMatrix = concordanceNetworkNMI(list(W, W1,W2));
################################################################################
data(Digits)
K = 20 # number of neighbours alpha = 0.5 # hyperparameter in affinityMatrix T = 20 # number of iterations of SNF
distL = lapply(dataL, function(x) dist2(x, x))
affinityL = lapply(distL, function(x) affinityMatrix(x, K, alpha)) ################################################################################
Concordance_matrix = concordanceNetworkNMI(affinityL, 3);
The output, Concordance_matrix, shows the concordance between the fused network and each individual network.
################################################################################
W = SNF(affinityL, K, T)
clustering = spectralClustering(W,3);
NMI = calNMI(clustering, label);
################################################################################
data(Digits)
n = floor(0.8*length(label)) # number of training cases trainSample = sample.int(length(label), n) train = lapply(dataL, function(x) x[trainSample, ]) # Use the first 150 samples for training test = lapply(dataL, function(x) x[-trainSample, ]) # Test the rest of the data set groups = label[trainSample]
K = 20 alpha = 0.5 t = 20 method = TRUE
newLabel = groupPredict(train,test,groups,K,alpha,t,method)
accuracy = sum(label[-trainSample] == newLabel[-c(1:n)])/(length(label) - n)
################################################################################