fbie / funcalc-array-benchmarks

Funcalc Array Benchmarks, what's not to get?

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Parallel Funcalc Benchmarks

Intro

We have benchmarked Funcalc on our benchmarking server. We have benchmarked Funcalc with converted spreadsheets from the EUSES corpus using the file benchmark.bat.

The date at which the benchmarks have been run can be found in the file build.log in each folder.

Meta-Statistics

How many arrays did the algorithm convert per sheet?

getLifts <- function (lpath) {
    errs <- list.files(path=lpath, pattern="err", full.names=TRUE)
    lifts <- t(sapply(errs, (function (err) {
        for (line in scan(err, what = character(), sep = "\n")) {
            if (grepl("Lifted", line)) {
                elems <- as.numeric(unlist(strsplit(line, " ")))
                return(c(sub(".err", "", basename(err)), elems[2], elems[4]))
            }
        }
    })))
}
lifts <- getLifts("euses/arrays")
02rise.xml1080
2000_places_School.xml00
2002Qvols.xml12800
2004_PUBLIC_BUGS_INVENTORY.xml44930
Aggregate20Governanc#A8A51.xml25280
EducAge25.xml00
financial-model-spreadsheet.xml00
Financial-Projections.xml18360
funding.xml4710
high_2003_belg.xml282999
iste-cs-2003-modeling-sim.xml4000
modeling-3.xml00
MRP_Excel.xml524527
notes5CMISB200SP04H2KEY.xml15000
ny_emit99.xml00
Test20Station20Powe#A90F3.xml12290
Time.xml00
v1tmp.xml00
WasteCalendarCalculat#A843B.xml6030

It seems that

  1. cell arrays are in general smaller than we expected (< 64 cells); and
  2. there are many cell arrays that would introduce cyclic dependencies when lifted.

Initial Data Probing

Let’s see how well we’re doing.

# file <- "MRP_Excel.xml"
file <- "Aggregate20Governanc#A8A51.xml"
readLog <- function (prefix, file) {
    return (read.table(paste(prefix, "/", file, ".out", sep=""),
                       dec=".",
                       row.names=3,
                       col.names=c("iteration", "mode", "elapsed"),
                       skip=2,
                       stringsAsFactors=TRUE))
}

# Turns elapsed milliseconds into doubles.
getElapsed <- function (vals) {
    as.double(sapply(vals[2], function (x) {
        return(sub(",", ".", sub("ms", "", x)))
    }))
}

getSpeedup <- function (experiments, baseline, file) {
  perf      <- readLog(experiments, file)
  base      <- readLog(baseline, file)
  base_mean <- mean(getElapsed(base))
  speedups  <- sapply(getElapsed(perf), function (x) { return (base_mean / x)})
  c(mean(speedups), sd(speedups))
}
cbind(c("Mean", "Stddev"), getSpeedup("euses/arrays", "euses/seq", file))
Mean0.678750559823602
Stddev0.0369747742062879
array <- getElapsed(readLog("euses/arrays", file))
plot(array)

plots/MRP_Excel_array_plot.png

hist(array, freq=0.1)

plots/MRP_Excel_array_hist.png

mean(getElapsed(readLog("euses/seq", "Financial-Projections.xml")))
4.684867

Overall Analysis

Let’s just focus on those sheets actually have lifted cell arrays:

filterSuccessfulLifts <- function (lifts) {
  successful <- lifts[as.numeric(lifts[,2]) + as.numeric(lifts[,3]) > 0, 1:3]
  return(successful[sort.list(successful[,1]),])
}
successful <- filterSuccessfulLifts(lifts)
02rise.xml1080
2002Qvols.xml12800
2004_PUBLIC_BUGS_INVENTORY.xml44930
Aggregate20Governanc#A8A51.xml25280
Financial-Projections.xml18360
funding.xml4710
high_2003_belg.xml282999
iste-cs-2003-modeling-sim.xml4000
MRP_Excel.xml524527
notes5CMISB200SP04H2KEY.xml15000
Test20Station20Powe#A90F3.xml12290
WasteCalendarCalculat#A843B.xml6030
computeSpeedups <- function (benchmark, baseline) {
  files <- list.files(benchmark, pattern="out")
  speedups <- t(sapply(files,
                       function (file) {
                           f <- gsub(".out", "", file)
                           s <- getSpeedup(benchmark, baseline, f)
                           return(rbind(f, s[1], s[2]))
                       }))
  speedups.row.names <- files
  return(speedups)
}

speedups <- computeSpeedups("euses/arrays", "euses/seq")
speedupsF <- subset(speedups, speedups[,1] %in% successful)
02rise.xml1.297362821693930.0111646533392518
2002Qvols.xml1.004858028631020.0551914400974847
2004_PUBLIC_BUGS_INVENTORY.xml2.266360764680860.038273942036477
Aggregate20Governanc#A8A51.xml0.6787505598236020.0369747742062879
Financial-Projections.xml0.6659638928708790.170498553241524
funding.xml0.9274047422978630.0126249406063556
high_2003_belg.xml0.9931688775252720.00702673279166305
iste-cs-2003-modeling-sim.xml1.071098070589890.0222093270795209
MRP_Excel.xml1.048682871819880.00789956041152768
notes5CMISB200SP04H2KEY.xml0.9084935346411560.0303644427416775
Test20Station20Powe#A90F3.xml1.139318938166190.0372818552054307
WasteCalendarCalculat#A843B.xml0.9587007922380620.111673990606713
plot.bar <- function (cols, col) {
    ts <- t(matrix(cols[,col]))
    ts.names <- cols[,1]
    return(barplot(ts))
}
plot.bar(speedupsF, 2)

plots/errorbars.png

Synthetic Benchmarks

computeSpeedups("examples/arrays", "examples/seq")
finance2.xml1.74558900570580.0843146578163405
finance.xml2.296266312882870.134346665415993
testsdf.xml2.299543881339980.0665480544807438
plot.bar(computeSpeedups("examples/arrays", "examples/seq"), 2)

plots/barplot_examples.png

I changed the number of benchmarks to run in testsdf.xml to 100. Clearly, our large or computationally heavy sheets gain much more from cell array lifting than the real-life sheets.

Also now for Filby’s sheets:

computeSpeedups("filby/arrays", "filby/seq")
DNA.xml0.8071761563707940.0710962266408015
EUSE.xml1.335802835422630.092334646020717
PLANCK.xml2.330022005682560.362830128081373

Let’s compare them with the speedup achieved via per-cell parallelism:

rbind(computeSpeedups("examples/cells", "examples/seq"),
      computeSpeedups("filby/cells", "filby/seq"))
finance2.xml0.3491696013773650.033142860048034
finance.xml0.1436799498280060.0221177314455975
testsdf.xml0.1703979864078670.0255833552621607
DNA.xml0.4465591530114310.0134697219740229
EUSE.xml0.1708369847696140.0194000274627591
PLANCK.xml0.5862760365534740.046148612962701

This should clearly show that our approach is useful!

plot.bar(computeSpeedups("filby/arrays", "filby/seq"), 2)

plots/barplot_filby.png

How many formula cells per sheet?

countFormulas <- function (file) {
    formulas <- sum(sapply(scan(file, what=character()),
                           function (line) { return(grepl("Formula", line)) }))
  return(c(basename(file), as.numeric(formulas)))
}
formulas <- t(sapply(list.files("~/Documents/funcalc-euses/",
                                recursive=TRUE, pattern="xml$",
                                full.names=TRUE),
                     countFormulas))
2004_PUBLIC_BUGS_INVENTORY.xml4495
Aggregate20Governanc#A8A51.xml3546
high_2003_belg.xml12861
02rise.xml10316
financial-model-spreadsheet.xml3115
Financial-Projections.xml3649
2000_places_School.xml1375
2002Qvols.xml2184
EducAge25.xml1470
notes5CMISB200SP04H2KEY.xml1557
Test20Station20Powe#A90F3.xml2164
v1tmp.xml1129
MRP_Excel.xml4809
ny_emit99.xml4353
Time.xml4198
WasteCalendarCalculat#A843B.xml844
funding.xml1636
iste-cs-2003-modeling-sim.xml1991
modeling-3.xml213

We compute the theoretical maximum speedup by using Amdahl’s law:

amdahl <- function (pWork, nThreads) {
    return(1 / (1 - pWork + pWork / nThreads))
}
max.speedup <- function (formulas, arrayCells) {
    return(amdahl(arrayCells / formulas, 32))
}

Let’s assume a sheet of 3000 formulas of which 400 are in parallelizable cell arrays:

max.speedup(3000, 400)
1.14832535885167

This is actually not too far from what we achieve on average, also counting sheets that are not converted:

speedups <- computeSpeedups("euses/arrays", "euses/seq")
mean(as.numeric(speedups[,2]))
1.06837660717346

Keep in mind that the estimate is overly optimistic! There are potential sequential dependencies between the cell arrays, which our theoretical bound does not take into account.

How well are we doing?

There seems to be something wrong with the formula count; how can the number of lifted cell array cells ever be larger than the number of overall formulas? Turns out I just don’t know R and data must be sorted alphabetically by file name.

fc0 <- formulas[sort.list(formulas[,1]),]
fc <- subset(fc0, fc0[,1] %in% successful)
ratios <- cbind(fc, as.numeric(successful[,2]) + as.numeric(successful[,3]))
02rise.xml10316108
2002Qvols.xml21841280
2004_PUBLIC_BUGS_INVENTORY.xml44954493
Aggregate20Governanc#A8A51.xml35462528
Financial-Projections.xml36491836
funding.xml1636471
high_2003_belg.xml128612928
iste-cs-2003-modeling-sim.xml1991400
MRP_Excel.xml48091051
notes5CMISB200SP04H2KEY.xml15571500
Test20Station20Powe#A90F3.xml21641229
WasteCalendarCalculat#A843B.xml844603

Now, we can compute the hypothetical bound.

bounds <- cbind(ratios[,1], max.speedup(as.numeric(ratios[,2]), as.numeric(ratios[,3])))
02rise.xml1.01024592672387
2002Qvols.xml2.3135593220339
2004_PUBLIC_BUGS_INVENTORY.xml31.5646258503401
Aggregate20Governanc#A8A51.xml3.23245214220602
Financial-Projections.xml1.95094566597607
funding.xml1.38677121135864
high_2003_belg.xml1.28295675594793
iste-cs-2003-modeling-sim.xml1.24165887121921
MRP_Excel.xml1.26858301664372
notes5CMISB200SP04H2KEY.xml14.9891696750903
Test20Station20Powe#A90F3.xml2.22312112748403
WasteCalendarCalculat#A843B.xml3.24810583283223

How far are we from reaching the overly optimistic, hypothetical bound? We compute the difference between hypothetical bound and actual speedup, divided by the bound:

δ_speedup = (bound - speedup) / bound

cbind(s0[,1], (as.numeric(bounds[,2]) - as.numeric(speedupsF[,2])) / as.numeric(bounds[,2]))
02rise.xml-0.284204951858754
2002Qvols.xml0.565665760518461
2004_PUBLIC_BUGS_INVENTORY.xml0.928199346463774
Aggregate20Governanc#A8A51.xml0.790019919874086
Financial-Projections.xml0.658645597114724
funding.xml0.331248922171328
high_2003_belg.xml0.225875016503221
iste-cs-2003-modeling-sim.xml0.137365265599756
MRP_Excel.xml0.173343125312861
notes5CMISB200SP04H2KEY.xml0.939390002626301
Test20Station20Powe#A90F3.xml0.487513782276187
WasteCalendarCalculat#A843B.xml0.704843117318591

Negative results probably mean that we exceed the hypothetical bound, which is good but weird.

For synthetic sheets:

synthS <- computeSpeedups("examples/arrays", "examples/seq")
synthF <- cbind(countFormulas("~/src/funcalc-examples/applied/finance2.xml"),
                countFormulas("~/src/funcalc-examples/applied/finance.xml"),
                countFormulas("~/src/funcalc-examples/tests/testsdf.xml"))
finance2.xmlfinance.xmltestsdf.xml
106987159433774
synthL <- getLifts("examples/arrays")
synthH <- max.speedup(as.numeric(synthF[2,]), as.numeric(synthL[,2]) + as.numeric(synthL[,3]))
cbind(synthS[,1], (as.numeric(synthH) - as.numeric(synthS[,2])) / as.numeric(synthH))
finance2.xml0.924886703190993
finance.xml0.849128836307796
testsdf.xml-1.01857483727986

Again, negative results. I think the approach is flawed since the measure of the possible parallel work is very inaccurate. The idea would be more useful if we can find a better way to approximate parallel work. Unless we can do that, we cannot use it.

Analysis for synthetic sheets:

computeSpeedups("synth/arrays", "synth/seq")
synth-map.xml3.137386596180670.0596575681287037
synth-prefix.xml10.30010837616980.655205245937034

About

Funcalc Array Benchmarks, what's not to get?


Languages

Language:F# 72.1%Language:Batchfile 27.9%