Memory usage for 100K Rows is >1.5GB
JuniorRaja opened this issue · comments
Hi,
I have been trying to write an .xlsx file from a really large dataset of 350K rows & 150 Columns. How would you approach writing such a large file. Pl find the below dry run results and the code.
For 10K Records
elapsed time: 10.374 sec
Memory peak usage: 76 MB
For 50K Records
Elapsed time: 56.563 sec
Memory peak usage: 364 MB
For 75K Records
Elapsed time: 97.748 sec
Memory peak usage: 546 MB
<?php
require_once __DIR__ . '/vendor/autoload.php';
require_once __DIR__ . '/src/autoload.php';
use \avadim\FastExcelWriter\Excel;
function generateXlsx($savePath, $filename, $jsonFilePath) {
if ($jsonFilePath === null) { die("Error: File path is null."); }
$jsonData = file_get_contents($jsonFilePath);
$data = json_decode($jsonData, true);
if ($data === null) { die("Error: Failed to decode JSON from text file."); }
$header = [
'c1-text' => '@', //text
'c2-text' => '@text', //text
'c3-integer' => '@integer',
'c4-integer' => ['text-color' => '#f0f'], // default format
'c5-money' => '@money',
'c6-price' => '#\'##0.000', //custom numeric format
'c7-date' => ['format' => '@date', 'width' => 'auto'],
'c8-date' => ['format' => 'YYYY-MM-DD', 'width' => 11],
'c9-time' => 'H:MM',
];
$outFileName = $savePath . '/' . $filename;
$timer = microtime(true);
$excel = Excel::create();
$sheet = $excel->getSheet();
$rowOptions = ['font-style' => 'bold'];
$sheet->writeHeader($header, $rowOptions);
foreach($data as $row) { $sheet->writeRow($row); }
$excel->save($outFileName);
echo '<b>', basename(__FILE__, '.php'), "</b><br>\n<br>\n";
echo 'out filename: ', $outFileName, "<br>\n";
echo 'elapsed time: ', round(microtime(true) - $timer, 3), ' sec', "<br>\n";
echo 'Memory peak usage: ', memory_get_peak_usage(true) / (1024 * 1024), ' MB<br>';
}
// Check if the script is invoked with the required parameters
if(isset($argv) && count($argv) >= 4) {
$savePath = $argv[1];
$filename = $argv[2];
$jsonFilePath = $argv[3];
// Generate the XLSX file
generateXlsx($savePath, $filename, $jsonFilePath);
} else {
echo "Usage: php script.php <savePath> <filename> <json_encoded_rows>\n";
}
?>
In this code, the memory_get_peak_usage() function shows the memory usage of more than just the library. You load a large json file, then call json_decode(), and these actions consume a lot of memory, much more than the FastExcelWriter library.
In the code below you can change the $maxRow variable and measure memory consumption. And you will see that it will be approximately 4Mb even with large $maxRow values
<?php
require_once __DIR__ . '/vendor/autoload.php';
require_once __DIR__ . '/src/autoload.php';
$maxCol = 150;
$maxRow = 100000;
$rowData = array_fill(0, $maxCol, 'text');;
$outFileName = __DIR__ . '/' . $maxCol . '-' . (int)($maxRow / 1000) . 'k.xlsx';
$timer = microtime(true);
$excel = \avadim\FastExcelWriter\Excel::create(['Sheet1']);
$sheet = $excel->sheet();
for($rowNum = 1; $rowNum <= $maxRow; $rowNum++) {
$sheet->writeRow($rowData);
}
$excel->save($outFileName);
$time = round(microtime(true) - $timer, 3);
$memory = round(memory_get_peak_usage(true) / (1024 * 1024), 2);
echo '<b>', basename(__FILE__, '.php'), "</b><br>\n<br>\n";
echo 'out filename: ', $outFileName, "<br>\n";
echo 'elapsed time: ', $time, ' sec', "<br>\n";
echo 'memory peak usage: ', $memory, "Mb<br>\n";
echo 'total rows: ', $maxRow, "<br>\n";
echo 'speed: ', round($maxRow / $time, 3), " rows/sec<br>\n";