Size of 1.9.34 is 1.7MB. Size of Chrome store extension is 64MB. Why?
marksolaris opened this issue · comments
Whilst appraising if this is safe to run, I unpack the Chrome extension CRX and see:
drwx------ 2 me me 4096 Oct 14 14:31 _metadata/
-rw-r--r-- 1 me me 17948354 Oct 14 14:15 alltests.bundle.js
-rw-r--r-- 1 me me 14084024 Oct 14 14:15 background.bundle.js
-rw-r--r-- 1 me me 14742964 Oct 14 14:15 control.bundle.js
-rw-r--r-- 1 me me 527 Oct 14 14:15 datatables_override.css
-rw-r--r-- 1 me me 3581 Oct 14 14:15 icon128.png
-rw-r--r-- 1 me me 1558 Oct 14 14:15 icon48.png
-rw-r--r-- 1 me me 18424289 Oct 14 14:15 inject.bundle.js
-rw-r--r-- 1 me me 820 Oct 14 14:15 inject.css
-rw-r--r-- 1 me me 13900 Oct 14 14:15 jquery.dataTables.min.css
-rw-r--r-- 1 me me 1368 Oct 14 14:31 manifest.json
-rw-r--r-- 1 me me 2084 Oct 14 14:15 popup.css
-rw-r--r-- 1 me me 8782 Oct 14 14:15 popup.html
-rw-r--r-- 1 me me 160 Oct 14 14:15 sort_asc.png
-rw-r--r-- 1 me me 201 Oct 14 14:15 sort_both.png
-rw-r--r-- 1 me me 158 Oct 14 14:15 sort_desc.png
Why are those bundles so huge?
I haven't been exposed to Node.js much hence why I have the query.
Did the extension get packaged with the 'development' env var still turned on? I can see all the files in tests/ were included.
I can replicate the chonk with a default build
% du -hs build
63M build
Setting utils/env.js to 'normal' helps a bit.
% du -hs build
28M build
-rwxrwxrwx 1 root root 7822147 Oct 22 10:37 alltests.bundle.js*
-rwxrwxrwx 1 root root 6212804 Oct 22 10:37 background.bundle.js*
-rwxrwxrwx 1 root root 6483854 Oct 22 10:37 control.bundle.js*
-rwxrwxrwx 1 root root 527 Oct 22 10:37 datatables_override.css*
-rwxrwxrwx 1 root root 3581 Oct 22 10:37 icon128.png*
-rwxrwxrwx 1 root root 1558 Oct 22 10:37 icon48.png*
-rwxrwxrwx 1 root root 7942361 Oct 22 10:37 inject.bundle.js*
-rwxrwxrwx 1 root root 820 Oct 22 10:37 inject.css*
-rwxrwxrwx 1 root root 13900 Oct 22 10:37 jquery.dataTables.min.css*
-rwxrwxrwx 1 root root 1302 Oct 22 10:37 manifest.json*
-rwxrwxrwx 1 root root 2084 Oct 22 10:37 popup.css*
-rwxrwxrwx 1 root root 8782 Oct 22 10:37 popup.html*
-rwxrwxrwx 1 root root 160 Oct 22 10:37 sort_asc.png*
-rwxrwxrwx 1 root root 201 Oct 22 10:37 sort_both.png*
-rwxrwxrwx 1 root root 158 Oct 22 10:37 sort_desc.png*
I got scared off by the chonk and after messing around with curl, extensions and phantomjs (Amazon's Seige encryption sucks) I settled on coding up a tampermonkey userscript to tear out the current Orders page HTML and save it to disk. It's suitable for my low volume purchasing. I like it because both files add up to 5KB of javascript and perl with none of the recursive dependencies that Node has.
I like your extension, it's awesome work and the funds raised for charity is epic. Alas I'm a "less is more" advocate so I'll bow out from using it.
My userscript sticks a Downloads Orders link at the top of each Orders page and saves that HTML scrape to a filename with the year and pagination number in it.
// ==UserScript==
// @name Amazon Order HTML Grab
// @version 1.1
// @include https://www.amazon.com/your-orders/order*
// @require http://ajax.googleapis.com/ajax/libs/jquery/1.6.2/jquery.min.js
// @run-at document-idle
// @grant unsafeWindow
// @description Grab Amazon orders raw HTML
// ==/UserScript==
(function(){
function insert_download(order_div) {
var new_a = document.createElement('A');
new_a.setAttribute('href', '#');
new_a.textContent = 'Download Orders';
new_a.addEventListener("click", download_orders_div);
order_div.prepend('<BR>');
order_div.prepend('<BR>');
order_div.prepend(new_a);
order_div.prepend('<BR>');
// Add a hidden div which will save the order_div outerHTML
var filesave_a = document.createElement("A");
filesave_a.classList.add("filesave");
filesave_a.style.display = 'none';
document.body.appendChild(filesave_a);
}
function download_orders_div() {
var target_div = document.getElementsByClassName("your-orders-content-container__content")[0];
var filesave_div = document.getElementsByClassName("filesave")[0];
var chosen_year_span = document.getElementsByClassName("a-dropdown-prompt")[0]; // always at the top
var chosen_year = chosen_year_span.textContent.replace(/[\n\r]+|[\s]{2,}/g, ' ').trim();
var page = '1';
var ul_pagination = document.getElementsByClassName("a-pagination")[0];
if (ul_pagination != null) {
var selected_li = ul_pagination.getElementsByClassName("a-selected")[0];
if (selected_li != null) {
page = selected_li.textContent
}
}
var filename = window.location.hostname.replace(/\./g, '_') + '_order_scrape_' + chosen_year + '_' + page + '.div';
if (target_div != null) {
target_div.select;
document.execCommand("copy");
// console.log('copied ' + target_div.outerHTML);
console.log('download_orders_div: saving HTML to ' + filename);
filesave_div.setAttribute('href', 'data:text/plain;charset=utf-8,' + encodeURIComponent(target_div.outerHTML));
filesave_div.setAttribute('download', filename);
filesave_div.click();
} else {
console.log('DIV .your-orders-content-container__content not found');
}
}
function waitForKeyElements (selectorTxt, actionFunction, bWaitOnce, iframeSelector) {
var targetNodes, btargetsFound;
if (typeof iframeSelector == "undefined") targetNodes = $(selectorTxt);
else targetNodes = $(iframeSelector).contents().find(selectorTxt);
if (targetNodes && targetNodes.length > 0) {
targetNodes.each ( function () {
var jThis = $(this);
var alreadyFound = jThis.data ('alreadyFound') || false;
if (!alreadyFound) {
// console.log('waitForKeyELements running ' + jThis);
actionFunction (jThis);
jThis.data ('alreadyFound', true);
}
} );
btargetsFound = true;
} else { btargetsFound = false; }
var controlObj = waitForKeyElements.controlObj || {};
var controlKey = selectorTxt.replace (/[^\w]/g, "_");
var timeControl = controlObj [controlKey];
if (btargetsFound && bWaitOnce && timeControl) {
clearInterval (timeControl);
delete controlObj [controlKey];
} else {
if ( ! timeControl) {
timeControl = setInterval ( function () {
waitForKeyElements ( selectorTxt, actionFunction, bWaitOnce, iframeSelector );
}, 500);
controlObj [controlKey] = timeControl;
}
}
waitForKeyElements.controlObj = controlObj;
}
waitForKeyElements ('DIV[class="your-orders-content-container__content js-yo-main-content"]', insert_download);
})();
I end up with:
www_amazon_com_order_scrape_2017_1.div
www_amazon_com_order_scrape_2018_1.div
www_amazon_com_order_scrape_2019_1.div
www_amazon_com_order_scrape_2019_2.div
www_amazon_com_order_scrape_2020_1.div
www_amazon_com_order_scrape_2020_2.div
www_amazon_com_order_scrape_2021_1.div
www_amazon_com_order_scrape_2022_1.div
www_amazon_com_order_scrape_2023_1.div
And with this quick Mojo::Dom extractor I can generate a CSV and/or push the values elsewhere where I need them.
#!/usr/bin/perl
use Mojo::DOM;
# Read in Amazon DIV contents
my $div_file = $ARGV[0];
if (not -f $div_file) {
print STDOUT "$0 amazon_orders_yyy_x.div\n"; exit(0);
}
# Make a DOM string
my $html = "<HTML><BODY>";
open(DIV, $div_file); $html .= join("", <DIV>); close(DIV);
$html .= "</BODY></HTML>";
$html =~ s/\s+/ /g;
$html =~ s/\n\n/\n/g;
# Find printables and output
$dom = Mojo::DOM->new;
$dom->parse($html);
for my $element ($dom->find('*')->each) {
if (($element->attr('class') =~ /a-color-secondary/)) {
my $text = $element->text; $text =~ s/^\s+//;
if (length($text)) { print STDOUT " $text"; }
}
if (($element->tag eq 'bdi') && ($element->attr('dir') eq "ltr")) {
if (length($element->text)) { printf STDOUT " %s\n", $element->text; }
}
}
Gives this initially. More work to do to extract the item titles.
./dump_orders www.amazon.com_order_scrape_2020_2.div
Order placed March 21, 2020 Total $352.32 Ship to Order # 113-7085813-00000000
Order placed February 21, 2020 Total $13402.28 Ship to Order # 114-9894300-00000000
Order placed February 9, 2020 Total $9926.87 Ship to Order # 113-4279419-00000000
Order placed January 27, 2020 Total $6920.52 Ship to Order # 113-9354114-00000000
Order placed January 23, 2020 Total $17897818.23 Ship to Order # 114-5737246-00000000
Order placed January 13, 2020 Total $415635.32 Ship to Order # 113-5279924-00000000
It can be closed as being answered by the reality of the Node.js usage. Initially the ticket was about the extension being 64MB as that has security implications with exposure to so much of other peoples Node.js codebase.