Potential memory bloat
ashmaroli opened this issue · comments
Tl;dr
This ticket is to pitch the possibility of the following:
- Bail out earlier if the rendering won't be cached eventually.
- Consider hashing the params (
params.hash
) prior to being passed toDigest
- Cache generated Digest key itself.
Summary
Using {% include ... %}
within a Liquid loop is known to increase build times. But in such cases,
this plugin should consider memory usage (within practical limits) as well.
While I agree that the Cache itself would grow over time, the intermediate usage can be reduced.
Details
This ticket is regarding the following lines:
jekyll-include-cache/lib/jekyll-include-cache/tag.rb
Lines 8 to 11 in f976a1e
jekyll-include-cache/lib/jekyll-include-cache/tag.rb
Lines 30 to 32 in f976a1e
👉 L#10 is executed irrespective of whether the tag will be rendered or not.. (and therefore allocate memory even if the program returns in L#11)
The allocation due to L#10 can be huge in certain situations. For example (sourced from an actual repo):
<ul class="list">
{% for post in page.posts %}
{% if post.categories contains 'links' %}
<li class="list__item">
<div class="card card--link">
=> {% include_cached components/link-card.html link=post %}
</div>
</li>
{% else %}
<li class="list__item list__item--large">
<div class="card card--article">
=> {% include_cached components/post-card.html page=post %}
</div>
</li>
{% endif %}
{% endfor %}
</ul>
In both uses above, #parse_params
is going to yield the post
object jsonified (or maybe the actual Jekyll::Document
object) either of which isn't a small object.
Consequently when the params
Hash is stringified in L#31, the entire json string or perhaps the result of Jekyll::Document#to_s
gets passed to Digest::MD5
Since this operation occurs before the Cache is traversed, this operation will always allocate significant memory.
@ashmaroli thanks for this. Would the solution be to return after the path is calculated?
To me the solution is multi-pronged as listed under ## Tl;dr
above. To elaborate further:
- Yes, 👍 to
return unless path
right after calculatingpath
. - However, if
path
is valid, then#key
is still going to take thepath
and large objectparams
(based on the cited example) as arguments. I was pitching for havingkey = key(path, params.hash)
instead. - That said, the
key
has to be technically, computed for every instance of the tag before theCache
can be checked. Therefore, a layout containing{% include_cached file.html foo='bar' %}
will generate a new "String object" (but with same Digest value) for everypage
render. So it'd be awesome if the "digest key" itself can be cached as well.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.