Really slow retrieval of manifest item by relative path with massive spines.
mickael-menu-mantano opened this issue · comments
Group discussion:
https://groups.google.com/forum/#!topic/readium-dev/H01tvjUVKGc
Part of this pull request is fixing this issue:
#198
Media Overlays code reference points (copy / paste from the email discussion):
I fixed a similar performance issue a long time ago with the Media
Overlays path matcher (I resorted to a cached map).
std::map<string, std::shared_ptr<ManifestItem>> cache_smilRelativePathToManifestItem;
std::map<std::shared_ptr<ManifestItem>, string> cache_manifestItemToAbsolutePath;
So, a lot of string concatenation / manipulations occur at that point:
string ManifestItem::AbsolutePath() const
https://github.com/readium/readium-sdk/blob/develop/ePub3/ePub/manifest.cpp#L183
BaseHref
https://github.com/readium/readium-sdk/blob/develop/ePub3/ePub/manifest.cpp#L203
Also, note that when the first loop iteration fails, we have to check
for lower/upper-case percent encoding mismatches! Hopefully, your test
did not include this codepath? (otherwise it would have added a huge
processing cost)
https://github.com/readium/readium-sdk/blob/develop/ePub3/ePub/package.cpp#L173
see getReferencedManifestItem()
:
https://github.com/readium/readium-sdk/blob/develop/ePub3/ePub/media-overlays_smil_model.cpp#L616
This improved performance by an order of magnitude!
I found that this is super slow:
string::size_type i = iri.find_first_of('#');
compared to plain old:
const char * str = iri.c_str();
for (int j = 0; j < size; j++)
char c = str[j];
if (c == '#')
See the timer I used to measure the difference:
Closed, see PR #208