gulrak / filesystem

An implementation of C++17 std::filesystem for C++11 /C++14/C++17/C++20 on Windows, macOS, Linux and FreeBSD.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Duplicate/Unexpected call detail::toUtf8

phprus opened this issue · comments

The path constructor already uses detail::toUtf8.

Duplicate call (and possible wrong in wchar_t environment):

template <class Source>
inline path& path::append(const Source& source)
{
    return this->operator/=(path(detail::toUtf8(source)));
}

and:

template <class EcharT>
inline path::path_type_EcharT<EcharT>& path::operator+=(EcharT x)
{
    std::basic_string<EcharT> part(1, x);
    concat(detail::toUtf8(part));
    return *this;
}

I admit that there is a potential performance overhead in that the string gets an additional copy (detail::toUtf8 does not do any conversion if called with std::string), but I can see no way it would lead to an error. In the wchat_t version, there is an additional conversion from and to wchar_t if called with a std::wstring. It will convert to std::string using detail::toUtf8 and then call the path constructor with it, where it is converted back to std::wstring. Due to the "string-is-utf8" and "wstring-is-utf16-on-windows" rules of this implementation these conversions are lossless. The constructor of path for the what_t case does call detail::toWChar not detail::toUtf8. In Release mode I couldn't see the impact of the additional non-converting detail::toUtf8 so I guess the compilers I tested it on optimized that out.

Still there are places that sure could get some optimization, especially for the wchar_t variant, and from time to time, or if I get reports of heavy performance issues I try to enhance the code, but given the time I can invest, my focus is on correctness first and I might be wrong, but I don't see potential for wrong conversions here. If I'm missing something, I need more details or a test case breaking.

Yes, there will be no encoding conversion errors. I misunderstood the code.

I don't understand why this chain of calls is needed: detail::toUtf8/*in constructor*/(detail::toUtf8/*in append function*/(...)) /*in the char version*/ or detail::toWChar/*in constructor*/(detail::toUtf8/*in append function*/(...)) /*in the wchar_t version*/? One call to detail::toUtf8 in this chain doesn't do any useful work.

I created a pull request that removes the internal call detail::toUtf8.

This is now part of release v1.5.2