Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

With c_str(...) the string length information is lost. #66

Open
Spammed opened this issue Nov 12, 2020 · 10 comments
Open

With c_str(...) the string length information is lost. #66

Spammed opened this issue Nov 12, 2020 · 10 comments

Comments

@Spammed
Copy link

Spammed commented Nov 12, 2020

I pass a string_view to map_source instead of a null-terminated string.

detail::open_file() or win::open_file_helper() use c_str(...) and c_str(...) is
casting std::string.data() to a char*.

This way the string length is no longer taken into account and an incorrect file name is used.

Obviously I can work around this problem,
but think that this is at least an API-Bug.

@Spammed
Copy link
Author

Spammed commented Nov 12, 2020

Ah! Most likely the pull request
"Adding support of UTF8 file name processing in mio on windows" (#64)
of shivendra14 will solve this issue as a side effect.

@vimpunk
Copy link
Owner

vimpunk commented Dec 20, 2020

Interesting! Mio was not written with string_view in support, that came after it was written.

Let me know if #64 fixes it for you, it should be merged soon.

@Spammed
Copy link
Author

Spammed commented Dec 20, 2020

I am having several issues getting the patch to work:
a) I am missing an #include <vector>.
and when I fix this I get
b) 'std::wstring mio::detail::win::s2ws(const std::string &)': cannot convert argument 1 from 'const String' to 'const std::string &'
To escape this problem I changed
std::wstring s2ws(const std::string& s)
to

template<
	typename String,
	typename = typename std::enable_if<
	std::is_same<typename char_type<String>::type, char>::value
	>::type
> std::wstring s2ws(const String& s)

but then I get
c) 'c_str': is not a member of 'std::basic_string_view<char,std::char_traits>' for the line
const auto wideCharCount = MultiByteToWideChar(CP_UTF8, 0, s.c_str(), sLength, buf.data(), sLength);
which can be cured by
const auto wideCharCount = MultiByteToWideChar(CP_UTF8, 0, c_str(s), sLength, buf.data(), sLength);

This does the job, but I'am totally uncomfortable with the way it does.
The original issue is, that the string length was lost. Now that new s2ws() is using Length() before that happens, which is fine.
The rest of this string dance just doesn't feel straight to me and IMHO this needs to be reworked again by someone who likes string types more than I do.

For your own experiments, my use case:

bool files_are_equal(const std::string_view file1, std::string_view file2)
{
	try {
		auto f1 = mio::mmap_source(file1);
		auto f2 = mio::mmap_source(file2);
		return f1.size() == f2.size() && std::equal(f1.data(), f1.data() + f1.size(), f2.data());
	} catch (...) {
		return false;
	}
}

const auto new_name {"name.ext.new'}; 
const auto p = new_name.find_last_of('.');
const auto orig_name = new_name.substr(0, p);
if (files_are_equal(orig_name, new_name) {
...
}

@Spammed Spammed changed the title With c_str() the string length information is lost. With c_str(...) the string length information is lost. Dec 21, 2020
@Spammed
Copy link
Author

Spammed commented Dec 22, 2020

Unfortunatly, I still have two issues with the your last commit 3f86a95

Issue a)
Error C2039: 'vector': is not a member of 'std' in line 802 of mio.hpp
Which means, that an #include <vector> is missing.

Issue b) (after curing a))
Error C2664: 'std::wstring mio::detail::win::s_2_ws(const std::string &)': cannot convert argument 1 from 'const String' to 'const std::string &' in line 814 of mio.hpp.

I think it's irrelevant, but for the sake of completeness:
Microsoft Visual Studio, Version 16.9.0 Preview 2.0, compiler flags /permissive-, /std:c++latest and /W4.

@vimpunk
Copy link
Owner

vimpunk commented Dec 22, 2020

I'm really sorry, it looks like the latest PR that I accepted doesn't compile then. I don't have a windows machine nearby so I couldn't test it, but it seems this was a mistake.

There are two options at the moment:

  • I roll back the commit and accept there is a bug in the windows execution path until fixed.
  • If you have time, you could send a patch to apply your suggested fixes, either on top of current master, or after I have rolled back.

What do you think?

@Spammed
Copy link
Author

Spammed commented Dec 22, 2020

I am willing and have time to create a proposal/patch.
I don't care, but probably it's better to restore the old, compilable state first.

Unfortunately I have not fully understand template< typename String ... and fear to create strange or inappropriate code for this. A few explanatory words would probably help me.

I suggest I create a patch to the best of my knowledge and we'll see from there.

@Spammed
Copy link
Author

Spammed commented Dec 22, 2020

Sorry I don't know GitHub working style.
I finally made up a patch on the latest version.

I try not to commit this patch for several reasons:

  1. it only affects the single header (I don't use CMake...).
  2. it is not clear to me what should really happen in case of path=="".
  3. It is not clear to me whether one should really waive s_2_ws(). But the code with s_2_ws() does not seem better/clearer to me.
  4. my compiler still throws some warnings for other places in mio.hpp.
Subject: [PATCH] a) add missing '#include <vector>' b) rework
 open_file_helper() to silence MSVC Error C2664:    'std::wstring
 mio::detail::win::s_2_ws(const std::string &)': cannot convert argument 1
 from 'const String' to 'const std::string &'

---
 single_include/mio/mio.hpp | 32 ++++++++++++++------------------
 1 file changed, 14 insertions(+), 18 deletions(-)

diff --git a/single_include/mio/mio.hpp b/single_include/mio/mio.hpp
index c568a46..b49ebb5 100644
--- a/single_include/mio/mio.hpp
+++ b/single_include/mio/mio.hpp
@@ -104,6 +104,7 @@ inline size_t make_offset_page_aligned(size_t offset) noexcept
 
 #include <iterator>
 #include <string>
+#include <vector>
 #include <system_error>
 #include <cstdint>
 
@@ -794,30 +795,25 @@ inline DWORD int64_low(int64_t n) noexcept
     return n & 0xffffffff;
 }
 
-std::wstring s_2_ws(const std::string& s)
-{
-    if (s.empty())
-        return{};
-    const auto s_length = static_cast<int>(s.length());
-    auto buf = std::vector<wchar_t>(s_length);
-    const auto wide_char_count = MultiByteToWideChar(CP_UTF8, 0, s.c_str(), s_length, buf.data(), s_length);
-    return std::wstring(buf.data(), wide_char_count);
-}
-
 template<
     typename String,
     typename = typename std::enable_if<
-        std::is_same<typename char_type<String>::type, char>::value
+    std::is_same<typename char_type<String>::type, char>::value
     >::type
 > file_handle_type open_file_helper(const String& path, const access_mode mode)
 {
-    return ::CreateFileW(s_2_ws(path).c_str(),
-            mode == access_mode::read ? GENERIC_READ : GENERIC_READ | GENERIC_WRITE,
-            FILE_SHARE_READ | FILE_SHARE_WRITE,
-            0,
-            OPEN_EXISTING,
-            FILE_ATTRIBUTE_NORMAL,
-            0);
+    const auto path_length = path.length();
+    if (path_length == 0)
+        return{};
+    std::vector<wchar_t> buf(path_length);
+    const auto wide_char_count = MultiByteToWideChar(CP_UTF8, 0, c_str(path), path_length, buf.data(), path_length);
+    return ::CreateFileW(buf.data(),
+        mode == access_mode::read ? GENERIC_READ : GENERIC_READ | GENERIC_WRITE,
+        FILE_SHARE_READ | FILE_SHARE_WRITE,
+        0,
+        OPEN_EXISTING,
+        FILE_ATTRIBUTE_NORMAL,
+        0);
 }
 
 template<typename String>
-- 
2.27.0.windows.1

@Spammed
Copy link
Author

Spammed commented Dec 22, 2020

Off Topic: The warnings, my compiler throws are (after applying the patch above):

...\mio.hpp(885,39): warning C4244: 'return': conversion from 'int64_t' to 'size_t', possible loss of data
...\mio.hpp(910,61): warning C4244: 'argument': conversion from 'const int64_t' to 'size_t', possible loss of data
...\mio.hpp(931,13): warning C4244: 'argument': conversion from 'const int64_t' to 'SIZE_T', possible loss of data
...\mio.hpp(1097,22): warning C4244: '=': conversion from 'const int64_t' to 'mio::basic_mmap<mio::access_mode::read,char>::size_type', possible loss of data
...\mio.hpp(1063): message : while compiling class template member function 'void mio::basic_mmap<mio::access_mode::read,char>::map(const mio::basic_mmap<mio::access_mode::read,char>::handle_type,const mio::basic_mmap<mio::access_mode::read,char>::size_type,const mio::basic_mmap<mio::access_mode::read,char>::size_type,std::error_code &)'
...\mio.hpp(1052): message : see reference to function template instantiation 'void mio::basic_mmap<mio::access_mode::read,char>::map(const mio::basic_mmap<mio::access_mode::read,char>::handle_type,const mio::basic_mmap<mio::access_mode::read,char>::size_type,const mio::basic_mmap<mio::access_mode::read,char>::size_type,std::error_code &)' being compiled
...\mio.hpp(1140): message : while compiling class template member function 'void mio::basic_mmap<mio::access_mode::read,char>::unmap(void)'
...\mio.hpp(971): message : see reference to function template instantiation 'void mio::basic_mmap<mio::access_mode::read,char>::unmap(void)' being compiled
...\mio.hpp(969): message : while compiling class template member function 'mio::basic_mmap<mio::access_mode::read,char>::~basic_mmap(void)'
...\mio.hpp(536): message : see reference to class template instantiation 'mio::basic_mmap<mio::access_mode::read,char>' being compiled
...\mio.hpp(1098,29): warning C4244: '=': conversion from 'const int64_t' to 'mio::basic_mmap<mio::access_mode::read,char>::size_type', possible loss of data

@vimpunk
Copy link
Owner

vimpunk commented Feb 23, 2021

Hi @Spammed, thanks for all your work on this so far. I had a patchy start to the year and didn't have time to work on open source. Just wanted to give a life sign, I'll hopefully be able to get back on this in the next weeks.

@Spammed
Copy link
Author

Spammed commented Apr 11, 2021

Hello,
I just stumbled across this little essay and thought it fit the topic:
https://blogs.msmvps.com/gdicanio/2021/03/26/the-case-of-string_view-and-the-magic-string/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants