Discussion:
[Bug c++/63631] New: std::regex_match yielding inexplicable garbage; invalid reads in valgrind
meme01 at eaku dot net
2014-10-23 16:21:53 UTC
Permalink
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63631

Bug ID: 63631
Summary: std::regex_match yielding inexplicable garbage;
invalid reads in valgrind
Product: gcc
Version: 4.9.1
Status: UNCONFIRMED
Severity: major
Priority: P3
Component: c++
Assignee: unassigned at gcc dot gnu.org
Reporter: meme01 at eaku dot net

Newly installed g++ 4.9.1, built from the tarball today:

---------------
$ cat /proc/version
Linux version 2.6.18-194.el5PAE (***@x86-007.build.bos.redhat.com) (gcc
version 4.1.2 20080704 (Red Hat 4.1.2-48)) #1 SMP Tue Mar 16 22:00:21 EDT 2010
$ cat /etc/redhat-release
Red Hat Enterprise Linux Client release 5.5 (Tikanga)
$ /usr/local/bin/g++ --version
g++ (GCC) 4.9.1
Copyright (C) 2014 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
---------------

Code sample to reproduce the bug:

---------------
#include <string>
#include <iostream>
#include <regex>

//#define INEXPLICABLY_WORK
//#define OPTION_THAT_DOESNT_HELP

int main(int argc, char** argv)
{
std::cmatch matches;
const std::string test_string(" 1:2.3 ");
std::cout << test_string << std::endl;
std::regex words_regex("[^\\s]+");

auto words_begin = std::sregex_iterator(test_string.begin(),
test_string.end(), words_regex);

#ifdef INEXPLICABLY_WORK
std::string unused = words_begin->str().c_str();
#endif

#ifdef OPTION_THAT_DOESNT_HELP
const char* c = words_begin->str().c_str();
#endif

if (std::regex_match(words_begin->str().c_str(), matches,
std::regex(".*:(\\d+\\.?\\d*)")))
std::cout << matches[0] << std::endl;

return 0;
}
---------------

I compile and execute with: rm -f test; /usr/local/bin/g++ --std=c++14 -O0
test.cpp -o test 2>&1 | more; ./test

Output is:

---------------
1:2.3
1:2.X
---------------

But if you enable INEXPLICABLY_WORK, it correctly yields:

---------------
1:2.3
1:2.3
---------------

Running valgrind yields the following errors, and a bunch more similar, none in
my code:

---------------
==7930== Invalid read of size 1
==7930== at 0x4007B01: memcpy (mc_replace_strmem.c:482)
==7930== by 0x40BC15B: std::string::_S_copy_chars(char*, char const*, char
const*) (in /usr/local/lib/libstdc++.so.6.0.20)
==7930== by 0x804E8AB: char* std::string::_S_construct_aux<char const*>(char
const*, char const*, std::allocator<char> const&, std::__false_type) (in MYDIR)
==7930== by 0x804D069: char* std::string::_S_construct<char const*>(char
const*, char const*, std::allocator<char> const&) (in MYDIR)
==7930== by 0x804C0E6: std::basic_string<char, std::char_traits<char>,
std::allocator<char> >::basic_string<char const*>(char const*, char const*,
std::allocator<char> const&) (in MYDIR)
==7930== by 0x804BDD7: std::sub_match<char const*>::str() const (in MYDIR)
==7930== by 0x804B6B1: std::basic_ostream<char, std::char_traits<char> >&
std::operator<< <char, std::char_traits<char>, char
const*>(std::basic_ostream<char, std::char_traits<char> >&, std::sub_match<char
const*> const&) (in MYDIR)
==7930== by 0x804A44F: main (in MYDIR)
==7930== Address 0x4193580 is 16 bytes inside a block of size 18 free'd
==7930== at 0x4005234: operator delete(void*) (vg_replace_malloc.c:346)
==7930== by 0x40BD137: std::string::_Rep::_M_destroy(std::allocator<char>
const&) (in /usr/local/lib/libstdc++.so.6.0.20)
==7930== by 0xC22E9B: (below main) (in /lib/libc-2.5.so)
---------------

The valgrind errors still occur even if INEXPLICABLY_WORK is enabled.

Thoughts?
redi at gcc dot gnu.org
2014-10-23 16:45:11 UTC
Permalink
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63631

Jonathan Wakely <redi at gcc dot gnu.org> changed:

What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |RESOLVED
Resolution|--- |INVALID
Severity|major |normal

--- Comment #1 from Jonathan Wakely <redi at gcc dot gnu.org> ---
words_begin->str().c_str() returns a dangling pointer, because
words_begin->str() returns a temporary that immediately goes out of scope.

This means your regex_match() call fills the match_results object with pointers
into a deallocated string
meme01 at eaku dot net
2014-10-23 16:54:00 UTC
Permalink
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63631

KarenRei <meme01 at eaku dot net> changed:

What |Removed |Added
----------------------------------------------------------------------------
Resolution|INVALID |FIXED

--- Comment #2 from KarenRei <meme01 at eaku dot net> ---


Ugh... thanks. So sregex_iterator->str() yields just a temporary. I'm
surprised, I'd think that'd be a big performance hit. But there must be some
internal reason.

Again,thanks.
redi at gcc dot gnu.org
2014-10-23 16:59:29 UTC
Permalink
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63631

Jonathan Wakely <redi at gcc dot gnu.org> changed:

What |Removed |Added
----------------------------------------------------------------------------
Resolution|FIXED |INVALID

--- Comment #3 from Jonathan Wakely <redi at gcc dot gnu.org> ---
sregex_iterator::operator->() yields a smatch_results and smatch_results::str()
yields a temporary string.

The reason is that smatch_results doesn't contain strings, it contains
iterators into some other string. When you want to retrieve a match it
constructs a new string from the relevant iterators. That avoids having to
construct all the strings up front when they may never be needed.
Loading...