A note on strcpy

User923005 · Post by **User923005** » Tue Nov 26, 2013 11:34 pm

hyatt wrote:
BTW Abort does NOTHING on mac osx. I ran this under lldb. Nothing. Just "Abort" and "normal termination".

I would say Apple's solution is idiotic.

ulimit -c unlimited

I assume that Mac OSX works like every other Posix system.

Jeremy Bernstein · Post by **Jeremy Bernstein** » Tue Nov 26, 2013 11:40 pm

We actually have found some real-world occurrences of this bug in our (that is, my day job's) software since updating test machines to 10.9. Our code base has parts dating back to 1985 or so, and has had lots of hands of varying levels of expertise touching it over the years. Typically, it happens accidentally of course (because you assigned the string to multiple variables), but we were seeing some extremely vexing memory corruption rather than crashing. I definitely prefer a hard crash with a solid crashlog than the mysterium of trashed data structures in memory...

jb

hyatt · Post by **hyatt** » Tue Nov 26, 2013 11:51 pm

Sorry, my man page does NOT say "don't do this". It says "the behavior is undefined." The behavior in Mavericks is NOT "undefined". You get a message that says "Abort" with no explanation as to why.

I looked at the strcpy() code. It works everywhere. You did NOT give me "50 examples" where it didn't work. You gave me 50 examples where it was used stupidly and did not work as expected. Again, feel free to point out ANY operating system / compiler combination where you compile Crafty and type the command "read xxx.pgn" and it crashes. Just one will do. NOT counting Mavericks. Where it doesn't crash either, the library simply checks for overlapping and terminates execution capriciously. In addition, apple strcpy() has slowed down because of this "error checking" that is pointless...

User923005 · Post by **User923005** » Tue Nov 26, 2013 11:59 pm

hyatt wrote:Sorry, my man page does NOT say "don't do this". It says "the behavior is undefined." The behavior in Mavericks is NOT "undefined". You get a message that says "Abort" with no explanation as to why.

I looked at the strcpy() code. It works everywhere. You did NOT give me "50 examples" where it didn't work. You gave me 50 examples where it was used stupidly and did not work as expected. Again, feel free to point out ANY operating system / compiler combination where you compile Crafty and type the command "read xxx.pgn" and it crashes. Just one will do. NOT counting Mavericks. Where it doesn't crash either, the library simply checks for overlapping and terminates execution capriciously. In addition, apple strcpy() has slowed down because of this "error checking" that is pointless...

The behavior is undefined because the standard says it is undefined. The code does not work everywhere. In fact, it is broken in any place that strcpy overlaps. It would fail any sensible code review.

When the manual says, "the behavior is undefined" that means it is a really, really, really, really bad idea to do it. It means that bad things can happen like core dumps and lost money and writes outside of memory bounds and Arienne rockets detonating[1] and x-ray machines killing patients[2].

[1][2] These are real results from programs with undefined behavior.

User923005 · Post by **User923005** » Wed Nov 27, 2013 12:02 am

Jeremy Bernstein wrote:We actually have found some real-world occurrences of this bug in our (that is, my day job's) software since updating test machines to 10.9. Our code base has parts dating back to 1985 or so, and has had lots of hands of varying levels of expertise touching it over the years. Typically, it happens accidentally of course (because you assigned the string to multiple variables), but we were seeing some extremely vexing memory corruption rather than crashing. I definitely prefer a hard crash with a solid crashlog than the mysterium of trashed data structures in memory...

jb

Amen to that.
The first step to fix a problem is to reproduce the problem. Once you can do that, fixing it is trivial.
Mangling memory and chugging along on our merry way is the worst possible behavior that I can imagine. We may not even know that we have done something incredibly stupid and dangerous.

User923005 · Post by **User923005** » Wed Nov 27, 2013 12:17 am

hyatt wrote: The behavior in Mavericks is NOT "undefined". You get a message that says "Abort" with no explanation as to why.

That is a perfectly good example of undefined behavior.
And the C language standard specifically says that the above exact behavior is completely acceptable and even lists it as one likely outcome.

Any time that code with undefined behavior works, it works by accident.
Now, it is possible to have a special implementation that gives implementation defined behavior for something that the standard says is undefined behavior.
An example of this is reading a hardware I/O port from a specific hardware address. Of course, this may work on a particular compiler and OS, but it is obviously not portable to other systems.

As a suggestion, go to the C language instructor at UAB and ask him what "undefined behavior" means in the C programming language. After his excellent description, I am quite sure that you will understand the situation better.

User923005 · Post by **User923005** » Wed Nov 27, 2013 12:46 am

FROM:
https://bugzilla.redhat.com/show_bug.cgi?id=638477#c99

Ma Ling 2010-11-15 02:17:54 EST
The blow reason is why to use backward copy, it is good on Core2, NHM, Opteron.
glbc memcpy will prefer to backward/forward copy mode according to offset from source and destination.

Optimize memcpy by avoiding memory false dependece

All read operations after allocation stage can run speculatively,
all write operation will run in program order, and if addresses are
different read may run before older write operation, otherwise wait
until write commit. However CPU don't check each address bit,
so read could fail to recognize different address even they
are in different page.For example if rsi is 0xf004, rdi is 0xe008,
in following operation there will generate big performance latency.
1. movq (%rsi), %rax
2. movq %rax, (%rdi)
3. movq 8(%rsi), %rax
4. movq %rax, 8(%rdi)

If %rsi and rdi were in really the same meory page, there are TRUE
read-after-write dependence because instruction 2 write 0x008 and
instruction 3 read 0x00c, the two address are overlap partially.
Actually there are in different page and no any issues,
but without checking each address bit CPU could think they are
in the same page, and instruction 3 have to wait for instruction 2
to write data into cache from write buffer, then load data from cache,
the cost time read spent is equal to mfence instruction. We may avoid it by
tuning operation sequence as follow.

1. movq 8(%rsi), %rax
2. movq %rax, 8(%rdi)
3. movq (%rsi), %rax
4. movq %rax, (%rdi)

Instruction 3 read 0x004, instruction 2 write address 0x010, no any
dependence. At last on Core2 we gain 1.83x speedup compared with
original instruction sequence. In this patch we first handle small
size(less 20bytes), then jump to different copy mode. Based on our
micro-benchmark small bytes from 1 to 127 bytes, we got up to 2X
improvement, and up to 1.5X improvement for 1024 bytes on Corei7.

Thanks
Ling

hyatt · Post by **hyatt** » Wed Nov 27, 2013 1:21 am

User923005 wrote:
hyatt wrote: The behavior in Mavericks is NOT "undefined". You get a message that says "Abort" with no explanation as to why.
That is a perfectly good example of undefined behavior.
And the C language standard specifically says that the above exact behavior is completely acceptable and even lists it as one likely outcome.

Any time that code with undefined behavior works, it works by accident.
Now, it is possible to have a special implementation that gives implementation defined behavior for something that the standard says is undefined behavior.
An example of this is reading a hardware I/O port from a specific hardware address. Of course, this may work on a particular compiler and OS, but it is obviously not portable to other systems.

As a suggestion, go to the C language instructor at UAB and ask him what "undefined behavior" means in the C programming language. After his excellent description, I am quite sure that you will understand the situation better.

Here is the incredible useful osx output with this bug included in an older crafty:

scrappy% gdb /users/chess/old/crafty-23.6/crafty
GNU gdb (GDB) 7.6
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-apple-darwin13.0.0".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /Users/chess/old/crafty-23.6/crafty...done.
(gdb) run
Starting program: /Users/chess/old/crafty-23.6/crafty
unable to open book file [./books.bin].
pondering disabled.
noise level set to 0.

Crafty v23.6 (1 cpus)

White(1): read db.pgn

Program received signal SIGABRT, Aborted.
0x00007fff8c949866 in ?? ()
(gdb)

REALLY points out the problem. core dump size is irrelevant here although ulimit -c says "unlimited".

hyatt · Post by **hyatt** » Wed Nov 27, 2013 1:23 am

User923005 wrote:FROM:
https://bugzilla.redhat.com/show_bug.cgi?id=638477#c99

Ma Ling 2010-11-15 02:17:54 EST
The blow reason is why to use backward copy, it is good on Core2, NHM, Opteron.
glbc memcpy will prefer to backward/forward copy mode according to offset from source and destination.

Optimize memcpy by avoiding memory false dependece

All read operations after allocation stage can run speculatively,
all write operation will run in program order, and if addresses are
different read may run before older write operation, otherwise wait
until write commit. However CPU don't check each address bit,
so read could fail to recognize different address even they
are in different page.For example if rsi is 0xf004, rdi is 0xe008,
in following operation there will generate big performance latency.
1. movq (%rsi), %rax
2. movq %rax, (%rdi)
3. movq 8(%rsi), %rax
4. movq %rax, 8(%rdi)

If %rsi and rdi were in really the same meory page, there are TRUE
read-after-write dependence because instruction 2 write 0x008 and
instruction 3 read 0x00c, the two address are overlap partially.
Actually there are in different page and no any issues,
but without checking each address bit CPU could think they are
in the same page, and instruction 3 have to wait for instruction 2
to write data into cache from write buffer, then load data from cache,
the cost time read spent is equal to mfence instruction. We may avoid it by
tuning operation sequence as follow.

1. movq 8(%rsi), %rax
2. movq %rax, 8(%rdi)
3. movq (%rsi), %rax
4. movq %rax, (%rdi)

Instruction 3 read 0x004, instruction 2 write address 0x010, no any
dependence. At last on Core2 we gain 1.83x speedup compared with
original instruction sequence. In this patch we first handle small
size(less 20bytes), then jump to different copy mode. Based on our
micro-benchmark small bytes from 1 to 127 bytes, we got up to 2X
improvement, and up to 1.5X improvement for 1024 bytes on Corei7.

Thanks
Ling

Who is talking about memcpy? With memcpy you KNOW how many bytes you are to copy. Not so with strcpy(). Totally unrelated.

User923005 · Post by **User923005** » Wed Nov 27, 2013 1:24 am

Compile the following program on your favorite compiler:

#include <string.h>
#include <stdio.h>

int main(int argc, char* argv[]) {
char b[32];
strcpy(b, "123456789012345");
strcpy(b + 1, b);
printf("[%s]\n", b);
return 0;
}

For Microsoft Visual C++ in 64 bit mode, it exits with a dump and gives the following error message:
Unhandled exception at 0x000007F7AE6013C0 in bozo.exe: Stack cookie instrumentation code detected a stack-based buffer overrun.

In 32 bit mode, it exits with a core dump and gives the following error message:
Unhandled exception at 0x009A1035 in bozo.exe: 0xC0000005: Access violation writing location 0x00200000.

GCC gave me this:

dcorbit@dcorbit /q/cc
$ cat bozo.c
#include <string.h>
#include <stdio.h>

int main(int argc, char* argv[]) {
char b[32];
strcpy(b, "123456789012345");
strcpy(b + 1, b);
printf("[%s]\n", b);
return 0;
}

dcorbit@dcorbit /q/cc
$ gcc -Wall -ansi -pedantic bozo.c

dcorbit@dcorbit /q/cc
$ ./a
[1123456788012345]

Look at it carefully, is it what you expected?

OpenChess

OpenChess

A note on strcpy

Re: A note on strcpy

Re: A note on strcpy

Re: A note on strcpy

Re: A note on strcpy

Re: A note on strcpy

Re: A note on strcpy

Re: A note on strcpy

Re: A note on strcpy

Re: A note on strcpy

Re: A note on strcpy