This program segfaults at the commented line:
#include <ctype.h>
#include <stdio.h>
void uppercase_ascii(char *s) {
while (*s) {
*s = toupper(*s); // SEGFAULT
s++;
}
}
int main() {
char* text = "this is a test";
printf("before [%s]\n", text);
uppercase_ascii(text);
printf("after [%s]\n", text);
}
text is pointing to "this is a test" - and that is stored in the program binary and that's why can't modify it.
Change it to:
char text[] = "this is a test";
You can modify that, text gets it's own copy.
On 01/08/2024 09:06, Mark Summerfield wrote:
This program segfaults at the commented line:
#include <ctype.h>
#include <stdio.h>
void uppercase_ascii(char *s) {
while (*s) {
*s = toupper(*s); // SEGFAULT
s++;
}
}
int main() {
char* text = "this is a test";
printf("before [%s]\n", text);
uppercase_ascii(text);
printf("after [%s]\n", text);
}
text is pointing to "this is a test" - and that is stored in the program binary and that's why can't modify it.
The formatting was messed up by Pan.
The function was:
void uppercase_ascii(char *s) {
while (*s) {
*s = toupper(*s);
s++;
}
}
This program segfaults at the commented line:
#include <ctype.h>
#include <stdio.h>
void uppercase_ascii(char *s) {
while (*s) {
*s = toupper(*s); // SEGFAULT
s++;
}
}
int main() {
char* text = "this is a test";
printf("before [%s]\n", text);
uppercase_ascii(text);
printf("after [%s]\n", text);
}
I know there are better ways to do ASCII uppercase, I don't care about
that; what I don't understand is why I can't do an in-place edit of a non- >const char*?
This program segfaults at the commented line:
#include <ctype.h>
#include <stdio.h>
void uppercase_ascii(char *s) {
while (*s) {
*s = toupper(*s); // SEGFAULT
s++;
}
}
int main() {
char* text = "this is a test";
printf("before [%s]\n", text);
uppercase_ascii(text);
printf("after [%s]\n", text);
}
This program segfaults at the commented line:
#include <ctype.h>
#include <stdio.h>
void uppercase_ascii(char *s) {
while (*s) {
*s = toupper(*s); // SEGFAULT
s++;
}
}
int main() {
char* text = "this is a test";
On Thu, 01 Aug 2024 08:06:57 +0000
Mark Summerfield <mark@qtrac.eu> wrote:
This program segfaults at the commented line:
#include <ctype.h>
#include <stdio.h>
void uppercase_ascii(char *s) {
while (*s) {
*s = toupper(*s); // SEGFAULT
s++;
}
}
int main() {
char* text = "this is a test";
printf("before [%s]\n", text);
uppercase_ascii(text);
printf("after [%s]\n", text);
}
The answers to your question are already given above, so I'd talk about something else. Sorry about it.
To my surprise, none of the 3 major compilers that I tried issued the
warning at this line:
char* text = "this is a test";
If implicit conversion of 'const char*' to 'char*' does not warrant
compiler warning than I don't know what does.
Is there something in the Standard that explicitly forbids diagnostic
for this sort of conversion?
BTW, all 3 compilers issue reasonable warnings when I write it slightly differently:
const char* ctext = "this is a test";
char* text = ctext;
I am starting to suspect that compilers (and the Standard?) consider
string literals as being of type 'char*' rather than 'const char*'.
This program segfaults at the commented line:
#include <ctype.h>
#include <stdio.h>
void uppercase_ascii(char *s) {
while (*s) {
*s = toupper(*s); // SEGFAULT
s++;
}
}
int main() {
char* text = "this is a test";
Program received signal SIGSEGV, Segmentation fault.
0x000055555555516e in uppercase_ascii (s=0x555555556004 "this is a test")
at inplace.c:6
6 *s = toupper(*s);
On 2024-08-01, Mark Summerfield <mark@qtrac.eu> wrote:
This program segfaults at the commented line:
#include <ctype.h>
#include <stdio.h>
void uppercase_ascii(char *s) {
while (*s) {
*s = toupper(*s); // SEGFAULT
s++;
}
}
int main() {
char* text = "this is a test";
The "this is a test" object is a literal. It is part of the program's image.
When you try to change it, you're making your program self-modifying.
Program received signal SIGSEGV, Segmentation fault.
0x000055555555516e in uppercase_ascii (s=0x555555556004 "this is a test")
at inplace.c:6
6 *s = toupper(*s);
On Linux, the string literals of a C executable are located together
with the program text. They are interspersed among the machine
instructions which reference them. The program text is mapped
read-only, so an attempted modification is an access violation trapped
by the OS, turned into a SIGSEGV signal.
Bart <bc@freeuk.com> writes:
On 01/08/2024 09:38, Richard Harnden wrote:
On 01/08/2024 09:06, Mark Summerfield wrote:
This program segfaults at the commented line:text is pointing to "this is a test" - and that is stored in the
#include <ctype.h>
#include <stdio.h>
void uppercase_ascii(char *s) {
while (*s) {
*s = toupper(*s); // SEGFAULT
s++;
}
}
int main() {
char* text = "this is a test";
printf("before [%s]\n", text);
uppercase_ascii(text);
printf("after [%s]\n", text);
}
program binary and that's why can't modify it.
That's not the reason for the segfault in this case.
I'm fairly sure it is.
With some
compilers, you *can* modify it, but that will permanently modify that
string constant. (If the code is repeated, the text is already in
capitals the second time around.)
It segfaults when the string is stored in a read-only part of the binary.
A string literal creates an array object with static storage duration.
Any attempt to modify that array object has undefined behavior.
On 01/08/2024 20:39, Kaz Kylheku wrote:
On 2024-08-01, Mark Summerfield <mark@qtrac.eu> wrote:
This program segfaults at the commented line:The "this is a test" object is a literal. It is part of the program's
#include <ctype.h>
#include <stdio.h>
void uppercase_ascii(char *s) {
while (*s) {
*s = toupper(*s); // SEGFAULT
s++;
}
}
int main() {
char* text = "this is a test";
image.
So is the text here:
char text[]="this is a test";
But this can be changed without making the program self-modifying.
I guess it depends on what is classed as the program's 'image'.
I'd say the image in the state it is in just after loading or just before execution starts (since certain fixups are needed). But some sections will
be writable during execution, some not.
When you try to change it, you're making your program self-modifying.
Program received signal SIGSEGV, Segmentation fault.On Linux, the string literals of a C executable are located together
0x000055555555516e in uppercase_ascii (s=0x555555556004 "this is a test") >>> at inplace.c:6
6 *s = toupper(*s);
with the program text. They are interspersed among the machine
instructions which reference them. The program text is mapped
read-only, so an attempted modification is an access violation trapped
by the OS, turned into a SIGSEGV signal.
Does it really do that?
On 01/08/2024 20:39, Kaz Kylheku wrote:
On 2024-08-01, Mark Summerfield <mark@qtrac.eu> wrote:
This program segfaults at the commented line:
#include <ctype.h>
#include <stdio.h>
void uppercase_ascii(char *s) {
while (*s) {
*s = toupper(*s); // SEGFAULT
s++;
}
}
int main() {
char* text = "this is a test";
The "this is a test" object is a literal. It is part of the program's image.
So is the text here:
char text[]="this is a test";
But this can be changed without making the program self-modifying.
I guess it depends on what is classed as the program's 'image'.
I'd say the image in the state it is in just after loading or just
before execution starts (since certain fixups are needed). But some
sections will be writable during execution, some not.
The dangers are small, but there must be reasons why a dedication
section is normally used. gcc on Windows creates up to 19 sections, so
it would odd for literal strings to share with code.
On 01/08/2024 21:59, Keith Thompson wrote:...
Bart <bc@freeuk.com> writes:
compilers, you *can* modify it, but that will permanently modify thatduration.
string constant. (If the code is repeated, the text is already in
capitals the second time around.)
It segfaults when the string is stored in a read-only part of the binary. >> A string literal creates an array object with static storage
Any attempt to modify that array object has undefined behavior.
What's the difference between such an object, and an array like one of
these:
static char A[100];
static char B[100]={1};
It segfaults when the string is stored in a read-only part of the binary. >>A string literal creates an array object with static storage duration.
Any attempt to modify that array object has undefined behavior.
What's the difference between such an object, and an array like one of
these:
static char A[100];
static char B[100]={1};
Do these not also have static storage duration? Yet presumably these can
be legally modified.
On 01/08/2024 16:40, Michael S wrote:
On Thu, 01 Aug 2024 08:06:57 +0000
Mark Summerfield <mark@qtrac.eu> wrote:
This program segfaults at the commented line:
#include <ctype.h>
#include <stdio.h>
void uppercase_ascii(char *s) {
while (*s) {
*s = toupper(*s); // SEGFAULT
s++;
}
}
int main() {
char* text = "this is a test";
printf("before [%s]\n", text);
uppercase_ascii(text);
printf("after [%s]\n", text);
}
The answers to your question are already given above, so I'd talk about
something else. Sorry about it.
To my surprise, none of the 3 major compilers that I tried issued the
warning at this line:
char* text = "this is a test";
If implicit conversion of 'const char*' to 'char*' does not warrant
compiler warning than I don't know what does.
Is there something in the Standard that explicitly forbids diagnostic
for this sort of conversion?
BTW, all 3 compilers issue reasonable warnings when I write it slightly
differently:
const char* ctext = "this is a test";
char* text = ctext;
I am starting to suspect that compilers (and the Standard?) consider
string literals as being of type 'char*' rather than 'const char*'.
Your suspicions are correct - in C, string literals are used to
initialise an array of char (or wide char, or other appropriate
character type). Perhaps you are thinking of C++, where the type is
"const char" (or other const character type).
So in C, when a string literal is used in an expression it is converted
to a "char *" pointer. You can, of course, assign that to a "const char
*" pointer. But it does not make sense to have a warning when assigning
it to a non-const "char *" pointer. This is despite it being undefined behaviour (explicitly stated in the standards) to attempt to write to a string literal.
The reason string literals are not const in C is backwards compatibility
- they existed before C had "const", and making string literals into
"const char" arrays would mean that existing code that assigned them to non-const pointers would then be in error. C++ was able to do the right thing and make them arrays of const char because it had "const" from the beginning.
gcc has the option "-Wwrite-strings" that makes string literals in C
have "const char" array type, and thus give errors when you try to
assign to a non-const char * pointer. But the option has to be
specified explicitly (it is not in -Wall) because it changes the meaning
of the code and can cause compatibility issues with existing correct code.
On 2024-08-01, Bart <bc@freeuk.com> wrote:
It segfaults when the string is stored in a read-only part of the binary. >>>A string literal creates an array object with static storage duration.
Any attempt to modify that array object has undefined behavior.
What's the difference between such an object, and an array like one of
these:
Programming languages can have objects that have the same lifetime, yet some of which are mutable and some of which are immutable.
If the compiler believes that the immutable objects are in fact
not mutated, it's a bad idea to modify them behind the compiler's
back.
There doesn't have to be any actual difference in the implementation of
these objects, like in what area they are stored, other than the rules regarding their correct use, namely prohibiting modification.
The Racket language has both mutable and immutable cons cells.
The difference is that the immutable cons cells simply lack the
operations needed to mutate them. I'm not an expert on the Racket
internals but I don't see a reason why they couldn't be stored in the
same heap.
static char A[100];
static char B[100]={1};
Do these not also have static storage duration? Yet presumably these can
be legally modified.
That 1 which initializes B[0] cannot be modified.
That 1 which initializes B[0] cannot be modified.
On 2024-08-01, Bart <bc@freeuk.com> wrote:
On 01/08/2024 20:39, Kaz Kylheku wrote:
On 2024-08-01, Mark Summerfield <mark@qtrac.eu> wrote:
This program segfaults at the commented line:
#include <ctype.h>
#include <stdio.h>
void uppercase_ascii(char *s) {
while (*s) {
*s = toupper(*s); // SEGFAULT
s++;
}
}
int main() {
char* text = "this is a test";
The "this is a test" object is a literal. It is part of the program's image.
So is the text here:
char text[]="this is a test";
But this can be changed without making the program self-modifying.
The array which is initialized by the literal is what can be
changed.
In this situation, the literal is just initializer syntax,
not required to be an object with an address.
I guess it depends on what is classed as the program's 'image'.
I'd say the image in the state it is in just after loading or just
before execution starts (since certain fixups are needed). But some
sections will be writable during execution, some not.
Programs can self-modify in ways designed into the run time.
The toaster has certain internal receptacles that can take
certain forks, according to some rules, which do not affect
the user operating the toaster according to the manual.
The dangers are small, but there must be reasons why a dedication
section is normally used. gcc on Windows creates up to 19 sections, so
it would odd for literal strings to share with code.
One reason is that PC-relative addressing can be used by code to
find its literals. Since that usually has a limited range, it helps
to keep the literals with the code. Combining sections also reduces
size. The addressing is also relocatable, which is useful in shared
libs.
candycanearter07 <candycanearter07@candycanearter07.nomail.afraid>
writes:
David Brown <david.brown@hesbynett.no> wrote at 17:56 this Thursday (GMT):[...]
gcc has the option "-Wwrite-strings" that makes string literals in C-Wwrite-strings is included in -Wpedantic.
have "const char" array type, and thus give errors when you try to
assign to a non-const char * pointer. But the option has to be
specified explicitly (it is not in -Wall) because it changes the meaning >>> of the code and can cause compatibility issues with existing correct code. >>
No it isn't, nor is it included in -Wall -- and it wouldn't make sense
to do so.
The -Wpedantic option is intended to produce all required diagnostics
for the specified C standard. -Wwrite-strings gives string literals the
type `const char[LENGTH]`, which enables useful diagnostics but is *non-conforming*.
For example, this program:
```
#include <stdio.h>
int main(void) {
char *s = "hello, world";
puts(s);
}
```
is valid (no diagnostic required), since it doesn't actually write to
the string literal object, but `-Wwrite-strings` causes gcc to warn
about it (because making the pointer non-const creates the potential for
an error).
Is there any reason not to always write ...
static const char *s = "hello, world";
... ?
You get all the warnings for free that way.
On 02/08/2024 02:06, Kaz Kylheku wrote:
On 2024-08-01, Bart <bc@freeuk.com> wrote:
It segfaults when the string is stored in a read-only part of the
binary.
A string literal creates an array object with static storage duration. >>>> Any attempt to modify that array object has undefined behavior.
What's the difference between such an object, and an array like one of
these:
Programming languages can have objects that have the same lifetime,
yet some
of which are mutable and some of which are immutable.
If the compiler believes that the immutable objects are in fact
not mutated, it's a bad idea to modify them behind the compiler's
back.
There doesn't have to be any actual difference in the implementation of
these objects, like in what area they are stored, other than the rules
regarding their correct use, namely prohibiting modification.
The Racket language has both mutable and immutable cons cells.
The difference is that the immutable cons cells simply lack the
operations needed to mutate them. I'm not an expert on the Racket
internals but I don't see a reason why they couldn't be stored in the
same heap.
static char A[100];
static char B[100]={1};
Do these not also have static storage duration? Yet presumably these can >>> be legally modified.
That 1 which initializes B[0] cannot be modified.
Why not? I haven't requested that those are 'const'. Further, gcc has no problem running this program:
static char A[100];
static char B[100]={1};
printf("%d %d %d\n", A[0], B[0], 1);
A[0]=55;
B[0]=89;
printf("%d %d %d\n", A[0], B[0], 1);
But it does use readonly memory for string literals.
(The point of A and B was to represent .bss and .data segments
respectively. A's data is not part of the EXE image; B's is.
While the point of 'static' was to avoid having to specify whether A and
B were at module scope or within a function.)
That 1 which initializes B[0] cannot be modified.
Or do you literally mean the value of that '1'? Then it doesn' make
sense; here that is a copy of the literal stored in one cell of 'B'. The value of the cell can change, then that particular copy of '1' is lost.
Here:
static char B[100] = {1, 1, 1, 1, 1, 1};
changing B[0] will not affect the 1s in B[1..5], and in my example
above, that standalone '1' is not affected.
On 02/08/2024 02:06, Kaz Kylheku wrote:
On 2024-08-01, Bart <bc@freeuk.com> wrote:
It segfaults when the string is stored in a read-only part of the
binary.
A string literal creates an array object with static storage duration. >>>> Any attempt to modify that array object has undefined behavior.
What's the difference between such an object, and an array like one of
these:
static char A[100];
static char B[100]={1};
Do these not also have static storage duration? Yet presumably these can >>> be legally modified.
That 1 which initializes B[0] cannot be modified.
Why not? I haven't requested that those are 'const'. ...
... Further, gcc has no
problem running this program:
static char A[100];
static char B[100]={1};
printf("%d %d %d\n", A[0], B[0], 1);
A[0]=55;
B[0]=89;
printf("%d %d %d\n", A[0], B[0], 1);
On 8/2/24 5:43 AM, Bart wrote:
On 02/08/2024 02:06, Kaz Kylheku wrote:
On 2024-08-01, Bart <bc@freeuk.com> wrote:
It segfaults when the string is stored in a read-only part of the
binary.
A string literal creates an array object with static storage duration. >>>>> Any attempt to modify that array object has undefined behavior.
What's the difference between such an object, and an array like one of >>>> these:
static char A[100];
static char B[100]={1};
Do these not also have static storage duration? Yet presumably these can >>>> be legally modified.
That 1 which initializes B[0] cannot be modified.
Why not? I haven't requested that those are 'const'. ...
You don't get a choice in the matter. The C language doesn't permit
numeric literals of any kind to be modified by your code.
and don't need to be, declared 'const'. I've heard that in some other languages, if you call foo(3), and foo() changes the value of it's
argument to 2, then subsequent calls to bar(3) will pass a value of 2 to bar(). That sounds like such a ridiculous mis-feature that I hesitate to identify which languages I had heard accused of having that feature, but
it is important to note that C is not one of them.
Just as 1 is an integer literal whose value cannot be modified,
Richard Harnden <richard.nospam@gmail.invalid> writes:
[...]
Is there any reason not to always write ...
static const char *s = "hello, world";
... ?
You get all the warnings for free that way.
The "static", if this is at block scope, specifies that the pointer
object, not the array object, has static storage duration. If it's at
file scope it specifies that the name "s" is not visible to other
translation units. Either way, use it if that's what you want, don't
use it if it isn't.
There's no good reason not to use "const". (If string literal objects
were const, you'd have to use "const" here.)
If you also want the pointer to be const, you can write:
const char *const s = "hello, world";
On 8/2/24 2:24 PM, Keith Thompson wrote:...
Richard Harnden <richard.nospam@gmail.invalid> writes:
[...]
Is there any reason not to always write ...
static const char *s = "hello, world";
... ?
...There's no good reason not to use "const". (If string literal objects
were const, you'd have to use "const" here.)
The one good reason to not make it const is that if you are passing it
to functions that take (non-const) char* parameters that don't
actually change that parameters contents.
On 8/2/24 14:42, Richard Damon wrote:
On 8/2/24 2:24 PM, Keith Thompson wrote:...
Richard Harnden <richard.nospam@gmail.invalid> writes:
[...]
Is there any reason not to always write ...
static const char *s = "hello, world";
... ?
...There's no good reason not to use "const". (If string literal objects
were const, you'd have to use "const" here.)
The one good reason to not make it const is that if you are passing it
to functions that take (non-const) char* parameters that don't
actually change that parameters contents.
Actually, that's not a good reason. If you can't modify the function's interface, you should use a (char*) cast, which will serve to remind
future programmers that this is a dangerous function call. You shouldn't
make the pointer's own type "char *".
On 01/08/2024 20:39, Kaz Kylheku wrote:
On 2024-08-01, Mark Summerfield <mark@qtrac.eu> wrote:
This program segfaults at the commented line:
#include <ctype.h>
#include <stdio.h>
void uppercase_ascii(char *s) {
while (*s) {
*s = toupper(*s); // SEGFAULT
s++;
}
}
int main() {
char* text = "this is a test";
The "this is a test" object is a literal. It is part of the program's
image.
So is the text here:
char text[]="this is a test";
But this can be changed without making the program self-modifying.
I guess it depends on what is classed as the program's 'image'.
I'd say the image in the state it is in just after loading or just
before execution starts (since certain fixups are needed). But some
sections will be writable during execution, some not.
When you try to change it, you're making your program self-modifying.
Program received signal SIGSEGV, Segmentation fault.
0x000055555555516e in uppercase_ascii (s=0x555555556004 "this is a
test")
at inplace.c:6
6 *s = toupper(*s);
On Linux, the string literals of a C executable are located together
with the program text. They are interspersed among the machine
instructions which reference them. The program text is mapped
read-only, so an attempted modification is an access violation trapped
by the OS, turned into a SIGSEGV signal.
Does it really do that? That's the method I've used for read-only
strings, to put them into the code-segment (since I neglected to support
a dedicated read-only data section, and it's too much work now).
But I don't like it since the code section is also executable; you could inadvertently execute code within a string (which might happen to
contain machine code for other purposes).
The dangers are small, but there must be reasons why a dedication
section is normally used. gcc on Windows creates up to 19 sections, so
it would odd for literal strings to share with code.
For some reason I had a sort of a habit wrt const pointers:
(experimental code, no ads, raw text...)
https://pastebin.com/raw/f52a443b1
________________________________
/* Interfaces ____________________________________________________________________*/ #include <stddef.h>
struct object_prv_vtable {
int (*fp_destroy) (void* const);
};
struct device_prv_vtable {
int (*fp_read) (void* const, void*, size_t);
int (*fp_write) (void* const, void const*, size_t);
};
;^)
I've heard that in some other
languages, if you call foo(3), and foo() changes the value of it's
argument to 2, then subsequent calls to bar(3) will pass a value of 2 to bar(). That sounds like such a ridiculous mis-feature that I hesitate to identify which languages I had heard accused of having that feature ...
On Fri, 2 Aug 2024 14:19:49 -0400, James Kuyper wrote:
I've heard that in some other
languages, if you call foo(3), and foo() changes the value of it's
argument to 2, then subsequent calls to bar(3) will pass a value of 2 to
bar(). That sounds like such a ridiculous mis-feature that I hesitate to
identify which languages I had heard accused of having that feature ...
I heard that, too. I think it was on some early FORTRAN compilers, on
early machine architectures, without stacks or reentrancy. And with the
weird FORTRAN argument-passing conventions.
On 8/2/24 9:31 PM, Lawrence D'Oliveiro wrote:
On Fri, 2 Aug 2024 14:19:49 -0400, James Kuyper wrote:
I've heard that in some otherI heard that, too. I think it was on some early FORTRAN compilers,
languages, if you call foo(3), and foo() changes the value of it's
argument to 2, then subsequent calls to bar(3) will pass a value of 2 to >>> bar(). That sounds like such a ridiculous mis-feature that I hesitate to >>> identify which languages I had heard accused of having that feature ...
on
early machine architectures, without stacks or reentrancy. And with the
weird FORTRAN argument-passing conventions.
I remember it too, and was based on the fact that all arguments were
pass by reference (so they could be either in or out parameters), and constants were passed as pointers to the location of memory where that constant was stored, and perhaps used elsewhere too. Why waste
precious memory to setup a temporary to hold be initialized and hold
the value, when you could just pass the address of a location that you
knew had the right value.
On 01/08/2024 22:42, Bart wrote:
char text[]="this is a test";
But this can be changed without making the program self-modifying.
"this is a test" is a string literal, and is typically part of the
program's image. (There are some C implementations that do things >differently, like storing such initialisation data in a compressed format.)
The array "char text[]", however, is a normal variable of type array of
char. It is most definitely not part of the program image - it is in
ram (statically allocated or on the stack, depending on the context) and
is initialised by copying the characters from the string literal (prior
to main(), or at each entry to its scope if it is a local variable).
David Brown <david.brown@hesbynett.no> wrote at 17:56 this Thursday (GMT):
On 01/08/2024 16:40, Michael S wrote:
On Thu, 01 Aug 2024 08:06:57 +0000
Mark Summerfield <mark@qtrac.eu> wrote:
This program segfaults at the commented line:
#include <ctype.h>
#include <stdio.h>
void uppercase_ascii(char *s) {
while (*s) {
*s = toupper(*s); // SEGFAULT
s++;
}
}
int main() {
char* text = "this is a test";
printf("before [%s]\n", text);
uppercase_ascii(text);
printf("after [%s]\n", text);
}
The answers to your question are already given above, so I'd talk about
something else. Sorry about it.
To my surprise, none of the 3 major compilers that I tried issued the
warning at this line:
char* text = "this is a test";
If implicit conversion of 'const char*' to 'char*' does not warrant
compiler warning than I don't know what does.
Is there something in the Standard that explicitly forbids diagnostic
for this sort of conversion?
BTW, all 3 compilers issue reasonable warnings when I write it slightly
differently:
const char* ctext = "this is a test";
char* text = ctext;
I am starting to suspect that compilers (and the Standard?) consider
string literals as being of type 'char*' rather than 'const char*'.
Your suspicions are correct - in C, string literals are used to
initialise an array of char (or wide char, or other appropriate
character type). Perhaps you are thinking of C++, where the type is
"const char" (or other const character type).
So in C, when a string literal is used in an expression it is converted
to a "char *" pointer. You can, of course, assign that to a "const char
*" pointer. But it does not make sense to have a warning when assigning
it to a non-const "char *" pointer. This is despite it being undefined
behaviour (explicitly stated in the standards) to attempt to write to a
string literal.
The reason string literals are not const in C is backwards compatibility
- they existed before C had "const", and making string literals into
"const char" arrays would mean that existing code that assigned them to
non-const pointers would then be in error. C++ was able to do the right
thing and make them arrays of const char because it had "const" from the
beginning.
gcc has the option "-Wwrite-strings" that makes string literals in C
have "const char" array type, and thus give errors when you try to
assign to a non-const char * pointer. But the option has to be
specified explicitly (it is not in -Wall) because it changes the meaning
of the code and can cause compatibility issues with existing correct code.
-Wwrite-strings is included in -Wpedantic.
... was based on the fact that all arguments were pass by reference ...
... general compression isn't something I've seen ...
Lawrence D'Oliveiro <ldo@nz.invalid> writes:
On Sat, 03 Aug 2024 17:07:37 -0700, Keith Thompson wrote:
... general compression isn't something I've seen ...
I recall Apple had a patent on some aspects of the “PEF” executable format
that they created for their PowerPC machines running old MacOS. This had
to do with some clever instruction encodings for loading stuff into
memory.
Is that relevant to what I asked about?
What I had in mind is something that, given this:
static int buf = { 1, 1, 1, ..., 1 }; // say, 1000 elements
would store something less than 1000*sizeof(int) bytes in the executable file. I wouldn't be hard to do, but I'm not convinced it would be worthwhile.
David Brown <david.brown@hesbynett.no> writes:
[...]
"this is a test" is a string literal, and is typically part of the[...]
program's image. (There are some C implementations that do things
differently, like storing such initialisation data in a compressed
format.)
What implementations do that? Typically data that's all zeros isn't
stored in the image, but general compression isn't something I've seen
(not that I've paid much attention). It would save space in the image,
but it would require decompression at load time and wouldn't save any
space at run time.
On 8/2/2024 3:29 PM, Ben Bacarisse wrote:
"Chris M. Thomasson" <chris.m.thomasson.1@gmail.com> writes:
For some reason I had a sort of a habit wrt const pointers:Why? It seems like an arbitrary choice to const qualify some pointer
(experimental code, no ads, raw text...)
https://pastebin.com/raw/f52a443b1
________________________________
/* Interfaces
____________________________________________________________________*/
#include <stddef.h>
struct object_prv_vtable {
int (*fp_destroy) (void* const);
};
struct device_prv_vtable {
int (*fp_read) (void* const, void*, size_t);
int (*fp_write) (void* const, void const*, size_t);
};
types and some pointed-to types (but never both).
I just wanted to get the point across that the first parameter, aka, akin
to "this" in C++ is a const pointer. Shall not be modified in any way shape or form. It is as it is, so to speak:
void foo(struct foobar const* const self);
constant pointer to a constant foobar, fair enough?
;^)Does the wink mean I should not take what you write seriously? If so,
please ignore my question.
The wink was meant to show my habit in basically a jestful sort of
way.
Lawrence D'Oliveiro <ldo@nz.invalid> writes:
On Sat, 03 Aug 2024 17:07:37 -0700, Keith Thompson wrote:
... general compression isn't something I've seen ...
I recall Apple had a patent on some aspects of the “PEF” executable
format that they created for their PowerPC machines running old MacOS.
This had to do with some clever instruction encodings for loading stuff
into memory.
Is that relevant to what I asked about?
On 8/4/2024 6:06 PM, Ben Bacarisse wrote:
"Chris M. Thomasson" <chris.m.thomasson.1@gmail.com> writes:
On 8/2/2024 3:29 PM, Ben Bacarisse wrote:No. If you intended a const pointer to const object why didn't you
"Chris M. Thomasson" <chris.m.thomasson.1@gmail.com> writes:
For some reason I had a sort of a habit wrt const pointers:Why? It seems like an arbitrary choice to const qualify some pointer
(experimental code, no ads, raw text...)
https://pastebin.com/raw/f52a443b1
________________________________
/* Interfaces
____________________________________________________________________*/ >>>>> #include <stddef.h>
struct object_prv_vtable {
int (*fp_destroy) (void* const);
};
struct device_prv_vtable {
int (*fp_read) (void* const, void*, size_t);
int (*fp_write) (void* const, void const*, size_t);
};
types and some pointed-to types (but never both).
I just wanted to get the point across that the first parameter, aka, akin >>> to "this" in C++ is a const pointer. Shall not be modified in any way shape >>> or form. It is as it is, so to speak:
void foo(struct foobar const* const self);
constant pointer to a constant foobar, fair enough?
write that? My point was that the consts seems to be scattered about
without any apparent logic and you've not explained why.
Your habit of what?;^)Does the wink mean I should not take what you write seriously? If so, >>>> please ignore my question.
The wink was meant to show my habit in basically a jestful sort of
way.
To write the declaration with names and the const access I want, so:
extern void (void const* const ptr);
void (void const* const ptr)
{
// ptr is a const pointer to a const void
}
On 8/5/2024 4:03 AM, Ben Bacarisse wrote:
"Chris M. Thomasson" <chris.m.thomasson.1@gmail.com> writes:
On 8/4/2024 6:06 PM, Ben Bacarisse wrote:I don't think you are following what I'm, saying. If you think there
"Chris M. Thomasson" <chris.m.thomasson.1@gmail.com> writes:
On 8/2/2024 3:29 PM, Ben Bacarisse wrote:No. If you intended a const pointer to const object why didn't you
"Chris M. Thomasson" <chris.m.thomasson.1@gmail.com> writes:
For some reason I had a sort of a habit wrt const pointers:Why? It seems like an arbitrary choice to const qualify some pointer >>>>>> types and some pointed-to types (but never both).
(experimental code, no ads, raw text...)
https://pastebin.com/raw/f52a443b1
________________________________
/* Interfaces
____________________________________________________________________*/ >>>>>>> #include <stddef.h>
struct object_prv_vtable {
int (*fp_destroy) (void* const);
};
struct device_prv_vtable {
int (*fp_read) (void* const, void*, size_t);
int (*fp_write) (void* const, void const*, size_t);
};
I just wanted to get the point across that the first parameter, aka, akin >>>>> to "this" in C++ is a const pointer. Shall not be modified in any way shape
or form. It is as it is, so to speak:
void foo(struct foobar const* const self);
constant pointer to a constant foobar, fair enough?
write that? My point was that the consts seems to be scattered about
without any apparent logic and you've not explained why.
Your habit of what?;^)Does the wink mean I should not take what you write seriously? If so, >>>>>> please ignore my question.
The wink was meant to show my habit in basically a jestful sort of
way.
To write the declaration with names and the const access I want, so:
extern void (void const* const ptr);
void (void const* const ptr)
{
// ptr is a const pointer to a const void
}
might be some value in finding out, you could as a few questions. I
won't say it again ;-)
I must be misunderstanding you. My habit in such code was to always make
the "this" pointer wrt some of my "object" oriented code a const
pointer. This was always the first parameter:
extern void foobar(void const* const ptr);
Lawrence D'Oliveiro <ldo@nz.invalid> writes:
On Sat, 03 Aug 2024 19:58:37 -0700, Keith Thompson wrote:
Lawrence D'Oliveiro <ldo@nz.invalid> writes:
On Sat, 03 Aug 2024 17:07:37 -0700, Keith Thompson wrote:
... general compression isn't something I've seen ...
I recall Apple had a patent on some aspects of the “PEF” executable >>>> format that they created for their PowerPC machines running old
MacOS. This had to do with some clever instruction encodings for
loading stuff into memory.
Is that relevant to what I asked about?
“Compression”
Was that intended to be responsive?
I must have completely missed it. Sorry about that. Please redefine?
Lawrence D'Oliveiro <ldo@nz.invalid> writes:
On Sun, 04 Aug 2024 23:38:14 -0700, Keith Thompson wrote:
Lawrence D'Oliveiro <ldo@nz.invalid> writes:
On Sat, 03 Aug 2024 19:58:37 -0700, Keith Thompson wrote:
Lawrence D'Oliveiro <ldo@nz.invalid> writes:
On Sat, 03 Aug 2024 17:07:37 -0700, Keith Thompson wrote:
... general compression isn't something I've seen ...
I recall Apple had a patent on some aspects of the “PEF” executable >>>>>> format that they created for their PowerPC machines running old
MacOS. This had to do with some clever instruction encodings for
loading stuff into memory.
Is that relevant to what I asked about?
“Compression”
Was that intended to be responsive?
Hint: you have to know something about executable formats.
I am profoundly uninterested in hints.
Here's what you snipped from what I wrote upthread:
What I had in mind is something that, given this:
static int buf = { 1, 1, 1, ..., 1 }; // say, 1000 elements
would store something less than 1000*sizeof(int) bytes in the executable
file. I wouldn't be hard to do, but I'm not convinced it would be
worthwhile.
There's a lot I don't know about executable formats, and you seem uninterested in doing more than showing off your presumed knowledge
without actually sharing it. Others have already answered my direct
question (Richard Damon and David Brown mentioned implementations
that use simple run-length encoding, and David gave some reasons
why it could be useful), so you can stop wasting everyone's time.
On 05/08/2024 23:40, Keith Thompson wrote:
Lawrence D'Oliveiro <ldo@nz.invalid> writes:
On Sun, 04 Aug 2024 23:38:14 -0700, Keith Thompson wrote:
Lawrence D'Oliveiro <ldo@nz.invalid> writes:
On Sat, 03 Aug 2024 19:58:37 -0700, Keith Thompson wrote:
Lawrence D'Oliveiro <ldo@nz.invalid> writes:
On Sat, 03 Aug 2024 17:07:37 -0700, Keith Thompson wrote:
... general compression isn't something I've seen ...
I recall Apple had a patent on some aspects of the “PEF” executable >>>>>>> format that they created for their PowerPC machines running old
MacOS. This had to do with some clever instruction encodings for >>>>>>> loading stuff into memory.
Is that relevant to what I asked about?
“Compression”
Was that intended to be responsive?
Hint: you have to know something about executable formats.
I am profoundly uninterested in hints.
Here's what you snipped from what I wrote upthread:
What I had in mind is something that, given this:
static int buf = { 1, 1, 1, ..., 1 }; // say, 1000 elements >>
would store something less than 1000*sizeof(int) bytes in the
executable
file. I wouldn't be hard to do, but I'm not convinced it would be >> worthwhile.
There's a lot I don't know about executable formats, and you seem
uninterested in doing more than showing off your presumed knowledge
without actually sharing it. Others have already answered my direct
question (Richard Damon and David Brown mentioned implementations
that use simple run-length encoding, and David gave some reasons
why it could be useful), so you can stop wasting everyone's time.
Storing those 1000 integers is normally going to take 4000 bytes (at
least, since data sections may be rounded up etc).
Doing it in under 4000 bytes would require some extra help. Who or what
is going to do that, and at what point?
There are two lots of support needed:
(1) Some process needs to run either while generating the EXE, or
compressing an existing EXE, to convert that data into a more compact form
(2) When launched, some other process is needed to decompress the data
before reaching the normal entry point.
I can tell you that nothing about Windows' EXE format will help here for either (1) or (2), since it would need support from the OS loader to decompress any data, and that doesn't exist.
So it would presumably need to be done by some extra code that is added
to the executable, that needs to be arranged to run as part of the
user-code.
A compiler that supports such compression could do this job: compressing sections, and then generating extra extra code, which must be called
first, which decompresses those sections.
Or an external utility like UPX can be applied, which tyically reduces
the size of an EXE by 2/3 (both code /and/ data), and which
transparently expands it when launched.
So, with the existence of such a utility, I wouldn't even bother trying
it within a compiler.
On 8/6/2024 4:29 AM, Ben Bacarisse wrote:
"Chris M. Thomasson" <chris.m.thomasson.1@gmail.com> writes:
I must have completely missed it. Sorry about that. Please redefine?It's going to seem silly after all these exchanges. I simply wanted to
know why you chose to use const as you originally posted:
| struct object_prv_vtable {
| int (*fp_destroy) (void* const);
| int (*fp_read) (void* const, void*, size_t);
| int (*fp_write) (void* const, void const*, size_t);
| };
because that looks peculiar (to the point of being arbitrary) to me.
You went on to talk about "self" pointers being const pointers to const
void, but that was not what you wrote, so it did not address what I was
asking about.
In general, const qualified argument types are rarely used and are even
more rarely used in function (or type) declarations because there have
no effect at all in that position. For example, I can assign fp_destroy
from a function declared without the const-qualified parameter:
int destroy(void *self) { /* ... */; return 1; }
...
vtab.fp_destroy = destroy;
or, if I do want the compiler to check that the function does not alter
its parameter, I can add the const in the function definition (were it
can be useful) even if it is missing from the declaration:
struct object_prv_vtable {
int (*fp_destroy) (void*);
/* ... */
};
int destroy(void *const self) { /* ... */; return 1; }
...
vtab.fp_destroy = destroy;
But if you want the const there so that the declaration matches the
function defintion, why not do that for all the parameters? Basically,
I would have expercted either this (just ine const where it matters):
struct object_prv_vtable {
int (*fp_destroy) (void *);
int (*fp_read) (void *, void *, size_t);
int (*fp_write) (void *, void const *, size_t);
};
and the actual functions that get assigned to these pointers might, if
you want that extra check, have all their parametera marked const. Or,
for consistency, you might have written
struct object_prv_vtable {
int (*fp_destroy) (void * const);
int (*fp_read) (void * const, void * const, size_t const);
int (*fp_write) (void * const, void const * const, size_t const);
};
even if none of the actual functions have const parameters.
Finally, if you had intended to write what you later went on to talk
about, you would have written either
struct object_prv_vtable {
int (*fp_destroy) (const void *);
int (*fp_read) (const void *, void *, size_t);
int (*fp_write) (const void *, void const *, size_t);
};
or
struct object_prv_vtable {
int (*fp_destroy) (const void * const);
int (*fp_read) (const void * const, void * const, size_t const);
int (*fp_write) (const void * const, void const * const, size_t const);
};
TL;DR: where you put the consts in the original just seemed arbitrary.
I'll also note that the term "const pointer" is often used when the
pointer is not const! It most often mean that the pointed-to type is
const qualified. As such, it's best to avoid the term altogether.
I wanted to get across that the pointer value for the first parameter
itself should not be modified. I read (void* const) as a const pointer to a "non-const" void. Now a const pointer to a const void is (void const*
const), from my code, notice the first parameter?
I consider the first parameter to be special in this older OO experiment of mine. It shall not be modified, so I wrote it into the API:
On 8/2/24 9:31 PM, Lawrence D'Oliveiro wrote:
On Fri, 2 Aug 2024 14:19:49 -0400, James Kuyper wrote:
I've heard that in some other
languages, if you call foo(3), and foo() changes the value of it's
argument to 2, then subsequent calls to bar(3) will pass a value of 2 to >>> bar(). That sounds like such a ridiculous mis-feature that I hesitate to >>> identify which languages I had heard accused of having that feature ...
I heard that, too. I think it was on some early FORTRAN compilers, on
early machine architectures, without stacks or reentrancy. And with the
weird FORTRAN argument-passing conventions.
I remember it too, and was based on the fact that all arguments were
pass by reference (so they could be either in or out parameters), and constants were passed as pointers to the location of memory where that constant was stored, and perhaps used elsewhere too. Why waste
precious memory to setup a temporary to hold be initialized and hold
the value, when you could just pass the address of a location that you
knew had the right value.
On 8/3/24 10:58 PM, Keith Thompson wrote:
Lawrence D'Oliveiro <ldo@nz.invalid> writes:
On Sat, 03 Aug 2024 17:07:37 -0700, Keith Thompson wrote:
... general compression isn't something I've seen ...
I recall Apple had a patent on some aspects of the ?PEF?
executable format that they created for their PowerPC machines
running old MacOS. This had to do with some clever instruction
encodings for loading stuff into memory.
Is that relevant to what I asked about?
What I had in mind is something that, given this:
static int buf = { 1, 1, 1, ..., 1 }; // say, 1000 elements
would store something less than 1000*sizeof(int) bytes in the
executable file. I wouldn't be hard to do, but I'm not convinced
it would be worthwhile.
I vaguely seem to remember an embedded format that did something like
this. The .init segement that was "copied" to the .data segement has
a simple run-length encoding option. For non-repetitive data, it
just encoded 1 copy of length n. But it could also encode repeats
like your example. When EPROM was a scarce commodity squeezing out a
bit of size for the .init segment was useful.
My guess that since it didn't persist, it didn't actually help that
much.
On 8/2/24 2:58 PM, James Kuyper wrote:
On 8/2/24 14:42, Richard Damon wrote:
On 8/2/24 2:24 PM, Keith Thompson wrote:
Richard Harnden <richard.nospam@gmail.invalid> writes:
[...]
Is there any reason not to always write ...
static const char *s = "hello, world";
... ?
...
There's no good reason not to use "const". (If string literal objects >>>> were const, you'd have to use "const" here.)
...
The one good reason to not make it const is that if you are passing it
to functions that take (non-const) char* parameters that don't
actually change that parameters contents.
Actually, that's not a good reason. If you can't modify the function's
interface, you should use a (char*) cast, which will serve to remind
future programmers that this is a dangerous function call. You shouldn't
make the pointer's own type "char *".
Depends on the library and how many times it is used. It may be a
perfectly safe call, as the function is defined not to change its
parameter, but being external code the signature might not be fixable.
Adding the cast at each call, may cause a "crying wolf" response that
trains people to just add the cast where it seems to be needed (even
if not warrented).
On 8/2/24 2:24 PM, Keith Thompson wrote:
Richard Harnden <richard.nospam@gmail.invalid> writes:
[...]
Is there any reason not to always write ...
static const char *s = "hello, world";
... ?
You get all the warnings for free that way.
The "static", if this is at block scope, specifies that the pointer
object, not the array object, has static storage duration. If it's at
file scope it specifies that the name "s" is not visible to other
translation units. Either way, use it if that's what you want, don't
use it if it isn't.
There's no good reason not to use "const". (If string literal objects
were const, you'd have to use "const" here.)
If you also want the pointer to be const, you can write:
const char *const s = "hello, world";
The one good reason to not make it const is that if you are passing it
to functions that take (non-const) char* parameters that don't
actually change that parameters contents.
These may still exist in legacy code since so far nothing has required
them to change.
Perhaps it is getting to the point that the language needs to abandon
support for that ancient code, and force "const correctness" (which I
admit some will call const-pollution) onto code, first with a formal deprecation period, where implementations are strongly suggested to
make the violation of the rule a warning, and then later changing the
type of string constants.
On 2024-08-01, Bart <bc@freeuk.com> wrote:
On 01/08/2024 20:39, Kaz Kylheku wrote:
On 2024-08-01, Mark Summerfield <mark@qtrac.eu> wrote:
This program segfaults at the commented line:
#include <ctype.h>
#include <stdio.h>
void uppercase_ascii(char *s) {
while (*s) {
*s = toupper(*s); // SEGFAULT
s++;
}
}
int main() {
char* text = "this is a test";
The "this is a test" object is a literal. It is part of the
program's image.
So is the text here:
char text[]="this is a test";
But this can be changed without making the program self-modifying.
The array which is initialized by the literal is what can be
changed.
In this situation, the literal is just initializer syntax,
not required to be an object with an address.
A string literal creates an array object with static storage
duration. [...]
candycanearter07 <candycanearter07@candycanearter07.nomail.afraid>
writes:
David Brown <david.brown@hesbynett.no> wrote at 17:56 this Thursday (GMT):
[...]
gcc has the option "-Wwrite-strings" that makes string literals in
C have "const char" array type, and thus give errors when you try
to assign to a non-const char * pointer. But the option has to be
specified explicitly (it is not in -Wall) because it changes the
meaning of the code and can cause compatibility issues with
existing correct code.
-Wwrite-strings is included in -Wpedantic.
No it isn't, nor is it included in -Wall -- and it wouldn't make
sense to do so.
The -Wpedantic option is intended to produce all required
diagnostics for the specified C standard. -Wwrite-strings
gives string literals the type `const char[LENGTH]`, which
enables useful diagnostics but is *non-conforming*.
Richard Harnden <richard.nospam@gmail.invalid> writes:
[...]
Is there any reason not to always write ...
static const char *s = "hello, world";
... ?
You get all the warnings for free that way.
The "static", if this is at block scope, specifies that the
pointer object, not the array object, has static storage duration.
If it's at file scope it specifies that the name "s" is not
visible to other translation units. Either way, use it if that's
what you want, don't use it if it isn't.
There's no good reason not to use "const". [...]
Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:
candycanearter07 <candycanearter07@candycanearter07.nomail.afraid>
writes:
David Brown <david.brown@hesbynett.no> wrote at 17:56 this Thursday (GMT): >>>[...]
gcc has the option "-Wwrite-strings" that makes string literals in
C have "const char" array type, and thus give errors when you try
to assign to a non-const char * pointer. But the option has to be
specified explicitly (it is not in -Wall) because it changes the
meaning of the code and can cause compatibility issues with
existing correct code.
-Wwrite-strings is included in -Wpedantic.
No it isn't, nor is it included in -Wall -- and it wouldn't make
sense to do so.
The -Wpedantic option is intended to produce all required
diagnostics for the specified C standard. -Wwrite-strings
gives string literals the type `const char[LENGTH]`, which
enables useful diagnostics but is *non-conforming*.
As long as the -Wwrite-strings diagnostics are only warnings the
result is still conforming.
It's not just about diagnostics. This program:
#include <stdio.h>
int main(void) {
puts(_Generic("hello",
char*: "char*",
const char*: "const char*",
default: "?"));
}
must print "char*" in a conforming implementation. With
(gcc|clang) -Wwrite-strings, it prints "const char*".
And something as simple as:
char *p = "hello";
is rejected with a fatal error with "-Wwrite-strings -pedantic-errors".
Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:
Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:
candycanearter07 <candycanearter07@candycanearter07.nomail.afraid>
writes:
David Brown <david.brown@hesbynett.no> wrote at 17:56 this Thursday (GMT):
[...]
gcc has the option "-Wwrite-strings" that makes string literals in >>>>>> C have "const char" array type, and thus give errors when you try
to assign to a non-const char * pointer. But the option has to be >>>>>> specified explicitly (it is not in -Wall) because it changes the
meaning of the code and can cause compatibility issues with
existing correct code.
-Wwrite-strings is included in -Wpedantic.
No it isn't, nor is it included in -Wall -- and it wouldn't make
sense to do so.
The -Wpedantic option is intended to produce all required
diagnostics for the specified C standard. -Wwrite-strings
gives string literals the type `const char[LENGTH]`, which
enables useful diagnostics but is *non-conforming*.
As long as the -Wwrite-strings diagnostics are only warnings the
result is still conforming.
It's not just about diagnostics. This program:
#include <stdio.h>
int main(void) {
puts(_Generic("hello",
char*: "char*",
const char*: "const char*",
default: "?"));
}
must print "char*" in a conforming implementation. With
(gcc|clang) -Wwrite-strings, it prints "const char*".
Good point. I hadn't considered such cases.
And something as simple as:
char *p = "hello";
is rejected with a fatal error with "-Wwrite-strings -pedantic-errors".
That violates the "As long as the -Wwrite-strings diagnostics are
only warnings" condition.
Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:
[...]
A string literal creates an array object with static storage
duration. [...]
A small quibble. Every string literal does sit in an array,
but it might not be a _new_ array, because different string
literals are allowed to overlap as long as the bytes in the
overlapping arrays have the right values.
On 12/08/2024 22:11, Tim Rentsch wrote:
Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:
[...]
A string literal creates an array object with static storage
duration. [...]
A small quibble. Every string literal does sit in an array,
but it might not be a _new_ array, because different string
literals are allowed to overlap as long as the bytes in the
overlapping arrays have the right values.
And this is exactly why string literals should always have been
const.
A compiler is entitled to share memory between strings. so
puts("lap");
puts("overlap");
it's entitled to make them overlap. Then add
char * p = "lap";
*p='X';
and it can overwrite the shared string. I think. which would
mean that writing "lap" again would have a different result.
But that ship has sailed. I'm not even sure const had been
invented that far back!
On 2024-08-01, Mark Summerfield <mark@qtrac.eu> wrote:
This program segfaults at the commented line:
#include <ctype.h>
#include <stdio.h>
void uppercase_ascii(char *s) {
while (*s) {
*s = toupper(*s); // SEGFAULT
s++;
}
}
int main() {
char* text = "this is a test";
The "this is a test" object is a literal. It is part of the
program's image. When you try to change it, you're making your
program self-modifying.
The ISO C language standard doesn't require implementations to
support self-modifying programs; the behavior is left undefined.
It could work in some documented, reliable way, in a given
implementation.
It's the same with any other constant in the program. [...]
Just as 1 is an integer literal whose value cannot be modified,
[...]
In 20/20 hindsight, my personal opinion is that it would have been
better to make string literals const in C89/C90.
I can't speak for most people, but I want string literals to be const
and I've thought about both sides of the equation. (Existing code could
be compiled with options to enable the old behavior and could be changed incrementally.)
In 20/20 hindsight, my personal opinion is that it would have been
better to make string literals const in C89/C90. Compilers could
still accept old const-incorrect code with a non-fatal warning,
and programmers would be encouraged but not immediately forced to
use const.
This could still be done in C2y, but I'm not aware of any proposals.
James Kuyper <jameskuyper@alumni.caltech.edu> writes:
Just as 1 is an integer literal whose value cannot be modified,
[...]
The C language doesn't have integer literals. C has string
literals, and compound literals, and it has integer constants.
But C does not have integer literals.
Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
James Kuyper <jameskuyper@alumni.caltech.edu> writes:
Just as 1 is an integer literal whose value cannot be modified,
[...]
The C language doesn't have integer literals. C has string
literals, and compound literals, and it has integer constants.
But C does not have integer literals.
Technically correct (but IMHO not really worth worrying about).
There is a proposal for C2y, authored by Jens Gustedt, to change the
term "constant" to "literal" for character, integer, and floating
constants. (I think it's a good idea.)
<https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3239.htm>
Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
James Kuyper <jameskuyper@alumni.caltech.edu> writes:
Just as 1 is an integer literal whose value cannot be modified,
[...]
The C language doesn't have integer literals. C has string
literals, and compound literals, and it has integer constants.
But C does not have integer literals.
True, but C++ does, and it means the same thing by "integer literal"
that C means by "integer constant".
C doesn't define the term "integer
literal" with any conflicting meaning, and my use of the C++ terminology allowed me to make the parallel with string literals clearer, so I don't
see any particular problem with my choice of words.
Is there any reason not to always write ...
static const char *s = "hello, world";
... ?
You get all the warnings for free that way.
Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:
Richard Harnden <richard.nospam@gmail.invalid> writes:
[...]
Is there any reason not to always write ...
static const char *s = "hello, world";
... ?
You get all the warnings for free that way.
The "static", if this is at block scope, specifies that the
pointer object, not the array object, has static storage duration.
If it's at file scope it specifies that the name "s" is not
visible to other translation units. Either way, use it if that's
what you want, don't use it if it isn't.
There's no good reason not to use "const". [...]
Other people have different opinions on that question.
You could have told us your opinion. You could have explained why
someone might have a different opinion. You could have given us a
good reason not to use "const", assuming there is such a reason.
You know the language well enough to make me suspect you might
have something specific in mind. [...]
On 28.09.2024 05:34, Keith Thompson wrote:
Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:
The more C is changed to resemble C++ the worse it becomes. It
isn't surprising that you like it.
For context, since the parent article is from a month and a half
ago, I was discussing a proposal to change a future C standard to
refer to "constants" as "literals". I mentioned that I think it's
a good idea.
I've heard of and seen various forms to name such entities...
- in a Pascal and an Eiffel book I find all these named "constants"
- in an Algol 68 book I read about "standard designations"
- in a book about languages and programming in general I find
"literals" ("abc"), "numerals" (42), "word-symbols" (false),
"graphemes" (), etc., differentiated
- I've also have heard about "standard representations [for the
values of a respective type]"; also a type-independent term
I also think (for various reasons) that "constants" is not a good
term. (Personally I like terms like the Algol 68 term, that seems
to "operate" on another [more conceptual] abstraction level.)
But you'll certainly have to expect a lot of anger if the terminology
of some standards documents get changed from one version to another.
Janis
Phillip Frabott <nntp@fulltermprivacy.com> writes:
In reply to "Janis Papanagnou" who wrote the following:[...]
I also think (for various reasons) that "constants" is not a good
term. (Personally I like terms like the Algol 68 term, that seems
to "operate" on another [more conceptual] abstraction level.)
But you'll certainly have to expect a lot of anger if the terminology
of some standards documents get changed from one version to another.
The only gripe I would have if we synonymized constants and literals
is that not every const is initialized with a literal. There have been times where I have initialized a const from the value of a variable. I don't think that const and literals are the same thing because of
this.
Though the word "const" is obviously derived from the English word "constant", in C "const" and "constant" are very different things.
The "const" keyword really means "read-only" (and perhaps would have
been clearer if it had been spelled "readonly").
A "constant" is what some languages call a "literal", and a "constant expression" is an expression that can be evaluated at compile time.
For example, this:
const int r = rand();
is perfectly valid.
The more C is changed to resemble C++ the worse it becomes. It
isn't surprising that you like it.
I presume that was intended as a personal insult.
Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:
The more C is changed to resemble C++ the worse it becomes. It
isn't surprising that you like it.
For context, since the parent article is from a month and a half
ago, I was discussing a proposal to change a future C standard to
refer to "constants" as "literals". I mentioned that I think it's
a good idea.
Sysop: | Keyop |
---|---|
Location: | Huddersfield, West Yorkshire, UK |
Users: | 429 |
Nodes: | 16 (2 / 14) |
Uptime: | 117:07:17 |
Calls: | 9,056 |
Calls today: | 3 |
Files: | 13,396 |
Messages: | 6,016,551 |