* README.Portability: New file.
From-SVN: r35039
This commit is contained in:
parent
245aefecd3
commit
9139939530
@ -1,3 +1,7 @@
|
||||
2000-07-15 Neil Booth <NeilB@earthling.net>
|
||||
|
||||
* README.Portability: New file.
|
||||
|
||||
Fri Jul 14 18:13:53 2000 Mark P Mitchell <mark@codesourcery.com>
|
||||
|
||||
* INSTALL: Give special instructions for building GCC on Irix 6.
|
||||
|
369
gcc/README.Portability
Normal file
369
gcc/README.Portability
Normal file
@ -0,0 +1,369 @@
|
||||
Copyright (C) 2000 Free Software Foundation, Inc.
|
||||
|
||||
This file is intended to contain a few notes about writing C code
|
||||
within GCC so that it compiles without error on the full range of
|
||||
compilers GCC needs to be able to compile on.
|
||||
|
||||
The problem is that many ISO-standard constructs are not accepted by
|
||||
either old or buggy compilers, and we keep getting bitten by them.
|
||||
This knowledge until know has been sparsely spread around, so I
|
||||
thought I'd collect it in one useful place. Please add and correct
|
||||
any problems as you come across them.
|
||||
|
||||
I'm going to start from a base of the ISO C89 standard, since that is
|
||||
probably what most people code to naturally. Obviously using
|
||||
constructs introduced after that is not a good idea.
|
||||
|
||||
The first section of this file deals strictly with portability issues,
|
||||
the second with common coding pitfalls.
|
||||
|
||||
|
||||
Portability Issues
|
||||
==================
|
||||
|
||||
Unary +
|
||||
-------
|
||||
|
||||
K+R C compilers and preprocessors have no notion of unary '+'. Thus
|
||||
the following code snippet contains 2 portability problems.
|
||||
|
||||
int x = +2; /* int x = 2; */
|
||||
#if +1 /* #if 1 */
|
||||
#endif
|
||||
|
||||
|
||||
Pointers to void
|
||||
----------------
|
||||
|
||||
K+R C compilers did not have a void pointer, and used char * as the
|
||||
pointer to anything. The macro PTR is defined as either void * or
|
||||
char * depending on whether you have a standards compliant compiler or
|
||||
a K+R one. Thus
|
||||
|
||||
free ((void *) h->value.expansion);
|
||||
|
||||
should be written
|
||||
|
||||
free ((PTR) h->value.expansion);
|
||||
|
||||
|
||||
String literals
|
||||
---------------
|
||||
|
||||
K+R C did not allow concatenation of string literals like
|
||||
|
||||
"This is a " "single string literal".
|
||||
|
||||
Moreover, some compilers like MSVC++ have fairly low limits on the
|
||||
maximum length of a string literal; 509 is the lowest we've come
|
||||
across. You may need to break up a long printf statement into many
|
||||
smaller ones.
|
||||
|
||||
|
||||
Empty macro arguments
|
||||
---------------------
|
||||
|
||||
ISO C (6.8.3 in the 1990 standard) specifies the following:
|
||||
|
||||
If (before argument substitution) any argument consists of no
|
||||
preprocessing tokens, the behavior is undefined.
|
||||
|
||||
This was relaxed by ISO C99, but some older compilers emit an error,
|
||||
so code like
|
||||
|
||||
#define foo(x, y) x y
|
||||
foo (bar, )
|
||||
|
||||
needs to be coded in some other way.
|
||||
|
||||
|
||||
signed keyword
|
||||
--------------
|
||||
|
||||
The signed keyword did not exist in K+R comilers, it was introduced in
|
||||
ISO C89, so you cannot use it. In both K+R and standard C,
|
||||
unqualified char and bitfields may be signed or unsigned. There is no
|
||||
way to portably declare signed chars or signed bitfields.
|
||||
|
||||
All other arithmetic types are signed unless you use the 'unsigned'
|
||||
qualifier. For instance, it is safe to write
|
||||
|
||||
short paramc;
|
||||
|
||||
instead of
|
||||
|
||||
signed short paramc;
|
||||
|
||||
If you have an algorithm that depends on signed char or signed
|
||||
bitfields, you must find another way to write it before it can be
|
||||
integrated into GCC.
|
||||
|
||||
|
||||
Function prototypes
|
||||
-------------------
|
||||
|
||||
You need to provide a function prototype for every function before you
|
||||
use it, and functions must be defined K+R style. The function
|
||||
prototype should use the PARAMS macro, which takes a single argument.
|
||||
Therefore the parameter list must be enclosed in parentheses. For
|
||||
example,
|
||||
|
||||
int myfunc PARAMS ((double, int *));
|
||||
|
||||
int
|
||||
myfunc (var1, var2)
|
||||
double var1;
|
||||
int *var2;
|
||||
{
|
||||
...
|
||||
}
|
||||
|
||||
You also need to use PARAMS when referring to function protypes in
|
||||
other circumstances, for example see "Calling functions through
|
||||
pointers to functions" below.
|
||||
|
||||
Variable-argument functions are best described by example:-
|
||||
|
||||
void cpp_ice PARAMS ((cpp_reader *, const char *msgid, ...));
|
||||
|
||||
void
|
||||
cpp_ice VPARAMS ((cpp_reader *pfile, const char *msgid, ...))
|
||||
{
|
||||
#ifndef ANSI_PROTOTYPES
|
||||
cpp_reader *pfile;
|
||||
const char *msgid;
|
||||
#endif
|
||||
va_list ap;
|
||||
|
||||
VA_START (ap, msgid);
|
||||
|
||||
#ifndef ANSI_PROTOTYPES
|
||||
pfile = va_arg (ap, cpp_reader *);
|
||||
msgid = va_arg (ap, const char *);
|
||||
#endif
|
||||
|
||||
...
|
||||
va_end (ap);
|
||||
}
|
||||
|
||||
For the curious, here are the definitions of the above macros. See
|
||||
ansidecl.h for the definitions of the above macros and more.
|
||||
|
||||
#define PARAMS(paramlist) paramlist /* ISO C. */
|
||||
#define VPARAMS(args) args
|
||||
|
||||
#define PARAMS(paramlist) () /* K+R C. */
|
||||
#define VPARAMS(args) (va_alist) va_dcl
|
||||
|
||||
|
||||
Calling functions through pointers to functions
|
||||
-----------------------------------------------
|
||||
|
||||
K+R C compilers require brackets around the dereferenced pointer
|
||||
variable. For example
|
||||
|
||||
typedef void (* cl_directive_handler) PARAMS ((cpp_reader *, const char *));
|
||||
p->handler (pfile, p->arg);
|
||||
|
||||
needs to become
|
||||
|
||||
(p->handler) (pfile, p->arg);
|
||||
|
||||
|
||||
Macros
|
||||
------
|
||||
|
||||
The rules under K+R C and ISO C for achieving stringification and
|
||||
token pasting are quite different. Therefore some macros have been
|
||||
defined which will get it right depending upon the compiler.
|
||||
|
||||
CONCAT2(a,b) CONCAT3(a,b,c) and CONCAT4(a,b,c,d)
|
||||
|
||||
will paste the tokens passed as arguments. You must not leave any
|
||||
space around the commas. Also,
|
||||
|
||||
STRINGX(x)
|
||||
|
||||
will stringify an argument; to get the same result on K+R and ISO
|
||||
compilers x should not have spaces around it.
|
||||
|
||||
|
||||
Enums
|
||||
-----
|
||||
|
||||
In K+R C, you have to cast enum types to use them as integers, and
|
||||
some compilers in particular give lots of warnings for using an enum
|
||||
as an array index.
|
||||
|
||||
Bitfields
|
||||
---------
|
||||
|
||||
See also "signed keyword" above. In K+R C only unsigned int bitfields
|
||||
were defined (i.e. unsigned char, unsigned short, unsigned long.
|
||||
Using plain int/short/long was not allowed).
|
||||
|
||||
|
||||
free and realloc
|
||||
----------------
|
||||
|
||||
Some implementations crash upon attempts to free or realloc the null
|
||||
pointer. Thus if mem might be null, you need to write
|
||||
|
||||
if (mem)
|
||||
free (mem);
|
||||
|
||||
|
||||
Reserved Keywords
|
||||
-----------------
|
||||
|
||||
K+R C has "entry" as a reserved keyword, so you should not use it for
|
||||
your variable names.
|
||||
|
||||
|
||||
Type promotions
|
||||
---------------
|
||||
|
||||
K+R used unsigned-preserving rules for arithmetic expresssions, while
|
||||
ISO uses value-preserving. This means an unsigned char compared to an
|
||||
int is done as an unsigned comparison in K+R (since unsigned char
|
||||
promotes to unsigned) while it is signed in ISO (since all of the
|
||||
values in unsigned char fit in an int, it promotes to int).
|
||||
|
||||
** Not having any argument whose type is a short type (char, short,
|
||||
float of any flavor) and subject to promotion. **
|
||||
|
||||
Trigraphs
|
||||
---------
|
||||
|
||||
You weren't going to use them anyway, but trigraphs were not defined
|
||||
in K+R C, and some otherwise ISO C compliant compilers do not accept
|
||||
them.
|
||||
|
||||
|
||||
Suffixes on Integer Constants
|
||||
-----------------------------
|
||||
|
||||
**Using a 'u' suffix on integer constants.**
|
||||
|
||||
|
||||
errno
|
||||
-----
|
||||
|
||||
errno might be declared as a macro.
|
||||
|
||||
|
||||
Common Coding Pitfalls
|
||||
======================
|
||||
Implicit int
|
||||
------------
|
||||
|
||||
In C, the 'int' keyword can often be omitted from type declarations.
|
||||
For instance, you can write
|
||||
|
||||
unsigned variable;
|
||||
|
||||
as shorthand for
|
||||
|
||||
unsigned int variable;
|
||||
|
||||
There are several places where this can cause trouble. First, suppose
|
||||
'variable' is a long; then you might think
|
||||
|
||||
(unsigned) variable
|
||||
|
||||
would convert it to unsigned long. It does not. It converts to
|
||||
unsigned int. This mostly causes problems on 64-bit platforms, where
|
||||
long and int are not the same size.
|
||||
|
||||
Second, if you write a function definition with no return type at
|
||||
all:
|
||||
|
||||
operate(a, b)
|
||||
int a, b;
|
||||
{
|
||||
...
|
||||
}
|
||||
|
||||
that function is expected to return int, *not* void. GCC will warn
|
||||
about this. K+R C has no problem with 'void' as a return type, so you
|
||||
need not worry about that.
|
||||
|
||||
Implicit function declarations always have return type int. So if you
|
||||
correct the above definition to
|
||||
|
||||
void
|
||||
operate(a, b)
|
||||
int a, b;
|
||||
...
|
||||
|
||||
but operate() is called above its definition, you will get an error
|
||||
about a "type mismatch with previous implicit declaration". The cure
|
||||
is to prototype all functions at the top of the file, or in an
|
||||
appropriate header.
|
||||
|
||||
Char vs unsigned char vs int
|
||||
----------------------------
|
||||
|
||||
In C, unqualified 'char' may be either signed or unsigned; it is the
|
||||
implementation's choice. When you are processing 7-bit ASCII, it does
|
||||
not matter. But when your program must handle arbitrary binary data,
|
||||
or fully 8-bit character sets, you have a problem. The most obvious
|
||||
issue is if you have a look-up table indexed by characters.
|
||||
|
||||
For instance, the character '\341' in ISO Latin 1 is SMALL LETTER A
|
||||
WITH ACUTE ACCENT. In the proper locale, isalpha('\341') will be
|
||||
true. But if you read '\341' from a file and store it in a plain
|
||||
char, isalpha(c) may look up character 225, or it may look up
|
||||
character -31. And the ctype table has no entry at offset -31, so
|
||||
your program will crash. (If you're lucky.)
|
||||
|
||||
It is wise to use unsigned char everywhere you possibly can. This
|
||||
avoids all these problems. Unfortunately, the routines in <string.h>
|
||||
take plain char arguments, so you have to remember to cast them back
|
||||
and forth - or avoid the use of strxxx() functions, which is probably
|
||||
a good idea anyway.
|
||||
|
||||
Another common mistake is to use either char or unsigned char to
|
||||
receive the result of getc() or related stdio functions. They may
|
||||
return EOF, which is outside the range of values representable by
|
||||
char. If you use char, some legal character value may be confused
|
||||
with EOF, such as '\377' (SMALL LETTER Y WITH UMLAUT, in Latin-1).
|
||||
The correct choice is int.
|
||||
|
||||
A more subtle version of the same mistake might look like this:
|
||||
|
||||
unsigned char pushback[NPUSHBACK];
|
||||
int pbidx;
|
||||
#define unget(c) (assert(pbidx < NPUSHBACK), pushback[pbidx++] = (c))
|
||||
#define get(c) (pbidx ? pushback[--pbidx] : getchar())
|
||||
...
|
||||
unget(EOF);
|
||||
|
||||
which will mysteriously turn a pushed-back EOF into a SMALL LETTER Y
|
||||
WITH UMLAUT.
|
||||
|
||||
|
||||
Other common pitfalls
|
||||
---------------------
|
||||
|
||||
o Expecting 'plain' char to be either sign or unsigned extending
|
||||
|
||||
o Shifting an item by a negative amount or by greater than or equal to
|
||||
the number of bits in a type (expecting shifts by 32 to be sensible
|
||||
has caused quite a number of bugs at least in the early days).
|
||||
|
||||
o Expecting ints shifted right to be sign extended.
|
||||
|
||||
o Modifying the same value twice within one sequence point.
|
||||
|
||||
o Host vs. target floating point representation, including emitting NaNs
|
||||
and Infinities in a form that the assembler handles.
|
||||
|
||||
o qsort being an unstable sort function (unstable in the sense that
|
||||
multiple items that sort the same may be sorted in different orders
|
||||
by different qsort functions).
|
||||
|
||||
o Passing incorrect types to fprintf and friends.
|
||||
|
||||
o Adding a function declaration for a module declared in another file to
|
||||
a .c file instead of to a .h file.
|
Loading…
Reference in New Issue
Block a user