This patch simplifies (!!!) the logic governing the naming of dump
files and auxiliary output files in the driver, in the compiler, and
in the LTO wrapper. No changes are made to the naming of primary
outputs, there are often ways to restore past behavior, and a number
of inconsistencies are fixed. Some internal options are removed
(-auxbase and -auxbase-strip), sensible existing uses of -dumpdir and
-dumpbase options remain unchanged, additional useful cases are added,
making for what is still admittedly quite complex. Extensive
documentation and testcases provide numerous examples, from normal to
corner cases.
The most visible changes are:
- aux and dump files now always go in the same directory, that
defaults to the directory of the primary output, but that can be
overridden with -dumpdir, -save-temps=*, or, preserving past behavior,
with a -dumpbase with a directory component.
- driver and compiler now have the same notion of naming of auxiliary
outputs, e.g. .dwo files will no longer be in one location while the
debug info suggests they are elsewhere, and -save-temps and .dwo
auxiliary outputs now go in the same location as .su, .ci and
coverage data, with consistent naming.
- explicitly-specified primary output names guide not only the
location of aux and dump outputs: the output base name is also used in
their base name, as a prefix when also linking (e.g. foo.c bar.c -o
foobar creates foobar-foo.dwo and foobar-bar.dwo with -gsplit-dwarf),
or as the base name instead of the input name (foo.c -c -o whatever.o
creates whatever.su rather than foo.su with -fstack-usage). The
preference for the input file base name, quite useful for our
testsuite, can be restored with -dumpbase "". When compiling and
linking tests in the testsuite with additional inputs, we now use this
flag. Files named in dejagnu board ldflags, libs, and ldscripts are
now quoted in the gcc testsuite with -Wl, so that they are not counted
as additional inputs by the compiler driver.
- naming a -dumpbase when compiling multiple sources used to cause
dumps from later compiles to overwrite those of earlier ones; it is
now used as a prefix when compiling multiple sources, like an
executable name above.
- the dumpbase, explicitly specified or computed from output or input
names, now also governs the naming of aux outputs; since aux outputs
usually replaced the suffix from the input name, while dump outputs
append their own additional suffixes, a -dumpbase-ext option is
introduced to enable a chosen suffix to be dropped from dumpbase to
form aux output names.
- LTO dump and aux outputs were quite a mess, sometimes leaking
temporary output names into -save-temps output names, sometimes
conversely generating desirable aux outputs in temporary locations.
They now obey the same logic of compiler aux and dump outputs, landing
in the expected location and taking the linker output name or an
explicit dumpbase overrider into account.
- Naming of -fdump-final-insns outputs now follows the dump file
naming logic for the .gkd files, and the .gk dump files generated in
the second -fcompare-debug compilation get the .gk inserted before the
suffix that -dumpbase-ext drops in aux outputs.
gcc/ChangeLog:
* common.opt (aux_base_name): Define.
(dumpbase, dumpdir): Mark as Driver options.
(-dumpbase, -dumpdir): Likewise.
(dumpbase-ext, -dumpbase-ext): New.
(auxbase, auxbase-strip): Drop.
* doc/invoke.texi (-dumpbase, -dumpbase-ext, -dumpdir):
Document.
(-o): Introduce the notion of primary output, mention it
influences auxiliary and dump output names as well, add
examples.
(-save-temps): Adjust, move examples into -dump*.
(-save-temps=cwd, -save-temps=obj): Likewise.
(-fdump-final-insns): Adjust.
* dwarf2out.c (gen_producer_string): Drop auxbase and
auxbase_strip; add dumpbase_ext.
* gcc.c (enum save_temps): Add SAVE_TEMPS_DUMP.
(save_temps_prefix, save_temps_length): Drop.
(save_temps_overrides_dumpdir): New.
(dumpdir, dumpbase, dumpbase_ext): New.
(dumpdir_length, dumpdir_trailing_dash_added): New.
(outbase, outbase_length): New.
(The Specs Language): Introduce %". Adjust %b and %B.
(ASM_FINAL_SPEC): Use %b.dwo for an aux output name always.
Precede object file with %w when it's the primary output.
(cpp_debug_options): Do not pass on incoming -dumpdir,
-dumpbase and -dumpbase-ext options; recompute them with
%:dumps.
(cc1_options): Drop auxbase with and without compare-debug;
use cpp_debug_options instead of dumpbase. Mark asm output
with %w when it's the primary output.
(static_spec_functions): Drop %:compare-debug-auxbase-opt and
%:replace-exception. Add %:dumps.
(driver_handle_option): Implement -save-temps=*/-dumpdir
mutual overriding logic. Save dumpdir, dumpbase and
dumpbase-ext options. Do not save output_file in
save_temps_prefix.
(adds_single_suffix_p): New.
(single_input_file_index): New.
(process_command): Combine output dir, output base name, and
dumpbase into dumpdir and outbase.
(set_collect_gcc_options): Pass a possibly-adjusted -dumpdir.
(do_spec_1): Optionally dumpdir instead of save_temps_prefix,
and outbase instead of input_basename in %b, %B and in
-save-temps aux files. Handle empty argument %".
(driver::maybe_run_linker): Adjust dumpdir and auxbase.
(compare_debug_dump_opt_spec_function): Adjust gkd dump file
naming. Spec-quote the computed -fdump-final-insns file name.
(debug_auxbase_opt): Drop.
(compare_debug_self_opt_spec_function): Drop auxbase-strip
computation.
(compare_debug_auxbase_opt_spec_function): Drop.
(not_actual_file_p): New.
(replace_extension_spec_func): Drop.
(dumps_spec_func): New.
(convert_white_space): Split-out parts into...
(quote_string, whitespace_to_convert_p): ... these. New.
(quote_spec_char_p, quote_spec, quote_spec_arg): New.
(driver::finalize): Release and reset new variables; drop
removed ones.
* lto-wrapper.c (HAVE_TARGET_EXECUTABLE_SUFFIX): Define if...
(TARGET_EXECUTABLE_SUFFIX): ... is defined; define this to the
empty string otherwise.
(DUMPBASE_SUFFIX): Drop leading period.
(debug_objcopy): Use concat.
(run_gcc): Recognize -save-temps=* as -save-temps too. Obey
-dumpdir. Pass on empty dumpdir and dumpbase with a directory
component. Simplify temp file names.
* opts.c (finish_options): Drop aux base name handling.
(common_handle_option): Drop auxbase-strip handling.
* toplev.c (print_switch_values): Drop auxbase, add
dumpbase-ext.
(process_options): Derive aux_base_name from dump_base_name
and dump_base_ext.
(lang_dependent_init): Compute dump_base_ext along with
dump_base_name. Disable stack usage and callgraph-info during
lto generation and compare-debug recompilation.
gcc/fortran/ChangeLog:
* options.c (gfc_get_option_string): Drop auxbase, add
dumpbase_ext.
gcc/ada/ChangeLog:
* gcc-interface/lang-specs.h: Drop auxbase and auxbase-strip.
Use %:dumps instead of -dumpbase. Add %w for implicit .s
primary output.
* switch.adb (Is_Internal_GCC_Switch): Recognize dumpdir and
dumpbase-ext. Drop auxbase and auxbase-strip.
lto-plugin/ChangeLog:
* lto-plugin.c (skip_in_suffix): New.
(exec_lto_wrapper): Use skip_in_suffix and concat to build
non-temporary output names.
(onload): Look for -dumpdir in COLLECT_GCC_OPTIONS, and
override link_output_name with it.
contrib/ChangeLog:
* compare-debug: Adjust for .gkd files named as dump files,
with the source suffix rather than the object suffix.
gcc/testsuite/ChangeLog:
* gcc.misc-tests/outputs.exp: New.
* gcc.misc-tests/outputs-0.c: New.
* gcc.misc-tests/outputs-1.c: New.
* gcc.misc-tests/outputs-2.c: New.
* lib/gcc-defs.exp (gcc_adjusted_linker_flags): New.
(gcc_adjust_linker_flags): New.
(dg-additional-files-options): Call it. Pass -dumpbase ""
when there are additional sources.
* lib/profopt.exp (profopt-execute): Pass the executable
suffix with -dumpbase-ext.
* lib/scandump.exp (dump-base): Mention -dumpbase "" use.
* lib/scanltranstree.exp: Adjust dump suffix expectation.
* lib/scanwpaipa.exp: Likewise.
Copyright (C) 2000-2020 Free Software Foundation, Inc.
This file is intended to contain a few notes about writing C code
within GCC so that it compiles without error on the full range of
compilers GCC needs to be able to compile on.
The problem is that many ISO-standard constructs are not accepted by
either old or buggy compilers, and we keep getting bitten by them.
This knowledge until now has been sparsely spread around, so I
thought I'd collect it in one useful place. Please add and correct
any problems as you come across them.
I'm going to start from a base of the ISO C90 standard, since that is
probably what most people code to naturally. Obviously using
constructs introduced after that is not a good idea.
For the complete coding style conventions used in GCC, please read
http://gcc.gnu.org/codingconventions.html
String literals
---------------
Some compilers like MSVC++ have fairly low limits on the maximum
length of a string literal; 509 is the lowest we've come across. You
may need to break up a long printf statement into many smaller ones.
Empty macro arguments
---------------------
ISO C (6.8.3 in the 1990 standard) specifies the following:
If (before argument substitution) any argument consists of no
preprocessing tokens, the behavior is undefined.
This was relaxed by ISO C99, but some older compilers emit an error,
so code like
#define foo(x, y) x y
foo (bar, )
needs to be coded in some other way.
Avoid unnecessary test before free
----------------------------------
Since SunOS 4 stopped being a reasonable portability target,
(which happened around 2007) there has been no need to guard
against "free (NULL)". Thus, any guard like the following
constitutes a redundant test:
if (P)
free (P);
It is better to avoid the test.[*]
Instead, simply free P, regardless of whether it is NULL.
[*] However, if your profiling exposes a test like this in a
performance-critical loop, say where P is nearly always NULL, and
the cost of calling free on a NULL pointer would be prohibitively
high, consider using __builtin_expect, e.g., like this:
if (__builtin_expect (ptr != NULL, 0))
free (ptr);
Trigraphs
---------
You weren't going to use them anyway, but some otherwise ISO C
compliant compilers do not accept trigraphs.
Suffixes on Integer Constants
-----------------------------
You should never use a 'l' suffix on integer constants ('L' is fine),
since it can easily be confused with the number '1'.
Common Coding Pitfalls
======================
errno
-----
errno might be declared as a macro.
Implicit int
------------
In C, the 'int' keyword can often be omitted from type declarations.
For instance, you can write
unsigned variable;
as shorthand for
unsigned int variable;
There are several places where this can cause trouble. First, suppose
'variable' is a long; then you might think
(unsigned) variable
would convert it to unsigned long. It does not. It converts to
unsigned int. This mostly causes problems on 64-bit platforms, where
long and int are not the same size.
Second, if you write a function definition with no return type at
all:
operate (int a, int b)
{
...
}
that function is expected to return int, *not* void. GCC will warn
about this.
Implicit function declarations always have return type int. So if you
correct the above definition to
void
operate (int a, int b)
...
but operate() is called above its definition, you will get an error
about a "type mismatch with previous implicit declaration". The cure
is to prototype all functions at the top of the file, or in an
appropriate header.
Char vs unsigned char vs int
----------------------------
In C, unqualified 'char' may be either signed or unsigned; it is the
implementation's choice. When you are processing 7-bit ASCII, it does
not matter. But when your program must handle arbitrary binary data,
or fully 8-bit character sets, you have a problem. The most obvious
issue is if you have a look-up table indexed by characters.
For instance, the character '\341' in ISO Latin 1 is SMALL LETTER A
WITH ACUTE ACCENT. In the proper locale, isalpha('\341') will be
true. But if you read '\341' from a file and store it in a plain
char, isalpha(c) may look up character 225, or it may look up
character -31. And the ctype table has no entry at offset -31, so
your program will crash. (If you're lucky.)
It is wise to use unsigned char everywhere you possibly can. This
avoids all these problems. Unfortunately, the routines in <string.h>
take plain char arguments, so you have to remember to cast them back
and forth - or avoid the use of strxxx() functions, which is probably
a good idea anyway.
Another common mistake is to use either char or unsigned char to
receive the result of getc() or related stdio functions. They may
return EOF, which is outside the range of values representable by
char. If you use char, some legal character value may be confused
with EOF, such as '\377' (SMALL LETTER Y WITH UMLAUT, in Latin-1).
The correct choice is int.
A more subtle version of the same mistake might look like this:
unsigned char pushback[NPUSHBACK];
int pbidx;
#define unget(c) (assert(pbidx < NPUSHBACK), pushback[pbidx++] = (c))
#define get(c) (pbidx ? pushback[--pbidx] : getchar())
...
unget(EOF);
which will mysteriously turn a pushed-back EOF into a SMALL LETTER Y
WITH UMLAUT.
Other common pitfalls
---------------------
o Expecting 'plain' char to be either sign or unsigned extending.
o Shifting an item by a negative amount or by greater than or equal to
the number of bits in a type (expecting shifts by 32 to be sensible
has caused quite a number of bugs at least in the early days).
o Expecting ints shifted right to be sign extended.
o Modifying the same value twice within one sequence point.
o Host vs. target floating point representation, including emitting NaNs
and Infinities in a form that the assembler handles.
o qsort being an unstable sort function (unstable in the sense that
multiple items that sort the same may be sorted in different orders
by different qsort functions).
o Passing incorrect types to fprintf and friends.
o Adding a function declaration for a module declared in another file to
a .c file instead of to a .h file.