1995-08-28 06:15:04 -04:00
|
|
|
Notes on the GNU Implementation of DWARF Debugging Information
|
|
|
|
--------------------------------------------------------------
|
|
|
|
Last Updated: Sun Jul 17 08:17:42 PDT 1994 by rfg@segfault.us.com
|
|
|
|
------------------------------------------------------------
|
|
|
|
|
|
|
|
This file describes special and unique aspects of the GNU implementation
|
|
|
|
of the DWARF debugging information language, as provided in the GNU version
|
|
|
|
2.x compiler(s).
|
|
|
|
|
|
|
|
For general information about the DWARF debugging information language,
|
|
|
|
you should obtain the DWARF version 1 specification document (and perhaps
|
|
|
|
also the DWARF version 2 draft specification document) developed by the
|
|
|
|
UNIX International Programming Languages Special Interest Group. A copy
|
1998-05-06 17:09:07 -04:00
|
|
|
of the DWARF version 1 specification (in PostScript form) may be
|
1995-08-28 06:15:04 -04:00
|
|
|
obtained either from me <rfg@netcom.com> or from the main Data General
|
|
|
|
FTP server. (See below.) The file you are looking at now only describes
|
|
|
|
known deviations from the DWARF version 1 specification, together with
|
|
|
|
those things which are allowed by the DWARF version 1 specification but
|
|
|
|
which are known to cause interoperability problems (e.g. with SVR4 SDB).
|
|
|
|
|
|
|
|
To obtain a copy of the DWARF Version 1 and/or DWARF Version 2 specification
|
|
|
|
from Data General's FTP server, use the following procedure:
|
|
|
|
|
|
|
|
---------------------------------------------------------------------------
|
|
|
|
ftp to machine: "dg-rtp.dg.com" (128.222.1.2).
|
|
|
|
|
|
|
|
Log in as "ftp".
|
|
|
|
cd to "plsig"
|
|
|
|
get any of the following file you are interested in:
|
|
|
|
|
|
|
|
dwarf.1.0.3.ps
|
|
|
|
dwarf.2.0.0.index.ps
|
|
|
|
dwarf.2.0.0.ps
|
|
|
|
---------------------------------------------------------------------------
|
|
|
|
|
|
|
|
The generation of DWARF debugging information by the GNU version 2.x C
|
|
|
|
compiler has now been tested rather extensively for m88k, i386, i860, and
|
|
|
|
Sparc targets. The DWARF output of the GNU C compiler appears to inter-
|
|
|
|
operate well with the standard SVR4 SDB debugger on these kinds of target
|
|
|
|
systems (but of course, there are no guarantees).
|
|
|
|
|
|
|
|
DWARF generation for the GNU g++ compiler is still not operable. This is
|
|
|
|
due primarily to the many remaining cases where the g++ front end does not
|
|
|
|
conform to the conventions used in the GNU C front end for representing
|
|
|
|
various kinds of declarations in the TREE data structure. It is not clear
|
|
|
|
at this time how these problems will be addressed.
|
|
|
|
|
|
|
|
Future plans for the dwarfout.c module of the GNU compiler(s) includes the
|
|
|
|
addition of full support for GNU FORTRAN. (This should, in theory, be a
|
|
|
|
lot simpler to add than adding support for g++... but we'll see.)
|
|
|
|
|
|
|
|
Many features of the DWARF version 2 specification have been adapted to
|
|
|
|
(and used in) the GNU implementation of DWARF (version 1). In most of
|
|
|
|
these cases, a DWARF version 2 approach is used in place of (or in addition
|
|
|
|
to) DWARF version 1 stuff simply because it is apparent that DWARF version
|
|
|
|
1 is not sufficiently expressive to provide the kinds of information which
|
|
|
|
may be necessary to support really robust debugging. In all of these cases
|
|
|
|
however, the use of DWARF version 2 features should not interfere in any
|
|
|
|
way with the interoperability (of GNU compilers) with generally available
|
|
|
|
"classic" (pre version 1) DWARF consumer tools (e.g. SVR4 SDB).
|
|
|
|
|
|
|
|
The DWARF generation enhancement for the GNU compiler(s) was initially
|
|
|
|
donated to the Free Software Foundation by Network Computing Devices.
|
|
|
|
(Thanks NCD!) Additional development and maintenance of dwarfout.c has
|
|
|
|
been largely supported (i.e. funded) by Intel Corporation. (Thanks Intel!)
|
|
|
|
|
|
|
|
If you have questions or comments about the DWARF generation feature, please
|
|
|
|
send mail to me <rfg@netcom.com>. I will be happy to investigate any bugs
|
|
|
|
reported and I may even provide fixes (but of course, I can make no promises).
|
|
|
|
|
|
|
|
The DWARF debugging information produced by GCC may deviate in a few minor
|
|
|
|
(but perhaps significant) respects from the DWARF debugging information
|
|
|
|
currently produced by other C compilers. A serious attempt has been made
|
|
|
|
however to conform to the published specifications, to existing practice,
|
|
|
|
and to generally accepted norms in the GNU implementation of DWARF.
|
|
|
|
|
|
|
|
** IMPORTANT NOTE ** ** IMPORTANT NOTE ** ** IMPORTANT NOTE **
|
|
|
|
|
|
|
|
Under normal circumstances, the DWARF information generated by the GNU
|
|
|
|
compilers (in an assembly language file) is essentially impossible for
|
|
|
|
a human being to read. This fact can make it very difficult to debug
|
|
|
|
certain DWARF-related problems. In order to overcome this difficulty,
|
|
|
|
a feature has been added to dwarfout.c (enabled by the -fverbose-asm
|
|
|
|
option) which causes additional comments to be placed into the assembly
|
|
|
|
language output file, out to the right-hand side of most bits of DWARF
|
|
|
|
material. The comments indicate (far more clearly that the obscure
|
|
|
|
DWARF hex codes do) what is actually being encoded in DWARF. Thus, the
|
|
|
|
-fverbose-asm option can be highly useful for those who must study the
|
|
|
|
DWARF output from the GNU compilers in detail.
|
|
|
|
|
|
|
|
---------
|
|
|
|
|
|
|
|
(Footnote: Within this file, the term `Debugging Information Entry' will
|
|
|
|
be abbreviated as `DIE'.)
|
|
|
|
|
|
|
|
|
|
|
|
Release Notes (aka known bugs)
|
|
|
|
-------------------------------
|
|
|
|
|
|
|
|
In one very obscure case involving dynamically sized arrays, the DWARF
|
|
|
|
"location information" for such an array may make it appear that the
|
|
|
|
array has been totally optimized out of existence, when in fact it
|
|
|
|
*must* actually exist. (This only happens when you are using *both* -g
|
|
|
|
*and* -O.) This is due to aggressive dead store elimination in the
|
|
|
|
compiler, and to the fact that the DECL_RTL expressions associated with
|
|
|
|
variables are not always updated to correctly reflect the effects of
|
|
|
|
GCC's aggressive dead store elimination.
|
|
|
|
|
|
|
|
-------------------------------
|
|
|
|
|
|
|
|
When attempting to set a breakpoint at the "start" of a function compiled
|
|
|
|
with -g1, the debugger currently has no way of knowing exactly where the
|
|
|
|
end of the prologue code for the function is. Thus, for most targets,
|
|
|
|
all the debugger can do is to set the breakpoint at the AT_low_pc address
|
|
|
|
for the function. But if you stop there and then try to look at one or
|
|
|
|
more of the formal parameter values, they may not have been "homed" yet,
|
|
|
|
so you may get inaccurate answers (or perhaps even addressing errors).
|
|
|
|
|
|
|
|
Some people may consider this simply a non-feature, but I consider it a
|
1998-05-06 17:09:07 -04:00
|
|
|
bug, and I hope to provide some GNU-specific attributes (on function
|
1995-08-28 06:15:04 -04:00
|
|
|
DIEs) which will specify the address of the end of the prologue and the
|
|
|
|
address of the beginning of the epilogue in a future release.
|
|
|
|
|
|
|
|
-------------------------------
|
|
|
|
|
|
|
|
It is believed at this time that old bugs relating to the AT_bit_offset
|
|
|
|
values for bit-fields have been fixed.
|
|
|
|
|
|
|
|
There may still be some very obscure bugs relating to the DWARF description
|
|
|
|
of type `long long' bit-fields for target machines (e.g. 80x86 machines)
|
|
|
|
where the alignment of type `long long' data objects is different from
|
|
|
|
(and less than) the size of a type `long long' data object.
|
|
|
|
|
|
|
|
Please report any problems with the DWARF description of bit-fields as you
|
|
|
|
would any other GCC bug. (Procedures for bug reporting are given in the
|
|
|
|
GNU C compiler manual.)
|
|
|
|
|
|
|
|
--------------------------------
|
|
|
|
|
|
|
|
At this time, GCC does not know how to handle the GNU C "nested functions"
|
|
|
|
extension. (See the GCC manual for more info on this extension to ANSI C.)
|
|
|
|
|
|
|
|
--------------------------------
|
|
|
|
|
|
|
|
The GNU compilers now represent inline functions (and inlined instances
|
|
|
|
thereof) in exactly the manner described by the current DWARF version 2
|
|
|
|
(draft) specification. The version 1 specification for handling inline
|
|
|
|
functions (and inlined instances) was known to be brain-damaged (by the
|
|
|
|
PLSIG) when the version 1 spec was finalized, but it was simply too late
|
|
|
|
in the cycle to get it removed before the version 1 spec was formally
|
|
|
|
released to the public (by UI).
|
|
|
|
|
|
|
|
--------------------------------
|
|
|
|
|
|
|
|
At this time, GCC does not generate the kind of really precise information
|
|
|
|
about the exact declared types of entities with signed integral types which
|
|
|
|
is required by the current DWARF draft specification.
|
|
|
|
|
|
|
|
Specifically, the current DWARF draft specification seems to require that
|
|
|
|
the type of an non-unsigned integral bit-field member of a struct or union
|
|
|
|
type be represented as either a "signed" type or as a "plain" type,
|
1998-05-06 17:09:07 -04:00
|
|
|
depending upon the exact set of keywords that were used in the
|
1995-08-28 06:15:04 -04:00
|
|
|
type specification for the given bit-field member. It was felt (by the
|
|
|
|
UI/PLSIG) that this distinction between "plain" and "signed" integral types
|
|
|
|
could have some significance (in the case of bit-fields) because ANSI C
|
|
|
|
does not constrain the signedness of a plain bit-field, whereas it does
|
|
|
|
constrain the signedness of an explicitly "signed" bit-field. For this
|
|
|
|
reason, the current DWARF specification calls for compilers to produce
|
|
|
|
type information (for *all* integral typed entities... not just bit-fields)
|
|
|
|
which explicitly indicates the signedness of the relevant type to be
|
|
|
|
"signed" or "plain" or "unsigned".
|
|
|
|
|
|
|
|
Unfortunately, the GNU DWARF implementation is currently incapable of making
|
|
|
|
such distinctions.
|
|
|
|
|
|
|
|
--------------------------------
|
|
|
|
|
|
|
|
|
|
|
|
Known Interoperability Problems
|
|
|
|
-------------------------------
|
|
|
|
|
|
|
|
Although the GNU implementation of DWARF conforms (for the most part) with
|
|
|
|
the current UI/PLSIG DWARF version 1 specification (with many compatible
|
|
|
|
version 2 features added in as "vendor specific extensions" just for good
|
|
|
|
measure) there are a few known cases where GCC's DWARF output can cause
|
|
|
|
some confusion for "classic" (pre version 1) DWARF consumers such as the
|
|
|
|
System V Release 4 SDB debugger. These cases are described in this section.
|
|
|
|
|
|
|
|
--------------------------------
|
|
|
|
|
|
|
|
The DWARF version 1 specification includes the fundamental type codes
|
|
|
|
FT_ext_prec_float, FT_complex, FT_dbl_prec_complex, and FT_ext_prec_complex.
|
|
|
|
Since GNU C is only a C compiler (and since C doesn't provide any "complex"
|
|
|
|
data types) the only one of these fundamental type codes which GCC ever
|
|
|
|
generates is FT_ext_prec_float. This fundamental type code is generated
|
|
|
|
by GCC for the `long double' data type. Unfortunately, due to an apparent
|
|
|
|
bug in the SVR4 SDB debugger, SDB can become very confused wherever any
|
|
|
|
attempt is made to print a variable, parameter, or field whose type was
|
|
|
|
given in terms of FT_ext_prec_float.
|
|
|
|
|
|
|
|
(Actually, SVR4 SDB fails to understand *any* of the four fundamental type
|
|
|
|
codes mentioned here. This will fact will cause additional problems when
|
|
|
|
there is a GNU FORTRAN front-end.)
|
|
|
|
|
|
|
|
--------------------------------
|
|
|
|
|
|
|
|
In general, it appears that SVR4 SDB is not able to effectively ignore
|
|
|
|
fundamental type codes in the "implementation defined" range. This can
|
|
|
|
cause problems when a program being debugged uses the `long long' data
|
|
|
|
type (or the signed or unsigned varieties thereof) because these types
|
|
|
|
are not defined by ANSI C, and thus, GCC must use its own private fundamental
|
|
|
|
type codes (from the implementation-defined range) to represent these types.
|
|
|
|
|
|
|
|
--------------------------------
|
|
|
|
|
|
|
|
|
|
|
|
General GNU DWARF extensions
|
|
|
|
----------------------------
|
|
|
|
|
|
|
|
In the current DWARF version 1 specification, no mechanism is specified by
|
|
|
|
which accurate information about executable code from include files can be
|
|
|
|
properly (and fully) described. (The DWARF version 2 specification *does*
|
|
|
|
specify such a mechanism, but it is about 10 times more complicated than
|
|
|
|
it needs to be so I'm not terribly anxious to try to implement it right
|
|
|
|
away.)
|
|
|
|
|
|
|
|
In the GNU implementation of DWARF version 1, a fully downward-compatible
|
|
|
|
extension has been implemented which permits the GNU compilers to specify
|
|
|
|
which executable lines come from which files. This extension places
|
|
|
|
additional information (about source file names) in GNU-specific sections
|
|
|
|
(which should be totally ignored by all non-GNU DWARF consumers) so that
|
|
|
|
this extended information can be provided (to GNU DWARF consumers) in a way
|
|
|
|
which is totally transparent (and invisible) to non-GNU DWARF consumers
|
|
|
|
(e.g. the SVR4 SDB debugger). The additional information is placed *only*
|
|
|
|
in specialized GNU-specific sections, where it should never even be seen
|
|
|
|
by non-GNU DWARF consumers.
|
|
|
|
|
|
|
|
To understand this GNU DWARF extension, imagine that the sequence of entries
|
|
|
|
in the .lines section is broken up into several subsections. Each contiguous
|
|
|
|
sequence of .line entries which relates to a sequence of lines (or statements)
|
|
|
|
from one particular file (either a `base' file or an `include' file) could
|
|
|
|
be called a `line entries chunk' (LEC).
|
|
|
|
|
|
|
|
For each LEC there is one entry in the .debug_srcinfo section.
|
|
|
|
|
|
|
|
Each normal entry in the .debug_srcinfo section consists of two 4-byte
|
|
|
|
words of data as follows:
|
|
|
|
|
|
|
|
(1) The starting address (relative to the entire .line section)
|
|
|
|
of the first .line entry in the relevant LEC.
|
|
|
|
|
|
|
|
(2) The starting address (relative to the entire .debug_sfnames
|
|
|
|
section) of a NUL terminated string representing the
|
|
|
|
relevant filename. (This filename name be either a
|
|
|
|
relative or an absolute filename, depending upon how the
|
|
|
|
given source file was located during compilation.)
|
|
|
|
|
|
|
|
Obviously, each .debug_srcinfo entry allows you to find the relevant filename,
|
|
|
|
and it also points you to the first .line entry that was generated as a result
|
|
|
|
of having compiled a given source line from the given source file.
|
|
|
|
|
|
|
|
Each subsequent .line entry should also be assumed to have been produced
|
|
|
|
as a result of compiling yet more lines from the same file. The end of
|
|
|
|
any given LEC is easily found by looking at the first 4-byte pointer in
|
|
|
|
the *next* .debug_srcinfo entry. That next .debug_srcinfo entry points
|
|
|
|
to a new and different LEC, so the preceding LEC (implicitly) must have
|
|
|
|
ended with the last .line section entry which occurs at the 2 1/2 words
|
|
|
|
just before the address given in the first pointer of the new .debug_srcinfo
|
|
|
|
entry.
|
|
|
|
|
|
|
|
The following picture may help to clarify this feature. Let's assume that
|
|
|
|
`LE' stands for `.line entry'. Also, assume that `* 'stands for a pointer.
|
|
|
|
|
|
|
|
|
|
|
|
.line section .debug_srcinfo section .debug_sfnames section
|
|
|
|
----------------------------------------------------------------
|
|
|
|
|
|
|
|
LE <---------------------- *
|
|
|
|
LE * -----------------> "foobar.c" <---
|
|
|
|
LE |
|
|
|
|
LE |
|
|
|
|
LE <---------------------- * |
|
|
|
|
LE * -----------------> "foobar.h" <| |
|
|
|
|
LE | |
|
|
|
|
LE | |
|
|
|
|
LE <---------------------- * | |
|
|
|
|
LE * -----------------> "inner.h" | |
|
|
|
|
LE | |
|
|
|
|
LE <---------------------- * | |
|
|
|
|
LE * ------------------------------- |
|
|
|
|
LE |
|
|
|
|
LE |
|
|
|
|
LE |
|
|
|
|
LE |
|
|
|
|
LE <---------------------- * |
|
|
|
|
LE * -----------------------------------
|
|
|
|
LE
|
|
|
|
LE
|
|
|
|
LE
|
|
|
|
|
|
|
|
In effect, each entry in the .debug_srcinfo section points to *both* a
|
|
|
|
filename (in the .debug_sfnames section) and to the start of a block of
|
|
|
|
consecutive LEs (in the .line section).
|
|
|
|
|
|
|
|
Note that just like in the .line section, there are specialized first and
|
|
|
|
last entries in the .debug_srcinfo section for each object file. These
|
|
|
|
special first and last entries for the .debug_srcinfo section are very
|
|
|
|
different from the normal .debug_srcinfo section entries. They provide
|
|
|
|
additional information which may be helpful to a debugger when it is
|
|
|
|
interpreting the data in the .debug_srcinfo, .debug_sfnames, and .line
|
|
|
|
sections.
|
|
|
|
|
|
|
|
The first entry in the .debug_srcinfo section for each compilation unit
|
|
|
|
consists of five 4-byte words of data. The contents of these five words
|
|
|
|
should be interpreted (by debuggers) as follows:
|
|
|
|
|
|
|
|
(1) The starting address (relative to the entire .line section)
|
|
|
|
of the .line section for this compilation unit.
|
|
|
|
|
|
|
|
(2) The starting address (relative to the entire .debug_sfnames
|
|
|
|
section) of the .debug_sfnames section for this compilation
|
|
|
|
unit.
|
|
|
|
|
|
|
|
(3) The starting address (in the execution virtual address space)
|
|
|
|
of the .text section for this compilation unit.
|
|
|
|
|
|
|
|
(4) The ending address plus one (in the execution virtual address
|
|
|
|
space) of the .text section for this compilation unit.
|
|
|
|
|
|
|
|
(5) The date/time (in seconds since midnight 1/1/70) at which the
|
|
|
|
compilation of this compilation unit occurred. This value
|
|
|
|
should be interpreted as an unsigned quantity because gcc
|
|
|
|
might be configured to generate a default value of 0xffffffff
|
|
|
|
in this field (in cases where it is desired to have object
|
|
|
|
files created at different times from identical source files
|
|
|
|
be byte-for-byte identical). By default, these timestamps
|
|
|
|
are *not* generated by dwarfout.c (so that object files
|
|
|
|
compiled at different times will be byte-for-byte identical).
|
|
|
|
If you wish to enable this "timestamp" feature however, you
|
|
|
|
can simply place a #define for the symbol `DWARF_TIMESTAMPS'
|
|
|
|
in your target configuration file and then rebuild the GNU
|
|
|
|
compiler(s).
|
|
|
|
|
|
|
|
Note that the first string placed into the .debug_sfnames section for each
|
|
|
|
compilation unit is the name of the directory in which compilation occurred.
|
|
|
|
This string ends with a `/' (to help indicate that it is the pathname of a
|
|
|
|
directory). Thus, the second word of each specialized initial .debug_srcinfo
|
|
|
|
entry for each compilation unit may be used as a pointer to the (string)
|
|
|
|
name of the compilation directory, and that string may in turn be used to
|
|
|
|
"absolutize" any relative pathnames which may appear later on in the
|
|
|
|
.debug_sfnames section entries for the same compilation unit.
|
|
|
|
|
|
|
|
The fifth and last word of each specialized starting entry for a compilation
|
|
|
|
unit in the .debug_srcinfo section may (depending upon your configuration)
|
|
|
|
indicate the date/time of compilation, and this may be used (by a debugger)
|
|
|
|
to determine if any of the source files which contributed code to this
|
|
|
|
compilation unit are newer than the object code for the compilation unit
|
|
|
|
itself. If so, the debugger may wish to print an "out-of-date" warning
|
|
|
|
about the compilation unit.
|
|
|
|
|
|
|
|
The .debug_srcinfo section associated with each compilation will also have
|
|
|
|
a specialized terminating entry. This terminating .debug_srcinfo section
|
|
|
|
entry will consist of the following two 4-byte words of data:
|
|
|
|
|
|
|
|
(1) The offset, measured from the start of the .line section to
|
|
|
|
the beginning of the terminating entry for the .line section.
|
|
|
|
|
|
|
|
(2) A word containing the value 0xffffffff.
|
|
|
|
|
|
|
|
--------------------------------
|
|
|
|
|
|
|
|
In the current DWARF version 1 specification, no mechanism is specified by
|
|
|
|
which information about macro definitions and un-definitions may be provided
|
|
|
|
to the DWARF consumer.
|
|
|
|
|
|
|
|
The DWARF version 2 (draft) specification does specify such a mechanism.
|
|
|
|
That specification was based on the GNU ("vendor specific extension")
|
|
|
|
which provided some support for macro definitions and un-definitions,
|
|
|
|
but the "official" DWARF version 2 (draft) specification mechanism for
|
|
|
|
handling macros and the GNU implementation have diverged somewhat. I
|
|
|
|
plan to update the GNU implementation to conform to the "official"
|
|
|
|
DWARF version 2 (draft) specification as soon as I get time to do that.
|
|
|
|
|
|
|
|
Note that in the GNU implementation, additional information about macro
|
|
|
|
definitions and un-definitions is *only* provided when the -g3 level of
|
|
|
|
debug-info production is selected. (The default level is -g2 and the
|
|
|
|
plain old -g option is considered to be identical to -g2.)
|
|
|
|
|
|
|
|
GCC records information about macro definitions and undefinitions primarily
|
|
|
|
in a section called the .debug_macinfo section. Normal entries in the
|
|
|
|
.debug_macinfo section consist of the following three parts:
|
|
|
|
|
|
|
|
(1) A special "type" byte.
|
|
|
|
|
|
|
|
(2) A 3-byte line-number/filename-offset field.
|
|
|
|
|
|
|
|
(3) A NUL terminated string.
|
|
|
|
|
|
|
|
The interpretation of the second and third parts is dependent upon the
|
|
|
|
value of the leading (type) byte.
|
|
|
|
|
|
|
|
The type byte may have one of four values depending upon the type of the
|
|
|
|
.debug_macinfo entry which follows. The 1-byte MACINFO type codes presently
|
|
|
|
used, and their meanings are as follows:
|
|
|
|
|
|
|
|
MACINFO_start A base file or an include file starts here.
|
|
|
|
MACINFO_resume The current base or include file ends here.
|
|
|
|
MACINFO_define A #define directive occurs here.
|
|
|
|
MACINFO_undef A #undef directive occur here.
|
|
|
|
|
|
|
|
(Note that the MACINFO_... codes mentioned here are simply symbolic names
|
|
|
|
for constants which are defined in the GNU dwarf.h file.)
|
|
|
|
|
|
|
|
For MACINFO_define and MACINFO_undef entries, the second (3-byte) field
|
|
|
|
contains the number of the source line (relative to the start of the current
|
|
|
|
base source file or the current include files) when the #define or #undef
|
|
|
|
directive appears. For a MACINFO_define entry, the following string field
|
|
|
|
contains the name of the macro which is defined, followed by its definition.
|
|
|
|
Note that the definition is always separated from the name of the macro
|
|
|
|
by at least one whitespace character. For a MACINFO_undef entry, the
|
|
|
|
string which follows the 3-byte line number field contains just the name
|
|
|
|
of the macro which is being undef'ed.
|
|
|
|
|
|
|
|
For a MACINFO_start entry, the 3-byte field following the type byte contains
|
|
|
|
the offset, relative to the start of the .debug_sfnames section for the
|
|
|
|
current compilation unit, of a string which names the new source file which
|
|
|
|
is beginning its inclusion at this point. Following that 3-byte field,
|
|
|
|
each MACINFO_start entry always contains a zero length NUL terminated
|
|
|
|
string.
|
|
|
|
|
|
|
|
For a MACINFO_resume entry, the 3-byte field following the type byte contains
|
|
|
|
the line number WITHIN THE INCLUDING FILE at which the inclusion of the
|
|
|
|
current file (whose inclusion ends here) was initiated. Following that
|
|
|
|
3-byte field, each MACINFO_resume entry always contains a zero length NUL
|
|
|
|
terminated string.
|
|
|
|
|
|
|
|
Each set of .debug_macinfo entries for each compilation unit is terminated
|
|
|
|
by a special .debug_macinfo entry consisting of a 4-byte zero value followed
|
|
|
|
by a single NUL byte.
|
|
|
|
|
|
|
|
--------------------------------
|
|
|
|
|
|
|
|
In the current DWARF draft specification, no provision is made for providing
|
|
|
|
a separate level of (limited) debugging information necessary to support
|
|
|
|
tracebacks (only) through fully-debugged code (e.g. code in system libraries).
|
|
|
|
|
|
|
|
A proposal to define such a level was submitted (by me) to the UI/PLSIG.
|
|
|
|
This proposal was rejected by the UI/PLSIG for inclusion into the DWARF
|
|
|
|
version 1 specification for two reasons. First, it was felt (by the PLSIG)
|
|
|
|
that the issues involved in supporting a "traceback only" subset of DWARF
|
|
|
|
were not well understood. Second, and perhaps more importantly, the PLSIG
|
|
|
|
is already having enough trouble agreeing on what it means to be "conforming"
|
|
|
|
to the DWARF specification, and it was felt that trying to specify multiple
|
|
|
|
different *levels* of conformance would only complicate our discussions of
|
|
|
|
this already divisive issue. Nonetheless, the GNU implementation of DWARF
|
|
|
|
provides an abbreviated "traceback only" level of debug-info production for
|
|
|
|
use with fully-debugged "system library" code. This level should only be
|
|
|
|
used for fully debugged system library code, and even then, it should only
|
|
|
|
be used where there is a very strong need to conserve disk space. This
|
|
|
|
abbreviated level of debug-info production can be used by specifying the
|
|
|
|
-g1 option on the compilation command line.
|
|
|
|
|
|
|
|
--------------------------------
|
|
|
|
|
|
|
|
As mentioned above, the GNU implementation of DWARF currently uses the DWARF
|
|
|
|
version 2 (draft) approach for inline functions (and inlined instances
|
|
|
|
thereof). This is used in preference to the version 1 approach because
|
|
|
|
(quite simply) the version 1 approach is highly brain-damaged and probably
|
|
|
|
unworkable.
|
|
|
|
|
|
|
|
--------------------------------
|
|
|
|
|
|
|
|
|
|
|
|
GNU DWARF Representation of GNU C Extensions to ANSI C
|
|
|
|
------------------------------------------------------
|
|
|
|
|
|
|
|
The file dwarfout.c has been designed and implemented so as to provide
|
|
|
|
some reasonable DWARF representation for each and every declarative
|
|
|
|
construct which is accepted by the GNU C compiler. Since the GNU C
|
|
|
|
compiler accepts a superset of ANSI C, this means that there are some
|
|
|
|
cases in which the DWARF information produced by GCC must take some
|
|
|
|
liberties in improvising DWARF representations for declarations which
|
|
|
|
are only valid in (extended) GNU C.
|
|
|
|
|
|
|
|
In particular, GNU C provides at least three significant extensions to
|
|
|
|
ANSI C when it comes to declarations. These are (1) inline functions,
|
|
|
|
and (2) dynamic arrays, and (3) incomplete enum types. (See the GCC
|
|
|
|
manual for more information on these GNU extensions to ANSI C.) When
|
|
|
|
used, these GNU C extensions are represented (in the generated DWARF
|
|
|
|
output of GCC) in the most natural and intuitively obvious ways.
|
|
|
|
|
|
|
|
In the case of inline functions, the DWARF representation is exactly as
|
|
|
|
called for in the DWARF version 2 (draft) specification for an identical
|
|
|
|
function written in C++; i.e. we "reuse" the representation of inline
|
|
|
|
functions which has been defined for C++ to support this GNU C extension.
|
|
|
|
|
|
|
|
In the case of dynamic arrays, we use the most obvious representational
|
|
|
|
mechanism available; i.e. an array type in which the upper bound of
|
|
|
|
some dimension (usually the first and only dimension) is a variable
|
|
|
|
rather than a constant. (See the DWARF version 1 specification for more
|
|
|
|
details.)
|
|
|
|
|
|
|
|
In the case of incomplete enum types, such types are represented simply
|
|
|
|
as TAG_enumeration_type DIEs which DO NOT contain either AT_byte_size
|
|
|
|
attributes or AT_element_list attributes.
|
|
|
|
|
|
|
|
--------------------------------
|
|
|
|
|
|
|
|
|
|
|
|
Future Directions
|
|
|
|
-----------------
|
|
|
|
|
|
|
|
The codes, formats, and other paraphernalia necessary to provide proper
|
|
|
|
support for symbolic debugging for the C++ language are still being worked
|
|
|
|
on by the UI/PLSIG. The vast majority of the additions to DWARF which will
|
|
|
|
be needed to completely support C++ have already been hashed out and agreed
|
|
|
|
upon, but a few small issues (e.g. anonymous unions, access declarations)
|
|
|
|
are still being discussed. Also, we in the PLSIG are still discussing
|
|
|
|
whether or not we need to do anything special for C++ templates. (At this
|
|
|
|
time it is not yet clear whether we even need to do anything special for
|
|
|
|
these.)
|
|
|
|
|
|
|
|
Unfortunately, as mentioned above, there are quite a few problems in the
|
|
|
|
g++ front end itself, and these are currently responsible for severely
|
|
|
|
restricting the progress which can be made on adding DWARF support
|
|
|
|
specifically for the g++ front-end. Furthermore, Richard Stallman has
|
|
|
|
expressed the view that C++ friendships might not be important enough to
|
|
|
|
describe (in DWARF). This view directly conflicts with both the DWARF
|
|
|
|
version 1 and version 2 (draft) specifications, so until this small
|
|
|
|
misunderstanding is cleared up, DWARF support for g++ is unlikely.
|
|
|
|
|
|
|
|
With regard to FORTRAN, the UI/PLSIG has defined what is believed to be a
|
|
|
|
complete and sufficient set of codes and rules for adequately representing
|
|
|
|
all of FORTRAN 77, and most of Fortran 90 in DWARF. While some support for
|
|
|
|
this has been implemented in dwarfout.c, further implementation and testing
|
|
|
|
will have to await the arrival of the GNU Fortran front-end (which is
|
|
|
|
currently in early alpha test as of this writing).
|
|
|
|
|
|
|
|
GNU DWARF support for other languages (i.e. Pascal and Modula) is a moot
|
|
|
|
issue until there are GNU front-ends for these other languages.
|
|
|
|
|
|
|
|
GNU DWARF support for DWARF version 2 will probably not be attempted until
|
|
|
|
such time as the version 2 specification is finalized. (More work needs
|
|
|
|
to be done on the version 2 specification to make the new "abbreviations"
|
|
|
|
feature of version 2 more easily implementable. Until then, it will be
|
|
|
|
a royal pain the ass to implement version 2 "abbreviations".) For the
|
|
|
|
time being, version 2 features will be added (in a version 1 compatible
|
|
|
|
manner) when and where these features seem necessary or extremely desirable.
|
|
|
|
|
|
|
|
As currently defined, DWARF only describes a (binary) language which can
|
|
|
|
be used to communicate symbolic debugging information from a compiler
|
|
|
|
through an assembler and a linker, to a debugger. There is no clear
|
|
|
|
specification of what processing should be (or must be) done by the
|
|
|
|
assembler and/or the linker. Fortunately, the role of the assembler
|
|
|
|
is easily inferred (by anyone knowledgeable about assemblers) just by
|
|
|
|
looking at examples of assembly-level DWARF code. Sadly though, the
|
|
|
|
allowable (or required) processing steps performed by a linker are
|
|
|
|
harder to infer and (perhaps) even harder to agree upon. There are
|
|
|
|
several forms of very useful `post-processing' steps which intelligent
|
|
|
|
linkers *could* (in theory) perform on object files containing DWARF,
|
|
|
|
but any and all such link-time transformations are currently both disallowed
|
|
|
|
and unspecified.
|
|
|
|
|
|
|
|
In particular, possible link-time transformations of DWARF code which could
|
|
|
|
provide significant benefits include (but are not limited to):
|
|
|
|
|
|
|
|
Commonization of duplicate DIEs obtained from multiple input
|
|
|
|
(object) files.
|
|
|
|
|
|
|
|
Cross-compilation type checking based upon DWARF type information
|
|
|
|
for objects and functions.
|
|
|
|
|
|
|
|
Other possible `compacting' transformations designed to save disk
|
|
|
|
space and to reduce linker & debugger I/O activity.
|