The rationale for this patch comes from the ROCm port [1], the goal
being to reduce the number of back and forths between GDB and the
target when doing successive operations. I'll start with explaining
the rationale and then go over the implementation. In the ROCm / GPU
world, the term "wave" is somewhat equivalent to a "thread" in GDB.
So if you read if from a GPU stand point, just s/thread/wave/.
ROCdbgapi, the library used by GDB [2] to communicate with the GPU
target, gives the illusion that it's possible for the debugger to
control (start and stop) individual threads. But in reality, this is
not how it works. Under the hood, all threads of a queue are
controlled as a group. To stop one thread in a group of running ones,
the state of all threads is retrieved from the GPU, all threads are
destroyed, and all threads but the one we want to stop are re-created
from the saved state. The net result, from the point of view of GDB,
is that the library stopped one thread. The same thing goes if we
want to resume one thread while others are running: the state of all
running threads is retrieved from the GPU, they are all destroyed, and
they are all re-created, including the thread we want to resume.
This leads to some inefficiencies when combined with how GDB works,
here are two examples:
- Stopping all threads: because the target operates in non-stop mode,
when the user interface mode is all-stop, GDB must stop all threads
individually when presenting a stop. Let's suppose we have 1000
threads and the user does ^C. GDB asks the target to stop one
thread. Behind the scenes, the library retrieves 1000 thread
states and restores the 999 others still running ones. GDB asks
the target to stop another one. The target retrieves 999 thread
states and restores the 998 remaining ones. That means that to
stop 1000 threads, we did 1000 back and forths with the GPU. It
would have been much better to just retrieve the states once and
stop there.
- Resuming with pending events: suppose the 1000 threads hit a
breakpoint at the same time. The breakpoint is conditional and
evaluates to true for the first thread, to false for all others.
GDB pulls one event (for the first thread) from the target, decides
that it should present a stop, so stops all threads using
stop_all_threads. All these other threads have a breakpoint event
to report, which is saved in `thread_info::suspend::waitstatus` for
later. When the user does "continue", GDB resumes that one thread
that did hit the breakpoint. It then processes the pending events
one by one as if they just arrived. It picks one, evaluates the
condition to false, and resumes the thread. It picks another one,
evaluates the condition to false, and resumes the thread. And so
on. In between each resumption, there is a full state retrieval
and re-creation. It would be much nicer if we could wait a little
bit before sending those threads on the GPU, until it processed all
those pending events.
To address this kind of performance issue, ROCdbgapi has a concept
called "forward progress required", which is a boolean state that
allows its user (i.e. GDB) to say "I'm doing a bunch of operations,
you can hold off putting the threads on the GPU until I'm done" (the
"forward progress not required" state). Turning forward progress back
on indicates to the library that all threads that are supposed to be
running should now be really running on the GPU.
It turns out that GDB has a similar concept, though not as general,
commit_resume. One difference is that commit_resume is not stateful:
the target can't look up "does the core need me to schedule resumed
threads for execution right now". It is also specifically linked to
the resume method, it is not used in other contexts. The target
accumulates resumption requests through target_ops::resume calls, and
then commits those resumptions when target_ops::commit_resume is
called. The target has no way to check if it's ok to leave resumed
threads stopped in other target methods.
To bridge the gap, this patch generalizes the commit_resume concept in
GDB to match the forward progress concept of ROCdbgapi. The current
name (commit_resume) can be interpreted as "commit the previous resume
calls". I renamed the concept to "commit_resumed", as in "commit the
threads that are resumed".
In the new version, we have two things:
- the commit_resumed_state field in process_stratum_target: indicates
whether GDB requires target stacks using this target to have
resumed threads committed to the execution target/device. If
false, an execution target is allowed to leave resumed threads
un-committed at the end of whatever method it is executing.
- the commit_resumed target method: called when commit_resumed_state
transitions from false to true. While commit_resumed_state was
false, the target may have left some resumed threads un-committed.
This method being called tells it that it should commit them back
to the execution device.
Let's take the "Stopping all threads" scenario from above and see how
it would work with the ROCm target with this change. Before stopping
all threads, GDB would set the target's commit_resumed_state field to
false. It would then ask the target to stop the first thread. The
target would retrieve all threads' state from the GPU and mark that
one as stopped. Since commit_resumed_state is false, it leaves all
the other threads (still resumed) stopped. GDB would then proceed to
call target_stop for all the other threads. Since resumed threads are
not committed, this doesn't do any back and forth with the GPU.
To simplify the implementation of targets, this patch makes it so that
when calling certain target methods, the contract between the core and
the targets guarantees that commit_resumed_state is false. This way,
the target doesn't need two paths, one for commit_resumed_state ==
true and one for commit_resumed_state == false. It can just assert
that commit_resumed_state is false and work with that assumption.
This also helps catch places where we forgot to disable
commit_resumed_state before calling the method, which represents a
probable optimization opportunity. The commit adds assertions in the
target method wrappers (target_resume and friends) to have some
confidence that this contract between the core and the targets is
respected.
The scoped_disable_commit_resumed type is used to disable the commit
resumed state of all process targets on construction, and selectively
re-enable it on destruction (see below for criteria). Note that it
only sets the process_stratum_target::commit_resumed_state flag. A
subsequent call to maybe_call_commit_resumed_all_targets is necessary
to call the commit_resumed method on all target stacks with process
targets that got their commit_resumed_state flag turned back on. This
separation is because we don't want to call the commit_resumed methods
in scoped_disable_commit_resumed's destructor, as they may throw.
On destruction, commit-resumed is not re-enabled for a given target
if:
1. this target has no threads resumed, or
2. this target has at least one resumed thread with a pending status
known to the core (saved in thread_info::suspend::waitstatus).
The first point is not technically necessary, because a proper
commit_resumed implementation would be a no-op if the target has no
resumed threads. But since we have a flag do to a quick check, it
shouldn't hurt.
The second point is more important: together with the
scoped_disable_commit_resumed instance added in fetch_inferior_event,
it makes it so the "Resuming with pending events" described above is
handled efficiently. Here's what happens in that case:
1. The user types "continue".
2. Upon destruction, the scoped_disable_commit_resumed in the
`proceed` function does not enable commit-resumed, as it sees some
threads have pending statuses.
3. fetch_inferior_event is called to handle another event, the
breakpoint hit evaluates to false, and that thread is resumed.
Because there are still more threads with pending statuses, the
destructor of scoped_disable_commit_resumed in
fetch_inferior_event still doesn't enable commit-resumed.
4. Rinse and repeat step 3, until the last pending status is handled
by fetch_inferior_event. In that case,
scoped_disable_commit_resumed's destructor sees there are no more
threads with pending statues, so it asks the target to commit
resumed threads.
This allows us to avoid all unnecessary back and forths, there is a
single commit_resumed call once all pending statuses are processed.
This change required remote_target::remote_stop_ns to learn how to
handle stopping threads that were resumed but pending vCont. The
simplest example where that happens is when using the remote target in
all-stop, but with "maint set target-non-stop on", to force it to
operate in non-stop mode under the hood. If two threads hit a
breakpoint at the same time, GDB will receive two stop replies. It
will present the stop for one thread and save the other one in
thread_info::suspend::waitstatus.
Before this patch, when doing "continue", GDB first resumes the thread
without a pending status:
Sending packet: $vCont;c:p172651.172676#f3
It then consumes the pending status in the next fetch_inferior_event
call:
[infrun] do_target_wait_1: Using pending wait status status->kind = stopped, signal = GDB_SIGNAL_TRAP for Thread 1517137.1517137.
[infrun] target_wait (-1.0.0, status) =
[infrun] 1517137.1517137.0 [Thread 1517137.1517137],
[infrun] status->kind = stopped, signal = GDB_SIGNAL_TRAP
It then realizes it needs to stop all threads to present the stop, so
stops the thread it just resumed:
[infrun] stop_all_threads: Thread 1517137.1517137 not executing
[infrun] stop_all_threads: Thread 1517137.1517174 executing, need stop
remote_stop called
Sending packet: $vCont;t:p172651.172676#04
This is an unnecessary resume/stop. With this patch, we don't commit
resumed threads after proceeding, because of the pending status:
[infrun] maybe_commit_resumed_all_process_targets: not requesting commit-resumed for target extended-remote, a thread has a pending waitstatus
When GDB handles the pending status and stop_all_threads runs, we stop a
resumed but pending vCont thread:
remote_stop_ns: Enqueueing phony stop reply for thread pending vCont-resume (1520940, 1520976, 0)
That thread was never actually resumed on the remote stub / gdbserver,
so we shouldn't send a packet to the remote side asking to stop the
thread.
Note that there are paths that resume the target and then do a
synchronous blocking wait, in sort of nested event loop, via
wait_sync_command_done. For example, inferior function calls, or any
run control command issued from a breakpoint command list. We handle
that making wait_sync_command_one a "sync" point -- force forward
progress, or IOW, force-enable commit-resumed state.
gdb/ChangeLog:
yyyy-mm-dd Simon Marchi <simon.marchi@efficios.com>
Pedro Alves <pedro@palves.net>
* infcmd.c (run_command_1, attach_command, detach_command)
(interrupt_target_1): Use scoped_disable_commit_resumed.
* infrun.c (do_target_resume): Remove
target_commit_resume call.
(commit_resume_all_targets): Remove.
(maybe_set_commit_resumed_all_targets): New.
(maybe_call_commit_resumed_all_targets): New.
(enable_commit_resumed): New.
(scoped_disable_commit_resumed::scoped_disable_commit_resumed)
(scoped_disable_commit_resumed::~scoped_disable_commit_resumed)
(scoped_disable_commit_resumed::reset)
(scoped_disable_commit_resumed::reset_and_commit)
(scoped_enable_commit_resumed::scoped_enable_commit_resumed)
(scoped_enable_commit_resumed::~scoped_enable_commit_resumed):
New.
(proceed): Use scoped_disable_commit_resumed and
maybe_call_commit_resumed_all_targets.
(fetch_inferior_event): Use scoped_disable_commit_resumed.
* infrun.h (struct scoped_disable_commit_resumed): New.
(maybe_call_commit_resumed_all_process_targets): New.
(struct scoped_enable_commit_resumed): New.
* mi/mi-main.c (exec_continue): Use scoped_disable_commit_resumed.
* process-stratum-target.h (class process_stratum_target):
<commit_resumed_state>: New.
* record-full.c (record_full_wait_1): Change commit_resumed_state
around calling commit_resumed.
* remote.c (class remote_target) <commit_resume>: Rename to...
<commit_resumed>: ... this.
(struct stop_reply): Move up.
(remote_target::commit_resume): Rename to...
(remote_target::commit_resumed): ... this. Check if there is any
thread pending vCont resume.
(remote_target::remote_stop_ns): Generate stop replies for resumed
but pending vCont threads.
(remote_target::wait_ns): Add gdb_assert.
* target-delegates.c: Regenerate.
* target.c (target_wait, target_resume): Assert that the current
process_stratum target isn't in commit-resumed state.
(defer_target_commit_resume): Remove.
(target_commit_resume): Remove.
(target_commit_resumed): New.
(make_scoped_defer_target_commit_resume): Remove.
(target_stop): Assert that the current process_stratum target
isn't in commit-resumed state.
* target.h (struct target_ops) <commit_resume>: Rename to ...
<commit_resumed>: ... this.
(target_commit_resume): Remove.
(target_commit_resumed): New.
(make_scoped_defer_target_commit_resume): Remove.
* top.c (wait_sync_command_done): Use
scoped_enable_commit_resumed.
[1] https://github.com/ROCm-Developer-Tools/ROCgdb/
[2] https://github.com/ROCm-Developer-Tools/ROCdbgapi
Change-Id: I836135531a29214b21695736deb0a81acf8cf566
gdb.base/maint-target-async-off.exp fails if you test against
gdbserver with "maint set target-non-stop on" forced.
(gdb) run
Starting program: build/gdb/testsuite/outputs/gdb.base/maint-target-async-off/maint-target-async-off
Breakpoint 1, main () at src/gdb/testsuite/gdb.base/maint-target-async-off.c:21
21 return 0;
(gdb) FAIL: gdb.base/maint-target-async-off.exp: continue until exit (timeout)
Above, GDB just stopped listening to stdin.
Basically, GDB assumes that a target working in non-stop mode
operation also supports async mode; it's a requirement. GDB
misbehaves badly otherwise, and even hits failed assertions.
Fix this by making target_is_non_stop_p return false if async is off.
gdb/ChangeLog:
* target.c (target_always_non_stop_p): Also check whether the
target can async.
Change-Id: I7e52e1061396a5b9b02ada462f68a14b76d68974
I noticed a spot in the DWARF reader using "per_objfile->per_bfd",
where a local per_bfd variable had already been created. Looking
through the file, I found a number of such spots. This patch changes
them to use the already-existing local, avoiding a bit of excess
pointer chasing.
gdb/ChangeLog
2021-03-26 Tom Tromey <tom@tromey.com>
* dwarf2/read.c (dwarf2_read_debug_names)
(dwarf2_build_psymtabs_hard, create_addrmap_from_aranges)
(dw2_debug_names_iterator::next, create_type_unit_group):
Simplify.
This commit resolves the remaining duplicate test names in
gdb.cp/*.exp. These are all the easy duplicates, I'm either giving
tests a new, unique name, extending an existing name to make it
unique, or changing an existing name to better reflect what the test
is actually doing, and thus, making this test name unique.
There should be no change in what is tested after this commit.
gdb/testsuite/ChangeLog:
* gdb.cp/breakpoint.exp: Extend test names to make them unique.
* gdb.cp/casts.exp: Give tests unique names.
* gdb.cp/filename.exp: Likewise.
* gdb.cp/gdb2495.exp: Likewise.
* gdb.cp/mb-ctor.exp: Extend test names to make them unique.
* gdb.cp/misc.exp: Rename test to make it unique.
* gdb.cp/nsnested.exp: Give tests unique names.
* gdb.cp/ovldbreak.exp: Likewise.
* gdb.cp/pr17494.exp: Rename test to reflect what is actually
being tested. This also removes the duplicate test name.
* gdb.cp/ref-types.exp: Likewise.
* gdb.cp/temargs.exp: Likewise.
While resolving duplicate test names I spotted that a test in
gdb.cp/cplusfuncs.exp included an unescaped '[]'. In TCL square
brackets enclose expressions to evaluate, and so in this case, where
there is no enclosed expression, this just evaluates to the empty
string.
This clearly was not what the test intended, so in this commit I have
escaped the square brackets. This has extended the test coverage.
gdb/testsuite/ChangeLog:
* gdb.cp/cplusfuncs.exp (test_paddr_operator_functions): Escape
square brackets in test.
I wanted to remove the duplicate test name from gdb.cp/maint.exp. In
this test we run some checks against different operator names. For
one operator we test with a variable number of spaces. However, we
were accidentally testing the one space version twice, and the zero
space version not at all, leading to a duplicate test name.
I could have just changed the duplicate one space version into the
missing zero space version, but I thought it would be neater to wrap
multiple tests in a loop, and check all operators with either zero,
one, or two spaces.
These tests are super quick so take almost no extra time, and this
gives marginally more test coverage.
gdb/testsuite/ChangeLog:
* gdb.cp/maint.exp (test_first_component): Run more tests with a
variable number of spaces, this removes the duplicate testing of
'operator ->' which existed before.
The test gdb.cp/gdb2384.exp contains some duplicate test names, and
also some test names with a string inside parentheses at the end. In
order to resolve the duplicates the obvious choice would be to add yet
more strings inside parentheses at the end of names, however, this is
discouraged in our test naming scheme.
The string in parentheses originates from a comment in the test source
code, which naturally leads to including this comment in the test
name.
In this commit I have changed the comment in the test source to remove
the string in parentheses, I then rename the tests in the .exp script
to match, making sure that all test names are unique.
There should be no change in test coverage after this commit.
gdb/testsuite/ChangeLog:
* gdb.cp/gdb2384.cc (main): Change comments used for breakpoints.
* gdb.cp/gdb2384.exp: Change and extend test names to avoid
duplicates, and also to avoid having a string inside parentheses
at the end of test names.
In trying to resolve the duplicate test names for the
gdb.cp/nsusing.exp script, I ended up giving the test script a serious
spring clean.
This reverts some of the changes introduced in commit df83a9bf8b,
but I don't think that we have lost any testing.
The test program is made of many functions, the test script wants to
stop in different functions and check which symbols are in scope.
Previously the test script would either restart GDB completely in
order to "progress" to the next function, or the script would restart
the test program using 'runto'.
In this commit I have reordered the steps of the test to correspond to
program order, I then progress through the test program once by just
placing a breakpoint and then continuing. As I said, the test is
checking which symbols are in scope at each location, so the exact
order of the tests doesn't matter, so long as we check the correct
symbols at each location.
I have also given the comments capital letters and full stops, and
re-wrapped them to a more sensible line length.
There was a duplicate test block introduced in the df83a9bf8b
commit which I have removed in this commit, this duplicate code was
responsible for one of the duplicate test names.
The other duplicate test name was due to the same command being run at
different locations, in this case I just gave the two tests explicit,
unique, names.
gdb/testsuite/ChangeLog:
* gdb.cp/nsusing.exp: Rewrite test, remove a duplicate test block.
Avoid repeated uses of 'runto', and instread just progress once
through the test stopping at different breakpoints. Give comments
a capital letter and full stop. Give duplicate tests unique names.
While all of MMX, SSE, and SSE2 are included in "generic64", they can be
individually disabled. There are two MOVQ forms lacking respective
attributes. While the MMX one would get refused anyway (due to MMX
registers not recognized with .nommx), the assembler did happily accept
the SSE2 form. Add respective CPU settings to both, paralleling what the
MOVD counterparts have.
When testing with "maint set target-non-stop on",
gdb.server/bkpt-other-inferior.exp sometimes fails like so:
(gdb) inferior 2
[Switching to inferior 2 [process 368191] (<noexec>)]
[Switching to thread 2.1 (Thread 368191.368191)]
[remote] Sending packet: $m7ffff7fd0100,1#5b
[remote] Packet received: 48
[remote] Sending packet: $m7ffff7fd0100,1#5b
[remote] Packet received: 48
[remote] Sending packet: $m7ffff7fd0100,9#63
[remote] Packet received: 4889e7e8e80c000049
#0 0x00007ffff7fd0100 in ?? ()
(gdb) PASS: gdb.server/bkpt-other-inferior.exp: inf 2: switch to inferior
break -q main
Breakpoint 2 at 0x1138: file /home/pedro/gdb/binutils-gdb/src/gdb/testsuite/gdb.server/server.c, line 21.
(gdb) PASS: gdb.server/bkpt-other-inferior.exp: inf 2: set breakpoint
delete breakpoints
Delete all breakpoints? (y or n) y
(gdb) [remote] wait: enter
[remote] wait: exit
FAIL: gdb.server/bkpt-other-inferior.exp: inf 2: delete all breakpoints in delete_breakpoints (timeout)
ERROR: breakpoints not deleted
Remote debugging from host ::1, port 55876
monitor exit
The problem is here:
(gdb) [remote] wait: enter
The testcase isn't expecting any output after the prompt.
Why is that "[remote] wait" output? What happens is that "delete
breakpoints" queries the user, and `query` disables/reenables target
async, which results in the remote target's async event handler ending
up marked:
(top-gdb) bt
#0 mark_async_event_handler (async_handler_ptr=0x556bffffffff) at ../../src/gdb/async-event.c:295
#1 0x0000556bf71b711f in infrun_async (enable=1) at ../../src/gdb/infrun.c:119
#2 0x0000556bf7471387 in target_async (enable=1) at ../../src/gdb/target.c:3684
#3 0x0000556bf748a0bd in gdb_readline_wrapper_cleanup::~gdb_readline_wrapper_cleanup (this=0x7ffe3cf30eb0, __in_chrg=<optimized out>) at ../../src/gdb/top.c:1074
#4 0x0000556bf74874e2 in gdb_readline_wrapper (prompt=0x556bfa17da60 "Delete all breakpoints? (y or n) ") at ../../src/gdb/top.c:1096
#5 0x0000556bf75111c5 in defaulted_query(const char *, char, typedef __va_list_tag __va_list_tag *) (ctlstr=0x556bf7717f34 "Delete all breakpoints? ", defchar=0 '\000', args=0x7ffe3cf31020) at ../../src/gdb/utils.c:893
#6 0x0000556bf751166f in query (ctlstr=0x556bf7717f34 "Delete all breakpoints? ") at ../../src/gdb/utils.c:985
#7 0x0000556bf6f11404 in delete_command (arg=0x0, from_tty=1) at ../../src/gdb/breakpoint.c:13500
...
... which then later results in a target_wait call:
(top-gdb) bt
#0 remote_target::wait_ns (this=0x7ffe3cf30f80, ptid=..., status=0xde530314f0802800, options=...) at ../../src/gdb/remote.c:7937
#1 0x0000556bf7369dcb in remote_target::wait (this=0x556bfa0b2180, ptid=..., status=0x7ffe3cf31568, options=...) at ../../src/gdb/remote.c:8173
#2 0x0000556bf745e527 in target_wait (ptid=..., status=0x7ffe3cf31568, options=...) at ../../src/gdb/target.c:2000
#3 0x0000556bf71be686 in do_target_wait_1 (inf=0x556bfa1573d0, ptid=..., status=0x7ffe3cf31568, options=...) at ../../src/gdb/infrun.c:3463
#4 0x0000556bf71be88b in <lambda(inferior*)>::operator()(inferior *) const (__closure=0x7ffe3cf31320, inf=0x556bfa1573d0) at ../../src/gdb/infrun.c:3526
#5 0x0000556bf71bebcd in do_target_wait (wait_ptid=..., ecs=0x7ffe3cf31540, options=...) at ../../src/gdb/infrun.c:3539
#6 0x0000556bf71bf97b in fetch_inferior_event () at ../../src/gdb/infrun.c:3879
#7 0x0000556bf71a27f8 in inferior_event_handler (event_type=INF_REG_EVENT) at ../../src/gdb/inf-loop.c:42
#8 0x0000556bf71cc8b7 in infrun_async_inferior_event_handler (data=0x0) at ../../src/gdb/infrun.c:9220
#9 0x0000556bf6ecb80f in check_async_event_handlers () at ../../src/gdb/async-event.c:327
#10 0x0000556bf76b011a in gdb_do_one_event () at ../../src/gdbsupport/event-loop.cc:216
...
... which returns TARGET_WAITKIND_IGNORE.
Fix this by only enabling remote output around setting the breakpoint.
gdb/testsuite/ChangeLog:
* gdb.server/bkpt-other-inferior.exp: Only enable remote output
around setting the breakpoint.
Change-Id: I2fd152fd9c46b1c5e7fa678cc4d4054dac0b2bd4
Running gdb.server/stop-reply-no-thread-multi.exp with "maint set
target-non-stop on" occasionally hit an internal error like this:
...
continue
Continuing.
warning: multi-threaded target stopped without sending a thread-id, using first non-exited thread
/home/pedro/gdb/binutils-gdb/src/gdb/inferior.c:291: internal-error: inferior* find_inferior_pid(process_stratum_target*, int): Assertion `pid != 0' failed.
A problem internal to GDB has been detected,
further debugging may prove unreliable.
This is a bug, please report it.
FAIL: gdb.server/stop-reply-no-thread-multi.exp: to_disable=Tthread: continue until exit (GDB internal error)
The backtrace looks like this:
...
#5 0x0000560357b0879c in internal_error (file=0x560357be6c18 "/home/pedro/gdb/binutils-gdb/src/gdb/inferior.c", line=291, fmt=0x560357be6b21 "%s: Assertion `%s' failed.") at /home/pedro/gdb/binutils-gdb/src/gdbsupport/errors.cc:55
#6 0x000056035762061b in find_inferior_pid (targ=0x5603596e9560, pid=0) at /home/pedro/gdb/binutils-gdb/src/gdb/inferior.c:291
#7 0x00005603576206e6 in find_inferior_ptid (targ=0x5603596e9560, ptid=...) at /home/pedro/gdb/binutils-gdb/src/gdb/inferior.c:305
#8 0x00005603577d43ed in remote_target::check_pending_events_prevent_wildcard_vcont (this=0x5603596e9560, may_global_wildcard=0x7fff84fb05f0) at /home/pedro/gdb/binutils-gdb/src/gdb/remote.c:7215
#9 0x00005603577d2a9c in remote_target::commit_resumed (this=0x5603596e9560) at /home/pedro/gdb/binutils-gdb/src/gdb/remote.c:6680
...
pid is 0 in this case because the queued event is a process exit event
with no pid associated:
(top-gdb) p event->ws
During symbol reading: .debug_line address at offset 0x563c9a is 0 [in module /home/pedro/gdb/binutils-gdb/build/gdb/gdb]
$1 = {kind = TARGET_WAITKIND_EXITED, value = {integer = 0, sig = GDB_SIGNAL_0, related_pid = {m_pid = 0, m_lwp = 0, m_tid = 0}, execd_pathname = 0x0, syscall_number = 0}}
(top-gdb)
This fixes it, and adds a "maint set target-non-stop on/off" axis to the testcase.
gdb/ChangeLog:
* remote.c
(remote_target::check_pending_events_prevent_wildcard_vcont):
Check whether the event's ptid is not null_ptid before looking up
the corresponding inferior.
gdb/testsuite/ChangeLog:
* gdb.server/stop-reply-no-thread-multi.exp (run_test): Add
"target_non_stop" parameter and use it.
(top level): Add "maint set target-non-stop on/off" testing axis.
Change-Id: Ia30cf275305ee4dcbbd33f731534cd71d1550eaa
The data object and function info sections (collectively "symtypetabs")
usually (i.e. if non-indexed) have sizes defined by the size of the ELF
dynamic symbol table in the object they are linked to. This means test
results should not depend on the exact sizes of these sections, because
adding entirely irrelevant symbols to the dynsym can cause spurious test
failures. (This also means we should not match the offset of sections
that follow them, since those too depend on the exact size of the
symtypetab sections.)
Spotted by turning the sanitizer on, which introduced new dynsym entries
and expanded the symtypetab sizes to match.
ld/ChangeLog
2021-03-25 Nick Alcock <nick.alcock@oracle.com>
* testsuite/ld-ctf/array.d: Only check that the data object
section is nonempty: do not check its exact size.
* testsuite/ld-ctf/diag-parlabel.d: Likewise.
* testsuite/ld-ctf/slice.d: Likewise.
* testsuite/ld-ctf/data-func-conflicted.d: Likewise, and for the
func info section too.
* testsuite/ld-ctf/function.d: Likewise, for the func info section.
The address sanitizer contains a redirector that captures dlopen calls,
so checks for dlopen with AC_SEARCH_LIBS will always conclude that
dlopen is present when the sanitizer is on. This means it won't add
-ldl to LIBS even if needed, and the immediately-following attempt to
actually link with -lbfd will fail because libbfd also needs dlsym,
which ASAN does *not* contain a redirector for.
If we check for dlsym instead of dlopen, the check works whether ASAN is
on or off. (bfd uses both in close proximity: if it needs one, it will
always need the other.)
libctf/ChangeLog
2021-03-25 Nick Alcock <nick.alcock@oracle.com>
* configure.ac: Check for dlsym, not dlopen.
* configure: Regenerate.
Harmless, but causes noise that makes it harder to spot other leaks.
libctf/ChangeLog
2021-03-25 Nick Alcock <nick.alcock@oracle.com>
* testsuite/libctf-writable/symtypetab-nonlinker-writeout.c: Don't
leak buf.
isqualifier, which is used by ctf_lookup_by_name to figure out if a
given word in a type name is a qualifier, takes the address of a
possibly out-of-bounds location before checking its bounds.
In any reasonable compiler this will just lead to a harmless address
computation that is then discarded if out-of-bounds, but it's still
undefined behaviour and the sanitizer rightly complains.
libctf/ChangeLog
2021-03-25 Nick Alcock <nick.alcock@oracle.com>
PR libctf/27628
* ctf-lookup.c (isqualifier): Don't dereference out-of-bounds
qhash values.
This makes it possible to use LIBCTF_DEBUG to debug things that happen
before the ctf_bfdopen_internal call that ctf_bfdopen_ctfsect eventually
thunks down to (symtab/strtab lookup, archive opening, etc).
This is not important for ctf_open callers, since ctf_fdopen already
calls libctf_init_debug, but ctf_bfdopen_ctfsect is a public entry point
that can be called directly (e.g. objdump and readelf both do so).
libctf/ChangeLog
2021-03-25 Nick Alcock <nick.alcock@oracle.com>
* ctf-open-bfd.c (ctf_bfdopen_ctfsect): Initialize debugging.
Every place that accesses a function's dtd_vlen accesses it only if the
number of args is nonzero, except the serializer, which always tries to
memcpy it. The number of bytes it memcpys in this case is zero, but it
is still undefined behaviour to copy zero bytes from a null pointer.
So check for this case explicitly.
libctf/ChangeLog
2021-03-25 Nick Alcock <nick.alcock@oracle.com>
PR libctf/27628
* ctf-serialize.c (ctf_emit_type_sect): Allow for a NULL vlen in
CTF_K_FUNCTION types.
This turns into a signed left shift by 31 bits, otherwise. This is an
offset and is always treated as unsigned in any case, so add an
appropriate cast.
include/ChangeLog
2021-03-25 Nick Alcock <nick.alcock@oracle.com>
PR libctf/27628
* ctf-api.h: Fix some indentation.
(CTF_SET_STID): Always do an unsigned shift, even if STID is
signed.
When we dump normal types, we emit their size and/or alignment:
but size and alignment dumping can return errors if the type is
part of a chain that terminates in a forward.
Emitting 0xffffffff as a size or alignment is unhelpful, so simply
skip emitting this info for any type for which size or alignment
checks return an error, no matter what the error is.
libctf/ChangeLog
2021-03-25 Nick Alcock <nick.alcock@oracle.com>
* ctf-dump.c (ctf_dump_format_type): Don't emit size or alignment
on error.
I ran into a new failure in gdb.base/gdb-caching-proc.exp:
FAIL: gdb.base/gdb-caching-proc.exp: supports_memtag: initial: memory-tag check
This is a failure from the `supports_memtag` proc added recently (this
new proc is in lib/gdb.exp).
The problem here is that `supports_memtag` is hitting one of the
default error cases in gdb_test_multiple, specifically it is finding a
$gdb_prompt left unmatched from an earlier call to gdb_test_multiple.
Looking back through the test output I found that the problem is the
proc `gnat_runtime_has_debug_info` in lib/ada.exp. This proc is not
matching the trailing $gdb_prompt. This leaves the prompt in the
expect buffer, then when we run `supports_memtag` it sees the prompt
and thinks that the test completed with no output.
Fixed by making use of `-wrap` in `gnat_runtime_has_debug_info` to
ensure the trailing prompt gets matched.
gdb/testsuite/ChangeLog:
* lib/ada.exp (gnat_runtime_has_debug_info): Use -wrap with
gdb_test_multiple.
To allow breakpoints to be created at invalid addresses,
target_read_code is used instead of read_code. This was fixed in
commit:
commit c01660c625
Date: Wed Apr 17 00:31:43 2019 +0100
gdb/riscv: Allow breakpoints to be created at invalid addresses
Unfortunately, the call to read_code was left in by mistake. The
result is that GDB will fail when trying to create the breakpoint,
rather than when trying to install the breakpoint (as is the case with
other targets).
This commit fixes this mistake and removes the offending call to
read_code.
gdb/ChangeLog:
* riscv-tdep.c (riscv_breakpoint_kind_from_pc): Remove call to
read_code.
The code was checking wrong bit for sign extension. It caused it
to zero-extend instead of sign-extend the immediate value.
2021-03-25 Abid Qadeer <abidh@codesourcery.com>
opcodes/
* nios2-dis.c (nios2_print_insn_arg): Fix sign extension of
immediate in br.n instruction.
gas/
* testsuite/gas/nios2/brn.s: New.
* testsuite/gas/nios2/brn.d: New.
In match_template() i.tm hasn't been filled yet, so it is necessarily t
which needs checking. This is only a latent issue as no other templates
with the same base_opcode have an extension_opcode of 1.
For VEX-encoded ones, all three involved vector registers have to be
distinct. For EVEX-encoded ones an actual mask register has to be in use
and zeroing-masking cannot be used (violation of either will #UD).
Additionally both involved vector registers have to be distinct for
EVEX-encoded gathers.
For INVLPGB the operand count was wrong (besides %edx there's also %ecx
which is an input to the insn). In this case I see little sense in
retaining the bogus 2-operand template. Plus swapping of the operands
wasn't properly suppressed for Intel syntax.
For PVALIDATE, RMPADJUST, and RMPUPDATE bogus single operand templates
were specified. These get retained, as the address operand is the only
one really needed to expressed non-default address size, but only for
compatibility reasons. Proper multi-operand insn get introduced and the
testcases get adjusted / extended accordingly.
While at it also drop the redundant definition of __amd64__ - we already
have x86_64 defined (or not) to distinguish 64-bit and non-64-bit cases.
This is only a partial fix for PR/gas 27419, in that it limits the bad
behavior of accepting mismatched operands to just x32 mode. The full fix
would be to revert commits 27f134698a and b3a3496f83, and to address
the issue in gcc instead.
The current_top_target function is a hidden dependency on the current
inferior. Since I'd like to slowly move towards reducing our dependency
on the global current state, remove this function and make callers use
current_inferior ()->top_target ()
There is no expected change in behavior, but this one step towards
making those callers use the inferior from their context, rather than
refer to the global current inferior.
gdb/ChangeLog:
* target.h (current_top_target): Remove, make callers use the
current inferior instead.
* target.c (current_top_target): Remove.
Change-Id: Iccd457036f84466cdaa3865aa3f9339a24ea001d
I noticed that dw2_map_matching_symbols does not use its 'kind'
parameter. This patch removes it. Tested by rebuilding.
2021-03-24 Tom Tromey <tom@tromey.com>
* dwarf2/read.c (dw2_map_matching_symbols): Update.
(dw2_expand_symtabs_matching_symbol): Remove 'kind' parameter.
(check_match, dw2_expand_symtabs_matching)
(dwarf2_debug_names_index::map_matching_symbols)
(dwarf2_debug_names_index::expand_symtabs_matching): Update.
Simon pointed out an error that I made in
compile_cplus_conver_struct_or_union in my original C++ compile submission:
if (type->code () == TYPE_CODE_STRUCT)
{
const char *what = TYPE_DECLARED_CLASS (type) ? "struct" : "class";
resuld = instance->plugin ().build_decl
(what, name.get (), (GCC_CP_SYMBOL_CLASS | nested_access
| (TYPE_DECLARED_CLASS (type)
? GCC_CP_FLAG_CLASS_NOFLAG
: GCC_CP_FLAG_CLASS_IS_STRUCT)),
0, nullptr, 0, filename, line);
}
Notice that WHAT will contain "struct" for TYPE_DECLARED_CLASS. Whoops.
Fortunately this first parameter of build_decl is only used for
debugging.
gdb/ChangeLog
2021-03-24 Keith Seitz <keiths@redhat.com>
* compile/compile-cplus-types.c
(compile_cplus_convert_struct_or_union): Fix TYPE_DECLARED_CLASS
thinko.
This variable was made static in:
6bd434d6ca ("gdb: make some variables static")
But I modified gdbarch.c instead of gdbarch.sh, so the change was
later reverted when gdbarch.c was re-generated.
Do it right this time.
gdb/ChangeLog:
* gdbarch.sh (gdbarch_data_registry): Make static.
* gdbarch.c: Re-generate.
Change-Id: I4048ba99a0cf47acd9da050934965db222fbd159
Add an AArch64-specific test and a more generic memory tagging test that
other architectures can run.
Even though architectures not supporting memory tagging can run the memory
tagging tests, the runtime check will make the tests bail out early, as it
would make no sense to proceed without proper support.
It is also tricky to do any further runtime tests for memory tagging, given
we'd need to deal with tags, and those are arch-specific. Therefore the
test in gdb.base is more of a smoke test.
If an architecture wants to implement memory tagging, then it makes sense to
have tests within gdb.arch instead.
gdb/testsuite/ChangeLog:
2021-03-24 Luis Machado <luis.machado@linaro.org>
* gdb.arch/aarch64-mte.c: New file.
* gdb.arch/aarch64-mte.exp: New test.
* gdb.base/memtag.c: New file.
* gdb.base/memtag.exp: New test.
* lib/gdb.exp (supports_memtag): New function.
Mention the new packets and memory tagging features.
gdb/ChangeLog:
2021-03-24 Luis Machado <luis.machado@linaro.org>
* NEWS: Mention memory tagging changes.
Document the changes to the "print" and "x" commands to support memory
tagging.
gdb/doc/ChangeLog:
2021-03-24 Luis Machado <luis.machado@linaro.org>
* gdb.texinfo (Data): Document memory tagging changes to the "print"
command.
(Examining Memory): Document memory tagging changes to the "x"
command.
(Memory Tagging): Update with more information on changes to the "x"
and "print" commands.
Extend the "x" and "print" commands to make use of memory tagging
functionality, if supported by the architecture.
The "print" command will point out any possible tag mismatches it finds
when dealing with pointers, in case such a pointer is tagged. No additional
modifiers are needed.
Suppose we have a pointer "p" with value 0x1234 (logical tag 0x0) and that we
have an allocation tag of 0x1 for that particular area of memory. This is the
expected output:
(gdb) p/x p
Logical tag (0x0) does not match the allocation tag (0x1).
$1 = 0x1234
The "x" command has a new 'm' modifier that will enable displaying of
allocation tags alongside the data dump. It will display one allocation
tag per line.
AArch64 has a tag granule of 16 bytes, which means we can have one tag for
every 16 bytes of memory. In this case, this is what the "x" command will
display with the new 'm' modifier:
(gdb) x/32bxm p
<Allocation Tag 0x1 for range [0x1230,0x1240)>
0x1234: 0x01 0x02 0x00 0x00 0x00 0x00 0x00 0x00
0x123c: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
<Allocation Tag 0x1 for range [0x1240,0x1250)>
0x1244: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
0x124c: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
(gdb) x/4gxm a
<Allocation Tag 0x1 for range [0x1230,0x1240)>
0x1234: 0x0000000000000201 0x0000000000000000
<Allocation Tag 0x1 for range [0x1240,0x1250)>
0x1244: 0x0000000000000000 0x0000000000000000
gdb/ChangeLog:
2021-03-24 Luis Machado <luis.machado@linaro.org>
* printcmd.c (decode_format): Handle the 'm' modifier.
(do_examine): Display allocation tags when required/supported.
(should_validate_memtags): New function.
(print_command_1): Display memory tag mismatches.
* valprint.c (show_memory_tag_violations): New function.
(value_print_option_defs): Add new option "memory-tag-violations".
(user_print_options) <memory_tag_violations>: Initialize to 1.
* valprint.h (struct format_data) <print_tags>: New field.
(value_print_options) <memory_tag_violations>: New field.
gdb/testsuite/ChangeLog:
2021-03-24 Luis Machado <luis.machado@linaro.org>
* gdb.base/options.exp: Adjust for new print options.
* gdb.base/with.exp: Likewise.
Document the new "memory-tag" command prefix and all of its subcommands.
gdb/doc/ChangeLog:
2021-03-24 Luis Machado <luis.machado@linaro.org>
* gdb.texinfo (Memory Tagging): New subsection and node.
(AArch64 Memory Tagging Extension): New subsection.
Add new commands under the "memory-tag" prefix to allow users to inspect,
modify and check memory tags in different ways.
The available subcommands are the following:
- memory-tag print-logical-tag <expression>: Prints the logical tag for a
particular address.
- memory-tag withltag <expression> <tag>: Prints the address tagged with the
logical tag <tag>.
- memory-tag print-allocation-tag <expression>: Prints the allocation tag for
a particular address.
- memory-tag setatag <expression> <length> <tags>: Sets one or more allocation
tags to the specified tags.
- memory-tag check <expression>: Checks if the logical tag in <address>
matches its allocation tag.
These commands make use of the memory tagging gdbarch methods, and are still
available, but disabled, when memory tagging is not supported by the
architecture.
I've pondered about a way to make these commands invisible when memory tagging
is not available, but given the check is at runtime (and support may come and go
based on a process' configuration), that is a bit too late in the process to
either not include the commands or get rid of them.
Ideas are welcome.
gdb/ChangeLog:
2021-03-24 Luis Machado <luis.machado@linaro.org>
* printcmd.c: Include gdbsupport/rsp-low.h.
(memory_tag_list): New static global.
(process_print_command_args): Factored out of
print_command_1.
(print_command_1): Use process_print_command_args.
(show_addr_not_tagged, show_memory_tagging_unsupported)
(memory_tag_command, memory_tag_print_tag_command)
(memory_tag_print_logical_tag_command)
(memory_tag_print_allocation_tag_command, parse_with_logical_tag_input)
(memory_tag_with_logical_tag_command, parse_set_allocation_tag_input)
(memory_tag_set_allocation_tag_command, memory_tag_check_command): New
functions.
(_initialize_printcmd): Add "memory-tag" prefix and subcommands.
gdbsupport/ChangeLog:
2021-03-24 Luis Machado <luis.machado@linaro.org>
* rsp-low.cc (fromhex, hex2bin): Move to ...
* common-utils.cc: ... here.
(fromhex) Change error message text to not be RSP-specific.
* rsp-low.h (fromhex, hex2bin): Move to ...
* common-utils.h: ... here.
This patch handles the tagged_addr_ctrl register that is exported when
generating a core file.
gdb/ChangeLog:
2021-03-24 Luis Machado <luis.machado@linaro.org>
* aarch64-linux-tdep.c
(aarch64_linux_iterate_over_regset_sections): Handle MTE register set.
* aarch64-linux-tdep.h (AARCH64_LINUX_SIZEOF_MTE_REGSET): Define.
Adds the AArch64-specific memory tagging support (MTE) by implementing the
required hooks and checks for GDBserver.
gdbserver/ChangeLog:
2021-03-24 Luis Machado <luis.machado@linaro.org>
* Makefile.in (SFILES): Add /../gdb/nat/aarch64-mte-linux-ptrace.c.
* configure.srv (aarch64*-*-linux*): Add arch/aarch64-mte-linux.o and
nat/aarch64-mte-linux-ptrace.o.
* linux-aarch64-low.cc: Include nat/aarch64-mte-linux-ptrace.h.
(class aarch64_target) <supports_memory_tagging>
<fetch_memtags, store_memtags>: New method overrides.
(aarch64_target::supports_memory_tagging)
(aarch64_target::fetch_memtags)
(aarch64_target::store_memtags): New methods.
Whenever a memory tag violation occurs, we get a SIGSEGV. Additional
information can be obtained through the siginfo data structure.
For AArch64 the Linux kernel may expose the fault address and tag
information, if we have a synchronous event. Otherwise there is
no fault address available.
The synchronous event looks like this:
--
(gdb) continue
Continuing.
Program received signal SIGSEGV, Segmentation fault
Memory tag violation while accessing address 0x0500fffff7ff8000
Allocation tag 0x1.
Logical tag 0x5
--
The asynchronous event looks like this:
--
(gdb) continue
Continuing.
Program received signal SIGSEGV, Segmentation fault
Memory tag violation
Fault address unavailable.
--
gdb/ChangeLog:
2021-03-24 Luis Machado <luis.machado@linaro.org>
* aarch64-linux-tdep.c
(aarch64_linux_report_signal_info): New function.
(aarch64_linux_init_abi): Register
aarch64_linux_report_signal_info as the report_signal_info hook.
* arch/aarch64-linux.h (SEGV_MTEAERR): Define.
(SEGV_MTESERR): Define.
Add some unit testing to exercise setting/getting logical tags in the
AArch64 implementation.
gdb/ChangeLog:
2021-03-24 Luis Machado <luis.machado@linaro.org>
* aarch64-linux-tdep.c: Include gdbsupport/selftest.h.
(aarch64_linux_ltag_tests): New function.
(_initialize_aarch64_linux_tdep): Register aarch64_linux_ltag_tests.
The Linux kernel exposes the information about MTE-protected pages via the
proc filesystem, more specifically through the smaps file.
What we're looking for is a mapping with the 'mt' flag, which tells us that
mapping was created with a PROT_MTE flag and, thus, is capable of using memory
tagging.
We already parse that file for other purposes (core file
generation/filtering), so this patch refactors the code to make the parsing
of the smaps file reusable for memory tagging.
The function linux_address_in_memtag_page uses the refactored code to allow
querying for memory tag support in a particular address, and it gets used in the
next patch.
gdb/ChangeLog:
2021-03-24 Luis Machado <luis.machado@linaro.org>
* linux-tdep.c (struct smaps_vmflags) <memory_tagging>: New flag
bit.
(struct smaps_data): New struct.
(decode_vmflags): Handle the 'mt' flag.
(parse_smaps_data): New function, refactored from
linux_find_memory_regions_full.
(linux_address_in_memtag_page): New function.
(linux_find_memory_regions_full): Refactor into parse_smaps_data.
* linux-tdep.h (linux_address_in_memtag_page): New prototype.
This is a quick cleanup that removes the use of fixed-length char arrays and
uses std::string instead.
gdb/ChangeLog:
2021-03-24 Luis Machado <luis.machado@linaro.org>
* linux-tdep.c (linux_find_memory_regions_full): Use std::string
instead of char arrays.
The patch implements the memory tagging target hooks for AArch64, so we
can handle MTE.
gdb/ChangeLog:
2021-03-24 Luis Machado <luis.machado@linaro.org>
* Makefile.in (ALL_64_TARGET_OBS): Add arch/aarch64-mte-linux.o.
(HFILES_NO_SRCDIR): Add arch/aarch64-mte-linux.h and
nat/aarch64-mte-linux-ptrace.h.
* aarch64-linux-nat.c: Include nat/aarch64-mte-linux-ptrace.h.
(aarch64_linux_nat_target) <supports_memory_tagging>: New method
override.
<fetch_memtags>: New method override.
<store_memtags>: New method override.
(aarch64_linux_nat_target::supports_memory_tagging): New method.
(aarch64_linux_nat_target::fetch_memtags): New method.
(aarch64_linux_nat_target::store_memtags): New method.
* arch/aarch64-mte-linux.c: New file.
* arch/aarch64-mte-linux.h: Include gdbsupport/common-defs.h.
(AARCH64_MTE_GRANULE_SIZE): Define.
(aarch64_memtag_type): New enum.
(aarch64_mte_get_tag_granules): New prototype.
* configure.nat (NATDEPFILES): Add nat/aarch64-mte-linux-ptrace.o.
* configure.tgt (aarch64*-*-linux*): Add arch/aarch64-mte-linux.o.
* nat/aarch64-mte-linux-ptrace.c: New file.
* nat/aarch64-mte-linux-ptrace.h: New file.