Commit Graph

179421 Commits

Author SHA1 Message Date
Nathan Sidwell
9d377c280c i386: Fix array index in expander
I noticed a compiler warning about out-of-bound access.  Fixed thusly.

	gcc/
	* config/i386/sse.md (mov<mode>): Fix operand indices.
2020-09-11 14:19:04 -07:00
Thomas Rodgers
64064678d6 libstdc++: only pull in bits/align.h if C++11 or later
libstdc++-v3/ChangeLog:

	* include/std/memory: Move #include <bits/align.h> inside C++11
	conditional includes.
2020-09-11 14:08:13 -07:00
Nathan Sidwell
f76b0f231b c++: Concepts and local externs
I discovered that we'd accept constraints on block-scope function
decls inside templates.  This fixes that.

	gcc/cp/
	* decl.c (grokfndecl): Don't attach to local extern.
2020-09-11 13:55:45 -07:00
Will Schmidt
2fda9e9bad [PATCH,rs6000] Testsuite fixup pr96139 tests
Hi,
  As reported, the recently added pr96139 tests will fail on older targets
  because the tests are missing the appropriate -mvsx or -maltivec options.
  This adds the options and clarifies the dg-require statements.

  The pr96139-c.c test needs -maltivec to work, but does not actually use
  vectors, so does not require -mvsx like the others.

  Sniff-regtested OK when specifying older targets on a power7 host.
  --target_board=unix/'{-mcpu=power4,-mcpu=power5,-mcpu=power6,-mcpu=power7,
  -mcpu=power8,-mcpu=power9}''{-m64,-m32}'"


gcc/testsuite/ChangeLog:
	* gcc.target/powerpc/pr96139-a.c: Specify -mvsx option and update the
	dg-require stanza to match.
	* gcc.target/powerpc/pr96139-b.c: Same.
	* gcc.target/powerpc/pr96139-c.c: Specify -maltivec option and update
	the dg-require stanza to match.
2020-09-11 15:47:20 -05:00
Thomas Rodgers
2c3b1c5f95 libstdc++: Split std::align/assume_aligned to bits/align.h
We would like to be able to use std::align and std::assume_aligned
without pulling in everything in <memory>.

libstdc++-v3/ChangeLog:

	* include/Makefile.am (bits_headers): Add new header.
	* include/Makefile.in: Regenerate.
	* include/bits/align.h: New file.
	* include/std/memory (align): Move definition to bits/align.h.
	(assume_aligned): Likewise.
2020-09-11 13:13:12 -07:00
Jonathan Wakely
53ad6b1979 libstdc++: Fix chrono::__detail::ceil to work with C++11
In C++11 constexpr functions can only have a return statement, so we
need to fix __detail::ceil to make it valid in C++11. This can be done
by moving the comparison and increment into a new function, __ceil_impl,
and calling that with the result of the duration_cast.

This would mean the standard C++17 std::chrono::ceil function would make
two further calls, which would add too much overhead when not inlined.
For C++17 and later use a using-declaration to add chrono::ceil to
namespace __detail. For C++11 and C++14 define chrono::__detail::__ceil
as a C++11-compatible constexpr function template.

libstdc++-v3/ChangeLog:

	* include/std/chrono [C++17] (chrono::__detail::ceil): Add
	using declaration to make chrono::ceil available for internal
	use with a consistent name.
	(chrono::__detail::__ceil_impl): New function template.
	(chrono::__detail::ceil): Use __ceil_impl to compare and
	increment the value. Remove SFINAE constraint.
2020-09-11 19:59:11 +01:00
Sunil K Pandey
40e99ed5f4 Fix fma test case [PR97018]
These tests are written for 256 bit vector. For -march=cascadelake,
vector size changed to 512 bit. It doubles the number of fma
instruction and test fail. Fix is to explicitly disable 512 bit
vector by passing additional option -mno-avx512f.

Tested on x86-64.

gcc/testsuite/ChangeLog:

	PR target/97018
	* gcc.target/i386/l_fma_double_1.c: Add option -mno-avx512f.
	* gcc.target/i386/l_fma_double_2.c: Likewise.
	* gcc.target/i386/l_fma_double_3.c: Likewise.
	* gcc.target/i386/l_fma_double_4.c: Likewise.
	* gcc.target/i386/l_fma_double_5.c: Likewise.
	* gcc.target/i386/l_fma_double_6.c: Likewise.
	* gcc.target/i386/l_fma_float_1.c: Likewise.
	* gcc.target/i386/l_fma_float_2.c: Likewise.
	* gcc.target/i386/l_fma_float_3.c: Likewise.
	* gcc.target/i386/l_fma_float_4.c: Likewise.
	* gcc.target/i386/l_fma_float_5.c: Likewise.
	* gcc.target/i386/l_fma_float_6.c: Likewise.
2020-09-11 09:52:21 -07:00
Martin Sebor
f36a8168f0 Move/correct offset adjustment (PR middle-end/96903).
Resolves:
PR middle-end/96903 - bogus warning on memcpy at negative offset from array end

gcc/ChangeLog:

	PR middle-end/96903
	* builtins.c (compute_objsize): Remove incorrect offset adjustment.
	(compute_objsize): Adjust offset range here instead.

gcc/testsuite/ChangeLog:

	PR middle-end/96903
	* gcc.dg/Wstringop-overflow-42.c:: Add comment.
	* gcc.dg/Wstringop-overflow-43.c: New test.
2020-09-11 09:42:29 -06:00
Nathan Sidwell
1be7bf7dab objc++: Always pop scope with method definitions [PR97015]
Syntax errors in method definition lists could leave us in a function
scope.  My recent change for block scope externs didn't like that.
This reimplements the parsing loop to finish the method definition we
started.  AFAICT the original code was attempting to provide some
error recovery.  Also while there, simply do the token peeking at the
top of the loop, rather than at the two(!) ends.

	gcc/cp/
	* parser.c (cp_parser_objc_method_definition_list): Reimplement
	loop, make sure we pop scope.
	gcc/testsuite/
	* obj-c++.dg/syntax-error-9.mm: Adjust expected errors.
2020-09-11 08:27:40 -07:00
Marek Polacek
13144466f1 c++: Remove LOOKUP_CONSTINIT.
Since we now have DECL_DECLARED_CONSTINIT_P, we no longer need
LOOKUP_CONSTINIT.

gcc/cp/ChangeLog:

	* cp-tree.h (LOOKUP_CONSTINIT): Remove.
	(LOOKUP_REWRITTEN): Adjust.
	* decl.c (duplicate_decls): Set DECL_DECLARED_CONSTINIT_P.
	(check_initializer): Use DECL_DECLARED_CONSTINIT_P instead of
	LOOKUP_CONSTINIT.
	(cp_finish_decl): Don't set DECL_DECLARED_CONSTINIT_P.  Use
	DECL_DECLARED_CONSTINIT_P instead of LOOKUP_CONSTINIT.
	(grokdeclarator): Set DECL_DECLARED_CONSTINIT_P.
	* decl2.c (grokfield): Don't handle LOOKUP_CONSTINIT.
	* parser.c (cp_parser_decomposition_declaration): Remove
	LOOKUP_CONSTINIT handling.
	(cp_parser_init_declarator): Likewise.
	* pt.c (tsubst_expr): Likewise.
	(instantiate_decl): Likewise.
	* typeck2.c (store_init_value): Use DECL_DECLARED_CONSTINIT_P instead
	of LOOKUP_CONSTINIT.
2020-09-11 11:17:03 -04:00
Jonathan Wakely
29216f56d0 libstdc++: Fix build error in <bits/regex_error.h>
libstdc++-v3/ChangeLog:

	* include/bits/regex_error.h (__throw_regex_error): Fix
	parameter declaration and use reserved attribute names.
2020-09-11 14:52:40 +01:00
Mike Crowe
e05ff30078 libstdc++: Avoid rounding errors on custom clocks in condition_variable
The fix for PR68519 in 83fd5e73b3 only
applied to condition_variable::wait_for. This problem can also apply to
condition_variable::wait_until but only if the custom clock is using a
more recent epoch so that a small enough delta can be calculated. let's
use the newly-added chrono::__detail::ceil to fix this and also make use
of that function to simplify the previous wait_for fixes.

Also, simplify the existing test case for PR68519 a little and make its
variables local so we can add a new test case for the above problem.
Unfortunately, the test would have only started failing if sufficient
time has passed since the chrono::steady_clock epoch had passed anyway,
but it's better than nothing.

libstdc++-v3/ChangeLog:

	* include/std/condition_variable (condition_variable::wait_until):
	Convert delta to steady_clock duration before adding to current
	steady_clock time to avoid rounding errors described in PR68519.
	(condition_variable::wait_for): Simplify calculation of absolute
	time by using chrono::__detail::ceil in both overloads.
	* testsuite/30_threads/condition_variable/members/68519.cc:
	(test_wait_for): Renamed from test01. Replace unassigned val
	variable with constant false. Reduce scope of mx and cv
	variables to just test_wait_for function.
	(test_wait_until): Add new test case.
2020-09-11 14:28:50 +01:00
Mike Crowe
f9ddb696a2 libstdc++: Avoid rounding errors in std::future::wait_* [PR 91486]
Convert the specified duration to the target clock's duration type
before adding it to the current time in
__atomic_futex_unsigned::_M_load_when_equal_for and
_M_load_when_equal_until.  This removes the risk of the timeout being
rounded down to the current time resulting in there being no wait at all
when the duration type lacks sufficient precision to hold the
steady_clock current time.

Rather than using the style of fix from PR68519, let's expose the C++17
std::chrono::ceil function as std::chrono::__detail::ceil so that it can
be used in code compiled with earlier standards versions and simplify
the fix. This was suggested by John Salmon in
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91486#c5 .

This problem has become considerably less likely to trigger since I
switched the __atomic__futex_unsigned::__clock_t reference clock from
system_clock to steady_clock and added the loop, but the consequences of
triggering it have changed too.

By my calculations it takes just over 194 days from the epoch for the
current time not to be representable in a float. This means that
system_clock is always subject to the problem (with the standard 1970
epoch) whereas steady_clock with float duration only runs out of
resolution machine has been running for that long (assuming the Linux
implementation of CLOCK_MONOTONIC.)

The recently-added loop in
__atomic_futex_unsigned::_M_load_when_equal_until turns this scenario
into a busy wait.

Unfortunately the combination of both of these things means that it's
not possible to write a test case for this occurring in
_M_load_when_equal_until as it stands.

libstdc++-v3/ChangeLog:

	PR libstdc++/91486
	* include/bits/atomic_futex.h
	(__atomic_futex_unsigned::_M_load_when_equal_for)
	(__atomic_futex_unsigned::_M_load_when_equal_until): Use
	__detail::ceil to convert delta to the reference clock
	duration type to avoid resolution problems.
	* include/std/chrono (__detail::ceil): Move implementation
	of std::chrono::ceil into private namespace so that it's
	available to pre-C++17 code.
	* testsuite/30_threads/async/async.cc (test_pr91486):
	Test __atomic_futex_unsigned::_M_load_when_equal_for.
2020-09-11 14:28:50 +01:00
Mike Crowe
b9faa3301c libstdc++: Loop when futex waits against arbitrary clock
If std::future::wait_until is passed a time point measured against a
clock that is neither std::chrono::steady_clock nor
std::chrono::system_clock then the generic implementation of
__atomic_futex_unsigned::_M_load_when_equal_until is called which
calculates the timeout based on __clock_t and calls the
_M_load_when_equal_until method for that clock to perform the actual
wait.

There's no guarantee that __clock_t is running at the same speed as the
caller's clock, so if the underlying wait times out timeout we need to
check the timeout against the caller's clock again before potentially
looping.

Also add two extra tests to the testsuite's async.cc:

* run test03 with steady_clock_copy, which behaves identically to
  chrono::steady_clock, but isn't chrono::steady_clock. This causes
  the overload of __atomic_futex_unsigned::_M_load_when_equal_until
  that takes an arbitrary clock to be called.

* invent test04 which uses a deliberately slow running clock in order
  to exercise the looping behaviour of
  __atomic_futex_unsigned::_M_load_when_equal_until described above.

libstdc++-v3/ChangeLog:

	* include/bits/atomic_futex.h
	(__atomic_futex_unsigned::_M_load_when_equal_until): Add
	loop on generic _Clock to check the timeout against _Clock
	again after _M_load_when_equal_until returns indicating a
	timeout.
	* testsuite/30_threads/async/async.cc: Invent slow_clock
	that runs at an eleventh of steady_clock's speed. Use it
	to test the user-supplied-clock variant of
	__atomic_futex_unsigned::_M_load_when_equal_until works
	generally with test03 and loops correctly when the timeout
	time hasn't been reached in test04.
2020-09-11 14:28:50 +01:00
Mike Crowe
87fce1923f libstdc++: Use std::chrono::steady_clock as atomic_futex reference clock
The user-visible effect of this change is that std::future::wait_for now
uses std::chrono::steady_clock to determine the timeout.  This makes it
immune to changes made to the system clock.  It also means that anyone
using their own clock types with std::future::wait_until will have the
timeout converted to std::chrono::steady_clock rather than
std::chrono::system_clock.

Now that use of both std::chrono::steady_clock and
std::chrono::system_clock are correctly supported for the wait timeout, I
believe that std::chrono::steady_clock is a better choice for the reference
clock that all other clocks are converted to since it is guaranteed to
advance steadily.  The previous behaviour of converting to
std::chrono::system_clock risks timeouts changing dramatically when the
system clock is changed.

libstdc++-v3/ChangeLog:

	* include/bits/atomic_futex.h (__atomic_futex_unsigned): Change
	__clock_t typedef to use steady_clock so that unknown clocks are
	synced to it rather than system_clock. Change existing __clock_t
	overloads of _M_load_and_text_until_impl and
	_M_load_when_equal_until to use system_clock explicitly. Remove
	comment about DR 887 since these changes address that problem as
	best as we currently able.
2020-09-11 14:28:24 +01:00
Mike Crowe
01d412ef36 libstdc++: Support futex waiting on chrono::steady_clock directly
The user-visible effect of this change is for std::future::wait_until to
use CLOCK_MONOTONIC when passed a timeout of std::chrono::steady_clock
type.  This makes it immune to any changes made to the system clock
CLOCK_REALTIME.

Add an overload of __atomic_futex_unsigned::_M_load_and_text_until_impl
that accepts a std::chrono::steady_clock, and correctly passes this
through to __atomic_futex_unsigned_base::_M_futex_wait_until_steady
which uses CLOCK_MONOTONIC for the timeout within the futex system call.
These functions are mostly just copies of the std::chrono::system_clock
versions with small tweaks.

Prior to this commit, a std::chrono::steady timeout would be converted
via std::chrono::system_clock which risks reducing or increasing the
timeout if someone changes CLOCK_REALTIME whilst the wait is happening.
(The commit immediately prior to this one increases the window of
opportunity for that from a short period during the calculation of a
relative timeout, to the entire duration of the wait.)

FUTEX_WAIT_BITSET was added in kernel v2.6.25.  If futex reports ENOSYS
to indicate that this operation is not supported then the code falls
back to using clock_gettime(2) to calculate a relative time to wait for.

I believe that I've added this functionality in a way that it doesn't
break ABI compatibility, but that has made it more verbose and less type
safe.  I believe that it would be better to maintain the timeout as an
instance of the correct clock type all the way down to a single
_M_futex_wait_until function with an overload for each clock.  The
current scheme of separating out the seconds and nanoseconds early risks
accidentally calling the wait function for the wrong clock.
Unfortunately, doing this would break code that compiled against the old
header.

libstdc++-v3/ChangeLog:

	* config/abi/pre/gnu.ver: Update for addition of
	__atomic_futex_unsigned_base::_M_futex_wait_until_steady.
	* include/bits/atomic_futex.h (__atomic_futex_unsigned_base):
	Add comments to clarify that _M_futex_wait_until and
	_M_load_and_test_until use CLOCK_REALTIME.
	(__atomic_futex_unsigned_base::_M_futex_wait_until_steady)
	(__atomic_futex_unsigned_base::_M_load_and_text_until_steady):
	New member functions that use CLOCK_MONOTONIC.
	(__atomic_futex_unsigned_base::_M_load_and_test_until_impl)
	(__atomic_futex_unsigned_base::_M_load_when_equal_until): Add
	overloads that accept a steady_clock time_point and use the
	new member functions.
	* src/c++11/futex.cc: Include headers required for
	clock_gettime.
	(futex_clock_monotonic_flag): New constant to tell futex to
	use CLOCK_MONOTONIC to match existing futex_clock_realtime_flag.
	(futex_clock_monotonic_unavailable): New global to store the
	result of trying to use CLOCK_MONOTONIC.
	(__atomic_futex_unsigned_base::_M_futex_wait_until_steady): Add
	new variant of _M_futex_wait_until that uses CLOCK_MONOTONIC to
	support waiting using steady_clock.
2020-09-11 14:25:00 +01:00
Mike Crowe
5bad23ceec libstdc++: Use FUTEX_CLOCK_REALTIME for futex wait
The futex system call supports waiting for an absolute time if
FUTEX_WAIT_BITSET is used rather than FUTEX_WAIT.  Doing so provides two
benefits:

1. The call to gettimeofday is not required in order to calculate a
   relative timeout.

2. If someone changes the system clock during the wait then the futex
   timeout will correctly expire earlier or later.  Currently that only
   happens if the clock is changed prior to the call to gettimeofday.

According to futex(2), support for FUTEX_CLOCK_REALTIME was added in the
v2.6.28 Linux kernel and FUTEX_WAIT_BITSET was added in v2.6.25.  To
ensure that the code still works correctly with earlier kernel versions,
an ENOSYS error from futex[1] results in the
futex_clock_realtime_unavailable flag being set.  This flag is used to
avoid the unnecessary unsupported futex call in the future and to fall
back to the previous gettimeofday and relative time implementation.

glibc applied an equivalent switch in pthread_cond_timedwait to use
FUTEX_CLOCK_REALTIME and FUTEX_WAIT_BITSET rather than FUTEX_WAIT for
glibc-2.10 back in 2009.  See
glibc:cbd8aeb836c8061c23a5e00419e0fb25a34abee7

The futex_clock_realtime_unavailable flag is accessed using
std::memory_order_relaxed to stop it becoming a bottleneck.  If the
first two calls to _M_futex_wait_until happen to happen simultaneously
then the only consequence is that both will try to use
FUTEX_CLOCK_REALTIME, both risk discovering that it doesn't work and, if
so, both set the flag.

[1] This is how glibc's nptl-init.c determines whether these flags are
    supported.

libstdc++-v3/ChangeLog:

	* src/c++11/futex.cc: Add new constants for required futex
	flags.  Add futex_clock_realtime_unavailable flag to store
	result of trying to use FUTEX_CLOCK_REALTIME.
	(__atomic_futex_unsigned_base::_M_futex_wait_until): Try to
	use FUTEX_WAIT_BITSET with FUTEX_CLOCK_REALTIME and only
	fall back to using gettimeofday and FUTEX_WAIT if that's not
	supported.
2020-09-11 14:24:59 +01:00
Mike Crowe
f639343dc8 libstdc++: Improve std::async test
Add tests for waiting for the future using both chrono::steady_clock and
chrono::system_clock in preparation for dealing with those clocks
properly in futex.cc.

libstdc++-v3/ChangeLog:

	* testsuite/30_threads/async/async.cc (test02): Test steady_clock
	with std::future::wait_until.
	(test03): Add new test templated on clock type waiting for future
	associated with async to resolve.
	(main): Call test03 to test both system_clock and steady_clock.
2020-09-11 14:24:59 +01:00
Christophe Lyon
55bdee9af3 libstdc++-v3/libsupc++/eh_call.cc: Avoid "set but not used" warning
When building with -fno-exceptions, bad_exception_allowed is set but
not used, causing a warning during the build.

This patch adds __attribute__((unused)) to avoid it.

2020-09-11  Torbjörn SVENSSON  <torbjorn.svensson@st.com>
	    Christophe Lyon  <christophe.lyon@linaro.org>

	libstdc++-v3/
	* libsupc++/eh_call.cc: Avoid warning with -fno-exceptions.
2020-09-11 13:00:29 +00:00
Christophe Lyon
fb00a9fc39 libstdc++-v3/libsupc++/eh_call.cc: Avoid warning with -fno-exceptions.
When building with -fno-exceptions, __throw_exception_again expands to
nothing, causing a "suggest braces around empty body in an 'if'
statement" warning.

This patch adds braces, like what was done in eh_personality.cc in svn
r193295 (git g:54ba39f599fc2f3d59fd3cd828a301ce9b731a20)

2020-09-11  Torbjörn SVENSSON  <torbjorn.svensson@st.com>
	    Christophe Lyon  <christophe.lyon@linaro.org>

	libstdc++-v3/
	* libsupc++/eh_call.cc: Avoid warning with -fno-exceptions.
2020-09-11 13:00:23 +00:00
Christophe Lyon
b32d2ea8c2 libstdc++-v3/include/bits/regex_error.h: Avoid warning with -fno-exceptions.
When building with -fno-exceptions, __GLIBCXX_THROW_OR_ABORT expands to
abort(), causing warnings:
unused parameter '__ecode'
unused parameter '__what'

This patch adds __attribute__((unused)) to avoid them.

2020-09-11  Torbjörn SVENSSON <torbjorn.svensson@st.com>
	    Christophe Lyon  <christophe.lyon@linaro.org>

	libstdc++-v3/
	* include/bits/regex_error.h: Avoid warning with -fno-exceptions.
2020-09-11 13:00:13 +00:00
Richard Biener
8d3767c302 tree-optimization/97020 - account SLP cost in loop vect again
The previous re-org made the cost of SLP vector stmts in loop
vectorization ignored.  The following rectifies this mistake.

2020-09-11  Richard Biener  <rguenther@suse.de>

	PR tree-optimization/97020
	* tree-vect-slp.c (vect_slp_analyze_operations): Apply
	SLP costs when doing loop vectorization.
2020-09-11 13:53:34 +02:00
Andrew Stubbs
2c1d809e93 testsuite: gimplefe-44 requires exceptions
This avoids an ICE on amdgcn.

gcc/testsuite/ChangeLog:

	* gcc.dg/gimplefe-44.c: Require exceptions.
2020-09-11 12:16:36 +01:00
Andrea Corallo
4ecc0061c4 libgccjit: Add new gcc_jit_global_set_initializer entry point
gcc/jit/ChangeLog

2020-08-01  Andrea Corallo  <andrea.corallo@arm.com>

	* docs/topics/compatibility.rst (LIBGCCJIT_ABI_14): New ABI tag.
	* docs/topics/expressions.rst (gcc_jit_global_set_initializer):
	Document new entry point in section 'Global variables'.
	* jit-playback.c (global_new_decl, global_finalize_lvalue): New
	method.
	(playback::context::new_global): Make use of global_new_decl,
	global_finalize_lvalue.
	(load_blob_in_ctor): New template function in use by the
	following.
	(playback::context::new_global_initialized): New method.
	* jit-playback.h (class context): Decl 'new_global_initialized',
	'global_new_decl', 'global_finalize_lvalue'.
	(lvalue::set_initializer): Add implementation.
	* jit-recording.c (recording::memento_of_get_pointer::get_size)
	(recording::memento_of_get_type::get_size): Add implementation.
	(recording::global::write_initializer_reproducer): New function in
	use by 'recording::global::write_reproducer'.
	(recording::global::replay_into)
	(recording::global::write_to_dump)
	(recording::global::write_reproducer): Handle
	initialized case.
	* jit-recording.h (class type): Decl 'get_size' and
	'num_elements'.
	* libgccjit++.h (class lvalue): Declare new 'set_initializer'
	method.
	(class lvalue): Decl 'is_global' and 'set_initializer'.
	(class global) Decl 'write_initializer_reproducer'. Add
	'm_initializer', 'm_initializer_num_bytes' fields.  Implement
	'set_initializer'. Add a destructor to free 'm_initializer'.
	* libgccjit.c (gcc_jit_global_set_initializer): New function.
	* libgccjit.h (gcc_jit_global_set_initializer): New function
	declaration.
	* libgccjit.map (LIBGCCJIT_ABI_14): New ABI tag.

gcc/testsuite/ChangeLog

2020-08-01  Andrea Corallo  <andrea.corallo@arm.com>

	* jit.dg/all-non-failing-tests.h: Add test-blob.c.
	* jit.dg/test-global-set-initializer.c: New testcase.
2020-09-11 12:18:59 +02:00
Tom de Vries
1554556312 [libatomic] Add nvptx support
Add nvptx support to libatomic.

Given that atomic_test_and_set is not implemented for nvptx (PR96964), the
compiler translates __atomic_test_and_set falling back onto the "Failing all
else, assume a single threaded environment and simply perform the operation"
case in expand_atomic_test_and_set, so it doesn't map onto an actual atomic
operation.

Still, that counts as supported for the configure test of libatomic, so we
end up with HAVE_ATOMIC_TAS_1/2/4/8/16 == 1, and the corresponding
__atomic_test_and_set_1/2/4/8/16 in libatomic all using that non-atomic
implementation.

Fix this by adding an atomic_test_and_set expansion for nvptx, that uses
libatomics __atomic_test_and_set_1.

This again makes the configure tests for HAVE_ATOMIC_TAS_1/2/4/8/16 fail, so
instead we use this case in tas_n.c:
...
/* If this type is smaller than word-sized, fall back to a word-sized
   compare-and-swap loop.  */
bool
SIZE(libat_test_and_set) (UTYPE *mptr, int smodel)
...
which for __atomic_test_and_set_8 uses INVERT_MASK_8.

Add INVERT_MASK_8 in libatomic_i.h, as well as MASK_8.

Tested libatomic testsuite on nvptx.

gcc/ChangeLog:

	PR target/96964
	* config/nvptx/nvptx.md (define_expand "atomic_test_and_set"): New
	expansion.

libatomic/ChangeLog:

	PR target/96898
	* configure.tgt: Add nvptx.
	* libatomic_i.h (MASK_8, INVERT_MASK_8): New macro definition.
	* config/nvptx/host-config.h: New file.
	* config/nvptx/lock.c: New file.
2020-09-11 12:06:15 +02:00
Andrew Stubbs
8ae0de5621 amdgcn: align TImode registers
This prevents execution failures caused by partially overlapping input and
output registers.  This is the same solution already used for DImode.

gcc/ChangeLog:

	* config/gcn/gcn.c (gcn_hard_regno_mode_ok): Align TImode registers.
	* config/gcn/gcn.md: Assert that TImode registers do not early clobber.
2020-09-11 10:55:32 +01:00
Richard Biener
054fc495fa improve BB vectorization dump locations
This tries to improve BB vectorization dumps by providing more
precise locations.  Currently the vect_location is simply the
very last stmt in a basic-block that has a location.  So for

double a[4], b[4];
int x[4], y[4];
void foo()
{
  a[0] = b[0]; // line 5
  a[1] = b[1];
  a[2] = b[2];
  a[3] = b[3];
  x[0] = y[0]; // line 9
  x[1] = y[1];
  x[2] = y[2];
  x[3] = y[3];
} // line 13

we show the user with -O3 -fopt-info-vec

t.c:13:1: optimized: basic block part vectorized using 16 byte vectors

while with the patch we point to both independently vectorized
opportunities:

t.c:5:8: optimized: basic block part vectorized using 16 byte vectors
t.c:9:8: optimized: basic block part vectorized using 16 byte vectors

there's the possibility that the location regresses in case the
root stmt in the SLP instance has no location.  For a SLP subgraph
with multiple entries the location also chooses one entry at random,
not sure in which case we want to dump both.

Still as the plan is to extend the basic-block vectorization
scope from single basic-block to multiple ones this is a first
step to preserve something sensible.

Implementation-wise this makes both costing and code-generation
happen on the subgraphs as analyzed.

2020-09-11  Richard Biener  <rguenther@suse.de>

	* tree-vectorizer.h (_slp_instance::location): New method.
	(vect_schedule_slp): Adjust prototype.
	* tree-vectorizer.c (vec_info::remove_stmt): Adjust
	the BB region begin if we removed the stmt it points to.
	* tree-vect-loop.c (vect_transform_loop): Adjust.
	* tree-vect-slp.c (_slp_instance::location): Implement.
	(vect_analyze_slp_instance): For BB vectorization set
	vect_location to that of the instance.
	(vect_slp_analyze_operations): Likewise.
	(vect_bb_vectorization_profitable_p): Remove wrapper.
	(vect_slp_analyze_bb_1): Remove cost check here.
	(vect_slp_region): Cost check and code generate subgraphs separately,
	report optimized locations and missed optimizations due to
	profitability for each of them.
	(vect_schedule_slp): Get the vector of SLP graph entries to
	vectorize as argument.
2020-09-11 11:29:18 +02:00
Eric Botcazou
ef4ab841d9 Fix ICE on nested packed variant record type
This is a regression present on the mainline and 10 branch: the compiler
aborts on code accessing a component of a packed record type whose type
is a packed discriminated record type with variant part.

gcc/ada/ChangeLog:
	* gcc-interface/utils.c (type_has_variable_size): New function.
	(create_field_decl): In the packed case, also force byte alignment
	when the type of the field has variable size.

gcc/testsuite/ChangeLog:
	* gnat.dg/pack27.adb: New test.
	* gnat.dg/pack27_pkg.ads: New helper.
2020-09-11 11:14:49 +02:00
Eric Botcazou
b5ffd55a61 Add missing stride entry in debug info
This adds a missing stride entry for bit-packed arrays of record types.

gcc/ada/ChangeLog:
	* gcc-interface/misc.c (get_array_bit_stride): Return TYPE_ADA_SIZE
	for record and union types.
2020-09-11 11:13:54 +02:00
Eric Botcazou
230e0dbdcb Drop GNAT encodings for fixed-point types
GDB can now deal with the DWARF representation just fine.

gcc/ada/ChangeLog:
	* gcc-interface/misc.c (gnat_get_fixed_point_type): Bail out only
	when the GNAT encodings are specifically used.
2020-09-11 11:13:16 +02:00
Eric Botcazou
7c919c12be Fix crash on array component with nonstandard index type
This is a regression present on mainline, 10 and 9 branches: the compiler
goes into an infinite recursion eventually exhausting the stack for the
declaration of a discriminated record type with an array component having
a discriminant as bound and an index type that is an enumeration type with
a non-standard representation clause.

gcc/ada/ChangeLog:
	* gcc-interface/decl.c (gnat_to_gnu_entity) <E_Array_Subtype>: Only
	create extra subtypes for discriminants if the RM size of the base
	type of the index type is lower than that of the index type.

gcc/testsuite/ChangeLog:
	* gnat.dg/specs/discr7.ads: New test.
2020-09-11 10:43:38 +02:00
Eric Botcazou
e898facaf3 Adjust email address 2020-09-11 10:16:17 +02:00
Eric Botcazou
a82c4c4cef Adjust email address 2020-09-11 10:12:28 +02:00
Eric Botcazou
dedf9ebc89 Adjust email address 2020-09-11 10:09:59 +02:00
Richard Biener
a9c960a3bd tree-optimization/97013 - avoid duplicate 'vectorization is not profitable'
This avoids dumping 'vectorization is not profitable' one more time
if none of the opportunities in a BB is profitable.

2020-09-11  Richard Biener  <rguenther@suse.de>

	PR tree-optimization/97013
	* tree-vect-slp.c (vect_slp_analyze_bb_1): Remove duplicate dumping.
2020-09-11 09:01:25 +02:00
Richard Biener
563326b5e4 random vectorizer fixes
This fixes random things found when doing SLP discovery from
arbitrary sets of stmts.

2020-09-10  Richard Biener  <rguenther@suse.de>

	* tree-vect-slp.c (vect_build_slp_tree_1): Check vector
	types for all lanes are compatible.
	(vect_analyze_slp_instance): Appropriately check for stores.
	(vect_schedule_slp): Likewise.
2020-09-11 08:10:38 +02:00
Tom de Vries
5e044c673f [nvptx] Fix UB in nvptx_assemble_value
When nvptx_assemble_value is called with size == 16, this bitshift runs
into UB:
...
  val &= ((unsigned  HOST_WIDE_INT)2 << (size * BITS_PER_UNIT - 1)) - 1;
...

Fix this by checking the shift amount.

Tested on nvptx.

gcc/ChangeLog:

	* config/nvptx/nvptx.c (nvptx_assemble_value): Fix undefined
	behaviour.
2020-09-11 07:27:56 +02:00
Tom de Vries
60e537a026 [nvptx] Fix printing of 128-bit constant (negative case)
For this code:
...
__int128 min_one = -1;
...
we currently generate:
...
.visible .global .align 8 .u64 min_one[2] = { -1, 0 };
...

Fix this in nvptx_assemble_value, such that we have instead:
...
.visible .global .align 8 .u64 min_one[2] = { -1, -1 };
...

gcc/ChangeLog:

	* config/nvptx/nvptx.c (nvptx_assemble_value): Handle negative
	__int128.

gcc/testsuite/ChangeLog:

	* gcc.target/nvptx/int128.c: New test.
2020-09-11 07:27:54 +02:00
Aaron Sawdey
848e74bea1 [PATCH][PR96791] disable POImode ld/st for memcpy
This is a (hopefully temporary) fix to PR96791. This will make
the default be -mno-block-ops-vector-pair even on power10, so we will
not hit the issue of DSE trying to truncate a POImode register. I am
still concerned it will be possible to hit this because the MMA builtins
will also generate POImode stores, but I think any example of that will
be somewhat more contrived.

gcc/ChangeLog:

	* config/rs6000/rs6000.c (rs6000_option_override_internal):
	Change default.
2020-09-10 21:13:38 -05:00
David Malcolm
b7028f060c analyzer: stricter handling of non-pure builtins [PR96798]
Amongst other things PR analyzer/96798 notes that
region_model::on_call_pre treats any builtin that hasn't been coded
yet as a no-op (albeit with an unknown return value), which is wrong
for non-pure builtins.

This patch updates that function's handling of such builtins so that it
instead conservatively assumes that any escaped/reachable regions can
be affected by the call, and implements enough handling of specific
builtins to avoid regressing the testsuite (I hope).

gcc/analyzer/ChangeLog:
	PR analyzer/96798
	* region-model-impl-calls.cc (region_model::impl_call_memcpy):
	New.
	(region_model::impl_call_strcpy): New.
	* region-model.cc (region_model::on_call_pre): Flag unhandled
	builtins that are non-pure as having unknown side-effects.
	Implement BUILT_IN_MEMCPY, BUILT_IN_MEMCPY_CHK, BUILT_IN_STRCPY,
	BUILT_IN_STRCPY_CHK, BUILT_IN_FPRINTF, BUILT_IN_FPRINTF_UNLOCKED,
	BUILT_IN_PUTC, BUILT_IN_PUTC_UNLOCKED, BUILT_IN_FPUTC,
	BUILT_IN_FPUTC_UNLOCKED, BUILT_IN_FPUTS, BUILT_IN_FPUTS_UNLOCKED,
	BUILT_IN_FWRITE, BUILT_IN_FWRITE_UNLOCKED, BUILT_IN_PRINTF,
	BUILT_IN_PRINTF_UNLOCKED, BUILT_IN_PUTCHAR,
	BUILT_IN_PUTCHAR_UNLOCKED, BUILT_IN_PUTS, BUILT_IN_PUTS_UNLOCKED,
	BUILT_IN_VFPRINTF, BUILT_IN_VPRINTF.
	* region-model.h (region_model::impl_call_memcpy): New decl.
	(region_model::impl_call_strcpy): New decl.

gcc/testsuite/ChangeLog:
	PR analyzer/96798
	* gcc.dg/analyzer/memcpy-1.c: New test.
	* gcc.dg/analyzer/strcpy-1.c: New test.
2020-09-10 21:08:09 -04:00
GCC Administrator
fdcc0283c6 Daily bump. 2020-09-11 00:16:28 +00:00
Michael Meissner
aa53f657aa PowerPC: Change cmove function return to bool.
In doing the other work for adding ISA 3.1 128-bit minimum, maximum, and
conditional move support, I noticed the two functions that process conditional
moves return 'int' instead of 'bool'.  This patch changes these functions to
return 'bool'.

gcc/
2020-09-10  Michael Meissner  <meissner@linux.ibm.com>

	* config/rs6000/rs6000-protos.h (rs6000_emit_cmove): Change return
	type to bool.
	(rs6000_emit_int_cmove): Change return type to bool.
	* config/rs6000/rs6000.c (rs6000_emit_cmove): Change return type
	to bool.
	(rs6000_emit_int_cmove): Change return type to bool.
2020-09-10 19:11:45 -04:00
Tom de Vries
af47a2035a [nvptx] Fix printing of 128-bit constant
Currently, for this code from c-c++-common/spec-barrier-1.c:
...
__int128 g = 9;
...
we generate:
...
// BEGIN GLOBAL VAR DEF: g
.visible .global .align 8 .u64 g[2] = { 9, 9 };
...
and consequently the test-case fails in execution.

The problem is caused by a shift in nvptx_assemble_value:
...
      val >>= part * BITS_PER_UNIT;
...
where the shift amount is equal to the number of bits in val, which is
undefined behaviour.

Fix this by detecting the situation and setting val to 0.

Tested on nvptx.

gcc/ChangeLog:

	PR target/97004
	* config/nvptx/nvptx.c (nvptx_assemble_value): Handle shift by
	number of bits in shift operand.
2020-09-10 21:30:33 +02:00
Jakub Jelinek
a8f9b4c54c lto: Fix up lto BLOCK tree streaming
When I've tried to backport recent LTO changes of mine, I've ran into
FAIL: g++.dg/ubsan/align-3.C   -O2 -flto -fno-use-linker-plugin -flto-partition=none  output pattern test
FAIL: g++.dg/ubsan/align-3.C   -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects  output pattern test
regressions that don't for some reason show up on the trunk.

I've tracked it down to input_location_and_block being called recursively,
first on some UBSAN_NULL ifn call's location which needs to stream a BLOCK
that hasn't been streamed yet, which in turn needs to stream some locations
for the decls in the BLOCK.

Looking at that align-3.C testcase on the trunk, I also see the recursive
lto_output_location_1 calls as well.  First on block:
$18 = <block 0x7fffea738480>
which is in the block tree from what I can see just fine (see the end of
mail).

Now, output_function has code that should stream the BLOCK tree leafs:
  /* Output DECL_INITIAL for the function, which contains the tree of
     lexical scopes.  */
  stream_write_tree (ob, DECL_INITIAL (function), true);
  /* As we do not recurse into BLOCK_SUBBLOCKS but only BLOCK_SUPERCONTEXT
     collect block tree leafs and stream those.  */
  auto_vec<tree> block_tree_leafs;
  if (DECL_INITIAL (function))
    collect_block_tree_leafs (DECL_INITIAL (function), block_tree_leafs);
  streamer_write_uhwi (ob, block_tree_leafs.length ());
  for (unsigned i = 0; i < block_tree_leafs.length (); ++i)
    stream_write_tree (ob, block_tree_leafs[i], true);

static void
collect_block_tree_leafs (tree root, vec<tree> &leafs)
{
  for (root = BLOCK_SUBBLOCKS (root); root; root = BLOCK_CHAIN (root))
    if (! BLOCK_SUBBLOCKS (root))
      leafs.safe_push (root);
    else
      collect_block_tree_leafs (BLOCK_SUBBLOCKS (root), leafs);
}

but the problem is that it is broken, it doesn't cover all block leafs,
but only leafs with an odd depth from DECL_INITIAL (and only some of those).

The following patch fixes that, but I guess we are going to stream at that
point significantly more blocks than before (though I guess most of the time
we'd stream them later on when streaming the gimple_locations that refer to
them).

2020-09-10  Jakub Jelinek  <jakub@redhat.com>

	* lto-streamer-out.c (collect_block_tree_leafs): Recurse on
	root rather than BLOCK_SUBBLOCKS (root).
2020-09-10 20:55:46 +02:00
Jonathan Wakely
1d5589d11e libstdc++: Fix -Wsign-compare warnings
libstdc++-v3/ChangeLog:

	* include/bits/locale_conv.h (__do_str_codecvt, __str_codecvt_in_all):
	Add casts to compare types of the same signedness.
2020-09-10 18:57:39 +01:00
Jonathan Wakely
866c53cb2e libstdc++: Fix -Wunused-local-typedefs warning
libstdc++-v3/ChangeLog:

	* include/bits/ranges_algobase.h (__equal_fn): Remove unused
	typedef.
2020-09-10 18:57:05 +01:00
Jonathan Wakely
f903c13ce8 libstdc++: Fix macro redefinition warnings
Including <version> after <iterator> gives a warning about redefining
the __cpp_lib_array_constexpr macro. What happens is that <iterator>
sets the C++20 value, then <version> redefines it to the C++17 value,
then undefines it and defines it again to the C++20 value.

This change avoids defining it to the C++17 value when compiling C++20
or later (which also means we no longer need the #undef).

A similar warning happens for __cpp_lib_constexpr_char_traits when
including <version> after any header that includes <bits/char_traits.h>.

libstdc++-v3/ChangeLog:

	* include/std/version (__cpp_lib_array_constexpr):
	(__cpp_lib_constexpr_char_traits): Only define C++17 value when
	compiling C++17.
2020-09-10 18:51:24 +01:00
Jonathan Wakely
0943b55817 libstdc++: Fix -Wdeprecated-declarations warnings
libstdc++-v3/ChangeLog:

	* include/experimental/bits/shared_ptr.h (shared_ptr(auto_ptr&&))
	(operator=(auto_ptr&&)): Add diagnostic pragmas to suppress
	warnings for uses of std::auto_ptr.
	* include/experimental/type_traits (is_literal_type_v):
	Likewise, for use of std::is_literal_type.
	* include/std/condition_variable (condition_variable_any::_Unlock):
	Likewise, for use of std::uncaught_exception.
2020-09-10 18:48:25 +01:00
Jonathan Wakely
b6b9fd4af9 libstdc++: Fix -Wnarrowing warnings
libstdc++-v3/ChangeLog:

	* include/bits/fs_path.h (path::_List::type()): Avoid narrowing
	conversion.
	* include/std/chrono (operator+(const year&, const years&)):
	Likewise.
2020-09-10 18:47:08 +01:00
Nathan Sidwell
f9189e1088 c++: TINFO_VAR_DECLARED_CONSTINIT -> DECL_DECLARED_CONSTINIT_P
We need to record whether template function-scopestatic decls are
constinit.  That's currently held on the var's TEMPLATE_INFO data.
But I want to get rid of such decl's template header as they're not
really templates, and they're never instantiated separately from their
containing function's definition.  (Just like auto vars, which don't
get them for instance).

This patch moves the flag into a spare decl_lang_flag.

	gcc/cp/
	* cp-tree.h (TINFO_VAR_DECLARED_CONSTINIT): Replace with ...
	(DECL_DECLARED_CONSTINIT_P): ... this.
	* decl.c (start_decl): No need to retrofit_lang_decl for constinit
	flag.
	(cp_finish_decl): Use DECL_DECLARED_CONSTINIT_P.
	* pt.c (tsubst_decl): No need to handle constinit flag
	propagation.
	(tsubst_expr): Or here.
2020-09-10 09:37:37 -07:00