cse_process_notes did a very simple substitution, which in the wrong
circumstances could create non-canonical RTL and invalid MEMs.
Various sticking plasters have been applied to cse_process_notes_1
to handle cases like ZERO_EXTEND, SIGN_EXTEND and UNSIGNED_FLOAT,
but I think this PR is a plaster too far.
The code is trying hard to avoid creating unnecessary rtl, which of
course is a good thing. If we continue to do that, then we can end
up changing subexpressions while keeping the containing rtx.
This in turn means that validate_change will be a no-op on the
containing rtx, even if its contents have changed. So in these
cases we have to apply validate_change to the individual subexpressions.
On the other hand, if we always apply validate_change to the
individual subexpressions, we'll end up calling validate_change
on something before it has been simplified and canonicalised.
And that's one of the situations we're trying to avoid.
There might be a middle ground in which we queue the validate_changes
as part of a group, and so can cancel the pending validate_changes
for subexpressions if there's a change in the outer expression.
But that seems even more ad-hoc than the current code.
It would also be quite an invasive change.
I think the best thing is just to hook into the existing
simplify_replace_fn_rtx function, keeping the REG and MEM handling
from cse_process_notes_1 essentially unchanged. It can generate
more redundant rtl when a simplification takes place, but it has
the advantage of being relative well-used code (both directly
and via simplify_replace_rtx).
2020-04-30 Richard Sandiford <richard.sandiford@arm.com>
gcc/
PR rtl-optimization/94740
* cse.c (cse_process_notes_1): Replace with...
(cse_process_note_1): ...this new function, acting as a
simplify_replace_fn_rtx callback to process_note. Handle only
REGs and MEMs directly. Validate the MEM if cse_process_note
changes its address.
(cse_process_notes): Replace with...
(cse_process_note): ...this new function.
(cse_extended_basic_block): Update accordingly, iterating over
the register notes and passing individual notes to cse_process_note.
PR 94856 is a call graph verifier error. We have a method which (in
the course of IPA-CP) loses its this pointer because it is unused and
the pass then does not clone all the this adjusting thunks and just
makes the calls go straight to the new clone - and then the verifier
complains that the edge does not seem to point to a clone of what it
used to. This looked weird because the verifier actually has logic
detecting this case but it turns out that it is confused by inliner
body-saving mechanism which invents a new decl for the base function.
Making the inlining body-saving mechanism to correctly set
former_clone_of allows us to detect this case too. Then we pass this
particular round of verification but the subsequent one fails because
we have inlined the function into its former thunk - which
subsequently does not have any callees, but the verifier still access
them and segfaults. Therefore the patch also adds a test whether the
a former hunk even has any call.
2020-04-30 Martin Jambor <mjambor@suse.cz>
PR ipa/94856
* cgraph.c (clone_of_p): Also consider thunks whih had their bodies
saved by the inliner and thunks which had their call inlined.
* ipa-inline-transform.c (save_inline_function_body): Fill in
former_clone_of of new body holders.
PR ipa/94856
* g++.dg/ipa/pr94856.C: New test.
Template headers are not incrementally updated as we parse its parameters.
We maintain a dummy level until the closing > when we replace the dummy with
a real parameter set. requires processing was expecting a properly populated
arg_vec in current_template_parms, and then creates a self-mapping of parameters
from that. But we don't need to do that, just teach map_arguments to look at
TREE_VALUE when args is NULL.
* constraint.cc (map_arguments): If ARGS is null, it's a
self-mapping of parms.
(finish_nested_requirement): Do not pass argified
current_template_parms to normalization.
(tsubst_nested_requirement): Don't assert no template parms.
The testcase ICEs because the range-based for generates three
artificial variables that need to be allocated to the coroutine
frame but, when walking the BIND_EXR that contains these, the
DECL_INITIAL for one of them refers to an entry appearing later,
which means that the frame entry hasn't been allocated when that
INITIAL is walked.
The solution is to defer walking the DECL_INITIAL/SIZE etc. until
all the BIND_EXPR vars have been processed.
gcc/cp/ChangeLog:
2020-04-30 Iain Sandoe <iain@sandoe.co.uk>
PR c++/94886
* coroutines.cc (transform_local_var_uses): Defer walking
the DECL_INITIALs of BIND_EXPR vars until all the frame
allocations have been made.
gcc/testsuite/ChangeLog:
2020-04-30 Iain Sandoe <iain@sandoe.co.uk>
PR c++/94886
* g++.dg/coroutines/pr94886-folly-3.C: New test.
The problem here is that target cleanup expressions have been
added to the initialisers for the awaitable (and returns of
non-trivial values from await_suspend() calls. This is because
the expansion of the co_await into its control flow is not
apparent to the machinery adding the target cleanup expressions.
The solution being tested is simply to recreate target expressions
as the co_awaits are lowered. Teaching the machinery to handle
walking co_await expressions in different ways at different points
(outside the coroutine transformation) seems overly complex.
gcc/cp/ChangeLog:
2020-04-30 Iain Sandoe <iain@sandoe.co.uk>
PR c++/94883
* coroutines.cc (register_awaits): Update target
expressions for awaitable and suspend handle
initializers.
gcc/testsuite/ChangeLog:
2020-04-30 Iain Sandoe <iain@sandoe.co.uk>
PR c++/94883
* g++.dg/coroutines/pr94883-folly-2.C: New test.
There are several places where the handling of a variable
declaration depends on whether it corresponds to a compiler
temporary, or to some other entity. We were testing that var
decls were artificial in determining this. However, proxy vars
are also artificial so that this is not sufficient. The solution
is to exclude variables with a DECL_VALUE_EXPR as well, since
the value variable will not be a temporary.
gcc/cp/ChangeLog:
2020-04-30 Iain Sandoe <iain@sandoe.co.uk>
PR c++/94879
* coroutines.cc (build_co_await): Account for variables
with DECL_VALUE_EXPRs.
(captures_temporary): Likewise.
(register_awaits): Likewise.
gcc/testsuite/ChangeLog:
2020-04-30 Iain Sandoe <iain@sandoe.co.uk>
PR c++/94879
* g++.dg/coroutines/pr94879-folly-1.C: New test.
Here we trip on the TYPE_USER_ALIGN (t) assert in strip_typedefs: it
gets "const d[0]" with TYPE_USER_ALIGN=0 but the result built by
build_cplus_array_type is "const char[0]" with TYPE_USER_ALIGN=1.
When we strip_typedefs the element of the array "const d", we see it's
a typedef_variant_p, so we look at its DECL_ORIGINAL_TYPE, which is
char, but we need to add the const qualifier, so we call
cp_build_qualified_type -> build_qualified_type
where get_qualified_type checks to see if we already have such a type
by walking the variants list, which in this case is:
char -> c -> const char -> const char -> d -> const d
Because check_base_type only checks TYPE_ALIGN and not TYPE_USER_ALIGN,
we choose the first const char, which has TYPE_USER_ALIGN set. If the
element type of an array has TYPE_USER_ALIGN, the array type gets it too.
So we can make check_base_type stricter. I was afraid that it might make
us reuse types less often, but measuring showed that we build the same
amount of types with and without the patch, while bootstrapping.
PR c++/94775
* tree.c (check_base_type): Return true only if TYPE_USER_ALIGN match.
(check_aligned_type): Check if TYPE_USER_ALIGN match.
* g++.dg/warn/Warray-bounds-10.C: New test.
2020-04-30 Kyrylo Tkachov <kyrylo.tkachov@arm.com>
* config/aarch64/aarch64.h (TARGET_OUTLINE_ATOMICS): Define.
* config/aarch64/aarch64.opt (moutline-atomics): Change to Int variable.
* doc/invoke.texi (moutline-atomics): Document as on by default.
It was previously discussed that indirect branches cannot go to
NOTE_INSN_DELETED_LABEL so inserting a landing pad is unnecessary.
See https://gcc.gnu.org/pipermail/gcc-patches/2019-May/522625.html
Before the patch a bti j was inserted after the label in
__attribute__((target("branch-protection=bti")))
int foo (void)
{
label:
return 0;
}
This is not necessary and weakens the security protection.
gcc/ChangeLog:
PR target/94748
* config/aarch64/aarch64-bti-insert.c (rest_of_insert_bti): Remove
the check for NOTE_INSN_DELETED_LABEL.
gcc/testsuite/ChangeLog:
PR target/94748
* gcc.target/aarch64/pr94748.c: New test.
From the generated manpages, it was not clear that its usage is
'-debuglib=<libname>'.
gcc/d/ChangeLog:
* gdc.texi (Options for Linking): Clarify usage of -defaultlib= and
-debuglib= options.
Corrects a previous change made to the SPARC stdc bindings, and
backports PPC-related fixes. The library and language testsuite now
passes fully on powerpc64le-linux-gnu.
Fixes: PR d/90719
Fixes: PR d/94825
Reviewed-on: https://github.com/dlang/dmd/pull/11079https://github.com/dlang/druntime/pull/3078https://github.com/dlang/druntime/pull/3083
libphobos/ChangeLog:
PR d/94825
* libdruntime/Makefile.am (DRUNTIME_SOURCES_CONFIGURED): Remove
config/powerpc/switchcontext.S
* libdruntime/Makefile.in: Regenerate.
* libdruntime/config/powerpc/callwithstack.S: Remove.
* libdruntime/config/powerpc/switchcontext.S: Fix symbol name of
fiber_switchContext.
* libdruntime/core/thread.d: Disable fiber migration tests on PPC.
* testsuite/libphobos.thread/fiber_guard_page.d: Set guardPageSize
same as stackSize.
> , CHANGES_URL ("gcc-10/changes.html#empty_base");
>
> where the macro would just use preprocessor string concatenation?
Ok, the following patch implements it (doesn't introduce a separate
macro and just uses CHANGES_ROOT_URL "gcc-10/changes.html#empty_base"),
in addition adds the documentation Joseph requested.
2020-04-30 Jakub Jelinek <jakub@redhat.com>
* configure.ac (--with-documentation-root-url,
--with-changes-root-url): Diagnose URL not ending with /,
use AC_DEFINE_UNQUOTED instead of AC_SUBST.
* opts.h (get_changes_url): Remove.
* opts.c (get_changes_url): Remove.
* Makefile.in (CFLAGS-opts.o): Don't add -DDOCUMENTATION_ROOT_URL
or -DCHANGES_ROOT_URL.
* doc/install.texi (--with-documentation-root-url,
--with-changes-root-url): Document.
* config/arm/arm.c (aapcs_vfp_is_call_or_return_candidate): Don't call
get_changes_url and free, change url variable type to const char * and
set it to CHANGES_ROOT_URL "gcc-10/changes.html#empty_base".
* config/s390/s390.c (s390_function_arg_vector,
s390_function_arg_float): Likewise.
* config/aarch64/aarch64.c (aarch64_vfp_is_call_or_return_candidate):
Likewise.
* config/rs6000/rs6000-call.c (rs6000_discover_homogeneous_aggregate):
Likewise.
* config.in: Regenerate.
* configure: Regenerate.
This fixes a problem with the vec_store_len_r intrinsic. The macros
mapping the intrinsic to a GCC builtin had the wrong signature.
With the patch an immediate length operand of vlrl/vstrl is handled
the same way as if it was passed in a register to vlrlr/vstrlr.
Values bigger than 15 always load the full vector. If it can be
recognized that it is in effect a full vector register load or store
it is now implemented with vl/vst instead.
gcc/ChangeLog:
2020-04-30 Andreas Krebbel <krebbel@linux.ibm.com>
* config/s390/constraints.md ("j>f", "jb4"): New constraints.
* config/s390/vecintrin.h (vec_load_len_r, vec_store_len_r): Fix
macro definitions.
* config/s390/vx-builtins.md ("vlrlrv16qi", "vstrlrv16qi"): Add a
separate expander.
("*vlrlrv16qi", "*vstrlrv16qi"): Add alternative for vl/vst.
Change constraint for vlrl/vstrl to jb4.
gcc/testsuite/ChangeLog:
2020-04-30 Andreas Krebbel <krebbel@linux.ibm.com>
* gcc.target/s390/zvector/vec_load_len_r.c: New test.
* gcc.target/s390/zvector/vec_store_len_r.c: New test.
While bootstrapping GCC on S/390 the following warning/error is raised:
gcc/var-tracking.c:10239:34: error: 'pre' may be used uninitialized in this function [-Werror=maybe-uninitialized]
10239 | VTI (bb)->out.stack_adjust += pre;
| ^
The lines of interest are:
HOST_WIDE_INT pre, post = 0;
// ...
if (!frame_pointer_needed)
{
insn_stack_adjust_offset_pre_post (insn, &pre, &post);
// ...
}
// ...
adjust_insn (bb, insn);
if (!frame_pointer_needed && pre)
VTI (bb)->out.stack_adjust += pre;
Both if statements depend on global variable frame_pointer_needed. In function
insn_stack_adjust_offset_pre_post local variable pre is initialized. The
problematic part is the function call between both if statements. Since
adjust_insn also calls functions which are defined in a different compilation
unit, we are not able to prove that global variable frame_pointer_needed is not
altered by adjust_insn and its siblings. Thus we must assume that
frame_pointer_needed may be true before the call and false afterwards which
renders the warning true (admitted the location hint is not totally perfect).
By initialising pre we silence the warning.
gcc/ChangeLog:
2020-04-30 Stefan Schulze Frielinghaus <stefansf@linux.ibm.com>
* var-tracking.c (vt_initialize): Move variables pre and post
into inner block and initialize both in order to fix warning
about uninitialized use. Remove unnecessary checks for
frame_pointer_needed.
Ensure that CF does not equal NULL in function output_stack_usage_1
before calling fprintf. This fixes the following warning/error:
gcc/toplev.c:976:13: error: argument 1 null where non-null expected [-Werror=nonnull]
976 | fprintf (cf, "\\n" HOST_WIDE_INT_PRINT_DEC " bytes (%s)",
| ~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
977 | stack_usage,
| ~~~~~~~~~~~~
978 | stack_usage_kind_str[stack_usage_kind]);
An example call side where CF is NULL is in function output_stack_usage.
gcc/ChangeLog:
2020-04-30 Stefan Schulze Frielinghaus <stefansf@linux.ibm.com>
* toplev.c (output_stack_usage_1): Ensure that first
argument to fprintf is not null.
The following patch attempts to use the diagnostics URL support if available
to provide more information about the C++17 empty base and C++20
[[no_unique_address]] empty class ABI changes in -Wpsabi diagnostics.
in GCC 10.1 at the end of the diagnostics is then in some terminals
underlined with a dotted line and points to a (to be written) anchor in
gcc-10/changes.html which we need to write anyway.
2020-04-29 Jakub Jelinek <jakub@redhat.com>
* configure.ac (-with-changes-root-url): New configure option,
defaulting to https://gcc.gnu.org/.
* Makefile.in (CFLAGS-opts.o): Define CHANGES_ROOT_URL for
opts.c.
* pretty-print.c (get_end_url_string): New function.
(pp_format): Handle %{ and %} for URLs.
(pp_begin_url): Use pp_string instead of pp_printf.
(pp_end_url): Use get_end_url_string.
* opts.h (get_changes_url): Declare.
* opts.c (get_changes_url): New function.
* config/rs6000/rs6000-call.c: Include opts.h.
(rs6000_discover_homogeneous_aggregate): Use %{in GCC 10.1%} instead
of just in GCC 10.1 in diagnostics and add URL.
* config/arm/arm.c (aapcs_vfp_is_call_or_return_candidate): Likewise.
* config/aarch64/aarch64.c (aarch64_vfp_is_call_or_return_candidate):
Likewise.
* config/s390/s390.c (s390_function_arg_vector,
s390_function_arg_float): Likewise.
* configure: Regenerated.
* c-format.c (PP_FORMAT_CHAR_TABLE): Add %{ and %}.
So, based on the yesterday's discussions, similarly to powerpc64le-linux
I've done some testing for s390x-linux too.
First of all, I found a bug in my patch from yesterday, it was printing
the wrong type like 'double' etc. rather than the class that contained such
the element. Fix below.
For s390x-linux, I was using
struct X { };
struct Y { int : 0; };
struct Z { int : 0; Y y; };
struct U : public X { X q; };
struct A { double a; };
struct B : public X { double a; };
struct C : public Y { double a; };
struct D : public Z { double a; };
struct E : public U { double a; };
struct F { [[no_unique_address]] X x; double a; };
struct G { [[no_unique_address]] Y y; double a; };
struct H { [[no_unique_address]] Z z; double a; };
struct I { [[no_unique_address]] U u; double a; };
struct J { double a; [[no_unique_address]] X x; };
struct K { double a; [[no_unique_address]] Y y; };
struct L { double a; [[no_unique_address]] Z z; };
struct M { double a; [[no_unique_address]] U u; };
#define T(S, s) extern S s; extern void foo##s (S); int bar##s () { foo##s (s); return 0; }
T (A, a)
T (B, b)
T (C, c)
T (D, d)
T (E, e)
T (F, f)
T (G, g)
T (H, h)
T (I, i)
T (J, j)
T (K, k)
T (L, l)
T (M, m)
as testcase and looking for "\tld\t%f0,".
While g++ 9 with -std=c++17 used to pass in fpr just
A, g++ 9 -std=c++14, as well as current trunk -std=c++14 & 17
and clang++ from today -std=c++14 & 17 all pass A, B, C
in fpr and nothing else. The intent stated by Jason seems to be
that A, B, C, F, G, J, K should all be passed in fpr.
Attached are two (updated) versions of the patch on top of the
powerpc+middle-end patch just posted.
The first one emits two separate -Wpsabi warnings like powerpc, one for
the -std=c++14 vs. -std=c++17 ABI difference and one for GCC 9 vs. 10
[[no_unique_address]] passing changes, the other one is silent about the
second case.
2020-04-29 Jakub Jelinek <jakub@redhat.com>
PR target/94704
* config/s390/s390.c (s390_function_arg_vector,
s390_function_arg_float): Use DECL_FIELD_ABI_IGNORED instead of
cxx17_empty_base_field_p. In -Wpsabi diagnostics use the type
passed to the function rather than the type of the single element.
Rename cxx17_empty_base_seen variable to empty_base_seen, change
type to int, and adjust diagnostics depending on if the field
has [[no_unique_attribute]] or not.
* g++.target/s390/s390.exp: New file.
* g++.target/s390/pr94704-1.C: New test.
* g++.target/s390/pr94704-2.C: New test.
* g++.target/s390/pr94704-3.C: New test.
* g++.target/s390/pr94704-4.C: New test.
As reported in the PR, while most intrinsic -O0 macro argument uses
are properly wrapped in ()s or used in context where having a complex
expression passed as the argument doesn't pose a problem (e.g. when
macro argument use is in between commas, or between ( and comma, or
between comma and ) etc.), especially the gather/scatter macros don't do
this and if one passes to some macro e.g. x + y as argument, the
corresponding inline function would do cast on the argument, but
the macro does (int) ARG, then it is (int) x + y rather than (int) (x + y).
The following patch fixes those issues in *gather/*scatter*; additionally,
the AVX2 macros were passing incorrect mask of e.g.
(__v2df)_mm_set1_pd((double)(long long int) -1)
which is IMHO equivalent to
(__v2df){-1.0, -1.0}
when it really wants to pass __v2df vector with all bits set.
I've used what the inline functions use for those cases.
2020-04-29 Jakub Jelinek <jakub@redhat.com>
PR target/94832
* config/i386/avx2intrin.h (_mm_mask_i32gather_pd,
_mm256_mask_i32gather_pd, _mm_mask_i64gather_pd,
_mm256_mask_i64gather_pd, _mm_mask_i32gather_ps,
_mm256_mask_i32gather_ps, _mm_mask_i64gather_ps,
_mm256_mask_i64gather_ps, _mm_i32gather_epi64,
_mm_mask_i32gather_epi64, _mm256_i32gather_epi64,
_mm256_mask_i32gather_epi64, _mm_i64gather_epi64,
_mm_mask_i64gather_epi64, _mm256_i64gather_epi64,
_mm256_mask_i64gather_epi64, _mm_i32gather_epi32,
_mm_mask_i32gather_epi32, _mm256_i32gather_epi32,
_mm256_mask_i32gather_epi32, _mm_i64gather_epi32,
_mm_mask_i64gather_epi32, _mm256_i64gather_epi32,
_mm256_mask_i64gather_epi32): Surround macro parameter uses with
parens.
(_mm_i32gather_pd, _mm256_i32gather_pd, _mm_i64gather_pd,
_mm256_i64gather_pd, _mm_i32gather_ps, _mm256_i32gather_ps,
_mm_i64gather_ps, _mm256_i64gather_ps): Likewise. Don't use
as mask vector containing -1.0 or -1.0f elts, but instead vector
with all bits set using _mm*_cmpeq_p? with zero operands.
* config/i386/avx512fintrin.h (_mm512_i32gather_ps,
_mm512_mask_i32gather_ps, _mm512_i32gather_pd,
_mm512_mask_i32gather_pd, _mm512_i64gather_ps,
_mm512_mask_i64gather_ps, _mm512_i64gather_pd,
_mm512_mask_i64gather_pd, _mm512_i32gather_epi32,
_mm512_mask_i32gather_epi32, _mm512_i32gather_epi64,
_mm512_mask_i32gather_epi64, _mm512_i64gather_epi32,
_mm512_mask_i64gather_epi32, _mm512_i64gather_epi64,
_mm512_mask_i64gather_epi64, _mm512_i32scatter_ps,
_mm512_mask_i32scatter_ps, _mm512_i32scatter_pd,
_mm512_mask_i32scatter_pd, _mm512_i64scatter_ps,
_mm512_mask_i64scatter_ps, _mm512_i64scatter_pd,
_mm512_mask_i64scatter_pd, _mm512_i32scatter_epi32,
_mm512_mask_i32scatter_epi32, _mm512_i32scatter_epi64,
_mm512_mask_i32scatter_epi64, _mm512_i64scatter_epi32,
_mm512_mask_i64scatter_epi32, _mm512_i64scatter_epi64,
_mm512_mask_i64scatter_epi64): Surround macro parameter uses with
parens.
* config/i386/avx512pfintrin.h (_mm512_prefetch_i32gather_pd,
_mm512_prefetch_i32gather_ps, _mm512_mask_prefetch_i32gather_pd,
_mm512_mask_prefetch_i32gather_ps, _mm512_prefetch_i64gather_pd,
_mm512_prefetch_i64gather_ps, _mm512_mask_prefetch_i64gather_pd,
_mm512_mask_prefetch_i64gather_ps, _mm512_prefetch_i32scatter_pd,
_mm512_prefetch_i32scatter_ps, _mm512_mask_prefetch_i32scatter_pd,
_mm512_mask_prefetch_i32scatter_ps, _mm512_prefetch_i64scatter_pd,
_mm512_prefetch_i64scatter_ps, _mm512_mask_prefetch_i64scatter_pd,
_mm512_mask_prefetch_i64scatter_ps): Likewise.
* config/i386/avx512vlintrin.h (_mm256_mmask_i32gather_ps,
_mm_mmask_i32gather_ps, _mm256_mmask_i32gather_pd,
_mm_mmask_i32gather_pd, _mm256_mmask_i64gather_ps,
_mm_mmask_i64gather_ps, _mm256_mmask_i64gather_pd,
_mm_mmask_i64gather_pd, _mm256_mmask_i32gather_epi32,
_mm_mmask_i32gather_epi32, _mm256_mmask_i32gather_epi64,
_mm_mmask_i32gather_epi64, _mm256_mmask_i64gather_epi32,
_mm_mmask_i64gather_epi32, _mm256_mmask_i64gather_epi64,
_mm_mmask_i64gather_epi64, _mm256_i32scatter_ps,
_mm256_mask_i32scatter_ps, _mm_i32scatter_ps, _mm_mask_i32scatter_ps,
_mm256_i32scatter_pd, _mm256_mask_i32scatter_pd, _mm_i32scatter_pd,
_mm_mask_i32scatter_pd, _mm256_i64scatter_ps,
_mm256_mask_i64scatter_ps, _mm_i64scatter_ps, _mm_mask_i64scatter_ps,
_mm256_i64scatter_pd, _mm256_mask_i64scatter_pd, _mm_i64scatter_pd,
_mm_mask_i64scatter_pd, _mm256_i32scatter_epi32,
_mm256_mask_i32scatter_epi32, _mm_i32scatter_epi32,
_mm_mask_i32scatter_epi32, _mm256_i32scatter_epi64,
_mm256_mask_i32scatter_epi64, _mm_i32scatter_epi64,
_mm_mask_i32scatter_epi64, _mm256_i64scatter_epi32,
_mm256_mask_i64scatter_epi32, _mm_i64scatter_epi32,
_mm_mask_i64scatter_epi32, _mm256_i64scatter_epi64,
_mm256_mask_i64scatter_epi64, _mm_i64scatter_epi64,
_mm_mask_i64scatter_epi64): Likewise.
While bootstrapping GCC on S/390 the following warning occurs:
gcc/fortran/io.c: In function 'bool gfc_resolve_dt(gfc_code*, gfc_dt*, locus*)':
gcc/fortran/io.c:3857:7: error: 'num' may be used uninitialized in this function [-Werror=maybe-uninitialized]
3857 | if (num == 0)
| ^~
gcc/fortran/io.c:3843:11: note: 'num' was declared here
3843 | int num;
Since gfc_resolve_dt is a non-static function we cannot assume anything about
argument DT. Argument DT gets passed to function check_io_constraints which
passes values depending on DT, namely dt->asynchronous->value.character.string
to function compare_to_allowed_values as well as argument warn which is true as
soon as DT->dterr is true. Thus both arguments depend on DT.
If function compare_to_allowed_values is called with
dt->asynchronous->value.character.string not being an allowed value, and
ALLOWED_F2003 as well as ALLOWED_GNU being NULL (which is the case at the
particular call side), and WARN equals true, then the function returns with a
non-zero value and leaves num uninitialized which renders the warning true.
Initialized num to -1 and added an assert statement.
gcc/fortran/ChangeLog:
2020-04-29 Stefan Schulze Frielinghaus <stefansf@linux.ibm.com>
PR fortran/94769
* io.c (check_io_constraints): Initialize local variable num to
-1 and assert that it receives a meaningful value by function
compare_to_allowed_values.
This is the rs6000 version of the earlier committed x86, aarch64 and arm
fixes, as create_tmp_var_raw is used because the C FE can call this outside
of function context, we need to make sure the first references to those
VAR_DECLs are through a TARGET_EXPR, so that it gets gimple_add_tmp_var
marked in whatever function it gets expanded in. Without that DECL_CONTEXT
is NULL and the vars aren't added as local decls of the containing function.
2020-04-29 Jakub Jelinek <jakub@redhat.com>
PR target/94826
* config/rs6000/rs6000.c (rs6000_atomic_assign_expand_fenv): Use
TARGET_EXPR instead of MODIFY_EXPR for first assignment to
fenv_var, fenv_clear and old_fenv variables. For fenv_addr
take address of TARGET_EXPR of fenv_var with void_node initializer.
Formatting fixes.
Array retval is not necessarily initialized by function is_call_safe and
may be used afterwards. Thus, initialize it explicitly.
gcc/ChangeLog:
2020-04-29 Stefan Schulze Frielinghaus <stefansf@linux.ibm.com>
PR tree-optimization/94774
* gimple-ssa-sprintf.c (try_substitute_return_value): Initialize
variable retval.
This patch makes the order in which template parameters appear in the
TREE_LIST returned by find_template_parameters deterministic between
runs.
The current nondeterminism is semantically harmless, but it has the
undesirable effect of causing some concepts diagnostics which print a
constraint's parameter mapping via pp_cxx_parameter_mapping to also be
nondeterministic, as in the testcases below.
gcc/cp/ChangeLog:
PR c++/94830
* pt.c (find_template_parameter_info::parm_list): New field.
(keep_template_parm): Use the new field to build up the
parameter list here instead of ...
(find_template_parameters): ... here. Return ftpi.parm_list.
gcc/testsuite/ChangeLog:
PR c++/94830
* g++.dg/concepts/diagnostics12.C: Clarify the dg-message now
that the corresponding diagnostic is deterministic.
* g++.dg/concepts/diagnostics13.C: New test.
This predicate is now used by aarch64 targets.
2020-04-29 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* calls.h (cxx17_empty_base_field_p): Turn into a function declaration.
* calls.c (cxx17_empty_base_field_p): New function. Check
DECL_ARTIFICIAL and RECORD_OR_UNION_TYPE_P in addition to the
previous checks.
Allow -fcf-protection with external thunk since the external thunk can be
made compatible with -fcf-protection.
gcc/
PR target/93654
* config/i386/i386-options.c (ix86_set_indirect_branch_type):
Allow -fcf-protection with -mindirect-branch=thunk-extern and
-mfunction-return=thunk-extern.
* doc/invoke.texi: Update notes for -fcf-protection=branch with
-mindirect-branch=thunk-extern and -mindirect-return=thunk-extern.
gcc/testsuite/
PR target/93654
* gcc.target/i386/pr93654.c: New test.
Essentially the same fix as for x86.
2020-04-29 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* config/arm/arm-builtins.c (arm_atomic_assign_expand_fenv): Use
TARGET_EXPR instead of MODIFY_EXPR for the first assignments to
fenv_var and new_fenv_var.
This patch makes the ABI code ignore zero-sized [[no_unique_address]]
fields when deciding whether something is a HFA or HVA.
For the tests, I wanted an -march setting that was stable enough
to use check-function-bodies and also wanted to force -mfloat-abi=hard.
I couldn't see any existing way of doing both together, since most
arm-related effective-target keywords are agnostic about the choice
between -mfloat-abi=softfp and -mfloat-abi=hard. I therefore added
a new effective-target keyword for this combination.
I used the arm_arch_* framework for the effective-target rather than
writing a new set of custom Tcl routines. This has the nice property
of separating the "compile and assemble" cases from the "link and run"
cases. I only need compilation to work for the new tests, so requiring
linking to work would be an unnecessary restriction.
However, including an ABI requirement is arguably stretching what the
list was originally intended to handle. The name arm_arch_v8a_hard
doesn't fit very naturally with some of the NEON-based tests.
On the other hand, the naming convention isn't entirely consistent,
so any choice would be inconsistent with something.
2020-04-29 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* doc/sourcebuild.texi (arm_arch_v8a_hard_ok): Document new
effective-target keyword.
(arm_arch_v8a_hard_multilib): Likewise.
(arm_arch_v8a_hard): Document new dg-add-options keyword.
* config/arm/arm.c (arm_return_in_memory): Note that the APCS
code is deprecated and has not been updated to handle
DECL_FIELD_ABI_IGNORED.
(WARN_PSABI_EMPTY_CXX17_BASE): New constant.
(WARN_PSABI_NO_UNIQUE_ADDRESS): Likewise.
(aapcs_vfp_sub_candidate): Replace the boolean pointer parameter
avoid_cxx17_empty_base with a pointer to a bitmask. Ignore fields
whose DECL_FIELD_ABI_IGNORED bit is set when determining whether
something actually is a HFA or HVA. Record whether we see a
[[no_unique_address]] field that previous GCCs would not have
ignored in this way.
(aapcs_vfp_is_call_or_return_candidate): Update the calls to
aapcs_vfp_sub_candidate and report a -Wpsabi warning for the
[[no_unique_address]] case. Use TYPE_MAIN_VARIANT in the
diagnostic messages.
(arm_needs_doubleword_align): Add a comment explaining why we
consider even zero-sized fields.
gcc/testsuite/
* lib/target-supports.exp: Add v8a_hard to the list of arm_arch_*
targets.
* g++.target/arm/no_unique_address_1.C: New test.
* g++.target/arm/no_unique_address_2.C: Likewise.
This ICE appears because gcc will stream it to the function_body section
when processing the variable with the initial value of the constructor
type, and the error_mark_node to the decls section.
When recompiling, the value obtained with DECL_INITIAL will be error_mark.
2020-04-29 Richard Biener <rguenther@suse.de>
Li Zekun <lizekun1@huawei.com>
PR lto/94822
* tree.c (component_ref_size): Guard against error_mark_node
DECL_INITIAL as it happens with LTO.
* gcc.dg/lto/pr94822_0.c: New testcase.
* gcc.dg/lto/pr94822_1.c: Alternate file.
* gcc.dg/lto/pr94822.h: Likewise.
This patch makes the ABI code ignore zero-sized [[no_unique_address]]
fields when deciding whether something is a HFA or HVA.
As things stood, we'd get two sets of -Wpsabi warnings, one when
trying to decide whether something was an SVE function, and another
when actually processing the function definition or function call.
The patch therefore makes aapcs_vfp_sub_candidate honour the
CUMULATIVE_ARGS "silent_p" flag where applicable.
This doesn't stop all duplicate warnings for parameters, and I suspect
we'll get duplicate warnings for return values too, but it should be
better than nothing.
2020-04-29 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* config/aarch64/aarch64.c (aarch64_function_arg_alignment): Add a
comment explaining why we consider even zero-sized fields.
(WARN_PSABI_EMPTY_CXX17_BASE): New constant.
(WARN_PSABI_NO_UNIQUE_ADDRESS): Likewise.
(aapcs_vfp_sub_candidate): Replace the boolean pointer parameter
avoid_cxx17_empty_base with a pointer to a bitmask. Ignore fields
whose DECL_FIELD_ABI_IGNORED bit is set when determining whether
something actually is a HFA or HVA. Record whether we see a
[[no_unique_address]] field that previous GCCs would not have
ignored in this way.
(aarch64_vfp_is_call_or_return_candidate): Add a parameter to say
whether diagnostics should be suppressed. Update the calls to
aapcs_vfp_sub_candidate and report a -Wpsabi warning for the
[[no_unique_address]] case.
(aarch64_return_in_msb): Update call accordingly, never silencing
diagnostics.
(aarch64_function_value): Likewise.
(aarch64_return_in_memory_1): Likewise.
(aarch64_init_cumulative_args): Likewise.
(aarch64_gimplify_va_arg_expr): Likewise.
(aarch64_pass_by_reference_1): Take a CUMULATIVE_ARGS pointer and
use it to decide whether arch64_vfp_is_call_or_return_candidate
should be silent.
(aarch64_pass_by_reference): Update calls accordingly.
(aarch64_vfp_is_call_candidate): Use the CUMULATIVE_ARGS argument
to decide whether arch64_vfp_is_call_or_return_candidate should be
silent.
gcc/testsuite/
* g++.target/aarch64/no_unique_address_1.C: New test.
* g++.target/aarch64/no_unique_address_2.C: Likewise.
mve.exp changed the default dg-do action to "assemble", but then
left it like that for later exp files. This meant that in a
two-multilib test run, the first arm.exp run would have a default
of "dg-do compile" and the second would have a default of
"dg-do assemble".
2020-04-29 Richard Sandiford <richard.sandiford@arm.com>
gcc/testsuite/
* g++.target/arm/mve.exp: Restore the original dg-do-what-default
before finishing.
Adds classKind information to the front-end AST, which in turn allows us
to fix code generation of type names for extern(C) and extern(C++)
structs and classes. Inspecting such types inside a debugger now just
works without the need to 'cast(module_name.cxx_type)'.
gcc/d/ChangeLog:
* d-codegen.cc (d_decl_context): Don't include module in the name of
class and struct types that aren't extern(D).
This is a simple fix for pr94820.
The PR was only fixed on i386, the same error was also reported on aarch64.
This function, because it is sometimes called even outside of function bodies, uses create_tmp_var_raw rather than create_tmp_var.
But in order for that to work, when first referenced, the VAR_DECLs need to appear in a TARGET_EXPR so that during gimplification
the var gets the right DECL_CONTEXT and is added to local decls. Without that, e.g. tree-nested.c ICEs on those.
2020-04-29 Haijian Zhang <z.zhanghaijian@huawei.com>
PR target/94820
* config/aarch64/aarch64-builtins.c
(aarch64_atomic_assign_expand_fenv): Use TARGET_EXPR instead of
MODIFY_EXPR for first assignment to fenv_cr, fenv_sr and
new_fenv_var.
Fix-up for commit 955cd05745 "Add
gcc/config/gcn/t-omp-device for OpenMP declare variant kind/arch/isa".
With AMD GCN offloading configured, I'm seeing occasional GCC build hangs.
I've now captured and analyzed one of them:
$ ps -f
UID PID PPID C STIME TTY TIME CMD
[...]
tschwing 5113 4508 0 20:24 pts/5 00:00:00 /bin/sh -c rm -f tmp-omp-device-properties.h; \ for kind in kind arch isa; do \ echo 'const char omp_offload_device_'${kind}'[] = ' \ >> tmp-omp-device-properties.h; \ for prop in no
tschwing 5126 5113 0 20:24 pts/5 00:00:00 sed -n s/^kind: //p
tschwing 5127 5113 0 20:24 pts/5 00:00:00 sed s/[[:blank:]]/ /g;s/ */ /g;s/^ //;s/ $//;s/ /\\0/g;s/^/"/;s/$/\\0\\0"/
[...]
$ pstree -p $$
[...]---sh(5113)-+-sed(5126)
`-sed(5127)
$ ls -lrt build-gcc/gcc/*omp-device*
-rw-r--r-- 1 tschwing eeg 39 Apr 23 20:24 build-gcc/gcc/omp-device-properties-nvptx
-rw-r--r-- 1 tschwing eeg 634 Apr 23 20:24 build-gcc/gcc/omp-device-properties-i386
-rw-r--r-- 1 tschwing eeg 58 Apr 23 20:24 build-gcc/gcc/tmp-omp-device-properties.h
Notably missing is the 'omp-device-properties-gcn' file...
$ grep ^ build-gcc/gcc/*omp-device*
build-gcc/gcc/omp-device-properties-i386:kind: cpu
build-gcc/gcc/omp-device-properties-i386:arch: x86 x86_64 i386 i486 i586 i686 ia32
build-gcc/gcc/omp-device-properties-i386:isa: sse4 cx16 [...]
build-gcc/gcc/omp-device-properties-nvptx:kind: gpu
build-gcc/gcc/omp-device-properties-nvptx:arch: nvptx
build-gcc/gcc/omp-device-properties-nvptx:isa: sm_30 sm_35
build-gcc/gcc/tmp-omp-device-properties.h:const char omp_offload_device_kind[] =
build-gcc/gcc/tmp-omp-device-properties.h:"amdgcn-amdhsa\0"
..., which we here seem to be intending to fill into
'tmp-omp-device-properties.h'.
$ grep ^omp_device_properties\ = build-gcc/gcc/Makefile
omp_device_properties = amdgcn-amdhsa= nvptx-none=omp-device-properties-nvptx x86_64-intelmicemul-linux-gnu=omp-device-properties-i386
Given the 's-omp-device-properties-h' Makefile rule, indeed there is an
unescaped '$${props}', which is meant to be the filename following the equals
sign -- but there is none for 'amdgcn-amdhsa=', so this tries to read from
'stdin'!
The real problem of course is elsewhere.
gcc/
* configure.ac <$enable_offload_targets>: 'amdgcn' is 'gcn'.
* configure: Regenerate.
... given that the GCN target did away with the constant 'vec_select'
restriction.
gcc/
PR target/94279
* rtlanal.c (set_noop_p): Handle non-constant selectors.
2020-04-29 Jakub Jelinek <jakub@redhat.com>
PR target/94706
* config/ia64/ia64.c (hfa_element_mode): Use DECL_FIELD_ABI_IGNORED
instead of cxx17_empty_base_field_p.
As reported by Iain and David, powerpc-darwin and powerpc-aix* have C++14
vs. C++17 ABI incompatibilities which are not fixed by mere adding of
cxx17_empty_base_field_p calls. Unlike the issues that were seen on other
targets where the artificial empty base field affected function argument
passing or returning of values, on these two targets the difference is
during class layout, not afterwards (e.g.
struct empty_base {};
struct S : public empty_base { unsigned long long l[2]; };
will have different __alignof__ (S) between C++14 and C++17 (or possibly
with double instead of unsigned long long too)).
I've tried:
struct X { };
struct Y { int : 0; };
struct Z { int : 0; Y y; };
struct U : public X { X q; };
struct A { float a, b, c, d; };
struct B : public X { float a, b, c, d; };
struct C : public Y { float a, b, c, d; };
struct D : public Z { float a, b, c, d; };
struct E : public U { float a, b, c, d; };
struct F { [[no_unique_address]] X x; float a, b, c, d; };
struct G { [[no_unique_address]] Y y; float a, b, c, d; };
struct H { [[no_unique_address]] Z z; float a, b, c, d; };
struct I { [[no_unique_address]] U u; float a, b, c, d; };
struct J { float a, b; [[no_unique_address]] X x; float c, d; };
struct K { float a, b; [[no_unique_address]] Y y; float c, d; };
struct L { float a, b; [[no_unique_address]] Z z; float c, d; };
struct M { float a, b; [[no_unique_address]] U u; float c, d; };
#define T(S, s) extern S s; extern void foo##s (S); int bar##s () { foo##s (s); return 0; }
T (A, a)
T (B, b)
T (C, c)
T (D, d)
T (E, e)
T (F, f)
T (G, g)
T (H, h)
T (I, i)
T (J, j)
T (K, k)
T (L, l)
T (M, m)
testcase on powerpc64-linux. Results:
G++ 9 -std=c++14 A, B, C passed in fprs, the rest in gprs
G++ 9 -std=c++17 A passed in fprs, the rest in gprs
current trunk -std=c++14 & 17 A, B, C passed in fprs, the rest in gprs
patched trunk -std=c++14 & 17 A, B, C, F, G, J, K passed in fprs, the rest in gprs
clang++ [*] -std=c++14 & 17 A, B, C, F, G, J, K passed in fprs, the rest in gprs
[*] clang version 11.0.0 (git@github.com:llvm/llvm-project.git 5c352e69e76a26e4eda075e20aa6a9bb7686042c)
Is that what we want? I think it matches the stated intent of P0840R2 or
what Jason/Jonathan said, and doing something different like e.g. not
treating C, G and K as homogenous because of the int : 0 in empty bases
or in zero sized [[no_unique_address] fields would be quite hard to
implement (because for C++14 the FIELD_DECL just isn't there).
2020-04-29 Jakub Jelinek <jakub@redhat.com>
PR target/94707
* tree-core.h (tree_decl_common): Note decl_flag_0 used for
DECL_FIELD_ABI_IGNORED.
* tree.h (DECL_FIELD_ABI_IGNORED): Define.
* calls.h (cxx17_empty_base_field_p): Change into a temporary
macro, check DECL_FIELD_ABI_IGNORED flag with no "no_unique_address"
attribute.
* calls.c (cxx17_empty_base_field_p): Remove.
* tree-streamer-out.c (pack_ts_decl_common_value_fields): Handle
DECL_FIELD_ABI_IGNORED.
* tree-streamer-in.c (unpack_ts_decl_common_value_fields): Likewise.
* lto-streamer-out.c (hash_tree): Likewise.
* config/rs6000/rs6000-call.c (rs6000_aggregate_candidate): Rename
cxx17_empty_base_seen to empty_base_seen, change type to int *,
adjust recursive calls, use DECL_FIELD_ABI_IGNORED instead of
cxx17_empty_base_field_p, if "no_unique_address" attribute is
present, propagate that to the caller too.
(rs6000_discover_homogeneous_aggregate): Adjust
rs6000_aggregate_candidate caller, emit different diagnostics
when c++17 empty base fields are present and when empty
[[no_unique_address]] fields are present.
* config/rs6000/rs6000.c (rs6000_special_round_type_align,
darwin_rs6000_special_round_type_align): Skip DECL_FIELD_ABI_IGNORED
fields.
* class.c (build_base_field): Set DECL_FIELD_ABI_IGNORED on C++17 empty
base artificial FIELD_DECLs.
(layout_class_type): Set DECL_FIELD_ABI_IGNORED on empty class
field_poverlapping_p FIELD_DECLs.
* lto-common.c (compare_tree_sccs_1): Handle DECL_FIELD_ABI_IGNORED.
* g++.target/powerpc/pr94707-1.C: New test.
* g++.target/powerpc/pr94707-2.C: New test.
* g++.target/powerpc/pr94707-3.C: New test.
* g++.target/powerpc/pr94707-4.C: New test.
* g++.target/powerpc/pr94707-5.C: New test.
* g++.target/powerpc/pr94707-4.C: New test.
This fixes a regression when canonicalizing refs for LIM PR84362.
This possibly unshares and rewrites the refs in the internal data
and thus pointer equality no longer works in ref_always_accessed
computation.
2020-04-29 Richard Biener <rguenther@suse.de>
* tree-ssa-loop-im.c (ref_always_accessed::operator ()):
Just check whether the stmt stores.
As observed in PR94719, an inherited constructor for an instantiation of
a constructor template confusingly has as its DECL_INHERITED_CTOR the
TEMPLATE_DECL of the constructor template rather than the particular
instantiation of the template.
This means two inherited constructors for two different instantiations
of the same constructor template have the same DECL_INHERITED_CTOR. And
since in satisfy_declaration_constraints our decl satisfaction cache is
keyed off of the result of strip_inheriting_ctors, we may end up
conflating the satisfaction values of the two inherited constructors'
constraints.
This patch fixes this issue by using the original tree, not the result
of strip_inheriting_ctors, as the key to the decl satisfaction cache.
gcc/cp/ChangeLog:
PR c++/94819
* constraint.cc (satisfy_declaration_constraints): Use saved_t
instead of t as the key to decl_satisfied_cache.
gcc/testsuite/ChangeLog:
PR c++/94819
* g++.dg/cpp2a/concepts-inherit-ctor10.C: New test.
* g++.dg/cpp2a/concepts-inherit-ctor11.C: New test.
When printing the substituted parameter list of a requires-expression as
part of the "in requirements with ..." context line during concepts
diagnostics, we weren't considering that substitution into a parameter
pack can yield zero or multiple parameters.
This patch changes the way we print the parameter list of a
requires-expression in print_requires_expression_info. We now print the
dependent form of the parameter list (along with its template parameter
mapping) instead of printing its substituted form. Besides being an
improvement in its own, this also sidesteps the substitution issue in the
PR altogether.
gcc/cp/ChangeLog:
PR c++/94808
* error.c (print_requires_expression_info): Print the dependent
form of the parameter list with its template parameter mapping,
rather than printing the substituted form.
gcc/testsuite/ChangeLog:
PR c++/94808
* g++.dg/concepts/diagnostic12.C: New test.
* g++.dg/concepts/diagnostic5.C: Adjust dg-message.
The emulation of mffsl with mffs, used when !TARGET_P9_MISC, is going
through the motions, but not storing the result in the given
operands[0]; it rather modifies operands[0] without effect. It also
creates a DImode pseudo that it doesn't use, overwriting subregs
instead.
The patch below fixes all of these, the indentation and a typo.
I'm concerned about several issues in the mffsl testcase. First, I
don't see that comparing the values as doubles rather than as long
longs is desirable. These are FPSCR bitfields, not FP numbers. I
understand mffs et al use double because they output to FP registers,
and the bit patterns are subnormal FP numbers, so it works, but given
the need for bit masking of at least one side, I'm changing the
compare to long longs.
Another issue with the test is that, if the compare fails, it calls
mffsl again to print the value, as if it would yield the same result.
But part of the FPSCR that mffsl (emulated with mffs or not) copies to
the output FP register is the FPCC, so the fcmpu used to compare the
result of the first mffsl will modify FPSCR and thus the result of the
second mffsl call. After changing the compare, this is no longer the
case, but I still think it's better to make absolutely sure what we
print is what we compared.
Yet another issue is that the test assumed the mffs bits that are not
to be extracted by mffsl to be already zero, instead of masking them
out explicitly. This is not about the mffs emulation in the mffsl
implementation, but about the mffs use in the test proper. The bits
appear to be zero indeed, as the bits left out are for sticky
exceptions, but there are reserved parts of FPSCR that might turn out
to be set in the future, so we're better off masking them out
explicitly, otherwise those bits could cause the compare to fail.
If some future mffsl is changed so that it copies additional nonzero
bits, the test will fail, and then we'll have a chance to adjust it
and the emulation.
for gcc/ChangeLog
PR target/94812
* gcc/config/rs6000/rs6000.md (rs6000_mffsl): Copy result to
output operand in emulation. Don't overwrite pseudos.
for gcc/testsuite/ChangeLog
PR target/94812
* gcc.target/powerpc/test_mffsl.c: Call mffsl only once.
Reinterpret the doubles as long longs for compares. Mask out
mffs bits that are not expected from mffsl.
My last patch rejected a namespace-scope declaration of the
implicitly-declared friend operator== before the class, but redeclaring it
after the class should be OK.
gcc/cp/ChangeLog
2020-04-28 Jason Merrill <jason@redhat.com>
PR c++/94583
* decl.c (use_eh_spec_block): Check nothrow type after
DECL_DEFAULTED_FN.
* pt.c (maybe_instantiate_noexcept): Call synthesize_method for
DECL_MAYBE_DELETED fns here.
* decl2.c (mark_used): Not here.
* method.c (get_defaulted_eh_spec): Reject DECL_MAYBE_DELETED here.
PR analyzer/94816 reports an ICE when attempting to copy a struct
containing a field for which add_region_for_type for fails (on
an OFFSET_TYPE): the region for the src field comes from
make_region_for_unexpected_tree_code which gives it a NULL type, and
then the copy calls add_region_for_type which unconditionally
dereferences the NULL type.
This patch fixes the ICE by checking for NULL types in
add_region_for_type.
gcc/analyzer/ChangeLog:
PR analyzer/94816
* engine.cc (impl_region_model_context::on_unexpected_tree_code):
Handle NULL tree.
* region-model.cc (region_model::add_region_for_type): Handle
NULL type.
* region-model.h
(test_region_model_context::on_unexpected_tree_code): Handle NULL
tree.
gcc/testsuite/ChangeLog:
PR analyzer/94816
* g++.dg/analyzer/pr94816.C: New test.
Turns out for consistency with LLVM the +nofp option shouldn't remove ALL of FP and MVE, just the FP part of MVE.
This requires more surgery with feature bits so for GCC 10 I'd rather just not support +nofp for -mcpu=cortex-m55
and implement it properly for GCC 11.
2020-04-28 Kyrylo Tkachov <kyrylo.tkachov@arm.com>
* config/arm/arm-cpus.in (cortex-m55): Remove +nofp option.
* doc/invoke.texi (Arm Options): Remove -mcpu=cortex-m55 from +nofp option.
In C++14, an empty class deriving from an empty base is not an
aggregate, while in C++17 it is. In order to implement this, GCC adds
an artificial field to such classes.
This artificial field has no mapping to Fundamental Data Types in the
Arm PCS ABI and hence should not count towards determining whether an
object can be passed using the vector registers as per section
"7.1.2 Procedure Calling" in the arm PCS
https://developer.arm.com/docs/ihi0042/latest?_ga=2.60211309.1506853196.1533541889-405231439.1528186050
This patch avoids counting this artificial field in
aapcs_vfp_sub_candidate, and hence calculates whether such objects
should be passed in vector registers in the same manner as C++14 (where
the artificial field does not exist).
Before this change, the test below would pass the arguments to `f` in
general registers. After this change, the test passes the arguments to
`f` using the vector registers.
The new behaviour matches the behaviour of `armclang`, and also matches
the GCC behaviour when run with `-std=gnu++14`.
> gcc -std=gnu++17 -march=armv8-a+simd -mfloat-abi=hard test.cpp
``` test.cpp
struct base {};
struct pair : base
{
float first;
float second;
pair (float f, float s) : first(f), second(s) {}
};
void f (pair);
int main()
{
f({3.14, 666});
return 1;
}
```
We add a `-Wpsabi` warning to catch cases where this fix has changed the ABI for
some functions. Unfortunately this warning is not emitted twice for multiple
calls to the same function, but I feel this is not much of a problem and can be
fixed later if needs be.
(i.e. if `main` called `f` twice in a row we only emit a diagnostic for the
first).
Testing:
Bootstrapped and regression tested on arm-linux.
This change fixes the struct-layout-1 tests Jakub added
https://gcc.gnu.org/pipermail/gcc-patches/2020-April/544204.html
Regression tested on arm-none-eabi.
gcc/ChangeLog:
2020-04-28 Matthew Malcomson <matthew.malcomson@arm.com>
Jakub Jelinek <jakub@redhat.com>
PR target/94711
* config/arm/arm.c (aapcs_vfp_sub_candidate): Account for C++17 empty
base class artificial fields.
(aapcs_vfp_is_call_or_return_candidate): Warn when PCS ABI
decision is different after this fix.
From what I can tell -Wanalyzer-use-of-uninitialized-value has not
yet found a true diagnostic in real-world code, and seems to be
particularly susceptible to false positives. These relate to bugs in
the region_model code.
For GCC 10 it seems best to remove this warning, which this patch does.
Internally it also removes POISON_KIND_UNINIT.
I'm working on a rewrite of the region_model code for GCC 11 that I
hope will fix these issues, and allow this warning to be reintroduced.
gcc/analyzer/ChangeLog:
PR analyzer/94447
PR analyzer/94639
PR analyzer/94732
PR analyzer/94754
* analyzer.opt (Wanalyzer-use-of-uninitialized-value): Delete.
* program-state.cc (selftest::test_program_state_dumping): Update
expected dump result for removal of "uninit".
* region-model.cc (poison_kind_to_str): Delete POISON_KIND_UNINIT
case.
(root_region::ensure_stack_region): Initialize stack with null
svalue_id rather than with a typeless POISON_KIND_UNINIT value.
(root_region::ensure_heap_region): Likewise for the heap.
(region_model::dump_summary_of_rep_path_vars): Remove
summarization of uninit values.
(region_model::validate): Remove check that the stack has a
POISON_KIND_UNINIT value.
(poisoned_value_diagnostic::emit): Remove POISON_KIND_UNINIT
case.
(poisoned_value_diagnostic::describe_final_event): Likewise.
(selftest::test_dump): Update expected dump result for removal of
"uninit".
(selftest::test_svalue_equality): Remove "uninit" and "freed".
* region-model.h (enum poison_kind): Remove POISON_KIND_UNINIT.
gcc/ChangeLog:
PR analyzer/94447
PR analyzer/94639
PR analyzer/94732
PR analyzer/94754
* doc/invoke.texi (Static Analyzer Options): Remove
-Wanalyzer-use-of-uninitialized-value.
(-Wno-analyzer-use-of-uninitialized-value): Remove item.
gcc/testsuite/ChangeLog:
PR analyzer/94447
PR analyzer/94639
PR analyzer/94732
PR analyzer/94754
* gcc.dg/analyzer/data-model-1.c: Mark "use of uninitialized
value" warnings as xfail for now.
* gcc.dg/analyzer/data-model-5b.c: Remove uninitialized warning.
* gcc.dg/analyzer/pr94099.c: Mark "uninitialized" warning as xfail
for now.
* gcc.dg/analyzer/pr94447.c: New test.
* gcc.dg/analyzer/pr94639.c: New test.
* gcc.dg/analyzer/pr94732.c: New test.
* gcc.dg/analyzer/pr94754.c: New test.
* gcc.dg/analyzer/zlib-6.c: Mark "uninitialized" warning as xfail
for now.
On the following testcase, match.pd during GENERIC folding
changes the -1U / x < y into __imag__ .MUL_OVERFLOW (x, y),
but unfortunately unlike for normal calls nothing sets TREE_SIDE_EFFECTS on
the call. There is the process_call_operands function that non-internal
call creation calls and it is usable for internal calls too,
e.g. TREE_SIDE_EFFECTS is derived from checking whether the
call has side-effects (non-ECF_{CONST,PURE}; we have those for internal
calls) and from whether any of the arguments has TREE_SIDE_EFFECTS.
2020-04-28 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/94809
* tree.c (build_call_expr_internal_loc_array): Call
process_call_operands.
* gcc.c-torture/execute/pr94809.c: New test.
Here is the patch introducing thunderx3t110 machine model
for the scheduler. A name for the new chip was added to the
list of the names to be recognized as a valid parameter for
mcpu and mtune flags. Added the TX3 tuning table and cost
model tables.
Added the new chip name to the documentation. Fixed copyright
names and dates.
Lowering the chip capabilities to v8.3 to be on the safe side.
Bootstrapped on AArch64.
2020-04-27 Anton Youdkevitch <anton.youdkevitch@bell-sw.com>
* config/aarch64/aarch64-cores.def: Add the chip name.
* config/aarch64/aarch64-tune.md: Regenerated.
* config/aarch64/aarch64.c: Add tuning table for the chip.
* gcc/config/aarch64/aarch64-cost-tables.h: Add cost tables.
* config/aarch64/thunderx3t110.md: New file: add the new
machine model for the scheduler
* config/aarch64/aarch64.md: Include the new model.
* doc/invoke.texi: Add the new name to the list
> We probably have to look into providing a -Wpsabi warning as well.
So like this?
2020-04-28 Jakub Jelinek <jakub@redhat.com>
PR target/94704
* config/s390/s390.c (s390_function_arg_vector,
s390_function_arg_float): Emit -Wpsabi diagnostics if the ABI changed.
The previous patch for this PR handled separate comparisons.
However, as arm targets show, the same fix is needed when
handling comparisons embedded in a VEC_COND_EXPR.
Here too, the problem is that vect_get_constant_vectors will
calculate its own vector type, using truth_type_for on the
STMT_VINFO_VECTYPE, and the vectoriable_* routines need to be
consistent with that.
2020-04-28 Richard Sandiford <richard.sandiford@arm.com>
gcc/
PR tree-optimization/94727
* tree-vect-stmts.c (vect_is_simple_cond): If both comparison
operands are invariant booleans, use the mask type associated with the
STMT_VINFO_VECTYPE. Use !slp_node instead of !vectype to exclude SLP.
(vectorizable_condition): Pass vectype unconditionally to
vect_is_simple_cond.
We changed the argument passed to the promise parameter preview
to match a reference to *this. However to be consistent with the
other ports, we do need to match the reference transformation in
the traits lookup and the promise allocator lookup.
gcc/cp/ChangeLog:
2020-04-28 Iain Sandoe <iain@sandoe.co.uk>
PR c++/94760
* coroutines.cc (instantiate_coro_traits): Pass a reference to
object type rather than a pointer type for 'this', for method
coroutines.
(struct param_info): Add a field to hold that the parm is a lambda
closure pointer.
(morph_fn_to_coro): Check for lambda closure pointers in the
args. Use a reference to *this when building the args list for the
promise allocator lookup.
gcc/testsuite/ChangeLog:
2020-04-28 Iain Sandoe <iain@sandoe.co.uk>
PR c++/94760
* g++.dg/coroutines/pr94760-mismatched-traits-and-promise-prev.C:
New test.
From the standard:
The header <coroutine> defines the primary template coroutine_traits
such that if ArgTypes is a parameter pack of types and if the
qualified-id R::promise_type is valid and denotes a type, then
coroutine_traits<R,ArgTypes...> has the following publicly accessible
member:
using promise_type = typename R::promise_type;
this should not prevent more specialised cases and the following
code should be accepted, but is currently rejected with:
'error: coroutine return type ‘void’ is not a class'
This is because the check for non-class-ness of the return value was
in the wrong place; it needs to be carried out in a SFINAE context.
The following patch removes the restriction in the traits template
instantiation and allows for the case that the ramp function could
return void.
The <coroutine> header is amended to implement the required
functionality.
gcc/cp/ChangeLog:
2020-04-28 Iain Sandoe <iain@sandoe.co.uk>
PR c++/94759
* coroutines.cc (coro_promise_type_found_p): Do not
exclude non-classes here (this needs to be handled in the
coroutine header).
(morph_fn_to_coro): Allow for the case where the coroutine
returns void.
gcc/testsuite/ChangeLog:
2020-04-28 Iain Sandoe <iain@sandoe.co.uk>
PR c++/94759
* g++.dg/coroutines/coro-bad-alloc-00-bad-op-new.C: Adjust for
updated error messages.
* g++.dg/coroutines/coro-bad-alloc-01-bad-op-del.C: Likewise.
* g++.dg/coroutines/coro-bad-alloc-02-no-op-new-nt.C: Likewise.
* g++.dg/coroutines/coro-missing-promise.C: Likewise.
* g++.dg/coroutines/pr93458-5-bad-coro-type.C: Liekwise.
* g++.dg/coroutines/torture/co-ret-17-void-ret-coro.C: New test.
libstdc++-v3/ChangeLog:
2020-04-28 Jonathan Wakely <jwakely@redhat.com>
Iain Sandoe <iain@sandoe.co.uk>
PR c++/94759
* include/std/coroutine: Implement handing for non-
class coroutine return types.
Structured binding makes use of the DECL_VALUE_EXPR fields
in local variables. We need to recognise these and only amend
the expression values, retaining the 'alias' value intact.
gcc/cp/ChangeLog:
2020-04-27 Iain Sandoe <iain@sandoe.co.uk>
PR c++/94701
* coroutines.cc (struct local_var_info): Add fields for static
variables and those with DECL_VALUE_EXPR redirection.
(transform_local_var_uses): Skip past typedefs and static vars
and then account for redirected variables.
(register_local_var_uses): Likewise.
gcc/testsuite/ChangeLog:
2020-04-27 Iain Sandoe <iain@sandoe.co.uk>
PR c++/94701
* g++.dg/coroutines/torture/local-var-06-structured-binding.C: New test.
This function, because it is sometimes called even outside of function
bodies, uses create_tmp_var_raw rather than create_tmp_var. But in order
for that to work, when first referenced, the VAR_DECLs need to appear in a
TARGET_EXPR so that during gimplification the var gets the right
DECL_CONTEXT and is added to local decls. Without that, e.g. tree-nested.c
ICEs on those.
2020-04-27 Jakub Jelinek <jakub@redhat.com>
PR target/94780
* config/i386/i386.c (ix86_atomic_assign_expand_fenv): Use
TARGET_EXPR instead of MODIFY_EXPR for first assignment to
sw_var, exceptions_var, mxcsr_orig_var and mxcsr_mod_var.
* gcc.dg/pr94780.c: New test.
We previously happened to accept this testcase, but never actually did
anything useful with the attribute. The patch for PR86379 stopped using
TREE_TYPE as USING_DECL_SCOPE, so 'using A::b' no longer had TREE_TYPE set,
so the language-independent decl_attributes started crashing on it.
GNU attributes are more flexible in their placement than C++11 attributes,
so if we encounter a dependent GNU attribute that syntactically appertains
to a type rather than the declaration as a whole, move it to the
declaration; that's almost certainly what the user meant, anyway.
gcc/cp/ChangeLog
2020-04-27 Jason Merrill <jason@redhat.com>
PR c++/90750
PR c++/79585
* decl.c (grokdeclarator): Move dependent attribute to decl.
* decl2.c (splice_template_attributes): No longer static.
In the first testcase below, the call to the target constructor foo{} from foo's
delegating constructor is encoded as the INIT_EXPR
*(struct foo *) this = AGGR_INIT_EXPR <4, __ct_comp, D.2140, ...>;
During initialization of the variable 'bar', we prematurely set TREE_READONLY on
bar's CONSTRUCTOR in two places before the outer delegating constructor has
returned: first, at the end of cxx_eval_call_expression after evaluating the RHS
of the above INIT_EXPR, and second, at the end of cxx_eval_store_expression
after having finished evaluating the above INIT_EXPR. This then prevents the
rest of the outer delegating constructor from mutating 'bar'.
This (hopefully minimally risky) patch makes cxx_eval_call_expression refrain
from setting TREE_READONLY when evaluating the target constructor of a
delegating constructor. It also makes cxx_eval_store_expression refrain from
setting TREE_READONLY when the object being initialized is "*this', on the basis
that it should be the responsibility of the routine that set 'this' in the first
place to set the object's TREE_READONLY appropriately.
gcc/cp/ChangeLog:
PR c++/94772
* constexpr.c (cxx_eval_call_expression): Don't set new_obj if we're
evaluating the target constructor of a delegating constructor.
(cxx_eval_store_expression): Don't set TREE_READONLY if the LHS of the
INIT_EXPR is '*this'.
gcc/testsuite/ChangeLog:
PR c++/94772
* g++.dg/cpp1y/constexpr-tracking-const23.C: New test.
* g++.dg/cpp1y/constexpr-tracking-const24.C: New test.
* g++.dg/cpp1y/constexpr-tracking-const25.C: New test.
PR 92830 reports that we always use "gcc/Warning-Options.html" when we
emit escaped documentation URLs when printing "[-Wname-of-option]" for
a warning.
This page is wrong for most Fortran warnings, and for analyzer warnings.
I considered various schemes involving adding extra tags to the .opt
format to capture where options are documented, but for now this patch
fixes the issue by introducing some special-casing logic.
It only fixes the URLs for warning options, not for other command-line
options, but those are the only options for which get_option_url is
currently called.
gcc/ChangeLog:
PR 92830
* configure.ac (DOCUMENTATION_ROOT_URL): Drop trailing "gcc/" from
default value, so that it can by supplied by get_option_html_page.
* configure: Regenerate.
* opts.c: Include "selftest.h".
(get_option_html_page): New function.
(get_option_url): Use it. Reformat to place comments next to the
expressions they refer to.
(selftest::test_get_option_html_page): New.
(selftest::opts_c_tests): New.
* selftest-run-tests.c (selftest::run_tests): Call
selftest::opts_c_tests.
* selftest.h (selftest::opts_c_tests): New decl.
branch-protection=pac-ret is not supported on ilp32 now and
the test requires it via branch-protection=standard.
committed as obvious.
gcc/testsuite/ChangeLog:
PR target/94697
* gcc.target/aarch64/pr94697.c: Require lp64.
cde-errors.c and cde-mve-error-2.c were failing with an rtl checking
failure because we applied UINTVAL to a nonconstant argument
(specifically a REG).
2020-04-27 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* config/arm/arm-builtins.c (arm_expand_builtin_args): Only apply
UINTVAL to CONST_INTs.
When this builtin has no parameters, speculation_safe_value_resolve_call
returns BUILT_IN_NONE, but resolve_overloaded_builtin uselessly
dereferences the first param just to return error_mark_node immediately.
The following patch rearranges it so that we only read the first parameter
if fncode is not BUILT_IN_NONE.
2020-04-27 Jakub Jelinek <jakub@redhat.com>
PR c/94755
* c-common.c (resolve_overloaded_builtin): Return error_mark_node for
fncode == BUILT_IN_NONE before initialization of first_param.
* c-c++-common/pr94755.c: New test.
The code change that caused this regression was not meant to affect neon
code-gen, however I missed the REG fall through. This patch makes sure we only
get the left-hand of the PLUS if it is indeed a PLUS expr.
gcc/ChangeLog:
2020-04-27 Andre Vieira <andre.simoesdiasvieira@arm.com>
* config/arm/arm.c (output_move_neon): Only get the first operand if
addr is PLUS.
In the testcase for PR94784, we have two vectors with the same ABI identity
but with different TYPE_MODEs. It would be better to flip the assert around
so that it checks that the two vectors have equal TYPE_VECTOR_SUBPARTS and
that converting the corresponding element types is a useless_type_conversion_p.
2020-04-27 Felix Yang <felix.yang@huawei.com>
gcc/
PR tree-optimization/94784
* tree-ssa-forwprop.c (simplify_vector_constructor): Flip the
assert around so that it checks that the two vectors have equal
TYPE_VECTOR_SUBPARTS and that converting the corresponding element
types is a useless_type_conversion_p.
gcc/testsuite/
PR tree-optimization/94784
* gcc.dg/pr94784.c: New test.
On aarch64 -mbranch-protection=pac-ret reuses the dwarf
opcode for window_save to mean "toggle the return address
mangle state", but in the dwarf2cfi internal logic the
state was not updated when an opcode was emitted, the
currently present update logic is only valid for the
original sparc use of window_save so a separate bool is
used on aarch64 to track the state.
This bug can cause the unwinder not to authenticate return
addresses that were signed (or vice versa) which means a
runtime crash on a pauth enabled system.
Currently only aarch64 pac-ret uses REG_CFA_TOGGLE_RA_MANGLE.
This should be backported to gcc-9 and gcc-8 branches.
gcc/ChangeLog:
PR target/94515
* dwarf2cfi.c (struct GTY): Add ra_mangled.
(cfi_row_equal_p): Check ra_mangled.
(dwarf2out_frame_debug_cfa_window_save): Remove the argument,
this only handles the sparc logic now.
(dwarf2out_frame_debug_cfa_toggle_ra_mangle): New function for
the aarch64 specific logic.
(dwarf2out_frame_debug): Update to use the new subroutines.
(change_cfi_row): Check ra_mangled.
gcc/testsuite/ChangeLog:
PR target/94515
* g++.target/aarch64/pr94515-1.C: New test.
* g++.target/aarch64/pr94515-2.C: New test.
The following patch fixes the C++14 vs. C++17 ABI passing incompatibility
on s390x-linux.
Bootstrapped/regtested on s390x-linux without and with the patch, the
difference being:
-FAIL: tmpdir-g++.dg-struct-layout-1/t032 cp_compat_x_alt.o-cp_compat_y_tst.o execute
FAIL: tmpdir-g++.dg-struct-layout-1/t032 cp_compat_x_tst.o-cp_compat_y_alt.o execute
-FAIL: tmpdir-g++.dg-struct-layout-1/t032 cp_compat_x_tst.o-cp_compat_y_tst.o execute
FAIL: tmpdir-g++.dg-struct-layout-1/t055 cp_compat_x_alt.o-cp_compat_y_alt.o execute
FAIL: tmpdir-g++.dg-struct-layout-1/t055 cp_compat_x_alt.o-cp_compat_y_tst.o execute
-FAIL: tmpdir-g++.dg-struct-layout-1/t055 cp_compat_x_tst.o-cp_compat_y_alt.o execute
-FAIL: tmpdir-g++.dg-struct-layout-1/t055 cp_compat_x_tst.o-cp_compat_y_tst.o execute
FAIL: tmpdir-g++.dg-struct-layout-1/t056 cp_compat_x_alt.o-cp_compat_y_alt.o execute
-FAIL: tmpdir-g++.dg-struct-layout-1/t056 cp_compat_x_alt.o-cp_compat_y_tst.o execute
FAIL: tmpdir-g++.dg-struct-layout-1/t056 cp_compat_x_tst.o-cp_compat_y_alt.o execute
-FAIL: tmpdir-g++.dg-struct-layout-1/t056 cp_compat_x_tst.o-cp_compat_y_tst.o execute
FAIL: tmpdir-g++.dg-struct-layout-1/t057 cp_compat_x_alt.o-cp_compat_y_alt.o execute
-FAIL: tmpdir-g++.dg-struct-layout-1/t057 cp_compat_x_alt.o-cp_compat_y_tst.o execute
FAIL: tmpdir-g++.dg-struct-layout-1/t057 cp_compat_x_tst.o-cp_compat_y_alt.o execute
-FAIL: tmpdir-g++.dg-struct-layout-1/t057 cp_compat_x_tst.o-cp_compat_y_tst.o execute
FAIL: tmpdir-g++.dg-struct-layout-1/t058 cp_compat_x_alt.o-cp_compat_y_alt.o execute
FAIL: tmpdir-g++.dg-struct-layout-1/t058 cp_compat_x_alt.o-cp_compat_y_tst.o execute
-FAIL: tmpdir-g++.dg-struct-layout-1/t058 cp_compat_x_tst.o-cp_compat_y_alt.o execute
-FAIL: tmpdir-g++.dg-struct-layout-1/t058 cp_compat_x_tst.o-cp_compat_y_tst.o execute
FAIL: tmpdir-g++.dg-struct-layout-1/t059 cp_compat_x_alt.o-cp_compat_y_alt.o execute
FAIL: tmpdir-g++.dg-struct-layout-1/t059 cp_compat_x_alt.o-cp_compat_y_tst.o execute
-FAIL: tmpdir-g++.dg-struct-layout-1/t059 cp_compat_x_tst.o-cp_compat_y_alt.o execute
-FAIL: tmpdir-g++.dg-struct-layout-1/t059 cp_compat_x_tst.o-cp_compat_y_tst.o execute
when performing ALT_CXX_UNDER_TEST=g++ testing with a system GCC 10 compiler
from a week ago. So, the alt vs. alt FAILs are all expected (we know before
this patch there is an ABI incompatibility) and some alt vs. tst (or tst vs.
alt) FAILs too - that depends on if the particular x or y test is compiled
with -std=c++14 or -std=c++17 - if x_tst is compiled with -std=c++14 and
y_alt is compiled with -std=c++17, then it should FAIL, similarly if x_alt
is compiled with -std=c++17 and y_tst is compiled with -std=c++14.
2020-04-27 Jakub Jelinek <jakub@redhat.com>
PR target/94704
* config/s390/s390.c (s390_function_arg_vector,
s390_function_arg_float): Ignore cxx17_empty_base_field_p fields.
Previously -fweb was disabled if only unroll small loops. After that
we find there is cases where it could help to rename pseudos and avoid
some anti-dependence which may occur after unroll.
This patch enables -fweb for small loops unrolling.
2020-04-27 Jiufu Guo <guojiufu@cn.ibm.com>
* common/config/rs6000/rs6000-common.c
(rs6000_option_optimization_table) [OPT_LEVELS_ALL]: Remove turn off
-fweb.
* config/rs6000/rs6000.c (rs6000_option_override_internal): Avoid to
set flag_web.
This bug is exposed by FRE refactor of r263875. Comparing the fre
dump file shows no obvious change of the segment fault function proves
it to be a target issue.
frame_pointer_needed is set to true in reload pass setup_can_eliminate,
but regs_ever_live[31] is false, pro_and_epilogue uses it without live
check causing CPU2006 465.tonto segment fault of loading from invalid
addresses due to r31 not saved/restored. Thus, add HARD_FRAME_POINTER_REGNUM
live check with frame_pointer_needed_indeed when generating pro_and_epilogue
instructions.
gcc/ChangeLog
2020-04-27 Xiong Hu Luo <luoxhu@linux.ibm.com>
PR target/91518
* config/rs6000/rs6000-logue.c (frame_pointer_needed_indeed):
New variable.
(rs6000_emit_prologue_components):
Check with frame_pointer_needed_indeed.
(rs6000_emit_epilogue_components): Likewise.
(rs6000_emit_prologue): Likewise.
(rs6000_emit_epilogue): Set frame_pointer_needed_indeed.
This test is rejected with a bogus "use of deleted function" error
starting with r225705 whereby convert_like_real/ck_base no longer
sets LOOKUP_ONLYCONVERTING for user_conv_p conversions. This does
not seem to be always correct. To recap, when we have something like
T t = x where T is a class type and the type of x is not T or derived
from T, we perform copy-initialization, something like:
1. choose a user-defined conversion to convert x to T, the result is
a prvalue,
2. use this prvalue to direct-initialize t.
In the second step, explicit constructors should be considered, since
we're direct-initializing. This is what r225705 fixed.
In this PR we are dealing with the first step, I think, where explicit
constructors should be skipped. [over.match.copy] says "The converting
constructors of T are candidate functions" which clearly eliminates
explicit constructors. But we also have to copy-initialize the argument
we are passing to such a converting constructor, and here we should
disregard explicit constructors too.
In this testcase we have
V v = m;
and we choose V::V(M) to convert m to V. But we wrongly choose
the explicit M::M<M&>(M&) to copy-initialize the argument; it's
a better match for a non-const lvalue than the implicit M::M(const M&)
but because it's explicit, we shouldn't use it.
When convert_like is processing the ck_user conversion -- the convfn is
V::V(M) -- it can see that cand->flags contains LOOKUP_ONLYCONVERTING,
but then when we're in build_over_call for this convfn, we have no way
to pass the flag to convert_like for the argument 'm', because convert_like
doesn't take flags. Fixed by creating a new conversion flag, copy_init_p,
set in ck_base/ck_rvalue to signal that explicit constructors should be
skipped.
LOOKUP_COPY_PARM looks relevant, but again, it's a LOOKUP_* flag, so
can't pass it to convert_like. DR 899 also seemed related, but that
deals with direct-init contexts only.
PR c++/90320
* call.c (struct conversion): Add copy_init_p.
(standard_conversion): Set copy_init_p in ck_base and ck_rvalue
if FLAGS demands LOOKUP_ONLYCONVERTING.
(convert_like_real) <case ck_base>: If copy_init_p is set, or
LOOKUP_ONLYCONVERTING into FLAGS.
* g++.dg/cpp0x/explicit13.C: New test.
* g++.dg/cpp0x/explicit14.C: New test.
Adds a new test directive COMPILABLE_MATH_TEST, and support has been
added for it in gdc-convert-test so that they are skipped if phobos is
not present on the target.
Only change in D runtime is a small documentation fix.
Reviewed-on: https://github.com/dlang/druntime/pull/3067https://github.com/dlang/dmd/pull/11060
gcc/testsuite/ChangeLog:
PR d/89418
* lib/gdc-utils.exp (gdc-convert-test): Add dg-skip-if for compilable
tests that depend on the phobos standard library.
Named arguments were being passed around by invisible reference, just
not variadic arguments. There is a need to de-duplicate the routines
that handle declaration/parameter promotion and reference checking.
However for now, the parameter helper functions have just been renamed
to parameter_reference_p and parameter_type, to make it more clear that
it is the Parameter equivalent to declaration_reference_p and
declaration_type.
On writing the tests, a forward-reference bug was discovered on x86_64
during va_list type semantic. This was due to fields not having their
parent set-up correctly.
gcc/d/ChangeLog:
PR d/94777
* d-builtins.cc (build_frontend_type): Set parent for generated
fields of built-in types.
* d-codegen.cc (argument_reference_p): Rename to ...
(parameter_reference_p): ... this.
(type_passed_as): Rename to ...
(parameter_type): ... this. Make TREE_ADDRESSABLE types restrict.
(d_build_call): Move handling of non-POD types here from ...
* d-convert.cc (convert_for_argument): ... here.
* d-tree.h (argument_reference_p): Rename declaration to ...
(parameter_reference_p): ... this.
(type_passed_as): Rename declaration to ...
(parameter_type): ... this.
* types.cc (TypeVisitor::visit (TypeFunction *)): Update caller.
gcc/testsuite/ChangeLog:
PR d/94777
* gdc.dg/pr94777a.d: New test.
* gdc.dg/pr94777b.d: New test.
Parameters to user-defined coroutines might be unnamed.
In that case, we must synthesize a name for the coroutine
frame copy.
gcc/cp/ChangeLog:
2020-04-26 Iain Sandoe <iain@sandoe.co.uk>
PR c++/94752
* coroutines.cc (morph_fn_to_coro): Ensure that
unnamed function params have a usable and distinct
frame field name.
gcc/testsuite/ChangeLog:
2020-04-26 Iain Sandoe <iain@sandoe.co.uk>
PR c++/94752
* g++.dg/coroutines/pr94752.C: New test.
Pragma inline affects whether functions are inlined or not. If at the
declaration level, it affects the functions declared in the block it
controls. If inside a function, it affects the function it is enclosed
by. Support has been in the front-end for some time, but the
information was not leveraged by the code generation pass.
gcc/d/ChangeLog:
* decl.cc (get_symbol_decl): Set DECL_DECLARED_INLINE_P or
DECL_UNINLINABLE for declarations with pragma(inline).
* toir.cc (IRVisitor::visit (GccAsmStatement *)): Set ASM_INLINE_P if
in function decorated with pragma(inline).
AIX pushes a stack frame when debugging is enabled. With -fcompare-debug
this generates comparison failures because code geneation is different.
This patch disables the stack push for -fcompare-debug that only is used
for internal testing and not for normal debug information generation that
will be consumed by AIX tools.
This patch also removes xfails from testsuite testcases that use
-fcompare-debug and no longer fail on AIX without the stack push difference.
* config/rs6000/rs6000-logue.c (rs6000_stack_info): Don't push a
stack frame when debugging and flag_compare_debug is enabled.
testsuite/
* g++.dg/debug/dwarf2/pr61433.C: Unfail AIX.
* g++.dg/opt/pr48549.C: Same.
* g++.dg/opt/pr60002.C: Same.
* g++.dg/opt/pr80436.C: Same.
* g++.dg/opt/pr83084.C: Same.
* g++.dg/other/pr42685.C: Same.
* gcc.dg/pr41241.c: Same.
* gcc.dg/pr42629.c: Same.
* gcc.dg/pr42630.c: Same.
* gcc.dg/pr42719.c: Same.
* gcc.dg/pr42728.c: Same.
* gcc.dg/pr42889.c: Same.
* gcc.dg/pr42916.c: Same.
* gcc.dg/pr43084.c: Same.
* gcc.dg/pr43670.c: Same.
* gcc.dg/pr44023.c: Same.
* gcc.dg/pr44971.c: Same.
* gcc.dg/pr45449.c: Same.
* gcc.dg/pr46771.c: Same.
* gcc.dg/pr47684.c: Same.
* gcc.dg/pr47881.c: Same.
* gcc.dg/pr48768.c: Same.
* gcc.dg/pr50017.c: Same.
* gcc.dg/pr56023.c: Same.
* gcc.dg/pr64935-1.c: Same.
* gcc.dg/pr64935-2.c: Same.
* gcc.dg/pr65521.c: Same.
* gcc.dg/pr65779.c: Same.
* gcc.dg/pr65980.c: Same.
* gcc.dg/pr66688.c: Same.
* gcc.dg/pr70405.c: Same.
* gcc.dg/vect/pr49352.c: Same.
ipa-sra-19.c uses a vector type that elicits a non-standard ABI warning
on AIX causing a spurious testsuite failure.
* gcc.dg/ipa/ipa-sra-19.c: Add -Wno-psabi option on AIX.
AIX 7.2 XCOFF does not support DWARF 5 sections. Skip the explicit
DWARF 5 tests that emit the new loc_lists and range_lists sections.
* gcc.dg/debug/dwarf2/pr82718-1.c: Skip on AIX.
* gcc.dg/debug/dwarf2/pr82718-2.c: Skip on AIX.
Our intrinsics do not handle spans on their return values (yet),
so this creates a temporary for subref array pointers.
2020-04-25 Thomas Koenig <tkoenig@gcc.gnu.org>
PR fortran/94578
* trans-expr.c (arrayfunc_assign_needs_temporary): If the
LHS is a subref pointer, we also need a temporary.
2020-04-25 Thomas Koenig <tkoenig@gcc.gnu.org>
PR fortran/94578
* gfortran.dg/pointer_assign_14.f90: New test.
* gfortran.dg/pointer_assign_15.f90: New test.
This just enables a test that can now be run since we've
resolved the PRs blocking it.
2020-04-25 Iain Sandoe <iain@sandoe.co.uk>
* g++.dg/coroutines/torture/co-ret-16-simple-control-flow.C:
Enable test.
2020-04-25 Michael Meissner <meissner@linux.ibm.com>
* config/rs6000/linux64.h (PCREL_SUPPORTED_BY_OS): Define to
enable PC-relative addressing for -mcpu=future.
* config/rs6000/rs6000-cpus.def (ISA_FUTURE_MASKS_SERVER): Move
after OTHER_FUTURE_MASKS. Use OTHER_FUTURE_MASKS.
* config/rs6000/rs6000.c (PCREL_SUPPORTED_BY_OS): If not defined,
suppress PC-relative addressing.
(rs6000_option_override_internal): Split up error messages
checking for -mprefixed and -mpcrel. Enable -mpcrel if the target
system supports it.
P2085 clarified that a defaulted comparison operator must be the first
declaration of the function. Rejecting that avoids the ICE trying to
compare the noexcept-specifications.
gcc/cp/ChangeLog
2020-04-24 Jason Merrill <jason@redhat.com>
PR c++/94583
* decl.c (redeclaration_error_message): Reject defaulted comparison
operator that has been previously declared.
This adds a note suggesting to enable concepts whenever 'requires' is parsed as
an invalid type name with concepts disabled.
gcc/cp/ChangeLog:
* parser.c (cp_parser_diagnose_invalid_type_name): Suggest enabling
concepts if the invalid identifier is 'requires'.
gcc/testsuite/ChangeLog:
* g++.dg/concepts/diagnostic11.C: New test.
* New core.math.toPrec templates have been added as an intrinsic.
Some floating point algorithms, such as Kahan-Babuska-Neumaier
Summation, require rounding to specific precisions. Rounding to
precision after every operation, however, loses overall precision in
the general case and is a runtime performance problem.
Adding these functions guarantee the rounding at required points in
the code, and document where in the algorithm the requirement exists.
* Support IBM long double types in core.internal.convert.
* Add missing aliases for 64-bit vectors in core.simd.
* RUNNABLE_PHOBOS_TEST directive has been properly integrated into the
D2 language testsuite.
Reviewed-on: https://github.com/dlang/druntime/pull/3063https://github.com/dlang/dmd/pull/11054
gcc/d/ChangeLog:
* intrinsics.cc (expand_intrinsic_toprec): New function.
(maybe_expand_intrinsic): Handle toPrec intrinsics.
* intrinsics.def (TOPRECF, TOPREC, TOPRECL): Add toPrec intrinsics.
finish_call_expr already has code to set current_function_returns_abnormally
if a template calls a noreturn function, but on the following testcase it
doesn't call a FUNCTION_DECL, but TEMPLATE_DECL instead, in which case
we didn't check noreturn at all and just assumed it could return.
2020-04-25 Jakub Jelinek <jakub@redhat.com>
PR c++/94742
* semantics.c (finish_call_expr): When looking if all overloads
are noreturn, use STRIP_TEMPLATE to look through TEMPLATE_DECLs.
* g++.dg/warn/Wreturn-type-12.C: New test.
As the new testcase shows, it is not safe to assume we can optimize
a conditional store into an automatic non-addressable var, we can do it
only if we can prove that the unconditional load or store actually will
not be outside of the boundaries of the variable.
If the offset and size are constant, we can, but this is already all
checked in !tree_could_trap_p, otherwise we really need to check for
a dominating unconditional store, or for the specific case of automatic
non-addressable variables, it is enough if there is a dominating load
(that is what those 4 testcases have). tree-ssa-phiopt.c has some
infrastructure for this already, see the add_or_mark_expr method etc.,
but right now it handles only MEM_REFs with SSA_NAME first operand
and some integral offset. So, I think it can be for GCC11 extended
to handle other memory references, possibly up to just doing
get_inner_reference and hasing based on the base, offset expressions
and bit_offset and bit_size, and have also a special case that for
!TREE_ADDRESSABLE automatic variables it could ignore whether something
is a load or store because the local stack should be always writable.
But it feels way too dangerous to do this this late for GCC10, so this
patch just restricts the optimization to the safe case (where lhs doesn't
trap), and on Richi's request also ignores TREE_ADDRESSABLE bit if
flag_store_data_races, because my understanding the reason for
TREE_ADDRESSABLE check is that we want to avoid introducing
store data races (if address of an automatic var escapes, some other thread
could be accessing it concurrently).
2020-04-25 Jakub Jelinek <jakub@redhat.com>
Richard Biener <rguenther@suse.de>
PR tree-optimization/94734
PR tree-optimization/89430
* tree-ssa-phiopt.c: Include tree-eh.h.
(cond_store_replacement): Return false if an automatic variable
access could trap. If -fstore-data-races, don't return false
just because an automatic variable is addressable.
* gcc.dg/tree-ssa/pr89430-1.c: Add xfail.
* gcc.dg/tree-ssa/pr89430-2.c: Add xfail.
* gcc.dg/tree-ssa/pr89430-5.c: Add xfail.
* gcc.dg/tree-ssa/pr89430-6.c: Add xfail.
* gcc.c-torture/execute/pr94734.c: New test.
The order of precedence used by the upstream reference compiler for
determining what library to link against is:
- No library if -nophoboslib or -fno-druntime was seen.
- The library passed to -debuglib if -g was also seen.
- The library passed to -defaultlib
- The in-tree libgphobos library.
This aligns the D language driver to follow the same rules.
gcc/d/ChangeLog:
* d-spec.cc (need_phobos): Remove.
(lang_specific_driver): Replace need_phobos with phobos_library.
Reorder -debuglib and -defaultlib to have precedence over libphobos.
(lang_specific_pre_link): Remove test for need_phobos.
The PR shows the compiler crashing with -mvsx -mlittle -O0. This turns
out to be caused by a failure to make of the higher bits in an index
endian conversion.
2020-04-24 Segher Boessenkool <segher@kernel.crashing.org>
PR target/94710
* config/rs6000/vector.md (vec_shr_<mode> for VEC_L): Correct little
endian byteshift_val calculation.
> I haven't added (yet) checks if the alternate compiler does support these
> options (I think that can be done incrementally), so for now this testing is
> done only if the alternate compiler is not used.
This patch does that, so now when testing against not too old compiler
it can do the -std=c++14 vs. -std=c++17 testing also between under test and
alt compilers.
2020-04-24 Jakub Jelinek <jakub@redhat.com>
PR c++/94383
* g++.dg/compat/struct-layout-1.exp: Use the -std=c++14 vs. -std=c++17
ABI compatibility testing even with ALT_CXX_UNDER_TEST, as long as
that compiler accepts -std=c++14 and -std=c++17 options.
This helps avoid spilling 64-bit constant loads to stack by simplifying the
code that LRA sees.
2020-04-24 Andrew Stubbs <ams@codesourcery.com>
gcc/
* config/gcn/gcn.md (*mov<mode>_insn): Only split post-reload.
The vector size chosen here is for V64DImode. The concept of this setting is
not well adapted for GCN, in which the vector size varies with the number of
lanes, not the other way around, but this is ok for now.
2020-04-24 Andrew Stubbs <ams@codesourcery.com>
gcc/testsuite/
* lib/target-supports.exp (available_vector_sizes): Add amdgcn.
(check_effective_target_vect_cmdline_needed): Disable for amdgcn.
(check_effective_target_vect_pack_trunc): Add amdgcn.
Some target C libraries that aren't recognized as freestanding don't
have filesystem support, so calling tmpnam, fopen/open and
remove/unlink fails to link.
This patch introduces a fileio effective target to the testsuite, and
requires it in the tests that call tmpnam.
for gcc/testsuite/ChangeLog
* lib/target-supports.exp (check_effective_target_fileio): New.
* gcc.c-torture/execute/fprintf-2.c: Require it.
* gcc.c-torture/execute/printf-2.c: Likewise.
* gcc.c-torture/execute/user-printf.c: Likewise.
I've had a couple of conversations now in which the shortness
of arm_sve.h was causing confusion, with people thinking that
the types and intrinsics were missing. It seems worth adding
a comment to explain what's going on.
2020-04-24 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* config/aarch64/arm_sve.h: Add a comment.
As discussed on PR94708, it's unsafe for rtl combine to generate fp
min/max under -funsafe-math-optimizations, considering NaNs. In
addition to flag_unsafe_math_optimizations check, we also need to
do extra mode feature testing here: && !HONOR_NANS (mode)
&& !HONOR_SIGNED_ZEROS (mode)
2020-04-24 Haijian Zhang <z.zhanghaijian@huawei.com>
gcc/
PR rtl-optimization/94708
* combine.c (simplify_if_then_else): Add check for
!HONOR_NANS (mode) && !HONOR_SIGNED_ZEROS (mode).
gcc/testsuite/
PR fortran/94708
* gfortran.dg/pr94708.f90: New test.
The default test timeout duration of the gc compiler is 10 minutes,
and the current default timeout duration of gofrontend is 240 seconds,
which is not long enough for some big tests. This CL changes it to
600s, so that all tests have enough time to complete.
Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/229657
The testcase uses the -flto option but does not ensure that LTO support
is enabled. This patch adds the test to the testcase.
* g++.dg/cpp0x/lambda/pr94426-1.C: Require LTO.
This fixes an ICE coming from mangle.c:write_expression when building the
testsuite of range-v3; the added testcase is a reduced reproducer for the ICE.
gcc/cp/ChangeLog:
* tree.c (zero_init_expr_p): Use uses_template_parms instead of
dependent_type_p.
gcc/testsuite/ChangeLog:
* g++.dg/cpp0x/dependent3.C: New test.
In the testcase below, when grokfndecl processes the operator() decl for the
lambda inside the friend function foo, processing_template_decl is rightly 1,
but template_class_depth on the lambda's closure type incorrectly returns 0
instead of 1.
Since processing_template_decl > template_class_depth, this makes grokfndecl
think that the operator() has its own set of template arguments, and so we
attach the innermost set of constraints -- those belonging to struct l -- to the
operator() decl. We then get confused when checking constraints_satisfied_p on
the operator() because it doesn't have template information and yet has
constraints associated with it.
This patch fixes template_class_depth to return the correct template nesting
level in cases like these, in that when it hits a friend function it walks into
the DECL_FRIEND_CONTEXT of the friend rather than into the CP_DECL_CONTEXT.
gcc/cp/ChangeLog:
PR c++/94645
* pt.c (template_class_depth): Walk into the DECL_FRIEND_CONTEXT of a
friend declaration rather than into its CP_DECL_CONTEXT.
gcc/testsuite/ChangeLog:
PR c++/94645
* g++.dg/cpp2a/concepts-lambda6.C: New test.
Every other *_exec insn has the exec operand last. This being the other way
around is a cause of bugs, and prevents use in macro templates.
2020-04-23 Andrew Stubbs <ams@codesourcery.com>
gcc/
* config/gcn/gcn-valu.md (mov<mode>_exec): Swap the numbers on operands
2 and 3.
(mov<mode>_exec): Likewise.
(trunc<vndi><mode>2_exec): Swap parameters to gen_mov<mode>_exec.
(<convop><mode><vndi>2_exec): Likewise.
The GIMPLE SSA store merging pass blows up when it is rewriting the
stores because it didn't realize that they don't belong to the same
EH region. Fixed by refusing to merge them.
PR tree-optimization/94717
* gimple-ssa-store-merging.c (try_coalesce_bswap): Return false if
one of the stores doesn't have the same landing pad number as the
first.
(coalesce_immediate_stores): Do not try to coalesce the store using
bswap if it doesn't have the same landing pad number as the first.
A user reported that we are still referring to a public review
draft of the ELFv2 ABI specification. Replace that by a permalink.
2020-04-24 Bill Schmidt <wschmidt@linux.ibm.com>
* gcc/doc/extend.texi (PowerPC AltiVec/VSX Built-in Functions):
Replace outdated link to ELFv2 ABI.
This PR is about the rs6000 backend emitting wrong assembly
for whole vector shift by 0, and while I think it is desirable
to fix the backend, I don't see a point why the expander should
try to emit that, whole vector shift by 0 is identity, we can just
return the operand.
2020-04-23 Jakub Jelinek <jakub@redhat.com>
PR target/94710
* optabs.c (expand_vec_perm_const): For shift_amt const0_rtx
just return v2.
Normally, when we find a statement containing an await expression
this will be expanded to a statement list implementing the control
flow implied. The expansion process successively replaces each
await expression in a statement with the result of its await_resume().
In the case of conditional statements (if, while, do, switch) the
expansion of the condition (or expression in the case of do-while)
cannot take place 'inline', leading to the PR.
The solution is to evaluate the expression separately, and to
transform while and do-while loops into endless loops with a break
on the required condition.
In fixing this, I realised that I'd also made a thinko in the case
of expanding truth-and/or-if expressions, where one arm of the
expression might need to be short-circuited. The mechanism for
expanding via the tree walk will not work correctly in this case and
we need to pre-expand any truth-and/or-if with an await expression
on its conditionally-taken arm. This applies to any statement with
truth-and/or-if expressions, so can be handled generically.
gcc/cp/ChangeLog:
2020-04-23 Iain Sandoe <iain@sandoe.co.uk>
PR c++/94288
* coroutines.cc (await_statement_expander): Simplify cases.
(struct susp_frame_data): Add fields for truth and/or if
cases, rename one field.
(analyze_expression_awaits): New.
(expand_one_truth_if): New.
(add_var_to_bind): New helper.
(coro_build_add_if_not_cond_break): New helper.
(await_statement_walker): Handle conditional expressions,
handle expansion of truth-and/or-if cases.
(bind_expr_find_in_subtree): New, checking-only.
(coro_body_contains_bind_expr_p): New, checking-only.
(morph_fn_to_coro): Ensure that we have a top level bind
expression.
gcc/testsuite/ChangeLog:
2020-04-23 Iain Sandoe <iain@sandoe.co.uk>
PR c++/94288
* g++.dg/coroutines/torture/co-await-18-if-cond.C: New test.
* g++.dg/coroutines/torture/co-await-19-while-cond.C: New test.
* g++.dg/coroutines/torture/co-await-20-do-while-cond.C: New test.
* g++.dg/coroutines/torture/co-await-21-switch-value.C: New test.
* g++.dg/coroutines/torture/co-await-22-truth-and-of-if.C: New test.
* g++.dg/coroutines/torture/co-ret-16-simple-control-flow.C: New test.
find_tm_attribute was using TREE_PURPOSE to get the attribute name,
which is breaking now that we preserve the C++11-style attribute
format past decl_attributes. So use get_attribute_name which can
handle both formats of attributes.
PR c++/94733
* c-attribs.c (find_tm_attribute): Use get_attribute_name instead of
TREE_PURPOSE.
* g++.dg/tm/attrib-5.C: New test.
In the recent get_narrower change, I wanted it to be efficient and avoid
recursion if there are many nested COMPOUND_EXPRs. That builds the
COMPOUND_EXPR nest with the right arguments, but as build2_loc computes some
flags like TREE_SIDE_EFFECTS, TREE_CONSTANT and TREE_READONLY, when it
is called with something that will not be the argument in the end, those
flags are computed incorrectly.
So, this patch instead uses an auto_vec and builds them in the reverse order
so when they are built, they are built with the correct operands.
2020-04-23 Jakub Jelinek <jakub@redhat.com>
PR middle-end/94724
* tree.c (get_narrower): Instead of creating COMPOUND_EXPRs
temporarily with non-final second operand and updating it later,
push COMPOUND_EXPRs into a vector and process it in reverse,
creating COMPOUND_EXPRs with the final operands.
* gcc.c-torture/execute/pr94724.c: New test.
This one took a bit of detective work. When array pointers point
to components of derived types, we currently set the span field
and then create an array temporary when we pass the array
pointer to a procedure as a non-pointer or non-target argument.
(This is inefficient, but that's for another release).
Now, the compiler detected this case when there was a direct assignment
like p => a%b, but not when p was returned either as a function result
or via an argument. This patch fixes that.
2020-04-23 Thomas Koenig <tkoenig@gcc.gnu.org>
PR fortran/93956
* expr.c (gfc_check_pointer_assign): Also set subref_array_pointer
when a function returns a pointer.
* interface.c (gfc_set_subref_array_pointer_arg): New function.
(gfc_procedure_use): Call it.
2020-04-23 Thomas Koenig <tkoenig@gcc.gnu.org>
PR fortran/93956
* gfortran.dg/pointer_assign_13.f90: New test.
Update the inline namespace to __n4861.
Add '__cpp_lib_coroutine' defined to 201902L per n4861.
libstdc++-v3/ChangeLog:
2020-04-23 Iain Sandoe <iain@sandoe.co.uk>
* include/std/coroutine: Update the inline namespace to __n4861.
Add the __cpp_lib_coroutine define, set to 201902L.
* include/std/version: Add __cpp_lib_coroutine, set to 201902L.
gcc/testsuite/ChangeLog:
2020-04-23 Iain Sandoe <iain@sandoe.co.uk>
* g++.dg/coroutines/coro-bad-alloc-00-bad-op-new.C: Adjust for
changed inline namespace.
* g++.dg/coroutines/coro-bad-alloc-01-bad-op-del.C: Likewise.
* g++.dg/coroutines/coro-bad-alloc-02-no-op-new-nt.C: Likewise
* g++.dg/coroutines/coro.h: Likewise
The bti pass currently first emits bti c at function start
if there is no paciasp (which also acts as indirect call
landing pad), then bti j is emitted at jump labels, however
if there is a label right before paciasp then the function
start can end up like
foo:
label:
bti j
paciasp
...
This patch is a minimal fix that just moves the bti c handling
after the bti j handling so we end up with
foo:
bti c
label:
bti j
paciasp
...
This could be improved by emitting bti jc in this case, or by
detecting that the label is not in fact an indirect jump target
and then this situation would be much less common.
Needs to be backported to gcc-9 branch.
gcc/ChangeLog:
PR target/94697
* config/aarch64/aarch64-bti-insert.c (rest_of_insert_bti): Swap
bti c and bti j handling.
gcc/testsuite/ChangeLog:
PR target/94697
* gcc.target/aarch64/pr94697.c: New test.
Add extra testing in the following two tests to make sure CPP predefines
redefinitions on #pragma works as expected when -mgeneral-regs-only
option is specified (See PR94678):
gcc.target/aarch64/pragma_cpp_predefs_2.c
gcc.target/aarch64/pragma_cpp_predefs_3.c
2020-04-23 Felix Yang <felix.yang@huawei.com>
gcc/testsuite/
PR target/94678
* gcc.target/aarch64/pragma_cpp_predefs_2.c: Fix typos, pop_pragma ->
pop_options. Add tests for general-regs-only.
* gcc.target/aarch64/pragma_cpp_predefs_3.c: Add tests for
general-regs-only.
2020-04-23 Andrew Stubbs <ams@codesourcery.com>
Thomas Schwinge <thomas@codesourcery.com>
PR middle-end/93488
gcc/
* omp-expand.c (expand_omp_target): Use force_gimple_operand_gsi on
t_async and the wait arguments.
gcc/testsuite/
* c-c++-common/goacc/pr93488.c: New file.
Reviewed-by: Thomas Schwinge <thomas@codesourcery.com>
This PR was caused by mismatched expectations between
vectorizable_comparison and SLP. We had a "<" comparison
between two booleans that were leaves of the SLP tree, so
vectorizable_comparison fell back on:
/* Invariant comparison. */
if (!vectype)
{
vectype = get_vectype_for_scalar_type (vinfo, TREE_TYPE (rhs1),
slp_node);
if (maybe_ne (TYPE_VECTOR_SUBPARTS (vectype), nunits))
return false;
}
rhs1 and rhs2 were *unsigned* boolean types, so we got back a vector
of unsigned integers. This in itself was OK, and meant that "<"
worked as expected without the need for the boolean fix-ups:
/* Boolean values may have another representation in vectors
and therefore we prefer bit operations over comparison for
them (which also works for scalar masks). We store opcodes
to use in bitop1 and bitop2. Statement is vectorized as
BITOP2 (rhs1 BITOP1 rhs2) or
rhs1 BITOP2 (BITOP1 rhs2)
depending on bitop1 and bitop2 arity. */
bool swap_p = false;
if (VECTOR_BOOLEAN_TYPE_P (vectype))
{
However, vectorizable_comparison then used vect_get_slp_defs to get
the actual operands. The request went to vect_get_constant_vectors,
which also has logic to calculate the vector type. The problem was
that this type was different from the one chosen above:
if (VECT_SCALAR_BOOLEAN_TYPE_P (TREE_TYPE (op))
&& vect_mask_constant_operand_p (stmt_vinfo))
vector_type = truth_type_for (stmt_vectype);
else
vector_type = get_vectype_for_scalar_type (vinfo, TREE_TYPE (op), op_node);
So the function gave back a vector of mask types, which here are vectors
of *signed* booleans. This meant that "<" gave:
true (-1) < false (0)
and so the boolean fixup above was needed after all.
Fixed by making vectorizable_comparison also pick a mask type in
this case.
2020-04-23 Richard Sandiford <richard.sandiford@arm.com>
gcc/
PR tree-optimization/94727
* tree-vect-stmts.c (vectorizable_comparison): Use mask_type when
comparing invariant scalar booleans.
gcc/testsuite/
PR tree-optimization/94727
* gcc.dg/vect/pr94727.c: New test.
In C++17, an empty class deriving from an empty base is not an
aggregate, while in C++14 it is. In order to implement this, GCC adds
an artificial field to such classes.
This artificial field has no mapping to Fundamental Data Types in the
AArch64 PCS ABI and hence should not count towards determining whether an
object can be passed using the vector registers as per section
"6.4.2 Parameter Passing Rules" in the AArch64 PCS.
https://github.com/ARM-software/abi-aa/blob/master/aapcs64/aapcs64.rst#the-base-procedure-call-standard
This patch avoids counting this artificial field in
aapcs_vfp_sub_candidate, and hence calculates whether such objects
should be passed in vector registers in the same manner as C++14 (where
the artificial field does not exist).
Before this change, the test below would pass the arguments to `f` in
general registers. After this change, the test passes the arguments to
`f` using the vector registers.
The new behaviour matches the behaviour of `armclang`, and also matches
the behaviour when run with `-std=gnu++14`.
> gcc -std=gnu++17 test.cpp
``` test.cpp
struct base {};
struct pair : base
{
float first;
float second;
pair (float f, float s) : first(f), second(s) {}
};
void f (pair);
int main()
{
f({3.14, 666});
return 1;
}
```
We add a `-Wpsabi` warning to catch cases where this fix has changed the ABI for
some functions. Unfortunately this warning is not emitted twice for multiple
calls to the same function, but I feel this is not much of a problem and can be
fixed later if needs be.
(i.e. if `main` called `f` twice in a row we only emit a diagnostic for the
first).
Testing:
Bootstrap and regression test on aarch64-linux.
All struct-layout-1 tests now pass.
gcc/ChangeLog:
2020-04-23 Matthew Malcomson <matthew.malcomson@arm.com>
Jakub Jelinek <jakub@redhat.com>
PR target/94383
* config/aarch64/aarch64.c (aapcs_vfp_sub_candidate): Account for C++17
empty base class artificial fields.
(aarch64_vfp_is_call_or_return_candidate): Warn when ABI PCS decision is
different after this fix.
libgfortran/ChangeLog:
2020-04-22 Fritz Reese <foreese@gcc.gnu.org>
PR libfortran/94694
PR libfortran/94586
* intrinsics/trigd.c, intrinsics/trigd_lib.inc, intrinsics/trigd.inc:
Guard against unavailable math functions.
Use suffixes from kinds.h based on the REAL kind.
gcc/fortran/ChangeLog:
2020-04-22 Fritz Reese <foreese@gcc.gnu.org>
* trigd_fe.inc: Use mpfr to compute cosd(30) rather than a host-
precision floating point literal based on an invalid macro.
branch-protection=pac-ret is only supported with lp64 abi.
gcc/testsuite/ChangeLog:
PR target/94514
* g++.target/aarch64/pr94514.C: Require lp64.
* gcc.target/aarch64/pr94514.c: Likewise.
Anyway, based on IRC discussion with Richard Sandiford on IRC, we should
probably test type uids instead of type pointers because type uids aren't
reused, but type pointers in a very bad luck case could be, and having the
static var at filescope and GTY((deletable)) is an overkill (and with costs
during GC time).
2020-04-23 Jakub Jelinek <jakub@redhat.com>
PR target/94707
* config/rs6000/rs6000-call.c (rs6000_discover_homogeneous_aggregate):
Use TYPE_UID (TYPE_MAIN_VARIANT (type)) instead of type to check
if the same type has been diagnosed most recently already.
As mentioned in the PR and on IRC, the recently added struct-layout-1.exp
new tests FAIL on powerpc64le-linux (among other targets).
FAIL: tmpdir-g++.dg-struct-layout-1/t032 cp_compat_x_tst.o-cp_compat_y_tst.o execute
FAIL: tmpdir-g++.dg-struct-layout-1/t058 cp_compat_x_tst.o-cp_compat_y_tst.o execute
FAIL: tmpdir-g++.dg-struct-layout-1/t059 cp_compat_x_tst.o-cp_compat_y_tst.o execute
in particular. The problem is that the presence or absence of the C++17
artificial empty base fields, which have non-zero TYPE_SIZE, but zero
DECL_SIZE, change the ABI decisions, if it is present (-std=c++17), the type
might not be considered homogeneous, while if it is absent (-std=c++14), it
can be.
The following patch fixes that and emits a -Wpsabi inform; perhaps more
often than it could, because the fact that rs6000_discover_homogeneous_aggregate
returns true when it didn't in in GCC 7/8/9 with -std=c++17 doesn't still
mean it will make a different ABI decision, but the warning triggered only
on the test I've changed (the struct-layout-1.exp tests use -w -Wno-psabi
already).
2020-04-23 Jakub Jelinek <jakub@redhat.com>
PR target/94707
* config/rs6000/rs6000-call.c (rs6000_aggregate_candidate): Add
cxx17_empty_base_seen argument. Pass it to recursive calls.
Ignore cxx17_empty_base_field_p fields after setting
*cxx17_empty_base_seen to true.
(rs6000_discover_homogeneous_aggregate): Adjust
rs6000_aggregate_candidate caller. With -Wpsabi, diagnose homogeneous
aggregates with C++17 empty base fields.
* g++.dg/tree-ssa/pr27830.C: Use -Wpsabi -w for -std=c++17 and higher.
On the following testcase GCC ICEs, because last_decl is error_mark_node,
and diag_attr_exclusions assumes that if it is not NULL, it must be a decl.
The following patch just doesn't diagnose attribute exclusions if the
other decl is erroneous (and thus we've already reported errors for it).
2020-04-23 Jakub Jelinek <jakub@redhat.com>
PR c/94705
* attribs.c (decl_attribute): Don't diagnose attribute exclusions
if last_decl is error_mark_node or has such a TREE_TYPE.
* gcc.dg/pr94705.c: New test.
My fix for PR94549 broke constraints_satisfied_p in the case where the inherited
constructor decl points to an instantiation of a constructor template coming
from an instantiation of a class template.
This is because the DECL_TI_ARGS of the inherited constructor decl in this case
contains only the innermost level of template arguments (those for the
constructor template), but constraint satisfaction expects to have the full set
of template arguments. This causes template argument substitution during
constraint satisfaction to fail in various ways.
On the other hand, the DECL_TI_ARGS of the DECL_INHERITED_CTOR is a full set of
template arguments but with the innermost level still in its dependent form,
which is the source of PR94549. So if we could combine these two sets of
template arguments then we'd be golden.
This patch does just that, by effectively reverting the fix for PR94549 and
instead using add_outermost_template_args to combine the template arguments of
the inherited constructor decl with those of its DECL_INHERITED_CTOR.
gcc/cp/ChangeLog:
PR c++/94719
PR c++/94549
* constraint.cc (satisfy_declaration_constraints): If the inherited
constructor points to an instantiation of a constructor template,
remember and use its attached template arguments.
gcc/testsuite/ChangeLog:
PR c++/94719
PR c++/94549
* g++.dg/cpp2a/concepts-inherit-ctor9.C: New test.
This PR was initially accepts-invalid, but I think it's actually valid
C++20 code. My reasoning is that in C++20 we no longer require the
declaration of operator== (#if-defed in the test), because C++20's
[temp.names]/2 says "A name is also considered to refer to a template
if it is an unqualified-id followed by a < and name lookup either finds
one or more functions or finds nothing." so when we're parsing
constexpr friend bool operator==<T>(T lhs, const Foo& rhs);
we treat "operator==" as a template name, because name lookup of
"operator==" found nothing and we have an operator-function-id, which is
an unqualified-id, and it's followed by a <. So the declaration isn't
needed to treat "operator==<T>" as a template-id.
PR c++/93807
* g++.dg/cpp2a/fn-template20.C: New test.
Since -mabi=ilp32 option is not compatible with large code model, Require
lp64 target for the following tests:
gcc.target/aarch64/pr63304_1.c
gcc.target/aarch64/pr70120-2.c
gcc.target/aarch64/pr94530.c
gcc.target/aarch64/reload-valid-spoff.c
2020-04-22 Duan bo <duanbo3@huawei.com>
gcc/testsuite/
PR testsuite/94712
* gcc.target/aarch64/pr63304_1.c: Require lp64 target.
* gcc.target/aarch64/pr70120-2.c: Likewise.
* gcc.target/aarch64/pr94530.c: Likewise.
* gcc.target/aarch64/reload-valid-spoff.c: Likewise.
As the two testcases for PR94678 show, -mgeneral-regs-only is handled
properly with SVE. We should issue an error message instead of expanding
SVE builtin funtions when -mgeneral-regs-only option is specified.
The middle end should never try to use vector patterns when the vector
modes have been disabled by !have_regs_of_mode. But it's still wrong
for the target to provide patterns that would inevitably lead to spill
failure due to lack of registers. So we should also add check for
!TARGET_GENERAL_REGS_ONLY in TARGET_SVE and other SVE related macros.
2020-04-22 Felix Yang <felix.yang@huawei.com>
gcc/
PR target/94678
* config/aarch64/aarch64.h (TARGET_SVE):
Add && !TARGET_GENERAL_REGS_ONLY.
(TARGET_SVE2): Add && TARGET_SVE.
(TARGET_SVE2_AES, TARGET_SVE2_BITPERM, TARGET_SVE2_SHA3,
TARGET_SVE2_SM4): Add && TARGET_SVE2.
* config/aarch64/aarch64-sve-builtins.h
(sve_switcher::m_old_general_regs_only): New member.
* config/aarch64/aarch64-sve-builtins.cc (check_required_registers):
New function.
(reported_missing_registers_p): New variable.
(check_required_extensions): Call check_required_registers before
return if all required extenstions are present.
(sve_switcher::sve_switcher): Save TARGET_GENERAL_REGS_ONLY in
m_old_general_regs_only and clear MASK_GENERAL_REGS_ONLY in
global_options.x_target_flags.
(sve_switcher::~sve_switcher): Set MASK_GENERAL_REGS_ONLY in
global_options.x_target_flags if m_old_general_regs_only is true.
gcc/testsuite/
PR target/94678
* gcc.target/aarch64/sve/acle/general/nosve_6.c: New test.
For future architecture with prefix instructions, always use plq/pstq
rather than lq/stq for atomic load of quadword. Then we never have to
do the doubleword swap on little endian. Before this fix, -mno-pcrel
would generate lq with the doubleword swap (which was ok) and -mpcrel
would generate plq, also with the doubleword swap, which was wrong.
2020-04-20 Aaron Sawdey <acsawdey@linux.ibm.com>
PR target/94622
* config/rs6000/sync.md (load_quadpti): Add attr "prefixed"
if TARGET_PREFIXED.
(store_quadpti): Ditto.
(atomic_load<mode>): Do not swap doublewords if TARGET_PREFIXED as
plq will be used and doesn't need it.
(atomic_store<mode>): Ditto, for pstq.
These warnings have nothing to do with virtual functions, so "override"
is inappropriate. The warnings are just talking about defining special
members, so let's say that.
PR translation/94698
* class.c (check_field_decls): Change "override" to "define" in
-Weffc++ diagnostics.
2020-04-22 José Rui Faustino de Sousa <jrfsousa@gmail.com>
PR fortran/90350
* simplify.c (simplify_bound): In the case of assumed-size arrays
check if the reference is to a full array.
2020-04-22 José Rui Faustino de Sousa <jrfsousa@gmail.com>
PR fortran/90350
* gfortran.dg/PR90350.f90: New test.
ia64 seems to be affected too, but the backend doesn't have any
-Wpsabi warnings and I'm not sure if we really need them for an (almost?)
dead target.
2020-04-22 Jakub Jelinek <jakub@redhat.com>
PR target/94706
* config/ia64/ia64.c (hfa_element_mode): Ignore
cxx17_empty_base_field_p fields.
As multiple targets are affected apparently, I believe at least
aarch64, arm, powerpc64le, s390{,x} and ia64,
I think we should have a middle-end predicate for this, so that if we need
to tweak it, we can do it in one spot.
2020-04-22 Jakub Jelinek <jakub@redhat.com>
PR target/94383
* calls.h (cxx17_empty_base_field_p): Declare.
* calls.c (cxx17_empty_base_field_p): Define.
Since arm_acle.h includes stdint.h, its use requires the presence of
the right gnu/stub-*.h, so make sure to include arm_acle.h when
checking the effective targets that generally imply that the testcase
will include it: arm_dsp, arm_crc, arm_coproc[1-4]
This makes several tests unsupported rather than fail.
2020-04-22 Christophe Lyon <christophe.lyon@linaro.org>
gcc/testsuite/
* lib/target-supports.exp (check_effective_target_arm_dsp)
(check_effective_target_arm_crc_ok_nocache)
(check_effective_target_arm_coproc1_ok_nocache)
(check_effective_target_arm_coproc2_ok_nocache)
(check_effective_target_arm_coproc3_ok_nocache)
(check_effective_target_arm_coproc4_ok_nocache): Include
arm_acle.h.
Since arm_cde.h includes stdint.h, its use requires the presence of
the right gnu/stub-*.h, so make sure to include it when checking the
arm_v8*m_main_cde* effective targets, otherwise we can decide CDE is
supported while it's not really (all tests that use arm_v8m_main_cde*
also include arm_cde.h aynway).
Similarly for the effective targets that also require MVE.
This makes several tests unsupported rather than fail.
2020-04-22 Christophe Lyon <christophe.lyon@linaro.org>
gcc/testsuite/
* lib/target-supports.exp (arm_v8m_main_cde, arm_v8m_main_cde_fp)
(arm_v8_1m_main_cde_mve, arm_v8_1m_main_cde_mve_fp): Include
arm_cde.h and arm_mve.h as ineeded.
Since arm_mve.h includes stdint.h, its use requires the presence of
the right gnu/stub-*.h, so make sure to include it when checking the
arm_v8_1m_mve_ok_nocache effective target, otherwise we can decide MVE
is supported while it's not really. This makes several tests
unsupported rather than fail.
2020-04-22 Christophe Lyon <christophe.lyon@linaro.org>
gcc/testsuite/
* lib/target-supports.exp
(check_effective_target_arm_v8_1m_mve_ok_nocache): Include
arm_mve.h.
Several ARM/MVE tests can be compiled even if the toolchain does not
support -mfloat-abi=hard (softfp is OK).
Use dg-add-options arm_v8_1m_mve or arm_v8_1m_mve_fp instead of using
dg-additional-options.
2020-04-22 Christophe Lyon <christophe.lyon@linaro.org>
gcc/testsuite/
* gcc.target/arm/mve/intrinsics/mve_vector_float.c: Use
arm_v8_1m_mve_fp.
* gcc.target/arm/mve/intrinsics/mve_vector_float1.c: Likewise.
* gcc.target/arm/mve/intrinsics/mve_vector_float2.c: Likewise.
* gcc.target/arm/mve/intrinsics/mve_vector_int.c: Use
arm_v8_1m_mve.
* gcc.target/arm/mve/intrinsics/mve_vector_int1.c: Likewise.
* gcc.target/arm/mve/intrinsics/mve_vector_int2.c: Likewise.
* gcc.target/arm/mve/intrinsics/mve_vector_uint.c: Likewise.
* gcc.target/arm/mve/intrinsics/mve_vector_uint1.c: Likewise.
* gcc.target/arm/mve/intrinsics/mve_vector_uint2.c: Likewise.
This test can pass with a hard-float toolchain, provided we don't
force -mfloat-abi=softfp.
This patch removes this useless option, as well as -save-temps which
is implied by arm_v8_1m_mve_fp.
2020-04-22 Christophe Lyon <christophe.lyon@linaro.org>
gcc/testsuite/
* gcc.target/arm/mve/intrinsics/mve_move_gpr_to_gpr.c: Remove
useless options.
Some MVE tests explicitly test a -mfloat-abi=hard option, but we need
to check that the toolchain actually supports it (which may not be the
case for arm-linux-gnueabi* targets). We can thus remove the related
dg-skip directives.
We also make use of dg-add-options arm_v8_1m_mve_fp and arm_v8_1m_mve
instead of duplicating the corresponding options in
dg-additional-options where we keep only -mfloat-abi to override the
option selected by arm_v8_1m_mve_fp.
2020-04-22 Christophe Lyon <christophe.lyon@linaro.org>
gcc/testsuite/
* gcc.target/arm/mve/intrinsics/mve_fp_fpu1.c: Use arm_hard_ok
effective target and arm_v8_1m_mve_fp options.
* gcc.target/arm/mve/intrinsics/mve_fp_fpu2.c: Use arm_softfp_ok
effective target and arm_v8_1m_mve_fp options.
* gcc.target/arm/mve/intrinsics/mve_fpu1.c: Use arm_hard_ok
effective target and arm_v8_1m_mve options.
* gcc.target/arm/mve/intrinsics/mve_fpu2.c: Use arm_softfp_ok
effective target and arm_v8_1m_mve options.
For arm-linux-gnueabi* targets, a toolchain cannot support the
float-abi opposite to the one it has been configured for: since glibc
does not support such multilibs, we end up lacking gnu/stubs-*.h when
including stdint.h for instance.
This patch introduces two new effective targets to detect whether we
can compile tests with -mfloat-abi=softfp or -mfloat-abi=hard.
This enables to make such tests unsupported rather than fail.
2020-04-22 Christophe Lyon <christophe.lyon@linaro.org>
gcc/testsuite/
* lib/target-supports.exp (arm_softfp_ok): New effective target.
(arm_hard_ok): Likewise.
gcc/
* doc/sourcebuild.texi (arm_softfp_ok, arm_hard_ok): Document.
This patch adds initial -mcpu support for the Arm Cortex-M55 CPU.
This CPU is an Armv8.1-M Mainline CPU supporting MVE.
An option to disable floating-point (and MVE) is provided with the +nofp.
For GCC 11 I'd like to add further fine-grained options to enable integer-only MVE
but that needs a bit more elaborate surgery in arm-cpus.in that I don't want to do
in GCC 10 at this stage.
As this CPU is not supported in gas and I don't want to couple GCC 10 to the very
latest binutils anyway, this CPU emits the cpu string in the assembly file as a build attribute
rather than a .cpu directive, thus sparing us the need to support .cpu cortex-m55 in gas.
The .cpu directive in gas isn't used for anything besides setting the Tag_CPU_name
build attribute anyway (which itself is not used by any tools I'm aware of).
All the architecture information used for target detection is already emitted using .arch_extension
directives and similar.
Bootstrapped and tested on arm-none-linux-gnueabihf. Also tested on arm-none-eabi.
2020-04-22 Kyrylo Tkachov <kyrylo.tkachov@arm.com>
Andre Vieira <andre.simoesdiasvieira@arm.com>
Mihail Ionescu <mihail.ionescu@arm.com>
* config/arm/arm.c (arm_file_start): Handle isa_bit_quirk_no_asmcpu.
* config/arm/arm-cpus.in (quirk_no_asmcpu): Define.
(ALL_QUIRKS): Add quirk_no_asmcpu.
(cortex-m55): Define new cpu.
* config/arm/arm-tables.opt: Regenerate.
* config/arm/arm-tune.md: Likewise.
* doc/invoke.texi (Arm Options): Document -mcpu=cortex-m55.
While '!$' with -fopenmp unsets too often load_line's seen_comment flag,
this only affects <tab> warnings; for trunction warnings, gfc_next_char_literal
re-handles the directives correctly. In terms of missed warnings, a directive
that is completely in the truncated part is not diagnosted (as it starts
with a '!').
PR fortran/94709
* scanner.c (load_line): In fixed form, also treat 'C' as comment and
'D'/'d' only with -fd-lines-as-comments. Treat '!$' with -fopenmp,
'!$acc' with -fopenacc and '!GCC$' as non-comment to permit <tab>
and truncation warnings.
PR fortran/94709
* gfortran.dg/gomp/warn_truncated.f: New.
* gfortran.dg/gomp/warn_truncated.f90: New.
This is really PR94683 part 2, handling the case in which the vector is
an identity and so doesn't need a VEC_PERM_EXPR. I should have realised
at the time that the other arm of the "if" would need the same fix.
2020-04-22 Richard Sandiford <richard.sandiford@arm.com>
gcc/
PR tree-optimization/94700
* tree-ssa-forwprop.c (simplify_vector_constructor): When processing
an identity constructor, use a VIEW_CONVERT_EXPR to handle mixtures
of similarly-structured but distinct vector types.
gcc/testsuite/
PR tree-optimization/94700
* gcc.target/aarch64/sve/acle/general/pr94700.c: New test.
As reported in the PR, per [dcl.fct.def.coroutine]/4 we should
be passing a reference to the object to the promise parameter
preview, and we are currently passing a pointer (this). Amend to
pass the reference.
gcc/cp/ChangeLog:
2020-04-22 Iain Sandoe <iain@sandoe.co.uk>
PR c++/94682
* coroutines.cc (struct param_info): Add a field to note that
the param is 'this'.
(morph_fn_to_coro): Convert this to a reference before using it
in the promise parameter preview.
gcc/testsuite/ChangeLog:
2020-04-22 Iain Sandoe <iain@sandoe.co.uk>
PR c++/94682
* g++.dg/coroutines/pr94682-preview-this.C: New test.
Some tests use --save-temps, but schedule-cleanups strictly matches
-save-temps, so we leave many temporary files after validation.
Instead of fixing every offending testcase, it's simpler and
future-proof to make schedule-cleanups handle both --save-temps and
-save-temps.
2020-04-22 Christophe Lyon <christophe.lyon@linaro.org>
gcc/testsuite/
* lib/gcc-dg.exp (schedule-cleanups): Accept --save-temps.
While instantiating test(Plot) we partially instantiate the generic lambda.
We look at forward<T>(rest)... and see that it's just replacing parameter
packs with new parameter packs and tries to do a direct substitution. But
because register_parameter_specializations had built up a
NONTYPE_ARGUMENT_PACK around the new parameter pack, the substitution
failed. So let's not wrap it that way.
gcc/cp/ChangeLog
2020-04-22 Jason Merrill <jason@redhat.com>
PR c++/94546
* pt.c (register_parameter_specializations): If the instantiation is
still a parameter pack, don't wrap it in a NONTYPE_ARGUMENT_PACK.
(tsubst_pack_expansion, tsubst_expr): Adjust.
The change committed to GCC 9 to allow string literals as template arguments
caused the compiler to prune away, and thus miss diagnosing, conversion from
nullptr to int in an array initializer. After looking at various approaches
to improving the pruning, we realized that the only place the pruning is
necessary is in the mangler.
gcc/cp/ChangeLog
2020-04-22 Martin Sebor <msebor@redhat.com>
Jason Merrill <jason@redhat.com>
PR c++/94510
* decl.c (reshape_init_array_1): Avoid stripping redundant trailing
zero initializers...
* mangle.c (write_expression): ...and handle them here even for
pointers to members by calling zero_init_expr_p.
* cp-tree.h (zero_init_expr_p): Declare.
* tree.c (zero_init_expr_p): Define.
(type_initializer_zero_p): Remove.
* pt.c (tparm_obj_values): New hash_map.
(get_template_parm_object): Store to it.
(tparm_object_argument): New.
gcc/testsuite/ChangeLog
2020-04-22 Martin Sebor <msebor@redhat.com>
PR c++/94510
* g++.dg/init/array58.C: New test.
* g++.dg/init/array59.C: New test.
* g++.dg/cpp2a/nontype-class34.C: New test.
* g++.dg/cpp2a/nontype-class35.C: New test.
This updates diagnose_valid_expression to mirror the convert_to_void check added
to tsubst_valid_expression_requirement by r10-7554.
gcc/cp/ChangeLog:
PR c++/67825
* constraint.cc (diagnose_valid_expression): Check convert_to_void here
as well as in tsubst_valid_expression_requirement.
gcc/testsuite/ChangeLog:
PR c++/67825
* g++.dg/concepts/diagnostic10.C: New test.
* g++.dg/cpp2a/concepts-pr67178.C: Adjust dg-message.
A comment in satisfy_declaration_constraints says
/* For inherited constructors, consider the original declaration;
it has the correct template information attached. */
d = strip_inheriting_ctors (d);
but it looks like this comment is wrong when the inherited constructor is for an
instantiation of a constructor template. In that case, DECL_TEMPLATE_INFO is
correct and DECL_INHERITED_CTOR points to the constructor template of the base
class rather than to the particular instantiation of the constructor template
(and so the DECL_TI_ARGS of the DECL_INHERITED_CTOR are in their dependent
form).
So doing strip_inheriting_ctors in this case then eventually leads to
satisfy_associated_constraints returning true regardless of the constraints
themselves, due to the passed in 'args' being dependent.
An inherited constructor seems to have a non-empty DECL_TEMPLATE_INFO only when
it's for an instantiation of a constructor template, so this patch fixes this
issue by checking for empty DECL_TEMPLATE_INFO before calling
strip_inheriting_ctors.
There is another unguarded call to strip_inheriting_ctors in
get_normalized_constraints_from_decl, but this one seems to be safe to do
unconditionally because the rest of that function doesn't need/look at the
DECL_TI_ARGS of the decl.
gcc/cp/ChangeLog:
PR c++/94549
* constraint.cc (satisfy_declaration_constraints): Don't strip the
inherited constructor if it already has template information.
gcc/testsuite/ChangeLog:
PR c++/94549
* g++.dg/concepts/inherit-ctor3.C: Adjust expected diagnostics.
* g++.dg/cpp2a/concepts-inherit-ctor4.C: New test.
* g++.dg/cpp2a/concepts-inherit-ctor8.C: New test.
The front end now supports parenthesized initialization for arrays in
C++20, so extend std::is_nothrow_constructible to support them too.
gcc/testsuite:
PR c++/94149
* g++.dg/cpp2a/paren-init24.C: Fix FIXMEs.
libstdc++-v3:
PR c++/94149
* include/std/type_traits (__is_nt_constructible_impl): Add partial
specializations for bounded arrays with non-empty initializers.
* testsuite/20_util/is_nothrow_constructible/value_c++20.cc: New test.
gcc/ChangeLog:
PR middle-end/94647
* gimple-ssa-warn-restrict.c (builtin_access::builtin_access): Correct
the computation of the lower bound of the source access size.
(builtin_access::generic_overlap): Remove a hack for setting ranges
of overlap offsets.
gcc/testsuite/ChangeLog:
PR middle-end/94647
* c-c++-common/Warray-bounds-2.c: Adjust a test case and add a new one.
* c-c++-common/Warray-bounds-3.c: Add tests for missing warnings.
* c-c++-common/Wrestrict.c: Invert bounds in printed ranges.
* gcc.dg/Warray-bounds-59.c: New test.
* gcc.dg/Wrestrict-10.c: Add a missing warning.
* gcc.dg/Wrestrict-5.c: Adjust text of expected warning.
* gcc.dg/Wrestrict-6.c: Expect to see a range of overlap offsets.
With -mbranch-protection=pac-ret the debug info toggles the
signedness state of the return address so the unwinder knows when
the return address needs pointer authentication.
The unwind context flags were not updated according to the dwarf
frame info.
This causes unwinding across frames that were built without pac-ret
to incorrectly authenticate the return address wich corrupts the
return address on a system where PAuth is enabled.
Note: This even affects systems where all code use pac-ret because
unwinding across a signal frame the return address is not signed.
gcc/testsuite/ChangeLog:
PR target/94514
* g++.target/aarch64/pr94514.C: New test.
* gcc.target/aarch64/pr94514.c: New test.
libgcc/ChangeLog:
PR target/94514
* config/aarch64/aarch64-unwind.h (aarch64_frob_update_context):
Update context->flags accroding to the frame state.
2020-04-21 John David Anglin <danglin@gcc.gnu.org>
* config/pa/som.h (ASM_WEAKEN_LABEL): Delete.
(ASM_WEAKEN_DECL): New define.
(HAVE_GAS_WEAKREF): Undefine.
The type compatibility handling in simplify_vector_constructor is
based on the number of elements and on element type compatibility,
but that's no longer enough to ensure that two vector types are
compatible. This patch uses a VIEW_CONVERT_EXPR if the permutation
type and result type are distinct.
2020-04-21 Richard Sandiford <richard.sandiford@arm.com>
gcc/
PR tree-optimization/94683
* tree-ssa-forwprop.c (simplify_vector_constructor): Use a
VIEW_CONVERT_EXPR to handle mixtures of similarly-structured
but distinct vector types.
gcc/testsuite/
PR tree-optimization/94683
* gcc.target/aarch64/sve/acle/general/pr94683.c: New test.
Jonathan reported an ABI incompatibility between C++14 and C++17 in
passing some aggregates with empty bases on aarch64 (and apparently on arm
too).
The following patch adds 3000 (by default) tests for such interoperability,
using the struct-layout-1* framework. The current 3000 tests are generated
as is (so unchanged from previous ones), and afterwards there is another set
of 3000 ones, where always one of the tNNN_x.C and tNNN_y.C tests get added
-std=c++14 -DCXX14_VS_CXX17 and another one -std=c++17 -DCXX14_VS_CXX17
options (which one which is chosen pseudo-randomly), which causes the
structs to have an empty base.
I haven't added (yet) checks if the alternate compiler does support these
options (I think that can be done incrementally), so for now this testing is
done only if the alternate compiler is not used.
I had to fix a bug in the flexible array handling, because while we were
lucky in the 3000 generated tests not to have toplevel fields after field
with flexible array members, in the next 3000 we aren't lucky anymore.
But even with that change, diff -upr between old and new
testsuite/g++/g++.dg/g++.dg-struct-layout-1/ doesn't show any differences
except for the ^Only in... messages for the new tests in there.
Bootstrapped/regtested on x86_64-linux and i686-linux and additionally
tested on aarch64-linux, where
FAIL: tmpdir-g++.dg-struct-layout-1/t032 cp_compat_x_tst.o-cp_compat_y_tst.o execute
FAIL: tmpdir-g++.dg-struct-layout-1/t056 cp_compat_x_tst.o-cp_compat_y_tst.o execute
FAIL: tmpdir-g++.dg-struct-layout-1/t057 cp_compat_x_tst.o-cp_compat_y_tst.o execute
FAIL: tmpdir-g++.dg-struct-layout-1/t058 cp_compat_x_tst.o-cp_compat_y_tst.o execute
FAIL: tmpdir-g++.dg-struct-layout-1/t059 cp_compat_x_tst.o-cp_compat_y_tst.o execute
because of the backend bug, and with that bug fixed it succeeds.
Matthew has kindly tested it also on aarch64-linux and arm*-*.
The primary goal of the patch is catch if some targets other than aarch64 or
arm aren't affected too.
2020-04-21 Jakub Jelinek <jakub@redhat.com>
PR c++/94383
* g++.dg/compat/struct-layout-1.exp: If !$use_alt, add -c to generator
args.
* g++.dg/compat/struct-layout-1_generate.c (dg_options): Add another
%s to the start of dg-options arg.
(cxx14_vs_cxx17, do_cxx14_vs_cxx17): New variables.
(switchfiles): If cxx14_vs_cxx17, prepend -std=c++14 -DCXX14_VS_CXX17
or -std=c++17 -DCXX17_VS_CXX14 - randomly - to dg-options.
(output): Don't append further fields once one with flexible array
member is added.
(generate_random_tests): Don't use toplevel unions if cxx14_vs_cxx17.
(main): If -c, emit second set of tests for -std=c++14 vs. -std=c++17
testing.
* g++.dg/compat/struct-layout-1_x1.h (empty_base): New type.
(EMPTY_BASE): Define.
(TX): Use EMPTY_BASE.
* g++.dg/compat/struct-layout-1_y1.h (empty_base): New type.
(EMPTY_BASE): Define.
(TX): Use EMPTY_BASE.
When building the parameter mapping for an atomic constraint,
find_template_parameters does not spot the template parameter within the
conversion-type-id of a dependent conversion operator, which later leads to an
ICE during substitution when looking up the missing template argument for this
unnoticed template parameter.
gcc/cp/ChangeLog:
PR c++/94597
* pt.c (any_template_parm_r) <case IDENTIFIER_NODE>: New case. If this
is a conversion operator, visit its TREE_TYPE.
gcc/testsuite/ChangeLog:
PR c++/94597
* g++.dg/cpp2a/concepts-conv2.C: New test.
The option -mabi=ilp32 should not be used in large code model. An error
message is added for the option conflict.
2020-04-21 Duan bo <duanbo3@huawei.com>
gcc/
PR target/94577
* config/aarch64/aarch64.c: Add an error message for option conflict.
* doc/invoke.texi (-mcmodel=large): Mention that -mcmodel=large is
incompatible with -fpic, -fPIC and -mabi=ilp32.
gcc/testsuite/
PR target/94577
* gcc.target/aarch64/pr94577.c: New test.
An ICE on darwin, when a SFINAE-context substitution produced
error_mark_node foo an operand of a POINTER_PLUS_EXPR.
fold_build_pointer_plus is unprepared to deal with that, so we need to
check earlier. We had no luck reducing the testcase to something
manageable.
* pt.c (tsubst_copy_and_build) [POINTER_PLUS_EXPR]: Check for
error_mark_node.
The PR noticed that omp-low.c contains a self-assignment in the
function new_omp_context:
if (outer_ctx) {
...
ctx->outer_reduction_clauses = ctx->outer_reduction_clauses;
This is obviously useless. The original intention might have been
to copy the field from the outer_ctx to ctx. Since this is done
(properly) in the only function where this field is actually used
(in function scan_omp_for) and the field is being initialized to zero
during the struct allocation, there is no need to attempt to do
anything to this field in new_omp_context. Thus this commit
removes any assignment to the field from new_omp_context.
2020-04-21 Frederik Harwath <frederik@codesourcery.com>
PR other/94629
* gcc/omp-low.c (new_omp_context): Remove assignments to
ctx->outer_reduction_clauses and ctx->local_reduction_clauses.
Reviewed-by: Thomas Schwinge <thomas@codesourcery.com>
This has been fixed by the PR71311 r7-1170-g4618c453205f18
change.
2020-04-21 Jakub Jelinek <jakub@redhat.com>
PR c/94686
* gcc.c-torture/compile/pr94686.c: New test.
Coroutine ramp functions have synthesised return values (the
user-authored function body cannot have an explicit 'return').
The current implementation attempts to optimise by building
the return in-place, in the manner of C++17 code. Clearly,
that was too ambitious and the fix builds a target expr for
the constructed version and passes that to finish_return_stmt.
This also means that we now get the same error messages for
implicit use of deleted CTORs etc.
gcc/cp/ChangeLog:
2020-04-21 Iain Sandoe <iain@sandoe.co.uk>
PR c++/94661
* coroutines.cc (morph_fn_to_coro): Simplify return
value computation.
gcc/testsuite/ChangeLog:
2020-04-21 Iain Sandoe <iain@sandoe.co.uk>
PR c++/94661
* g++.dg/coroutines/ramp-return-a.C: New test.
* g++.dg/coroutines/ramp-return-b.C: New test.
* g++.dg/coroutines/ramp-return-c.C: New test.
si_code in siginfo_t is a macro on NetBSD, not a member of the
struct itself, so add a C trampoline for receiving its value.
Also replace references to mos.waitsemacount with the replacement and
add some helpers from os_netbsd.go in the GC repository.
Update golang/go#38538.
Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/228918
As an extension (there should be a CWG about this though), we support
braced-init-list as a template argument, but convert_nontype_argument
had trouble digesting them. We ICEd because of the double coercion we
perform for template arguments: convert_nontype_argument called from
finish_template_type got a { }, and since a class type was involved and
we were in a template, convert_like created an IMPLICIT_CONV_EXPR. Then
the second conversion of the same argument crashed in constexpr.c
because the IMPLICIT_CONV_EXPR had gotten wrapped in a TARGET_EXPR.
Another issue was that an IMPLICIT_CONV_EXPR leaked to constexpr.c when
building an aggregate init.
We should have instantiated the IMPLICIT_CONV_EXPR in the first call to
convert_nontype_argument, but we didn't, because the call to
is_nondependent_constant_expression returned false because it checks
!BRACE_ENCLOSED_INITIALIZER_P. Then non_dep was false even though the
expression didn't contain anything dependent and we didn't instantiate
it in convert_nontype_argument. To fix this, check
BRACE_ENCLOSED_INITIALIZER_P in cxx_eval_outermost_constant_expr rather
than in is_nondependent_*.
PR c++/94592
* constexpr.c (cxx_eval_outermost_constant_expr): Return when T is
a BRACE_ENCLOSED_INITIALIZER_P.
(is_nondependent_constant_expression): Don't check
BRACE_ENCLOSED_INITIALIZER_P.
(is_nondependent_static_init_expression): Likewise.
* g++.dg/cpp2a/nontype-class34.C: New test.
* g++.dg/cpp2a/nontype-class35.C: New test.
This PR seems to be similar to PR c++/43382, except that the recursive call to
the variadic function with trailing return type in this testcase is additionally
given some explicit template arguments.
In the first testcase below, when resolving the recursive call to 'select',
fn_type_unification first substitutes in the call's explicit template arguments
before doing unification, and so during this substitution the template argument
pack for Args is incomplete.
Since the pack is incomplete, the substitution of 'args...' in the trailing
return type decltype(f(args...)) is handled by the unsubstituted_packs case of
tsubst_pack_expansion. But the handling of this case happens _before_ we reset
local_specializations, and so the substitution ends up reusing the old binding
for 'args' from local_specializations rather than building a new one.
This patch fixes this issue by setting up local_specializations sooner in
tsubst_pack_expansion, before the handling of the unsubstituted_packs case.
It also adds a new policy to local_specialization_stack so that we could use the
class here to conditionally replace local_specializations.
gcc/cp/ChangeLog:
PR c++/94628
* cp-tree.h (lss_policy::lss_nop): New enumerator.
* pt.c (local_specialization_stack::local_specialization_stack): Handle
an lss_nop policy.
(local_specialization_stack::~local_specialization_stack): Likewise.
(tsubst_pack_expansion): Use a local_specialization_stack instead of
manually saving and restoring local_specializations. Conditionally
replace local_specializations sooner, before the handling of the
unsubstituted_packs case.
gcc/testsuite/ChangeLog:
PR c++/94628
* g++.dg/cpp0x/variadic179.C: New test.
* g++.dg/cpp0x/variadic180.C: New test.
We issue bogus -Wparentheses warnings (3 of them!) for this fold expression:
((B && true) || ...)
Firstly, issuing a warning for a compiler-generated expression is wrong
and secondly, B && true must be wrapped in ( ) otherwise you'll get
error: binary expression in operand of fold-expression.
PR c++/94505 - bogus -Wparentheses warning with fold-expression.
* pt.c (fold_expression): Add warning_sentinel for -Wparentheses
before calling build_x_binary_op.
* g++.dg/cpp1z/fold11.C: New test.
parm = STRIP_NOPS (parm); is unnecessary and generates
warning: operation on 'parm' may be undefined [-Wsequence-point]
when cp/coroutines.cc is compiled with -std=c++11.
* coroutines.cc (captures_temporary): Don't assign the result of
STRIP_NOPS to the same variable.
The vector popcount expanders use a hardcoded subreg. This might lead
to double subregs being generated which then fail to match. With this
patch simplify_gen_subreg is used instead to fold the subregs.
gcc/ChangeLog:
2020-04-20 Andreas Krebbel <krebbel@linux.ibm.com>
* config/s390/vector.md ("popcountv8hi2_vx", "popcountv4si2_vx")
("popcountv2di2_vx"): Use simplify_gen_subreg.
gcc/testsuite/ChangeLog:
2020-04-20 Andreas Krebbel <krebbel@linux.ibm.com>
* g++.dg/pr94666.C: New test.
The vsel instruction is a bit-wise select instruction. Using an
IF_THEN_ELSE to express it in RTL is wrong and leads to wrong code being
generated in the combine pass.
With the patch the pattern is written using bit operations. However,
I've just noticed that the manual still demands a fixed point mode for
AND/IOR and friends although several targets emit bit ops on floating
point vectors (including i386, Power, and s390). So I assume this is a
safe thing to do?!
gcc/ChangeLog:
2020-04-20 Andreas Krebbel <krebbel@linux.ibm.com>
PR target/94613
* config/s390/s390-builtin-types.def: Add 3 new function modes.
* config/s390/s390-builtins.def: Add mode dependent low-level
builtin and map the overloaded builtins to these.
* config/s390/vx-builtins.md ("vec_selV_HW"): Rename to ...
("vsel<V_HW"): ... this and rewrite the pattern with bitops.
gcc/testsuite/ChangeLog:
2020-04-20 Andreas Krebbel <krebbel@linux.ibm.com>
PR target/94613
* gcc.target/s390/zvector/pr94613.c: New test.
* gcc.target/s390/zvector/vec_sel-1.c: New test.
This patch fixes a large lmbench performance regression with
128-bit SVE, compiled in length-agnostic mode.
vect_better_loop_vinfo_p (new in GCC 10) tries to estimate whether
a new loop_vinfo is cheaper than a previous one, with an in-built
preference for the old one. For variable VF it prefers the old
loop_vinfo if it is cheaper for at least one VF. However, we have
no idea how likely that VF is in practice.
Another extreme would be to do what most of the rest of the
vectoriser does, and rely solely on the constant estimated VF.
But as noted in the comment, this means that a one-unit cost
difference would be enough to pick the new loop_vinfo,
despite the target generally preferring the old loop_vinfo
where possible. The cost model just isn't accurate enough
for that to produce good results as things stand: there might
not be any practical benefit to the new loop_vinfo at the
estimated VF, and it would be significantly worse for higher VFs.
The patch instead goes for a hacky compromise: make sure that the new
loop_vinfo is also no worse than the old loop_vinfo at double the
estimated VF. For all but trivial loops, this ensures that the
new loop_vinfo is only chosen if it is better than the old one
by a non-trivial amount at the estimated VF. It also avoids
putting too much faith in the VF estimate.
I realise this isn't great, but it's supposed to be a conservative fix
suitable for stage 4. The only affected testcases are the ones for
pr89007-*.c, where Advanced SIMD is indeed preferred for 128-bit SVE
and is no worse for 256-bit SVE.
Part of the problem here is that if the new loop_vinfo is better,
we discard the old one and never consider using it even as an
epilogue loop. This means that if we choose Advanced SIMD over SVE,
we're much more likely to have left-over scalar elements.
Another is that the estimate provided by estimated_poly_value might have
different probabilities attached. E.g. when tuning for a particular core,
the estimate is probably accurate, but when tuning for generic code,
the estimate is more of a guess. Relying solely on the estimate is
probably correct for the former but not for the latter.
Hopefully those are things that we could tackle in GCC 11.
2020-04-20 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* tree-vect-loop.c (vect_better_loop_vinfo_p): If old_loop_vinfo
has a variable VF, prefer new_loop_vinfo if it is cheaper for the
estimated VF and is no worse at double the estimated VF.
gcc/testsuite/
* gcc.target/aarch64/sve/cost_model_8.c: New test.
* gcc.target/aarch64/sve/cost_model_9.c: Likewise.
* gcc.target/aarch64/sve/pr89007-1.c: Add -msve-vector-bits=512.
* gcc.target/aarch64/sve/pr89007-2.c: Likewise.
This testcase triggered an ICE in rtx_vector_builder::step because
we were trying to use a stepped representation for floating-point
constants. The underlying problem was that the arguments to
rtx_vector_builder were the wrong way around, meaning that some
variations were likely to be incorrectly encoded for integers
(but probably as a silent failure).
Also, aarch64_sve_expand_vector_init_handle_trailing_constants
tries to extend the trailing constant elements to a full vector
by following the "natural" pattern of the original vector, which
should generally lead to nicer constants. However, for the testcase,
we'd then end up picking a variable for some elements. Fixed by
stubbing out all variable elements with zeros.
That fix involved testing valid_for_const_vector_p. For consistency,
the patch uses the same test when finding trailing constants, instead
of the previous aarch64_legitimate_constant_p.
2020-04-20 Richard Sandiford <richard.sandiford@arm.com>
gcc/
PR target/94668
* config/aarch64/aarch64.c (aarch64_sve_expand_vector_init): Fix
order of arguments to rtx_vector_builder.
(aarch64_sve_expand_vector_init_handle_trailing_constants): Likewise.
When extending the trailing constants to a full vector, replace any
variables with zeros.
gcc/testsuite/
PR target/94668
* gcc.target/aarch64/sve/pr94668.c: New test.
We treat tpl-tpl-parms as types. They're not; bound-tpl-tpl-parms
are. We can get away with them being type-like. Unfortunately we
give the original level==orig_level case a canonical type, but the
reduced cases of level<orig_level get structural equality. This patch
gives them structural type always.
* pt.c (canonical_type_parameter): Assert not a tpl-tpl-parm.
(process_template_parm): tpl-tpl-parms are structural.
(rewrite_template_parm): Propagate structuralness.
We were not comparing expression pack expansions correctly. We could
consider distinct expansions equal and creating two, apparently equal,
specializations that would sometimes collide. cp_tree_operand_length
says a pack has 1 operand (for mangling), whereas it actually has 3,
but only two of which are significant for equality. We must special
case that in cp_tree_equal. That new code matches the hasher and the
type_pack_expansion case in structural_comp_types.
* tree.c (cp_tree_equal): [TEMPLATE_ID_EXPR, default] Refactor.
[EXPR_PACK_EXPANSION]: Add.
One of the problems hit by pr94454 was that the argument hasher was
not skipping nodes that template_args_equal would. Fixed by replacing
the STRIP_NOPS invocation by a bespoke loop. We also confuse the
canonical type machinery by treating tpl-tpl-parms as types. They're
not; bound-tpl-tpl-parms are. We can get away with them being
type-like. Unfortunately we give the original level==orig_level case
a canonical type, but the reduced cases of level<orig_level get
structural equality. That breaks the hasher because we'll use
TYPE_HASH (CANONICAL_TYPE ()) when we can. There's a note in
tsubst[TEMPLATE_TEMPLATE_PARM] about why the reduced ones cannot have
a canonical type. (I didn't feel like questioning that assertion at
this point.)
* pt.c (iterative_hash_template_arg): Strip nodes as
template_args_equal does.
[ARGUMENT_PACK_SELECT, TREE_VEC, CONSTRUCTOR]: Refactor.
[node_class:TEMPLATE_TEMPLATE_PARM]: Hash by level & index.
[node_class:default]: Refactor.
Add missing check in gfc_set_array_spec for sum of rank and corank to not
exceed GFC_MAX_DIMENSIONS.
2020-04-20 Harald Anlauf <anlauf@gmx.de>
PR fortran/93364
* array.c (gfc_set_array_spec): Check for sum of rank and corank
not exceeding GFC_MAX_DIMENSIONS.
2020-04-20 Harald Anlauf <anlauf@gmx.de>
PR fortran/93364
* gfortran.dg/pr93364.f90: New test.
2020-04-20 Steve Kargl <kargl@gcc.gnu.org>
Thomas Koenig <tkoenig@gcc.gnu.org>
PR fortran/91800
* decl.c (variable_decl): Reject Hollerith constants as type
initializer.
2020-04-20 Steve Kargl <kargl@gcc.gnu.org>
Thomas Koenig <tkoenig@gcc.gnu.org>
PR fortran/91800
* gfortran.dg/hollerith_9.f90: New test.
While the coroutines implementation, and most of the coroutines
tests, will operate with C++14 or newer, these tests require
facilities introduced in C++17. Add the target requirement.
gcc/testsuite/
2020-04-19 Iain Sandoe <iain@sandoe.co.uk>
* g++.dg/coroutines/torture/co-await-17-capture-comp-ref.C: Require
C++17.
* g++.dg/coroutines/torture/co-ret-15-default-return_void.C: Likewise.
Returning &gfc_bad_expr when simplifying bounds after a divisin by zero
happened results in the division by zero error actually reaching the user.
2020-04-19 Thomas Koenig <tkoenig@gcc.gnu.org>
PR fortran/93500
* resolve.c (resolve_operator): If both operands are
NULL, return false.
* simplify.c (simplify_bound): If a division by zero
was seen during bound simplification, free the
corresponcing expression and return &gfc_bad_expr.
2020-04-19 Thomas Koenig <tkoenig@gcc.gnu.org>
PR fortran/93500
* arith_divide_3.f90: New test.
Similarly to inline asm, :: (or any other number of consecutive colons) can
appear in ObjC @selector argument and with the introduction of CPP_SCOPE
into the C FE, we need to trat CPP_SCOPE as two CPP_COLON tokens.
The C++ FE does that already that way.
2020-04-19 Jakub Jelinek <jakub@redhat.com>
PR objc/94637
* c-parser.c (c_parser_objc_selector_arg): Handle CPP_SCOPE like
two CPP_COLON tokens.
* objc.dg/pr94637.m: New test.
Patch fixes test failure seen on X32 where a nested struct was passed in
registers, rather than via invisible reference. Now, all non-POD
structs are passed by invisible reference, not just those with a
user-defined copy constructor/destructor.
gcc/d/ChangeLog:
PR d/94609
* d-codegen.cc (argument_reference_p): Don't check TREE_ADDRESSABLE.
(type_passed_as): Build reference type if TREE_ADDRESSABLE.
* d-convert.cc (convert_for_argument): Build explicit TARGET_EXPR if
needed for arguments passed by invisible reference.
* types.cc (TypeVisitor::visit (TypeStruct *)): Mark all structs that
are not POD as TREE_ADDRESSABLE.
The intended purpose of the option is both for targets that don't
support phobos yet, and for gdc itself to support bootstrapping itself
as a self-hosted D compiler.
The libphobos testsuite has been updated to only add libphobos to the
search paths if it's being built. A new D2 testsuite directive
RUNNABLE_PHOBOS_TEST has also been patched in to disable some runnable
tests that have phobos dependencies, of which is a temporary measure
until upstream DMD fixes or removes these tests entirely.
gcc/testsuite/ChangeLog:
* lib/gdc-utils.exp (gdc-convert-test): Add dg-skip-if for tests that
depending on the phobos standard library.
libphobos/ChangeLog:
* configure: Regenerate.
* configure.ac: Add --with-libphobos-druntime-only option and the
conditional ENABLE_LIBDRUNTIME_ONLY.
* configure.tgt: Define LIBDRUNTIME_ONLY.
* src/Makefile.am: Add phobos sources if not ENABLE_LIBDRUNTIME_ONLY.
* src/Makefile.in: Regenerate.
* testsuite/testsuite_flags.in: Add phobos path if compiling phobos.
The current check_effective_target_d_runtime procedure returns false if
the target is built without any core runtime library for D being
available (--disable-libphobos). This additional procedure is for
targets where the core runtime library exists, but without the higher
level standard library.
gcc/ChangeLog:
* doc/sourcebuild.texi (Effective-Target Keywords, Environment
attributes): Document d_runtime_has_std_library.
gcc/testsuite/ChangeLog:
* gdc.dg/link.d: Use d_runtime_has_std_library effective target.
* gdc.dg/runnable.d: Move phobos tests to...
* gdc.dg/runnable2.d: ...here. New test.
* lib/target-supports.exp
(check_effective_target_d_runtime_has_std_library): New.
libphobos/ChangeLog:
* testsuite/libphobos.phobos/phobos.exp: Skip if effective target is
not d_runtime_has_std_library.
* testsuite/libphobos.phobos_shared/phobos_shared.exp: Likewise.
In the testcase below, during specialization of c<int>::d, we build two
identical specializations of the parameter type b<decltype(e)::k> -- one when
substituting into c<int>::d's TYPE_ARG_TYPES and another when substituting into
c<int>::d's DECL_ARGUMENTS.
We don't reuse the first specialization the second time around as a consequence
of the fix for PR c++/56247 which made PARM_DECLs always compare different from
one another during spec_hasher::equal. As a result, when looking up existing
specializations of 'b', spec_hasher::equal considers the template argument
decltype(e')::k to be different from decltype(e'')::k, where e' and e'' are the
result of two calls to tsubst_copy on the PARM_DECL e.
Since the two specializations are considered different due to the mentioned fix,
their TYPE_CANONICAL points to themselves even though they are otherwise
identical types, and this triggers an ICE in maybe_rebuild_function_decl_type
when comparing the TYPE_ARG_TYPES of c<int>::d to its DECL_ARGUMENTS.
This patch fixes this issue at the spec_hasher::equal level by ignoring the
'comparing_specializations' flag in cp_tree_equal whenever the DECL_CONTEXTs of
the two parameters are identical. This seems to be a sufficient condition to be
able to correctly compare PARM_DECLs structurally. (This also subsumes the
CONSTRAINT_VAR_P check since constraint variables all have empty, and therefore
identical, DECL_CONTEXTs.)
gcc/cp/ChangeLog:
PR c++/94632
* tree.c (cp_tree_equal) <case PARM_DECL>: Ignore
comparing_specializations if the parameters' contexts are identical.
gcc/testsuite/ChangeLog:
PR c++/94632
* g++.dg/template/canon-type-14.C: New test.
When updating an auto return type of an abbreviated function template in
splice_late_return_type, we should also propagate PLACEHOLDER_TYPE_CONSTRAINTS
(and cv-qualifiers) of the original auto node.
gcc/cp/ChangeLog:
PR c++/92187
* pt.c (splice_late_return_type): Propagate cv-qualifiers and
PLACEHOLDER_TYPE_CONSTRAINTS from the original auto node to the new one.
gcc/testsuite/ChangeLog:
PR c++/92187
* g++.dg/concepts/abbrev5.C: New test.
* g++.dg/concepts/abbrev6.C: New test.
This time instead of having a NOP copy insn that we can completely ignore and
ultimately remove, we have a NOP set within a multi-set PARALLEL. It triggers,
the same failure when the source of such a set is a hard register for the same
reasons as we've already noted in the BZ and patches-to-date.
For prior cases we've been able to mark the insn as a nop set and ignore it for
the rest of cse_insn, ultimately removing it. That's not really an option here
as there are other sets that we have to preserve.
We might be able to fix this instance by splitting the multi-set insn, but I'm
not keen to introduce splitting into cse. Furthermore, the target may not be
able to split the insn. So I considered this is non-starter.
What I finally settled on was to use the existing do_not_record machinery to
ignore the nop set within the parallel (and only that set within the parallel).
One might argue that we should always ignore a REG_UNUSED set. But I rejected
that idea -- we could have cse-able divmod insns where the first had a
REG_UNUSED note for a destination, but the second did not.
One might also argue that we could have a nop set without a REG_UNUSED in a
multi-set parallel and thus we could trigger yet another insert_regs ICE at
some point. I tend to think this is a possibility. If we see this happen,
we'll have to revisit.
PR rtl-optimization/90275
* cse.c (cse_insn): Avoid recording nop sets in multi-set parallels
when the destination has a REG_UNUSED note.
In this PR, we're ICEing on a use of an 'int... a' template parameter pack as
part of the variadic lambda init-capture [...z=a].
The unexpected thing about this variadic init-capture is that it is not
type-dependent, and so the call to do_auto_deduction from
lambda_capture_field_type actually resolves its type to 'int' instead of exiting
early like it does for a type-dependent variadic initializer. This later
confuses add_capture which, according to one of its comments, assumes that
'type' is always 'auto' for a variadic init-capture.
The simplest fix (and the approach that this patch takes) seems to be to avoid
doing auto deduction in lambda_capture_field_type when the initializer uses
parameter packs, so that we always return 'auto' even in the non-type-dependent
case.
gcc/cp/ChangeLog:
PR c++/94483
* lambda.c (lambda_capture_field_type): Avoid doing auto deduction if
the explicit initializer has parameter packs.
gcc/testsuite/ChangeLog:
PR c++/94483
* g++.dg/cpp2a/lambda-pack-init5.C: New test.
In the testcase for this PR, we try to parse the statement
A(value<0>());
first tentatively as a declaration (with a parenthesized declarator), and during
this tentative parse we end up issuing a hard error from
cp_parser_check_template_parameters about its invalidness as a declaration.
Rather than issuing a hard error, it seems we should instead simulate an error
since we're parsing tentatively. This would then allow cp_parser_statement to
recover and successfully parse the statement as an expression-statement instead.
gcc/cp/ChangeLog:
PR c++/88754
* parser.c (cp_parser_check_template_parameters): Before issuing a hard
error, first try simulating an error instead.
gcc/testsuite/ChangeLog:
PR c++/88754
* g++.dg/parse/ambig10.C: New test.
The attached patch fixes an ICE on invalid: When the return type of
a function was misdeclared with a wrong rank, we issued a warning,
but not an error (unless with -pedantic); later on, an ICE ensued.
Nothing good can come from wrongly declaring a function type
(considering the ABI), so I changed that into a hard error.
2020-04-17 Thomas Koenig <tkoenig@gcc.gnu.org>
PR fortran/94090
* gfortran.dg (gfc_compare_interfaces): Add
optional argument bad_result_characteristics.
* interface.c (gfc_check_result_characteristics): Fix
whitespace.
(gfc_compare_interfaces): Handle new argument; return
true if function return values are wrong.
* resolve.c (resolve_global_procedure): Hard error if
the return value of a function is wrong.
2020-04-17 Thomas Koenig <tkoenig@gcc.gnu.org>
PR fortran/94090
* gfortran.dg/interface_46.f90: New test.
We were seeing performance regressions on 256-bit SVE with code like:
for (int i = 0; i < count; ++i)
#pragma GCC unroll 128
for (int j = 0; j < 128; ++j)
*dst++ = 1;
(derived from lmbench).
For 128-bit SVE, it's clearly better to use Advanced SIMD STPs here,
since they can store 256 bits at a time. We already do this for
-msve-vector-bits=128 because in that case Advanced SIMD comes first
in autovectorize_vector_modes.
If we handled full-loop predication well for this kind of loop,
the choice between Advanced SIMD and 256-bit SVE would be mostly
a wash, since both of them could store 256 bits at a time. However,
SVE would still have the extra prologue overhead of setting up the
predicate, so Advanced SIMD would still be the natural choice.
As things stand though, we don't handle full-loop predication well
for this kind of loop, so the 256-bit SVE code is significantly worse.
Something to fix for GCC 11 (hopefully). However, even though we
account for the overhead of predication in the cost model, the SVE
version (wrongly) appeared to need half the number of stores.
That was enough to drown out the predication overhead and meant
that we'd pick the SVE code over the Advanced SIMD code.
512-bit SVE has a clear advantage over Advanced SIMD, so we should
continue using SVE there.
This patch tries to account for this in the cost model. It's a bit
of a compromise; see the comment in the patch for more details.
2020-04-17 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* config/aarch64/aarch64.c (aarch64_advsimd_ldp_stp_p): New function.
(aarch64_sve_adjust_stmt_cost): Add a vectype parameter. Double the
cost of load and store insns if one loop iteration has enough scalar
elements to use an Advanced SIMD LDP or STP.
(aarch64_add_stmt_cost): Update call accordingly.
gcc/testsuite/
* gcc.target/aarch64/sve/cost_model_2.c: New test.
* gcc.target/aarch64/sve/cost_model_3.c: Likewise.
* gcc.target/aarch64/sve/cost_model_4.c: Likewise.
* gcc.target/aarch64/sve/cost_model_5.c: Likewise.
* gcc.target/aarch64/sve/cost_model_6.c: Likewise.
* gcc.target/aarch64/sve/cost_model_7.c: Likewise.
This change fixes two obvious redundant assignments reported by cppcheck:
trunk.git/gcc/c/c-parser.c:16969:2: style: Variable 'data.clauses' is reassigned a value before the old one has been used. [redundantAssignment]
trunk.git/gcc/cp/call.c:5116:9: style: Variable 'arg2' is reassigned a value before the old one has been used. [redundantAssignment]
2020-04-17 Jakub Jelinek <jakub@redhat.com>
PR other/94629
* c-parser.c (c_parser_oacc_routine): Remove redundant assignment
to data.clauses.
* call.c (build_conditional_expr_1): Remove redundant assignment to
arg2.
As the testcase shows, there are unfortunately more problematic cases
in *testqi_ext_3 if the mode is not CCZmode, because the sign flag might
not behave the same between the insn with zero_extract and what we split it
into.
The previous fix to the insn condition was because *testdi_1 for mask with
upper 32-bits clear and bit 31 set is implemented using SImode test and thus
SF is set depending on that bit 31 rather than on always cleared.
But we can have other cases. On the zero_extract (which has <MODE>mode),
we can have either the pos + len == precision of <MODE>mode, or
pos + len < precision of <MODE>mode cases. The former one copies the most
significant bit into SF, the latter will have SF always cleared.
For the former case, either it is a zero_extract from a larger mode, but
then when we perform test in that larger mode, SF will be always clear and
thus mismatch from the zero_extract case (so we need to enforce CCZmode),
or it will be a zero_extract from same mode with pos 0 and len equal to
mode precision, such zero_extracts should have been really simplified
into their first operand.
For the latter case, when SF is always clear on the define_insn with
zero_extract, we need to split into something that doesn't sometimes set
SF, i.e. it has to be a test with mask that doesn't have the most
significant bit set. In some cases it can be achieved through using test
in a wider mode (e.g. in the testcase, there is
(zero_extract:SI (reg:HI) (const_int 13) (const_int 3))
which will always set SF to 0, but we split it into
(and:HI (reg:HI) (const_int -8))
which will copy the MSB of (reg:HI) into SF, but we can do:
(and:SI (subreg:SI (reg:HI) 0) (const_int 0xfff8))
which will keep SF always cleared), but there are various cases where we
can't (when already using DImode, or when SImode and we'd turned it into
the problematic *testdi_1 implemented using SImode test, or when
the val operand is a MEM (we don't want to read from memory more than
the user originally wanted), paradoxical subreg of MEM could be problematic
too if we through the narrowing end up with a MEM).
So, the patch attempts to require CCZmode (and not CCNOmode) if it can't
really ensure the SF will have same meaning between the define_insn and what
we split it into, and if we decide we allow CCNOmode, it needs to avoid
performing narrowing and/or widen if pos + len would indicate we'd have MSB
set in the mask.
2020-04-17 Jakub Jelinek <jakub@redhat.com>
Jeff Law <law@redhat.com>
PR target/94567
* config/i386/i386.md (*testqi_ext_3): Use CCZmode rather than
CCNOmode in ix86_match_ccmode if len is equal to <MODE>mode precision,
or pos + len >= 32, or pos + len is equal to operands[2] precision
and operands[2] is not a register operand. During splitting perform
SImode AND if operands[0] doesn't have CCZmode and pos + len is
equal to mode precision.
* gcc.c-torture/execute/pr94567.c: New test.
Co-Authored-By: Jeff Law <law@redhat.com>
PR lto/94612
* lto-common.c: Initialize file_data->lto_section_header
before lto_mode_identity_table call. It is needed because
it decompresses a LTO section.
delete_insn_and_edges calls purge_dead_edges whenever deleting the last insn
in a bb, whatever it is. If it called it only for mandatory last insns
in the basic block (that may not be followed by DEBUG_INSNs, dunno if that
is control_flow_insn_p or something more complex), that wouldn't be a
problem, but as it calls it on any last insn and can actually do something
in the bb, if such an insn is followed by one more more DEBUG_INSNs and
nothing else in the same bb, we don't call purge_dead_edges with -g and do
call it with -g0.
On the testcase, there are two reg-to-reg moves with REG_EH_REGION notes
(previously memory accesses but simplified and yet not optimized), and the
second is followed by DEBUG_INSNs; the second move is delete_insn_and_edges
and after removing it, for -g0 purge_dead_edges removes the REG_EH_REGION
from the now last insn in the bb (the first reg-to-reg move), while
for -g it isn't called and things diverge from that quickly on.
Fixed by calling purdge_dead_edges even if we remove the last real insn
followed only by DEBUG_INSNs in the same bb.
2020-04-17 Jakub Jelinek <jakub@redhat.com>
PR rtl-optimization/94618
* cfgrtl.c (delete_insn_and_edges): Set purge not just when
insn is the BB_END of its block, but also when it is only followed
by DEBUG_INSNs in its block.
* g++.dg/opt/pr94618.C: New test.
When I've added the VLA tweak for OpenMP to avoid error_mark_nodes in the IL in
type, I forgot that TYPE_DOMAIN could be NULL. Furthermore, as an optimization,
this patch checks the hopefully cheapest condition that is very likely false
most of the time (enabled only during OpenMP handling) first.
2020-04-17 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/94621
* tree-inline.c (remap_type_1): Don't dereference NULL TYPE_DOMAIN.
Move id->adjust_array_error_bounds check first in the condition.
* gcc.c-torture/compile/pr94621.c: New test.
PR gcov-profile/94570
* ltmain.sh: Do not define HAVE_DOS_BASED_FILE_SYSTEM
for CYGWIN.
PR gcov-profile/94570
* coverage.c (coverage_init): Use separator properly.
PR gcov-profile/94570
* filenames.h (defined): Do not define HAVE_DOS_BASED_FILE_SYSTEM
for CYGWIN.
Co-Authored-By: Jonathan Yong <10walls@gmail.com>