This test currently fails on VxWorks 7 SR06x0 targets when in kernel
mode, because it expects a discrepancy between built-in and system
intmax_t for all VxWorks targets when in kernel mode. Fortunately,
this has now been fixed when targetting VxWorks 7 SR06x0, so this
commit adjusts the "dg-error" condition to exclude newer versions of
VxWorks 7.
for gcc/testsuite/ChangeLog
* gcc.dg/intmax_t-1.c: Do not expect an error on *-*-vxworks7r*
targets.
Match xfail on kernel instead of rtp mode.
for gcc/testsuite/changeLog
* gcc.dg/pthread-init-1.c: Fix the VxWorks xfail filters.
* gcc.dg/pthread-init-2.c: Ditto.
Explicitly disable some vxworks-missing features in the testsuite, that
the current feature tests detect as present.
for gcc/testsuite/ChangeLog
* lib/target-supports.exp (check_weak_available,
check_fork_available, check_effective_target_lto,
check_effective_target_mempcpy): Add vxworks filters.
The implicit -mlong-calls used in our vxworks configurations changes
the call sequences from those expected in the mve_libcall testcases.
This patch brings the test output in line with the expectations, with
an explicit -mno-long-calls.
for gcc/testsuite/ChangeLog
* gcc.target/arm/mve/intrinsics/mve_libcall1.c: Pass an
explicit -mno-long-calls.
* gcc.target/arm/mve/intrinsics/mve_libcall2.c: Likewise.
The implicit -mlong-calls from our vxworks configurations makes the
tail-call instructions differ from those expected by the
no_unique_address tests in gcc.target/arm.
This patch adds -mno-long-calls to the compilation commands, so that
we generate the expected sequences.
for gcc/testsuite/ChangeLog
* g++.target/arm/no_unique_address_1.C: Add -mno-long-calls.
* g++.target/arm/no_unique_address_2.C: Likewise.
The headmerge tests pass a constant to conditional calls, so that the
same constant is always passed to a function, though it's a different
function depending on which path is taken.
The test checks that the constant appears only once in the assembly
output, as a means to verify that the insns setting up the argument
are unified: they appear as separate insns up to jump2, where
crossjump identifies a common prefix to all conditional paths and
unifies them.
Alas, with -mlong-calls, that we enable in our arm-vxworks
configurations, the argument register is loaded after loading the
callee address into another register. Since each path calls a
different function, there's no common initial code sequence for
crossjump to unify, and the argument register set up remains separate,
so the test fails.
Though it would surely be desirable for the compiler to perform the
unification of the argument register setting up, this patch merely
avoids the effects of -mlong-calls, with an explicit -mno-long-calls.
for gcc/testsuite/ChangeLog
* gcc.target/arm/headmerge-1.c: Add -mno-long-calls.
* gcc.target/arm/headmerge-2.c: Likewise.
The implicit -mlong-calls used in our arm-vxworks configurations
changes the register allocation patterns in the arm/fp16-aapcs-2.c
test: r3 ends up used in the long-call sequence, and we end up using
ip as a temporary, which doesn't match the expected mov patterns.
This patch adds an explicit -mno-long-calls for the generated code to
match the expectation.
for gcc/testsuite/ChangeLog
* gcc.target/arm/fp16-aapcs-2.c: Use -mno-long-calls.
On some targets, there are no < 8191; and >= 8191; strings,
but < 8191) and >= 8191), so just remove the ; from the regexps.
2021-01-01 Jakub Jelinek <jakub@redhat.com>
PR testsuite/98489
PR tree-optimization/56719
* gcc.dg/tree-ssa/pr56719.c: Remove semicolon from
scan-tree-dump-times regexps.
In this testcase we end up with:
unsigned long long x = ...;
char y = (char) (x << 37);
The overwidening pattern realised that only the low 8 bits
of x << 37 are needed, but then tried to turn that into:
unsigned long long x = ...;
char y = (char) x << 37;
which gives an out-of-range shift. In this case y can simply
be replaced by zero, but as the comment in the patch says,
it's kind-of awkward to do that in the middle of vectorisation.
Most of the overwidening stuff is about keeping operations
as narrow as possible, which is important for vectorisation
but could be counter-productive for scalars (especially on
RISC targets). In contrast, optimising y to zero in the above
feels like an independent optimisation that would benefit scalar
code and that should happen before vectorisation.
gcc/
PR tree-optimization/98302
* tree-vect-patterns.c (vect_determine_precisions_from_users): Make
sure that the precision remains greater than the shift count.
gcc/testsuite/
PR tree-optimization/98302
* gcc.dg/vect/pr98302.c: New test.
This PR is about a case in which the vectoriser was feeding
incorrect alignment information to tree-data-ref.c, leading
to incorrect runtime alias checks. The alignment was taken
from the TREE_TYPE of the DR_REF, which in this case was a
COMPONENT_REF with a normally-aligned type. However, the
underlying MEM_REF was only byte-aligned.
This patch uses dr_alignment to calculate the (byte) alignment
instead, just like we do when creating vector MEM_REFs.
gcc/
PR tree-optimization/94994
* tree-vect-data-refs.c (vect_vfa_align): Use dr_alignment.
gcc/testsuite/
PR tree-optimization/94994
* gcc.dg/vect/pr94994.c: New test.
The static GET_MODE_MASKs for SVE vectors are based on the
static precisions, which in turn are based on 128-bit SVE.
The precisions are later updated based on -msve-vector-bits
(usually to become variable length), but the GET_MODE_MASK
stayed the same. This caused combine to fold:
(*_extract:DI (subreg:DI (reg:VNxMM R) 0) ...)
to zero because the extracted bits appeared to be insignificant.
gcc/
PR rtl-optimization/98214
* genmodes.c (emit_insn_modes_h): Emit a definition of CONST_MODE_MASK.
(emit_mode_mask): Treat mode_mask_array as non-constant if adj_nunits.
(emit_mode_adjustments): Update GET_MODE_MASK when updating
GET_MODE_NUNITS.
* machmode.h (mode_mask_array): Use CONST_MODE_MASK.
The following patch adds some clz simplifications. If
clz is 0, then the MSB of the argument is set, and if clz is prec-1, then
the argument is 1.
2020-12-31 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/94802
* match.pd (clz(X) == 0 -> (int)X < 0): New simplification.
(clz(X) == (prec-1) -> X == 1): Likewise.
* gcc.dg/tree-ssa/pr94802-1.c: New test.
The following patch adds two simplifications to recognize idioms
for ABS_EXPR resp. ABSU_EXPR.
2020-12-31 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/94785
* match.pd ((-(X < 0) | 1) * X -> abs (X)): New simplification.
((-(X < 0) | 1U) * X -> absu (X)): Likewise.
* gcc.dg/tree-ssa/pr94785.c: New test.
The following testcase is miscompiled, because niter analysis miscomputes
the number of iterations to 0.
The problem is that niter analysis uses mpz_t (wonder why, wouldn't
widest_int do the same job?) and when wi::to_mpz is called e.g. on the
TYPE_MAX_VALUE of __uint128_t, it initializes the mpz_t result with wrong
value.
wi::to_mpz has code to handle negative wide_ints in signed types by
inverting all bits, importing to mpz and complementing it, which is fine,
but doesn't handle correctly the case when the wide_int's len (times
HOST_BITS_PER_WIDE_INT) is smaller than precision when wi::neg_p.
E.g. the 0xffffffffffffffffffffffffffffffff TYPE_MAX_VALUE is represented
in wide_int as 0xffffffffffffffff len 1, and wi::to_mpz would create
0xffffffffffffffff mpz_t value from that.
This patch handles it by adding the needed -1 host wide int words (and has
also code to deal with precision that aren't multiple of
HOST_BITS_PER_WIDE_INT).
2020-12-31 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/98474
* wide-int.cc (wi::to_mpz): If wide_int has MSB set, but type
is unsigned and excess negative, append set bits after len until
precision.
* gcc.c-torture/execute/pr98474.c: New test.
The following testcase is diagnosed by UBSan as invalid, even when it is
valid.
We have a derived type Base2 at offset 1 with alignment 1 and do:
(const Derived &) ((const Base2 *) this + -1)
but the folder before ubsan in the FE gets a chance to instrument it
optimizes that into:
(const Derived &) this + -1
and so we require that this has 8-byte alignment which Derived class needs.
Fixed by avoiding such an optimization when -fsanitize=alignment is in
effect if it would affect the alignments (and guarded with !in_gimple_form
because we don't really care during GIMPLE, though pointer conversions are
useless then and so such folding isn't needed very much during GIMPLE).
2020-12-31 Jakub Jelinek <jakub@redhat.com>
PR c++/98206
* fold-const.c: Include asan.h.
(fold_unary_loc): Don't optimize (ptr_type) (((ptr_type2) x) p+ y)
into ((ptr_type) x) p+ y if sanitizing alignment in GENERIC and
ptr_type points to type with higher alignment than ptr_type2.
* g++.dg/ubsan/align-4.C: New test.
The following patch adds an optimization mentioned in PR56719 #c8.
We already have the x != 0 && y != 0 && z != 0 into (x | y | z) != 0
and x != -1 && y != -1 && y != -1 into (x & y & z) != -1
optimizations, this patch just extends that to
x < C && y < C && z < C for power of two constants C into
(x | y | z) < C (for unsigned comparisons).
I didn't want to create too many buckets (there can be TYPE_PRECISION such
constants), so the patch instead just uses one buckets for all such
constants and loops over that bucket up to TYPE_PRECISION times.
2020-12-31 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/56719
* tree-ssa-reassoc.c (optimize_range_tests_cmp_bitwise): Also optimize
x < C && y < C && z < C when C is a power of two constant into
(x | y | z) < C.
* gcc.dg/tree-ssa/pr56719.c: New test.
Symbols with extern(D) linkage are now mangled using back references to
types and identifiers if these occur more than once in the mangled name
as emitted before. This reduces symbol length, especially with chained
expressions of templated functions with Voldemort return types.
For example, the average symbol length of the 127000+ symbols created by
a libphobos unittest build is reduced by a factor of about 3, while the
longest symbol shrinks from 416133 to 1142 characters.
Reviewed-on: https://github.com/dlang/dmd/pull/12079
gcc/d/ChangeLog:
* dmd/MERGE: Merge upstream dmd 2bd4fc3fe.
This does not yet include support for the //go:embed directive added
in this release.
* Makefile.am (check-runtime): Don't create check-runtime-dir.
(mostlyclean-local): Don't remove check-runtime-dir.
(check-go-tool, check-vet): Copy in go.mod and modules.txt.
(check-cgo-test, check-carchive-test): Add go.mod file.
* Makefile.in: Regenerate.
Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/280172
There is no need for combine splitters to emit insn patterns with clobbers,
the pass is smart enough to add clobbers to patterns as necessary.
2020-12-30 Uroš Bizjak <ubizjak@gmail.com>
gcc/
* config/i386/i386.md: Remove unnecessary clobbers
from combine splitters.
The implementation in d-lang.cc was based on what was present in libcpp.
This synchronizes the escaping logic to match the current version.
gcc/d/ChangeLog:
* d-lang.cc (deps_add_target): Handle quoting ':' character.
Reimplement backslash tracking.
CST trees that were converted back to a D front-end AST node lost all
location information of the original expression. Now this is propagated
on to the literal expression.
gcc/d/ChangeLog:
* d-tree.h (d_eval_constant_expression): Add location argument.
* d-builtins.cc (d_eval_constant_expression): Give generated constants
a proper file location.
* d-compiler.cc (Compiler::paintAsType): Pass expression location to
d_eval_constant_expression.
* d-frontend.cc (eval_builtin): Likewise.
The following patch adds combine splitters to optimize:
- vpcmpeqd %ymm1, %ymm1, %ymm1
- vpandn %ymm1, %ymm0, %ymm0
vpmovmskb %ymm0, %eax
+ notl %eax
etc. (for vectors with less than 32 elements with xorl instead of notl).
2020-12-30 Jakub Jelinek <jakub@redhat.com>
PR target/98461
* config/i386/sse.md (<sse2_avx2>_pmovmskb): Add splitters
for pmovmskb of NOT vector.
* gcc.target/i386/sse2-pr98461.c: New test.
* gcc.target/i386/avx2-pr98461.c: New test.
2020-12-29 Paul Thomas <pault@gcc.gnu.org>
gcc/fortran
PR fortran/97612
* primary.c (build_actual_constructor): Missing allocatable
components are set unallocated using EXPR_NULL. Then missing
components are tested for a default initializer.
gcc/testsuite/
PR fortran/97612
* gfortran.dg/structure_constructor_17.f90: New test.
2020-12-29 Paul Thomas <pault@gcc.gnu.org>
gcc/fortran
PR fortran/93833
* trans-array.c (get_array_ctor_var_strlen): If the character
length backend_decl cannot be found, convert the expression and
use the string length. Clear up some minor white space issues
in the rest of the file.
gcc/testsuite/
PR fortran/93833
* gfortran.dg/deferred_character_36.f90 : New test.
The ARC code contains code which should only work with the old reload
pass. Such code is found in arc_secondary_reload hook, however it was
not properly quarded. Reverse the if-condition predicate such that
req_equiv_mem is called when lra is not in progress.
gcc/
2020-12-29 Claudiu Zissulescu <claziss@synopsys.com>
* config/arc/arc.c (arc_secondary_reload): Flip if-condition
predicates.
Signed-off-by: Claudiu Zissulescu <claziss@synopsys.com>
The REGNO_OK_FOR_BASE_P is using reg_renumber array. However, it is
not always defined. Use it only when it is defined.
gcc/
2020-12-29 Claudiu Zissulescu <claziss@synopsys.com>
* config/arc/arc.h (REGNO_OK_FOR_BASE_P): Check if defined
reg_renumber.
Signed-off-by: Claudiu Zissulescu <claziss@synopsys.com>
We need an temporary register when moving data from a cached memory to
an uncached memory. Fix this issue and add a test for it.
gcc/
2020-12-29 Claudiu Zissulescu <claziss@synopsys.com>
* config/arc/arc.c (prepare_move_operands): Use a temporary
registers when we have cached mem-to-uncached mem moves.
gcc/testsuite/
2020-12-29 Vladimir Isaev <isaev@synopsys.com>
* gcc.target/arc/uncached-9.c: New test.
Signed-off-by: Claudiu Zissulescu <claziss@synopsys.com>
Update movdi, movdf and mov vectors not to use predicated vadd2
instructions. vadd2 is used as a "fast" move in these patterns. This
fixes a number of failures in dejagnu.
gcc/
2020-12-29 Claudiu Zissulescu <claziss@synopsys.com>
* config/arc/arc.md (movdi_insn): Update pattern, no predicated
vadd2 usage.
(movdf_insn): Likewise.
* config/arc/simdext.md (movVEC_insn): Likewise.
Signed-off-by: Claudiu Zissulescu <claziss@synopsys.com>
Use copy_to_reg where appropriate, use int_mode_for_mode
and fix comment indentation.
2020-12-29 Uroš Bizjak <ubizjak@gmail.com>
gcc/
* config/i386/i386-expand.c (ix86_gen_TWO52): Use REAL_MODE_FORMAT
to determine number of mantissa bits. Use real_2expN instead
of real_ldexp.
(ix86_expand_rint): Use copy_to_reg.
(ix86_expand_floorceildf_32): Ditto.
(ix86_expand_truncdf_32): Ditto.
(ix86_expand_rounddf_32): Ditto.
(ix86_expand_floorceil): Use copy_to_reg and int_mode_for_mode.
(ix86_expand_trunc): Ditto.
(ix86_expand_round): Ditto.
The libgomp texinfo docs lead to an invalid "up" link on the Top node,
which we can avoid similarly to the Top link in the main GCC manual.
2020-12-28 Sandra Loosemore <sandra@codesourcery.com>
libgomp/
* libgomp.texi (Top): Avoid bad "up" link.
Support for HSAIL has been deprecated with GCC 10 and their web server
has been down for weeks.
gcc/
2020-12-28 Gerald Pfeifer <gerald@pfeifer.com>
* doc/standards.texi (HSAIL): Remove section.
x86_expand_rint expander uses x86_sse_copysign_to_positive, which
is unable to change the sign from - to +. When FE_DOWNWARD rounding
direction is in effect, the expanded sequence that involves subtraction
can trigger x - x = -0.0 special rule. x86_sse_copysign_to_positive
fails to change the sign of the intermediate value, assumed to always
be positive, back to positive.
The patch adds one extra fabs that strips the sign from the intermediate
value when flag_rounding_math is in effect.
2020-12-28 Uroš Bizjak <ubizjak@gmail.com>
gcc/
PR target/96793
* config/i386/i386-expand.c (ix86_expand_rint):
Remove the sign of the intermediate value for flag_rounding_math.
gcc/testsuite/
PR target/96793
* gcc.target/i386/pr96793-2.c: New test.
It is possible to avoid the call to force_reg and use existing
temporary register in ix86_expand_trunc, ix86_expand_round and
ix86_expand_rounddf_32 expanders.
2020-12-28 Uroš Bizjak <ubizjak@gmail.com>
gcc/
* config/i386/i386-expand.c (ix86_expand_trunc): Use
existing temporary register to avoid a call to force_reg.
gcc:
2020-12-27 Gerald Pfeifer <gerald@pfeifer.com>
* doc/analyzer.texi (Analyzer Internals): Find a new source for
the "A Memory Model for Static Analysis of C Programs" paper.
2020-12-27 Paul Thomas <pault@gcc.gnu.org>
gcc/fortran
PR fortran/97694
PR fortran/97723
* check.c (allocatable_check): Select rank temporaries are
permitted even though they are treated as associate variables.
* resolve.c (gfc_resolve_code): Break on select rank as well as
select type so that the block os resolved.
* trans-stmt.c (trans_associate_var): Class associate variables
that are optional dummies must use the backend_decl.
gcc/testsuite/
PR fortran/97694
PR fortran/97723
* gfortran.dg/select_rank_5.f90: New test.
2020-12-26 Paul Thomas <pault@gcc.gnu.org>
gcc/fortran
PR fortran/98022
* data.c (gfc_assign_data_value): Throw an error for inquiry
references. Follow with corrected code that would provide the
expected result and provides clean error recovery.
gcc/testsuite/
PR fortran/98022
* gfortran.dg/data_inquiry_ref.f90: Change to dg-compile and
add errors for inquiry references.
2020-12-23 Paul Thomas <pault@gcc.gnu.org>
gcc/fortran
PR fortran/83118
* trans-array.c (gfc_alloc_allocatable_for_assignment): Make
sure that class expressions are captured for dummy arguments by
use of gfc_get_class_from_gfc_expr otherwise the wrong vptr is
used.
* trans-expr.c (gfc_get_class_from_gfc_expr): New function.
(gfc_get_class_from_expr): If a constant expression is
encountered, return NULL_TREE;
(gfc_trans_assignment_1): Deallocate rhs allocatable components
after passing derived type function results to class lhs.
* trans.h : Add prototype for gfc_get_class_from_gfc_expr.