On x86, since -fPIC and -shared should be used to create offload image,
we put them the last to properly create offload image.
2019-11-11 H.J. Lu <hjl.tools@gmail.com>
PR target/87833
* config/i386/intelmic-mkoffload.c (prepare_target_image): Put
-fPIC and -shared the last to create offload image.
From-SVN: r278041
The 'gcc/configure' script sources all 'gcc/*/config-lang.in' files, but fails
to emit such dependency information into the build machinery. That means,
currently, when something gets changed in a 'gcc/*/config-lang.in' file, this
is not noticed, and doesn't propagate through the build machinery.
Handling of configure fragments is modelled in the same way as it already
exists for Makefile fragments.
gcc/
* Makefile.in (LANG_CONFIGUREFRAGS): Define.
(config.status): Use/depend on it.
* configure.ac (all_lang_configurefrags): Track, 'AC_SUBST'.
* configure: Regenerate.
From-SVN: r278035
In this patch, loop unroll adjust hook is introduced for powerpc. We
can do target related heuristic adjustment in this hook. In this patch,
-funroll-loops is enabled for small loops at O2 and above with an option
-munroll-small-loops to guard the small loops unrolling, and it works
fine with -flto.
gcc/
2019-11-11 Jiufu Guo <guojiufu@linux.ibm.com>
PR tree-optimization/88760
* gcc/config/rs6000/rs6000.opt (-munroll-only-small-loops): New option.
* gcc/common/config/rs6000/rs6000-common.c
(rs6000_option_optimization_table) [OPT_LEVELS_2_PLUS_SPEED_ONLY]:
Turn on -funroll-loops and -munroll-only-small-loops.
[OPT_LEVELS_ALL]: Turn off -fweb and -frename-registers.
* config/rs6000/rs6000.c (rs6000_option_override_internal): Remove
set of PARAM_MAX_UNROLL_TIMES and PARAM_MAX_UNROLLED_INSNS.
Turn off -munroll-only-small-loops for explicit -funroll-loops.
(TARGET_LOOP_UNROLL_ADJUST): Add loop unroll adjust hook.
(rs6000_loop_unroll_adjust): Define it. Use -munroll-only-small-loops.
gcc.testsuite/
2019-11-11 Jiufu Guo <guojiufu@linux.ibm.com>
PR tree-optimization/88760
* gcc.dg/pr59643.c: Update back to r277550.
From-SVN: r278034
To align with rs6000_insn_cost costing more for load type insns,
this patch is to make load insns cost more in vectorization cost
function. The latency of load insns is about twice that of
"simple" instructions; 2 vs. 1 on older cores, and 4 (or so) vs.
2 on newer cores. Considering that the result of load usually
is used somehow later (true-dep) but store won't, we keep the
store as before.
The SPEC2017 performance evaluation on Power8 shows 525.x264_r
+9.56%, 511.povray_r +2.08%, 527.cam4_r 1.16% gains, no
significant degradation, SPECINT geomean +0.88%, SPECFP geomean
+0.26%.
The SPEC2017 performance evaluation on Power9 shows no significant
improvement or degradation, SPECINT geomean +0.04%, SPECFP geomean
+0.04%.
The SPEC2006 performance evaluation on Power8 shows 454.calculix
+4.41% gain but 416.gamess -1.19% and 453.povray -3.83% degradation.
I looked into the two degradation bmks, the degradation were NOT
due to hotspot changes by vectorization, were all side effects.
SPECINT geomean +0.10%, SPECFP geomean no changed considering
the degradation.
gcc/ChangeLog
2019-11-11 Kewen Lin <linkw@gcc.gnu.org>
* config/rs6000/rs6000.c (rs6000_builtin_vectorization_cost): Make
scalar_load, vector_load, unaligned_load and vector_gather_load cost
more to conform hardware latency and insn cost settings.
From-SVN: r278033
Some of the solution to PR71767 is incomplete, and we need finer-grained
control over whether symbols need to be made linker-visible. This is a
preparation patch, providing the flag.
gcc/ChangeLog:
2019-11-10 Iain Sandoe <iain@sandoe.co.uk>
* config/darwin.h (MACHO_SYMBOL_FLAG_LINKER_VIS): New.
(MACHO_SYMBOL_LINKER_VIS_P): New.
From-SVN: r278028
As part of PR 91413, GFortran now prints a warning when a variable is
moved from the stack to static storage. However, when the user
explicitly specifies that all local variables should be put in static
storage with the -fno-automatic option, don't print this warning.
Regtested on x86_64-pc-linux-gnu, committed as obvious.
gcc/fortran/ChangeLog:
2019-11-10 Janne Blomqvist <jb@gcc.gnu.org>
PR fortran/91413
* trans-decl.c (gfc_finish_var_decl): Don't print warning when
-fno-automatic is enabled.
From-SVN: r278027
This paper was delayed until the February meeting in Prague so that we could
get a better idea of what the impact on existing code would actually be. To
that end, I'm implementing it now.
* typeck2.c (check_narrowing): Treat pointer->bool as a narrowing
conversion with -std=c++2a.
From-SVN: r278026
2019-11-10 Paul Thomas <pault@gcc.gnu.org>
PR fortran/92123
*decl.c (gfc_verify_c_interop_param): Remove error asserting
that pointer or allocatable variables in a bind C procedure are
not supported. Delete some trailing spaces.
* trans-stmt.c (trans_associate_var): Correct the attempt to
treat scalar pointer or allocatable temporaries as if they are
array descriptors.
2019-11-10 Paul Thomas <pault@gcc.gnu.org>
PR fortran/92123
* gfortran.dg/bind_c_procs_3.f90 : New test.
* gfortran.dg/ISO_Fortran_binding_15.c : New test.
* gfortran.dg/ISO_Fortran_binding_15.f90 : Additional source.
From-SVN: r278025
The liveness of eliminable hard registers is not tracked by LRA between
basic blocks, so they should not be used as spill registers as LRA may
decide to allocate them to pseudos while the spilled value is still live.
2019-11-10 Kwok Cheung Yeung <kcy@codesourcery.com>
gcc/
* lra-spills.c (assign_spill_hard_regs): Do not spill into
registers in eliminable_regset.
From-SVN: r278024
* ipa-inline.c (compute_uninlined_call_time,
compute_inlined_call_time): Take edge frequency as
parameter rather than computing it by itself.
(big_speedup_p, edge_badness): Manually CSE sreal
frequency calculations.
From-SVN: r278023
* ipa-prop.c (ipa_propagate_indirect_call_infos): Remove ipa edge
args summaries of inlined edge unless it holds info about
described reference.
From-SVN: r278020
Sometimes combine wants to do a move in CCFPmode, but we don't currently
handle moves in any CC mode other than CCmode. Fix that oversight.
* config/rs6000/rs6000.md (CC_any): New mode iterator.
(*movcc_internal1): Rename to...
(*movcc_<mode> for CC_any): ... this. Support moves of all CC modes.
From-SVN: r278017
* cgraph.h (struct cgraph_node): Add ipcp_clone flag.
(cgraph_node::create_virtual_clone): Copy it.
* ipa-cp.c (ipcp_versionable_function_p): Watch for missing
summaries.
(ignore_edge_p): If caller has ipa-cp disabled, skip the edge, too.
(ipcp_verify_propagated_values): Do not verify nodes where ipcp
is disabled.
(propagate_constants_across_call): If callee is not analyzed, give up.
(propagate_constants_topo): Lower to bottom latties of all callees of
functions with ipa-cp disabled.
(ipcp_propagate_stage): Skip functions with ipa-cp disabled.
(cgraph_edge_brings_value_p): Check for availability first.
(create_specialized_node): Set ipcp_clone.
(ipcp_store_bits_results): Check that info is present.
* ipa-fnsummary.c (evaluate_properties_for_edge): Do not analyze
thunks.
(ipa_call_context::duplicate_from, ipa_call_context::equal_to): Be
conservative when callee summary is missing.
(remap_edge_summaries): Lookup call summary only when needed.
* ipa-icf.c (sem_function::param_used_p): Be ready for missing summary.
* ipa-prpo.c (ipa_alloc_node_params, ipa_initialize_node_params):
Use get_create.
(ipa_analyze_node): Use get_create.
(propagate_controlled_uses): Do not propagate when function is not
analyzed.
(ipa_propagate_indirect_call_infos): Remove summary of inline clone.
(ipa_read_node_info): Use get_create.
* ipa-prop.h (IPA_NODE_REF): Use get.
(IPA_NODE_REF_GET_CREATE): New.
From-SVN: r278016
* ipa-inline-analysis.c (do_estimate_growth_1): Add support for
capping the growth cumulated.
(offline_size): Break out from ...
(estimate_growth): ... here.
(check_callers): Add N, OFFLINE and MIN_SIZE and KNOWN_EDGE
parameters.
(growth_likely_positive): Turn to ...
(growth_positive_p): Re-implement.
* ipa-inline.h (growth_likely_positive): Remove.
(growth_positive_p): Declare.
* ipa-inline.c (want_inline_small_function_p): Use
growth_positive_p.
(want_inline_function_to_all_callers_p): Likewise.
From-SVN: r278007
* ipa-fnsummary.c (estimate_edge_size_and_time): Do not call
estimate_edge_devirt_benefit when not computing hints;
do not compute time when not asked for.
(estimate_calls_size_and_time): Pass NULL hints and time when
these are not computed; do not evaluate hint predicates when these are
not computed.
(ipa_merge_fn_summary_after_inlining): Do not re-evaluate edge
frequency.
From-SVN: r278005
PR tree-optimization/92401
* gimple-match-head.c (gimple_resimplify1): Call const_unop only
if res_op->code is an expression with code length 1.
* gimple-match-head.c (gimple_resimplify2): Call const_binop only
if res_op->code is an expression with code length 2.
* gimple-match-head.c (gimple_resimplify3): Call fold_ternary only
if res_op->code is an expression with code length 3.
* g++.dg/opt/pr92401.C: New test.
From-SVN: r278004
When a stub is used to call the mcount function, the code is already
marking it as used unconditionally; This is the only use of the so-
called validation outside darwin.{h,c}. This moves the 'validation'
into darwin.c which is a step towards making validation routine local.
gcc/
2019-11-09 Iain Sandoe <iain@sandoe.co.uk>
* config/darwin.c (machopic_mcount_stub_name): Validate the
symbol stub name when it is created.
* config/i386/darwin.h (FUNCTION_PROFILER): Remove the symbol
stub validation.
From-SVN: r278000
The Darwin protos header is missing an include guard, this adds one.
gcc/ChangeLog:
2019-11-08 Iain Sandoe <iain@sandoe.co.uk>
* config/darwin-protos.h: Add include quard.
From-SVN: r277993
I noticed that for code like
struct S {
int *foo : 3;
};
we generate nonsensical
r.C:2:8: error: function definition does not declare parameters
2 | int *foo : 3;
It talks about a function because after parsing the declspecs of 'foo' we don't
see either ':' or "name :", so we think it's not a bit-field decl. So we parse
the declarator and since a ctor-initializer begins with a ':', we try to parse
it as a function body, generating the awful diagnostic. With this patch, we
issue:
r.C:2:8: error: bit-field ‘foo’ has non-integral type ‘int*’
2 | int *foo : 3;
* parser.c (cp_parser_member_declaration): Add a diagnostic for
bit-fields with non-integral types.
* g++.dg/diagnostic/bitfld4.C: New test.
From-SVN: r277991
2019-11-08 Andrew MacLeod <amacleod@redhat.com>
* range-op.h (range_operator::fold_range): Return result in a
reference parameter instead of by value.
(range_operator::wi_fold): Same.
* range-op.cc (range_operator::wi_fold): Return result in a reference
parameter instead of by value.
(range_operator::fold_range): Same.
(value_range_from_overflowed_bounds): Same.
(value_range_with_overflow): Same
(create_possibly_reversed_range): Same.
(operator_equal::fold_range): Same.
(operator_not_equal::fold_range): Same.
(operator_lt::fold_range): Same.
(operator_le::fold_range): Same.
(operator_gt::fold_range): Same.
(operator_ge::fold_range): Same.
(operator_plus::wi_fold): Same.
(operator_plus::op1_range): Change call to fold_range.
(operator_plus::op2_range): Change call to fold_range.
(operator_minus::wi_fold): Return result via reference parameter.
(operator_minus::op1_range): Change call to fold_range.
(operator_minus::op2_range): Change call to fold_range.
(operator_min::wi_fold): Return result via reference parameter.
(operator_max::wi_fold): Same.
(cross_product_operator::wi_cross_product): Same.
(operator_mult::wi_fold): Same.
(operator_div::wi_fold): Same.
(operator_div op_floor_div): Fix whitespace.
(operator_exact_divide::op1_range): Change call to fold_range.
(operator_lshift::fold_range): Return result via reference parameter.
(operator_lshift::wi_fold): Same.
(operator_rshift::fold_range): Same.
(operator_rshift::wi_fold): Same.
(operator_cast::fold_range): Same.
(operator_cast::op1_range): Change calls to fold_range.
(operator_logical_and::fold_range): Return result via reference.
(wi_optimize_and_or): Adjust call to value_range_with_overflow.
(operator_bitwise_and::wi_fold): Return result via reference.
(operator_logical_or::fold_range): Same.
(operator_bitwise_or::wi_fold): Same.
(operator_bitwise_xor::wi_fold): Same.
(operator_trunc_mod::wi_fold): Same.
(operator_logical_not::fold_range): Same.
(operator_bitwise_not::fold_range): Same.
(operator_bitwise_not::op1_range): Change call to fold_range.
(operator_cst::fold_range): Return result via reference.
(operator_identity::fold_range): Same.
(operator_abs::wi_fold): Same.
(operator_absu::wi_fold): Same.
(operator_negate::fold_range): Same.
(operator_negate::op1_range): Change call to fold_range.
(operator_addr_expr::fold_range): Return result via reference.
(operator_addr_expr::op1_range): Change call to fold_range.
(operator_pointer_plus::wi_fold): Return result via reference.
(operator_pointer_min_max::wi_fold): Same.
(operator_pointer_and::wi_fold): Same.
(operator_pointer_or::wi_fold): Same.
(range_op_handler): Change call to fold_range.
(range_cast): Same.
* tree-vrp.c (range_fold_binary_symbolics_p): Change call to
fold_range.
(range_fold_unary_symbolics_p): Same.
(range_fold_binary_expr): Same.
(range_fold_unary_expr): Same.
From-SVN: r277979
With the new reduction vectype handling, neutral_op_for_slp_reduction
needs to know whether the caller is using STMT_VINFO_REDUC_VECTYPE
(for an epilogue value) or STMT_VINFO_VECTYPE (for a PHI argument).
This fixes various gcc.target/aarch64/sve/slp_* tests.
2019-11-08 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* tree-vect-loop.c (neutral_op_for_slp_reduction): Take the
vector type as an argument rather than reading it from the
stmt_vec_info.
(vect_create_epilog_for_reduction): Update accordingly.
(vectorizable_reduction): Likewise.
(vect_transform_cycle_phi): Likewise.
From-SVN: r277977
* config/rs6000/predicates.md (branch_comparison_operator): Allow only
the comparison codes that make sense for the mode used, and only the
codes that can be done with a single branch instruction.
From-SVN: r277976
Allows character literals to used to assign values to non-character variables
in the same way that Hollerith constants are used. In addition character
literals can be used in data statements just like Hollerith constants.
Warnings of such use are output to discourage this usage as it is a non-standard
legacy feature and must be explicitly enabled.
Enabled by -fdec and -fdec-char-conversions.
Co-Authored-By: Jim MacArthur <jim.macarthur@codethink.co.uk>
From-SVN: r277975
gcc/ChangeLog:
2019-11-08 Andre Vieira <andre.simoesdiasvieira@arm.com>
PR tree-optimization/92351
* tree-vect-data-refs.c (vect_compute_data_ref_alignment): When we are
peeling the main loop for alignment, make sure to set the misalignment
of the epilogue's data references to DR_MISALIGNMENT_UNKNOWN.
gcc/testsuite/ChangeLog:
2019-11-08 Andre Vieira <andre.simoesdiasvieira@arm.com>
PR tree-optimization/92351
* gcc.dg/vect/vect-peel-2.c: Disable epilogue vectorization and
split the source of this test to...
* gcc.dg/vect/vect-peel-2-src.c: ... This.
* gcc.dg/vect/vect-peel-2-epilogues.c: New test.
From-SVN: r277974
2019-11-08 Richard Biener <rguenther@suse.de>
PR ipa/92409
* tree-inline.c (declare_return_variable): Properly handle
type mismatches for the return slot.
From-SVN: r277972
PR target/92095
* config/sparc/sparc-protos.h (output_load_pcrel_sym): Declare.
* config/sparc/sparc.c (sparc_cannot_force_const_mem): Revert latest
change.
(got_helper_needed): New static variable.
(output_load_pcrel_sym): New function.
(get_pc_thunk_name): Remove after inlining...
(load_got_register): ...here. Rework the initialization of the GOT
register and of the GOT helper.
(save_local_or_in_reg_p): Test the REGNO of the GOT register.
(sparc_file_end): Test got_helper_needed to decide whether the GOT
helper must be emitted. Use output_asm_insn instead of fprintf.
(sparc_init_pic_reg): In PIC mode, always initialize the PIC register
if optimization is enabled.
* config/sparc/sparc.md (load_pcrel_sym<P:mode>): Emit the assembly
by calling output_load_pcrel_sym.
From-SVN: r277966
If get_ref_base_and_extent returns poly_int offsets or sizes,
tree-sra.c:create_access prevents SRA from being applied to the base.
However, we haven't verified by that point that we have a valid base
to disqualify.
This originally led to an ICE on the attached testcase, but it
no longer triggers there after the introduction of IPA SRA.
2019-11-08 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* tree-sra.c (create_access): Delay disqualifying the base
for poly_int values until we know we have a base.
gcc/testsuite/
* gcc.target/aarch64/sve/acle/general/inline_2.c: New test.
From-SVN: r277965
gcc/ChangeLog:
2019-11-08 Andre Vieira <andre.simoesdiasvieira@arm.com>
* tree-vect-loop.c (vect_analyze_loop): Disable epilogue vectorization
for loops with SIMDUID set. Enable epilogue vectorization for loops
with SIMDLEN set after finding a main loop with a VF that matches it.
From-SVN: r277964
PR target/92038
* gimple-ssa-store-merging.c (find_constituent_stores): For return
value only, return non-NULL if there is a single non-clobber
constituent store even if there are constituent clobbers and return
one of clobber constituent stores if all constituent stores are
clobbers.
(split_group): Handle clobbers.
(imm_store_chain_info::output_merged_store): When computing
bzero_first, look after all clobbers at the start. Don't count
clobber stmts in orig_num_stmts, except if the first orig store is
a clobber covering the whole area and split_stores cover the whole
area, consider equal number of stmts ok. Punt if split_stores
contains only ->orig stores and their number plus number of original
clobbers is equal to original number of stmts. For ->orig, look past
clobbers in the constituent stores.
(imm_store_chain_info::output_merged_stores): Don't remove clobber
stmts.
(rhs_valid_for_store_merging_p): Don't return false for clobber stmt
rhs.
(store_valid_for_store_merging_p): Allow clobber stmts.
(verify_clear_bit_region_be): Fix up a thinko in function comment.
* g++.dg/opt/store-merging-1.C: New test.
* g++.dg/opt/store-merging-2.C: New test.
* g++.dg/opt/store-merging-3.C: New test.
From-SVN: r277963
PR c++/92384
* function.c (assign_parm_setup_block, assign_parm_setup_stack): Don't
copy TYPE_EMPTY_P arguments from data->entry_parm to data->stack_parm
slot.
(assign_parms): For TREE_ADDRESSABLE parms with TYPE_EMPTY_P type
force creation of a unique data.stack_parm slot.
* g++.dg/torture/pr92384.C: New test.
From-SVN: r277962