手动编译所有-O3标志与字面指定“-O3”的结果不同

时间:2013-12-07 01:23:55

标签: c++ c g++

我有一个非常特殊的问题,我想深究。根据{{​​3}},优化-O3只是一组优化标志。所以我尝试使用g++ program.c -o program -fauto-inc-dec -fcompare-elim -...编译程序,其中我手动列出了-O3中的所有优化。然后我尝试了g++ program.c -o program -O3并发现后一个二进制文件更快。这意味着手动优化不等同。知道为什么会这样吗?我们使用多个程序观察到此行为,甚至使用-O1-O2

1 个答案:

答案 0 :(得分:8)

我发现,优化选项页面不一定完整。您可以使用标记-S -fverbose-asm找到似乎完全选项集GCC应用于特定程序,并检查.s编译器生成的文件。

例如,在我在本地编译的程序中,GCC报告它使用了以下标志:

# GNU C (GCC) version 4.8.0 (x86_64-unknown-linux-gnu)
#   compiled by GNU C version 4.8.0, GMP version 4.3.2, MPFR version 3.0.0-p3, MPC version 0.8.2
# GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072
# options passed:  -I . -I .. -I /usr/include/SDL -D linux -D _GNU_SOURCE=1
# -D _REENTRANT -D JZINTV_VERSION_MAJOR=0x01 -D JZINTV_VERSION_MINOR=0x00
# gfx/gfx_scale.c -msse -mtune=generic -march=x86-64
# -auxbase-strip gfx/gfx_scale.s -ggdb3 -O6 -Wall -Wextra -Wshadow
# -Wpointer-arith -Wbad-function-cast -Wcast-qual -Wc++-compat
# -Wmissing-declarations -Wmissing-prototypes -Wstrict-prototypes -std=c99
# -fverbose-asm -flto -fomit-frame-pointer -fprefetch-loop-arrays
# options enabled:  -faggressive-loop-optimizations
# -fasynchronous-unwind-tables -fauto-inc-dec -fbranch-count-reg
# -fcaller-saves -fcombine-stack-adjustments -fcommon -fcompare-elim
# -fcprop-registers -fcrossjumping -fcse-follow-jumps -fdefer-pop
# -fdelete-null-pointer-checks -fdevirtualize -fdwarf2-cfi-asm
# -fearly-inlining -feliminate-unused-debug-types -fexpensive-optimizations
# -fforward-propagate -ffunction-cse -fgcse -fgcse-after-reload -fgcse-lm
# -fgnu-runtime -fguess-branch-probability -fhoist-adjacent-loads -fident
# -fif-conversion -fif-conversion2 -findirect-inlining -finline
# -finline-atomics -finline-functions -finline-functions-called-once
# -finline-small-functions -fipa-cp -fipa-cp-clone -fipa-profile
# -fipa-pure-const -fipa-reference -fipa-sra -fira-hoist-pressure
# -fira-share-save-slots -fira-share-spill-slots -fivopts
# -fkeep-static-consts -fleading-underscore -fmath-errno -fmerge-constants
# -fmerge-debug-strings -fmove-loop-invariants -fomit-frame-pointer
# -foptimize-register-move -foptimize-sibling-calls -foptimize-strlen
# -fpartial-inlining -fpeephole -fpeephole2 -fpredictive-commoning
# -fprefetch-loop-arrays -free -freg-struct-return -fregmove
# -freorder-blocks -freorder-functions -frerun-cse-after-loop
# -fsched-critical-path-heuristic -fsched-dep-count-heuristic
# -fsched-group-heuristic -fsched-interblock -fsched-last-insn-heuristic
# -fsched-rank-heuristic -fsched-spec -fsched-spec-insn-heuristic
# -fsched-stalled-insns-dep -fschedule-insns2 -fshow-column -fshrink-wrap
# -fsigned-zeros -fsplit-ivs-in-unroller -fsplit-wide-types
# -fstrict-aliasing -fstrict-overflow -fstrict-volatile-bitfields
# -fsync-libcalls -fthread-jumps -ftoplevel-reorder -ftrapping-math
# -ftree-bit-ccp -ftree-builtin-call-dce -ftree-ccp -ftree-ch
# -ftree-coalesce-vars -ftree-copy-prop -ftree-copyrename -ftree-cselim
# -ftree-dce -ftree-dominator-opts -ftree-dse -ftree-forwprop -ftree-fre
# -ftree-loop-distribute-patterns -ftree-loop-if-convert -ftree-loop-im
# -ftree-loop-ivcanon -ftree-loop-optimize -ftree-parallelize-loops=
# -ftree-partial-pre -ftree-phiprop -ftree-pre -ftree-pta -ftree-reassoc
# -ftree-scev-cprop -ftree-sink -ftree-slp-vectorize -ftree-slsr -ftree-sra
# -ftree-switch-conversion -ftree-tail-merge -ftree-ter
# -ftree-vect-loop-version -ftree-vectorize -ftree-vrp -funit-at-a-time
# -funswitch-loops -funwind-tables -fvar-tracking
# -fvar-tracking-assignments -fvect-cost-model -fverbose-asm
# -fzero-initialized-in-bss -m128bit-long-double -m64 -m80387
# -maccumulate-outgoing-args -malign-stringops -mfancy-math-387
# -mfp-ret-in-387 -mglibc -mieee-fp -mlong-double-80 -mmmx -mno-sse4
# -mpush-args -mred-zone -msse -msse2 -mtls-direct-seg-refs

那是旗帜的重点......