程序性能优化
23 九月 2017 by yx-
我的经验
ifort 默认开启-O2选项;
ifort 编译时加上-xhost 可以利用当前处理器最优的指令集, 保证结果的前提下提升计算速度("This option tells the compiler to generate instructions for the highest instruction set available on the compilation host processor");
ifort(2013) 编译代码时加上加上-vec-report 可以查看编译器对哪些代码做了向量优化(vectorized). -
loop unrolling, vectorize
Vectorization Essentials
vectorization support: unroll factor set to xxxx
Avoid Manual Loop Unrolling
Vectorization and Optimization Reports
Tutorial: Using Auto Vectorization
pointer aliasing and vectorization
Common Vectorization Tips
Random Number Function Vectorization -
内存对齐
Fortran Array Data and Arguments and Vectorization (a helpful post about alignment, vectorize, array, pointer)
Data Alignment to Assist Vectorization
Improving Performance by Aligning Data (Fortran)
C++内存对齐详解 -
OpenMP, MPI
lanl MPI tutorial
lanl OpenMP tutorial
MPI Forum
www.openmp.org: Resources
llnl HPC online training materials
OpenMP online tutorial
OpenMP online tutorial: Exercise
OpenMP的一点使用经验
OpenMP reduction (can be used to calculate min, max of an array)
OpenMP Array Reduction in Fortran
cnblog: Fortran openmp并行计算编程: reduction An Introduction to MPIParallel Programming with the message Passing Interface (online html with code)
[Youtube REWL Lecture III] -
GPU, CUDA Fortran, OpenACC PGI: Tuning a Monte Carlo Algorithm on GPUs
Fortran下用OpenACC加速的11个编程技巧 -
C语言代码优化, 位运算实现乘, 除, 求2n方的余数
C语言编程优化运行速度