某所 XT4 での CrayPat 出力結果
- 球殻モデル
- 解像度 T84L48
コンパイラオプションは
-fastsse -O4 -Minline -Mvect=noassoc -Mlre=noassoc -mp=nonuma \ -Bstatic
- 結果
* 思った以上に MPI 関係のコストがかからない.
* lumatrix_f77.f が鬼門. というかコストかかりすぎている気がする. こんなモンかなぁ…
手があき次第最優先で.
--------------------------------------------------------------- 96.1% | 536.638646 | -- | -- | 451717.0 |USER |-------------------------------------------------------------- | 29.4% | 164.159681 | 3.642548 | 2.3% | 1.0 |main | 15.3% | 85.239062 | 0.078474 | 0.1% | 4102.0 |lusolv_ | 9.1% | 50.551754 | 2.168129 | 4.4% | 19039.0 |snls2g_ | 5.7% | 31.887453 | 0.051958 | 0.2% | 59118.0 |fttctf_ | 5.1% | 28.221466 | 0.053867 | 0.2% | 6012.0 |wa_base_module_xya_wa_ | 5.0% | 27.751203 | 0.412491 | 1.6% | 173362.0 |fttzl4_ | 3.8% | 21.487004 | 0.039820 | 0.2% | 3006.0 |wt_module_xyz_kgrad_wt_ | 3.3% | 18.411535 | 0.118059 | 0.7% | 6012.0 |snpsog_ | 3.1% | 17.488779 | 0.275271 | 1.7% | 8524.0 |snlg2s_ | 3.1% | 17.471511 | 0.514526 | 3.1% | 55126.0 |fttzl2_ | 2.6% | 14.458635 | 0.007494 | 0.1% | 59118.0 |fttruf_ | 2.3% | 12.972823 | 0.106637 | 0.9% | 6012.0 |sngsog_ | 1.9% | 10.804498 | 0.028414 | 0.3% | 13027.0 |sncs2g_ | 1.9% | 10.519774 | 0.019756 | 0.2% | 1068.0 |lusol2_ | 1.7% | 9.506332 | 0.025235 | 0.3% | 3006.0 |wt_module_xyz_gradlat_wt_ | 1.5% | 8.220879 | 0.107584 | 1.4% | 19039.0 |snfs2g_ | 1.3% | 7.486052 | 0.019462 | 0.3% | 16144.0 |at_module_at_dx_at_ |==============================================================