LAPACKãã.NETãžã®ç¿»èš³ã«åãçµãã§ããŸãã ãã¹ãŠã®ãã¹ããå«ããã¹ãŠã®LAPACKãæ£åžžã«å€æããFORTRANã³ã³ãã€ã©ãŒãäœæããŸããã å®éã®ããŒã¿åã§ã¯ïŒã»ãŒïŒãã¹ãŠã®ãã¹ãã«åæ ŒããŸãã è€éãªããŒã¿ã«ã€ããŠã¯ããŸã ããã€ãã®ç²ŸåºŠã®åé¡ããããŸãã
äŸïŒ
XEIGTSTZ <zec.in-ZLAHQRã®ã¢ã³ããŒãããŒãåå ã§å€±æããŸãã
åçŸæé ïŒZGET37-> knt == 31ãZHSEQR-> ZLAHQR-> 2çªç®ã®QRã¹ãããïŒITS == 2ïŒã®æåŸã«ã次ã®ã³ãŒããã¢ã³ããŒãããŒãåŒãèµ·ãããŸãïŒç¹å®ã®ã¬ãžã¹ã¿ã§ã¯ã以äžãåç
§ïŒ
TEMP = H( I, I-1 )
IF( DIMAG( TEMP ).NE.RZERO ) THEN
RTEMP = ABS( TEMP) ! <-- underflow on TEMP = (~1e-0173, ~1e-0173)
IF (RTEMP .EQ. RZERO) RTEMP = CABS1(TEMP)
H( I, I-1 ) = RTEMP
TEMP = TEMP / RTEMP
IF( I2.GT.I )
ç§ãã¡ã®ã³ã³ãã€ã©ã¯.NETCLRã察象ãšããŠããŸãã ãã®JITã¯ãABSïŒTEMPïŒã«SSEã¬ãžã¹ã¿ã䜿çšããããšã決å®ããŸããããã«ããããã°ããã¥ãŒãã®äžéèšç®ã§ã¢ã³ããŒãããŒãçºçããŸãã IfortïŒå¥ã®äŸãšããŠïŒã¯ããã®ç¶æ³ã§æµ®åå°æ°ç¹ã¬ãžã¹ã¿ã䜿çšãããããã¢ã³ããŒãããŒããŸããïŒé·ãã80ãããã§ããããïŒã å®è¡æã«ã³ã³ãã€ã©/ããã»ããµã«å¿ èŠãªç²ŸåºŠ/æ°å€ç¯å²ã«é¢ããŠãLAPACKã«äœãæåŸ ã§ããããæ確ã«ææ¡ããããšããŠããŸãã
å粟床ã®ãã¹ãŠã®ãã¹ãã¯ãå°ãªããšã64ãããã¬ãžã¹ã¿ãå¿ èŠãšããããã«èšèšãããŠããŸããïŒ ãããšããä»æ¥å©çšå¯èœãªäººæ°ã®ããFORTRANã³ã³ãã€ã©ãŒã®ã»ããã§æåããããã«èšèšãããŠããŸããïŒ ïŒäžèšã®æåã®ã±ãŒã¹ïŒããã³åæ§ã®ä»ã®ã±ãŒã¹ïŒã§ã¯æ³šæãå¿ èŠãªå ŽåããããŸããåé¡ãæèµ·ããå¿ èŠããããŸããïŒïŒ
ä»æ§ãæ¢ããŸãããããŸã èŠã€ãããŸããã§ããã ä»»æã®ãªã³ã¯ãããã ããã°å¹žãã§ãã åãã£ãŠæè¬ããŸãïŒ
ã¢ã³ããŒãããŒèªäœã¯æ¬åœã®åé¡ã§ã¯ãããŸããã ã¢ã³ããŒãããŒã®åŸãã¢ã«ãŽãªãºã ã¯ã¢ã³ããŒãããŒã®åŸåãå°ãªãCABS1ã«åãæ¿ãããŸãã çºçããåé¡ã¯ãTEMPãå®å šã«åäžã§ã¯ãªããZã®äžžãã«ã€ãªããããšã§ãã
èãããã解決çã¯ãCABS1ã䜿çšããŠäºåã¹ã±ãŒãªã³ã°ããŠãããABSã䜿çšããŠä¿®æ£ããããšã§ãïŒæåã®ã¹ã±ãŒãªã³ã°ã®ãããABSã¯ãªãŒããŒãããŒããªããªããŸãïŒã ïŒãã·ã³ã«ã¢ã³ããŒãããŒãçºçããªãããããã¹ãã§ããŸããïŒ
IF (RTEMP .EQ. RZERO) THEN
RTEMP = CABS1(TEMP)
H( I, I-1 ) = RTEMP
TEMP = TEMP / RTEMP
RTEMP = ABS( TEMP)
H( I, I-1 ) = H( I, I-1 )*RTEMP
TEMP = TEMP / RTEMP
END IF
ãã¹ãã¯ã人æ°ã®ããFORTRANã³ã³ãã€ã©ãŒã®ã»ããã§æåããããã«èšèšãããŠãããšæããŸããããã¯ããã¹ããå®è¡ãããæ¹æ³ã ããã§ãã ã¢ã³ããŒãããŒ/ãªãŒããŒãããŒã®äºæž¬ã¯éåžžã«å°é£ã§ãã å°ãªããšãç§ã®å Žåããããã®ãµãã«ãŒãã³ã¯ãïŒäžè¬çãªã³ã³ãã€ã©ã䜿çšããŠïŒåŸ¹åºçã«ãã¹ãããèŠã€ãã£ããªãŒããŒãããŒ/ã¢ã³ããŒãããŒãä¿®æ£ããã ãã§èšèšãããŠããŸãã
ããããšãããããŸããïŒ ããã¯éåžžã«åœ¹ç«ã¡ãŸãã
CABS1ã䜿çšããŠã¢ã³ããŒãããŒããã®å埩ãè©Šã¿ãŸããã ããããç§ãã¡ã®è©Šã¿ã¯ååã«é²ãã§ããŸããã§ããã ããªãã®ææ¡ã¯ã¯ããã«ããŸãããããã§ãã 䜿çš...
*
* Ensure that H(I,I-1) is real.
*
TEMP = H( I, I-1 )
IF( DIMAG( TEMP ).NE.RZERO ) THEN
RTEMP = ABS( TEMP)
IF (RTEMP .EQ. RZERO) THEN
RTEMP = CABS1(TEMP)
H( I, I-1 ) = RTEMP
TEMP = TEMP / RTEMP
RTEMP = ABS( TEMP)
H( I, I-1 ) = H( I, I-1 )*RTEMP
ELSE
H( I, I-1 ) = RTEMP
END IF
TEMP = TEMP / RTEMP
IF( I2.GT.I )
$ CALL ZSCAL( I2-I, DCONJG( TEMP ), H( I, I+1 ), LDH )
CALL ZSCAL( I-I1, TEMP, H( I1, I ), 1 )
IF( WANTZ ) THEN
CALL ZSCAL( NZ, TEMP, Z( ILOZ, I ), 1 )
END IF
END IF
*
130 CONTINUE
...ãã®å埩ã¯æ£åžžã«å®äºããŸãïŒABSïŒïŒã«SSEã¬ãžã¹ã¿ã䜿çšããŠããå Žåã§ãïŒã
ãã¹ãã¯ã人æ°ã®ããFORTRANã³ã³ãã€ã©ãŒã®ã»ããã§æåããããã«èšèšãããŠãããšæããŸããããã¯ããã¹ããå®è¡ãããæ¹æ³ã ããã§ãã ã¢ã³ããŒãããŒ/ãªãŒããŒãããŒã®äºæž¬ã¯éåžžã«å°é£ã§ãã å°ãªããšãç§ã®å Žåããããã®ãµãã«ãŒãã³ã¯ãïŒäžè¬çãªã³ã³ãã€ã©ã䜿çšããŠïŒåŸ¹åºçã«ãã¹ãããèŠã€ãã£ããªãŒããŒãããŒ/ã¢ã³ããŒãããŒãä¿®æ£ããã ãã§èšèšãããŠããŸãã
ãã¹ãã¹ã€ãŒãã¯éåžžã«åœ¹ç«ã¡ãŸãïŒ ç§ã®å€§ãŸããªèŠç©ããã§ã¯ããã¹ãã®1ïŒ æªæºãããã®ãŸãã¯åæ§ã®ãªãŒããŒãããŒã®åé¡ã®åœ±é¿ãåããŠããŸãïŒã³ã³ãã€ã©ãŒã䜿çšããŠããå ŽåïŒã ã¢ã³ããŒãããŒ/ãªãŒããŒãããŒã«å¯ŸããŠãã¹ããããã«å ç¢ã«ããããšã§ãLAPACKãããå€ãã®ãã©ãããã©ãŒã ã«å°å ¥ã§ããå¯èœæ§ããããŸãã äžèšã®ïŒå€±æããïŒè©Šã¿ã¯ã»ãã®äžäŸã§ãããç§ãã¡ã®åŽã§ä¿®æ£ãæãä»ãããšãã»ãšãã©ã§ããªãããšãæ確ã«ç€ºããŠããŸãã è€æ°ã®é¢é£ããåé¡ãéãåã«ããã®ãããªæ ã«èå³ããããã©ããããããŠäœãè¯ãã¢ãããŒãã«ãªããã«ã€ããŠãè°è«ãå§ããããšæããŸãã
@hokbãš@thijssteelã®æ¹åã«
ãããžã§ã¯ãã§ã®ç§ã®éãããçµéšãèãããšãç§ã¯ããªãã®åªåãšããªãã®PRããããã®ã¬ã€ãã©ã€ã³ãšããŠåãæ©äŒãããã ããã°å¹žãã§ãã ç§ãã¡ããã®å°æ¥ã®PR ...ïŒããã§ãããããã°ïŒïŒ
ããã«ã¡ã¯@hokb ã
ä»æ§ãæ¢ããŸãããããŸã èŠã€ãããŸããã§ããã ä»»æã®ãªã³ã¯ãããã ããã°å¹žãã§ãã åãã£ãŠæè¬ããŸãïŒ
ã©ãã«ãäœãæå®ãããŠãããããããŸããã
ç§ãã¡ã®ã³ã³ãã€ã©ã¯.NETCLRã察象ãšããŠããŸãã ãã®JITã¯ãABSïŒTEMPïŒã«SSEã¬ãžã¹ã¿ã䜿çšããããšã決å®ããŸããããã«ããããã°ããã¥ãŒãã®äžéèšç®ã§ã¢ã³ããŒãããŒãçºçããŸãã IfortïŒå¥ã®äŸãšããŠïŒã¯ããã®ç¶æ³ã§æµ®åå°æ°ç¹ã¬ãžã¹ã¿ã䜿çšãããããã¢ã³ããŒãããŒããŸããïŒé·ãã80ãããã§ããããïŒã å®è¡æã«ã³ã³ãã€ã©/ããã»ããµã«å¿ èŠãªç²ŸåºŠ/æ°å€ç¯å²ã«é¢ããŠãLAPACKã«äœãæåŸ ã§ããããæ確ã«ææ¡ããããšããŠããŸãã
倪åã®ã¹ããŒãã¡ã³ãïŒãã¹ãŠã®èšç®ãIEEE 64ãããæŒç®ã䜿çšããŠè¡ãããå ŽåãLAPACKã¯æ©èœããã¯ãã§ãã
LAPACKã¯ã80ãããã¬ãžã¹ã¿ããã€ã§ããã®èšç®ã«åœ¹ç«ã€ããšãæåŸ ããŠããŸããã ã¢ã«ãŽãªãºã ã¯ã64ãããæŒç®ã念é ã«çœ®ããŠèšèšãããŠããŸãã ããŠã @ thijssteelãè¿°ã¹ã
ç§ãã¡ã¯ããããã®åé¡ã解決ããããã®æ ã®äžã§äœç³»çãªããšã¯äœãããŠããŸããã äžè¬ã«ãã¢ã«ãŽãªãºã ããã¹ãã¹ã€ãŒãã«åæ Œããã°ååæºè¶³ã§ããŸãããŸãã80ãããã¬ãžã¹ã¿ããã®å©ããããã°ãããã§ååã§ãã
å粟床ã®ãã¹ãŠã®ãã¹ãã¯ãå°ãªããšã64ãããã¬ãžã¹ã¿ãå¿ èŠãšããããã«èšèšãããŠããŸããïŒ ãããšããä»æ¥å©çšå¯èœãªäººæ°ã®ããFORTRANã³ã³ãã€ã©ãŒã®ã»ããã§æåããããã«èšèšãããŠããŸããïŒ ïŒäžèšã®æåã®ã±ãŒã¹ïŒããã³åæ§ã®ä»ã®ã±ãŒã¹ïŒã§ã¯æ³šæãå¿ èŠãªå ŽåããããŸããåé¡ãæèµ·ããå¿ èŠããããŸããïŒïŒ
ç§ã®å€§ãŸããªèŠç©ããã§ã¯ããã¹ãã®1ïŒ æªæºãããã®ãŸãã¯åæ§ã®ãªãŒããŒãããŒã®åé¡ã®åœ±é¿ãåããŠããŸãïŒã³ã³ãã€ã©ãŒã䜿çšããŠããå ŽåïŒã
ãããç§ã®ã 1ïŒ ïŒ ããã¯æãããæ°ã§ãã
ãã¹ãã¯ã¢ã³ããŒãããŒãšãªãŒããŒãããŒé åã®åšãã§å€ãã®ãã¹ããè¡ã£ãŠããããããŠãŒã¶ãŒã®ã³ãŒãããããã®åé¡ãããªã¬ãŒãããšããç¹ã§ãã¹ãã®å¯èœæ§ãã¯ããã«é«ããšäºæ³ãããŸãããããã§ããªãã§ãã
ã¢ã³ããŒãããŒ/ãªãŒããŒãããŒã«å¯ŸããŠãã¹ããããã«å ç¢ã«ããããšã§ãLAPACKãããå€ãã®ãã©ãããã©ãŒã ã«å°å ¥ã§ããå¯èœæ§ããããŸãã
ããå€ãã®ãã©ãããã©ãŒã ãžã®ç§»æ€æ§ã¯ç¢ºãã«1ã€ã®é¢å¿äºã§ãã ãã1ã€ã®é¢å¿ã¯ãGMPãªã©ã®ããã±ãŒãžã«ããæ¡åŒµç²ŸåºŠã§ããç§ãç解ããŠããããã«ã粟床ã¯èšç®å šäœã§åºå®ãããŠããŸãã ïŒããšãã°ãããªãã¯256ãããã®èãã§ããã300ãããã®ã¬ãžã¹ã¿ããããŸãããïŒ
äžèšã®ïŒå€±æããïŒè©Šã¿ã¯ã»ãã®äžäŸã§ãããç§ãã¡ã®åŽã§ä¿®æ£ãæãä»ãããšãã»ãšãã©ã§ããªãããšãæ確ã«ç€ºããŠããŸãã è€æ°ã®é¢é£ããåé¡ãéãåã«ããã®ãããªæ ã«èå³ããããã©ããããããŠäœãè¯ãã¢ãããŒãã«ãªããã«ã€ããŠãè°è«ãå§ããããšæããŸãã
ã¯ãã èå³ããããŸãã ããããç§ãã¡ã«ã§ããããšã¯ããã ãã§ãã ãããŠãç§ãã¡ã¯ç§ãã¡ã®ç¿ã«ãããããããŸãã ãããã£ãŠããã®åé¡ãäžåºŠã«1ã€ãã€åãäžããŠãã©ããŸã§é²ãã§ãããã確èªããããšãã§ããŸãã
ãããã«ãããGitHubã«åé¡ãæçš¿ããããšã¯åžžã«è¯ãèãã§ãã ããã¯åé¡ãžã®èªèãäžããåé¡ã解決ããããã®å©ããã¢ã€ãã¢ãéããã®ã«åœ¹ç«ã¡ãŸãã
ç§ãã¡ã¯ãã®éãé²ãã§ããããã§ãããç§ã¯ããã楜ã«ããããšããå§ãããŸãã
ãã¶ããgfortranã®å Žåããã¹ãç®çã§ãã©ã°-mfpmath=sse -msse2
ã䜿çšããŠã³ã³ãã€ã«ããå¿
èŠããããŸãã ããã«ããããã¹ãŠã®èšç®ã64ãããæŒç®ã§å®è¡ãããããã«ãªããšæããŸãã ã§ãããããŸããã
ãããžã§ã¯ãã§ã®ç§ã®éãããçµéšãèãããšãç§ã¯ããªãã®åªåãšããªãã®PRããããã®ã¬ã€ãã©ã€ã³ãšããŠåãæ©äŒãããã ããã°å¹žãã§ãã ç§ãã¡ããã®å°æ¥ã®PR ...ïŒããã§ãããããã°ïŒïŒ
ãã¡ããïŒ ïŒ577ãã芧ãã ããã
@weslleyspereiraçŽ æŽãããïŒ ãããCLAHQRã«ãåãããã«åœãŠã¯ãŸããã©ãããç§ã¯ãŸã ãã§ãã¯ããŠããŸãã ç§ã®çµæãã§ããã ãæ©ãæçš¿ããŸãïŒææ¥ïŒ
ããã«ã¡ã¯@langou ïŒ
倪åã®ã¹ããŒãã¡ã³ãïŒãã¹ãŠã®èšç®ãIEEE 64ãããæŒç®ã䜿çšããŠè¡ãããå ŽåãLAPACKã¯æ©èœããã¯ãã§ãã
è¯ãïŒ ãä»äºããšã¯ããç¹å®ã®ç¯å²ã®ããŒã¿ãäŸçµŠãããå Žåããç¹å®ã®ã¬ãžã¹ã¿ãµã€ãºãåå ã§ãªãŒããŒãããŒããªãããšãæå³ãããšæããŸããïŒ
LAPACKã¯ã80ãããã¬ãžã¹ã¿ããã€ã§ããã®èšç®ã«åœ¹ç«ã€ããšãæåŸ ããŠããŸããã ã¢ã«ãŽãªãºã ã¯ã64ãããæŒç®ã念é ã«çœ®ããŠèšèšãããŠããŸãã ããŠã @ thijssteelãè¿°ã¹ã
ç§ãã¡ã¯ããããã®åé¡ã解決ããããã®æ ã®äžã§äœç³»çãªããšã¯äœãããŠããŸããã äžè¬ã«ãã¢ã«ãŽãªãºã ããã¹ãã¹ã€ãŒãã«åæ Œããã°ååæºè¶³ã§ããŸãããŸãã80ãããã¬ãžã¹ã¿ããã®å©ããããã°ãããã§ååã§ãã
ãšãŠããªãŒãºããã«ã§ããïŒ
ç§ã®å€§ãŸããªèŠç©ããã§ã¯ããã¹ãã®1ïŒ æªæºãããã®ãŸãã¯åæ§ã®ãªãŒããŒãããŒã®åé¡ã®åœ±é¿ãåããŠããŸãïŒã³ã³ãã€ã©ãŒã䜿çšããŠããå ŽåïŒã
ãããç§ã®ã 1ïŒ ïŒ ããã¯æãããæ°ã§ãã
ãŸããããããããã¯ããããããã¯ããã«å°ãªããã§ã;ïŒ
ããå€ãã®ãã©ãããã©ãŒã ãžã®ç§»æ€æ§ã¯ç¢ºãã«1ã€ã®é¢å¿äºã§ãã ãã1ã€ã®é¢å¿ã¯ãGMPãªã©ã®ããã±ãŒãžã«ããæ¡åŒµç²ŸåºŠã§ããç§ãç解ããŠããããã«ã粟床ã¯èšç®å šäœã§åºå®ãããŠããŸãã ïŒããšãã°ãããªãã¯256ãããã®èãã§ããã300ãããã®ã¬ãžã¹ã¿ããããŸãããïŒ
é¢çœããã«èãããŸããããã®ãããªåºå®ç²ŸåºŠã®è©Šã¿ã®çµéšãäžè¶³ããŠãããããããã«ã€ããŠã³ã¡ã³ãããããšã¯ã§ããŸããã
ã¯ãã èå³ããããŸãã ããããç§ãã¡ã«ã§ããããšã¯ããã ãã§ãã ãããŠãç§ãã¡ã¯ç§ãã¡ã®ç¿ã«ãããããããŸãã ãããã£ãŠããã®åé¡ãäžåºŠã«1ã€ãã€åãäžããŠãã©ããŸã§é²ãã§ãããã確èªããããšãã§ããŸãã
ç§ã¯ãŸã è¯ãäžè¬çãªã¢ãããŒããäœã§ãããããããªãã ç§ã®ç解ãããŸãã«ãçŽ æŽã§ãããªãã°ãç§ãšäžç·ã«è£žã§ãã ããã ãããããªãŒããŒãããŒ/ã¢ã³ããŒãããŒã¯åžžã«å ¥åããŒã¿ãšã¢ã«ãŽãªãºã ã®äž¡æ¹ã«äŸåããŠããã®ã§ã¯ãããŸãããïŒ ãããã£ãŠãã³ãŒãããã¹ãããæ°ããæ¡ä»¶ãšãããããå埩ããããã®æ°ããã³ãŒãã§æº¢ãããã代ããã«ãå ¥åããŒã¿ã®ã蚱容ç¯å²ããæžããããšãã§ããŸããïŒ ãã ããã©ã¡ãã®ã¢ãããŒãã«å¿ èŠãªäœæ¥ã«ã€ããŠããå¿ èŠãªæŽå¯ã¯ãããŸããã ã ããç§ã¯äœããã£ãšå®çŸå¯èœãå€æã§ããŸããã
ãããã«ãããGitHubã«åé¡ãæçš¿ããããšã¯åžžã«è¯ãèãã§ãã ããã¯åé¡ãžã®èªèãäžããåé¡ã解決ããããã®å©ããã¢ã€ãã¢ãéããã®ã«åœ¹ç«ã¡ãŸãã
è¯ãã é²è¡äžã«åé¡ãæåºããŸãã ã¢ã³ããŒãããŒãåçŸã§ããã«ä¿®æ£ãè¡ãã®ã¯é£ããããšã ãšç解ããŠããŸãã ã§ã¯ãåé¡ãããæ確ã«ããããã«ã©ã®ãããªæ å ±ãæäŸã§ããŸããïŒ ã³ã³ã¯ãªãŒãã®ã¢ã³ããŒãããŒãŸã§ã®çµè·¯ã¯åœ¹ã«ç«ã¡ãŸããïŒ ã€ãŸããå埩åæ°ãããŒã«ã«ã®çŸåšã®å€ããã¡ã€ã«åãªã©ãšãšãã«æäŸããŸããïŒ
ç§ãã¡ã¯ãã®éãé²ãã§ããããã§ãããç§ã¯ããã楜ã«ããããšããå§ãããŸãã
ããã§ãåãã§ãïŒ :)
ïŒ577ã®çµæã®1ã€ã¯ãLAPACKãFORTRANã³ã³ãã€ã©ãŒã«äŸåããŠãé©åºŠã«å ç¢ãªïŒã¢ã³ããŒ/ãªãŒããŒãããŒïŒè€çŽ æ°é€ç®ãšABSïŒïŒãå®è£ ããããšã§ãã ããã¥ã¡ã³ãã®ä¿å®ãéå§ãããã®ãããªèŠä»¶ãåæ§ã®èŠä»¶ãåéããå¿ èŠãããã®ã§ããããã ãããã¯ãLAPACKãä»ã®/æ°ããã³ã³ãã€ã©ã§äœ¿çšããã人ãã³ã³ãã€ã©ãã«ããŒãããã³LAPACKã¢ã«ãŽãªãºã ã®äžéšãŸãã¯ãã¹ãŠãä»ã®èšèªã«è»¢éããã人ã«ãšã£ãŠãåæ§ã«éèŠã§äŸ¿å©ã§ãã
ãã¡ããïŒ ãã®æ å ±ãååã«ææžåããŠãããšããã§ãããã
ãŸãããã©ãŒã ã®ãã¡ã€ã«LAPACK/SRC/z*.f
ïŒCOMPLEX * 16ã¢ã«ãŽãªãºã ïŒå
ã®ãã¹ãŠã®åå²ãïŒå€åïŒè¿œè·¡ããããšã«æéãè²»ãããŸããã
REAL / COMPLEX or COMPLEX / COMPLEX
åèš53åã®ãã¡ã€ã«ãèŠã€ãããŸããã æ·»ä»ãã¡ã€ã«ãåç §ããŠãã ããïŒ complexDivisionFound.code-search
ãã®ããã«ãVisual StudioCodeã§REGEXåŒã䜿çšããŸããã
\ nã* / ^ 0-9 ïŒïŒïŒDBLEïŒïŒ?! REALïŒïŒ?! MINïŒïŒ?! MAXïŒ[^ 0-9]
ãã¶ããgfortranã®å Žåããã¹ãç®çã§ãã©ã°
-mfpmath=sse -msse2
ã䜿çšããŠã³ã³ãã€ã«ããå¿ èŠããããŸãã ããã«ããããã¹ãŠã®èšç®ã64ãããæŒç®ã§å®è¡ãããããã«ãªããšæããŸãã ã§ãããããŸããã
ã¯ããGCCã䜿çšããå Žåã¯å¿ èŠã§ããããã®ãã©ã°ã¯x86-64ã§ãããã©ã«ãã§èšå®ãããŠããå¿ èŠããããŸãã 以äžã®ããã¥ã¡ã³ãã®æç²ã¯GCC11çšã§ãããã¯ããã«å€ãããŒãžã§ã³ã®GCCã§ãåãåäœã瀺ãã¯ãã§ãã GNUã³ã³ãã€ã©ã³ã¬ã¯ã·ã§ã³ïŒGCCïŒã®äœ¿çšïŒ3.19.59x86ãªãã·ã§ã³
sse
SSEåœä»€ã»ããã«ååšããã¹ã«ã©ãŒæµ®åå°æ°ç¹åœä»€ã䜿çšããŸãã ãã®åœä»€ã»ããã¯ãPentium III以éã®ãããã§ãµããŒããããŠãããAMDã©ã€ã³ã§ã¯Athlon-4ãAthlon XPãããã³AthlonMPãããã§ãµããŒããããŠããŸãã 以åã®ããŒãžã§ã³ã®SSEåœä»€ã»ããã¯å粟床æŒç®ã®ã¿ããµããŒãããŠãããããå粟床ããã³æ¡åŒµç²ŸåºŠã®æŒç®ã¯åŒãç¶ã387ã䜿çšããŠå®è¡ãããŸããPentium4ããã³AMD x86-64ãããã«ã®ã¿ååšããåŸã®ããŒãžã§ã³ã¯ãå粟床æŒç®ããµããŒãããŸãããããã
x86-32ã³ã³ãã€ã©ã®å ŽåãSSEæ¡åŒµãæå¹ã«ããŠãã®ãªãã·ã§ã³ãæå¹ã«ããã«ã¯ã
-march=cpu-type
ã-msse
ãŸãã¯-msse2
ã¹ã€ããã䜿çšããå¿ èŠããããŸãã x86-64ã³ã³ãã€ã©ã®å Žåããããã®æ¡åŒµæ©èœã¯ããã©ã«ãã§æå¹ã«ãªã£ãŠããŸããçµæãšããŠåŸãããã³ãŒãã¯ãã»ãšãã©ã®å Žåããªãé«éã§ããã387ã³ãŒãã®æ°å€çäžå®å®æ§ã®åé¡ãåé¿ããã¯ãã§ãããäžæçãªãã®ã80ãããã§ãããšäºæ³ããæ¢åã®ã³ãŒããå£ãå¯èœæ§ããããŸãã
ããã¯ãx86-64ã³ã³ãã€ã©ã§ããDarwin x86-32ã¿ãŒã²ããã®ããã©ã«ãã®éžæã§ããã
-ffast-math
ãæå¹ã«ãªã£ãŠããå Žåã®SSE2åœä»€ã»ããã䜿çšããx86-32ã¿ãŒã²ããã®ããã©ã«ãã®éžæã§ãã
äŸïŒ
XEIGTSTZ <zec.in-ZLAHQRã®ã¢ã³ããŒãããŒãåå ã§å€±æããŸãã
åçŸæé ïŒZGET37-> knt == 31ãZHSEQR-> ZLAHQR-> 2çªç®ã®QRã¹ãããïŒITS == 2ïŒã®æåŸã«ã次ã®ã³ãŒããã¢ã³ããŒãããŒãåŒãèµ·ãããŸãïŒç¹å®ã®ã¬ãžã¹ã¿ã§ã¯ã以äžãåç §ïŒTEMP = H( I, I-1 ) IF( DIMAG( TEMP ).NE.RZERO ) THEN RTEMP = ABS( TEMP) ! <-- underflow on TEMP = (~1e-0173, ~1e-0173) IF (RTEMP .EQ. RZERO) RTEMP = CABS1(TEMP) H( I, I-1 ) = RTEMP TEMP = TEMP / RTEMP IF( I2.GT.I )
ç§ãã¡ã®ã³ã³ãã€ã©ã¯.NETCLRã察象ãšããŠããŸãã ãã®JITã¯ãABSïŒTEMPïŒã«SSEã¬ãžã¹ã¿ã䜿çšããããšã決å®ããŸããããã«ããããã°ããã¥ãŒãã®äžéèšç®ã§ã¢ã³ããŒãããŒãçºçããŸãã IfortïŒå¥ã®äŸãšããŠïŒã¯ããã®ç¶æ³ã§æµ®åå°æ°ç¹ã¬ãžã¹ã¿ã䜿çšãããããã¢ã³ããŒãããŒããŸããïŒé·ãã80ãããã§ããããïŒã å®è¡æã«ã³ã³ãã€ã©/ããã»ããµã«å¿ èŠãªç²ŸåºŠ/æ°å€ç¯å²ã«é¢ããŠãLAPACKã«äœãæåŸ ã§ããããæ確ã«ææ¡ããããšããŠããŸãã
èŠç¹ããŸãšãããšïŒ
æ£ããïŒ
ããã©ã«ãã§ã¯ãGCCã¯x86-64äžã®64ãããæµ®åå°æ°ç¹ã¬ãžã¹ã¿ãŒã®ã³ãŒãã®ã¿ãçæããŸããç§ã®ãã·ã³ã§ã¯éåžžã1ã€ãŸãã¯2ã€ãé€ããŠãã¹ãŠã®LAPACKãã¹ãã«åæ ŒããŸãã
Netlib LAPACKãã¹ãã¹ã€ãŒãã¯GCCã§ã³ã³ãã€ã«ãããšãã«åæ ŒããŸããïŒ
ç·šéïŒè§£æ±ºæžã¿https://github.com/Reference-LAPACK/lapack/pull/577#issuecomment -859496175
ãã¶ããgfortranã®å Žåããã¹ãç®çã§ãã©ã°-mfpmath = sse-msse2ã䜿çšããŠã³ã³ãã€ã«ããå¿ èŠããããŸãã ããã«ããããã¹ãŠã®èšç®ã64ãããæŒç®ã§å®è¡ãããããã«ãªããšæããŸãã ã§ãããããŸããã
MacOSãšLinuxã®äž¡æ¹ã§GCC11ã䜿çšããŠ-mfpmath=sse -msse2
ãè©ŠããŸããïŒ https ïŒ https ïŒ
@ hokb ãSSEãã©ã°ã䜿çšããŠGCCã§https://github.com/Reference-LAPACK/lapack/issues/575#issuecomment -855880000ã§èšåãããªãŒããŒãããŒã®åé¡ãåçŸã§ããŸããïŒ ãããæäŒã£ãŠãããŸãããã
@weslleyspereiraç§ã¯GCCãè©ŠããããšããããŸããã ç§ãã¢ã¯ã»ã¹ã§ãã/å®è¡äžã®ã»ããã¢ããã¯ãWindowsã§ã®ifortã ãã§ãã ãã¹ãã®ããã«cygwinçµç±ã§GCCãèµ·åããŠå®è¡ããã«ã¯ãæ°æ¥ããããŸãïŒç¹ã«ãçŸåšã®ããªããŒããã«ã®éšå±ãã...ïŒ|ïŒãã ãããã®ãã£ã¬ã³ãžã«ææŠããå¿
èŠãããå Žåã¯ãç¥ãããã ããã
å°ãªããšãã httpsïŒ //godbolt.org/z/YYv5oPxe9ã«ãããšããã©ã°ã䜿çšããŠããgfortranã«ãã£ãŠçæãããã³ãŒãã«ã¯åœ±é¿ããŸããã ããããã¡ããããã¹ãå®è¡ã ãã確å®ã«ããããŸã...
ç§ã¯ãŠã£ã³ããŠã䜿çšããŠããŸããããããã«ãããŸãã ãŸããUbuntuã§ifortã䜿çšããŠLAPACKããã¹ãããäœãèµ·ãããã確èªããŸãã äŒæ¥ãã楜ãã¿ãã ããïŒ