__ftol2 is exported from msvcrt on NT6+, not from ntdll for some reason. So native apps still need to statically link _ftol2 and _ftoul2_legacy, but apps linking to msvcrt only need to statically link _ftoul2_legacy (via msvcrtex), when on NT6+.
This is used by CL v19.41+. It replicates the behavior of the inline assembly code that previous CL versions generated. According to tests it works the same as with previous VS versions.
MSVC was previously given a "result" variable to copy the fscale result from st(0). This led to another "fld" FPU stack push at the very end without popping the source value from the FPU stack.
Moreover, this copy isn't even needed: A simple "fstp st(1)" at the end pops an element from the FPU stack while effectively storing the result in st(0), the register used for returning a double value.
This problem didn't affect GCC, as it is only given the "fscale" instruction and does all necessary stack operations itself.
However, looking into the CRT sources, I found many other i386 implementations with inline assembly suffering from the same problem.
Fortunately, they have been replaced by pure assembly implementations a while ago, so it's time to finally remove them.
ldexp would have also been a candidate for a pure assembly implementation, but the required check for NaN and setting errno (verified on Win2003) already outweighs the benefits.
And we cannot just do a NaN check with FUCOMI as this is an i686/pentiumpro instruction while we're still targeting i586/pentium.
I'm also using this opportunity to clean up the ldexp.c header and only put in the remaining contributors as returned by "git blame".
Thanks to NightWolve1975 for reporting the problem! (https://twitter.com/nightwolve1975/status/1099042477531643912)