There's no great advantage to using MMX instructions here on modern
processors, since REP MOVSB/STOSB are optimized in microcode anyway
and tend to run much faster than MMX/SSE/AVX variants.
This also makes it much easier to implement high-level emulation of
memcpy/memset in UserspaceEmulator once we get there. :^)
This was supposed to be the foundation for some kind of pre-kernel
environment, but nobody is working on it right now, so let's move
everything back into the kernel and remove all the confusion.
Move the "fast memcpy" stuff out of StdLibExtras.h and into Memory.h.
This will break a ton of things that were relying on StdLibExtras.h
to include a bunch of other headers. Fix will follow immediately after.
This makes it possible to include StdLibExtras.h from Types.h, which is
the main point of this exercise.