use memcpyx on TINI: up to 432 dhry/sec