• Rosetta 2 in tight code

    From Alan Browne@21:1/5 to All on Sat Nov 4 19:16:14 2023
    Been working on a project that has a particular function running in 8
    threads.

    Over a small 'patch' of a sphere ( about 0.4 x 1 radians in surface),
    there are 100,000 objects between the surface and an arbitrary height
    above the surface. The function searches for other objects within a
    small circle around each object:

    1) search for objects close to each other in the vertical then, if the
    vertical is close: 2) checks the horizontal distance within that
    spherical circle of the sphere - the latter is trig heavy, but does not
    occur very often.

    This code is tight - no I/O at all - purpose is to extend a linked list attached to the first object if another object is within interaction
    range (for eventual further computations).

    = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =
    Testing (non Rosetta 2 part).
    Isolated the function into a test program.
    First, the code is compiled on each computer for the target, ie, x86_64
    for the i7 and Aarch64 for the M2.
    i7 is a hyper threading quad core, 3.7 GHz (iMac 27" 2012)

    Run times - for 8 threads.
    ((99999^2 - 1)/2 compares = 4,999,900,000 compares).

    iMac i7 : 29.9 seconds
    Mac Mini M2 : 8.1 seconds

    - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
    In 4 threads?
    iMac i7 : 49.4 seconds
    Mac Mini M2 : 10.4 seconds


    i7 8-thread core loads: https://www.dropbox.com/scl/fi/muqt8es9awn7njro1uspj/FLT-i7-8-Thr.png?rlkey=gc1c6tkc3q554b6ihcf8f8jqm

    i7 4-thread core loads: https://www.dropbox.com/scl/fi/rgqlhwepqvaoe8nh9vy4c/FLT-i7-4-Thr.png?rlkey=0jllngnevekq8w4lqiluva7wm

    = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =

    Second test (The Rosetta 2 part)
    Transferred the x86_64 executable to the mini M2, and ran it there
    (under Rosetta 2 - which is automatic if you try to run x_86_64 code).

    x86_64 code under Rosetta - 8 threads
    Mac Mini M2 : 215 seconds

    So, while Rosetta 2 in general cases might be quick as a whip, it's not
    so hot if the code is tight and lots of math.
    (Side note: When Rosetta 2 is up an extra thread is active... )

    I don't expect a huge improvement on M3... If it were a higher spec M3
    with more cores, then more threads would certainly help.

    --
    “Markets can remain irrational longer than your can remain solvent.”
    - John Maynard Keynes.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)