• Hotdog Salesman@programming.dev
    link
    fedilink
    arrow-up
    1
    ·
    3 hours ago

    They do do it in C. The packages are written in C, python is just used as the wrapper to allow less coding skilled data scientists to easily use it.

    That’s like the entire data science joke. It’s C in a python trench coat.

    • Grumpy@sh.itjust.works
      link
      fedilink
      arrow-up
      1
      ·
      21 minutes ago

      Nearly every languages’ every core packages are written in C. And almost every higher packages have some amount of C. That doesn’t mean we get to say every program is done in C. And if you keep drilling down, everything is just machine lang. And certainly still disproves the OPs point of inefficient python.

      Saying it’s all done in C hardly even true. Just look at xformers library on GitHub. Only 2.7% of the code is C. And the entire library is about optimizing.

      Additionally, vast majority of the great leaps in ML efficiency changes hasn’t come from better programmed packages, though they too certainly made big strides. How we calculate itself has changed. That’s what makes the greatest optimizations in anything. It doesn’t matter what language it is, doing a loop 1000000 times to add 1 is going to be worse performance than just doing 1 multiplied by 1000000. How we calculate, what we choose to give up (such as determinism in some implementations if SDP attention changes) and such makes big differences.

      Optimizations also has to be done by someone. Whether that be data scientists or otherwise. The ability for higher level languages to enable them to do so like you say also makes a big difference. If all the programmers had to optimize in C only, we’d still be way behind where we are now in performance.

      Just swapping languages doesn’t yield better results like OP is implying.