_______               __                   _______
       |   |   |.---.-..----.|  |--..-----..----. |    |  |.-----..--.--.--..-----.
       |       ||  _  ||  __||    < |  -__||   _| |       ||  -__||  |  |  ||__ --|
       |___|___||___._||____||__|__||_____||__|   |__|____||_____||________||_____|
                                                             on Gopher (inofficial)
   URI Visit Hacker News on the Web
   URI   What Is Specialized Hardware and Why Open Source Will Drive Adoption
   DIR   text version
        exikyut wrote 3 days ago:
        I guess the most relevant question within an open source context is,
        "okay, but when/how do we get to play with this fancy new hardware?"
        OpenPOWER remains a massive niche - as in, it's big, but it's still a
        niche - because nobody (in the sense of "just anybody", ie the long
        tail) can really get their hands on it. It's nontrivially difficult to
        maintain access to IBM POWER8 kit, and the Talos is also certainly way
        beyond what I could personally afford from a "directionless
        tinkering/learning" standpoint.
        If I understand this article correctly, cloud-native means there's not
        likely to be a Talos equivalent for sale, and it's all remote access
        Also, OpenPOWER is, like, an entire CPU, with a design is extremely old
        and can be expected to stay around for a long time, and even with that
        sort of centralizable focus opportunity it's still a niche.
        I get the impression this is suggesting the creation of custom
        components with somewhat shorter design lifecycles - years, certainly,
        but not multiples of decades, and maybe only months for individual
        hardware revisions.
        If this really wants to attract developers from outside of the
        immediate focus of the relevant industries... how are the
        discoverability and accessibility equations going to be solved?
        Of course, the more potential cooks you attract to the kitchen the more
        overheads you have to deal with, but I wonder if is a necessary element
        to maintain interest and familiarity with what would apparently prefer
        to be a fast-changing environment.
          mrmrcoleman wrote 3 days ago:
          "how are the discoverability and accessibility equations going to be
          solved?" - this is the whole challenge.
          I referred to the WorksonARM project in the blog: [1] WorksOnARM
          solved this problem for ARM by making machines available through
          Equinix Metal's API for development and testing.
          Hardware manufacturers can ship boxes to developers, give them access
          through an API like Equinix Metal's, or some other approach, but one
          way or another developers are going to need access.
   URI    [1]: https://www.worksonarm.com/
        mwcampbell wrote 3 days ago:
        > Kubernetes and its family of cloud native projects revolutionized 
        computing in 4 short years.
        This strikes me as wild hyperbole. It's a new management layer for
        server-side computing -- nothing compared to the changes brought by
        microprocessors, or even by minicomputers like the DEC PDP line.
          jgalt212 wrote 3 days ago:
          Even more so if you can cram more and more processor cores into a
          chip, or more and more sockets into a board.  It's easy to envision
          and multi-socket ARM server making more than a few micro service
          based architectures unnecessary.
        WaitWaitWha wrote 3 days ago:
        I am unclear of the article's target.
        Your article implies we are reading about HW manufacturers that have
        prioritization & work load issues, but then you mention Apple, AWS, et
        al.  These HW designs are all directed work to the HW manufacturer.
        There is no concerns of prioritization, or work load.  They get paid
        handsomely for making the right choices.
          sitkack wrote 3 days ago:
          It is a not so hidden message to industry about what Equinix is doing
          in the cloud server space.
            6pac3rings wrote 2 days ago:
            Does anyone have insight on what implications follow from Jim
            Keller's Tenstorrent graph processing chip? He speculates the
            future may have a $5 add-on chip to a child's toy which will have
            more intelligence than a human on #162 Lex Fridman Podcast. I get
            how the single threaded raster pixel gpu architecture is less
            evolved than a graph processor.
              sitkack wrote 1 day ago:
              I saw that, his comment was more around what the future could
              look like in 20-30 years. I don't think he was talking about
              their next product. Think Sturgeon and PKD vantage points and
              timelines. [1] And now I have fallen back down the Epiphany [1]
              rabbit hole. The main thing that graph processors get you is
              being able to avoid the von Neumann Bottleneck [2] [1] [2]
   URI        [1]: https://en.wikipedia.org/wiki/Explicit_data_graph_execut...
   URI        [2]: https://en.wikipedia.org/wiki/Adapteva
   URI        [3]: https://en.wikipedia.org/wiki/Von_Neumann_architecture
        imtringued wrote 3 days ago:
        This article didn't mention PIM (processing in memory) which is a way
        of keeping the established general purpose CPU model and accelerating
        it by simply adding processors directly into RAM. The compute power
        scales with the size of your dataset. You also benefit from greater
        memory bandwidth and lower power consumption.
        Here is an existing implementation:
   URI  [1]: https://www.upmem.com/
          sitkack wrote 3 days ago:
           [1] and [2] I ran into one of these companies, in 2015? at a big
          data conference. The founders claimed to have not known about iram,
          ;)  It feels like PIM comes up every couple years, but ram is
          expensive already, and these chips are going to be some multiple more
          expensive. There is some inflection point ,memory bandwidth, power,
          something that will enable PIM to finally get traction.
          The immediate problem was tooling, os support, supply, scale, etc. I
          think it would make a lot of sense to get this stuff installed
          sitewide after your code has been tuned for support. So it needs a
          good simulator or it needs an interposer so one can code to the ABI
          of the hardware.
          I could see a big super computer initiative being used a test bed,
          probably one designed for genomics. I think we overestimate how
          effective and easy to use PIM (processing in ram) could be. Where it
          will really start to shine is when the processor can retire
          instructions into the PIM, pre-materialize data streams from the PIM,
          do a scheduled reduction and then push compute bundles back to
          memory.  The computational ram needs to be integrated with cpu, it
          can't just be an accelerator.
          Computational disks might be way way easier to implement.  We will
          know how that is going as soon as Western Digital tells us.
   URI    [1]: https://en.wikipedia.org/wiki/Computational_RAM
   URI    [2]: http://iram.cs.berkeley.edu/
   URI    [3]: https://github.com/chipsalliance/Cores-SweRV
            mrmrcoleman wrote 3 days ago:
            Thanks for the references sitkack
          jleahy wrote 3 days ago:
          Presumably your computing power scales with the number of 'data
          processors' rather than the size of your dataset per se. In the same
          way that you might say "if you use GPUs your computing power scales
          with the size of your dataset" (as you buy more GPUs to get more
          memory, thus get more compute).
          But how does this differ to a CPU or GPU? Both are just DRAM with
          something bolted onto the side, and both have a limited bandwidth
          between the DRAM and the processors. The difference with putting the
          processors on the DIMMs is that you now have an extremely restrictive
          thermal envelope to work within (and if you put it on the same die,
          which I hope you're not, then a CMOS technology that's supremely ill
          suited to computation).
          mrmrcoleman wrote 3 days ago:
          Hey, thanks for the link! This post will be followed up with a series
          of articles that will go into more detail about the various
          technologies. I'll be sure to include PIM in my research.
   DIR <- back to front page