• modeler@lemmy.world
    link
    fedilink
    English
    arrow-up
    1
    ·
    edit-2
    5 months ago

    Typically you need about 1GB graphics RAM for each billion parameters (i.e. one byte per parameter). This is a 405B parameter model. Ouch.

    Edit: you can try quantizing it. This reduces the amount of memory required per parameter to 4 bits, 2 bits or even 1 bit. As you reduce the size, the performance of the model can suffer. So in the extreme case you might be able to run this in under 64GB of graphics RAM.

    • cheddar@programming.dev
      link
      fedilink
      English
      arrow-up
      1
      ·
      5 months ago

      Typically you need about 1GB graphics RAM for each billion parameters (i.e. one byte per parameter). This is a 405B parameter model.