Doing the Lord’s work in the Devil’s basement

  • 0 Posts
  • 8 Comments
Joined 6 months ago
cake
Cake day: May 8th, 2024

help-circle

  • Yeh, i did some looking up in the meantime and indeed you’re gonna have a context size issue. That’s why it’s only summarizing the last few thousand characters of the text, that’s the size of its attention.

    There are some models fine-tuned to 8K tokens context window, some even to 16K like this Mistral brew. If you have a GPU with 8G of VRAM you should be able to run it, using one of the quantized versions (Q4 or Q5 should be fine). Summarizing should still be reasonably good.

    If 16k isn’t enough for you then that’s probably not something you can perform locally. However you can still run a larger model privately in the cloud. Hugging face for example allows you to rent GPUs by the minute and run inference on them, it should just net you a few dollars. As far as i know this approach should still be compatible with Open WebUI.







  • I’ve been getting back into anarchy Minecraft as an old buddy of mine is kinda resurrecting a base I used to be active at.

    The scene is mostly dead, on our main server it’s 2 to 4 players on average which is crazy to me. It used to be from 50 to 100 most evenings.

    Now I’ve got a 2 million blocks trip to make, even auto walking on the nether roof that’s gonna take some time. But it’s also an occasion to revisit some historic milestones along the way! I was able to get my hands on one signed book a friend had given me some time before passing away so it’s also kind of an emotional journey.