Skip Navigation

LLaMA Now Goes Faster on CPUs

justine.lol

LLaMA Now Goes Faster on CPUs

My kernels go 2x faster than MKL for matrices that fit in L2 cache, which makes them a work in progress, since the speedup works best for prompts having fewer than 1,000 tokens.

LocalLLaMA @sh.itjust.works

LLaMA Now Goes Faster on CPUs

Hacker News @lemmy.smeargle.fans

LLaMA now goes faster on CPUs

AI Companions @lemmy.world

LLaMA Now Goes Faster on CPUs

0 comments

No comments