**skawa (guru）** @skawa@mstdn.guru · 2023-03-13T11:13:43Z

LLaMAはリッチなモデルだと2トークン/秒という話からラズパイでも10秒で1トークンという話まであって幅の広さがいいっすね。RT

I've sucefully runned LLaMA 7B model on my 4GB RAM Raspberry Pi 4. It's super slow about 10sec/token. But it looks we can run powerful cognitive pipelines on a cheap hardware. https://twitter.com/miolini/status/1634982361757790209/photo/1