Milk-V Duo achieves baby Llama 2 at 0.5 tok/s.

https://twitter.com/Redstone_Bi/status/1683777532309696513

I can confirm, all running on the board (ArchLinux, see 【Arch Linux On Milkv-duo】Milkv-duo 运行 Arch Linux 系统 - Duo - MilkV Community):

pacman -Sy git tcc wget
git clone https://github.com/karpathy/llama2.c
cd llama2.c
tcc -o run run.c -lm
wget https://huggingface.co/karpathy/tinyllamas/resolve/main/stories15M.bin
./run stories15M.bin
Once upon a time, there was a little girl named Lily. She loved to play with her toys and eat yummy snacks. One day, Lily's mommy said, "Let's clean up your toys before we eat the snacks!" Lily thought it was a good idea, so they started cleaning up.
After they finished cleaning, Lily's mommy put the snacks in the freezer. Lily was so excited to eat them and check the freezer to see if they were fresh. But when they opened the freezer, they saw that the snacks had turned out thin and tasted funny. Lily's mommy felt bad that they didn't have enough to eat.
Lily had an idea. She said, "Let's put some new snacks in the freezer and try again!" So they did. And sure enough, they had gained some new snacks from the freezer. Lily and her mommy were happy and enjoyed their snacks together. The end.
achieved tok/s: 0.192897

On another run it achieved 0.35 tok/s

1 Like