AI

From Hegemon Wiki
Revision as of 10:21, 14 April 2023 by 192.168.1.2 (talk)
Jump to navigation Jump to search

LLama

https://wiki.installgentoo.com/wiki/Home_server#Expanding_Your_Storage

https://rentry.org/llama-tard-v2

https://rentry.org/llamaaids

https://hackmd.io/@reneil1337/alpaca

https://boards.4channel.org/g/catalog#s=lmg%2F

https://find.4chan.org/?q=AI+Dynamic+Storytelling+General

https://find.4chan.org/?q=AI+Chatbot+General

https://find.4chan.org/?q=%2Flmg%2F (local models general)

https://boards.4channel.org/g/thread/92400764#p92400764

https://rentry.org/llamaaids


https://files.catbox.moe/lvefgy.json

https://pytorch.org/hub/nvidia_deeplearningexamples_tacotron2/


python server.py --model llama-7b-4bit --wbits 4

python server.py --model llama-13b-4bit-128g --wbits 4 --groupsize 128

https://github.com/qwopqwop200/GPTQ-for-LLaMa/issues/59 for installing with out of space error

https://github.com/oobabooga/text-generation-webui/wiki/LLaMA-model#4-bit-mode


https://github.com/pybind/pybind11/discussions/4566

https://lmsysvicuna.miraheze.org/wiki/How_to_use_Vicuna#Use_with_llama.cpp%3A

https://huggingface.co/anon8231489123/vicuna-13b-GPTQ-4bit-128g


Here's the uncucked Vicuna model (trained on the dataset that don't have the moralistic bullshit anymore) Too bad it's just the CPU quantized version

Vicuna generating its own prompts

Benchmarks

Interface Model GPTQ Xformers? HW Load Speed
text-gen anonVic13B GPTQ-for-LLaMa-triton yes 240gb SSD, 16gb,desktop off 10.53 7.97 tokens/sec
text-gen anonVic13B GPTQ-for-LLaMa-triton No xformers 240gb SSD, 16gb,desktop off 10.22s 7.55 tokens/sec
text-gen anonVic13B GPTQ-for-LLaMa-cuda No xformers 240gb SSD, 16gb,desktop off 16.68s 4.03 tokens/sec
text-gen anonVic13B GPTQ-for-LLaMa-cuda yes 240gb SSD, 16gb,desktop off 9.34s 4.01 tokens/sec