Editing
AI
(section)
Jump to navigation
Jump to search
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
==== Older ==== {| class="wikitable sortable" |+ ! !Interface !Model !GPTQ !Xformers? !HW !Load !Speed |- | |text-gen |anon8231489123-vicuna-13b-GPTQ-4bit-128g |GPTQ-for-LLaMa-'''triton''' |yes |240gb SSD, 16gb,desktop off |10.53 |7.97 tokens/s |- | |text-gen |anon8231489123-vicuna-13b-GPTQ-4bit-128g |GPTQ-for-LLaMa-'''triton''' |No xformers |240gb SSD, 16gb,desktop off |10.22s |7.55 tokens/s |- | |text-gen |anon8231489123-vicuna-13b-GPTQ-4bit-128g |GPTQ-for-LLaMa-'''cuda''' |No xformers |240gb SSD, 16gb,desktop off |16.68s |4.03 tokens/s |- | |text-gen |anon8231489123-vicuna-13b-GPTQ-4bit-128g |GPTQ-for-LLaMa-'''cuda''' |yes |240gb SSD, 16gb,desktop off |9.34s |4.01 tokens/s |- | |text-gen |llama-30b-sft-oa-alpaca-epoch-2-4bit'''-ggml''' |no |no |2TB SSD, 64gb |? |0.67 tokens/s |- | |text-gen |llama-30b-sft-oa-alpaca-epoch-2-4bit'''-ggml''' |no |no |2TB SSD, 64gb, '''--threads 8''' |maybe 30s? |0.51 tokens/s |- | |text-gen |llama-30b-sft-oa-alpaca-epoch-2-4bit'''-ggml''' |no |no |2TB SSD, 64gb, '''--threads 7''' | |0.68 tokens/s |- | |text-gen |llama-30b-sft-oa-alpaca-epoch-2-4bit'''-ggml''' |no |no |2TB SSD, 64gb, '''--threads 6''' | |0.61 tokens/s |- | |text-gen |anon8231489123-vicuna-13b-GPTQ-4bit-128g'''-ggml''' |no |no |2TB SSD, 64gb | |1.17 tokens/s |- | |text-gen |anon8231489123-vicuna-13b-GPTQ-4bit-128g |GPTQ-for-LLaMa-'''triton''' |yes |2TB SSD, 64gb, '''--pre_layer 25''' |45.69 |0.25 tokens/s |- | |text-gen |anon8231489123-vicuna-13b-GPTQ-4bit-128g |GPTQ-for-LLaMa-'''triton''' |yes |2TB SSD, 64gb |36.47 |9.63 tokens/s |- | |'''llama.cpp''' |llama-30b-sft-oa-alpaca-epoch-2-4bit'''-ggml''' | | |2TB SSD, 64gb |10317.90 ms |1096.21 ms per token |- | |'''llama.cpp-modern-avx512''' |llama-30b-sft-oa-alpaca-epoch-2-4bit'''-ggml''' | | |2TB SSD, 64gb |9288.69 ms |1049.03 ms per token |- | |'''llama.cpp-avx512-pr833''' |llama-30b-sft-oa-alpaca-epoch-2-4bit'''-ggml''' | | |2TB SSD, 64gb |13864.06 ms |0.89 tokens/s, 820.68 ms per token |- | |text-gen |TheBloke-gpt4-alpaca-lora-30B-4bit-GGML/ggml-model-'''q4_0''' | | |2TB SSD, 64gb | |0.78 tokens/s |- | |text-gen+'''avx512-pr833''' |TheBloke-gpt4-alpaca-lora-30B-4bit-GGML/ggml-model-q4_0 | | |2TB SSD, 64gb | |1.04 tokens/s |- |2023-04-24 |text-gen |anon8231489123-vicuna-13b-GPTQ-4bit-128g |GPTQ-for-LLaMa-'''triton''' |yes |2TB SSD, 64gb, also running llama.cpp with another model |16.36 |5.07 tokens/s |- |2023-04-26 |koboldcpp |gozfarb-llama-30b-supercot-ggml/ggml-model-q4_0.bin |'''clblast''' |n/a |2TB SSD, 64gb, '''--threads 8''' | |1073ms/T |- |2023-04-29 |koboldcpp |Alpacino-30b-q4_0.bin |'''clblast''' |n/a |2TB SSD, 64gb | |700ms/T |- |2023-07-13 |koboldcpp |llama-33b-supercot-ggml-'''q5_1''' (complains about '''old format''') |'''cublas''' |n/a |2TB SSD, 64gb, '''--nommap --smartcontext --usecublas --gpulayers 18''' | |643ms/T 1.4T/s |- |2023-07-13 |koboldcpp |llama-33b-supercot-ggml-'''q5_1''' (complains about '''old format''') |'''clblast''' |n/a |2TB SSD, 64gb, --nommap --smartcontext '''--useclblast 0 0''' --gpulayers 18 | |685ms/T 1.2T/s |- |2023-07-13 |koboldcpp |'''airoboros-33b-gpt4-1.2.ggmlv3.q4_K_M.bin''' |'''cublas''' |n/a |2TB SSD, 64gb, --nommap --smartcontext --usecublas --gpulayers 18 '''(probably space for more)''' | |652ms/T 1.5T/s |- |2023-07-13 |koboldcpp |airoboros-33b-gpt4-1.2.ggmlv3.q4_K_M.bin |cublas |n/a |2TB SSD, 64gb, --nommap --smartcontext --usecublas '''--gpulayers 26 (I note 3 threads are set by default)''' | |593ms/T 1.6T/s |- |2023-07-13 |koboldcpp |airoboros-33b-gpt4-1.2.ggmlv3.q4_K_M.bin |cublas |n/a |2TB SSD, 64gb, --nommap --smartcontext --usecublas --gpulayers 26 '''--psutil _set_threads (4 threads)''' | |514ms/T 1.8T/s |- |2023-07-13 |koboldcpp |airoboros-33b-gpt4-1.2.ggmlv3.q4_K_M.bin |cublas |n/a |2TB SSD, 64gb, --smartcontext --usecublas --gpulayers 26 --psutil _set_threads '''(removed nommap)''' | |508ms/T 1.9T/s |- |2023-07-13 |koboldcpp |airoboros-33b-gpt4-1.2.ggmlv3.q4_K_M.bin |cublas |n/a |2TB SSD, 64gb, --smartcontext --usecublas --gpulayers 26 '''--threads 5''' | |454ms/T 2.1T/s |- |2023-07-13 |koboldcpp |airoboros-33b-gpt4-1.2.ggmlv3.q4_K_M.bin |cublas |n/a |2TB SSD, 64gb, --smartcontext --usecublas --gpulayers 26 '''--threads 6''' | |'''422ms/T 2.2T/s''' |- |2023-07-13 |koboldcpp |airoboros-33b-gpt4-1.2.ggmlv3.q4_K_M.bin |cublas |n/a |2TB SSD, 64gb, --smartcontext --usecublas --gpulayers 26 '''--threads 7''' | |509ms/T 1.8T/s |- |2023-07-13 |koboldcpp |airoboros-33b-gpt4-1.2.ggmlv3.q4_K_M.bin |cublas |n/a |2TB SSD, 64gb, --smartcontext --usecublas --gpulayers 26 '''--threads 8''' | |494ms/T 1.7T/s |- |2023-07-13 |koboldcpp |airoboros-33b-gpt4-1.2.ggmlv3.q4_K_M.bin |cublas |n/a |2TB SSD, 64gb, --smartcontext --usecublas --gpulayers 26 '''--threads 6 --linearrope (no difference, needs supercot?)''' | |425ms/T 2.2T/s |- |2023-07-13 |koboldcpp |airoboros-33b-gpt4-'''1.4'''.ggmlv3.q4_K_M.bin |cublas |n/a |2TB SSD, 64gb, --smartcontext --usecublas --gpulayers 26 '''--threads 6''' | |400ms/T 2.3T/s |- |2023-07-13 |koboldcpp |airoboros-'''65b'''-gpt4-1.4.ggmlv3.q4_K_M.bin |cublas |n/a |2TB SSD, 64gb, --nommap --smartcontext --usecublas --gpulayers 13 --threads 6 | |1366ms/T 0.7T/s |- |2023-07-14 |koboldcpp |airoboros-65b-gpt4-1.4.ggmlv3.'''q2_K.'''bin |cublas |n/a |2TB SSD, 64gb, --nommap --smartcontext --usecublas --gpulayers 13 --threads 6 | |765ms/T - 1.2T/s |- |2023-09-06 |koboldcpp |'''guanaco-33B.ggmlv3.q4_K_M.bin''' |cublas |n/a |2TB SSD, 64gb, --stream --smartcontext --usecublas --gpulayers 29 --threads 6 | |562ms/T - 1.3T/s |- |2023-09-06 |koboldcpp |guanaco-33B.ggmlv3.q4_K_M.bin |cublas |n/a |2TB SSD, 64gb, '''--nommap''' --stream --smartcontext --usecublas --gpulayers 29 --threads 6 | |567ms/T), Total:70.7s (1.4T/s |- |2023-09-06 |koboldcpp |guanaco-33B.ggmlv3.q4_K_M.bin |cublas |n/a |2TB SSD, 64gb, '''--nommap''' --stream --smartcontext --usecublas --gpulayers 25 --threads 6 | |563ms/T), Total:70.2s (1.4T/s |- |2023-12-03 |koboldcpp |guanaco-33B.q4_K_M.'''gguf''' |cublas |n/a |2TB SSD, 64gb, --nommap --smartcontext --usecublas --threads 6 '''--gpulayers 27''' | |330.7ms/T), Total:40.79s 2.94T/s |- |2023-12-07 |koboldcpp |guanaco-33B.q4_K_M.gguf |cublas |n/a |'''7950x3d''', 2TB SSD, 64gb, --nommap --smartcontext --usecublas --threads 6 --gpulayers 27 | |202.1ms/T, 4.78T/s |- |2023-12-07 |koboldcpp |guanaco-33B.q4_K_M.gguf |cublas |n/a |7950x3d, 2TB SSD, 64gb, --nommap --smartcontext --usecublas '''--threads 32''' --gpulayers 27 | |360.8ms/T, 2.68T/s |- |2023-12-07 |koboldcpp |guanaco-33B.q4_K_M.gguf |cublas |n/a |7950x3d, 2TB SSD, 64gb, --nommap --smartcontext --usecublas '''--threads 16''' --gpulayers 27 | |202.6ms/T, 4.82T/s |- |2023-12-07 |koboldcpp |guanaco-33B.q4_K_M.gguf |cublas |n/a |7950x3d, 2TB SSD, 64gb, --nommap --smartcontext --usecublas '''--threads 15''' --gpulayers 27 | |195.0ms/T, 5.03T/s |- |2023-12-16 |koboldcpp |'''mistral-7b-instruct-v0.2.Q8_0.gguf''' |cublas |n/a |7950x3d, 2TB SSD, 64gb, --nommap --smartcontext --usecublas --threads 15 '''--gpulayers 33''' | |22.9ms/T, 42.90T/s |- |2023-12-17 |koboldcpp |'''mixtral-8x7b-moe-rp-story.Q8_0.gguf''' |cublas |n/a |7950x3d, 2TB SSD, 64gb, --nommap --smartcontext --usecublas --threads 15 '''--gpulayers 6''' | |214.9ms/T, 4.47T/s |- |2024-02-04 |SillyTavern |miqu 70b | | |gpu layers 9 | |1.4T/s |}
Summary:
Please note that all contributions to Hegemon Wiki may be edited, altered, or removed by other contributors. If you do not want your writing to be edited mercilessly, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource (see
Hegemon Wiki:Copyrights
for details).
Do not submit copyrighted work without permission!
Cancel
Editing help
(opens in new window)
Navigation menu
Personal tools
Not logged in
Talk
Contributions
Log in
Namespaces
Page
Discussion
English
Views
Read
Edit
Edit source
View history
More
Search
Navigation
Main page
Recent changes
Random page
Help about MediaWiki
Tools
What links here
Related changes
Special pages
Page information