Editing
AI
(section)
Jump to navigation
Jump to search
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
== LLama == [https://boards.4channel.org/g/catalog#s=lmg%2F /lmg/] === Models === [https://old.reddit.com/r/LocalLLaMA/comments/1obrvab/support_for_ling_and_ring_models_1000b103b16b_has/ support_for_ling_and_ring_models_1000b103b16b_has] https://github.com/PrimeIntellect-ai/prime-rl - Claims to be a smarter finetune of GLM4.5 Air ==== Abliteration/MXFP4 MOE: ==== [https://old.reddit.com/r/LocalLLaMA/comments/1p5epot/the_most_objectively_correct_way_to_abliterate_so/ the_most_objectively_correct_way_to_abliterate_so/] [https://huggingface.co/noctrex/GLM-4.5-Air-Derestricted-MXFP4_MOE-GGUF noctrex/GLM-4.5-Air-Derestricted-MXFP4_MOE-GGUF] [https://old.reddit.com/r/LocalLLaMA/comments/1oypwa7/a_more_surgical_approach_to_abliteration/ a_more_surgical_approach_to_abliteration] [https://www.reddit.com/r/LocalLLaMA/comments/1oymku1/heretic_fully_automatic_censorship_removal_for/ heretic_fully_automatic_censorship_removal_for] [https://old.reddit.com/r/LocalLLaMA/comments/1ozh8py/mxfp4_hybrid_dense_models_ready_to_share_near/ mxfp4_hybrid_dense_models_ready_to_share_near/] - magiccodingman ==== Upcoming: ==== * [https://github.com/ggml-org/llama.cpp/issues/16186 <bdi>Qwen3-Omni-30B-A3B</bdi>] * [https://qwen.ai/blog?id=qwen3-omni-flash-20251201 Qwen3-Omni-Flash-2025-12-01] * [https://huggingface.co/inclusionAI/Ring-flash-2.0-GGUF Ring-Flash-2.0] * [https://old.reddit.com/r/LocalLLaMA/comments/1oh5asg/new_model_from_the_minimax_team_minimaxm2_an/ MiniMax-M2, an impressive 230B-A10B LLM.New Model] - Not sure if this works with llama.cpp * [https://old.reddit.com/r/LocalLLaMA/comments/1ojzekg/moonshotaikimilinear48ba3binstruct_hugging_face/ moonshotai/Kimi-Linear-48B-A3B-Instruct] * [https://huggingface.co/unsloth/Olmo-3-32B-Think-GGUF Olmo 3] * [https://github.com/ggml-org/llama.cpp/pull/17420 GigaChat3] * [https://github.com/ggml-org/llama.cpp/issues/15512 ERNIE-4.5-VL-28B-A3B-Thinking] * [https://old.reddit.com/r/LocalLLaMA/comments/1p6gsjh/llada20_103b16b_has_been_released/ LLaDA2.0 (103B/16B) has been released] ==== Other: ==== * [https://huggingface.co/ibm-granite/models Granite 4.0] * LLaDA-MoE-7B-A1B-Instruct * OLMoE === Misc === [https://boards.4channel.org/g/thread/93415313#p93421310 lammacpp server?] "There was a tokenizer caching error, some people said. Redownload the hf_output files from the repo or just change the use_cache line in the config.json to say: "use_cache": true," for the Vicuna13B-free https://github.com/stochasticai/xturing/tree/main/examples/int4_finetuning https://wiki.installgentoo.com/wiki/Home_server#Expanding_Your_Storage https://rentry.org/llama-tard-v2 https://rentry.org/llamaaids https://hackmd.io/@reneil1337/alpaca https://find.4chan.org/?q=AI+Dynamic+Storytelling+General https://find.4chan.org/?q=AI+Chatbot+General https://find.4chan.org/?q=%2Flmg%2F (local models general) https://boards.4channel.org/g/thread/92400764#p92400764 https://rentry.org/llamaaids <nowiki>https://files.catbox.moe/lvefgy.json</nowiki> https://pytorch.org/hub/nvidia_deeplearningexamples_tacotron2/ <blockquote> python server.py --model llama-7b-4bit --wbits 4 python server.py --model llama-13b-4bit-128g --wbits 4 --groupsize 128</blockquote>https://github.com/qwopqwop200/GPTQ-for-LLaMa/issues/59 for installing with out of space error https://github.com/oobabooga/text-generation-webui/wiki/LLaMA-model#4-bit-mode https://github.com/pybind/pybind11/discussions/4566 https://lmsysvicuna.miraheze.org/wiki/How_to_use_Vicuna#Use_with_llama.cpp%3A https://huggingface.co/anon8231489123/vicuna-13b-GPTQ-4bit-128g [https://huggingface.co/ShreyasBrill/Vicuna-13B Here's the uncucked Vicuna model (trained on the dataset that don't have the moralistic bullshit anymore) Too bad it's just the CPU quantized version] [https://www.reddit.com/r/Oobabooga/comments/12hyini/vicuna_generating_its_own_prompts/jfrtvh3/ Vicuna generating its own prompts] https://huggingface.co/TheBloke/vicuna-13B-1.1-GPTQ-4bit-128g - <code>python3 llama.py vicuna-AlekseyKorshuk-7B c4 --wbits 4 --true-sequential --act-order --groupsize 128 --save_safetensors vicuna-AlekseyKorshuk-7B-GPTQ-4bit-128g.safetensors</code> [https://github.com/ggerganov/llama.cpp/pull/933 <bdi>β65% speedup of the AVX-512 implementation of <code>ggml_vec_dot_q4_0()</code></bdi> #933] "Speaking of which, for any 30b anons struggling with context size, I figured something out. If you use the Triton branch on WSL, go into GPTQ_loader.py and comment out make_quant_attn like so" [https://boards.4channel.org/g/thread/92835207#p92837143 from here] [https://boards.4channel.org/g/thread/92842505#p92845181 Just grab the CUDA branch of qwop's GPTQ for LLaMA (or Triton if you want to be a dickhole) or if you have webui installed, go into the folder for GPTQ. Make sure all the requirements are installed and run this line:] [https://boards.4channel.org/g/thread/92842505#p92845181 python llama.py /path/to30b c4 --wbits 4 --true-sequential --groupsize 128 --save_safetensors alpacino-4bit-128g.safetensors] [https://boards.4channel.org/g/thread/92842505#p92845181 And it'll run. For windows, obviously flip the slashes the right way. And for linux, you may need to add CUDA_VISIBLE_DEVICES=0 to the front of the command.]
Summary:
Please note that all contributions to Hegemon Wiki may be edited, altered, or removed by other contributors. If you do not want your writing to be edited mercilessly, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource (see
Hegemon Wiki:Copyrights
for details).
Do not submit copyrighted work without permission!
Cancel
Editing help
(opens in new window)
Navigation menu
Personal tools
Not logged in
Talk
Contributions
Log in
Namespaces
Page
Discussion
English
Views
Read
Edit
Edit source
View history
More
Search
Navigation
Main page
Recent changes
Random page
Help about MediaWiki
Tools
What links here
Related changes
Special pages
Page information