By A Mystery Man Writer
RedPajama is “a project to create leading open-source models, starts by reproducing LLaMA training dataset of over 1.2 trillion tokens”. It’s a collaboration between Together, Ontocord.ai, ETH DS3Lab, Stanford CRFM, …
What's in the RedPajama-Data-1T LLM training set
togethercomputer/RedPajama-Data-V2 · Datasets at Hugging Face
LLM360, A true Open Source LLM
Inside language models (from GPT to Olympus) – Dr Alan D. Thompson
65-Billion-Parameter Large Model Pretraining Accelerated by 38
GitHub - togethercomputer/RedPajama-Data: The RedPajama-Data
Inside language models (from GPT to Olympus) – Dr Alan D. Thompson
Web LLM runs the vicuna-7b Large Language Model entirely in your
Bringing LLM Fine-Tuning and RLHF to Everyone