By A Mystery Man Writer
RedPajama, which creates fully open-source large language models, has released a 1.2 trillion token dataset following the LLaMA recipe.
Open source large language models: Benefits, risks and types - IBM Blog
2023 in science - Wikipedia
RedPajama: New Open-Source LLM Reproducing LLaMA Training Dataset of over 1.2 trillion tokens
RedPajama Reproducing LLaMA🦙 Dataset on 1.2 Trillion Tokens, by Angelina Yang
RedPajama - Llama is getting Open Source!
List of Open Sourced Fine-Tuned Large Language Models (LLM), by Sung Kim
Cloud Intelligence at the speed of 5000 tok/s - with Ce Zhang and Vipul Ved Prakash of Together AI
Llama 2: The New Open LLM SOTA (ft. Nathan Lambert, Matt Bornstein, Anton Troynikov, Russell Kaplan, Whole Mars Catalog et al.)
Why LLaMA-2 is such a Big Deal
GitHub - dsdanielpark/open-llm-datasets: Repository for organizing datasets and papers used in Open LLM.
2311.17035] Scalable Extraction of Training Data from (Production) Language Models
今日気になったAI系のニュース【23/4/24】|shanda
Inside language models (from GPT to Olympus) – Dr Alan D. Thompson – Life Architect
Cameron R. Wolfe, Ph.D. on X: LLaMA-2 outlines the remaining limitations of open-source language models well. Put simply, the gap in performance between open-source and proprietary LLMs is largely due to the
Timeline of computing 2020–present - Wikiwand