Gigantic Work

Falcon 180B: The largest openly available language model

06 Sep 2023

Falcon 180B sets a new state-of-the-art for open models. It is the largest openly available language model, with 180 billion parameters, and was trained on a massive 3.5 trillion tokens. It tops the leaderboard for (pre-trained) open-access models and rivals proprietary models like PaLM-2. While difficult to rank definitively yet, it is considered on par with PaLM-2 Large, making Falcon 180B one of the most capable LLMs publicly known.

Stable Diffusion XL: the next iteration in the evolution of text-to-image generation models

26 Jul 2023

Stable Diffusion XL (SDXL 1.0) is designed to produce photorealistic outputs with enhanced detail and composition compared to previous SD models, such as SD 1.5 and 2.1. Key improvements in SDXL 1.0 include more realistic image generation, better face creation, legible text within images, and the ability to generate aesthetically pleasing art using shorter prompts.

State of GPT

26 May 2023

Learn about the training pipeline of GPT assistants like ChatGPT, from tokenization to pretraining, supervised finetuning, and Reinforcement Learning from Human Feedback (RLHF). Dive deeper into practical techniques and mental models for the effective use of these models, including prompting strategies, finetuning, the rapidly growing ecosystem of tools, and their future extensions.

Stable Diffusion: A Powerful Text-to-Image Model for Diverse Applications

31 Dec 2022

Stable Diffusion is a text-to-image diffusion model that utilizes a frozen CLIP ViT-L/14 text encoder, similar to Google's Imagen, to condition the model on text prompts. This relatively lightweight model, with an 860M UNet and 123M text encoder, requires a GPU with at least 10GB VRAM to run efficiently.

OpenAI Whisper: A Robust and Versatile Speech Recognition System

21 Sep 2022

Whisper is an automatic speech recognition (ASR) system trained on a massive 680,000-hour multilingual and multitask dataset collected from the web. This extensive and diverse dataset enhances Whisper's robustness to accents, background noise, and technical language. Additionally, it facilitates transcription in multiple languages and translation into English. Open-sourcing models and inference code aims to provide a foundation for developing practical applications and conducting further research on robust speech processing.

Blog

Falcon 180B: The largest openly available language model

Stable Diffusion XL: the next iteration in the evolution of text-to-image generation models

State of GPT

Stable Diffusion: A Powerful Text-to-Image Model for Diverse Applications

OpenAI Whisper: A Robust and Versatile Speech Recognition System