Falcon LLM
An open-source model launched by Abu Dhabi’s Technology Innovation Institute (TII)
Über Falcon LLM
Falcon-40B is a foundational LLM with 40B parameters, training on one trillion tokens. Falcon 40B is an autoregressive decoder-only model. An autoregressive decoder-only model means that the model is trained to predict the next token in a sequence given the previous tokens. The GPT model is a good example of this.
They also have another smaller version: Falcon-7B which has 7B parameters, trained on 1,500B tokens. Aswell as a Falcon-40B-Instruct, and Falcon-7B-Instruct models available, if you are looking for a ready-to-use chat model.
The architecture of Falcon has been shown to significantly outperform GPT-3 for only 75% of the training compute budget, as well as only requiring ? of the compute at inference time. Falcon was developed using specialized tools and incorporates a unique data pipeline capable of extracting valuable content from web data. The pipeline was designed to extract high-quality content by employing extensive filtering and deduplication techniques.
Sources:
- https://www.kdnuggets.com/2023/06/falcon-llm-new-king-llms.html
- https://www.packtpub.com/article-hub/falcon-llm-the-dark-horse-in-open-source-llm-race