Falcon 180B: What is Falcon 180B, and how good is Falcon 180B?

[additional_author]

Table of Contents

The UAE’s Technology Innovation Institute (TII) has unveiled its most advanced large language model (LLM) to date: the Falcon 180B. Marking a significant stride in AI leadership, the UAE-based company has made this cutting-edge technology available in open-source, offering unparalleled opportunities for both commercial and research applications.

What is Falcon 180B?

The Falcon 180B is the latest LLM from TII, following its predecessors, the Falcon 40B and Falcon 7B. Representing a significant scale-up from the Falcon 40B, this model stands out for its innovative features like multi-query capability and enhanced scalability. Trained on a massive 3.5 trillion tokens and utilising up to 4096 GPUs, Falcon 180B boasts an impressive 180 billion parameters. This training utilised Amazon SageMaker for approximately 7 million GPU hours, a feat that is 2.5 times larger than Llama 2 and employed four times more computing power.

For a deeper understanding of Falcon 180B’s architecture and capabilities, refer to the official Hugging Face blog post.

Source: – https://huggingface.co/blog/falcon

How Good is Falcon 180B?

Performance evaluations place Falcon 180B in a league close to OpenAI’s Chat GPT4 and just ahead of Chat GPT3.5. Regarding specific benchmarks, it surpasses Chat GPT3.5 and Llama 2 in the MMLU evaluation. It matches Google’s PaLM 2-Large in LAMBADA, Winogrande, ARC, HellaSwag, WebQuestions, PIQA, CB, RTE, WiC, WSC, BoolQ, COPA, and ReCoRD. Essentially, Falcon 180B’s performance varies between Chat GPT4 and Chat GPT3.5, depending on the benchmark used.

Credit: – Hugging Face.

The Hugging Face leaderboard currently ranks Falcon 180B with a score of 68.74, marginally leading over Llama 2’s 67.35. This achievement underscores the model’s competitive edge in the AI landscape.