UAE’s Technology Innovation Institute launches open source Falcon 40B LLM for research and commercial use

Dr. Ray O. Johnson, Chief Executive Officer of TII
Dr. Ray O. Johnson, Chief Executive Officer of Technology Innovation Institute

The Technology Innovation Institute (TII), a global scientific research center and the applied research pillar of Abu Dhabi’s Advanced Technology Research Council (ATRC), strengthened its growing global influence in the field of artificial intelligence by announcing that “Falcon 40B,” the UAE’s first large-scale AI model, is now open source for research and commercial use.

What does this mean for the industry?

This pioneering move shows Abu Dhabi’s commitment to fostering collaboration across sectors and driving advancements in generative AI. Falcon, a foundational large language model (LLM) with 40 billion parameters, trained on one trillion tokens, grants unprecedented access to researchers and small and medium-sized enterprise (SME) innovators alike.

TII is providing access to the model’s weights as a more comprehensive open-source package, with the aim of enabling access to powerful LLM capabilities, promoting transparency and accountability, and supporting innovation and research in the field.

In the current AI ecosystem, developers are finding large language models that provide access to model weights more appealing due to the enhanced capabilities they offer for fine-tuning compared to those without. While the majority of large language models have granted exclusive licenses solely to non-commercial users, TII has taken a key stride in offering researchers and commercial users access to the Falcon 40B large language model.

In conjunction with the release of Falcon 40B as an open-source model, TII launched a call for proposals, inviting scientists, researchers, and visionaries who are enthusiastic about harnessing the potential of the foundation model. They are encouraged to contribute their ideas and leverage the model to build inspiring use cases or explore further possibilities for its application to cover areas like engineering, healthcare, sustainability, coding, and much more.

As an incentive for the research proposals, selected projects will receive “training compute power” in the form of investment, enabling innovators to leverage robust computational resources for accelerated data analysis, complex modeling, and new discoveries.

This support will ultimately nurture and accelerate the development of novel ideas, providing the necessary resources to turn them into impactful artificial intelligence solutions with commercial viability and societal benefits. VentureOne, the commercialization arm of ATRC, will facilitate computation power to productize the most innovative solutions.

H.E.Faisal Al Bannai, Secretary General, Advanced Technology Research Council
H.E.Faisal Al Bannai, Secretary General, Advanced Technology Research Council

“Making Falcon 40B open source is a vital step in our commitment to fostering AI innovation. We are disrupting LLM access and enabling researchers and entrepreneurs to come up with the most innovative use cases. We will further support these submissions with computation power as funding through VentureOne, helping to advance a thriving research ecosystem,” said H.E.Faisal Al Bannai, Secretary General, Advanced Technology Research Council (ATRC).

What makes Falcon 40B a unique player in the space?

Falcon, unveiled in March 2023, showcased unique performance and underscored the UAE’s commitment to tech progress. Based on Stanford University’s HELM LLM benchmarking tool, Falcon 40B outperformed counterparts in utilizing significantly less training compute power.

With only 75% of the training compute of OpenAI‘s GPT-3, 40% of DeepMind’s Chinchilla AI, and 80% of the training compute of Google’s PaLM-62B, the tool substantiated the Technology Innovation Institute‘s commitment to advancing developments in generative AI.

Commenting on the new initiative, Dr. Ray O. Johnson, Chief Executive Officer of TII, said: “Computing power plays a pivotal role in expediting AI system training and enabling faster implementation of use cases. As the new fuel that drives tech innovation, the move to offer such support will be game-changing in enhancing the capabilities of innovators, and enabling them to push the boundaries of their projects to achieve remarkable advancements.”

Falcon 40B is a breakthrough led by the Technology Innovation Institute‘s AI and Digital Science Research Center (AIDRC). The same team also launched NOOR, the world’s largest Arabic NLP model last year, and is on track to develop and announce Falcon 180B soon.

Dr. Ebtesam Almazrouei, Director, AI Cross-Center Unit at TII
Dr. Ebtesam Almazrouei, Director, AI Cross-Center Unit at TII

Dr. Ebtesam Almazrouei, Director, AI Cross-Center Unit, the Technology Innovation Institute, said: “The open-source release of Falcon 40B, 7.5B, and 1.3B parameter artificial intelligence models and our high-quality REFINEDWEB dataset, exemplifies the profound scientific contributions of the UAE. With each breakthrough, we defy limitations, reshape the realm of possibilities, and pave the way for collaborative efforts with transformative impact.”

The United Arab Emirates recently moved up five places to rank as the top Arab country and the 37th out of 166 countries in the UN Frontier Technologies Readiness Index 2023. Complementing a long list of progressive technology milestones, the open-source generative AI model is set to boost the UAE’s credentials as a mainstream artificial intelligence player.

If you are interested in the Falcon AI models or submitting for the use case call for proposal, visit the website. Falcon LLMs open sourced, will be made available under a license built on the principles of open-source Apache 2.0 software, which allows for a wide range of free use.

What is the wider industry context of this product news?

The top technical challenges of training large language models (LLMs) in 2023 include:

  • Data availability and quality. LLMs require massive amounts of data to train, and this data must be of high quality. Inadequate data can lead to LLMs that are biased, inaccurate, or simply not very useful.
  • Computational resources. LLMs are computationally expensive to train. This is because they require large amounts of data and computing power. The cost of training an LLM can be prohibitive for many organizations.
  • Model complexity. LLMs are complex models, and this complexity can make them difficult to train and understand. This can lead to problems such as overfitting, which can make LLMs less accurate.
  • Interpretability. It can be difficult to interpret the results of an LLM. This can make it difficult to understand how LLMs work and to use them effectively.

Despite these challenges, LLMs are becoming increasingly important tools for a variety of tasks, including natural language processing, machine translation, and text generation. As LLMs become more powerful, they are likely to have a significant impact on international relations. For example, LLMs could be used to:

  • Improve communication and understanding between different cultures. LLMs could be used to translate languages more accurately and to generate text that is tailored to different cultures. This could help to improve communication and understanding between people from different countries.
  • Detect and prevent misinformation. LLMs could be used to identify and flag misinformation, such as fake news and propaganda. This could help to prevent the spread of misinformation and to protect people from harm.
  • Resolve conflicts. LLMs could be used to analyze complex data sets and to identify potential solutions to problems. This could help to resolve conflicts and to improve international relations.

The development of LLMs is a rapidly evolving field, and it is difficult to predict how they will ultimately shape international relations. However, it is clear that LLMs have the potential to play a significant role in the future of international relations.

In addition to the technical challenges mentioned above, there are also a number of ethical challenges associated with the development and use of LLMs. For example, LLMs could be used to generate harmful content, such as hate speech or propaganda. It is important to develop ethical guidelines for the development and use of LLMs in order to mitigate these risks.

Gerald Ainomugisha is a business news reporter and freelance B2B marketer with over 10 years of experience in writing high-converting copy and content for businesses of all kinds, especially SaaS providers in the niches of HR, IT, fintech, eCommerce and web3. Since joining Upwork in 2012 (back when it was still eLance), Gerald A. has delivered great results for hundreds of clients, maintaining a 98% Job Success rate as well as 5+ years of Top Rated Plus rating (and Premium Writers Talent Cloud membership). Book a meeting with Gerald A. today to get the powerful SEO content you need! 

Gerald Ainomugisha, B2B marketing expert