Pre-training multi-billion parameter LLMs on a single GPU with Flora
Blog
Pre-training multi-billion parameter LLMs on a single GPU with Flora
First, we show how to incorporate Flora into code. Second, we give a high-level overview of how Flora works. Third, we provide benchmark training results. Finally, we compare Flora to the subsequent and closely related GaLore method.
Leading in artificial intelligence through education
News
Leading in artificial intelligence through education
RBC recently piloted a new AI training program – Leading in Artificial Intelligence (AI) – a one-day course, the first of its kind to be offered to senior executives, operating committees and members of the RBC Board of Directors in both Canada and the United States.
Neural Tangent Kernel Applications
Neural Tangent Kernel Applications
In this research tutorial, we consider the implications of the neural tangent kernel (NTK). This edition is part III in our series.