ARM Details Project Trillium MLP Architecture

Stefan Mileschin · 25th May 2018, 08:06

Arm first announced Project Trillium machine learning IPs back in February and we were promised we’d be hearing more about the product in a few months’ time. Project Trillium is unusual for Arm to talk about because the IP hasn’t been finalised yet and won’t be finished until this summer, yet Arm made sure not to miss out on the machine learning and AI “hype train” that has happened over the last 8 months in both the semiconductor industry and as well as particularly in the mobile industry.

Today Arm details more of the architecture of what Arm now seems to more consistently call their “machine learning processor” or MLP from here on now. The MLP IP started off a blank sheet in terms of architecture implementation and the team consists of engineers pulled off from the CPU and GPU teams.

With the MLP Arm set out to provide three key aspects that are demanded in machine learning IPs: Efficiency of convolutional computations, efficient data movement, and sufficient programmability. From a high level perspective the MLP seems no different than many other neural network accelerator IPs out there. It still has a set of MAC engines for the raw computational power, while offering some sort of programmable control flow block alongside a sufficiently robust memory subsystem.

Starting off at a more detailed view of the IP’s block diagram, the MLP consists of common functional blocks such as the memory interconnect interfaces as well as a DMA engine. The above graphic we see portrayal of the data flow (green arrows) and control flow (red arrows) throughout the functional blocks of the processor. The SRAM is a common block to the MLP sized at 1MB which serves as the local buffer for computations done by the compute engines. The compute engines each contained fixed function blocks which operate on the various layers of the neural network model, such as input feature map read blocks which pass onto control information to a weight decoder.

https://www.anandtech.com/show/12791...p-architecture

25th May 2018, 08:06	#1
Stefan Mileschin [M] Reviewer Join Date: May 2010 Location: Romania Posts: 148,597	ARM Details Project Trillium MLP Architecture Arm first announced Project Trillium machine learning IPs back in February and we were promised we’d be hearing more about the product in a few months’ time. Project Trillium is unusual for Arm to talk about because the IP hasn’t been finalised yet and won’t be finished until this summer, yet Arm made sure not to miss out on the machine learning and AI “hype train” that has happened over the last 8 months in both the semiconductor industry and as well as particularly in the mobile industry. Today Arm details more of the architecture of what Arm now seems to more consistently call their “machine learning processor” or MLP from here on now. The MLP IP started off a blank sheet in terms of architecture implementation and the team consists of engineers pulled off from the CPU and GPU teams. With the MLP Arm set out to provide three key aspects that are demanded in machine learning IPs: Efficiency of convolutional computations, efficient data movement, and sufficient programmability. From a high level perspective the MLP seems no different than many other neural network accelerator IPs out there. It still has a set of MAC engines for the raw computational power, while offering some sort of programmable control flow block alongside a sufficiently robust memory subsystem. Starting off at a more detailed view of the IP’s block diagram, the MLP consists of common functional blocks such as the memory interconnect interfaces as well as a DMA engine. The above graphic we see portrayal of the data flow (green arrows) and control flow (red arrows) throughout the functional blocks of the processor. The SRAM is a common block to the MLP sized at 1MB which serves as the local buffer for computations done by the compute engines. The compute engines each contained fixed function blocks which operate on the various layers of the neural network model, such as input feature map read blocks which pass onto control information to a weight decoder. https://www.anandtech.com/show/12791...p-architecture

Similar Threads
Thread	Thread Starter	Forum	Replies	Last Post
ARM Announces Project Trillium Machine Learning IPs	Stefan Mileschin	WebNews	0	17th February 2018 14:50
ARM details its future Mali-Cetus GPU architecture	Stefan Mileschin	WebNews	0	4th May 2017 08:48
Microsoft’s Project Scorpio: More Hardware Details Revealed	Stefan Mileschin	WebNews	0	7th April 2017 06:47
Microsoft Details Project Olympus Open Compute Standard	Stefan Mileschin	WebNews	0	13th March 2017 09:15
ARM Reveals Cortex-A72 Architecture Details	Stefan Mileschin	WebNews	0	24th April 2015 07:28
NVIDIA Reveals First Details about Project Denver CPU Core	Stefan Mileschin	WebNews	0	6th January 2014 11:16
Intel Reveals Architecture Details of Intel Xeon Phi Co-Processor	Stefan Mileschin	WebNews	0	31st August 2012 07:21
Google's Project Glass gets some more details	Stefan Mileschin	WebNews	0	28th June 2012 08:05
A look at VIA's next-gen Isaiah x86 CPU architecture	Sidney	WebNews	0	25th January 2008 04:23
AMD talks details of "Bulldozer," the first completely new architecture since K8	jmke	WebNews	0	27th July 2007 17:51