Publicação
Parallel dot-products for deep learning on FPGA
| Resumo: | Deep neural networks have recently shown great results in a vast set of image applications. The associated deep learning models are computationally very demanding and, therefore, several hardware solutions have been proposed to accelerate their computation. FPGAs have recently shown very good performances for these kind of applications and so it is considered a promising platform to accelerate the execution of deep learning algorithms. A common operation in these algorithms is multiply-accumulate (MACC) that is used to calculate dot-products. Since many dot products can be calculated in parallel, as long as memory bandwidth is available, it is very important to implement this operation very efficiently to increase the density of MACC units in an FPGA. In this paper, we propose an implementation of parallel MACC units in FPGA for dot-product operations with very high performance/area ratios using a mix of DSP blocks and LUTs. We consider fixed-point representations with 8 bits of size, but the method can be applied to other bit widths. The method allows us to achieve TOPs performances, even for low cost FPGAs. |
|---|---|
| Autores principais: | Véstias, Mário |
| Outros Autores: | Duarte, Rui; De Sousa, Jose; Cláudio de Campos Neto, Horácio |
| Assunto: | Multiply-accumulate Deep learning FPGA Multiplicar-acumular |
| Ano: | 2017 |
| País: | Portugal |
| Tipo de documento: | documento de conferência |
| Tipo de acesso: | acesso restrito |
| Instituição associada: | Instituto Politécnico de Lisboa |
| Idioma: | inglês |
| Origem: | Repositório Científico do Instituto Politécnico de Lisboa |
Registos relacionados
article Efficient design of pruned convolutional neural networks on FPGA
por: Véstias, Mário
Publicado em: (2020)
por: Véstias, Mário
Publicado em: (2020)
groups Hybrid dot-product calculation for convolutional neural networks in FPGA
por: Véstias, Mário
Publicado em: (2019)
por: Véstias, Mário
Publicado em: (2019)
school Real-time implementation of 3D LiDAR point cloud semantic segmentation in an FPGA
por: Delgado, Pedro Paulo Fontes
Publicado em: (2023)
por: Delgado, Pedro Paulo Fontes
Publicado em: (2023)
article A fast and scalable architecture to run convolutional neural networks in low density FPGAs
por: Véstias, Mário
Publicado em: (2020)
por: Véstias, Mário
Publicado em: (2020)
article Moving deep learning to the edge
por: Véstias, Mário
Publicado em: (2020)
por: Véstias, Mário
Publicado em: (2020)
groups Design of a Multiband Full-Rate Ultra-Wideband Receiver in FPGA
por: Véstias, Mário
Publicado em: (2013)
por: Véstias, Mário
Publicado em: (2013)
groups Trends Of CPU, GPU and FPGA for high-performance computing
por: Véstias, Mário
Publicado em: (2014)
por: Véstias, Mário
Publicado em: (2014)
article Fast convolutional neural networks in low density FPGAs using zero-skipping and weight pruning
por: Véstias, Mário
Publicado em: (2019)
por: Véstias, Mário
Publicado em: (2019)
article Decimal multiplication in FPGA with a novel decimal adder/subtractor
por: Véstias, Mário
Publicado em: (2021)
por: Véstias, Mário
Publicado em: (2021)
groups A many-core co-processor for embedded parallel computing on FPGA
por: José, Wilson
Publicado em: (2015)
por: José, Wilson
Publicado em: (2015)
article Improving the area of fast parallel decimal multipliers
por: Véstias, Mário
Publicado em: (2018)
por: Véstias, Mário
Publicado em: (2018)
groups Efficient Implementation Of A Single-Precision Floating-Point Arithmetic Unit on FPGA
por: José, Wilson
Publicado em: (2014)
por: José, Wilson
Publicado em: (2014)
article Smart embedded system for skin cancer classification
por: Durães, Pedro F. F.
Publicado em: (2023)
por: Durães, Pedro F. F.
Publicado em: (2023)
article FPGA controlled MEMS inclinometer
por: Alves, F. S.
Publicado em: (2013)
por: Alves, F. S.
Publicado em: (2013)
article Pipelined FPGA coprocessor for elliptic curve cryptography based on residue number system
por: Miguens Matutino, Pedro
Publicado em: (2017)
por: Miguens Matutino, Pedro
Publicado em: (2017)
article Trusted execution environments leveraging reconfigurable FPGA technology
por: Pereira, Sérgio Augusto Gomes
Publicado em: (2022)
por: Pereira, Sérgio Augusto Gomes
Publicado em: (2022)
groups Using dynamic reconfiguration to reduce the area of a JPEG decoder on FPGA
por: Rodrigues, Tiago
Publicado em: (2015)
por: Rodrigues, Tiago
Publicado em: (2015)
school Distributed deep learning for sleep apnea detection on ECG signals
por: Machado, Ana Margarida da Silva
Publicado em: (2020)
por: Machado, Ana Margarida da Silva
Publicado em: (2020)
article A FPGA based C runtime hardware accelerator
por: Garcia, Paulo
Publicado em: (2011)
por: Garcia, Paulo
Publicado em: (2011)
article Revisiting a macroeconomic controversy: the case of the multiplier–accelerator effect
por: Mourão, Paulo
Publicado em: (2022)
por: Mourão, Paulo
Publicado em: (2022)
school Many-core approach to 2D-DCT calculation using an FPGA
por: Mália, Wilson Alexandre Borges
Publicado em: (2014)
por: Mália, Wilson Alexandre Borges
Publicado em: (2014)
article ZX fusion: a ZX spectrum implementation on an FPGA with modern peripherals
por: Jacinto, Gustavo
Publicado em: (2024)
por: Jacinto, Gustavo
Publicado em: (2024)
school Realização de um ZX spectrum em FPGA
por: Jacinto, Gustavo Dinis Venturinha Cercas Lopes
Publicado em: (2023)
por: Jacinto, Gustavo Dinis Venturinha Cercas Lopes
Publicado em: (2023)
school Deep learning aplicado aos videojogos
por: Moreno, João Paulo Henriques
Publicado em: (2023)
por: Moreno, João Paulo Henriques
Publicado em: (2023)
school Codificador JPEG baseado em FPGA
por: Brilhante, André Miguel de Sousa
Publicado em: (2012)
por: Brilhante, André Miguel de Sousa
Publicado em: (2012)
article Combining YOLO and deep reinforcement learning for autonomous driving in public roadworks scenarios
por: Andrade, Nuno
Publicado em: (2022)
por: Andrade, Nuno
Publicado em: (2022)
school Development of a spectrum analyzer based on FPGA
por: Ferreira, André Tiago Pereira Almendra
Publicado em: (2023)
por: Ferreira, André Tiago Pereira Almendra
Publicado em: (2023)
school Acceleration on FPGA of an SVM classifier for road condition sensor
por: Alves, Marcelo Quintela
Publicado em: (2020)
por: Alves, Marcelo Quintela
Publicado em: (2020)
article Deepmol: an automated machine and deep learning framework for computational chemistry
por: Correia, João
Publicado em: (2024)
por: Correia, João
Publicado em: (2024)
article Development of deep learning approaches to predict relationships between chemical structures and sweetness
por: Capela, João
Publicado em: (2022)
por: Capela, João
Publicado em: (2022)
article Deep learning searches for vector-like leptons at the LHC and electron/muon colliders
por: Morais, António P.
Publicado em: (2023)
por: Morais, António P.
Publicado em: (2023)
category A single chip FPGA-based cross-coupling multi-motor drive system
por: Amornwongpeeti, Sarayut
Publicado em: (2015)
por: Amornwongpeeti, Sarayut
Publicado em: (2015)
groups FPGA-based architecture for hyperspectral endmember extraction
por: Rosário, João
Publicado em: (2014)
por: Rosário, João
Publicado em: (2014)
article Towards an FPGA-based edge device for the internet of things
por: Pinto, Sandro
Publicado em: (2015)
por: Pinto, Sandro
Publicado em: (2015)
article Hyperspectral compressive sensing with a system-on-chip FPGA
por: Nascimento, Jose
Publicado em: (2020)
por: Nascimento, Jose
Publicado em: (2020)
article Forecast in the pharmaceutical area – Statistic models vs deep learning
por: Ferreira, Raquel
Publicado em: (2018)
por: Ferreira, Raquel
Publicado em: (2018)
article A deep learning approach to identify not suitable for work images
por: Bicho, Daniel
Publicado em: (2020)
por: Bicho, Daniel
Publicado em: (2020)
article Synthesizable and prototypic visual-tactile system-in FPGA: an alternative to analysis and improvement of the voice quality for the hearing impaired people
por: Alves, R. L.
Publicado em: (2016)
por: Alves, R. L.
Publicado em: (2016)
article Deep learning for drug response prediction in cancer
por: Baptista, Delora
Publicado em: (2021)
por: Baptista, Delora
Publicado em: (2021)
school Identification and classification of transporter proteins using deep learning models
por: Silva, Andrea Ferreira Meireles
Publicado em: (2019)
por: Silva, Andrea Ferreira Meireles
Publicado em: (2019)
Registos relacionados
-
article Efficient design of pruned convolutional neural networks on FPGA
por: Véstias, Mário
Publicado em: (2020) -
groups Hybrid dot-product calculation for convolutional neural networks in FPGA
por: Véstias, Mário
Publicado em: (2019) -
school Real-time implementation of 3D LiDAR point cloud semantic segmentation in an FPGA
por: Delgado, Pedro Paulo Fontes
Publicado em: (2023) -
article A fast and scalable architecture to run convolutional neural networks in low density FPGAs
por: Véstias, Mário
Publicado em: (2020) -
article Moving deep learning to the edge
por: Véstias, Mário
Publicado em: (2020)