Implementation of RISC Processor for DSPAcceleratorArchitectureExploiting Carry Save Arithmetic

Lanke Kalyani, A Sowjanya

Abstract


Hardware acceleration has been proved an extremely promisingimplementation strategyforthedigitalsignal processing(DSP) domain.Ratherthanadoptingamonolithicapplication-specificintegrated circuit designapproach,  in thisbrief, we present a  novel accelerator architecture comprising flexiblecomputational  units that support the executionofalargesetofoperationtemplatesfoundinDSPkernels. Wedifferentiatefrompreviousworksonflexibleacceleratorsbyenabling computations tobeaggressivelyperformedwithcarry-save(CS)formatteddata.Advancedarithmeticdesignconcepts, i.e.,recodingtechniques, areutilizedenabling CSoptimizationstobeperformedinalargerscope thaninpreviousapproaches.Extensiveexperimentalevaluationsshow thattheproposedacceleratorarchitecturedeliversaveragegainsofup to 61.91%in area-delay productand54.43%in energy consumption comparedwiththestate-of-artflexibledatapaths. In this paper, their concentration is on 16 bit operations but here in the proposed scheme, the focus is on 32 bit operations.Hardware Acceleration basically refers to the usage of computer hardware to perform some functions faster than they are actually possible within the software running on general purpose CPU. TheRISCor ReducedInstructionSetComputerisadesignphilosophythathasbecomeamainstreaminScientificandengineeringapplications.Themainobjectiveofthispaperis to design and implement of 32 – bit RISC(ReducedInstruction Set Computer) processor forflexible DSP Accelerator Architecture.Thedesignwillhelp to improve the speed of the processor, and to give thehigherperformance of the processor. The most important featureofthe RISC processor is that this processor is very simpleandsupport load/store architecture. The important componentsofthis processor include the Arithmetic Logic Unit,Shifter,Rotator and Control unit. The module functionalityandperformance issues like area, power dissipationandpropagation delay are analyzed. Therefore, here we meet some of the main constraints likeComplexity of the instruction set, which will reduce the amount of space, time, cost, power, heat and other things that it takes to implement the instruction set part of a processor. As the Time of execution decreases, the Speed of execution automatically increases.

Hardware acceleration has been proved an extremely promisingimplementation strategyforthedigitalsignal processing(DSP) domain.Ratherthanadoptingamonolithicapplication-specificintegrated circuit designapproach,  in thisbrief, we present a  novel accelerator architecture comprising flexiblecomputational  units that support the executionofalargesetofoperationtemplatesfoundinDSPkernels. Wedifferentiatefrompreviousworksonflexibleacceleratorsbyenabling computations tobeaggressivelyperformedwithcarry-save(CS)formatteddata.Advancedarithmeticdesignconcepts, i.e.,recodingtechniques, areutilizedenabling CSoptimizationstobeperformedinalargerscope thaninpreviousapproaches.Extensiveexperimentalevaluationsshow thattheproposedacceleratorarchitecturedeliversaveragegainsofup to 61.91%in area-delay productand54.43%in energy consumption comparedwiththestate-of-artflexibledatapaths. In this paper, their concentration is on 16 bit operations but here in the proposed scheme, the focus is on 32 bit operations.Hardware Acceleration basically refers to the usage of computer hardware to perform some functions faster than they are actually possible within the software running on general purpose CPU. TheRISCor ReducedInstructionSetComputerisadesignphilosophythathasbecomeamainstreaminScientificandengineeringapplications.Themainobjectiveofthispaperis to design and implement of 32 – bit RISC(ReducedInstruction Set Computer) processor forflexible DSP Accelerator Architecture.Thedesignwillhelp to improve the speed of the processor, and to give thehigherperformance of the processor. The most important featureofthe RISC processor is that this processor is very simpleandsupport load/store architecture. The important componentsofthis processor include the Arithmetic Logic Unit,Shifter,Rotator and Control unit. The module functionalityandperformance issues like area, power dissipationandpropagation delay are analyzed. Therefore, here we meet some of the main constraints likeComplexity of the instruction set, which will reduce the amount of space, time, cost, power, heat and other things that it takes to implement the instruction set part of a processor. As the Time of execution decreases, the Speed of execution automatically increases.


Keywords


Hardware Acceleration, Reduced Instruction Set Computer (RISC), Carry Save formatted Data.

References


P. Ienneand R. Leupers, Customizable Embedded Processors:DesignTechnologies and Applications. SanFrancisco, CA, USA: MorganKaufmann,2007.

P.M.Heysters,G.J.M.Smit,andE.Molenkamp,“Aflexibleand energy-efficientcoarse-grained reconfigurablearchitecturefor mobile systems,”J.Supercomput.,vol.26,no.3,pp.283–308,2003.

B. Mei, S. Vernalde, D. Verkest, H. D. Man, and R. Lauwereins, “ADRES:Anarchitecture withtightlycoupled VLIWprocessorand coarse-grained reconfigurablematrix,”inProc.13th Int. Conf. Field Program.LogicAppl.,vol.2778.2003,pp.61–70.

M. D. Galanis,G. Theodoridis, S. Tragoudas, and C.E. Goutis, “Ahigh-performancedatapath forsynthesizingDSPkernels,”IEEE Trans.Comput.-Aided DesignIntegr. CircuitsSyst., vol. 25, no. 6, pp.1154–1162,Jun.2006.

K. ComptonandS.Hauck,“Automatic design ofreconfigurabledomain- specificflexiblecores,”IEEETrans.VeryLargeScaleIntegr.(VLSI) Syst.,vol.16,no.5,pp.493–503,May2008.

S.Xydis,G.Economakos,andK.Pekmestzi,“Designingcoarse-grain reconfigurablearchitecturesbyinliningflexibilityintocustomarithmetic data-paths,”Integr.,VLSIJ.,vol.42,no.4,pp.486–503,Sep.2009.

S.Xydis,G.Economakos, D.Soudris,andK.Pekmestzi,“Highperfor- manceandareaefficientflexible DSPdatapathsynthesis,”IEEETrans. Very Large Scale Integr. (VLSI) Syst.,vol. 19,no. 3, pp. 429–442, Mar.2011.

G.Ansaloni,P.Bonzini,andL.Pozzi,“EGRA: Acoarsegrained reconfigurablearchitectural template,”IEEETrans.VeryLargeScale Integr.(VLSI)Syst.,vol.19,no.6,pp.1062–1074,Jun.2011.

M.Stojilovic,D.Novo, L.Saranovac,P.Brisk,andP.Ienne,“Selective flexibility:Creatingdomain-specificreconfigurable arrays,”IEEETrans. Comput.-AidedDesignIntegr.CircuitsSyst.,vol.32,no.5,pp.681–694, May2013.

R.Kastner,A.Kaplan,S.O.Memik,andE.Bozorgzadeh,“Instruction generationforhybridreconfigurable systems,”ACMTrans.Design Autom.Electron.Syst.,vol.7,no.4,pp.605–627,Oct.2002.

T. Kim andJ. Um, “A practical approach to the synthesis of arithmetic circuits using carry-save-adders,” IEEE Trans. Comput.-Aided Design Integr. Circuits Syst., vol. 19,no. 5, pp. 615–624, May2000.

A.Hosangadi,F.Fallah,andR.Kastner,“Optimizinghighspeed arithmeticcircuitsusingthree-termextraction,”inProc. Design,Autom. TestEur.(DATE),vol.1.Mar.2006,pp.1–6.

A. K. Verma, P. Brisk, and P. Ienne,“Data-flowtransformationsto maximize theuseofcarry-save representationinarithmeticcircuits,” IEEETrans.Comput.-AidedDesignIntegr. CircuitsSyst.,vol.27, no. 10, pp.1761–1774,Oct.2008.

B.Parhami,ComputerArithmetic:AlgorithmsandHardwareDesigns. Oxford,U.K.:OxfordUniv.Press,2000.

G. Constantinides, P. Y. K. Cheung, and W. Luk, Synthesis and Optimization DSP Algorithms. Norwell, MA, USA: Kluwer, 2004.

N.Moreano,E.Borin,C.deSouza,andG.Araujo,“Efficientdata- path mergingforpartially reconfigurablearchitectures,” IEEETrans. Comput.-AidedDesign Integr. CircuitsSyst.,vol. 24,no.7,pp.969–980, Jul.2005.

S.Xydis, G.Palermo,andC.Silvano,“Thermal-awaredatapathmerging forcoarse-grainedreconfigurable processors,”inProc.Design,Autom. TestEur.Conf.Exhibit.(DATE),Mar.2013,pp.1649–1654.

K. Tsoumanis, S. Xydis, C. Efstathiou, N. Moschopoulos, and K.Pekmestzi,“An optimizedmodifiedboothrecoderforefficient design oftheadd-multiplyoperator,”IEEETrans. CircuitsSyst.I,Reg.Papers, vol.61,no.4,pp.1133–1143,Apr.2014.

Y.-H.ChenandT.-Y.Chang,“Ahigh-accuracyadaptiveconditional- probabilityestimator forfixed-widthbooth multipliers,”IEEETrans. Circuits Syst. I, Reg. Papers,vol. 59, no. 3, pp. 594–603, Mar.2012.

G.D.Micheli,SynthesisandOptimizationofDigitalCircuits,1sted. NewYork,NY,USA:McGraw-Hill,1994.

N.H.E.WesteandD.M. Harris,CMOSVLSIDesign:ACircuits andSystemsPerspective,4thed.Reading,MA,USA:Addison-Wesley, 2010.


Full Text: PDF [Full Text]

Refbacks

  • There are currently no refbacks.


Copyright © 2013, All rights reserved.| ijseat.com

Creative Commons License
International Journal of Science Engineering and Advance Technology is licensed under a Creative Commons Attribution 3.0 Unported License.Based on a work at IJSEat , Permissions beyond the scope of this license may be available at http://creativecommons.org/licenses/by/3.0/deed.en_GB.