Dissertations / Theses on the topic 'Application specific processors'
Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles
Consult the top 50 dissertations / theses for your research on the topic 'Application specific processors.'
Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.
You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.
Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.
Mutigwe, Charles. "Automatic synthesis of application-specific processors." Thesis, Bloemfontein : Central University of Technology, Free State, 2012. http://hdl.handle.net/11462/163.
Full textThis thesis describes a method for the automatic generation of appli- cation speci_c processors. The thesis was organized into three sepa- rate but interrelated studies, which together provide: a justi_cation for the method used, a theory that supports the method, and a soft- ware application that realizes the method. The _rst study looked at how modern day microprocessors utilize their hardware resources and it proposed a metric, called core density, for measuring the utilization rate. The core density is a function of the microprocessor's instruction set and the application scheduled to run on that microprocessor. This study concluded that modern day microprocessors use their resources very ine_ciently and proposed the use of subset processors to exe- cute the same applications more e_ciently. The second study sought to provide a theoretical framework for the use of subset processors by developing a generic formal model of computer architecture. To demonstrate the model's versatility, it was used to describe a number of computer architecture components and entire computing systems. The third study describes the development of a set of software tools that enable the automatic generation of application speci_c proces- sors. The FiT toolkit automatically generates a unique Hardware Description Language (HDL) description of a processor based on an application binary _le and a parameterizable template of a generic mi- croprocessor. Area-optimized and performance-optimized custom soft processors were generated using the FiT toolkit and the utilization of the hardware resources by the custom soft processors was character- ized. The FiT toolkit was combined with an ANSI C compiler and a third-party tool for programming _eld-programmable gate arrays (FPGAs) to create an unconstrained C-to-silicon compiler.
Lau, C. H. "Computational structures for application specific VLSI processors." Thesis, University of Edinburgh, 1989. http://hdl.handle.net/1842/12396.
Full textNyländen, T. (Teemu). "Application specific programmable processors for reconfigurable self-powered devices." Doctoral thesis, Oulun yliopisto, 2018. http://urn.fi/urn:isbn:9789526218755.
Full textTiivistelmä Esineiden internet tulee muuttamaan tulevaisuudessa elinympäristömme täysin. Se tulee mahdollistamaan interaktiiviset ympäristöt nykyisten passiivisten ympäristöjen sijaan. Lisäksi elinympäristömme tulee reagoimaan tekoihimme ja puheeseemme sekä myös tunteisiimme. Tämä kaikkialla läsnä olevan langaton infrastruktuuri tulee vaatimaan ennennäkemätöntä laskentatehokkuutta yhdistettynä äärimmäiseen energiatehokkuuteen. Nykyiset esineiden internet ratkaisut nojaavat lähes täysin kaupallisiin "suoraan hyllyltä" saataviin yleiskäyttöisiin mikrokontrollereihin. Ne ovat kuitenkin optimoituja pelkästään matalan tehonkulutuksen näkökulmasta, eivätkä niinkään energiatehokkuuden, saati tulevaisuuden esineiden internetin vaatiman laskentatehon suhteen. Kuitenkin esineiden internet on lähtökohtaisesti sovelluskohtaista laskentaa vaativa, joten yleiskäyttöisten prosessoreiden käyttö signaalinkäsittelytehtäviin on epäloogista. Sen sijaan sovelluskohtaisten kiihdyttimien käyttö laskentaan, todennäköisesti mahdollistaisi tavoitellun vaatimustason saavuttamisen. Tämä väitöskirja esittelee yhden mahdollisen ratkaisun matalan energian kulutuksen, korkean suorituskyvyn ja joustavuuden yhdenaikaiseen saavuttamiseen kustannustehokkaalla tavalla, käyttäen uudelleenkonfiguroitavia heterogeenisiä prosessoriratkaisuja. Työssä esitellään uusi grafiikkaprosessori-tyylinen uudelleen konfiguroitava kiihdytin esineiden internet sovellusalueelle, jota pystytään hyödyntämään useimpien laskentatehoa vaativien sovellusten kanssa. Ehdotetun kiihdyttimen ominaisuuksia arvioidaan kahta konenäkösovellusta esimerkkinä käyttäen ja osoitetaan sen saavuttavan loistavan yhdistelmän energia tehokkuutta ja suorituskykyä. Kiihdytin suunnitellaan käyttäen tehokasta ja nopeaa ohjelmiston ja laitteiston yhteissuunnitteluketjua, jolla voidaan saavuttaa lähestulkoon kaupallisten "suoraan hyllyltä" saatavien prosessoreiden kehitystyön helppous, joka puolestaan mahdollistaa kustannustehokkaan kehitys- ja suunnittelutyön
Glökler, Tilman Meyr Heinrich. "Design of energy-efficient application-specific instruction set processors /." Boston, Mass. [u.a.] : Kluwer Acad. Publ, 2004. http://www.loc.gov/catdir/enhancements/fy0820/2004041376-d.html.
Full textHautala, I. (Ilkka). "From dataflow models to energy efficient application specific processors." Doctoral thesis, Oulun yliopisto, 2019. http://urn.fi/urn:isbn:9789526223681.
Full textTiivistelmä Langattomien verkkojen kehittyminen on luonut edellytykset useille uusille sovelluksille. Muiden muassa sosiaalisen media, suoratoistopalvelut, virtuaalitodellisuus ja esineiden internet asettavat kannettaville ja puettaville laitteille moninaisia toimintoihin, suorituskykyyn, energiankulutukseen ja fyysiseen muotoon liittyviä vaatimuksia. Yksi isoimmista haasteista on sulautettujen laitteiden energiankulutus. Laitteiden energiatehokkuutta on pyritty parantamaan rinnakkaislaskentaa ja räätälöityjä laskentaresursseja hyödyntämällä. Tämä puolestaan on vaikeuttanut niin laite- kuin sovelluskehitystä, koska laajassa käytössä olevat kehitystyökalut perustuvat matalan tason abstraktioihin ja hyödyntävät alun perin yksi ydinprosessoreille suunniteltuja ohjelmointikieliä. Korkean tason ja automatisoitujen kehitysmenetelmien käyttöönottoa on hidastanut aikaansaatujen järjestelmien puutteellinen suorituskyky ja laiteresurssien tehoton hyödyntäminen. Väitöskirja esittelee datavuopohjaiseen suunnitteluun perustuvan työkaluketjun, joka on tarkoitettu energiatehokkaiden signaalikäsittelyjärjestelmien toteuttamiseen. Työssä esiteltävä suunnitteluvuo pohjautuu laitteistoratkaisuissa räätälöitävään ja ohjelmoitavaan siirtoliipaistavaan prosessoritemplaattiin. Ehdotettu suunnitteluvuo mahdollistaa useiden heterogeenisten prosessoriytimien ja niiden välisten kytkentöjen räätälöimisen sovelluksien tarpeiden vaatimalla tavalla. Suunnitteluvuossa ohjelmistot kuvataan korkean tason datavuomallien avulla. Tämä mahdollistaa erityisesti rinnakkaista laskentaa sisältävän ohjelmiston automaattisen sovittamisen erilaisiin moniprosessorijärjestelmiin ja nopeuttaa erilaisten järjestelmätason ratkaisujen kartoittamista. Suunnitteluvuon käyttökelpoisuus osoitetaan käyttäen esimerkkinä kolmea eri signaalinkäsittelysovellusta. Tulokset osoittavat, että suunnittelumenetelmien abstraktiotasoa on mahdollista nostaa ilman merkittävää suorituskyvyn heikkenemistä. Väitöskirjan keskeinen sovellusalue on videonkoodaus. Työ esittelee videonkoodaukseen suunniteltuja energiatehokkaita ja uudelleenohjelmoitavia prosessoriytimiä. Ratkaisut perustuvat usean prosessoriytimen käyttämiseen hyödyntäen erityisesti videonkäsittelyalgoritmeille ominaista liukuhihnarinnakkaisuutta. Prosessorien virrankulutus, suorituskyky ja pinta-ala on analysoitu käyttämällä simulointimalleja, jotka huomioivat logiikkasolujen sijoittelun ja johdotuksen. Ehdotetut sovelluskohtaiset prosessoriratkaisut tarjoavat uuden energiatehokkaan kompromissiratkaisun tavanomaisten ohjelmoitavien prosessoreiden ja kiinteästi johdotettujen video-kiihdyttimien välille
Sohl, Joar. "Efficient Compilation for Application Specific Instruction set DSP Processors with Multi-bank Memories." Doctoral thesis, Linköpings universitet, Datorteknik, 2015. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-113702.
Full textFranz, Jonathan D. Duren Russell Walker. "An evaluation of CoWare Inc.'s Processor Designer tool suite for the design of embedded processors." Waco, Tex. : Baylor University, 2008. http://hdl.handle.net/2104/5254.
Full textLim, Wei Ming. "Design of application specific instruction set processors for the domain of GF(2'm)." Thesis, University of Sheffield, 2004. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.412439.
Full textRadhakrishnan, Swarnalatha Computer Science & Engineering Faculty of Engineering UNSW. "Heterogeneous multi-pipeline application specific instruction-set processor design and implementation." Awarded by:University of New South Wales. Computer Science and Engineering, 2006. http://handle.unsw.edu.au/1959.4/29161.
Full textTell, Eric. "Design of Programmable Baseband Processors." Doctoral thesis, Linköping : Univ, 2005. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-4377.
Full textMa, Ning. "Ultra-low-power Design and Implementation of Application-specific Instruction-set Processors for Ubiquitous Sensing and Computing." Doctoral thesis, KTH, Industriell och Medicinsk Elektronik, 2015. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-174896.
Full textQC 20151009
Wahlen, Oliver [Verfasser]. "C Compiler Aided Design of Application-Specific Instruction-Set Processors Using the Machine Description Language LISA / Oliver Wahlen." Aachen : Shaker, 2004. http://d-nb.info/1181603536/34.
Full textWahlen, Oliver [Verfasser]. "C compiler aided design of application specific instruction set processors using the machine description language LISA / vorgelegt von Oliver Wahlen." Aachen : Shaker, 2004. http://nbn-resolving.de/urn:nbn:de:hbz:82-opus-9116.
Full textWilczák, Milan. "Ladicí nástroj generických simulátorů mikroprocesorů." Master's thesis, Vysoké učení technické v Brně. Fakulta informačních technologií, 2010. http://www.nusl.cz/ntk/nusl-237257.
Full textFRANCHINI, Silvia Giuseppina. "Graphic Coprocessors with Native Clifford Algebra Support." Doctoral thesis, Università degli Studi di Palermo, 2009. http://hdl.handle.net/10447/178952.
Full textBishell, Aaron. "Designing application-specific processors for image processing : a thesis presented in partial fulfilment of the requirements for the degree of Master of Science in Computer Science, Massey University, Palmerston North, New Zealand." Massey University, 2008. http://hdl.handle.net/10179/1024.
Full textGrant, David. "A Lightweight Processor Core for Application Specific Acceleration." Thesis, University of Waterloo, 2004. http://hdl.handle.net/10012/800.
Full textMikó, Albert. "Akcelerace aplikací pomocí specializovaných instrukcí." Master's thesis, Vysoké učení technické v Brně. Fakulta informačních technologií, 2016. http://www.nusl.cz/ntk/nusl-255444.
Full textDolíhal, Luděk. "Testování generovaných překladačů jazyka c pro procesory ve vestavěných systémech." Doctoral thesis, Vysoké učení technické v Brně. Fakulta informačních technologií, 2017. http://www.nusl.cz/ntk/nusl-412583.
Full textKreutz, Marcio Eduardo. "Geração de processador para aplicacao especifica." reponame:Biblioteca Digital de Teses e Dissertações da UFRGS, 1997. http://hdl.handle.net/10183/17752.
Full textThis work discusses a processor for specific applications architecture, based on the MCS8051 microcontroller. This processor is used as a solution for many local industry applications, being the base of dedicated systems. The dedicated 8051 generated should allow complete integration of the system, and with the added value to the chip, reduced costs. The architecture optimization will produce as result a reduced instruction set, made by the often used instructions for each application. The main instruction set optimization goal refers to the instrucions decoders and microcode generators in the control part, because a large area in the processor is needed to implement them. Thus, a reduced instruction set will allow area savings, making possible the complete system integration in a chip. An ASIP architecture will have a higher cost than the original one. An alternative to solve this problem is add value to the chip, creating an Application Specific Integrated System (ASIS). An ASIS can be made with a acceptable cost, if it’s possible to integrate other circuits to the chip without area increase. This can be done in the area saved by using fewer implemented instructions. Because the 8051 is a commercial architecture, there is a large amount of software developed for it. This can be considered an advantage because basic softwares like compilers are available, being not necessary to create them. Another advantage refers to the large number of engineers trained to use the 8051. To preserve the already developed applications it’s necessary to mantain software compatibility. Assembler level programming is very boring an error prone task, being desirable to have software compatibility at higher levels through the use of high level languages. To create the necessary SW compatibility, a C compiler developed for 8051 was optimized. The chose for C language refers to its large utilization. The optimized C compiler tries to use a reduced instruction set, formed with the most important instructions for each application, in order ro save area. When an instruction needs to be used in an application, and it’s not present in the instruction set, the compiler tries to replace it with other instructions. The compiler will not use instructions not present in the original 8051 instruction set. So, new instrucions will be not created. To create an instruction set formed with the most important instructions for each application, a static analysis is made on a precompiled assembler source. An assembler source generated with a reduced instruction set (RISC) will probably have more instructions than the same assembler generated with a full instruction set (CISC). This can be explained because of the replacements instruction. If one instruction is replaced by other two, and these are from the original instruction set, probably the time needed to execute them would be higher. In order to deal with this problem, an instruction pipeline was implemented to the 8051. This work presents Standard Cells and FPGA results of Logic Synthesis of the optimized architecture. Also, assembly programs generated by the optimized compiler are presented.
McMullin, John Derek. "Accelerating the parsing process with an application specific VLSI RISC processor." Thesis, University of Central Lancashire, 1997. http://clok.uclan.ac.uk/8694/.
Full textPetrov, Peter. "Application specific embedded processor customizations for low power and high performance /." Diss., Connect to a 24 p. preview or request complete full text in PDF format. Access restricted to UC campuses, 2004. http://wwwlib.umi.com/cr/ucsd/fullcit?p3137218.
Full textPackiaraj, Vivek. "Study, Design and Implementation of an Application Specific Instruction Set Processor for a Specific DSP Task." Thesis, Linköping University, Electronics System, 2008. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-52314.
Full textThere is a lot of literature already available describing well-structured approach for embeddeddesign and implementation of Application Specific Integrated Processor (ASIP) micro processorcore.
This concept features hardware structured approach for implementation of processor core fromminimal instruction set, encoding standards, hardware mapping, and micro architecture design,coding conventions, RTL,verification and burning into a FPGA. The goal is to design an ASIPprocessor core (Micro architecture design and RTL) which can perform DSP task, e.g., FIR. Thereport is a well structured approach of design and implementation of an ASIP DSP processor forDSP applications like FIR. This report contains design flow starting from Instruction set design,micro architecture design and RTL implementation of the core. Details of the power simulationsof FPGA are also listed and analyzed.
Martin, Rovira Julia, and Fructoso Melero Francisco Manuel. "Micro-Network Processor : A Processor Architecture for Implementing NoC Routers." Thesis, Jönköping University, JTH, Computer and Electrical Engineering, 2007. http://urn.kb.se/resolve?urn=urn:nbn:se:hj:diva-941.
Full textRouters are probably the most important component of a NoC, as the performance of the whole network is driven by the routers’ performance. Cost for the whole network in terms of area will also be minimised if the router design is kept small. A new application specific processor architecture for implementing NoC routers is proposed in this master thesis, which will be called µNP (Micro-Network Processor). The aim is to offer a solution in which there is a trade-off between the high performance of routers implemented in hardware and the high level of flexibility that could be achieved by loading a software that routed packets into a GPP. Therefore, a study including the design of a hardware based router and a GPP based router has been conducted. In this project the first version of the µNP has been designed and a complete instruction set, along with some sample programs, is also proposed. The results show that, in the best case for all implementation options, µNP was 7.5 times slower than the hardware based router. It has also behaved more than 100 times faster than the GPP based router, keeping almost the same degree of flexibility for routing purposes within NoC.
Vogt, Timo. "A reconfigurable application-specific instruction-set processor for trellis-based channel decoding /." Kaiserslautern : Techn. Univ. Kaiserslautern, 2008. http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&doc_number=016537958&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA.
Full textShee, Seng Lin Computer Science & Engineering Faculty of Engineering UNSW. "ADAPT : architectural and design exploration for application specific instruction-set processor technologies." Awarded by:University of New South Wales, 2007. http://handle.unsw.edu.au/1959.4/35404.
Full textStothard, David. "The development of an application specific processor for the transmission line matrix method." Thesis, Loughborough University, 2000. https://dspace.lboro.ac.uk/2134/14899.
Full textCheung, Newton Computer Science & Engineering Faculty of Engineering UNSW. "Design automation methodologies for extensible processor platform." Awarded by:University of New South Wales. School of Computer Science and Engineering, 2005. http://handle.unsw.edu.au/1959.4/26118.
Full textGerlach, Lukas [Verfasser]. "KAVUAKA: A Low-Power Application-Specific Processor Architecture for Digital Hearing Aids / Lukas Gerlach." Hannover : Gottfried Wilhelm Leibniz Universität, 2021. http://d-nb.info/1230550674/34.
Full textYassin, Yahya H. "ULTRA LOW POWER APPLICATION SPECIFIC INSTRUCTION-SET PROCESSOR DESIGN : for a cardiac beat detector algorithm." Thesis, Norwegian University of Science and Technology, Department of Electronics and Telecommunications, 2009. http://urn.kb.se/resolve?urn=urn:nbn:no:ntnu:diva-9914.
Full textHigh efficiency and low power consumption are among the main topics in embedded systems today. For complex applications, off-the-shelf processor cores might not provide the desired goals in terms of power consumption. By optimizing the processor for the application, or a set of applications, one could improve the computing power by introducing special purpose hardware units. The execution cycle count of the application would in this case be reduced significantly, and the resulting processor would consume less power. In this thesis, some research is done in how to optimize a software and hardware development for ultra low power consumption. A cardiac beat detector algorithm is implemented in ANSI C, and optimized for low power consumption, by using several software power optimization techniques. The resulting application is mapped on a basic processor architecture provided by Target Compiler Technologies. This processor is optimized further for ultra low power consumption by applying application specific hardware, and by using several hardware power optimization techniques. A general processor and the optimized processor has been mapped on a chip, using a 90 nm low power TSMC process. Information about power dissipation is extracted through netlist simulation, and the results of both processors have been compared. The optimized processor consume 55% less average power, and the duty cycle of the processor, i.e., the time in which the processor executes its task with respect to the time budget available, has been reduced from 14% to 2.8%. The reduction in the total execution cycle count is 81%. The possibilities of applying power gating, or voltage and frequency scaling are discussed, and it is concluded that further reduction in power consumption is possible by applying these power optimization techniques. For a given case, the average leakage power dissipation is estimated to be reduced by 97.2%.
Schlechter, E. J. (Emile Johan). "Manufacturing intelligence : a dissemination of intelligent manufacturing principles with specific application." Thesis, Stellenbosch : Stellenbosch University, 2002. http://hdl.handle.net/10019.1/52927.
Full textENGLISH ABSTRACT: Artificial intelligence has provided several techniques with applications in manufacturing. Knowledge based systems, neural networks, case based reasoning, genetic algorithms and fuzzy logic have been successfully employed in manufacturing. This thesis will provide the reader with an introduction and an understanding of each of these techniques (Chapter 2 & 3). The intelligent manufacturing process can be a complex one and can be decomposed into several components: intelligent design, intelligent process planning, intelligent quality management, intelligent maintenance and diagnosis, intelligent scheduling and intelligent control. This thesis will focus on how each of the artificial intelligence techniques can be applied to each of the manufacturing process fields. Chapter 5 Chapter 6 Chapter 7 Knowledge based systems Neural networks Fuzzy logic Case based reasoning Genetic algorithms Chapter 8 Chapter 9 Chapter 10 Manufacturing intelligence can be approached from two main directions: theoretical research and practical application. Most of the concepts, methods and techniques discussed in this thesis are approached from a theoretical research point of view. This thesis is also aimed at providing the reader with a broader picture of manufacturing intelligence and how to apply the intelligent techniques, in theory. Specific attention will be given to intelligent scheduling as an application (Chapter 11). The application will demonstrate how case based reasoning can be applied in intelligent scheduling within a small manufacturing plant.
AFRIKAANSE OPSOMMING: Kunsmatige intelligensie bied 'n verskeidenheid tegnieke en toepassings in die vervaardigingsomgewing. Kennis baseerde sisteme, neurale netwerke, gevalle basseerde redenasie, generiese algoritmes en wasige logika word suksesvol in die vervaardigingsopset toegepas. Dié tesis gee die leser 'n inleiding en basiese oorsig van metodes om elk van die tegnieke te gebruik (hoofstuk 2 & 3). Die intelligente vervaardigingproses is 'n komplekse proses en kan afgebreek word in verskeie komponente: intelligente ontwerp, intelligente prosesbeplanning, intelligente gehaltebestuur, intelligente onderhoud en diagnose, intelligente kontrole en intelligente skedulering. Hierdie tesis sal fokus op hoe elk van die kunsmatige intelligente tegnieke op elk van die vervaardigingprosesvelde toegepas kan word. Hoofstuk 5 Hoofstuk 6 Hoofstuk 7 Kennis gebaseerde sisteme Wasige logika Neurale netwerke Gevalle baseerde redenasie Generiese algoritmes Hoofstuk 8 Hoofstuk 9 Hoofstuk 10 Vervaardigingsintelligensie kan vanuit twee oogpunte benader word, naamlik 'n teoretiese ondersoek en 'n praktiese aanslag. Die meeste van hierdie konsepte, metodes en tegnieke word in hierdie tesis vanuit 'n teoretiese oogpunt benader. Die tesis is daarop gerig om die leser 'n wyer perspektief te gee van intelligente vervaardiging en hoe om die intelligente tegnieke, in teorie, toe te pas. Spesifieke aandag sal gegee word aan intelligente skedulering as 'n toepassing (Hookstuk 11). Die toepassing sal demonstreer hoe gevalle baseerde redenasie toegepas kan word in intelligente skedulering.
Lüthje, Olaf [Verfasser]. "A Methodology for Automated Analysis of Application Specific Processor Models with Respect to Test Generation / Olaf Lüthje." Aachen : Shaker, 2005. http://d-nb.info/1181615461/34.
Full textŽádník, Jakub. "Implementation of Fast Fourier Transformation on Transport Triggered Architecture." Master's thesis, Vysoké učení technické v Brně. Fakulta elektrotechniky a komunikačních technologií, 2017. http://www.nusl.cz/ntk/nusl-361729.
Full textBytyn, Andreas [Verfasser], Gerd [Akademischer Betreuer] Ascheid, and Rainer [Akademischer Betreuer] Leupers. "Efficiency and scalability exploration of an application-specific instruction-set processor for deep convolutional neural networks / Andreas Bytyn ; Gerd Ascheid, Rainer Leupers." Aachen : Universitätsbibliothek der RWTH Aachen, 2020. http://d-nb.info/1230325506/34.
Full textŠulek, Jakub. "Verifikace ASIP založena na formálních tvrzeních." Master's thesis, Vysoké učení technické v Brně. Fakulta informačních technologií, 2015. http://www.nusl.cz/ntk/nusl-264941.
Full textNoury, Ludovic. "Contribution à la conception de processeurs d'analyse de signaux à large bande dans le domaine temps-fréquence : l'architecture F-TFR." Paris 6, 2008. http://www.theses.fr/2008PA066206.
Full textKhan, Muhammad Jazib. "Programmable Address Generation Unit for Deep Neural Network Accelerators." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-271884.
Full textConvolutional Neural Networks blir mer och mer populära på grund av deras applikationer inom revolutionerande tekniker som autonom körning, biomedicinsk bildbehandling och naturligt språkbearbetning. Med denna ökning av antagandet ökar också komplexiteten hos underliggande algoritmer. Detta medför implikationer för beräkningsplattformarna såväl som GPU: er, FPGAeller ASIC-baserade acceleratorer, särskilt för Adressgenerationsenheten (AGU) som är ansvarig för minnesåtkomst. Befintliga acceleratorer har normalt Parametrizable Datapath AGU: er som har mycket begränsad anpassningsförmåga till utveckling i algoritmer. Därför krävs ny hårdvara för nya algoritmer, vilket är en mycket ineffektiv metod när det gäller tid, resurser och återanvändbarhet. I denna forskning utvärderas sex algoritmer med olika implikationer för hårdvara för adressgenerering och en helt programmerbar AGU (PAGU) presenteras som kan anpassa sig till dessa algoritmer. Dessa algoritmer är Standard, Strided, Dilated, Upsampled och Padded convolution och MaxPooling. Den föreslagna AGU-arkitekturen är en Very Long Instruction Word-baserad applikationsspecifik instruktionsprocessor som har specialiserade komponenter som hårdvara räknare och noll-overhead-slingor och en kraftfull Instruktionsuppsättning Arkitektur (ISA) som kan modellera statiska och dynamiska begränsningar och affinera och icke-affinerad adress ekvationer. Målet har varit att minimera flexibiliteten kontra avvägning av område, kraft och prestanda. För ett fungerande testnätverk av semantisk segmentering har resultaten visat att PAGU visar nära den perfekta prestanda, 1 cykel per adress, för alla algoritmer som beaktas undantar Upsampled Convolution för vilken det är 1,7 cykler per adress. Området för PAGU är ungefär 4,6 gånger större än Parametrizable Datapath-metoden, vilket fortfarande är rimligt med tanke på de stora flexibilitetsfördelarna. Potentialen för PAGU är inte bara begränsad till neurala nätverksapplikationer utan också i mer allmänna digitala signalbehandlingsområden som kan utforskas i framtiden.
Fatmi, Hassane. "Méthodologie d’analyse des signaux et caractérisation hydrogéologique : application aux chroniques de données obtenues aux laboratoires souterrains du Mont Terri, Tournemire et Meuse/Haute-Marne." Thesis, Toulouse, INPT, 2009. http://www.theses.fr/2009INPT020H/document.
Full textThis report presents a set of statistical methods for pre-processing and analyzing multivariate hydrogeologic time series, such as pore pressure and its relation to atmospheric pressure. The goal is to study the hydrogeologic characteristics of low permeability geologic formations (argilite) in the context of deep disposal of radioactive waste. The pressure time series are analyzed in relation with different phenomena, such as earth tides, barometric effects, and the evolution of excavated galleries. The pre-processing is necessary for reconstituting and homogenizing the time series in the presence of data gaps, outliers, and variable time steps. The preprocessed signals are then analyzed with a view to characterizing the hydraulic properties of this type of low permeability formation (specific storativity; effective porosity). For this sake, we have developed and used the following methods (implemented in Matlab): temporal correlation analyses; spectral/Fourier analyses; multiresolution wavelet analyses envelopes of random processes. This methodology is applied to data collected at the URL (Underground Research Laboratory) of the Mont Terri International Consortium (Swiss Jura), as well as some other data collected at the URL of IRSN at Tournemire (Aveyron) and at the URL of ANDRA (Meuse / Haute-Marne)
Husár, Adam. "Implementace obecného assembleru." Master's thesis, Vysoké učení technické v Brně. Fakulta informačních technologií, 2007. http://www.nusl.cz/ntk/nusl-412779.
Full textLi, Bo. "Conception et test de cellules de gestion d'énergie à commande numérique en technologies CMOS avancées." Phd thesis, INSA de Lyon, 2012. http://tel.archives-ouvertes.fr/tel-00782429.
Full textKarkhanis, Tejas. "Automated design of application-specific superscalar processors." 2006. http://www.library.wisc.edu/databases/connect/dissertations.html.
Full textKim, Kyosun. "Automatic synthesis of application-specific programmable processors." 1998. https://scholarworks.umass.edu/dissertations/AAI9909175.
Full textMin, Jae Hong. "Fused floating-point arithmetic for application specific processors." Thesis, 2013. http://hdl.handle.net/2152/23342.
Full texttext
Rajan, Kaushik. "Efficient Cache Organization For Application Specific And General Purpose Processors." Thesis, 2008. http://hdl.handle.net/2005/838.
Full textΚάργας, Χρήστος. "Energy efficient instruction decoding in application: Specific instruction - set processors." Thesis, 2012. http://hdl.handle.net/10889/6295.
Full textΜε τη σύγχρονη τεχνολογία σχεδιασμού επεξεργαστών, ο σχεδιαστής μπορεί με ευκολία να σχεδιάσει ένα προγραμματιζόμενο Επεξεργαστή Συνόλου Εντολών Ειδικού Σκοπού (ASIP - Application-Specific Instruction-set Processor) για ένα συγκεκριμένο εύρος εφαρμογών. Υπάρχουν διάφοροι τέτοιοι επεξεργαστές διαθέσιμοι για ασύρματες εφαρμογές, κρυπτογράφηση και βιοϊατρικές εφαρμογές (π.χ. στον αλγόριθμο εντοπισμού χτύπου ηλεκτροκαρδιογραφήματος). Στους παραδοσιακούς επεξεργαστές και επεξεργαστές σήματος (DSP - Digital Signal Processor) ο ορισμός του συνόλου εντολών και η πολυπλοκότητα έχουν μεγάλη επίδραση, ειδικά στην κατανάλωση ισχύος. Μία πιθανή λύση σε αυτό το πρόβλημα είναι οι ορθογώνιοι επεξεργαστές μεγάλου μεγέθους λέξης εντολής (VLIW - Very Large Instruction Word). Με τον όρο ορθογώνιο επεξεργαστή, ορίζεται ένας επεξεργαστής οριζόντιου σύνολου εντολών, άρα ένας επεξεργαστής στον οποίο μπορεί να υπάρξει κάθε διαθέσιμος συνδυασμός μεταξύ των διαθέσιμων εντολών και των μεθόδων διευθυνσιοδότησης για πρόσβαση στη μνήμη και το αρχείο καταχωρητών. Οι ορθογώνιοι επεξεργαστές δεν επιβαρύνουν τόσο τον αποκωδικοποιητή εντολών. Αντί αυτού το μέγεθος της λέξης της εντολής γίνεται πολύ μεγάλο, και έτσι μετατίθεται το ενεργειακό κόστος στην μνήμη εντολών προγράμματος (program memory )ή την κρυφή μνήμη εντολών προγράμματος (instruction cache). Για τους σκοπούς αυτής της διπλωματικής εργασίας, αναπτύχθηκε ένας επεξεργαστής SIMD, ο οποίος συγκρίνεται με έναν soft-SIMD για να μελετηθούν η απαιτούμενη περιοχή στο ενσωματωμένο, επιδόσεις και κατανάλωση ενέργειας για μία βιοϊατρική εφαρμογή, καθώς και το πως η περιγραφή ενός επεξεργαστή στη γλώσσα περιγραφής επεξεργαστών ASIP nML ορίζει την παραγούμενη γλώσσα περιγραφής υλικού (HDL - Hardware Description Language). Ο επεξεργαστής αυτός μετατρέπεται σε ορθογώνιο, και με τη χρήση επαναληπτικών πειραμάτων μελετάται η επίδραση στην κατανάλωση ενέργειας κατά τη διάρκεια αλλαγών στην αρχιτεκτονική του συνόλου εντολών και του μεγέθους της μνήμης εντολών προγράμματος. Ακόμη μελετάται πως μπορεί να εκμεταλλευτεί ο σχεδιαστής την αναδιάρθρωση του συνόλου εντολών για να βελτιώσει την κατανάλωση ενέργειας.
Chao, Chie-Min, and 趙至敏. "Development of Software Tools for Application-Specific Instruction-set Processors (ASIPs)." Thesis, 2005. http://ndltd.ncl.edu.tw/handle/10553662144219920236.
Full text國立交通大學
電子工程系所
93
Programmable processors are dramatically attractive to amortize manufacturing costs and design efforts, as the system complexity grows. Besides, in order to satisfy the tight design constraints such as performance and power of today’s embedded systems, processor architectures are getting more specialized to some application domains (e.g. an application-specific instruction-set processor; ASIP). This thesis discusses the acceleration of system prototyping of new processor cores by reducing the software development time. Firstly, we propose a simple and effective high-level language compilation method by encapsulation new processor cores in compiler o friendly RISC shell. The native code translation form compiled RISC codes to the target processor is carried out by cooperating hardware and software. Secondly, we propose an efficient instruction set simulator with decoupled hazard checker and memory simulator. The simulation time is significantly reduced via native translation, while the cycle accuracy is maintained with proper instrumentation. Finally, we have constructed a C compiler with 6.98% hardware over head and a cycle-accurate ISS with 102~104 speed up for a proprietary DSP processor. Moreover, we have developed JPEG and H.264 encoding systems based on these software tools.
Yu, Cheng-Juei, and 余承叡. "Methodologies and Algorithms for High-Level Synthesis of Application Specific Processors." Thesis, 2011. http://ndltd.ncl.edu.tw/handle/19319584832695146324.
Full text國立臺灣大學
電機工程學研究所
99
Growing design complexity has led designers to generate designs at higher levels of abstraction, such as, the behavior description level. The core task for synthesizing the behavior description is the high-level synthesis (HLS), which contains three main steps to create a hardware architecture of datapath elements, control logic and memory elements: resource allocation, binding and scheduling. Among these tasks, resource scheduling is considered the most important in a HLS process. A part of this thesis is devoted to the study of the problem of resource-constrained scheduling (RCS) and proposes two search algorithms to exactly solve the problem in an effective, systematic way. The proposed algorithms are capable of reducing the computational effort required to obtain the best schedules on a pre-defined datapath by effectively pruning the non-promising search space. The effectiveness of the algorithms against existing approaches over time and space is demonstrated by theorems and related analysis. Furthermore, this thesis also presents methodologies that help convert a given application specified in C programming language into a hardware implementation of a custom processor. Besides datapaths representing standalone functional blocks the proposed RCS algorithms can schedule effectively, control paths representing calls of functions are also considered. Based on and extended from the concept of hierarchical finite-state machines (HFSMs), a number of built-in HFSM templates are proposed and used as the elementary components of a hardware design. Guidelines on the refinement of a C program are introduced; the refined C functions are compiled into HFSMs that in turn generate synthesizable hardware description language (HDL) code as the final design. A set of HFSMs is viewed as an intermediate representation between C and HDL and can be functionally simulated. Two modeling levels, i.e. cycle-accurate and cycle-approximated, are supported. In the end of this thesis, experimental results on a series of several well known algorithmic benchmarks demonstrate the effectiveness of the proposed algorithms and approaches against existing ones.
Chen, Bo-hong, and 陳柏宏. "Task Partition and Scheduling on Multiple Heterogeneous Application Specific Instruction Set Processors." Thesis, 2009. http://ndltd.ncl.edu.tw/handle/68773497398506205745.
Full text逢甲大學
資訊工程所
97
Stream processing applications demand high throughput. Multiple heterogeneous Application-Specific Instruction-set Processor (ASIP) architectures tend not to offer only sufficient throughput but also runtime flexibility, which cannot be provided by custom ASIC solutions. In this paper, the partition and scheduling methodology has been proposed to utilize both the Application-Specific Instruction (ASI) in ASIP, to exploit fine-grain data parallelism, and multiple ASIPs, to exploit task and pipeline parallelism. In our design flow, the data dependence of a streaming application will be analyzed with all possible ASI candidates and then to generate the corresponding task graph. The task graph will be partitioned and scheduled to the pipeline stages to achieve the desired throughput constraint with less hardware cost. The experiment results show an average hardware reduction of 8% compared to the results without using ASIs.
Wang, Hui-Shan, and 王惠珊. "Generating and Exploiting Reconfigurable Custom Functional Unit in Application Specific VLIW Processors." Thesis, 2009. http://ndltd.ncl.edu.tw/handle/54856053965819254006.
Full text國立交通大學
資訊科學與工程研究所
97
To improve the performance of processors, a customized accelerator, reconfigurable custom functional unit (RCFU), may be appended to a very long instruction word (VLIW) processor architecture. The technique is to generate RCFU by those frequent operation segments and collapse operation segments which could be executed on the RCFU as customized instructions. Then, instruction scheduling is done to elaborate instruction-level parallelism for performance improvement at compile time. In this research, we propose not only a tightly-coupled RCFU design on the VLIW processor, but also an algorithm is also proposed to exploit the processor augmented with RCFU. We assume that FUs in the processor pipeline and RCFU could execute simultaneously, and independent operation mapping and instruction scheduling algorithms are integrated into a single phase to get more performance gains and higher hardware usability. We had comparisons between the processors with RCFU and without RCFU. Overall, our proposed RCFU design while using our proposed exploitation algorithm still achieves giant speedup on average over previous generating algorithms. Furthermore, the algorithm for exploiting RCFU also achieves obviously speedup on average over previous methods, separating algorithms.
Li, Gin-Hsuan, and 李京軒. "Application-Specific Instruction-set Processors With Implicit Registers To Improve Register Bandwidth." Thesis, 2010. http://ndltd.ncl.edu.tw/handle/47423386130421458728.
Full text逢甲大學
資訊工程所
98
Application-Specific Instruction-set processors (ASIPs) has become an important design choice for embedded systems due to runtime flexibility, which cannot be provided by custom ASIC solutions. However, the limited register bandwidth in the core processor becomes a potential performance bottleneck. In our design flow, the Implicit registers (Iregs) and a Implicit register allocation algorithm is proposed to improve the data bandwidth of CIs and reduce the number of additional move instructions. The experiment results show an average speedup of 8% (up to 27%) compared to the results without using Implicit registers.