Journal articles: 'Processor Architectures'

1

Page, Ian. "Reconfigurable processor architectures." Microprocessors and Microsystems 20, no. 3 (May 1996): 185–96. http://dx.doi.org/10.1016/0141-9331(95)01076-9.

Full text

APA, Harvard, Vancouver, ISO, and other styles

2

Byrd, G. T., and M. A. Holliday. "Multithreaded processor architectures." IEEE Spectrum 32, no. 8 (1995): 38–46. http://dx.doi.org/10.1109/6.402166.

Full text

APA, Harvard, Vancouver, ISO, and other styles

3

Yantır, Hasan Erdem, Wenzhe Guo, Ahmed M. Eltawil, Fadi J. Kurdahi, and Khaled Nabil Salama. "An Ultra-Area-Efficient 1024-Point In-Memory FFT Processor." Micromachines 10, no. 8 (July 31, 2019): 509. http://dx.doi.org/10.3390/mi10080509.

Full text

Abstract:

Current computation architectures rely on more processor-centric design principles. On the other hand, the inevitable increase in the amount of data that applications need forces researchers to design novel processor architectures that are more data-centric. By following this principle, this study proposes an area-efficient Fast Fourier Transform (FFT) processor through in-memory computing. The proposed architecture occupies the smallest footprint of around 0.1 mm 2 inside its class together with acceptable power efficiency. According to the results, the processor exhibits the highest area efficiency ( FFT / s / area ) among the existing FFT processors in the current literature.

APA, Harvard, Vancouver, ISO, and other styles

4

Korolija, Nenad, and Kent Milfeld. "Towards hybrid supercomputing architectures." Journal of Computer and Forensic Sciences 1, no. 1 (2022): 47–54. http://dx.doi.org/10.5937/1-42710.

Full text

Abstract:

In light of recent work on combining control-flow and dataflow architectures on the same chip die, a new architecture based on an asymmetric multicore processor is proposed. The control-flow architectures are described as a most commonly used computer architecture today. Both multicore and manycore architectures are explained, as they are based on the same principles. A dataflow computing model assumes that data input flows through hardware as either a software or hardware dataflow implementation. In software dataflow, processors based on the control-flow paradigm process tasks based on their availability from the same queue (if there are any). In hardware dataflow architectures, the hardware is configured for a particular algorithm, and data input is streamed into the hardware, and the output is streamed back to the multicore processor for further processing. Hardware dataflow architectures are usually implemented with FPGAs. Hybrid architectures employ asymmetric multicore and manycore computer architectures that are based on the control-flow and hardware dataflow architecture, all combined on the same chip die. Advantages include faster processing time, lower power consumption (and heating), and less space needed for the hardware.

APA, Harvard, Vancouver, ISO, and other styles

5

Tabak, Daniel. "Microelectronics: Processor architectures I." Microprocessing and Microprogramming 24, no. 1-5 (August 1988): 563. http://dx.doi.org/10.1016/0165-6074(88)90111-1.

Full text

APA, Harvard, Vancouver, ISO, and other styles

6

Tabak, Daniel. "Microelectronics: Processor architectures II." Microprocessing and Microprogramming 24, no. 1-5 (August 1988): 693. http://dx.doi.org/10.1016/0165-6074(88)90131-7.

Full text

APA, Harvard, Vancouver, ISO, and other styles

7

Rao, Wenjing, Alex Orailoglu, and Ramesh Karri. "Towards Nanoelectronics Processor Architectures." Journal of Electronic Testing 23, no. 2-3 (March 20, 2007): 235–54. http://dx.doi.org/10.1007/s10836-006-0555-7.

Full text

APA, Harvard, Vancouver, ISO, and other styles

8

Göhringer, Diana, Thomas Perschke, Michael Hübner, and Jürgen Becker. "A Taxonomy of Reconfigurable Single-/Multiprocessor Systems-on-Chip." International Journal of Reconfigurable Computing 2009 (2009): 1–11. http://dx.doi.org/10.1155/2009/395018.

Full text

Abstract:

Runtime adaptivity of hardware in processor architectures is a novel trend, which is under investigation in a variety of research labs all over the world. The runtime exchange of modules, implemented on a reconfigurable hardware, affects the instruction flow (e.g., in reconfigurable instruction set processors) or the data flow, which has a strong impact on the performance of an application. Furthermore, the choice of a certain processor architecture related to the class of target applications is a crucial point in application development. A simple example is the domain of high-performance computing applications found in meteorology or high-energy physics, where vector processors are the optimal choice. A classification scheme for computer systems was provided in 1966 by Flynn where single/multiple data and instruction streams were combined to four types of architectures. This classification is now used as a foundation for an extended classification scheme including runtime adaptivity as further degree of freedom for processor architecture design. The developed scheme is validated by a multiprocessor system implemented on reconfigurable hardware as well as by a classification of existing static and reconfigurable processor systems.

APA, Harvard, Vancouver, ISO, and other styles

9

Bezzubtsev, Stanislav O., Vyacheslav V. Vasin, Dmitry Yu Volkanov, Shynar R. Zhailauova, Vladislav A. Miroshnik, Yuliya A. Skobtsova, and Ruslan L. Smeliansky. "An Approach to the Construction of a Network Processing Unit." Modeling and Analysis of Information Systems 26, no. 1 (March 15, 2019): 39–62. http://dx.doi.org/10.18255/1818-1015-2019-1-39-62.

Full text

Abstract:

The paper proposes the architecture and basic requirements for a network processor for OpenFlow switches of software-defined networks. An analysis of the architectures of well-known network processors is presented − NP-5 from EZchip (now Mellanox) and Tofino from Barefoot Networks. The advantages and disadvantages of two different versions of network processor architectures are considered: pipeline-based architecture, the stages of which are represented by a set of general-purpose processor cores, and pipeline-based architecture whose stages correspond to cores specialized for specific packet processing operations. Based on a dedicated set of the most common use case scenarios, a new architecture of the network processor unit (NPU) with functionally specialized pipeline stages was proposed. The article presents a description of the simulation model of the NPU of the proposed architecture. The simulation model of the network processor is implemented in C ++ languages using SystemC, the open-source C++ library. For the functional testing of the obtained NPU model, the described use case scenarios were implemented in C. In order to evaluate the performance of the proposed NPU architecture a set of software products developed by KM211 company and the KMX32 family of microcontrollers were used. Evaluation of NPU performance was made on the basis of a simulation model. Estimates of the processing time of one packet and the average throughput of the NPU model for each scenario are obtained.

APA, Harvard, Vancouver, ISO, and other styles

10

KATZ, RANDY H., and JOHN L. HENNESSY. "HIGH PERFORMANCE MICROPROCESSOR ARCHITECTURES." International Journal of High Speed Electronics and Systems 01, no. 01 (March 1990): 1–17. http://dx.doi.org/10.1142/s0129156490000022.

Full text

Abstract:

Single chip processor performance has improved dramatically since the inception of the four-bit microprocessor in 1971. This is due in part to technological advances, (i.e., faster devices and greater device density), but also because of the adoption of architectural approaches well suited to the opportunities and limitations of VLSI. The most appropriate are those that effectively reduce off-chip memory accesses and admit to a regular pipelined implementation. The over-riding goal of pipelining is to achieve “single cycle execution”, i.e., instructions appear to execute in a single processor cycle. Today’s RISC processors are close to realizing this goal, and the next generation will reduce the cycles per instruction even further. In this paper, we will review the design issues and the proposed architectures for high performance VLSI processors.

APA, Harvard, Vancouver, ISO, and other styles

11

Musoll, Enric, and Mario Nemirovsky. "Design Space Exploration of High-Performance Parallel Architectures." Journal of Integrated Circuits and Systems 3, no. 1 (November 18, 2008): 32–38. http://dx.doi.org/10.29292/jics.v3i1.279.

Full text

Abstract:

High-performance single-threaded processors achieve their performance goal partly by relying, among other architectural techniques, on speculation and large on-chip caches. The hardware to support these techniques is usually a large portion of the overall processor real state area, and therefore it consumes a significant amount of power that sometimes is not optimally used toward doing useful work. In this work, we study the intuitive fact that architectures with hardware support for threads are more power efficient than a more traditional single-threaded superscalar architecture. Toward this goal, we have created a model of the power, performance and area of several parallel architectures. This model shows that a parallel architecture can be designed so that (a) it requires less area and power (to reach the same performance), or (b) it achieves better power efficiency and less area (for the same power budget), or (c) it has higher performance and better power efficiency (for the same area constraint), when compared to a single-threaded superscalar architecture.

APA, Harvard, Vancouver, ISO, and other styles

12

Aamodt, Tor M., Wilson Wai Lun Fung, and Timothy G. Rogers. "General-Purpose Graphics Processor Architectures." Synthesis Lectures on Computer Architecture 13, no. 2 (May 21, 2018): 1–140. http://dx.doi.org/10.2200/s00848ed1v01y201804cac044.

Full text

APA, Harvard, Vancouver, ISO, and other styles

13

Vehlies, Uwe. "Stepwise Transformation of Algorithms into Array Processor Architectures by the DECOMP." VLSI Design 3, no. 1 (January 1, 1995): 67–80. http://dx.doi.org/10.1155/1995/76861.

Full text

Abstract:

A formal approach for the transformation of computation intensive digital signal processing algorithms into suitable array processor architectures is presented. It covers the complete design flow from algorithmic specifications in a high-level programming language to architecture descriptions in a hardware description language. The transformation itself is divided into manageable design steps and implemented in the CAD-tool DECOMP which allows the exploration of different architectures in a short time. With the presented approach data independent algorithms can be mapped onto array processor architectures. To allow this, a known mapping methodology for array processor design is extended to handle inhomogeneous dependence graphs with nonregular data dependences. The implementation of the formal approach in the DECOMP is an important step towards design automation for massively parallel systems.

APA, Harvard, Vancouver, ISO, and other styles

14

MOTLAGH, BAHMAN S., and RONALD F. DeMARA. "PERFORMANCE OF SCALABLE SHARED-MEMORY ARCHITECTURES." Journal of Circuits, Systems and Computers 10, no. 01n02 (February 2000): 1–22. http://dx.doi.org/10.1142/s0218126600000068.

Full text

Abstract:

Analytical models were developed and simulations of memory latency were performed for Uniform Memory Access (UMA), Non-Uniform Memory Access (NUMA), Local-Remote-Global (LRG), and RCR architectures for hit rates from 0.1 to 0.9 in steps of 0.1, memory access times of 10 to 100 ns, proportions of read/write access from 0.01 to 0.1, and block sizes of 8 to 64 words. The RCR architecture provides favorable performance over UMA and NUMA architectures for all ranges of application and system parameters. RCR outperforms LRG architectures when the hit rates of the processor cache exceed 80%and replicated memory exceed 25%. Thus, inclusion of a small replicated memory at each processor significantly reduces expected access time since all replicated memory hits become independent of global traffic. For configurations of up to 32 processors, results show that latency is further reduced by distinguishing burst-mode transfers between isolated memory accesses and those which are incrementally outside the working set.

APA, Harvard, Vancouver, ISO, and other styles

15

Skvortsov, Leonid Vladlenovich, Roman Vyacheslavovich Baev, Ksenia Yurievna Dolgorukova, and Eugene Yurievich Sharygin. "Developing an LLVM-based compiler for stack based TF16 processor architecture." Proceedings of the Institute for System Programming of the RAS 33, no. 5 (2021): 137–54. http://dx.doi.org/10.15514/ispras-2021-33(5)-8.

Full text

Abstract:

Development for stack-based architectures is usually done using legacy low level languages or assembly code, so there exists a problem of a high level programming language support for such architectures. In this paper we describe the development process of an LLVM/Clang-based C compiler for stack-based TF16 processor architecture. LLVM was used due to adaptation possibilities of its components for new architectures, such as disassembler, linker and debugger. Two compiler versions were developed. The first version generated code without using stack capabilities of TF16, treating it instead as a register-based architecture. This version was relatively easy to develop and it provided us a comparison point for the second one. In the second version we have implemented a platform independent stack scheduling algorithm that allowed us to generate code that makes use of the stack capabilities of the CPU. When comparing the two versions, a version that utilized stack capabilities generated code that was on average 35.7% faster and 50.8% smaller than the original version. The developed stack scheduling algorithm also allows to support other stack based architectures in LLVM toolchain.

APA, Harvard, Vancouver, ISO, and other styles

16

Yantır, Hasan Erdem, Ahmed M. Eltawil, and Khaled N. Salama. "Efficient Acceleration of Stencil Applications through In-Memory Computing." Micromachines 11, no. 6 (June 26, 2020): 622. http://dx.doi.org/10.3390/mi11060622.

Full text

Abstract:

The traditional computer architectures severely suffer from the bottleneck between the processing elements and memory that is the biggest barrier in front of their scalability. Nevertheless, the amount of data that applications need to process is increasing rapidly, especially after the era of big data and artificial intelligence. This fact forces new constraints in computer architecture design towards more data-centric principles. Therefore, new paradigms such as in-memory and near-memory processors have begun to emerge to counteract the memory bottleneck by bringing memory closer to computation or integrating them. Associative processors are a promising candidate for in-memory computation, which combines the processor and memory in the same location to alleviate the memory bottleneck. One of the applications that need iterative processing of a huge amount of data is stencil codes. Considering this feature, associative processors can provide a paramount advantage for stencil codes. For demonstration, two in-memory associative processor architectures for 2D stencil codes are proposed, implemented by both emerging memristor and traditional SRAM technologies. The proposed architecture achieves a promising efficiency for a variety of stencil applications and thus proves its applicability for scientific stencil computing.

APA, Harvard, Vancouver, ISO, and other styles

17

Bakó, László, Szabolcs Hajdú, and Fearghal Morgan. "Evaluation and Comparison of Low FPGA Footprint, Embedded Soft-Core Processors." MACRo 2015 2, no. 1 (October 1, 2017): 23–30. http://dx.doi.org/10.1515/macro-2017-0003.

Full text

Abstract:

AbstractThe paper presents three embedded soft-core processor architectures, developed by the authors to be easily implementable while yielding low digital resource usage. These architectures will be compared and contrasted between each-other by introducing a special testing method, based on control algorithm implementations. For reference, the same testing and comparison has been implemented on a well established architecture, too, on the Xilinx PicoBlaze processor. Measurement results and application suggestion are given in the concluding section.

APA, Harvard, Vancouver, ISO, and other styles

18

Jonckers, N., B. Engelen, K. Appels, S. De Raedemaeker, L. Mariën, and J. Prinzie. "Towards Single-Event Upset detection in Hardware Secure RISC-V processors." Journal of Instrumentation 19, no. 06 (June 1, 2024): C06009. http://dx.doi.org/10.1088/1748-0221/19/06/c06009.

Full text

Abstract:

Abstract Single-event effects and hardware security show close similarities in terms of vulnerabilities and mitigation techniques. Secure processors address physical attacks from the outside, such as external laser stimulation, to compromise the program and extract sensitive information from the systems. To overcome this vulnerability, secure extensions to the hardware architecture are often built into modern processor cores. Given the limited design resources often found in space or high-energy physics experiment development teams, this article addresses the extent to which secure hardware architectures can be a reliable source of processor SEU detection.

APA, Harvard, Vancouver, ISO, and other styles

19

Rong Lin. "Reconfigurable parallel inner product processor architectures." IEEE Transactions on Very Large Scale Integration (VLSI) Systems 9, no. 2 (April 2001): 261–72. http://dx.doi.org/10.1109/92.924037.

Full text

APA, Harvard, Vancouver, ISO, and other styles

20

Ramdas, Tirath, Gregory K. Egan, David Abramson, and Kim K. Baldridge. "ERI sorting for emerging processor architectures." Computer Physics Communications 180, no. 8 (August 2009): 1221–29. http://dx.doi.org/10.1016/j.cpc.2009.01.029.

Full text

APA, Harvard, Vancouver, ISO, and other styles

21

Edwards, Chris. "Processor Makers Embrace DPUs." New Electronics 53, no. 21 (December 8, 2020): 16–17. http://dx.doi.org/10.12968/s0047-9624(22)61661-4.

Full text

APA, Harvard, Vancouver, ISO, and other styles

22

Jain. S, Poonam, Pooja S, Sripath Roy. K, Abhilash K, and Arvind B V. "Implementation of asymmetric processing on multi core processors to implement IOT applications on GNU/Linux framework." International Journal of Engineering & Technology 7, no. 2.7 (March 18, 2018): 710. http://dx.doi.org/10.14419/ijet.v7i2.7.10928.

Full text

Abstract:

Internet of Things brought in a bigger computing challenges where there came a need for running tasks in a multi-sensor and large data processing is involved. In order to implement this requirement multiprocessors are being used for implementation of IoT Gateways. There comes a need for specific tasks having a resource dedicated for its job. To fulfill this we face a hurdle in choosing dedicated processor or shared processor in a Symmetric Processing Architecture. Dedicated processor are the one in which all the tasks are being processed on a single core where as in fair share processors specific processes are assigned to specific cores. Symmetric processing makes use of dedicated processors where as Asymmetric processor makes use of shared processors. Asymmetric Multi Processing can be used in real time applications in order to solve real time problems, one such platform is IOT. In this paper we have evaluated Asymmetric processing on GNU/Linux Platform to test multiple threads running on different multi-core processors architectures to realize the same for running IOT applications having higher computational requirements in the future.

APA, Harvard, Vancouver, ISO, and other styles

23

SHARIF, MD HAIDAR. "HIGH-PERFORMANCE MATHEMATICAL FUNCTIONS FOR SINGLE-CORE ARCHITECTURES." Journal of Circuits, Systems and Computers 23, no. 04 (April 2014): 1450051. http://dx.doi.org/10.1142/s0218126614500510.

Full text

Abstract:

Nowadays high-performance computing (HPC) architectures are designed to resolve assorted sophisticated scientific as well as engineering problems across an ever intensifying number of HPC and professional workloads. Application and computation of key trigonometric functions sine and cosine are in all spheres of our daily life, yet fairly time consuming task in high-performance numerical simulations. In this paper, we have delivered a detailed deliberation of how the micro-architecture of single-core Itanium® and Alpha 21264/21364 processors as well as the manual optimization techniques improve the computing performance of several mathematical functions. On describing the detailed algorithm and its execution pattern on the processor, we have confirmed that the processor micro-architecture side by side manual optimization techniques ameliorate computing performance significantly as compared to not only the standard math library's built-in functions with compiler optimizing options but also Intel® Itanium® library's highly optimized mathematical functions.

APA, Harvard, Vancouver, ISO, and other styles

24

THIELE, LOTHAR, and ULRICH ARZT. "ON THE SYNTHESIS OF MASSIVELY PARALLEL ARCHITECTURES." International Journal of High Speed Electronics and Systems 04, no. 02 (June 1993): 99–131. http://dx.doi.org/10.1142/s0129156493000078.

Full text

Abstract:

We describe synthesis methods for massively parallel architectures. In particular, a methodology is proposed which leads to a mechanical and provably correct design of systems consisting of memory banks, switches, and processor arrays. In particular, the trajectory is based on the concept of piecewise-linear/regular algorithms and processor arrays. Complex design transformations such as partitioning, clustering, control generation, and flattening/creation of hierarchical levels are embedded in a homogeneous design flow. The final specification of the implementation not only contains the processing elements but also control flow, control processors and interfaces to memory banks and switches.

APA, Harvard, Vancouver, ISO, and other styles

25

Garzia, Fabio, Roberto Airoldi, and Jari Nurmi. "Implementation of FFT on General-Purpose Architectures for FPGA." International Journal of Embedded and Real-Time Communication Systems 1, no. 3 (July 2010): 24–43. http://dx.doi.org/10.4018/jertcs.2010070102.

Full text

Abstract:

This paper describes two general-purpose architectures targeted to Field Programmable Gate Array (FPGA) implementation. The first architecture is based on the coupling of a coarse-grain reconfigurable array with a general-purpose processor core. The second architecture is a homogeneous multi-processor system-on-chip (MP-SoC). Both architectures have been mapped onto two different Altera FPGA devices, a StratixII and a StratixIV. Although mapping onto the StratixIV results in higher operating frequencies, the capabilities of the device are not fully exploited. The implementation of a FFT on the two platforms shows a considerable speed-up in comparison with a single-processor reference architecture. The speed-up is higher in the reconfigurable solution but the MP-SoC provides an easier programming interface that is completely based on C language. The authors’ approach proves that implementing a programmable architecture on FPGA and then programming it using a high-level software language is a viable alternative to designing a dedicated hardware block with a hardware description language (HDL) and mapping it on FPGA.

APA, Harvard, Vancouver, ISO, and other styles

26

Lee, Jongbok. "Performance Study of Asymmetric Multicore Processor Architectures." Journal of the Institute of Webcasting, Internet and Telecommunication 14, no. 3 (June 30, 2014): 163–69. http://dx.doi.org/10.7236/jiibc.2014.14.3.163.

Full text

APA, Harvard, Vancouver, ISO, and other styles

27

Rezgui, S., R. Velazco, R. Ecoffet, S. Rodriguez, and J. R. Mingo. "Estimating error rates in processor-based architectures." IEEE Transactions on Nuclear Science 48, no. 5 (2001): 1680–87. http://dx.doi.org/10.1109/23.960357.

Full text

APA, Harvard, Vancouver, ISO, and other styles

28

Pradhan. "Dynamically Restructurable Fault-Tolerant Processor Network Architectures." IEEE Transactions on Computers C-34, no. 5 (May 1985): 434–47. http://dx.doi.org/10.1109/tc.1985.1676583.

Full text

APA, Harvard, Vancouver, ISO, and other styles

29

Gebali, F., and A. N. M. E. Rafiq. "Processor array architectures for deep packet classification." IEEE Transactions on Parallel and Distributed Systems 17, no. 3 (March 2006): 241–52. http://dx.doi.org/10.1109/tpds.2006.39.

Full text

APA, Harvard, Vancouver, ISO, and other styles

30

Diamantaras, K. I., W. H. Chou, and S. Y. Kung. "Dynamic programming implementation on array processor architectures." Journal of VLSI signal processing systems for signal, image and video technology 13, no. 1 (August 1996): 27–35. http://dx.doi.org/10.1007/bf00930665.

Full text

APA, Harvard, Vancouver, ISO, and other styles

31

Zmyzgova, T. R., A. V. Solovyev, A. G. Rabushko, A. A. Medvedev, and Yu V. Adamenko. "Issues of compatibility of processor command architectures." IOP Conference Series: Earth and Environmental Science 421 (January 7, 2020): 042006. http://dx.doi.org/10.1088/1755-1315/421/4/042006.

Full text

APA, Harvard, Vancouver, ISO, and other styles

32

Gehrke, W., and K. Gaedke. "Associative controlling of monolithic parallel processor architectures." IEEE Transactions on Circuits and Systems for Video Technology 5, no. 5 (1995): 453–64. http://dx.doi.org/10.1109/76.473558.

Full text

APA, Harvard, Vancouver, ISO, and other styles

33

Moran, J., and S. Alexandres. "A comparison of some processor farm architectures." Microprocessing and Microprogramming 34, no. 1-5 (February 1992): 85–88. http://dx.doi.org/10.1016/0165-6074(92)90108-j.

Full text

APA, Harvard, Vancouver, ISO, and other styles

34

Withagen, Willem Jan, and Rob Takken. "Hierachical modeling and simulation of processor architectures." Microprocessing and Microprogramming 39, no. 2-5 (December 1993): 229–32. http://dx.doi.org/10.1016/0165-6074(93)90094-2.

Full text

APA, Harvard, Vancouver, ISO, and other styles

35

Barbierato, Enrico, Daniele Manini, and Marco Gribaudo. "A Multiformalism-Based Model for Performance Evaluation of Green Data Centres." Electronics 12, no. 10 (May 10, 2023): 2169. http://dx.doi.org/10.3390/electronics12102169.

Full text

Abstract:

Although the coexistence of ARM and INTEL technologies in green data centres is technically feasible, significant challenges exist that must be addressed. These challenges stem from the differences in instruction sets and power consumption between the two processor architectures. While ARM processors are known for their energy efficiency, INTEL processors tend to consume more power. Consequently, evaluating the performance of hybrid architectures can be a complex task. The contributions of this article consist of (i) a multiformalism-based model of a data centre, providing a natural and convenient approach to the specification process and performance analysis of a realistic scenario and (ii) a review of the performance indices, including the choice of one architecture over another, power consumption, the response time, and request loss, according to different policies. As a result, the model aims to address issues such as system underutilization and the need to estimate the optimal workload balance, thereby providing an effective solution for evaluating the performance of hybrid hardware architectures.

APA, Harvard, Vancouver, ISO, and other styles

36

Wang, Guang, and Yin Sheng Gao. "A Control Path Design of Communications Processor." Advanced Materials Research 694-697 (May 2013): 1459–64. http://dx.doi.org/10.4028/www.scientific.net/amr.694-697.1459.

Full text

Abstract:

With the widespread popularity of wireless mobile devices, the demands of emerging applications have been proposed, such as video telephony and HD video play. The next generation of wireless mobile computing devices needs to have higher data transfer rate, more complex algorithms, as well as low power consumption. This paper gives out a variable-width SIMD processor architectures controller and data transfer module designing method. The controller designing focused on controlling the program flow and initializing the data transfer module. By simulating the architecture on Xilinx ISE Design Suite, the fundamental modules have been designed and tested. The results of simulation verify the correctness of the controller and data transfer module design.

APA, Harvard, Vancouver, ISO, and other styles

37

Mahmood, Ausif. "Behavioral Simulation and Performance Evaluation of Multi-Processor Architectures." VLSI Design 4, no. 1 (January 1, 1996): 59–68. http://dx.doi.org/10.1155/1996/91035.

Full text

Abstract:

The development of multi-processor architectures requires extensive behavioral simulations to verify the correctness of design and to evaluate its performance. A high level language can provide maximum flexibility in this respect if the constructs for handling concurrent processes and a time mapping mechanism are added. This paper describes a novel technique for emulating hardware processes involved in a parallel architecture such that an object-oriented description of the design is maintained. The communication and synchronization between hardware processes is handled by splitting the processes into their equivalent subprograms at the entry points. The proper scheduling of these subprograms is coordinated by a timing wheel which provides a time mapping mechanism. Finally, a high level language pre-processor is proposed so that the timing wheel and the process emulation details can be made transparent to the user.

APA, Harvard, Vancouver, ISO, and other styles

38

Srinivasan, Sudarshan K., Koushik Sarker, and Rajendra S. Katti. "Token-Aware Completion Functions for Elastic Processor Verification." Research Letters in Electronics 2009 (2009): 1–5. http://dx.doi.org/10.1155/2009/480740.

Full text

Abstract:

We develop a formal verification procedure to check that elastic pipelined processor designs correctly implement their instruction set architecture (ISA) specifications. The notion of correctness we use is based on refinement. Refinement proofs are based on refinement maps, which—in the context of this problem—are functions that map elastic processor states to states of the ISA specification model. Data flow in elastic architectures is complicated by the insertion of any number of buffers in any place in the design, making it hard to construct refinement maps for elastic systems in a systematic manner. We introduce token-aware completion functions, which incorporate a mechanism to track the flow of data in elastic pipelines, as a highly automated and systematic approach to construct refinement maps. We demonstrate the efficiency of the overall verification procedure based on token-aware completion functions using six elastic pipelined processor models based on the DLX architecture.

APA, Harvard, Vancouver, ISO, and other styles

39

KYRIAKIS-BITZAROS, E. D., D. J. SOUDRIS, and C. E. GOUTIS. "TRANSFORMATION OF NESTED LOOPS INTO UNIFORM RECURRENCES AND THEIR MAPPING TO REGULAR PROCESSOR ARRAYS." Journal of Circuits, Systems and Computers 06, no. 03 (June 1996): 243–65. http://dx.doi.org/10.1142/s0218126696000194.

Full text

Abstract:

A methodology for transforming a class of iterative algorithms, expressed in nested loops, into Uniform Recurrent Equation (URE) forms and their mapping into regular processor array architectures is presented. The propagation space of the variables of the original nested loop is specified by a system of linear equations formed by their index functions. The data flow within the index space, is then localized by the derivation of a set of parametric dependence vectors, which eventually can be used to transform the initial algorithm into a set of UREs. The mapping of the UREs is accomplished by decomposition of the index space into independent subsets of variable instances using the derived dependence vectors. The dependence graphs of the subsets are normalized and subsequently, are mapped on the processor array architecture. The exploitation of the independent subsets leads to significant improvement of the efficiency of the processor array compared to architectures derived by using linear transformations of the entire index space. Under certain conditions, only local interconnections in the processor array are required. The proposed methodology is illustrated by the design of alternative processor arrays implementing the convolution algorithm.

APA, Harvard, Vancouver, ISO, and other styles

40

Jung, Yongchul, Jaechan Cho, Seongjoo Lee, and Yunho Jung. "Area-Efficient Pipelined FFT Processor for Zero-Padded Signals." Electronics 8, no. 12 (November 22, 2019): 1397. http://dx.doi.org/10.3390/electronics8121397.

Full text

Abstract:

This paper proposes an area-efficient fast Fourier transform (FFT) processor for zero-padded signals based on the radix-2 2 and the radix-2 3 single-path delay feedback pipeline architectures. The delay elements for aligning the data in the pipeline stage are one of the most complex units and that of stage 1 is the biggest. By exploiting the fact that the input data sequence is zero-padded and that the twiddle factor multiplication in stage 1 is trivial, the proposed FFT processor can dramatically reduce the required number of delay elements. Moreover, the 256-point FFT processors were designed using hardware description language (HDL) and were synthesized to gate-level circuits using a standard cell library for 65 nm CMOS process. The proposed architecture results in a logic gate count of 40,396, which can be efficient and suitable for zero-padded FFT processors.

APA, Harvard, Vancouver, ISO, and other styles

41

Buinevich, M., and K. Izrailov. "Identification of Processor’s Architecture of Executable Code Based on Machine Learning. Part 1. Frequency Byte Model." Proceedings of Telecommunication Universities 6, no. 1 (2020): 77–85. http://dx.doi.org/10.31854/1813-324x-2020-6-1-77-85.

Full text

Abstract:

This article shows us the study results of a method for identifying the processor architecture of an executable code based on machine learning. In the first part of the article we see an overview of existing solutions for machine code identifying and we see how the author makes a new method assumption. The author considers features of the machine code instructions and build its frequency-byte model. There is a processor architecture identification scheme, which is based on this model. Apart from that we see the frequency signatures which are provided for the following Top 10 processor architectures: amd64, arm64, armel, armhf, i386, mips, mips64el, mipsel, ppc64el, s390x.

APA, Harvard, Vancouver, ISO, and other styles

42

ERTEN, GAIL, and FATHI M. SALAM. "TWO CELLULAR ARCHITECTURES FOR INTEGRATED IMAGE SENSING AND PROCESSING ON A SINGLE CHIP." Journal of Circuits, Systems and Computers 08, no. 05n06 (October 1998): 637–59. http://dx.doi.org/10.1142/s0218126698000407.

Full text

Abstract:

Two architectures for a programmable image processor with on-chip light sensing capability are described. The first is a VLSI implementation of a cellular neural network. The second is a distributed dual-structure mutation of the first architecture. The distributed dual architecture leverages the speed of silicon against the large silicon area requirements. Moreover, the innovative integrated nature of the dual-structure design significantly reduces the bottleneck and computational overload caused by data transfer from sensory focal plane to the image processor. The paper also describes VLSI chip prototypes and test results.

APA, Harvard, Vancouver, ISO, and other styles

43

LIU, WANLI, DAVID H. ALBONESI, JOHN GOSTOMSKI, LLOYD PALUM, DAVE HINTERBERGER, RICK WANZENRIED, and MARK INDOVINA. "AN EVALUATION OF A CONFIGURABLE VLIW MICROARCHITECTURE FOR EMBEDDED DSP APPLICATIONS." Journal of Circuits, Systems and Computers 13, no. 06 (December 2004): 1321–45. http://dx.doi.org/10.1142/s0218126604001994.

Full text

Abstract:

The last decade has witnessed a significant increase in processor offerings geared towards embedded DSP applications. Such processors are commonly VLIW architectures with special ISA and/or microarchitecture features for speeding up signal processing functions and customization options to improve cost/performance. The Jazz Programmable System Architecture from Improv Systems is one such processor offering. Jazz employs a VLIW architecture which is well-suited to the characteristics of embedded DSP applications such as voice over packet, media processing, and home connectivity. The microarchitecture incorporates overlaid datapaths, distributed register file and memory systems, code compression, and parallel computation and memory access. Jazz permits design-time configuration in an attempt to bridge the gap between the flexibility of a programmable processor and the cost-benefit of full customization. In this paper, we explore the cost/performance tradeoffs of the Jazz microarchitecture on various embedded multimedia applications using a detailed cycle-level simulator as well as area and power models. Through a comparison of the performance, power, and area of different hardware configurations running these applications, we demonstrate how the configurability of the architecture affords a cost-performance benefit over a fixed microarchitecture. Key features of the microarchitecture are quantitatively evaluated in terms of their influence on performance. The relationship between compiler optimizations and processor performance is also explored.

APA, Harvard, Vancouver, ISO, and other styles

44

Prisagjanec, Milcho, and Pece Mitrevski. "Reducing Competitive Cache Misses in Modern Processor Architectures." International Journal of Computer Science and Information Technology 8, no. 6 (December 30, 2016): 49–57. http://dx.doi.org/10.5121/ijcsit.2016.8605.

Full text

APA, Harvard, Vancouver, ISO, and other styles

45

Corporaal, Henk, and Marnix Arnold. "Using Transport Triggered Architectures for Embedded Processor Design." Integrated Computer-Aided Engineering 5, no. 1 (January 1, 1998): 19–38. http://dx.doi.org/10.3233/ica-1998-5103.

Full text

APA, Harvard, Vancouver, ISO, and other styles

46

Lee, Jongbok. "A Performance Study of Embedded Multicore Processor Architectures." Journal of the Institute of Webcasting, Internet and Telecommunication 13, no. 1 (February 28, 2013): 163–69. http://dx.doi.org/10.7236/jiibc.2013.13.1.163.

Full text

APA, Harvard, Vancouver, ISO, and other styles

47

Lee, Jongbok. "Performance Study of Multicore Digital Signal Processor Architectures." Journal of the Institute of Webcasting, Internet and Telecommunication 13, no. 4 (August 31, 2013): 171–77. http://dx.doi.org/10.7236/jiibc.2013.13.4.171.

Full text

APA, Harvard, Vancouver, ISO, and other styles

48

Marks, R. Jackson, Les E. Atlas, Seho Oh, and Kwan F. Cheung. "Optical-processor architectures for alternating-projection neural networks." Optics Letters 13, no. 6 (June 1, 1988): 533. http://dx.doi.org/10.1364/ol.13.000533.

Full text

APA, Harvard, Vancouver, ISO, and other styles

49

Touloupis, E., J. A. Flint, V. A. Chouliaras, and D. D. Ward. "Modelling multiple faults in fault-tolerant processor architectures." Electronics Letters 41, no. 21 (2005): 1162. http://dx.doi.org/10.1049/el:20053160.

Full text

APA, Harvard, Vancouver, ISO, and other styles

50

Balkesen, Cagri, Jens Teubner, Gustavo Alonso, and M. Tamer ozsu. "Main-Memory Hash Joins on Modern Processor Architectures." IEEE Transactions on Knowledge and Data Engineering 27, no. 7 (July 1, 2015): 1754–66. http://dx.doi.org/10.1109/tkde.2014.2313874.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Journal articles on the topic 'Processor Architectures'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles