Log in

Relevant bibliographies by topics / In-memory-computing (IMC) / Journal articles

To see the other types of publications on this topic, follow the link: In-memory-computing (IMC).

Journal articles on the topic 'In-memory-computing (IMC)'

Author: Grafiati

Published: 19 July 2023

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 34 journal articles for your research on the topic 'In-memory-computing (IMC).'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Song, Soonbum, and Youngmin Kim. "Novel In-Memory Computing Adder Using 8+T SRAM." Electronics 11, no. 6 (2022): 929. http://dx.doi.org/10.3390/electronics11060929.

Full text

Abstract:

Von Neumann architecture-based computing systems are facing a von Neumann bottleneck owing to data transfer between separated memory and processor units. In-memory computing (IMC), on the other hand, reduces energy consumption and improves computing performance. This study explains an 8+T SRAM IMC circuit based on 8+T differential SRAM (8+T SRAM) and proposes 8+T SRAM-based IMC full adder (FA) and 8+T SRAM-based IMC approximate adder, which are based on the 8+T SRAM IMC circuit. The 8+T SRAM IMC circuit performs SRAM read and bitwise operations simultaneously and performs each logic operation parallelly. The proposed IMC FA and the proposed IMC approximate adder can be applied to a multi-bit adder. The two adders are based on the 8+T SRAM IMC circuit and thus read and compute simultaneously. In this study, the 8+T SRAM IMC circuit was applied to the adder, leveraging its ability to perform read and logic operations simultaneously. According to the performance in this study, the 8+T SRAM IMC circuit, proposed FA, proposed RCA, and proposed approximated adder are good candidates for IMC, which aims to reduce energy consumption and improve overall performance.

APA, Harvard, Vancouver, ISO, and other styles

2

Mannocci, P., M. Farronato, N. Lepri, et al. "In-memory computing with emerging memory devices: Status and outlook." APL Machine Learning 1, no. 1 (2023): 010902. http://dx.doi.org/10.1063/5.0136403.

Full text

Abstract:

In-memory computing (IMC) has emerged as a new computing paradigm able to alleviate or suppress the memory bottleneck, which is the major concern for energy efficiency and latency in modern digital computing. While the IMC concept is simple and promising, the details of its implementation cover a broad range of problems and solutions, including various memory technologies, circuit topologies, and programming/processing algorithms. This Perspective aims at providing an orientation map across the wide topic of IMC. First, the memory technologies will be presented, including both conventional complementary metal-oxide-semiconductor-based and emerging resistive/memristive devices. Then, circuit architectures will be considered, describing their aim and application. Circuits include both popular crosspoint arrays and other more advanced structures, such as closed-loop memory arrays and ternary content-addressable memory. The same circuit might serve completely different applications, e.g., a crosspoint array can be used for accelerating matrix-vector multiplication for forward propagation in a neural network and outer product for backpropagation training. The different algorithms and memory properties to enable such diversification of circuit functions will be discussed. Finally, the main challenges and opportunities for IMC will be presented.

APA, Harvard, Vancouver, ISO, and other styles

3

Sun, Zhaohui, Yang Feng, Peng Guo, et al. "Flash-based in-memory computing for stochastic computing in image edge detection." Journal of Semiconductors 44, no. 5 (2023): 054101. http://dx.doi.org/10.1088/1674-4926/44/5/054101.

Full text

Abstract:

Abstract The “memory wall” of traditional von Neumann computing systems severely restricts the efficiency of data-intensive task execution, while in-memory computing (IMC) architecture is a promising approach to breaking the bottleneck. Although variations and instability in ultra-scaled memory cells seriously degrade the calculation accuracy in IMC architectures, stochastic computing (SC) can compensate for these shortcomings due to its low sensitivity to cell disturbances. Furthermore, massive parallel computing can be processed to improve the speed and efficiency of the system. In this paper, by designing logic functions in NOR flash arrays, SC in IMC for the image edge detection is realized, demonstrating ultra-low computational complexity and power consumption (25.5 fJ/pixel at 2-bit sequence length). More impressively, the noise immunity is 6 times higher than that of the traditional binary method, showing good tolerances to cell variation and reliability degradation when implementing massive parallel computation in the array.

APA, Harvard, Vancouver, ISO, and other styles

4

Pedretti, Giacomo, and Daniele Ielmini. "In-Memory Computing with Resistive Memory Circuits: Status and Outlook." Electronics 10, no. 9 (2021): 1063. http://dx.doi.org/10.3390/electronics10091063.

Full text

Abstract:

In-memory computing (IMC) refers to non-von Neumann architectures where data are processed in situ within the memory by taking advantage of physical laws. Among the memory devices that have been considered for IMC, the resistive switching memory (RRAM), also known as memristor, is one of the most promising technologies due to its relatively easy integration and scaling. RRAM devices have been explored for both memory and IMC applications, such as neural network accelerators and neuromorphic processors. This work presents the status and outlook on the RRAM for analog computing, where the precision of the encoded coefficients, such as the synaptic weights of a neural network, is one of the key requirements. We show the experimental study of the cycle-to-cycle variation of set and reset processes for HfO2-based RRAM, which indicate that gate-controlled pulses present the least variation in conductance. Assuming a constant variation of conductance σG, we then evaluate and compare various mapping schemes, including multilevel, binary, unary, redundant and slicing techniques. We present analytical formulas for the standard deviation of the conductance and the maximum number of bits that still satisfies a given maximum error. Finally, we discuss RRAM performance for various analog computing tasks compared to other computational memory devices. RRAM appears as one of the most promising devices in terms of scaling, accuracy and low-current operation.

APA, Harvard, Vancouver, ISO, and other styles

5

Bansla, Neetu, and Rajneesh . "Future ERP: In-Memory Computing (IMC)Technology Infusion." Journal of Information Technology and Sciences 6, no. 3 (2020): 17–21. http://dx.doi.org/10.46610/joits.2020.v06i03.003.

Full text

APA, Harvard, Vancouver, ISO, and other styles

6

Kim, Manho, Sung-Ho Kim, Hyuk-Jae Lee, and Chae-Eun Rhee. "Case Study on Integrated Architecture for In-Memory and In-Storage Computing." Electronics 10, no. 15 (2021): 1750. http://dx.doi.org/10.3390/electronics10151750.

Full text

Abstract:

Since the advent of computers, computing performance has been steadily increasing. Moreover, recent technologies are mostly based on massive data, and the development of artificial intelligence is accelerating it. Accordingly, various studies are being conducted to increase the performance and computing and data access, together reducing energy consumption. In-memory computing (IMC) and in-storage computing (ISC) are currently the most actively studied architectures to deal with the challenges of recent technologies. Since IMC performs operations in memory, there is a chance to overcome the memory bandwidth limit. ISC can reduce energy by using a low power processor inside storage without an expensive IO interface. To integrate the host CPU, IMC and ISC harmoniously, appropriate workload allocation that reflects the characteristics of the target application is required. In this paper, the energy and processing speed are evaluated according to the workload allocation and system conditions. The proof-of-concept prototyping system is implemented for the integrated architecture. The simulation results show that IMC improves the performance by 4.4 times and reduces total energy by 4.6 times over the baseline host CPU. ISC is confirmed to significantly contribute to energy reduction.

APA, Harvard, Vancouver, ISO, and other styles

7

Zhang, Jin, Zhiting Lin, Xiulong Wu, et al. "An 8T SRAM Array with Configurable Word Lines for In-Memory Computing Operation." Electronics 10, no. 3 (2021): 300. http://dx.doi.org/10.3390/electronics10030300.

Full text

Abstract:

In-memory computing (IMC) has been widely accepted to be an effective method to improve energy efficiency. To realize IMC, operands in static random-access memory (SRAM) are stored in columns, which contradicts SRAM write patterns and requires additional data movement. In this paper, an 8T SRAM array with configurable word lines is proposed, in where the operands are arranged in rows, following the traditional SRAM storage pattern, and therefore additional data movement is not required. The proposed structure supports three different computing modes. In the ternary multiplication mode, the reference voltage generation column is not required. The energy of computing is only 1.273 fJ/bit. In the unsigned multibit multiplication mode, discharge and charging paths are used to enlarge the voltage difference of the least significant bit. In the logic operation mode, different types of operations (e.g., IMP, OR, NOR, XNOR, and XOR) are achieved in a single cycle. The frequency of logic computing is up to 909 MHz.

APA, Harvard, Vancouver, ISO, and other styles

8

Xue, Wang, Liu, Lv, Wang, and Zeng. "An RISC-V Processor with Area-Efficient Memristor-Based In-Memory Computing for Hash Algorithm in Blockchain Applications." Micromachines 10, no. 8 (2019): 541. http://dx.doi.org/10.3390/mi10080541.

Full text

Abstract:

Blockchain technology is increasingly being used in Internet of things (IoT) devices for information security and data integrity. However, it is challenging to implement complex hash algorithms with limited resources in IoT devices owing to large energy consumption and a long processing time. This paper proposes an RISC-V processor with memristor-based in-memory computing (IMC) for blockchain technology in IoT applications. The IMC-adapted instructions were designed for the Keccak hash algorithm by virtue of the extendibility of the RISC-V instruction set architecture (ISA). Then, an RISC-V processor with area-efficient memristor-based IMC was developed based on an open-source core for IoT applications, Hummingbird E200. The general compiling policy with the data allocation method is also disclosed for the IMC implementation of the Keccak hash algorithm. An evaluation shows that >70% improvements in both performance and energy saving were achieved with limited area overhead after introducing IMC in the RISC-V processor.

APA, Harvard, Vancouver, ISO, and other styles

9

Krishnan, Gokul, Sumit K. Mandal, Manvitha Pannala, et al. "SIAM: Chiplet-based Scalable In-Memory Acceleration with Mesh for Deep Neural Networks." ACM Transactions on Embedded Computing Systems 20, no. 5s (2021): 1–24. http://dx.doi.org/10.1145/3476999.

Full text

Abstract:

In-memory computing (IMC) on a monolithic chip for deep learning faces dramatic challenges on area, yield, and on-chip interconnection cost due to the ever-increasing model sizes. 2.5D integration or chiplet-based architectures interconnect multiple small chips (i.e., chiplets) to form a large computing system, presenting a feasible solution beyond a monolithic IMC architecture to accelerate large deep learning models. This paper presents a new benchmarking simulator, SIAM, to evaluate the performance of chiplet-based IMC architectures and explore the potential of such a paradigm shift in IMC architecture design. SIAM integrates device, circuit, architecture, network-on-chip (NoC), network-on-package (NoP), and DRAM access models to realize an end-to-end system. SIAM is scalable in its support of a wide range of deep neural networks (DNNs), customizable to various network structures and configurations, and capable of efficient design space exploration. We demonstrate the flexibility, scalability, and simulation speed of SIAM by benchmarking different state-of-the-art DNNs with CIFAR-10, CIFAR-100, and ImageNet datasets. We further calibrate the simulation results with a published silicon result, SIMBA. The chiplet-based IMC architecture obtained through SIAM shows 130 and 72 improvement in energy-efficiency for ResNet-50 on the ImageNet dataset compared to Nvidia V100 and T4 GPUs.

APA, Harvard, Vancouver, ISO, and other styles

10

Kiran Cherupally, Sai, Jian Meng, Adnan Siraj Rakin, et al. "Improving the accuracy and robustness of RRAM-based in-memory computing against RRAM hardware noise and adversarial attacks." Semiconductor Science and Technology 37, no. 3 (2022): 034001. http://dx.doi.org/10.1088/1361-6641/ac461f.

Full text

Abstract:

Abstract We present a novel deep neural network (DNN) training scheme and resistive RAM (RRAM) in-memory computing (IMC) hardware evaluation towards achieving high accuracy against RRAM device/array variations and enhanced robustness against adversarial input attacks. We present improved IMC inference accuracy results evaluated on state-of-the-art DNNs including ResNet-18, AlexNet, and VGG with binary, 2-bit, and 4-bit activation/weight precision for the CIFAR-10 dataset. These DNNs are evaluated with measured noise data obtained from three different RRAM-based IMC prototype chips. Across these various DNNs and IMC chip measurements, we show that our proposed hardware noise-aware DNN training consistently improves DNN inference accuracy for actual IMC hardware, up to 8% accuracy improvement for the CIFAR-10 dataset. We also analyze the impact of our proposed noise injection scheme on the adversarial robustness of ResNet-18 DNNs with 1-bit, 2-bit, and 4-bit activation/weight precision. Our results show up to 6% improvement in the robustness to black-box adversarial input attacks.

APA, Harvard, Vancouver, ISO, and other styles

11

Bärenfänger, Rieke, Boris Otto, and Hubert Österle. "Business value of in-memory technology – multiple-case study insights." Industrial Management & Data Systems 114, no. 9 (2014): 1396–414. http://dx.doi.org/10.1108/imds-07-2014-0212.

Full text

Abstract:

Purpose – The purpose of this paper is to assess the business value of in-memory computing (IMC) technology by analyzing its organizational impact in different application scenarios. Design/methodology/approach – This research applies a multiple-case study methodology analyzing five cases of IMC application scenarios in five large European industrial and service-sector companies. Findings – Results show that IMC can deliver business value in various applications ranging from advanced analytic insights to support of real-time processes. This enables higher-level organizational advantages like data-driven decision making, superior transparency of operations, and experience with Big Data technology. The findings are summarized in a business value generation model which captures the business benefits along with preceding enabling changes in the organizational environment. Practical implications – Results aid managers in identifying different application scenarios where IMC technology may generate value for their organizations from business and IT management perspectives. The research also sheds light on the socio-technical factors that influence the likelihood of success or failure of IMC initiatives. Originality/value – This research is among the first to model the business value creation process of in-memory technology based on insights from multiple implemented applications in different industries.

APA, Harvard, Vancouver, ISO, and other styles

12

Mambu, Kévin, Henri-Pierre Charles, Maha Kooli, and Julie Dumas. "Towards Integration of a Dedicated Memory Controller and Its Instruction Set to Improve Performance of Systems Containing Computational SRAM." Journal of Low Power Electronics and Applications 12, no. 1 (2022): 18. http://dx.doi.org/10.3390/jlpea12010018.

Full text

Abstract:

In-memory computing (IMC) aims to solve the performance gap between CPU and memories introduced by the memory wall. However, it does not address the energy wall problem caused by data transfer over memory hierarchies. This paper proposes the data-locality management unit (DMU) to efficiently transfer data from a DRAM memory to a computational SRAM (C-SRAM) memory allowing IMC operations. The DMU is tightly coupled within the C-SRAM and allows one to align the data structure in order to perform effective in-memory computation. We propose a dedicated instruction set within the DMU to issue data transfers. The performance evaluation of a system integrating C-SRAM within the DMU compared to a reference scalar system architecture shows an increase from ×5.73 to ×11.01 in speed-up and from ×29.49 to ×46.67 in energy reduction, versus a system integrating C-SRAM without any transfer mechanism compared to a reference scalar system architecture.

APA, Harvard, Vancouver, ISO, and other styles

13

Lin, Huai, Xi Luo, Long Liu, et al. "All-Electrical Control of Compact SOT-MRAM: Toward Highly Efficient and Reliable Non-Volatile In-Memory Computing." Micromachines 13, no. 2 (2022): 319. http://dx.doi.org/10.3390/mi13020319.

Full text

Abstract:

Two-dimensional van der Waals (2D vdW) ferromagnets possess outstanding scalability, controllable ferromagnetism, and out-of-plane anisotropy, enabling the compact spintronics-based non-volatile in-memory computing (nv-IMC) that promises to tackle the memory wall bottleneck issue. Here, by employing the intriguing room-temperature ferromagnetic characteristics of emerging 2D Fe3GeTe2 with the dissimilar electronic structure of the two spin-conducting channels, we report on a new type of non-volatile spin-orbit torque (SOT) magnetic tunnel junction (MTJ) device based on Fe3GeTe2/MgO/Fe3GeTe2 heterostructure, which demonstrates the uni-polar and high-speed field-free magnetization switching by adjusting the ratio of field-like torque to damping-like torque coefficient in the free layer. Compared to the conventional 2T1M structure, the developed 3-transistor-2-MTJ (3T2M) cell is implemented with the complementary data storage feature and the enhanced sensing margin of 201.4% (from 271.7 mV to 547.2 mV) and 276% (from 188.2 mV to 520 mV) for reading “1” and “0”, respectively. Moreover, superior to the traditional CoFeB-based MTJ memory cell counterpart, the 3T2M crossbar array architecture can be executed for AND/NAND, OR/NOR Boolean logic operation with a fast latency of 24 ps and ultra-low power consumption of 2.47 fJ/bit. Such device to architecture design with elaborated micro-magnetic and circuit-level simulation results shows great potential for realizing high-performance 2D material-based compact SOT magnetic random-access memory, facilitating new applications of highly reliable and energy-efficient nv-IMC.

APA, Harvard, Vancouver, ISO, and other styles

14

Rajput, Anil Kumar, and Manisha Pattanaik. "A Nonvolatile 7T2M SRAM Cell with Improved Noise Margin for Energy Efficient In Memory Boolean Computations." International Journal of Engineering Research in Electronics and Communication Engineering 9, no. 1 (2022): 1–8. http://dx.doi.org/10.36647/ijerece/09.01.a001.

Full text

Abstract:

The current computing systems are facing von Neumann bottleneck (VNB) in modern times due to the high prominence on big-data applications such as artificial intelligence and neuromorphic computing. In-memory computation is one of the emerging computing paradigms to mitigate this VNB. In this paper, a memristor-based robust 7T2M Nonvolatile-SRAM (NvSRAM) is proposed for energy-efficient In-memory computation. The 7T2M NvSRAM is designed using CMOS and memristor with a higher resistance ratio, which improved the write margin by 74.44% and the energy consumption for read and write operation by 5.10% and 9.66% over conventional 6T SRAM at the cost of increment in write delay. The read decoupled path with the VGND line enhances the read margin and read path Ion/Ioff ratio of 7T2M NvSRAM cell by 2.69× and 102.42%, respectively over conventional 6T SRAM. The proposed cell uses a stacking transistor to reduce the leakage power in standby mode by 64.20% over conventional 6T SRAM. In addition to the normal SRAM function, the proposed 7T2M NvSRAM performs In-Memory Boolean Computation (IMBC) operations such as NAND, AND, NOR, OR, and XOR in a single cycle without compute-disturb (stored data flips during IMC). It achieves 4.29-fJ/bit average energy consumption at 1.8 V for IMBC operations

APA, Harvard, Vancouver, ISO, and other styles

15

Krishnan, Gokul, Sumit K. Mandal, Chaitali Chakrabarti, Jae-Sun Seo, Umit Y. Ogras, and Yu Cao. "Impact of On-chip Interconnect on In-memory Acceleration of Deep Neural Networks." ACM Journal on Emerging Technologies in Computing Systems 18, no. 2 (2022): 1–22. http://dx.doi.org/10.1145/3460233.

Full text

Abstract:

With the widespread use of Deep Neural Networks (DNNs), machine learning algorithms have evolved in two diverse directions—one with ever-increasing connection density for better accuracy and the other with more compact sizing for energy efficiency. The increase in connection density increases on-chip data movement, which makes efficient on-chip communication a critical function of the DNN accelerator. The contribution of this work is threefold. First, we illustrate that the point-to-point (P2P)-based interconnect is incapable of handling a high volume of on-chip data movement for DNNs. Second, we evaluate P2P and network-on-chip (NoC) interconnect (with a regular topology such as a mesh) for SRAM- and ReRAM-based in-memory computing (IMC) architectures for a range of DNNs. This analysis shows the necessity for the optimal interconnect choice for an IMC DNN accelerator. Finally, we perform an experimental evaluation for different DNNs to empirically obtain the performance of the IMC architecture with both NoC-tree and NoC-mesh. We conclude that, at the tile level, NoC-tree is appropriate for compact DNNs employed at the edge, and NoC-mesh is necessary to accelerate DNNs with high connection density. Furthermore, we propose a technique to determine the optimal choice of interconnect for any given DNN. In this technique, we use analytical models of NoC to evaluate end-to-end communication latency of any given DNN. We demonstrate that the interconnect optimization in the IMC architecture results in up to 6 × improvement in energy-delay-area product for VGG-19 inference compared to the state-of-the-art ReRAM-based IMC architectures.

APA, Harvard, Vancouver, ISO, and other styles

16

Nanda, Ipseeta, and Rajesh De. "REVIEW OF CLOUD COMPUTING CRYPTOGRAPHY." Information Management And Computer Science 5, no. 2 (2022): 31–33. http://dx.doi.org/10.26480/imcs.02.2022.31.33.

Full text

Abstract:

The delivery of computing services over the internet, as opposed to storing data on a local memory device or a proprietary disc drive, is known as cloud computing. Servers, storage, databases, networking, and software are some examples of computing services. The primary justification and major benefit of using the cloud are the user’s ability to store data there and access it from any location at any time, as well as the low cost of all its services. Despite this, because the data stored in the cloud is not directly maintained by the customer, security has always been a major concern with cloud computing. The data owners are unlikely to be aware of the route their data is taking when they upload or store data using a cloud computing service. The user is unaware of whether or not a third party is gathering, processing, and accessing their information. Numerous cryptography algorithms have been proposed to address security concerns. This paper discussed different cryptography algorithms that are present in the previous work with a focus on the fundamentals of cloud computing.

APA, Harvard, Vancouver, ISO, and other styles

17

Verma, Anil, Divya Anand, Aman Singh, et al. "IoT-Inspired Reliable Irregularity-Detection Framework for Education 4.0 and Industry 4.0." Electronics 11, no. 9 (2022): 1436. http://dx.doi.org/10.3390/electronics11091436.

Full text

Abstract:

Education 4.0 imitates Industry 4.0 in many aspects such as technology, customs, challenges, and benefits. The remarkable advancement in embryonic technologies, including IoT (Internet of Things), Fog Computing, Cloud Computing, and Augmented and Virtual Reality (AR/VR), polishes every dimension of Industry 4.0. The constructive impacts of Industry 4.0 are also replicated in Education 4.0. Real-time assessment, irregularity detection, and alert generation are some of the leading necessities of Education 4.0. Conspicuously, this study proposes a reliable assessment, irregularity detection, and alert generation framework for Education 4.0. The proposed framework correspondingly addresses the comparable issues of Industry 4.0. The proposed study (1) recommends the use of IoT, Fog, and Cloud Computing, i.e., IFC technological integration for the implementation of Education 4.0. Subsequently, (2) the Symbolic Aggregation Approximation (SAX), Kalman Filter, and Learning Bayesian Network (LBN) are deployed for data pre-processing and classification. Further, (3) the assessment, irregularity detection, and alert generation are accomplished over SoTL (the set of threshold limits) and the Multi-Layered Bi-Directional Long Short-Term Memory (M-Bi-LSTM)-based predictive model. To substantiate the proposed framework, experimental simulations are implemented. The experimental outcomes substantiate the better performance of the proposed framework, in contrast to the other contemporary technologies deployed for the enactment of Education 4.0.

APA, Harvard, Vancouver, ISO, and other styles

18

Krasnov, Mikhail Mikhailovich, and Olga Borisovna Feodoritova. "Using the functional programming library for solving numerical problems on graphics accelerators with CUDA technology." Proceedings of the Institute for System Programming of the RAS 33, no. 5 (2021): 167–80. http://dx.doi.org/10.15514/ispras-2021-33(5)-10.

Full text

Abstract:

Modern graphics accelerators (GPUs) can significantly speed up the execution of numerical tasks. However, porting programs to graphics accelerators is not an easy task. Sometimes the transfer of programs to such accelerators is carried out by almost completely rewriting them (for example, when using the OpenCL technology). This raises the daunting task of maintaining two independent source codes. However, CUDA graphics accelerators, thanks to technology developed by NVIDIA, allow you to have a single source code for both conventional processors (CPUs) and CUDA. The machine code generated when compiling this single text depends on which compiler it is compiled with (the usual one, such as gcc, icc and msvc, or the compiler for CUDA, nvcc). However, in this single source code, you need to somehow tell the compiler which parts of this code to parallelize on shared memory. For the CPU, this is usually done using OpenMP and special pragmas to the compiler. For CUDA, parallelization is done in a completely different way. The use of the functional programming library developed by the authors allows you to hide the use of one or another parallelization mechanism on shared memory within the library and make the user source code completely independent of the computing device used (CPU or CUDA). This article shows how this can be done.

APA, Harvard, Vancouver, ISO, and other styles

19

Backx, Rosa, Caroline Skirrow, Pasquale Dente, Jennifer H. Barnett, and Francesca K. Cormack. "Comparing Web-Based and Lab-Based Cognitive Assessment Using the Cambridge Neuropsychological Test Automated Battery: A Within-Subjects Counterbalanced Study." Journal of Medical Internet Research 22, no. 8 (2020): e16792. http://dx.doi.org/10.2196/16792.

Full text

Abstract:

Background Computerized assessments are already used to derive accurate and reliable measures of cognitive function. Web-based cognitive assessment could improve the accessibility and flexibility of research and clinical assessment, widen participation, and promote research recruitment while simultaneously reducing costs. However, differences in context may influence task performance. Objective This study aims to determine the comparability of an unsupervised, web-based administration of the Cambridge Neuropsychological Test Automated Battery (CANTAB) against a typical in-person lab-based assessment, using a within-subjects counterbalanced design. The study aims to test (1) reliability, quantifying the relationship between measurements across settings using correlational approaches; (2) equivalence, the extent to which test results in different settings produce similar overall results; and (3) agreement, by quantifying acceptable limits to bias and differences between measurement environments. Methods A total of 51 healthy adults (32 women and 19 men; mean age 36.8, SD 15.6 years) completed 2 testing sessions, which were completed on average 1 week apart (SD 4.5 days). Assessments included equivalent tests of emotion recognition (emotion recognition task [ERT]), visual recognition (pattern recognition memory [PRM]), episodic memory (paired associate learning [PAL]), working memory and spatial planning (spatial working memory [SWM] and one touch stockings of Cambridge), and sustained attention (rapid visual information processing [RVP]). Participants were randomly allocated to one of the two groups, either assessed in-person in the laboratory first (n=33) or with unsupervised web-based assessments on their personal computing systems first (n=18). Performance indices (errors, correct trials, and response sensitivity) and median reaction times were extracted. Intraclass and bivariate correlations examined intersetting reliability, linear mixed models and Bayesian paired sample t tests tested for equivalence, and Bland-Altman plots examined agreement. Results Intraclass correlation (ICC) coefficients ranged from ρ=0.23-0.67, with high correlations in 3 performance indices (from PAL, SWM, and RVP tasks; ρ≥0.60). High ICC values were also seen for reaction time measures from 2 tasks (PRM and ERT tasks; ρ≥0.60). However, reaction times were slower during web-based assessments, which undermined both equivalence and agreement for reaction time measures. Performance indices did not differ between assessment settings and generally showed satisfactory agreement. Conclusions Our findings support the comparability of CANTAB performance indices (errors, correct trials, and response sensitivity) in unsupervised, web-based assessments with in-person and laboratory tests. Reaction times are not as easily translatable from in-person to web-based testing, likely due to variations in computer hardware. The results underline the importance of examining more than one index to ascertain comparability, as high correlations can present in the context of systematic differences, which are a product of differences between measurement environments. Further work is now needed to examine web-based assessments in clinical populations and in larger samples to improve sensitivity for detecting subtler differences between test settings.

APA, Harvard, Vancouver, ISO, and other styles

20

An, SangWoo, and Seog Chung Seo. "Highly Efficient Implementation of Block Ciphers on Graphic Processing Units for Massively Large Data." Applied Sciences 10, no. 11 (2020): 3711. http://dx.doi.org/10.3390/app10113711.

Full text

Abstract:

With the advent of IoT and Cloud computing service technology, the size of user data to be managed and file data to be transmitted has been significantly increased. To protect users’ personal information, it is necessary to encrypt it in secure and efficient way. Since servers handling a number of clients or IoT devices have to encrypt a large amount of data without compromising service capabilities in real-time, Graphic Processing Units (GPUs) have been considered as a proper candidate for a crypto accelerator for processing a huge amount of data in this situation. In this paper, we present highly efficient implementations of block ciphers on NVIDIA GPUs (especially, Maxwell, Pascal, and Turing architectures) for environments using massively large data in IoT and Cloud computing applications. As block cipher algorithms, we choose AES, a representative standard block cipher algorithm; LEA, which was recently added in ISO/IEC 29192-2:2019 standard; and CHAM, a recently developed lightweight block cipher algorithm. To maximize the parallelism in the encryption process, we utilize Counter (CTR) mode of operation and customize it by using GPU’s characteristics. We applied several optimization techniques with respect to the characteristics of GPU architecture such as kernel parallelism, memory optimization, and CUDA stream. Furthermore, we optimized each target cipher by considering the algorithmic characteristics of each cipher by implementing the core part of each cipher with handcrafted inline PTX (Parallel Thread eXecution) codes, which are virtual assembly codes in CUDA platforms. With the application of our optimization techniques, in our implementation on RTX 2070 GPU, AES and LEA show up to 310 Gbps and 2.47 Tbps of throughput, respectively, which are 10.7% and 67% improved compared with the 279.86 Gbps and 1.47 Tbps of the previous best result. In the case of CHAM, this is the first optimized implementation on GPUs and it achieves 3.03 Tbps of throughput on RTX 2070 GPU.

APA, Harvard, Vancouver, ISO, and other styles

21

Kazemi, Arman, Franz Müller, Mohammad Mehdi Sharifi, et al. "Achieving software-equivalent accuracy for hyperdimensional computing with ferroelectric-based in-memory computing." Scientific Reports 12, no. 1 (2022). http://dx.doi.org/10.1038/s41598-022-23116-w.

Full text

Abstract:

AbstractHyperdimensional computing (HDC) is a brain-inspired computational framework that relies on long hypervectors (HVs) for learning. In HDC, computational operations consist of simple manipulations of hypervectors and can be incredibly memory-intensive. In-memory computing (IMC) can greatly improve the efficiency of HDC by reducing data movement in the system. Most existing IMC implementations of HDC are limited to binary precision which inhibits the ability to match software-equivalent accuracies. Moreover, memory arrays used in IMC are restricted in size and cannot immediately support the direct associative search of large binary HVs (a ubiquitous operation, often over 10,000+ dimensions) required to achieve acceptable accuracies. We present a multi-bit IMC system for HDC using ferroelectric field-effect transistors (FeFETs) that simultaneously achieves software-equivalent-accuracies, reduces the dimensionality of the HDC system, and improves energy consumption by 826x and latency by 30x when compared to a GPU baseline. Furthermore, for the first time, we experimentally demonstrate multi-bit, array-level content-addressable memory (CAM) operations with FeFETs. We also present a scalable and efficient architecture based on CAMs which supports the associative search of large HVs. Furthermore, we study the effects of device, circuit, and architectural-level non-idealities on application-level accuracy with HDC.

APA, Harvard, Vancouver, ISO, and other styles

22

Dazzi, Martino, Abu Sebastian, Luca Benini, and Evangelos Eleftheriou. "Accelerating Inference of Convolutional Neural Networks Using In-memory Computing." Frontiers in Computational Neuroscience 15 (August 3, 2021). http://dx.doi.org/10.3389/fncom.2021.674154.

Full text

Abstract:

In-memory computing (IMC) is a non-von Neumann paradigm that has recently established itself as a promising approach for energy-efficient, high throughput hardware for deep learning applications. One prominent application of IMC is that of performing matrix-vector multiplication in O(1) time complexity by mapping the synaptic weights of a neural-network layer to the devices of an IMC core. However, because of the significantly different pattern of execution compared to previous computational paradigms, IMC requires a rethinking of the architectural design choices made when designing deep-learning hardware. In this work, we focus on application-specific, IMC hardware for inference of Convolution Neural Networks (CNNs), and provide methodologies for implementing the various architectural components of the IMC core. Specifically, we present methods for mapping synaptic weights and activations on the memory structures and give evidence of the various trade-offs therein, such as the one between on-chip memory requirements and execution latency. Lastly, we show how to employ these methods to implement a pipelined dataflow that offers throughput and latency beyond state-of-the-art for image classification tasks.

APA, Harvard, Vancouver, ISO, and other styles

23

Fan, Anjunyi, Yihan Fu, Yaoyu Tao, et al. "Hadamard product-based in-memory computing design for floating point neural network training." Neuromorphic Computing and Engineering, February 9, 2023. http://dx.doi.org/10.1088/2634-4386/acbab9.

Full text

Abstract:

Abstract Deep neural networks (DNNs) are one of the key fields of machine learning. It requires considerable computational resources for cognitive tasks. As a novel technology to perform computing inside/near memory units, in-memory computing (IMC) significantly improves computing efficiency by reducing the need for repetitive data transfer between the processing and memory units. However, prior IMC designs mainly focus on the acceleration for DNN inference. DNN training with the IMC hardware has rarely been proposed. The challenges lie in the requirement of DNN training for high precision (e.g. floating point) and various operations of tensors (e.g. inner and outer products). These challenges call for the IMC design with new features. This paper proposes a novel Hadamard product-based IMC design for floating point DNN training. Our design consists of multiple compartments, which are the basic units for the matrix element-wise processing. We also develop BFloat16 post-processing circuits and fused adder trees, laying the foundation for IMC floating point processing. Based on the proposed circuit scheme, we reformulate the back-propagation training algorithm for the convenience and efficiency of the IMC execution. The proposed design is implemented with commercial 28nm technology process design kits and benchmarked with widely used neural networks. We model the influence of the circuit structural design parameters and provide an analysis framework for design space exploration. Our simulation validates that MobileNet training with the proposed IMC scheme saves 91.2% in energy and 13.9% in time versus the same task with NVIDIA GTX 3060 GPU. The proposed IMC design has a data density of 769.2Kb/mm2 with the floating point processing circuits included, showing a 3.5× improvement than the prior floating point IMC designs.

APA, Harvard, Vancouver, ISO, and other styles

24

Chang, Chih-Cheng, Shao-Tzu Li, Tong-Lin Pan, et al. "Device quantization policy in variation-aware in-memory computing design." Scientific Reports 12, no. 1 (2022). http://dx.doi.org/10.1038/s41598-021-04159-x.

Full text

Abstract:

AbstractDevice quantization of in-memory computing (IMC) that considers the non-negligible variation and finite dynamic range of practical memory technology is investigated, aiming for quantitatively co-optimizing system performance on accuracy, power, and area. Architecture- and algorithm-level solutions are taken into consideration. Weight-separate mapping, VGG-like algorithm, multiple cells per weight, and fine-tuning of the classifier layer are effective for suppressing inference accuracy loss due to variation and allow for the lowest possible weight precision to improve area and energy efficiency. Higher priority should be given to developing low-conductance and low-variability memory devices that are essential for energy and area-efficiency IMC whereas low bit precision (< 3b) and memory window (< 10) are less concerned.

APA, Harvard, Vancouver, ISO, and other styles

25

Lalchhandama, F., Mukesh Sahani, Vompolu Mohan Srinivas, Indranil Sengupta, and Kamalika Datta. "In-Memory Computing on Resistive RAM Systems Using Majority Operation." Journal of Circuits, Systems and Computers 31, no. 04 (2021). http://dx.doi.org/10.1142/s0218126622500712.

Full text

Abstract:

Memristors can be used to build nonvolatile memory systems with in-memory computing (IMC) capabilities. A number of prior works demonstrate the design of an IMC-capable memory macro using a memristor crossbar. However, read disturbance limits the use of such memory systems built using a 0-transistor, 1-RRAM (0T1R) structure that suffers from the sneak path problem. In this paper, we introduce a scheme for both memory and logic operations using the 1-transistor, 1-RRAM (1T1R) memristor crossbar, which effectively mitigates the read disturbance problem. The memory array is designed using nMOS transistors and the VTEAM memristor model. The peripheral circuitry like decoders, voltage multiplexers, and sense amplifiers is designed using a 45[Formula: see text]nm CMOS technology node. We introduce a mapping technique to realize arbitrary logic functions using Majority (MAJ) gate operations in the 1T1R crossbar. Through extensive experimentation on benchmark functions, it has been found that the proposed mapping method gives an improvement of 65% or more in terms of the number of time steps required, and 59% or more in terms of energy consumption as compared to some of the recent methods.

APA, Harvard, Vancouver, ISO, and other styles

26

Zeng, Junwei, Nuo Xu, Yabo Chen, Chenglong Huang, Zhiwei Li, and Liang Fang. "AIMCU-MESO: An In-Memory Computing Unit Constructed by MESO Device." ACM Transactions on Design Automation of Electronic Systems, May 26, 2022. http://dx.doi.org/10.1145/3539575.

Full text

Abstract:

Traditional CMOS-based von-Neumann computer architecture faces the issue of memory wall that the limitation of bus-bandwidth and the speed mismatch between processor and memory restrict the efficiency of data processing along with an irreducible energy consumption conducted by data movement, especially in some data-intensive applications. Recently, some novel in-memory computing (IMC) paradigms developed by utilizing the characteristics of different non-volatile memories provide promising ways to overcome the bottleneck of memory wall. Here, we propose a new IMC unit based on a memory array with the core element of magnetoelectric spin-orbit logic (MESO) device (AIMCU-MESO), in which the characteristics of the MESO device are exploited to achieve several in-memory logic operations with the functions of NAND, NOR and XOR in the MESO-based memory array. With the aid of some transistor-based switches, these logic operations can be achieved between any two MESOs in the array. Furthermore, the computing process of a 1-bit full adder (FA) is achieved in AIMCU-MESO by the in-memory logic manner to demonstrate the ability of logic cascading. The result of SPICE simulation for achieving the 1-bit FA using MESO devices is demonstrated, and the performances are compared with other designs of spintronics-based devices. Compared to multilevel voltage-controlled SOT-based magnetic memory (MV-SOTM), the proposed design demonstrates 71.4% and 49.2% reduction in terms of storage delay and logic delay, respectively.

APA, Harvard, Vancouver, ISO, and other styles

27

Baroni, Andrea, Artem Glukhov, Eduardo Pérez, et al. "An energy-efficient in-memory computing architecture for survival data analysis based on resistive switching memories." Frontiers in Neuroscience 16 (August 9, 2022). http://dx.doi.org/10.3389/fnins.2022.932270.

Full text

Abstract:

One of the objectives fostered in medical science is the so-called precision medicine, which requires the analysis of a large amount of survival data from patients to deeply understand treatment options. Tools like machine learning (ML) and deep neural networks are becoming a de-facto standard. Nowadays, computing facilities based on the Von Neumann architecture are devoted to these tasks, yet rapidly hitting a bottleneck in performance and energy efficiency. The in-memory computing (IMC) architecture emerged as a revolutionary approach to overcome that issue. In this work, we propose an IMC architecture based on resistive switching memory (RRAM) crossbar arrays to provide a convenient primitive for matrix-vector multiplication in a single computational step. This opens massive performance improvement in the acceleration of a neural network that is frequently used in survival analysis of biomedical records, namely the DeepSurv. We explored how the synaptic weights mapping strategy and the programming algorithms developed to counter RRAM non-idealities expose a performance/energy trade-off. Finally, we discussed how this application is tailored for the IMC architecture rather than being executed on commodity systems.

APA, Harvard, Vancouver, ISO, and other styles

28

Kingra, Sandeep Kaur, Vivek Parmar, and Manan Suri. "In-Memory Computation Based Mapping of Keccak-f Hash Function." Frontiers in Nanotechnology 4 (March 16, 2022). http://dx.doi.org/10.3389/fnano.2022.841756.

Full text

Abstract:

Cryptographic hash functions play a central role in data security for applications such as message authentication, data verification, and detecting malicious or illegal modification of data. However, such functions typically require intensive computations with high volume of memory accesses. Novel computing architectures such as logic-in-memory (LIM)/in-memory computing (IMC) have been investigated in the literature to address the limitations of intense compute and memory bottleneck. In this work, we present an implementation of Keccak-f (a state-of-the-art secure hash algorithm) using a variant of simultaneous logic-in-memory (SLIM) that utilizes emerging non-volatile memory (NVM) devices. Detailed operation and instruction mapping on SLIM-based digital gates is presented. Through simulations, we benchmark the proposed approach using LIM cells based on four different emerging NVM devices (OxRAM, CBRAM, PCM, and FeRAM). The proposed mapping strategy when used with state-of-the-art emerging NVM devices offers EDP savings of up to 300× compared to conventional methods.

APA, Harvard, Vancouver, ISO, and other styles

29

Bahnasawy, Nirmeen A., Gamal M. Attiya, Mervat Mosa, and Magdy A. Koutb. "A MODIFIED A* ALGORITHM FOR ALLOCATING TASK IN HETEROGENEOUS DISTIRBUTED COMPUTING SYSTEMS." International Journal of Computing, August 1, 2014, 50–57. http://dx.doi.org/10.47839/ijc.8.2.666.

Full text

Abstract:

Distributed computing can be used to solve large scale scientific and engineering problems. A parallel application could be divided into a number of tasks and executed concurrently on different computers in the system. This paper provides an optimal task assignment algorithm under memory constraints to minimize required time of finishing a parallel application. The proposed algorithm is based on the optimal assignment sequential search (OASS) of the A* algorithm with additional modifications. This modified algorithm yields optimal solution, lower time complexity, reduces the turnaround time of the application and considerably faster compared with the sequential search algorithm.

APA, Harvard, Vancouver, ISO, and other styles

30

"A Decentralized Accountability Framework For Enhancing Secure Data Sharing Through Icm In Cloud." International Journal of Innovative Technology and Exploring Engineering 8, no. 11 (2019): 3446–52. http://dx.doi.org/10.35940/ijitee.k2556.0981119.

Full text

Abstract:

The Cloud substitutes a computing criterion where shared configurable resources are afforded as an on-demand service over the Internet. Moreover, the cloud environment provides resources to the users on the basis of services like SaaS, PaaS and IaaS. Generally, a cloud can be referred as private cloud or public cloud. When a Cloud Service Provider (CSP) imposes upon public cloud resources to compile their private cloud, the result is demonstrated as a virtual private cloud. Private or public, the imperious intent of cloud computing is to provide simplistic, reliable usage of various computing resources. One of the significant features of cloud is that the outsourced data are accessed through any anonymous machines over the Internet. On the other hand, it creates an issue that user’s fear of unknown access of data, which can become a major difficulty to the wide implementation of cloud. In this paper, a decentralized accountability framework is developed to monitor the actual usage and access of the data that is shared on cloud. For that, a logging mechanism that includes authentication for each user to access the data has also been provided. Moreover, some procedures for providing the data under the control of data owner includes Integrity Checking Mechanism (ICM) have also been developed. The overall process strengthens the security constraints over cloud. And the experimental results reveal that the approach affords secure and scalable data sharing with reduced memory utilization and processing time

APA, Harvard, Vancouver, ISO, and other styles

31

"A Decentralized Accountability Framework for Enhancing Secure Data Sharing Through ICM in Cloud." VOLUME-8 ISSUE-10, AUGUST 2019, REGULAR ISSUE 8, no. 10 (2019): 1505–11. http://dx.doi.org/10.35940/ijitee.a1026.0881019.

Full text

Abstract:

The Cloud substitutes a computing criterion where shared configurable resources are afforded as an on-demand service over the Internet. Moreover, the cloud environment provides resources to the users on the basis of services like SaaS, PaaS and IaaS. Generally, a cloud can be referred as private cloud or public cloud. When a Cloud Service Provider (CSP) imposes upon public cloud resources to compile their private cloud, the result is demonstrated as a virtual private cloud. Private or public, the imperious intent of cloud computing is to provide simplistic, reliable usage of various computing resources. One of the significant features of cloud is that the outsourced data are accessed through any anonymous machines over the Internet. On the other hand, it creates an issue that user’s fear of unknown access of data, which can become a major difficulty to the wide implementation of cloud. In this paper, a decentralized accountability framework is developed to monitor the actual usage and access of the data that is shared on cloud. For that, a logging mechanism that includes authentication for each user to access the data has also been provided. Moreover, some procedures for providing the data under the control of data owner includes Integrity Checking Mechanism (ICM) have also been developed. The overall process strengthens the security constraints over cloud. And the experimental results reveal that the approach affords secure and scalable data sharing with reduced memory utilization and processing time.

APA, Harvard, Vancouver, ISO, and other styles

32

Kosmatopoulos, C., and N. Tsagourias. "DEVELOPMENT OF A STAND ALONE MONITORING SYSTEM (S.A.MO.S.)." International Journal of Computing, August 1, 2014, 59–62. http://dx.doi.org/10.47839/ijc.1.2.114.

Full text

Abstract:

Environmental monitoring is nowadays a common instrumentation application, not only in cases where scientific information is needed, but also for pollution control and development planning of certain sensitive areas. This paper describes the development of a Stand Alone Monitoring System (S.A.MO.S.) which is actually a complete monitoring station for environmental measurements in rivers, lakes, lagoons and other sensitive ecosystems. S.A.MO.S. is capable of performing scheduled measurements by sensors or other independent electronic measuring instruments, and stores data locally in a memory module. Measurement data may then be transferred to a PC computing system for further analysis either by using cellular phone network or via a UHF transceiver. Validation of measurements is achieved by system’s self check control and warning signals can be sent to authorized personnel in case of errors. The basic features and the configuration of this system are presented in this work.

APA, Harvard, Vancouver, ISO, and other styles

33

Xiong, Fu-Rui, Zhi-Chang Qin, Qian Ding, et al. "Parallel Cell Mapping Method for Global Analysis of High-Dimensional Nonlinear Dynamical Systems1." Journal of Applied Mechanics 82, no. 11 (2015). http://dx.doi.org/10.1115/1.4031149.

Full text

Abstract:

The cell mapping methods were originated by Hsu in 1980s for global analysis of nonlinear dynamical systems that can have multiple steady-state responses including equilibrium states, periodic motions, and chaotic attractors. The cell mapping methods have been applied to deterministic, stochastic, and fuzzy dynamical systems. Two important extensions of the cell mapping method have been developed to improve the accuracy of the solutions obtained in the cell state space: the interpolated cell mapping (ICM) and the set-oriented method with subdivision technique. For a long time, the cell mapping methods have been applied to dynamical systems with low dimension until now. With the advent of cheap dynamic memory and massively parallel computing technologies, such as the graphical processing units (GPUs), global analysis of moderate- to high-dimensional nonlinear dynamical systems becomes feasible. This paper presents a parallel cell mapping method for global analysis of nonlinear dynamical systems. The simple cell mapping (SCM) and generalized cell mapping (GCM) are implemented in a hybrid manner. The solution process starts with a coarse cell partition to obtain a covering set of the steady-state responses, followed by the subdivision technique to enhance the accuracy of the steady-state responses. When the cells are small enough, no further subdivision is necessary. We propose to treat the solutions obtained by the cell mapping method on a sufficiently fine grid as a database, which provides a basis for the ICM to generate the pointwise approximation of the solutions without additional numerical integrations of differential equations. A modified global analysis of nonlinear systems with transient states is developed by taking advantage of parallel computing without subdivision. To validate the parallelized cell mapping techniques and to demonstrate the effectiveness of the proposed method, a low-dimensional dynamical system governed by implicit mappings is first presented, followed by the global analysis of a three-dimensional plasma model and a six-dimensional Lorenz system. For the six-dimensional example, an error analysis of the ICM is conducted with the Hausdorff distance as a metric.

APA, Harvard, Vancouver, ISO, and other styles

34

Wu, Chuangwen, Xiangqing Zhou, Guang Zeng, et al. "Field-free spin-orbit torque switching in interlayer exchange coupled Co/Ta/CoTb." Journal of Physics: Condensed Matter, July 5, 2023. http://dx.doi.org/10.1088/1361-648x/ace4b1.

Full text

Abstract:

Abstract This study investigates a T-type field-free spin-orbit torque (SOT) device with an in-plane magnetic layer coupled to a perpendicular magnetic layer via a non-magnetic spacer. The device utilizes a Co/Ta/CoTb structure, in which the in-plane Co layer and the perpendicular CoTb layer are ferromagnetically coupled through the Ta spacer. "T-type" refers to the magnetization arrangement in the FM/spacer/FIM structure, where the magnetization in FM is in-plane, while in FIM, it is out-of-plane. This configuration forms a T-shaped arrangement for the magnetization of the two magnetic layers. Additionally, "interlayer exchange coupling" denotes the interaction between the two magnetic layers, which is achieved by adjusting the material and thickness of the spacer. Our results show that an in-plane effective field from the interlayer exchange coupling (IEC) enables deterministic current-induced magnetization switching of the CoTb layer. The field-driven and the current-driven asymmetric domain wall motion are observed and characterized by Magneto-optic Kerr effect (MOKE) measurements. The functionality of multistate synaptic plasticity is demonstrated by understanding the relationship between the anomalous Hall resistance and the applied current pulses, indicating the potential for the device in spintronic memory and neuromorphic computing.

APA, Harvard, Vancouver, ISO, and other styles

We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!