Zaloguj się

Gotowe bibliografie tematyczne / Data layout and Computation Reordering / Artykuły w czasopismach

Kliknij ten link, aby zobaczyć inne rodzaje publikacji na ten temat: Data layout and Computation Reordering.

Artykuły w czasopismach na temat „Data layout and Computation Reordering”

Autor: Grafiati

Data publikacji: 6 września 2023

Utwórz poprawne odniesienie w stylach APA, MLA, Chicago, Harvard i wielu innych

Wybierz rodzaj źródła:

Sprawdź 50 najlepszych artykułów w czasopismach naukowych na temat „Data layout and Computation Reordering”.

Przycisk „Dodaj do bibliografii” jest dostępny obok każdej pracy w bibliografii. Użyj go – a my automatycznie utworzymy odniesienie bibliograficzne do wybranej pracy w stylu cytowania, którego potrzebujesz: APA, MLA, Harvard, Chicago, Vancouver itp.

Możesz również pobrać pełny tekst publikacji naukowej w formacie „.pdf” i przeczytać adnotację do pracy online, jeśli odpowiednie parametry są dostępne w metadanych.

Przeglądaj artykuły w czasopismach z różnych dziedzin i twórz odpowiednie bibliografie.

1

Masselos, K., P. Merakos, T. Stouraitis, and C. E. Goutis. "Computation Reordering: A Novel Transformation for Low Power DSP Synthesis." VLSI Design 10, no. 2 (1999): 177–202. http://dx.doi.org/10.1155/1999/16415.

Pełny tekst źródła

Streszczenie:

A novel architectural transformation for low power synthesis of inner product computational structures is presented. The proposed transformation reorders the sequence of evaluation of the multiply-accumulate operations that form the inner products. Information related to both coefficients, which are statically determined, and data, which are dynamic, is used to drive the reordering of computation. The reordering of computation reduces the switching activity at the inputs of the computational units but inside them as well leading to power consumption reduction. Different classes of algorithms requiring inner product computation are identified and the problem of computation reordering is formulated for each of them. The target architecture to which the proposed transformation applies is based on a power optimal memory organization and is described in detail. Experimental results for several DSP algorithms show that the proposed transformation leads to significant savings in net switching activity and thus in power consumption.

Style APA, Harvard, Vancouver, ISO itp.

2

Shen, Zhao-Li, Yu-Tong Liu, Bruno Carpentieri, Chun Wen, and Jian-Jun Wang. "Recursive reordering and elimination method for efficient computation of PageRank problems." AIMS Mathematics 8, no. 10 (2023): 25104–30. http://dx.doi.org/10.3934/math.20231282.

Pełny tekst źródła

Streszczenie:

<abstract><p>The PageRank model is widely utilized for analyzing a variety of scientific issues beyond its original application in modeling web search engines. In recent years, considerable research effort has focused on developing high-performance iterative methods to solve this model, particularly when the dimension is exceedingly large. However, due to the ever-increasing extent and size of data networks in various applications, the computational requirements of the PageRank model continue to grow. This has led to the development of new techniques that aim to reduce the computational complexity required for the solution. In this paper, we present a recursive 5-type lumping algorithm combined with a two-stage elimination strategy that leverage characteristics about the nonzero structure of the underlying network and the nonzero values of the PageRank coefficient matrix. This method reduces the initial PageRank problem to the solution of a remarkably smaller and sparser linear system. As a result, it leads to significant cost reductions for computing PageRank solutions, particularly in scenarios involving large and/or multiple damping factors. Numerical experiments conducted on over 50 real-world networks demonstrate that the proposed methods can effectively exploit characteristics of PageRank problems for efficient computations.</p></abstract>

Style APA, Harvard, Vancouver, ISO itp.

3

Rodrigues, Thiago Nascimento, Maria Claudia Silva Boeres, and Lucia Catabriga. "Parallel Implementations of RCM Algorithm for Bandwidth Reduction of Sparse Matrices." TEMA (São Carlos) 18, no. 3 (2018): 449. http://dx.doi.org/10.5540/tema.2017.018.03.449.

Pełny tekst źródła

Streszczenie:

The Reverse Cuthill-McKee (RCM) algorithm is a well-known heuristicfor reordering sparse matrices. It is typically used to speed up the computation ofsparse linear systems of equations. This paper describes two parallel approachesfor the RCM algorithm as well as an optimized version of each one based on someproposed enhancements. The first one exploits a strategy for reducing lazy threads,while the second one makes use of a static bucket array as the main data structureand suppress some steps performed by the original algorithm. These related changesled to outstanding reordering time results and significant bandwidth reductions.The performance of two algorithms is compared with the respective implementationmade available by Boost library. The OpenMP framework is used for supportingthe parallelism and both versions of the algorithm are tested with large sparse andstructural symmetric matrices.

Style APA, Harvard, Vancouver, ISO itp.

4

LIU, YING, and WENYUAN LI. "VISUALIZING MICROARRAY DATA FOR BIOMARKER DISCOVERY BY MATRIX REORDERING AND REPLICATOR DYNAMICS." Journal of Bioinformatics and Computational Biology 06, no. 06 (2008): 1089–113. http://dx.doi.org/10.1142/s0219720008003862.

Pełny tekst źródła

Streszczenie:

In most microarray data sets, there are often multiple sample classes, which are categorized into the normal or diseased type. Traditional feature selection methods consider multiple classes equally without paying attention to the upregulation/downregulation across the normal and diseased classes; while the specific gene selection methods for biomarker discovery particularly consider differential gene expressions across the normal and diseased classes, but ignore the existence of multiple classes. More importantly, there are few visualization algorithms to assist biomarker discovery from microarray data. In this paper, to help users visually analyze microarray data and improve biomarker discovery, we propose to employ matrix reordering techniques that have been developed and used in matrix computation. In particular, we generalized a well-known population genetic algorithm, namely, replicator dynamics, to reorder a microarray data matrix with multiple classes. The new algorithm simultaneously takes into account the global between-class data pattern and local within-class data pattern. Our results showed that our matrix reordering algorithm not only provides a visualization method to effectively analyze microarray data on both genes and samples, but also improves the accuracy of classifying the samples.

Style APA, Harvard, Vancouver, ISO itp.

5

DE STEFANO, CLAUDIO, and ANGELO MARCELLI. "AN EFFICIENT METHOD FOR ONLINE CURSIVE HANDWRITING STROKES REORDERING." International Journal of Pattern Recognition and Artificial Intelligence 18, no. 07 (2004): 1157–71. http://dx.doi.org/10.1142/s0218001404003691.

Pełny tekst źródła

Streszczenie:

In the framework of online cursive handwriting recognition, we present an efficient method for reordering the sequence of strokes composing handwriting in two special cases of interest: the horizontal bar of the character "" and the dot of the character "". The proposed method exploits shape information for selecting the strokes that most likely correspond to the features of interest, and layout and topological information for locating the strokes representing the body of the characters to which the features belong to. The method does not depend on the specific algorithm used for detecting the elementary strokes in which the electronic ink may be decomposed into. The performance of our method, evaluated on a data set of cursive words produced by 50 different writers, has shown a correct reordering of the sequence in more than 85% of the cases. Thus, the proposed method allows obtaining a more stable and invariant description of the electronic ink in terms of elementary stroke sequences, and therefore can be helpfully used as a preprocessing step for both segmentation-based and word-based handwriting recognition systems.

Style APA, Harvard, Vancouver, ISO itp.

6

Mehta, Dinesh P. "CLOTH MEASURE: A Software Tool for Estimating the Memory Requirements of Corner Stitching Data Structures." VLSI Design 7, no. 4 (1998): 425–36. http://dx.doi.org/10.1155/1998/64716.

Pełny tekst źródła

Streszczenie:

In a previous paper [1], we derived formulae for estimating the storage requirements of the Rectangular and L-shaped Corner Stitching data structures [2, 3] for a given layout. These formulae require the computation of quantities called violations, which are geometric properties of the layout. In this paper, we present optimal Θ(n log n) algorithms for computing violations, where n is the number of rectangles in the layout. These algorithms are incorporated into a software tool called CLOTH MEASURE. Experiments conducted with CLOTH MEASURE show that it is a viable tool for estimating the memory requirements of a layout without having to implement the corner stitching data structures, which is a tedious and time-consuming task.

Style APA, Harvard, Vancouver, ISO itp.

7

Hoang, Vinh Quoc, and Yuhua Chen. "Cost-Effective Network Reordering Using FPGA." Sensors 23, no. 2 (2023): 819. http://dx.doi.org/10.3390/s23020819.

Pełny tekst źródła

Streszczenie:

The advancement of complex Internet of Things (IoT) devices in recent years has deepened their dependency on network connectivity, demanding low latency and high throughput. At the same time, expanding operating conditions for these devices have brought challenges that limit the design constraints and accessibility for future hardware or software upgrades. These limitations can result in data loss because of out-of-order packets if the design specification cannot keep up with network demands. In addition, existing network reordering solutions become less applicable due to the drastic changes in the type of network endpoints, as IoT devices typically have less memory and are likely to be power-constrained. One approach to address this problem is reordering packets using reconfigurable hardware to ease computation in other functions. Field Programmable Gate Array (FPGA) devices are ideal candidates for hardware implementations at the network endpoints due to their high performance and flexibility. Moreover, previous research on packet reordering using FPGAs has serious design flaws that can lead to unnecessary packet dropping due to blocking in memory. This research proposes a scalable hardware-focused method for packet reordering that can overcome the flaws from previous work while maintaining minimal resource usage and low time complexity. The design utilizes a pipelined approach to perform sorting in parallel and completes the operation within two clock cycles. FPGA resources are optimized using a two-layer memory management system that consumes minimal on-chip memory and registers. Furthermore, the design is scalable to support multi-flow applications with shared memories in a single FPGA chip.

Style APA, Harvard, Vancouver, ISO itp.

8

Meng, Xiankai, Zhuo Zhang, Jianxin Xue, Fangshu Chen, and Jiahui Wang. "Reliability Analysis for Programs with Redundancy Computation for Soft Errors." Journal of Physics: Conference Series 2522, no. 1 (2023): 012022. http://dx.doi.org/10.1088/1742-6596/2522/1/012022.

Pełny tekst źródła

Streszczenie:

Abstract Soft error is one of the factors which may affect the reliability of computer programs. A common method to alleviate the impact of soft errors is redundancy computation, a classical data flow error detection mechanism. However, a program with redundancy computation may still have some vulnerable spots, which might be caused by the flaw during implementation or the instruction reordering given by compiler optimization. Finding the vulnerable spots of a program with redundancy computation is of great significance to evaluate the capability of the error detection mechanism. There are some conventional methods to analyze the reliability of a program under soft errors, such as the irradiation experiment, fault injection, and modeling analysis. However, the irradiation experiment is expensive, fault injection is very time-consuming, and the existing modeling analysis methods have not considered the error detection mechanism. This paper proposes a novel method of reliability analysis for programs with redundancy computation by analyzing the dynamic instruction sequence. Experimental results show that our approach has fairly high accuracy and a false negative rate of about 0.0545.

Style APA, Harvard, Vancouver, ISO itp.

9

MISHRA, SK. "On accelerating the FFT of Cooley and Tukey." MAUSAM 36, no. 2 (2022): 167–72. http://dx.doi.org/10.54302/mausam.v36i2.1833.

Pełny tekst źródła

Streszczenie:

The efficient Fourier transform (EFT) and FFT algorithms are described and their computational efficiencies with respect to the direct method are discussed. An efficient procedure is proposed for the reordering of data set; the use of EFT algorithm for the initial Fourier transforms and restricting the size of final subsets to not less than 4 is also suggested for saving computation time in the FFT. It is found that on average the FFT with the proposed modifications is more than twice as fast as the original FFT. The amount of overhead operations involved in computer routine, based on the modified FFT is estimated.

Style APA, Harvard, Vancouver, ISO itp.

10

Phetkaew, Thimaporn, Wanchai Rivepiboon, and Boonserm Kijsirikul. "Reordering Adaptive Directed Acyclic Graphs for Multiclass Support Vector Machines." Journal of Advanced Computational Intelligence and Intelligent Informatics 7, no. 3 (2003): 315–21. http://dx.doi.org/10.20965/jaciii.2003.p0315.

Pełny tekst źródła

Streszczenie:

The problem of extending binary support vector machines (SVMs) for multiclass classification is still an ongoing research issue. Ussivakul and Kijsirikul proposed the Adaptive Directed Acyclic Graph (ADAG) approach that provides accuracy comparable to that of the standard algorithm-Max Wins and requires low computation. However, different sequences of nodes in the ADAG may provide different accuracy. In this paper we present a new method for multiclass classification, Reordering ADAG, which is the modification of the original ADAG method. We show examples to exemplify that the margin (or 2/|w| value) between two classes of each binary SVM classifier affects the accuracy of classification, and this margin indicates the magnitude of confusion between the two classes. In this paper, we propose an algorithm to choose an optimal sequence of nodes in the ADAG by considering the |w| values of all classifiers to be used in data classification. We then compare our performance with previous methods including the ADAG and the Max Wins algorithm. Experimental results demonstrate that our method gives higher accuracy. Moreover it runs faster than Max Wins, especially when the number of classes and/or the number of dimensions are relatively large.

Style APA, Harvard, Vancouver, ISO itp.

11

Li, Yun, Yanping Chen, Miaoxi Zhao, and Xinxin Zhai. "Optimization of Planning Layout of Urban Building Based on Improved Logit and PSO Algorithms." Complexity 2018 (November 19, 2018): 1–11. http://dx.doi.org/10.1155/2018/9452813.

Pełny tekst źródła

Streszczenie:

There is a huge amount of data in the opportunity of “turning waste into treasure” with the arrival of the big data age. Urban layout is very important for the development of urban transportation and building system. Once the layout of the city is finalized, it will be difficult to start again. Therefore, the urban architectural layout planning and design have a very important impact. This paper uses the urban architecture layout big data for building layout optimization using advanced computation techniques. Firstly, a big data collection and storage system based on the Hadoop platform is established. Then, the evaluation model of urban building planning based on improved logit and PSO algorithm is established. The PSO algorithm is used to find the suitable area for this kind of building layout, and then through five impact indicators: land prices, rail transit, historical protection, road traffic capacity, and commercial potential have been established by using the following logit linear regression model. Then, the bridge between logit and PSO algorithm is established by the fitness value of particle. The particle in the particle swarm is assigned to the index parameter of logit model, and then the logit model in the evaluation system is run. The performance index corresponding to the set of parameters is obtained. The performance index is passed to the PSO as the fitness value of the particle to search for the best adaptive position. The reasonable degree of regional architectural planning is obtained, and the rationality of urban architectural planning layout is determined.

Style APA, Harvard, Vancouver, ISO itp.

12

Kyriacou, Costas, Paraskevas Evripidou, and Pedro Trancoso. "CacheFlow: Cache Optimizations for Data Driven Multithreading." Parallel Processing Letters 16, no. 02 (2006): 229–44. http://dx.doi.org/10.1142/s0129626406002599.

Pełny tekst źródła

Streszczenie:

Data-Driven Multithreading is a non-blocking multithreading model of execution that provides effective latency tolerance by allowing the computation processor do useful work, while a long latency event is in progress. With the Data-Driven Multithreading model, a thread is scheduled for execution only if all of its inputs have been produced and placed in the processor's local memory. Data-driven sequencing leads to irregular memory access patterns that could affect negatively cache performance. Nevertheless, it enables the implementation of short-term optimal cache management policies. This paper presents the implementation of CacheFlow, an optimized cache management policy which eliminates the side effects due to the loss of locality caused by the data-driven sequencing, and reduces further cache misses. CacheFlow employs thread-based prefetching to preload data blocks of threads deemed executable. Simulation results, for nine scientific applications, on a 32-node Data-Driven Multithreaded machine show an average speedup improvement from 19.8 to 22.6. Two techniques to further improve the performance of CacheFlow, conflict avoidance and thread reordering, are proposed and tested. Simulation experiments have shown a speedup improvement of 24% and 32%, respectively. The average speedup for all applications on a 32-node machine with both optimizations is 26.1.

Style APA, Harvard, Vancouver, ISO itp.

13

Mathur, Kapil K., and S. Lennart Johnsson. "All-to-All Communication on the Connection Machine CM-200." Scientific Programming 4, no. 4 (1995): 251–73. http://dx.doi.org/10.1155/1995/637864.

Pełny tekst źródła

Streszczenie:

Detailed algorithms for all-to-all broadcast and reduction are given for arrays mapped by binary or binary-reflected Gray code encoding to the processing nodes of binary cube networks. Algorithms are also given for the local computation of the array indices for the communicated data, thereby reducing the demand for the communications bandwidth. For the Connection Machine system CM-200, Hamiltonian cycle-based all-to-all communication algorithms yield a performance that is a factor of 2 to 10 higher than the performance offered by algorithms based on trees, butterfly networks, or the Connection Machine router. The peak data rate achieved for all-to-all broadcast on a 2,048-node Connection Machine system CM-200 is 5.4 Gbyte/s. The index order of the data in local memory depends on implementation details of the algorithms, but it is well defined. If a linear ordering is desired, then including the time for local data reordering reduces the effective peak data rate to 2.5 Gbyte/s.

Style APA, Harvard, Vancouver, ISO itp.

14

Ginting, Bobby, and Ralf-Peter Mundani. "Comparison of Shallow Water Solvers: Applications for Dam-Break and Tsunami Cases with Reordering Strategy for Efficient Vectorization on Modern Hardware." Water 11, no. 4 (2019): 639. http://dx.doi.org/10.3390/w11040639.

Pełny tekst źródła

Streszczenie:

We investigate in this paper the behaviors of the Riemann solvers (Roe and Harten-Lax-van Leer-Contact (HLLC) schemes) and the Riemann-solver-free method (central-upwind scheme) regarding their accuracy and efficiency for solving the 2D shallow water equations. Our model was devised to be spatially second-order accurate with the Monotonic Upwind Scheme for Conservation Laws (MUSCL) reconstruction for a cell-centered finite volume scheme—and be temporally fourth-order accurate using the Runge–Kutta fourth-order method. Four benchmark cases of dam-break and tsunami events dealing with highly-discontinuous flows and wet–dry problems were simulated. To this end, we applied a reordering strategy for the data structures in our code supporting efficient vectorization and memory access alignment for boosting the performance. Two main features are pointed out here. Firstly, the reordering strategy employed has enabled highly-efficient vectorization for the three solvers investigated on three modern hardware (AVX, AVX2, and AVX-512), where speed-ups of 4.5–6.5× were obtained on the AVX/AVX2 machines for eight data per vector while on the AVX-512 machine we achieved a speed-up of up to 16.7× for 16 data per vector, all with singe-core computation; with parallel simulations, speed-ups of up to 75.7–121.8× and 928.9× were obtained on AVX/AVX2 and AVX-512 machines, respectively. Secondly, we observed that the central-upwind scheme was able to outperform the HLLC and Roe schemes 1.4× and 1.25×, respectively, by exhibiting similar accuracies. This study would be useful for modelers who are interested in developing shallow water codes.

Style APA, Harvard, Vancouver, ISO itp.

15

Chen, Guoxing. "A Data-Driven Intelligent System for Assistive Design of Interior Environments." Computational Intelligence and Neuroscience 2022 (August 25, 2022): 1–11. http://dx.doi.org/10.1155/2022/8409495.

Pełny tekst źródła

Streszczenie:

This paper analyses the design of a healthy interior environment using big data intelligence. The application of big data intelligence in the design of healthy interior environments is necessary because the traditional interior design approaches consume a lot of energy and other problems. Benefited by its strong ability of computation and analytics, artificial intelligence can well improve a series of problems in the field of interior design. The proposal summarizes the sources, classifications, and expressions of behavioral data in interior spaces, carries out analysis and research on behavioral data from two aspects: display space and supermarket space, summarizes the interior methods based on behavioral data, and analyses the visualization application of behavioral data in different interior scenes, to explore the application value of behavioral data in interior design. In contrast to it is the unconscious behavioral response, the biggest characteristic of which is that it is regulated by the behavioral subject’s physiological factors or habits of the behavior issuer. In this paper, we convert the layout recommendation problem of a space into a functional classification problem of segmented segments and household segments on a plane. The scene layout features are extracted by binary coding, the abstraction of the cross features between the vector segments is achieved by using a word embedding algorithm, the feature matrix is reduced in dimensionality, and finally, the segmentation network model and the layout network model are constructed, respectively, by using a bidirectional LSTM. The experiments show that the accuracy of the layout recommendation model in this paper is 98%, which can meet the demand for real-time online layouts.

Style APA, Harvard, Vancouver, ISO itp.

16

Zhu, Chenhong, J. G. Wang, Na Xu, Wei Liang, Bowen Hu, and Peibo Li. "A Combination Approach of the Numerical Simulation and Data-Driven Analysis for the Impacts of Refracturing Layout and Time on Shale Gas Production." Sustainability 14, no. 23 (2022): 16072. http://dx.doi.org/10.3390/su142316072.

Pełny tekst źródła

Streszczenie:

Refracturing can alleviate the rapid decline of shale gas production with a low drilling cost, but an appropriate fracture layout and optimal refracturing time have been unclear without a heavy computation load. This paper proposes a combination approach with a numerical simulation and data-driven analysis to quickly evaluate the impacts of the refracturing layout and refracturing time on shale gas production. Firstly, a multiphysical coupling model with the creep of natural fractures is established for the numerical simulation on shale gas production. Secondly, the effects of the refracturing layout and refracturing time on the shale gas production are investigated through a single factor sensitivity analysis, but this analysis cannot identify the fracture interaction. Thirdly, the influence of fractures interaction on shale gas production is explored through a combination of a global sensitivity analysis (GSA) and an artificial neural network (ANN). The GSA results observed that the adjacent fractures have more salient interferences, which means that a denser fracture network will not significantly increase the total gas production, or will reduce the contribution from each fracture, resulting in higher fracturing costs. The new fractures that are far from existing fractures have greater contributions to cumulative gas production. In addition, the optimal refracturing time varies with the refracturing layout and is optimally implemented within 2–3 years. A suitable refracturing scale and time should be selected, based on the remaining gas reserve. These results can provide reasonable insights for the refracturing design on the refracturing layout and optimal time. This ANN-GSA approach provides a fast evaluation for the optimization of the refracturing layout and time without enormous numerical simulations.

Style APA, Harvard, Vancouver, ISO itp.

17

Filelis-Papadopoulos, Christos K., and George A. Gravvanis. "Hybrid multilevel solution of sparse least-squares linear systems." Engineering Computations 34, no. 8 (2017): 2752–66. http://dx.doi.org/10.1108/ec-10-2016-0353.

Pełny tekst źródła

Streszczenie:

Purpose Large sparse least-squares problems arise in different scientific disciplines such as optimization, data analysis, machine learning and simulation. This paper aims to propose a two-level hybrid direct-iterative scheme, based on novel block independent column reordering, for efficiently solving large sparse least-squares linear systems. Design/methodology/approach Herewith, a novel block column independent set reordering scheme is used to separate the columns in two groups: columns that are block independent and columns that are coupled. The permutation scheme leads to a two-level hierarchy. Using this two-level hierarchy, the solution of the least-squares linear system results in the solution of a reduced size Schur complement-type square linear system, using the preconditioned conjugate gradient (PCG) method as well as backward substitution using the upper triangular factor, computed through sparse Q-less QR factorization of the columns that are block independent. To improve the convergence behavior of the PCG method, the upper triangular factor, computed through sparse Q-less QR factorization of the coupled columns, is used as a preconditioner. Moreover, to further reduce the fill-in, then the column approximate minimum degree (COLAMD) algorithm is used to permute the block consisting of the coupled columns. Findings The memory requirements for solving large sparse least-squares linear systems are significantly reduced compared to Q-less QR decomposition of the original as well as the permuted problem with COLAMD. The memory requirements are reduced further by choosing to form larger blocks of independent columns. The convergence behavior of the iterative scheme is improved due to the chosen preconditioning scheme. The proposed scheme is inherently parallel due to the introduction of block independent column reordering. Originality/value The proposed scheme is a hybrid direct-iterative approach for solving sparse least squares linear systems based on the implicit computation of a two-level approximate pseudo-inverse matrix. Numerical results indicating the applicability and effectiveness of the proposed scheme are given.

Style APA, Harvard, Vancouver, ISO itp.

18

Rieber, Dennis, Axel Acosta, and Holger Fröning. "Joint Program and Layout Transformations to Enable Convolutional Operators on Specialized Hardware Based on Constraint Programming." ACM Transactions on Architecture and Code Optimization 19, no. 1 (2022): 1–26. http://dx.doi.org/10.1145/3487922.

Pełny tekst źródła

Streszczenie:

The success of Deep Artificial Neural Networks (DNNs) in many domains created a rich body of research concerned with hardware accelerators for compute-intensive DNN operators. However, implementing such operators efficiently with complex hardware intrinsics such as matrix multiply is a task not yet automated gracefully. Solving this task often requires joint program and data layout transformations. First solutions to this problem have been proposed, such as TVM, UNIT, or ISAMIR, which work on a loop-level representation of operators and specify data layout and possible program transformations before the embedding into the operator is performed. This top-down approach creates a tension between exploration range and search space complexity, especially when also exploring data layout transformations such as im2col, channel packing, or padding. In this work, we propose a new approach to this problem. We created a bottom-up method that allows the joint transformation of both computation and data layout based on the found embedding. By formulating the embedding as a constraint satisfaction problem over the scalar dataflow, every possible embedding solution is contained in the search space. Adding additional constraints and optimization targets to the solver generates the subset of preferable solutions. An evaluation using the VTA hardware accelerator with the Baidu DeepBench inference benchmark shows that our approach can automatically generate code competitive to reference implementations. Further, we show that dynamically determining the data layout based on intrinsic and workload is beneficial for hardware utilization and performance. In cases where the reference implementation has low hardware utilization due to its fixed deployment strategy, we achieve a geomean speedup of up to × 2.813, while individual operators can improve as much as × 170.

Style APA, Harvard, Vancouver, ISO itp.

19

McGue, Matt, Emily A. Willoughby, Aldo Rustichini, Wendy Johnson, William G. Iacono, and James J. Lee. "The Contribution of Cognitive and Noncognitive Skills to Intergenerational Social Mobility." Psychological Science 31, no. 7 (2020): 835–47. http://dx.doi.org/10.1177/0956797620924677.

Pełny tekst źródła

Streszczenie:

We investigated intergenerational educational and occupational mobility in a sample of 2,594 adult offspring and 2,530 of their parents. Participants completed assessments of general cognitive ability and five noncognitive factors related to social achievement; 88% were also genotyped, allowing computation of educational-attainment polygenic scores. Most offspring were socially mobile. Offspring who scored at least 1 standard deviation higher than their parents on both cognitive and noncognitive measures rarely moved down and frequently moved up. Polygenic scores were also associated with social mobility. Inheritance of a favorable subset of parent alleles was associated with moving up, and inheritance of an unfavorable subset was associated with moving down. Parents’ education did not moderate the association of offspring’s skill with mobility, suggesting that low-skilled offspring from advantaged homes were not protected from downward mobility. These data suggest that cognitive and noncognitive skills as well as genetic factors contribute to the reordering of social standing that takes place across generations.

Style APA, Harvard, Vancouver, ISO itp.

20

Ravikumar, Penugonda, Palla Likhitha, Bathala Venus Vikranth Raj, Rage Uday Kiran, Yutaka Watanobe, and Koji Zettsu. "Efficient Discovery of Periodic-Frequent Patterns in Columnar Temporal Databases." Electronics 10, no. 12 (2021): 1478. http://dx.doi.org/10.3390/electronics10121478.

Pełny tekst źródła

Streszczenie:

Discovering periodic-frequent patterns in temporal databases is a challenging problem of great importance in many real-world applications. Though several algorithms were described in the literature to tackle the problem of periodic-frequent pattern mining, most of these algorithms use the traditional horizontal (or row) database layout, that is, either they need to scan the database several times or do not allow asynchronous computation of periodic-frequent patterns. As a result, this kind of database layout makes the algorithms for discovering periodic-frequent patterns both time and memory inefficient. One cannot ignore the importance of mining the data stored in a vertical (or columnar) database layout. It is because real-world big data is widely stored in columnar database layout. With this motivation, this paper proposes an efficient algorithm, Periodic Frequent-Equivalence CLass Transformation (PF-ECLAT), to find periodic-frequent patterns in a columnar temporal database. Experimental results on sparse and dense real-world and synthetic databases demonstrate that PF-ECLAT is memory and runtime efficient and highly scalable. Finally, we demonstrate the usefulness of PF-ECLAT with two case studies. In the first case study, we have employed our algorithm to identify the geographical areas in which people were periodically exposed to harmful levels of air pollution in Japan. In the second case study, we have utilized our algorithm to discover the set of road segments in which congestion was regularly observed in a transportation network.

Style APA, Harvard, Vancouver, ISO itp.

21

Fuschini, Franco, Marina Barbiroli, Giovanna Calò, et al. "Multi-Level Analysis of On-Chip Optical Wireless Links." Applied Sciences 10, no. 1 (2019): 196. http://dx.doi.org/10.3390/app10010196.

Pełny tekst źródła

Streszczenie:

Networks-on-chip are being regarded as a promising solution to meet the on-going requirement for higher and higher computation capacity. In view of future kilo-cores architectures, electrical wired connections are likely to become inefficient and alternative technologies are being widely investigated. Wireless communications on chip may be therefore leveraged to overcome the bottleneck of physical interconnections. This work deals with wireless networks-on-chip at optical frequencies, which can simplify the network layout and reduce the communication latency, easing the antenna on-chip integration process at the same time. On the other end, optical wireless communication on-chip can be limited by the heavy propagation losses and the possible cross-link interference. Assessment of the optical wireless network in terms of bit error probability and maximum communication range is here investigated through a multi-level approach. Manifold aspects, concurring to the final system performance, are simultaneously taken into account, like the antenna radiation properties, the data-rate of the core-to core communication, the geometrical and electromagnetic layout of the chip and the noise and interference level. Simulations results suggest that communication up to some hundreds of μm can be pursued provided that the antenna design and/or the target data-rate are carefully tailored to the actual layout of the chip.

Style APA, Harvard, Vancouver, ISO itp.

22

Zhang, Qunshan, and George A. McMechan. "Common-image gathers in the incident phase-angle domain from reverse time migration in 2D elastic VTI media." GEOPHYSICS 76, no. 6 (2011): S197—S206. http://dx.doi.org/10.1190/geo2011-0015.1.

Pełny tekst źródła

Streszczenie:

Reverse time migration (RTM) was implemented with a modified crosscorrelation imaging condition for data from 2D elastic vertically transversely isotropy (VTI) media. The computation cost was reduced because scalar qP- and qS-wavefield separations are performed in VTI media, for the source and receiver wavefields only at the RTM imaging time, to calculate the migrated qP and qS images. Angle-domain common-image gathers (CIGs) were extracted from qPqP and qPqS common-source RTM images. The local incident angle was produced as the difference between the qP-wave phase angle, obtained directly from the source wavefield polarization, and the normal to the reflector, calculated as the instantaneous wavenumber direction via a directional Hilbert transform of the stacked image. Angle-domain CIGs were extracted by reordering the prestack-migrated images by local incident phase angle, source by source. Vector decomposition of the source qP-wavefield was required to calculate the qP-wave phase polarization direction for each image point at its imaging time. RTM and angle-domain CIG extraction were successfully implemented and illustrated with a synthetic 2D elastic VTI example.

Style APA, Harvard, Vancouver, ISO itp.

23

Tran, Nhat-Phuong, Myungho Lee, and Sugwon Hong. "Performance Optimization of 3D Lattice Boltzmann Flow Solver on a GPU." Scientific Programming 2017 (2017): 1–16. http://dx.doi.org/10.1155/2017/1205892.

Pełny tekst źródła

Streszczenie:

Lattice Boltzmann Method (LBM) is a powerful numerical simulation method of the fluid flow. With its data parallel nature, it is a promising candidate for a parallel implementation on a GPU. The LBM, however, is heavily data intensive and memory bound. In particular, moving the data to the adjacent cells in the streaming computation phase incurs a lot of uncoalesced accesses on the GPU which affects the overall performance. Furthermore, the main computation kernels of the LBM use a large number of registers per thread which limits the thread parallelism available at the run time due to the fixed number of registers on the GPU. In this paper, we develop high performance parallelization of the LBM on a GPU by minimizing the overheads associated with the uncoalesced memory accesses while improving the cache locality using the tiling optimization with the data layout change. Furthermore, we aggressively reduce the register uses for the LBM kernels in order to increase the run-time thread parallelism. Experimental results on the Nvidia Tesla K20 GPU show that our approach delivers impressive throughput performance: 1210.63 Million Lattice Updates Per Second (MLUPS).

Style APA, Harvard, Vancouver, ISO itp.

24

Wang, Yang, Jie Liu, Xiaoxiong Zhu, Qingyang Zhang, Shengguo Li, and Qinglin Wang. "Improving Structured Grid-Based Sparse Matrix-Vector Multiplication and Gauss–Seidel Iteration on GPDSP." Applied Sciences 13, no. 15 (2023): 8952. http://dx.doi.org/10.3390/app13158952.

Pełny tekst źródła

Streszczenie:

Structured grid-based sparse matrix-vector multiplication and Gauss–Seidel iterations are very important kernel functions in scientific and engineering computations, both of which are memory intensive and bandwidth-limited. GPDSP is a general purpose digital signal processor, which is a very significant embedded processor that has been introduced into high-performance computing. In this paper, we designed various optimization methods, which included a blocking method to improve data locality and increase memory access efficiency, a multicolor reordering method to develop Gauss–Seidel fine-grained parallelism, a data partitioning method designed for GPDSP memory structures, and a double buffering method to overlap computation and access memory on structured grid-based SpMV and Gauss–Seidel iterations for GPDSP. At last, we combined the above optimization methods to design a multicore vectorization algorithm. We tested the matrices generated with structured grids of different sizes on the GPDSP platform and obtained speedups of up to 41× and 47× compared to the unoptimized SpMV and Gauss–Seidel iterations, with maximum bandwidth efficiencies of 72% and 81%, respectively. The experiment results show that our algorithms could fully utilize the external memory bandwidth. We also implemented the commonly used mixed precision algorithm on the GPDSP and obtained speedups of 1.60× and 1.45× for the SpMV and Gauss–Seidel iterations, respectively.

Style APA, Harvard, Vancouver, ISO itp.

25

Farmakis, Panagiotis M., and Athanasios P. Chassiakos. "Genetic algorithm optimization for dynamic construction site layout planning." Organization, Technology and Management in Construction: an International Journal 10, no. 1 (2018): 1655–64. http://dx.doi.org/10.1515/otmcj-2016-0026.

Pełny tekst źródła

Streszczenie:

AbstractThe dynamic construction site layout planning (DCSLP) problem refers to the efficient placement and relocation of temporary construction facilities within a dynamically changing construction site environment considering the characteristics of facilities and work interrelationships, the shape and topography of the construction site, and the time-varying project needs. A multi-objective dynamic optimization model is developed for this problem that considers construction and relocation costs of facilities, transportation costs of resources moving from one facility to another or to workplaces, as well as safety and environmental considerations resulting from facilities’ operations and interconnections. The latter considerations are taken into account in the form of preferences or constraints regarding the proximity or remoteness of particular facilities to other facilities or work areas. The analysis of multiple project phases and the dynamic facility relocation from phase to phase highly increases the problem size, which, even in its static form, falls within the NP (for Nondeterministic Polynomial time)-hard class of combinatorial optimization problems. For this reason, a genetic algorithm has been implemented for the solution due to its capability to robustly search within a large solution space. Several case studies and operational scenarios have been implemented through the Palisade’s Evolver software for model testing and evaluation. The results indicate satisfactory model response to time-varying input data in terms of solution quality and computation time. The model can provide decision support to site managers, allowing them to examine alternative scenarios and fine-tune optimal solutions according to their experience by introducing desirable preferences or constraints in the decision process.

Style APA, Harvard, Vancouver, ISO itp.

26

Jakubowski, Aleksander, Leszek Jarzebowicz, Mikołaj Bartłomiejczyk, et al. "Modeling of Electrified Transportation Systems Featuring Multiple Vehicles and Complex Power Supply Layout." Energies 14, no. 24 (2021): 8196. http://dx.doi.org/10.3390/en14248196.

Pełny tekst źródła

Streszczenie:

The paper proposes a novel approach to modeling electrified transportation systems. The proposed solution reflects the mechanical dynamics of vehicles as well as the distribution and losses of electric supply. Moreover, energy conversion losses between the mechanical and electrical subsystems and their bilateral influences are included. Such a complete model makes it possible to replicate, e.g., the impact of voltage drops on vehicle acceleration or the necessity of partial disposal of regenerative braking energy due to temporary lack of power transmission capability. The modeling methodology uses a flexible twin data-bus structure, which poses no limitation on the number of vehicles and enables modeling complex traction power supply structures. The proposed solution is suitable for various electrified transportation systems including suburban and urban systems. The modeling methodology is applicable i.a. to Matlab/Simulink, which makes it broadly available and customizable, and provides short computation time. The applicability and accuracy of the method were verified by comparing simulation and measurement results on an exemplary trolleybus system operating in Pilsen, Czech Republic. Simulation of daily operation of an area including four supply sections and maximal simultaneous number of nine vehicles showed a good conformance with the measured data, with the difference in the total consumed energy not exceeding 5%.

Style APA, Harvard, Vancouver, ISO itp.

27

Ohmori, Shunichi, and Kazuho Yoshimoto. "A Primal-Dual Interior-Point Method for Facility Layout Problem with Relative-Positioning Constraints." Algorithms 14, no. 2 (2021): 60. http://dx.doi.org/10.3390/a14020060.

Pełny tekst źródła

Streszczenie:

We consider the facility layout problem (FLP) in which we find the arrangements of departments with the smallest material handling cost that can be expressed as the product of distance times flows between departments. It is known that FLP can be formulated as a linear programming problem if the relative positioning of departments is specified, and, thus, can be solved to optimality. In this paper, we describe a custom interior-point algorithm for solving FLP with relative positioning constraints (FLPRC) that is much faster than the standard methods used in the general-purpose solver. We build a compact formation of FLPRC and its duals, which enables us to establish the optimal condition very quickly. We use this optimality condition to implement the primal-dual interior-point method with an efficient Newton step computation that exploit special structure of a Hessian. We confirm effectiveness of our proposed model through applications to several well-known benchmark data sets. Our algorithm shows much faster speed for finding the optimal solution.

Style APA, Harvard, Vancouver, ISO itp.

28

He, Peng, Zhen Zhou, Lian Peng Wang, and Na Wang. "Numerical Simulation Software Design on Protein Particle Detection." Advanced Materials Research 345 (September 2011): 223–27. http://dx.doi.org/10.4028/www.scientific.net/amr.345.223.

Pełny tekst źródła

Streszczenie:

Light scattering approach and optical measuring instrument are the most popular system in the protein aggregation detection, especially in the therapeutic and diagnostic protein industry. Microsoft windows based software written in Matlab was developed as a teaching/research tool for designing light particle instrument and detection systems. This software incorporates information for lots of optical parameters, formulas and methods. There is a choice among 3 light scattering system design style. The software output includes advising parameter of light particle instrument and detection system, intensity of light scattering at some angle, characteristic curves of light scattering style. The software also provides the capability of visualizing the design layout and performs numerical computation to analyze the experimental data.

Style APA, Harvard, Vancouver, ISO itp.

29

Li, Guohua, An Liu, and Huajie Shen. "A Massive Image Recognition Algorithm Based on Attribute Modelling and Knowledge Acquisition." Advances in Mathematical Physics 2021 (November 24, 2021): 1–12. http://dx.doi.org/10.1155/2021/4632070.

Pełny tekst źródła

Streszczenie:

In this paper, an in-depth study and analysis of attribute modelling and knowledge acquisition of massive images are conducted using image recognition. For the complexity of association relationships between attributes of incomplete data, a single-output subnetwork modelling method for incomplete data is proposed to build a neural network model with each missing attribute as output alone and other attributes as input in turn, and the network structure can deeply portray the association relationships between each attribute and other attributes. To address the problem of incomplete model inputs due to the presence of missing values, we propose to treat and describe the missing values as system-level variables and realize the alternate update of network parameters and dynamic filling of missing values through iterative learning among subnets. The method can effectively utilize the information of all the present attribute values in incomplete data, and the obtained subnetwork population model is a fit to the attribute association relationships implied by all the present attribute values in incomplete data. The strengths and weaknesses of existing image semantic modelling algorithms are analysed. To reduce the workload of manually labelling data, this paper proposes the use of a streaming learning algorithm to automatically pass image-level semantic labels to pixel regions of an image, where the algorithm does not need to rely on external detectors and a priori knowledge of the dataset. Then, an efficient deep neural network mapping algorithm is designed and implemented for the microprocessing architecture and software programming framework of this edge processor, and a layout scheme is proposed to place the input feature maps outside the kernel DDR and the reordered convolutional kernel matrices inside the kernel storage body and to design corresponding efficient vectorization algorithms for the multidimensional matrix convolution computation, multidimensional pooling computation, local linear normalization, etc., which exist in the deep convolutional neural network model. The efficient vectorized mapping scheme is designed for the multidimensional matrix convolution computation, multidimensional pooling computation, local linear normalization, etc. in the deep convolutional neural network model so that the utilization of MAC components in the core loop can reach 100%.

Style APA, Harvard, Vancouver, ISO itp.

30

Holdener, D., S. Nebiker, and S. Blaser. "DESIGN AND IMPLEMENTATION OF A NOVEL PORTABLE 360° STEREO CAMERA SYSTEM WITH LOW-COST ACTION CAMERAS." ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences XLII-2/W8 (November 13, 2017): 105–10. http://dx.doi.org/10.5194/isprs-archives-xlii-2-w8-105-2017.

Pełny tekst źródła

Streszczenie:

The demand for capturing indoor spaces is rising with the digitalization trend in the construction industry. An efficient solution for measuring challenging indoor environments is mobile mapping. Image-based systems with 360° panoramic coverage allow a rapid data acquisition and can be processed to georeferenced 3D images hosted in cloud-based 3D geoinformation services. For the multiview stereo camera system presented in this paper, a 360° coverage is achieved with a layout consisting of five horizontal stereo image pairs in a circular arrangement. The design is implemented as a low-cost solution based on a 3D printed camera rig and action cameras with fisheye lenses. The fisheye stereo system is successfully calibrated with accuracies sufficient for the applied measurement task. A comparison of 3D distances with reference data delivers maximal deviations of 3 cm on typical distances in indoor space of 2-8 m. Also the automatic computation of coloured point clouds from the stereo pairs is demonstrated.

Style APA, Harvard, Vancouver, ISO itp.

31

Pilati, Francesco, Emilio Ferrari, Mauro Gamberi, and Silvia Margelli. "Multi-Manned Assembly Line Balancing: Workforce Synchronization for Big Data Sets through Simulated Annealing." Applied Sciences 11, no. 6 (2021): 2523. http://dx.doi.org/10.3390/app11062523.

Pełny tekst źródła

Streszczenie:

The assembly of large and complex products such as cars, trucks, and white goods typically involves a huge amount of production resources such as workers, pieces of equipment, and layout areas. In this context, multi-manned workstations commonly characterize these assembly lines. The simultaneous operators’ activity in the same assembly station suggests considering compatibility/incompatibility between the different mounting positions, equipment sharing, and worker cooperation. The management of all these aspects significantly increases the balancing problem complexity due to the determination of the start/end times of each task. This paper proposes a new mixed-integer programming model to simultaneously optimize the line efficiency, the line length, and the workload smoothness. A customized procedure based on a simulated annealing algorithm is developed to effectively solve this problem. The aforementioned procedure is applied to the balancing of the real assembly line of European sports car manufacturers distinguished by 665 tasks and numerous synchronization constraints. The experimental results present remarkable performances obtained by the proposed procedure both in terms of solution quality and computation time. The proposed approach is the practical reference for efficient multi-manned assembly line design, task assignment, equipment allocation, and mounting position management in the considered industrial fields.

Style APA, Harvard, Vancouver, ISO itp.

32

Novaković, Vedran. "Batched Computation of the Singular Value Decompositions of Order Two by the AVX-512 Vectorization." Parallel Processing Letters 30, no. 04 (2020): 2050015. http://dx.doi.org/10.1142/s0129626420500152.

Pełny tekst źródła

Streszczenie:

In this paper a vectorized algorithm for simultaneously computing up to eight singular value decompositions (SVDs, each of the form [Formula: see text]) of real or complex matrices of order two is proposed. The algorithm extends to a batch of matrices of an arbitrary length [Formula: see text], that arises, for example, in the annihilation part of the parallel Kogbetliantz algorithm for the SVD of matrices of order [Formula: see text]. The SVD method for a single matrix of order two is derived first. It scales, in most instances error-free, the input matrix [Formula: see text] such that the scaled singular values cannot overflow whenever the elements of [Formula: see text] are finite, and then computes the URV factorization of the scaled matrix, followed by the SVD of the non-negative upper-triangular middle factor. A vector-friendly data layout for the batch is then introduced, where the same-indexed elements of each of the input and the output matrices form vectors, and the algorithm’s steps over such vectors are described. The vectorized approach is shown to be about three times faster than processing each matrix in the batch separately, while slightly improving accuracy over the straightforward method for the [Formula: see text] SVD.

Style APA, Harvard, Vancouver, ISO itp.

33

Diaz-Guerra, Francisco, and Angel Jimenez-Molina. "Continuous Prediction of Web User Visual Attention on Short Span Windows Based on Gaze Data Analytics." Sensors 23, no. 4 (2023): 2294. http://dx.doi.org/10.3390/s23042294.

Pełny tekst źródła

Streszczenie:

Understanding users’ visual attention on websites is paramount to enhance the browsing experience, such as providing emergent information or dynamically adapting Web interfaces. Existing approaches to accomplish these challenges are generally based on the computation of salience maps of static Web interfaces, while websites increasingly become more dynamic and interactive. This paper proposes a method and provides a proof-of-concept to predict user’s visual attention on specific regions of a website with dynamic components. This method predicts the regions of a user’s visual attention without requiring a constant recording of the current layout of the website, but rather by knowing the structure it presented in a past period. To address this challenge, the concept of visit intention is introduced in this paper, defined as the probability that a user, while browsing, will fixate their gaze on a specific region of the website in the next period. Our approach uses the gaze patterns of a population that browsed a specific website, captured via an eye-tracker device, to aid personalized prediction models built with individual visual kinetics features. We show experimentally that it is possible to conduct such a prediction through multilabel classification models using a small number of users, obtaining an average area under curve of 84.3%, and an average accuracy of 79%. Furthermore, the user’s visual kinetics features are consistently selected in every set of a cross-validation evaluation.

Style APA, Harvard, Vancouver, ISO itp.

34

Vrublova, Dana, Roman Kapica, Beata Gibesova, Jaroslav Mudruňka, and Adam Struś. "APPLICATION OF GNSS TECHNOLOGY IN SURFACE MINING." Geodesy and cartography 42, no. 4 (2016): 122–28. http://dx.doi.org/10.3846/20296991.2016.1268433.

Pełny tekst źródła

Streszczenie:

VŠB – Technical University of Ostrava, institute of geodesy and mine surveying has been cooperating with Severočeske doly j.s.c. (SD) in important research project since 2007. The main goal is improve control system for opencast mining. Two bucket wheel excavators (K800/103 and KU300/27) were equipped with measurement hardware at the Libouš Lignite Mine (North Bohemia brown coal basin). The position of the bucket wheel centre is computed by means of GNSS data, inclinometer and incremental measurements. Data is transferred to a base. All the values measured are saved in this database. The surface layout of the mine as well as positions of underground geological layers are updated on a regular basis in the digital model of the mine. The main aim of the research is verifying the system in connection to digital model for short time prognosis of qualitative parameters of coal (Ad, Sd, Qr, Wr and MS), continuous automatic computation of mined materials (m3, tons) and continuous checking of creation of the movement surface/plane of the excavator and mining goals. Mine surveyors have a lead role in the working team. The paper describes possibilities of using the GNSS for mine surveying and for production planning.

Style APA, Harvard, Vancouver, ISO itp.

35

Sang, Xiaoting, Zhenghui Hu, Huanyu Li, Chunlei Li, and Zhoufeng Liu. "A Block-Based and Highly Parallel CNN Accelerator for Seed Sorting." Journal of Electrical and Computer Engineering 2022 (November 17, 2022): 1–16. http://dx.doi.org/10.1155/2022/5608573.

Pełny tekst źródła

Streszczenie:

Seed sorting is critical for the breeding industry to improve the agricultural yield. The seed sorting methods based on convolutional neural networks (CNNs) have achieved excellent recognition accuracy on large-scale pretrained network models. However, CNN inference is a computationally intensive process that often requires hardware acceleration to operate in real time. For embedded devices, the high-power consumption of graphics processing units (GPUs) is generally prohibitive, and the field programmable gate array (FPGA) becomes a solution to perform high-speed inference by providing a customized accelerator for a particular user. To date, the recognition speeds of the FPGA-based universal accelerators for high-throughput seed sorting tasks are slow, which cannot guarantee real-time seed sorting. Therefore, a block-based and highly parallel MobileNetV2 accelerator is proposed in this paper. First, a hardware-friendly quantization method that uses only fixed-point operation is designed to reduce resource consumption. Then, the block convolution strategy is proposed to avoid latency and energy consumption increase caused by large-scale intermediate result off-chip data transfers. Finally, two scalable computing engines are explicitly designed for depth-wise convolution (DWC) and point-wise convolution (PWC) to develop the high parallelism of block convolution computation. Moreover, an efficient memory system with a double buffering mechanism and new data reordering mode is designed to address the imbalance between memory access and parallel computing. Our proposed FPGA-based MobileNetV2 accelerator for real-time seed sorting is implemented and evaluated on the platform of Xilinx XC7020. Experimental results demonstrate that our implementation can achieve about 29.4 frames per second (FPS) and 10.86 Giga operations per second (GOPS), and 0.92× to 5.70 × DSP-efficiency compared with previous FPGA-based accelerators.

Style APA, Harvard, Vancouver, ISO itp.

36

Fink, Martin, and Denis Noble. "Markov models for ion channels: versatility versus identifiability and speed." Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences 367, no. 1896 (2009): 2161–79. http://dx.doi.org/10.1098/rsta.2008.0301.

Pełny tekst źródła

Streszczenie:

Markov models (MMs) represent a generalization of Hodgkin–Huxley models. They provide a versatile structure for modelling single channel data, gating currents, state-dependent drug interaction data, exchanger and pump dynamics, etc. This paper uses examples from cardiac electrophysiology to discuss aspects related to parameter estimation. (i) Parameter unidentifiability (found in 9 out of 13 of the considered models) results in an inability to determine the correct layout of a model, contradicting the idea that model structure and parameters provide insights into underlying molecular processes. (ii) The information content of experimental voltage step clamp data is discussed, and a short but sufficient protocol for parameter estimation is presented. (iii) MMs have been associated with high computational cost (owing to their large number of state variables), presenting an obstacle for multicellular whole organ simulations as well as parameter estimation. It is shown that the stiffness of models increases computation time more than the number of states. (iv) Algorithms and software programs are provided for steady-state analysis, analytical solutions for voltage steps and numerical derivation of parameter identifiability. The results provide a new standard for ion channel modelling to further the automation of model development, the validation process and the predictive power of these models.

Style APA, Harvard, Vancouver, ISO itp.

37

Huang, Lei, Haoqiang Jin, Liqi Yi, and Barbara Chapman. "Enabling Locality-Aware Computations in OpenMP." Scientific Programming 18, no. 3-4 (2010): 169–81. http://dx.doi.org/10.1155/2010/185421.

Pełny tekst źródła

Streszczenie:

Locality of computation is key to obtaining high performance on a broad variety of parallel architectures and applications. It is moreover an essential component of strategies for energy-efficient computing. OpenMP is a widely available industry standard for shared memory programming. With the pervasive deployment of multi-core computers and the steady growth in core count, a productive programming model such as OpenMP is increasingly expected to play an important role in adapting applications to this new hardware. However, OpenMP does not provide the programmer with explicit means to program for locality. Rather it presents the user with a “flat” memory model. In this paper, we discuss the need for explicit programmer control of locality within the context of OpenMP and present some ideas on how this might be accomplished. We describe potential extensions to OpenMP that would enable the user to manage a program's data layout and to align tasks and data in order to minimize the cost of data accesses. We give examples showing the intended use of the proposed features, describe our current implementation and present some experimental results. Our hope is that this work will lead to efforts that would help OpenMP to be a major player on emerging, multi- and many-core architectures.

Style APA, Harvard, Vancouver, ISO itp.

38

Bay, Christopher J., Paul Fleming, Bart Doekemeijer, Jennifer King, Matt Churchfield, and Rafael Mudafort. "Addressing deep array effects and impacts to wake steering with the cumulative-curl wake model." Wind Energy Science 8, no. 3 (2023): 401–19. http://dx.doi.org/10.5194/wes-8-401-2023.

Pełny tekst źródła

Streszczenie:

Abstract. Wind farm design and analysis heavily rely on computationally efficient engineering models that are evaluated many times to find an optimal solution. A recent article compared the state-of-the-art Gauss-curl hybrid (GCH) model to historical data of three offshore wind farms. Two points of model discrepancy were identified therein: poor wake predictions for turbines experiencing a lot of wakes and wake interactions between two turbines over long distances. The present article addresses those two concerns and presents the cumulative-curl (CC) model. Comparison of the CC model to high-fidelity simulation data and historical data of three offshore wind farms confirms the improved accuracy of the CC model over the GCH model in situations with large wake losses and wake recovery over large inter-turbine distances. Additionally, the CC model performs comparably to the GCH model for single- and fewer-turbine wake interactions, which were already accurately modeled. Lastly, the CC model has been implemented in a vectorized form, greatly reducing the computation time for many wind conditions. The CC model now enables reliable simulation studies for both small and large offshore wind farms at a low computational cost, thereby making it an ideal candidate for wake-steering optimization and layout optimization.

Style APA, Harvard, Vancouver, ISO itp.

39

Cho, Hyungmin. "RiSA: A Reinforced Systolic Array for Depthwise Convolutions and Embedded Tensor Reshaping." ACM Transactions on Embedded Computing Systems 20, no. 5s (2021): 1–20. http://dx.doi.org/10.1145/3476984.

Pełny tekst źródła

Streszczenie:

Depthwise convolutions are widely used in convolutional neural networks (CNNs) targeting mobile and embedded systems. Depthwise convolution layers reduce the computation loads and the number of parameters compared to the conventional convolution layers. Many deep neural network (DNN) accelerators adopt an architecture that exploits the high data-reuse factor of DNN computations, such as a systolic array. However, depthwise convolutions have low data-reuse factor and under-utilize the processing elements (PEs) in systolic arrays. In this paper, we present a DNN accelerator design called RiSA, which provides a novel mechanism that boosts the PE utilization for depthwise convolutions on a systolic array with minimal overheads. In addition, the PEs in systolic arrays can be efficiently used only if the data items ( tensors ) are arranged in the desired layout. Typical DNN accelerators provide various types of PE interconnects or additional modules to flexibly rearrange the data items and manage data movements during DNN computations. RiSA provides a lightweight set of tensor management tasks within the PE array itself that eliminates the need for an additional module for tensor reshaping tasks. Using this embedded tensor reshaping, RiSA supports various DNN models, including convolutional neural networks and natural language processing models while maintaining a high area efficiency. Compared to Eyeriss v2, RiSA improves the area and energy efficiency for MobileNet-V1 inference by 1.91× and 1.31×, respectively.

Style APA, Harvard, Vancouver, ISO itp.

40

Yang, Hsiao-Fang, and Heng-Li Yang. "Development of a self-design system for greeting cards on the basis of interactive evolutionary computation." Kybernetes 45, no. 3 (2016): 521–35. http://dx.doi.org/10.1108/k-07-2015-0178.

Pełny tekst źródła

Streszczenie:

Purpose – User-centered product designs have been attracting increasing attention, particularly in digital design. In interacting with the design support system, designers may face problems such as changing demands (e.g. unclear demands) and insufficient descriptions of these demands (e.g. data scarcity). The purpose of this paper is to build a design support system prototype for demonstrating the feasibility of meeting the high involvement of users in digital products. Design/methodology/approach – Interactive evolutionary computation is applied. Findings – A prototype of self-design greeting card system (SDGCS) was proposed. It provides professional design layouts, offers users numerous self-design models, and allows nonprofessional users to easily design greeting cards. The results of this study show that users were satisfied with the functionality, usefulness, and ease-of-use of the SDGCS. Research limitations/implications – This study used digital card design as an example for demonstrating the feasibility of satisfying the unclear needs of uses, enabling users to design a digital card creatively and complete their designs quickly. However, the current system only supports the design of static objects and layout of card. And the evaluation sample size was small, which might affect generalizability of the findings. Practical implications – In practice, greeting card web operators can image the feasible business models by providing the attraction of self-design functionalities. Originality/value – In current human-centric marketing era, consumers have begun to request interaction with designers in creating the value of a product. However, very few previous studies have provided support for digital product self-design. This study demonstrated the feasibility of satisfying the needs of self-design.

Style APA, Harvard, Vancouver, ISO itp.

41

Chang, Wenhao, Xiaopei Cai, and Qihao Wang. "Dynamic characteristic difference of steel-spring floating slab track between single-carriage and multi-carriage models." Noise & Vibration Worldwide 52, no. 6 (2021): 156–67. http://dx.doi.org/10.1177/0957456521999868.

Pełny tekst źródła

Streszczenie:

The steel-spring floating slab track (SSFST) is a low-stiffness structure, sensitive to the vehicle loads. Due to the coupling effect of the superposition of adjacent bogies, it is difficult for conventional single-carriage models to meet the simulation requirements. To find a balance between computation efficiency and authenticity of analytical model results, the influence of carriage number on SSFST should be studied. Based on the finite element method and multi-body dynamics, a refined three-dimensional coupled model of multi-carriage-SSFST-tunnel was established. The difference in the dynamic response of the SSFST between single-carriage and multi-carriage models was analyzed and compared with the measured data. The field test results show that structural displacements and accelerations under the two-carriage model are much closer to the measured data. The dynamic model analysis results show that the maximum displacement of the rail and SSFST in the midspan of the slab increase by 0.48 mm and 0.34 mm under the multi-carriage model, and the vibration reduction effectiveness increases by 1.4–2.0 dB. Dynamic responses of the rail and SSFST show minor differences under the two-carriage and three-carriage models. The article is expected to provide a reference for the theoretical research, design, and layout optimization of subway SSFST.

Style APA, Harvard, Vancouver, ISO itp.

42

Hinman, James R., Holger Dannenberg, Andrew S. Alexander, and Michael E. Hasselmo. "Neural mechanisms of navigation involving interactions of cortical and subcortical structures." Journal of Neurophysiology 119, no. 6 (2018): 2007–29. http://dx.doi.org/10.1152/jn.00498.2017.

Pełny tekst źródła

Streszczenie:

Animals must perform spatial navigation for a range of different behaviors, including selection of trajectories toward goal locations and foraging for food sources. To serve this function, a number of different brain regions play a role in coding different dimensions of sensory input important for spatial behavior, including the entorhinal cortex, the retrosplenial cortex, the hippocampus, and the medial septum. This article will review data concerning the coding of the spatial aspects of animal behavior, including location of the animal within an environment, the speed of movement, the trajectory of movement, the direction of the head in the environment, and the position of barriers and objects both relative to the animal’s head direction (egocentric) and relative to the layout of the environment (allocentric). The mechanisms for coding these important spatial representations are not yet fully understood but could involve mechanisms including integration of self-motion information or coding of location based on the angle of sensory features in the environment. We will review available data and theories about the mechanisms for coding of spatial representations. The computation of different aspects of spatial representation from available sensory input requires complex cortical processing mechanisms for transformation from egocentric to allocentric coordinates that will only be understood through a combination of neurophysiological studies and computational modeling.

Style APA, Harvard, Vancouver, ISO itp.

43

Ozherelev, D. A., V. V. Shalai, and I. A. Ridel. "Study of the operating efficiency of centrifugal separators for gas preparation." Proceedings of Higher Educational Institutions. Маchine Building, no. 9 (750) (September 2022): 63–72. http://dx.doi.org/10.18698/0536-1044-2022-9-63-72.

Pełny tekst źródła

Streszczenie:

There exist two ways to intensify production in an oil and gas facility: the first one involves accepting associated petroleum gas for treatment from third-party subsoil users and supplying gas to the main gas pipeline in accordance with the technical specifications, while the second one concerns upgrading crucial processing equipment, which includes separators. Associated petroleum gas as mixed with natural gas affects the separation process in terms of a significant decrease in separator efficiency for the same set of operational parameters due to increasing the mass flow rate. In turn, this low separation efficiency results in the separated gas ablating the liquid phase. This factor varies over a wide range and depends on the design and actual performance of the separator, as well as on the pressure, temperature and composition of the gas mixture supplied. We consider a tentative layout for a centrifugal separator of a combined design for treating natural gas containing a quantity of associated petroleum gas. The paper presents numerical computation results for the separation simulation, as well as data obtained during actual separator operation for different heat and pressure values. We established that the separator design proposed provides high efficiency of gas treatment.

Style APA, Harvard, Vancouver, ISO itp.

44

V, Karthik, Savita Chaudhary, and Radhika A D. "Feature Extraction in Music information retrival using Machine Learning Algorithms." International Journal of Data Informatics and Intelligent Computing 1, no. 1 (2022): 1–10. http://dx.doi.org/10.59461/ijdiic.v1i1.11.

Pełny tekst źródła

Streszczenie:

Music classification is essential for faster Music record recovery. Separating the ideal arrangement of highlights and selecting the best investigation technique are critical for obtaining the best results from sound grouping. The extraction of sound elements could be viewed as an exceptional case of information sound information being transformed into sound instances. Music division and order can provide a rich dataset for the analysis of sight and sound substances. Because of the great dimensionality of sound highlights as well as the variable length of sound fragments, Music layout is dependent on the overpowering computation. By focusing on rhythmic aspects of different songs, this article provides an introduction of some of the possibilities for computing music similarity. Almost every MIR toolkit includes a method for extracting the beats per minute (BPM) and consequently the tempo of each music. The simplest method of computing very low-level rhythmic similarities is to sort and compare songs solely by their tempo There are undoubtedly far better and more precise solutions. work discusses some of the most promising ways for computing rhythm similarities in a Big Data framework usaing machine Learning algorithms.

Style APA, Harvard, Vancouver, ISO itp.

45

Hussain, Md Muzakkir, Ahmad Taher Azar, Rafeeq Ahmed, et al. "SONG: A Multi-Objective Evolutionary Algorithm for Delay and Energy Aware Facility Location in Vehicular Fog Networks." Sensors 23, no. 2 (2023): 667. http://dx.doi.org/10.3390/s23020667.

Pełny tekst źródła

Streszczenie:

With the emergence of delay- and energy-critical vehicular applications, forwarding sense-actuate data from vehicles to the cloud became practically infeasible. Therefore, a new computational model called Vehicular Fog Computing (VFC) was proposed. It offloads the computation workload from passenger devices (PDs) to transportation infrastructures such as roadside units (RSUs) and base stations (BSs), called static fog nodes. It can also exploit the underutilized computation resources of nearby vehicles that can act as vehicular fog nodes (VFNs) and provide delay- and energy-aware computing services. However, the capacity planning and dimensioning of VFC, which come under a class of facility location problems (FLPs), is a challenging issue. The complexity arises from the spatio-temporal dynamics of vehicular traffic, varying resource demand from PD applications, and the mobility of VFNs. This paper proposes a multi-objective optimization model to investigate the facility location in VFC networks. The solutions to this model generate optimal VFC topologies pertaining to an optimized trade-off (Pareto front) between the service delay and energy consumption. Thus, to solve this model, we propose a hybrid Evolutionary Multi-Objective (EMO) algorithm called Swarm Optimized Non-dominated sorting Genetic algorithm (SONG). It combines the convergence and search efficiency of two popular EMO algorithms: the Non-dominated Sorting Genetic Algorithm (NSGA-II) and Speed-constrained Particle Swarm Optimization (SMPSO). First, we solve an example problem using the SONG algorithm to illustrate the delay–energy solution frontiers and plotted the corresponding layout topology. Subsequently, we evaluate the evolutionary performance of the SONG algorithm on real-world vehicular traces against three quality indicators: Hyper-Volume (HV), Inverted Generational Distance (IGD) and CPU delay gap. The empirical results show that SONG exhibits improved solution quality over the NSGA-II and SMPSO algorithms and hence can be utilized as a potential tool by the service providers for the planning and design of VFC networks.

Style APA, Harvard, Vancouver, ISO itp.

46

Xu, Lewei, Zhuhua Hu, Chong Zhang, and Wei Wu. "Remote Sensing Image Segmentation of Mariculture Cage Using Ensemble Learning Strategy." Applied Sciences 12, no. 16 (2022): 8234. http://dx.doi.org/10.3390/app12168234.

Pełny tekst źródła

Streszczenie:

In harbour areas, the irrational layout and high density of mariculture cages can lead to a dramatic deterioration of the culture’s ecology. Therefore, it is important to analyze and regulate the distribution of cages using intelligent analysis based on deep learning. We propose a remote sensing image segmentation method based on the Swin Transformer and ensemble learning strategy. Firstly, we collect multiple remote sensing images of cages and annotate them, while using data expansion techniques to construct a remote sensing image dataset of mariculture cages. Secondly, the Swin Transformer is used as the backbone network to extract the remote sensing image features of cages. A strategy of alternating the local attention module and the global attention module is used for model training, which has the benefit of reducing the attention computation while exchanging global information. Then, the ensemble learning strategy is used to improve the accuracy of remote sensing cage segmentation. We carry out quantitative and qualitative analyses of remote sensing image segmentation of cages at the ports of Li’an, Xincun and Potou in Hainan Province, China. The results show that our proposed segmentation scheme has significant performance improvement compared to other models. In particular, the mIoU reaches 82.34% and pixel accuracy reaches 99.71%.

Style APA, Harvard, Vancouver, ISO itp.

47

Derbel, Khaoula, and Károly Beneda. "Sliding Mode Control for Micro Turbojet Engine Using Turbofan Power Ratio as Control Law." Energies 13, no. 18 (2020): 4841. http://dx.doi.org/10.3390/en13184841.

Pełny tekst źródła

Streszczenie:

The interest in turbojet engines was emerging in the past years due to their simplicity. The purpose of this article is to investigate sliding mode control (SMC) for a micro turbojet engine based on an unconventional compound thermodynamic parameter called Turbofan Power Ratio (TPR) and prove its advantage over traditional linear methods and thrust parameters. Based on previous research by the authors, TPR can be applied to single stream turbojet engines as it varies proportionally to thrust, thus it is suitable as control law. The turbojet is modeled by a linear, parameter-varying structure, and variable structure sliding mode control has been selected to control the system, as it offers excellent disturbance rejection and provides robustness against discrepancies between mathematical model and real plant as well. Both model and control system have been created in MATLAB® Simulink®, data from real measurement have been taken to evaluate control system performance. The same assessment is conducted with conventional Proportional-Integral-Derivative (PID) controllers and showed the superiority of SMC, furthermore TPR computation using turbine discharge temperature was proven. Based on the results of the simulation, a controller layout is proposed and its feasibility is investigated. The utilization of TPR results in more accurate thrust output, meanwhile it allows better insight into the thermodynamic process of the engine, hence it carries an additional diagnostic possibility.

Style APA, Harvard, Vancouver, ISO itp.

48

Ma, Song, Jianguo Tan, Xiankai Li, and Jiang Hao. "The effect analysis of an engine jet on an aircraft blast deflector." Transactions of the Institute of Measurement and Control 41, no. 4 (2018): 990–1001. http://dx.doi.org/10.1177/0142331218755892.

Pełny tekst źródła

Streszczenie:

This paper establishes a novel mathematical model for computing the plume flow field of a carrier-based aircraft engine. Its objective is to study the impact of jet exhaust gases with high temperature, high speed and high pressure on the jet blast deflector. The working condition of the nozzle of a fully powered on engine is first determined. The flow field of the exhaust jet is then numerically simulated at different deflection angle using the three-dimensional Reynolds averaged Navier–Stokes equations and the standard [Formula: see text]-[Formula: see text] turbulence method. Moreover, infra-red temperature tests are further carried out to test the temperature field when the jet blast deflector is at the [Formula: see text] deflection angle. The comparison between the simulation results and the experimental results show that the proposed computation model can perfectly describe the system. There is only 8–10% variation between them. A good verification is achieved. Moreover, the experimental results show that the jet blast deflector plays an outstanding role in driving the high-temperature exhaust gases. It is found that [Formula: see text] may be the best deflection angle to protect the deck and the surrounding equipment effectively. These data results provide a valuable basis for the design and layout optimization of the jet blast deflector and deck.

Style APA, Harvard, Vancouver, ISO itp.

49

Khaliq, Aleem, Lorenzo Comba, Alessandro Biglia, Davide Ricauda Aimonino, Marcello Chiaberge, and Paolo Gay. "Comparison of Satellite and UAV-Based Multispectral Imagery for Vineyard Variability Assessment." Remote Sensing 11, no. 4 (2019): 436. http://dx.doi.org/10.3390/rs11040436.

Pełny tekst źródła

Streszczenie:

In agriculture, remotely sensed data play a crucial role in providing valuable information on crop and soil status to perform effective management. Several spectral indices have proven to be valuable tools in describing crop spatial and temporal variability. In this paper, a detailed analysis and comparison of vineyard multispectral imagery, provided by decametric resolution satellite and low altitude Unmanned Aerial Vehicle (UAV) platforms, is presented. The effectiveness of Sentinel-2 imagery and of high-resolution UAV aerial images was evaluated by considering the well-known relation between the Normalised Difference Vegetation Index (NDVI) and crop vigour. After being pre-processed, the data from UAV was compared with the satellite imagery by computing three different NDVI indices to properly analyse the unbundled spectral contribution of the different elements in the vineyard environment considering: (i) the whole cropland surface; (ii) only the vine canopies; and (iii) only the inter-row terrain. The results show that the raw s resolution satellite imagery could not be directly used to reliably describe vineyard variability. Indeed, the contribution of inter-row surfaces to the remotely sensed dataset may affect the NDVI computation, leading to biased crop descriptors. On the contrary, vigour maps computed from the UAV imagery, considering only the pixels representing crop canopies, resulted to be more related to the in-field assessment compared to the satellite imagery. The proposed method may be extended to other crop typologies grown in rows or without intensive layout, where crop canopies do not extend to the whole surface or where the presence of weeds is significant.

Style APA, Harvard, Vancouver, ISO itp.

50

Аверченков, Владимир, Vladimir Averchenkov, Александр Самсоненко, and Aleksandr Samsonenko. "AUTOMATION OF OPTIC INSPECTION CONTROL AT QUALITY MANAGEMENT OF PRINTED CIRCUIT ASSEMBLY SOLDERING." Bulletin of Bryansk state technical university 2016, no. 2 (2016): 149–55. http://dx.doi.org/10.12737/20271.

Pełny tekst źródła

Streszczenie:

The development of methods of designing and assembling electronic units on circuit boards with the use of surface assembling techniques (SAT) became one of the priority fields. The product optic control is the most common method of control carried out by specialized equipment, an automated optic inspection (AOI). On the basis of the fulfilled analysis of technical solutions on the organization of inspections of different manufacturers and according to the test results in practice of some variants there was offered a circuit including the following equipment layout: a set of chambers, a system of linear displacement, a controller, a computer and software. The optical inspection software can be divided into the software of management and the software of control. The former serves for the computation of a control signal, the comparison of coordinates of equipment (with physical (F), imported (C) and in-spected (P)) interpretation of coordinates, interpolation and so on. The software of control contains modules: of data import from Gerberfile, the identification of board images obtained from the chamber, of the com-putation and comparison of identified components with a standard, of the analysis defects and the definition of a defect type, of database (DB) filling, of the interaction with a user.  The system described in comparison with foreign analogues has a smaller set of options, but solves a problem of control under conditions of enterprises manufacturing electronics of the second class (according to IPCA-610 Standard “Criteria for Electronic Assemblage Acceptance).

Style APA, Harvard, Vancouver, ISO itp.

Oferujemy zniżki na wszystkie plany premium dla autorów, których prace zostały uwzględnione w tematycznych zestawieniach literatury. Skontaktuj się z nami, aby uzyskać unikalny kod promocyjny!