Einloggen

Thematische Bibliographien / Multithreaded application / Zeitschriftenartikel

Um die anderen Arten von Veröffentlichungen zu diesem Thema anzuzeigen, folgen Sie diesem Link: Multithreaded application.

Zeitschriftenartikel zum Thema „Multithreaded application“

Autor: Grafiati

Veröffentlicht am 28. Juni 2021

Zuletzt aktualisiert am 21. Juni 2025

Geben Sie eine Quelle nach APA, MLA, Chicago, Harvard und anderen Zitierweisen an

Wählen Sie eine Art der Quelle aus:

Machen Sie sich mit Top-50 Zeitschriftenartikel für die Forschung zum Thema "Multithreaded application" bekannt.

Neben jedem Werk im Literaturverzeichnis ist die Option "Zur Bibliographie hinzufügen" verfügbar. Nutzen Sie sie, wird Ihre bibliographische Angabe des gewählten Werkes nach der nötigen Zitierweise (APA, MLA, Harvard, Chicago, Vancouver usw.) automatisch gestaltet.

Sie können auch den vollen Text der wissenschaftlichen Publikation im PDF-Format herunterladen und eine Online-Annotation der Arbeit lesen, wenn die relevanten Parameter in den Metadaten verfügbar sind.

Sehen Sie die Zeitschriftenartikel für verschiedene Spezialgebieten durch und erstellen Sie Ihre Bibliographie auf korrekte Weise.

1

Giebas, Damian, and Rafał Wojszczyk. "Deadlocks Detection in Multithreaded Applications Based on Source Code Analysis." Applied Sciences 10, no. 2 (2020): 532. http://dx.doi.org/10.3390/app10020532.

Der volle Inhalt der Quelle

Annotation:

This paper extends multithreaded application source code model and shows how to using it to detect deadlocks in C language applications. Four known deadlock scenarios from literature can be detected using our model. For every scenario we created theorems and proofs whose fulfillment guarantees the occurrence of deadlocks in multithreaded applications. Paper also contains comparison of multithreaded application source code model and Petri nets and describe advantages and disadvantages both of them.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

2

GIEBAS, Damian, and Rafał WOJSZCZYK. "GRAPHICAL REPRESENTATIONS OF MULTITHREADED APPLICATIONS." Applied Computer Science 14, no. 2 (2018): 20–37. http://dx.doi.org/10.35784/acs-2018-10.

Der volle Inhalt der Quelle

Annotation:

This article contains a brief description of existing graphical methods for presenting multithreaded applications, i.e. Control Flow Graph and Petri nets. These methods will be discussed, and then a way to represent multithreaded applications using the concurrent process system model will be presented. All these methods will be used to present the idea of a multithreaded application that includes the race condition phenomenon. In the summary, all three methods will be compared and subjected to the evaluation, which will depend on whether the given representation will allow to find the mentioned phenomenon.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

3

Kundan, Shivam, Theodoros Marinakis, Iraklis Anagnostopoulos, and Dimitri Kagaris. "A Pressure-Aware Policy for Contention Minimization on Multicore Systems." ACM Transactions on Architecture and Code Optimization 19, no. 3 (2022): 1–26. http://dx.doi.org/10.1145/3524616.

Der volle Inhalt der Quelle

Annotation:

Modern Chip Multiprocessors (CMPs) are integrating an increasing amount of cores to address the continually growing demand for high-application performance. The cores of a CMP share several components of the memory hierarchy, such as Last-Level Cache (LLC) and main memory. This allows for considerable gains in multithreaded applications while also helping to maintain architectural simplicity. However, sharing resources can also result in performance bottleneck due to contention among concurrently executing applications. In this work, we formulate a fine-grained application characterization methodology that leverages Performance Monitoring Counters (PMCs) and Cache Monitoring Technology (CMT) in Intel processors. We utilize this characterization methodology to develop two contention-aware scheduling policies, one static and one dynamic , that co-schedule applications based on their resource-interference profiles. Our approach focuses on minimizing contention on both the main-memory bandwidth and the LLC by monitoring the pressure that each application inflicts on these resources. We achieve performance benefits for diverse workloads, outperforming Linux and three state-of-the-art contention-aware schedulers in terms of system throughput and fairness for both single and multithreaded workloads. Compared with Linux, our policy achieves up to 16% greater throughput for single-threaded and up to 40% greater throughput for multithreaded applications. Additionally, the policies increase fairness by up to 65% for single-threaded and up to 130% for multithreaded ones.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

4

Muralidhara, Sai Prashanth, Mahmut Kandemir, and Padma Raghavan. "Intra-application shared cache partitioning for multithreaded applications." ACM SIGPLAN Notices 45, no. 5 (2010): 329–30. http://dx.doi.org/10.1145/1837853.1693498.

Der volle Inhalt der Quelle

APA, Harvard, Vancouver, ISO und andere Zitierweisen

5

Molchanov, Viktor. "Implementation of multithreaded calculations in educational web applications." Development Management 17, no. 2 (2019): 1–7. http://dx.doi.org/10.21511/dm.17(2).2019.01.

Der volle Inhalt der Quelle

Annotation:

The complication of the logic of educational web applications raises the issue of the effectiveness of the organization of their implementation. At the same time, efficiency, including pedagogical one, is connected among other factors with the technology of implementation of programs. When using as the main browser program, it is necessary to take into account its features, in particular, one-flow mode of execution of programs (scripts). Implementation of more complex algorithms in web applications delays the response of the application interface to user actions. This creates a discomfort for the user and, as a result, reduces the effectiveness of his work. Expanding the range of devices from which users access the Internet leads to the fact that mobile devices are more and more often used for learning as well. Therefore, another side of the problem is the impact on the quality of connection to the server. It is necessary to ensure the work of the program in case of interruptions in connection or reduce their impact. A solution to the problem may be the implementation of part of the calculations in the background. The article deals with the use of calculations in the background streams of the browser and caching control for educational web applications. Various ways of creating such streams and the peculiarities of their use are analyzed.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

6

Makarov, Igor Sergeevich, Denis Vyacheslavovich Larin, Evgeniia Grigor'evna Vorobeva, Daniil Pavlovich Emelin, and Dmitry Aleksandrovich Kartashov. "The impact of asynchronous and multithreaded query processing models on the performance of server-side web applications." Программные системы и вычислительные методы, no. 1 (January 2025): 13–20. https://doi.org/10.7256/2454-0714.2025.1.73665.

Der volle Inhalt der Quelle

Annotation:

The object of the study is server-side web applications and their performance when processing a large number of simultaneous requests. Asynchronous technologies (Node.js, Python Asyncio, Go, Kotlin Coroutines) and multithreaded models (Java Threading, Python Threading). The authors analyze asynchronous event loops, goroutines, coroutines, and classical multithreaded approaches in detail, evaluating their effectiveness in tasks with intensive use of I/O and computing resources. An experiment is underway with API development in three languages (Java, Node.js, Go) and testing using the hey utility. It also explores the features of scalability, performance optimization, caching, error handling, load tests, and implementation features of parallel computing. The purpose of the study is to determine which approaches provide the highest performance in server applications. Research methods include load testing, collection of metrics (response time, bandwidth, and server resource consumption), and analysis of the results. The scientific novelty lies in comparing asynchronous and multithreaded methods in real-world web development scenarios. The main conclusions of the study are recommendations on the use of asynchronous technologies in high-load I/O tasks and multithreading in computationally complex scenarios. The results obtained will help developers optimize the performance of server applications depending on their tasks and workload. Additionally, the study examines aspects of the complexity of debugging asynchronous applications, the impact of thread pools on the performance of multithreaded solutions, as well as scenarios in which asynchronous and multithreaded approaches can complement each other. Special attention is paid to server resource management under scalable loads, which will allow IT specialists to more accurately select tools and technologies for solving specific tasks. In conclusion, possible ways to optimize the operation of server applications are discussed, including the use of new approaches and algorithms, as well as the prospects for the development of asynchronous and multithreaded technologies in the context of highly loaded systems, their impact on the overall application architecture, as well as on increasing fault tolerance and security.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

7

Ryabev, A. V. "Overview and classification of advanced schemes of multithreaded combined energy transmissions based on their kinematic analysis." Izvestiya MGTU MAMI 10, no. 2 (2016): 55–65. http://dx.doi.org/10.17816/2074-0530-66932.

Der volle Inhalt der Quelle

Annotation:

The article deals with the existing and promising modern automotive multithreaded combined energy transmissions, based on the principle of separation of power for the electrical and mechanical streams. These combined energy transmissions due to the presence in their design of continuously variable electric transmission allow obtaining an arbitrary gear ratio from the engine to the wheels, while maintaining high efficiency inherent to manual transmission. It allows to assume that multithreaded combined energy transmissions are promising for use in hybrid vehicles as evidenced by the successful operation of Toyota Prius automobile. The article describes 16 different schemes of electromechanical transmissions. Some of them are actually applied in practice, while others exist only as prototypes or theoretical projects. On the basis of the kinematic analysis, including determination of number of operating modes and degrees of freedom as well as the construction of kinematic plans for different of operating modes the classification of multithreaded combined energy transmissions by type of differential mechanism (mechanical part of transmission) was proposed. There were allocated single-mode and multi-mode multithreaded combined energy transmissions. The last ones were divided into three classes, depending on the method of obtaining different modes: stepped, variable and combined. Moreover, within each class transmissions with differential at input, differential at output with complex power division were identified. This review allows to get acquainted with possibilities of application of multithreaded combined energy transmissions in road transport, to understand its strengths and weaknesses, identify promising areas of application of multithreaded electromechanical transmissions of various types.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

8

Shen, Hua, Guo Shun Zhou, and Hui Qi Yan. "A Study of Parallelization and Performance Optimizations Based on OpenMP." Applied Mechanics and Materials 321-324 (June 2013): 2933–37. http://dx.doi.org/10.4028/www.scientific.net/amm.321-324.2933.

Der volle Inhalt der Quelle

Annotation:

The primary consequence of the transition to multicore processors is that applications will increasingly need to be parallelized to improve application's throughput, responsiveness and latency. Multithreading is becoming increasingly important for modern programming. Unfortunately, parallel programming is no doubt much more tedious and error-prone than serial programming. Although modern compilers can manage threads well, but in practice, synchronization errors (such as: data race errors, deadlocks) required careful management and good optimization method. This paper presents a preliminary study of the usability of the Intel threading tools for multicore programming. This work compare performance of a single threaded application with multithreaded applications, use tools called Intel® VTune Performance Analyzer, Intel® Thread Checker and OpenMP to efficiently optimize multithreaded applications.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

9

Settle, Alex, Dan Connors, Enric Gibert, and Antonio González. "A dynamically reconfigurable cache for multithreaded processors." Journal of Embedded Computing 2, no. 2 (2006): 221–33. https://doi.org/10.3233/emc-2006-00027.

Der volle Inhalt der Quelle

Annotation:

Chip multi-processors (CMP) are rapidly emerging as an important design paradigm for both high performance and embedded processors. These machines provide an important performance alternative to increasing the clock frequency. In spite of the increase in potential performance, several issues related to resource sharing on the chip can negatively impact the performance of embedded applications. In particular, the shared on-chip caches make each job's memory access times dependent on the behavior of the other jobs sharing the cache. If not adequately managed, this can lead to problems in meeting hard real-time scheduling constraints. This work explores adaptable caching strategies which balance the resource demands of each application and in turn lead to improvements in throughput for the collective workload. Experimental results demonstrate speedups of up to 1.47X for workloads of two co-scheduled applications compared against a fully-shared two-level cache hierarchy. Additionally, the adaptable caching scheme is shown to achieve an average speedup of 1.10X over the leading cache partitioning model. By dynamically managing cache storage for multiple application threads at runtime, sizable performance levels are achieved, which provides chip designers the opportunity to maintain high performance as cache size and power budgets become a concern in the CMP design space.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

10

Mao, Li Na, and Lin Yan Tang. "The Design and Application of Monitoring Framework Based on AOP." Applied Mechanics and Materials 685 (October 2014): 671–75. http://dx.doi.org/10.4028/www.scientific.net/amm.685.671.

Der volle Inhalt der Quelle

Annotation:

In this article apply AOP technology to multithreaded monitoring, using a database to store thread information multi-threaded monitoring platform implementation scheme, thread monitoring module is completely independent of the original system.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

11

Xue, Xiaozhen, Sima Siami-Namini, and Akbar Siami Namin. "Testing Multi-Threaded Applications Using Answer Set Programming." International Journal of Software Engineering and Knowledge Engineering 28, no. 08 (2018): 1151–75. http://dx.doi.org/10.1142/s021819401850033x.

Der volle Inhalt der Quelle

Annotation:

We introduce a technique to formally represent and specify race conditions in multithreaded applications. Answer set programming (ASP) is a logic-based knowledge representation paradigm to formally express belief acquired through reasoning in an application domain. The transparent and expressiveness representation of problems along with powerful non-monotonic reasoning power enable ASP to abstractly represent and solve some certain classes of NP hard problems in polynomial times. We use ASP to formally express race conditions and thus represent potential data races often occurred in multithreaded applications with shared memory models. We then use ASP to generate all possible test inputs and thread interleaving, i.e. scheduling, whose executions would result in deterministically exposing thread interleaving failures. We evaluated the proposed technique with some moderate sized Java programs, and our experimental results confirm that the proposed technique can practically expose common data races in multithreaded programs with low false positive rates. We conjecture that, in addition to generating threads scheduling whose execution order leads to the exposition of data races, ASP has several other applications in constraint-based software testing research and can be utilized to express and solve similar test case generation problems where constraints play a key role in determining the complexity of searches.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

12

Kama, Sami, Charles Leggett, Scott Snyder, and Vakho Tsulaia. "The ATLAS multithreaded offline framework." EPJ Web of Conferences 214 (2019): 05018. http://dx.doi.org/10.1051/epjconf/201921405018.

Der volle Inhalt der Quelle

Annotation:

In preparation for Run 3 of the LHC, scheduled to start in 2021, the ATLAS experiment is revising its offline software so as to better take advantage of machines with many cores. A major part of this effort is migrating the software to run as a fully multithreaded application, as this has been shown to significantly improve the memory scaling behavior. This note outlines changes made to the software framework to support this migration.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

13

Tang, Xulong, Mahmut Taylan Kandemir, and Mustafa Karakoy. "Mix and Match: Reorganizing Tasks for Enhancing Data Locality." Proceedings of the ACM on Measurement and Analysis of Computing Systems 5, no. 2 (2021): 1–24. http://dx.doi.org/10.1145/3460087.

Der volle Inhalt der Quelle

Annotation:

Application programs that exhibit strong locality of reference lead to minimized cache misses and better performance in different architectures. However, to maximize the performance of multithreaded applications running on emerging manycore systems, data movement in on-chip network should also be minimized. Unfortunately, the way many multithreaded programs are written does not lend itself well to minimal data movement. Motivated by this observation, in this paper, we target task-based programs (which cover a large set of available multithreaded programs), and propose a novel compiler-based approach that consists of four complementary steps. First, we partition the original tasks in the target application into sub-tasks and build a data reuse graph at a sub-task granularity. Second, based on the intensity of temporal and spatial data reuses among sub-tasks, we generate new tasks where each such (new) task includes a set of sub-tasks that exhibit high data reuse among them. Third, we assign the newly-generated tasks to cores in an architecture-aware fashion with the knowledge of data location. Finally, we re-schedule the execution order of sub-tasks within new tasks such that sub-tasks that belong to different tasks but share data among them are executed in close proximity in time. The detailed experiments show that, when targeting a state of the art manycore system, our proposed compiler-based approach improves the performance of 10 multithreaded programs by 23.4% on average, and it also outperforms two state-of-the-art data access optimizations for all the benchmarks tested. Our results also show that the proposed approach i) improves the performance of multiprogrammed workloads, and ii) generates results that are close to maximum savings that could be achieved with perfect profiling information. Overall, our experimental results emphasize the importance of dividing an original set of tasks of an application into sub-tasks and constructing new tasks from the resulting sub-tasks in a data movement- and locality-aware fashion.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

14

Eickemeyer, Richard J., Ross E. Johnson, Steven R. Kunkel, Mark S. Squillante, and Shiafun Liu. "Evaluation of multithreaded uniprocessors for commercial application environments." ACM SIGARCH Computer Architecture News 24, no. 2 (1996): 203–12. http://dx.doi.org/10.1145/232974.232994.

Der volle Inhalt der Quelle

APA, Harvard, Vancouver, ISO und andere Zitierweisen

15

Kumar, Pradeep. "Optimizing Multithreaded Applications: Techniques and Strategies." International Journal of Multidisciplinary Research and Growth Evaluation. 1, no. 3 (2020): 67–76. https://doi.org/10.54660/.ijmrge.2020.1.3.67-76.

Der volle Inhalt der Quelle

Annotation:

Optimizing multithreaded applications has become a cornerstone of modern computing, driven by the widespread adoption of multi-core processors. These applications aim to leverage thread-level parallelism to maximize hardware utilization, but achieving this is fraught with challenges, including synchronization overheads, cache inefficiencies, and diminishing returns in performance scaling. Effective optimization requires a comprehensive understanding of performance metrics, cache behavior, and the underlying hardware architecture. Parallel efficiency metrics, such as speedup and CPU utilization, are instrumental in identifying bottlenecks and guiding optimization strategies (Hennessy & Patterson, 2017, pp. 353–354). Scaling challenges often arise when thread counts exceed the number of physical cores, leading to resource contention and degraded performance (Hennessy & Patterson, 2017, p. 362). Cache coherence issues further exacerbate these challenges. True sharing, caused by frequent updates to shared memory locations, and false sharing, due to adjacent data access in shared cache lines, remain significant impediments to performance (Fog, 2016, pp. 112–113). This paper explores advanced techniques such as dynamic task scheduling, lock-free programming, and memory alignment to mitigate these challenges. Tools like Coz and eBPF are highlighted for their role in profiling and diagnosing bottlenecks in multithreaded applications (Seznec & Michaud, 2006, p. 60) [2]. A case study demonstrates the application of these techniques, showcasing improvements in scalability and throughput by addressing thread contention and synchronization overheads (Fog, 2016, p. 121) [3]. By integrating advanced profiling tools with targeted optimization strategies, developers can enhance multithreaded performance and fully exploit modern hardware capabilities. However, continued research is needed to address emerging challenges in hybrid architectures and memory technologies (Hennessy & Patterson, 2017, p. 372).

APA, Harvard, Vancouver, ISO und andere Zitierweisen

16

Nagel, Everton, Ricardo Melo Czekster, Thais Webber, and César Augusto Missio Marcon. "A Framework Prototype for Multithreaded Implementation Over Micro-Controllers." Journal of Integrated Circuits and Systems 14, no. 1 (2019): 1–10. http://dx.doi.org/10.29292/jics.v14i1.39.

Der volle Inhalt der Quelle

Annotation:

Multithreading is pervasive in embedded system applications development. The applications requirements are becoming more rigorous, demanding the execution of concurrent tasks that must also take into account modularity and flexibility. An important part of the operating systems development concerns the implementation of scheduling algorithms. In an embedded system context, it is essential to consider that the scheduling algorithm heavily influences application behavior. Due to restricted and finite hardware resources, it is important to evaluate the use of flexible algorithms to guarantee efficiency. Currently, projects for embedded operating systems do exist for microcontrollers’ devices that implement scheduling algorithms, however, the developer cannot change or add new scheduling policies without implementing kernel tweaks and modifications. The alternatives are not flexible when choosing the scheduling algorithm according to the application needs. This imposes restrictions to many systems, forcing them to run specific static scheduling algorithms because no other options are available. The objective of this work concerns the design and development of a framework that implements a microkernel with a modular scheduler unit, allowing the execution of tailored algorithms according to the application profile. The idea is to provide a flexible platform to conveniently select the most appropriated algorithm. We have employed low capacity hardware to implement multithreading patterns corresponding to sets of concurrent tasks, demonstrating the strengths of adopting our approach. Our results show that the use of modern techniques that combine modularity, multithreading, and scheduling methods for embedded systems yield best executions when compared to its sequential counterparts.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

17

AZAGURY, ALAIN, ELLIOT K. KOLODNER, and EREZ PETRANK. "A NOTE ON THE IMPLEMENTATION OF REPLICATION-BASED GARBAGE COLLECTION FOR MULTITHREADED APPLICATIONS AND MULTIPROCESSOR ENVIRONMENTS." Parallel Processing Letters 09, no. 03 (1999): 391–99. http://dx.doi.org/10.1142/s0129626499000360.

Der volle Inhalt der Quelle

Annotation:

Replication-based incremental garbage collection is one of the more appealing concurrent garbage collection algorithms known today. It allows continuous operation of the application (the mutator) with very short pauses for garbage collection. There is a growing need for such garbage collectors suitable for a multithreaded environments such as the Java Virtual Machine. Furthermore, it is desirable to construct collectors that also work on multiprocessor computers. We begin by pointing out an important, yet subtle point, which arises when implementing the replication-based garbage collector for a multithreaded environment. We first show that a simple and natural implementation of the algorithm may lead to an incorrect behavior of multithreaded applications. We then show that another simple and natural implementation eliminates the problem completely. Thus, the contribution of this part is in stressing this warning to future implementors. Next, we address the effects of the memory coherence model on this algorithm. We show that even when the algorithm is properly implemented with respect to our first observation, a problem might still arise when a multiprocessor system is used. Adopting a naive solution to this problem results in very frequent (and expensive) synchronization. We offer a slight modification to the algorithm which eliminates the problem and requires little synchronization.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

18

GIBADULLIN, R. F., and I. S. VERSHININ. "ASSOCIATIVE PROTECTION OF NUMERICAL INFORMATION IN TEXT DOCUMENTS USING THE PARALLEL FRAMEWORK LIBRARY ON THE.NET PLATFORM." Computational nanotechnology 10, no. 3 (2023): 121–29. http://dx.doi.org/10.33693/2313-223x-2023-10-3-121-129.

Der volle Inhalt der Quelle

Annotation:

The paper discusses the development and analysis of an application designed to protect numeric data in text files using an associative data protection mechanism. The application, based on the .NET platform and using the Parallel Framework library, was tested in detail to evaluate the effectiveness of multithreaded data processing and the use of regular expressions to extract numeric information from text. The results showed that the application of parallel processing can significantly increase performance, achieving twice the speedup on a multi-core hardware platform. At the same time, the paper highlights and analyzes some of the challenges and limitations associated with parallel processing, including user interface locking, the need for thread safety, and the peculiarities of working with regular expressions in multithreaded mode. Possible directions for further improvement of the application are discussed. The conducted research is of practical value for the development of parallel data processing methods in the context of information protection.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

19

Silva, Bruno, Luiz Guerreiro Lopes, and Fábio Mendonça. "Multithreaded and GPU-Based Implementations of a Modified Particle Swarm Optimization Algorithm with Application to Solving Large-Scale Systems of Nonlinear Equations." Electronics 14, no. 3 (2025): 584. https://doi.org/10.3390/electronics14030584.

Der volle Inhalt der Quelle

Annotation:

This paper presents a novel Graphics Processing Unit (GPU) accelerated implementation of a modified Particle Swarm Optimization (PSO) algorithm specifically designed to solve large-scale Systems of Nonlinear Equations (SNEs). The proposed GPU-based parallel version of the PSO algorithm uses the inherent parallelism of modern hardware architectures. Its performance is compared against both sequential and multithreaded Central Processing Unit (CPU) implementations. The primary objective is to evaluate the efficiency and scalability of PSO across different hardware platforms with a focus on solving large-scale SNEs involving thousands of equations and variables. The GPU-parallelized and multithreaded versions of the algorithm were implemented in the Julia programming language. Performance analyses were conducted on an NVIDIA A100 GPU and an AMD EPYC 7643 CPU. The tests utilized a set of challenging, scalable SNEs with dimensions ranging from 1000 to 5000. Results demonstrate that the GPU accelerated modified PSO substantially outperforms its CPU counterparts, achieving substantial speedups and consistently surpassing the highly optimized multithreaded CPU implementation in terms of computation time and scalability as the problem size increases. Therefore, this work evaluates the trade-offs between different hardware platforms and underscores the potential of GPU-based parallelism for accelerating SNE solvers.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

20

DAM, MADS, BART JACOBS, ANDREAS LUNDBLAD, and FRANK PIESSENS. "Security monitor inlining and certification for multithreaded Java." Mathematical Structures in Computer Science 25, no. 3 (2014): 528–65. http://dx.doi.org/10.1017/s0960129512000916.

Der volle Inhalt der Quelle

Annotation:

Security monitor inlining is a technique for security policy enforcement whereby monitor functionality is injected into application code in the style of aspect-oriented programming. The intention is that the injected code enforces compliance with the policy (security), and otherwise interferes with the application as little as possible (conservativity and transparency). Such inliners are said to be correct. For sequential Java-like languages, inlining is well understood, and several provably correct inliners have been proposed. For multithreaded Java one difficulty is the need to maintain a shared monitor state. We show that this problem introduces fundamental limitations in the type of security policies that can be correctly enforced by inlining. A class of race-free policies is identified that precisely characterizes the inlineable policies by showing that inlining of a policy outside this class is either not secure or not transparent, and by exhibiting a concrete inliner for policies inside the class which is secure, conservative and transparent. The inliner is implemented for Java and applied to a number of practical application security policies. Finally, we discuss how certification in the style of proof-carrying code could be supported for inlined programs by using annotations to reduce a potentially complex verification problem for multithreaded Java bytecode to sequential verification of just the inlined code snippets.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

21

Albertian, A. M., and I. I. Kurochkin. "Use of specialized computational devices on the node of the desktop grid system for the solve of combinatorial problems." Transaction Kola Science Centre 11, no. 8-2020 (2020): 105–10. http://dx.doi.org/10.37614/2307-5252.2020.8.11.010.

Der volle Inhalt der Quelle

Annotation:

The paper discusses the use of specialized Intel Xeon Phi devices in a desktop grid system. As an example of the successful use of specialized devices, the combinatorial problem of finding diagonal Latin squares is given. Features of implementation of the computing application are discussed. The results of work of a computer application in multithreaded modes on different processors andspecialized devices are given.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

22

Alsaffar, Awse S., and Ayad H. Alezzy. "A Lightweight Portable Multithreaded Client-Server Docker Containers." Technium: Romanian Journal of Applied Sciences and Technology 4, no. 10 (2022): 31–43. http://dx.doi.org/10.47577/technium.v4i10.7722.

Der volle Inhalt der Quelle

Annotation:

In terms of diversity in operating systems, environments, and platforms, and with limited host infrastructure resources to hold all operating systems and platforms, the need arises to design applications that run in many or we can say in all operating systems and platforms. This paper deals with designing and implementing a lightweight multithreaded client-server application appears as a separate Docker container. Both client and server containers are based on alpine Linux and developed using Python programming language. The execution unit in the containers is Python program files. As a case study, the server acts as a Wikipedia server and it can serve many clients simultaneously. The Docker will build the containers depending on a writing Dockerfile for each container and push them to the registry (docker hub). When pulled the image of the containers from the registry account, then it could be run the container on a host. The proposed containerized multithreaded client-server model will become a portable with limited capabilities other than using Virtualization. Resources and cost requirements to achieve portability is evaluated for both virtualization and containerization paradigms. The results showed the superiority of containerization over the virtualization in both resources and cost requirements.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

23

Sharamet, A. V. "Multithreaded Convolution Implementation Based on Block Methods." Doklady BGUIR 20, no. 7 (2022): 81–87. http://dx.doi.org/10.35596/1729-7648-2022-20-7-81-87.

Der volle Inhalt der Quelle

Annotation:

A multithreaded convolution implementation based on block algorithms is considered. Convolution is essentially the basis of many methods that solve the problem of determining the degree of similarity or independence of two processes, in other words, when it is necessary to determine the degree of correlation. The algorithm itself is executed with a significant delay, because for its execution it is necessary to accumulate the entire signal and then process it. The analysis showed that one of the possible ways to reduce time costs is a multithreaded implementation of convolution based on block algorithms. The article shows the main features of the convolution implementation by the overlap method with addition and the overlap method with addition, as well as numerical examples. The results obtained show that the application of these methods without the use of a window function leads to significant distortions in the signal spectrum. Based on the results of the analysis, a universal scheme for performing convolution based on multithreaded processing of an input data block is proposed. This allows to achieve a good compromise between computational complexity, system architecture, and time costs.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

24

Metzner, Alexander, and Juergen Niehaus. "MSparc: Multithreading in Real-Time Architectures." JUCS - Journal of Universal Computer Science 6, no. (10) (2000): 1034–51. https://doi.org/10.3217/jucs-006-10-1034.

Der volle Inhalt der Quelle

Annotation:

This paper presents the use of multithreaded processors in real-time architectures. In particular we will handle real-time applications with hard timing constraints. In our approach, events (e.g. timer interrupts, signals from the environment, etc) are distinguished into three classes according to the reaction times that have to be met. Since two of these classes are well known in real-time systems, we will focus on the new class, for which the special features of a multithreaded processor together with a real-time scheduler realized in hardware are employed. Doing so enables us to realize the handling of events from this new class in software while still meeting the demands on reaction time. Additionally, the predictability of the application and the ease of implementing them are increased. The processor, named MSparc, which we developed to support these features, is based on block multithreading and is outlined in this paper, too. We then present an architecture, designed for rapid prototyping of embedded systems, to show the feasibility of this approach. Finally, a case study shows the potential of multithreading for embedded systems.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

25

Pashchenko, Dmitry V., Dmitry A. Trokoz, Alexey I. Martyshkin, Tatyana Yu Pashchenko, Mikhail M. Butaev, and Mikhail Yu Babich. "Research of a multithreaded non-deterministic system model." Nexo Revista Científica 34, no. 01 (2021): 193–204. http://dx.doi.org/10.5377/nexo.v34i01.11297.

Der volle Inhalt der Quelle

Annotation:

Managing the systems which behaviour is non-deterministic is one of the most important problems in modern management theory. Today, systems with structural and behavioural complexity are prevalent in all areas of human activity, and therefore, their research is of the utmost importance. Such systems, as opposed to deterministic systems, are called non-deterministic. They are characterised by difficult predictable behaviour determined both by external random influences, and within the systems themselves. A clear example of a non-deterministic system is crowds of people, factories, and computer networks and systems. The problem of non-deterministic behaviour directly within the context of professional activities can be seen using an example of building syntactic analysers. The aim of the paper is to design a class of systems oriented towards supporting elements of a discrete event model. The target of research is to simulate discrete event models. The subject of research is a creation of a discrete event model based on the behaviour of an undetermined finite state automaton. During the preparation of the paper, there was developed and practically implemented an algorithm for the application, which materializes the principle of working with threads. The results obtained in the paper are aimed at solving the problem of parallel data processing based on the parallelism of NFA's (non-deterministic finite automaton) behaviour when reading the input string characters. As a result, this should have a positive impact on the regulation of the simulation processes of a non-deterministic system, increasing its efficiency and stability. In conclusion, the algorithm of the application work is disclosed and conclusions about the effectiveness and efficiency of its development are drawn.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

26

S.K, SRIVATSA, and RAVI KUMAR CH. "RECONFIGURABLE FRAME WORK FOR CHIPMULTIPROCESSORS AND ITS APPLICATION IN MULTITHREADED ENVIRONMENT." International Journal on Information Sciences and Computing 6, no. 1 (2012): 41–47. http://dx.doi.org/10.18000/ijisac.50111.

Der volle Inhalt der Quelle

APA, Harvard, Vancouver, ISO und andere Zitierweisen

27

Sun, Ya Juan, Hong Lin, and Bao Hui Wang. "Research and Application on Optimization of Multi-Thread Download Technology for Enhanced Search Engine." Advanced Materials Research 756-759 (September 2013): 1008–12. http://dx.doi.org/10.4028/www.scientific.net/amr.756-759.1008.

Der volle Inhalt der Quelle

Annotation:

Multi-threaded file download as the key technology of content acquisition system for search engine, determines the efficiency and timeliness of content acquisition. In this paper, we do the research on optimization technologies which include multithreaded download based on P2SP, task scheduling based on MapReduce and download based on the protocol adaptation, designed to improve enhanced search engine efficiency. At last the result shows that the optimization method is successful for content acquisition.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

28

Karasik, O. N., and A. A. Prihozhy. "ADVANCED SCHEDULER FOR COOPERATIVE EXECUTION OF THREADS ON MULTI-CORE SYSTEM." «System analysis and applied information science», no. 1 (May 4, 2017): 4–11. http://dx.doi.org/10.21122/2309-4923-2017-1-4-11.

Der volle Inhalt der Quelle

Annotation:

Three architectures of the cooperative thread scheduler in a multithreaded application that is executed on a multi-core system are considered. Architecture A0 is based on the synchronization and scheduling facilities, which are provided by the operating system. Architecture A1 introduces a new synchronization primitive and a single queue of the blocked threads in the scheduler, which reduces the interaction activity between the threads and operating system, and significantly speed up the processes of blocking and unblocking the threads. Architecture A2 replaces the single queue of blocked threads with dedicated queues, one for each of the synchronizing primitives, extends the number of internal states of the primitive, reduces the inter- dependence of the scheduling threads, and further significantly speeds up the processes of blocking and unblocking the threads. All scheduler architectures are implemented on Windows operating systems and based on the User Mode Scheduling. Important experimental results are obtained for multithreaded applications that implement two blocked parallel algorithms of solving the linear algebraic equation systems by the Gaussian elimination. The algorithms differ in the way of the data distribution among threads and by the thread synchronization models. The number of threads varied from 32 to 7936. Architecture A1 shows the acceleration of up to 8.65% and the architecture A2 shows the acceleration of up to 11.98% compared to A0 architecture for the blocked parallel algorithms computing the triangular form and performing the back substitution. On the back substitution stage of the algorithms, architecture A1 gives the acceleration of up to 125%, and architecture A2 gives the acceleration of up to 413% compared to architecture A0. The experiments clearly show that the proposed architectures, A1 and A2 outperform A0 depending on the number of thread blocking and unblocking operations, which happen during the execution of multi-threaded applications. The conducted computational experiments demonstrate the improvement of parameters of multithreaded applications on a heterogeneous multi-core system due the proposed advanced versions of the thread scheduler.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

29

Hamad, Faten, and Abdelsalam Alawamrah. "Measuring the Performance of Parallel Information Processing in Solving Linear Equation Using Multiprocessor Supercomputer." Modern Applied Science 12, no. 3 (2018): 74. http://dx.doi.org/10.5539/mas.v12n3p74.

Der volle Inhalt der Quelle

Annotation:

Evaluation the performance of the algorithms and the method that is used to implement it play a major role in the assessment of the performance of many applications and it help the researchers to decide which algorithm to use and which method to implement it, it also give indicate of the performance of the hardware that the algorithm is tested over. In this paper we evaluate the performance of solving linear equation application over supercomputer which was implemented and using Message Passing interface (MPI) library. The sequential and multithreaded algorithm for solving linear equations has been experimented too and the results has been recorded, the speedup and efficiency of the algorithm has been calculated and the results showed that the parallel algorithm outperforms other methods with the large size matrix of 8192 * 8192 over the number of processors of 64. For large input size, the results also showed that there is a noticeable decrease in running time as the number of processors increase. But in case of multithreaded the results showed that as the matrix size increase the time required for running the algorithm is rapidly increasing although the number of threads increased. This indicates that the parallel performance over for large matrix input size is better and outperforms other methods.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

30

Toktorbaev, A., S. Karabaev, and Zh Toktomuratova. "Comparative Analysis of Asynchronous and Multithreaded Programming Methods in Python for Big Data Processing." Bulletin of Science and Practice 11, no. 5 (2025): 131–38. https://doi.org/10.33619/2414-2948/114/19.

Der volle Inhalt der Quelle

Annotation:

This paper presents a comparative analysis of asynchronous and multithreaded programming methods in Python aimed at optimizing big data processing. The main concepts, architectural features, and practical implementation aspects using Python’s standard libraries (asyncio, threading) and additional tools (concurrent.futures) are examined. An experimental performance analysis on typical data processing tasks is provided, highlighting the advantages and drawbacks of each approach and offering recommendations for optimal application depending on the task type.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

31

Yu, Qian, Tong Li, Zhong Wen Xie, Na Zhao, and Ying Lin. "Distributed Computing Design Methods for Multicore Application Programming." Advanced Materials Research 756-759 (September 2013): 1295–99. http://dx.doi.org/10.4028/www.scientific.net/amr.756-759.1295.

Der volle Inhalt der Quelle

Annotation:

In order to solve the serial execution caused by multithreaded concurrent access to shared data and realize the dynamic load balance of tasks on shared memory symmetric multi-processor (multi-core) computing platform, new design methods are presented. By presenting multicore distributed locks, multicore shared data localization, multicore distributed queue, the new design methods can greatly decrease the number of accessing the shared data and realize the dynamic load balance of tasks. For illustration, design scheme of multicore task manager of server software are given by using new design methods. Results shows the new design methods reduce the number of access shared resources, partially resolve the serial execution of cooperative threads and realize the dynamic task balance of server software, which validate the superiority of this approach.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

32

NATARAJAN, RAGAVENDRA, VINEETH MEKKAT, WEI-CHUNG HSU, and ANTONIA ZHAI. "EFFECTIVENESS OF COMPILER-DIRECTED PREFETCHING ON DATA MINING BENCHMARKS." Journal of Circuits, Systems and Computers 21, no. 02 (2012): 1240006. http://dx.doi.org/10.1142/s0218126612400063.

Der volle Inhalt der Quelle

Annotation:

For today's increasingly power-constrained multicore systems, integrating simpler and more energy-efficient in-order cores becomes attractive. However, since in-order processors lack complex hardware support for tolerating long-latency memory accesses, developing compiler technologies to hide such latencies becomes critical. Compiler-directed prefetching has been demonstrated effective on some applications. On the application side, a large class of data centric applications has emerged to explore the underlying properties of the explosively growing data. These applications, in contrast to traditional benchmarks, are characterized by substantial thread-level parallelism, complex and unpredictable control flow, as well as intensive and irregular memory access patterns. These applications are expected to be the dominating workloads on future microprocessors. Thus, in this paper, we investigated the effectiveness of compiler-directed prefetching on data mining applications in in-order multicore systems. Our study reveals that although properly inserted prefetch instructions can often effectively reduce memory access latencies for data mining applications, the compiler is not always able to exploit this potential. Compiler-directed prefetching can become inefficient in the presence of complex control flow and memory access patterns; and architecture dependent behaviors. The integration of multithreaded execution onto a single die makes it even more difficult for the compiler to insert prefetch instructions, since optimizations that are effective for single-threaded execution may or may not be effective in multithreaded execution. Thus, compiler-directed prefetching must be judiciously deployed to avoid creating performance bottlenecks that otherwise do not exist. Our experiences suggest that dynamic performance tuning techniques that adjust to the behaviors of a program can potentially facilitate the deployment of aggressive optimizations in data mining applications.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

33

Theobald, Kevin B., Rishi Kumar, Gagan Agrawal, Gerd Heber, Ruppa K. Thulasiram, and Guang R. Gao. "Implementation and evaluation of a communication intensive application on the EARTH multithreaded system." Concurrency and Computation: Practice and Experience 14, no. 3 (2002): 183–201. http://dx.doi.org/10.1002/cpe.604.

Der volle Inhalt der Quelle

APA, Harvard, Vancouver, ISO und andere Zitierweisen

34

GRELCK, CLEMENS, and SVEN-BODO SCHOLZ. "SAC — FROM HIGH-LEVEL PROGRAMMING WITH ARRAYS TO EFFICIENT PARALLEL EXECUTION." Parallel Processing Letters 13, no. 03 (2003): 401–12. http://dx.doi.org/10.1142/s0129626403001379.

Der volle Inhalt der Quelle

Annotation:

SAC is a purely functional array processing language designed with numerical applications in mind. It supports generic, high-level program specifications in the style of APL. However, rather than providing a fixed set of built-in array operations, SAC provides means to specify such operations in the language itself in a way that still allows their application to arrays of any rank and size. This paper illustrates the major steps in compiling generic, rank- and shape-invariant SAC specifications into efficiently executable multithreaded code for parallel execution on shared memory multiprocessors. The effectiveness of the compilation techniques is demonstrated by means of a small case study on the PDE1 benchmark, which implements 3-dimensional red/black successive over-relaxation. Comparisons with HPF and ZPL show that despite the genericity of code, SAC achieves highly competitive runtime performance characteristics.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

35

Mori, Shinichiro, Masanao Kobayashi, Motoki Kumagai, and Shinichi Minohara. "Development of a GPU-based multithreaded software application to calculate digitally reconstructed radiographs for radiotherapy." Radiological Physics and Technology 2, no. 1 (2009): 40–45. http://dx.doi.org/10.1007/s12194-008-0040-3.

Der volle Inhalt der Quelle

APA, Harvard, Vancouver, ISO und andere Zitierweisen

36

Ngo, Hieu Khanh, and Grolleau Emmanuel. "DARTSVIEW, A TOOLKIT FOR DARTS IN LABVIEW." Science and Technology Development Journal 12, no. 14 (2009): 69–76. http://dx.doi.org/10.32508/stdj.v12i14.2341.

Der volle Inhalt der Quelle

Annotation:

DARTS (Design Approach for Real Time Systems) [4] is a software design method for real time systems. LabVIEW (Laboratory Virtual Instrument Engineering Workbench) is a graphical application development environment developed by National Instruments Corporation based on the dataflow representation of the "G" language [6][2]. LabVIEW is implicitly multithreaded and has high level functions for communication/synchronization, allowing it to be used as a programming language for control/command and soft real-time applications. In order to help a designer to develop a real-time application, we propose the library DARTSVIEW, which simplifies the passage from the conception of a "multitasking" application to the implementation [8). One can use DARTSVIEW in different phases of the life cycle of real-time system software. The last version of DARTSVIEW, allows to define in XML several real-time programming normalized languages, and to generate a part of the code for different specific programming languages (Ada, POSIX 1003.1, VxWorks, OSEK/VDX, etc.). The flexibility introduced by the use of XML allows a designer also to generate some code targeting real-time scheduling analysis tools in order to achieve the temporal validation. The objective of this article is to present an overview of DARTSVIEW, a Toolkit for DARTS in LabVIEW, the role of DARTSVIEW in the software.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

37

Rybak, A. T., A. V. Ivanovskaya, P. P. Batura, and A. Yu Pelipenko. "Synchronization in multi-motor hydromechanical systems." Advanced Engineering Research 21, no. 4 (2022): 337–45. http://dx.doi.org/10.23947/2687-1653-2021-21-4-337-345.

Der volle Inhalt der Quelle

Annotation:

Introduction. The paper submits the analysis of existing design solutions of flow dividers used to synchronize hydraulic drives of working bodies of technological and mobile machines. The market demands for multithreaded throttle flow dividers without valves with the controlled division ratio, such as multi-axle vehicle chassis, are identified. The objective of the work was to analyze the possibility and rationale for developing a throttle four-way flow divider without valves with sensing elements of the Venturi tube type. The solution should provide the synchronicity of movement (rotation) of more than three working bodies of technological and mobile machines.Materials and Methods. A patent search for the designs of hydraulic flow dividers is carried out, and systems that require the division of the hydraulic fluid flow into more than two executive bodies are considered. An upgrade option, which allows dividing the flow into four branches, is proposed for the design of a three-channel throttle flow divider without valves.Results. The urgency of developing a multithreaded throttle flow divider without valves for application in industrial and mobile machines is validated. Two types of four-flow dividers are considered, their weaknesses are indicated. It is noted that the development of a multithreaded throttle flow divider based on the designs created in 1989 and 1991 will reduce the number of hydraulic pumps and get rid of the series connection of double-flow dividers. In this way, it is possible to reduce pressure losses in the hydraulic system and implement adaptive control of hydraulic motors of multi-motor mobile machines. The possibility to obtain a divider/combiner into four flows by adding an outlet chamber connected to the membrane chamber through a channel entering the Venturi nozzle on the basis of a three-flow throttle divider is shown. The principle of operation of such equipment is described.Discussion and Conclusions. The principles of construction of throttle flow dividers without valves are considered. An upgrade option is proposed to increase the number of division channels from three to four. However, to validate the operability of this design, a numerical analysis of the various modes of operation of the divider is required — calculation of the reduced volumetric stiffness of its working cavities. The information obtained can be used to modernize the hydraulic units of technological and mobile machines, increase their reliability, manufacturability, and efficiency. The issues that need to be solved in further research are identified.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

38

Cao, Yang, Meina Zhang, Xianjun Wang, Yaqin Shang, and Simiao Jia. "Application of Multi-threading Mechanisms to the Connect6 Gaming System." Frontiers in Computing and Intelligent Systems 8, no. 2 (2024): 18–21. http://dx.doi.org/10.54097/4wf78n15.

Der volle Inhalt der Quelle

Annotation:

The field of computer games is currently a popular research direction in the field of artificial intelligence, and has a pivotal position in the development and research of artificial intelligence. In the field of computer games, Connect6 is an emerging chess game, whose position complexity and research value are comparable to those of Go and Xiangqi. In the Connect6 game system, there are a large number of branching positions, and the traditional single-threaded search mechanism is time-consuming and laborious, which reduces the search efficiency of the game to a certain extent, so how to find the best positional moves quickly becomes an important factor affecting the strength of the Connect6 game. In this paper, by introducing the multi-threading mechanism in Connect6 intelligent confrontation system, assigning a thread to each branch position, utilizing the computer's multi-threaded parallel computing mechanism combined with a reasonable evaluation method will be able to quickly get the best move for the next move. Through experimental comparison, it can be obtained that in the field of Connect6 game, the multithreaded search mechanism is more efficient for the position search, and also has better game prediction, which can improve the search efficiency and reduce the search time of Connect6 game to a certain extent.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

39

Johnson, Seth R., Julien Esseiva, Elliott Biondo, et al. "Celeritas: Accelerating Geant4 with GPUs." EPJ Web of Conferences 295 (2024): 11005. http://dx.doi.org/10.1051/epjconf/202429511005.

Der volle Inhalt der Quelle

Annotation:

Celeritas [1] is a new Monte Carlo (MC) detector simulation code designed for computationally intensive applications (specifically, High Lumi- nosity Large Hadron Collider (HL-LHC) simulation) on high-performance heterogeneous architectures. In the past two years Celeritas has advanced from prototyping a GPU-based single physics model in infinite medium to implementing a full set of electromagnetic (EM) physics processes in complex geometries. The current release of Celeritas, version 0.3, has incorporated full device-based navigation, an event loop in the presence of magnetic fields, and detector hit scoring. New functionality incorporates a scheduler to offload electromagnetic physics to the GPU within a Geant4-driven simulation, enabling integration of Celeritas into high energy physics (HEP) experimental frameworks such as CMSSW. On the Summit supercomputer, Celeritas performs EM physics between 6 and 32 faster using the machine’s Nvidia GPUs compared to using only CPUs. When running a multithreaded Geant4 ATLAS test beam application with full hadronic physics, using Celeritas to accelerate the EM physics results in an overall simulation speedup of 1.8–2.3× on GPU and 1.2× on CPU.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

40

Vidal, Eric Cesar Jr, and Alexander Nareyek. "A Real-Time Concurrent Planning and Execution Framework for Automated Story Planning for Games." Proceedings of the AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment 7, no. 2 (2011): 98–105. http://dx.doi.org/10.1609/aiide.v7i2.12475.

Der volle Inhalt der Quelle

Annotation:

This paper presents a framework that facilitates communication between a planning system (“planner”) and a plan execution system (“executor”) to enable them to run concurrently, with the main emphasis on meeting the real-time requirements of the application domain. While the framework is applicable to general-purpose planning, its features are optimized for the requirements of automated story planning for games—with emphasis on monitoring player-triggered events and handling on-time (re-)generation of story assets such as characters, maps and scenarios. This framework subsumes the traditional interleaved planning-and-execution paradigm used in embedded continual planning systems and generalizes it to a non-embedded context, making the framework ideal for use with contemporary game architectures (e.g., multithreaded game engines, or games with subsystems communicating over a network).

APA, Harvard, Vancouver, ISO und andere Zitierweisen

41

Jones, Christopher, and Patrick Gartung. "CMSSW Scaling Limits on Many-Core Machines." EPJ Web of Conferences 295 (2024): 03008. http://dx.doi.org/10.1051/epjconf/202429503008.

Der volle Inhalt der Quelle

Annotation:

Today the LHC offline computing relies heavily on CPU resources, despite the interest in compute accelerators, such as GPUs, for the longer term future. The number of cores per CPU socket has continued to increase steadily, reaching the levels of 64 cores (128 threads) with recent AMD EPYC processors, and 128 cores on Ampere Altra Max ARM processors. Over the course of the past decade, the CMS data processing framework, CMSSW, has been transformed from a single-threaded framework into a highly concurrent one. The first multithreaded version was brought into production by the start of the LHC Run 2 in 2015. Since then, the framework’s threading efficiency has gradually been improved by adding more levels of concurrency and reducing the amount of serial code paths. The latest addition was support for concurrent Runs. In this work we review the concurrency model of the CMSSW, and measure its scalability with real CMS applications, such as simulation and reconstruction, on modern many-core machines. We show metrics such as event processing throughput and application memory usage with and without the contribution of I/O, as I/O has been the major scaling limitation for the CMS applications.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

42

Tang, Xulong, Mahmut Taylan Kandemir, and Mustafa Karakoy. "Mix and Match: Reorganizing Tasks for Enhancing Data Locality." ACM SIGMETRICS Performance Evaluation Review 49, no. 1 (2022): 47–48. http://dx.doi.org/10.1145/3543516.3460103.

Der volle Inhalt der Quelle

Annotation:

Application programs that exhibit strong locality of reference lead to minimized cache misses and better performance in different architectures. In this paper, we target task-based programs, and propose a novel compiler-based approach that consists of four complementary steps. First, we partition the original tasks in the target application into sub-tasks and build a data reuse graph at a sub-task granularity. Second, based on the intensity of temporal and spatial data reuses among sub-tasks, we generate new tasks where each such (new) task includes a set of sub-tasks that exhibit high data reuse among them. Third, we assign the newly-generated tasks to cores in an architecture-aware fashion with the knowledge of data location. Finally, we re-schedule the execution order of sub-tasks within new tasks such that sub-tasks that belong to different tasks but share data among them are executed in close proximity in time. The experiments show that, when targeting a state of the art manycore system, our compiler-based approach improves the performance of 10 multithreaded programs by 23.4% on average.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

43

Saravanan, G., and N. Yuvaraj. "Cloud resource optimization based on poisson linear deep gradient learning for mobile cloud computing." Journal of Intelligent & Fuzzy Systems 40, no. 1 (2021): 787–97. http://dx.doi.org/10.3233/jifs-200799.

Der volle Inhalt der Quelle

Annotation:

Mobile Cloud Computing (MCC) addresses the drawbacks of Mobile Users (MU) where the in-depth evaluation of mobile applications is transferred to a centralized cloud via a wireless medium to reduce load, therefore optimizing resources. In this paper, we consider the resource (i.e., bandwidth and memory) allocation problem to support mobile applications in a MCC environment. In such an environment, Mobile Cloud Service Providers (MCSPs) form a coalition to create a resource pool to share their resources with the Mobile Cloud Users. To enhance the welfare of the MCSPs, a method for optimal resource allocation to the mobile users called, Poisson Linear Deep Resource Allocation (PL-DRA) is designed. For resource allocation between mobile users, we formulate and solve optimization models to acquire an optimal number of application instances while meeting the requirements of mobile users. For optimal application instances, the Poisson Distributed Queuing model is designed. The distributed resource management is designed as a multithreaded model where parallel computation is provided. Next, a Linear Gradient Deep Resource Allocation (LG-DRA) model is designed based on the constraints, bandwidth, and memory to allocate mobile user instances. This model combines the advantage of both decision making (i.e. Linear Programming) and perception ability (i.e. Deep Resource Allocation). Besides, a Stochastic Gradient Learning is utilized to address mobile user scalability. The simulation results show that the Poisson queuing strategy based on the improved Deep Learning algorithm has better performance in response time, response overhead, and energy consumption than other algorithms.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

44

Adam, George K. "Timing and Performance Metrics for TWR-K70F120M Device." Computers 12, no. 8 (2023): 163. http://dx.doi.org/10.3390/computers12080163.

Der volle Inhalt der Quelle

Annotation:

Currently, single-board computers (SBCs) are sufficiently powerful to run real-time operating systems (RTOSs) and applications. The purpose of this research was to investigate the timing performance of an NXP TWR-K70F120M device with μClinux OS on concurrently running tasks with real-time features and constraints, and provide new and distinct technical data not yet available in the literature. Towards this goal, a custom-built multithreaded application with specific compute-intensive sorting and matrix operations was developed and applied to obtain measurements in specific timing metrics, including task execution time, thread waiting time, and response time. In this way, this research extends the literature by documenting performance results on specific timing metrics. The performance of this device was additionally benchmarked and validated against commonly used platforms, a Raspberry Pi4 and BeagleBone AI SBCs. The experimental results showed that this device stands well both in terms of timing and efficiency metrics. Execution times were lower than with the other platforms, by approximately 56% in the case of two threads, and by 29% in the case of 32-thread configurations. The outcomes could be of practical value to companies which intend to use such low-cost embedded devices in the development of reliable real-time industrial applications.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

45

Городняя, Лидия Васильевна. "Perspectives of Functional Programming of Parallel Computations." Russian Digital Libraries Journal 24, no. 6 (2022): 1090–116. http://dx.doi.org/10.26907/1562-5419-2021-24-6-1090-1116.

Der volle Inhalt der Quelle

Annotation:

The article is devoted to the results of the analysis of modern trends in functional programming, considered as a metaparadigm for solving the problems of organizing parallel computations and multithreaded programs for multiprocessor complexes and distributed systems. Taking into account the multi-paradigm nature of parallel programming, the paradigm analysis of languages and functional programming systems is used. This makes it possible to reduce the complexity of the problems being solved by methods of decomposition of programs into autonomously developed components, to evaluate their similarities and differences. Consideration of such features is necessary when predicting the course of application processes, as well as when planning the study and organizing the development of programs. There is reason to believe that functional programming has the ability to improve programs performance. A variety of paradigmatic characteristics inherent in the preparation and debugging of long-lived parallel computing programs are shown.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

46

Guerra, Jorge, Hajime Nobuhara, and Kaoru Hirota. "Fuzzy Configuration Space for Moving Obstacle Avoidance of Autonomous Mobile Robots." Journal of Advanced Computational Intelligence and Intelligent Informatics 10, no. 1 (2006): 26–34. http://dx.doi.org/10.20965/jaciii.2006.p0026.

Der volle Inhalt der Quelle

Annotation:

A fuzzy configuration space description method that provides the path planning solution for autonomous mobile robots in dynamically changing environment is proposed based on a hybrid planning algorithm that combines total solutions and reactive control through fuzzy proximity measures. The system (made with C++) that monitors and controls mobile robots remotely is created using a multithreaded model while taking advantage of high performance OpenGL routines to counter the increase in computational cost generated by this approach. Experiments on a real Lego robot are performed using a personal computer with a 1.5GHz Pentium4 CPU and a CCD camera. The efficiency of the hybrid algorithm and the potential of this approach, as a distributed system, in greatly changing dynamic environments are shown. The system provides a starting point for further development of distributed robotic systems, for application in human support tasks where interaction with nonprecise human behaviors are better mentioned with fuzzy parameters.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

47

Malek, Maximilian, and Christoph W. Sensen. "Instant Feedback Rapid Prototyping for GPU-Accelerated Computation, Manipulation, and Visualization of Multidimensional Data." International Journal of Biomedical Imaging 2018 (June 3, 2018): 1–9. http://dx.doi.org/10.1155/2018/2046269.

Der volle Inhalt der Quelle

Annotation:

Objective. We have created an open-source application and framework for rapid GPU-accelerated prototyping, targeting image analysis, including volumetric images such as CT or MRI data. Methods. A visual graph editor enables the design of processing pipelines without programming. Run-time compiled compute shaders enable prototyping of complex operations in a matter of minutes. Results. GPU-acceleration increases processing the speed by at least an order of magnitude when compared to traditional multithreaded CPU-based implementations, while offering the flexibility of scripted implementations. Conclusion. Our framework enables real-time, intuition-guided accelerated algorithm and method development, supported by built-in scriptable visualization. Significance. This is, to our knowledge, the first tool for medical data analysis that provides both high performance and rapid prototyping. As such, it has the potential to act as a force multiplier for further research, enabling handling of high-resolution datasets while providing quasi-instant feedback and visualization of results.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

48

Y M, Manu. "StateOS : A Memory Sufficient Hybrid Operating System for IOT Devices." INTERANTIONAL JOURNAL OF SCIENTIFIC RESEARCH IN ENGINEERING AND MANAGEMENT 08, no. 06 (2024): 1–5. http://dx.doi.org/10.55041/ijsrem35618.

Der volle Inhalt der Quelle

Annotation:

The proliferation of Internet of Things (IoT) technology has spurred the need for efficient operating systems tailored to the constraints of IoT devices. This paper presents StateOS, a hybrid operating system designed to meet the demands of sensor-based IoT devices and wireless sensors. StateOS combines the memory efficiency of event-driven systems with the clarity of control flow provided by multithreaded approaches. By implementing a microkernel architecture, cross-layer network protocol design, and a hybrid task scheduler, StateOS offers real-time capabilities while minimizing memory footprint. Macro-based task APIs and state machine-based visual programming facilitate cooperative threaded programming, making StateOS suitable for both experienced developers and novices. The paper provides an in-depth overview of StateOS's architecture, kernel functionalities, code examples, implementation on various platforms, performance evaluation, and concludes by highlighting its potential as a valuable tool for modern IoT application development. Keywords — Hybrid operating system (OS), Internet of Things (IoT) and Memory efficiency.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

49

Machado, Iar, Michael Stanton, and Tiago Salmito. "Internet Protocol Version 6 (IPv6) - Internet Protocol Version 4 (IPv4) Network Address, Port & Protocol Translation And Multithreaded DNS-Application Gateway." Proceedings of the Asia-Pacific Advanced Network 31 (June 1, 2011): 12. http://dx.doi.org/10.7125/apan.31.2.

Der volle Inhalt der Quelle

APA, Harvard, Vancouver, ISO und andere Zitierweisen

50

Singh, Abhinav, Navpreet Singh, and Vinay Bajpai. "Internet Protocol Version 6 (IPv6) - Internet Protocol Version 4 (IPv4) Network Address, Port & Protocol Translation And Multithreaded DNS-Application Gateway." Proceedings of the Asia-Pacific Advanced Network 31 (June 1, 2011): 12. http://dx.doi.org/10.7125/apan.31.21.

Der volle Inhalt der Quelle

APA, Harvard, Vancouver, ISO und andere Zitierweisen

Wir bieten Rabatte auf alle Premium-Pläne für Autoren, deren Werke in thematische Literatursammlungen aufgenommen wurden. Kontaktieren Sie uns, um einen einzigartigen Promo-Code zu erhalten!