Journal articles on the topic 'Parallelisation in time'

To see the other types of publications on this topic, follow the link: Parallelisation in time.

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'Parallelisation in time.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Ajtonyi, István, and Gábor Terstyánszky. "Real-Time Requirements and Parallelisation in Fault Diagnosis." IFAC Proceedings Volumes 28, no. 5 (May 1995): 471–77. http://dx.doi.org/10.1016/s1474-6670(17)47268-5.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Kaber, Sidi-Mahmoud, Amine Loumi, and Philippe Parnaudeau. "Parallel Solution of Linear Systems." East Asian Journal on Applied Mathematics 6, no. 3 (July 20, 2016): 278–89. http://dx.doi.org/10.4208/eajam.210715.250316a.

Full text
Abstract:
AbstractComputational scientists generally seek more accurate results in shorter times, and to achieve this a knowledge of evolving programming paradigms and hardware is important. In particular, optimising solvers for linear systems is a major challenge in scientific computation, and numerical algorithms must be modified or new ones created to fully use the parallel architecture of new computers. Parallel space discretisation solvers for Partial Differential Equations (PDE) such as Domain Decomposition Methods (DDM) are efficient and well documented. At first glance, parallelisation seems to be inconsistent with inherently sequential time evolution, but parallelisation is not limited to space directions. In this article, we present a new and simple method for time parallelisation, based on partial fraction decomposition of the inverse of some special matrices. We discuss its application to the heat equation and some limitations, in associated numerical experiments.
APA, Harvard, Vancouver, ISO, and other styles
3

Drysdale, Timothy David, and Tomasz P. Stefanski. "Parallelisation of Implicit Time Domain Methods: Progress with ADI-FDTD." PIERS Online 5, no. 2 (2009): 117–20. http://dx.doi.org/10.2529/piers080905063810.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Poulhaon, Fabien, Francisco Chinesta, and Adrien Leygue. "A first step toward a PGD-based time parallelisation strategy." European Journal of Computational Mechanics 21, no. 3-6 (August 30, 2012): 300–311. http://dx.doi.org/10.1080/17797179.2012.714985.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Dodson, S. J., S. P. Walker, and M. J. Bluck. "Parallelisation issues for high speed time domain integral equation analysis." Parallel Computing 25, no. 8 (September 1999): 925–42. http://dx.doi.org/10.1016/s0167-8191(99)00031-9.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Niculescu, Virginia, and Robert Manuel Ştefănică. "Tries-Based Parallel Solutions for Generating Perfect Crosswords Grids." Algorithms 15, no. 1 (January 13, 2022): 22. http://dx.doi.org/10.3390/a15010022.

Full text
Abstract:
A general crossword grid generation is considered an NP-complete problem and theoretically it could be a good candidate to be used by cryptography algorithms. In this article, we propose a new algorithm for generating perfect crosswords grids (with no black boxes) that relies on using tries data structures, which are very important for reducing the time for finding the solutions, and offers good opportunity for parallelisation, too. The algorithm uses a special tries representation and it is very efficient, but through parallelisation the performance is improved to a level that allows the solution to be obtained extremely fast. The experiments were conducted using a dictionary of almost 700,000 words, and the solutions were obtained using the parallelised version with an execution time in the order of minutes. We demonstrate here that finding a perfect crossword grid could be solved faster than has been estimated before, if we use tries as supporting data structures together with parallelisation. Still, if the size of the dictionary is increased by a lot (e.g., considering a set of dictionaries for different languages—not only for one), or through a generalisation to a 3D space or multidimensional spaces, then the problem still could be investigated for a possible usage in cryptography.
APA, Harvard, Vancouver, ISO, and other styles
7

Iman Fitri Ismail, Akmal Nizam Mohammed, Bambang Basuno, Siti Aisyah Alimuddin, and Mustafa Alas. "Evaluation of CFD Computing Performance on Multi-Core Processors for Flow Simulations." Journal of Advanced Research in Applied Sciences and Engineering Technology 28, no. 1 (September 11, 2022): 67–80. http://dx.doi.org/10.37934/araset.28.1.6780.

Full text
Abstract:
Previous parallel computing implementations for Computational Fluid Dynamics (CFD) focused extensively on Complex Instruction Set Computer (CISC). Parallel programming was incorporated into the previous generation of the Raspberry Pi Reduced Instruction Set Computer (RISC). However, it yielded poor computing performance due to the processing power limits of the time. This research focuses on utilising two Raspberry Pi 3 B+ with increased processing capability compared to its previous generation to tackle fluid flow problems using numerical analysis and CFD. Parallel computing elements such as Secure Shell (SSH) and the Message Passing Interface (MPI) protocol were implemented for Advanced RISC Machine (ARM) processors. The parallel network was then validated by a processor call attempt and core execution test. Parallelisation of the processors enables the study of fluid flow and computational fluid dynamics (CFD) problems, such as validation of the NACA 0012 airfoil and an additional case of the Laplace equation for computing the temperature distribution via the parallel system. The experimental NACA 0012 data was validated using the parallel system, which can simulate the airfoil's physics. Each core was enabled and tested to determine the system's performance in parallelising the execution of various programming algorithms such as pi calculation. A comparison of the execution time for the NACA 0012 validation case yielded a parallelisation efficiency above 50%. The case studies confirmed the Raspberry Pi 3 B+'s successful parallelisation independent of external software and machines, making it a self-sustaining compact demonstration cluster of parallel computers for CFD.
APA, Harvard, Vancouver, ISO, and other styles
8

HAMMERSLEY, ANDREW. "Parallelisation of a 2-D Fast Fourier Transform Algorithm." International Journal of Modern Physics C 02, no. 01 (March 1991): 363–66. http://dx.doi.org/10.1142/s0129183191000494.

Full text
Abstract:
The calculation of two and higher-dimension Fast Fourier Transforms (FFT’s) are of great importance in many areas of data analysis and computational physics. The two-dimensional FFT is implemented for a parallel network using a master-slave approach. In-place performance is good, but the use of this technique as an “accelerator” is limited by the communications time between the host and the network. The total time is reduced by performing the host-master communications in parallel with the master-slave communications. Results for the calculation of the two-dimensional FFT of real-valued datasets are presented.
APA, Harvard, Vancouver, ISO, and other styles
9

MCAVANEY, CHRISTOPHER, and ANDRZEJ GOSCINSKI. "AUTOMATIC PARALLELISATION AND EXECUTION OF APPLICATIONS ON CLUSTERS." Journal of Interconnection Networks 02, no. 03 (September 2001): 331–43. http://dx.doi.org/10.1142/s0219265901000427.

Full text
Abstract:
Parallel execution is a very efficient means of processing vast amounts of data in a small amount of time. Creating parallel applications has never been easy, and requires much knowledge of the task and the execution environment used to execute parallel processes. The process of creating parallel applications can be made easier through using a compiler that automatically parallelises a supplied application. Executing the parallel application is also simplified when a well designed execution environment is used. Such an execution environment provides very powerful operations to the programmer transparently. Combining both a parallelising compiler and execution environment and providing a fully automated parallelisation and execution tool is the aim of this research. The advantage of using such a fully automated tool is that the user does not need to provide any additional input to gain the benefits of parallel execution. This report shows the tool and how it transparently supports the programmer creating parallel applications and supports their execution.
APA, Harvard, Vancouver, ISO, and other styles
10

Taygan, Ugur, and Adnan Ozsoy. "Performance analysis and GPU parallelisation of ECO object tracking algorithm." New Trends and Issues Proceedings on Advances in Pure and Applied Sciences, no. 12 (April 30, 2020): 109–18. http://dx.doi.org/10.18844/gjpaas.v0i12.4991.

Full text
Abstract:
The classification and tracking of objects has gained popularity in recent years due to the variety and importance of their application areas. Although object classification does not necessarily have to be real time, object tracking is often intended to be carried out in real time. While the object tracking algorithm mainly focuses on robustness and accuracy, the speed of the algorithm may degrade significantly. Due to their parallelisable nature, the use of GPUs and other parallel programming tools are increasing in the object tracking applications. In this paper, we run experiments on the Efficient Convolution Operators object tracking algorithm, in order to detect its time-consuming parts, which are the bottlenecks of the algorithm, and investigate the possibility of GPU parallelisation of the bottlenecks to improve the speed of the algorithm. Finally, the candidate methods are implemented and parallelised using the Compute Unified Device Architecture. Keywords: Object tracking, parallel programming.
APA, Harvard, Vancouver, ISO, and other styles
11

Epicoco, I., S. Mocavero, F. Macchia, M. Vichi, T. Lovato, S. Masina, and G. Aloisio. "Performance and results of the high-resolution biogeochemical model PELAGOS025 within NEMO." Geoscientific Model Development Discussions 8, no. 12 (December 16, 2015): 10585–625. http://dx.doi.org/10.5194/gmdd-8-10585-2015.

Full text
Abstract:
Abstract. The present work aims at evaluating the scalability performance of a high-resolution global ocean biogeochemistry model (PELAGOS025) on massive parallel architectures and the benefits in terms of the time-to-solution reduction. PELAGOS025 is an on-line coupling between the physical ocean model NEMO and the BFM biogeochemical model. Both the models use a parallel domain decomposition along the horizontal dimension. The parallelisation is based on the message passing paradigm. The performance analysis has been done on two parallel architectures, an IBM BlueGene/Q at ALCF (Argonne Leadership Computing Facilities) and an IBM iDataPlex with Sandy Bridge processors at CMCC (Euro Mediterranean Center on Climate Change). The outcome of the analysis demonstrated that the lack of scalability is due to several factors such as the I/O operations, the memory contention, the load unbalancing due to the memory structure of the BFM component and, for the BlueGene/Q, the absence of a hybrid parallelisation approach.
APA, Harvard, Vancouver, ISO, and other styles
12

Fairbrother-Browne, Aine, Sonia García-Ruiz, Regina Hertfelder Reynolds, Mina Ryten, and Alan Hodgkinson. "ensemblQueryR: fast, flexible and high-throughput querying of Ensembl LD API endpoints in R." Gigabyte 2023 (September 14, 2023): 1–10. http://dx.doi.org/10.46471/gigabyte.91.

Full text
Abstract:
We present ensemblQueryR, an R package for querying Ensembl linkage disequilibrium (LD) endpoints. This package is flexible, fast and user-friendly, and optimised for high-throughput querying. ensemblQueryR uses functions that are intuitive and amenable to custom code integration, familiar R object types as inputs and outputs as well as providing parallelisation functionality. For each Ensembl LD endpoint, ensemblQueryR provides two functions, permitting both single- and multi-query modes of operation. The multi-query functions are optimised for large query sizes and provide optional parallelisation to leverage available computational resources and minimise processing time. We demonstrate improved computational performance of ensemblQueryR over an exisiting tool in terms of random access memory (RAM) usage and speed, delivering a 10-fold speed increase whilst using a third of the RAM. Finally, ensemblQueryR is near-agnostic to operating system and computational architecture through Docker and singularity images, making this tool widely accessible to the scientific community.
APA, Harvard, Vancouver, ISO, and other styles
13

DASH, S. K. "On the parallelisation of weather and climate models." MAUSAM 52, no. 1 (December 29, 2021): 191–200. http://dx.doi.org/10.54302/mausam.v52i1.1687.

Full text
Abstract:
The numerical models used for weather forecasting and climate studies need very large computing resources. The current research in the field indicates that for accurate forecasts, one needs to use models at very high resolution, sophisticated data assimilation techniques and physical parameterisation schemes and multi-model ensemble integrations. In fact the spatial resolution required for accurate forecasts may demand computing power which is prohibitively high considering the processing power of a single processor of any supercomputer. During the last two decades, the developments in computing technology show the emergence of parallel computers with a number of processors which are capable of supplying enormously large computing power as against a single computer. Today, a cluster of workstations or personal computers can be used in parallel to integrate a global climate model for a long time. However, there are bottlenecks to be overcome in order to achieve maximum efficiency. Inter-processor communication is the key issue in case of global weather and climate models. The present paper aims at discussing the status of parallelisation of weather and climate models at leading centres of operational forecasting and research, the inherent parallelism in weather and climate models, the problems encountered in inter-processing communication and various ways of achieving maximum parallel efficiency.
APA, Harvard, Vancouver, ISO, and other styles
14

Bashford, Tim, Kelvin Donne, Arnaud Marotin, and Ala Al-Hussany. "Parallelisation Techniques for the Dual Reciprocity and Time-Dependent Boundary Element Method Algorithms." International Journal of Computational Methods and Experimental Measurements 5, no. 3 (April 1, 2017): 395–403. http://dx.doi.org/10.2495/cmem-v5-n3-395-403.

Full text
APA, Harvard, Vancouver, ISO, and other styles
15

Epicoco, Italo, Silvia Mocavero, Francesca Macchia, Marcello Vichi, Tomas Lovato, Simona Masina, and Giovanni Aloisio. "Performance and results of the high-resolution biogeochemical model PELAGOS025 v1.0 within NEMO v3.4." Geoscientific Model Development 9, no. 6 (June 10, 2016): 2115–28. http://dx.doi.org/10.5194/gmd-9-2115-2016.

Full text
Abstract:
Abstract. The present work aims at evaluating the scalability performance of a high-resolution global ocean biogeochemistry model (PELAGOS025) on massive parallel architectures and the benefits in terms of the time-to-solution reduction. PELAGOS025 is an on-line coupling between the Nucleus for the European Modelling of the Ocean (NEMO) physical ocean model and the Biogeochemical Flux Model (BFM) biogeochemical model. Both the models use a parallel domain decomposition along the horizontal dimension. The parallelisation is based on the message passing paradigm. The performance analysis has been done on two parallel architectures, an IBM BlueGene/Q at ALCF (Argonne Leadership Computing Facilities) and an IBM iDataPlex with Sandy Bridge processors at the CMCC (Euro Mediterranean Center on Climate Change). The outcome of the analysis demonstrated that the lack of scalability is due to several factors such as the I/O operations, the memory contention, the load unbalancing due to the memory structure of the BFM component and, for the BlueGene/Q, the absence of a hybrid parallelisation approach.
APA, Harvard, Vancouver, ISO, and other styles
16

Jaruga, A., S. Arabas, D. Jarecka, H. Pawlowska, P. K. Smolarkiewicz, and M. Waruszewski. "libmpdata++ 0.1: a library of parallel MPDATA solvers for systems of generalised transport equations." Geoscientific Model Development Discussions 7, no. 6 (November 26, 2014): 8179–273. http://dx.doi.org/10.5194/gmdd-7-8179-2014.

Full text
Abstract:
Abstract. This paper accompanies first release of libmpdata++, a C++ library implementing the Multidimensional Positive-Definite Advection Transport Algorithm (MPDATA). The library offers basic numerical solvers for systems of generalised transport equations. The solvers are forward-in-time, conservative and non-linearly stable. The libmpdata++ library covers the basic second-order-accurate formulation of MPDATA, its third-order variant, the infinite-gauge option for variable-sign fields and a flux-corrected transport extension to guarantee non-oscillatory solutions. The library is equipped with a non-symmetric variational elliptic solver for implicit evaluation of pressure gradient terms. All solvers offer parallelisation through domain decomposition using shared-memory parallelisation. The paper describes the library programming interface, and serves as a user guide. Supported options are illustrated with benchmarks discussed in the MPDATA literature. Benchmark descriptions include code snippets as well as quantitative representations of simulation results. Examples of applications include: homogeneous transport in one, two and three dimensions in Cartesian and spherical domains; shallow-water system compared with analytical solution (originally derived for a 2-D case); and a buoyant convection problem in an incompressible Boussinesq fluid with interfacial instability. All the examples are implemented out of the library tree. Regardless of the differences in the problem dimensionality, right-hand-side terms, boundary conditions and parallelisation approach, all the examples use the same unmodified library, which is a key goal of libmpdata++ design. The design, based on the principle of separation of concerns, prioritises the user and developer productivity. The libmpdata++ library is implemented in C++, making use of the Blitz++ multi-dimensional array containers, and is released as free/libre and open-source software.
APA, Harvard, Vancouver, ISO, and other styles
17

Jaruga, A., S. Arabas, D. Jarecka, H. Pawlowska, P. K. Smolarkiewicz, and M. Waruszewski. "libmpdata++ 1.0: a library of parallel MPDATA solvers for systems of generalised transport equations." Geoscientific Model Development 8, no. 4 (April 8, 2015): 1005–32. http://dx.doi.org/10.5194/gmd-8-1005-2015.

Full text
Abstract:
Abstract. This paper accompanies the first release of libmpdata++, a C++ library implementing the multi-dimensional positive-definite advection transport algorithm (MPDATA) on regular structured grid. The library offers basic numerical solvers for systems of generalised transport equations. The solvers are forward-in-time, conservative and non-linearly stable. The libmpdata++ library covers the basic second-order-accurate formulation of MPDATA, its third-order variant, the infinite-gauge option for variable-sign fields and a flux-corrected transport extension to guarantee non-oscillatory solutions. The library is equipped with a non-symmetric variational elliptic solver for implicit evaluation of pressure gradient terms. All solvers offer parallelisation through domain decomposition using shared-memory parallelisation. The paper describes the library programming interface, and serves as a user guide. Supported options are illustrated with benchmarks discussed in the MPDATA literature. Benchmark descriptions include code snippets as well as quantitative representations of simulation results. Examples of applications include homogeneous transport in one, two and three dimensions in Cartesian and spherical domains; a shallow-water system compared with analytical solution (originally derived for a 2-D case); and a buoyant convection problem in an incompressible Boussinesq fluid with interfacial instability. All the examples are implemented out of the library tree. Regardless of the differences in the problem dimensionality, right-hand-side terms, boundary conditions and parallelisation approach, all the examples use the same unmodified library, which is a key goal of libmpdata++ design. The design, based on the principle of separation of concerns, prioritises the user and developer productivity. The libmpdata++ library is implemented in C++, making use of the Blitz++ multi-dimensional array containers, and is released as free/libre and open-source software.
APA, Harvard, Vancouver, ISO, and other styles
18

FOGELBERG, CHRISTOPHER, and VASILE PALADE. "DENSE STRUCTURAL EXPECTATION MAXIMISATION WITH PARALLELISATION FOR EFFICIENT LARGE-NETWORK STRUCTURAL INFERENCE." International Journal on Artificial Intelligence Tools 22, no. 03 (June 2013): 1350011. http://dx.doi.org/10.1142/s0218213013500115.

Full text
Abstract:
Research on networks is increasingly popular in a wide range of machine learning fields, and structural inference of networks is a key problem. Unfortunately, network structural inference is time consuming and there is an increasing need to infer the structure of ever-larger networks. This article presents the Dense Structural Expectation Maximisation (DSEM) algorithm, a novel extension of the well-known SEM algorithm. DSEM increases the efficiency of structural inference by using the time-expensive calculations required in each SEM iteration more efficiently, and can be O(N) times faster than SEM, where N is the size of the network. The article has also combined DSEM with parallelisation and evaluated the impact of these improvements over SEM, individually and combined. The possibility of combining these novel approaches with other research on structural inference is also considered. The contributions also appear to be usable for all kinds of structural inference, and may greatly improve the range, variety and size of problems which can be tractably addressed. Code is freely available online at: http://syntilect.com/cgf/pubs:software .
APA, Harvard, Vancouver, ISO, and other styles
19

Naeem, M. Asif, Habib Khan, Saad Aslam, and Noreen Jamil. "Parallelisation of a Cache-Based Stream-Relation Join for a Near-Real-Time Data Warehouse." Electronics 9, no. 8 (August 12, 2020): 1299. http://dx.doi.org/10.3390/electronics9081299.

Full text
Abstract:
Near real-time data warehousing is an important area of research, as business organisations want to analyse their businesses sales with minimal latency. Therefore, sales data generated by data sources need to reflect immediately in the data warehouse. This requires near-real-time transformation of the stream of sales data with a disk-based relation called master data in the staging area. For this purpose, a stream-relation join is required. The main problem in stream-relation joins is the different nature of inputs; stream data is fast and bursty, whereas the disk-based relation is slow due to high disk I/O cost. To resolve this problem, a famous algorithm CACHEJOIN (cache join) was published in the literature. The algorithm has two phases, the disk-probing phase and the stream-probing phase. These two phases execute sequentially; that means stream tuples wait unnecessarily due to the sequential execution of both phases. This limits the algorithm to exploiting CPU resources optimally. In this paper, we address this issue by presenting a robust algorithm called PCSRJ (parallelised cache-based stream relation join). The new algorithm enables the execution of both disk-probing and stream-probing phases of CACHEJOIN in parallel. The algorithm distributes the disk-based relation on two separate nodes and enables parallel execution of CACHEJOIN on each node. The algorithm also implements a strategy of splitting the stream data on each node depending on the relevant part of the relation. We developed a cost model for PCSRJ and validated it empirically. We compared the service rates of both algorithms using a synthetic dataset. Our experiments showed that PCSRJ significantly outperforms CACHEJOIN.
APA, Harvard, Vancouver, ISO, and other styles
20

Haveraaen, Magne. "Machine and Collection Abstractions for User-Implemented Data-Parallel Programming." Scientific Programming 8, no. 4 (2000): 231–46. http://dx.doi.org/10.1155/2000/485607.

Full text
Abstract:
Data parallelism has appeared as a fruitful approach to the parallelisation of compute-intensive programs. Data parallelism has the advantage of mimicking the sequential (and deterministic) structure of programs as opposed to task parallelism, where the explicit interaction of processes has to be programmed. In data parallelism data structures, typically collection classes in the form of large arrays, are distributed on the processors of the target parallel machine. Trying to extract distribution aspects from conventional code often runs into problems with a lack of uniformity in the use of the data structures and in the expression of data dependency patterns within the code. Here we propose a framework with two conceptual classes, Machine and Collection. The Machine class abstracts hardware communication and distribution properties. This gives a programmer high-level access to the important parts of the low-level architecture. The Machine class may readily be used in the implementation of a Collection class, giving the programmer full control of the parallel distribution of data, as well as allowing normal sequential implementation of this class. Any program using such a collection class will be parallelisable, without requiring any modification, by choosing between sequential and parallel versions at link time. Experiments with a commercial application, built using the Sophus library which uses this approach to parallelisation, show good parallel speed-ups, without any adaptation of the application program being needed.
APA, Harvard, Vancouver, ISO, and other styles
21

James, Edward, and Peter Munro. "Diffuse Correlation Spectroscopy: A Review of Recent Advances in Parallelisation and Depth Discrimination Techniques." Sensors 23, no. 23 (November 22, 2023): 9338. http://dx.doi.org/10.3390/s23239338.

Full text
Abstract:
Diffuse correlation spectroscopy is a non-invasive optical modality used to measure cerebral blood flow in real time, and it has important potential applications in clinical monitoring and neuroscience. As such, many research groups have recently been investigating methods to improve the signal-to-noise ratio, imaging depth, and spatial resolution of diffuse correlation spectroscopy. Such methods have included multispeckle, long wavelength, interferometric, depth discrimination, time-of-flight resolution, and acousto-optic detection strategies. In this review, we exhaustively appraise this plethora of recent advances, which can be used to assess limitations and guide innovation for future implementations of diffuse correlation spectroscopy that will harness technological improvements in the years to come.
APA, Harvard, Vancouver, ISO, and other styles
22

Buono, Daniele, Gabriele Mencagli, Alessio Pascucci, and Marco Vanneschi. "Performance analysis and structured parallelisation of the space–time adaptive processing computational kernel on multi-core architectures." International Journal of Parallel, Emergent and Distributed Systems 29, no. 5 (February 11, 2014): 460–98. http://dx.doi.org/10.1080/17445760.2014.885967.

Full text
APA, Harvard, Vancouver, ISO, and other styles
23

Albertsson, Kim, Sergei Gleyzer, Marc Huwiler, Vladimir Ilievski, Lorenzo Moneta, Saurav Shekar, Victor Estrade, Akshay Vashistha, Stefan Wunsch, and Omar Andres Zapata Mesa. "New Machine Learning Developments in ROOT/TMVA." EPJ Web of Conferences 214 (2019): 06014. http://dx.doi.org/10.1051/epjconf/201921406014.

Full text
Abstract:
The Toolkit for Multivariate Analysis, TMVA, the machine learning package integrated into the ROOT data analysis framework, has recently seen improvements to its deep learning module, parallelisation of multivariate methods and cross validation. Performance benchmarks on datasets from high-energy physics are presented with a particular focus on the new deep learning module which contains robust fully-connected, convolutional and recurrent deep neural networks implemented on CPU and GPU architectures. Both dense and convo-lutional layers are shown to be competitive on small-scale networks suitable for high-level physics analyses in both training and in single-event evaluation. Par-allelisation efforts show an asymptotical 3-fold reduction in boosted decision tree training time while the cross validation implementation shows significant speed up with parallel fold evaluation.
APA, Harvard, Vancouver, ISO, and other styles
24

Karchev, Konstantin. "Analytic auto-differentiable ΛCDM cosmography." Journal of Cosmology and Astroparticle Physics 2023, no. 07 (July 1, 2023): 065. http://dx.doi.org/10.1088/1475-7516/2023/07/065.

Full text
Abstract:
Abstract I present general analytic expressions for distance calculations (comoving distance, time coordinate, and absorption distance) in the standard ΛCDM cosmology, allowing for the presence of radiation and for non-zero curvature. The solutions utilise the symmetric Carlson basis of elliptic integrals, which can be evaluated with fast numerical algorithms that allow trivial parallelisation on GPUs and automatic differentiation without the need for additional special functions. I introduce a PyTorch-based implementation in the phytorch.cosmology package and briefly examine its accuracy and speed in comparison with numerical integration and other known expressions (for special cases). Finally, I demonstrate an application to high-dimensional Bayesian analysis that utilises automatic differentiation through the distance calculations to efficiently derive posteriors for cosmological parameters from up to 106 mock type Ia supernovæ using variational inference.
APA, Harvard, Vancouver, ISO, and other styles
25

GENT, IAN P., IAN MIGUEL, PETER NIGHTINGALE, CIARAN MCCREESH, PATRICK PROSSER, NEIL C. A. MOORE, and CHRIS UNSWORTH. "A review of literature on parallel constraint solving." Theory and Practice of Logic Programming 18, no. 5-6 (August 2, 2018): 725–58. http://dx.doi.org/10.1017/s1471068418000340.

Full text
Abstract:
AbstractAs multi-core computing is now standard, it seems irresponsible for constraints researchers to ignore the implications of it. Researchers need to address a number of issues to exploit parallelism, such as: investigating which constraint algorithms are amenable to parallelisation; whether to use shared memory or distributed computation; whether to use static or dynamic decomposition; and how to best exploit portfolios and cooperating search. We review the literature, and see that we can sometimes do quite well, some of the time, on some instances, but we are far from a general solution. Yet there seems to be little overall guidance that can be given on how best to exploit multi-core computers to speed up constraint solving. We hope at least that this survey will provide useful pointers to future researchers wishing to correct this situation.
APA, Harvard, Vancouver, ISO, and other styles
26

OTANI, Yoshihiro, Naoshi NISHIMURA, and Michihiro KITAHARA. "On the parallelisation of a time domain fast boundary integral equation method for three dimensional elastodynamics for shared memory computers." Journal of applied mechanics 7 (2004): 295–304. http://dx.doi.org/10.2208/journalam.7.295.

Full text
APA, Harvard, Vancouver, ISO, and other styles
27

Rodríguez-Vázquez, Juan José, José Luis Vázquez-Poletti, Carlos Delgado, Andrea Bulgarelli, and Miguel Cárdenas-Montes. "Performance study of a signal-extraction algorithm using different parallelisation strategies for the Cherenkov Telescope Array's real-time-analysis software." Concurrency and Computation: Practice and Experience 29, no. 12 (May 17, 2017): e4086. http://dx.doi.org/10.1002/cpe.4086.

Full text
APA, Harvard, Vancouver, ISO, and other styles
28

Walter, Johanna-Gabriela, Lourdes S. M. Alwis, Bernhard Roth, and Kort Bremer. "All-Optical Planar Polymer Waveguide-Based Biosensor Chip Designed for Smartphone-Assisted Detection of Vitamin D." Sensors 20, no. 23 (November 27, 2020): 6771. http://dx.doi.org/10.3390/s20236771.

Full text
Abstract:
An all-optical plasmonic sensor platform designed for smartphones based on planar-optical waveguide structures integrated in a polymer chip is reported for the first time. To demonstrate the applicability of the sensor system for biosensing purposes, the detection of 25-hydroxyvitamin D (25OHD) in human serum samples using an AuNP-enhanced aptamer-based assay was demonstrated. With the aid of the developed assay sensitivity of 0.752 pixel/nM was achieved for 25OHD concentrations ranging from 0–100 nM. The waveguide structure of the sensor enables miniaturisation and parallelisation, thus, demonstrates the potential for simultaneous detection of various analytes including biomarkers. The entire optical arrangement can be integrated into a single polymer chip which allows for large scale and cost-efficient sensor fabrication. The broad utilization and access of smartphone electronics make the proposed design most attractive for its wider use in lab-on-chip applications.
APA, Harvard, Vancouver, ISO, and other styles
29

Lütgehetmann, Daniel, Dejan Govc, Jason P. Smith, and Ran Levi. "Computing Persistent Homology of Directed Flag Complexes." Algorithms 13, no. 1 (January 7, 2020): 19. http://dx.doi.org/10.3390/a13010019.

Full text
Abstract:
We present a new computing package Flagser, designed to construct the directed flag complex of a finite directed graph, and compute persistent homology for flexibly defined filtrations on the graph and the resulting complex. The persistent homology computation part of Flagser is based on the program Ripser by U. Bauer, but is optimised specifically for large computations. The construction of the directed flag complex is done in a way that allows easy parallelisation by arbitrarily many cores. Flagser also has the option of working with undirected graphs. For homology computations Flagser has an approximate option, which shortens compute time with remarkable accuracy. We demonstrate the power of Flagser by applying it to the construction of the directed flag complex of digital reconstructions of brain microcircuitry by the Blue Brain Project and several other examples. In some instances we perform computation of homology. For a more complete performance analysis, we also apply Flagser to some other data collections. In all cases the hardware used in the computation, the use of memory and the compute time are recorded.
APA, Harvard, Vancouver, ISO, and other styles
30

Riemer, Björn, and Kay Hameyer. "Multi-harmonic approach to determine load dependent local flux variations in power transformer cores." Archives of Electrical Engineering 64, no. 1 (March 1, 2015): 129–38. http://dx.doi.org/10.1515/aee-2015-0012.

Full text
Abstract:
Abstract This paper presents a methodology for the calculation of the flux distribution in power transformer cores considering nonlinear material, with reduced computational effort. The calculation is based on a weak coupled multi-harmonic approach. The methodology can be applied to 2D and 3D Finite Element models. The decrease of the computational effort for the proposed approach is >90% compared to a time-stepping method at comparable accuracy. Furthermore, the approach offers a possibility for parallelisation to reduce the overall simulation time. The speed up of the parallelised simulations is nearly linear. The methodology is applied to a single-phase and a three-phase power transformer. Exemplary, the flux distribution for a capacitive load case is determined and the differences in the flux distribution obtained by a 2D and 3D FE model are pointed out. Deviations are significant, due to the fact, that the 2D FE model underestimates the stray fluxes. It is shown, that a 3D FE model of the transformer is required, if the nonlinearity of the core material has to be taken into account.
APA, Harvard, Vancouver, ISO, and other styles
31

Fleig, Luisa, and Klaus Hoschke. "An Automated Parametric Surface Patch-Based Construction Method for Smooth Lattice Structures with Irregular Topologies." Applied Sciences 13, no. 20 (October 12, 2023): 11223. http://dx.doi.org/10.3390/app132011223.

Full text
Abstract:
Additive manufacturing enables the realization of complex component designs that cannot be achieved with conventional processes, such as the integration of cellular structures, such as lattice structures, for weight reduction. To include lattice structures in component designs, an automated algorithm compatible with conventional CAD that is able to handle various lattice topologies as well as variable local shape parameters such as strut radii is required. Smooth node transitions are desired due to their advantages in terms of reduced stress concentrations and improved fatigue performance. The surface patch-based algorithm developed in this work is able to solidify given lattice frames to smooth lattice structures without manual construction steps. The algorithm requires only a few seconds of sketching time for each node and favours parallelisation. Automated special-case workarounds as well as fallback mechanisms are considered for non-standard inputs. The algorithm is demonstrated on irregular lattice topologies and applied for the construction of a lattice infill of an aircraft component that was additively manufactured.
APA, Harvard, Vancouver, ISO, and other styles
32

Vojinović, M. "CMS HGCAL electronics vertical integration system tests." Journal of Instrumentation 19, no. 02 (February 1, 2024): C02022. http://dx.doi.org/10.1088/1748-0221/19/02/c02022.

Full text
Abstract:
Abstract In preparation for the High-Luminosity phase of the CERN Large Hadron Collider, the Compact Muon Solenoid collaboration will replace the existing endcap calorimeters with the High Granularity Calorimeter. Considering both endcaps, this novel sub-detector will have around six million readout channels, pushing approximately 108 Tbit s-1 of usable data between the detector front-end (on-detector) electronics and the back-end (off-detector) electronics. Given the scale and complexity of the endcap calorimeter upgrade project, the electronics testing must be carefully planned. The strategy has been to split the efforts between vertical (start-to-end) and horizontal (parallelisation) test systems wherever possible. An important milestone was the development and operation of a test system that prototypes a complete vertical slice of the future endcap electronics system. For the first time, a version of a test system consisting of the full vertical electronics readout chain was successfully operated in a beam test, where it was used to acquire real physics data.
APA, Harvard, Vancouver, ISO, and other styles
33

Sirigu, M., E. Faraggiana, A. Ghigo, and G. Bracco. "Development of MOST, a fast simulation model for optimisation of floating offshore wind turbines in Simscape Multibody." Journal of Physics: Conference Series 2257, no. 1 (April 1, 2022): 012003. http://dx.doi.org/10.1088/1742-6596/2257/1/012003.

Full text
Abstract:
Abstract the paper presents the development of an innovative non-linear, time domain numerical model for the simulation of offshore floating wind turbines, named MOST. The model is able to evaluate the movement of the platform in six degrees of freedom, the power production and the load cycles acting on the blades. MOST is implemented in Matlab-Simulink environment using Simscape Multibody. The aerodynamics is modelled with the blade element momentum theory and the hydrodynamics is modelled using WEC-Sim, a Simscape library developed by NREL and SANDIA. The use of Simscape offers great flexibility to quickly introduce complex dynamic systems such as hybrid wave-wind platforms, flexible platforms, or sea water active ballast systems. Additionally, Matlab provides useful toolboxes and extensive libraries for advanced control systems, linearisation analysis, parallelisation and generation of C code. The results of MOST are then compared to FAST, an open-source code widely used in the academic research. The case study is a 15 MW reference wind turbine installed on the Volturn US platform. The comparison shows a good agreement between the two codes with a significant reduction of the simulation time.
APA, Harvard, Vancouver, ISO, and other styles
34

Chevallier, F. "On the parallelization of atmospheric inversions of CO<sub>2</sub> surface fluxes within a variational framework." Geoscientific Model Development Discussions 6, no. 1 (January 8, 2013): 37–57. http://dx.doi.org/10.5194/gmdd-6-37-2013.

Full text
Abstract:
Abstract. The variational formulation of Bayes' theorem allows inferring CO2 sources and sinks from atmospheric concentrations at much higher space-time resolution than the ensemble approach or the analytical one. However, it usually exhibits limited scalable parallelism. This limitation hinders global atmospheric inversions operated on decadal time scales and regional ones with kilometric spatial scales, because of the computational cost of the underlying transport model that has to be run at each iteration of the variational minimization. Here, we introduce a Physical Parallelisation (PP) of variational atmospheric inversions. In the PP, the inversion still manages a single physically and statistically consistent window, but the transport model is run in parallel overlapping sub-segments in order to massively reduce the computation wall clock time of the inversion. For global inversions, a simplification of transport modelling is described to connect the output of all segments. We demonstrate the performance of the approach on a global inversion for CO2 with a 32-yr inversion window (1979–2010) with atmospheric measurements from 81 sites of the NOAA global cooperative air sampling network. In this case, we show that the duration of the inversion is reduced by a seven-fold factor (from months to days) while still processing the three decades consistently and with improved numerical stability.
APA, Harvard, Vancouver, ISO, and other styles
35

Rahimi, M. M., and F. Hakimpour. "TOWARDS A CLOUD BASED SMART TRAFFIC MANAGEMENT FRAMEWORK." ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences XLII-4/W4 (September 27, 2017): 447–53. http://dx.doi.org/10.5194/isprs-archives-xlii-4-w4-447-2017.

Full text
Abstract:
Traffic big data has brought many opportunities for traffic management applications. However several challenges like heterogeneity, storage, management, processing and analysis of traffic big data may hinder their efficient and real-time applications. All these challenges call for well-adapted distributed framework for smart traffic management that can efficiently handle big traffic data integration, indexing, query processing, mining and analysis. In this paper, we present a novel, distributed, scalable and efficient framework for traffic management applications. The proposed cloud computing based framework can answer technical challenges for efficient and real-time storage, management, process and analyse of traffic big data. For evaluation of the framework, we have used OpenStreetMap (OSM) real trajectories and road network on a distributed environment. Our evaluation results indicate that speed of data importing to this framework exceeds 8000 records per second when the size of datasets is near to 5 million. We also evaluate performance of data retrieval in our proposed framework. The data retrieval speed exceeds 15000 records per second when the size of datasets is near to 5 million. We have also evaluated scalability and performance of our proposed framework using parallelisation of a critical pre-analysis in transportation applications. The results show that proposed framework achieves considerable performance and efficiency in traffic management applications.
APA, Harvard, Vancouver, ISO, and other styles
36

Varvoutas, Konstantinos, Georgia Kougka, and Anastasios Gounaris. "Systematic exploitation of parallel task execution in business processes." Computer Science and Information Systems, no. 00 (2023): 57. http://dx.doi.org/10.2298/csis230401057v.

Full text
Abstract:
Business process re-engineering (or optimization) has been attracting a lot of interest, and it is considered as a core element of business process management (BPM). One of its most effective mechanisms is task re-sequencing with a view to decreasing process duration and costs, whereas duration (aka cycle time) can be reduced using task parallelism as well. In this work, we propose a novel combination of these two mechanisms, which is resource allocation-aware. Starting from a solution where a given resource allocation in business processes can drive optimizations in an underlying BPMN diagram, our proposal considers resource allocation and model modifications in a combined manner, where an initially suboptimal resource allocation can lead to better overall process executions. More specifically, the main contribution is twofold: (i) to present a proposal that leverages a variant of representation of processes as Refined Process Structure Trees (RPSTs) with a view to enabling novel resource allocation-driven task re-ordering and parallelisation in a principled manner, and (ii) to introduce a resource allocation paradigm that assigns tasks to resources taking into account the re-sequencing opportunities that can arise. The results show that we can yield improvements in a very high proportion of our experimental cases, while these improvements can reach 45% decrease in cycle time.
APA, Harvard, Vancouver, ISO, and other styles
37

Hong, Chengyu, Xuben Wang, Gaishan Zhao, Zhao Xue, Fei Deng, Qinping Gu, Zhixiang Song, et al. "Discontinuous finite element method for efficient three-dimensional elastic wave simulation." Journal of Geophysics and Engineering 18, no. 1 (February 2021): 98–112. http://dx.doi.org/10.1093/jge/gxaa070.

Full text
Abstract:
Abstract The existing discontinuous Galerkin (DG) finite element method (FEM) for the numerical simulation of elastic wave propagation is primarily implemented in two dimensions. Here, a discontinuous FEM (DFEM) for efficient three-dimensional (3D) elastic wave simulation is presented. First, the velocity–stress equations of 3D elastic waves in isotropic media are transformed into first-order coefficient-changed partial differential equations. A DG discretisation method for wave field values on a unit boundary is then defined using the local Lax–Friedrichs flux format. The equations are first transformed into equivalent integral equations, and subsequently into a spatial semi-discrete ordinary differential equation system using a hierarchical orthogonal basis function. The DFEM is extended to an arbitrary high-order accuracy in the time domain using the exponential integrator technique and the explicit optimal strong-stability-preserving Runge–Kutta method. Finally, an efficient method for selecting the calculation area of the geometry of the current shot record is realised. For the computation, a multi-node parallelism with improved resource utilisation and parallelisation efficiency is implemented. The numerical results show that the proposed method can improve both the accuracy of the simulation and the efficiency of the calculation compared with existing methods.
APA, Harvard, Vancouver, ISO, and other styles
38

Leonards, U., R. Rettenbach, and R. Sireteanu. "The Dynamics of Perceptual Learning in Different Visual Search Tasks: Psychophysics and Psychophysiology." Perception 26, no. 1_suppl (August 1997): 232. http://dx.doi.org/10.1068/v970149.

Full text
Abstract:
Serial visual search can become parallel with practice (Sireteanu and Rettenbach, 1995 Vision Research35 2037 – 2043). Our purpose was to examine whether psychophysiological indices reflect the changes in reaction time during training. We used targets and distractors that differed either in orientation (‘tilt’), or in local brightness: closed circles with or without an additional line element (‘added line’), or circles with gaps of different width (‘gap’). The subjects’ task was to indicate the presence or absence of a target on a computer screen by immediately pressing a button and pointing to the location of the target if the trial was positive, or raise the hand if negative. No feedback was given. Response time and error rate were recorded. In addition, electrocardiograms, galvanic skin response, respiration rate and amplitude, horizontal eye movements, and electromyograms were monitored. Two naive and two experienced subjects participated in at least 16 experimental sessions. Before training, slopes differed for the three tasks, ranging from parallel search for the feature ‘tilt’ to a very steep serial search for the feature ‘gap’. These differences were reflected in the psychophysiological parameters. Reaction time and error rate decreased continuously with learning, leading to parallel search after prolonged practice for all three tasks (see Nase et al, 1995 Perception24 Supplement, 84). Preliminary results indicate that the psychophysiological measures do not follow the perceptual changes during learning. We conclude that, despite the perceptual parallelisation with practice, the attentional load remains high for initially serial tasks.
APA, Harvard, Vancouver, ISO, and other styles
39

Van Loo, Maarten, and Gert Verstraeten. "A Spatially Explicit Crop Yield Model to Simulate Agricultural Productivity for Past Societies under Changing Environmental Conditions." Water 13, no. 15 (July 24, 2021): 2023. http://dx.doi.org/10.3390/w13152023.

Full text
Abstract:
Most contemporary crop yield models focus on a small time window, operate on a plot location, or do not include the effects of the changing environment, which makes it difficult to use these models to assess the agricultural sustainability for past societies. In this study, adaptions were made to the agronomic AquaCrop model. This adapted model was ran to cover the last 4000 years to simulate the impact of climate and land cover changes, as well as soil dynamics, on the productivity of winter wheat crops for a Mediterranean mountain environment in SW Turkey. AquaCrop has been made spatially explicit, which allows hydrological interactions between different landscape positions, whilst computational time is kept limited by implementing parallelisation schemes on a supercomputer. The adapted model was calibrated and validated using crop and soil information sampled during the 2015 and 2016 harvest periods. Simulated crop yields for the last 4000 years show the strong control of precipitation, while changes in soil thickness following erosion, and to lesser extent re-infiltration of runoff along a slope catena also have a significant impact on crop yield. The latter is especially important in the valleys, where soil and water accumulate. The model results also show that water export to the central valley strongly increased (up to four times) following deforestation and the resulting soil erosion on the hillslopes, turning it into a marsh and rendering it unsuitable for crop cultivation.
APA, Harvard, Vancouver, ISO, and other styles
40

Strakos, Petr, Milan Jaros, Lubomir Riha, and Tomas Kozubek. "Speed Up of Volumetric Non-Local Transform-Domain Filter Utilising HPC Architecture." Journal of Imaging 9, no. 11 (November 20, 2023): 254. http://dx.doi.org/10.3390/jimaging9110254.

Full text
Abstract:
This paper presents a parallel implementation of a non-local transform-domain filter (BM4D). The effectiveness of the parallel implementation is demonstrated by denoising image series from computed tomography (CT) and magnetic resonance imaging (MRI). The basic idea of the filter is based on grouping and filtering similar data within the image. Due to the high level of similarity and data redundancy, the filter can provide even better denoising quality than current extensively used approaches based on deep learning (DL). In BM4D, cubes of voxels named patches are the essential image elements for filtering. Using voxels instead of pixels means that the area for searching similar patches is large. Because of this and the application of multi-dimensional transformations, the computation time of the filter is exceptionally long. The original implementation of BM4D is only single-threaded. We provide a parallel version of the filter that supports multi-core and many-core processors and scales on such versatile hardware resources, typical for high-performance computing clusters, even if they are concurrently used for the task. Our algorithm uses hybrid parallelisation that combines open multi-processing (OpenMP) and message passing interface (MPI) technologies and provides up to 283× speedup, which is a 99.65% reduction in processing time compared to the sequential version of the algorithm. In denoising quality, the method performs considerably better than recent DL methods on the data type that these methods have yet to be trained on.
APA, Harvard, Vancouver, ISO, and other styles
41

Allen, C. B., A. G. Sunderland, and R. Johnstone. "Application of a parallel rotor CFD code on HPCx." Aeronautical Journal 111, no. 1117 (March 2007): 145–52. http://dx.doi.org/10.1017/s0001924000004401.

Full text
Abstract:
Aspects of parallel simulation of rotor flows are considered. These flows can be extremely expensive for a compressible finite-volume CFD code, and parallelisation can be essential. The award of HPCx time through the UK Applied Aerodynamics Consortium has allowed large rotor simulations to be performed and wake grid dependence to be investigated. However, there are several issues that need to be investigated when considering very large simulations, including the grid generation process, the parallel flow-solver, including an effective mesh motion approach, and visualisation options. Details of these are presented here, with particular emphasis on the flow-solver parallel performance. A detailed performance analysis of the unsteady flow-solver has been undertaken and the code optimised to improve parallel performance, and details of the parallel scaling performance are presented. The parallel scaling of the code is very good on all the HPC architectures tested here, and this has been recognised by an HPCx Gold Star Capability Incentive award. Results of simulation of a fourbladed lifting rotor in forward flight are also presented, for two mesh densities. It is shown that the solution computed on the serial limit on mesh size, around four million cells, exhibits excessive diffusion, and is of limited use in terms of detailed flow features. The results on a very fine mesh, 32 million cells, have shown a much better solution resolution, and it is also demonstrated that the λ2vortex core visualisation option is extremely useful.
APA, Harvard, Vancouver, ISO, and other styles
42

Yang, Feng Wei, Chandrasekhar Venkataraman, Vanessa Styles, and Anotida Madzvamuse. "A Robust and Efficient Adaptive Multigrid Solver for the Optimal Control of Phase Field Formulations of Geometric Evolution Laws." Communications in Computational Physics 21, no. 1 (December 5, 2016): 65–92. http://dx.doi.org/10.4208/cicp.240715.080716a.

Full text
Abstract:
AbstractWe propose and investigate a novel solution strategy to efficiently and accurately compute approximate solutions to semilinear optimal control problems, focusing on the optimal control of phase field formulations of geometric evolution laws. The optimal control of geometric evolution laws arises in a number of applications in fields including material science, image processing, tumour growth and cell motility. Despite this, many open problems remain in the analysis and approximation of such problems. In the current work we focus on a phase field formulation of the optimal control problem, hence exploiting the well developed mathematical theory for the optimal control of semilinear parabolic partial differential equations. Approximation of the resulting optimal control problemis computationally challenging, requiring massive amounts of computational time and memory storage. The main focus of this work is to propose, derive, implement and test an efficient solution method for such problems. The solver for the discretised partial differential equations is based upon a geometric multigrid method incorporating advanced techniques to deal with the nonlinearities in the problem and utilising adaptive mesh refinement. An in-house two-grid solution strategy for the forward and adjoint problems, that significantly reduces memory requirements and CPU time, is proposed and investigated computationally. Furthermore, parallelisation as well as an adaptive-step gradient update for the control are employed to further improve efficiency. Along with a detailed description of our proposed solution method together with its implementation we present a number of computational results that demonstrate and evaluate our algorithms with respect to accuracy and efficiency. A highlight of the present work is simulation results on the optimal control of phase field formulations of geometric evolution laws in 3-D which would be computationally infeasible without the solution strategies proposed in the present work.
APA, Harvard, Vancouver, ISO, and other styles
43

Musa, Zainab I., Sahalu Balarabe Junaidu, Baroon Ismaeel Ahmad, A. F. Donfack Kana, and Adamu Abubakar Ibrahim. "An Enhanced Predictive Analytics Model for Tax-Based Operations." International Journal on Perceptive and Cognitive Computing 9, no. 1 (January 28, 2023): 44–49. http://dx.doi.org/10.31436/ijpcc.v9i1.343.

Full text
Abstract:
In order to meet its basic responsibilities of governance such as provision of infrastructure, governments world over require significant amount of funds. Consequently, citizens and businesses are required to pay certain legislated amounts as taxes and royalties. However, tax compliance and optimal revenue generation remains a major source of concern. Measures such as penalties and in the current times Data and Predictive Analytics have been devised to curb these issues. Such effective Analytics measures are absent in Bauchi State and Nigeria as a whole. Previous studies in Nigeria have done much in the area of tax compliance but have not implemented Data Analytics solutions to unearth the relationships which this study will cover. A Combined Sequential Minimal Optimisation (CSMO) model has been developed to analyse co-relation of Tax-payers, classification and predictive traits which uncovers trends on which to base overall decisions for the ultimate goal of revenue generation. Experimental validation demonstrates the advantages of CSMO in terms of classification, training time and prediction accuracy in comparison to Sequential Minimal Optimisation (SMO) and Parallel Sequential Minimal Optimisation (PSMO). CSMO recorded a Kappa Statistics measure of 0.916 which is 8% more than the SMO and 7.8% more than the PSMO; 99.74% correctly classified instances was compared to 98.28% in SMO and 98.35 in parallel SMO. Incorrectly classified instances of CSMO recorded a value of 0.25% which is better than 1.72% of SMO and 1.68% of PSMO. Training time of 223ms was recorded when compared to 378ms in SMO and 286ms in PSMO. A better value of 0.9981 for CSMO was achieved in the ROC Curve plot against 0.944 in SMO and 0.913 in PSMO. CSMO takes advantage of powerful Analytics techniques such as prediction and parallelisation in function-based classifiers to discover relationships that were initially non-existent
APA, Harvard, Vancouver, ISO, and other styles
44

Räss, Ludovic, Ivan Utkin, Thibault Duretz, Samuel Omlin, and Yuri Y. Podladchikov. "Assessing the robustness and scalability of the accelerated pseudo-transient method." Geoscientific Model Development 15, no. 14 (July 25, 2022): 5757–86. http://dx.doi.org/10.5194/gmd-15-5757-2022.

Full text
Abstract:
Abstract. The development of highly efficient, robust and scalable numerical algorithms lags behind the rapid increase in massive parallelism of modern hardware. We address this challenge with the accelerated pseudo-transient (PT) iterative method and present a physically motivated derivation. We analytically determine optimal iteration parameters for a variety of basic physical processes and confirm the validity of theoretical predictions with numerical experiments. We provide an efficient numerical implementation of PT solvers on graphical processing units (GPUs) using the Julia language. We achieve a parallel efficiency of more than 96 % on 2197 GPUs in distributed-memory parallelisation weak-scaling benchmarks. The 2197 GPUs allow for unprecedented tera-scale solutions of 3D variable viscosity Stokes flow on 49953 grid cells involving over 1.2 trillion degrees of freedom (DoFs). We verify the robustness of the method by handling contrasts up to 9 orders of magnitude in material parameters such as viscosity and arbitrary distribution of viscous inclusions for different flow configurations. Moreover, we show that this method is well suited to tackle strongly nonlinear problems such as shear-banding in a visco-elasto-plastic medium. A GPU-based implementation can outperform direct-iterative solvers based on central processing units (CPUs) in terms of wall time, even at relatively low spatial resolution. We additionally motivate the accessibility of the method by its conciseness, flexibility, physically motivated derivation and ease of implementation. This solution strategy thus has a great potential for future high-performance computing (HPC) applications, and for paving the road to exascale in the geosciences and beyond.
APA, Harvard, Vancouver, ISO, and other styles
45

Schannwell, Clemens, Reinhard Drews, Todd A. Ehlers, Olaf Eisen, Christoph Mayer, Mika Malinen, Emma C. Smith, and Hannes Eisermann. "Quantifying the effect of ocean bed properties on ice sheet geometry over 40 000 years with a full-Stokes model." Cryosphere 14, no. 11 (November 11, 2020): 3917–34. http://dx.doi.org/10.5194/tc-14-3917-2020.

Full text
Abstract:
Abstract. Simulations of ice sheet evolution over glacial cycles require integration of observational constraints using ensemble studies with fast ice sheet models. These include physical parameterisations with uncertainties, for example, relating to grounding-line migration. More complete ice dynamic models are slow and have thus far only be applied for < 1000 years, leaving many model parameters unconstrained. Here we apply a 3D thermomechanically coupled full-Stokes ice sheet model to the Ekström Ice Shelf embayment, East Antarctica, over a full glacial cycle (40 000 years). We test the model response to differing ocean bed properties that provide an envelope of potential ocean substrates seawards of today's grounding line. The end-member scenarios include a hard, high-friction ocean bed and a soft, low-friction ocean bed. We find that predicted ice volumes differ by > 50 % under almost equal forcing. Grounding-line positions differ by up to 49 km, show significant hysteresis, and migrate non-steadily in both scenarios with long quiescent phases disrupted by leaps of rapid migration. The simulations quantify the evolution of two different ice sheet geometries (namely thick and slow vs. thin and fast), triggered by the variable grounding-line migration over the differing ocean beds. Our study extends the timescales of 3D full-Stokes by an order of magnitude compared to previous studies with the help of parallelisation. The extended time frame for full-Stokes models is a first step towards better understanding other processes such as erosion and sediment redistribution in the ice shelf cavity impacting the entire catchment geometry.
APA, Harvard, Vancouver, ISO, and other styles
46

Nader, François, Patrick Pizette, Nicolin Govender, Daniel N. Wilke, and Jean-François Ferellec. "Modelling realistic ballast shape to study the lateral pull behaviour using GPU computing." EPJ Web of Conferences 249 (2021): 06003. http://dx.doi.org/10.1051/epjconf/202124906003.

Full text
Abstract:
The use of the Discrete Element Method to model engineering structures implementing granular materials has proven to be an efficient method to response under various behaviour conditions. However, the computational cost of the simulations increases rapidly, as the number of particles and particle shape complexity increases. An affordable solution to render problems computationally tractable is to use graphical processing units (GPU) for computing. Modern GPUs offer up 10496 compute cores, which allows for a greater parallelisation relative to 32-cores offered by high-end Central Processing Unit (CPU) compute. This study outlines the application of BlazeDEM-GPU, using an RTX 2080Ti GPU (4352 cores), to investigate the influence of the modelling of particle shape on the lateral pull behaviour of granular ballast systems used in railway applications. The idea is to validate the model and show the benefits of simulating non-spherical shapes in future large-scale tests. The algorithm, created to generate the shape of the ballast based on real grain scans, and using polyhedral shape approximations of varying degrees of complexity is shown. The particle size is modelled to scale. A preliminary investigation of the effect of the grain shape is conducted, where a sleeper lateral pull test is carried out in a spherical grains sample, and a cubic grains sample. Preliminary results show that elementary polyhedral shape representations (cubic) recreate some of the characteristic responses in the lateral pull test, such as stick/slip phenomena and force chain distributions, which looks promising for future works on railway simulations. These responses that cannot be recreated with simple spherical grains, unless heuristics are added, which requires additional calibration and approximations. The significant reduction in time when using non-spherical grains also implies that larger granular systems can be investigated.
APA, Harvard, Vancouver, ISO, and other styles
47

Brown, John C. "High speed feature unification and parsing." Natural Language Engineering 1, no. 4 (December 1995): 309–38. http://dx.doi.org/10.1017/s1351324900000243.

Full text
Abstract:
AbstractFeature unification in parsing has previously used either inefficient Prolog programs, or LISP programs implementing early pre-WAM Prolog models of unification involving searches of binding lists, and the copying of rules to generate edges: features within rules and edges have traditionally been expressed as lists or functions, with clarity being preferred to speed of processing. As a result, parsing takes about 0·5 seconds for a 7-word sentence. Our earlier work produced an optimised chart parser for a non-unification context-free-grammar that achieved 5 ms parses, with high-ambiguity sentences involving hundreds of edges, using the grammar and sentences from Tomita's work on shift-reduce parsing with multiple stack branches. A parallel logic card design resulted that would speed this by a further factor of at least 17. The current paper extends this parser to treat a much more complex unification grammar with structures, using extensive indexing of rules and edges and the optimisations of top-down filtering and look-ahead, to demonstrate where unification occurs during parsing. Unification in parsing is distinguished from that in Prolog, and four alternative schemes for storing features and performing unification are considered, including the traditional binding-list method and three other methods optimised for speed for which overall unification times are calculated. Parallelisation of unification using cheap logic hardware is considered, and estimates show that unification will negligibly increase the parse time of our parallel parser card. Preliminary results are reported from a prototype serial parser that uses the fourth most efficient unification method, and achieves 7 ms for 7-word sentences, and under 1 s for a 36-word 360-way ambiguous sentence with 10,000 edges, on a conventional workstation.
APA, Harvard, Vancouver, ISO, and other styles
48

Bergami, Giacomo, Samuel Appleby, and Graham Morgan. "Quickening Data-Aware Conformance Checking through Temporal Algebras." Information 14, no. 3 (March 8, 2023): 173. http://dx.doi.org/10.3390/info14030173.

Full text
Abstract:
A temporal model describes processes as a sequence of observable events characterised by distinguishable actions in time. Conformance checking allows these models to determine whether any sequence of temporally ordered and fully-observable events complies with their prescriptions. The latter aspect leads to Explainable and Trustworthy AI, as we can immediately assess the flaws in the recorded behaviours while suggesting any possible way to amend the wrongdoings. Recent findings on conformance checking and temporal learning lead to an interest in temporal models beyond the usual business process management community, thus including other domain areas such as Cyber Security, Industry 4.0, and e-Health. As current technologies for accessing this are purely formal and not ready for the real world returning large data volumes, the need to improve existing conformance checking and temporal model mining algorithms to make Explainable and Trustworthy AI more efficient and competitive is increasingly pressing. To effectively meet such demands, this paper offers KnoBAB, a novel business process management system for efficient Conformance Checking computations performed on top of a customised relational model. This architecture was implemented from scratch after following common practices in the design of relational database management systems. After defining our proposed temporal algebra for temporal queries (xtLTLf), we show that this can express existing temporal languages over finite and non-empty traces such as LTLf. This paper also proposes a parallelisation strategy for such queries, thus reducing conformance checking into an embarrassingly parallel problem leading to super-linear speed up. This paper also presents how a single xtLTLf operator (or even entire sub-expressions) might be efficiently implemented via different algorithms, thus paving the way to future algorithmic improvements. Finally, our benchmarks highlight that our proposed implementation of xtLTLf (KnoBAB) outperforms state-of-the-art conformance checking software running on LTLf logic.
APA, Harvard, Vancouver, ISO, and other styles
49

Poulhaon, Fabien, Francisco Chinesta, and Adrien Leygue. "A first step toward a PGD-based time parallelisation strategy." European Journal of Computational Mechanics, June 6, 2012. http://dx.doi.org/10.13052/17797179.2012.714985.

Full text
Abstract:
This paper proposes a new method for solving the heat transfer equation based on a parallelisation in time of the computation. A parametric multidimensional model is solved within the context of the Proper Generalised Decomposition (PGD). The initial field of temperature and the boundary conditions of the problem are treated as extra-coordinates, similar to time and space. Two main approaches are exposed: a “full” parallelisation based on an off-line parallel computation and a “partial” parallelisation based on a decomposition of the original problem. Thanks to an optimised overlapping strategy, the reattachment of the local solutions at the interfaces of the time subdomains can be improved. For large problems, the parallel execution of the algorithm provides an interesting speedup and opens new perspectives regarding real-time simulation.
APA, Harvard, Vancouver, ISO, and other styles
50

Drousiotis, Efthyvoulos, and Paul Spirakis. "Single MCMC chain parallelisation on decision trees." Annals of Mathematics and Artificial Intelligence, July 2, 2023. http://dx.doi.org/10.1007/s10472-023-09876-9.

Full text
Abstract:
AbstractDecision trees (DT) are highly famous in machine learning and usually acquire state-of-the-art performance. Despite that, well-known variants like CART, ID3, random forest, and boosted trees miss a probabilistic version that encodes prior assumptions about tree structures and shares statistical strength between node parameters. Existing work on Bayesian DT depends on Markov Chain Monte Carlo (MCMC), which can be computationally slow, especially on high dimensional data and expensive proposals. In this study, we propose a method to parallelise a single MCMC DT chain on an average laptop or personal computer that enables us to reduce its run-time through multi-core processing while the results are statistically identical to conventional sequential implementation. We also calculate the theoretical and practical reduction in run time, which can be obtained utilising our method on multi-processor architectures. Experiments showed that we could achieve 18 times faster running time provided that the serial and the parallel implementation are statistically identical.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography