Academic literature on the topic 'Toolchain provenance'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the lists of relevant articles, books, theses, conference reports, and other scholarly sources on the topic 'Toolchain provenance.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Journal articles on the topic "Toolchain provenance":

1

Villa-Uriol, M. C., G. Berti, D. R. Hose, A. Marzo, A. Chiarini, J. Penrose, J. Pozo, et al. "@neurIST complex information processing toolchain for the integrated management of cerebral aneurysms." Interface Focus 1, no. 3 (April 6, 2011): 308–19. http://dx.doi.org/10.1098/rsfs.2010.0033.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Cerebral aneurysms are a multi-factorial disease with severe consequences. A core part of the European project @neurIST was the physical characterization of aneurysms to find candidate risk factors associated with aneurysm rupture. The project investigated measures based on morphological, haemodynamic and aneurysm wall structure analyses for more than 300 cases of ruptured and unruptured aneurysms, extracting descriptors suitable for statistical studies. This paper deals with the unique challenges associated with this task, and the implemented solutions. The consistency of results required by the subsequent statistical analyses, given the heterogeneous image data sources and multiple human operators, was met by a highly automated toolchain combined with training. A testimonial of the successful automation is the positive evaluation of the toolchain by over 260 clinicians during various hands-on workshops. The specification of the analyses required thorough investigations of modelling and processing choices, discussed in a detailed analysis protocol. Finally, an abstract data model governing the management of the simulation-related data provides a framework for data provenance and supports future use of data and toolchain. This is achieved by enabling the easy modification of the modelling approaches and solution details through abstract problem descriptions, removing the need of repetition of manual processing work.
2

Abuhamad, Mohammed, Tamer Abuhmed, David Mohaisen, and Daehun Nyang. "Large-scale and Robust Code Authorship Identification with Deep Feature Learning." ACM Transactions on Privacy and Security 24, no. 4 (November 30, 2021): 1–35. http://dx.doi.org/10.1145/3461666.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Successful software authorship de-anonymization has both software forensics applications and privacy implications. However, the process requires an efficient extraction of authorship attributes. The extraction of such attributes is very challenging, due to various software code formats from executable binaries with different toolchain provenance to source code with different programming languages. Moreover, the quality of attributes is bounded by the availability of software samples to a certain number of samples per author and a specific size for software samples. To this end, this work proposes a deep Learning-based approach for software authorship attribution, that facilitates large-scale, format-independent, language-oblivious, and obfuscation-resilient software authorship identification. This proposed approach incorporates the process of learning deep authorship attribution using a recurrent neural network, and ensemble random forest classifier for scalability to de-anonymize programmers. Comprehensive experiments are conducted to evaluate the proposed approach over the entire Google Code Jam (GCJ) dataset across all years (from 2008 to 2016) and over real-world code samples from 1,987 public repositories on GitHub. The results of our work show high accuracy despite requiring a smaller number of samples per author. Experimenting with source-code, our approach allows us to identify 8,903 GCJ authors, the largest-scale dataset used by far, with an accuracy of 92.3%. Using the real-world dataset, we achieved an identification accuracy of 94.38% for 745 C programmers on GitHub. Moreover, the proposed approach is resilient to language-specifics, and thus it can identify authors of four programming languages (e.g., C, C++, Java, and Python), and authors writing in mixed languages (e.g., Java/C++, Python/C++). Finally, our system is resistant to sophisticated obfuscation (e.g., using C Tigress) with an accuracy of 93.42% for a set of 120 authors. Experimenting with executable binaries, our approach achieves 95.74% for identifying 1,500 programmers of software binaries. Similar results were obtained when software binaries are generated with different compilation options, optimization levels, and removing of symbol information. Moreover, our approach achieves 93.86% for identifying 1,500 programmers of obfuscated binaries using all features adopted in Obfuscator-LLVM tool.
3

Jang, Hohyeon, Nozima Murodova, and Hyungjoon Koo. "ToolPhet: Inference of Compiler Provenance from Stripped Binaries with Emerging Compilation Toolchains." IEEE Access, 2024, 1. http://dx.doi.org/10.1109/access.2024.3355098.

Full text
APA, Harvard, Vancouver, ISO, and other styles

Dissertations / Theses on the topic "Toolchain provenance":

1

Benoit, Tristan. "Cartographie des programmes et de leurs interrelations." Electronic Thesis or Diss., Université de Lorraine, 2023. http://www.theses.fr/2023LORR0320.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Dans le domaine du génie logiciel, assurer la qualité et la sûreté des logiciels est complexe. Ce contexte est dû à un ensemble de facteurs, notamment l'utilisation croissante de bibliothèques et le recours à des pratiques comme la copie de codes à partir de services en ligne. Une réponse courante à cette problématique est l'application de méthodes formelles de validation des programmes avant leur diffusion. Cette approche, cependant, requiert une compréhension précise des enjeux à vérifier et un haut degré d'expertise. Cette thèse introduit des méthodes innovantes de rétro-ingénierie pour collecter automatiquement des informations sur l'origine d'un programme et pour identifier des clones de programmes au sein de larges jeux de données. Notre première contribution est le nouveau modèle de réseau de neurones Site Neural Network (SNN) qui prédit la chaîne de compilation utilisée pour produire un programme entier. SNN offre une grande rapidité ainsi qu'une bonne précision. Sa modularité grâce à l'utilisation de hiérarchies de classificateurs permet de considérer facilement des chaînes de compilation supplémentaires. Notre seconde contribution est Program Spectral Similarity (PSS), un outil qui fournit un moyen rapide et efficace de détecter des clones de programmes, même quand leur architecture matérielle visée diffère ou en cas d'offuscation. Contrairement aux méthodes basées sur les fonctions binaires ou sur la distance d'édition des graphes, qui sont chronophages et peu robustes, PSS s'appuie sur l'analyse spectrale de graphes pour mesurer la similarité entre programmes. Cette thèse participe ainsi à renforcer la sécurité des systèmes en mettant à disposition des outils pour identifier rapidement les clones de programmes malveillants. En outre, elle apporte un soutien à l'investigation numérique en donnant des informations pertinentes sur la chaîne de compilation. Ce travail ouvre la voie à de nouveaux réseaux de neurones spécialisés pour les programmes, ainsi qu'au développement de méthodes d'analyse spectrale pour l'étude de la similarité des codes binaires
In the field of software engineering, ensuring the quality and security of software is complex. This context is due to a set of factors, notably the increasing use of libraries and the use of practices such as copying codes from online services. The usual solution to this problem is the application of formal methods for program validation before their release. However, this approach requires a precise specification and a high degree of expertise. This thesis introduces new reverse engineering methods to automatically collect information about a program toolchain provenance and identify program clones within large data repositories. Our first contribution is the innovative neural network model Site Neural Network (SNN), which predicts the compilation toolchain used to produce an entire program. SNN offers excellent speed as well as good accuracy. Its modularity due to the use of hierarchies of classifiers allows for easy consideration of additional toolchains. Our second contribution is the Program Spectral Similarity (PSS), a tool that provides a quick and efficient way to detect program clones, even when their target hardware architecture differs or in the case of obfuscation. Unlike binary function-based methods or graph edit distance methods, which are time-consuming and low resilient, PSS relies on the spectral analysis of graphs to measure the similarity between programs. This thesis thus contributes to cyber security by providing tools to identify malware clones quickly. In addition, it supports computer forensics by providing relevant information on the compilation chain. This work paves the way for new neural networks for programs, as well as the development of spectral graph analysis methods for studying binary code similarity

Conference papers on the topic "Toolchain provenance":

1

Rosenblum, Nathan, Barton P. Miller, and Xiaojin Zhu. "Recovering the toolchain provenance of binary code." In the 2011 International Symposium. New York, New York, USA: ACM Press, 2011. http://dx.doi.org/10.1145/2001420.2001433.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Benoit, Tristan, Jean-Yves Marion, and Sebastien Bardin. "Binary level toolchain provenance identification with graph neural networks." In 2021 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER). IEEE, 2021. http://dx.doi.org/10.1109/saner50967.2021.00021.

Full text
APA, Harvard, Vancouver, ISO, and other styles

To the bibliography