Academic literature on the topic 'Nested Dataset'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the lists of relevant articles, books, theses, conference reports, and other scholarly sources on the topic 'Nested Dataset.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Journal articles on the topic "Nested Dataset"

1

Dinh, Thi Lan Anh, and Filipe Aires. "Nested leave-two-out cross-validation for the optimal crop yield model selection." Geoscientific Model Development 15, no. 9 (May 5, 2022): 3519–35. http://dx.doi.org/10.5194/gmd-15-3519-2022.

Full text
Abstract:
Abstract. The use of statistical models to study the impact of weather on crop yield has not ceased to increase. Unfortunately, this type of application is characterized by datasets with a very limited number of samples (typically one sample per year). In general, statistical inference uses three datasets: the training dataset to optimize the model parameters, the validation dataset to select the best model, and the testing dataset to evaluate the model generalization ability. Splitting the overall database into three datasets is often impossible in crop yield modelling due to the limited number of samples. The leave-one-out cross-validation method, or simply leave one out (LOO), is often used to assess model performance or to select among competing models when the sample size is small. However, the model choice is typically made using only the testing dataset, which can be misleading by favouring unnecessarily complex models. The nested cross-validation approach was introduced in machine learning to avoid this problem by truly utilizing three datasets even with limited databases. In this study, we propose one particular implementation of the nested cross-validation, called the nested leave-two-out cross-validation method or simply the leave two out (LTO), to choose the best model with an optimal model selection (using the validation dataset) and estimate the true model quality (using the testing dataset). Two applications are considered: robusta coffee in Cu M'gar (Dak Lak, Vietnam) and grain maize over 96 French departments. In both cases, LOO is misleading by choosing models that are too complex; LTO indicates that simpler models actually perform better when a reliable generalization test is considered. The simple models obtained using the LTO approach have improved yield anomaly forecasting skills in both study crops. This LTO approach can also be used in seasonal forecasting applications. We suggest that the LTO method should become a standard procedure for statistical crop modelling.
APA, Harvard, Vancouver, ISO, and other styles
2

Sheikhaei, Mohammad Sadegh, Hasan Zafari, and Yuan Tian. "Joined Type Length Encoding for Nested Named Entity Recognition." ACM Transactions on Asian and Low-Resource Language Information Processing 21, no. 3 (May 31, 2022): 1–23. http://dx.doi.org/10.1145/3487057.

Full text
Abstract:
In this article, we propose a new encoding scheme for named entity recognition (NER) called Joined Type-Length encoding (JoinedTL). Unlike most existing named entity encoding schemes, which focus on flat entities, JoinedTL can label nested named entities in a single sequence. JoinedTL uses a packed encoding to represent both type and span of a named entity, which not only results in less tagged tokens compared to existing encoding schemes, but also enables it to support nested NER. We evaluate the effectiveness of JoinedTL for nested NER on three nested NER datasets: GENIA in English, GermEval in German, and PerNest, our newly created nested NER dataset in Persian. We apply CharLSTM+WordLSTM+CRF, a three-layer sequence tagging model on three datasets encoded using JoinedTL and two existing nested NE encoding schemes, i.e., JoinedBIO and JoinedBILOU. Our experiment results show that CharLSTM+WordLSTM+CRF trained with JoinedTL encoded datasets can achieve competitive F1 scores as the ones trained with datasets encoded by two other encodings, but with 27%–48% less tagged tokens. To leverage the power of three different encodings, i.e., JoinedTL, JoinedBIO, and JoinedBILOU, we propose an encoding-based ensemble method for nested NER. Evaluation results show that the ensemble method achieves higher F1 scores on all datasets than the three models each trained using one of the three encodings. By using nested NE encodings including JoinedTL with CharLSTM+WordLSTM+CRF, we establish new state-of-the-art performance with an F1 score of 83.7 on PerNest, 74.9 on GENIA, and 70.5 on GermEval, surpassing two recent neural models specially designed for nested NER.
APA, Harvard, Vancouver, ISO, and other styles
3

Li, Zan, Hong Zhang, Zhengzhen Li, and Zuyue Ren. "Residual-Attention UNet++: A Nested Residual-Attention U-Net for Medical Image Segmentation." Applied Sciences 12, no. 14 (July 15, 2022): 7149. http://dx.doi.org/10.3390/app12147149.

Full text
Abstract:
Image segmentation is a basic technology in the field of image processing and computer vision. Medical image segmentation is an important application field of image segmentation and plays an increasingly important role in clinical diagnosis and treatment. Deep learning has made great progress in medical image segmentation. In this paper, we proposed Residual-Attention UNet++, which is an extension of the UNet++ model with a residual unit and attention mechanism. Firstly, the residual unit improves the degradation problem. Secondly, the attention mechanism can increase the weight of the target area and suppress the background area irrelevant to the segmentation task. Three medical image datasets such as skin cancer, cell nuclei, and coronary artery in angiography were used to validate the proposed model. The results showed that the Residual-Attention UNet++ achieved superior evaluation scores with an Intersection over Union (IoU) of 82.32%, and a dice coefficient of 88.59% with the skin cancer dataset, a dice coefficient of 85.91%, and an IoU of 87.74% with the cell nuclei dataset and a dice coefficient of 72.48%, and an IoU of 66.57% with the angiography dataset.
APA, Harvard, Vancouver, ISO, and other styles
4

Zhang, Jilong, Yajuan Zhang, Hongyang Zhang, Quan Zhang, Weihua Su, Shijie Guo, and Yuanquan Wang. "Segmentation of biventricle in cardiac cine MRI via nested capsule dense network." PeerJ Computer Science 8 (November 30, 2022): e1146. http://dx.doi.org/10.7717/peerj-cs.1146.

Full text
Abstract:
Background Cardiac magnetic resonance image (MRI) has been widely used in diagnosis of cardiovascular diseases because of its noninvasive nature and high image quality. The evaluation standard of physiological indexes in cardiac diagnosis is essentially the accuracy of segmentation of left ventricle (LV) and right ventricle (RV) in cardiac MRI. The traditional symmetric single codec network structure such as U-Net tends to expand the number of channels to make up for lost information that results in the network looking cumbersome. Methods Instead of a single codec, we propose a multiple codecs structure based on the FC-DenseNet (FCD) model and capsule convolution-capsule deconvolution, named Nested Capsule Dense Network (NCDN). NCDN uses multiple codecs to achieve multi-resolution, which makes it possible to save more spatial information and improve the robustness of the model. Results The proposed model is tested on three datasets that include the York University Cardiac MRI dataset, Automated Cardiac Diagnosis Challenge (ACDC-2017), and the local dataset. The results show that the proposed NCDN outperforms most methods. In particular, we achieved nearly the most advanced accuracy performance in the ACDC-2017 segmentation challenge. This means that our method is a reliable segmentation method, which is conducive to the application of deep learning-based segmentation methods in the field of medical image segmentation.
APA, Harvard, Vancouver, ISO, and other styles
5

Fu, Yao, Chuanqi Tan, Mosha Chen, Songfang Huang, and Fei Huang. "Nested Named Entity Recognition with Partially-Observed TreeCRFs." Proceedings of the AAAI Conference on Artificial Intelligence 35, no. 14 (May 18, 2021): 12839–47. http://dx.doi.org/10.1609/aaai.v35i14.17519.

Full text
Abstract:
Named entity recognition (NER) is a well-studied task in natural language processing. However, the widely-used sequence labeling framework is difficult to detect entities with nested structures. In this work, we view nested NER as constituency parsing with partially-observed trees and model it with partially-observed TreeCRFs. Specifically, we view all labeled entity spans as observed nodes in a constituency tree, and other spans as latent nodes. With the TreeCRF we achieve a uniform way to jointly model the observed and the latent nodes. To compute the probability of partial trees with partial marginalization, we propose a variant of the Inside algorithm, the Masked Inside algorithm, that supports different inference operations for different nodes (evaluation for the observed, marginalization for the latent, and rejection for nodes incompatible with the observed) with efficient parallelized implementation, thus significantly speeding up training and inference. Experiments show that our approach achieves the state-of-the-art (SOTA) F1 scores on the ACE2004, ACE2005 dataset, and shows comparable performance to SOTA models on the GENIA dataset. We release the code at https://github.com/FranxYao/Partially-Observed-TreeCRFs.
APA, Harvard, Vancouver, ISO, and other styles
6

Kulkarni, Rishikesh U., Catherine L. Wang, and Carolyn R. Bertozzi. "Analyzing nested experimental designs—A user-friendly resampling method to determine experimental significance." PLOS Computational Biology 18, no. 5 (May 2, 2022): e1010061. http://dx.doi.org/10.1371/journal.pcbi.1010061.

Full text
Abstract:
While hierarchical experimental designs are near-ubiquitous in neuroscience and biomedical research, researchers often do not take the structure of their datasets into account while performing statistical hypothesis tests. Resampling-based methods are a flexible strategy for performing these analyses but are difficult due to the lack of open-source software to automate test construction and execution. To address this, we present Hierarch, a Python package to perform hypothesis tests and compute confidence intervals on hierarchical experimental designs. Using a combination of permutation resampling and bootstrap aggregation, Hierarch can be used to perform hypothesis tests that maintain nominal Type I error rates and generate confidence intervals that maintain the nominal coverage probability without making distributional assumptions about the dataset of interest. Hierarch makes use of the Numba JIT compiler to reduce p-value computation times to under one second for typical datasets in biomedical research. Hierarch also enables researchers to construct user-defined resampling plans that take advantage of Hierarch’s Numba-accelerated functions.
APA, Harvard, Vancouver, ISO, and other styles
7

Liu, Wen, Yankui Sun, and Qingge Ji. "MDAN-UNet: Multi-Scale and Dual Attention Enhanced Nested U-Net Architecture for Segmentation of Optical Coherence Tomography Images." Algorithms 13, no. 3 (March 4, 2020): 60. http://dx.doi.org/10.3390/a13030060.

Full text
Abstract:
Optical coherence tomography (OCT) is an optical high-resolution imaging technique for ophthalmic diagnosis. In this paper, we take advantages of multi-scale input, multi-scale side output and dual attention mechanism and present an enhanced nested U-Net architecture (MDAN-UNet), a new powerful fully convolutional network for automatic end-to-end segmentation of OCT images. We have evaluated two versions of MDAN-UNet (MDAN-UNet-16 and MDAN-UNet-32) on two publicly available benchmark datasets which are the Duke Diabetic Macular Edema (DME) dataset and the RETOUCH dataset, in comparison with other state-of-the-art segmentation methods. Our experiment demonstrates that MDAN-UNet-32 achieved the best performance, followed by MDAN-UNet-16 with smaller parameter, for multi-layer segmentation and multi-fluid segmentation respectively.
APA, Harvard, Vancouver, ISO, and other styles
8

Turanzas, J., M. Alonso, H. Amaris, J. Gutierrez, and S. Pastrana. "A nested decision tree for event detection in smart grids." Renewable Energy and Power Quality Journal 20 (September 2022): 353–58. http://dx.doi.org/10.24084/repqj20.308.

Full text
Abstract:
Digitalization process experienced by traditional power networks towards smart grids extend the challenges faced by power grid operators to the field of cybersecurity. False data injection attacks, one of the most common cyberattacks in smart grids, could lead the power grid to sabotage itself. In this paper, an event detection algorithm for cyberattack in smart grids is developed based on a decision tree. In order to find the most accurate algorithm, two different decision trees with two different goals have been trained: one classifies the status of the network, corresponding to an event, and the other will classify the location where the event is detected. To train the decision trees, a dataset made by co-simulating a power network and a communication network has been used. The decision trees are going to be compared in different settings by changing the division criteria, the dataset used to train them and the misclassification cost. After looking at their performance independently, the best way to combine them into a single algorithm is presented.
APA, Harvard, Vancouver, ISO, and other styles
9

Jamali, A., and A. Abdul Rahman. "EVALUATION OF ADVANCED DATA MINING ALGORITHMS IN LAND USE/LAND COVER MAPPING." ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences XLII-4/W16 (October 1, 2019): 283–89. http://dx.doi.org/10.5194/isprs-archives-xlii-4-w16-283-2019.

Full text
Abstract:
Abstract. For environmental monitoring, land-cover mapping, and urban planning, remote sensing is an effective method. In this paper, firstly, for land use land cover mapping, Landsat 8 OLI image classification based on six advanced mathematical algorithms of machine learning including Random Forest, Decision Table, DTNB, Multilayer Perceptron, Non-Nested Generalized Exemplars (NN ge) and Simple Logistic is used. Then, results are compared in the terms of Overall Accuracy (OA), Mean Absolute Error (MAE) and Root Mean Squared Error (RMSE) for land use land cover (LULC) mapping. Based on the training and test datasets, Simple Logistic had the best performance in terms of OA, MAE and RMSE values of 99.9293, 0.0006 and 0.016 for training dataset and values of 99.9467, 0.0005 and 0.0153 for the test dataset.
APA, Harvard, Vancouver, ISO, and other styles
10

Hazard, Derek, Martin Schumacher, Mercedes Palomar-Martinez, Francisco Alvarez-Lerma, Pedro Olaechea-Astigarraga, and Martin Wolkewitz. "Improving nested case-control studies to conduct a full competing-risks analysis for nosocomial infections." Infection Control & Hospital Epidemiology 39, no. 10 (August 30, 2018): 1196–201. http://dx.doi.org/10.1017/ice.2018.186.

Full text
Abstract:
AbstractObjectiveCompeting risks are a necessary consideration when analyzing risk factors for nosocomial infections (NIs). In this article, we identify additional information that a competing risks analysis provides in a hospital setting. Furthermore, we improve on established methods for nested case-control designs to acquire this information.MethodsUsing data from 2 Spanish intensive care units and model simulations, we show how controls selected by time-dynamic sampling for NI can be weighted to perform risk-factor analysis for death or discharge without infection. This extension not only enables hazard rate analysis for the competing risk, it also enables prediction analysis for NI.ResultsThe estimates acquired from the extension were in good agreement with the results from the full (real and simulated) cohort dataset. The reduced dataset results averted any false interpretation common in a competing-risks setting.ConclusionsUsing additional information that is routinely collected in a hospital setting, a nested case-control design can be successfully adapted to avoid a competing risks bias. Furthermore, this adapted method can be used to reanalyze past nested case-control studies to enhance their findings.
APA, Harvard, Vancouver, ISO, and other styles

Dissertations / Theses on the topic "Nested Dataset"

1

DENTI, FRANCESCO. "Bayesian Mixtures for Large Scale Inference." Doctoral thesis, Università degli Studi di Milano-Bicocca, 2020. http://hdl.handle.net/10281/262923.

Full text
Abstract:
I modelli mistura bayesiani sono onnipresenti in statistica per la loro semplicità e flessibilità e possono essere facilmente impiegati in un'ampia varietà di contesti. In questa tesi, miriamo a fornire alcuni contributi agli attuali metodi bayesiani di analisi dei dati, spesso motivati ​​da domande di ricerca provenienti da applicazioni biologiche. In particolare, ci concentriamo sullo sviluppo di nuovi modelli mistura bayesiani, tipicamente in un ambiente non parametrico, per migliorare ed estendere aree di ricerca che coinvolgono dati caratterizzati da grande dimensioni: la modellazione di dati nested, test di ipotesi simultaneo e la riduzione della dimensionalità. \\ Pertanto, il nostro obiettivo è duplice: sviluppare metodi statistici robusti motivati da un solido background teorico e proporre algoritmi efficienti, scalabili e trattabili per le loro applicazioni. \\ La tesi è organizzata come segue. Nel capitolo 1 esamineremo brevemente il background metodologico e discuteremo i concetti necessari che appartengono alle diverse aree a cui contribuiremo con questa tesi. \\ Nel capitolo 2 proponiamo un modello di atomi comuni (CAM) per nested data, che supera le limitazioni del processo del nested Dirichlet Process, come discusso in \ citep {Camerlenghi2018}. Deriviamo le sue proprietà teoriche e sviluppiamo uno slice sampler per dati nested al fine di ottenere un algoritmo efficiente per la simulazione della posterior. Abbiamo poi incorporato il modello in un framework di Rounded mixture of Gaussian Kernels, così da applicare il nostro metodo a una abundance table derivante da uno studio di microbioma. \\ Nel capitolo \ref {BNPT} sviluppiamo una versione BNP del two-group model, modellando sia $ f_0 $ che $ f_1 $ con Pitman-Yor mixtures models. Proponiamo di fissare i due parametri $ \sigma_0 $ e $ \sigma_1 $ in modo che $ \sigma_0> \sigma_1 $, in base alla logica secondo cui il PY che modella la distribuzione nulla dovrebbe essere più vicino alla sua misura di base (opportunamente scelta Gaussiana standard), mentre il PY alternativo dovrebbe avere meno vincoli. Per indurre la separazione, impieghiamo una non-local prior sul parametro location della misura base del PY collocato su $ f_1 $. Mostriamo come il modello si comporta in diversi scenari e applichiamo questa metodologia a un set di dati del microbioma. \\ Il capitolo \ref{Peluso} presenta una seconda proposta per il two-group model. Qui, utilizziamo non-local distributions per modellare la densità alternativa direttamente nella formulazione della Likelihood. Abbiamo proposto una formulazione sia parametrica che non parametrica del modello. Forniamo poi una giustificazione teorica per l'adozione di questo approccio e, dopo aver confrontato le prestazioni del nostro modello con diversi concorrenti, presentiamo tre applicazioni su set di dati genomici reali pubblicamente disponibili. \\ Nel capitolo \ref {CRIME} ci concentriamo sul miglioramento del modello per la stima delle dimensioni intrinseche (ID) discusso in \citet {Allegra}, dove gli autori stimano gli IDs modellando il rapporto delle distanze da un punto dal suo primo e secondo vicino più vicino (NN). Innanzitutto, proponiamo di includere distribuzioni a priori più adatte nel loro modello mistura finita. Quindi, estendiamo la metodologia teorica esistente derivando distribuzioni in forma chiusa per i rapporti di distanze da un punto a due NNs di ordine generico. Proponiamo poi un semplice modello di mistura nonparametrica usando il processo di Dirichlet, in cui sfruttiamo le distribuzioni derivate per estrarre più informazioni dai dati. Il capitolo si conclude quindi con studi di simulazione e l'applicazione a dati reali. \\ Infine, il capitolo \ref {Conclusions} presenta le direzioni future e le conclusioni.
Bayesian mixture models are ubiquitous in statistics due to their simplicity and flexibility and can be easily employed in a wide variety of contexts. In this dissertation, we aim at providing a few contributions to current Bayesian data analysis methods, often motivated by research questions from biological applications. In particular, we focus on the development of novel Bayesian mixture models, typically in a nonparametric setting, to improve and extend active research areas that involve large-scale data: the modeling of nested data, multiple hypothesis testing, and dimensionality reduction.\\ Therefore, our goal is twofold: to develop robust statistical methods motivated by a solid theoretical background, and to propose efficient, scalable and tractable algorithms for their applications.\\ The thesis is organized as follows. In Chapter \ref{intro} we shortly review the methodological background and discuss the necessary concepts that belong to the different areas that we will contribute to with this dissertation. \\ In Chapter \ref{CAM} we propose a Common Atoms model (CAM) for nested datasets, which overcomes the limitations of the nested Dirichlet Process, as discussed in \citep{Camerlenghi2018}. We derive its theoretical properties and develop a slice sampler for nested data to obtain an efficient algorithm for posterior simulation. We then embed the model in a Rounded Mixture of Gaussian kernels framework to apply our method to an abundance table from a microbiome study.\\ In Chapter \ref{BNPT} we develop a BNP version of the two-group model \citep{Efron2004}, modeling both the null density $f_0$ and the alternative density $f_1$ with Pitman-Yor process mixture models. We propose to fix the two discount parameters $\sigma_0$ and $\sigma_1$ so that $\sigma_0>\sigma_1$, according to the rationale that the null PY should be closer to its base measure (appropriately chosen to be a standard Gaussian base measure), while the alternative PY should have fewer constraints. To induce separation, we employ a non-local prior \citep{Johnson} on the location parameter of the base measure of the PY placed on $f_1$. We show how the model performs in different scenarios and apply this methodology to a microbiome dataset.\\ Chapter \ref{Peluso} presents a second proposal for the two-group model. Here, we make use of non-local distributions to model the alternative density directly in the likelihood formulation. We propose both a parametric and a nonparametric formulation of the model. We provide a theoretical justification for the adoption of this approach and, after comparing the performance of our model with several competitors, we present three applications on real, publicly available genomic datasets.\\ In Chapter \ref{CRIME} we focus on improving the model for intrinsic dimensions (IDs) estimation discussed in \citet{Allegra}. In particular, the authors estimate the IDs modeling the ratio of the distances from a point to its first and second nearest neighbors (NNs). First, we propose to include more suitable priors in their parametric, finite mixture model. Then, we extend the existing theoretical methodology by deriving closed-form distributions for the ratios of distances from a point to two NNs of generic order. We propose a simple Dirichlet process mixture model, where we exploit the novel theoretical results to extract more information from the data. The chapter is then concluded with simulation studies and the application to real data.\\ Finally, Chapter \ref{Conclusions} presents the future directions and concludes.
APA, Harvard, Vancouver, ISO, and other styles
2

Schulz, Sebastian [Verfasser], and B. [Akademischer Betreuer] Nestler. "Phase-field simulations of multi-component solidification and coarsening based on thermodynamic datasets / Sebastian Schulz. Betreuer: B. Nestler." Karlsruhe : KIT-Bibliothek, 2016. http://d-nb.info/1106330110/34.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Schulz, Sebastian [Verfasser], and B. [Akademischer Betreuer] Nestler. "Phase-field simulations of multi-component solidification and coarsening based on thermodynamic datasets / Sebastian Schulz ; Betreuer: B. Nestler." Karlsruhe : KIT Scientific Publishing, 2017. http://d-nb.info/1185759832/34.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Mauricio-Sanchez, David, Andrade Lopes Alneu de, and higuihara Juarez Pedro Nelson. "Approaches based on tree-structures classifiers to protein fold prediction." Institute of Electrical and Electronics Engineers Inc, 2017. http://hdl.handle.net/10757/622536.

Full text
Abstract:
El texto completo de este trabajo no está disponible en el Repositorio Académico UPC por restricciones de la casa editorial donde ha sido publicado.
Protein fold recognition is an important task in the biological area. Different machine learning methods such as multiclass classifiers, one-vs-all and ensemble nested dichotomies were applied to this task and, in most of the cases, multiclass approaches were used. In this paper, we compare classifiers organized in tree structures to classify folds. We used a benchmark dataset containing 125 features to predict folds, comparing different supervised methods and achieving 54% of accuracy. An approach related to tree-structure of classifiers obtained better results in comparison with a hierarchical approach.
Revisión por pares
APA, Harvard, Vancouver, ISO, and other styles
5

Calçada, David Tiago. "Predicting chelonia mydas nests survivability rates with use of machine learning techniques: applying machine learning techniques on conservation data – case study." Master's thesis, 2020. http://hdl.handle.net/10362/97228.

Full text
Abstract:
Dissertation presented as the partial requirement for obtaining a Master's degree in Information Management, specialization in Knowledge Management and Business Intelligence
It is the generalized goal of knowledge discovery techniques to help us find useful patterns in data whilst not subjecting us to the ambiguity and overcomplexity of models. In fact, it has become increasingly important to allow for a common language to exist between biologists and data scientists. In my thesis I aim to make use of Green Turtle (Chelonya mydas) nesting data obtained in surveys conducted from 2015 to 2019 in Príncipe Island, in order to obtain two things: Firstly, to understand insights related to sea turtle survivability rates; Secondly, to develop prediction models on said rates via popular Machine Learning algorithms. For this purpose, I will detail how my collaboration with the sea turtle conservation team in Principe Island began, and work has been developed since. I will describe all steps referring to the dataset transformation, manipulation and exploration, and detail how each step has allowed me to feed the sea turtle data into powerful Machine Learning algorithms that are to be evaluated against their ability to predict accurate nest survivability rates. At the end of the contextual part of this document, I will explain my findings and present the limitations of this project; I hope to provide a solid example that will allow future students and researchers to keep in mind what challenges await them should they pursue this field. Finally, a key aspect of this thesis that is very important that it’s written in such a way that individuals with different backgrounds are able to understand its content and objectives.
APA, Harvard, Vancouver, ISO, and other styles

Books on the topic "Nested Dataset"

1

Reis, Lucas Bond. Florianópolis arqueológica. Editora da UFSC, 2021. http://dx.doi.org/10.5007/978-65-5805-023-0.

Full text
Abstract:
A obra apresenta uma coletânea de artigos sobre o Patrimônio Arqueológico de Florianópolis. São ao todo 12 capítulos que procuram cobrir diversos temas e períodos relacionados à história da ocupação humana do Município, com foco principal na Ilha de Santa Catarina. O livro se inicia com dois textos mais gerais: um versando sobre a História da Pesquisa Arqueológica em Florianópolis; e outro dando um panorama geral sobre a composição desse patrimônio e da sua situação hoje em termos de preservação e pesquisa. Dando prosseguimento ao livro há nove capítulos que vão tratar de temas variados relacionados à arqueologia do Município, os quais cobrem cerca de 5.000 anos de História. Nesse sentido, temos um capítulo sobre os sítios conchiferos, dentro os quais estão os sambaquis, bem conhecidos de boa parte da população local; um capítulo sobre os sítios conhecidos como “oficinas líticas” ou sítios de “amoladores-polidores fixos” e sítios com representação rupestre, também bastante conhecidos pela população; seguem-se então dois capítulos que tratam de forma mais específica os contextos relacionados ao sambaquis, apresentando dois estudos de caso sobre sambaquis localizados às margens da Lagoa da Conceição – um dos estudos enfatiza a identificação dos restos de animais presentes no sítio e constrói um diálogo direto com a biologia e a constituição da fauna da Ilha, para a partir disso falar sobre uso do espaço e constituição de territórios por parte das populações que construíram esse sítio, enquanto o outro foca na análise dos restos esqueletais humanos encontrados no sítio conhecido como Ponta das Almas. Em seguida temos três que dialogam entre si, versando sobre a temática da presença indígena associada aos grupos Jê e Guarani na Ilha de Santa Catarina. O primeiro deles apresenta uma discussão sobre a existência de uma ocupação Jê na ilha e usa como base para isso um estudo de caso sobre um sítio com “pseudo” estruturas subterrâneas para problematizar as evidências utilizadas pela arqueologia nesta discussão. O segundo apresenta um estudo de caso sobre um sítio Guarani escavado no norte da ilha, com datas para o século XVI e o outro apresenta um estudo baseado em fontes históricas e arqueológicas sobre a presença indígena na Ilha de Santa Catarina nos séculos XVIII e XIX. O nono capítulo desta parte, que corresponde ao décimo primeiro capítulo do livro, traz uma reflexão sobre a diversidade cultural na construção do espaço urbano de Florianópolis, discutindo a questão da imigração europeia a partir da perspectiva da arqueologia. O décimo segundo capítulo do livro apresenta uma discussão a partir de uma série de ações educativas desenvolvidas entre 2013 e 2015 em escolas municipais sobre o Patrimônio arqueológico do Município. Por fim, inserimos em anexo uma listagem atualizada dos sítios arqueológicos conhecidos e registrados no Município, assim como uma mapa, em formato A1 com todos os sítio plotados. Com esta composição o livro mescla textos mais gerais, apresenta o histórico das pesquisas, trata de estudos de caso e apresenta a diversidade de contextos e potenciais para pesquisa e ações relacionadas a políticas públicas para o Patrimônio Arqueológico no município.
APA, Harvard, Vancouver, ISO, and other styles

Book chapters on the topic "Nested Dataset"

1

Gong, Yansheng, and Wenfeng Jing. "A Fully-Nested Encoder-Decoder Framework for Anomaly Detection." In Proceeding of 2021 International Conference on Wireless Communications, Networking and Applications, 749–59. Singapore: Springer Nature Singapore, 2022. http://dx.doi.org/10.1007/978-981-19-2456-9_75.

Full text
Abstract:
AbstractAnomaly detection is an important branch of computer vision. At present, a variety of deep learning models are applied to anomaly detection. However, the lack of abnormal samples makes supervised learning difficult to implement. In this paper, we mainly study abnormal detection tasks based on unsupervised learning and propose a Fully-Nested Encoder-decoder Framework. The main part of the proposed generating model consists of a generator and a discriminator, which are adversarially trained based on normal data samples. In order to improve the image reconstruction capability of the generator, we design a Fully-Nested Residual Encoder-decoder Network, which is used to encode and decode the images. In addition, we add residual structure into both encoder and decoder, which reduces the risk of overfitting and enhances the feature expression ability. In the test phase, a distance measurement model is used to determine whether the test sample is abnormal. The experimental results on the CIFAR-10 dataset demonstrate the excellent performance of our method. Compared with the existing models, our method achieves the state-of-the-art result.
APA, Harvard, Vancouver, ISO, and other styles
2

Quicke, Donald, Buntika A. Butcher, and Rachel Kruft Welton. "More on apply family of functions - avoid loops to get more speed." In Practical R for biologists: an introduction, 322–25. Wallingford: CABI, 2021. http://dx.doi.org/10.1079/9781789245349.0027.

Full text
Abstract:
Abstract This chapter focuses on the 'apply' set of functions. These functions are for those who need to process very large datasets, or who need to perform loop-type operations on largish datasets but perhaps in a nested fashion.
APA, Harvard, Vancouver, ISO, and other styles
3

Quicke, Donald, Buntika A. Butcher, and Rachel Kruft Welton. "More on apply family of functions - avoid loops to get more speed." In Practical R for biologists: an introduction, 322–25. Wallingford: CABI, 2021. http://dx.doi.org/10.1079/9781789245349.0322.

Full text
Abstract:
Abstract This chapter focuses on the 'apply' set of functions. These functions are for those who need to process very large datasets, or who need to perform loop-type operations on largish datasets but perhaps in a nested fashion.
APA, Harvard, Vancouver, ISO, and other styles
4

Osakabe, Yoshihiro, and Akinori Asahara. "Proposing Novel High-Performance Compounds by Nested VAEs Trained Independently on Different Datasets." In Advances and Trends in Artificial Intelligence. Theory and Practices in Artificial Intelligence, 714–22. Cham: Springer International Publishing, 2022. http://dx.doi.org/10.1007/978-3-031-08530-7_60.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Chebotko, Artem, and Shiyong Lu. "Nested Optional Join for Efficient Evaluation of SPARQL Nested Optional Graph Patterns." In Advances in Semantic Web and Information Systems, 281–308. IGI Global, 2010. http://dx.doi.org/10.4018/978-1-60566-992-2.ch013.

Full text
Abstract:
Relational technology has shown to be very useful for scalable Semantic Web data management. Numerous researchers have proposed to use RDBMSs to store and query voluminous RDF data using SQL and RDF query languages. This chapter studies how RDF queries with the so called well-designed graph patterns and nested optional patterns can be efficiently evaluated in an RDBMS. The authors propose to extend relational algebra with a novel relational operator, nested optional join (NOJ), that is more efficient than left outer join in processing nested optional patterns of well-designed graph patterns. They design three efficient algorithms to implement the new operator in relational databases: (1) nested-loops NOJ algorithm, NL-NOJ, (2) sort-merge NOJ algorithm, SM-NOJ, and (3) simple hash NOJ algorithm, SH-NOJ. Using a real life RDF dataset, the authors demonstrate the efficiency of their algorithms by comparing them with the corresponding left outer join implementations and explore the effect of join selectivity on the performance of these algorithms.
APA, Harvard, Vancouver, ISO, and other styles
6

Li, Liu, and Fusong Ling. "Chinese Medical Named Entity Recognition Method Based on Word-Word Relationship." In Computer Methods in Medicine and Health Care. IOS Press, 2022. http://dx.doi.org/10.3233/atde220541.

Full text
Abstract:
To solve the problem that the current deep learning method is difficult to deal with the recognition of nested entities in Chinese medical text, a deep learning model based on word-word relationship is introduced, and the relationship between words is built by multi-granularity 2D graphs to improve the recognition of nested entities. First, we use BERT (Bidirectional Encoder Representation from Transformers) for pre-training, then we use BiLSTM (directional Long Short-Term Memory) to extract the context information. Then, we merge the token representation information, the word distance information and the word regional information, through use a multi-granularity hole convolution to obtain the role information of different words. Finally, we use decoding layer to predict entity relationships and decode the result. This model is tested on the CMeEE Chinese medical dataset. Compared with the popularity models such as BiLSTM-CRF (Conditional Random Field) and BERT-BiLSTM-CRF, the F1 value is improved by 2.52%. Experimental results show that for Chinese medical named entity recognition with nested entities, this model can better recognize the medical entities in Chinese medical text.
APA, Harvard, Vancouver, ISO, and other styles
7

Pham, Thien, Loi Truong, Mao Nguyen, Akhil Garg, Liang Gao, and Tho Quan. "Sequence-in-Sequence Learning for SOH Estimation of Lithium-Ion Battery." In Proceedings of CECNet 2021. IOS Press, 2021. http://dx.doi.org/10.3233/faia210385.

Full text
Abstract:
State-of-Health (SOH) prediction of a Lithium-ion battery is essential for preventing malfunction and maintaining efficient working behaviors for the battery. In practice, this task is difficult due to the high level of noise and complexity. There are many machine learning methods, especially deep learning approaches, that have been proposed to address this problem recently. However, there is much room for improvement because the nature of the battery data is highly non-linear and exhibits higher dependence on multidisciplinary parameters such as resistance, voltage and external conditions the battery is subjected to. In this paper, we propose an approach known as bidirectional sequence-in-sequence, which exploits the dependency of nested cycle-wise and channel-wise battery data. Experimented with real dataset acquired from NASA, our method results in significant reduction of error of approximately up to 32.5%.
APA, Harvard, Vancouver, ISO, and other styles
8

Muche Fenta, Setegn, and Haile Mekonnen Fenta. "Level and Determinant of Child Mortality Rate in Ethiopia." In Mortality Rates in Middle and Low-Income Countries. IntechOpen, 2022. http://dx.doi.org/10.5772/intechopen.100482.

Full text
Abstract:
Background: One of the objectives of the Sustainable Development Goals (SDG) is to diminish the under-five mortality rate and improvement in maternal health. This study aims to identify factors that affect under-five mortality based on the 2016 EDHS dataset using the multilevel count regression model. Method: The EDHS data have a two-level hierarchical structure, with 14,370 women nested within 11 geographical regions. Multilevel count models were employed to predict the outcomes. Results: The data were found to have excess zeros (53.7%); the variance (1.697) is higher than its mean (0.90). Among families of count models, the HNB model was found to be a better fit for the dataset than the others. The study revealed that a child of multiple births is 1.45 more likely to die as compared with a single birth. Babies delivered in the private sector are a 0.65 lower risk of under-five mortality compared to the babies delivered at home. Conclusion: Vaccination of child, family size, age of mother, antenatal visit, birth interval, birth order, contraceptive used, father education level, mother education level, father occupation, place of delivery, child twin, age first birth and religion were significantly associated with under-five mortality. The Ministry of Health should work properly to raise the awareness of parents for vaccination, family planning services and efforts should be made to improve the parental educational level.
APA, Harvard, Vancouver, ISO, and other styles
9

Mu’inah, U. M., R. Fajriyah, and H. Nugrahapraja. "Organ-specific expression revealed using support vector machine on maize nested association mapping datasets." In Empowering Science and Mathematics for Global Competitiveness, 532–36. CRC Press, 2019. http://dx.doi.org/10.1201/9780429461903-72.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Ravindra, Padmashree, and Kemafor Anyanwu. "Nesting Strategies for Enabling Nimble MapReduce Dataflows for Large RDF Data." In Information Retrieval and Management, 811–38. IGI Global, 2018. http://dx.doi.org/10.4018/978-1-5225-5191-1.ch035.

Full text
Abstract:
Graph and semi-structured data are usually modeled in relational processing frameworks as “thin” relations (node, edge, node) and processing such data involves a lot of join operations. Intermediate results of joins with multi-valued attributes or relationships, contain redundant subtuples due to repetition of single-valued attributes. The amount of redundant content is high for real-world multi-valued relationships in social network (millions of Twitter followers of popular celebrities) or biological (multiple references to related proteins) datasets. In MapReduce-based platforms such as Apache Hive and Pig, redundancy in intermediate results contributes avoidable costs to the overall I/O, sorting, and network transfer overhead of join-intensive workloads due to longer workflows. Consequently, providing techniques for dealing with such redundancy will enable more nimble execution of such workflows. This paper argues for the use of a nested data model for representing intermediate data concisely using nesting-aware dataflow operators that allow for lazy and partial unnesting strategies. This approach reduces the overall I/O and network footprint of a workflow by concisely representing intermediate results during most of a workflow's execution, until complete unnesting is absolutely necessary. The proposed strategies are integrated into Apache Pig and experimental evaluation over real-world and synthetic benchmark datasets confirms their superiority over relational-style MapReduce systems such as Apache Pig and Hive.
APA, Harvard, Vancouver, ISO, and other styles

Conference papers on the topic "Nested Dataset"

1

Ringland, Nicky, Xiang Dai, Ben Hachey, Sarvnaz Karimi, Cecile Paris, and James R. Curran. "NNE: A Dataset for Nested Named Entity Recognition in English Newswire." In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA, USA: Association for Computational Linguistics, 2019. http://dx.doi.org/10.18653/v1/p19-1510.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Jonak, Martin, Stepan Jezek, and Radim Burget. "Evaluation of Nested U-Net models performance on MVTec AD dataset." In 2022 14th International Congress on Ultra Modern Telecommunications and Control Systems and Workshops (ICUMT). IEEE, 2022. http://dx.doi.org/10.1109/icumt57764.2022.9943348.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Loukachevitch, Natalia, Ekaterina Artemova, Tatiana Batura, Pavel Braslavski, Ilia Denisov, Vladimir Ivanov, Suresh Manandhar, Alexander Pugachev, and Elena Tutubalina. "NEREL: A Russian Dataset with Nested Named Entities, Relations and Events." In International Conference Recent Advances in Natural Language Processing. INCOMA Ltd. Shoumen, BULGARIA, 2021. http://dx.doi.org/10.26615/978-954-452-072-4_100.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Dinh, Tuan Le, Suk-Hwan Lee, Seong-Geun Kwon, and Ki-Ryong Kwon. "Cell Nuclei Segmentation in Cryonuseg dataset using Nested Unet with EfficientNet Encoder." In 2022 International Conference on Electronics, Information, and Communication (ICEIC). IEEE, 2022. http://dx.doi.org/10.1109/iceic54506.2022.9748537.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Wu, Shuhui, Yongliang Shen, Zeqi Tan, and Weiming Lu. "Propose-and-Refine: A Two-Stage Set Prediction Network for Nested Named Entity Recognition." In Thirty-First International Joint Conference on Artificial Intelligence {IJCAI-22}. California: International Joint Conferences on Artificial Intelligence Organization, 2022. http://dx.doi.org/10.24963/ijcai.2022/613.

Full text
Abstract:
Nested named entity recognition (nested NER) is a fundamental task in natural language processing. Various span-based methods have been proposed to detect nested entities with span representations. However, span-based methods do not consider the relationship between a span and other entities or phrases, which is helpful in the NER task. Besides, span-based methods have trouble predicting long entities due to limited span enumeration length. To mitigate these issues, we present the Propose-and-Refine Network (PnRNet), a two-stage set prediction network for nested NER. In the propose stage, we use a span-based predictor to generate some coarse entity predictions as entity proposals. In the refine stage, proposals interact with each other, and richer contextual information is incorporated into the proposal representations. The refined proposal representations are used to re-predict entity boundaries and classes. In this way, errors in coarse proposals can be eliminated, and the boundary prediction is no longer constrained by the span enumeration length limitation. Additionally, we build multi-scale sentence representations, which better model the hierarchical structure of sentences and provide richer contextual information than token-level representations. Experiments show that PnRNet achieves state-of-the-art performance on four nested NER datasets and one flat NER dataset.
APA, Harvard, Vancouver, ISO, and other styles
6

Zeng, Yu, Yan Gao, Jiaqi Guo, Bei Chen, Qian Liu, Jian-Guang Lou, Fei Teng, and Dongmei Zhang. "RECPARSER: A Recursive Semantic Parsing Framework for Text-to-SQL Task." In Twenty-Ninth International Joint Conference on Artificial Intelligence and Seventeenth Pacific Rim International Conference on Artificial Intelligence {IJCAI-PRICAI-20}. California: International Joint Conferences on Artificial Intelligence Organization, 2020. http://dx.doi.org/10.24963/ijcai.2020/504.

Full text
Abstract:
Neural semantic parsers usually fail to parse long and complicated utterances into nested SQL queries, due to the large search space. In this paper, we propose a novel recursive semantic parsing framework called RECPARSER to generate the nested SQL query layer-by-layer. It decomposes the complicated nested SQL query generation problem into several progressive non-nested SQL query generation problems. Furthermore, we propose a novel Question Decomposer module to explicitly encourage RECPARSER to focus on different components of an utterance when predicting SQL queries of different layers. Experiments on the Spider dataset show that our approach is more effective compared to the previous works at predicting the nested SQL queries. In addition, we achieve an overall accuracy that is comparable with state-of-the-art approaches.
APA, Harvard, Vancouver, ISO, and other styles
7

Couto, João M. M., Breno Pimenta, Igor M. de Araújo, Samuel Assis, Julio C. S. Reis, Ana Paula C. da Silva, Jussara M. Almeida, and Fabrício Benevenuto. "Central de Fatos: Um Repositório de Checagens de Fatos." In Dataset Showcase Workshop. Sociedade Brasileira de Computação, 2021. http://dx.doi.org/10.5753/dsw.2021.17421.

Full text
Abstract:
Recentemente, o interesse por frentes de pesquisa analisando os mecanismos, bem como maneiras de evitar a disseminação de desinformação aumentou significativamente. Neste cenário, um recorrente obstáculo a indisponibilidade de checagens de fatos. Neste trabalho, compilamos uma extensa coleção de checagens oriundas de importantes agências de checagem de fatos brasileiras. Oferecemos à comunidade cientifica uma coleção inédita contendo checagens de diversas fontes confiáveis que abrangem um largo espectro de tópicos. Ao todo, a coleção resultante engloba 11647 instâncias de checagem de fatos coletadas em 6 agências diferentes que podem ser utilizadas em diversos estudos nos contexos de identificação e combate à desinformaço em plataformas digitais no Brasil.
APA, Harvard, Vancouver, ISO, and other styles
8

Silva, Mariana O., Amanda F. Paula, Gabriel P. Oliveira, Iago A. D. Vaz, Henrique Hott, Larissa D. Gomide, Arthur P. G. Reis, et al. "LiPSet: Um conjunto de Dados com Documentos Rotulados de Licitações Públicas." In Dataset Showcase Workshop. Sociedade Brasileira de Computação, 2022. http://dx.doi.org/10.5753/dsw.2022.224925.

Full text
Abstract:
Neste trabalho, é apresentado o LiPSet, um conjunto de dados com documentos rotulados de licitações públicas de Minas Gerais. Após uma visão geral do processo de coleta e rotulação manual, uma breve análise exploratória de dados é apresentada para resumir as principais características e contribuições do conjunto de dados proposto. Além disso, são discutidas potenciais aplicações e principais desafios que envolvem o uso do LiPSet.
APA, Harvard, Vancouver, ISO, and other styles
9

Albuquerque, Aldéryck Félix de, Abílio Nogueira Barros, Andreza Alencar, André Nascimento, Ibsen Mateus Bittencourt, and Rafael Ferreira Mello. "Dataset de Estimativas populacionais desagregada por município e idade 2014-2020." In Dataset Showcase Workshop. Sociedade Brasileira de Computação, 2022. http://dx.doi.org/10.5753/dsw.2022.225525.

Full text
Abstract:
Neste estudo busca-se solucionar a falta de dados de estimativas populacionais segmentadas por município e idade, no período de 2014 a 2020 para todos os municípios do Brasil, através da criação de um Dataset que fornece estes dados de forma estruturada e enriquecida com características para facilitar seu reuso, partindo de dados oficiais como do IBGE e do Ministério da Saúde e processados por uma metodologia já aprovada por um órgão de Estado. Além da implantação da metodologia para geração do Dataset, também são discutidas oportunidades de melhoria no método de processamento, direcionando assim futuros estudos de desagregação populacional considerando as particularidades dos conjuntos de dados dos órgãos de Estado no Brasil.
APA, Harvard, Vancouver, ISO, and other styles
10

Porto, Fabio, Amir Khatibi, João N. Rittmeyer, Eduardo Ogasawara, Patrick Valduriez, and Dennis Shasha. "Constellation Queries over Big Data." In Simpósio Brasileiro de Banco de Dados. Sociedade Brasileira de Computação - SBC, 2018. http://dx.doi.org/10.5753/sbbd.2018.22221.

Full text
Abstract:
A geometrical pattern is a set of points with all pairwise distances (or, more generally, relative distances) specified. Finding matches to such patterns has applications to spatial data in seismic, astronomical, and transportation contexts. Finding geometric patterns is a challenging problem as the potential number of sets of elements that compose shapes is exponentially large in the size of the dataset and the pattern. In this paper, we propose algorithms to find patterns in large data applications. Our methods combine quadtrees, matrix multiplication, and bucket join processing to discover sets of points that match a geometric pattern within some additive factor on the pairwise distances. Our distributed experiments show that the choice of composition algorithm (matrix multiplication or nested loops) depends on the freedom introduced in the query geometry through the distance additive factor. Three clearly identified blocks of threshold values guide the choice of the best composition algorithm.
APA, Harvard, Vancouver, ISO, and other styles

Reports on the topic "Nested Dataset"

1

Idakwo, Gabriel, Sundar Thangapandian, Joseph Luttrell, Zhaoxian Zhou, Chaoyang Zhang, and Ping Gong. Deep learning-based structure-activity relationship modeling for multi-category toxicity classification : a case study of 10K Tox21 chemicals with high-throughput cell-based androgen receptor bioassay data. Engineer Research and Development Center (U.S.), July 2021. http://dx.doi.org/10.21079/11681/41302.

Full text
Abstract:
Deep learning (DL) has attracted the attention of computational toxicologists as it offers a potentially greater power for in silico predictive toxicology than existing shallow learning algorithms. However, contradicting reports have been documented. To further explore the advantages of DL over shallow learning, we conducted this case study using two cell-based androgen receptor (AR) activity datasets with 10K chemicals generated from the Tox21 program. A nested double-loop cross-validation approach was adopted along with a stratified sampling strategy for partitioning chemicals of multiple AR activity classes (i.e., agonist, antagonist, inactive, and inconclusive) at the same distribution rates amongst the training, validation and test subsets. Deep neural networks (DNN) and random forest (RF), representing deep and shallow learning algorithms, respectively, were chosen to carry out structure-activity relationship-based chemical toxicity prediction. Results suggest that DNN significantly outperformed RF (p < 0.001, ANOVA) by 22–27% for four metrics (precision, recall, F-measure, and AUPRC) and by 11% for another (AUROC). Further in-depth analyses of chemical scaffolding shed insights on structural alerts for AR agonists/antagonists and inactive/inconclusive compounds, which may aid in future drug discovery and improvement of toxicity prediction modeling.
APA, Harvard, Vancouver, ISO, and other styles
2

Alviarez, Vanessa, Michele Fioretti, Ken Kikkawa, and Monica Morlacco. Two-Sided Market Power in Firm-to-Firm Trade. Inter-American Development Bank, August 2021. http://dx.doi.org/10.18235/0003493.

Full text
Abstract:
Firms in global value chains (GVCs) are granular and exert bargaining power over the terms of trade. We show that these features are crucial to understanding the well-established variation in prices and pass-through across importers and exporters. We develop a novel theory of prices in GVCs, which tractably nests a wide range of bilateral concentration and bargaining power configurations. We test and evaluate the models predictions using a novel dataset merging transaction-level U.S. import data with balance sheet data for both U.S. importers and foreign exporters. Our pricing framework enhances traditional frameworks in the literature in accurately predicting price changes following a tariff shock. The results shed light on the role of firms in determining the tariff pass-through onto import prices.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography