Tesis sobre el tema "Neural networks (Computer science)"
Crea una cita precisa en los estilos APA, MLA, Chicago, Harvard y otros
Consulte los 50 mejores tesis para su investigación sobre el tema "Neural networks (Computer science)".
Junto a cada fuente en la lista de referencias hay un botón "Agregar a la bibliografía". Pulsa este botón, y generaremos automáticamente la referencia bibliográfica para la obra elegida en el estilo de cita que necesites: APA, MLA, Harvard, Vancouver, Chicago, etc.
También puede descargar el texto completo de la publicación académica en formato pdf y leer en línea su resumen siempre que esté disponible en los metadatos.
Explore tesis sobre una amplia variedad de disciplinas y organice su bibliografía correctamente.
Landassuri, Moreno Victor Manuel. "Evolution of modular neural networks". Thesis, University of Birmingham, 2012. http://etheses.bham.ac.uk//id/eprint/3243/.
Texto completoSloan, Cooper Stokes. "Neural bus networks". Thesis, Massachusetts Institute of Technology, 2018. http://hdl.handle.net/1721.1/119711.
Texto completoThis electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.
Cataloged from student-submitted PDF version of thesis.
Includes bibliographical references (pages 65-68).
Bus schedules are unreliable, leaving passengers waiting and increasing commute times. This problem can be solved by modeling the traffic network, and delivering predicted arrival times to passengers. Research attempts to model traffic networks use historical, statistical and learning based models, with learning based models achieving the best results. This research compares several neural network architectures trained on historical data from Boston buses. Three models are trained: multilayer perceptron, convolutional neural network and recurrent neural network. Recurrent neural networks show the best performance when compared to feed forward models. This indicates that neural time series models are effective at modeling bus networks. The large amount of data available for training bus network models and the effectiveness of large neural networks at modeling this data show that great progress can be made in improving commutes for passengers.
by Cooper Stokes Sloan.
M. Eng.
Khan, Altaf Hamid. "Feedforward neural networks with constrained weights". Thesis, University of Warwick, 1996. http://wrap.warwick.ac.uk/4332/.
Texto completoZaghloul, Waleed A. Lee Sang M. "Text mining using neural networks". Lincoln, Neb. : University of Nebraska-Lincoln, 2005. http://0-www.unl.edu.library.unl.edu/libr/Dissertations/2005/Zaghloul.pdf.
Texto completoTitle from title screen (sites viewed on Oct. 18, 2005). PDF text: 100 p. : col. ill. Includes bibliographical references (p. 95-100 of dissertation).
Hadjifaradji, Saeed. "Learning algorithms for restricted neural networks". Thesis, National Library of Canada = Bibliothèque nationale du Canada, 2000. http://www.collectionscanada.ca/obj/s4/f2/dsk1/tape3/PQDD_0016/NQ48102.pdf.
Texto completoCheung, Ka Kit. "Neural networks for optimization". HKBU Institutional Repository, 2001. http://repository.hkbu.edu.hk/etd_ra/291.
Texto completoAhamed, Woakil Uddin. "Quantum recurrent neural networks for filtering". Thesis, University of Hull, 2009. http://hydra.hull.ac.uk/resources/hull:2411.
Texto completoWilliams, Bryn V. "Evolutionary neural networks : models and applications". Thesis, Aston University, 1995. http://publications.aston.ac.uk/10635/.
Texto completoDe, Jongh Albert. "Neural network ensembles". Thesis, Stellenbosch : Stellenbosch University, 2004. http://hdl.handle.net/10019.1/50035.
Texto completoENGLISH ABSTRACT: It is possible to improve on the accuracy of a single neural network by using an ensemble of diverse and accurate networks. This thesis explores diversity in ensembles and looks at the underlying theory and mechanisms employed to generate and combine ensemble members. Bagging and boosting are studied in detail and I explain their success in terms of well-known theoretical instruments. An empirical evaluation of their performance is conducted and I compare them to a single classifier and to each other in terms of accuracy and diversity.
AFRIKAANSE OPSOMMING: Dit is moontlik om op die akkuraatheid van 'n enkele neurale netwerk te verbeter deur 'n ensemble van diverse en akkurate netwerke te gebruik. Hierdie tesis ondersoek diversiteit in ensembles, asook die meganismes waardeur lede van 'n ensemble geskep en gekombineer kan word. Die algoritmes "bagging" en "boosting" word in diepte bestudeer en hulle sukses word aan die hand van bekende teoretiese instrumente verduidelik. Die prestasie van hierdie twee algoritmes word eksperimenteel gemeet en hulle akkuraatheid en diversiteit word met 'n enkele netwerk vergelyk.
Lee, Ji Young Ph D. Massachusetts Institute of Technology. "Information extraction with neural networks". Thesis, Massachusetts Institute of Technology, 2017. http://hdl.handle.net/1721.1/111905.
Texto completoCataloged from PDF version of thesis.
Includes bibliographical references (pages 85-97).
Electronic health records (EHRs) have been widely adopted, and are a gold mine for clinical research. However, EHRs, especially their text components, remain largely unexplored due to the fact that they must be de-identified prior to any medical investigation. Existing systems for de-identification rely on manual rules or features, which are time-consuming to develop and fine-tune for new datasets. In this thesis, we propose the first de-identification system based on artificial neural networks (ANNs), which achieves state-of-the-art results without any human-engineered features. The ANN architecture is extended to incorporate features, further improving the de-identification performance. Under practical considerations, we explore transfer learning to take advantage of large annotated dataset to improve the performance on datasets with limited number of annotations. The ANN-based system is publicly released as an easy-to-use software package for general purpose named-entity recognition as well as de-identification. Finally, we present an ANN architecture for relation extraction, which ranked first in the SemEval-2017 task 10 (ScienceIE) for relation extraction in scientific articles (subtask C).
by Ji Young Lee.
Ph. D.
Zeng, Brandon. "Towards understanding residual neural networks". Thesis, Massachusetts Institute of Technology, 2019. https://hdl.handle.net/1721.1/123067.
Texto completoCataloged from PDF version of thesis.
Includes bibliographical references (page 37).
Residual networks (ResNets) are now a prominent architecture in the field of deep learning. However, an explanation for their success remains elusive. The original view is that residual connections allows for the training of deeper networks, but it is not clear that added layers are always useful, or even how they are used. In this work, we find that residual connections distribute learning behavior across layers, allowing resnets to indeed effectively use deeper layers and outperform standard networks. We support this explanation with results for network gradients and representation learning that show that residual connections make the training of individual residual blocks easier.
by Brandon Zeng.
M. Eng.
M.Eng. Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science
Sarda, Srikant 1977. "Neural networks and neurophysiological signals". Thesis, Massachusetts Institute of Technology, 1999. http://hdl.handle.net/1721.1/9806.
Texto completoIncludes bibliographical references (p. 45).
The purpose of this thesis project is to develop, implement, and validate a neural network which will classify compound muscle action potentials (CMAPs). The two classes of signals are "viable" and "non-viable." This classification system will be used as part of a quality assurance mechanism on the NC-stat nerve conduction monitoring system. The results show that standard backpropagation neural networks provide exceptional classification results on novel waveforms. Also, principal components analysis is a powerful preprocessing technique which allows for a significant reduction in processing efficiency, while maintaining performance standards. This system is implementable as a real-time quality control process for the NC-stat.
by Srikant Sarda.
S.B.and M.Eng.
Nareshkumar, Nithyalakshmi. "Simulataneous versus Successive Learning in Neural Networks". Miami University / OhioLINK, 2005. http://rave.ohiolink.edu/etdc/view?acc_num=miami1134068959.
Texto completoAmin, Muhamad Kamal M. "Multiple self-organised spiking neural networks". Thesis, Available from the University of Aberdeen Library and Historic Collections Digital Resources. Online version available for University members only until Feb. 1, 2014, 2009. http://digitool.abdn.ac.uk:80/webclient/DeliveryManager?application=DIGITOOL-3&owner=resourcediscovery&custom_att_2=simple_viewer&pid=26029.
Texto completoWith: Clustering with self-organised spiking neural network / Muhamad K. Amin ... et al. Joint 4th International Conference on Soft Computing and Intelligent Systems (SCIS) and 9th International Symposium on Advanced Intelligent Systems (SIS) Sept. 17-21, 2008, Nagoya. Japan. Includes bibliographical references.
McMichael, Lonny D. (Lonny Dean). "A Neural Network Configuration Compiler Based on the Adaptrode Neuronal Model". Thesis, University of North Texas, 1992. https://digital.library.unt.edu/ark:/67531/metadc501018/.
Texto completoYang, Horng-Chang. "Multiresolution neural networks for image edge detection and restoration". Thesis, University of Warwick, 1994. http://wrap.warwick.ac.uk/66740/.
Texto completoPolhill, John Gareth. "Guaranteeing generalisation in neural networks". Thesis, University of St Andrews, 1995. http://hdl.handle.net/10023/12878.
Texto completoSalama, Rameri. "On evolving modular neural networks". University of Western Australia. Dept. of Computer Science, 2000. http://theses.library.uwa.edu.au/adt-WU2003.0011.
Texto completoBhattacharya, Dipankar. "Neural networks for signal processing". Thesis, National Library of Canada = Bibliothèque nationale du Canada, 1996. http://www.collectionscanada.ca/obj/s4/f2/dsk3/ftp04/nq21924.pdf.
Texto completoTipping, Michael E. "Topographic mappings and feed-forward neural networks". Thesis, Aston University, 1996. http://publications.aston.ac.uk/672/.
Texto completoRountree, Nathan y n/a. "Initialising neural networks with prior knowledge". University of Otago. Department of Computer Science, 2007. http://adt.otago.ac.nz./public/adt-NZDU20070510.135442.
Texto completoShah, Jagesh V. (Jagesh Vijaykumar). "Learning dynamics in feedforward neural networks". Thesis, Massachusetts Institute of Technology, 1995. http://hdl.handle.net/1721.1/36541.
Texto completoIncludes bibliographical references (leaves 108-115).
by Jagesh V. Shah.
M.S.
Mars, Risha R. "Organic LEDs for optoelectronic neural networks". Thesis, Massachusetts Institute of Technology, 2012. http://hdl.handle.net/1721.1/77537.
Texto completoCataloged from PDF version of thesis.
Includes bibliographical references (p. 79-81).
In this thesis, I investigate the characteristics of Organic Light Emitting Diodes (OLEDs) and assess their suitability for use in the Compact Optoelectronic Integrated Neural (COIN) coprocessor. The COIN coprocessor, a prototype artificial neural network implemented in hardware, seeks to implement neural network algorithms in native optoelectronic hardware in order to do parallel type processing in a faster and more efficient manner than all-electronic implementations. The feasibility of scaling the network to tens of millions of neurons is the main reason for optoelectronics - they do not suffer from crosstalk and other problems that affect electrical wires when they are densely packed. I measured the optical and electrical characteristics different types of OLEDs, and made calculations based on existing optical equipment to determine the specific characteristics required if OLEDs were to be used in the prototype. The OLEDs were compared to Vertical Cavity Surface Emitting Lasers (VCSELs) to determine the tradeoffs in using one over the other in the prototype neural network.
by Risha R. Mars.
M.Eng.
Doshi, Anuja. "Aircraft position prediction using neural networks". Thesis, Massachusetts Institute of Technology, 2005. http://hdl.handle.net/1721.1/33300.
Texto completoIncludes bibliographical references (leaf 64).
The Federal Aviation Administration (FAA) has been investigating early warning accident prevention systems in an effort to prevent runway collisions. One system in place is the Airport Movement Area Safety System (AMASS), developed under contract with the FAA. AMASS uses a linear prediction system to predict the position of an aircraft 5 to 30 seconds in the future. The system sounds an alarm to warn air traffic controllers if it foresees a potential accident. However, research done at MIT and Volpe National Transportation Systems Center has shown that neural networks more accurately predict the future position of aircraft. Neural networks are self-learning, and the time required for the optimization of safety logic will be minimized using neural networks. More accurate predictions of aircraft position will deliver earlier warnings to air traffic controllers while reducing the number of nuisance alerts. There are many factors to consider in designing an aircraft position prediction neural network, including history length, types of inputs and outputs, and applicable training data. This document chronicles the design, training, performance, and analysis of a position prediction neural network, and the presents the resulting optimal neural network for the AMASS System. Additionally, the neural network prediction model is then compared other prediction models, including a constant speed, linear regression, and an auto regression model. In this analysis, neural networks present themselves as a superior model for aircraft position prediction.
by Anuja Doshi.
M.Eng.and S.B.
Gu, Youyang. "Food adulteration detection using neural networks". Thesis, Massachusetts Institute of Technology, 2016. http://hdl.handle.net/1721.1/106015.
Texto completoThis electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.
Cataloged from student-submitted PDF version of thesis.
Includes bibliographical references (pages 99-100).
In food safety and regulation, there is a need for an automated system to be able to make predictions on which adulterants (unauthorized substances in food) are likely to appear in which food products. For example, we would like to know that it is plausible for Sudan I, an illegal red dye, to adulter "strawberry ice cream", but not "bread". In this work, we show a novel application of deep neural networks in solving this task. We leverage data sources of commercial food products, hierarchical properties of substances, and documented cases of adulterations to characterize ingredients and adulterants. Taking inspiration from natural language processing, we show the use of recurrent neural networks to generate vector representations of ingredients from Wikipedia text and make predictions. Finally, we use these representations to develop a sequential method that has the capability to improve prediction accuracy as new observations are introduced. The results outline a promising direction in the use of machine learning techniques to aid in the detection of adulterants in food.
by Youyang Gu.
M. Eng.
Mehta, Haripriya(Haripriya P. ). "Secure inference of quantized neural networks". Thesis, Massachusetts Institute of Technology, 2020. https://hdl.handle.net/1721.1/127663.
Texto completoCataloged from the official PDF of thesis.
Includes bibliographical references (pages 63-65).
Running image recognition algorithms on medical datasets raises several privacy concerns. Hospitals may not have access to an image recognition model that a third party may have developed, and medical images are HIPAA protected and thus, cannot leave hospital servers. However, with secure neural network inference, hospitals can send encrypted medical images as input to a modified neural network that is compatible with leveled fully homomorphic encryption (LHE), a form of encryption that can support evaluation of degree-bounded polynomial functions over encrypted data without decrypting it, and Brakerski/Fan-Vercauteren (BFV) scheme - an efficient LHE cryptographic scheme which only operates with integers. To make the model compatible with LHE with the BFV scheme, the neural net weights, and activations must be converted to integers through quantization and non-linear activation functions must be approximated with low-degree polynomial functions. This paper presents a pipeline that can train real world models such as ResNet-18 on large datasets and quantize them without significant loss in accuracy. Additionally, we highlight customized quantize inference functions which we will eventually modify to be compatible with LHE and measure the impact on model accuracy.
by Haripriya Mehta.
M. Eng.
M.Eng. Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science
Srivastava, Sanjana. "On foveation of deep neural networks". Thesis, Massachusetts Institute of Technology, 2019. https://hdl.handle.net/1721.1/123134.
Texto completoThesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2019
Cataloged from student-submitted PDF version of thesis.
Includes bibliographical references (pages 61-63).
The human ability to recognize objects is impaired when the object is not shown in full. "Minimal images" are the smallest regions of an image that remain recognizable for humans. [26] show that a slight modification of the location and size of the visible region of the minimal image produces a sharp drop in human recognition accuracy. In this paper, we demonstrate that such drops in accuracy due to changes of the visible region are a common phenomenon between humans and existing state-of- the-art convolutional neural networks (CNNs), and are much more prominent in CNNs. We found many cases where CNNs classified one region correctly and the other incorrectly, though they only differed by one row or column of pixels, and were often bigger than the average human minimal image size. We show that this phenomenon is independent from previous works that have reported lack of invariance to minor modifications in object location in CNNs. Our results thus reveal a new failure mode of CNNs that also affects humans to a lesser degree. They expose how fragile CNN recognition ability is for natural images even without synthetic adversarial patterns being introduced. This opens potential for CNN robustness in natural images to be brought to the human level by taking inspiration from human robustness methods. One of these is eccentricity dependence, a model of human focus in which attention to the visual input degrades proportional to distance from the focal point [7]. We demonstrate that applying the "inverted pyramid" eccentricity method, a multi-scale input transformation, makes CNNs more robust to useless background features than a standard raw-image input. Our results also find that using the inverted pyramid method generally reduces useless background pixels, therefore reducing required training data.
by Sanjana Srivastava.
M. Eng.
M.Eng. Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science
Behnke, Sven. "Hierarchical neural networks for image interpretation /". Berlin [u.a.] : Springer, 2003. http://www.loc.gov/catdir/enhancements/fy0813/2003059597-d.html.
Texto completoWhyte, William John. "Statistical mechanics of neural networks". Thesis, University of Oxford, 1995. http://ora.ox.ac.uk/objects/uuid:e17f9b27-58ac-41ad-8722-cfab75139d9a.
Texto completoAdamu, Abdullahi S. "An empirical study towards efficient learning in artificial neural networks by neuronal diversity". Thesis, University of Nottingham, 2016. http://eprints.nottingham.ac.uk/33799/.
Texto completoMohr, Sheila Jean. "Temporal EKG signal classification using neural networks". Master's thesis, This resource online, 1991. http://scholar.lib.vt.edu/theses/available/etd-02022010-020115/.
Texto completoTreadgold, Nicholas K. Computer Science & Engineering Faculty of Engineering UNSW. "Constructive neural networks : generalisation, convergence and architectures". Awarded by:University of New South Wales. School of Computer Science and Engineering, 1999. http://handle.unsw.edu.au/1959.4/17615.
Texto completoTavanaei, Amirhossein. "Spiking Neural Networks and Sparse Deep Learning". Thesis, University of Louisiana at Lafayette, 2019. http://pqdtopen.proquest.com/#viewpdf?dispub=10807940.
Texto completoThis document proposes new methods for training multi-layer and deep spiking neural networks (SNNs), specifically, spiking convolutional neural networks (CNNs). Training a multi-layer spiking network poses difficulties because the output spikes do not have derivatives and the commonly used backpropagation method for non-spiking networks is not easily applied. Our methods use novel versions of the brain-like, local learning rule named spike-timing-dependent plasticity (STDP) that incorporates supervised and unsupervised components. Our method starts with conventional learning methods and converts them to spatio-temporally local rules suited for SNNs.
The training uses two components for unsupervised feature extraction and supervised classification. The first component refers to new STDP rules for spike-based representation learning that trains convolutional filters and initial representations. The second introduces new STDP-based supervised learning rules for spike pattern classification via an approximation to gradient descent by combining the STDP and anti-STDP rules. Specifically, the STDP-based supervised learning model approximates gradient descent by using temporally local STDP rules. Stacking these components implements a novel sparse, spiking deep learning model. Our spiking deep learning model is categorized as a variation of spiking CNNs of integrate-and-fire (IF) neurons with performance comparable with the state-of-the-art deep SNNs. The experimental results show the success of the proposed model for image classification. Our network architecture is the only spiking CNN which provides bio-inspired STDP rules in a hierarchy of feature extraction and classification in an entirely spike-based framework.
Czuchry, Andrew J. Jr. "Toward a formalism for the automation of neural network construction and processing control". Diss., Georgia Institute of Technology, 1993. http://hdl.handle.net/1853/9199.
Texto completoBragansa, John. "On the performance issues of the bidirectional associative memory". Thesis, Georgia Institute of Technology, 1993. http://hdl.handle.net/1853/17809.
Texto completoPost, David L. "Network Management: Assessing Internet Network-Element Fault Status Using Neural Networks". Ohio : Ohio University, 2008. http://www.ohiolink.edu/etd/view.cgi?ohiou1220632155.
Texto completoMorphet, Steven Brian Işık Can. "Modeling neural networks via linguistically interpretable fuzzy inference systems". Related electronic resource: Current Research at SU : database of SU dissertations, recent titles available full text, 2004. http://wwwlib.umi.com/cr/syr/main.
Texto completoNgom, Alioune. "Synthesis of multiple-valued logic functions by neural networks". Thesis, National Library of Canada = Bibliothèque nationale du Canada, 1998. http://www.collectionscanada.ca/obj/s4/f2/dsk2/ftp03/NQ36787.pdf.
Texto completoRivest, François. "Knowledge transfer in neural networks : knowledge-based cascade-correlation". Thesis, McGill University, 2002. http://digitool.Library.McGill.CA:80/R/?func=dbin-jump-full&object_id=29470.
Texto completoKünzle, Philippe. "Building topological maps for robot navigation using neural networks". Thesis, McGill University, 2005. http://digitool.Library.McGill.CA:80/R/?func=dbin-jump-full&object_id=82266.
Texto completoIn this thesis, we explore the issue of creating a topological map from range data. A robot in a simulated environment uses the distance from objects around it (range data) and a compass as inputs. From this information, the robot finds intersections, classifies them as landmarks using a neural network and creates a topological map of its environment. The neural network detecting landmarks is trained online on sample intersections. Although the robot evolves in a simulated environment, the ideas developed in this thesis could be applied to a real robot in an office space.
Yang, Xiao. "Memristor based neural networks : feasibility, theories and approaches". Thesis, University of Kent, 2014. https://kar.kent.ac.uk/49041/.
Texto completoWang, Fengzhen. "Neural networks for data fusion". Thesis, National Library of Canada = Bibliothèque nationale du Canada, 1997. http://www.collectionscanada.ca/obj/s4/f2/dsk2/ftp02/NQ30179.pdf.
Texto completoHorvitz, Richard P. "Symbol Grounding Using Neural Networks". University of Cincinnati / OhioLINK, 2012. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1337887977.
Texto completoTurner, Joe. "Application of artificial neural networks in pharmacokinetics /". Connect to full text, 2003. http://setis.library.usyd.edu.au/adt/public_html/adt-NU/public/adt-NU20031007.090937/index.html.
Texto completoMiller, Paul Ian. "Recurrent neural networks and adaptive motor control". Thesis, University of Stirling, 1997. http://hdl.handle.net/1893/21520.
Texto completoChen, Francis Xinghang. "Modeling human vision using feedforward neural networks". Thesis, Massachusetts Institute of Technology, 2016. http://hdl.handle.net/1721.1/112824.
Texto completoThis electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.
Cataloged from student-submitted PDF version of thesis.
Includes bibliographical references (pages 81-86).
In this thesis, we discuss the implementation, characterization, and evaluation of a new computational model for human vision. Our goal is to understand the mechanisms enabling invariant perception under scaling, translation, and clutter. The model is based on I-Theory [50], and uses convolutional neural networks. We investigate the explanatory power of this approach using the task of object recognition. We find that the model has important similarities with neural architectures and that it can reproduce human perceptual phenomena. This work may be an early step towards a more general and unified human vision model.
by Francis Xinghang Chen.
M. Eng.
Dernoncourt, Franck. "Sequential short-text classification with neural networks". Thesis, Massachusetts Institute of Technology, 2017. http://hdl.handle.net/1721.1/111880.
Texto completoCataloged from PDF version of thesis.
Includes bibliographical references (pages 69-79).
Medical practice too often fails to incorporate recent medical advances. The two main reasons are that over 25 million scholarly medical articles have been published, and medical practitioners do not have the time to perform literature reviews. Systematic reviews aim at summarizing published medical evidence, but writing them requires tremendous human efforts. In this thesis, we propose several natural language processing methods based on artificial neural networks to facilitate the completion of systematic reviews. In particular, we focus on short-text classification, to help authors of systematic reviews locate the desired information. We introduce several algorithms to perform sequential short-text classification, which outperform state-of-the-art algorithms. To facilitate the choice of hyperparameters, we present a method based on Gaussian processes. Lastly, we release PubMed 20k RCT, a new dataset for sequential sentence classification in randomized control trial abstracts.
by Franck Dernoncourt.
Ph. D.
Zhang, Jeffrey M. Eng Massachusetts Institute of Technology. "Enhancing adversarial robustness of deep neural networks". Thesis, Massachusetts Institute of Technology, 2019. https://hdl.handle.net/1721.1/122994.
Texto completoThesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2019
Cataloged from student-submitted PDF version of thesis.
Includes bibliographical references (pages 57-58).
Logit-based regularization and pretrain-then-tune are two approaches that have recently been shown to enhance adversarial robustness of machine learning models. In the realm of regularization, Zhang et al. (2019) proposed TRADES, a logit-based regularization optimization function that has been shown to improve upon the robust optimization framework developed by Madry et al. (2018) [14, 9]. They were able to achieve state-of-the-art adversarial accuracy on CIFAR10. In the realm of pretrain- then-tune models, Hendrycks el al. (2019) demonstrated that adversarially pretraining a model on ImageNet then adversarially tuning on CIFAR10 greatly improves the adversarial robustness of machine learning models. In this work, we propose Adversarial Regularization, another logit-based regularization optimization framework that surpasses TRADES in adversarial generalization. Furthermore, we explore the impact of trying different types of adversarial training on the pretrain-then-tune paradigm.
by Jeffry Zhang.
M. Eng.
M.Eng. Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science
Miglani, Vivek N. "Comparing learned representations of deep neural networks". Thesis, Massachusetts Institute of Technology, 2019. https://hdl.handle.net/1721.1/123048.
Texto completoThesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2019
Cataloged from student-submitted PDF version of thesis.
Includes bibliographical references (pages 63-64).
In recent years, a variety of deep neural network architectures have obtained substantial accuracy improvements in tasks such as image classification, speech recognition, and machine translation, yet little is known about how different neural networks learn. To further understand this, we interpret the function of a deep neural network used for classification as converting inputs to a hidden representation in a high dimensional space and applying a linear classifier in this space. This work focuses on comparing these representations as well as the learned input features for different state-of-the-art convolutional neural network architectures. By focusing on the geometry of this representation, we find that different network architectures trained on the same task have hidden representations which are related by linear transformations. We find that retraining the same network architecture with a different initialization does not necessarily lead to more similar representation geometry for most architectures, but the ResNeXt architecture consistently learns similar features and hidden representation geometry. We also study connections to adversarial examples and observe that networks with more similar hidden representation geometries also exhibit higher rates of adversarial example transferability.
by Vivek N. Miglani.
M. Eng.
M.Eng. Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science
Trinh, Loc Quang. "Greedy layerwise training of convolutional neural networks". Thesis, Massachusetts Institute of Technology, 2019. https://hdl.handle.net/1721.1/123128.
Texto completoThesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2019
Cataloged from student-submitted PDF version of thesis.
Includes bibliographical references (pages 61-63).
Layerwise training presents an alternative approach to end-to-end back-propagation for training deep convolutional neural networks. Although previous work was unsuccessful in demonstrating the viability of layerwise training, especially on large-scale datasets such as ImageNet, recent work has shown that layerwise training on specific architectures can yield highly competitive performances. On ImageNet, the layerwise trained networks can perform comparably to many state-of-the-art end-to-end trained networks. In this thesis, we compare the performance gap between the two training procedures across a wide range of network architectures and further analyze the possible limitations of layerwise training. Our results show that layerwise training quickly saturates after a certain critical layer, due to the overfitting of early layers within the networks. We discuss several approaches we took to address this issue and help layerwise training improve across multiple architectures. From a fundamental standpoint, this study emphasizes the need to open the blackbox that is modern deep neural networks and investigate the layerwise interactions between intermediate hidden layers within deep networks, all through the lens of layerwise training.
by Loc Quang Trinh.
M. Eng.
M.Eng. Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science