opent03 / works

Fun things I wrote/cowrote.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

@@

I co-wrote "Neural Networks: A Continuum of Potential" with Eric Hu and Johnny Huang as the final project for COMP 599 - Topics in Computer Science: Mathematical Techniques for Machine Learning, taught by Prof. Panangaden, Fall 2019.

"On the Concentration of Measure in Orlicz spaces of exponential type" was my final project for MATH 598 - Topics in Probability & Statistics: Concentration Phenomena, taught by Prof. Lin, Winter 2020.

"On the Analysis of Stochastic Gradient Descent in Neural Networks via Gradient Flows" was my final report for MATH 470 - Honors Research Project, supervised by Prof. Khalili, Winter 2020.

I co-wrote "Fader Networks: A Heuristic Approach" with Marcos Cardenas Zelaya and Marie Leech, our final report for COMP 551, taught by Prof. William Hamilton. The level of reproducibility of Lample et. al.'s work was mindblowing. We generated some pretty hilarious domain adapted images.

Neural Networks: A Continuum of Potential, December 2019

Theories on neural networks with an infinite number of hidden units were explored since the late 1990's, deepening the understanding of these computational models in two principal aspects: 1. network behavior under some limiting process on its parameters and 2. neural networks from a functional perspective. Continuous neural networks are of particular interest due to their computational feasibility from finite affine parametrizations and strong convergence properties under constraints. In this paper, we survey some of the theoretical groundings of continuous networks, notably their correspondence with Gaussian processes, computation and applications through the Neural Tangent Kernel (NTK) formulation, and apply the infinite dimensional extension to inputs and outputs all the while considering their universal approximation properties.

On the Concentration of Measure in Orlicz spaces of exponential type, April 2020

The study of Orlicz spaces, first described in 1931 by Wladyslaw Orlicz and Zygmunt Wilhelm Birnbaum, became popular in the empirical processes and functional analysis literature, due to the rising interest in chaining arguments to derive probabilistic bounds for stochastic processes, and generalizations of Lp spaces as well as Sobolev space embeddings, respectively. Orlicz spaces exhibit strong concentration phenomena, inherited from their construction. In particular, they are associated to the sub-Exponential and sub-Gaussian classes of random variables. In this article, we aim to provide a brief introduction to concentration of measure in Orlicz spaces, in particular, Orlicz spaces of exponential type. We begin by the construction of these spaces, and delve into certain concentration guarantees and applications.

On the Analysis of Stochastic Gradient Descent in Neural Networks via Gradient Flows, May 2020

Research in neural network theory is steadily gaining traction, as there is a growing interest in the thorough understanding of the functionalities and the mechanisms through which these models achieve strong performances in decision problems. Several methods have been proposed to quantitatively assess the optimization process of the neural network’s high dimensional non-convex objective, employing various tools such as kernel methods, global optimization, optimal transport, and functional analysis. In this work, we focus on Mei et al.’s analysis of the mean risk field of two-layer neural networks which associates stochastic gradient descent’s (SGD) training dynamics to a partial differential equation (PDE) in the space of probability measures with the topology of weak convergence. Precisely, we dissect the proof of the convergence of SGD’s dynamics to the solution of the PDE, showcase several results regarding the analysis of the latter and their implications on the training process of neural networks via SGD, and discuss related work as well as potential further explorations stemming from various fields.

Fader Networks: A Heuristic Approach, May April 2019

In recent years, approaches to approximating complex data distributions have been centered around the generative adversarial networks (GANs) paradigm, eliminating the need for Markov chains in generative stochastic networks or approximate inference in Boltzmann machines. Applying GANs to image and video editing have been done in a supervised setting where external information about the data allows the re-generation of real images with deterministic complex modifications, using Invertible conditional GANs (IcGANs). Fader Networks extend on this idea by learning a post-encoding latent space invariant to labeled features on the image, and re-generating the original image by providing the decoder with alternate attributes of choice. In this paper, we explore the impacts of modifications on the encoding and decoding convolutional blocks, analyzing the effects of droput in the discriminator, implementations of different loss functions on the generated images' quality using appropriate metrics and extend the model by including Skip Connects. We finish by providing an empirical assessment on how Fader networks develop a pseudo-understanding of higher-level image features.

About

Fun things I wrote/cowrote.