Deep-learning neural networks can be programmed, or structured by a human to perform one or more complex tasks. The key requirements are the ability to (1) design the network topology and (2) lock weights in the ANN (Artificial Neural Network) during training. A powerful example of structured deep-learning comes from the 1993 Farber, et.al. paper, “Identification of Continuous-Time Dynamical Systems: Neural Network Based Algorithms and Parallel Implementation” that implemented a fourth-order Runge-Kutta numerical integrator, discussed how to handle stiff sets of equations, perform identification in continuous time systems and train “netlets” to model a set of ODEs (Ordinary Differential Equations). The paper notes that both implicit and explicit integrators can be used. Succinctly, repeated iterations of a feed-forward neural network are used to train the implicit integrator while a recurrent neural network is used during training of the explicit integrator. This paper also discusses the algorithms used and their implementation on parallel machines (SIMD and MIMD architectures). Once trained, these task level neural networks can be incorporated into other deep-learning systems to train other neural network subsystems as well as be integrated into conventional computational applications.
Beyond the ability to integrate and model stiff sets of equations, follow-on work by Ramiro Rico-Martınez, Raymond A. Adomaitis, and Ioannis G. Kevrekidis investigated the noninvertability of this approach in the 2000 paper, “Noninvertibility in neural networks”
We present and discuss an inherent shortcoming of neural networks used as discrete-time models in system identification, time series processing, and prediction. Trajectories of nonlinear ordinary differential equations (ODEs) can, under reasonable assumptions, be integrated uniquely backward in time. Discrete-time neural network mappings derived from time series, on the other hand, can give rise to multiple trajectories when followed backward in time: they are in principle noninvertible. This fundamental difference can lead to model predictions that are not only slightly quantitatively different, but qualitatively inconsistent with continuous time series. We discuss how noninvertibility arises, present key analytical concepts and some of its phenomenology. Using two illustrative examples (one experimental and one computational), we demonstrate when noninvertibility becomes an important factor in the validity of artificial neural network (ANN) predictions, and show some of the overall complexity of the predicted pathological dynamical behavior. These concepts can be used to probe the validity of ANN time series models, as well as provide guidelines for the acquisition of additional training data.
There are numerous applications of these techniques in a variety of fields in vision research, mathematical analysis, control and chemical engineering to name a few.
Please see my publications list to see other applications: http://techenablement.com/rob-farber.
The techniques discussed in “Identification of Continuous-Time Dynamical Systems: Neural Network Based Algorithms and Parallel Implementation” can be easily applied to the farbopt teaching code that achieves a TF/s per GPU or Intel Xeon Phi and over 13 PF/s on the ORNL Titan supercomputer due to the near-linear scaling of the Farber parallel mapping.
Of course, TechEnablement has utilized many of the techniques discussed in our articles such as structured deep-learning for consulting in a variety fields from manufacturing optimization to color matching and small molecule drug-design.
Leave a Reply