Unsupervised Learning review:
Neural networks:
Application:
Session | Date | Content |
---|---|---|
Day 0 | 06/16/2023 (2:30-3:30 PM) | Introduction, Setting up your Python Notebook |
Day 1 | 06/19/2023 (2:30-3:30 PM) | Python Data Types |
Day 2 | 06/20/2023 (2:30-3:30 PM) | Python Functions and Classes |
Day 3 | 06/21/2023 (2:30-3:30 PM) | Scientific Computing with Numpy and Scipy |
Day 4 | 06/22/2023 (2:30-3:30 PM) | Data Manipulation and Visualization |
Day 5 | 06/23/2023 (2:30-3:30 PM) | Materials Science Packages |
Day 6 | 06/26/2023 (2:30-3:30 PM) | Introduction to ML, Supervised Learning |
Day 7 | 06/27/2023 (2:30-3:30 PM) | Regression Models |
Day 8 | 06/28/2023 (2:30-3:30 PM) | Unsupervised Learning |
Day 9 | 06/29/2023 (2:30-3:30 PM) | Neural Networks |
Day 10 | 06/30/2023 (2:30-3:30 PM) | Advanced Applications in Materials Science |
Example: The "Swiss Roll" manifold
The eigenvectors of the correlation matrix are called principal components.
The associated eigenvalues describe the proportion of the data variance in the direction of each principal component.
We can project our (normalized) data onto the first $n$ principal components to reduce the dimensionality of the data, while still keeping most of the variance:
$$\mathbf{z} \mapsto \mathbf{u} = \begin{bmatrix} \mathbf{z}^T\mathbf{p}^{(1)} \\ \mathbf{z}^T\mathbf{p}^{(2)} \\ \vdots \\ \mathbf{z}^T\mathbf{p}^{(n)} \\ \end{bmatrix}$$Neural Networks
Introduction to Neural Networks
Application:
They are flexible non-linear models capable of solving many difficult supervised learning problems
They often work best on large, complex datasets
This predictive power comes at the cost of model interpretability.
($\text{sign}(x) = \pm 1$, depending on the sign of $x$)
$f(x) = +1$ only if a weighted sum of the inputs $x_i$ exceed a given threshold (i.e. $-w_0$)
This is similar to the electrical response of a neuron to external stimuli
The Perceptron neuron model has some disadvantages:
the function $\text{sign}(x)$ is not continuous and has a derivative of 0 everywhere.
It can be difficult to fit this function to data if it is not continuous and differentiable.
Instead of the $\text{sign}(x)$ function, we apply a continuous, non-linear function $\sigma(x)$ to the output:
We can combine multiple independent neurons into a layer of neurons.
The layer computes a vector $\mathbf{a} = f(\mathbf{x})$ of outputs from the neurons:
Consider a layer of $m$ neurons each with $D+1$ weights.
We can organize the layer's weights into a matrix $\mathbf{W}$:
$\eta$ is a constant called the learning rate.
The numerical process by which $\nabla_w \mathcal{E}(f)$ for layered neural networks is computed is called backpropagation