{ "cells": [ { "cell_type": "markdown", "id": "b0cdbb34", "metadata": {}, "source": [ "# Adaptive Stepsize\n", "\n", "\n", "$E_n'-E_n = \\eta\\cdot d_n$ \n", "\n", "For gradient-descent-like methods: $d=-1\\cdot\\nabla L$ \n", "For (quasi)-Newton-like methods: $d=-1\\cdot (H_L)^{-1}\\nabla L \\quad\\Rightarrow\\quad d^\\dagger H d = -1\\cdot d^\\dagger\\nabla L$" ] }, { "cell_type": "markdown", "id": "08ea9178", "metadata": {}, "source": [ "## Based on Taylor-Series" ] }, { "cell_type": "markdown", "id": "dc8a99d9", "metadata": {}, "source": [ "### 1. Order\n", "\n", "$L'=L + 2\\cdot \\Re\\left\\{\\sum_n (E_n'-E_n)^\\dagger \\nabla_nL\\right\\}$ \n", "\n", "$\\eta = \\frac{L'-L}{2\\cdot \\Re\\{ d^\\dagger\\cdot \\nabla L \\}}$ \n", "\n", "In Copra: $L'=-L \\quad\\rightarrow\\quad$ $\\eta = \\frac{-L}{\\Re\\{ d^\\dagger\\cdot \\nabla L \\}}$ \n", "\n", "\n", "\n", "### 2. Order\n", "\n", "$L'=L + 2\\cdot \\Re\\left\\{\\sum_n (E_n'-E_n)^\\dagger \\nabla_nL\\right\\} + \\Re\\left\\{\\sum_{np} (E_n'-E_n)^\\dagger H_{np}(E_p'-E_p)\\right\\}$ \n", "\n", "\n", "$\\eta = 2\\cdot\\left(1 \\mp\\sqrt{1-\\frac{L'-L}{2\\Re\\{ d^\\dagger\\cdot \\nabla L \\}}}\\right)$ \n", "\n", "(with signed sqrt to keep numerical stability, is the case for all sqrt here) " ] }, { "cell_type": "markdown", "id": "40c9181e", "metadata": {}, "source": [ "## Based on Pade-Approximation " ] }, { "cell_type": "markdown", "id": "0486d978", "metadata": {}, "source": [ "$R_{nm}(x)=\\frac{P(x)}{Q(x)} = \\frac{\\sum\\limits_{i=0}^{n} a_i x^i}{1+\\sum\\limits_{i=1}^{m} b_i x^i}$ \n", "\n", "If Taylor-series is: $\\quad T(x)=f_0 + f_1 x +f_2x^2 + \\mathcal{O}(x^3)$ \n", "\n", "
\n", "\n", "$R_{01}(x)=\\frac{a_0}{1+b_1x} = \\frac{f_0}{1-\\frac{f_1}{f_0}x}$ \n", "\n", "$R_{11}(x)=\\frac{a_0+a_1x}{1+b_1x} = \\frac{f_0 + \\left(f_1-\\frac{f_2}{f_1}f_0\\right)x}{1-\\frac{f_2}{f_1}x}$ \n", "\n", "$R_{02}(x)=\\frac{a_0}{1+b_1x+b_2x^2} = \\frac{f_0}{1-\\frac{f_1}{f_0}x + \\left(\\left(\\frac{f_1}{f_0}\\right)^2-\\frac{f_2}{f_0}\\right)x^2}$ \n", "\n" ] }, { "cell_type": "markdown", "id": "eedf4971", "metadata": {}, "source": [ "$\\eta_{01}=\\frac{L}{L'}\\cdot\\frac{L'-L}{2\\cdot\\Re\\{ d^\\dagger\\cdot \\nabla L \\}}$ \n", "\n", "$\\eta_{11}=\\frac{2(L'-L)}{4\\cdot\\Re\\{ d^\\dagger\\cdot \\nabla L \\} - (L'-L)}$ \n", "\n", "\n", "$\\eta_{02}=\\frac{L}{4\\cdot\\Re\\{ d^\\dagger\\cdot \\nabla L \\}+L}\\cdot\\left(1\\pm\\sqrt{1 - 4\\cdot\\left(1 + \\frac{L}{4\\cdot\\Re\\{ d^\\dagger\\cdot \\nabla L \\}}\\right)\\cdot\\frac{L'-L}{L'}}\\right)$" ] } ], "metadata": { "language_info": { "name": "python" } }, "nbformat": 4, "nbformat_minor": 5 }