- Proximal Dogleg Opportunistic Majorization for Nonconvex and Nonsmooth Optimization(arXiv)
Creator : Yiming Zhou, Wei Dai
Summary : We think about minimizing a operate consisting of a quadratic time period and a proximable time period which is presumably nonconvex and nonsmooth. This drawback is also called scaled proximal operator. Regardless of its easy kind, present strategies endure from sluggish convergence or excessive implementation complexity or each. To beat these limitations, we develop a quick and user-friendly second-order proximal algorithm. Key innovation includes constructing and fixing a collection of opportunistically majorized issues alongside a hybrid Newton course. The strategy immediately makes use of the exact Hessian of the quadratic time period, and calculates the inverse solely as soon as, eliminating the iterative numerical approximation of the Hessian, a typical apply in quasi-Newton strategies. The algorithm’s convergence to a vital level is established, and native convergence charge is derived primarily based on the Kurdyka-Lojasiewicz property of the target operate. Numerical comparisons are performed on well-known optimization issues. The outcomes show that the proposed algorithm not solely achieves a quicker convergence but in addition tends to converge to a greater native optimum evaluate to benchmark algorithms.
2.Zeroth-order Gradient and Quasi-Newton Strategies for Nonsmooth Nonconvex Stochastic Optimization (arXiv)
Creator : Luke Marrinan, Uday V. Shanbhag, Farzad Yousefian
Summary : We think about the minimization of a Lipschitz steady and expectation-valued operate outlined as f(x)≜E[f~(x,ξ)], over a closed and convex set. Our focus lies on acquiring each asymptotics in addition to charge and complexity ensures for computing an approximate stationary level (in a Clarke sense) through zeroth-order schemes. We undertake a smoothing-based strategy reliant on minimizing fη the place fη(x)=Eu[f(x+ηu)], u is a random variable outlined on a unit sphere, and η>0. It has been noticed {that a} stationary level of the η-smoothed drawback is a 2η-stationary level for the unique drawback within the Clarke sense. In such a setting, we develop two units of schemes with promising empirical conduct. (I) We develop a smoothing-enabled variance-reduced zeroth-order gradient framework (VRG-ZO) and make two units of contributions for the sequence generated by the proposed zeroth-order gradient scheme. (a) The residual operate of the smoothed drawback tends to zero virtually certainly alongside the generated sequence, permitting for making ensures for η-Clarke stationary options of the unique drawback; (b) To compute an x that ensures that the anticipated norm of the residual of the η-smoothed drawback is inside ε requires no higher than O(η−1ε−2) projection steps and O(η−2ε−4) operate evaluations. (II) Our second scheme is a zeroth-order stochastic quasi-Newton scheme (VRSQN-ZO) reliant on a mix of randomized and Moreau smoothing; the corresponding iteration and pattern complexities for this scheme are O(η−5ε−2) and O(η−7ε−4), respectively