Research on Perturbed Gradient Descent part1(Machine Learning 2024) | by Monodeep Mukherjee | May, 2024

1.PAC-tuning:Effective-tuning Pretrained Language Fashions with PAC-driven Perturbed Gradient Descent

Authors: Guangliang Liu, Zhiyu Xue, Xitong Zhang, Kristen Marie Johnson, Rongrong Wang

Summary: Effective-tuning pretrained language fashions (PLMs) for downstream duties is a large-scale optimization downside, by which the selection of the coaching algorithm critically determines how effectively the skilled mannequin can generalize to unseen check information, particularly within the context of few-shot studying. To realize good generalization efficiency and keep away from overfitting, strategies resembling information augmentation and pruning are sometimes utilized. Nevertheless, including these regularizations necessitates heavy tuning of the hyperparameters of optimization algorithms, resembling the favored Adam optimizer. On this paper, we suggest a two-stage fine-tuning methodology, PAC-tuning, to deal with this optimization problem. First, based mostly on PAC-Bayes coaching, PAC-tuning instantly minimizes the PAC-Bayes generalization certain to be taught correct parameter distribution. Second, PAC-tuning modifies the gradient by injecting noise with the variance realized within the first stage into the mannequin parameters throughout coaching, leading to a variant of perturbed gradient descent (PGD). Prior to now, the few-shot situation posed difficulties for PAC-Bayes coaching as a result of the PAC-Bayes certain, when utilized to giant fashions with restricted coaching information, may not be stringent. Our experimental outcomes throughout 5 GLUE benchmark duties reveal that PAC-tuning efficiently handles the challenges of fine-tuning duties and outperforms robust baseline strategies by a visual margin, additional confirming the potential to use PAC coaching for another settings the place the Adam optimizer is at present used for coaching

Source link

Research on Perturbed Gradient Descent part1(Machine Learning 2024) | by Monodeep Mukherjee | May, 2024

Working with Input-Convex Neural Networks part3(Machine Learning 2024) | by Monodeep Mukherjee | Jul, 2024

Embracing the Future: The Rise of AI-Driven Development in Software Engineering The software… | by DevBlogs | Jul, 2024

Research on Metaheuristic methods part4(Machine Learning 2024) | by Monodeep Mukherjee | Jul, 2024

How Real-Time Data Analytics and AI Are Transforming Heavy Equipment Operations

NVIDIA Accelerates Google Quantum AI Processor Design With Simulation of Quantum Device Physics

Game Development and Cloud Computing: Benefits of Cloud-Native Game Servers

Teradata AI Unlimited in Microsoft Fabric is Now Available for Public Preview through Microsoft Fabric Workload Hub

Cognigy Unveils Agentic AI: Transforming the Future of Enterprise Contact Centers

Our Picks

SandboxAQ Helps Unlock the Next Generation of AI-Driven Chemistry with NVIDIA Technology

Declutter Your Docs: The Magic of Text Summarizing Tools | by Nowigence | Jun, 2024

Move over A.I. , O.I. is in Town. | by Sanyukta Gokhale | Jun, 2024

Most Popular

Revolutionizing the Way We Find Love

Will GenAI Replace Data Engineers? No – And Here’s Why.

Assortment Optimization Machine Learning | by Danishaliarshar | Mar, 2024

Research on Perturbed Gradient Descent part1(Machine Learning 2024) | by Monodeep Mukherjee | May, 2024

Related Posts