Local-Gemma: Memory-efficient Inference with Gemma 2

A quite simple framework

Hugging Face launched Native-gemma, a framework constructed on prime of Transformers and Bitsandbytes to run Gemma 2 domestically.

It facilitates establishing an area occasion of Gemma 2 with three reminiscence presets buying and selling off pace and accuracy for reminiscence:

That is merely achieved through the use of two methods for lowering GPU reminiscence consumption:

4-bit quantization with bitsandbytes

Gadget map to dump components of the mannequin to the CPU

Furthermore, local-gemma additionally presets completely different “mode” for inference relying in your goal duties: “chat”, “factual” or “inventive”.

There’s a CLI however you would possibly desire code for extra flexibility (code example published by Hugging Face):

from local_gemma import LocalGemma2ForCausalLM
from transformers import AutoTokenizer
mannequin =…

Source link

Local-Gemma: Memory-efficient Inference with Gemma 2

Working with Input-Convex Neural Networks part3(Machine Learning 2024) | by Monodeep Mukherjee | Jul, 2024

Embracing the Future: The Rise of AI-Driven Development in Software Engineering The software… | by DevBlogs | Jul, 2024

Research on Metaheuristic methods part4(Machine Learning 2024) | by Monodeep Mukherjee | Jul, 2024

LogicMonitor Seeks to Disrupt AI Landscape with an $800 Million Strategic Investment at a Valuation of Approximately $2.4 Billion to Revolutionize Data Centers

Denodo Platform 9.1 Brings New Advanced AI Capabilities and Enhanced Data Lakehouse Performance

Harnessing AI in Agriculture – insideAI News

How Big Data Is Transforming Patient Care Delivery

How to Assist Human Agents & Transform Customer Experience with Conversational AI?

Our Picks

Predictive Analytics for Healthcare: Anticipating Patient Needs | by Gurojaschadha | Jul, 2024

Ensemble Techniques: Detailed Exploration of Bagging and Boosting Algorithms | by sumit gaurav | Jul, 2024

Exploring the Essential Machine Learning Libraries | by Gouravyadav | Kinomoto.Mag | Apr, 2024

Most Popular

Revolutionizing the Way We Find Love

Will GenAI Replace Data Engineers? No – And Here’s Why.

Assortment Optimization Machine Learning | by Danishaliarshar | Mar, 2024

Local-Gemma: Memory-efficient Inference with Gemma 2

A quite simple framework

Related Posts