Mastering LLM File Formats with Python | by Boqiang & Henry | Apr, 2024

Massive language fashions, corresponding to GPT-3, BERT, and others, are skilled on large quantities of textual content information and include billions of parameters. To effectively retailer and distribute these fashions, numerous file codecs have been developed by totally different organizations and frameworks. On this article, we’ll discover a few of the commonest file codecs used for big language fashions and supply examples of easy methods to load and run them utilizing Python.

Anthropic, the corporate behind the creation of AI fashions like Claude, has developed two file codecs particularly designed for environment friendly storage and loading of huge machine studying fashions: GGML (Glorot/Gated Gremlin MLmodel) and GGUF (Glorot/Gated Gremlin Updatable Format).

GGML is an optimized format that goals to scale back the reminiscence footprint and loading occasions of huge fashions, making it appropriate for working on client {hardware}. GGUF, alternatively, is an updatable model of GGML that enables for fine-tuning or updating the mannequin parameters.

To load and run a GGML mannequin in Python, you should use Anthropic’s ggml library:

import ggmlclass GGMLModel:
def __init__(self, model_path):
self.mannequin = ggml.load_model(model_path)
def run(self, input_text):
# Preprocess enter
input_tokens = self.mannequin.tokenize(input_text)
# Run inference
output_tokens = self.mannequin.generate(input_tokens)
# Postprocess output
output_text = self.mannequin.detokenize(output_tokens)
return output_text
# Utilization
mannequin = GGMLModel('path/to/mannequin.ggml')
output = mannequin.run('Enter textual content goes right here')
print(output)

The Hugging Face Transformers library supplies a unified interface for working with numerous pre-trained language fashions. Fashions within the HF format are usually saved in a listing construction containing a number of information, together with the mannequin weights, configuration, and tokenizer information.

To load and run an HF mannequin in Python, you should use the transformers library:

from transformers import AutoTokenizer, AutoModelForCausalLMclass HuggingFaceModel:
def __init__(self, model_path):
self.tokenizer =…

Source link

Mastering LLM File Formats with Python | by Boqiang & Henry | Apr, 2024

Working with Input-Convex Neural Networks part3(Machine Learning 2024) | by Monodeep Mukherjee | Jul, 2024

Embracing the Future: The Rise of AI-Driven Development in Software Engineering The software… | by DevBlogs | Jul, 2024

Research on Metaheuristic methods part4(Machine Learning 2024) | by Monodeep Mukherjee | Jul, 2024

How Real-Time Data Analytics and AI Are Transforming Heavy Equipment Operations

NVIDIA Accelerates Google Quantum AI Processor Design With Simulation of Quantum Device Physics

Game Development and Cloud Computing: Benefits of Cloud-Native Game Servers

Teradata AI Unlimited in Microsoft Fabric is Now Available for Public Preview through Microsoft Fabric Workload Hub

Cognigy Unveils Agentic AI: Transforming the Future of Enterprise Contact Centers

Our Picks

Convolution Neural Network: A Simple Overview | by Adinda Rahmah S | Jun, 2024

Neural Networks: Basic theory and architecture types | by Greg Postalian-Yrausquin | Jun, 2024

How To Do Accounts Receivable Reconciliation

Most Popular

Revolutionizing the Way We Find Love

Will GenAI Replace Data Engineers? No – And Here’s Why.

Assortment Optimization Machine Learning | by Danishaliarshar | Mar, 2024

Mastering LLM File Formats with Python | by Boqiang & Henry | Apr, 2024

Related Posts