08 Apr 2023

Publishing your model on Hugging Face.

by Harpo MAxx (5 min read)

If you have been working with IA-based applications in the last years, you should have definitively heard about Hugging Face. Hugging Face is a technology company that specializes in natural language processing (NLP) and artificial intelligence (AI). According to their website, their mission is to democratize AI and make these technologies more accessible to the broader community, encouraging researchers and developers to build innovative applications based on NLP.

Last week I was iterating with all the different aspects of the Hugging Face platform. My Idea was to publish a model for DGA detection (yeap the same model I’ve been mentioning the last couple of posts).

Hugging Face Services

In a brief, the Hugging Face platform provide you with four different services.

PUBLISH YOUR MODEL . In the basic service you can publish your pre-trained model together with information about the training process such as weights, training dataset used, etc.
PUBLISH YOUR DATASET. You can publish a dataset. Eventually, you can link a particular model to a particular dataset if you have used it for training your model.
PUBLISH A MINIMAL UI for your model. These are called Spaces and they provide you with different SDKs for building a simple UI for querying your model. For building the UI can rely upon very well-known Python libraries such as Gradio and StreamLit. However, more recently, they have included support for dockerized apps. In this way, you can use your favorite language and libraries for building your own UI. A basic computer infrastructure with 2 CPU is available for free. For more hardware, you will need yo pay! 👊
CREATE A CUSTOM ENDPOINT for your model. Basically, you can create an endpoint for accessing your application from any other sources. An endpoint refers to a URL (Uniform Resource Locator) or URI (Uniform Resource Identifier) that a client can use to access a specific service or resource on a server or web application. ⚠️ Notice that this a paid service. ⚠️

Models, Spaces, and datasets are hosted on the Hugging Face Hub as Git repositories, which means that version control and collaboration are core elements of the Hub

The Model Hub

To manage all the services at Hugging Face you can access the so-called Model Hub. The Model Hub is where the members of the Hugging Face community can host all of their model checkpoints for simple storage, discovery, and sharing. Download pre-trained models with the huggingface_hub client library, with 🤗 Transformers for fine-tuning and other usages or with any of the over 15 integrated libraries.

The huggingface_hub is a Python module for interacting with Hugging Face. In a lot of cases, you must be logged in with a Hugging Face account to interact with the Hub: download private repos, upload files, and create PRs,… Create an account if you don’t already have one, and then sign in to get your User Access Token from your Settings page. The User Access Token is used to authenticate your identity to the Hub.

Once you have set up your token, you can login and start using the model hub. This can be done programmatically via the huggingface_hub package. We are going to use the excellent reticulate package for accessing Python from R.

library(reticulate)
hfhub <- reticulate::import('huggingface_hub')
hfhub$login()

Now you are logged into the Hugging Face hub. So, the next step is to push the model into the model hub. As I mentioned (actually I copied & pasted from Hugging Face), the model Hub has support for 15 well-known integrated libraries. Of course, Keras is one of them.

For pushing a model into the Model Hub, you will need to create a Keras model or (as in my case) load a previously saved one. Then, using the hugging Face API you can push the model into the Model Hub with the push_to_hub_keras() function. The function takes two parameters: the model and the name of the remote repository where you want to store the model.

library(keras)

# loading the DGA model for classifier 
model<-load_model_hdf5("cacic-pmodel.h5")

# Push model to hf
hfhub$push_to_hub_keras(model,'dga-detector')

When you use the push_to_hub_keras() the model is pushed into the remote repository ( ‘dga-detector’ in this case). But also this will generate a model card that includes your model’s hyperparameters, a plot of your model, and a couple of sections related to the usage purpose of your model, model biases, and limitations about putting the model in production. Model cards are files that accompany the models and provide handy information. Under the hood, model cards are simple Markdown files with additional metadata. Here you can check the model card generated for the dga-detector . In my case, I didn’t include training information, since I lost the Tensorboard information. The DGA-detector is 5 years old, so…

Theoretically, the model is ready to be used by anybody interested in DGA detectors. Using the function from_pretrained_keras() and indicating the model name.

model <- hfhub$from_pretrained_keras("harpomaxx/dga-detector")
## Do your stuff with the model

However, for actually using the model, you will need to provide some information about the input format and ideally the necessary code for reproducing the preprocessing. Since harpomaxx/dga-detector is a git repository you could easily add a couple of files with the required preprocessing pipeline. In this case, the only preprocessing consists of tokenizing the domain name.

# Code for using the DGA detector model

library(keras)
library(plumber)
library(reticulate)

hfhub <- reticulate::import('huggingface_hub')
model <- hfhub$from_pretrained_keras("harpomaxx/dga-detector")
modelid <- "cacic-2018-model"
valid_characters <- "$abcdefghijklmnopqrstuvwxyz0123456789-_."
valid_characters_vector <- strsplit(valid_characters,split="")[[1]]
tokens <- 0:length(valid_characters_vector)
names(tokens) <- valid_characters_vector

# DGA prediction function
predict <- function(domain){
domain_encoded <-
sapply(
unlist(strsplit(tolower(domain),split="")), 
function(x) tokens [[x]]
)

domain_encoded<-pad_sequences(t(domain_encoded),
                              maxlen=45,
                              padding='post',
                              truncating='post')
  prediction<-predict(model,
                      domain_encoded)
  return(list(modelid=modelid,
              domain=domain,
              class=ifelse(prediction[1]>0.9,1,0),
              probability=prediction[1]))
}

That is the minimal work required for sharing a model at Hugging Face. You can check the model at Hugging Face here.
However, if you want to provide something working out of the box. Then you will need to use the Hugging Face Spaces. I will discuss the process in the next post…