An Introduction to Hugging Face and Their Pipelines

An Introduction to Hugging Face and Their Pipelines

In a previous post, linked here, I introduced Hugging Face models without much explanation. So, in this post I’ll introduce the Hugging Face ecosystem and several amazing things you can do ‘right out of the box’ using their ‘pipeline’ library.

Most of these examples come straight from Hugging Face themselves either from their website or their book Natural Language Processing with Transformers: Building Language Applications with Hugging Face which is written by the Hugging Face team.

If you want to follow along inside of my Google Colab you can find it here.

The Hugging Face Ecosystem

Two squares showcase the Hugging Face Ecosystem. The top square contains the Hugging Face Hub with Models, Datasets, Metrics, and Docs. The bottom square shows Tokenizers, Transformers, Datasets, and Accelerate. There are three arrows pointing between the squares, indicating a natural two-way connection from Tokenizers, Transformers, and Datasets up to the Hugging Face Hub. Accelerate is connected to Transformers with a two-way arrow.

The Hugging Face ecosystem consists of two parts:

  • A group of libraries for use in Python
  • A hub of models and datasets

In the previous post we utilized a few of the libraries. This is a collection of code that simplifies working with Transformers and other ML models. The models we downloaded and utilized came from the Hub, and our code automatically downloaded the models of the Hub for use.

The Hugging Face website has full documentation for both the libraries and the models. There are also free classes available. On the Hugging Face website each model has a ‘model card’ which gives the details for each mode. Here is an example for Llama 2, the model from Facebook:

A screenshot of the Hugging Face website, looking at a Model card. The appearance of the site is similar to a GitHub page in structure and design and gives details regarding the specific use of Llama 2.

Hugging Face Pipelines: The Easiest Way to Get Started with Artificial Intelligence

Okay, now that we’ve been introduced to Hugging Face properly, what can we do right out of the box with their libraries and models?

The easiest way to get started is the Pipeline library. There are numerous out of the box Pipelines that will allow you to immediately add Machine Learning capabilities to your apps with little effort. In this article I will collect together several simple examples of Pipelines that I’ve collected from Hugging Face.

Let’s start with a few imports:

from transformers import pipeline
import pandas as pd
import torch

Now let’s use pytorch to determine if we have a gpu or not:

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

We’ll use this device throughout this demo to try and speed things up if a GPU is available. If you are doing this on Google Colab be sure to ‘Change runtime type’ and set it to T4 GPU. But this code should also work with a regular CPU.

Let’s start with a sentiment analysis pipeline:

classifier = pipeline("text-classification", device=device)

Now let’s analyze a Dear John letter:

text = ("Dear John, we're breaking up. As much as I enjoyed Paris, I just don't see you and I together."
"I'm keeping the Star Trek phaser you gave me when we met at Comic Con. "
"Please don't contact me. Love, Beth.")
outputs = classifier(text)
print(outputs)

Here is the result:

[{'label': 'NEGATIVE', 'score': 0.5937128663063049}]

So, despite phrase like “Love, Beth” it correctly figured out a Dear John letter is negative. Now let’s pull out the proper nouns (named entities) in this letter using a named entity pipeline:

named_entities_tagger = pipeline("ner", aggregation_strategy="simple", device=device)
outputs = named_entities_tagger(text)
pd.DataFrame(outputs)

We get a spot-on result:

A chart showcasing the results of the test. There are five categories and five rows, starting from 0 to 4. The categories are entity_group, score, word, start, and end. Under entity_group, the results in order from 0-4 are PER, LOC, MISC, MISC, and PER. Under score, 0.984036, 0.997841. 0.993135, 0.980863, and 0.964161. Under word, John, Paris, Star Trek, Comic Con, and Beth. Under start, 5, 51, 111, 155, and 197. Under end, 9, 56, 120, 164, and 201.

It correctly found every proper noun!

Suppose we want to be able to ask questions about the letter and have our AI find the answers within the letter.

reader = pipeline("question-answering", device=device)
question = "Who is Beth breaking up with?"
outputs = reader(question=question, context=text)
pd.DataFrame([outputs])

Result:

A chart showcasing the results of the test. There are four categories and one row. The categories are score, start, end, and answer. Under score the result is 0.969272. Under start is 0. Under end is 9. Under answer is Dear John.

It correctly recognized that Beth was breaking up with John. And note how it even tells you where in the letter that answer is found (between characters 0 and 9).

Now let’s get a summary:

summarizer = pipeline("summarization", device=device)
outputs = summarizer(text, min_length=5, max_length=30, clean_up_tokenization_spaces=True)
print(outputs[0]['summary_text'])

Result:

Dear John, we're breaking up. As much as I enjoyed Paris, I just don't see you and I together. I'm
Not bad. 

Now let’s translate the letter to German:

translator = pipeline("translation_en_to_de", model="Helsinki-NLP/opus-mt-en-de", device=device)
outputs = translator(text, clean_up_tokenization_spaces=True, min_length=25)
print(outputs[0]['translation_text'])

Result:

Lieber John, wir trennen uns. So sehr mir Paris gefallen hat, ich sehe dich und mich einfach nicht zusammen. Ich behalte den Star Trek Phaser, den du mir gegeben hast, als wir uns bei Comic Con trafen. Bitte kontaktiere mich nicht. Love, Beth.

I tried translating that back to English using Google Translate and got this result:

Dear John, we are parting ways. As much as I liked Paris, I just don't see you and me together. I'm keeping the Star Trek phaser you gave me when we met at Comic Con. Please do not contact me. Love, Beth.

Almost perfect!

Now let’s write an automatic response back to Beth:

generator = pipeline("text-generation", device=device)
prompt = text + "\n\nShort Response to Beth:\nDear Beth,\nI received your letter breaking up with me."
outputs = generator(prompt, min_length = 25, max_length = 200)
print(outputs[0]['generated_text'])

I won’t show you my result. It was pretty bad. You probably don’t want to write automatic responses to ex-girlfriends using AI. 😊

Working with Images

Let’s now play with images. Let’s try image generation. First a few installs:

!pip install diffusers
!pip install accelerate

And now let’s create an image of Superman!

from diffusers import AutoPipelineForText2Image
sd_pipeline = AutoPipelineForText2Image.from_pretrained("runwayml/stable-diffusion-v1-5", low_cpu_mem_usage=False).to(device)
prompt = "Superman to the rescue!"

image = sd_pipeline(prompt, num_inference_steps=25).images[0]
# Display the image
display(image)

Here is my result:

A cartoon image of Superman flying, is the style of a typical comic book. The quality of the image and artwork is decent, although the hands are somewhat incorrect with particularly oddly designed thumbs.

I would note that I had some trouble getting this one to work in Google Colab notebooks. It seemed to work better with a GPU, but it often ran out of memory and errored out. Just skip it if you have a problem

Now let’s use a pipeline to find and label items in an image:

from IPython.display import display
from PIL import Image
import requests
from PIL import Image, ImageDraw
from transformers import pipeline

url = "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/coco_sample.png"
image_data = requests.get(url, stream=True).raw
original_image = Image.open(image_data)
display(original_image)

Result:

An image of two adult cats laying upon a pink towel or blanket, each laying next to a typical TV remote contro.

Now let’s find objects and set labels. First, let’s do an install:

!pip install timm
import timm

Now let’s find objects and set labels:

# Allocate a pipeline for object detection
object_detector = pipeline('object-detection')
results = object_detector(original_image)
print(results)
image = original_image.copy()

# Draw bounding boxes and labels on the image
draw = ImageDraw.Draw(image)
for result in results:
    label = result['label']
    score = result['score']
    box = result['box']
    xmin, ymin, xmax, ymax = box['xmin'], box['ymin'], box['xmax'], box['ymax']

    # Draw bounding box
    draw.rectangle([xmin, ymin, xmax, ymax], outline="white", width=3)

    # Draw label and score
    draw.text((xmin, ymin), f" {label}", fill="white")

# Display the image
display(image)

Result:

The same image of two adult cats laying upon a pink blanket or towel, each laying next to a typical TV remote. There are now identifiers in the image, five boxes- one surrounding the entire image identifying a couch, two surrounding each cat individually and correctly labeling them as cats, and two surround the remotes individually and correctly labeling them as remotes.

Pretty cool for something that required so little work, right?

For this next part, if you’re in a Notebook (such as Google Colab) you’ll need to import this:

from IPython.display import Audio

Text-to-Speech

Now let’s try some text-to-speech

pipe = pipeline("text-to-speech", model="suno/bark-small", device=device)
text = "Ladybugs have had important roles in culture and religion, being associated with luck, love, fertility and prophecy."
output = pipe(text)

Audio(output["audio"], rate=output["sampling_rate"])

I found the results of this to be mixed. It worked, but sometimes dropped the first or last word. I’ll bet it works better if we don’t use the default model and use something meant specifically to do this. I also sometimes got an error (usually that timm was not imported even though it was) or it would just hang. But it usually worked, so hopefully you’ll have good luck with this one.

Final Thoughts

As this blog post shows, you can do some amazing things right out-of-the-box with Hugging Face pipelines. And there are many more pipelines available that we didn’t cover here. (See a complete list here and documents here.) It is very easy to add Artificial Intelligence and Machine Learning capabilities to your applications.

Be sure to check out the rest of the Mindfire Blog for more blog posts and tutorials on Machine Learning, and also be sure to follow us on LinkedIn to stay in the loop. If you have any questions, be sure to ask them down below!

If you are interested in having AI/ML added to a project, or need help with any software development- Mindfire is here to help and would love to talk to you, just reach out over our Contact-Us page.

SHARE


comments powered by Disqus

Follow Us

Latest Posts

subscribe to our newsletter