Wikipedia and Google Gemini
- By Bruce Nielson
- ML & AI Specialist
Integrating Wikipedia Into Your Python Applications Using Google Gemini
In my last post, I showed how to use the Gemini API. This post picks up right where we left off. You can follow along with this post in this Google Colab Notebook which combines this and the last post.
As a quick reminder, you should already have an API Key for Google’s Gemini. (See last post for details or go here).
Gemini is Google’s latest and greatest Large Language Model (LLM), and Google has an API to integrate it into your applications for free. Well, up to 60 queries per minute are free, anyhow. After that you’ll have to pay.
There are two main models. Gemini Pro, which is optimized for text, and Gemini Pro Vision, which is multimodal and handles text and images. Today we’re going to see how to use both models to grab information off of Wikipedia.
Logging Into Gemini’s API
After obtaining an API key (which I named “GeminiSecret” in the Google Colab Notebook) we need to login. First, we must import a few libraries:
import google.generativeai as genai
from google.generativeai.types import HarmCategory, HarmBlockThreshold
import textwrap
from google.colab import userdata
Then login as follows. (You may need to replace ‘GeminiSecret’ with whatever you named your secret for your Google API Key in your Notebook.)
GOOGLE_API_KEY=userdata.get('GeminiSecret')
genai.configure(api_key=GOOGLE_API_KEY)
Using Wikipedia In Your Applications
Having access to information on Wikipedia is an amazing resource. It becomes even more amazing when you integrate it into your applications using an LLM like Gemini. Imagine your applications being able to look up any Wikipedia page you wish to obtain information for the user.
The easiest way to use Wikipedia in your Python applications is through the ‘wikipedia’ module in Python. You need to first install it using pip:
!pip install -q wikipedia
Note that the “!” operator is only necessary if you are doing this in a Notebook.
Then import the wikipedia module:
import wikipedia
Let’s test out that the module works by grabbing the official summary of a page. Since, in the last post, we wrote an AI Dungeon Master using Gemini let’s stick with a Dungeons and Dragon’s style theme and we’ll look up Conan the Barbarian:
observation = wikipedia.summary("Conan the Barbarian", sentences=1000, auto_suggest=False)
observation
Here is what I got back:
Conan the Barbarian (also known as Conan the Cimmerian) is a fictional sword and sorcery hero who originated in pulp magazines and has since been adapted to books, comics, films (including Conan the Barbarian and Conan the Destroyer), television programs (animated and live-action), video games, and role-playing games. Robert E. Howard created the character in 1932 for a series of fantasy stories published in Weird Tales magazine.
The earliest appearance of a Robert E. Howard character named Conan was that of a black-haired barbarian with heroic attributes in the 1931 short story "People of the Dark". By 1932, Howard had fully conceptualized Conan. Before his death, Howard had written 21 stories starring the barbarian. Over the years many other writers have written works featuring Conan.
Many Conan the Barbarian stories feature Conan embarking on heroic adventures filled with common fantasy elements such as princesses and wizards. Howard's mythopoeia has the stories set in the legendary Hyborian Age in the times after the fall of Atlantis. Conan is a Cimmerian, who are descendants of the Atlanteans and ancestors of the modern Gaels. Conan is himself a descendant of Kull of Atlantis (an earlier adventurer of Howard's).
Not bad, though it isn’t really a summary of the true Wikipedia page. It’s more a summary of the character himself. Let’s see if Gemini can help us out with this. But let’s first explore the capabilities of the ‘wikipedia’ module and what it can do.
The Wikipedia Module
Here is how to grab the url for the page:
wiki_url = wikipedia.page("Conan the Barbarian", auto_suggest=True)
wiki_url.url
Response:
https://en.wikipedia.org/wiki/Conan_the_Barbarian
What if we want to search on “Conan the Barbarian” to get all Conan related pages instead of just going to the main Wikipedia entry for him?
search_results = wikipedia.search("Conan the Barbarian")
search_results
Response:
['Conan the Barbarian',
'Conan the Barbarian (1982 film)',
'Conan the Barbarian (comics)',
'Conan the Barbarian (2011 film)',
'Valeria (Conan the Barbarian)',
'Conan (comics)',
'Conan the Barbarian (disambiguation)',
'Conan the Destroyer',
'Hyborian Age',
'List of games based on Conan the Barbarian']
We could, if we wished, now use the wikipedia.page() function to grab any of these pages. This all makes me wonder, what else can we do with this module? Let’s list out all the attributes/functions the module has:
# wikipedia page object is created
page_object = wikipedia.page("Conan the Barbarian")
# Filter out attributes that start and end with '__'
filtered_attributes = [attribute for attribute in dir(page_object) if not attribute.startswith('_') and not attribute.endswith('__')]
# Print each attribute on one line
for attribute in filtered_attributes:
print(attribute)
I’m intentionally skipping methods from the module that start with an underscore because those are typically internal methods that we aren’t supposed to call. Here is what I get back:
output
categories
content
coordinates
html
images
links
original_title
pageid
parent_id
references
revision_id
section
sections
summary
title
url
With a bit of testing, I found ‘coordinates’ doesn’t work right, so I am going to skip over that one. And we’ll skip ‘content’ for now too because we’ll deal with it later. But for the rest, let’s look at what they contain:
for attribute in filtered_attributes:
if attribute != 'content' and attribute != 'coordinates':
if hasattr(page_object, attribute): # Check if the attribute exists
value = getattr(page_object, attribute)
print(f"Attribute: {attribute}, Value: {value}")
This will list every single attribute of the page for Conan the Barbarian. Go try this in the Notebook and look at the results. One that catches my eye is the ‘images’ attribute. It is a list of URLs for each image on the Wikipedia page! That’s a perfect opportunity to try out the Gemini Pro Vision module to see how it works- but we’ll come back to that later. For now, let’s finish using Gemini Pro to get a good summary of the page.
First, take a peek at the ‘content’ of the page by doing this:
print(page_object.content)
The result is far too long for me to print in this post. It contains not only information about the character of Conan the Barbarian but also about the author, the publishing history, etc. In other words, it contains a lot of information not contained in the official Wikipedia summary. So, let’s get Gemini to try to summarize this for us better.
Unfortunately, Gemini has no built-in summary methods… but that won’t stop us! We’ll just make a prompt that asks Gemini to summarize the text of the page like this:
# Replace with your actual text
text_to_summarize = page_object.content
# Define the model and prompt
prompt = f"Provide a one paragraph summary of the following text: \n{text_to_summarize}"
# Generate the summary
response = model.generate_content(contents=prompt)
Notice how I did this. We just built a prompt that said: “Provide a one paragraph summary of the following text:” and then added the actual text to the prompt! Yes, this is all you have to do with Gemini to get it to do a summary for you! In future posts, we’ll see that this sort of technique is quite powerful! It is almost like you are just talking to the computer on Star Trek and ‘programming it’ via a simple conversation.
Now let’s take a look at the results:
# Access the summarized text
summary_text = response.text
# Print the summary
print(textwrap.fill(summary_text, width=80))
Here is the result I got back:
Conan the Barbarian, created by Robert E. Howard in 1932, is a renowned sword and sorcery hero. The character has been featured in a multitude of formats, including pulp magazines, books, comics, films, animated and live-action television programs, video games, and role-playing games. Howard's original stories were published in Weird Tales magazine, with over 20 stories written before his death in 1936. Conan, a Cimmerian descended from the Atlanteans, is a powerful, chivalrous, and humorous barbarian. His iconic appearance includes a square-cut black mane, blue eyes, and a giant stature. The character's adventures take place in the fictional Hyborian Age, after the fall of Atlantis and before the rise of ancient civilizations. Conan's adventures often involve heroic feats for personal gain and protection. Despite his brutish appearance, he is a skilled warrior and tactician. His adventures have gained him the position of king of Aquilonia. Howard's original stories have been influenced by other works, such as Clark Ashton Smith's Hyperborean Cycle and H. P. Lovecraft's Cthulhu Mythos. Later writers, including L. Sprague de Camp and Lin Carter, have expanded the Conan universe with new stories and rewrites. The character has achieved widespread recognition through various adaptations. Arnold Schwarzenegger's portrayal in the 1982 film Conan the Barbarian brought immense popularity to the character. Conan has also been featured in successful comic book series by Marvel Comics and Dark Horse Comics. Despite ongoing copyright and trademark disputes, Conan remains a beloved and enduring figure in popular culture.
Wow! That’s quite the improvement over the official summary! It briefly covers nearly every important aspect of the Wikipedia page.
The Gemini Pro Vision Model
Okay, but we’re still not done! Let’s get back to those images we previously noticed and see if Gemini Pro Vision model can ‘read’ an image correctly.
First, we’ll need some new libraries:
from PIL import Image
import requests
from io import BytesIO
import matplotlib.pyplot as plt
Let’s test loading the image from Wikipedia:
# URL of the image
image_url = page_object.images[0]
print(image_url)
# Wikipedia expects a user-header.
# See https://stackoverflow.com/questions/69791356/downloading-certain-images-from-wikipedia-results-in-unexpected-unidentifiedimag
headers = {
'User-Agent': 'My User Agent 1.0'
}
# Send a GET request to the URL to fetch the image
response = requests.get(image_url, headers=headers)
# Check if the request was successful (status code 200)
if response.status_code == 200:
print("Success!")
# Read the image data from the response
image_data = BytesIO(response.content)
# Open the image using Pillow
image = Image.open(image_data)
# Display the image using matplotlib
plt.imshow(image)
plt.show()
else:
print("Failed to fetch the image. Status code:", response.status_code)
One gotcha here that threw me off: Wikipedia won’t let you just grab images directly. You have to specify in the header that you are a ‘User-Agent’ or it will fail to retrieve the image.
Here is the result I get back:
https://upload.wikimedia.org/wikipedia/commons/7/7a/Conan_colors_by_rodrigokatrakas_ddcrjw1-fullview.jpg
Success!
Yup, that is definitely Conan the Barbarian! But let’s see what Gemini Pro Vision says about it!
multi_mode_model = genai.GenerativeModel('gemini-pro-vision')
response = multi_mode_model.generate_content(image)
to_markdown(response.text)
And how does Gemini do at reading this image? I get this as a response:
Conan, The Cimmerian.
Heck yeah, Gemini! Spot on! It can recognize an image of Conan the Barbarian!
So, in this post we saw how to use the Wikipedia module in Python to allow your applications to use Wikipedia as a resource. By combining that with the power of an LLM (Gemini) we showed how you can easily integrate the knowledge on Wikipedia into your applications by simply talking to the Gemini model via text like a computer on Star Trek.
Other Resources: