GCP Authentication and Google Translation API

Part Two: Using the Google Translate API to Translate International News

This is a continuation of a three-part series. To go to part one, click here.

Analyzing Article Sources

In the last part, we pulled metadata from the latest news headlines using the News API. In part two, we will learn how to call the Google Translation API to translate text.

To start, let's take a look at where the articles we called are coming from:

import seaborn

ax = sns.countplot(x = 'country', data = sources, palette='Set1')
plt.ylabel('Number of News Sources')
plt.xlabel('Publication Country')
ax.set_xticklabels(['Australia', 'USA', 'UK', 'Germany', 'Italy', 'India'])
plt.title('Breakdown of News Source Publication Countries')
plt.show()

The articles on the News API comes from 6 different countries.

Now, let's look at the language of the article data:

ax = sns.countplot(x = 'language', data = sources, palette='Set1')
plt.ylabel('Number of News Sources')
plt.xlabel('Publication Language')
ax.set_xticklabels(['English', 'German'])
plt.title('Breakdown of News Source Languages')
plt.show()

Even though some of the source countries are not primarily English-speaking, the article data is only in English and German.

Google Translate API

We are going to use the Google Translate API to translate any German-language articles captured in our News API request. In the next post, we will use the Google Natural Language API to perform sentiment and entity analysis on all of the articles.

Google Cloud Platform (GCP) Authentication

Authentication on GCP can be tricky, but these steps should guide you along the process. There are plenty on online resources if you get stuck (GCP documentation is robust and helpful).

We will use a service account to use the GCP API's.

  1. Create a GCP account at https://console.cloud.google.com. If you have Google Account for gmail, you can use that.

  2. Once logged into GCP, go to the console and create or select a project.

  3. You will likely need to set up a billing account. Unless you are making many frequent requests, you won't be charged for API usage.

  4. Open the console left side menu and select APIs & services, and then select Library.

  5. Click the Translation API under Google Cloud Machine Learning.

  6. Click ENABLE.

  7. From the API and Services page you previously visited, click "Credentials" on the left side menu.

  8. Under "Create Credentials" select "Service account key"

  9. Create a new service account and make yourself the owner. Choose JSON for the key type.

  10. Click "Create" and the key will automatically download. Save this to your computer - don't lose it!

Making it Work in Python

  1. Now that we are done in the GCP Console, we will download the GCP Client Library for Python. The documentation is here.

  2. I'm using Anaconda, so I use the Anaconda prompt to install the Google Translate client library.

$ pip install --upgrade google-cloud-translate
  1. We also need the OS library, so install that too.
$ pip install os
  1. Now, in python, import the libraries and set up your API key path.
from google.cloud import translate
import os
path = 'path/to/apikey.json'
os.environ['GOOGLE_APPLICATION_CREDENTIALS'] = path
  1. We will do a simple translation function to test it is working properly.
# define client
translate_client = translate.Client()

#create function to translate text to English
def translated_test(text, target='en'):
    result = translate_client.translate(text, target_language=target)

    print('Translation', result['translatedText'])
    print('Source language: ', result['detectedSourceLanguage'])

text_test = 'Hola amigo'
translated_test(text_test)    

Great, it works!

Translating German Articles

Let's go back to where we started. There are 10 news sources that are providing articles in German to translate for our 'Choose Your News' program.

We will first extract the articles in German (where language = "de") from the DataFrame we created in the last part.

results.head()

german = results[results.language == 'de']
german = german.reset_index().drop(['index'], axis=1)

Now, we define our function to use the Translation API.

def translate(to_trans):    
    for i in range(len(to_trans)):
        translation = translate_client.translate(to_trans[i], target_language='en')
        to_trans[i] = translation['translatedText']

translate(german.description_x)   
german.head()     

The descriptions of the articles are in English. Great! Our translation API is working!

We will now combine the translated articles with our other articles.

english = results[results.language == 'en']

#two dataframes to combine
to_combine = [english, german]

#we use ignore_index=True because we are concatenating objects where the concatenation axis does not have meaningful indexing information.
translated = pd.concat(to_combine, ignore_index=True)

Summary

We have now used the News API to pull in the latest news headlines and their descriptions from 70 different news sources. We then used our Google Cloud Platform account to create a service account and authenticate usage of the Translation API.

We are ready for Part 3: Using the Google Natural Language API to Analyze News Sentiment

Go Top