An Introduction to Google’s Cloud Natural Language API

Are you caught up on the Google I/O announcements? I’m still wrapping my head around the MUM and LaMDA announcements but found this article from Edwin Toonen to be an excellent overview. Edwin says, “In complex queries like this, it all comes down to combining entities, sentiments, and intent to figure out what something means.”  I’ve found it easiest to start learning about entities, sentiments, and natural language processing by playing around with Google’s Cloud Natural Language API. Although its language model is separate from that of Google Search, the API can give us clues into how — and how well — machines understand our content. Let’s learn:

How does the Natural Language API work?

To work with the API, we have to input data in an expected manner through an API request. Then the API will return a response with different data types we need to understand.

As with most APIs, requests are not always free, so you will need to set up a Google Cloud account with billing information. This guide focuses on helping you understand what the API offers and determining if it suits your needs. If you decide to start using it, you’ll need to set up your API key and authenticate your environment. Once you have the correct setup, you can make sample requests in your preferred programming language. Code samples are available in Go, Java, Node.JS, PHP, and Python. 

The API Input/Request

The input types required to request the API are the Document and Content. The Document type provides metadata about the content. As you could guess, the Content type is the text to process.

Think of the Document as your doctype element in HTML. It tells the API what type of information to expect with the “type” field and what language with the “language” field. The document type can be HTML or plain text. The language field is optional, but if you don’t populate it, you will be relying on the API to correctly detect the language the content is in. Refer to the list of supported languages here

You should also pass your document’s encodingType when declaring the response, such as UTF8, UTF16, and UTF32.  If the encoding is not specified, all encoding-based indexing (which you’ll see in responses as “beginOffset”) will be set to -1. 

Google breaks down the Content into tokens, which are defined as “the smallest syntactic building block of the text.” We will talk more about the Token output, but I want to note that both the content and tokens have limits imposed on how much the API will process. If you exceed these quotas, the API will either throw an error or, even worse, completely ignore your requests. 

The API Outputs/Response

In addition to the Token, other main response types are TextSpan and Sentence. Let’s take a look at each of these.


TextSpan is an object that includes the content or text for the output being returned and where that content is located within the original document, indicated by the beginOffset field. See below an example TextSpan output converted into JSON.

The location of the content is vital for understanding dependencies. The TextSpan object is the expected data type for the “text” field within the Token.


A Token is usually a word or punctuation mark. The output from the API gives us more data than simply the text, though. Other fields in the Token JSON object are partOfSpeech, dependencyEdge, and lemma.


Does the simple mention of parts of speech take you back to school days and having your language teacher drill nouns, verbs, adjectives, adverbs, etc., into your head? It sure did for me, so I’ll spare you all of those details. The partOfSpeech object also includes: 

  • Aspect
  • Case
  • Form
  • Gender
  • Mood
  • Number
  • Person
  • Proper
  • Reciprocity
  • Tense
  • Voice

 If you want to dig into the syntax, read Google’s explanation for each partOfSpeech field and their possible values.


As mentioned earlier, TextSpan includes the location of the content, which helps with understanding these dependencies. Going back to the days of learning syntax, you may remember learning that words (tokens) within a sentence can modify or depend on other words to clarify the whole sentence. For each sentence, Google creates a dependency tree and identifies a root token, which corresponds to the main verb in the sentence. 

Google’s example dependency tree

The dependencyEdge type explains the output token’s relationship to the token it modifies (see my example below for more on this). The token being modified is considered the head token, so the headTokenIndex field passes its 0-based index. The label explains the relationship the token being output has to its head token. The label will return ROOT if the token is the root token of the sentence.

Let’s look at the example content of “She loves the taste of Coca-Cola.”

“Loves” is the root token in the dependency tree, since it is the verb. See how the label is ROOT in the dependency_edge object of the response (which I’ve reformatted into JSON for better human readability):

Here’s what the Token response for “of” looks like:

“Of” has a headTokenIndex of 3. Since the dependency tree uses a  0-based index, we can count 4 tokens from the beginning of the sentence.  “Taste” is at that index, making it the head token for “of.” The label PREP represents a prepositional modifier. “Of” is modifying “taste,” its head token. View all of the labels and their meaning.


Google doesn’t provide much documentation on this and instead points to the Wikipedia article, which has some great examples for different languages. The lemma is the most typical form of the word. From the previous example, “loves” has a lemma of “love,” as would “loving” and “loved.” When I was learning Spanish and French, we studied verbs by starting with the infinitive, the lemma of all versions of that verb.


Since the API is creating dependencies based on each sentence, it can, of course, output sentence data. The Sentence output also has a “text” field with a TextSpan type value. 
Let’s update our request content to “Although she loves the taste of Coca-Cola, she drinks Coke Zero. Her friend worked at Coca-Cola.” Here’s the response (as JSON again).

The sentiment output, which is blank in this example, is tied to sentiment analysis methods, which we will learn more about in the next section.

These types described above are necessary for all outputs but aren’t the only types. Sentiment (as mentioned above), Entity, and ClassificationCategory are other types that are output on specific API methods.

What language processing techniques does the Natural Language API use?

We already know that the API can analyze syntax, from the way it breaks down sentences and tokens. So, what other techniques can it offer us? The API’s other methods are:

  • analyzeEntities and analyzeEntitySentiment
  • classifyText
  • analyzeSentiment 
  • annotateText 

Analyze Entities

The analyzeEntities method finds named entities (currently proper names and common nouns) in the text along with entity types, salience, mentions for each entity, and other properties. This method returns the Entity type.

The API found the entity “Coca-Cola” and has matched it with metadata objects from our last example content. The metadata key of “MID” represents a Knowledge Graph machine ID. The other metadata key of “wikipedia_url” is pretty self-explanatory, but means that the entity exists in Wikipedia’s knowledge base, too. Suppose you want to learn more about the different knowledge bases. In that case, Xoogler Michael Ringgaard manages a fantastic tool that lists entities across knowledge bases and helps humans more easily visualize how entities are related. Check out this example for Coca-Cola

Notice how this entity is about Coca-Cola, the beverage product, not the organization. The company MID is the one listed in our output with a MID of “/m/03phgz.” (I — incorrectly — thought using “taste of” would give enough context to indicate that mention refers to the soft drink, so keep this in mind when you have similar entities!)  I highlighted in the References section of the screenshot the Freebase ID for the Coca-Cola beverage, which is “/m/01yvs.”

What is a Freebase ID? Google bought Freebase in 2010 then closed it in 2015 to focus on the Knowledge Graph. Any entities created after the closing of Freebase have a MID that begins with /g/. For example, Coca-Cola Zero Sugar, which was introduced in 2016, has a MID (or “Freebase ID” in Ringgaard’s database) of “/g/1212gl1p.” 

You can also check for the MID in Google Trends. As long as you have selected a topic (hint: it will have “Topic” or some other category listed under the keywords, rather than “search term”), you can pull the MID from the “q” query parameter of the page’s URL. The “/” in the MID is encoded to “%2F.”

Looking back at the API’s entity output, below the metadata we have salience. Google considers salience“the importance or centrality of that entity to the entire document text.”  

Following the salience are the mentions of the Entity in the Document. The first mention includes sentiment. The Sentiment type has two values, magnitude and score. Magnitude represents the magnitude, or size, of the sentiment, whether it is negative or positive. The magnitude value is a non-negative number. The score is based on a  -1 to 1 range, with -1 being a negative sentiment and 1 being positive. The sentiment score available outside of the mentions applies to the overall sentiment of the Entity. The sentiment score is provided on entities when the analyzeEntitySentiment service is run. We won’t go into any further details on the analyzeEntitySentiment method since we’ve already reviewed its output in this example.

Classify Text

Another method provided by the Cloud Natural Language API is classifyText, which includes the ClassificationCategory type.

The output above indicates that our example content is within the Food & Drink > Beverages > Soft Drinks category with a confidence of 0.86. The name field displays the category. Check out the list of pre-defined categories

I assumed these categories would match those available in Google Trends. The top-level category of “Food & Drink” matches the top category in Google Trends, as does the Trends topic of “soft drink.” However, the Beverages subcategory is not available in Trends. “Alcoholic Beverages” and “Non-Alcoholic Beverages” are the options given in Trends.

The second field in the ClassificationCategory is the confidence score. The value is from 0 to 1 to indicate how confident the classifier is in this categorization. The confidence score could be helpful if you have a website that covers various categories and you aren’t sure which to set as the primary category for a particular blog post.

Analyze Sentiment

Beyond just the sentiment of entities, the API has a method to provide sentiment analysis for the whole document. My original example didn’t have enough sentiment, so I created a new example: “Coca-Cola is the best soft drink ever! I love it more than any other brand.” Here is the output (converted to JSON again) of the analyzeSentiment method.

As pictured above, this service provides an overall sentiment for the document, output in the “document_sentiment” object. Within the Sentence object, the sentiment is populated for each.

Annotate Text

The annotateText method allows you to run several or all of the other methods at once. You will need to pass an extra parameter on the Document to indicate which features you want to enable. The features expect a Boolean (true/false) value. The Features are:

  • extractSyntax, which enables the analyzeSyntax method and returns the tokens and sentences
  • extractEntities
  • extractDocumentSentiment
  • extractEntitySentiment
  • classifyText

How can marketers use the Natural Language API?

Content Categorization

Although the classifyText method is meant to categorize, I found the categories too broad when working on a niche topic. 

I was helping to tag travel industry posts; sometimes content about United Airlines was abbreviated to United. I wanted to use entity extraction to determine if the ~2,000 posts that included “United,” but not “United Airlines,” should be categorized as the latter. Using the entityAnalysis method, I checked if an entity on the page matched the MID for United Airlines.

I took this a step further for another client in the manufacturing industry. From their glossary of industry terms, I was able to identify 40 entities tied to their content. I ran all of their content, including crawled and indexed PDFs, to determine  each of these entities salience to each piece of content. This information helped visualize where some pages might need to be differentiated from each other, where content around an entity is vague or thing, and where content gaps existed. If an entity had little or no salience within the existing body of content, that indicated an opportunity for new content. If you are interested in this, check out the Python script in my Github repository.

User-Generated Content Sentiment Analysis

The analyzeSentiment method can also be used to track your customers’ and partners’ feelings about your products.

Do you use Review markup for your products? Run that schema through the analyzeSentiment method to find the overall sentiment toward your product based on the reviews found on the product page.

Are you working with influencers or affiliate marketers? Sometimes it can be hard to choose the right partner. Check the sentiment of their content. If they tend to have more negative sentiment, you might not want them advocating for your brand.

Content Quality Analysis

Using the annotateText method, you can analyze syntax and sentiment to understand how they may affect high-performing vs. low-performing content. Based on the results, you could test tweaking lower-performing content by updating the copy to reflect a more positive sentiment. Do how-to guides with more imperative mood verbs tend to perform better than others? Is your text being marked as subjunctive when you thought you were providing a fact-based take?

How else might you use the Cloud Natural Language API for your marketing efforts? I hope this guide has piqued your interest and has made you feel more comfortable learning about natural language processing. If you’re using the Natural Language API to drive content efforts, let us know how it’s going in the comments!

Written by
Mike founded UpBuild in 2015 and served as its CEO for seven years, before passing the torch to Ruth Burr Reedy. Mike remains with the company today as Head of Business Operations.

Related Posts

Leave a Reply

Your email address will not be published. Required fields are marked *