Company News Lab


Overview

Using the Intrinio Financial Data API (application programming interface) we can programmatically retrieve the latest new articles for a company.

We will use the All News by Company API end-point to retrieve company news throughout this lab.

In this lab we will cover:

  • Retrieving the latest news articles for Apple Inc. (AAPL) using Python or R
  • Extracting keywords and occurrence counts from news articles for Apple Inc.
  • Plotting the occurrence counts for the top 40 keywords in news about Apple Inc.

Code Examples

Code examples in Python and R are provided throughout this lab.

All code examples can be run in your browser using a Jupter Notebook. To launch the Jupyter Notebook for any code example simply click the Run Jupyter Notebook button at the top of the code example.

If you wish to run the code examples in your own development environment, be sure to first install the Intrino SDK for the language of your choice:

API Keys

In code examples throughout this lab you will be asked to provide your API Key.

The API keys assigned to you are available below.

For the purposes of this lab, be sure to use your Production API key

Production
To view your API keys, click here to log in.

Section 1

Retrieving the latest news articles for Apple Inc.

In this section we will use the All News by Company API end-point to retrieve latest news articles for
Apple Inc.

After calling the API we will iterate over the news articles returned, adding each news article summary to an 'article_summaries' array.

Run the code below to see the result...

Section 2

Extracting keywords and occurrence counts from news articles for Apple Inc.

Now that we have an array of the latest news article summaries for Apple Inc. we'll iterate over the array extracting keywords from each article summary.

For keyword extraction we'll use an algorithm called 'TextRank'.

In Python we'll use the 'gensim' library's TextRank implementation.

In R we'll use the 'textrank' library, with support from the 'udpipe' library.

Extracting keywords from an article summary will return an array of keywords. We'll iterate over this array of keywords adding each keyword to a global 'all_keywords' array.

Once this operation has completed for every article and every keyword within every article, the 'all_keywords' array will contain every keyword for every article - occurring multiple times for keywords which happened to appear in multiple article summaries.

Since we're interested in knowing the number of occurrences for each unique keyword within news articles for Apple, we'll need to perform a transformation on our 'all_keywords' array.

In Python we'll simply use a Counter to transform our 'all_keywords' array into an array of unique keyword / count tuples. We'll store this array of tuples as 'unique_keywords'.

In R we'll convert our 'all_keywords' list into a data frame, then use the table() function to return a list of unique keywords and counts. We'll cast the result returned from table() into a new data frame called 'unique_keywords'.

Run the code below to see the result...

Section 3

Plotting the occurrence counts for the top 40 keywords in news about Apple Inc.

Building on our previous work, now let's create a box plot for all of our unique keywords and their occurrence counts.

Run the code below to see the result...

Going further...
  • Can you retrieve the news articles for multiple companies, creating arrays of news article keywords for each company?
  • There are many general keywords contained within news articles about Apple Inc. These keywords could apply to any company and don't provide us any information that might be useful for investment purposes. How might we filter the list of keywords down to those which apply to Apple Inc. most specifically?
  • How might we filter our list of keywords down to those which apply to Apple Inc. as a company within the technology industry?