Create account

DEV Community

Bogdan Covrig

Posted on Aug 22, 2020

Content Sentiment Analysis: Explore the emotion score of your content with Google API

#actionshackathon #github #googlecloud #showdev

My Workflow

I spent a few nights wrapping Google's analyzeSentiment API into a GitHub Action. The Action runs Sentiment Analysis over the content of HTML files and provides an overview of the overall emotion of all (the selected) pages in your project.

The API returns values from -1 to 1, indicating how strong a certain emotion – positive or negative – is. After running the Action, a table with the score per each page is printed in its logs. Read more about Interpreting sentiment analysis values.

Along with other content analysis tools, it might come in handy to maintainers who want to understand the text that is pushed to the project every day. See it in action 🚀

⚠️ Now I got really excited about this and will continue developing along with other automation ideas that I have. This being an early release of the Action, please take a look at the roadmap and submit issues with what kind of features you would like to see in future releases.

Submission Category:

Maintainer Must-Haves

Yaml File or Link to Code

Here is an example of how to use the Action on public .html files.

name: Sentiment analysis on public

on: push

jobs:
  analysis:
    runs-on: ubuntu-latest
    steps:
    - uses: actions/checkout@v2 #Be sure you checkout the files beforehand
    - name: Run sentiment analysis on HTML files
      uses: bogdaaamn/copy-sentiment-analysis@v0.6.1
      with: 
        gcp_key: ${{ secrets.GCP_KEY }} #Google Cloud Platform API key. Read the README for instructions

Along with the code – more examples, requirements and a known issues roadmap are available in the bogdaaamn/copy-sentiment-analysis repository (view it on Marketplace).

bogdaaamn / copy-sentiment-analysis

Run sentiment analysis over the text of your website using Google API.

Copy Sentiment Analysis

This GitHub Action runs Sentiment Analysis over the built text of your GitHub project. It uses Google's analyzeSentiment API, evaluating the overall emotion score (from positive to negative) of a page. The Action provides an overview of the scores of all the pages from your project (more on interpreting the scores).

🚀 Usage

This is a workflow example of using the Action on plain .html files from the public folder (by default).

name: Sentiment analysis on public

on: push

jobs:
  analysis:
    runs-on: ubuntu-latest
    steps:
    - uses: actions/checkout@v2 #Be sure you checkout the files beforehand
    - name: Run sentiment analysis on HTML files
      uses: bogdaaamn/copy-sentiment-analysis@v0.6.1
      with: 
        gcp_key: ${{ secrets.GCP_KEY }} #Google Cloud Platform API key. Read the README for instructions

Although, if you project needs to be built beforehand, be sure you place…

View on GitHub

Additional Resources / Info

Open source use cases

At the moment, the overview table is printed in the Actions tab, after the code is running. But it seems counter-intuitive and having too much friction in between.

I am curious about what the community thinks: How would you see this Action printing the results? A comment to the PR? A table in Action's log? Failing if there are too many negative results?

⚠️ GCP's bias in sentiment analysis

A few years ago, Google API was criticized in the media for producing bias results towards race, gender, and religion. So I had mixed feelings about using a pre-trained model in this Action.

Now, it is hard to understand what is going on with Google's proprietary algorithm and how they fight unwanted bias, but more recent research (see charlescearl, 2019) concluded that GCP seems to be less sensitive to the race or gender of participants than other competitor platforms. The same article recommends that users should proceed with caution and conduct evaluations on their own. The tests that I've done had neutral results, but I am ready to expand my research and pull the plug if needed.

Moreover, there is extraordinary research done towards identifying, analyzing, and diminishing bias in data (see Dixon et al., 2018 from Google Research or Caliskan et al., 2016 and May et al., 2019) and all I hope is that Google (or really any cloud provider out there) is doing better and better. I believe it is really important to enable and support bias and fairness research – especially now, after the recent upbringing of GPT-3 (see Burus, 2020) when the society gets exposed more and more to technology.

Sources

Top comments (3)

Vadorequest • Aug 26 '20

An option to decide whether do the sentiment analysis per file or per "block" could be handful.

I recently worked with sentiment analysis, with sentences written within a Excel file. I could easily convert that Excel file into one HTML file, and split each sentence by a div or a CSS class, and then I could easily get the result using your Action.

Regarding your results display question, I advise to keep something visual, like what you did with the table in the output. It would definitely be great to have it both as Action comment (as shown above) and as PR comment (for history and easier/faster visualization).
I'd also advise to generate a JSON artifact that's included with the "run".

That's what I do with my 2E2 tests, I store the screenshots and videos as artifacts. See github.com/UnlyEd/next-right-now/r...

Having such artifact would allow the owner to actually use those results programmatically quite easily. (table is great for visualization)

Bogdan Covrig • Aug 26 '20

Thanks a lot @vadorequest for the input! Really appreciate it 🙌🏻

An option to decide whether do the sentiment analysis per file or per "block" could be handful.

That is a great idea, I actually thought about that myself. From my experience, when we usually run sentiment analysis on actual documents, we do it for the whole document. But I see how there is a content issue there. While papers are meant to be written about the same topic, on a website you have different sections that might be totally unrelated and have a different tone, vibe, or language. So I think an option to switch between those would be really helpful.

It would definitely be great to have it both as Action comment (as shown above) and as PR comment (for history and easier/faster visualization).

Thank you for that, I totally agree. I am currently trying to sort out @actions/github and then I will jump on the PR comment.

Having such artifact would allow the owner to actually use those results programmatically quite easily. (table is great for visualization)

This is such a great idea, never thought about how should I take this programmatically to the next steps.

Thanks for sharing UnlyEd/next-right-now, you have really nice pipelines in there.