Ildar Sharafeev

Posted on Mar 7, 2023 • Originally published at thesametech.com on Mar 3, 2023

Automated Blog Promotion with ChatGPT, Twitter and AWS

#chatgpt #openai #aws #serverless

In today’s digital landscape, creating a blog is just the first step in establishing an online presence. To attract readers and build a following, bloggers need to promote their content effectively. In this blog post, we’ll explore the topic of automated blog promotion and how to use ChatGPT, Twitter API, and AWS to create a blog promotion toolkit that can help bloggers promote their content more efficiently. The toolkit is designed to automate the promotion process using cutting-edge technologies, making it easier for bloggers to reach their target audience.

By following the steps outlined in this post, you’ll be able to create your own automated blog promotion toolkit and take your blog promotion efforts to the next level. Let’s dive in and see how ChatGPT, Twitter API, and AWS can help you promote your blog like a pro!

Requirements:

Schedule automation. The toolkit must run automatically on a schedule, without requiring manual intervention, to save users time and effort in promoting their content.
Extendibility. The toolkit should be easy to extend, allowing users to add new promotion tools as needed. The MVP version of the toolkit will contain only one tool designed to engage new readers by replying to tweets using OpenAI APIs and ChatGPT, with AI text generation to ensure that the Twitter bot responds using relevant context. In the future, we may want to add functionality (publishing to relevant communities) or even expand to new platforms (e.g. LinkedIn).
Politeness. The tool must be polite and respectful to Twitter accounts, avoiding multiple replies to the same user within a short period to prevent appearing annoying or spammy.
Hybrid runtime. The toolkit should be able to run both in the AWS cloud and locally, providing users with the flexibility to choose the deployment method that works best for them.

Problem

Creating high-quality content for a blog is a challenging task, but promoting that content to reach a wider audience can be even more difficult. Social media platforms like Twitter offer a powerful way to share blog posts and engage with readers, but manually managing a Twitter account and replying to tweets can be time-consuming and repetitive. Furthermore, engaging with Twitter users in a way that is polite and respectful can be challenging, particularly when trying to reach a large number of users. These challenges can make it difficult for bloggers to effectively promote their content on Twitter and reach new readers. In the next section, we will explore how our blog promotion toolkit can help address these challenges and enable bloggers to reach their promotion goals.

Solution

Our blog promotion tool aims to simplify the process of promoting blog content on Twitter by automating the process of finding relevant tweets and engaging with Twitter users. The tool takes a list of blog posts, each with its own set of hashtags and metadata. These hashtags and metadata are configured manually by the author of the blog (using the same ChatGPT usually 😀) and extracted directly from the HTML.

Once the hashtags have been extracted, the toolkit passes them to the Twitter SearchTweets API, which returns a list of relevant tweets from oldest to newest. The toolkit then checks each tweet for various requirements, such as politeness and the author’s audience size. If a tweet qualifies, the toolkit feeds the tweet URL and blog post URL to the OpenAI API, which generates a relevant reply using the ChatGPT language model. Finally, the toolkit sends the generated reply using the Twitter Manage Tweets API.

We can visualize this flow as a pipeline with the following key components:

By automating the process of finding relevant tweets and generating polite and engaging responses, our blog promotion toolkit can help bloggers save time and effort in promoting their content on Twitter. Additionally, the ability to extend the toolkit with new promotion tools in the future can provide even more ways to engage with readers and promote blog content.

Tools and libraries

Our blog promotion toolkit is built using a variety of tools and libraries, including:

Python : a popular programming language that is widely used in data analysis, machine learning, and web development. This is my first time using this language to build something meaningful so please don’t judge me harshly.
Tweepy : a Python library that provides easy access to the Twitter API, allowing developers to build applications that interact with Twitter. I will be using V2 Client.
OpenAI library : a Python library that provides access to OpenAI’s powerful language models, including GPT-3.
BeautifulSoup : a Python library for web scraping and parsing HTML documents.
AWS services : DynamoDB (for storage), AWS Lambda (for computing), AWS CloudWatch (for event scheduling to trigger Lambda), and of course my favourite AWS App Composer to integrate all of these services together into a working SAM template and deploy it via AWS CloudFormation.

Implementation

TLDR; Link to the code

GitHub repo: https://github.com/sr-shifu/blog-promo-toolkit

Building infrastructure

As mentioned earlier, I will use AWS App Composer UI to build my SAM template. I will not go deep into the details (you can find more information about how to work with it in one of my previous posts) — I will just leave my final diagram here:

Infrastructure is pretty simple:

EngageTweets lambda function. I decided to set its timeout to the maximum (15 mins) to give it as much time as possible: OpenAI API can be slow, and Twitter search API might have pretty long cooldown timeouts after being throttled.
Event rule (AWS::Events::Rule) that will trigger EngageTweets lambda function using this cron expression (unfortunately UI does not support this configuration, so you need to write it by hand): cron(0 18 * * ? *). This expression can be decoded to human language as "at 18:00 UTC every day".
IntegrationTokens Secrets Manager instance to store tokens to access Twitter and OpenAI APIs. EngageTweets lambda will consume them via environment variables (again, need to configure manually in the template).
RepliedTweets table. It will be used to store information about tweets that have been already replied — it would be used by both the SearchTweets and TweetFilter components. It will use userId as a partitioning key and tweetId as a range (sort) key. There are also 2 additional attributes that are worth mentioning:
- TTL: expiration key (UNIX time of when the record needs to be deleted from the table). The idea behind this is to make storage more efficient. Twitter’s SearchTweets API can search tweets only for the last 7 days, so there is no need to store tweet reply data forever. We can reply to the tweet, set TTL to be current_time + 7d , and make sure that we will not disturb the author during the next 7 days or will not reply to the same tweet again.
- searchKey: query string that was used to search this tweet (about this later). The idea here is to build Global Secondary Index (GSI) using searchKey as the primary key and tweetId as a range key. Using this index we can find the latest tweet that was replied to in case the previous function execution was interrupted (for instance, exceeded lambda timeout). You may ask why don't store reply time in the table and use it as a sort key instead. The answer is simple: Twitter uses Snowflake IDs that guarantee that all tweets follow the rough order (roughly sortable). Search API also supports since_id a request parameter that will give us all tweets that were sent after the last replied tweet id stored in our table. In case if our tool didn't run during the last 7 days and all data in the table was purged, it's not a problem at all - remember, API returns only the recent 7 days.

Deeper into the code

Initialize Tweepy client:

client = tweepy.Client(
    bearer_token=bearer_token,
    consumer_key = consumer_key,
    consumer_secret = consumer_secret,
    access_token = access_token,
    access_token_secret = access_token_secret,
    wait_on_rate_limit = True
)

With wait_on_rate_limit = True the client will swallow all throttling exceptions and wait for the API to cool down.

Generate reply using openAI:

def generate_tweet_reply_message(tweet_url, post_url, lang = 'en'):
    prompt = f"Reply to tweet {tweet_url}. Reply must include link to article {post_url} and engage to follow @TheSameTech{' using ' + lang + ' language' if lang != 'en' else ''}. Don't exceed {str(MAX_TWITTER_MESSAGE_LENGTH)} chars"
    completions = openai.Completion.create(
        engine="text-davinci-003",
        prompt=prompt,
        max_tokens=300,
        n=1,
        stop=None,
        temperature=0.7,
    )
    author_id = re.search(r"twitter\.com/([^/]+)/status", tweet_url).group(1)
    # this is one of the weird stuff I noticed - sometimes ChatGPT tags author using their ID, and not account name
    return completions.choices[0].text.replace("\n\n", "").replace(f" @{author_id}", "")

Parameters:

engine: This parameter specifies the ID of the OpenAI language model to use for generating text. In this example, the text-davinci-003 model (ChatGPT-3) is used, which is one of OpenAI's most advanced models.
prompt: This parameter specifies the text prompt to use as input to the language model. In the blog promotion toolkit, the prompt will be the relevant tweet that the ChatGPT algorithm will be generating a response to.
max_tokens: This parameter specifies the maximum number of tokens (words or punctuation marks) that the language model should generate in its response.
n: This parameter specifies the number of responses to generate. In this case, only one response will be generated.
stop: This parameter specifies a sequence of tokens that should be used as a stopping point for the language model's response. In this case, no stopping sequence is specified.
temperature: This parameter controls the randomness of the language model's responses. A higher temperature value will produce more creative and varied responses, while a lower temperature value will produce more predictable and conservative responses. In this example, a temperature of 0.7 is used, which should produce responses that are creative but not too unpredictable.

UPDATE : OpenAI released new gpt-3.5-turbo language model a few days after I wrote this post (Mar 7, 2023). It's priced at $0.002 per 1K tokens, which is 10x cheaper than the existing GPT-3.5 models.

Search tweets:

twitter_metadata = extract_twitter_metadata(post_url, tagsSelector='.post-tags')
        keywords, hash_tags, description, *rest = twitter_metadata
        combos = list(combinations(hash_tags, 2))
        for combo in(combos):
            hash_tags_string = " ".join(combo)
            latest_tweet_id = get_latest_activity(hash_tags_string)
            if latest_tweet_id is None and search_days_ago is not None:
                start_time=(datetime.datetime.now() - datetime.timedelta(days=search_days_ago)).strftime("%Y-%m-%dT%H:%M:%SZ")
            tweets = search_recent_tweets_with_pagination(query=hash_tags_string, max_results = 100, start_time=start_time, latest_tweet_id=latest_tweet_id, tweet_fields=['id', 'author_id', 'created_at', 'in_reply_to_user_id', 'lang'])
            # do other stuff (filtering, generating reply, promoting)

After Twitter metadata is extracted from HTML, I use hashtags as search criteria to find relevant tweets. Usually, every blog post has about 4–5 tags associated with it. If I pass down all of them together, most likely Twitter API will return nothing — that’s why I combine them by pairs and use every pair as a search key (so every blog post will generate from 6 to 10 search requests in general).

Running locally

With SAM, you don’t need to do a lot. Only execute 3 commands and you are good to go. If you want to run the local version of DynamoDB, please follow the README instructions in my GitHub repo.

sam build
sam local start-lambda
aws lambda invoke --function-name "EngageTweets" --endpoint-url "http://127.0.0.1:3001" --no-verify-ssl out.txt

It’s the same simple using Python executable:

cd src/engage-tweets-lambda
pip install -r requirements.txt
source env/bin/activate   
TABLE_NAME=PromotedTweets python engage_tweets.py

Just make sure you have all tokens stored in your local .env file.

Deploying to AWS

One command you need to know:

sam deploy

Limitations

Twitter API rate limit: Twitter API has a rate limit that limits the number of API calls that can be made per user per 15-minute window. This means that if a user exceeds the rate limit, they may not be able to make any further API calls until the limit resets (but we already got this covered — see previous section).
Twitter monthly limit to search tweets: Twitter limits the number of tweets that can be searched in a given month. The current limit is 500,000 tweets per month per developer account, which is also subject to change.
OpenAI API is not free: The OpenAI API charges based on the number of requests and responses (tokens) sent and received. While we have obtained a free trial package of $18, continued use of the API will require payment based on usage.
Recent Twitter API changes: Twitter has recently made changes to its API that impact the availability of certain features and data. For example, as of February 2022, Twitter has suspended access to the user profile and follower count endpoints, making it difficult to determine the size of a tweet author’s audience. This may impact the effectiveness of our blog promotion toolkit.
Twitter’s recent announcement regarding automation: Twitter has announced new rules regarding automation on the platform, aimed at preventing spam and abusive behaviour. While the full extent of these changes is not yet clear, they may impact the functionality of our blog promotion toolkit. We will continue to monitor developments and adjust our approach as needed.

These limitations highlight the challenges involved in building and maintaining a blog promotion toolkit that relies on external APIs and platforms. Despite these limitations, I believe that our toolkit can still be effective in promoting our content and engaging with our audience. By staying informed and adapting to changes as they arise, we can continue to use these tools to achieve our goals.

Final words

In conclusion, our blog promotion toolkit is designed to help content creators reach new audiences by engaging with relevant Twitter users. By leveraging the power of AI language models and Twitter’s API, we are able to generate personalized replies to tweets that mention our blog post’s relevant content in the context of the target tweet message. However, it’s important to note that there are limitations to this approach, including Twitter API rate limits and monthly search limits, as well as the cost of using OpenAI’s API for language processing.

Overall, I believe that this toolkit can be a valuable asset for bloggers and content creators looking to promote their work on social media. I look forward to continuing to refine and improve this toolkit, and we welcome any feedback or suggestions from the community. Thank you for considering my toolkit, and I hope it can help you achieve your content promotion goals!

Originally published at https://thesametech.com on March 3, 2023.

You can also follow me on Twitter and connect on LinkedIn to get notifications about new posts!

DEV Community