# Define a for-loop to generate tweets at regular intervals # We cannot make large API call in one go. def scraptweets( search_words, date_since, numTweets, numRuns): You may explore the list of metadata from the tweepy.Cursor object in detail (this is the real messy part). I only extracted those metadata that I deemed relevant to my case. number of runs that happen once every 15 minutes.starting date, after which all tweets would be extracted (you can only extract tweets that are not older than the last 7 days).search parameter such as key words and hashtags etc.# Pass your twitter credentials to tweepy via its OAuthHandlerĪuth = OAuthHandler(consumer_key, consumer_secret)Īuth.set_access_token(access_key, access_secret)ĭue to the limited number of API calls one can make using a basic and free developer account, (~900 calls every 15 minutes before your access is denied) I created a function that extract 2,500 tweets per run once every 15 minutes (I tried to extract 3,00 and above but that got me denied after the second batch). # Twitter credentials # Obtain them from your twitter developer account If you ran into any authentication errors, regenerate your keys and try again. Switch over to Jupyter Notebook and import the following libraries: from tweepy import OAuthHandlerįrom tweepy.streaming import StreamListener You would need 4 pieces of information ready - API key, API secret key, Access token, Access token secret. You can view this page after you have been granted access and created an app. Just follow the instructions and after some time (only a few hours for me), they would grant you your access. Prerequisites: Setting up a Twitter Developer Accountīefore you start using Tweepy, you would need a Twitter Developer Account in order to call Twitter’s APIs. I only managed to get most of them that I needed after a few rounds of trial and error. One downside is that I couldn’t find any documentation that tells you what are the parameter values for pulling certain metadata out of a tweet. Tweepy was the only library that did not throw any errors for my environment, and it was quite easy to get things doing. I tried out a few Python libraries and decided to go ahead with Tweepy. In the end, I threw these ideas into the bin and decided to do it myself. I had considered and tried out tools such as Octoparse, but they either only support Windows (I am using a Macbook), were unreliable, or they only allow you to download a certain number of tweets unless you subscribe to a plan. The first order of affair was to obtain the tweets. The codes can be configured to suit your own needs. In this example, we will be extracting tweets related to the Hong Kong Protest Movement 2019, which I have written an analysis on. Case Study: Hong Kong Protest Movement 2019
0 Comments
Leave a Reply. |