


I’ve started to show RSS feed data on my Raspberry Pi Radio using the xscreensaver ’Phosphor’ screensaver.Renteria pushing Samardzija for All-Star game|1401322649189Ĭhicago teen Townsend stuns French star in French Open|1401322649189Įmanuel: No Wrigley Field hearing next month|1401322649189 Stanley Cup Final to start Wednesday|1401322649189 Sox Game Day: Noesi still searching for 1st Sox victory|1401322649189īears claim tackle Ola off waivers|1401322649189 When run with the Chicago Tribune RSS feed URL shown, the script writes data like the following to its “database” (which is a text file with the fields separated by a | character): # (but only if they're not already in there)į.write(title + "|" + str(current_timestamp) + "\n") # add all the posts we're going to print to the database with the current timestamp If post_is_in_db_with_old_timestamp(title): # if post is already in the database, skip it # return true if the title is in the database with a timestamp > limitĭef post_is_in_db_with_old_timestamp(title): Here’s the admittedly-crappy-but-functional Python source code:Ĭurrent_time_millis = lambda: int(round(time.time() * 1000))Ĭurrent_timestamp = current_time_millis() For titles not already in the database, it writes the titles and timestamps to the database.Checks a database to see if the title of each feed is already in the database, and if so, if it was put in there more than 12 hours ago.
Rss feed reader python download#
Download an RSS feed from the URL given on the command line.The output is given with UTF-8 charsets, if you are scraping non-english reddits then set the environment to use UTF - export LANG=en_US.With the caveats that (a) I don’t know much about Python, (b) I don’t want to learn that much about it right now, and (c) I’m not concerned with performance at the moment, the following Python script does the following: Reader return RedditContent which have following information ( extracted_text and image_alt_text are extracted from Reddit content via BeautifulSoup) - class RedditContent: # If `after` is passed then it will fetch contents after this date # If `since_id` is passed then it will fetch contents after this id # fetch_content will fetch all contents if no parameters are passed. Since_time = datetime.utcnow().astimezone(pytz.utc) + timedelta(days=- 5) # To consider comments entered in past 5 days only Now you can run the following example - import pprintįrom reddit_rss_reader.reader import RedditRSSReader For example to fetch all comments on subreddit r/wallstreetbets. RedditRSSReader require feed url, hence refer link to generate.
Rss feed reader python install#
Install from master branch (if you want to try the latest features): git clone

Install via PyPi: pip install reddit-rss-reader For serious scrapping register your bot at apps to get client details and use it with Praw. *Note: These feeds are rate limited hence can only be used for testing purpose. For more details about what type of RSS feed is provided by Reddit refer these links: link1 and link2. It can be used to fetch content from front page, subreddit, all comments of subreddit, all comments of a certain post, comments of certain reddit user, search pages and many more. This is wrapper around publicly/privately available Reddit RSS feeds.
