JCaselles planetparser-feedparser 20130716-140539

Posted: 2013-07-16 14:05

Parse Planet using feedparser

Request

Repeat the assignment of printing the title and author of each post in planet.fedoraproject.org, but this time using feedparser.

Solution

yolk -l output:

(virt)[manel@manu virt]$ yolk -l
Python          - 2.7.3        - active development (/usr/lib/python2.7/lib-dynload)
distribute      - 0.6.24       - active
feedparser      - 5.1.3        - active
pip             - 1.1          - active
wsgiref         - 0.1.2        - active development (/usr/lib/python2.7)
yolk            - 0.4.3        - active

Code

Link to code

 1 #!/usr/bin/env python
 2 
 3 from sys import exit
 4 import feedparser
 5 
 6 url = "http://planet.fedoraproject.org/rss20.xml"
 7 
 8 def print_blog_info ():
 9 
10     """
11     This function uses feedparser to parse the rss feed of planet.fedoraproject
12     and prints the title and author.
13 
14     Feedparser parses al the content with feedparser.parse (url) function.
15     All entries corresponding to the different blogs are stored in
16     feedparser.entries[], being a dicctionary where you can extract different
17     content giving the propper key. The key needed here is "title", which gives
18     us the title of the post. Then from it we can extract the author and title
19     of the given post.
20 
21     """
22 
23     z = 0
24 
25     rss_doc =feedparser.parse (url)
26 
27     for x in range(len(rss_doc.entries)):
28         z += 1
29         tmp = rss_doc.entries[x]['title'].split(':')
30         print """
31 Blog Entry n. %.2i:
32 -----------------
33 
34 Tile: '%s'
35 Author: %s
36         """ % (z, tmp[1], tmp[0])
37 
38 
39 if __name__ == "__main__":
40     print_blog_info ()
41     exit(0)