sheeshmohsin Pycon Funnel Statistics 20131101    Posted:


It is an web application used to calculate the statistics of the candidates applied for talks that are listed in The application is made by using flask and jinja. Python code is written to trace and calculate the statistics from the Pycon Funnel 2013 website


Download the app using this command
git clone

And run the app. using command


Package Required

Flask Subprocess jinja2 bs4 (beautifulsoup4) urllib2

Team Name:-

"Triple S"

Team Members:-

  1. Shalini Roy
  2. Shantanu Sarkar
  3. Sheesh Mohsin


iamsudip vid2mp3 v0.1.1 release 20131011    Posted:

Commandline tool to extract audio from online flash videos as mp3.

It communicates with for this purpose.,, are also supported by this tool.


Now it is an early release so better you try it in a virtual environment or

if you want to install it in your system it requires su/superuser access.

To install the script do this

Option 1 : Install via pip

$ sudo pip install vid2mp3

Option 2 : If you have downloaded the source

$ sudo python install


 1 #!/usr/bin/env python
 3 '''
 4 Commandline tool to extract audio from online flash videos as mp3.
 5 It communicates with for this purpose.
 6,, are also supported by this tool.
 7 '''
 9 __appname__ = "vid2mp3"
10 __version__ = "0.1.1"
11 __author__ = "iamsudip <>"
12 __license__ = "MIT"
14 import requests
15 import sys, os
16 from bs4 import BeautifulSoup
17 import time
19 def download(data):
20     try:
21         with open("%s" %data[1], "wb") as fobj:
22             print "File size: " + data[2]
23             response = requests.get(data[0], stream=True)
24             fsrc = response.raw
25             size = response.headers.get("content-length")
26             length = 16*1024
27             while True:
28                 buf =
29                 if not buf:
30                     break
31                 fobj.write(buf)
32                 sys.stdout.write("Downloaded " + str(os.path.getsize(data[1])/1024) + "kb of " + str(int(size)/1024) + " kb\r")
33                 sys.stdout.flush()
34             print "Download complete."
35             sys.exit(0)
36     except IOError:
37         data[1] = "Rename_manually.mp3"
38         download(data)
40 def process_url(s_url, sleep_time = 10):
41     try:
42         s_url_soup = BeautifulSoup(requests.get(s_url).text, "xml")
43         data = [str(s_url_soup.downloadurl.contents[0]),
44             str(s_url_soup.file.contents[0]),
45             str(s_url_soup.filesize.contents[0])
46         ]
47         if not data[0]:
48             raise AttributeError
49         print "Downloading: %s" % data[1]
50         print "Fetching data from: %s" % data[0]
51         return data
52     except AttributeError:
53         time.sleep(sleep_time)
54         sleep_time += 10
55         process_url(s_url, sleep_time)
57 def make_url(url):
58     try:
59         response ="", data={"mediaurl": url})
60         s_url = eval(response.text)['statusurl'].replace('\/', '/')
61         data = process_url(s_url)
62         download(data)
63     except KeyError:
64         print "Please check the given link.\n Exiting"
65         sys.exit(1)
67 if __name__ == '__main__':
68     if len(sys.argv) == 2:
69         make_url(sys.argv[1])
70     else:
71         print "usage:\n vid2mp3 [link-to-video]"

How-to Use

Use it as

$ vid2mp3 [video-link]



  • First stable release.
  • Supports
  • Supports
  • Supports
  • Downloads the audio file(.mp3) into present working directory.


  • More user friendly and verbose.
  • Fixed wrong url exception.

Reporting Bugs

Please report bugs at github issue tracker:


tenstormavi Update 20131010    Posted:

Team name:-


Things done:-

The code for storing the data from the client is done.Reading about how to send the stored data to the server.

Things left to do :-

Code for sending the stored data to the server is still need to be done.


Till now no such problems.Reading to make the project well.

Team members :-

  1. tenstormavi
  2. priyanka
  3. iamsudip
  4. sonam


iamsudip pysub-dl v0.2.6 20131004    Posted:

Subtitle downloader

Actually now it only supports to download the subtitles.

It is an early release but works well. So, do not forget to upgrade it frequently.


Now it is a stable release so if you want to install it in your system it requires su/superuser access.

To install the script do this

Option 1 : Install via pip

$ sudo pip install pysub-dl

Option 2 : If you have downloaded the source

$ sudo python install

Do not forget to upgrade it after a week interval I am modifying so many things very frequently

$ sudo pip install pysub-dl --upgrade

How to use

Use it like: pysub-dl movie [language]

[language] is optional, if language is provided you will see subtitles available for that language only or it will show you all.

For help

$ pysub-dl --help

General usage

$ pysub-dl iron-man english

$ pysub-dl hitman spanish

If you do not provide any language it will show you all the available subtitles for that movie for all languages.

One request: Do not use space in the movie name, Use '-'(hyphen) instead of ' '(white space). Movie name is not case sensitive, so all you have to worry about the space only or if you want to use special characters like ' you can use double quotes i. e.

$ pysub-dl "we're the millers"

First you should go to your movie directory then you should execute your code, it will be better cause execution downloads the subtitle file(s) in the present working directory but it is not mandatory.

On execution it will show the list of subtitles available at the moment at for that particular movie. You have to choose one number from them.

Uncertainly if you give wrong movie name, no problem it will show you the probable movie name and you have to choose from them.

For windows users they should download the package manually and extract it. Run it on that directory as

C:\master> python pysub-dl incredibles english

Press [Ctrl] + c or [Ctrl] + d to exit immediately.

Reporting Bugs

Please report bugs at github issue tracker:


sheeshmohsin dup_images 20130815    Posted:

To write a python script to find duplicate images in different directories

The code can be run by:-

$ python <1st_dir_path> <2nd_dir_path>

or, if made executable,

$./ <1st_dir_path> <2nd_dir_path>


This script is useful in finding the duplicate images present in different type of directories, we have to enter the path of the directories as argument in the command line.


Christina-B Dummy project 20130815    Posted:

Problem : Write a python program to make a package and upload it on Pypi.

Download the package

Click or use $ pip install Mypackge


Once the package is installed execute the command in terminal

$ shell

A prompt will open up in which we can open up two commands greet and stock


To exit use "Ctrl+d"

Github link: link


Christina-B Userfinder 20130812    Posted:

Problem: Python program to find out the users who can log in into a linux system

Package used: pwd

Description: 1. Get all the details using getwall() function. 2. Display the users using the information provided.

Link: link

$ python


JCaselles dup_images 20130801    Posted:



To write a command-line tool to find duplicate images, images being jpeg files with exif information.


Use Pillow (former PIL) to read the exif information of every image, and compare them to find duplicates in every directory.


  • pillow module


Link to GitHub


#!/usr/bin/env python

Finds duplicate images in the specified directories

from sys import argv, exit
from PIL import Image
from os import listdir, walk
from os.path import join
from argparse import ArgumentParser

def print_duplicates(im_dict):
    Prints, from all entries in im_dict dictionary, those keys which have more
    than one value, i. e. more than one location for each exif information

    if im_dict:

        print "\n\nDuplicated images:\n------------------\n\n"

        theres_any = False

        for x in im_dict.values():

            if len(x) > 1:
                theres_any = True
                print "Duplicated instances:"

                for w in range(len(x)):
                    print " - %s" % x[w]

                print "\n"

        if not theres_any:
            print "No duplicated images found. Hurra!!\n\n"

        print "\n\nNo images in these directories!\n\n"

def get_exif(dirs):
    Returns a dictionary with all the ~.jpg files founded in the specific
    directories of dirs list. Format:
    {"exif_values" : "[path1, <path2, ...>]"}


    Whe create a nested loop as following: for each filename in dir_content,
    which is each list generated by listdir() of the given list of directories.
    Then we check if this file is .jpg or related, open it and get the exif
    information. We don't need to read it, therefore no need for TAGS.
    _getexif() returns a dictionary, which can't be a dictionary key as it is
    mutable. Therefore, we generate a string with all exif values concatenated.

    Then we add the dictionary entry, using setdefault, which will help in our
    propose: if the key exists, appends the new value to the existing value of
    that key. If the key doesn't exists, creates a new entry. Therefore we can
    know which files are duplicate: those which more than one value in his


    exifpath_dict = {} # returned dict

    for dir_content, x in [(listdir(x), x) for x in set(dirs)]:
        # Using set(dirs) to prevent duplicate dirs

        for filename in dir_content:

            exif_string = ""

            if filename.endswith(".jpg") or filename.endswith(".jpeg") \
                or filename.endswith(".JPG") or filename.endswith(".JPEG"):

                    exif_info =, filename))._getexif()

                except IOError, e:
                    print "**ERROR** Error opening %s: %s" % (filename, e)

                        for z in exif_info.keys():
                            exif_string += str(exif_info[z])

                    except AttributeError:
                        print "**WARNING** %s doesn't have exif " \
                              "info or is corrupted, skipping it" % filename

                        exifpath_dict.setdefault(exif_string, []).append(join (x, filename))

    return exifpath_dict

if __name__ == "__main__":

    parser = ArgumentParser()

                        help = "Directories to search for duplicates",
                        nargs = "+")

    args = parser.parse_args()

    print ""


rahulc93 dup_images 20130801    Posted:

In this assignment, we pass some paths as command line arguments, and the script will tell us if there exists a duplicate file for a image file.

Source Code

The source code for the above problem can be found here


The solution to the problem is shown below.

#!/usr/bin/env python

import os
import sys
import fnmatch
from PIL import Image

def get_exif(path, image):
    if path[-1] != '/':
        path += '/'
    img =  # adding the file-name to the 'path' string and creating Image object
    exif_data = img._getexif()  # obtaining exif data for the 'img' object
    return exif_data  # return the data obtained

dir_list = sys.argv[1:]  # list of directory paths given as input
for path in dir_list:  # iterating through 'dir_list'
    if not os.path.exists(path):  # 'path' is not a valid path
        print "Invalid path: %r" % path  # display error message
        dir_list.remove(path)  # remove the invalid path from 'dir_list'

jpeg_files = []  # list of all the jpeg files
for count, path in enumerate(dir_list):  # iterating through 'dir_list'
    jpeg_files.append([path])  # adding the 'path' to 'jpeg_files' as a list
    for fl in os.listdir(path):  # iterating through the directory entries in 'path'
        if fnmatch.fnmatch(fl, '*.JPG'):  # a jpg file has been encountered
            jpeg_files[count].append([fl])  # add the filename to 'jpeg_files'

for entries in jpeg_files:  # scanning the 'jpeg_files' list
    if len(entries) == 1:  # no jpg files present in a particular entry
        jpeg_files.remove(entries)  # remove the unwanted entry

for entries in jpeg_files:  # iterate through the list entries of 'jpeg_list'
    for files in entries[1:]:  # iterate through the file names in 'entries'
        if not get_exif(entries[0], files[0]):  # no exif data for 'files' in path 'entries[0]'
            entries.remove(files)  # remove such a file entry from 'entries'

dup_images = []  # contains information about duplicate images of 'file1'
count = 0  # number of files without duplicates
for index1, list1 in enumerate(jpeg_files):  # iterating through list entries of 'jpeg_files'
    for file1 in list1[1:]:  # iterating through the members of 'list1'
        dup_images.insert(count, [file1])  # adding the name of 'file1' to 'dup_images'
        dup_images[count].append(list1[0])  # adding the location of 'file1' to 'dup_images'
        for index2, list2 in enumerate(jpeg_files):  # iterating through list entries of 'jpeg_files'
            for file2 in list2[1:]:  # iterating through members of 'list2'
                if not cmp(get_exif(list1[0], file1[0]), get_exif(list2[0], file2[0])) and jpeg_files[index1][list1.index(file1)] != jpeg_files[index2][list2.index(file2)]:
                    # exif data of 'file1' and 'file2' are same, and they are not the same files
                    dup_images[count].append(list2[0])  # adding the location of 'file2' to 'dup_images'
                    jpeg_files[index2].remove(file2)  # remove 'file2' from 'jpeg_files[index2]'
        jpeg_files[index1].remove(file1)  # remove 'file1' from 'jpeg_files[index1]'
        count+=1  # increment count by 1

for entries in dup_images:  # iterate through the members of 'dup_images'
    if len(entries) == 2:  # no duplicate images found in 'entries'
        dup_images.remove(entries)  # remove the above entry

if len(dup_images) == 0:  # no duplicate image found for the paths provided by the user
    print 'No duplicate images found'  # display the message
else:  # some duplicate files are present
    print '%d files with duplicate copies found.\n' % len(dup_images)  # number of duplicate files present
    for count, entries in enumerate(dup_images):  # iterate through members of 'dup_images'
        print 'File %d: %r' % (count+1, entries[0][0])  # print file name which has duplicate images
        for index, location in enumerate(entries[1:]):  # iterating through locations
            print 'Location %d: %r' % (index+1, location)  # print the paths to duplicate images
        print ""  # print a newline

Run the Code

To run the code, save it as, and follow the steps.

  1. Change the file's permissions and make it executable:

    $ chmod +x
  2. Execute the script:

    $ ./ <path-1> <path-2> ... <path-n>

Alternatively, you can try:

$ python <path-1> <path-2> ... <path-n>


<path-n> represents the n-th/path/to/be/scanned


ThyArmageddon Duplicate Image Finger 20130801    Posted:

Duplicate Image Finder

This script will take any number of directories, search them recursively and return the duplicate images with their locations. The dup_images script can be found on GitHub.


#!/use/bin/env python2
This Python script will scan one of more directories and return
the duplicate images in them based on their md5sum

import os
import sys
import Image
import hashlib

def md5sum(img):
    md5sum returns the md5sum of the file given to it
    md5 = hashlib.md5()
    with open(img, "rb") as f:
        for chunk in iter(lambda: * md5.block_size), b''):
            return md5.hexdigest()

def find_files(path):
    The following function will take a path
    search it recursively and find all the images
    inside the directories inside. Then it will find
    the md5 checksum of each image and return in a
    for root, dirs, files in os.walk(path):
        if files:
            for _file in files:
           + "/" + _file)
                    md5 = md5sum(root + "/" + _file)
                    yield {md5: root + "/" + _file}

                except IOError:

def parse_paths(paths):
    This function will take a list of paths and check if the paths
    are valid, if not, it will drop them.
    file_list = []
    for path in paths:
        if os.path.isdir(path):
            file_list = find_files(path)
            yield file_list

            print path + " is not a valid path"

def find_dups(_list):
    This function will search through the dictionary
    comparing md5 checksums in search of duplicates.
    If it finds duplicates it will output the duplicates'
    path and filename
    md5_list = {}
    print "Duplicate images are:"
    for items in _list:
        for item in items:
            for _md5 in item:
                if _md5 in md5_list:
                    print "'" + item[_md5] + "' and '" + \
                        md5_list[_md5] + "' are duplicates"
                    md5_list[_md5] = item[_md5]

if __name__ == "__main__":
    if len(sys.argv) < 1:
        print "Please provide at least one direcrory"

        file_list = parse_paths(sys.argv)


Contents © 2013 dgplug - Powered by Nikola