Scrape Bing Search Engine Results Page

bingLogo_5F00_lgBased on my last post for scraping the Google SERP I decided to make the small change to scrape the organic search results from Bing.

I wasn’t able to find a way to display 100 results per page in the Bing results so this script will only return the top 10. However it could be enhanced to loop through the pages of results but I have left that out of this code.

Example Usage:

$ python BingScrape.py
http://twitter.com/halotis
http://www.halotis.com/
http://www.halotis.com/progress/
http://doi.acm.org/10.1145/367072.367328
http://runtoloseweight.com/privacy.php
http://twitter.com/halotis/statuses/2391293559
http://friendfeed.com/mfwarren
http://www.date-conference.com/archive/conference/proceedings/PAPERS/2001/DATE01/PDFFILES/07a_2.pdf
http://twitterrespond.com/
http://heatherbreen.com

Here’s the Python Code:

#!/usr/bin/env python
# -*- coding: utf-8 -*-
# (C) 2009 HalOtis Marketing
# written by Matt Warren
# http://halotis.com/
 
import urllib,urllib2
 
from BeautifulSoup import BeautifulSoup
 
def bing_grab(query):
 
    address = "http://www.bing.com/search?q=%s" % (urllib.quote_plus(query))
    request = urllib2.Request(address, None, {'User-Agent':'Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)'} )
    urlfile = urllib2.urlopen(request)
    page = urlfile.read(200000)
    urlfile.close()
 
    soup = BeautifulSoup(page)
    links =   [x.find('a')['href'] for x in soup.find('div', id='results').findAll('h3')]
 
    return links
 
if __name__=='__main__':
    # Example: Search written to file
    links = bing_grab('halotis')
    print '\n'.join(links)