Tag Archives: back-links

alexa_logoSometimes it’s useful to know where all the back-links to a website are coming from.

As a competitor it can give you information about how your competition is promoting their site. You can shortcut the process of finding the good places to get links from, and who might be a client or a good contact for your business by finding out who is linking to your competitors.

If you’re buying or selling a website the number and quality of back-links helps determine the value of a site. checking the links to a site should be on the checklist you use when buying a website.

With that in mind I wrote a short script that scrapes the links to a particular domain from the list that Alexa provides.

import urllib2
from BeautifulSoup import BeautifulSoup
def get_alexa_linksin(domain):
    page = 0
    linksin = []
    while True :
        req = urllib2.Request(url)
        HTML = urllib2.urlopen(req).read()
        soup = BeautifulSoup(HTML)
        next = soup.find(id='linksin').find('a', attrs={'class':'next'})
        linksin += [(link['href'], link.string) for link in soup.find(id='linksin').findAll('a')]
        if next :
	    page = page+1
        else :
    return linksin
if __name__=='__main__':
    linksin = get_alexa_linksin('halotis.com')
    print linksin