Sometimes it’s useful to know where all the back-links to a website are coming from.
As a competitor it can give you information about how your competition is promoting their site. You can shortcut the process of finding the good places to get links from, and who might be a client or a good contact for your business by finding out who is linking to your competitors.
If you’re buying or selling a website the number and quality of back-links helps determine the value of a site. checking the links to a site should be on the checklist you use when buying a website.
With that in mind I wrote a short script that scrapes the links to a particular domain from the list that Alexa provides.
import urllib2 from BeautifulSoup import BeautifulSoup def get_alexa_linksin(domain): page = 0 linksin = [] while True : url='http://www.alexa.com/site/linksin;'+str(page)+'/'+domain req = urllib2.Request(url) HTML = urllib2.urlopen(req).read() soup = BeautifulSoup(HTML) next = soup.find(id='linksin').find('a', attrs={'class':'next'}) linksin += [(link['href'], link.string) for link in soup.find(id='linksin').findAll('a')] if next : page = page+1 else : break return linksin if __name__=='__main__': linksin = get_alexa_linksin('halotis.com') print linksin |