Category Archives: Python

Bit.ly offers a very simple API for creating short URLs. The service can also provide you with some basic click statistics. Unfortunately there are a few missing pieces to the API. To get around that you’ll have to keep a list of bit.ly links you want to track. Depending on your situation you may need to keep some of the information updated regularly and stored locally to do a deeper analysis of your links.

There are a couple of advanced tricks you can use to get more out of your tracking.

  1. Add GET arguments to the end of the URL to split test
  2. – If you want to track clicks from different sources that land at the same page you need use different links. The easiest way to create two links to the same page is to append a GET argument. So if you wanted to promote my site http://halotis.com and wanted to compare Twitter to AdWords then you could create bit.ly links to http://halotis.com?from=twitter and http://halotis.com?from=adwords. You can add more information with more arguments such as http://halotis.com/?from=adwords&adgroup=group1. If you control the landing page, then you will see those arguments in Google Analytics and will have even more information about who clicked your links.

  3. Look at stats for any bit.ly link including referring sites, real-time click time-lines, and locations by adding a + to the end of it: http://bit.ly/10HYCo+
  4. Find out which other bit.ly users have shortened a link using the API – google.com bitly info
  5. Use the javascript library to grab stats and embed them into a webpage — see code below

Get click count stats inserted with this Javascript example code. Just update the login & ApiKey and put this in the head section of your webpage:

<script type="text/javascript" charset="utf-8" src="http://bit.ly/javascript-api.js?version=latest&login=YOURBITLYLOGIN&apiKey=YOURAPIKEYGOESHERE"></script>
<script type="text/javascript" charset="utf-8">
	BitlyCB.myStatsCallback = function(data) {
		var results = data.results;
 
		var links = document.getElementsByTagName('a');
		for (var i=0; i < links.length; i++) {
			var a = links[i];
			if (a.href && a.href.match(/^http\:\/\/bit\.ly/)) {
				var hash = BitlyClient.extractBitlyHash(a.href);
				if (results.hash == hash || results.userHash == hash) {
					if (results.userClicks) {
						var uc = results.userClicks + " clicks on this bit.ly URL. ";
					} else {
						var uc = "";
					}
 
					if (results.clicks) {
						var c = results.clicks;
					} else {
						var c = "0";
					}
					c += " clicks on all shortened URLS for this source. ";
 
					var sp = BitlyClient.createElement('span', {'text': " [ " + uc + c + " ] "});
					a.parentNode.insertBefore(sp, a.nextSibling);
				}
			}
 
		};
 
	}
 
	// wait until page is loaded to call API
	BitlyClient.addPageLoadEvent(function(){
		var links = document.getElementsByTagName('a');
		var fetched = {};
		var hashes = [];
		for (var i=0; i < links.length; i++) {
			var a = links[i];
			if (a.href && a.href.match(/^http\:\/\/bit\.ly/)) {
				if (!fetched[a.href]) {
					BitlyClient.stats(BitlyClient.extractBitlyHash(a.href), 'BitlyCB.myStatsCallback');
					fetched[a.href] = true;
				}
			}
		};
 
	});
	</script>

If you want to have a small command line script that can fetch this data from bit.ly and print it then check out this Python script that uses the bitly library which makes it very easy:

import bitly       #http://code.google.com/p/python-bitly/
BITLY_LOGIN = "YOUR_BITLY_LOGIN"
BITLY_API_KEY = "YOUR_BITLY_API_KEY"
 
short_url='http://bit.ly/31IqMl'
 
b = bitly.Api(login=BITLY_LOGIN,apikey=BITLY_API_KEY)
stats = b.stats(short_url)
print "%s - User clicks %s, total clicks: %s" % (short_url, stats.user_clicks, stats.total_clicks)

app_engine_logo_smFor the past two weeks I have been working on a project that has great potential of really taking off in a big way. I’m developing the site using Python and the Django framework running on Google App Engine. I have a lot of good things to say about working with this development stack. Some of the big wins are:

  • Python is an awesome language – easy to read, write, and maintain.  There’s lots of libraries available which makes development go faster.
  • Django is a great framework that makes developing webapps very clean.  There’s a great separation between templates, views, and urls.  Once I got the hang of how things are supposed to be done in django it’s very easy  to get things up quickly.
  • Google App Engine has a really amazing admin interface that gives access to the logging information, database tables, and website statistics.  The free quotas are generous, it scales well, and takes almost no time to set up.  The GUI development app for OS X works really well and does development debugging better than the stock django manage.py script.

But there have been some really frustrating points during the development of my first real web service running on GAE.

  • There are too many choices/variations of Django – none have great documentation
    • The built in Django that comes with GAE is stripped down to the bare essentials – no admin interface, different forms, different Models, different User/authentication.  Big portions of the documentation at djangoproject.org are useless if you use this version of Django.
    • app-engine-helper – provides a way to get more of the standard Django installed.  I haven’t tried this one.
    • app-engine-patch – similar to helper, but development seems more active.  app-engine-patch also includes a bunch of app-engine ready Django applications such as jQuery, blueprintCSS, and registration.  It supports using standard Django user accounts and the admin interface.

The biggest problem I’ve had is with user registration and authentication. Between the app-engine-patch and Google App Engine, there seems to be at least 4 different authentication and session schemes, and multiple User Models to choose from. Some require additional middleware – others don’t. I want to use the registration application and standard Django Users but it doesn’t seem to want to work with a Model’s UserProperty. To top it off there’s very little documentation and I haven’t found an example application to see how it should be done. Argh.

The exciting news is that I expect to have my first web service up and running in about a week. The second one is in development and I expect to launch it in early August.

I was a little bored today and decided to write up a simple script that pushes RSS feed information out to Twitter and manages to keep track of the history so that tweets are not sent out more than once.

It was actually a very trivial little script to write but it could actually be useful for something that I’m working on in the future.

The script makes use of an Sqlite database to store history and bit.ly for shortening URLs. I’ve made heavy use of some really nice open source libraries to make for a very short and sweet little script.

Grab the necessary python libraries:
python-twitter
python-bitly
feedparser

You’ll need to sign up for free accounts at Twitter and Bit.ly to use this script.

Hopefully someone out there can take this code example to do something really cool with Twitter and Python.

Update: I’ve added some bit.ly link tracking output to this script. After it twitters the RSS feed it will print out the click count information for every bit.ly link.

from time import strftime
import sqlite3
 
import twitter     #http://code.google.com/p/python-twitter/
import bitly       #http://code.google.com/p/python-bitly/
import feedparser  #available at feedparser.org
 
 
DATABASE = "tweets.sqlite"
 
BITLY_LOGIN = "bitlyUsername"
BITLY_API_KEY = "api key"
 
TWITTER_USER = "username"
TWITTER_PASSWORD = "password"
 
def print_stats():
	conn = sqlite3.connect(DATABASE)
	conn.row_factory = sqlite3.Row
	c = conn.cursor()
 
	b = bitly.Api(login=BITLY_LOGIN,apikey=BITLY_API_KEY)
 
	c.execute('SELECT title, url, short_url from RSSContent')
	all_links = c.fetchall()
 
	for row in all_links:
 
		short_url = row['short_url']
 
		if short_url is None:
			short_url = b.shorten(row['url'])
			c.execute('UPDATE RSSContent SET `short_url`=? WHERE `url`=?',(short_url,row['url']))
 
 
		stats = b.stats(short_url)
		print "%s - User clicks %s, total clicks: %s" % (row['title'], stats.user_clicks,stats.total_clicks)
 
	conn.commit()
 
def tweet_rss(url):
 
	conn = sqlite3.connect(DATABASE)
	conn.row_factory = sqlite3.Row
	c = conn.cursor()
 
	#create the table if it doesn't exist
	c.execute('CREATE TABLE IF NOT EXISTS RSSContent (`url`, `title`, `dateAdded`, `content`, `short_url`)')
 
	api = twitter.Api(username=TWITTER_USER, password=TWITTER_PASSWORD)
	b = bitly.Api(login=BITLY_LOGIN,apikey=BITLY_API_KEY)
 
	d = feedparser.parse(url)
 
	for entry in d.entries:
 
		#check for duplicates
		c.execute('select * from RSSContent where url=?', (entry.link,))
		if not c.fetchall():
 
			tweet_text = "%s - %s" % (entry.title, entry.summary)
 
			shortened_link = b.shorten(entry.link)
 
			t = (entry.link, entry.title, strftime("%Y-%m-%d %H:%M:%S", entry.updated_parsed), entry.summary, shortened_link)
			c.execute('insert into RSSContent (`url`, `title`,`dateAdded`, `content`, `short_url`) values (?,?,?,?,?)', t)
			print "%s.. %s" % (tweet_text[:115], shortened_link)
 
			api.PostUpdate("%s.. %s" % (tweet_text[:115], shortened_link))
 
	conn.commit()
 
if __name__ == '__main__':
  tweet_rss('http://www.halotis.com/feed/')
  print_stats()

I discovered this very handy trick for getting relevant YouTube videos in an RSS feed and I have used it to build up some very powerful blog posting scripts to grab relevant content for some blogs that I have.  It could also be helpful to pull these into an RSS Reader to quickly skim the newest videos relevant to a specific search.  I thought I would share some of this with you and hopefully you’ll be able to use these scripts to get an idea of your own.

To start with you need to create the URL of the RSS feed for the search.  To do that you can do the search on YouTube and click the RSS icon in the address bar.  The structure of the URL should be something like this:

http://gdata.youtube.com/feeds/base/videos?q=halotis%20marketing&client=ytapi-youtube-search&alt=rss&v=2

The problem with the RSS feed is that they don’t include the HTML required to embed the video. You have to parse the RSS content and find the URL for the video which can be used to create the embed code so you can post the videos somewhere else.

In my example code I have categorized each url to a target keyword phrase.  The code below is not a comprehensive program, but just an idea of how to go about repurposing YouTube RSS content.

import feedparser  # available at: feedparser.org
from time import strftime
import sqlite3
 
DATABASE = "YouTubeMatches.sqlite"
 
conn = sqlite3.connect(DATABASE)
conn.row_factory = sqlite3.Row
c = conn.cursor()
 
def LoadIntoDatabase(phrase, url):
 
	d = feedparser.parse(url)
	for entry in d.entries:
 
		#check for duplicates with the url
		Constants.c.execute('select * from YoutubeResources where url=?', (entry.link,))
		if len(Constants.c.fetchall()) == 0:
			#only adding the resources that are not already in the table.
			t = (phrase,entry.link, 0, entry.title, strftime("%Y-%m-%d %H:%M:%S", entry.updated_parsed), entry.summary)
			Constants.c.execute('insert into YoutubeResources (`phrase`, `url`, `used`, `title`,`dateAdded`, `content`) values (?,?,?,?,?,?)', t)
 
	Constants.conn.commit()
 
def getYouTubeEmbedCode(phrase):
 
	c.execute("select * from YoutubeResources where phrase=? and used=0", (phrase,))
	result = Constants.c.fetchall()
	random.shuffle(result)
 
	content = result[0]
 
	contentString = content[3]
	url=content[1].replace('?', '').replace('=' , '/')
	embedCode='&lt;div class="youtube-video"&gt;&lt;object width="425" height="344"&gt;&lt;param name="movie" value="%s&amp;hl=en&amp;fs=1"&gt; &lt;/param&gt;&lt;param name="allowFullScreen" value="true"&gt; &lt;/param&gt;&lt;param name="allowscriptaccess" value="always"&gt; &lt;/param&gt;&lt;embed src="%s&amp;hl=en&amp;fs=1" type="application/x-shockwave-flash" allowscriptaccess="always" allowfullscreen="true" width="425" height="344"&gt; &lt;/embed&gt;&lt;/object&gt;&lt;/div&gt;' % (url, url)
 
	t=time.strftime("%Y-%m-%d %H:%M:%S", time.localtime())
	c.execute("UPDATE YoutubeResources SET used = '1', dateUsed = ? WHERE url = ? ", (t, url) )
 
	return embedCode