Category Archives: Software

Schrödinger’s Programmer is a thought experiment. A real-life paradox which comes as a result of the Copenhagen interpretation of quantum mechanics. The thought experiment presents a programmer that may or may not have written software.

You have a closed office. In this office is a computer (with internet access) and a software programmer. She is tasked with writing a piece of software that can be written in an hour. However there is an equal chance that she will instead find and read something interesting on reddit.com and accomplish nothing in an hour.

After an hour has elapsed one would say that the software is finished if meanwhile she did work. The psi-function of the entire system would express this by having in it the both project completed and nothing done state mixed or smeared out in equal parts.

It is typical of these cases that an indeterminacy originally restricted to whether or not something of interest is on reddit.com becomes transformed into macroscopic indeterminacy, which can then be resolved by direct observation by opening the office door. That prevents us from so naively accepting as valid a “blurred model” for representing reality. In itself, it would not embody anything unclear or contradictory. There is a difference between a shaky or out-of-focus photograph and a snapshot of clouds and fog banks.

Mobile platforms are the new hotness. The growth is explosive as more and more smart phones get bought up but the problem is that unlike PC development and web development there isn’t a standard way of writing applications that will work across platforms.

The two major smart phone platforms – Android and iOS are the ones most developers are targeting. Luckily as the development tools for these operating systems have matured a number of frameworks have come out that will let you write your application once and have it run across both.

For game development there is Corona SDK. Corona lets you write applications in Lua, a dynamic scripting language, which are then compiled down to native applications. It includes some pretty neat graphics and physics engine components so that you don’t have to write them yourself. Just focus on what the game is and how it looks and sounds.

Pretty cool. The SDK comes with plenty of example code and Lua is pretty easy to pick up. It’s definitely the way to go if you are programming a game.

For other kinds of apps – ones with buttons, menus and other native widgets there’s Titanium. Titanium lets you write native applications with web technologies. So you can use what you know of JavaScript, HTML and CSS.

Titanium includes plenty of APIs for getting at the stuff you wouldn’t normally be doing in Javascript and there a lot of great documentation.

Both of these solutions integrate with the simulators (iOS simulators are only available on a Mac computer). And they both are very active at keeping up with the latest device features. If you want to create a app for iPhone or Android take a look at these.

With any software product there is a massive wishlist of features to implement and things to develop. It is a never ending process of thinking of ideas, evaluating them, and perhaps adding them into your software.

Nowhere is this more apparent, and more of a problem than in the first release.

I have been experiencing this lately.

You get near the end of the list of things needed to do, and somehow new things always get added and they seem crucial for the product. The release date slips and you go even longer without customers and revenue.

One of the great development philosophies in software development is “Release early. Release often. And listen to your customers.” It was popularized by Eric S. Raymond and was originally applied to Linux kernel development. It has since become more and more popular in many more projects including essentially everything that is developed at Google.

One of the most influential business books I have read recently Ready, Fire, Aim: Zero to 100 Million in No Time Flat(aff) shares an idea that meshes very well with the release early, release often development process.

From a business perspective the most important thing a new business has to do is get customers. In the processes of finding your first customers you learn the most valuable thing you need to succeed. Which is how to sell your product. The sales process can take years to really figure out – everything from identifying who the customers are, what language they use, where you can find them, and what their pain points typically are. It’s fair to say that many of your assumptions will be wrong in this area.

Under the assumption that you don’t really know what your customers want – releasing early is even more critical. Getting something out into the marketplace which can later be improved with feedback received in the sales process or from your first customers can give you a first mover advantage and reduce the missteps taken on developing features you thought would be important, but which are rarely used.

So as an entrepreneur and software developer how can you balance these concerns and draw a line to determine what are the minimum set of core features required for an initial release?

Here are some of my tips:

  1. Identifiy the core problem that you are trying to solve, make it something concrete and simple
  2. For every feature idea ask yourself “Can I say I have solution to the core problem without this feature”
  3. For everything that’s not truly required, defer it to a later phase of the project
  4. Keep in mind that it’s much easier to add new features than to modify poorly implemented ones (this is why many first movers in a market lose out to copycats)
  5. Start the sales process and marketing push before the product is actually ready

Following this framework should whittle the list of features down to something manageable and keep the pressure on to get it out.

Good luck.

The software and technology industry has produced perhaps the most number of self made millionaires (and billionaires) of our generation.  It’s something that happens with every disruptive new industry – a massive redistribution of wealth.  Software is unique in the power it provides and the low cost of distribution.

As a result the margins can be extremely high.

We are on the cusp of yet another sea change again in the technology industry as the sales of mobile phones is now out pacing that of computers.  While at the same time new mobile phones are now becoming capable of running useful software.  Unlike computers, mobile phones work everywhere and usually have some interesting capabilities, like cameras, GPS, and accelerators.

This is still a new industry and there’s plenty of space for new ideas to be implemented and make millions.

Someone asked me recently to develop a content spinner algorithm which can take a document and produce variations of that document. I thought it was an interesting thing to think about, and very rarely do I get to think about algorithms like this in my normal day-to-day programming.

The document contains variation options with a special syntax.  for example:

{Hi|Hello|Good morning}, my name is Matt and I have {something {important|special} to say|a favorite book}.

The algorithm will recursively go through the string and generate a new string by choosing an option provided in curly braces separated by pipes.

import random
 
def spin(content):
    """takes a string like
 
    {Hi|Hello|Good morning}, my name is Matt and I have {something {important|special} to say|a favorite book}.
 
    and randomly selects from the options in curly braces
    to produce unique strings.
    """
    start = content.find('{')
    end = content.find('}')
 
    if start == -1 and end == -1:
        #none left
        return content
    elif start == -1:
        return content
    elif end == -1:
        raise "unbalanced brace"
    elif end < start:
        return content
    elif start < end:
        rest = spin(content[start+1:])
        end = rest.find('}')
        if end == -1:
            raise "unbalanced brace"
        return content[:start] + random.choice(rest[:end].split('|')) + spin(rest[end+1:])
 
if __name__=='__main__':
    print spin('{Hi|Hello|Good morning}, my name is Matt and I have {something {important|special} to say|a favorite book}.')

This is the project that I have been focused on building for the last few months, and it has expanded from work that I have been doing off and on for several years.  I’m excited that it finally has a home of it’s own and is nearly ready to release to the public.

I will be finishing up the design over the next month or so with the plan to do a very small announcement in early January to let some early adopters in to start using the system.

You are probably a lot like me and have a couple of idle domain names that you’ve picked up over the years and never really done anything with.  You also have a few websites that are actually pretty decent but not getting the traffic they deserve from the search engines.  That’s why I created the automatic blog machine – For myself to solve these two problems.  I can easily create a blog and hook it into the system then forget about it.  It will run for months building traffic, links, and attracting advertisers and drive traffic to the websites I actually care about.

I’m using this system to build out my network of websites and build an increasing base of pages I can then sell through Google DoubleClick for Publishers, or sell text link ads or promote Amazon products, Ebay auctions or any number of other affiliate products.  By integrating an Ad Server I can quickly and easily put an ad across the entire network at no cost to me and immediately drive massive amounts of traffic.

One concern is that the sites should not be spammy and I took great effort to make sure the content that the Automatic Blog Machine creates is unique, natural and readable.  That means auto-translation is not used because it creates hard to read content, it also means there is a strategy for both internal and external linking.  Getting these things to work correctly was actually pretty complex.  It requires me to datamine a lot of content in order to build each blog post.

I’m pretty proud of the development of this tool and I’m looking forward to letting people in to see how it works.

Check out the Automatic Blog Machine

After seeing the successful launch of the Autoblog Samurai product launch come through my email box over the last week I thought it might be time to dig up my scripts that I wrote several years ago to attempt the same thing. Over the past few years I have run a couple of autoblogs but never really took the concept into something that was really profitable or very easy to use. (even though the blogs that I did run were making a small profit)

But after seeing the amount of excitement that Autoblog Samurai has been able to create around their software I’m intrigued enough to give it a second shot.

So I have started to revamp my existing hodge podge of scripts into a proper web based application.  It will be a django based web application that allows users to configure many blogs and pipe in many content sources to each one.  Wrapped around everything will be a number of specific monetization tools, cross promotion tools, and hopefully some analytics built right in.

The one big perk of having it web based is that it will never sleep unlike your home computer which may not always be turned on an have the software running.  That actual bit of out of sight out of mind might actually mean that users forget they’re running a bunch of autoblogs until they get a check in the mail from Google Adsense.

I’m not sure yet if I’ll make the software public or if I’ll keep it for myself.

After an hour of work on it last night I actually got a functional prototype system running.  There’s a couple of things that need to be cleaned up, and features added to make it competitive.  But most of the work will be designing and polishing a nice user interface.

I would like to be able to create a system that can scale to 10,000+ blogs and publish content to all of them as often as every minute.  I think that would be an interesting experiment in internet marketing.

Connecting to a Google Gmail account is easy with Python using the built in imaplib library. It’s possible to download, read, mark and delete messages in your gmail account by scripting it.

Here’s a very simple script that prints out the latest email received:

#!/usr/bin/env python
 
import imaplib
M=imaplib.IMAP4_SSL('imap.gmail.com', 993)
M.login('myemailaddress@gmail.com','password')
status, count = M.select('Inbox')
status, data = M.fetch(count[0], '(UID BODY[TEXT])')
 
print data[0][1]
M.close()
M.logout()

As you can see. Not a lot of code required to login and check and email. However, imaplib provides just a very thin layer on the imap protocol and you’ll have to refer to the documentation on how imap works and the commands available to really use imaplib. As you can see in the fetch command the “(UID BODY[TEXT])” bit is a raw imap instruction. In this case I’m calling fetch with the size of the Inbox folder because the most recent email is listed last (uid of most recent message is count) and telling it to return the body text of the email. There are many more complex ways to navigate an imap inbox. I recommend playing with it in the interpreter and connecting directly to the server with telnet to understand exactly what is happening.

Here’s a good resource for quickly getting up to speed with IMAP Accessing IMAP email accounts using telnet

As much as I’ve found the basic webscraping to be really simple with urllib and BeautifulSoup. It leaves somethings to be desired. The BeautifulSoup project has languished and recent versions have switched the HTML parser for one that is less able to manage with the poorly encoded pages on real websites.

Scrapy is a full on framework for scraping websites and it offers many features including a stand alone command-line interface and daemon tool to make scraping websites much more systematic and organized.

I have yet to build any substantial scraping scripts based on Scrapy but judging from the snippets I’ve read at http://snippets.scrapy.org, the documentation at http://doc.scrapy.org and the project blog at http://blog.scrapy.org. It seems like a solid project with a good future and a lot of really great features that will make my scripts more automate-able and standardized.

I got an email the other day from Frank Kern who was pimping another make money online product from his cousin Trey. The Number Effect is a DVD containing the results of an experiment where he created an affiliate link to every one of the 12,000 products for sale on ClickBank and sent paid (PPV) traffic to all of those links and found which ones were profitable. He found 54 niches with profitable campaigns out of 12,000.

Trey went on to talk about the software that he had written for this experiment. It apparently took a bit of work to get going from his outsourced programmer.

I thought it would be fun to try and implement the same script myself. It took about 1 hour to program the whole thing.

So if you want to create your own clickbank affiliate link for all of the clickbank products for sale here’s a script that will do it. Keep in mind that I never did any work to make this thing fast. and it takes about 8 hours to scrape all 13,000 products, create the affiliate links, and resolve the urls for where it goes. Sure I could make it faster, but I’m lazy.

Here’s the python script to do it:

#!/usr/bin/env python
# encoding: utf-8
"""
ClickBankMarketScrape.py
 
Created by Matt Warren on 2010-09-07.
Copyright (c) 2010 HalOtis.com. All rights reserved.
 
"""
 
 
 
CLICKBANK_URL = 'http://www.clickbank.com'
MARKETPLACE_URL = CLICKBANK_URL+'/marketplace.htm'
AFF_LINK_FORM = CLICKBANK_URL+'/info/jmap.htm'
 
AFFILIATE = 'mfwarren'
 
import urllib, urllib2
from BeautifulSoup import BeautifulSoup
import re
 
product_links = []
product_codes = []
pages_to_scrape = []
 
def get_category_urls():
	request = urllib2.Request(MARKETPLACE_URL, None)
	urlfile = urllib2.urlopen(request)
	page = urlfile.read()
	urlfile.close()
 
	soup = BeautifulSoup(page)
	parentCatLinks = [x['href'] for x in soup.findAll('a', {'class':'parentCatLink'})]
	return parentCatLinks
 
def get_products():
 
	fout = open('ClickBankLinks.csv', 'w')
 
	while len(pages_to_scrape) > 0:
 
		url = pages_to_scrape.pop()
		request = urllib2.Request(url, None)
		urlfile = urllib2.urlopen(request)
		page = urlfile.read()
		urlfile.close()
 
		soup = BeautifulSoup(page)
 
		results = [x.find('a') for x in soup.findAll('tr', {'class':'result'})]
 
		nextLink = soup.find('a', title='Next page')
		if nextLink:
			page_to_scrape.append(nextLink['href'])
 
		for product in results:
			try:
				product_code = str(product).split('.')[1]
				product_codes.append(product_code)
				m = re.search('^< (.*)>(.*)< ', str(product))
				title = m.group(2)
				my_link = get_hoplink(product_code)
				request = urllib2.Request(my_link)
				urlfile = urllib2.urlopen(request)
				display_url = urlfile.url
				#page = urlfile.read()  #continue here if you want to scrape keywords etc from landing page
 
				print my_link, display_url
				product_links.append({'code':product_code, 'aff_link':my_link, 'dest_url':display_url})
				fout.write(product_code + ', ' + my_link + ', ' + display_url + '\n')
				fout.flush()
			except:
				continue  # handle cases where destination url is offline
 
	fout.close()
 
def get_hoplink(vendor):
	request = urllib2.Request(AFF_LINK_FORM + '?affiliate=' + AFFILIATE + '&promocode=&submit=Create&vendor='+vendor+'&results=', None)
	urlfile = urllib2.urlopen(request)
	page = urlfile.read()
	urlfile.close()
	soup = BeautifulSoup(page)
	link = soup.findAll('input', {'class':'special'})[0]['value']
	return link
 
if __name__=='__main__':
	urls = get_category_ids()
	for url in urls:
		pages_to_scrape.append(CLICKBANK_URL+url)
	get_products()