Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Ask HN: "How-Tos" for Rolling Your Own Short URLs?
20 points by daveambrose on May 3, 2009 | hide | past | favorite | 22 comments
I'm interested in working on a small side project to roll my own short URL, similar to how TechCrunch recently unveiled their version of http://tcrn.ch/.

If you know of any detailed How-Tos for creating your own version of a short URL, I'd appreciate it.

(P.S. To keep the question on topic, I'm not interested in debating the merits of whether short URLs promote spam, acts as a middleman, etc as was covered some weeks ago here by joshu.)



I made one for Tipjoy where I mapped a content ID to short string, and loaded a frame like the Digg Bar where people could donate to the site and view the content. For example: http://tipjoy.com/2w/

Here is the python code I used to "shrink" and unshrink the content ID.

  def shrink( id ):
      validCharacters = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ._-~"
      base = len(validCharacters)
      characterArray = ""
      d = id
      while d > 0:
          ind = d % base
          characterArray = validCharacters[ind] + characterArray
          d = d / base
      return characterArray
  
  def unshrink( code ):
      validCharacters = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ._-~"
      base = len(validCharacters)
      val = 0
      revCode = code[::-1]
      power = 0
      for r in revCode:
          i = validCharacters.find(r)
          if i==-1:
              raise
          val += i * (base**power)
          power += 1
      return val
If you think this code is bad for some reason, please tell me in the comments. I spent little time reviewing it when I wrote it about a year ago.


input.php - input box asking for url. call create.php on form submit.

create.php - generates random string, adds url and random string to urls table

redirect.php - takes random string. finds url associated with it. redirects

.htaccess - catches http//domain.com/<random string> and maps it to redirect.php


and make sure to redirect with a 301


i.e. header($final_destination, true, 301)


Wow, I really botched that up.

header("Location: " . $final_destination, true, 301)

In my defense, I generally wrap this call into my own redirect function.


I wrote up my experiences building a vanity short URL service a few weeks ago: http://simonwillison.net/2009/Apr/11/revcanonical/ - algorithms and code included


I would associate each URL with a sequential number. And express the number using base-36 (0...9 + a...z). I don't see the need to use obfuscation for this application but if desired I'd hash the URL + a secret and prepend that fixed-length hash to the base-36 sequential number. I'd express the hash in base-36 too.


Well, originally, tinyurl just went for sequential short URLs too. But as those were predictable, abuse was afoot: "Early on, the resulting URL aliases of the service were predictable, and were exploited by users to create vulgar associations. The URL aliases dick and cunt were made to redirect to the White House websites of U.S. Vice President Dick Cheney and Second Lady Lynne Cheney." Source: http://en.wikipedia.org/wiki/Tinyurl#Early_abuses


Interesting... tack that hash on. :) Probably does not need to be a hash actually. A few random base-36 digits would do it. You'd just need to store the random digits with the URL in whatever sort of persistent storage system you're utilizing.


    1.to_s(36) #=> "1"
    10.to_s(36) #=> "a"
    100.to_s(36) #=> "2s"
    123456.to_s(36) #=> "2n9c"
Ruby. Where the number is the id of the record in the database or something.


Also forgot to mention, you can do the opposite with:

    "2n9c".to_i(36) #=> 123456


Create your own URL shortening service with Heroku and Shorty: http://brad.posterous.com/create-your-own-url-shortening

Some useful stuff to go on there even if you're not using Heroku.


Here is a related question. Is creating a hash of letters and numbers instead of just using an ID beneficial in anyway?


The only benefit is that by switching from base 10 to base 36 or base 62 (26 lowercase + 26 uppercase + numbers) you are average URL will be shorter. You can usually do this using http://php.net/base_convert


It's about length. If you limit yourself to numbers you only get 10 possible characters instead of 36 if you use letters or 62 if you make them case sensitive.


By an ID I simply mean starting at 1 and moving up from there


It prevents people from iterating over all URLs in your database. Whether or not you consider that a plus or a minus, is mostly up to you.

It also prevents people from easily seeing how many URLs you've shortened.


You can write the basics of that sort of thing from start to finish in half a day - easy stuff. You need:

1) Database table to store the "codes" - two "main" columns: Code, DestinationUrl

2) Random code generator to generate the "Code"

That's really the gist of it.


Exactly. The "win" tinyurl.com (etc.) was that someone had the foresight to see the need at all and do it, not that it was hard to do.

Amazon got patents on less.


Wasn't it services like learn.to, beam.to and similar that came first? With the intend to have freely pickable shorthand URLs, where the shortness of them was just a byproduct?

I'm not sure of this, but iirc the learn.to/beam.to etc services were around in the early web already, mid-90s, while tinyurl and friends came later.


Those services were different because they were meant to provide an alternative URL for your website, not someone else's. Also, they required you to go through a signup process and I remember some of them displaying ads.

And of course, there was no Twitter back in those days to create an artificial need for short URLs.


Yepp, I made that experience too. But to work out the details took inexpectedly long.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: