I guess you could say sometimes I care a bit too much about stupid things… this is one of those things. I really really like short and straightforward URLs. What follows is a description of a script that takes care of exactly that, making it super easy to manage those urls using github.

What I’ve been doing so far is having a redis database with all my urls and a simple python script with Flask that does this

@app.route('/', defaults={'path': ''})
@app.route('/<path:path>')
def catch_all(path):
    dest = redis_cli.get('url:%s' % path)
    if dest:
        return redirect(dest)
    return 'Page not found', 404

While that actually worked, it was a PITA to actually update a link… I’d have to ssh into the server, open redis-cli and update it. Also, I’ve recently added cloudflare everywhere to have them cache my sites, that means I now need to purge the cache once URLs are updated (I’m having cloudflare cache my content for 30 days, and whenever I update the content, I purge it).

In my ideal scenario, I would handle a file containing all the url mappings, and that should be it. Modifying the file should modify the redirections… and, the most natural way to handle those files for me is obviously git. Github has webhooks, what means I can get a notification every time the urls file is modified for free (literally).

Also, cloudflare’s API has a very convenient method to purge the cache… so this was just about fitting all the pieces together.

Basically, this script has two endpoints… /refresh that… well, refreshes the URLs and /whatever-url-you-set-up that does the redirection. Then, I set up a github webhook to hit /refresh whenever I do a push and that in turn fetches the urls files, reloads them on redis and purges the cache on cloudflare. BUT… github has a small delay between the moment you push and when they make the content available on their website, so I added a &wait parameter that just waits as many seconds as requested before actually pulling the urls.

I hope the idea is clear by now, and you can see the code on github (it’s just one tiny file doing everything). The script uses the following environemnt values:

  • CONFIG_FILE: path to a configuration file containing the refresh key and the list of domains with their urls file paths
  • REDIS_HOST: the hostname of the redis server it uses
  • REDIS_DB: the database you want it to use

The configuration file is a simple json file, this is mine

{
    "key": "thisIsACrazySecretNobodyKnows",
    "domains": {
        "gmc.uy": {
            "urls": "https://raw.githubusercontent.com/g3rv4/redirections/master/urls.json",
            "afterRefresh": "/var/config/clear-cache.sh -zone myZoneId"
        },
        "dv.uy": {
            "urls": "https://raw.githubusercontent.com/d4tagirl/redirections/master/urls.yml",
            "afterRefresh": "/var/config/clear-cache.sh -zone anotherZoneId"
        }
    }
}

The schema is pretty self-explanatory, but here’s a detail:

  • key: this is the key you’ll have to pass to refresh to have the system actually refresh the urls
  • domains: an object containing the domains you want the script to handle.
    • urls: contains a url to the file containing the redirections
    • afterRefresh: an optional parameter, if it’s present, this script will be executed… well, after refreshing the domain :)

And the urls files are also pretty straightforward… just an object containing slug: destinationUrl either as a json file or as a yaml file (you can see my file here)

Once that’s taken care of, all you need to do is set up a webhook on your redirections repository on github to https://yourdomain.com/refresh?key=thisIsACrazySecretNobodyKnows&wait=10&domain=domaintorefresh.com and that’s it! and if you are using cloudflare, just call their API on your afterRefresh script.

That works, and you can use it just like that by running gunicorn… but I really like docker. I have everything dockerized on my server and this is not the exception, so I made a docker container with everything it needs to run (python + my script + dependencies + redis), so that I don’t have to manage anything.

If you want to go that route, you should totally use it via docker compose, here’s my docker-compose.yml. I have a web custom network that’s what the dockerized nginx server uses to reach the different containers, so that I don’t have to bind the ports to the host… but if you don’t want to go that route, you can just map the port 8000

version: '2'
services:
  simple-redirect:
    image: 'g3rv4/simple-redirect'
    hostname: simple-redirect
    restart: always
    environment:
      - REDIS_HOST=localhost
      - REDIS_DB=1
      - CONFIG_FILE=/var/config/config.json
    volumes:
      - '/path/to/config:/var/config'
    networks:
      - web
networks:
  web:
    external:
      name: web

Then, you can put an nginx as a reverse proxy… here’s my site config

server {
  listen *:80;
  server_name www.gmc.uy;
  server_tokens off;

  root /nowhere;
  rewrite ^ https://gmc.uy$request_uri? permanent;
}

server {
  listen *:80;
  server_name gmc.uy;
  server_tokens off;

  root /nowhere;
  location /refresh {
    add_header Cache-Control "s-maxage=1";
    proxy_set_header Host $http_host;
    proxy_pass http://simple-redirect:8000;
  }

  location / {
    add_header Cache-Control "s-maxage=2592000";
    proxy_set_header Host $http_host;
    proxy_pass http://simple-redirect:8000;
  }
}

aaaand… that’s it! super straightforward, right? I’m setting Cache-Control: s-maxage=1 for the /refresh endpoint so that cloudflare doesn’t cache that ;)

Gervasio Marchand

@[email protected] g3rv4


Published