This weeks update: Just keep spinner!

So as the wife is in Portugal getting some sun and I am home alone with the kids who are now finally asleep.  I thought I would review the week(s)

Burf.co

There is not much to report except a realisation that creating a new Burf Search Engine is gonna be a lot of work!  Even taking a fairly small chunk of it (100 million pages), I just can’t produce results fast enough using MongoDB.  So I am thinking about a few key points I want to focus.

  • Must be fast
  • Must be fairly useful and produce useful results
  • Must update itself
  • Must have a niche

So at the moment, I take the first 100,000,000 URLs from the CommonCrawl that returned HTTP status of 200 and are marked as English.  The CommonCrawl contains I believe around 2.3 billion URLs and so what I may do is filter which ones I want.  I could also build up a simple list of top sites (BBC, Wikipedia, MSN etc) and just index them once a week.

I also need to think about the people who submit their sites,  I need to at least action them.  I am thinking of moving to Elasticsearch just to speed everything up.  Everything seems slow in MongoDB past 1 million records even on an SSD.

I have also found myself regressing a bit to my old Burf.com ways of buying domains, looking at turnkey websites and SEO tips and tricks.  I used to love this and had over 100 sites at one point.

On a side note, Burf.co has a new site design and is now hosted remotely 🙂

Echo MG

So iOS development has started on the MVP which is good, more would have happened if Burf.co had played ball.  This is far more important than Burf.co, so if it comes to it, I will turn Burf.co off to focus.  It’s nice to do some iOS development again.

Hack24

This had been parked however this is a game jam coming up this Friday and I think with a lot of RedBull and sugar, I may be able to actually complete the MVP and get it uploaded which would be great.

Robots….

So some bits have been printed but I really need to sort the garage out so that I can then sort out all the EDR parts.  Too many jobs, not enough time 🙁

Leave a Reply