Thursday Update: The RETURN

Hello and welcome to the first official Thursday update. Yes, there have been many updates that have come out on Thursday but they were always called Wednesday Updates. Go Figure.

Summer

Summer was rammed with kids’ stuff, work, family, and all things not very robotic or techie. The nights are now getting darker and so the geeky epic stuff will return.

Burf.co

So today I was going to shut down Burf.co because I have done very little with it and I am still annoyed with the terrible hosting company (HostPapa) who seem to be proper ropy. However, when I went to see the site stats, which I assumed were maybe 1-2k a month (who would use a search engine other than Google) I was pleasantly surprised.

Seeing as I have done very little with the site that’s quite bonkers. So instead of turning it off (due to the price of energy), I have decided I should do something useful with it.

Electronics course
Completed and I have bought the next module which is more around Robotics and Arduino’s

Review of this year’s goals

So I set out a lot of goals this year based on me working part-time, which sadly ended and I was already a bit behind. However, I have to say I am still proud that I have achieved quite a bit and I am confident I will achieve more! Maybe not everything but a lot more than last year

  • Math: Learn Algebra and kinematic (Stalled)
  • Finish Introducing Robotics  (Done)
  • Start and Finish Robotic Vision: Principles of Vision (Scrapped as very old)
  • Finish Artificial Intelligence for Robotics : This is the self-driving course I really want to do but needs maths. (Not started yet)
  • Give a good shot to Machine Learning by Andrew NG : A very famous ML course (Not started yet)
  • Get Robotic head talking and listening (Done, will go further)
  • Do something useful with Gwiz (Done and sold)
  • Finish Electronics course (Done)
  • Start and finish Self driving course using deep learning (Not started yet)
  • Build/use already built little robot to do maze-solving work (Little progress made)
  • Finish Robotic Diploma (Missed off list but was important) (Done)
  • Finish Inmoov Robot (Missed off list but is very important) (Progressing, the body is built need to do electronics)

Weirdly Burf.co which has been rewritten this year, and has extra stuff isn’t on this list.

The new focused list of things to do by the end of the year

So as there 3 months to go, let’s focus on a few easy tasks, medium tasks, and maybe 1 hard task

  • Finish Artificial Intelligence for Robotics : This is the self-driving course I really want to do but needs maths. (HARD task and requires me to know Maths better, stretched goal)
  • Do something with Mobility Scooter or Electric Wheelchair (EASY task)
  • Make progress with Inmoov Robots Electronics (MEDIUM task)
  • Do some sort of ML with Burf.co (MEDIUM task)
  • Complete Introduction to Application of Robotic Tools (2-week course, EASY task I hope)
  • Start and make good progress Self-driving course using deep learning (EASY task)
  • Get ROS robots back and running, remind myself of syntax, etc (EASY task)

Hopefully, I can do most of these, having a list definitely helps me focus.

Wish me luck 🙂

The Wednesday Update: The Sunny Edition

Welcome to another Wednesday update and boy has the weather been great the last 5 days. Compared to the usual rain we had for a good few weeks, the weather has been epic!

Gwiz update
Stuff has happened, the big thing being that I have now ordered brand new (yes actually new) batteries for it. I have gone with some 100AH deep cycle Hankook DC31. I will be running around 65 volts which does make the Gwiz shift a bit! They should arrive ready for the weekend 🙂

Inmoov Robotic Head update

Well, this has started to look pretty cool now that I have tidied up the internals. I have mounted the amplifier, sound level PCB, and other things. I need to reprint the top part of the skull

I think I have got this to a great place. I do want to add a camera to it but to mount that onto the Pi will be a massive challenge. I also need to fix the Burf.co chat API as it’s dying badly. I would also like the Python script to autostart when the head is turned for all of the speech, TTS and API services.

Burf.co : Goodbye server!

So do not run random Linux scripts on a live server from the internet lol. I completely removed SUDO permissions from my box, 1.2 terabytes of data lost! It was a bad day, I was multitasking badly and paid the price. I managed to rebuild the server rather quickly (2 hours) however the main index data will take ages to get back. I like to see this huge mess up as an opportunity to do something different so watch this space!

Review

Let’s see how I am actually doing against the plan, I had a few lost sessions due to house improvements and covid, but how am I doing against the list

List from February

  • Math : Learn Algebra and kinematic (Little progress made)
  • Finish Introducing Robotics  (Done)
  • Start and Finish Robotic Vision: Principles of Vision (Scrapped as very old)
  • Finish Artificial Intelligence for Robotics : This is the self driving course I really want to do but needs maths. (Not started yet)
  • Give a good shot to Machine Learning by Andrew NG : A very famous ML course (Not started yet)
  • Get Robotic head talking and listenering (Done, will go further)
  • Do something useful with Gwiz (Progressing)
  • Finish Electronics course (Not started yet)
  • Start and finish Self driving course using deep learning (Not started yet)
  • Build/use already built little robot to do maze solving work (Little progress made)
  • Finish Robotic Diploma (Missed off list but was important) (Done)
  • Finish Inmoov Robot (Missed off list but is very important) (Progressing)

I find it very odd how I missed the Robotics Diploma and finishing the Inmoov Robot (full size) off the main list as these are actually more important than most of the other things.

So 3 things done, 1 scrapped (it was mentioned in a previous post why), 2 processing well, and 5 academic courses to start. Seeing as it’s 3 months in, I think that is actually good progress.

Elasticsearch : I did it, I used an existing framework, not recreate the wheel :)

Burf.co and Elasticsearch

So, as you can guess by the title, I decided to check out Elasticsearch (and do a couple of courses on it) to see if I could use it with Burf.co Search Engine.  Elasticsearch is built to be a search engine, whereas previously I was using MongoDB’s full test indexing and well, it just wasn’t up to the job of quickly returning results in a few seconds.

I still use the CommonCrawl data, which is filtered and put into MongoDB (the actual HTML document is stored), I then parse this and chuck it into Elasticsearch.  The program is all written in Java and seems to work pretty well.

The only issues I had were around updating my Java Spring Boot API’s to talk to Elasticsearch.  The Spring Boot Elasticsearch component uses an older version of Elasticsearch (current is v6.4.2, Spring Boot I believe is v5.x).  The other issue I had which seems silly is that most examples I found use Maven as the build pipeline, whereas I use Gradle.

The result is that search on Burf.co now works and is a lot quicker than before (disclaimer, I am still adding data to Elasticsearch, last looked it was around 55m pages).  I still need to learn about how to optimise it, how to do better more accurate searches, however, I am happy that I moved to use the correct tool for the job 🙂

Hack24

Last Friday, a few of my old Compsoft work colleagues met up for a GameJam which was fun.  I decided to focus on finishing Hack24 (finally), or to at least evaluate how much work was required.  Sadly the GameJam only lasted 7 hours (2 am) however I did get quite a lot of bugs and features implemented into Hack.  An Android MVP version is coming really soon!

I read a book!!!

So, it appears that you don’t just need to watch tv and play games while you’re on the exercise bike, you can READ, and I did so.  I read a book on flipping businesses (random I know) and quite enjoyed it.  So I have the $100 startup to read next 🙂

 

 

 

 

This weeks update: Just keep spinner!

So as the wife is in Portugal getting some sun and I am home alone with the kids who are now finally asleep.  I thought I would review the week(s)

Burf.co

There is not much to report except a realisation that creating a new Burf Search Engine is gonna be a lot of work!  Even taking a fairly small chunk of it (100 million pages), I just can’t produce results fast enough using MongoDB.  So I am thinking about a few key points I want to focus.

  • Must be fast
  • Must be fairly useful and produce useful results
  • Must update itself
  • Must have a niche

So at the moment, I take the first 100,000,000 URLs from the CommonCrawl that returned HTTP status of 200 and are marked as English.  The CommonCrawl contains I believe around 2.3 billion URLs and so what I may do is filter which ones I want.  I could also build up a simple list of top sites (BBC, Wikipedia, MSN etc) and just index them once a week.

I also need to think about the people who submit their sites,  I need to at least action them.  I am thinking of moving to Elasticsearch just to speed everything up.  Everything seems slow in MongoDB past 1 million records even on an SSD.

I have also found myself regressing a bit to my old Burf.com ways of buying domains, looking at turnkey websites and SEO tips and tricks.  I used to love this and had over 100 sites at one point.

On a side note, Burf.co has a new site design and is now hosted remotely 🙂

Echo MG

So iOS development has started on the MVP which is good, more would have happened if Burf.co had played ball.  This is far more important than Burf.co, so if it comes to it, I will turn Burf.co off to focus.  It’s nice to do some iOS development again.

Hack24

This had been parked however this is a game jam coming up this Friday and I think with a lot of RedBull and sugar, I may be able to actually complete the MVP and get it uploaded which would be great.

Robots….

So some bits have been printed but I really need to sort the garage out so that I can then sort out all the EDR parts.  Too many jobs, not enough time 🙁

New Burf.co, Business Success Diploma and building furniture!

So, I am trying my best to make sure I update my blog regularly so that I can look back and see what I have achieved.

Burf.co, now with no search button

So between spending over 12 hours building furniture for the wife (2 new bunk beds, 2 desks, 2 chairs and a bookcase) I also managed to build a new MVP of Burf.co.  The previous version was using Swift and was all in one (frontend/ backend).  The new one is properly structured and uses VueJS for the frontend hosted externally, and the backend is in Java Spring Boot.  It is just a prototype at the moment and doesn’t even have a search button, you just type and it starts getting the results (and smashing the db) but it was a fun prototype to build and is definitely the way I want to go.

Next Steps

  • Increase dataset from 10 million to 100 million and make it fast as f*ck
  • Add site pages (contact, about, submit etc)
  • Add tag cloud and routing to make it better for SEO

Business Success Diploma

So ages ago I signed up to Shaw Academy as it has some great courses and reviews.  I got a lifetime membership so I wasn’t in a rush.  Since I finished the bodybuilding show, I have been caining through the course at super speed and hope to take my exam this week:). It was one of my new year’s resolutions to do (so was Hack24 🙁 ).  I am learning a lot and hope to carry on with more advanced training around business.

Echo

The above projects are the fluff as I call it to get out of the way so that I can focus on Echo and getting this startup off the ground.  I already started building a basic website for them.  Once that’s complete, its time to focus on the MVP

Robotics

When it rains, it pours!  So before I had even thought about the above 3 projects, I had decided I was going to build a humanoid out of VEX EDR, which is a brilliant building platform.  They had just brought out their new V5 system which looks epic (more powerful motors, more motors, vision control, touch screen etc).  VEX was very nice in sending me some V5 hardware to start building this 3D Printed / VEX EDR humanoid!  This will be my chilled weekend project 🙂

 

Lets kick it off again!

So one of the only plus points of not getting any sleep due to the effects of the extreme diet for this bodybuilding show is insomnia, my mind gets very creative and forces me to start kicking off new ideas, projects, missions et!

So, if you saw my last post, I said I was gonna finish Hack24, fix Burf.co and sort the garage!   So far, Burf.co is back up, but about to completely change, the garage is nearly finished being geared up as a robotics lab and Hack24 has not moved.  I do want to finish hack24 but I don’t want to rush it and I want to harness my energy on some crazy robotics ideas while my brain still works 🙂

So the plan v2!  Warning it’s a little bonkers, even for me!

Build a backend set of machine learning API’s that Burf.co, mobile devices, and my robots use to send and retrieve data.  The idea is I could send it a question, a command or an image and it does some magic and responds.

  • So for mobile devices, they would send images and text to speech, it would return ImageNet classification or answers to questions.
  • Burf.co would become more of a knowledge base system using NLP to feed into other systems.
  • There would also be a public facing chatbot which would hopefully learn off of all of this.  Planning a system POC using AIML to test the waters
  • This would all somehow be also brought together to add some usefulness to my future robotics projects (image classification, knowledge base, etc)

I brought some odd bits of hardware, upgraded the Burf.co server, brought some domains, and started rewriting Burf.co in Java.  I decided I want to try and use a common language and randomly Java seemed the best fit (client, server, mobile etc)

It’s gonna be a slow progress but I think its gonna be exciting.

Let’s kick it all off!

Sorry for the last few months there has been little update. I decided that before I got too old I should enter a bodybuilding contest, honestly seemed like a good idea at the time. Well it’s been the hardest 3 months of my life to be honest. First off it isn’t the cheapest thing to do (I have spent £500 on chicken alone), you have to be super disciplined (up at 5, 1 hour cardio, training even if ill), you become very moody, tired and even get insomnia (writing this at 3am) because your constantly hungry! However with less than 2 weeks to go, I am still chuffed I decided to do it. On a positive side, because you don’t sleep as much, your brain seems to be come very creative!!!

So what’s the plan Batman?

So, I have many plans, lots of ideas and have been researching lots of stuff but I need to do some house keeping first!

1) Release a v1 of Hack24 cross platform to prove the framework works.

2) Fix Burf.co Search Engine because I turned it off.

3) Finish the garage so that large projects are possible 🙂

All of the above are in progress and I hope to have them done very soon. Then it is full speed ahead for some cool robotics / machine learning project that I will discuss in my next post 🙂

ALL THE SERVERS (Sponsored by SUGAR)

So off to see the new Avengers tonight, super excited about it (probably due to a large amount of Pick & Mix and Energy drinks I have already consumed!)

Before I go, I thought I would rebuild the Burf.co server as the site has been running off my desktop computer for a week or so in preparation for the new server.  At the same point my MacBook Pro has also been filled up with the CommonCrawl!  The reason I took it offline in the first place was that it ran MongoDB like a dog!  Even running Raid 0, the full-text search was slower than my 69 year old mum! (She is really slow, bless her).

So the rebuild, I have scraped the raid 0 and put in an SSD. I am also running 2 instances of MongoDB on the same box.  The server has 128gb of ram now so should be fine however this time I want 2 distinct datasets without the power cost of running 2 servers (Yes I know I can run stuff in the cloud, but look up the costs of 4tb of space).

One data set will live on the 4TB drive and will be the raw data from CommonCrawl before I have processed it.  The other dataset, which will live on the SSD, will be the processed data for the search engine.   The aim is to have a much smaller refined set of keywords for each page that will live in memory, and in hard times be read off the SSD.  This approach also means I can reprocess the data as many times as I like, plus switch out the full-text engine (2nd instance of Mongo) for Postgres without losing the raw web pages held in the main MongoDB.

My original plan was to try and get between 1-5 million pages indexed which was more than the original Burf.com ever did.  The current solution is already at 7.7 million without breaking a sweat, and the new solution I hope to hit 50 million!

I did plan to crawl the web manually before I discovered the CommonCrawl (and I may still do parts), so I bought a second had HP c7000 blade server (Its a f@cking beast, and I can’t even lift it!).  However, I think it’s going to be repurposed for some machine learning stuff across the larger dataset.  I cant let 16 * 4 * 2 cores go to waste even though it keeps my house warm!

 

c7000 blade server

 

So next steps for Burf.co

  • Move all the data from the other machines on to new server and fire up the current Burf.co
  • Get 4TB of CommonCrawl web data and process it
  • Build a new search algorithm
  • Make the site sexy!