The Wednesday Update: Today was an awesome day!

So, as mentioned before I have reduced my hours at work so that I can try and learn new exciting things, mainly around Robotics, and actually get stuff done without impacting the family. So Wednesday’s I only work until 12.

Today was a really good productive day!

Gwiz Update

After not being on the road for 7 years, having only done 8 miles since 2015, the Gwiz is back on the road! This really started the afternoon off to a good start. For anyone who doesn’t know what this car is, it was one of the first mainstream (ish) electric cars that were popular in London. I believe around 4000 were sold in the UK back between 2005-2008. My aim is to make this drive itself 🙂

Robotics Update

Even if the day stopped with the Gwiz passing, it would have been a good day. However, I decided to start work back on my Inmoov head and work on making its Jaw do lipsync. Far from perfect but it is working and it’s the start of me building stuff.

A bit of a mess
Not the worst soldering

Alfies new room

Alfie had been sharing a bedroom with his brothers and so once my ex left, I really wanted to try and make him a cool room. Alfie painted the walls the colour he wanted, the carpet he chose, and to finish it off we added some Alexa enable stuff. But wait, that wasn’t enough, after some search Alfie found a bed he really liked. It had a LED backlight and Bluetooth speakers built-in. This arrived today and my father and I built it in rapid time. Now he can actually sleep in his room!

The icing on the cake

Honestly, I was so chuffed by now, it’s 5:30pm and I feel I have achieved so much! However, Alfie, my 10-year-old son comes to me and says’s “Dad I am ready to release my Roblox game, I need your help”. I was on it like a car bonnet. Between us we got it released and I can’t be prouder! I bought Alfie a Roblox coding book for Christmas as he asked for it and he has been reading it at night off his own back. I think that was a good investment! Well done Alfie!

The Wednesday Update: On Saturday!

Even though it’s Saturday night, I wanted to do a weekly update. It’s important to me to at least show progress or benefits of taking time off from work to progress my hobbies/education in one way or another. In the last update, I said I had a rough plan. Now, how well have I progressed with that plan!

Begin Robotics by Reading University

This is now complete and I really enjoyed doing it even though I have completed it before. The simulations tasks really get you thinking and I now even understand the maths in the motor controller example. I did have to take a little time out to get my head around it however this is all part of that plan.

Maths

So I have started relearning Algebra using Khan Academy, I have also been asking friends and family who know maths a lot better than me. This has really helped and I am starting to get the brain working again. At school, maths was my strongest subject by far.

Gwiz electric car

The Gwiz is back from the welder, I have done a few jobs on it and it is booked in for an MOT next Wednesday :). It did come back with an odd beep sound but that was just the door sensor being a pain

Arena

Taking a step back from trying to jump before I can walk, as mentioned I did really enjoy the simulations on the Begin Robotics course. So much so that I plan to set up a hardware (e.g using actual robots) version of it. I have a VEX Robotics competition area that I plan to put in the conservatory. This should allow me to do ultrasonics wall detection, light following, line following and even sending data between robots.

Conclusion

All in all, I do think I am making good progress so far!

Progress so far : I need a plan

So I have had my first Wednesday afternoon off and I did do some useful things (clear space for working, sell some stuff, do some online learning). However, it has become apparent I really need to focus on a subset of things and get them done. Already I have signed up for lots of online courses, and have a billion ideas. Usually, this would be good thing but context switching wastes time and I don’t have a lot of time!

So, initially here are the high-level objectives I want to achieve over the next couple of months from my Wednesdays afternoons.

Finish Begin Robotics by a Reading university.

To be fair I have completed this 4-week course before but I really enjoyed it and have nearly finished it already again. This is a great introduction to robotics. (ETA 1 week)

Maths

Next I think I need to improve my Maths, mainly Algebra and surrounding topics that are useful for robotics. Robotics can contain quite a lot of formulas from PID controllers to working out the position of things (kinematic) and this is usually the area I really struggle on. Khan Academy is my go-to place for this. (2-3 weeks)

Re-intro into Python and Kinematics
Next,
there are 2 other Future Learn courses I want to finish, another robotics one (again nearly finished but paused on the maths) and a really simple Python course which I should smash through. I have done a bit of Python in the past but it’s been a while. Python is the go-to language for AI & Robotics (2 weeks)

Do some practical
Once I reach here I hope to take a short pause from education and put into practice some of the stuff I have learned. I would like to get my Inmoov head talking with lip sync (Arduino wired up to a sound level meter) and I want to add some sensors to the GWiz car and see if we can get a picture of the outside. (2-3 weeks)

Burf.co

On a slight diversion, I would also like to update Burf.co to be a bit more useful/themed around my learning. Everything is a bit static and rather boring. (1-2 weeks)

End of year goals

Once I hit these, I think it’s then time to review and see what I want to do next.
By the end of the year, I would like to be more comfortable with Maths, general robotics and have a semi-complete Inmoov robot. I would also love to finish the Udacity Self Driving course I attempted years ago.

Happy New Year!

So even though I ended 2021 with Covid and wasn’t allowed to see anyone for most of the Christmas period, I look back at the year and feel very thankful!

At the start of 2021 I was in a job I hated, I had a pending court case over the house (so I didn’t even know where I was going to live), I was fighting the ex for joint custody of the kids and my favorite past time (instead of robots) was drinking a beer and going to bed.

At the end of 2022, I am in a job I love, learning new stuff every day, I managed to keep the house and I have 50/50 custody of my kids. I now have a good relationship with my ex which is super important for the kids and I have reduced my hours at work so I can also start focusing on robotics and crazy projects. Beer consumption has also been reduced and replaced with Hot Chocolate.

I really am thankful to be where I am today and to all the people who helped me get here. Special thanks to my loving partner Jo!

So what does 2022 hold? Have I set 4096 New Years Resolutions… hmm not really.

Reduce and focus!

As I have reduced my hours at work slightly, I need to make sure that time is used for useful things. My main goal is to get back into robotics. So building and programming robots. This is really the number one goal!

Side projects to this are of course a Burf.co search engine and the GWiz electric car. I see both of these kinda fitting in with robotics a little. Burf has a AI chatbot which now will become the focus. The GWiz would be perfect to make self-drive or at least model it.

Ok, now you’re thinking what crazy money or health goal have I got lined up? Over the years I have been able to do some amazing things, however this year, I want to keep it simple, just get a bit fitter and encourage the kids to get involved. Spend money on useful things (like the kitchen) not 15 3D printers!

The last but of course most important goal is to be a good father, a good person, and a good friend to the people around me. I think over the 2 years I have learned a lot about myself (good and bad) and really feel as a human we can constantly strive to be a better kinder person.

Here are some of the things I worked on while I had Covid

Restarted on the Inmoov project
Robot Dog which can track objects etc
New CNC machine for Xmas 🙂

Burf : The reboot!

Did you notice I didn’t say Burf.co? Well, there is a reason for that!

I hope (really hope) that this is the first article of many on the road to me returning to building stuff! It’s been a long road filled with many personal issues (Divorce for example), redundancy, and COVID however I feel that journey is now at a end. I am rather excited about the future and have already started a few projects

Work

So I have returned to Compsoft Creative, briefly as a Senior Software Engineer (C# and Android) and then promoted to the Head of Software Engineering! Compsoft has always been a fantastic place to work and I hope to try and make it even better.

Motorbikes

Yep, I have a new hobby and I am really enjoying it! I have as much fun fixing/restoring them as I do riding them. I have owned about 6 bikes in the last year, plus an awesome Robin Reliant Trike. It’s a great hobby with my father and takes me back to being a kid again.

GWiz of Death

One of my less popular purchases was a tiny electric car called a GWiz. New, they were a terrible car, mine being left to die for many years which made it just that little bit shitter. When it was delivered (as it didn’t work), it was a shed. I have had great fun trying to fix it and now it actually moves. The aim is to get it to do 60mph within a few minutes (or before I run out of road)

Burf.co

So due to a few reasons (electric bills of £200pm) I shut down Burf.co and turned everything off, I killed Vibe Innovation and gave up on anything interesting until home life had settled down a bit. As I said, I think it’s time to start resuming the coding projects and I have turned Burf.co back on :). I hope to actually do some cool stuff with Burf.co in the coming months

Inmoov robot

I did start restoring it in 2019/2020 but had to take a break from it. I did lots of stuff to it but just didn’t make any videos. Like Burf.co, I hope to get this moving on soon 🙂

The answer is 2,828,752,948

Well I never thought I would get there and it took a few attempts but I managed to stick the CommonCrawls text corpses of the Internet into MongoDB. This is around 13TB of text, which is definitely more than I can read in a day!

The next task is to work out the language of each page, I have a few ways to do that, then ignore anything that is not English. I am not really sure how I would analyise text data in a language I do not understand hence why I plan to skip it.

After that I need to try and work out what each page’s context is, then chuck it in ElasticSearch.

File Search

CommonCrawl also contains millions (63,270,007) of links to files like PDFs, Docs, and images. I have started processing this data to see what useful information I can extra.

Burf.co Website

Shocking, I know, I think I need to hire someone to do a good job of it! Watch this space.

Its been a while

So it’s been over 4 months since I last updated my blog, to me, it seems a lot longer, maybe that’s to do with the Lockdown? who knows?

So, what have I been up to? except drinking lots of Beer like most of the people under Lockdown are doing.  Well, I have been semi-productive, I would say pretty productive, however, compared to some of my friends (one who launched a radio station while on lockdown), I will retire to sem-productive.

Inmoov Robot

First 3 months of the year, the progress was fairly non-existent however due to the lockdown, I have come leaps and bounds.  I have rebuilt the head, neck, torso, fixed or improved tons of things, required most of it, and made a stand (which blow a hole in my ceiling and nearly killed me).  I am waiting for some potentiometer (Absolute pain to find the right ones) for the arms but I hope to have the biceps done soon.  I have made a start printing the legs for it via my new CopyMaster 400 3D printer which is pretty cool.

I have turned my conservatory into a robotics area especially for this robot which I have found has helped a lot.

Burf.co

So on the mission to learn C# and Azure, I have completely rewritten (all 20 lines of it) Burf.co Search Engine to be Azure Functions running in Docker Containers written in C#,  I actually really enjoyed doing this and again it’s come further along than the old Java/Kotlin one did in weeks versus months.  I have Azure functions that serve up the results to the website, parse and index websites, crawls sites and even a chatbot for fun 🙂 (SearchAI.uk)

I have even fired up the old HP DL580 server (currently keeping the house warm) to see if I can process data faster.

CommonCrawl WET File Processing

So I decided to try and write a script that would download the WET files from the CommonCrawl (56000 files, 8TB compressed).  These files contain 2.8 billion webpages or so and could be a really fun thing to process using ML etc.

Here is my V1.02 of this script, it’s hacky at best but its a start:

 

namespace SimonPlayGround
{
    class Program
    {
        private const string Path = "https://commoncrawl.s3.amazonaws.com/";
        private const string TargetFolder = @"z:\";
        public static int Jump = 2320;
        private const int Threads = 20;
        private const int Block = 10000;
        private const string UserId = "simon";

        public static async Task Main(string[] args)
        {
            var client = new WebClient();
            var paths = new List<string>();

            var mongo = new MongoClient(new MongoClientSettings()
            {
                Server = new MongoServerAddress("192.168.0.150"),
                MaxConnectionPoolSize = 500
            });

            var db = mongo.GetDatabase("WEB");
            var collection = db.GetCollection<Page>("wet");

            DownloadPaths(client);

            foreach (var line in File.ReadLines(TargetFolder + "wet"))
            {
                paths.Add(line);
            }

            // hack to remove done ones
            for (int i = 0; i < Jump; i++)
            {
                paths.RemoveAt(paths.Count - 1);
            }

            var tasks = new List<Task>();
            tasks.AsParallel();

            for (var i = 0; i < Threads; i++)
            {
                var filename = paths.Last();
                tasks.Add(Task.Run(() => Process(filename, collection)));
                paths.RemoveAt(paths.Count - 1);
            }

            while (tasks.Any())
            {
                await Task.WhenAny(tasks);
                var finishedTasks = tasks.Where(t => t.IsCompleted).ToList();
                foreach (var finishedTask in finishedTasks)
                {
                    tasks.Remove(finishedTask);
                    if (paths.Count > 0)
                    {
                        var filename = paths.Last();
                        tasks.Add(Task.Run(() => Process(filename, collection)));
                        paths.RemoveAt(paths.Count - 1);
                        Console.WriteLine($"Left {paths.Count} {tasks.Count} {Jump}");
                        // todo write here the number of files done
                    }
                }
            }
        }

        public static async Task Process(string filename, IMongoCollection<Page> collection)
        {
            var file = await DownloadWetAsync(filename);

            await ParseWet(file, collection);
            Console.WriteLine($"FILE PROCESSED");
            File.Delete(file);
            Jump += 1;
            // todo write here that file was completed
        }

        public static async Task ParseWet(string filename, IMongoCollection<Page> collection)
        {
            using StreamReader sr = File.OpenText(filename);
            string s;
            StringBuilder sb = new StringBuilder();
            var foundDoc = false;
            var foundURL = false;
            var url = string.Empty;
            var count = 0;
            var pages = new List<Page>();

            Console.WriteLine($"Processing {filename}");

            while ((s = sr.ReadLine()) != null)
            {
                if (foundDoc == false && s.Equals("WARC-Type: conversion"))
                {
                    sb.Append(s + Environment.NewLine);
                    foundDoc = true;
                }
                else if (foundDoc == true && s.Equals("WARC/1.0"))
                {
                    var from = sb.ToString().IndexOf("Content-Length: ", StringComparison.Ordinal) + "Content-Length: ".Length;
                    var text = sb.ToString()[@from..sb.Length];
                    var body = text.Substring(text.IndexOf(Environment.NewLine, StringComparison.Ordinal) + 1);
                    foundDoc = false;
                    foundURL = false;
                    sb.Clear();

                    try
                    {
                        count += 1;
                        pages.Add(new Page()
                        {
                            Url = url,
                            Body = body
                        });

                        if (count % 1000 == 0)
                        {
                            Console.WriteLine($"Procsessed {count} {DateTime.Now}");
                        }

                        if (count == Block)
                        {
                            count = 0;
                            await BulkSave(pages, collection);
                            Console.WriteLine($"{Block} done {DateTime.Now}");
                            pages.Clear();
                        }
                    }
                    catch
                    {

                    }
                }
                else if (foundDoc == true)
                {
                    sb.Append(s + Environment.NewLine);

                    if (foundURL == false && s.StartsWith("WARC-Target-URI: "))
                    {
                        var from = s.IndexOf("WARC-Target-URI: ", StringComparison.Ordinal) + "WARC-Target-URI: ".Length;
                        url = s[@from..s.Length];
                        foundURL = true;
                    }
                }
            }

            // save any left over
            if (pages.Count > 0)
            {
                await BulkSave(pages, collection);
            }
        }

        public static async Task BulkSave(List<Page> pages, IMongoCollection<Page> collection)
        {
            try
            {
                var updateOneModels = pages.Select(x =>
                {
                    var filterDefinition = Builders<Page>.Filter.Eq(p => p.Url, x.Url);
                    var updateDefinition = Builders<Page>.Update.SetOnInsert(p => p.Body, x.Body);

                    return new UpdateOneModel<Page>(filterDefinition, updateDefinition) { IsUpsert = true };
                }).ToList();

                var resultWrites = await collection.BulkWriteAsync(updateOneModels);
                Console.WriteLine($"OK?: {resultWrites.IsAcknowledged} - Inserted Count: {resultWrites.InsertedCount} {resultWrites.ModifiedCount}");

                updateOneModels.Clear();

            }
            catch
            {

            }
        }
        public class HttpRetryMessageHandler : DelegatingHandler
        {
            public HttpRetryMessageHandler(HttpClientHandler handler) : base(handler) { }

            protected override Task<HttpResponseMessage> SendAsync(
                HttpRequestMessage request,
                CancellationToken cancellationToken) =>
                Policy
                    .Handle<HttpRequestException>()
                    .Or<TaskCanceledException>()
                    .OrResult<HttpResponseMessage>(x => !x.IsSuccessStatusCode)
                    .WaitAndRetryAsync(10, retryAttempt => TimeSpan.FromSeconds(Math.Pow(3, retryAttempt)))
                    .ExecuteAsync(() => base.SendAsync(request, cancellationToken));
        }

        public static async Task<string> DownloadWetAsync(string line)
        {
            var filename = line.Split('/').Last();

            if (!File.Exists(TargetFolder + filename))
            {
                Console.WriteLine($"downloading {filename}");

                using (HttpClient client = new HttpClient(new HttpRetryMessageHandler(new HttpClientHandler())))
                {
                    using (HttpResponseMessage response = await client.GetAsync(Path + line, HttpCompletionOption.ResponseHeadersRead))

                    using (Stream streamToReadFrom = await response.Content.ReadAsStreamAsync())
                    {
                        using (Stream streamToWriteTo = File.Open(TargetFolder + filename, FileMode.Create))
                        {
                            await streamToReadFrom.CopyToAsync(streamToWriteTo);
                        }
                    }
                }
            }
            else
            {
                Console.WriteLine($"GZ exist {filename}");
            }

            var wetFile = TargetFolder + filename.Substring(0, filename.Length - 3);

            if (!File.Exists(wetFile))
            {
                Console.WriteLine($"Decompressing {filename}");
                DecompressGZip(TargetFolder + filename, wetFile);
            }
            else
            {
                Console.WriteLine($"WET exist {wetFile}");
            }

            return wetFile;
        }

        public static void DownloadPaths(WebClient client)
        {
            client.DownloadFile("https://commoncrawl.s3.amazonaws.com/crawl-data/CC-MAIN-2020-16/wet.paths.gz", TargetFolder + "wet.gz");
            DecompressGZip(TargetFolder + "wet.gz", TargetFolder + "wet");
        }


        public static void DecompressGZip(String fileRoot, String destRoot)
        {
            using FileStream fileStram = new FileStream(fileRoot, FileMode.Open, FileAccess.Read);
            using GZipInputStream zipStream = new GZipInputStream(fileStram);
            using StreamReader sr = new StreamReader(zipStream);
            var data = sr.ReadToEnd();
            File.WriteAllText(destRoot, data);
        }
    }

    public class Page
    {
        [BsonId] public ObjectId Id { get; set; }
        [BsonElement("url")] public string Url { get; set; }
        [BsonElement("body")] public string Body { get; set; }

    }
}

2019 can F*ck off! 2020, yeah baby!

Wow, it appears that I have not updated my blog since Feb 2019. At that time, life was great, brainpower was at a high, ideas were flowing and world domination (Burf Style) was near (Nope this isn’t a Brexit rant). Then life got turned upside down, and I had to deal (well try and deal) with a great crisis in my life, my wife sadly no longer wanted to be with me. 2019 was a horrible horrible year and so I didn’t really build anything, I shut down my sites, projects, and hobbies and just tried to deal with what was going on at home.

Long road ahead
So, I would love to write there was a happy ending to 2019 but these things take years especially when kids are involved and the process has just started however you just can’t keep my crazy brain down for too long! So while I was aimlessly surfing eBay for random things, I came across this for sale!

Inmoov Robot 3D Printed

Seeing an Inmoov robot for sell on eBay is rare as rocking horse poo, it takes people years to print these (4 years for this one) and are usually attached to a college, university or something. This one needed some work but I thought what a perfect project for me to sink my efforts in to! My dad agreed and purchased it as my Christmas present (Thanks dad).

The Inmoov robot project is one of the most amazing 3D projects out there, it allows someone with a lot of time (and some money to buy servos), to print a fully functional top half of a robot, to buy the same sort of thing would cost you hundreds of thousands of pounds. I have printed a few parts of it before to merge with educational products like LEGO and VEX however, I always run out of steam (Time) while actually attempting to print the whole thing. Buying a mostly built one means I can fix a bit at a time.

So this year’s focus is to complete this while learning more about robotics, engineering, and electronics:) I will try and do a YouTube series on my progress and I will still try to think of crazy ways to try and make Burf.co a better search engine.

Watch this space 🙂

Inmoov Robot 3D Printed

Well that plan went Pete Tong!

So, in my last blog post, my servers had given me the middle finger and in the post before that, I had said that I was determined to do Robotics this year, stay focused and not take anything else on. Well, lets just put them crazy thoughts into the bin, we all knew I was going to epically fail them!

So in no particular order, here’s what’s been going on:

The Server
So, the new DL580 server is now working, I won’t mention I took a power drill to it, but let’s just say its playing ball. It’s now got a new raid card, HP Storage Works Array and 22TB of lovely storage ready for anything I may want to chuck at it. Its currently turned off at the moment but the reasons for that come up soon!

OpenFaas
Short version, its the t*ts, is epically cool! Go install it now!
So OpenFaas is an Open Source Function as a service framework (Think AWS Lambda, Azure Functions etc) which runs on top of Docker/Kubernetes and allows a developer to focus on creating call functions instead of infrastructure so much. It auto scales, supports tons of programming languages and has great analytics. It’s free and runs on practically any hardware including the Raspberry Pi. The Cloud addon to it (addon may not be the right word, maybe V2) makes things even easier by hooking into Github etc and then can automatically deploy after a commit. What really got me excited about this was that I could have a single package containing Python, Java and c# (for example) functions and not need to worry about any of the infrastructure needed to make them an API. I then could use a single command to deploy it all to my server!

There are other free FAAS solutions out there (FN, Azure Functions Runtime, Openwhisk, Serverless etc) but OpenFaas is extremely popular and only is getting stronger (13K stars on GitHub), has a great set of examples including a functions store and has a very helpful active Slack channel for support etc.

So, the reason the server has been off is partly that I have been waiting for some hard drives to turn up, and partly because of OpenFaas. Before I had discussed Faas, I was going to just build VM’s (Hyper-v) for all the servers I want. However, all that has changed now due to OpenFaas and I would really like all future stuff to work serverless. Because most Serverless frameworks seem to sit on top of Docker, there is some further research around Windows Server 2016 and containers needed to be done. The end result may be that I have to format my new server with Ubuntu, or create a Linux VM to host Docker on top of Windows. I still need to do some research as I believe WS2016 introduced mixed (Windows/Linux) containers for Docker.

Another Startup wants some Burf
Yup, I seem to be in demand, or it’s a way to stop me building Skynet (or some sort of killer robot). Another startup company, who is nearly ready to launch has asked me to help them out and get them across the line. It’s a fantastic opportunity which I am pretty excited about as I get to take ownership of the entire technology stack (which for once I don’t need to develop). My first challenge is around AWS and making sure their solution scale which should be fun! I will disclose more soon!

My servers shit the bed!

So, this year I plan to focus on my robotics and try and get back into machine learning. With my usual way of working, I thought I would fire up the Blade Server (HP C7000) and start getting some environments going etc. Server says no! Well its admin console does anyway and this is how you control all the blades. So after a week of kicking it, I decided to get something more practical (and something I could actually lift). I got a HP DL580 Gen 7 + HP StorageWorks running 16TB of space! PERFECT

Round 2: More shitting of the bed!
So everything was going great except the StorageWorks Disk Array seemed a little slow, sometimes it would do 500mb a sec, then 700k??? Then it shut down a few times and now it doesn’t even switch on 🙁 Hopefully it will be replaced this week however I had just got used to HyperV and made a few Vm’s to replace my old servers.

I think once I have a server infrastructure that works even when it’s not a full moon, I am going to start rewriting the Burf.co search engine so that NLP (Natural Language Processing) is built in at the start. I am thinking of building a search engine just for science and technology articles.

On other news, I have been checking out YOLO (You only look once) for image recognition using Tensorflow. It is super cool and I hope to use it soon with real-time video for my robots.