UTC The Empire Archival

[linkstandalone]

Introduction

This post requires an introduction, and as such I hope to do it well here. UTC or better known as The UNKNOWN [?] Trading Company, is a group of players that originate from Counter Strike(1.6 I believe) that moved on to play Minecraft and then eventually in 2012 settled on a Minecraft Server called MineZ. MineZ is basically DayZ if it was ported to Minecraft and wasn't as buggy as DayZ. In concept it is a very cool idea, and it generally works better for the casual player. The UNKNOWN [?] Trading Company, as the name suggests set out to do Trade Runs in the south. In doing so, they developed tactics and a unique culture and community unlike anything else on the server at the time, a lot of this spearheaded by the Directoire(Directors) of the clan. How I came to be involved with them was in the Summer of 2016. I stuck around for a good four years and I enjoyed a lot of my time with the UTC. Nowadays, I'm no longer in their discord out of choice(on account of it being a major distraction for me), but I am still very active on the UTC Subreddit.

Project

The project in question here is a full archive of all News Articles. When I was new to UTC and still sometimes to this day, I like to look at the older News Articles as you can gain some insights into the formation of the clan. It also wasn't impossible either. In 2018 I had attempted something similar though my efforts were cut short by the lack of a proper website to host the contents, and my lack of knowledge of external APIs to gather the Subreddit data for these news articles.

Another part of the motivation is my desire to become competent with Shell Scripting, and other glue tools like sed. It's my opinion that a developer doesn't need to overcomplicate anything, and can just use Shell Scripts or other Scripts(See Python or Javascript) to handle generation of blog pages or other typicaly "complicated" tasks.

On my main page under the extra information, you will find a direct link to the full archive. In this list you will find both an archive link to the markdown text converted to HTML, as well as a full URL link to the original reddit post. The motivation for this being that subreddits can get taken down or quarantined(and given the sometimes dicey nature of content on the subreddit, this isn't at all impossible), though I will say it is unlikely given it has only some 200 Subscribers.

You can find this full script yourself if you wish to generate this yourself. It is uncommented, but only 15 lines long so you should be able to understand it. The link to this code is here. As the observant Shell Scripter will note, it is indeed a POSIX compliant Shell Script, which means it can be run in other Shells besides Bash. The observant Shell Scripter will also note 3 depedencies. wget, which if you don't have just stop reading this and evaluate your life decisions, jq, which is a JSON parser that can be used in the Shell, and Markdown. Markdown just converts text in markdown format to its corresponding HTML format. It's very useful for this specific case of archiving every single News Article and retaining the formatting used- though it's doesn't cover the Reddit Markdown Specification which contains slightly more features than the regular Markdown Specification. An example of this can be found in the first issue, with the tiny text. My solution was... it was too minor to be worth any time, and barely affects readability of the news anyway.

As for jq... One major performance improvement in this script remains. I could use Streams instead of parsing the whole file which would be much more efficient and effective. The reason this wasn't done is because the documentation for parsing JSON with streams is difficult to read and understand for jq, however as the file size of the JSON is very small, it only takes about 20 seconds for the parsing in the script, and variable time for the wget operation. If you have a better solution, feel free to send me the source code of your better shell script to my email aaronleonard@risingthumb.xyz. Also feel free to send me suggestions there too.

Final notes

This is just to show that a static website can be used as a full archival system for a regular clan newspaper posted on the subreddit. One might ask "Hey, does that really count as you need to be there to actually run the code?". To affirm this, just set up a cronjob to once a week or so update the archive.

The most likely way this script will break, is if the API I am doing wget requests to breaks. If that happens, the saving grace is that the parsing will almost definitely fail and the script will just not run

So if you own a site(or even a github.io site), I highly recommend experimenting with using Shell Scripts for managing that site, and using either FTP or GIT to update it. There's plenty of ways to automate execution of Shell Scripts, git and ftp. For a personal site like this, it's more than enough to handle all changing aspects like this.