The UNKNOWN [?] Archival Project


In a short span of time, collosal quantities of data and information has been lost. This extends to everything. Great civilizations, our past way of life, and the way of life of our ancestors. The cultural and symbolic icons. A great deal of this information and data has been lost. In recent history, even as recent as 40 years ago, a huge amount of data has been lost. Software on Floppy Disks, as the hardware is not designed to last so long. Discussions over platforms such as Usenet, BBS Boards and Forums. Even many old social media platforms have gone extinct. It would be a shame to lose such a great quantity and great quality of this information and data, but even the task of archival carries a cost of storage. The storage medium, and the gathering of this data. I believe it is the responsibility of each person to at least attempt to archive something, or to contribute to other archives. Without this attempt for archival... Future historians, scholars, or even just interested and fascinated individuals will be disappointed by our lack of effort in any form of archiival.

With this said, I am setting out to find and archive as much relevant and important information with regards to the UNKNOWN [?] Clan. I have already begun this project in a smaller manner, but this blog post will discuss the technology, software and plan for all this archival.

The technology

It is worth taking a moment first to evaluate the technology. The vast majority of data and information relevant to the UNKOWN [?] Clan takes the form of binary data, and this binary data can take the form of text information, text information on Forums or Reddit, or even on older platforms such as Raidcall, Skype and Discord. As such it is important to establish a set of 4 conditions that will be used to evaluate if some data or information is worthwhile preserving in this archive.

  1. Relatively public information. Elaborating on this, I refer to information that can be gathered within its different forms by a member of the UNKNOWN [?] Clan.
  2. Non-confidential information(or information scrubbed of confidential detail). This is to protect all involved parties, as well as to respect the privacy of the Clan.
  3. Originating, or relating to the UNKNOWN [?] Clan. This would simply be about deeming the origin and relation of any information in this archive.
  4. Deemed to hold merit. This is a necessity due to time and storage constraints, and so that certain information can prioritised for archival.

Elaborating on this, I will take a moment to discuss the future-proofing of this technology. I will use APIs where relevant, and I will attempt to use data forms that will likely be around for a great period of time. Of this form is plain HTML files, plain Media files, and so on. I will avoid using any Database software, and a lot of the data will be formatted and archived using some Shell Scripts, primarily in POSIX-Compliant Shell, but possibly some in Bash. I will also provide the raw data and information files zipped for people to download and make personal backups of. The greatest vulnerability in all this is its centralisation on this server, and to tackle that, a lot of the data will be mirrored with my physical machine. This should render this archive future-proof(save for abandonment of File Standards, or loss of software to read the Files), and destruction-resistant(save for nobody making personal backups and damage to the main server and my main machine.)

Arguably the biggest issue overall will be one of permission, respect of intellectual property and copyright, as well as giving credit and acknowledging the copyright of the individuals involved.

Current Priority for Archival

How you can help