Experiments in AI generated code #

I personally think that AI generated code is still rather naff. I wrote at length my opinions on AI generation in various fields and areas here. Additionally this is not going to be me singing its praises nor damning it to all hell.

The tool I wrote with it, is a little tool called ctag. This was already a tool that was condemned to the graveyard of good ideas left unimplemented because of too little time and interest. Where I started using AI for it, it was a tool that could navigate a directory structure and tab between a metadata viewer(that wasn't implemented) and the directory view. Where I finished with it, it was a tool that could navigate a directory structure, has a fuzzy finder so I can find specific files and folders, select multiple music files and change their metadata and then tag track order by selection order in a file, and also to recursively apply a metadata to all music files in a directory. The actual tagging was done by another program called id3v2.

My experience was pretty much that LLMs are competent for making small tools, however as the number of LOC of this program expanded it got slower and slower at making changes. I suspect in large projects, the utility of LLMs fades somewhat unless you're able to manage these context issues. One method I have heard about managing those context issues is by having the LLM write into and update its own sort of documentation "agents.md" file. I did not try this. Another way of managing this is likely by building a test suite. If I were to expand this project anymore(which I don't currently have many plans to do as I consider it mostly feature complete for my purposes of tagging my own music library), I would need to set up a test suite. Another issue with LLMs, is that firstly the implementations it will take are not ideal, and it will end up with quite a bit of duplicated code. Compared against what you'd expect from a Junior Software developer it's fine, and for a small project kept to itself, and especially for projects that will have a single or "at best" double digits userbase, is also fine. I would not trust this software for anything larger.

Another issue is the sheer quantity of code it produces. It is very difficult for humans to verify correctness here with the sheer quantity of code these LLMs produce. This doubles up my suspicions that for software development the burden has shifted significantly from writing the code to maintaining the code and merging pull requests for correct code. Additionally because of the sheer quantity of code, LLMs will miss out things that have a similar logic. As an example, in ctag, it had duplicated out the code for recursively tagging genres, years, albums and artists, when these could have been a single function. Secondly the approach the LLMs take match the common approach of a lot of low quality software which is mixing data and implementation and hard coding things. If the API for the id3v2 program changes, this would be tedious to update. In most software I write I try to take a data oriented approach where the data is mostly separate from the implementation(the only time I include data should be in the implementation is for sensible defaults or for things that are unlikely to be changed). This has the effect that I can then write or tweak tools that can produce this data, or transform this data to meet my specifications.

Another thing that has me concerned, is... for context I used GPT-5 Mini integrated in Visual Studio Code using Copilot in agent mode. Agent mode is concerning and has the potential for huge damage if an AI requests to run erroneous commands. There have been occurrence of people running rm -rf of the entire PC instead of a specific directory. I believe both Steam and a Linux Distribution that I've forgotten the name of have made this mistake before basically deleting a user's entire system. These sorts of things are likely to happen, especially as LLMs are likely to want to run peculiar programs.

I will say I am pleasantly surprised of its capabilities for making small tools and "toy" programs... I have additionally used LLMs as a minor debugging tool where I paste my code and ask it to suggest where errors are occurring which can be useful for jogging the mind in what could be causing it in particularly complex systems(though this is only relevant if an LLM has been trained on that project or language).

As I spoke before in my opinions on AI, I see LLMs mainly as a threat to entry-level software development jobs, but I do not see it as a threat to senior-level software development jobs as the requirement for understanding and knowing far-reaching consequences is still there. The sheer quantity of code and the sheer... badness of the code especially when correctness isn't enforced or validated. For small and quick developer or private tools that are 2000 LOC(Lines of Code) or less, ideally 500-1000 LOC or less is where it shines currently, but even then it depends on you being able to describe in precise and exact language what you need implemented and implementing it one step at a time, not throwing the entire list at an LLM. Additionally it may also help to provide scaffolding code so that the LLM can conform roughly to the scaffolding(like how vines cling to fences).

I will also say, asking an LLM to design something for you is in my opinion fairly doomed to produce results in the same way as AI generated Art. An LLM cannot make reasonable decisions about what is a good feature to add to it. In this way, an AI is also even more a victim to additive design- a mentality of some people where the only way to improve something is adding things. Sometimes the best improvements is removing things or simplifying things. I think this also somewhat applies to design patterns as well for programming, which is why the idea above about scaffolding code may be a worthwhile consideration.

I do personally see LLMs being a useful minor tool for making very rough and dirty private development/convenience tools, and for minor bits of debugging, and this will probably be how I use LLMs going forward in programming. It still requires a competent guiding hand otherwise it'll make complete nonsense rubbish. That said, I still see it as a minor tool, not something revolutionary, and it has many caveats.

Finally, I need to remind people to be civilised since AI and LLMs are hot topics full of hatred and vitriol these days and this is just an experiment. See my blog post on my current opinions, and as I said in that blog post, this doesn't include all the overwhelmingly evil and corrupt ethical failures of various LLMs companies. The only thing I will say about this, is that the governments have provides a set of rules for them and a set of rules for us, and this sort of behaviour inspires vigilante justice, lawlessness, anarchism(in the sense of disrespecting and ignoring laws, even the ones that improve society) and in both the best and worst case scenarios... revolution. I condemn the evil at the steering wheel of these technologies.

Email your feedback about "Experiments in AI generated code"

Published on 2026/03/09

Generated by openring

risingthumb.xyz The only programme I'm likely to get on is the news.

Experiments in AI generated code #

Articles from blogs I follow around the net

Quickie: Carpenter Brut † TURBO KILLER † Directed by Seth Ickerman

...

thecozycat 💔 My site isn't salvageable. I feel so heartbroken. ...