When it comes to all the fancy software tools out there, I'm finding myself becoming more and more luddite.
I've learned through reading and discussion many ways on how to handle data. It's something I'm always interested in. However, the one thing that I've learned - and sometimes the hard way - is that no matter how awesome your handling of your data it all falls to pieces when your tools break.
The wider issue would be explored further in The Chain of Trust. The basic idea is that you build on top of your tools and your reliance can bring about one of those "all your eggs in one basket" problems. I'll try to briefly summarize my particular case, and that's what leads to the main topic of this post (versioned directories).
Most people have their web-displayed data stored entirely online. The single version of it is stored remotely. So their chain of trust is something like Computer => Internet Connection => Hosting Provider.
Their computer dies, the data is safe with their provider. Their net connection breaks, they can't get at their stuff anymore. Their hosting provider goes under, they're screwed. Nice.
This is solved by looping the chain of trust in on itself with Computer => Internet Connection => Hosting Provider => Computer. This is obvious enough.
But let's take a programming project as our example. I'm not so sure how to explain this universally so I'll use my Compiled Website project as something of an example. I think it would work something like Source Code [editor] => Computer [OS] => Internet Connection [wireless?] => Hosting [browser]. I'm adding in some bracketed things to describe some of the tools used.
If we look more closely at the tools and layers, we can see a reliance on the editor for the Source Code, a reliance on the OS for the Computer, etc.
Many programmers take their projects and version them into something like Concurrent Versions System (CVS), Subversion (SVN), and now git is becoming quite popular. This extra layer adds complexity and a reliance on another tool.
Every tool and service being used is another link in a long chain of trust. Any breakage within it can be a serious issue. A computer can have various issues from hardware failure through to operating system issues, even hacking, viruses or malware. A piece of software could break, have its configuration break, or it could reach its end-of-life and become unsupported. Mentioning software in this light is a nod to the value of free software.
So where am I going with all of this? Well with the Compiled Website project, I'm cutting out huge amounts of dependencies, essentially removing links in that chain of trust.
No scripting: No PHP etc. No special CMS engine. No database, so no MySQL (or its superior PostgreSQL). No version control system. No special hosting functionality whatsoever.
So how do I handle multiple versions? Well I decided to just do things the brute force way, and I keep separate directories with the files. If i cared, I could keep them as tarballs or in some other compressed format. While working, my editor gives me unlimited undos, and I save regularly, but after testing and confirming a new feature or a good fix I will bump up the version. It's a horrible duplication of space to do things this way, because I may have some files which are exactly the same within multiple version-directories, but I don't mind so much. [note to self: don't let fdupes run amok there]
So I have a list of versioned directories, and I have a simple problem to solve. I want to easily begin my scripting and have it set things up with the most recent version in mind. But how do I do that? It's a pretty simple concept to grasp, so it should be straightforward to solve.
ls -1d */ | grep '[0-9]*.[0-9]*.[0-9]' | sed 's////' | tail -n 1
Welcome to Linux.
ls -1d */
It should be a simple affair to bring up a listing of only directories. Apparently it's not, and lots of people are hauling out find or other tools. Seriously, is there no functionality built into ls? A few people out there knew about this apparently occult feature. I think the */ is actually leveraging the shell itself. I put in the 1 to give me a nice column to process easily.
I actually considered using sed for this, but then I realized how retarded that would be. This lets me see only the versioned directories and not other directories such as those with libraries in them. Oh, I also learned to version my libraries too. =)
So 123.456.789 is valid and is passed through by this command.
Can I use regular expressions directory with ls? Probably not.
This removes the trailing slash which is added by ls. I have no idea why the slash is added, and while there seems to be a way to have ls add the symbol, or even change the symbol, I don't think there's a way to remove it.
Strangely enough, ls -1d --indicator-style=slash */ will show two trailing slashes, so something is a bit odd.
tail -n 1
Simple enough, just grab the last entry of whatever gets passed through. This turns out to be the highest number.
Now I may end up with some issues, such as:
0 10 1 2 ...
This would piss me off quite a bit. Maybe I could sort differently somehow, or I may just break down and add zeros into my directories, like
00 01 02
But I hope I don't have to do that.
So even though I'm eliminating all kinds of tools, I have stupid shit to deal with like this. Now I'm still relying on commandline tools like.. well like Linux in general, but also on grep, sed and tail, but I'm not so bothered by this. I'll probably keep using Linux for a while. I did always want to use BSD (DragonFly BSD looks awesome), but the odd wrappers they use to run Linux software weirds me out.
Whoa, on that note.. their update 2.5.x Development news, 08-Feb-2010
Our new swapcache is now fully operational in the development branch. Swapcache is a general system feature which expands the use of swap to cover clean filesystem data and meta-data (not just dirty anonymous memory). When used with a SSD (Solid State Drive) swapcache has an enormously positive effect on system performance, almost acting like extended memory. Even a small 40G SSD can cache upwards of 80 million inodes for fast directory operations, or make your working dataset completely free of slow HD seeks virtually without having to lift a finger. The implications are staggering.