In the previous article, we discussed the analogy between the Unix File System and Fluidinfo’s tag hierarchy. This analogy forms the basis and inspiration for the Fluidinfo Shell, Fish. But a file system without move and copy commands would be a sad and contemptible thing, and at the moment Fish, like Fluidinfo, is impoverished by the lack of such basic functionality as cp and mv. Here we will try to design such functionality, building on the analogy.
Copying and Moving¶
In Unix we can
- copy files with the cp command
- copy directories (and their contents) with cp -R
- move files to a different location with mv
- move directories (and their contents) to a different location with mv
- rename files, also with the mv command
- rename directories with mv
- delete files with rm (“remove”)
- delete empty directories with the rmdir command and delete directories together with their contents with the rm -r command.
In general, the functionality of mv is conceptually equivalent to copying and then removing an item.
We can also copy files and directories between different machines using the rcp and rsync commands, which are both similar to cp but understand a host: prefix. An alternative to these commands is the ftp command, which operates in a very different manner, and uses different mechanisms, to ultimately similar effect.
Fish, today, offers no commands for moving, renaming or copying tags or namespaces, but does provide an rm command that performs the combined functions of rm and rmdir (also requiring a -r and -f flags in some cases). It is worth noting, briefly, that in some sense Fish’s rm goes much further than rm on Unix, in that the command
fish rm rating
removes not only the abstract rating tag, but every occurrence of that tag in Fluidinfo, potentially on millions of objects. This is why Fish requires the -f flag to force the removal of a tag that is in use. In the prevous article, we argued that Fluidinfo’s objects play the natural analogues of computers in a network. From that perspective, if we think of rcp as a more powerful version remote version of cp, Fish’s rm command is more like a remote rm (presumably rrm) that allows you to remove all files with a given path on all hosts simultaneously. It’s as if you could say something like:
rrm -f *:~/.bashrc
to remove the .bashrc in your home directory on every machine on which you have an account. Indeed, if Linus Torvalds were not merely Linux’s creator but the superuser on all copies of the OS, with such a command he could remove everything on all Linux hosts with
rrm -rf *:/
Let’s think about the Unix commands cp and mv, and their possible generalizations to the realm of Fluidinfo. Recall that, when we want to be precise, we need to distinguish between two different senses of the word “tag”. Ordinarily when we attach a tag, possibly with a value, to an object, we create what we might variously call a “tag instance” or a “concrete tag”. Fluidinfo, however, maintains a user’s tag hierarchy independent of whether tags are actually in use. When discussing tags in this sense, independent of objects, I call them abstract or platonic tags. These are quite real and can persist even when the tag is not in use. The diagram below shows the abstract tag hierarchy for Alice, on the right, and her file system, on the left. Note carefully that in her private namespace Alice has both a tag and a namespace called moments.
In what follows, assume Alice’s current working directory on Unix is her home directory /home/alice, and for concreteness, assume that her shell is the Bourne Again Shell, Bash.
Copying a file or a tag¶
On Unix, Alice can copy her has-drunk file into her private directory by saying any of the following:
cp has-drunk private/has-drunk
cp has-drunk private/
cp has-drunk private
Obvious though it is, let us spell out what this means. After running one of these commands, Alice will have two files, /home/alice/has-drunk and /home/alice/private/has-drunk, where previously she had only one, and each will contain a separate copy of the same data.
We could plausibly adopt any or all of these in Fish to copy Alice’s has-drunk tag to her private namespace. But what would that do? I think the most obvious action would be to create a new abstract tag called alice/private/has-drunk and then tag all the objects currently tagged with alice/private/has-drunk with the new alice/private/has-drunk tag, copying their values if any. We would need to consider quite carefully how to handle permissions when performing such copying. The case is rather different from Unix, because in Unix permissions are hierarchical in the sense that a file with public read permission in an fully private directory cannot be read. This is an important detail. On Unix, let’s assume that Alice’s private/things file has open read permissions (644):
alice$ ls -l private/things
-rw-r--r-- 1 alice staff 0 17 Jan 18:57 things
and that she then locks her private directory so that only even she can look at it.
alice$ chmod 700 private
If Bert now trys to look at alice/private/things he will find that he cannot:
bert$ cat ~alice/private/things
cat: /home/alice/private/things: Permission denied
In particular, on Unix this means that if Alice moves a non-private file to a private directory (by which I mean, one with neither read nor execute permission) it becomes unreadable.
In Fluidinfo, the permissions hierarchy is consulted only when new tags and namespaces are created. So if Alice creates a new tag in her private namespace, it will default to being private; if we copy the permissions of a tag when copying the tag, its permissions will be unaltered, and potentially different from if we created the tag afresh in the new location.
The correct behaviour is not clear, and either way there is potential for surprising the user in unpleasant ways, most obviously by making public data that the user intended to be private. We have seen above how by copying permissions with tags we could violate a (reasonable) assumption that moving a tag into a private namespace would make it private. If we fail, however, to copy permissions, copying a private tag to a non-private namespace would result in a non-private tag, which might also be a nasty surprise.
My first inclination here is to do a “reverse Facebook” by, when in doubt, setting the permissions on the destination to the more restrictive of the two possibilities, on the assumption that revealing data that Alice wanted to keep private is both worse and less correctable than making data more private than intended, given the inability to make people unsee (or even, uncopy) things. Needless to say, we could also have options to allow the user to choose what behaviour she wants.
Q1. How should permissions behave when tags and namespaces are copied or moved? Should we go for:
- The permission of the destination is copied from the source?
- We follow mv = rm + cp + rm and create the new tag or namespace in the new location according to default rules?
- Maximum privacy: apply the more restrictive of the permissions suggested by a. and b. (or, if necessary, their most restrictive intersection).
Moving or renaming a file or a tag¶
Going back to Unix, Alice can move her has-drunk file from her home directory to her private directory with any of these commands:
mv has-drunk private/has-drunk
mv has-drunk private/
mv has-drunk private
Again, we could plausibly adopt all of these forms in Fish to move Alice’s has-drunk name to her private namespace. In this case, there seems no real issue about what should happen. We can’t sensibly “move” the abstract tag but not the concrete ones. Using our rule of thumb that
move = copy to new location + remove the original
or
mv src dest = cp src dest; rm src
this reinforces the case for making cp copy all the concrete tags as well as the abstract tag.
Renaming really raises no extra issues: just as in Unix Alice can rename here has-drunk tag to drunk with a simple
mv has-drunk drunk
she should be able to rename her has-drunk abstract tag as drunk with the same command in Fish, and in the process rename all its concrete instances.
Copying and Moving and Renaming Directories¶
What about copying a directory in Unix? We use cp for that, but now we need to use -R to force the directory and all its contents to be copied recursively: without this -R, we can’t even copy and empty directory such as things:
$ cp things thangs
cp: things is a directory (not copied).
But with -R, we can copy a directory hierarchy as easily as a file. Let’s suppose Alice wants a duplicate of her private directory in her things directory. She can use any of
cp -R private things
cp -R private things/
cp -R private things/private
cp -R private things/private/
and the result will be a duplicate:
$ ls -RF things
private/
things/private:
moments/ things thoughts/
things/private/moments:
things/private/thoughts:
Move works essentially the same way and needs no example. Again, so far there seems to be no reason why we shouldn’t build analogous functionality in Fish for copying and moving namespaces and their contents. We would certainly allow the -R flag, but might not require it, and would certainly allow -r to be used as a synonym. As with copying simple files, and following our mv = cp + rm dictum, concrete tags in the hierarchy would be copied, together with their values, on all objects to which they are attached.
Clobbering¶
Now consider the following commands on Unix, in the context of the same directory structure shown in the original figure:
cp has-drunk private/things
mv has-drunk private/things
The destination, private/things is a file that already exists: it will be clobbered (overwritten) by both cp and mv. The same would be true if Alice copied her private/moments/things file to her private directory with any of
cp private/moments/things private
cp private/moments/things private/
cp private/moments/things private/things
or their mv counterparts. So in Unix, the rule is
When the destination specified is a directory, move or copy the source into that directory. If there was already a file with that name in the directory, delete it first.
When the destination specified is a file, first remove that file if it exists, then copy or move the source to that destination.
Except that this isn’t quite true: you can’t clobber a file with a directory. So
$ cp -R things private/things
cp: private/things: Not a directory
cp: private/things: Not a directory
cp: private/things: Not a directory
cp: private/things: Not a directory
(the four failures being as each of the entities in private fails to be be copied), and
$ mv things private/things
mv: rename things to private/things: Not a directory
Why can’t a hulking great directory clobber a puny file? I don’t know. Unix has many wonderful attributes, but consistency is not foremost among them. To my surprise, even adding -f cannot persuade the system to do it. Whether Fish should copy this apparently anomalous behaviour is not completely clear to me: logic suggests not, but fidelity to Unix conventions suggests maybe so. The point may be moot anyway, as there’s a good chance I will require a -f to clobber even a tag, just as I do with rm, if it is in use. This is because whereas on Unix, clobbering a single file removes a single entity, however big. In Fluidinfo, a single abstract tag could have a million instances or more, and I feel requiring a -f flag to encourage the user to confirm her intent before engaging in such (potentially) wide-spread destruction is not unreasonable.
These minor exceptions notwithstanding, the way files get clobbered suggests that we might extend our recipe to include the rule:
If dest is a file:
mv src dest = rm -f dest; cp -R src dest; rm -r src
Does that make sense for tags in Fish?
This, I think, is an interesting question. We could certainly make Fish remove all the abstract destination tag and all its concrete tags before moving or copying another tag. But it also seems reasonable to consider the possibility of replacing those concrete tags present in the source, but not those absent in the source.
To make this real: suppose Alice says (in Fish)
$ cp has-drunk moments
and at the time she does the state of her has-drunk and drunk tags is as follows:
$ fish show -q 'has alice/has-drunk' /about
2 objects matched
Object d440c5cf-9680-4748-b70e-56f07f35ca09:
/fluiddb/about = "drink me (not poison)"
Object ec430756-e110-4bc4-b882-544afda1cce8:
/fluiddb/about = "drink me"
$ fish show -q 'has alice/drunk' /about
2 objects matched
Object d440c5cf-9680-4748-b70e-56f07f35ca09:
/fluiddb/about = "drink me (not poison)"
Object 49126b6d-18bd-457f-af55-a251cf400fc9:
/fluiddb/about = "drink me not"
or, diagramatically:
It seems clear that the value of the has-drunk tag on "drink me (not poison)" (no value) should be replaced with the value of the drunk tag (true), and that a new has-drunk tag should be placed on "drink me not", also with the value true. It is less clear, however, that the has-drunk tag on "drink me" needs to be deleted. We will be moving on to discuss selective copying and moving later anyway, but we have certainly formed a question:
Q2. If a tag is clobbered by a mv or cp command, should all of its instances be clobbered, or only those necessary to make way for the tag values from the source?
No comments:
Post a Comment