26 July 2011

Movement and Renaming In Fluidinfo and Fish

The Fluidinfo Shell, Fish, already includes Unix-inspired commands for many tag-management tasks including listing tags and namespaces (ls), removing them (rm), creating them (touch and mkns/mkdir) as well an analogue of the Unix chmod permissions-changing command (perms). What about copying and renaming?

The Unix command for renaming files is mv (short for move). I imagine this is briefly confusing for newcomers to Unix, but the command is actually well named when all its functions are considered. In Unix, to rename a file from foo to bar one says:

mv foo bar

The same form can be used to rename a directory, say from private to secret:

mv private secret

The true appropriateness of the name mv becomes apparent only when the command is used for it third form, in which the destination is an existing directory. If etc is such an existing directory then

mv foo etc

will move foo into the directory etc; or, if you prefer, it renames foo as etc/foo. The same thing happens with a directory, i.e.

mv private etc

will move the directory private into etc as etc/private. We can also move several files at the same time:

mv foo private etc

will move both foo and private to etc., which is handy.

Should Fish do something similar?

Moving Tags and Values in Fish

There seems to be every reason to support a similar mv command in Fish. As with files, we may wish to rename a tag or namespace, or to move it within the tag hierarchy, and Unix provides a well-thought-through, powerful and convenient template for this. Tags would behave like files, and namespaces like directories. But there is a complication, as we shall see.

In Fluidinfo, a named tag, such as njr/rating is an abstract entity: it doesn’t really store any significant information itself. The tag may be attached to many objects, with different values or none on each. We can think of those tags-attached-to-objects as concrete tags or tag instances.

So if Fish implemented an mv command as above, then if authenticated as user njr (and using Unix-style paths in Fish), the command

mv rating star-rating

would change the name of the tag njr/rating to njr/star-rating not just in one place, but everywhere it occurs in the system. Similarly, if njr/private is a namespace,

mv private secret

would rename not only the tags in njr/private, but on every object to which those tags are attached.

This is not so much a problem as an opportunity. There is definitely a need for an easy way to rename and move whole named abstract tags and namespaces, but it would also be useful to be able to move and rename individual concrete tags. For example, maybe I want to rename my njr/rating tag on Citizen Kane from njr/rating to njr/star-rating (preserving its value). (After all, film ratings are all about stars, right?)

The obvious thing to do there would be to add a -a to specify the about tag for the object:

mv -a 'film:citizen kane (1941)' rating star-rating

It could also accept -i, to specify by ID, or -q to choose a set of objects for which to apply the renaming with a Fluidinfo query.

So far, so good. But what if it is a different kind of movement that I desire? What if I want to move the tag to a different object? Perhaps I’ve used the wrong object, or wish to swap to a different convention for about tags. As an example, let’s suppose I missed out the year required in the film-u convention when constructing the about tag for the film Citizen Kane. How would I move the tag from one object to another?

Perhaps the most obvious approach would be to repeat the -a flag to specify two different about tags. The library Fish uses for processing command-line options does allow them to be repeated, so the command could be

mv -a 'film:citizen kane' -a 'film:citizen kane (1941)' rating

Notice that there tag appears only once, because here, it’s not being renamed. However, I really don’t like repeated tags; they are quite rare in Unix and feel sloppy to me. I’d be more tempted to use -b for the second one (from a to b; geddit?). So then it would be:

mv -a 'film:citizen kane' -b 'film:citizen kane (1941)' rating

You could also perform a renaming at the same time:

mv -a 'film:citizen kane' -b 'film:citizen kane (1941)' rating star-rating

which you might prefer to express as

mv -a 'film:citizen kane'/rating -b 'film:citizen kane (1941)' star-rating

We could do something similar with -i (-j for the second, I suppose); it doesn’t really apply in the case of -q since it would be optimistic in the extreme to expect pairs of about tags from two different queries to align correctly.

What I’ve described so far seems fairly workable, and is my current best guess at what I’ll try to implement. A cp command would be fairly similar, and would probably require a -R flag for recursion.

But there is an alternative that seems worth discussing, even if only to reject it.

The endpoint approach

I generally think of tags as being attached to objects, sometimes with values, pretty much exactly as the diagrams generated by http://abouttag.appspot.com show them. Here’s an example for Citizen Kane.

citizen-kane-object.png

(You can generate this live, in a modern browser, by visiting here).

And I think of the (abstract) tags and namespaces as living in a tree, exactly like the directory structure in a hierarchical file system. Something like this, where the yellow folder icons represent namespaces and the tags are red.

FluidinfoTagHierarchy.png

The analogy can even be extended further by thinking of the value on a tag as being like a file’s contents; this seems particularly apt in cases where the tag values really do correspond to file contents, as is the case with Fish’s documentation. Using a pair of tricks stolen from Nicholas Tollervery (@ntoll), the documentation is actually stored in Fluidinfo, with the content of the index.html file stored in a tag called index.html under the fish namespace. So the full path for the tag that stores the front page of Fish’s documentation is fish/index.html and this (concrete) tag is stored on the object with the about tag fish. The URL for the documentation is then http://fluiddb.fluidinfo.com/about/fish/fish/index.html. The second “trick” is to set the MIME type of the value of fish/index.html on the fish object to text/html, which causes Fluidinfo to serve the content with that MIME type, so that as far as the internet is concerned, http://fluiddb.fluidinfo.com/about/fish/fish/index.html is regular HTML web content. The same applies to all the other pages. It will come as no surprise to learn that the fish documentation is published programmatically, and that I have vague plans to add a publish command to Fish to allow a directory tree of files to be uploaded to Fluidinfo as a set of tags on namespaces on a nominated object.

It will not have escaped your attention in this digression, that a crucial difference between a file in a filesystem and a tag in Fluidinfo is that the tag may have arbitrarily many different values, each attached to a different object. It’s like having your file system partially replicated on every computer in the world, with different content on the subset of files stored on each.

This suggests a different way of thinking about the relationship between the tag hierarchy and objects, which is really the way that Fluidinfo’s end-points use them. In this view, each object has a partial copy of the (abstract) tag hierarchy instantiated on it, with values that are specific (and hopefully appropriate) to that object.

This is neatly illustrated by looking at a recursive listing of the fish user’s top-level namespace:

$ fish ls -R

fish:
_static/            h                   p                   tags.html
a                   help.html           perms.html          test.html
about.html          i                   pwd.html            touch.html
abouttag.html       index.html          pwn.html            u
amazon.html         install.html        q                   unixlike.html
b                   j                   r                   untag.html
c                   k                   rm.html             v
cli.html            l                   s                   version.html
commands.html       ls.html             search.html         w
count.html          m                   searchindex.js      whoami.html
d                   mkdir.html          shell-fish.html     x
e                   mkns.html           show.html           y
f                   n                   su.html             z
g                   normalize.html      t
genindex.html       o                   tag.html

fish/_static:
basic.css           default.css         fish-doc-logo.png   pygments.css
def.css             doctools.js         jquery.js           searchtools.js

All the tags that look like filenames (the ones containing a dot) are tags that exist in exactly one place—on the object with the about tag fish. (They could occur in multiple places, however. For example, I could publish the full set of documentation to version-numbered objects fish v3.12, fish v3.11 etc., rather than constantly replacing the old with the new, as I do now.)

The other tags (those with single-letter names) are there to allow anyone playing with the online version of Fish (shell-fish) to tag things with a set of different tags, and might exist on any set of objects in Fluidinfo.

In this view, the full path for a concrete tag can be expressed either in terms of its about tag (if it has one) or its ID. In the case of fish/index.html, the only concrete instantiation taht exists today lives under the /about endpoint, at

/about/fish/fish/index.html

or, for those of you who prefer UUIDs, under the /objects endpoint, at

/objects/0e7b94c7-1b37-41c2-bd7f-1a376786aa4e/fish/index.html

This suggests an alternative possible syntax for moving concrete tags. We could recast

mv -a 'film:citizen kane' /rating -b 'film:citizen kane (1941)' star-rating

as

mv '/about/film:citizen kane/njr/rating' '/about/film:citizen kane (1941)/star-rating'

This approach is not without its attractions. It is rather elegant and unambiguous, and it completely eliminates the need for flags in this case, which can hardly fail to be a benefit. On balance, however, I think it’s probably not the way to go. It feels somewhat inconsistent with thre rest of Fish, it forces me to put the username in (which I don’t really want to do) and—to me—it feels slightly inverted (if logical). There’s also a potential problem that there’s technically ambiguity with respect to a potential user called about, but that probably isn’t such a worry.

But the more important reason is that I dont want to have to repeat the object specification if I do something like

mv '/about/film:citizen kane (1941)/njr/rating' '/about/film:citizen kane (1941)/star-rating'

In that case, the -a tag feels much better, and that seems like a strong reason for with with either -a and -b or repeated -a flags when shifting between different objects.

So at the moment I’m leaning strongly towards the first suggestions introduced above, using -a and -b as required. As ever, however, I’m interested in opinions. I expect it will be a while before I get around to implementing this anyway.

No comments:

Post a Comment

Labels