07 September 2009

The Permissions Sketch

Hey, Terry, Can I borrow your tags?

njr: Hi Terry

terrycojones: Hi Nick

njr: Seeing as we’re friends, could I see your, ratings please?

terrycojones: Sure. I’ve just set an exception on see for terrycojones/rating so njr can see it.

njr: Great thanks. Except, it’s odd. I can see that you have a rating on The Hitchhiker’s Guide to the Galaxy, but I can’t see its value.

terrycojones: Oh, you mean you want read permission. No problem. There I’ve set an exception for njr to be able to read terrycojones/rating too.

njr: Excellent. Yes, I can see it now. Just an 8, eh?

terrycojones: Yeah, well . . .

[A bit later]

njr: Terry, you know how we’re really good friends.

terrycojones: Sure.

njr: Well, I was wondering if it wouldn’t be useful if I couldn’t actually set ratings for you. For instance, when we were talking the other day, you were saying how maybe you didn’t even think THHGTTG deserved an 8, and you planned to take it down to a 6; only there you were working on three tickets and the net was down where you were, and basically you couldn’t be arsed?

terrycojones: Yeah, that makes sense. OK, I’ve given you `update permission on terrycojones/rating now, so you can change it.

njr: Wow, you’re really fast at that. Are you using some cool FluidDB client?

terrycojones: No, I just use curl and send the raw HTTP. The API’s so RESTful . . .

njr: [Rolls eyes]. OK, let me try that. No, not curl—changing your rating. Yeah, that worked. Cool. And I know you also wanted to give Pärt’s Tabular Rasa a 10, so I’ll just do that for you.

terrycojones: Great. Hang on, I’ll just give you permission.

njr: Eh? You’re going senile, mate: you just did. I changed your rating on Hitchhiker’s Guide to the Galaxy, remember?

terrycojones: Yes. But you only needed update permission for that. Now you want to tag a new object with terrycojones/rating. You need create permission to do that. But don’t worry, I’ve given you that too.

njr: Oh wow. These permissions are pretty-fined grained aren’t they? Yeah, I’ve done it. Except—fool!—I tagged the wrong object. I forgot the umlaut on Pärt in the about tag. I know you’re fussy about your accents, being Australian. (Mind you, Pärt might be fussy about it too.) So there I’ve put it on the right one now. Except—that’s weird, I can’t seem to untag the first one. Surely if I have update and create permission, that must allow me to remove a tag, right?

terrycojones: Of course not! Deletion is completely different! But no problem. I’ve given you delete permission on terrycojones/rating now.

njr: Right. So now I can do anything with terrycojones/rating, right? I can see it, read it, tag things with it (create), change tag values (update) and even untag things (delete). Truly, I have power over your ratings.

terrycojones: Yup. You could even make the world’s biggest Mariah Carey fan. But of course, I’d have to kill you if you did that. And you’re the only one apart from me who can set my ratings, so I’ll know.

njr: Consider it done.

But there’s more . . .

terrycojones: Of course, there are still things you can’t do with my ratings.

njr: There are?

terrycojones: Sure. You can’t do anything to the tag itself.

njr: You mean apart from see it, read it, apply it, change it and update it?

terrycojones: I said the tag itself.

njr: The tag . . . itself?

terrycojones: Yup.

njr: That sounds a bit abstract to me, Terry. I’m just a simple physicist. What are you talking about?

terrycojones: See, you change set my ratings, and change my ratings, but you can’t change terrycojones/rating itself. You can’t change what it means. And you can’t delete it.

njr: What it means?

terrycojones: Yes. If you look at the properties of the tag, you’ll find that the description of it is "terrycojones's ratings (Spinal Tap) scale".

njr: Spinal Tap scale?

terrycojones: Sure. Zero to eleven.

njr: Of course. Well, eleven’s nice, if a bit odd; and prime. What do you rate eleven?

terrycojones: Oh not much. The FluidDB permissions system. And Esteve.

njr: Esteve? Better not tell him. He’ll want a raise.

terrycojones: It’s OK; he’s on the exception list. He can’t read my ratings.

njr: But he writes the code!

terrycojones: Yeah, but you should see his principles. He’s uncorruptable.

njr: Alright, alright.

terrycojones: Anyway, the point is, you can’t change that.

njr: The incorruptable genius of Esteve?

terrycojones: Well, that too. But I meant the meaning of my ratings.

njr: Even though I have have every conceivable write permission on the tag?

terrycojones: Yeah, but not on the tag itself.

njr: (If you say itself in that meaningful tone one more time . . .)

terrycojones: Yeah, well, anyway, there’s a separate permission for updating the tag itself.

njr: [Expletives deleted.] Of course there is. And what’s that called?

terrycojones: update.

njr: No, see, you already gave me update permission, wise guy

terrycojones: On the tag. Not the tag itself.

njr: Oh, update on the tag itself. I see. And what about delete? You said I couldn’t delete the tag. But I’ve already removed that Part tag from the object without the umlaut.

terrycojones: Ah yes; but you haven’t deleted the tag itself

njr: [Further colourful expletives deleted] Right. So I can delete every terrycojones/rating you’ve ever put on anything, and indeed, any terrycojones/rating anyone else has ever put on anything for you. But I can’t delete the essence of terrycojones/rating, the meta-data about terrycojones/rating, the terrycojones/rating itself.

terrycojones: That’s right. (And it’s not meta-data; it’s data. All data is equal in FluidDB.)

njr: Whatever. So it that it? Is that really it?

terrycojones: Yup.

njr: So let me see if I have this straight. There are permissions for seeing, creating, reading, updating and deleting tags. And then there are some special administrative permissions for updating and deleting the tag itself.

terrycojones: That’s right.

njr: And that’s absolutely it? If you gave me those, I really would own you. I’d have complete control of terrycojones/rating.

terrycojones: Ah, well, it’s funny you should say that.

njr: There’s more isn’t there?

terrycojones: Well, there’s control.

njr: There’s control.

terrycojones: Yes, there’s control.

njr: Meaning . . .?

terrycojones: Well, who do you think’s been giving you all these permissions to tell the world about my secret infatuation with Mariah Carey?

njr: Ah, yes, Of course. There’s control of tags. Kind of like ownership of files. But surely, you own all the ``terrycojones tags, don’t you?

terrycojones: Sure I do. But I could given them to you if I wanted to. Or I could even let us share them: so we both controlled them.

njr: And that’s called control?

terrycojones: Right.

njr: And that works the same way? With an open/closed policy and an exception list?

terrycojones: Sure does.

njr: Wow. So you could you set it to be closed and not have any exceptions?

terrycojones: Yup.

njr: And then no one would be able to change it?

terrycojones: Not even God.

njr: Really? Is God subject to the FluidDB permissions system.

terrycojones: Well, not yet. She doesn’t have an account. But if she ever gets on, it’s the same rules for her as you, and me, and FluidDB. No exceptions.

njr: Wow, so you if you closed off all the permissions on your ratings and then took away control, your ratings would be like digital tatoos. No one could ever change them.

terrycojones: That’s right. I’m going to rate my mum 10 and then do that.

njr: Aaahhh . . .

Some months later

terrycojones: Hi Nick

njr: Hi Terry

terrycojones: You know how we’re great friends and all that shit?

njr: Sure

terrycojones: And how I trusted you with my ratings, and you mostly didn’t abuse except for that whole rating-Mariah-Carey-11 business.

njr: You love her really.

terrycojones: Whatever. The thing is, I thought it would be good if you have me permission on your njr/guardian-1000 namespace, so I can add some stuff and fix all the unicode you screwed up.

njr: Ah, unicode, yes. OK, what do I need to do.

terrycojones: Well, you could just give me control of it; then I could do anything.

njr: Control of njr/guardian-1000? So I guess control on a namespace is like control on a tag?

terrycojones: On a tag itself, right.

njr: Well, you know, Terry, I trust you and everything, but . . .

terrycojones: Yeah, OK. I don’t really need control. But create permission would be useful.

njr: OK, done. Is that it?

terrycojones: Well, actually, update would be useful too. Since you screwed its description.

njr: So update permission on a namespace is like the administrative update permission on a tag?

terrycojones: On the tag itself. Right. It lets you change the data about the namespace. Like the description.

njr: OK, I’ve done that too. Anything else.

terrycojones: Well you have an open policy on list, so I can see what’s there. But there seems to be some junk. I mean you have a sub-namespace njr/guardian-1000/best-FluidDB-UUIDs. I don’t think the Guardian has published its list of all-time best FluidDB-UUIDs yet. And even if it did, I think you’d want a tag not a namespace for that. So I think we could lose it. You know how I love deleting things.

njr: So delete it!

terrycojones: I will. But you need to give me delete permission on the namespace for that

njr: Right, so that would be another administrative permission on the namespace. Like delete on the abstract tag itself.

terrycojones: You’re finally getting this.

njr: Slowly, slowly, the egg walks, (as they say in Addis Ababa).

terrycojones: You ever been to Addis Ababa?

njr: No. But I know a woman who rates it an 11.

terrycojones: OK. It’s in the njr/guardian-1000/cities now too.

njr: It is?

terrycojones: It is!

njr: OK, Ciao, Terry. I’ll let you get back to your Mariah Carey records.

terrycojones: [Expletives I’ve never even heard before deleted.]

02 September 2009

Permissions Worth Getting Excited About

Permissions. Doesn’t the very subject set your heart racing?

No?

Well, me neither; normally. But, rather to my surprise, my thesis today is that one of the most revolutionary and remarkable features of FluidDB is its permissions system, and that this, more than anything else, explains why it has a shot at blowing apart web applications and social data.

I accept it’s an unlikely claim. But let’s see if I can make a case.

WTF?

If you ever hear Terry talk about FluidDB, or read his stuff, the guy speaks in riddles. One minute he sounds like Richard Stallman, banging on about freedom and how when he rules the world no one will have to ask permission to write data, no one will own objects, everything will be shared and social and anarchic and messy and we’ll all be able to do exactly what we want. He even calls FluidDB the database with the heart of a wiki, whatever that’s supposed to mean. And the very next next minute, he’s talking about how at the heart of FluidDB lies a really strong permissions system that allows us to control exactly what goes in FluidDB.

The guy’s clearly schizophrenic, if not paranoid delusional. Right?

In Defence of Terry

FluidDB is a remarkably simple conception. At its heart, it’s just a collection of initially empty containers to which we can attach tags. In best computer-science tradition, we call the containers objects and they really are 100% shared.

If you can find an object, you can tag it . . . and so can anyone else.
http://StochasticSolutions.com/fluiddb/image/bare-objects.png

As you can see, the objects all have rather long, individual identifiers—actually 128-bit numbers sometimes known as UUIDs [1]. Some objects also have a second identifier, known as the about tag. About tags are unique and never change, so there can only be one object in FluidDB whose about tag is set to the (exact) string oxygen, though there can be any number with no about tag.

http://StochasticSolutions.com/fluiddb/image/objects.png

So objects are not owned, but shared, and once created, are never destroyed [2].

So one half of the claim is good: objects are shared, and anyone can tag them with whatever they like. Where do all the permissions and control come in?

Tags and Values

When we say that anyone can write to any objects in the system, that’s true. Anyone can stick any tag they like onto any object they can find [3]. The tag has a name, such rating or colour or Ulysses and may have a value—such as 7, or interesting or a picture of some leaves. Like this:

http://StochasticSolutions.com/fluiddb/image/LuminousLeaves.png

The only restriction your tag names all start with your FluidDB username. For example, my FluidDB username is njr, so my tag names start with njr/. So I can have njr/rating, njr/colour njr/Ulysses, or anything else I like. And you, Josephine, can have tags called josephine/rating, josephine/colour (or even josephine/color, if you’re that way inclined) and so on.

http://StochasticSolutions.com/fluiddb/image/hitchhiker-njr-josephine.png

Placing tags on the same object in FluidDB is, of course, a way of relating them, but that isn’t the focus of this article.

Permissions

So finally, we come to the bit.

By default, I can only create and set tags starting njr, but can read any tags, while Josephine, naturally enough, can only create and set tags starting josephine. Unlike objects, tags are owned.

But there is a permissions system that means that I, as the owner of the owner of tags starting njr, can control exactly what Josephine can do with my tags and vice versa.

So if I want to, I can hide my ratings from Josephine, but let her see my colours. Or I can hide them from everyone except Josephine and Terry.

Slightly more unusually, I can also choose to allow some, or even all other users to write some of my tags if I like. For example, I might decide that I trust Terry enough to allow him to set some of my tags.

This ability to control who can do what extends not only to users, but to applications as well. So suppose someone writes an application called Read-Planner that allows people to tag any book they run across on the web (or anywhere, in fact) with a to-read tag. If I want to use this application, I don’t have to give it my password or open up access to all the rights that I have: I can simply give Read-Planner the ability to create to-read tags for me. And if at some point I become unhappy with it, of course I can revoke that permission. I don’t even need something like OAuth: I can just do it.

Even better, I can also choose to allow other applications whatever form of access to my to-read tags I want—maybe read only, or maybe some of them have write access too. It’s up to me.

And this is why we think FluidDB has the potential to be so revolutionary: it is truly social, while leaving each user firmly in control of his or her data.

New Rules for Web Applications

At the moment, any of us who use web applications tend to spend a lot of time and effort populating application databases to make them useful to us. But when we do so, we tend to lose control of our data. They go into a private database schema, and what access we have to that depends entirely on what the application allows us to do. Sometimes there are reasonable ways to get the data back out (some kind of an XML dump perhaps), sometimes not. But always the application is in control. And linking data across applications is, in general, somewhere between hard and impossible.

FluidDB can change all that by leaving the user in control of his or her data, granting the application only such permissions as necessary or desired, and ensuring that the user retains flexability and control.

Now obviously, this only works if applications agree to work with data in this way, and equally obviously, a lot of them are going to be extremely reluctant to do so. After all, in the well-worn phrase, information is power. So it’s more than possible that far from embracing the openness and intrinsic cross-application, cross-user interoperability championed and facilitated by FluidDB, many applications will seek to hold onto the data. We fully expect this.

But we also expect that there will be some applications, perhaps new ones, perhaps small ones initially, that will embrace the idea. And over time, a movement can grow, perceptions can change, and maybe in ten years time the idea of an application owning and controlling your data will seems as antiquated as the notion that you shouldn’t be allowed to see your own medical records.

A Tiny Bit of Detail

I’ll close by talking in slightly more detail about the way the permissions system in FluidDB operates. Further articles will go into the gory details.

Briefly, you can set permissions for each group of tags sharing a name. [4] So for example, if I use a rating tag, I can control exactly who can read, change and create this njr/ratings tag. And I can control each of those aspects independently. [5]

Similarly, if I choose to use names that contain slashes, like maybe njr/book/rating and njr/book/own I can choose control who can do what at the level of a stem like njr/rating if I choose. So I could allow some book applications, or a bookish friends, to manipulate my book tags, maybe even creating new ones for me, but not my other tags (especially not my njr/private ones!)

Conclusion

Unlikely as it sounds, one of the key innovations in FluidDB is its combining a completely shared set of information containers (the objects) with the ability for users to tightly control, at a granular level, precisely who can read, write and create the tags used to store information on those objects. This applies not only to users, but also to applications, giving a simple way for users to grant applications broad or narrow access to read or manipulate some or all of their data while, ultimately retaining complete control of it.

In FluidDB, you never need to ask permission to write to an object, but you always need permission to use someone else’s tags.

It’s a powerful combination.


[1]So-called Universal Unique Identifiers.
[2]In fact, we sometimes like to say that objects for every possible about tag already exist, in rather the same way that Plato believed that all numbers and other mathematical objects existed in perpetuity in what we now call his platonic universe. It’s simply that we only bother to allocate storage for objects with any particular about tag when someone actually wants to use it.
[3]If it doesn’t have an about tag, an object might be very hard to find, essentially requiring you to guess a 128-bit number. But it doesn’t really matter, as we’ll see.
[4]We sometimes call the set of tags having, or potentially having a particular name (like njr/rating) as an abstract tag. We can think of the (concrete) tags that we actually attach to FluidDB objects as concrete instantiation of a canonical, abstract, platonic, master tag of the same name.
[5]There are actually more than three aspects I can control, but this the essence of it.

01 September 2009

FDB 1.21

I just pushed fdb.py 1.21 to GitHub (http://github.com/njr0/fdb.py).

The command line utilities and the API now work consistently with subnamespaces using the funcitonality introduced in 1.19.

So

fdb tag -a DADGAD /njr/rather/deep/space/rating=9

will now work, creating namespaces for as many of rather, deep, space and rating as required.

FDB 1.20: Namespace Functions Added to API

I’ve added some new functions to the python API to FluidDB in the fdb.py library. This has been pushed to GitHub (http://github.com/njr0/fdb.py).

At the moment, the functionality is somewhat embryonic, but useful. Basically there are new functions to:

  • Create a namespace, e.g.

    id = db.create_namespace ('/njr/bas/bar/foo',  'three levels deep',
                              verbose=True)
    

    This is recursive, and will create bas under /njr and bar under bas if required.

  • Delete a namespace:

    status = db.delete_namespace ('bas/bar/foo')
    

    This is not recursive, though arguments for recurse and force have been added to the function signature. (They are casually ignored at present.)

  • Fetch the description of a namespace, e.g.

    print db.describe_namespace ('bas/bar/foo')
    

These follow FDB’s usual convention that if paths start with a ‘/’ they are taken to be absolute, and if not they are taken to be relative to the user’s top-level namespace. So for me (njr), /njr/foo/bar is the same as foo/bar.

Points to note:

  • There are no unit tests for this functionality yet (lazy, lazy, bad njr!)
  • There is, however, a set of examples in nstest.py that illustrate and, to some extent, test the functionality (I know, I know, why not just make them tests. I will, I will.)
  • There are no new commands in the CLI to leverage this APi functionality. (As you might imagine, I don’t intend that to be a permanent state of affairs.)
  • Not only are there no new commands, but the existing commands have not been extended to take advantage of the new functionality properly, So, for example, fdb tag still can’t use the new namespace created. Ridiculous? Yes.

The following code shows the API:

import fdb
import simplejson as json

db = fdb.FluidDB ()
user = db.credentials.username

db.create_namespace ('subs', 'Subnamespace of njr', verbose=True)
print
db.delete_namespace ('/%s/subs' % user, verbose=True)
print

db.create_namespace ('bas/bar/foo',  'three levels deep', verbose=True)
print

print db.describe_namespace ('/%s/bas/bar/foo' % user), '\n'
print db.describe_namespace ('/%s/bas/bar' % user), '\n'
print db.describe_namespace ('/%s/bas' % user), '\n\n'

db.delete_namespace ('/%s/bas/bar/foo' % user, verbose=True)
db.delete_namespace ('bas/bar', verbose=True)
db.delete_namespace ('/%s/bas' % user, verbose=True)
print

assert db.describe_namespace ('bas/bar/foo') == 404
assert db.describe_namespace ('bas/bar') == 404
assert db.describe_namespace ('bas') == 404

print 'All namespaces verified to have been deleted.'

It produces the following output (starting from a clean sheet, anyway).

Created namespace /njr/subs with ID 1671aa1f-9cdb-44a0-b55e-ed4ca78553cd

Removed namespace /njr/subs

Created namespace /njr/bas with ID 2ce1aa8e-d069-4d3b-960c-40685c47868f
Created namespace /njr/bas/bar with ID 4328b9ec-d96a-40e4-9a30-6906d360d9a8
Created namespace /njr/bas/bar/foo with ID d83240e7-2863-4cac-95d2-42d61b2050d4

         description: three levels deep
                  id: d83240e7-2863-4cac-95d2-42d61b2050d4
      namespaceNames: []
            tagNames: []

         description: None
                  id: 4328b9ec-d96a-40e4-9a30-6906d360d9a8
      namespaceNames: ['foo']
            tagNames: []

         description: None
                  id: 2ce1aa8e-d069-4d3b-960c-40685c47868f
      namespaceNames: ['bar']
            tagNames: []


Removed namespace /njr/bas/bar/foo
Removed namespace /njr/bas/bar
Removed namespace /njr/bas

All namespaces verified to have been deleted.

31 August 2009

Flags in FDB from version 1.18 onwards

I posted recently that FDB 1.18 has been released and that one of the changes in there was that my custom flags library for handling command line options (switches; flags; things like -a) has been replaced with the standard optparse library. (Thanks again, Yevgev (@xa4a_)

When I wrote that post, I hadn’t fully appreciated that this had subtly changed the behaviour of command line arguments, and in doing so invalidated some of the examples I posted.

Most people will almost certainly prefer the new behaviour (or at least, will probadbly find it more familiar), but since its different and inconsistent with previous published examples, I thought I’d better document it.

The main changes are as follows:

  • Single-letter options may no longer be conflated, i.e.

    USE   fdb tag -v -a DADGAD rating=10
    NOT   fdb tag -av DADGAD rating=10

    to specify that you want to set a rating of 10 on the object whose object whose about tag is DADGAD

  • -q, -a and -i now take the argument immediately following as the query, about-tag or ID, rather than using the first argument not tied to an option. So

    USE   fdb tag -v -a DADGAD rating=10
          fdb tag -sand -i edfaaf70-b9b5-42f1-ad4c-30601b70a2ac rating=10
          fdb tag -v -sand -q "has njr/rating" rating=10
    
    USE   fdb tag -a -v DADGAD rating=10
          fdb tag -i -sand edfaaf70-b9b5-42f1-ad4c-30601b70a2ac rating=10
          fdb tag -v -q -sand "has njr/rating" rating=10
  • options may now appear anywhere in the command

    USE  fdb tag -i -sand edfaaf70-b9b5-42f1-ad4c-30601b70a2ac rating=10
    OR   fdb tag -i edfaaf70-b9b5-42f1-ad4c-30601b70a2ac rating=10 -sand
    OR   fdb -i edfaaf70-b9b5-42f1-ad4c-30601b70a2ac tag rating=10 -sand
  • there are now long versions with -- available to go with the short (single-letter) form for each action. (Personally, I hate --, but I lost that battle years ago.) Specifically, you can use

    --about ABOUT, -a ABOUT  to specify an about tag of ABOUT
    --id ID, -i ID           to specify an id of ID
    --query QUERY, -q QUERY  to specify a query of QUERY
    --timeout N, -T N        to specify an Http timeout of N
    --sandbox, -s            to specify use of the sandbox
    --verbose, -v            to increase verbosity
    --debug, -D              for debug mode

29 August 2009

FDB 1.18

FDB version 1.18 is now available from GitHub at http://github.com/njr0/fdb.py/.

There’s quite a lot of cleanup, almost all done by Yevgen Varavva (@xa4a_ on Twitter). Changes since the last version are:

  • Using a decorator to avoid code repetion
  • Moving flag/option handling to use the standard optparse library instead of the custom flags lib
  • fixing a couple of minor bugs
  • making everything work against the sandbox as well as the main instance.

At the time of writing, the main instance is offline, so I haven’t actually been able to test against that, but I suspect it will be OK.

I haven’t actually removed the flags library from the respository yet, but probably will.

26 August 2009

The Guardian 1000 Novels Everyone Must Read

I wanted to create a few objects that would begin to show some other aspects of FluidDB.

Some may remember that in January The Guardian ran a series of articles entitled 1000 Novels Everyone Must Read. They published the list in themed sections, which made a lot of sense for the paper, but which made it surprisingly hard to see what the 1,000 were, so I spent a while building a digital version of the list which I blogged about and published here. (Ironically, they then did the same thing, but my work wasn’t entirely wasted, because at least I ended up with nice, clean, structured, electronic version of their list.

But . . .

Wouldn’t it be great if people could stick ratings on those 1,000 novels, or indicate which ones they’ve read, and also augment the list with novels they think should have been on there. And wouldn’t it be insanely great (in that strangely familiar phrase) if you could then find out things like

  • which novels your friends have read and rated highly that you haven’t read
  • which of your friends seem to like the same novels as you
  • who of your non-friends like the same novels as you
  • which novels have the highest ratings from users.

This is exactly the kind of thing FluidDB was built for.

So, I’ve made a start by creating FluidDB objects for some [1] of the novels.

The Hitchhiker’s Guide to FluidDB

In order to load information into FluidDB in a useful way, we need to decide how we want to represent it. FluidDB is fantastically relaxed about this, and it is not necessary for everyone to agree. But at the same time, for users to share data meaningfully, and even for a single user to be able to find information systematically, fixing a structure is really useful.

This is what I am using so far.

http://StochasticSolutions.com/fluiddb/image/hitchhiker.png

In principle, it seems to me that the ISBN number is the natural way to identify a book (despite some limitations), so I’ve chosen that as the about tag (even though they’re painful for me to look up).

In a perfect world, t tags in grey would ideally have some namespace with more authority than njr (ideally isbn.org:), and in time, that may come, quite possibly using the very same object (edfaaf70-b9b5-42f1-ad4c-30601b70a2ac). But for now, I’ve stuck the basic metadata in my namespace (njr), and that should be enough to get things rolling. These tag were put on programmatically using a few lines of python.

The social tag, of course, is the black one—njr/rating = 9. Unlike the others, which were generated automatically, I put that one on myself, by hand. There are lots of ways of doing this, but I used FDB, with the command

$ fdb tag -av isbn:978-0330258647 rating=9
Tagged object with about="isbn:978-0330258647" with rating = 9

So: 10 down 990 to go. I will add the rest as soon as I can find a reasonable way of getting their ISBN numbers.

Tag Conventions

While it is fundamental to FluidDB that each user and application can make their own choices, I believe good conventions are useful and will nurture the growth of FluidDB. (Terry refers to my desire for conventions as my fascist librarian tendencies, but I’m sure he means it kindly.) I will maintain my personal suggested conventions on this blog on a post I’ll make soon. The first ones will be these:

The About Tag

  • Books: I suggest a primary convention that matches about="isbn:978-0330258647". [2] This is, the four lower case letter ‘isbn’, followed by a colon, followed by the 13-digit ISBN number.
  • Books without an ISBN number: I am going to use about=”book:Name of Book/Author” for now, but that’s not very precise. (Maybe eventually FLuidDB ID’s will replace ISBN numbers.)

Other Tags

  • Ratings: username/rating. Mostly because Terry Jones (the father of FluidDB) has effectively promulgated this with his every description of FluidDB, I suggest ratings on a scale of 0 (worst) to 10 (best). Feel free to use decimals if 11 ratings aren’t enough. Obviously, anyone can go higher (or lower) if they want, but when I’m calculating means and things, I’ll ignore out-of-bounds values.

The Guardian 10 Novels Everyone Must Read

As noted before, I’ve currenlty only added objects for the first ten. This is so tiny, I thought I might as well show the FDB output illustrating what’s there. If anyone does have access, do tag these with your rating, or a has_read tag or whatever (as if you need my permission!).

At the time of writing, The Hitchhiker’s Guide to the Galaxy has three ratings; but I bet that increases really fast.

fdb show -q 'has njr/guardian-1000' /about title publication-year author-forename author-surname guardian-1000
10 objects matched
Object edfaaf70-b9b5-42f1-ad4c-30601b70a2ac:
  /fluiddb/about = "isbn:978-0330258647"
  /njr/title = "The Hitchhiker's Guide to the Galaxy"
  /njr/publication-year = 1979
  /njr/author-forename = "Douglas"
  /njr/author-surname = "Adams"
  /njr/guardian-1000 = 1

Object 9d49f0b5-fb0b-49c8-b4fc-d82eb35d90e1:
  /fluiddb/about = "isbn:978-1569470039"
  /njr/title = "Silver Stallion"
  /njr/publication-year = 1990
  /njr/author-forename = "Junghyo"
  /njr/author-surname = "Ahn"
  /njr/guardian-1000 = 1

Object 1be19311-1f81-4f70-ba7a-4075d9b06b4d:
  /fluiddb/about = "isbn:978-9774246036"
  /njr/title = "al-Bab al-Maftouh"
  /njr/publication-year = 1960
  /njr/author-forename = "Latifa"
  /njr/author-surname = "al-Zayyat"
  /njr/guardian-1000 = 1

Object f0130f4f-2bc2-4ea4-8e52-d6d4922969a2:
  /fluiddb/about = "isbn:978-0701206048"
  /njr/title = "Death of a Hero"
  /njr/publication-year = 1929
  /njr/author-forename = "Richard"
  /njr/author-surname = "Aldington"
  /njr/guardian-1000 = 1

Object e65b1540-1fe4-46b5-b14d-da32ae592dfe:
  /fluiddb/about = "isbn:978-0141188539"
  /njr/title = "The Face of Another"
  /njr/publication-year = 1964
  /njr/author-forename = "Kobo"
  /njr/author-surname = "Abe"
  /njr/guardian-1000 = 1

Object 9cdfb63a-38e2-4c27-a739-5107ec24d151:
  /fluiddb/about = "isbn:978-1857989984"
  /njr/title = "Non-Stop"
  /njr/publication-year = 1958
  /njr/author-forename = "Brian W"
  /njr/author-surname = "Aldiss"
  /njr/guardian-1000 = 1

Object 14371df6-73d9-43e0-96e5-2008e852d67f:
  /fluiddb/about = "isbn:978-0140621198"
  /njr/title = "Little Women"
  /njr/publication-year = 1868
  /njr/author-forename = "Louisa May"
  /njr/author-surname = "Alcott"
  /njr/guardian-1000 = 1

Object 97401884-3590-4c22-b9dc-0cd2744270eb:
  /fluiddb/about = "isbn:978-1841955612"
  /njr/title = "The Man with the Golden Arm"
  /njr/publication-year = 1949
  /njr/author-forename = "Nelson"
  /njr/author-surname = "Algren"
  /njr/guardian-1000 = 1

Object 4a96c86b-1117-4c93-a815-1494c13fc2cf:
  /fluiddb/about = "isbn:978-0141186900"
  /njr/title = "Anthills of the Savannah"
  /njr/publication-year = 1987
  /njr/author-forename = "Chinua"
  /njr/author-surname = "Achebe"
  /njr/guardian-1000 = 1

Object f95a950e-1b1a-425d-b4ac-b8fc82c99cb3:
  /fluiddb/about = "isbn:978-0141023380"
  /njr/title = "Things Fall Apart"
  /njr/publication-year = 1958
  /njr/author-forename = "Chinua"

  /njr/author-surname = "Achebe"
  /njr/guardian-1000 = 1
[1]OK, only ten so far, but there’s a reason. Doing them all would be really easy if I knew a good way of looking up an ISBN number programmatically. If anyone knows one, please let me know (by leaving a comment, or any other way). Otherwise I’ll have to resort to screen-scaping, which is completely doable, but dull and requires manual intervention when it goes wrong.
[2]When I say the about tag and write about="...", this is shorthand for fluidbdb/about. This tag is special in that it is unique (only one object can ever exist with a given value of the about tag), and immutable. This is a basic property of FluidDB, and means that when a rating is attached to the object with about="isbn:978-0330258647", we can be confident that the object will forever more be about that ISBN number.

fdb 1.15: Important Bug Fix

I pushed a new version of FDB (version 1.15) to GitHub (http://github.com/njr0/fdb.py).

Among other minor things, this fixed a serious problem whereby tags created by FDB have the wrong type. (Sorry about that.) So if you have used FDB (the library or the command line utilities) you may want to reset any tags.

The new version is better, but can’t set tags with no value because of a bug currently present in both the main and sandbox instances of FluidDB. That means that one test currently fails.

I’ve added a KNOWN-PROBLEMS file, which documents these issues.

25 August 2009

Delicious and FluidDB (redux)

Preamble

This article discusses the relationship between the social bookmarking site, del.icio.us, and the new online database FluidDB. It also talka about a utility that can be used to transfer bookmarks from del.icio.us to FluidDB, and a little about what else you can do with FluidDB.

This article is an HTML version of the PDF posted previously posted here.

Some examples in this article use the command line version of fdb.py. This is a client library for FluidDB, available from GitHub at http://github.com/njr0/fdb.py

The delicious importer described here is also distributed with fdb.py.

Delicious and FluidDB

A simple mental model of a bookmark in del.icio.us (http://del.icio.us) [1] looks something like this:

google-delicious-simple.png

This represents the site www.google.com, and I have tagged it with my search, google and home tags.

Perhaps the simplest way to map this to FluidDB is as follows. This is what delicous2fluiddb.py currently does.

google-fluiddb-simple-short-about.png

Everything looks very similar except that the address (ID) of the object we store the information on is more prominent and the subject of the bookmark (the URL) is placed on the special about tag. This address (id) is real (it exists in FluidDB and is permanent).

$ fdb show -a "http://www.google.com/" /id
Object with about="http://www.google.com/":
  /id = "6c1d7519-35c2-42c0-8b8c-08ffbbc5990d"

Unlike the other tags, the about tag is not owned by njr, but is system-wide. So we often abbreviate it to just about.

google-fluiddb-simple-short-about.png

The about tag is special in that (a) its value can never be changed and (b) it is unique. [2] This one is permanently associated with the object having the ID 6c1d7519-35c2-42c0-8b8c-08ffbbc5990d. So by attaching these tags to this object, we can be entirely confident that they will always be associated with the URL http://www.google.com/.

Del.icio.us actually allows us to store a little more information in a bookmark—a title, some notes and an update date. We can also think of the URL as a tag itself, but as in FluidDB, it’s owned by the system, not by njr. So a more complete picture of a del.icio.us bookmark might be:

google-delicious-bigger.png

Note that in del.icio.us, the tags are just names, but the other four items of information on the bookmark have values.

Replicating this in FluidDB is trivial, because all tags can take values, which can have various types—strings, numbers, dates and much more.

google-fluiddb-simple-bigger.png

Although delicious2fluiddb.py doesn’t do this yet, it would be trivial for it replicate the delicious title, notes and update fields, to produce the structure shown above; soon, it will.

There are many other ways we could represent information from del.icio.us in FluidDB. Since the value of a tag in FluidDB can be a set, another obvious way to represent the tags would be as a set of strings on a tag called tags (or perhaps delicious-tags).

google-fluiddb-simple-tags-as-values.png

Of course, once in FluidDB, we can also add as many other tags as we like to the object, and these tags may or may not have values.

Privacy, Sharing, and Permissions

But before we get onto that, there’s one more crucial bit of information that del.icio.us stores about a bookmark—whether it is shared (meaning anyone can see that njr has the bookmark, and what tags I’ve put in it), or private (meaning that only I can see it).

google-delicious-shared.png

Right now, delicious2fluiddb.py only uploads the shared bookmarks to FluidDB. But how would it make information shared or private?

As well as allowing the arbitrary tagging of objects, and providing a query language, FluidDB also possesses a rich permissions structure. In fact, there are fine-grained permissions associated with every different tag name in the system (njr/google, njr/title, terrycojones/rating etc.) [#fc]

abstract-attribute-tag.png

To illustrate this, we can use fdb to query FluidDB and check this.

$ fdb show -a "Object for the attribute njr/title" /id
Object with about="Object for the attribute njr/title":
  /id = "a6e022bf-14e9-4033-aa91-39789dc11b23"

This command asks fdb to show the /id (which is fdb shorthand for the object’s ID), for the object whose about tag is the string “Object for the attribute njr/title”. The -a means that we wish to specify the object by it’s about tag rather than, for example, by ID or by a more general FluidDB query. If you have fdb and a username and password, you should be able to run this query and get exactly this result.

While this isn’t the place to go into all the detail, the key thing is that while all objects in FluidDB are shared, all data in FluidDB is subject to a strong, fine-grained permissions system.

A user can control various kinds of read and write access for each of their tags, with separate permissions for tags having each name (njr/title, njr/google etc.)

Beyond Simple (Valueless) Tags

Of course, the whole point of FluidDB, is not to imitate del.icio.us, but to enable more powerful things. Some of the power comes when different people tag the same object, which doesn’t have to be about a URL.

object-about-fugitive-pieces.png

This object is guaranteed to be about isbn:978-0747534969, since the about tag can never change. So if we query FluidDB looking for books that both njr and terrycojones have rated 10, we will find this and can retrieve any information on the object, subject to the permissions on the tags.

fugitive-pieces-with-ratings.png

The FluidDB query terrycojones/rating=10 and njr/rating=10 would allow this object to be retrieved, and its title could be found with the FDB command:

$ fdb show -q "terrycojones/rating=10 and njr/rating=10" /njr/title

(For more details of the FluidDB query language, see http://doc.fluidinfo.com/fluidDB/queries.html.)

Notes

[1]del.icio.us, now often delicious.com, is the original social bookmarking site. It is now owned by Yahoo.
[2]There can, however, be many objects with no about tag; and there are.
[3]Each tag name used in FluidDB actually has a corresponding object in the system; we sometimes call this the “abstract tag”, to distinguish it from real tags attached to objects in the system. The permissions system operates on these abstract tags, and permissions set on an abstract tag apply to all the tags used with that name.

24 August 2009

fdb.py 1.14

I've pushed various changes to FDB to the repository at here at GitHub.
Nothing huge, but these changes:

# 2008/08/23 v1.01      Added delicious2fluiddb.pdf
#
# 2008/08/24 v1.11      Added missing README to repository
#
# 2008/08/24 v1.12      Fixed four tests so that they work even
#                       for user's *not* called njr...
#
# 2008/08/24 v1.13      Fixed bug that prevented tagging with real values.
#                       Added tests for reading various values.
#                       Made various minor corrections to 
#                       delicious2fluiddb.pdf.
#                       Split out test class for FDB internal unit tests
#                       that don't exercise/require the FluidDB API.
#                       Added command for that and documented '-v'
#                       (Also, in fact, fixed and simplied the regular
#                       expressions for floats and things, which were wrong.)
#
# 2008/08/24 v1.14      Fixed delicious.py so it sets the font on the page.

23 August 2009

On the Relationship between Delicious and FluidDB

I've drawn lots of pictures and written a few words to try to illustrate similarities and differences between Delicious (http://del.icio.us) and FluidDB as part of the process of trying to document and explain what delicious2fluiddb.py actually does.
It would be nice to post this directly on the blog, but turning it into HTML and PNGs etc. will take a while, so for now I'm just linking to the PDF, which you can get below.

FDB 1.0

fdb.py, available at GitHub is up to 1.1.
New features include
  • a count command in the command line interface to count the number of results form a query
  • new (not throughly tested) raw HTTP GET and associated commands for talking to the FluidDB API at a low level
  • Ability to specify a host (with -host) or specifically to work with the sanbox (with -sandbox)
  • The ability to retrive an object's id with /id and its about tag with /about (as special cases).
The 1.1 release also includes a PDF describing in pictures what the delicious2fluiddb.py code does. See next blog post for details.

22 August 2009

What the homepage generated from del.icio.us looks like

Homepage.png

del.icio.us integration with FluidDB though fdb.py 0.8

My fdb.py library (available at GitHub) includes the ability to upload bookmarks and tags from del.icio.us to FluidDB. It also (just because that's the old code I based it on) has the ability to create a home page consisting of all your bookmarks tagged with a particular tag or set of tags (home by default) in a dense grid.
This post is currently a verbatim dump of the new README-DELICIOUS file I wrote as documentation for how to use it. But I might reformat it later, and will almost certain write some posts on the relationship between del.icio.us and FluidDB.
CODE FOR WORKING WITH DELICIOUS DISTRIBUTED WITH FDB
====================================================

Because FluidDB shares some characteristics with del.icio.us
(http://del.icio.us/ or http://delicious.com), some code for
working with del.icio.us is distributed with fdb.

There are four files:

  1. delicious2fluiddb.py
     This imports data from del.icio.us to FluidDB.
     It uses library code from delicious.py (below).

  2. delicious.py
     This uses the delicious API to extract bookmarks and tags from
     delicious, as an XML feed, and then to create a web page
     (normally a home page) with a fairly dense set of links to
     pages tagged with a paricular tag (normally 'home') on delicious.
     It also caches the data it extracts (in files).

  3. deliconfig.py
     This configuration file controls the behaviour of delicious.py.
     It partly provides information about paths for various data,
     and partly controls formatting of the home page created by delicious.py

  4. delicious.cgi
     This is a CGI script that can be used to run delicious.py.
     The typical usage mode is:

        Add or modify some bookmarks on delicious
        Run the cgi script (usually by clicking a link on the home page).
        Return home

     This updates the home page.



IMPORTING BOOKMARKS FROM DELICIOUS TO FLUIDDB
=============================================

This should be straightforward.
First, make sure that fdb.py is working by running
the tests.   (See README file.)

Then edit deliconfig.py and at the very least set the
cache path to an xml file name in location you
can write to and set credentials to a file containing
your username and password for delicious on separatelines.
Like this:

username
password

You probably also need to set the homepage to a writable
location.

If all you want to do is upload your bookmarks and tags
to FluidDB, you can probably ignore the rest.

(Though if you don't have any bookmarks tagged home, it
might be a good idea to set tags to a bookmark that you
do have.)

If you have already downloaded your delicious bookmarks
in xml, just set the cache variable to point to the relevant
XML file and run with -c.

Then, to download your data from delicious, run

    python delicious.py

If you have already got them in the file, you can skip
this stage.   (delicious2fluiddb.py reads from the cache.)

Then, to upload to FluidDB, type

    python delicious2fluiddb.py


WHAT IT DOES
============

It creates an object for each of your shared bookmarks with
the about tag (dluiddb/about) set to URL of the bookmarked page.

Then, for each tag you have, it created a FluidDB tag under
your username with the same name as the delicious tag.

It currently ignores all other fields, though I will fix that
in a later release.


KNOWN PROBLEMS
==============

Contrary to the FluidDB documentation, tags with colons
in the name (notably, tags that start for:) fail.
This is because FluidDB currently bans colons in tag names.
This is fixed in the sandbox (as I type), and should go live
soon, so I don't plan to alter this functionality.


NOTE ON DELICIOUS BACKUPS
=========================

Because the author is paranoid, delicious.py never overwrites
an XML dump from delicious, but simple renames the old one with
a datestamp.   I have about 180 backups of delicious.
Obviously, you can delete them if you don't like keeping backups.


CREATING A HOME PAGE
====================

When you run delicious.py, it creates a web page in the location
specified by the homepage location in deliconfig.py.
This basically consists of links for all your bookmarks tagged
with 'home' (or any any space-separated list of tags you set
in the variable tags).

The body of the link will be the title from delicious unless you
put something in the notes field, in which case that will be used
instead.   This is particularly useful if the title is long.

There's some special functionality allowing two related links
to be put in one position.   This is achieved, given one link
call "foo" bu having the notes field of another set to "foo +bar".

If you do this, a single position in the grid will get two links,
the first called foo, pointing to its URL, and the second called
bar, pointing to its URL.   For example, Google.com with title
Google and Google UK with notes set to "Google +UK".
This produces two adjacent links, the first of which is called Google
and the second of which is called UK, pointing at the two Google
sites.


CREATING A LINK TO REFRESH YOUR HOME PAGE
=========================================

If you really like the delicious tag-based home page, you might
want to install delicious.cgi to run this from a link in your
browser.   All you really need to do for that is to stick
delicious.cgi, delicious.py and deliconfig.py into your cgi-bin
directory, suitably configured, get it running, and then bookmark
that link from delicious, tagging it with home.
If you're doing this, you need to make sure that some of the locations
in deliconfig.py are writable by your Apache (or any other web server
you might be using.)

When you run it successfully, you get output like this:

Reading entries from del.icio.us
Writing cache /Users/njr/Sites/cache/delicious.xml
Building home page /Users/njr/Sites/cache/index.html
Home page built and backed up
Completed OK.

fdb.py 0.8: Documentation (the new README file)

This post is a webified version of the new README file provided with fdb.py version 0.8.

FDB Python Library

fdb is a primarily a library for providing access to the FluidDB database (http://fluidinfo.com/fluiddb) from Fluidinfo (http://fluidinfo.com/.) There is lots of coverage of the library (and its evolution) at http://abouttag.blogspot.com/.

FDB Command Line Access

fdb can also be used for command-line access to FluidDB. See Using the Command Line.

Dependencies

If you're running python 2.6, fdb.py should just run. With earlier version of python, you need to get access to simplejson and httplib2. You can get simplejson from http://pypi.python.org/pypi/simplejson/ and httplib2 from http://code.google.com/p/httplib2/.

Credentials

For many operations, you also need an account on FluidDB, and credentials (a username and password). You can get these from
    http://fluidinfo.com/accounts/new
The library allows you to give it your credentials in various different ways, but life is simplest if you stick them in a 2-line file (preferably with restricted read access) in the format
username
password
On Unix, the default location for this is ~/.fluidDBcredentials, and on Windows the default file is fluidDBcredentials.ini in your home folder.

Tests

The library includes a set of tests. If you have valid credentials, and everything is OK, these should run successfully if you just execute the file fdb.py. For example, at the time of writing this README file (version 0.8 of the fdb), I get this:
$ python fdb.py
....................
----------------------------------------------------------------------
Ran 20 tests in 46.311s

OK

Using the Library

Four ways of exploring the library are:
  1. look at the tests (the ones in the class TestFluidDB)
  2. look at the blog (http://abouttag.blogspot.com)
  3. read the function documentation, which is . . . existent.
  4. look at and run example.py, which should print DADGAD and 10.
Here is example.py:
import fdb

db = fdb.FluidDB () # assumes credentials are in the standard place
db.get_tag_value_by_about ('DADGAD', '/fluiddb/about')
(status, value) = db.get_tag_value_by_about ('DADGAD', '/fluiddb/about')
print value
assert db.tag_object_by_about ('DADGAD', 'rating', 10) == 0
(status, value) = db.get_tag_value_by_about ('DADGAD', 'rating')
print value

Using the Command Line

Commands can be run by giving arguments to fdb.py. For a list of commands, use
    python fdb.py help
An example command is
    python fdb.py show -a DADGAD rating /fluiddb/about
Obviously, if you want to use fdb as a command from the shell, it will probably be convenient to use an alias or create a trivial shell script to run it. I use bash, with the alias
    alias fdb='python ~/python/fluiddb/fdb.py'
which allows me to type
    fdb show -a DADGAD rating /fluiddb/about
etc.

Delicious

Also distributed with fdb itself is code for accessing delicious.com (http://del.icio.us/, as was), and for migrating bookmarks and other data to FluidDB. This also includes functionality for creating web homepages from delicious based on a home tag.
A further post on using the del.icio.us uploader and other functionality will follow.

fdb.py 0.5: get becomes show

I've done a bit more to fdb.py. It's mostly tidying up and more tests (you can never have enough tests!), but there's a significant change to the command line interface: I've changed the get command to show. So now, you'd type something like:
  fdb show -a DADGAD /njr/rating
instead of
  fdb get -a DADGAD /njr/rating
That assumes you have an alias or shell script pointing at it, of course. If not, the full thing is:
  python fdb.py show -a DADGAD /njr/rating
The reason for the change is that (somewhat reluctantly) I've concluded that it's going to be useful to support raw HTTP GET, POST, PUT, DELETE and HEAD for lower-level stuff, and it seems cleaner if those commands are just of the form
  fdb get whatever
There are other minor changes; see the log. And I added a nice, permissive license ("stolen" from Nicholas Tollervey; but I don't think he'll mind).

21 August 2009

The fdb command line in fdb.py 0.3

I just pushed fdb.py 0.3 to github (http://github.com/njr0/fdb.py). It has a few extra things in the API, but the big new thing is you can use it from the command line.
I hope the following is reasonably self-explanatory. Obviously you can use an alias or a 1-line shell script to get rid of the need for the python fdb.py.
Script started on Fri Aug 21 16:27:35 2009
$ # reference objects by their about tag using -a
$ # the -v just tells it to be verbose and tell you what it's doing 
$ python fdb.py tag -av DADGAD rating=10
Tagged object with about="DADGAD" with rating = 10

$ python fdb.py tag -av DADGAD favourite
Tagged object with about="DADGAD" with favourite

$ python fdb.py get -a DADGAD rating favourite /terry/rating
Object with about=DADGAD:
  /njr/rating = 10
  /njr/favourite
  <tag /terry/rating not present>

$ python fdb.py untag -a -v DADGAD rating
Removed tag rating from object with about="DADGAD"

$ python fdb.py get -a DADGAD rating favourite /terry/rating
Object with about=DADGAD:
  <tag /njr/rating not present>
  /njr/favourite
  <tag /terry/rating not present>

$ # reference objects by their id tag using -i

$ python fdb.py tag -iv a984efb2-67d8-4b5c-86d0-267b87832fa4 rating=10
Tagged object a984efb2-67d8-4b5c-86d0-267b87832fa4 with rating = 10

$ python fdb.py tag -iv a984efb2-67d8-4b5c-86d0-267b87832fa4 favourite
Tagged object a984efb2-67d8-4b5c-86d0-267b87832fa4 with favourite

$ python fdb.py get -i a984efb2-67d8-4b5c-86d0-267b87832fa4 rating favourit e /terry/rating
Object a984efb2-67d8-4b5c-86d0-267b87832fa4:
  /njr/rating = 10
  /njr/favourite
  <tag /terry/rating not present>

$ python fdb.py untag -i -v a984efb2-67d8-4b5c-86d0-267b87832fa4 rating
Removed tag rating from object a984efb2-67d8-4b5c-86d0-267b87832fa4

$ python fdb.py get -i a984efb2-67d8-4b5c-86d0-267b87832fa4 rating favourit e /terry/rating
Object a984efb2-67d8-4b5c-86d0-267b87832fa4:
  <tag /njr/rating not present>
  /njr/favourite
  <tag /terry/rating not present>
$ exit

Script done on Fri Aug 21 16:31:45 2009

fdb.py on github

If you like your libraries revision controlled, fdb.py is now available on github at http://github.com/njr0/fdb.py. And Xavi says it works!

Tagging and untagging from python with fdb.py 0.2

I extended fdb.py a bit. Version 0.2 is available where 0.1 used to be, at http://stochasticsolutions.com/fluiddb/fdb.py, but also at http://stochasticsolutions.com/fluiddb/fdb0.2.py. Version 0.1 is still available at http://stochasticsolutions.com/fluiddb/fdb0.1.py. (I know, I know, I should make it publicly available through a VCS, and I will, but this is much easier for me right now.)
Main new features are:
  • Ability to untag objects
  • More tests
  • Tests that work for users whose FluidDB name isn't njr
  • More tag path manipulation
  • Most (all?) of the commands on tags now accept relative or absolute tag names
It still doesn't support subnamespaces or (m)any queries yet, though.
The following code illustrates the new functionality:
import fdb
import types

db = fdb.FluidDB (fdb.Credentials (filename='/Users/njr/.fluidDBcredentials'))

# Create an object with about='DADGAD' (or look up ID is already exists)
o = db.create_object ('DADGAD')
assert type (o) != types.IntType        # Would indicate an error code
id_DADGAD = o.id

# Add njr/rating=10 to 
assert db.tag_object_by_id (id_DADGAD, 'rating', 10) == 0

# Read the value back and heck it's right
(status, value) = db.get_tag_value_by_id (id_DADGAD, 'rating')
assert value == 10

# Remove njr/rating from DADGAD object
assert db.untag_object_by_id (id_DADGAD, 'rating') == 0

# Again, using absolute tag path
assert db.untag_object_by_id (id_DADGAD, '/njr/rating') == 0

# Again, using absolute tag path
assert db.untag_object_by_id (id_DADGAD, '/njr/rating') == 0

# Yet again, requesting error if the tag or object isn't there
error = db.untag_object_by_id (id_DADGAD, '/njr/rating', False)
assert error == fdb.STATUS.NOT_FOUND    # 404 :-)

print 'Well, that all worked!'
This produces:
zero:$ time python taguntag.py
Well, that all worked!

real 0m3.920s
user 0m0.121s
sys 0m0.068s

Tagging, Tags and Abstract Tags

An issue anyone using the FluidDB API (in any form) will run across fairly quickly is the occasionally vexed issue of what we actually mean by a tag.
Conceptually, it's very straightforward: a tag is exactly like the tags we all know and love from del.icio.us, Flickr, GMail etc., with the twist that they can have values. So a tag has a name (like njr/rating, terry/toread etc.), and optionally has a value (which can be of almost any type) too.
Additionally, tags have some other properties, like a rather full permissions system (that controls who can see, edit and attach them to objects) and some metadata, like an optional description.
The potential confusion arises because sets of tags sharing the same name are (for good reason) often managed together, and in some cases share metadata in FluidDB.
We can see this if we look at the process of tagging an object in a little detail.
In the client library, fdb.py that I published earlier, you can add a tag to an object very simply if you know its ID. For example, if I wanted to add an njr/rating of 10 tag to the object with the id a984efb2-67d8-4b5c-86d0-267b87832fa4g, I could just say
import fdb
db = fdb.FluidDB (fdb.Credentials (filename='/Users/njr/.fluidDBcredentials'))
o = db.tag_object_by_id ('a984efb2-67d8-4b5c-86d0-267b87832fa4', 'rating', 10)
assert o = 0
This corresponds very closely to the conceptual model I use and encourage others to use, and works even if there have never been any njr/rating tags in the system before.
Under the covers, however, the native HTTP API sees things slightly differently. Before I can tag something with a tag such as njr/rating I first have to tell the system I want to use tags with this name. The underlying API refers to this process as tag creation, though I prefer to think of it as abstract tag creation, or tag declaration.
So the way fdb.py actually works, is that when you ask it to tag an object with a given tag (and perhaps a value), it goes ahead and tries to do that for you. But if that tag hasn't previously been declared (i.e., if the abstract form of it hasn't been created), this will fail. In this case, the library backs up and creates the abstract tag and then tries again. It does this using another fdb call:
db.create_abstract_tag ('rating', description=None, indexed=True):
As you can see, we can also give a description for an (abstract) tag, which in effect applies to all the real ("concrete") tags we create when we tag objects, and can also specify whether FluidDB should index the tag (making it searchable). So we could say:
db.create_abstract_tag ('rating',
          description="njr's rating for things, on a scale of 0-10")
Similarly, I will soon impement some untag methods in fdb.py, but these shouldn't be confused with the delete_abstract_tag function that already exists. The delete_abstract_tagmethod doesn't simply simply remove tag from an object, but actually deletes all tags with the given tag name (and the abstract tag itself) from the system.
The reason FluidDB cares so much about abstract tags (or, if you prefer, sets of tags sharing the same tag name) is that this is the level at which the permissions system acts. In FluidDB, there is fairly fine-grain control, allowing the owner of an (abstract) tag to decide who is allowed to read, apply, and alter tags with a given name to objects.
More on that later.

20 August 2009

fdb: A simple python client library for FluidDB

I spent most of today playing with FluidDB, building on the work that I mentioned earlier from Sanghyeon Seo and Nicholas Tollervey,
The result is the fdb.py library, which you can find here.
If anyone wants to use it, feel free. I'll attach a licence some time, but it'll be BSD or similar --- something very permissive --- as long as Sanghyeon Seo and Nicholas Tollervey are happy.
There are six tests at the end that show its use reasonably clearly (I hope). The easiest way to use it is to stick your credentials in a file, possibly ~/.fluidDBcredentials as username and password on separate lines. Like this:
$ cat ~/.fluidDBcredentials
njr
myVerySecretPasswordThatAbsolutelyNobodyKnows
I'l post some example code using it, but that might not be till tomorrow evening.
The main thing I've used it for (other than the tests) is for pushing about 1500 bookmarks
from del.icio.us into FluidDB and it worked almost flawlessly as far as I can tell.
It hung after 678 object creations, but was fine after I interrupted and started again.

On tags, namespaces, paths

The full specification of a tag in fluidDB might be something like
http://fluidDB.fluidinfo.com/tags/njr/var/rating
The way we talk about this in FluidDB (at least, the bits we're agreed on) is as follows:
  • rating is the name of the tag;
  • njr/var is the name of a hierarchical namespace; njr is me and var is a (sub-)namespace that I've created under my username namespace njr/.
The problem, as usual, is that we may wish to refer to different bits of it. I am currently (for myself, and in my code) using the following names, though others will undoubtedly use others.
  • http://fluidDB.fluidinfo.com/tags/njr/var/rating is the tag URI;
  • /tags/njr/var/rating is the the full tag path;
  • /njr/var/rating is the absolute tag path;
  • /njr/var is the absolute namespace;
  • var/rating is the relative tag path;
  • rating is the short tag name;
Why do I care? Well, partly just so that we can talk about things, and partly because I want to make my functions be fairly liberal about what they accept (relative or absolute paths, filling in missing namespaces when appropraite) etc. I'm sure there'll be confusion for a while; hopefully followed by blissful clarity.

Getting started with FluidDB

This post describes the exact steps I took to get started with FluidDB. It describes how to

  • Get a login and password
  • Get the libraries to allow you to use it from python 2.5
  • Get the FluidDB object ID corresponding to a FluidDB username
  • Create a new object in FluidDB
  • Retrieve that object

What I had to do

  1. Get a username and password. I got my username as a consequence of following @fluiddb on twitter, and the password was sent to me this morning. If you haven’t already reserved a username this way, you can reserve one instead by signing up at http://fluidinfo.com/accounts/new.

  2. I started using the very simple but useful python library fluiddb put together by Sanghyeon Seo and augmented by Nicholas Tollervey, who gives some nice examples of its use on his blog. (I found bitbucket a bit confusing, but you can get the actual source from, gzipped, from http://bitbucket.org/sanxiyn/fluidfs/get/054092b8d3ff.gz or as a zip or bz2 by changing the extension.)

  3. I found that I needed install a couple of libraries to use with Python 2.5. These were httplib2, available from http://code.google.com/p/httplib2/ and simplejson, available from http://pypi.python.org/pypi/simplejson/ I then tried tollervey’s examples, which all worked fine, without any credentials for FluidDB. So far so good.

  4. Then I used the following trivial code to get the ID for the object corresponding to me (njr).

    import fluiddb
    
    njrID = fluiddb.call('GET', '/objects', query='fluiddb/users/username = "njr"')
    print njrID
    

which produced

$ python getnjr.py
(200, {'ids': ['cde01b2c-68bb-4d41-b25c-3ca49dcec434']})
  1. I then wanted to try creating an object, which does require credentials. For this, I created the following trivial credentials class (credentials.py):

    class Credentials:
        def __init__ (self, username, password, id=None):
            self.username = username
            self.password = password
            self.id = id
    
    njr = Credentials ('njr', 'my-secret-password',
        'cde01b2c-68bb-4d41-b25c-3ca49dcec434')
    

    (Obviously, that isn’t the real password...)

  2. That having worked, it seemed like it was time to create an object and test that I could retrieve it, which I did with the following code (createDADGAD.py):

    import fluiddb, credentials
    import simplejson as json
    
    njr=credentials.njr
    fluiddb.login (njr.username, njr.password)
    about_DADGAD = json.dumps ({'about' : 'DADGAD'})
    (status, o) = fluiddb.call ('POST', '/objects', about_DADGAD)
    print status, o
    print fluiddb.call ('GET', '/objects/' + o['id'], '{"showAbout": true}')
    

    which produced

    $ python createDADGAD.py
    201 {'id': 'a984efb2-67d8-4b5c-86d0-267b87832fa4',
    'URI': 'http://fluidDB.fluidinfo.com/objects/a984efb2-67d8-4b5c-86d0-267b87832fa4'}
    (200, {'about': 'DADGAD', 'tagPaths': ['fluiddb/about']})

Labels