31 August 2009

Flags in FDB from version 1.18 onwards

I posted recently that FDB 1.18 has been released and that one of the changes in there was that my custom flags library for handling command line options (switches; flags; things like -a) has been replaced with the standard optparse library. (Thanks again, Yevgev (@xa4a_)

When I wrote that post, I hadn’t fully appreciated that this had subtly changed the behaviour of command line arguments, and in doing so invalidated some of the examples I posted.

Most people will almost certainly prefer the new behaviour (or at least, will probadbly find it more familiar), but since its different and inconsistent with previous published examples, I thought I’d better document it.

The main changes are as follows:

  • Single-letter options may no longer be conflated, i.e.

    USE   fdb tag -v -a DADGAD rating=10
    NOT   fdb tag -av DADGAD rating=10

    to specify that you want to set a rating of 10 on the object whose object whose about tag is DADGAD

  • -q, -a and -i now take the argument immediately following as the query, about-tag or ID, rather than using the first argument not tied to an option. So

    USE   fdb tag -v -a DADGAD rating=10
          fdb tag -sand -i edfaaf70-b9b5-42f1-ad4c-30601b70a2ac rating=10
          fdb tag -v -sand -q "has njr/rating" rating=10
    
    USE   fdb tag -a -v DADGAD rating=10
          fdb tag -i -sand edfaaf70-b9b5-42f1-ad4c-30601b70a2ac rating=10
          fdb tag -v -q -sand "has njr/rating" rating=10
  • options may now appear anywhere in the command

    USE  fdb tag -i -sand edfaaf70-b9b5-42f1-ad4c-30601b70a2ac rating=10
    OR   fdb tag -i edfaaf70-b9b5-42f1-ad4c-30601b70a2ac rating=10 -sand
    OR   fdb -i edfaaf70-b9b5-42f1-ad4c-30601b70a2ac tag rating=10 -sand
  • there are now long versions with -- available to go with the short (single-letter) form for each action. (Personally, I hate --, but I lost that battle years ago.) Specifically, you can use

    --about ABOUT, -a ABOUT  to specify an about tag of ABOUT
    --id ID, -i ID           to specify an id of ID
    --query QUERY, -q QUERY  to specify a query of QUERY
    --timeout N, -T N        to specify an Http timeout of N
    --sandbox, -s            to specify use of the sandbox
    --verbose, -v            to increase verbosity
    --debug, -D              for debug mode

29 August 2009

FDB 1.18

FDB version 1.18 is now available from GitHub at http://github.com/njr0/fdb.py/.

There’s quite a lot of cleanup, almost all done by Yevgen Varavva (@xa4a_ on Twitter). Changes since the last version are:

  • Using a decorator to avoid code repetion
  • Moving flag/option handling to use the standard optparse library instead of the custom flags lib
  • fixing a couple of minor bugs
  • making everything work against the sandbox as well as the main instance.

At the time of writing, the main instance is offline, so I haven’t actually been able to test against that, but I suspect it will be OK.

I haven’t actually removed the flags library from the respository yet, but probably will.

26 August 2009

The Guardian 1000 Novels Everyone Must Read

I wanted to create a few objects that would begin to show some other aspects of FluidDB.

Some may remember that in January The Guardian ran a series of articles entitled 1000 Novels Everyone Must Read. They published the list in themed sections, which made a lot of sense for the paper, but which made it surprisingly hard to see what the 1,000 were, so I spent a while building a digital version of the list which I blogged about and published here. (Ironically, they then did the same thing, but my work wasn’t entirely wasted, because at least I ended up with nice, clean, structured, electronic version of their list.

But . . .

Wouldn’t it be great if people could stick ratings on those 1,000 novels, or indicate which ones they’ve read, and also augment the list with novels they think should have been on there. And wouldn’t it be insanely great (in that strangely familiar phrase) if you could then find out things like

  • which novels your friends have read and rated highly that you haven’t read
  • which of your friends seem to like the same novels as you
  • who of your non-friends like the same novels as you
  • which novels have the highest ratings from users.

This is exactly the kind of thing FluidDB was built for.

So, I’ve made a start by creating FluidDB objects for some [1] of the novels.

The Hitchhiker’s Guide to FluidDB

In order to load information into FluidDB in a useful way, we need to decide how we want to represent it. FluidDB is fantastically relaxed about this, and it is not necessary for everyone to agree. But at the same time, for users to share data meaningfully, and even for a single user to be able to find information systematically, fixing a structure is really useful.

This is what I am using so far.

http://StochasticSolutions.com/fluiddb/image/hitchhiker.png

In principle, it seems to me that the ISBN number is the natural way to identify a book (despite some limitations), so I’ve chosen that as the about tag (even though they’re painful for me to look up).

In a perfect world, t tags in grey would ideally have some namespace with more authority than njr (ideally isbn.org:), and in time, that may come, quite possibly using the very same object (edfaaf70-b9b5-42f1-ad4c-30601b70a2ac). But for now, I’ve stuck the basic metadata in my namespace (njr), and that should be enough to get things rolling. These tag were put on programmatically using a few lines of python.

The social tag, of course, is the black one—njr/rating = 9. Unlike the others, which were generated automatically, I put that one on myself, by hand. There are lots of ways of doing this, but I used FDB, with the command

$ fdb tag -av isbn:978-0330258647 rating=9
Tagged object with about="isbn:978-0330258647" with rating = 9

So: 10 down 990 to go. I will add the rest as soon as I can find a reasonable way of getting their ISBN numbers.

Tag Conventions

While it is fundamental to FluidDB that each user and application can make their own choices, I believe good conventions are useful and will nurture the growth of FluidDB. (Terry refers to my desire for conventions as my fascist librarian tendencies, but I’m sure he means it kindly.) I will maintain my personal suggested conventions on this blog on a post I’ll make soon. The first ones will be these:

The About Tag

  • Books: I suggest a primary convention that matches about="isbn:978-0330258647". [2] This is, the four lower case letter ‘isbn’, followed by a colon, followed by the 13-digit ISBN number.
  • Books without an ISBN number: I am going to use about=”book:Name of Book/Author” for now, but that’s not very precise. (Maybe eventually FLuidDB ID’s will replace ISBN numbers.)

Other Tags

  • Ratings: username/rating. Mostly because Terry Jones (the father of FluidDB) has effectively promulgated this with his every description of FluidDB, I suggest ratings on a scale of 0 (worst) to 10 (best). Feel free to use decimals if 11 ratings aren’t enough. Obviously, anyone can go higher (or lower) if they want, but when I’m calculating means and things, I’ll ignore out-of-bounds values.

The Guardian 10 Novels Everyone Must Read

As noted before, I’ve currenlty only added objects for the first ten. This is so tiny, I thought I might as well show the FDB output illustrating what’s there. If anyone does have access, do tag these with your rating, or a has_read tag or whatever (as if you need my permission!).

At the time of writing, The Hitchhiker’s Guide to the Galaxy has three ratings; but I bet that increases really fast.

fdb show -q 'has njr/guardian-1000' /about title publication-year author-forename author-surname guardian-1000
10 objects matched
Object edfaaf70-b9b5-42f1-ad4c-30601b70a2ac:
  /fluiddb/about = "isbn:978-0330258647"
  /njr/title = "The Hitchhiker's Guide to the Galaxy"
  /njr/publication-year = 1979
  /njr/author-forename = "Douglas"
  /njr/author-surname = "Adams"
  /njr/guardian-1000 = 1

Object 9d49f0b5-fb0b-49c8-b4fc-d82eb35d90e1:
  /fluiddb/about = "isbn:978-1569470039"
  /njr/title = "Silver Stallion"
  /njr/publication-year = 1990
  /njr/author-forename = "Junghyo"
  /njr/author-surname = "Ahn"
  /njr/guardian-1000 = 1

Object 1be19311-1f81-4f70-ba7a-4075d9b06b4d:
  /fluiddb/about = "isbn:978-9774246036"
  /njr/title = "al-Bab al-Maftouh"
  /njr/publication-year = 1960
  /njr/author-forename = "Latifa"
  /njr/author-surname = "al-Zayyat"
  /njr/guardian-1000 = 1

Object f0130f4f-2bc2-4ea4-8e52-d6d4922969a2:
  /fluiddb/about = "isbn:978-0701206048"
  /njr/title = "Death of a Hero"
  /njr/publication-year = 1929
  /njr/author-forename = "Richard"
  /njr/author-surname = "Aldington"
  /njr/guardian-1000 = 1

Object e65b1540-1fe4-46b5-b14d-da32ae592dfe:
  /fluiddb/about = "isbn:978-0141188539"
  /njr/title = "The Face of Another"
  /njr/publication-year = 1964
  /njr/author-forename = "Kobo"
  /njr/author-surname = "Abe"
  /njr/guardian-1000 = 1

Object 9cdfb63a-38e2-4c27-a739-5107ec24d151:
  /fluiddb/about = "isbn:978-1857989984"
  /njr/title = "Non-Stop"
  /njr/publication-year = 1958
  /njr/author-forename = "Brian W"
  /njr/author-surname = "Aldiss"
  /njr/guardian-1000 = 1

Object 14371df6-73d9-43e0-96e5-2008e852d67f:
  /fluiddb/about = "isbn:978-0140621198"
  /njr/title = "Little Women"
  /njr/publication-year = 1868
  /njr/author-forename = "Louisa May"
  /njr/author-surname = "Alcott"
  /njr/guardian-1000 = 1

Object 97401884-3590-4c22-b9dc-0cd2744270eb:
  /fluiddb/about = "isbn:978-1841955612"
  /njr/title = "The Man with the Golden Arm"
  /njr/publication-year = 1949
  /njr/author-forename = "Nelson"
  /njr/author-surname = "Algren"
  /njr/guardian-1000 = 1

Object 4a96c86b-1117-4c93-a815-1494c13fc2cf:
  /fluiddb/about = "isbn:978-0141186900"
  /njr/title = "Anthills of the Savannah"
  /njr/publication-year = 1987
  /njr/author-forename = "Chinua"
  /njr/author-surname = "Achebe"
  /njr/guardian-1000 = 1

Object f95a950e-1b1a-425d-b4ac-b8fc82c99cb3:
  /fluiddb/about = "isbn:978-0141023380"
  /njr/title = "Things Fall Apart"
  /njr/publication-year = 1958
  /njr/author-forename = "Chinua"

  /njr/author-surname = "Achebe"
  /njr/guardian-1000 = 1
[1]OK, only ten so far, but there’s a reason. Doing them all would be really easy if I knew a good way of looking up an ISBN number programmatically. If anyone knows one, please let me know (by leaving a comment, or any other way). Otherwise I’ll have to resort to screen-scaping, which is completely doable, but dull and requires manual intervention when it goes wrong.
[2]When I say the about tag and write about="...", this is shorthand for fluidbdb/about. This tag is special in that it is unique (only one object can ever exist with a given value of the about tag), and immutable. This is a basic property of FluidDB, and means that when a rating is attached to the object with about="isbn:978-0330258647", we can be confident that the object will forever more be about that ISBN number.

fdb 1.15: Important Bug Fix

I pushed a new version of FDB (version 1.15) to GitHub (http://github.com/njr0/fdb.py).

Among other minor things, this fixed a serious problem whereby tags created by FDB have the wrong type. (Sorry about that.) So if you have used FDB (the library or the command line utilities) you may want to reset any tags.

The new version is better, but can’t set tags with no value because of a bug currently present in both the main and sandbox instances of FluidDB. That means that one test currently fails.

I’ve added a KNOWN-PROBLEMS file, which documents these issues.

25 August 2009

Delicious and FluidDB (redux)

Preamble

This article discusses the relationship between the social bookmarking site, del.icio.us, and the new online database FluidDB. It also talka about a utility that can be used to transfer bookmarks from del.icio.us to FluidDB, and a little about what else you can do with FluidDB.

This article is an HTML version of the PDF posted previously posted here.

Some examples in this article use the command line version of fdb.py. This is a client library for FluidDB, available from GitHub at http://github.com/njr0/fdb.py

The delicious importer described here is also distributed with fdb.py.

Delicious and FluidDB

A simple mental model of a bookmark in del.icio.us (http://del.icio.us) [1] looks something like this:

google-delicious-simple.png

This represents the site www.google.com, and I have tagged it with my search, google and home tags.

Perhaps the simplest way to map this to FluidDB is as follows. This is what delicous2fluiddb.py currently does.

google-fluiddb-simple-short-about.png

Everything looks very similar except that the address (ID) of the object we store the information on is more prominent and the subject of the bookmark (the URL) is placed on the special about tag. This address (id) is real (it exists in FluidDB and is permanent).

$ fdb show -a "http://www.google.com/" /id
Object with about="http://www.google.com/":
  /id = "6c1d7519-35c2-42c0-8b8c-08ffbbc5990d"

Unlike the other tags, the about tag is not owned by njr, but is system-wide. So we often abbreviate it to just about.

google-fluiddb-simple-short-about.png

The about tag is special in that (a) its value can never be changed and (b) it is unique. [2] This one is permanently associated with the object having the ID 6c1d7519-35c2-42c0-8b8c-08ffbbc5990d. So by attaching these tags to this object, we can be entirely confident that they will always be associated with the URL http://www.google.com/.

Del.icio.us actually allows us to store a little more information in a bookmark—a title, some notes and an update date. We can also think of the URL as a tag itself, but as in FluidDB, it’s owned by the system, not by njr. So a more complete picture of a del.icio.us bookmark might be:

google-delicious-bigger.png

Note that in del.icio.us, the tags are just names, but the other four items of information on the bookmark have values.

Replicating this in FluidDB is trivial, because all tags can take values, which can have various types—strings, numbers, dates and much more.

google-fluiddb-simple-bigger.png

Although delicious2fluiddb.py doesn’t do this yet, it would be trivial for it replicate the delicious title, notes and update fields, to produce the structure shown above; soon, it will.

There are many other ways we could represent information from del.icio.us in FluidDB. Since the value of a tag in FluidDB can be a set, another obvious way to represent the tags would be as a set of strings on a tag called tags (or perhaps delicious-tags).

google-fluiddb-simple-tags-as-values.png

Of course, once in FluidDB, we can also add as many other tags as we like to the object, and these tags may or may not have values.

Privacy, Sharing, and Permissions

But before we get onto that, there’s one more crucial bit of information that del.icio.us stores about a bookmark—whether it is shared (meaning anyone can see that njr has the bookmark, and what tags I’ve put in it), or private (meaning that only I can see it).

google-delicious-shared.png

Right now, delicious2fluiddb.py only uploads the shared bookmarks to FluidDB. But how would it make information shared or private?

As well as allowing the arbitrary tagging of objects, and providing a query language, FluidDB also possesses a rich permissions structure. In fact, there are fine-grained permissions associated with every different tag name in the system (njr/google, njr/title, terrycojones/rating etc.) [#fc]

abstract-attribute-tag.png

To illustrate this, we can use fdb to query FluidDB and check this.

$ fdb show -a "Object for the attribute njr/title" /id
Object with about="Object for the attribute njr/title":
  /id = "a6e022bf-14e9-4033-aa91-39789dc11b23"

This command asks fdb to show the /id (which is fdb shorthand for the object’s ID), for the object whose about tag is the string “Object for the attribute njr/title”. The -a means that we wish to specify the object by it’s about tag rather than, for example, by ID or by a more general FluidDB query. If you have fdb and a username and password, you should be able to run this query and get exactly this result.

While this isn’t the place to go into all the detail, the key thing is that while all objects in FluidDB are shared, all data in FluidDB is subject to a strong, fine-grained permissions system.

A user can control various kinds of read and write access for each of their tags, with separate permissions for tags having each name (njr/title, njr/google etc.)

Beyond Simple (Valueless) Tags

Of course, the whole point of FluidDB, is not to imitate del.icio.us, but to enable more powerful things. Some of the power comes when different people tag the same object, which doesn’t have to be about a URL.

object-about-fugitive-pieces.png

This object is guaranteed to be about isbn:978-0747534969, since the about tag can never change. So if we query FluidDB looking for books that both njr and terrycojones have rated 10, we will find this and can retrieve any information on the object, subject to the permissions on the tags.

fugitive-pieces-with-ratings.png

The FluidDB query terrycojones/rating=10 and njr/rating=10 would allow this object to be retrieved, and its title could be found with the FDB command:

$ fdb show -q "terrycojones/rating=10 and njr/rating=10" /njr/title

(For more details of the FluidDB query language, see http://doc.fluidinfo.com/fluidDB/queries.html.)

Notes

[1]del.icio.us, now often delicious.com, is the original social bookmarking site. It is now owned by Yahoo.
[2]There can, however, be many objects with no about tag; and there are.
[3]Each tag name used in FluidDB actually has a corresponding object in the system; we sometimes call this the “abstract tag”, to distinguish it from real tags attached to objects in the system. The permissions system operates on these abstract tags, and permissions set on an abstract tag apply to all the tags used with that name.

24 August 2009

fdb.py 1.14

I've pushed various changes to FDB to the repository at here at GitHub.
Nothing huge, but these changes:

# 2008/08/23 v1.01      Added delicious2fluiddb.pdf
#
# 2008/08/24 v1.11      Added missing README to repository
#
# 2008/08/24 v1.12      Fixed four tests so that they work even
#                       for user's *not* called njr...
#
# 2008/08/24 v1.13      Fixed bug that prevented tagging with real values.
#                       Added tests for reading various values.
#                       Made various minor corrections to 
#                       delicious2fluiddb.pdf.
#                       Split out test class for FDB internal unit tests
#                       that don't exercise/require the FluidDB API.
#                       Added command for that and documented '-v'
#                       (Also, in fact, fixed and simplied the regular
#                       expressions for floats and things, which were wrong.)
#
# 2008/08/24 v1.14      Fixed delicious.py so it sets the font on the page.

23 August 2009

On the Relationship between Delicious and FluidDB

I've drawn lots of pictures and written a few words to try to illustrate similarities and differences between Delicious (http://del.icio.us) and FluidDB as part of the process of trying to document and explain what delicious2fluiddb.py actually does.
It would be nice to post this directly on the blog, but turning it into HTML and PNGs etc. will take a while, so for now I'm just linking to the PDF, which you can get below.

FDB 1.0

fdb.py, available at GitHub is up to 1.1.
New features include
  • a count command in the command line interface to count the number of results form a query
  • new (not throughly tested) raw HTTP GET and associated commands for talking to the FluidDB API at a low level
  • Ability to specify a host (with -host) or specifically to work with the sanbox (with -sandbox)
  • The ability to retrive an object's id with /id and its about tag with /about (as special cases).
The 1.1 release also includes a PDF describing in pictures what the delicious2fluiddb.py code does. See next blog post for details.

22 August 2009

What the homepage generated from del.icio.us looks like

Homepage.png

del.icio.us integration with FluidDB though fdb.py 0.8

My fdb.py library (available at GitHub) includes the ability to upload bookmarks and tags from del.icio.us to FluidDB. It also (just because that's the old code I based it on) has the ability to create a home page consisting of all your bookmarks tagged with a particular tag or set of tags (home by default) in a dense grid.
This post is currently a verbatim dump of the new README-DELICIOUS file I wrote as documentation for how to use it. But I might reformat it later, and will almost certain write some posts on the relationship between del.icio.us and FluidDB.
CODE FOR WORKING WITH DELICIOUS DISTRIBUTED WITH FDB
====================================================

Because FluidDB shares some characteristics with del.icio.us
(http://del.icio.us/ or http://delicious.com), some code for
working with del.icio.us is distributed with fdb.

There are four files:

  1. delicious2fluiddb.py
     This imports data from del.icio.us to FluidDB.
     It uses library code from delicious.py (below).

  2. delicious.py
     This uses the delicious API to extract bookmarks and tags from
     delicious, as an XML feed, and then to create a web page
     (normally a home page) with a fairly dense set of links to
     pages tagged with a paricular tag (normally 'home') on delicious.
     It also caches the data it extracts (in files).

  3. deliconfig.py
     This configuration file controls the behaviour of delicious.py.
     It partly provides information about paths for various data,
     and partly controls formatting of the home page created by delicious.py

  4. delicious.cgi
     This is a CGI script that can be used to run delicious.py.
     The typical usage mode is:

        Add or modify some bookmarks on delicious
        Run the cgi script (usually by clicking a link on the home page).
        Return home

     This updates the home page.



IMPORTING BOOKMARKS FROM DELICIOUS TO FLUIDDB
=============================================

This should be straightforward.
First, make sure that fdb.py is working by running
the tests.   (See README file.)

Then edit deliconfig.py and at the very least set the
cache path to an xml file name in location you
can write to and set credentials to a file containing
your username and password for delicious on separatelines.
Like this:

username
password

You probably also need to set the homepage to a writable
location.

If all you want to do is upload your bookmarks and tags
to FluidDB, you can probably ignore the rest.

(Though if you don't have any bookmarks tagged home, it
might be a good idea to set tags to a bookmark that you
do have.)

If you have already downloaded your delicious bookmarks
in xml, just set the cache variable to point to the relevant
XML file and run with -c.

Then, to download your data from delicious, run

    python delicious.py

If you have already got them in the file, you can skip
this stage.   (delicious2fluiddb.py reads from the cache.)

Then, to upload to FluidDB, type

    python delicious2fluiddb.py


WHAT IT DOES
============

It creates an object for each of your shared bookmarks with
the about tag (dluiddb/about) set to URL of the bookmarked page.

Then, for each tag you have, it created a FluidDB tag under
your username with the same name as the delicious tag.

It currently ignores all other fields, though I will fix that
in a later release.


KNOWN PROBLEMS
==============

Contrary to the FluidDB documentation, tags with colons
in the name (notably, tags that start for:) fail.
This is because FluidDB currently bans colons in tag names.
This is fixed in the sandbox (as I type), and should go live
soon, so I don't plan to alter this functionality.


NOTE ON DELICIOUS BACKUPS
=========================

Because the author is paranoid, delicious.py never overwrites
an XML dump from delicious, but simple renames the old one with
a datestamp.   I have about 180 backups of delicious.
Obviously, you can delete them if you don't like keeping backups.


CREATING A HOME PAGE
====================

When you run delicious.py, it creates a web page in the location
specified by the homepage location in deliconfig.py.
This basically consists of links for all your bookmarks tagged
with 'home' (or any any space-separated list of tags you set
in the variable tags).

The body of the link will be the title from delicious unless you
put something in the notes field, in which case that will be used
instead.   This is particularly useful if the title is long.

There's some special functionality allowing two related links
to be put in one position.   This is achieved, given one link
call "foo" bu having the notes field of another set to "foo +bar".

If you do this, a single position in the grid will get two links,
the first called foo, pointing to its URL, and the second called
bar, pointing to its URL.   For example, Google.com with title
Google and Google UK with notes set to "Google +UK".
This produces two adjacent links, the first of which is called Google
and the second of which is called UK, pointing at the two Google
sites.


CREATING A LINK TO REFRESH YOUR HOME PAGE
=========================================

If you really like the delicious tag-based home page, you might
want to install delicious.cgi to run this from a link in your
browser.   All you really need to do for that is to stick
delicious.cgi, delicious.py and deliconfig.py into your cgi-bin
directory, suitably configured, get it running, and then bookmark
that link from delicious, tagging it with home.
If you're doing this, you need to make sure that some of the locations
in deliconfig.py are writable by your Apache (or any other web server
you might be using.)

When you run it successfully, you get output like this:

Reading entries from del.icio.us
Writing cache /Users/njr/Sites/cache/delicious.xml
Building home page /Users/njr/Sites/cache/index.html
Home page built and backed up
Completed OK.

fdb.py 0.8: Documentation (the new README file)

This post is a webified version of the new README file provided with fdb.py version 0.8.

FDB Python Library

fdb is a primarily a library for providing access to the FluidDB database (http://fluidinfo.com/fluiddb) from Fluidinfo (http://fluidinfo.com/.) There is lots of coverage of the library (and its evolution) at http://abouttag.blogspot.com/.

FDB Command Line Access

fdb can also be used for command-line access to FluidDB. See Using the Command Line.

Dependencies

If you're running python 2.6, fdb.py should just run. With earlier version of python, you need to get access to simplejson and httplib2. You can get simplejson from http://pypi.python.org/pypi/simplejson/ and httplib2 from http://code.google.com/p/httplib2/.

Credentials

For many operations, you also need an account on FluidDB, and credentials (a username and password). You can get these from
    http://fluidinfo.com/accounts/new
The library allows you to give it your credentials in various different ways, but life is simplest if you stick them in a 2-line file (preferably with restricted read access) in the format
username
password
On Unix, the default location for this is ~/.fluidDBcredentials, and on Windows the default file is fluidDBcredentials.ini in your home folder.

Tests

The library includes a set of tests. If you have valid credentials, and everything is OK, these should run successfully if you just execute the file fdb.py. For example, at the time of writing this README file (version 0.8 of the fdb), I get this:
$ python fdb.py
....................
----------------------------------------------------------------------
Ran 20 tests in 46.311s

OK

Using the Library

Four ways of exploring the library are:
  1. look at the tests (the ones in the class TestFluidDB)
  2. look at the blog (http://abouttag.blogspot.com)
  3. read the function documentation, which is . . . existent.
  4. look at and run example.py, which should print DADGAD and 10.
Here is example.py:
import fdb

db = fdb.FluidDB () # assumes credentials are in the standard place
db.get_tag_value_by_about ('DADGAD', '/fluiddb/about')
(status, value) = db.get_tag_value_by_about ('DADGAD', '/fluiddb/about')
print value
assert db.tag_object_by_about ('DADGAD', 'rating', 10) == 0
(status, value) = db.get_tag_value_by_about ('DADGAD', 'rating')
print value

Using the Command Line

Commands can be run by giving arguments to fdb.py. For a list of commands, use
    python fdb.py help
An example command is
    python fdb.py show -a DADGAD rating /fluiddb/about
Obviously, if you want to use fdb as a command from the shell, it will probably be convenient to use an alias or create a trivial shell script to run it. I use bash, with the alias
    alias fdb='python ~/python/fluiddb/fdb.py'
which allows me to type
    fdb show -a DADGAD rating /fluiddb/about
etc.

Delicious

Also distributed with fdb itself is code for accessing delicious.com (http://del.icio.us/, as was), and for migrating bookmarks and other data to FluidDB. This also includes functionality for creating web homepages from delicious based on a home tag.
A further post on using the del.icio.us uploader and other functionality will follow.

fdb.py 0.5: get becomes show

I've done a bit more to fdb.py. It's mostly tidying up and more tests (you can never have enough tests!), but there's a significant change to the command line interface: I've changed the get command to show. So now, you'd type something like:
  fdb show -a DADGAD /njr/rating
instead of
  fdb get -a DADGAD /njr/rating
That assumes you have an alias or shell script pointing at it, of course. If not, the full thing is:
  python fdb.py show -a DADGAD /njr/rating
The reason for the change is that (somewhat reluctantly) I've concluded that it's going to be useful to support raw HTTP GET, POST, PUT, DELETE and HEAD for lower-level stuff, and it seems cleaner if those commands are just of the form
  fdb get whatever
There are other minor changes; see the log. And I added a nice, permissive license ("stolen" from Nicholas Tollervey; but I don't think he'll mind).

21 August 2009

The fdb command line in fdb.py 0.3

I just pushed fdb.py 0.3 to github (http://github.com/njr0/fdb.py). It has a few extra things in the API, but the big new thing is you can use it from the command line.
I hope the following is reasonably self-explanatory. Obviously you can use an alias or a 1-line shell script to get rid of the need for the python fdb.py.
Script started on Fri Aug 21 16:27:35 2009
$ # reference objects by their about tag using -a
$ # the -v just tells it to be verbose and tell you what it's doing 
$ python fdb.py tag -av DADGAD rating=10
Tagged object with about="DADGAD" with rating = 10

$ python fdb.py tag -av DADGAD favourite
Tagged object with about="DADGAD" with favourite

$ python fdb.py get -a DADGAD rating favourite /terry/rating
Object with about=DADGAD:
  /njr/rating = 10
  /njr/favourite
  <tag /terry/rating not present>

$ python fdb.py untag -a -v DADGAD rating
Removed tag rating from object with about="DADGAD"

$ python fdb.py get -a DADGAD rating favourite /terry/rating
Object with about=DADGAD:
  <tag /njr/rating not present>
  /njr/favourite
  <tag /terry/rating not present>

$ # reference objects by their id tag using -i

$ python fdb.py tag -iv a984efb2-67d8-4b5c-86d0-267b87832fa4 rating=10
Tagged object a984efb2-67d8-4b5c-86d0-267b87832fa4 with rating = 10

$ python fdb.py tag -iv a984efb2-67d8-4b5c-86d0-267b87832fa4 favourite
Tagged object a984efb2-67d8-4b5c-86d0-267b87832fa4 with favourite

$ python fdb.py get -i a984efb2-67d8-4b5c-86d0-267b87832fa4 rating favourit e /terry/rating
Object a984efb2-67d8-4b5c-86d0-267b87832fa4:
  /njr/rating = 10
  /njr/favourite
  <tag /terry/rating not present>

$ python fdb.py untag -i -v a984efb2-67d8-4b5c-86d0-267b87832fa4 rating
Removed tag rating from object a984efb2-67d8-4b5c-86d0-267b87832fa4

$ python fdb.py get -i a984efb2-67d8-4b5c-86d0-267b87832fa4 rating favourit e /terry/rating
Object a984efb2-67d8-4b5c-86d0-267b87832fa4:
  <tag /njr/rating not present>
  /njr/favourite
  <tag /terry/rating not present>
$ exit

Script done on Fri Aug 21 16:31:45 2009

fdb.py on github

If you like your libraries revision controlled, fdb.py is now available on github at http://github.com/njr0/fdb.py. And Xavi says it works!

Tagging and untagging from python with fdb.py 0.2

I extended fdb.py a bit. Version 0.2 is available where 0.1 used to be, at http://stochasticsolutions.com/fluiddb/fdb.py, but also at http://stochasticsolutions.com/fluiddb/fdb0.2.py. Version 0.1 is still available at http://stochasticsolutions.com/fluiddb/fdb0.1.py. (I know, I know, I should make it publicly available through a VCS, and I will, but this is much easier for me right now.)
Main new features are:
  • Ability to untag objects
  • More tests
  • Tests that work for users whose FluidDB name isn't njr
  • More tag path manipulation
  • Most (all?) of the commands on tags now accept relative or absolute tag names
It still doesn't support subnamespaces or (m)any queries yet, though.
The following code illustrates the new functionality:
import fdb
import types

db = fdb.FluidDB (fdb.Credentials (filename='/Users/njr/.fluidDBcredentials'))

# Create an object with about='DADGAD' (or look up ID is already exists)
o = db.create_object ('DADGAD')
assert type (o) != types.IntType        # Would indicate an error code
id_DADGAD = o.id

# Add njr/rating=10 to 
assert db.tag_object_by_id (id_DADGAD, 'rating', 10) == 0

# Read the value back and heck it's right
(status, value) = db.get_tag_value_by_id (id_DADGAD, 'rating')
assert value == 10

# Remove njr/rating from DADGAD object
assert db.untag_object_by_id (id_DADGAD, 'rating') == 0

# Again, using absolute tag path
assert db.untag_object_by_id (id_DADGAD, '/njr/rating') == 0

# Again, using absolute tag path
assert db.untag_object_by_id (id_DADGAD, '/njr/rating') == 0

# Yet again, requesting error if the tag or object isn't there
error = db.untag_object_by_id (id_DADGAD, '/njr/rating', False)
assert error == fdb.STATUS.NOT_FOUND    # 404 :-)

print 'Well, that all worked!'
This produces:
zero:$ time python taguntag.py
Well, that all worked!

real 0m3.920s
user 0m0.121s
sys 0m0.068s

Tagging, Tags and Abstract Tags

An issue anyone using the FluidDB API (in any form) will run across fairly quickly is the occasionally vexed issue of what we actually mean by a tag.
Conceptually, it's very straightforward: a tag is exactly like the tags we all know and love from del.icio.us, Flickr, GMail etc., with the twist that they can have values. So a tag has a name (like njr/rating, terry/toread etc.), and optionally has a value (which can be of almost any type) too.
Additionally, tags have some other properties, like a rather full permissions system (that controls who can see, edit and attach them to objects) and some metadata, like an optional description.
The potential confusion arises because sets of tags sharing the same name are (for good reason) often managed together, and in some cases share metadata in FluidDB.
We can see this if we look at the process of tagging an object in a little detail.
In the client library, fdb.py that I published earlier, you can add a tag to an object very simply if you know its ID. For example, if I wanted to add an njr/rating of 10 tag to the object with the id a984efb2-67d8-4b5c-86d0-267b87832fa4g, I could just say
import fdb
db = fdb.FluidDB (fdb.Credentials (filename='/Users/njr/.fluidDBcredentials'))
o = db.tag_object_by_id ('a984efb2-67d8-4b5c-86d0-267b87832fa4', 'rating', 10)
assert o = 0
This corresponds very closely to the conceptual model I use and encourage others to use, and works even if there have never been any njr/rating tags in the system before.
Under the covers, however, the native HTTP API sees things slightly differently. Before I can tag something with a tag such as njr/rating I first have to tell the system I want to use tags with this name. The underlying API refers to this process as tag creation, though I prefer to think of it as abstract tag creation, or tag declaration.
So the way fdb.py actually works, is that when you ask it to tag an object with a given tag (and perhaps a value), it goes ahead and tries to do that for you. But if that tag hasn't previously been declared (i.e., if the abstract form of it hasn't been created), this will fail. In this case, the library backs up and creates the abstract tag and then tries again. It does this using another fdb call:
db.create_abstract_tag ('rating', description=None, indexed=True):
As you can see, we can also give a description for an (abstract) tag, which in effect applies to all the real ("concrete") tags we create when we tag objects, and can also specify whether FluidDB should index the tag (making it searchable). So we could say:
db.create_abstract_tag ('rating',
          description="njr's rating for things, on a scale of 0-10")
Similarly, I will soon impement some untag methods in fdb.py, but these shouldn't be confused with the delete_abstract_tag function that already exists. The delete_abstract_tagmethod doesn't simply simply remove tag from an object, but actually deletes all tags with the given tag name (and the abstract tag itself) from the system.
The reason FluidDB cares so much about abstract tags (or, if you prefer, sets of tags sharing the same tag name) is that this is the level at which the permissions system acts. In FluidDB, there is fairly fine-grain control, allowing the owner of an (abstract) tag to decide who is allowed to read, apply, and alter tags with a given name to objects.
More on that later.

20 August 2009

fdb: A simple python client library for FluidDB

I spent most of today playing with FluidDB, building on the work that I mentioned earlier from Sanghyeon Seo and Nicholas Tollervey,
The result is the fdb.py library, which you can find here.
If anyone wants to use it, feel free. I'll attach a licence some time, but it'll be BSD or similar --- something very permissive --- as long as Sanghyeon Seo and Nicholas Tollervey are happy.
There are six tests at the end that show its use reasonably clearly (I hope). The easiest way to use it is to stick your credentials in a file, possibly ~/.fluidDBcredentials as username and password on separate lines. Like this:
$ cat ~/.fluidDBcredentials
njr
myVerySecretPasswordThatAbsolutelyNobodyKnows
I'l post some example code using it, but that might not be till tomorrow evening.
The main thing I've used it for (other than the tests) is for pushing about 1500 bookmarks
from del.icio.us into FluidDB and it worked almost flawlessly as far as I can tell.
It hung after 678 object creations, but was fine after I interrupted and started again.

On tags, namespaces, paths

The full specification of a tag in fluidDB might be something like
http://fluidDB.fluidinfo.com/tags/njr/var/rating
The way we talk about this in FluidDB (at least, the bits we're agreed on) is as follows:
  • rating is the name of the tag;
  • njr/var is the name of a hierarchical namespace; njr is me and var is a (sub-)namespace that I've created under my username namespace njr/.
The problem, as usual, is that we may wish to refer to different bits of it. I am currently (for myself, and in my code) using the following names, though others will undoubtedly use others.
  • http://fluidDB.fluidinfo.com/tags/njr/var/rating is the tag URI;
  • /tags/njr/var/rating is the the full tag path;
  • /njr/var/rating is the absolute tag path;
  • /njr/var is the absolute namespace;
  • var/rating is the relative tag path;
  • rating is the short tag name;
Why do I care? Well, partly just so that we can talk about things, and partly because I want to make my functions be fairly liberal about what they accept (relative or absolute paths, filling in missing namespaces when appropraite) etc. I'm sure there'll be confusion for a while; hopefully followed by blissful clarity.

Getting started with FluidDB

This post describes the exact steps I took to get started with FluidDB. It describes how to

  • Get a login and password
  • Get the libraries to allow you to use it from python 2.5
  • Get the FluidDB object ID corresponding to a FluidDB username
  • Create a new object in FluidDB
  • Retrieve that object

What I had to do

  1. Get a username and password. I got my username as a consequence of following @fluiddb on twitter, and the password was sent to me this morning. If you haven’t already reserved a username this way, you can reserve one instead by signing up at http://fluidinfo.com/accounts/new.

  2. I started using the very simple but useful python library fluiddb put together by Sanghyeon Seo and augmented by Nicholas Tollervey, who gives some nice examples of its use on his blog. (I found bitbucket a bit confusing, but you can get the actual source from, gzipped, from http://bitbucket.org/sanxiyn/fluidfs/get/054092b8d3ff.gz or as a zip or bz2 by changing the extension.)

  3. I found that I needed install a couple of libraries to use with Python 2.5. These were httplib2, available from http://code.google.com/p/httplib2/ and simplejson, available from http://pypi.python.org/pypi/simplejson/ I then tried tollervey’s examples, which all worked fine, without any credentials for FluidDB. So far so good.

  4. Then I used the following trivial code to get the ID for the object corresponding to me (njr).

    import fluiddb
    
    njrID = fluiddb.call('GET', '/objects', query='fluiddb/users/username = "njr"')
    print njrID
    

which produced

$ python getnjr.py
(200, {'ids': ['cde01b2c-68bb-4d41-b25c-3ca49dcec434']})
  1. I then wanted to try creating an object, which does require credentials. For this, I created the following trivial credentials class (credentials.py):

    class Credentials:
        def __init__ (self, username, password, id=None):
            self.username = username
            self.password = password
            self.id = id
    
    njr = Credentials ('njr', 'my-secret-password',
        'cde01b2c-68bb-4d41-b25c-3ca49dcec434')
    

    (Obviously, that isn’t the real password...)

  2. That having worked, it seemed like it was time to create an object and test that I could retrieve it, which I did with the following code (createDADGAD.py):

    import fluiddb, credentials
    import simplejson as json
    
    njr=credentials.njr
    fluiddb.login (njr.username, njr.password)
    about_DADGAD = json.dumps ({'about' : 'DADGAD'})
    (status, o) = fluiddb.call ('POST', '/objects', about_DADGAD)
    print status, o
    print fluiddb.call ('GET', '/objects/' + o['id'], '{"showAbout": true}')
    

    which produced

    $ python createDADGAD.py
    201 {'id': 'a984efb2-67d8-4b5c-86d0-267b87832fa4',
    'URI': 'http://fluidDB.fluidinfo.com/objects/a984efb2-67d8-4b5c-86d0-267b87832fa4'}
    (200, {'about': 'DADGAD', 'tagPaths': ['fluiddb/about']})

About Tag

This blog is about Fluidinfo, an online database based on tags from Fluidinfo plc.
For information on the tools and libraries discussed in the blog see the Tools and Links page.
My name is Nick Radcliffe, and I'm an advisor to and investor in Fluidinfo I've known Terry Jones, whose brainchild FluidDB is, for over 20 years, from back in the days when he and I used to work on genetic algorithms and would meet at conferences. I mostly work on customer analytics, currently through my company Stochastic Solutions. I have developed a bit of a bad habit of creating new, non-standard databases, but I think this one's going to be great, and change the world.

Labels