06 October 2008

Web2.0 database theory




at the heart of the classic web2.0 site
is a database of
users U
times
content-items C
U*C

each user u
expects to rate
each content-item c

and these ratings
guide
what sequence of content-items s(C)
is offered to any user
(aka recommendations)

eg flickr
tracks faves via pink stars
and creates from these
a single stream (explore)
offered to all users
but also allows subscriptions
to others' newly-added pix or faves
(not trimming redundancies)
and to tags and thematic groups
(as designated by the photographer)

or eg twitter
tracks faves via gold stars
but leaves it to Favrd
to collate these into a stream
also allowing subscriptions
to others' newly-added tweets or faves
(ignoring redundancies)
and to topical hashtags

or eg youtube
supports faving and rating
and makes recommendations
(though i doubt many people
bother with these?)




if the fundamental web2.0 goal is
helping users find the best content-items
the most important omission
from these sites is
redundancy-trimming:

once you've seen/judged an item
you won't need to see it again
so it can be silently trimmed
from offered streams

(one exception being
streams of favorites
from users you're interested to know
so you want to be told
if they-liked-that-too)




another useful database is C*C similarity

people who like c1
usually like c2 too

possibly decomposable into
thematic dimensions T:
easy-difficult
natural-artificial
common-rare
emotional-dry
sexual-chaste






.