who knows man i'm just getting started myself
been so busy trying to get the guts to work i haven't really thought about the ux of this
i mean let's say you download and install this thing, then what
(a docker image is forthcoming btw)
anyway it's gonna be an empty thing so not really very interesting to most people for a while
i already said in the video what the first things are:
- my website
- my client extranets
- the nature of software
- remaking the ibis tool
this thing has a bunch of cruft on the file system which will stay there for the time being
excited to finally replace the fake scaled images (rewrite rules) that have been there for like 15 years with real ones
also excited to do a content inventory for once, basically bootstrap some content inventory ui
also scan my whole site for terminology, proper nouns etc, finally flesh out the audience modeling stuff
go back and look at the document stats and try to do something useful with it
oh how about do some friggin stretchtext
also noodle with that video stuff
this is a no-brainer; get everything into content-addressable storage
this one is a lot newer and can live completely in content-addressable storage
i am very interested in making an annotation system
plus also same indexes as my site: concepts, books, people/orgs
photo index/credits a big one
there is also a bunch of stuff with the original books that i'd like to do
this thing lives almost completely in the graph
this one will need the proxy handler for sure for external links
here is where most of the work on really dense graph-forward stuff is gonna be
definitely hook in pm ontology
definitely need a data entry interface for people/orgs too
say static site, like jekyll; if it's wordpress or whatever i have no idea
i mean you could probably just load all the files into content-addressable storage, that'd be a good start
here's the thing i've been thinking though: stuff like git literally is a content-addressable store with an index on top, it probably wouldn't be hard to make an adapter that just hoovers up git repositories and sticks their metadata in the graph such that you could perfectly recreate a git repository
only wrinkle is
Store::Digest
isn't smart enough to store metadata about functional relationships between objects which is what you would need to represent stuff without it going nuts with redundant datalike if you have a general-purpose content-addressable store, like just a huge gaping maw where you chuck whatever blobs of data, especially if you're using it for cache too, it's gonna get real big real fast
git in particular runs everything through
DEFLATE
no matter what it is and then stores that, but the hashes in the file names correspond to the uncompressed datagit also diffs the current version of every file against the last version, keeps the current version whole to be fast, and then just stores the diff for all previous versions, which it reconstitutes on the fly (or at least it did, i dunno what it does now)
thing about
Store::Digest
(and really any other content-addressable store) is it knows what it has in it, but it doesn't know why; that's what the index layer is forso if you have two things in the store, a big object and a small object, and can say definitively that the small object is the same as the big object with a particular invertible function applied to it (
b = gzip(a) <=> a = gunzip(b)
), then you can delete the big object and just recompute it when you need tonow that gets trickier when a third object is used as input:
diff(a, b) = c <=> patch(b, c) = a
because you will have to make goddamn certain you never throw that part outeasy enough to manage in git because everything in a git repository belongs to it. not the case for a general-purpose content-addressable store.
worst thing for a content-addressable store is you have something in there and no idea why
i.e., your graph has no record of the object at all
can't delete it; something else might be using it
anyway, strategies for de-chesterton-fencing a content-addressable store probably a decent idea
inclined to partition it so stuff like cache doesn't get mixed in with non-cache; only thing about that is there will be collisions so sometimes you'll be storing the same object twice, although that's probably not nearly as bad as an ocean of runaway blobjects with no provenance
i mean you'll be to scan them or whatever and make determinations on most of them but that is Work™ that somebody has to do
another thing is you could have a definitely-cache flag when you insert an object: if the identical object is ever subsequently inserted with that flag off, then the flag stays off and never gets flipped back on no matter how many other times the object is reinserted with the flag on. hey that's actually kind of a not bad idea
could even have some automatic capacity management and LRU policy or whatever
aaanyway that was my digression on content-addressable stores, back to the absorbing a website business
thing is, Intertwingler
doesn't really have much to offer you if all you have is a regular website, i mean besides the whole url stuff and i guess the transforms on pages and images etc are nothing to sneer at
that and i guess the potential to make things get real weird
like what do you make when you don't have to think in terms of pages anymore
very PKM-ey but also kinda not
makes me think about investing in some kind of sparse-to-dense onboarding process
that might require me finishing Loupe though
anyway i'll pick this up later, ta