Skip to content

Latest commit

 

History

History
153 lines (152 loc) · 9.66 KB

behaviour.org

File metadata and controls

153 lines (152 loc) · 9.66 KB

generic engine behaviour

  • [ ] <<P001>> engine initializes with configuration
  • [ ] <<I001>> we don’t want to run one of these things for each site, especially if they’re sharing data. more redundant config plus moving parts to fail. if anything it should be optional to run multiple daemons but not necessary.
    • [ ] <<P002>> engine must be configurable with arbitrarily many authorities (host:port pairs)
      • [ ] <<P003>> authorities dispatched by Host: request header
      • [ ] <<P004>> we should be able to alias authorities
  • [ ] <<I002>> we need a way for the static site generator to know in advance the files it has to write out.
    • [ ] <<P005>> engine responds with manifest of available URIs
      • [ ] <<P006>> manifest can be narrowed per site
      • [ ] <<P007>> manifest should be since whenever
  • [ ] <<I003>> we want the engine to have a standard interface.
    • [ ] <<P008>> engine main operation receives HTTP request object
      • [ ] <<P009>> engine transforms request
        • [ ] <<P010>> rewrite headers
          • e.g. rewrite accept headers based on user-agent header
          • e.g. rewrite accept headers based on query parameter
        • [ ] <<P011>> rewrite request-URI (path + query parameters + path parameters)
          • note: path parameters are UI/designators for response transforms
        • [ ] <<P012>> rewrite request method
        • [ ] <<P013>> rewrite request body
        • [ ] <<P014>> manipulate request transform stack
          • note: static configuration of response transforms will need to manipulate the request transform stack
            • e.g. markdown to (x)html transform: presence in the stack should insert a request transform that adds text/markdown to the request’s accept header
        • [ ] <<P015>> manipulate source polling sequence
        • [ ] <<P016>> manipulate response transform stack
      • [ ] <<P017>> engine transforms response
  • [ ] <<P018>> engine main operation returns HTTP response object
  • [ ] <<P019>> transforms must have access to graph
  • [ ] <<P020>> transforms must have access to subrequests
    • (eg GETing linked resources, POSTing to content-addressable store…)

source modules

  • [ ] <<P020>> a source module should be serviceable as a stand-alone Web app.
  • [ ] <<I004>> how are we going to negotiate which source handles a request?
    • [ ] <<P021>> have the resolver append alternate URLs to the request object.
      • [ ] <<I005>> this entails subclassing the Rack request objects.
    • [ ] <<P022>> have the source modules register one or more URI schemes.

filesystem source

  • mainly intended to be for transitional (eg to straight CAS + graph)
  • want it to JFW with existing content
  • [ ] <<P023>> handle file: URIs
    • [ ] <<A000>> no don’t do this actually, we’ll treat content handlers as microservices that handle their own URI resolution (with some constraints)
    • [ ] <<I006>> what about content negotiation?
      • [ ] <<P024>> we can just say that the filesystem has additional properties that cause it to respond to Accept-* headers.
  • [ ] <<P000>> behave like ~CatalystX::Action::Negotiate~

content-addressable store

  • P022
    • [ ] <<P025>> handle ni: URIs
  • [ ] <<I007>> Store::Digest implements a /.well-known/ni/ root and index pages that could/should be exposed

reverse proxy

  • [ ] <<I008>> it is not safe to proxy just any old URL.
    • [ ] <<P026>> institute a default-deny policy for reverse proxying.
      • [ ] <<P027>> require explicit domains
        • [ ] <<I009>> do we need more granular than domains?
          • [ ] <<P028>> probably not. domains should be good enough.
        • [ ] <<P029>> allow any subdomain of a specified domain; if you need to block a sub-subdomain then add a block; make block take precedence over allow.

fully graph-sourced (ie transparent; no opaque representation)

  • [ ] <<I010>> how do we get the request to the right handler?
    • [ ] <<P030>> route handlers by subject URI
      • [ ] <<P031>> subject URI should take precedence
    • [ ] <<P032>> route handlers by rdf:type
      • [ ] <<P033>> resources with type assertions that are “closer” topologically (via rdfs:subClassOf or owl:equivalentClass) to the handler’s configured types get a higher score
    • [ ] <<P034>> route handlers by Accept-* headers
      • [ ] <<I011>> really only Accept and Accept-Language because Accept-Charset and Accept-Encoding are moot (via transforms)
        • note: what about those other dimensions recently added (Prefer etc)?
        • [ ] <<P035>> make request transforms strip off Accept-Charset and Accept-Encoding

indexes (skos concept schemes/collections, sioc containers, etc)

  • [ ] <<P036>> paginate (?)

document stats

  • [ ] <<P037>> respond to qb:DataSet, qb:Slice, and qb:ObservationGroup.

generic

  • [ ] <<P038>> make this thing powered by Loupe

other generated/non-(x)html

feeds
  • [ ] <<P039>> respond to dcat:Catalog and others
  • [ ] <<P040>> compute feed for audience
google site map
  • [ ] <<I012>> google prefers big sites split up their sitemap
    • [ ] <<P041>> split up the sitemap according to some manageable scheme and link to it from the root
  • [ ] <<I013>> let’s not pollute the root shall we
    • [ ] <<P042>> place sitemap at /.well-known/sitemap.xml with a permanent redirect
robots.txt
  • one might also imagine humans.txt, ads.txt, credits.txt and so on
  • [ ] <<P043>> make whatever happens here actually use the graph data so it stays up to date
json-ld?
  • more generally a ;meta pseudo-transform that could be polymorphic, rdfa/turtle/json-ld/whatever
    • would it really be a pseudo-transform though?

URI resolver

  • [ ] <<I014>>

transforms in general

  • [ ] <<P044>> transforms should be microservices with their own URLs that you can POST to directly.
    • [ ] <<A001>> this facilitates intelligent heterogeneity by making it possible for transforms to be stand-alone microservices
  • [ ] <<I015>> how do we ensure transforms get executed in the right order?
    • [ ] <<P045>> create execution phases like apache

request transforms

  • [ ] <<I016>> Certain request transforms will elicit erroneous responses (ie they will return something other than what the client asked for) if they do not have a concomitant response transform.
    • [ ] <<P046>> make it so request transforms can conditionally push a response transform onto the stack.

response transforms

  • [ ] <<P047>> map response transforms to content types.
  • [ ] <<I017>> certain response transforms will not make sense to run without a concomitant request transform having been run first.
    • [ ] <<P048>> have a way to pair request transforms to response transforms.
    • [ ] <<P049>> have a given response transform install its paired request transform?

static site generator behaviour

  • [ ] <<P050>> generator initializes with configuration

generator main function

  • [ ] <<I018>> we need a way for the static site generator to know in advance the files it has to write out.
    • [ ] <<P051>> generator receives manifest from engine
    • [ ] <<P052>> generator writes resources to disk
      • [ ] <<P053>> generator examines target mtimes
        • [ ] <<P054>> generator only overwrites files changed since last write
    • [ ] <<P055>> generator writes rewrite maps
    • [ ] <<P056>> generator writes site map

“live” engine adapter behaviour

  • [ ] <<P057>> adapter initializes with configuration
  • [ ] <<P058>> adapter spawns daemon
    • [ ] <<P059>> daemon forks/threads as necessary (tunable in config)
    • [ ] <<P060>> daemon listens on a socket

async daemon behaviour

  • [ ] <<P061>> async daemon initializes with configuration
  • [ ] <<P062>> async daemon runs plain command queue
    • [ ] <<P063>> queue has persistent state/resumes when interrupted
    • [ ] <<I019>> AMQP node?
  • [ ] <<P064>> async daemon behaves like at(1) (scheduled one-off commands)
  • [ ] <<P065>> async daemon behaves like cron(1) (scheduled repeating commands)

pluggable operations

  • [ ] <<P066>> external link crawler
  • [ ] <<P067>> RSS/Atom feed poller
    • [ ] <<I020>> PSHB event handler via webhook?
    • [ ] <<I021>> polling cues from statistics/feed payload? (yeah right)
  • [ ] <<P068>> content-addressable store bulk scanner/compressor

CLI behaviour

  • [ ] <<P069>> spawn daemon from CLI
  • [ ] <<P070>> run static site generator from CLI

interactive shell

  • [ ] <<I022>> tooling for RDF sucks in general
    • [ ] <<P071>> query and manipulate graph
      • [ ] <<P072>> shell interprets basic graph manipulation commands (as Turtle/SPARQL syntax)
        • [ ] <<P073>> autocomplete symbols
          • [ ] <<P074>> autocomplete all syntax
        • [ ] <<P075>> set prefix mappings
      • [ ] <<P076>> shell interprets SPARQL commands
        • [ ] <<P077>> pipe sparql output to targets
          • [ ] <<P078>> provide alternative syntaxes
  • [ ] <<P079>> load graph from file
    • [ ] <<P080>> auto-detect syntax
    • [ ] <<P081>> set default graph context (?)
  • [ ] <<P082>> dump graph to file
    • [ ] <<P083>> Turtle and others
  • [ ] <<P084>> find and tag jargon
    • [ ] <<P085>> must attempt to resolve to existing SKOS concepts or provide UI to create new ones
    • [ ] <<P086>> must write back to source