concept stores

Research how it would be possible to allow 3rd-party SPARQL-based endpoints to be added as a new “concept store” (besides Wikipedia and Wikidata).

[ Note: This research follows from a Twitter discussion with Kingsley Uyi Idehen ]

Use cases:

  • An organization with a SPARQL-endpoint that wants to use the Conzept UI, to give people an integrated/customizable/augmented view of their knowledge base.
  • Allow for adding extra “concept stores” to a Conzept system, so other (more niche) concepts can be presented.
  • (more difficult probably) Allow a user to explore an unkown SPARQL-endpoint and render the results in Conzept and allow for full-text searches within that knowledge base (the latter may need more than a SPARQL end-point).

requirements

  • Create a list of store-definitions containing:
    • Name
    • Description
    • Website
    • active (boolean)
    • SPARQL-endpoint URL
    • OpenSearch URL (for JSON, and XML?). See also: OpenSearchlight.
  • Multi-query support for the autocomplete widget:
    • result merging
  • Multi-query support for the search-query results:
    • result merging
    • result ranking
    • result paging
  • Integrate a SPARQL-explorer sub-system, like Sparklis
    • sparql-endpoint testing: query all endpoints (note: some endpoints are dead) → then open “sparql explorer 2” in the main-section
    • Handle HTTP GET/POST CORS issues with the SPARQL-endpoints (not sure POST is supported in the current Conzept proxy-service)
    • build steps here

adapt conzept internals

  • Adapt internal data structures to allow for handling non-Wikidata entities.
  • Adapt field-definitions to allow for non-Wikidata property resolving
  • How to 'understand' the SPARQL-endpoint class graph and class properties?
    • Adapt topic classification code to allow for non-Wikidata classification rules.
  • Allow the Conzept type-system to handle non-Wikidata-Qid based entity IDs (eg. EntityDocument URLs)
  • How to page the results correctly, when combining them from multiple data sources?
    • get total results per data source
    • combine page-N results per data source
    • result ordering in pages
    • understand when to stop querying a data source

other tools

  • SparqlExplorer (Not used on Conzept. I've been unable to build it from source yet, due to NPM package version dependency issues)
  • SQID (Not used on Conzept. This is a Wikidata-specific search/browse tool, based on the Wikidata SPARQL and REST API.)