Real-life Semantic Web Technology-based Information Sharing via Linked Data Concepts – SpazioDati dataGRAPH Wikisearch API

My last post was the fifth in a series of “spotlight posts” I am using to illuminate practical examples of Real-life Semantic Web Technology-based Information Sharing via Linked Data Concepts. All of these examples support my proposed Enhanced Linked Data Architecture and its current incarnation as the Enhanced Linkeddata Architecture for Persistent Sharing Environments (ELAPSE)™.

My sixth post in this series is focused on the SpazioDati dataGRAPH Wikisearch API:

“Wikisearch is a semantic search API that helps you find the specific Wikipedia page you’re looking for. It’s designed to work even if you don’t remember its exact title, or have only a vague remembrance that it relates to some specific topic.

Wikisearch can understands semantic relationships between different things and concepts. For instance, it understands that Westminster church and Westminster abbey refer to the same thing, even though they are spelled differently.”1

To provide these capabilities, the Wikisearch API can be used to connect movies with books–for example, you know about the Game of Thrones TV series but you want to know about the original book. Wikisearch can search for ‘screenplay game of thrones‘, which will lead you to the Wikipedia page titled ‘a song of ice and fire‘ – even if you had no clue about the name of the book.

Wikisearch is especially useful in auto-complete scenarios where the user can typically examine only the first few search results before taking a decision. By filtering out things like disambiguation pages and providing content that is semantically relevant to the search at hand, it can significantly improve overall user experience.

To learn how to use the Wikisearch API  go:

  • For Wikisearch API “getting started” information, click here.
  • For generic information on the API, click here.
  • For specific Wikisearch API reference information, click here.
SpazioDati Wikisearch API Demo Site
SpazioDati Wikisearch API Demo Site

When focused on open source and open standards, the Wikisearch API does implement a number of the same components of my Enhanced Linkeddata Architecture for Persistent Sharing Environments (ELAPSE), including:

My next post will discuss a seventh practical example of Real-life Semantic Web Technology-based Information Sharing via Linked Data Concepts.

=david.l.woolfenden

1 http://wikisearch.dandelion.eu/#/

Real-life Semantic Web Technology-based Information Sharing via Linked Data Concepts – Metreeca Tools

My last post was the fourth in a series of “spotlight posts” I am using to illuminate practical examples of Real-life Semantic Web Technology-based Information Sharing via Linked Data Concepts. All of these examples support my proposed Enhanced Linked Data Architecture and its current incarnation as the Enhanced Linkeddata Architecture for Persistent Sharing Environments (ELAPSE)™.

My fifth post in this series is focused on Metreeca Tools:

“Metreeca provides your organization with an agile roadmap for kick-starting and leveraging semantic and linked data projects, applying our know-how to resolve current practical problems in a way that will pay back now and in the future.

Metreeca assists data owners and users in developing organization-wide or application-specific knowledge bases that allow data to be shared and reused across application, department and enterprise boundaries .”1

To provided these capabilities, Metreeca’s current toolset includes:

  • Metreeca Graph Rover

Metreeca Graph Rover is a self‑service search and analysis platform natively designed for W3C-compliant semantic knowledge bases. It enables non-technical users to visually interact with complex data graphs, shielding them from RDF and SPARQL technicalities.

There is a user-friendly search and navigation tool providing an interactive canvas for exploring complex data graphs and for summarizing relevant features as tabular datasets supporting:

– Faceted semantic search, linked data navigation, and set‑based pivoting
– Dynamic user interface, automatically adapting to endpoint contents
– Compatibility with any CORS-enabled SPARQL 1.1 endpoint

UPDATE: Metreeca Graph Rover public beta new 0.37 version is out with the first working examples on DBpedia! Graph Rover is a self service search and analysis platform for SPARQL graph databases that enables business users, with no technical skills, to perform advanced searches and analyses on complex data graphs.

  • Metreeca Path Finder

Metreeca Path Finder is a causal issue analysis and lightweight dashboarding tool based on the Current Reality Tree (CRT) Thinking Process from Eliyahu M. Goldratt’s Theory of Constraints (TOC). It enables your team to quickly identify core problems and to contextually link them to a coherent and intuitive set of operational key performance indicators (KPIs).

There is a user-friendly modelling tool providing an interactive whiteboard for building real-world causal models and enriching them with a coherent set of operational KPIs and:

– Streamlined user interface with multiple task-focused views
– Automatic real-time highlighting of critical model features
– Contextual specification of operational metrics and KPIs

A lightweight dashboarding tool presents KPIs in the context of causal models, enabling them to be related to each other through causal connections between the underlying entities along with:

– Agile and secure integration with local data sources
– Contextual monitoring and causal analysis of KPI deviations
– Tight integration with self-service semantic data analysis tools

Metreeca Path Finder causal model review
Metreeca Path Finder causal model review

Metreeca states that business data is a unique asset that lives and grows within and outside your organization–and, that achieving your goals depends on sharing, integrating, and analyzing available data. Semantic and linked data technologies were born to provide a common, web-scale framework that makes people and programs more effective at these tasks.

Their tools focus on client needs by developing a shared business vocabulary that: defines how to exchange and interpret data in a coherent way; enables agile data integration from distributed sources; and, enforces data consistency. This “integrated knowledge base” may power a wide range of value-added applications and services, even beyond its initial planned scope, taking advantage of standardized data exchange protocols and formats, such as:

  • Semantic search, data exploration, and analysis; integrated performance management; and, integrated risk and compliance management
  • On-the-fly mashup with external, third-party, statistical and reference
  • Data for marketing and strategic analysis
  • Publishing of semantic-enriched content and commercial offers
  • Highly visible to search engines
  • Business-to-business (B2B) integration with suppliers, customers, and other business partners
  • Machine-to-machine (M2M) integration and support for IoT applications

– A very informative 4-minute introduction to semantic knowledge management by Metreeca is available here.

– A free to use Solo Edition of the Metreeca Tools is available here. This edition supports the creation of real-world TOC models and KPI systems.

When focused on open source and open standards, the Metreeca tools do implement a number of the same components of my Enhanced Linkeddata Architecture for Persistent Sharing Environments (ELAPSE), including:

  • W3C-compliant semantic knowledge bases
  • CORS-enabled SPARQL 1.1 endpoints
  • RDF
  • SPARQL

My next post will discuss a sixth practical example of Real-life Semantic Web Technology-based Information Sharing via Linked Data Concepts.

=david.l.woolfenden

1 http://www.metreeca.it/solutions/knowledge/

Real-life Semantic Web Technology-based Information Sharing via Linked Data Concepts – Redlink

My last post was the third in a series of “spotlight posts” I am using to illuminate practical examples of Real-life Semantic Web Technology-based Information Sharing via Linked Data Concepts. All of these examples support my proposed Enhanced Linked Data Architecture and its current incarnation as the Enhanced Linkeddata Architecture for Persistent Sharing Environments (ELAPSE)™.

My fourth post in this series is focused on Redlink, which:

“provides APIs to access the services of the (Redlink) platform. SDKs are available for major programming languages. Plugins are available for leading content management platforms such as Drupal, AlfrescoWordPress as well as for search engines like Apache Solr.”1

  • A short screencast  for non-experts to set up their first Redlink App is available here.

Redlink’s tagline is:”We help you Make Sense of your Data.” They understand that, as organizations and users are creating an ever-increasing volume of content that is unstructured  and lacks meaningful meta-data, manual tagging of this content with meta-data is impossible given the velocity and volume at which content is being generated. This limits the content’s find-ability, information discovery, and knowledge management–and, hence, decreases the value of content for business processes.

Redlink can help enterprises make sense of their data by semantically enriching, linking, and searching the vast amounts of unstructured data. This helps businesses to become more effective by managing actionable knowledge in the form of linked data instead of plain documents, content, or databases. They do this by exposing APIs in three different services areas of the platform:

1.  Content Analysis – Redlink Content Analysis offers fact extraction, topic classification, content categorisation and fact linking from textual and media documents in different languages.

  • Content Analysis API documentation is available here.

2.  Linked Data Publishing – Redlink Linked Data Publishing offers data management, data publication, and data integration for enterprise data using open standards and technologies (RDF and Linked Data). Legacy proprietary data can be transformed into a standardized data model and integrated with data from other sources.

  • Linked Data Publishing API documentation is available here.

3.  Semantic Search – Redlink Semantic Search offers high performance faceted and semantic search over enterprise content, different ranking algorithms, auto completion, thesaurus/vocabulary integration, etc.

  • Semantic Search API documentation is available here.
  • Several demonstrations of developer platform access using these APIs are here.

There is also an SDK offering native integration of the Redlink services into the most common programming environments. A developer can simply configure the SDK with an API key and is then able to access the functionality without needing to explicitly call Web services. Currently, Redlink provides SDKs for Java, Python, PHP, and JavaScript.

  • More information is available here.

The Redlink Platform is attempting to turn semantic processing into a commodity by simplifying its use for web and application developers around the world without requiring the otherwise necessary technology expertise. Since the platform is implemented as a private cloud infrastructure, there is no need to install and configure complex technology. Instead, users can access the platform through the Internet, typically by calling Web services with an API key for identification, tracking, and billing. Even in small applications, like blogs or custom web applications, it is easy to use the technology. At the same time, the platform provides the scalability needed for large-scale applications with millions of documents. The platform is built on several existing open source framework architectures (e.g. Apache Stanbol, Apache Marmotta, Apache SOLR – links below), illustrated below:

Redlink Platform Architecture

As with previous “spotlighted” projects, this solution has also attempted to address a number of the concerns, identified in my previous posts, that are commonly associated with today’s RDF Stores and Linkeddata Framework and Model solutions. This solution’s successful focus on creating a “Platform for Content Analysis, Linked Data and Semantic Search” will make this solution successful in the long run.

When focused on open source and open standards, this solution does implement a number of the same components (the Enhanced parts) of my Enhanced Linkeddata Architecture for Persistent Sharing Environments (ELAPSE), including:

My next post will discuss a fifth practical example of Real-life Semantic Web Technology-based Information Sharing via Linked Data Concepts.

=david.l.woolfenden

  1. http://redlink.co/platform/evaluate/

Real-life Semantic Web Technology-based Information Sharing via Linked Data Concepts – Austria Linked Open Data (LOD) Project

My last post was the second in a series of “spotlight posts” I am using to illuminate practical examples of Real-life Semantic Web Technology-based Information Sharing via Linked Data Concepts. All of these examples support my proposed Enhanced Linked Data Architecture and its current incarnation as the Enhanced Linkeddata Architecture for Persistent Sharing Environments (ELAPSE)™.

My third post in this series is focused on the Austria Linked Open Data (LOD) Project, which:

“Follows examples in the U.S., EU, ​​UK  and Holland, implemented, on the basis of existing regional and national open records (Open Data) @ data.gv.at  or data.wien.gv.at , a freely available Linked Open Data (LOD), infrastructure for Austria”1

  • A short video about LOD Austria on YouTube is here.
  • An Project Description of the Linked Open Data (LOD) PILOT Austria – presented at the PiLOD event at VU Amsterdam (Netherlands) on 29-01-2014 is here.

Freely available LOD takes the form of modeling and publication of ~30-50 most important basic data sets (e.g. postal codes, districts, public places, schools, school types, industry sectors, etc.) as LOD, which then results in an efficient and sustainable data publication and data sharing on the part of administration, economy, and society. Relevant data are identified, converted, and linked and are made ​​available as LOD. Linked Open Data provides a method for delivering structured data for optimal reuse. The project has proven to be an important building block for a sustainable digital infrastructure in Austria.

The City of Vienna has played a leadership role in this project by stressing the usage of open standards for the interfaces and the software to allow for more transparency, participation, and collaboration. In addition to technical interfaces, a legal framework was also established by city administration. The City of Vienna’s Open Data site includes an extensive Data Catalog and a number of Applications that have been built upon the LOD base data formats.

  • For more technical details on the City of Vienna’s Open Data site and the overall Austria LOD-driven Open Government Data (OGD) initiative, go here.

A system architecture for the transformation of the various input data formats to LOD formats, usage of  ontologies, and Android app access is shown below:

Vienna Linked Open Data (LOD) project architecture
Vienna Linked Open Data (LOD) project architecture 2

As with previous “spotlighted” projects, this project has also attempted to address a number of the concerns, identified in one of my previous posts, that are commonly associated with today’s RDF Stores and Linkeddata Framework solutions. This project’s strong and successful ontology-based directions have proven that both short- and long-term benefits are achievable using Linked Data Concepts.

When focused on open source and open standards, this project does implement a number of the same components (the Enhanced parts) of my Enhanced Linkeddata Architecture for Persistent Sharing Environments (ELAPSE), including:

My next post will discuss a fourth practical example of Real-life Semantic Web Technology-based Information Sharing via Linked Data Concepts.

=david.l.woolfenden

  1. http://data.gv.at/anwendungen/vienna-linked-open-data/
  2. http://cweiss.net/lod/

Real-life Semantic Web Technology-based Information Sharing via Linked Data Concepts – The SPITFIRE Project

My last post was the first in a series of “spotlight posts” I am using to illuminate practical examples of Real-life Semantic Web Technology-based Information Sharing via Linked Data Concepts.  All of these examples support my proposed Enhanced Linked Data Architecture and its current incarnation as the Enhanced Linkeddata Architecture for Persistent Sharing Environments (ELAPSE)™.

My second post in this series is focused on The SPITFIRE Project, which:

Uses open standards and state-of-the art protocols and is based on semantic web technologies (RDF) for meaningful data exchange, autonomous sensor correlation to learn about the environment, and software built around the Linked Data principles to be open for novel and unforeseen applications.”1

The SPITFIRE Project’s goal is to help the Internet of Things (IoT) become a reality. There are several obstacles on different levels that need to be overcome to meet this goal:

1. – Devices and Sensors (or “Things”) need to connect to the Internet and be able to offer services. To do so, they have to announce and describe these services in machine-understandable ways so that user-facing systems are able to find and utilize them.
2. – These “Things” have to learn about their physical surroundings so they can serve sensing or acting purposes without requiring explicit configuration or programming.
3. – Finally, it must be possible to include IoT devices in complex systems that combine local and remote data, from different sources, in novel and surprising ways.

The SPITFIRE Project’s SPITFIRE ontology is used to promote the semantic annotation of IoT sensors and devices to simplify:

(a) The integration of sensor data by data providers,
(b) The discovery of sensors, and
(c) The development of services and applications based on sensor data.

To ease the integration with other annotated data sources, SPITFIRE Project follows the recommended best-effort practice to reuse existing, popular ontologies/vocabularies as much as possible. The design of an ontology for SPITFIRE usage is composed of two main steps. First, existing vocabularies, taxonomies, and ontologies are reviewed and evaluated with respect to their usefulness to describe concepts required within the SPITFIRE Project use cases. Second, all concepts required in the uses cases, but without a suitable existing description, are identified and an ontology to describe these concepts is defined.

Initial SPITFIRE ontology concepts are from the domain of energy-saving in building automation and support for modeling any kind of activity/event that has been “sensed together.” The resulting SPITFIRE ontology aligns already existing vocabularies–including the World Wide Web Consortium (W3C) Semantic Sensor Network (SSN) ontology and social & provenance vocabularies, as well as The Internet Engineering Task Force (IETF®) Constrained RESTful Environments (CoRE) specs and especially the Constrained Application Protocol (CoAP) –to enable the semantic description of not only sensor measurements and sensor metadata but also of the context surrounding them, the detected activities, and energy efficiency concepts.

The SPITFIRE ontology is composed of modules with a focus on:

(a) energy saving in building automation (the SPITFIRE consolidated use case) thus allowing the developer to monitor and describe both the structure and the performance of sensor networks and their components; and,

(b) modeling any kind of activity/event that has been sensed together and enriched by descriptions of the surrounding environment. The SPITFIRE ontology enables a full and rich description of not only sensor data but also the sensed event, its structure, what triggered it, and its relation with other activities.

An alignment diagram of already existing ontologies from the social, sensor, and provenance domain, aimed at further modeling the context surrounding sensors is shown below:

Spitfire Ontology Social Provenance Diagram
  • For more details on the SPITFIRE ontology, go here.
  • A SlideShare presentation on the overall SPITFIRE Project is available here.

One key focus of the SPITFIRE Project was to bring its works and results together in a unified architecture and software suite. The main result is a unified SPITFIRE software suite: the SPITFIRE developer toolchain. This toolchain combines the smart service proxy (SSP), various Constrained Application Protocol (CoAP) extensions and implementations of SPITFIRE, semantic annotations, Service-Level Semantic Entities (SLSE), In-Network Semantic Entities (INSE), In-Network Query Processing (INQP), sleep scheduling, and cryptographic routines. Many of these components are available as modular WiseLib code. They can be used, as is, to quickly create SPITFIRE-enabled networks, or be easily adapted and re-assembled for custom applications.

  • More toolchain details and download links for these components are available here.

As with the Optique Project, the SPITFIRE Project attempts taddress a number of the concerns, identified in one of my previous posts, that are commonly associated with today’s RDF Stores and Linkeddata Framework solutions. The SPITFIRE Project ‘s strong and successful ontology-based directions have proven that both short and long-term benefits are achievable using Linked Data Concepts.

When focused on open source and open standards, the SPITFIRE Project does implement a number of the same components (the Enhanced parts) of my Enhanced Linkeddata Architecture for Persistent Sharing Environments (ELAPSE), including:

My next post will discuss a third practical example of Real-life Semantic Web Technology-based Information Sharing via Linked Data Concepts.

=david.l.woolfenden

Real-life Semantic Web Technology-based Information Sharing via Linked Data Concepts – The Optique Project

My last few posts have discussed and addressed, what I believe to be, some of the current stumbling blocks to full adoption of Linked Data integration concepts and patterns in today’s Information Interoperability / Information Sharing space. I have attempted to address each of these current stumbling blocks by identifying additional components (the Enhanced parts) of my proposed Enhanced Linked Data Architecture and its current incarnation as the Enhanced Linkeddata Architecture for Persistent Sharing Environments (ELAPSE)™.

This post is the first in a series of “spotlight posts” I will use to illuminate practical examples of Real-life Semantic Web Technology-based Information Sharing via Linked Data Concepts. My first example is The Optique Project, which:

“Advocates for a next generation of the well-known Ontology-Based Data Integration (Access) (OBDA) approach to address the data access problem. OBDA systems address the data access problem by presenting a general ontology-based and end-user oriented query interface over heterogeneous data sources.”1

This project, in its recent 1.0 release,  implemented a  3-layer architecture that offers several functions, including the ability to query/visualize data and to install/maintain an ontology and associated mappings, as well as an efficient query processing mechanism. Optique 1.0 allows users to pose queries, via a visual query formulation (VQF) interface, a SPARQL editor, or from a preexisting query catalog. The VQF exploits reasoning to show both explicit and implicit domain knowledge, which helps guide the formulation of the query. This architecture is depicted below:

Optique 1.0 System Architecture
Optique 1.0 System Architecture

Optique 1.0 is built on top of the fluid Operations Information Workbench (IWB) Open Source project, a generic platform for semantic data management. The IWB provides a shared triple store for managing the assets of Optique 1.0 [e.g. ontologies, mappings, query logs, (excerpts of) query answers, database metadata, etc.]. The IWB also provides generic interfaces and APIs for semantic data management (i.e. ontology processing APIs). In addition to these backend data management capabilities, the IWB provides an extensible user interface that follows a semantic wiki approach and is based on a rich, extensible pool of widgets for visualization, interaction, mashup, and collaboration.

The Optique Project is committed to making its results available in the form of open source software.

  • Optique Project Public Deliverables are downloadable here.
  • Details on the Optique European Partner Programme are available here.
  • A series of videos showing a demonstration of Optique 1.0 in action are available here.

The Optique Project addresses a number of the concerns, identified in my previous post, that are commonly associated with today’s RDF Stores and Linkeddata Framework solutions.  The Optique Installation Wizards, both Basic and Advanced, provide a set of Open Semantic Web Modeling Solutions that  reduce the perceived complexity of building the models necessary for successful Linked Data integration concepts and patterns implementation.

When focused on open source and open standards,  Optique 1.0 does implement a number of the same components (the Enhanced parts) of my Enhanced Linkeddata Architecture for Persistent Sharing Environments (ELAPSE), including:

  • Information Workbench (IWB) – (Info and Download) and it’s Open Source HTML5 Pivot Viewer Solution (Download)
  • -ontop- framework – (Info and Download)
  • LogMap: Logic-based Methods for Ontology Mapping (Info and Download)

My next post will discuss a second practical example of Real-life Semantic Web Technology-based Information Sharing via Linked Data Concepts.

=david.l.woolfenden

1 http://www.optique-project.eu/

My Vision for an Enhanced Linked Data Architecture – PART 4

In my previous post, I continued to discuss and address, what I believe to be, some of the current stumbling blocks to full adoption of Linked Data integration concepts and patterns in today’s Information Interoperability / Information Sharing space. I attempted to address one of the current stumbling block by identifying additional components (the Enhanced parts) of my proposed Enhanced Linked Data Architecture and its current incarnation as the Enhanced Linkeddata Architecture for Persistent Sharing Environments (ELAPSE)™.

In this post, I will begin to discuss another concern associated with today’s RDF Stores and Linkeddata Framework solutions that could also be addressed via this Enhanced Linked Data Architecture. This concern is the perceived complexity of building the models necessary for successful Linked Data integration concepts and patterns implementation.

From an Ontology Management and Taxonomy creation perspective, tools do exist today that can support multiple ontology representation languages that are used to develop and maintain the persistence of a model and to support visual navigation possibilities within a specific knowledge model implementing Linked Data integration concepts and patterns. A handful of these tools are listed on this Semantic Web Modeling solutions page.

An evolving standard that has helped to simplify the syntax used to format and give expression to Linked Data models–specifically, RDF, Ontologies, and Taxonomies–is Turtle (Terse RDF Triple Language). Turtle is generally recognized as being more readable and easier to manually edit than RDF/XML. A majority of today’s RDF Stores and Linkeddata Framework solutions and toolkits include Turtle parsing and serialization ability.

When focused on open source and open standards, some other existing components (the Enhanced parts) of my proposed Enhanced Linked Data Architecture that will help address this concern are:

  • Neologism – by the Linked Data Research Centre – DERI – National University of Ireland, Galway (Info and Download)
  • Xturtle Editor Eclipse-based plugin – by the Research Group Agile Knowledge Engineering and Semantic Web (AKSW) – (Info and Gihub)
  • Semaphore Ontology Manager – by Smartlogic Semaphore Limited (Info and Download)

My next post is the first in a series of “spotlight posts” I will use to illuminate practical examples of Real-life Semantic Web Technology-based Information Sharing via Linked Data Concepts.

=david.l.woolfenden

My Vision for an Enhanced Linked Data Architecture – PART 3

In my previous post, I discussed and began to address what I believe to be a current stumbling block to full adoption of Linked Data integration concepts and patterns in today’s Information Interoperability / Information Sharing space. I attempted to address this stumbling block by identifying additional components (the Enhanced parts) of my proposed Enhanced Linked Data Architecture and its current incarnation as the Enhanced Linkeddata Architecture for Persistent Sharing Environments (ELAPSE)™. In this post, I will discuss and began to address a third major stumbling block.

The third major stumbling block to full adoption of these concepts revolves around poor RDF store performance reflected by sluggish data loading times and less than ideal SPARQL Protocol and RDF Query Language (SPARQL) query-response and updating results against distributed and federated models. These characteristics are commonly associated with a majority of today’s RDF (i.e. Triple Store or Graph DB) Stores and LinkedData Framework solutions. Fortunately, these types of performance issues are being addressed on a daily basic by several software development organizations and their supporting open-source communities.

A Linked Open Data 2 (LOD2) project “Big Data RDF Store Benchmarking Experience” blog entry recently recorded a general observation that:

“RDF stores have made significant advances in architecture (cluster-ready) and functionality (Business Intelligence queries), as well as in performance and scalability. By now, we can truly conclude that Big Data projects can make use of RDF technology, and that is a win.”

A recent posting to OpenLink Software, Inc.‘s Orri Erling’s blog stated:

“To get much further in performance, physical storage needs to adapt to the data. Thus, in the long term, we see RDF as a lingua franca of data interchange and publishing, supported by highly scalable and adaptive databases that exploit the structure implicit in the data to deliver performance equal to the best in Structured Query Language (SQL) data warehousing. When we get the schema from the data, we have schema-last flexibility and schema-first performance. The genie is back in the bottle, and data models are unified.”

Just updated on June 24, 2013, this NoSQL Databases for RDF: An Empirical Evaluation page links to some very telling Benchmark Results comparing five different NoSQL stores for RDF processing.

When focused on open source and open standards, some additional existing components (the Enhanced parts) of my proposed ELAPSE™ Architecture that will help address these performance concerns are:

  • Virtuoso Open-Source Edition – scalable cross-platform server that combines Relational, Graph, and Document Data Management with Web Application Server and Web Services Platform functionality (Github and interview with founder)
  • MonetDB – pioneered column-store solutions for high-performance data warehouses for business intelligence and eScience since 1993 (Downloads)

My next post will discuss other concerns associated with today’s RDF Stores and Linkeddata Framework solutions that could also be addressed via this ELAPSE™ Architecture.

=david.l.woolfenden

My Vision for an Enhanced Linked Data Architecture – PART 2

In my previous post I discussed two of, what I believe to be, the current stumbling blocks to full adoption of Linked Data integration concepts and patterns in today’s Information Interoperability / Information Sharing space. I also began to address one of these stumbling blocks via additional components (the Enhanced parts) of my proposed Enhanced Linked Data Architecture and its current incarnation as the Enhanced Linkeddata Architecture for Persistent Sharing Environments (ELAPSE)™. With this posting, I will begin to discuss possible data security labeling/classification solutions that may address this stumbling block and how the current “Triple-level” security concerns associated with today’s Semantic Web Technology solutions could also be resolved via my proposed ELAPSE™ Architecture.

A second major stumbling block to full adoption of these concepts is the lack of “Triple-level” security capability that is already built-in to available RDF (i.e. Triple Store or Graph DB) Stores and LinkedData Framework solutions. Today’s Information Interoperability / Information Sharing space requires that actionable shared information be gleamed from numerous different and diverse file format(s) and data types, including traditional structured data sources along with unstructured datasemi-structured data, and raw data sources in both open and proprietary format(s). Each of these data sources, when required, has typically already had appropriate Authentication, Authorization, and Accounting (AAA) policies adopted and implemented. As these data sources are “exposed” as Interlinked Semantic Data, to take advantage of Linked Data integration concepts and patterns, their existing AAA policies must be maintained and enforced.  A recent blog posting by Orri Erling here discusses “combined provenance and security label”  and “selective hash join” graph-level access concepts to address access control needs and their associated performance effects.

A prime example of the need to deal with numerous different and diverse file format(s) and data types is the wealth of geospatial data required to be stored and analyzed by the intelligence community. The Open Geospatial Consortium (OGC) standards [i.e. Web Map Service (WMS), Web Feature Service (WFS), Web Coverage Service (WCS), and Web Processing Service (WPS)] were specifically defined to provide open standards based interoperability for access to and exchange of geospatial information across multiple data sources, which typically have existing AAA policies that must be maintained as this data is modeled and exposed as a semantic knowledge base.

A recent paper published on 15 April 2013 in The Institute of Electrical & Electronics Engineers, Inc. (IEEE) Transactions on Dependable and Secure Computing (TDSC) journal, titled “Authorization Control for a Semantic Data Repository Through an Inference Policy Engine,” proposes a powerful multi-layered authorization and access control model. This model is a combination mechanism, including: a ‘security role and labeling technique’ in which many security properties can be determined by the expressiveness of the authorization scheme; a powerful authorization system [26]; and, a multi-clearance paradigm [30] .

Without deeply diving into the usage of Description Logic (DL) when defining the Semantic Web Model (or, semantic knowledge base), this model/knowledge base can be perceived as consisting of Terminological Knowledge Box (TBox) Axioms and Assertional Knowledge Box (ABox) Axioms [18]/[29]. I believe the proposed semantic reasoner based authorization model and its support for content-based access control–in that the authorization requirements are established not only for the model’s concepts in the TBox (conceptional schema) but also for their individuals in the ABox (actual data)–is exactly what is needed to help address today’s security-related adoption stumbling block.

As always, when focused on open standards, some additional existing components (the Enhanced parts) of my proposed ELAPSE™ Architecture that will help address these security concerns may also include:

My next post will discuss other concerns associated with today’s RDF Stores and Linkeddata Framework solutions that could also be addressed via this ELAPSE™ Architecture.

=david.l.woolfenden


IEEE Paper Citations:

Abdullah Alamri, Peter Bertok, James A. Thom, “Authorization Control for a Semantic Data Repository Through an Inference Policy Engine,” IEEE Transactions on Dependable and Secure Computing, 15 April 2013. IEEE computer Society Digital Library. IEEE Computer Society,

  • [18] V. Milea, F. Frasincar, and U. Kaymak. tOWL: A temporal web ontology language. Systems, Man, and Cybernetics, Part B: Cybernetics, IEEE Transactions on, 42(1):268 –281, Feb. 2012.
  • [26] R. S. Sandhu. Role-based access control. In Advances in Computers. Academic Press, 1994.
  • [29] A.-Y. Turhan. Description logic reasoning for semantic web ontologies. In Proceedings of the International Conference on Web Intelligence, Mining and Semantics, WIMS ’11, pages 6:1–6:5, New York, NY, USA, 2011. ACM.
  • [30] L. Xu, H. Zhang, X. Du, and C. Wang. Research on mandatory access control model for application system. In Networks 

    Security, Wireless Communications and Trusted Computing, 2009. NSWCTC ’09. International Conference on, volume 2, pages 159 –163, April 2009.

My Vision for an Enhanced Linked Data Architecture – PART 1

In my previous post, I discussed the movement towards Semantic Web Technology (SWT)-based information sharing via Linked Data Concepts. Within that posting, I also made a prediction that evolving Linked Data open standards, specifications, and solutions are now creating, and will in the future create, continued disruption across the entire Information Interoperability / Information Sharing space. In this post, I will identify some of what I believe to be the current stumbling blocks to full adoption of Linked Data integration concepts and patterns in today’s Information Interoperability / Information Sharing space. I will also begin to discuss some of these stumbling blocks via additional components (the Enhanced parts) of my proposed Enhanced Linked Data Architecture and its current incarnation as the Enhanced Linkeddata Architecture for Persistent Sharing Environments (ELAPSE)™.

One stumbling block to full adoption of these concepts is the lack of an event-based and “Real-time” Push Notification ability that is already built-in to available RDF [i.e. Triple Store or Graph Database (Graph DB)] Stores and LinkedData Framework solutions. A number of popular RDF Store solutions do support the concept of “Events,” although not many of them. Events, in this context, are a basic notification of any changes made in the RDF Store – a typical, basic “Real-time” Push Notification ability can be customized to meet the needs of the business domain.  I attempt to address this deficiency in my proposed Enhanced Linked Data Architecture by including components that support the concept of “Real-time” Push Notification style of Internet-based immediate communication, where the request for a given transaction is initiated by the publisher of that transaction/notification not by the receiver. This capability is implemented via a Hypertext Transfer Protocol (HTTP) Webhook/Service Hook Web Application Programming Interface (API) – i.e. simple HTTP(S)-based webserver-to-webserver, service-oriented communication mechanism initiated on an event/transaction basis. WebHooks are typically invoked via an API using simple mechanisms for sending Push Notification “trigger events” between APIs using HTTP POST callbacks.

An additional stumbling blocks to full adoption of these concepts is the perception and real lack of “Triple-level” security across most RDF Stores and Linkeddata Stack Framework solutions. Like moving to the “cloud” itself (i.e. cloud computing; or, “the cloud”), this can become a major issue–be it real or not–and a serious stumbling block to full adoption of Linked Data integration concepts and patterns in today’s Information Interoperability / Information Sharing space.

My next post will begin to discuss security labeling/classification and how “Triple-level” security concerns associated with today’s RDF Stores and Linkeddata Stack Framework solutions could be addressed via this Enhanced Linked Data Architecture.

=david.l.woolfenden