Archive

Posts Tagged ‘foaf’

Whats the point : Semantic RESTful Web Services ?

March 9, 2011 4 comments

Well, I think its dawning on me, that what Roy Fielding talks about (rather abstractly) [1] is what Henry Story neatly summarises and provides examples of [2] – REST SOA, with connected semantics. I’m one of those who can be accused of implementing REST not in the Roy Fielding manner of the word, but in the anyting thats not WS* “meaning”. I’ve done request mapping, content negotiating, resource rendering in XML, Json (and a bunch of others), GET,PUT,POST,HEAD etc etc etc, but never all together, and never in the true Spirit of Roy. But when you add the semantic web, you can really see that theres something good going on here – “easy” and ubiquitous webservices.

Roy talks about representations, resources and connectedness, about agents or service consumers that deal with well-known media types and links, and nothing else – REST implies that a user agent is “thin”, understands basic-and-well-known types and protocols, and renders a look and feel and a behaviour that reacts to what it is fed. As he says it should work with the “follow your nose” principal (no need for WADL[3,4]).

For a browser this would mean that you point it an URL, it displays content suitably, that it receives and displays links with appropriate CRUD capabilities for it and and relations it is given.  For example, given a book resource, render it using the .book CSS class, and create links to add to shopping cart, get a contents list, add to a favourites list. For a chapter in the book, there may be link to print it, to relate it to a chapter in another similar book, to annotate it and send to a colleague. For a daemon or agent it might mean that it alters the time at which it performs an action against a resource, or what action it takes. The navigation and action controls aren’t determined by business or display logic, but by the resource and its relations – the agent consuming the resource knows it has to display or follow a link, the CSS may have display capabilities based on the resource type or context, the workflow steps will appear at the right time for the right user, under the right circumstances. Client logic is solely to deal with converting representations to appropriate media-types, and driving application state – using relations and verbs to make transitions with links.

But the thing that got me spinning, as I tried to understand the abstractedness, and as I looked into JAX-RS [5], and its various implementations (well, Spring* in particular TBH,which doesnt do JAX-RS in fact [6]) was that the connectedness and follow-your-nose principal seemed absent. Its all very well and cosy (and arguably easy) to create some platform code that maps URIs to classes and methods and HTTP verbs, and then to output XML or JSON or not (think JSP), or perhaps even Atom, OData, RDF, N3 or TTL but wheres the linked connectedness – the things we talk about and take for granted in Linked Open Semantic Data world ? And how does it know what links to create, how to generate them, and how they should be presented (if there’s a human involved) ?

Well, Henry blows that lightbulb for me when he illustrates from his foaf profile all the foaf:knows relations [2]. In a RESTful world where a service returns a foaf file and reads the foaf:knows elements it can decide what to do based on that predicate – it can deduce that the resource represented is a Person and can create the links it chooses using what it knows about foaf:knows and REST verbs – create/read/update/delete. It might allow addition of another foaf:knows with a PUT to the URI identifying the owner, an update to a mailing list so that all those foaf:knows objects are added, or automatically update a trust counter against a system resource because if Henry foaf:knows TBL in this context, then TBL must be “good for it” :-). In addition, it only knows that a URI represents that Person, and the URI could be a hypermedia link in the form of an URL, a ftp or webDav link, or some other protocol. Finally, this “knows” concept is really an upfront agreement about what representations are being used for the state of the application (it knows and XML schema, or an Ontology, or perhaps even looks them up on the fly), but navigating thru state is controlled by the interactions with the service (Http verbs) and the responses (status and agreed represenation in the body content) received.

At first sight those RESTful libraries don’t really need to know that much about the connectedness – they only need to map verbs and serve resources with those links embedded (RDF anyone? ) and using those well-known vocabularies, classes,relations and constraints – ie ontologies. But what about workflow : I post an object or resource, I get a response with the ID of that resource, and I need a link that tells me where to go for the next state transition ?

So, lo-and-behold, we have semantic linked data and REST superadditively combined, in a loosely coupled web (or “cloud”, if you like that keyword) of semantic links, intelligent user agents that understand those links or their context, web resolvable URIs, and value-added interlinked services – in effect a “Web Service Bus”. [7] !!

Now

  1. Point your People tool at the RESTful people+location web service and it “just works” to give you a social-network-mashup of connected people and interests (provenance, trust), and then
  2. switch over to your Energy consumption application and it also just works (based on what it has chosen to do and the well-known ontologies and resources it understands) – see how big your carbon footprint is when you meet TBL next week at Geneva if you fly,drive or take the train – and maybe you’ll be able to see who you can meet on the way and who else will be sitting beside you.

But your not out of the woods yet, doing semantic RESTfu web apps isnt a clear open space : your application still has to deal with authentication, input validation, long lived database transaction control, multithreading, performance, perhaps object relational mapping, but jax-rs/REST takes care of the object-message-mapping (the interface-to-implementation layer), your client or agent is thin but intelligent, and your middle tier contains your business logic.

Your application will need to honour the request-response state machine, perhaps checking availability using OPTIONS, or Etags.

You’ll need to decide how to transform from your programming model of choice – OO perhaps – to Resource. Some of the object to RDF mapping within libraries like Empire[15], JenaBean[16], Sommer [22]{defunct?), object-triple [17] may help. Perhaps this wont be an issue for you if you can foist the RESTful resource and linkage proposition onto an object model and remain in the object world – why waste processor and resource when you store data in RDF, convert to an Object on retrieval, process, convert to Xml-RDF or JSON on the way out, then parse and walk in a JSP before rendering as HTML ? As an OO programmer on the web you’re familiar with marshalling objects in and out of different serialization –  RDF/XML/JSON/HTML, but you do want and need to minimise those transitions. Perhapsfor “Big Data” we should stay in the Resource world : persist to a fast native RDF triplestore or HPC based system on a cluster of MapReduce or somesuch (CouchDB[20], Heart/HBase [21] ? perhaps BigData[18] or SHARD[19], AllegroGraph[23] ?), and talk to it with ProLog or some such – forget the Object paradigm and embrace the Linked, Open World Resources, and also do it with REST.

You also have to be clear that REST suits what you want to do (other architectures haven’t just been demoted to history) what your services are  -what you are interfacing with, what are your domain objects, what service operations are exposed when, what workflow do you need to encompass[13], and how granular you need to be – a shopping cart application will need to save items to a shopping list, rather than save the items themselves (or the cart resource probably), but it will also, behind the service, need to update a stock control or inventory – which isnt exposed to your end user.  So be clear about which service level CRUD operations you need to expose to your user or “agent”, and which if any domain objects you need to directly manipulate.

But in the end, hopefully, you’ve still followed your enterprise principles and patterns, but you’ve adopted a long lasting web-scale architecture, and if youve added the semantic vocabulary, you’ve got the basis for successful evolution, a network effect, adaptable clients and agents and a successful resolution to an important business case – thats why your doing this, isn’t it, not because its cool ?

Update : April 24 – read Otavios paper on RESTfulGrounding [25] but also read Alowisheq, Millard and Tiropanis EXPRESS RESTful services paper[26]. RESTfulGrounding does for REST and WADL what OWL-S does for WSDL – it gets Semantic descriptions into the syntactic descriptions that automated services might use to interact with a web service, and facilitates discovery, composition, monitoring and execution. EXPRESS takes a different approach and based on an existing RESTful web service allows you to create an OWL description that can also be RESTfully accessed to describe the services resources, relations and “parameters” (OWL DataTypeProprty and ObjectProperty). They describe an adaptation of Amazon S3 buckets and docs with EXPRESS and compare with SA-REST and OWL-S approaches.

I like EXPRESS more than RESTfulGrounding as the simplicity appeals : the way it in turn relies on REST to underpin the service description access and interaction, adheres to RESTful principles for message exchange – using TTL rather than XML – , follow-you-nose, and the fact that this in turn means I don’t have to learn much if I want to make use of it. It does need the use of a code generator for stubs and URIs and a manual step to define which methods apply to which URIs, and doesn’t do much for discovery and composition – but they acknowledge this and intend to work on it – and a real implementation with these tools needs to be made available so that people like me can try it out. Is there one ?

I need to understand more about WADL[27,28] (why is it needed in the first place ?) and how I might go about actually building a set of services that need to be described and then discovered and composed to provide some useful value, but EXPRESS fits nicely into web scale, lo-fi approaches that quickly gain traction and that might make use of a CPoA kind of approach for discovery and composition.

* You’ve got other choices :

  • Apache CXF – perhaps best if you come from the WS* camp or have a mixture [8]
  • GlassFish Jersey – seems to have good traction, with hooks into Spring et al [9]
  • RESTeasy – JBoss jax-rs implementation [10]
  • RESTlet – not sure about this, seems to have good support, taking a different approach apparently – eg RESTlet vs SERVlet, but I need more info to do it justice [11]
  • PLAY Framework – has good REST support I understand from others. [12]
  • Clerezza – Apache incubator project with RDF, jax-rs, scala and “renderlet” support. Looks interesting from a RDF PoV, but maybe not so interesting from an OOD PoV [14]

[1] http://roy.gbiv.com/untangled/2008/rest-apis-must-be-hypertext-driven
[2] http://blogs.sun.com/bblfish/entry/rest_apis_must_be_hypertext
[3] http://wadl.java.net/
[4] http://bitworking.org/news/193/Do-we-need-WADL
[5] http://jcp.org/en/jsr/detail?id=311
[6] http://grzegorzborkowski.blogspot.com/2009/03/test-drive-of-spring-30-m2-rest-support.html
[7] http://wisdomofganesh.blogspot.com/2010/06/wanted-esc-not-esb.html

[8] http://cxf.apache.org/

[9] http://jersey.java.net/

[10] http://www.jboss.org/resteasy

[11] http://www.restlet.org/

[12] http://www.playframework.org/documentation/1.1/routes

[13] http://www.infoq.com/articles/webber-rest-workflow

[14] http://incubator.apache.org/clerezza/

[15] https://github.com/clarkparsia/Empire

[16] http://code.google.com/p/jenabean/

[17] http://code.google.com/p/object-triple/

[18] http://www.systap.com/bigdata.htm

[19] http://www.dist-systems.bbn.com/people/krohloff/shard_overview.shtml

[20] http://couchdb.apache.org/

[21] http://wiki.apache.org/incubator/HeartProposal

[22] http://java.net/projects/sommer/

[23] http://www.franz.com/agraph/allegrograph/

[24] http://blog.cubrid.org/web-2-0/database-technology-for-large-scale-data/
[25] http://www.fullsemanticweb.com/blog/ontologies/restfulgrounding/
[26] http://ebookbrowse.com/express-expressing-restful-semantic-services-using-domain-ontologies-pdf-d12806537
[27] http://java.net/projects/wadl/
[28] http://bitworking.org/news/193/Do-we-need-WADL

Advertisements

Java Semantic & Linked Open Data webapps – Part 5.1

January 18, 2011 1 comment

How to Architect ?

Well – what before how  – this is firstly about requirements, and then about treatment

Linked Open Data app

Create a semantic repository for a read only dataset with a sparql endpoint for the linked open data web. Create a web application with Ajax and html (no server side code) that makes use of this data and demonstrates linkage to other datasets. Integrate free text search and query capability. Generate a data driven UI from ontology if possible.

So – a fairly tall order : in summary

  • define ontology
  • extract entites from digital text and transform to rdf defined by ontology
  • create an RDF dataset and host in a repository.
  • provide a sparql endpoint
  • create a URI namespace and resolution capability. ensure persistence and decoupling of possible
  • provide content negotiation for human and machine addressing
  • create a UI with client side code only
  • create a text index for keyword search and possibly faceted search, and integrate into the UI alongside query driven interfaces
  • link to other datasets – geonames, dbpedia, any others meaningful – demonstrate promise and capability of linkage
  • build an ontology driven UI so that a human can navigate data, with appropriate display based on type, and appropriate form to drive exploration

Here’s what we end up

Lewis Topographical Dictionary linked data app - system diagram

  1. UserAgent – a browser navigates to Lewis TDI homepage – http://uoccou.endofinternet.net:8080/resources/sparql – and
  2. the webserver (tomcat in fact) returns html and javascript. This is the “application”.
  3. interactions on the webpage invoke javascript that either makes direct calls to Joseki (6) or makes use or permanent URIs (at purl.org) for subject instances from the ontology
  4. purl.org redirects to dynamic dns which resolves to hosted application – on EC2, or during development to some other server. This means we have permanent URIs with flexible hosting locations, at the expense of some network round trips – YMMV.
  5. dyndns calls EC2 where a 303 filter intersects to resolve to either a sparql (6) call for html, json or rdf. pluggable logic for different URIs and/or accept headers means this can be a select, describe, or construct.
  6. Joseki as a sparql endpoint provides RDF query processing with extensions for freetext search, aggregates, federation, inferencing
  7. TDB provides single semantic repository instance (java, persistent, memory mapped) addressable by joseki. For failover or horizontal scaling with multiple sparql endpoints SDB should probably be used. For vertical scaling at TDB – get a bigger machine ! Consider other repository options where physical partitioning, failover/resilience or concurrent webapp instance access required (ie if youre building a webapp connected to a repository by code rather than a web page that makes use of a sparql endpoint).

Next article will provide similar description or architecture used for the Java web application with code that is directly connected to a repository rather than one that talks to a sparql endpoint.

Building a Semantic Web Application in Java

November 22, 2010 1 comment

I assume you know Java and you have built web applications. You also know what a database is, what hierarchichal data are, what a schema is. You may have used ORM libraries like Hibernate. You’ve done some UI work with Javascript, JSP or RCP perhaps. You know what MVC means in the context of a web application. Now you want to know what and why you would use the Semantic Web and the Linked Open Data web to build a useful application.

Since you have decided that you also want to make sure it can fulfill all the usual use-cases – performance, concurrent updating, ease of maintenance, basic functional capability (for a business app perhaps), and importantly, whats the benefit – how do you sell this to your manager. For you it’s cool, its being talked about in all the hard to reach places that only you know, and it has more and more promise the more you look into it. You want to do it, but if it doesn’t cut-the-mustard, whats the point – wait a while, let someone else solve the problems, then come back to it. You can still work in the web app world, doing what youve done for the last while, make some money, pay the bills. So, before you even start, the Semantic Web is up against it – it has a lot to prove against a long embedded technology where experience has already paid and dividends been reaped. It better be good……

First question – where to start ? Its a hard question to answer if you’re not familiar with the Semantic Web, and even if you are – if you can grasp the basic and simple ideas behind it – then you’re still going to have difficulty. And whats the Linked Open Data web ? How do I use it, why should I bother ? Why would my boss be interested ? How can I make them interested ? How can they make money from it ? Doesn’t his mean my companies data is going to be available to the public, and my competitors ?

So, over the next while, Im going to try and relate how and why I did what I did to build two different kinds of web application –

1) a read-only reference data application that makes use content loaded into a repository and fronted with a sparql end point, talks to geonames, dbpedia, sindice, uberblic and google maps.

2) a “White Label” J2EE Location Based Service that uses JPA behind its DAOs to talk to a semantic repository. It also makes use of OpenID to provide some anonymity, Spring Security to provide security and ACL controlled authorization (all stored in a semantic repository) and integrates the first app using JSONP (Facebook Authentication and OAuth authorization are also on the cards).

It has taken a while, and there have been some good and bad choices, so tune in next time for the first installment in the series, selected from this bunch :

  1. Target Functionality – what is my application and what parts of the semantic web do I want and/or need to deliver. Does the application want/need to be part of the Linked Open Data web ?
  2. Selection criteria – technologies, content, delivery
  3. Available tools and technologies. Can I avoid SOA pitfalls or is this just the same old story ?
  4. What needs writing ?
  5. How to architect ?
  6. Do I need an ontology – how to create ? Whats OWL-DL/Lite/RDFS/DAML/XYZ ?
  7. Text vs RDF vs XML – Reading, generating, parsing, APIs.
  8. Size, performance, scale
  9. Output – files, databse, rdf, URIs, linkage, API ?
  10. Do I need freetext search ? How do I do it ? Why not just use SOLR and faceted search ?
  11. Content Negotiation – who/what is my audience ? browser, script, machine, API ?
  12. Mapping – I’ve got location, lets use it – show it and link it
  13. Ajax, js – can I use semantic web/rdf/rdfa libs in my UI  – do I have to do it all ? What help is there ?
  14. UI – can I build a UI to display and collect queries (forms) using my ontology ? How do I allow a human to easily navigate thru my strutured, but infinitely (hey, what about closure ?) graphable set of nodes ?.

Which do you want to hear about first ?

Like I said, I built two web applications for the Semantic and Linked Open Data web. The articles following this introduction will talk about both in parallel.

Lewis Topographical Dictionary Ireland and SkyTwenty on EC2

November 17, 2010 Comments off

Both applications are now running on Amazon EC2 in a micro instance.

  • AMI 32bit Ubuntu 10.04 (ami-480df921)
  • OpenJDK6
  • Tomcat6
  • MySQL5
  • Joseki 3.4.2
  • Jena 3.6.2
  • Sesame2
  • Empire 0.7
  • DynDNS ddclient (see [1])

Dont try installing sun-java6-jdk, it wont work. You might get it installed if you try running instance as m1.small, and do it as the first task on the AMI instance. Didnt suit me, as I discovered too late, and my motivation to want to install it turned out to be no-propagation of JAVA_OPTS, not the jdk. See earlier post on setting up Ubuntu.

  • Lewist Topographical Dictionary of Ireland
    • Javascript/Ajax to sparql endpoint. Speedy.
    • Extraction and RDF generation from unstructured text with custom software.
    • Sparql endpoint on Joseki, with custom content negotiation
    • Ontology for location, roads, related locations, administrative description, natural resources, populations, peerage.
    • Onotology for Peerage – Nobility, Gentry, Commoner.
    • Find locations where peers have more than one seat
    • Did one peer know another, in what locations, degree of separation
    • Linked Open Data connections to dbPedia, GeoNames (uberblic and sindice to come) – find people in dbPedia born in 1842 for your selected location. Map on google maps with geoNames sourced wgs84 lat/long.
  • SkyTwenty
    • Location based service built JPA based Enterprise app on Semantic repo (sesame native).
    • Spring with SpringSec ACL, OpenID Authorisation.
    • Location and profile tagging with Umbel Subject Concepts.
    • FOAF and SIOC based ontology
    • Semantic query console – “find locations tagged like this”, “find locations posted by people like me”
    • Scheduled queries, with customisable action on success or failure
    • Location sharing and messaging with ACL groups – – identity hidden and location and date time cloaked to medium accuracy.
    • Commercial apps possible – identity hidden and location and date time cloaked to low accuracy
    • Data mining across all data for aggregate queries – very low accuracy, no app/group/person identifiable
    • To come
      • OpenAuth for application federation,
      • split/dual JPA – to rdbms for typical app behaviour, to semantic repo for query console
      • API documentation

A report on how these were developed and the things learned is planned, warts and all.

[1]http://blog.codesta.com/codesta_weblog/2008/02/amazon-ec2—wh.html – not everything needs to be done, but you’ll get the idea. Install ddclient and follow instructions.

FOAF for historical figures.

November 7, 2009 Comments off

It seems to me having recently looked at the Elphin Census of 1749 and the ideas of Improvement, eg at Caldwell Castle and Estate, that it’s the actors in the events of history who are the most important, and one of the best sources of investigation to understand many topics. Why not create a FOAF repository for them with

  • Name
  • Place of Birth
  • Place of Death
  • Where lived, when
  • Friends
  • Interests
  • Publications
  • Title or role

It would then be possible to link and see

  • which improvers knew Jonathan Swift, were members of the RDS, and had lands in some part of the country. Or
  • what Baronets knew each other who were also diametrically opposed on the question of Patriotism (in the anglo-irish 18th sense).

FOAF ? Drupal ? WordPress ? RDFa ?

November 2, 2009 Comments off

Why cant I upload my foaf profile to wordpress ? Or can I ?

Will wordpress go the semantic way as Drupal7 is heading ?

Any modules for RDFa -ing the HTML in Drupal, without me having to manually crank it out, if I wanted to ?

Categories: semantic Tags: , , ,