Archive

Posts Tagged ‘cloud’

Whats the point : Semantic RESTful Web Services ?

March 9, 2011 4 comments

Well, I think its dawning on me, that what Roy Fielding talks about (rather abstractly) [1] is what Henry Story neatly summarises and provides examples of [2] – REST SOA, with connected semantics. I’m one of those who can be accused of implementing REST not in the Roy Fielding manner of the word, but in the anyting thats not WS* “meaning”. I’ve done request mapping, content negotiating, resource rendering in XML, Json (and a bunch of others), GET,PUT,POST,HEAD etc etc etc, but never all together, and never in the true Spirit of Roy. But when you add the semantic web, you can really see that theres something good going on here – “easy” and ubiquitous webservices.

Roy talks about representations, resources and connectedness, about agents or service consumers that deal with well-known media types and links, and nothing else – REST implies that a user agent is “thin”, understands basic-and-well-known types and protocols, and renders a look and feel and a behaviour that reacts to what it is fed. As he says it should work with the “follow your nose” principal (no need for WADL[3,4]).

For a browser this would mean that you point it an URL, it displays content suitably, that it receives and displays links with appropriate CRUD capabilities for it and and relations it is given.  For example, given a book resource, render it using the .book CSS class, and create links to add to shopping cart, get a contents list, add to a favourites list. For a chapter in the book, there may be link to print it, to relate it to a chapter in another similar book, to annotate it and send to a colleague. For a daemon or agent it might mean that it alters the time at which it performs an action against a resource, or what action it takes. The navigation and action controls aren’t determined by business or display logic, but by the resource and its relations – the agent consuming the resource knows it has to display or follow a link, the CSS may have display capabilities based on the resource type or context, the workflow steps will appear at the right time for the right user, under the right circumstances. Client logic is solely to deal with converting representations to appropriate media-types, and driving application state – using relations and verbs to make transitions with links.

But the thing that got me spinning, as I tried to understand the abstractedness, and as I looked into JAX-RS [5], and its various implementations (well, Spring* in particular TBH,which doesnt do JAX-RS in fact [6]) was that the connectedness and follow-your-nose principal seemed absent. Its all very well and cosy (and arguably easy) to create some platform code that maps URIs to classes and methods and HTTP verbs, and then to output XML or JSON or not (think JSP), or perhaps even Atom, OData, RDF, N3 or TTL but wheres the linked connectedness – the things we talk about and take for granted in Linked Open Semantic Data world ? And how does it know what links to create, how to generate them, and how they should be presented (if there’s a human involved) ?

Well, Henry blows that lightbulb for me when he illustrates from his foaf profile all the foaf:knows relations [2]. In a RESTful world where a service returns a foaf file and reads the foaf:knows elements it can decide what to do based on that predicate – it can deduce that the resource represented is a Person and can create the links it chooses using what it knows about foaf:knows and REST verbs – create/read/update/delete. It might allow addition of another foaf:knows with a PUT to the URI identifying the owner, an update to a mailing list so that all those foaf:knows objects are added, or automatically update a trust counter against a system resource because if Henry foaf:knows TBL in this context, then TBL must be “good for it” :-). In addition, it only knows that a URI represents that Person, and the URI could be a hypermedia link in the form of an URL, a ftp or webDav link, or some other protocol. Finally, this “knows” concept is really an upfront agreement about what representations are being used for the state of the application (it knows and XML schema, or an Ontology, or perhaps even looks them up on the fly), but navigating thru state is controlled by the interactions with the service (Http verbs) and the responses (status and agreed represenation in the body content) received.

At first sight those RESTful libraries don’t really need to know that much about the connectedness – they only need to map verbs and serve resources with those links embedded (RDF anyone? ) and using those well-known vocabularies, classes,relations and constraints – ie ontologies. But what about workflow : I post an object or resource, I get a response with the ID of that resource, and I need a link that tells me where to go for the next state transition ?

So, lo-and-behold, we have semantic linked data and REST superadditively combined, in a loosely coupled web (or “cloud”, if you like that keyword) of semantic links, intelligent user agents that understand those links or their context, web resolvable URIs, and value-added interlinked services – in effect a “Web Service Bus”. [7] !!

Now

  1. Point your People tool at the RESTful people+location web service and it “just works” to give you a social-network-mashup of connected people and interests (provenance, trust), and then
  2. switch over to your Energy consumption application and it also just works (based on what it has chosen to do and the well-known ontologies and resources it understands) – see how big your carbon footprint is when you meet TBL next week at Geneva if you fly,drive or take the train – and maybe you’ll be able to see who you can meet on the way and who else will be sitting beside you.

But your not out of the woods yet, doing semantic RESTfu web apps isnt a clear open space : your application still has to deal with authentication, input validation, long lived database transaction control, multithreading, performance, perhaps object relational mapping, but jax-rs/REST takes care of the object-message-mapping (the interface-to-implementation layer), your client or agent is thin but intelligent, and your middle tier contains your business logic.

Your application will need to honour the request-response state machine, perhaps checking availability using OPTIONS, or Etags.

You’ll need to decide how to transform from your programming model of choice – OO perhaps – to Resource. Some of the object to RDF mapping within libraries like Empire[15], JenaBean[16], Sommer [22]{defunct?), object-triple [17] may help. Perhaps this wont be an issue for you if you can foist the RESTful resource and linkage proposition onto an object model and remain in the object world – why waste processor and resource when you store data in RDF, convert to an Object on retrieval, process, convert to Xml-RDF or JSON on the way out, then parse and walk in a JSP before rendering as HTML ? As an OO programmer on the web you’re familiar with marshalling objects in and out of different serialization –  RDF/XML/JSON/HTML, but you do want and need to minimise those transitions. Perhapsfor “Big Data” we should stay in the Resource world : persist to a fast native RDF triplestore or HPC based system on a cluster of MapReduce or somesuch (CouchDB[20], Heart/HBase [21] ? perhaps BigData[18] or SHARD[19], AllegroGraph[23] ?), and talk to it with ProLog or some such – forget the Object paradigm and embrace the Linked, Open World Resources, and also do it with REST.

You also have to be clear that REST suits what you want to do (other architectures haven’t just been demoted to history) what your services are  -what you are interfacing with, what are your domain objects, what service operations are exposed when, what workflow do you need to encompass[13], and how granular you need to be – a shopping cart application will need to save items to a shopping list, rather than save the items themselves (or the cart resource probably), but it will also, behind the service, need to update a stock control or inventory – which isnt exposed to your end user.  So be clear about which service level CRUD operations you need to expose to your user or “agent”, and which if any domain objects you need to directly manipulate.

But in the end, hopefully, you’ve still followed your enterprise principles and patterns, but you’ve adopted a long lasting web-scale architecture, and if youve added the semantic vocabulary, you’ve got the basis for successful evolution, a network effect, adaptable clients and agents and a successful resolution to an important business case – thats why your doing this, isn’t it, not because its cool ?

Update : April 24 – read Otavios paper on RESTfulGrounding [25] but also read Alowisheq, Millard and Tiropanis EXPRESS RESTful services paper[26]. RESTfulGrounding does for REST and WADL what OWL-S does for WSDL – it gets Semantic descriptions into the syntactic descriptions that automated services might use to interact with a web service, and facilitates discovery, composition, monitoring and execution. EXPRESS takes a different approach and based on an existing RESTful web service allows you to create an OWL description that can also be RESTfully accessed to describe the services resources, relations and “parameters” (OWL DataTypeProprty and ObjectProperty). They describe an adaptation of Amazon S3 buckets and docs with EXPRESS and compare with SA-REST and OWL-S approaches.

I like EXPRESS more than RESTfulGrounding as the simplicity appeals : the way it in turn relies on REST to underpin the service description access and interaction, adheres to RESTful principles for message exchange – using TTL rather than XML – , follow-you-nose, and the fact that this in turn means I don’t have to learn much if I want to make use of it. It does need the use of a code generator for stubs and URIs and a manual step to define which methods apply to which URIs, and doesn’t do much for discovery and composition – but they acknowledge this and intend to work on it – and a real implementation with these tools needs to be made available so that people like me can try it out. Is there one ?

I need to understand more about WADL[27,28] (why is it needed in the first place ?) and how I might go about actually building a set of services that need to be described and then discovered and composed to provide some useful value, but EXPRESS fits nicely into web scale, lo-fi approaches that quickly gain traction and that might make use of a CPoA kind of approach for discovery and composition.

* You’ve got other choices :

  • Apache CXF – perhaps best if you come from the WS* camp or have a mixture [8]
  • GlassFish Jersey – seems to have good traction, with hooks into Spring et al [9]
  • RESTeasy – JBoss jax-rs implementation [10]
  • RESTlet – not sure about this, seems to have good support, taking a different approach apparently – eg RESTlet vs SERVlet, but I need more info to do it justice [11]
  • PLAY Framework – has good REST support I understand from others. [12]
  • Clerezza – Apache incubator project with RDF, jax-rs, scala and “renderlet” support. Looks interesting from a RDF PoV, but maybe not so interesting from an OOD PoV [14]

[1] http://roy.gbiv.com/untangled/2008/rest-apis-must-be-hypertext-driven
[2] http://blogs.sun.com/bblfish/entry/rest_apis_must_be_hypertext
[3] http://wadl.java.net/
[4] http://bitworking.org/news/193/Do-we-need-WADL
[5] http://jcp.org/en/jsr/detail?id=311
[6] http://grzegorzborkowski.blogspot.com/2009/03/test-drive-of-spring-30-m2-rest-support.html
[7] http://wisdomofganesh.blogspot.com/2010/06/wanted-esc-not-esb.html

[8] http://cxf.apache.org/

[9] http://jersey.java.net/

[10] http://www.jboss.org/resteasy

[11] http://www.restlet.org/

[12] http://www.playframework.org/documentation/1.1/routes

[13] http://www.infoq.com/articles/webber-rest-workflow

[14] http://incubator.apache.org/clerezza/

[15] https://github.com/clarkparsia/Empire

[16] http://code.google.com/p/jenabean/

[17] http://code.google.com/p/object-triple/

[18] http://www.systap.com/bigdata.htm

[19] http://www.dist-systems.bbn.com/people/krohloff/shard_overview.shtml

[20] http://couchdb.apache.org/

[21] http://wiki.apache.org/incubator/HeartProposal

[22] http://java.net/projects/sommer/

[23] http://www.franz.com/agraph/allegrograph/

[24] http://blog.cubrid.org/web-2-0/database-technology-for-large-scale-data/
[25] http://www.fullsemanticweb.com/blog/ontologies/restfulgrounding/
[26] http://ebookbrowse.com/express-expressing-restful-semantic-services-using-domain-ontologies-pdf-d12806537
[27] http://java.net/projects/wadl/
[28] http://bitworking.org/news/193/Do-we-need-WADL

Amazon EC2 t1.micro swizz !

January 6, 2011 5 comments

Just got my bill from Amazon for the 2 instances Im running and find Ive been charged for 728 hours on one of them – I thought this was supposed to be free for a year ! Reading again the small print (ugh) it seems you are entitled to 750 hours free, but it doesn’t explicitely say per instance. So – it seems its per account and you can run as many instances as you like and use a total of 750 hours across them in total before you get charged. Then again, I suppose thats reasonable enough – Amazon wouldn’t want to have every SME in the world running in the cloud for free, for a year when you could be getting cash from them, would you ? I must have been in a daze 🙂

Categories: cloud Tags: , , , ,

Google Fund Social Network Semantic Research at DERI

December 16, 2010 Comments off

An announcement today [1] that Google will fund a team led by Dr Alexandre Passant [2] on mobile social network applications. I wonder how much it will be like my own SkyTwenty platform [3] – far from being a social networking or blogging system but a mobile location service, with anonymity, access control and data shrouding. The DERI work is apparently going to build on some Google tech – “PubSubHubbub”[4] – and DERIs own hub based mobile blogging service – SMOB[5] – v2 of which was released in Jan 2010, integrates with Sindice. Distributed storage, Git like ? What about anonymity, provenance, trust, security, and scalabilty ?

[1] http://www.siliconrepublic.com/innovation/item/19694-google-funds-semantic-web/
[2] http://www.deri.ie/about/team/member/alexandre_passant/
[3] http://skytwenty.endofinternet.net:8080/treasure/moreInfo.usp
[4] http://code.google.com/p/pubsubhubbub/
[5] http://smob.me/

Installing JDKs & Tomcat on Ubuntu 10.04/10 AMI on EC2

November 8, 2010 Comments off

That title is a mouthful isnt it ? A cloudful.  Only slightly better than “How to go about successfully installing Sun JDK6 on Ubuntu 10.04/10 (Lucid/Maverick) using an Ubuntu EBS image on Amazon EC2”

I am trying to setup an Ubuntu 10.04 image on EC2 as a t1.micro. Amazon are giving away free time for the first year on this size image so its a good way to get to grips with things and learn the ins and outs.

So

1) get an amazon account 🙂

2) sign up for EC2 – means handing over credit card details and read the quick getting started guide

2.1) download the cmdline tools or use the web interface.  There are some things you need this for because the web console doesnt have everything. I used the web interface to set up credentials and so on, had to set up ssh on cygwin, then switched to putty (use puttygen to convert the amazon private key first). Cmdline tools need a pk and a x509 cert, so you need to follow the instructions to create those as well.

3) find an image – I have unsuccessfully tried

  • ami-480df921 – a basic ubuntu i386 10.04 on EBS
  • ami-508c7839 – a basic ubuntu i286 10.10 on EBS

4) read some documentation usage docs about volumes, snapshots, reboots and termination and whatever you can from the library of docs and tutorials on  pricing, sizing, architecture

5) decide what you need installed for your purposes – you will install software on your instance just like its your own – that is to say there is no special amazon repo or methodology to follow

6) go for it – first change the instance so the root volume is preserved on reboot (otherwise you’ll lost it all)

/dev/sda1=vol-ddf910b5:attached:2010-11-15T17:09:44.000Z:true

6.1) I wanted Java, Tomcat, mySQL and my own software

6.1.1) Java & Tomcat – I used sudo termsel – and chose the Java-Tomcat option – what could be easier ? Well, lots actually. The task hangs, and no amount of ping, ssh or retries helps. So, reboot the instance : this seems to work.  Syslog has nothing to reveal. Try again, same thing happens. Turns out its trashed, so you need to use apt-get purge on the openjdk jre packages it attempts to install then do an apt-get update, then try again. It worked for me, but I suspect YMMV.

6.1.2) MySQL – use apt-get and install the MySQL-server package

6.1.3) my software – jarred up an instance of my webapp, WinSCPd to the instance (very slow), and unjarred. Had to install the OpenJDK (not jre) package to get jar of course. Sigh. Added Ubuntu to the tomcat6 group with adduser (not useradd !!!), copied and moved webapps and libs around the place, then fixed up local paths in my config.

Restart tomcat and cross fingers. My JNDI context name (jdbc) isnt being found and an instrumentation agent Im using with -agent on the JAVA_OPTS is failing. (getAllLoadedClasses() is returning null ! ) Could it be that I need the Sun JDK ?…..

6.1.4) Attempt to install Sun JDK. apt-get install sun-jdk6 you might guess – but thats too easy.  Sun JDK6 is now in an archive repository, so you need to add that

sudo add-apt-repository "deb http://archive.canonical.com/ lucid partner"
sudo apt-get update
sudo apt-get install sun-java6-jdk

and then, wait for it, your instance hangs. So, Im about to try to fix that, by following instructions from an Ubuntu launchpad bug report.

i) In theory, first thing to do is start your t.micro instance as m1.small. You need the cmdline tools for this:

ec2-run-instances -k your-pk-file-name -g your-security-group-name -t m1.small ami-image-id

for example

ec2-run-instances -k .ssh/uoc_kp -g uoc-test-1 -t m1.small ami-480df871

ii) connect to the instance and install java. you’ll need to use the same pk in whatever ssh client you are using. the install java with


sudo apt-get update
sudo apt-get install sun-java6-jdk

iii) stop instance

iv) start normally – that is without setting type to m1.small

However, when you do i) you end up with a new instance with a new volume. So how can you install the jdk on the volume you need ?? Ive asked, but got no answer….

So now I try an Amazon Linux image. This however doesnt have Sun JDK available to install, so I’ll try one more time with the openjdk – maybe this version will work

  • launch instance and connect
  • scp tarball of install from ubuntu instance (install public key from ubuntu user in .ssh/authorized_keys on Amazon Linux AMI)/
  • Link /var/lib/tomcat6 to home dir : ln -sf /var/lib/tomcat6 tomcat6
  • install tomcat6, tweak /etc/init.d/tomcat6.conf so that CATALINA_BASE points at /var/lib/tomcat6
  • add ec2-user to tomcat group
  • chmod 775 ~/tomcat6 so that tomcat group can write logs etc
  • mv mysql connector jar to $CATALINA_HOME/lib
  • install mysql : sudo yum install mysql-server.i386 – but how to set up root user ? more pain !
    • well – when you first run /etc/init.d/mysqld it tells you
      /usr/bin/mysqladmin -u root password ‘new-password’
      /usr/bin/mysqladmin -u root -h ip-10-100-10-100 password ‘new-password’
  • run tomcat6 …..
  • same problem – no instrumentation (my app wont run without)

S0, digging deeper, I find eventually that my JAVA_OPTS dont get passed with sudo. So I use -E, and they get as far as /etc/init.d/tomcat6, but no further. Looking at /usr/sbin/tomcat6 which is called by the init script, I see that /etc/tomcat6/tomcat6.conf is pulled in, and env vars can be created here, including JAVA_OPTS (which has the -javaagent JVM param).  And its writable by the tomcat group, so this is the best place to put it, as long as you wrap it in quotes.

So, now, tomcat6 appears to be starting, but I have other problems to solve, but at least theres progress. Snails pace, but progress.

[Update]

That would have been too easy tho. On Amazon Linux AMI I seemed to have all sorts of strange permissions problems. properties files couldnt be read, a directory couldnt be created. So I have given up, and have moved back to the ubuntu 10.04 install I have for now. Knowing that the JAVA_OPTS were a problem, and seeing that /etc/tomcat6/tomcat6.conf doesnt exist and is not used by /etc/init.d/tomcat6 on ubuntu, all you need to to get JAVA_OPTS is set them in your env and use sudo with -E.

So thats Tomcat6 running in upto 384m, and Joseki on 256m.  Wonder how they’ll fair…

For log4j, I needed to set a fully qualified path, or I get log create errors.

And then, finally, had to update the Security Group to let traffic thru on the right ports. Now just need to update the code to the latest level, start a fresh repo on Sesame, restart a few more times (sigh) and test the jsp/js code for machine specifics I may have left lying around. Almost there !