Duus Blog: Semantic Web

Viser opslag med etiketten Semantic Web. Vis alle opslag

torsdag den 6. oktober 2011

Tingenes Internet

Teknologirådet inviterede den 4.10 til opstart på et projekt, der har til formål at belyse mulighederne for anvendelsesområder og de særlige sikkerhedskrav, der vil opstå, når 'tingenes internet' meget snart bliver en realitet. Mødet blev indledt at Kim Escherich, IBM, der har ansvar for 'tingenes internet' og 'smart planet' hos IBM. Denne film illustrerer begrebet og IBM's synspunkter på emnet.

Alle IT-firmaer og konsulenthuse er begyndt at interessere sig for emnet – se f.eks. McKinseys blog, Sun's side allerede fra 2003 om 'The Internet of Things' og ikke mindst netværksleverandøren CISCO.

Men kan vi definere begrebet lidt nærmere og lidt mere operationelt? Vi opfatter tingenes internet som de ting – enheder, processorer, lagringsenheder, 'apparater' – der har en IP adresse og som kan kommunikere via internettet. Uanset om det er lukkede, krypterede net, der bruger internetprotokollen eller enheder, der er forbundet i det 'store' internet.

Der er i dag mere end 20 mia af sådanne enhedser, og estimatet for 2020 er 'mindst 50 mia'. Det skal sættes i relatiuon til dagens 2mia registrerede menneskelige internet brugere.

Men allerede i dag har vi 5 milliarder mobil-telefonbrugere, som formodentlig inden længe også vil indgå i det, som IBM kalder 'a system om systems'.

Når vi taler om 'ting' , kan de optræde i principielt 2 forskellige roller: som detektorer eller som 'effektorer'. (Jvfr. Hood & Margett). En detektor er noget, der kan opsamle data, måle, 'se' – og en effektor er en ting, der kan udføre noget: åbne en dør, skifte et signal, starte en motor, manøvrere et køretøj. Men detektorer og effektorer er instrumenter, og for at vi kan anvende dem i et netværk, må der nødvendigvis også være en logik, en intelligens, et program, der baseret på de opsamlede data kan foretag en beregning og ud fra et regelsæt beslutte hvilken effektor, der skal anvendes og hvad den skal gøre. Verden er 'Instrumented, Connected & Intelligent'.

Når man er blevet enige om denne grundlæggende beskrivelse, er det også klart, at 'det semantiske web' spiller en stor rolle i definitionen af data interfaces, snitflader og resultatet af de intelligente processer. Det vil hjælpe os til at sikre, at vidt forskellige 'ting', der er etableret med vidt forskellige formål og med vidt forskellige bruger/kundegrupper potentielt kan kommunikere med hinanden. Eksemplet er f.eks. Velfærdsteknologi, hvor en alliance af producenter af intelligente målesystemer – CONTINUA – har formået at sætte nogle standarder, der sikrer at enhederne kan kobles til f.eks. den samme 'hub' via veldefinerede interfaces. Der er behov for en endnu bredere definition af denne type platform, så den f.eks. også vil kunne tilkoble detektorer og effektorer, der anvendes bredt til overvågning, styring af intelligente huse (Se f.eks. Phillips projekt) , som kan indgå i intelligente grids vaskemaskine, køleskab), åbne garageporten, styre temperatur o.s.v. Se præsentationen om 'intelligent grids' , der beskriver formålet med disse grids og hvilke typer udstyr, man tænker vil blive koblet på dette net. Det ser ovenikøbet ud som om Danmark har en styrkeposition på området, se f.eks. denne præsentation:

Blandt anvendelsesområderne er det naturligvis værd at bemærke, at man f.eks. på fødevareområdet er begyndt at indføre chips, der kan garantere sporbarheden af oprindelsen – ('klap din bøf') – og derved bidrage til større forbrugertryghed, (Se RFID projektet vedr. fisk) - men i en næste version, hvor chippen kan registrere hvad der sker med kødet, hvor frisk det er, om der er giftstoffer eller tilsætningsstoffer, kan man få forbedret fødevaresikkerhed og reducere spild. Man er ikke længere afhængig af en meget grov vurdering stemplet som 'sidste salgsdato', men kan rent faktisk 'se' om kødet er hensigtsmæssigt som menneskeføde. En chip i frosne hindbær vil kunne afsløre forgiftningsrisiko. Undgå spild og effektivisere transport er et stort område, men vi kan også se fremkomsten f.eks. af intelligente betonklodser, der ved præ-fabrikerede byggelementer kan fremskynde byggeprocessen og efterfølgende kontrollere holdbarhed for at effektivisere vedligehold. (Se det nordiske project om sporbarhed og sikre fødevarer)

Intelligent videoovervågning er allerede en realitet, hvor videokameraer kun logger sekvenser, såfremt der optræder overtrædelser af programmerede regler, f.eks. for bilers hastighed, placering på vejen, personers opførsel og adgangsforsøg til specielle arealer, døre el. l.

(Se f.eks. Chigacos intelligente videoovervågningssystem)

Blandt de mere kuriøse er 'intelligent sand', der oprindeligt blev udviklet til militært brug, så et antal små chips, der kan kommunikere med hinanden og som indeholder 'logik' kan bedømme, om en fjendtlig kampvogn eller en deling soldater passerer et område, og herefter via satellit uploade simple måleresultater. De civile anvendelser for dette er f.eks. temperaturovervågning af skove, hvor der er særlig stor risiko for skovbrande, og ved overskridelse af en tærskelværdi uploade GPS-data til en satellit.

Hele transport- og trafikområdet er en klasse af anvendelsesområder for sig – potentielt bruger bilister i en by dobbelt så meget tid på at lede efter parkeringspladser som teoretisk nødvendigt, hvis et system kunne holde styr på ledige parkeringspladser og gelejde søgende biler til pladsen. Det kræver naturligvis at et 'stor' andel af biler og parkertingspladser er udstyret med detektorer , forbundet til en regelmekanisme, der kan styre effektorer i biler.

Man kan forestille sig, at dette system potentielt anvendes til helt at undgå human intervention i biler, der ligesom fly køres automatisk. Se her Google's præsentation af 'driverless cars'.

Hvornår når vi 'the tipping point'? - det tidspunkt, hvor mængden af 'ting' er så stor, at der produceres løsninger ('apps') i en tilstrækkelig kvalitet og med tilstrækkelig lav pris til at sikre den fortsatte udbredelse? ( Der er allerede i dag downloadet 10 milliarder apps!) Er det 'top down' bestemt – som hvor sygehuset eller kommunen beslutter og installerer overvågning af kronikere? Eller det brugeren, der afgør om hun vil ha' de intelligente systemer og enheder? Hvem afgør kvaliteten? Og hvor må man være opmærksom på en potentiel umyndiggørelse af brugeren? Dette er nogle af de spørgsmål, som en kraftig forøgelse af anvendelsen vil rejse.

Det kan i høj grad opfattes som et 'hønen og ægget' problem, at der ikke fremkommer særlig mange løsninger, fordi der ikke er særlig meget efterspørgsel – men at dømme ud fra eksplosionen i antallet af apps til smartphones, vil et mængdemæssigt gennembrud kunne ske når som helst – og især hvis standarder og interfaces er åbne og tilgængelige.

Adgang til data, opsamlet af detektorer, Business Intelligens, data mining og ekspertystemer, der 'forstår' betydning af data og kan træffe automatiserede beslutninger, er en del af web 3.0 konceptet, og jo længere vi når i retning af at kunne lave intelligente tekstananlyser, begrebsanalyser og drage slutninger, jo bedre vil 'tingenes internet' kunne fungere. Men jo mere afgængige bliver vi samtdigt af løsningerne, og det kræver overvejelse, hvorledes vi kan sikre et mere og mere livsvigtigt system. Som Nordahl Grieg sagde: 'Lænker de slaver til dig, lænker de dig til sig'

Crowd sourcing, hvor man i mere community- netværker udvikler og raffinerer løsninger, er også en mulighed. Eksemplet er her fra Japan, hvor Fukushima-katastrofen ikke fra myndighedernes side blev dækket godt nok med præcise målinger om strålingsfare. Her anskaffede en mængde japanerne på privat basis geigertællere, og ved hjælp af et hollandsk web site kunne man nu få aktuelle oplysninger om stråling, vindretning, forecast etc.

Alle disse applikationer stiller krav til indtænkning af beskyttelse af private data. Alle steder, hvor personhenførbare data kan kobles til kritiske målinger, holdninger, handlinger, eller bare geografiske positioner på givne tidspunkter, skal man overveje med henblik på i sin videste konsekvens at lade brugeren selv styre, hvem der kan få adgang til disse data. Sikkerheden i 'tingenes internet' er en udfordring, og 'privacy by design' er et krav. Man kan ikke tilkoble det bagefter, det skal indlejres i løsningen fra starten. Det kan være pseudonym-baseret, det kan være 'trusted 3^rd party', der har nøglen. Og det diskuteres i en anden sammenhæng senere i Teknologirådets spændende projekt.

torsdag den 9. juli 2009

The European Union and web 3.0

Already in October 2008 the European Commission launched a consultation on Web 3.0 to try to position Europe and to communicate a strategy to the member states during 3.rd quarter of 2009 proposing a policy approach “addressing the whole range of political and technological issues related to the move from RFID and sensing technologies to the Internet of Things”

This indicates that at the time of the initial consultation, the commission’s view of Web 3.0 was more focused on the ‘internet of things’ than the current understanding of being the semantic network of interrelated data, or as Tim Berners-Lee put it: “The Semantic Web is an extension of the current web in which information is given well-defined meaning, better enabling computers and people to work in cooperation.”

During the Autumn a number of external experts critized the Commission for lack of innovation and understanding of web 3.0. See for instance this blog critizising the Commisioner Vivianne Reading: http://blog.semantic-web.at/2008/09/30/eu-commissions-short-sighted-definition-of-web-30/

So what is the current status of EU’s view on web 3.0?. In January 2009, Gerard Santucci, Head of Unit from DG Information Society and Media, presented his and his unit’s view on the status of the IoT for the conference on the Future of the Internet. At this point the EU seems to begin to focus also on the tremendous amounts of data that could be available – either as a direct result of collected data through the IoT or by making data available online, so Santucci in his paper also stressed the need for security around it: “The exercise of data subject rights in the context of intelligent networked environments will require the creation of complex Identity Management systems, based on interoperability, identification, authentication, and authorisation.”

In May EU hosted a conference on the Future of the Internet in Prague. This was a step towards a much more comprehensive understanding of the potential promises of the semantic web, and an excellent video was presented here. Also the Swedish Government, before their presidency of ERU beginning July 1, 2009, was present and demonstrated through State Secretary Leif Zetterberg a more forward looking attitude, but again no mentioning of the opportunities and issues around the potential for a semantic web in Europe. His focus areas were 1) formulate a new ICT strategy for Europe from 2011-2015, 2) Harmonize use Digital TV Frequencies in Europe as well as creating one common market for telecommunication industry, and 3) aiming at a secure infrastructure. But no words on making public data available or securing interoperability to create a whole new web 3.0 market place. Hopefully the Swedish Presidency will redress this during the fall.

In June the commission launched an action plan to embrace ‘The Internet of Things’ (IoT) including a comprehensive list of sub-action items, 13 in all ranging from Governance of IoT, over Privacy, Risc Assessments, Impact on Society, Need for Standards, Research Needed, Public-Private Partnership for IoT, RFID Issues, Institutional Awareness, International Dialogue to Measurement of the Uptake. But not much information about one of the key issues in the i2010 Strategy for ICT: Creation of a Single European Information Space with a strong emphasis on Content Online, as stated by Ken Ducatel, Head of Unit from DG/INFSO responsible for the follow-on program to the i2010 in a presentation in March 2009.

In May 2009 a group of international experts called upon the EU Commission to increase focus of solving the inherent problems of the current infrastructure and standards in the Internet:

“The Internet was never designed for how it is now being used and is creaking at the seams. We have connectivity today but it is not ubiquitous; we have bandwidth but it is not limitless; we have many devices but they don’t all talk to each other. We can transfer data but the transfers are far from seamless. We have access to content but it can’t be reused easily across every device. Applications and interfaces are still not intuitive, putting barriers in the way of the Internet’s benefits for many people. And, since security was an afterthought on the current Internet, we are exposed in various ways to spam, identity theft and fraud.”

So the experts is trying to put focus back again on some of the key words for web 3.0: Seamless data exchange, re-use of content, open interfaces. Plus – of course – the challenges raised by the mere fact that we are running out of addresses in IPV4 and need to plan for IPV6 as fast as possible.

The timing of this statement is quite crucial as the hearings for the next ICT Strategic plan – the i2015 – is just about to start, and the more consultations with experts and with country spokesmen that point to these problems, the more likely it will be that EU will see the potential of the semantic web within a short time frame.

In some areas, particularly Health Care, it is obvious that a clear use of semantics is a mere necessity for cross border interoperability and for exchange of electronic patient records, a long lasting problem of standardization even within each member country. (Denmark, for instance, has got at least 5 different standards for EPR’s).

In June 2009 EU organized a seminar for ‘Ontology-driven interoperability in eHealth’. This marked a midterm in a 3-year project trying to define the standards needed and the necessary steps towards interoperability in the eHealth area. The Picture above illustrates the problem. In this context the organisations involved cover the key players for web 3.0: CEN, CENELEC, ETSI, W3C, ISO, OASIS and the more health care oriented standardisation committees CONTINUA and IHE. The eHealth domain as well as the eProcurement domain which I mentioned in my earlier blog on web 3.0, seem to be well on their way towards practical results and methodologies that can be deployed by other domains.

While EU is gaining its forces and collecting input from all over on the upcoming priorities, the web 3.0/Semantic web environment is slowly progressing, for each day and month producing simple-to-use, practical tags and definitions. See for instance these 4 examples of practical microformats enhancing exchange of information on personal identity, calendar inputs and other:

http://www.associatedcontent.com/article/1773550/semantic_web_4_fundamental_microformats.html

In the meantime, led by David Osimo, a crowd sourcing initiative is taking place to create the People’s declaration on the i2015. Suggest you visit the page and register and vote for the top priorities. As of to day the top rated recommendation is: release government data in free, open, standard, readily available, accessible formats Or – if you want to participate in an upcoming event on the semantic web and it’s potential, register at: http://iswc2009.semanticweb.org/

mandag den 22. juni 2009

Towards web 3.0 ???

Last week W3C opened a conference in San Jose, California, on Semantic technology. Also present at the conference were a number of vendors all hoping to commercialize products within the realm of semantics. I got curious and decided to make some research to see how far web 3.0 had really come; I put a question up on Facebook, but the reactions I got were like this: “ Why are we talking about web 3.0? We haven’t even starting exploiting web 2.0?”

So where are we?

We are definitely moving away from web 1.0, connecting computers, over web 2.0 - connecting people - to what may be called web 3.0 – If that means, that web 3.0 is where you use the value of the web 2.0 technologies plus the semantic tools to find your way into all the crab and get the right and/or most likely answers to your questions. In order to do this, of course you need standards and the status of this work was a major part of last week’s conference.

W3C owns some of the core technologies within the semantic domain, and it is seen as a major turning point these days that the maturity of the basic technologies: RDF – Resource Development Framework, and OWL - Web Ontology Language. Based on the standards developed here and also of course the XML-standards, the semantic query language SPARQL has been developed.

Related, but independently maintained standards such as XBRL, is also moving ahead to help clarify definitions and meanings across systems and boundaries.

The whole concept of semantics is particularly important in a multi-language set up like the European Community, and EU as such have since long promoted the use of semantics as a core technology to provide interoperability across the boundaries of Europe. One of the first pan-European areas where principles of the semantic web are being defined and tested in the area of eProcurement. The PEPPOL Conference in January in Copenhagen kicked off the project, where a number of IT companies and representatives from users are seated. The Danish Ministry of Finance and IBM Denmark are both partners here. A general perspective of the PEPPOL Project can be found here: http://www.peppolinfrastructure.com/20090129ConnectingtoPEPPOL.pdf

PEPPOL is still a development project, although the demonstration phase will appear pretty soon.

But as the multitude of globally available solutions presented in San Jose here in June showed, we may now be on the brink of a real break-through to have a multitude of commercial applications available.

Ivan Herman is responsible for W3C’s semantic programme. He gave a lengthy interview that described his opinion of the status. Her stated it like this:

“Web 3.0 is the idea of having data on the web defined and linked in a way that it can be used by machines not just for display purposes, but for automation, integration and reuse of data across various applications." (From San Fransisco W3C Conference)

Other blogs and discussion for a on the net have been dealing with the topic for quite some time.

Phil Wainewright: What to expect from web 3.0? , tries to explain what the main differences and concepts for breakthrough in web 3.0 really are. He explains, that web 3.0 consists of

4 Layers: API-Layer – where service providers give access to content and data. He thinks this layer is pretty mature, with almost no profit left for new comers. The next layer, The Aggregation services Layer contains all the goodies of web 2.0 like RSS Feeds etc., and the third and exciting ‘new’ area is the Application Services layer – Where office, EPR, CMS, and other applications and services are buying offered on demand, software as a service. A fourth layer may consist of Serviced Clients, and this may also be an interesting new business area, according to Phil Wainewright.

As an example of one of the application areas he expects to thrive, is the WebEX Office SaaS:

WebEX – example of a company focusing on delivering SaaS using web 3.0

Searching the web I also found Richard MacManus, lecturing about web 3.0:

“Web 1.0 is characterized by enabling reading, web 2.0 = read/write where everybody becomes a publisher – but web 3.0??”

“Unstructured information will give way to structured information, paving the way to more intelligent computing.”

The essence of his expectations is that web sites will be turned into web services, whether or not this should be considered a brand new paradigm is a matter of taste, but in Rachard MacManus opnion:

“There is a difference in the solutions we are seeing in 2009: More products based on structured data (Wolfram Alpha), more real time – made sadly necessary because of the situation in Iran - (twitter, OneRiot), better filters (FriendFeed and Facebook with copies FF)”

So of web 3.0 is all about structuring data and making data available, then some of the new semantic techniques for storing relations between entities - TripleStore Technology – like ‘Peter is friend with Susan’ – or ‘Muhammed is a member of the AK81 gang’ is the way ahead. Much easier than to describe EDIFACT rules in the 90’ties, and if you could really create this links of links of links and use powerful searching tools across the variety of databases, then we will surely see a new level of intelligent computing.

According to Alexander Korts, (april 2009) in his article on The web of Data, creating Machine Accessible Information gives the following example:

“One promising approach is W3C's Linking Open Data (LOD) project. The above image (on top of blog) illustrates participating data sets. The data sets themselves are set up to re-use existing ontologies such as WordNet, FOAF, and SKOS and interconnect them.

The data sets all grant access to their knowledge bases and link to items of other data sets. The project follows basic design principles of the World Wide Web: simplicity, tolerance, modular design, and decentralization. The LOD project currently counts more than 2 billion RDF triples, which is a lot of knowledge. (A triple is a piece of information that consists of a subject, predicate, and object to express a particular subject's property or relationship to another subject.) Also, the number of participating data sets is rapidly growing. The data sets currently can be accessed in heterogeneous ways; for example, through a semantic web browser or by being crawled by a semantic search engine.

This in a way makes it reassuring and at the same time illustrates the amount of work still ahead of us before we will reach the ‘promised’ land of web 3.0. Also it shows, that web 3.0 is a journey and not and end in itself. And finally: that we will have to master Web 2.0 techniques and imbed this into all the traditional services and solutions before we get a user friendly and intuitive way of accessing all these data. But most important: It puts a pressure on all Governments (in particular) but also on private companies to make data available. Tim Berness Lee gave a very interesting and inspiring pitch on this matter in February this year: Tim Berners Lee on the next web

Conclusion is that we are still only stumbling at the foot of the mountain, but we have spotted the way ahead.

(And if you are really interested in the topic of Digital Libraries, maybe you should attend the Conference on Semantic Web for digital Libraries in Trento in September. )

Duus Blog