A Framework for Web Science

(130 pages)

Tim Berners-Lee1, Wendy Hall2,

James A. Hendler3, Kieron O’Hara4,

Nigel Shadbolt4 and Daniel J. Weitzner5

1 Computer Science and Artificial Intelligence Laboratory, Massachusetts

Institute of Technology

2 School of Electronics and Computer Science, University of Southampton

3 Department of Computer Science, Rensselaer Polytechnic Institute

4 School of Electronics and Computer Science, University of Southampton

5 Computer Science and Artificial Intelligence Laboratory, Massachusetts

Institute of Technology

Foundations and Trends R in

Web Science

Vol. 1, No 1 (2006) 1–130

Abstract

This text sets out a series of approaches to the analysis and synthesis

of the World Wide Web, and other web-like information structures.

A comprehensive set of research questions is outlined, together with

a sub-disciplinary breakdown, emphasising the multi-faceted nature of

the Web, and the multi-disciplinary nature of its study and development.

These questions and approaches together set out an agenda for

Web Science, the science of decentralised information systems. Web

Science is required both as a way to understand the Web, and as a way

to focus its development on key communicational and representational

requirements. The text surveys central engineering issues, such as the

development of the Semantic Web, Web services and P2P. Analytic

approaches to discover the Web’s topology, or its graph-like structures,

are examined. Finally, the Web as a technology is essentially socially

embedded; therefore various issues and requirements for Web use and

governance are also reviewed.

Web Science: An Interdisciplinary Approach to Understanding the World Wide Web

(8 pages)

ABSTRACT

Despite the huge success of the World Wide Web as a technology,

and the significant amount of computing infrastructure on which it

sits, the Web, as an entity remains surprisingly unstudied. In this

article, we look at some of the issues that need to be explored to

model the Web as a whole, to keep it growing, and to understand

its continuing social impact. We argue that a "systems" approach,

in the sense of "systems biology" is needed if we are to be able to

understand and engineer the future of the Web.

The Semantic Web Revisited

(6 pages)

The Semantic Web is a Web of actionable

information—information derived from data through

a semantic theory for interpreting the symbols. The semantic

theory provides an account of “meaning” in which the

logical connection of terms establishes interoperability

between systems. This was not a new vision. Tim Berners-

Lee articulated it at the very first World Wide Web Conference

in 1994. This simple idea, however, remains largely

unrealized.

SIR TIM BERNERS-LEE HEADS INTEL’S LIST OF MOST INFLUENTIAL TECHNOLOGISTS

From Hexus

LONDON, Jan. 29, 2008 – Intel Corporation today revealed Sir Tim Berners-Lee, the man widely regarded as the founder of the modern-day World Wide Web, as the most influential person in technology over the past 150 years for his impact on society and ground-breaking technology.

As Intel continues to celebrate the innovation of its 45 nanometer (nm) next-generation family of quad-core processors, it brought together a panel of experts including academics, journalists and independent third parties to vote on technology’s 45 most influential individuals.

In the judging session held last week in London, the panel’s full top ten comprised:

Tim Berners-Lee – Founder of the modern-day World Wide Web
Sergey Brin – Co-founder of Google
Larry Page – Co-founder of Google
Guglielmo Marconi – Inventor of the Radiotelegraph system
Jack Kilby – Inventor of the Integrated Circuit and Calculator
Gordon Moore – Co-founder of Intel
Alan Turing – played a major role in deciphering German Code in WWII
Robert Noyce – Co-founder of Intel
William Shockley – Co-Inventor of the Transistor
Don Estridge – Led the development of the IBM computer

The two founders of Intel® – Gordon Moore and Robert Noyce both featured in the top ten,.

Moore - famous for Moore’s Law, a key factor in the rapid growth of the PC industry - was

voted 6th, while Noyce, who co-developed the integrated circuit was placed 8th.

Web Science Workshop

(9 pages)

September 2005

Web and society (a conclusion)

The discussion on the Web and society was introduced by the chairman, Weitzner,

reviewing earlier discussions on public policy questions of privacy and copyright, and

arguing that we need to give people control over how they interact with information

resources, while adding transparency. Feigenbaum argued that questions of

accountability and transparency are linked: transparency enables accountability. It

was generally agreed that it was possible to write rules to cover policies and policy

violations, but that a problem was how to get people to use such rules.

Feigenbaum also noted that accountability was controversial, particularly at high

levels of abstraction. Alternative goals, such as fairness, could be pursued. Weitzner

distinguished between accountability and enforcement, and noted that the pursuit of

particular goals needed experimental and empirical work to address the question: what

happens to social interactions with particular architectures?

Berners-Lee and Milner discussed the use of process calculi to model small parts of

systems, as a potential method for helping with these social questions.

Goble noted that any kind of social regulation infrastructure would have to be

lightweight. Feigenbaum suggested that many aspects of the infrastructure, such as

identification and authentication, aren’t hard to use, although Hendler cautioned that

controlling identities is hard in distributed open systems. Weitzner noted the useful

property of the general SW architecture that allows generic high level rules covering

social interactions to be written.

Lynchpin talk by Tim Berners-Lee

Tim Berners-Lee’s talk highlighted different phases of information structure, moving

from characters, to linear structures, to trees and hierarchies, and lastly webs, which

can grow without a central bottleneck. The goal of a web is serendipitous reuse, but a

minus is that it comes with global obligations, such as maintaining Web content,

which are important for allowing the serendipity to happen. The SW is a web of logic,

very different from hypertext. We need the same standard for symbol space as the

WWW. We need to be able to map URIs to anything. Looking up URIs is still the

critical architecture.

Berners-Lee also offered some comments on the NLP/SW debate stemming from

Wilks’ lynchpin talk, arguing that the two are very separate.

NLP SW

Words Terms of logic

Meaning is use Meaning is defined in words, or code,

or specific use

No ownership of words URI ownership

“Hydrogen” pt:Hydrogen

Defining words in

ontology is never

complete

and a waste of time

Defining terms is never perfect but

useful

NL is constantly changing Ontologies are basically static

Can’t benefit from injecting logic Can’t benefit from cloudy statistics

Machine finds stuff Machine infers stuff