Tuesday, August 30, 2005

Why Map the Web?

The Webgrapher blog will discuss issues relating on what some might consider to be a narrow domain of networks within cyberspace: static, public HTML documents accessible via the World Wide Web. Other domains (application spaces) of cyberspace, including email, instant messaging, and peer to peer networks, have been and are being mapped by academics and industry researchers, but they will not be the focus of this blog.

Why the relatively narrow approach? This is one question we will explore in this blog, but the simplest answer is that it is the only domain of which I'm currently aware for which solid tools with transparent, open mapping algorithms are publicly available. Specifically, I am referring to the Issue Crawler software developed by Richard Rogers and the Govcom.org foundation. A more well-known tool in the blog circuits, TouchGraph, is, in my opinion, not suitable for academic research, specifically because it makes use of Google's proprietary "similar pages" algorithm. (If you aware of other publicly available tools that can map local portions of the web graph please let me know.)

Another reason to map networks within the Web instead of networks mediated by email or peer to peer applications is that the web tends to be more public and spans a wider range of media objects (text, video, sound, interactive databases) than many other Internet applications. If you are interested in studying broad social structures such as global political policy networks, the Web is an ideal place to look. Of course, this statement is subject to challenge and begs of the question of why we would want to map any kind of network within the Internet in the first place. These issues will come up in the course of this blog.

In the not too distant future, expect a blog entry placing web graph networks within the context of cyberspace networks in general. This will include a distinction between the term cyberspace and Internet, where Internet is a real-time instantiation of the more abstract cultural object cyberspace.

As this is a blog, expect the content here to be fairly informal and somewhat unstructured. Post your comments and questions and I'll do my best to follow up with them. More to come.


