World Wide Web
Up Emotion Icons Abbreviations Become a hacker World Wide Web Spam coockie search engine

 Shahrad Rezaei Tehrani

 

 

Understanding the World Wide Web

August 2000

The World Wide Web is a system of Internet servers that supports hypertext to access several Internet protocols on a single interface. The World Wide Web is often abbreviated as the Web, WWW, or W3.

The World Wide Web was developed in 1989 by Tim Berners-Lee of the European Particle Physics Lab (CERN) in Switzerland. The initial purpose of the Web was to use networked hypertext to facilitate communication among its members, who were located in several countries. Word was soon spread beyond CERN, and a rapid growth in the number of both developers and users ensued. In addition to hypertext, the Web began to incorporate graphics, video, and sound. In recent years, the use of the Web has now reached global proportions.

Almost every protocol type available on the Internet is accessible on the Web. Internet protocols are sets of rules that allow for intermachine communication on the Internet. The following major protocols are accessible on the Web:

 

E-mail (Simple Mail Transport Protocol or SMTP)
Distributes electronic messages and files to one or more electronic mailboxes

 

Telnet (Telnet Protocol)
Facilitates login to a computer host to execute commands

 

FTP (File Transfer Protocol)
Transfers text or binary files between an FTP server and client

 

Usenet (Network News Transfer Protocol or NNTP)
Distributes Usenet news articles derived from topical discussions on newsgroups

 

HTTP (HyperText Transfer Protocol)
Transmits hyptertext over networks. This is the protocol of the WWW.

Many other protocols are available on the Web. To name just one example, the Voice over Internet Protocol (VoIP) allows users to place a telephone call over the Web.

The World Wide Web provides a single interface for accessing all these protocols. This creates a convenient and user-friendly environment. It is no longer necessary to be conversant in these protocols within separate, command-level environments. The Web gathers together these protocols into a single system. Because of this feature, and because of the Web's ability to work with multimedia and advanced programming languages, the World Wide Web is the fastest-growing component of the Internet.

 

HYPERTEXT: THE MOTION OF THE WEB

The operation of the Web relies primarily on hypertext as its means of information retrieval. HyperText is a document containing words that connect to other documents. These words are called links and are selectable by the user. A single hypertext document can contain links to many documents. In the context of the Web, words or graphics may serve as links to other documents, images, video, and sound. Links may or may not follow a logical path, as each connection is programmed by the creator of the source document. Overall, the WWW contains a complex virtual web of connections among a vast number of documents, graphics, videos, and sounds.

Producing hypertext for the Web is accomplished by creating documents with a language called HyperText Markup Language, or HTML. With HTML, tags are placed within the text to accomplish document formatting, visual features such as font size, italics and bold, and the creation of hypertext links. Graphics may also be incorporated into an HTML document. HTML is an evolving language, with new tags being added as each upgrade of the language is developed and released. The World Wide Web Consortium, led by Tim Berners-Lee, coordinates the efforts of standardizing HTML.

 

PAGES ON THE WEB

The World Wide Web consists of files, called pages or Web pages, containing information and links to resources throughout the Internet.

Web pages can be created by user activity. For example, if you visit a Web search engine and enter keywords on the topic of your choice, a page will be created containing the results of your search. In fact, an increasing amount of information found on the Web today is served from databases, creating temporary Web pages "on the fly" in response to user queries.

Access to Web pages may be accomplished by:

 

  1. Entering an Internet address and retrieving a page directly
  2. Browsing through pages and selecting links to move from one page to another
  3. Searching through subject directories linked to organized collections of Web pages
  4. Entering a search statement at a search engine to retrieve pages on the topic of your choice

 

RETRIEVING DOCUMENTS ON THE WEB: THE URL

URL stands for Uniform Resource Locator. The URL specifies the Internet address of a file stored on a host computer connected to the Internet. Every file on the Internet, no matter what its access protocol, has a unique URL. Web software programs use the URL to retrieve the file from the host computer and the directory in which it resides. This file is then displayed on the monitor connected to the user's local machine.

URLs are translated into numeric addresses using the Internet Domain Name System (DNS). The numeric address is actually the "real" URL. Since numeric strings are difficult for humans to use, alphneumeric addresses are employed by end users. Once the translation is made, the Web server can send the requested page to the user's Web browser.

Anatomy of a URL

This is the format of the URL:

protocol://host/path/filename
For example, this is a URL on the home page of the House Committee on Agriculture of the U.S. House of Representatives:
http://www.house.gov/agriculture/schedule.htm
This URL is typical of addresses hosted in domains in the United States.
Structure of this URL:

 

  1. Protocol: http
  2. Host computer name: www
  3. Second-level domain name: house
  4. Top-level domain name: gov
  5. Directory name: agrictulture
  6. File name: schedule.htm

Note how much information about the content of the file is present in this well-constructed URL. <> Other examples:

telnet://library.albany.edu      the University at Albany library text-based catalog

ftp://ftp.uu.net/graphics/picasso     a file at an ftp site

Several top-level domains (TLDs) are common in the United States:
com commercial enterprise
edu educational institution
gov U.S. government entity
mil U.S. military entity
net network access provder
org usually nonprofit organizations

In addition, dozens of domain names have been assigned to identify and locate files stored on host computers in countries around the world. These are referred to as two-letter Internet country codes, and have been standardized by the International Standards Organization as ISO 3166. For example:

 

ch Switzerland
de Germany
jp Japan
uk United Kingdom

It had been proposed that new top-level domains be added to the existing domain names. The U.S. Government has formed the Internet Corporation for Assigned Names and Numbers (ICANN) to work out these and other issues relating to domain names.

 

HOW TO ACCESS THE WORLD WIDE WEB: WEB BROWSERS

To access the World Wide Web, you must use a Web browser. A browser is a software program that allows users to access and navigate the World Wide Web. There are two types of browsers:

 

  1. Graphical: Text, images, audio, and video are retrievable through a graphical software program such as Netscape Navigator and Internet Explorer. These browsers are available for both Windows-based and Macintosh computers. Navigation is accomplished by pointing and clicking with a mouse on highlighted words and graphics.

    You can install a graphical browser such as Netscape Navigator in your Windows-based or Macintosh machine. Navigator is available for downloading on the Netscape Web site: http://home.netscape.com/. Microsoft's Internet Explorer is available from the Microsoft Web site: http://www.microsoft.com/. To use these programs to access the Web, you need an ethernet connection or a dialup connection known as a SLPP or PPP. The latter may be obtained from an Internet Service Provider. For more information, see How to Connect to the Internet.

     

  2. Text: Lynx is a browser that provides access to the Web in text-only mode. Navigation is accomplished by highlighting emphasized words in the screen with the arrow up and down keys, and then pressing the forward arrow (or Enter) key to follow the link. This browser is available through your personal VAX or UNIX account on campus. For more information, see Guide to Using Lynx.

 

Extending the Browser: Plug-Ins

Software programs may be configured to a Web browser in order to enhance its capabilities. When the browser encounters a sound, image or video file, it hands off the data to other programs, called plug-ins, to run or display the file. Working in conjunction with plug-ins, browsers can offer a seamless multimedia experience. Many plug-ins are available for free.

File formats requiring plug-ins are known as MIME types. MIME stands for Multimedia Internet Mail Extension, and was originally developed to help e-mail software handle a variety of binary (non-ASCII) file attachments. The use of MIME has expanded to the Web. For example, the basic MIME type handled by Web browsers is text/html associated with the file extention .html.

A common plug-in utilized on the Web is the Adobe Acrobat Reader. The Acrobat Reader allows you to view documents created in Adobe's Portable Document Format. These documents are the MIME type application/pdf and are associated with the file extension .pdf. When the Acrobat Reader has been configured to your browser, the program will open and display the file requested when you click on a hyperlinked file name with the suffix .pdf. The latest versions of the Acrobat Reader allow for the viewing of documents within the browser window.

Web browsers are often standardized with a small suite of plug-ins, especially for playing multimedia content. Additional plug-ins may be obtained at the browser's Web site, at special download sites on the Web, or from the Web sites of the companies that created the programs. The number of available plug-ins is increasing rapidly.

Once a plug-in is configured to your browser, it will automatically launch when you choose to access a file type that it uses.

 

Beyond Plug-Ins: Active X

ActiveX is a technology developed by Microsoft which may make plug-ins less neccesary. ActiveX offers the opportunity to embed animated objects, data, and computer code on Web pages. A web browser supporting ActiveX can render most items encountered on a Web page. For example, Active X allows users to view three-dimensional VRML worlds in a Web browser without the use of a VRML plug-in. As another example of the power of ActiveX, this technology can allow you to view and edit PowerPoint presentations directly within your Web browser. ActiveX works best with Microsoft's Internet Explorer browser.

 

THE EXPERIENCE OF THE WEB

Today's World Wide Web presents an ever-diversified experience of multimedia, programming languages, and real-time communication. There is no question that it is a challenge to keep up with the rapid pace of developments. The following presents a brief description of some of the more important trends to watch.

 

Multimedia

The Web has become a broadcast medium. It is possible to listen to audio and video over the Web, both pre-recorded and live. For example, you can visit the sites of various news organizations and view the same videos shown on the nightly television news. Several plug-ins are available for viewing these videos. For example, Apple's Quick Time Player downloads files with the .mov extension and displays these as "movies" in a small window on your computer screen. Quick Time files can be quite large, and it may take patience to wait for the entire movie to download into your computer before you can view it.

The problem of slow download times has been answered by a revolutionary development in multimedia capability: streaming media. In this case, audio or video files are played as they are downloading, or streaming, into your computer. Only a small wait, called buffering, is necessary before the file begins to play. The RealPlayer plug-in plays streaming audio and video files. Extensive files such as interviews, speeches and hearings work very well with the RealPlayer. The RealPlayer is also ideal for the broadcast of real-time events. These may include press conferences, live radio and television broadcasts, concerts, etc. The Windows Media Player is another streaming media player. Many sites offer the option to use one player or the other. A list of sites that make use of these programs is available on the page, Multimedia on the Web.

Shockwave presents another multimedia experience. Shockwave allows for the creation and implementation of an entire multimedia display combining graphics, animation and sound.

Sound files, including music, may also be heard on the Web. It is not uncommon to visit a Web page and hear background music. Sound files are also available for downloading independent of Web page visits. Sound files of many types are supported by the Web with the appropriate plug-ins. The MP3 file format, and the choice of supporting plug-ins, is the latest music trend to sweep the Web. The famous Napster site allows for the exchange of MP3 files.

Live cams are another aspect of the multimedia experience available on the Web. Live cams are video cameras that send their data in real time to a Web server. These cams may appear in all kinds of locations, both serious and whimsical: an office, on top of a building, a scenic locale, a special event, and so on.

 

Programming languages and functions

The use of existing and new programming languages have extended the capabilities of the We What follows is a basic guide to a group of the more common languages and functions in use on the Web today.

CGI, Active Server Pages: CGI (Common Gateway Interface) refers to a specification by which programs can communicate with a Web server. A CGI program, or script, is any program designed to accept and return data that conforms to the CGI specification. The program can be written in any programming language, including C, Perl, and Visual Basic Script. A common use for a CGI script is to process an interactive form on a Web page. For example, you might fill out a form ordering a book through Interlibrary Loan. The script processes your information and sends it to a designated e-mail address in the Interlibrary Loan department.

Another type of dynamically generated Web page is called Active Server Pages (ASP). Developed by Microsoft, ASPs are HTML pages that include scripting and create interactive Web server applications. The scripts run on the server, rather than on the Web browser, to generate the HTML pages sent to browsers. Visual Basic and JScript (a subset of JavaScript) are often used for the scripting. ASPs end in the file extension .asp.

Java/Java Applets: Java is probably the most famous of the programming languages of the Web. Java is an object-oriented programming language similar to C++. Developed by Sun Microsystems, the aim of Java is to create programs that will be platform independent. The Java motto is, "Write once, run anywhere." A perfect Java program should work equally well on a PC, Macintosh, Unix, and so on, without any additional programming. This goal has yet to be realized. Java can be used to write applications for both Web and non-Web use.

Web-based Java applications are usually in the form of Java applets. These are small Java programs called from an HTML page that can be downloaded from a Web server and run on a Java-compatible Web browser. A few examples include live newsfeeds, moving images with sound, calculators, charts and spreadsheets, and interactive visual displays. Java applets can tend to load slowly, but programming improvements should lead to a shortened loading time.

JavaScript/JScript: JavaScript is a programming language created by Netscape Communications. Small programs written in this language are embedded within an HTML page, or called externally from the page, to enhance the page's the functionality. Examples of JavaScript include moving tickers, drop-down menus, real-time calendars and clocks, and mouse-over interactions. JScript is a similar language developed by Microsoft and works with the company's Internet Explorer browser.

VRML: VRML (Virtual Reality Modeling Language) allows for the creation of three-dimensional worlds. These may be linked from Web pages and displayed with a VRML viewer. Netscape Communicator comes with the Cosmo viewer for experiencing these three-dimensional worlds. One of the most interesting aspects of VRML is the option to "enter" the world and control your movements within the world.

XML: XML (eXtensible Markup Language) is a Web page creation language that enables designers to create their own customized tags to provide functionality not available with HTML. XML is a language of data structure and exchange, and allows developers to separate form from content. At present, this language is little used as Web browsers are only beginning to support it. In May 1999, however, the W3 Consortium announced that HTML 4.0 has been recast as an XML application called XHTML. This move will have a significant impact on the future of both XML and HTML.

 

Real-Time Communication

Text, audio and video communication can occur in real time on the Web. This capability allows people to conference and collaborate in real time. In general, the faster the Internet connection, the more successful the experience.

At its simplest, chat programs allow multiple users to type to each other in real time. Internet Relay Chat and America Online's Instant Messenger are prime examples of this type of program. The development of a messenging protocols is underway. Such a protocol would allow for the expansion of this capability throughout the Internet.

More enhanced real-time communication offers an audio and/or video component. CU-See Me is one of the most popular sotware programs of this type. Even more elaborate are programs that allow for true real-time collaboration. Microsoft's NetMeeting and Netscape's Conference (available with Communicator) are good examples of this.

Featured collaboration tools include:

 
audio: conduct a telephone conversation on the Web
video: view your audience
file transfer: send files back and forth among participants
chat: type in real time
whiteboard: draw, mark up, and save images on a shared window or board
document/application sharing: view and use a program on another's desktop machine
collaborative Web browsing: visit Web pages together

Currently no standard exists that will work among all conferencing programs.

Push: Push refers to a technology that sends data to a program without the program's request. This is the opposite of the typical "pull" of the Web, in which the user clicks on a link to request a file from a server. With push, the data is sent automatically. Content is sent through a "channel." The early Web-based implementation of push was commercial. Push can also be used to deliver software upgrades to a desktop machine.

 

Laura Cohen
lcohen@albany.edu

 

This page was last updated on 03-Mar-2002.

E-Mail:        Shahrad_rezaei@hotmail.com

YAHOO ID: Shrezai@yahoo.com