Understanding Interactive TV

By D.M. Schwartz.

Draft dated 16 October 2000

Synopsis
What is "interactive television"? Ask that question to ten industry participants, and the response will likely be ten different answers. Presently, there is no general agreement on what constitutes interactive TV. The wide range of possibilities for interactive TV is part of what makes the concept so attractive, and so difficult to implement. In this paper, the multiple definitions of interactive TV are discussed, as are various scenarios for its commercialization.

A Brief History
The history of interactive TV contains a number of distinct threads that have intertwined from time to time, but on the whole represent separate sequences of development. Interaction within the context of broadcast TV has had little to do with interactive video as it has developed in the computer industry. Similarly, until the past few years, interactivity based on closed circuit TV has been a separate thread. The telephony industry independently developed interactive video in the form of telephones with pictures. The computer industry's form of interactive video consisted entirely of graphical user interfaces to data processing applications, until the recent addition of multimedia data types. Then, there is the videogame industry, which over the course of 25 years has added whole new dimensions to interactivity, including high fidelity sound, force feedback, body motion input devices and 3-D imaging headsets.

Ever since the very first flickering of television about 70 years ago, the engineers and promoters of TV have talked about 2-way video. Their vision of bi-directional video communications was realized in closed circuit and private wireless TV in the 1950s. In the 1960s, AT&T attempted to commercialize 2-way video in the form of "videophones", using an adaptation of their voice telephony system. Until recently, the high cost of video cameras, the limited broadcast bandwidth available, and the high cost of cable for 2-way TV limited its application to proprietary systems used mainly for security within or around a specific facility. In these systems, the interactivity is that of people communicating with each other, and additionally there can be a means to remotely control the cameras.

The utility of video for surveillance made what is now referred to as "telepresence" a major form of interactive closed-circuit TV. Telepresence systems enable people to interact safely from a distance with objects in a hostile or dangerous environment using remote-controlled video cameras in combination with robotic devices such as manipulator arms and grippers. From bomb handling and disposal to underwater salvage, telepresence has proven itself to be an economically viable form of interactive TV. Telepresence in the form of 2-way video chat plus user-directed performance is now a profitable segment of Internet e-commerce, mainly for the purpose of adult entertainment.

Interaction with the content of broadcast television began with the first use of telephone call-ins by the host of the "Today Show" in the late 1950s. Callers responded to questions broadcast over the air by dialing the originating TV station's switchboard, and if they were deemed suitable for going on-air, the phone on the host's desk would ring. A 7-second broadcast delay was used to prevent inappropriate material from being aired. The telephone call-in form of interactive participation continues to this day on a variety of TV programs, ranging from games to shopping.

In the 1960s, interactive video took on an entirely different meaning within the computer industry. Video in the computer industry meant "glass teletype", as cathode ray tubes (CRTs) started to take over from online printers as one of the primary output devices for mainframe computers. Almost immediately, in university computing centers everywhere, the text-based computer game, "Space War", previously confined to a single or multiple users at teletype machines, became a CRT and keyboard-based game. Within a few years, the text-only video displays of space warfare encounters gained primitive character-based graphics depicting ship and missile trajectories. Beginning in the 1970s with Atari's "Pong", computer games made the leap from mainframe computers to microcomputers that could use the home TV screen as the video display.

Videogame hardware morphed into home computers in the 1980s. Many of these early systems used the home TV as the display device. True to their heritage, games quickly became the second most popular use of home PCs. And, since computers can use modems to communicate with each other over telephone lines, interactivity extended to remote multi-users, so games could be played among a group of people at a distance from one another. Long before the rise of the Internet, bulletin board-based games offered chat between players and game downloading services. For a few years, a single architecture combined a broadcast receiver color TV with a microcomputer in a keyboard and a modem to deliver conventional TV, electronic messaging, and videogames to the living room.

By 1996, when the Internet and Web browsers became mainstream PC applications, home computers no longer used the TV as the video display, simply because TVs could not support SVGA, and then XVGA screen resolutions. The Web developed into a powerful interactive media in its own right, distinct from videogames, TV, and non-networked computers. Now, it is estimated that over 50% of American homes have both a TV and a PC - and it's the PC that is connected to the Web.

Over the past three years, interactivity has been added both to community-wide cable television and satellite TV. It is possible to connect to the Internet using a set-top box, a keyboard and the home TV for the display device. Typically, a telephone line is used by the set-top box to gain Internet access, but not in all cases. All Satellite TV systems require a phone line to support interactivity. Other interactive TV systems offer interactivity that enhances TV with "click to purchase" capability, without a keyboard, using only the remote control.

Levels of Interactivity
Aside from the hardware, it is useful to consider interactive TV on a purely functional basis. Acknowledging that some functions are more difficult and costly to implement than others, levels of interactivity can be defined, such as low, medium, and high. These distinctions are somewhat arbitrary, as the levels blend into one another and features that are easy to supply within one community system may be next to impossible to deliver in another.

Low Interactivity TV
One big step up from plain old TV, low interactivity TV offers basic Internet services, gaming, and near video on demand:

- NTSC standard images
- Scheduled-by-popular-demand video movies
- Stereo sound system
- Videogames at NTSC resolution
- Wired game controller
- Wired QWERTY Keyboard
- Access to email, Web pages and search engine via a    gateway server that pre-processes pages for NTSC    display.

Medium Interactivity TV
With advanced gaming, 2-way video, video on demand and support for all Internet services, the medium level of interactive TV features:

- Mid-resolution images at 800 by 600, non-interlaced, 30   frames per second
- Picture-in-picture with two 320 by 240 window capability
- True video on demand (start any cable movie,   any time)
- 2-way video chat
- 3-speaker sound system
- DVD player
- 3-D world games at 800 by 600 resolution with single   remote player support   
- Wireless game controller with force feedback
- Wired QWERTY Keyboard
- USB port for still camera and Webcam video input and       optional output devices such as a printer, scanner or disk    drive
- Access to Web-based productivity suite, including email,    search engine, word processor, spreadsheet, financial    management tools

High Interactivity TV
Capable of full-immersion gaming, personal video stream control, video conferencing and support for all Internet services, the highest level of interactive TV features:

- High-resolution images at 1024 by 768, non-interlaced, 30    frames per second
- Picture-in-picture with four 320 by 240 window capability
- Tape-deck-like control of video on demand
- Real-time branching video (for example, click to change    scenes)
- 5-way video chat
- Theater-quality sound system

- DVD RAM-based video and  data recording
- 3-D world games at 1024 by 768 resolution, multiple    remote player support   
- Wireless game controller with force feedback
- Wireless QWERTY Keyboard

- Wireless headset for voice recognition

- Voice synthesis for user feedback and text to speech
- Firewire port for digital video camera input and other    peripherals, like printers, scanners and advanced game    controllers
- Access to Web-based productivity suite, including email,    search engine, word processor, spreadsheet, financial    management tools

Commercialization of Interactive TV
Widely perceived as a potentially huge revenue generator for providers and consumer electronics companies, the implementation of interactive TV is being approached from a number of directions. Arguably, about 500,000 users already have interactive TV at the first level described above. Those users own a Microsoft WebTV set top box, a stereo TV, a stand-alone videogame system, and access a cable TV service with subscription or pay-per-view movies. In the near future, with some digital cable services, both the external videogame system and the WebTV box will not be required. The cable company will offer those features as options for digital cable, reducing living room clutter substantially. Indeed, some hotels offer such services, today.

The medium and high interactivity TV systems may be costly to realize for both the consumer and the provider, or not, depending on how it's done. The beginnings of several implementations are now visible, as are their unique business models. It is convenient to group these interactive TV systems by type, without reference to specific offerings. Broadly speaking, there are three types of interactive TV systems in the process of being commercialized: head end digital cable systems, broadband Internet systems, and hybrid satellite/dial-up systems. Each faces its own set of challenges, and all of them share some common barriers.

The barriers to commercialization faced by proposed medium and highly interactive TV systems include the slow rate of adoption of HDTV due to the high price of digital TV sets, the high cost of provider-side infrastructure, Internet bandwidth costs and Internet congestion. The high cost of digital TV sets is a function of low demand, which in turn is caused by the lack of HDTV broadcasting, which is slow to get online because there are not enough digital TV viewers. This circular problem may be alleviated by the increasing availability of other sources of high resolution content that could drive HDTV sales.

High image resolution content is becoming increasingly attractive in two separate venues, DVD for home theater and on the Internet, in the form of Web pages. The problem with these content sources as drivers for adoption of digital TV sets is twofold. To date, consumers would rather spend their money on very large NTSC screens than on smaller HDTV screens, and PC video monitors are very inexpensive. Add the fact that most HDTV sets don't accept input from a PC, and digital TV becomes an unattractive alternative. This may change, as another high definition video source becomes widely available: DVD-based videogame machines. Consumers willingly spend hundreds of dollars per year on videogames, after spending hundreds to buy the platform itself. Couple a high bandwidth Internet connection and an HDTV set to the videogame system and most of the functions of a medium to highly interactive TV are there. Another way around the expensive TV set problem is to use a PC monitor as the display for either the cable interface box or the videogame platform.

Provider-side infrastructure costs impact the different systems in various ways. For the head end digital cable services, upgrading their plants to support digital TV, video on demand, games and Internet access is a quadruple burden that can cost millions of dollars per head end. Cost recovery means persuading their subscribers to pay for each new premium service. However, cable companies are already encountering serious resistance above basic cable pricing. For the interactive TV over broadband Internet contingent, infrastructure is not as big a barrier, given that over 3 million high bandwidth connections and the back-end support for them already exist. So far, PC users seem willing to pay the freight to get the service. For interactive TV, in the form of interactive video streaming over the Internet, the lack of digital TV sets is not an issue, as the PC already has its dedicated high-definition monitor right out of the box.

On the down side, interactive TV over the Internet is subject to "netlock", traffic congestion that limits the number of high-quality video streams within any given metro area. In addition, every video stream on the Internet costs its provider about $0.50 per hour per viewer. Although this cost is decreasing every few months, it must be covered by advertising or pay per view fees. So far, only pay per view of adult entertainment has been able to deliver a profit margin. As more viewers gain access to broadband Internet, and bandwidth costs go down, other pay per view content, such as sports and concerts will become feasible. Obviously, these two barriers do not exist for head end based cable TV delivery of interactive TV.

The hybrid of satellite digital video delivery and dialup modem back channel for user data avoids many of the infrastructure costs, Internet bandwidth and Internet costs issues associated with the digital head end cable approach and broadband Internet systems. The digital video servers can be concentrated cost-effectively at a few data centers and the infrastructure/availability of dialup modem data support is ubiquitous. The remaining problem for the hybrid satellite-modem system is mainly performance. The performance issue centers on latency. Network latency is the time it takes for the user's input to affect the interactive video stream. With a low-bandwidth modem as the control channel and a satellite up and down link gating the video, shoot-em-up and racing video games are virtually impossible to implement, leaving only card games and other lesser forms of interaction.

Industry Standards
Given that the whole point of standardization is common definitions, it is no surprise that there are a number of proposed and effective standards pertaining to interactive TV. It is beyond the scope of this paper to examine them all. Instead, partial, non-representative samples will be considered here. Note that the range of standards covers hardware, signal transmission and software. The applicable software standards include everything from video encoding formats to on-screen object linking methods. Perhaps, at some point in the future a de-facto working set of standards will emerge.

Starting with the hardware, there is now a proposed digital TV hardware specification circulating among national committees of manufacturers. Last minute changes in the proposed standard have delayed shipments of digital TVs from some manufacturers. Other manufacturers are now shipping their best guess at compliant sets. On the PC hardware side, there is general agreement on XVGA as the screen display format. Note that both analog and digital TV have rectangular pixels and PCs have square pixels. PCs do not interlace video frames and TVs, including most of the proposed digital ones, do. On the other hand, many digital camcorders are capable of recording and playing non-interlaced video. DVD players generally support a range of image formats, including HDTV.

For transmission of NTSC digital TV, which can range from the same effective screen resolution as analog NTSC, right up to high-end HDTV, the proposed standard only covers the downstream side. This is because the user's back channel to the transmitter can not be within the allocated spectrum of the main signal. A variety of proprietary formats for user upstream data intended for use on cable or fiber are presently in trial deployments. The competing satellite TV systems each use their own downstream transmission format, with a fully standardized back channel of data via modem. Digital video on the Internet is transmitted via either HTTP or UDP, which are not compatible server data protocols. However, underlying both of those is a common data packet format, TCP/IP. For interactive TV on the Web, HTTP is more common because it handles both the downstream and upstream data with verification, even though UDP is more efficient for downstream-only video, which can be sent without any receiver verification.

On the software side, the three main standardization issues are video encoding/decoding ( the "codec"), stream format, and the user interface. Digital video must be encoded to make both storage and transmission practical. Digital broadcast, cable and satellite TV have all agreed on a type of codec, MPEG, that is used in different, sometimes compatible ways, by each segment. A version of MPEG 1 is used by satellite TV, digital cable and minimum standard broadcast digital TV. MPEG 2 is used by HDTV broadcasters and DVD-standard devices. On the Internet, MPEG 4 is used by about half of all streaming video providers. The other half is split between two other codecs, RealNetworks and Sorenson.

Then, there are the competing video stream formats. RealNetworks has their own stream format. Apple has QuickTime. And, Microsoft has MediaPlayer. Video streams contain additional information about the encoded video within the stream, as well as a method for controlling the behavior of the stream. None of these stream formats are compatible. Each one needs its own software playback method. The playback methods are available to users of PCs and Macs both as standalone applications and plug-ins for browsers.

For user interfaces, the standard face of plain old TV is changing as digital TV enters the frame. Each broadcast network, cable company and satellite TV provider has taken the opportunity of digital and added features to TV. Each of these features needs a way for users to take advantage of it, and this has resulted a variety of new buttons on remote controls, new icons in the corner of the TV screen and menu screens of all flavors: pop-up, pull-down, and side-scrolling under the picture. There are no standards in this arena. For interactive TV on the Internet, standardization on the Web browser as the main user interface may seem inevitable, as most video is now delivered this way. But, a number of Internet TV channels have opted for proprietary software applications that open their own user interfaces on the desktop.

TV and PCs - Convergence or Divergence?
This author, among many others, has written that the convergence of TVs, PCs and the Internet seems likely within 10 years. A number of trends point in this direction, including the decreasing cost of PCs, the increasing image quality of TV, smart digital TVs, the number of homes with PCs wired to the Internet, increasing Internet usage per household, pay per view on the Internet, video chat, and on and on. Those trends are supported by plenty of surveys and statistics. On the other hand, there are trends in the opposite direction, that indicate TVs and their audience will remain separated and distinct. Perhaps the two most important trends that point to divergence are the popularity of home theater and proprietary videogame platforms.

Big screen home theaters with multi-channel sound are increasingly popular with consumers. As broadcast and cable signals improve in quality with digital TV and HDTV, the visual benefits of shear size will become even more obvious and attractive. It seems clear that without the additional cost and complexity burden of computer electronics, big TVs will always be less expensive than big-screen PCs. Not to mention the fact that TVs don't "crash" the way computers do. More reliability further enhances the visual entertainment bang for the buck advantage of TV over PCs.

The new generation of videogame platforms now on the market are supercomputers in their own right, complete with high-speed network connectivity, at the user's option. Compared head to head with PCs of similar game graphics capability, the proprietary platforms perform as well, if not better, at a fraction of the price, which is typically one-sixth that of a PC. And again, the game machines are more reliable and have virtually no maintenance, compared to a PC.

Conclusion
Interactive TV is already on the market in a variety of forms, ranging from PCs with Internet-based video, to digital set top boxes on cable TV, to videogame systems. At this point in time, in appears unlikely that any single flavor or configuration will dominate the consumer space in the foreseeable future. Instead, interactive TV will continue to gain popularity in a multitude of forms, serving its users within their budgetary constraints both at work and at home.