Music and The Home Computer - An Introduction to Digital Music
John M. Zeigler, Ph.D.
e have discussed music appreciation software and software to aid in learning to play an instrument, and how music and sound are created on the PC in other articles in this series. However, what do you do if you simply want to listen to music on your PC? Most PC's can play music CD's in their CD-ROM drives, but you still have to carry around the CD. These days there are other options that allow you to load and play music on your PC just like software. In this article, we'll take a quick look at how music is encoded for playing on the PC, the differences in the various formats in which music can be sent over the net and played on a PC and the current interest in something called MP3, the current choice for encoding music for transmission and playing on the computer.
As we discussed in our last article, Creating Sound and Music on the PC, there are two basic means by which music can be handled on the computer. The simplest method is to sample the sound waveform tens of thousands of times per second and then simply have the computer reproduce the music by playing the samples. These kinds of files, called WAV files on the IBM PC, have some disadvantages. First, they are only as good as the number of samples taken. Thus, to produce decent stereo sound a piece of music will have to be sampled 44,000 times a second in each of two channels. Second, given the complexity of music waveforms, such files are huge; for example, 1 minute of music sampled at 44000 times a second will be nearly 10 MB in size. This is far too large to transmit over anything less than a T1 connection to the Internet operating at 1 megabaud.
Another approach, and one that is most commonly used on the Internet, is to encode not the music itself, but a set of instructions for how the music is to be played. Such files, called MIDI sequences, since they are sequences of instructions, are much smaller than waveform files and can be transmitted over standard dial-up connections in a reasonable length of time. The major disadvantage of MIDI sequences is that the interpretation of the instructions sent is entirely up to the hardware and software available. Thus, MIDI sequences will sound different from computer to computer. Of course, these inherent differences in sound will tend to cover up the nuances of performance that differentiate one performer from another, so that even a well-written MIDI file will lack the power and immediacy of a live performance.
Fortunately, mathematicians and programmers have developed another alternative that provides a viable compromise between the sometimes poor aesthetic qualities of MIDI files and the extreme sizes of waveform music files. This involves something called compression, a mathematical process in which a large quantity of information can be squeezed into a smaller file. There are two basic forms of compression, "lossless" and "lossy" compression. As the names imply, "lossless" compression involves no actual loss of data, so that the original uncompressed file can be reconstituted from the compressed version by use of the appropriate software. Most Internet users are familiar with the use of ZIP files in which multiple files are placed together in a single file and compressed. Nearly all software is delivered in compressed archives of some sort, because they are typically about half the size of the original files and can be decompressed to give exactly the original files. Compression technologies are a science in and of themselves and usually involve the application of several forms of compression at the same time. For example, ZIP compression uses "tokenization" (replacement of common sequences with special characters called "tokens"), run-length encoding (the replacement of repeated sequences of digital information with shorter "stand-ins" which the decompression program understands to stand for the longer sequences they replace) and something called LZW compression, a mathematical technique that aids in the identification of replaceable sequences.
Lossless compression is a pretty good technique for software, but it doesn't come near to making a 10 MB file small enough to transmit over an Internet dial-up connection (less than 100 KB). For that, we need compression factors that are much higher than the factor of two or so achievable routinely with lossless compression. Enter lossy compression. While lossy compression can't be used for software, since throwing away bytes of the compiled code would make the software unusable, it works well for graphics and sound. In lossy compression, the trick is to figure out what information one can safely discard without so severely impacting the quality of the graphic or sound that the difference between the compressed and original file is unacceptably obvious.
Once again, mathematics comes to the rescue. The most commonly used compressed format for "high color" (more than 64K colors) graphics files is the JPEG (for Joint Photographic Experts Group) or .JPG format, in use everywhere on the Web for photographs. JPEG is based on something called DCT (Discrete Cosine Transformation). DCT essentially mathematically re-maps the contents of the uncompressed file in such a way that the elements of the file that have the least impact on the appearance of the photo can be readily identified and then discarded. If you've ever created a JPG file in a graphics program, you're familiar with a dialogue that asks you to set the "quality level" of the compression. What you're really being asked is how much of the least significant information you want thrown away. The more you discard, the smaller, but lower quality, the resulting compressed file will be. Once you've made your decision, LZW compression is applied to the remaining information in the file. JPEG compression can easily produce compression ratios between 10 and 20, while still maintaining good graphics appearance. Note that lossy compression is "one-way" - once you discard the information to produce the JPG file, you can't reproduce the original file exactly, though you can come arbitrarily close to it depending on the quality level chosen for compressing the original file.
Of course, video and sound information is MUCH larger than photographic information and, therefore, in much greater need of even higher levels of compression. Numerous compression-decompression schemes ("codecs") have been developed for video and audio files. Although all are different in the specifics of the way they work, they are all lossy compression schemes that use mathematical techniques (e.g., DCT, "fractals," "wavelets") to identify information to discard and then compress the remainder to a high degree. The more efficient the compression method, the better it is at identifying the information that can be safely discarded in the compressed file. For example, one commercial audio codec discards all frequencies above 16,000 Hz (cycles per second), since most people can't hear those frequencies anyway. Video codecs typically discard some color depth information, since the eye can't distinguish more than about 4 million individual colors, and all data that doesn't change from one video frame to the next ("delta encoding"). Generally, the various codecs have special strengths and weaknesses, depending on the specific content of the video or sound clip to be compressed. For that reason, each of the codecs compete with one another in the market and have gained greater or lesser acceptance due both to their inherent qualities and the market power of the companies touting them.
Unless you've been living on a desert island for the last couple of years, you've probably heard some of the hoopla on the Web about something called "MP3." MP3 stands for Motion Picture Experts Group, Audio Layer 3. It's the third iteration of a standard for compressing audio and video data at high compression ratios with excellent quality. If you have a DVD movie player or a DVD-ROM drive in your computer, you're already using MP3, since DVD's use a version of MP3 for their encoding. MP3 compression discards frequencies above 16,000 cps and quiet sounds at or near the same frequency as loud sounds. This kind of selective removal of information results in very high compression ratios for MP3 encoded sound files.
Although DVD's are an established use of MP3, most of the current excitement about MP3 is its use for audio files. There are literally thousands of Web sites which offer both legal and illegal MP3 audio files. You can download MP3 files and, with the proper player program, play them on your computer. Some popular player programs include Winamp, Sonique, and MusicMatch Jukebox. Failing that, even the Windows Media Player will play MP3 files, though without the flexibility provided by a dedicated player program. If you have a Diamond Rio (http://www.diamondmm.com) or similar appliance, you don't even need a computer to play MP3 files. The Rio is about the size of a pack of cigarettes and will hold about an hour of MP3 music files in its standard memory. The Apple iPod is a similar device, using a proprietary coding rights management scheme, that has become wildly popular.
In principle, almost anybody can create their own MP3 files. The process involves two steps: a so-called "ripper" program captures the video or sound information from a CD, microphone, tape or other source. A second encoder program converts this to MP3 format. Just as with JPEG encoders for photos, the MP3 encoder program gives you some choices which determine how much of the sound information is discarded and, thereby, the degree of compression of the sound that you achieve and the quality of reproduction. Once encoded in MP3, anyone with a player program or appliance can listen to the file.
Other legal sites offer MP3 files for purchase; one of the largest of these is MP3.com at http://www.mp3.com. This site offers thousands of legal MP3 files from all music genres for purchase, as well as a large number of free and legal MP3 sample files. In addition, you can find the latest MP3 software and technical information there and use the search engine to help locate MP3 files. MP3.com is a good way to get your feet wet with MP3 legally. There are also lots of other sites these days offering MP3's. If you only want one or two songs from an album, this is a good option, since at 99 cents apiece, you save some real money, although there are limitations on how you can use the files.
It doesn't take much thought to realize that the ability to encode tracks off CD's is a sure route to copyright infringement by unscrupulous or poorly informed individuals. An MP3 file encoded from a commercial artist's recordings allows people to avoid paying for a CD and, thus, deprives the artist and the recording company of royalty income. For that reason, many of the MP3 recordings you'll find on the Web are illegally recorded ("pirated"). Those that prepare MP3 files from copyrighted works and those that listen to them are subjecting themselves to a potential for a very expensive infringement suit. Even those who simply prepare MP3 files from their favorite CD's for their personal, non-commercial use are asking for trouble.
Since the rapid adoption of MP3 as a de facto standard for sound files threatens the incomes of major recording companies, many Web sites offering illegal MP3 files have been shut down under threat of lawsuit. The most notable of these is Napster, which faced a multibillion dollar judgment and was shut down by legal action (Napster is now back, but offering only fully licensed, legal MP3's for sale only). Most people have probably heard about the lawsuits the recording industry has filed against thousands of kids for using illegal MP3's (i.e. unlicensed). Several companies are also developing propriety versions of MP3 which have key or locale information (much like DVD's) intended to limit infringing uses of the files. This whole area is in considerable flux right now with no obvious winner among the competing strategies. There are now several different "digital rights management" (DRM) schemes in use today. Most of them are intrusive, but made necessary because pirating of copyrighted works has become so common.
Even if you don't plan to listen to digitized music on your computer, you may still want to know about and use MP3 or some other A-V codec. These are just about the best way to send short video clips or recordings of family greetings by e-mail. Similarly, composers and musicians can tape their performances and distribute them without the intervention of a recording company via MP3 files sent over the Internet. Like most of the other technologies associated with the computer/Internet revolution, the question is not whether MP3 can be useful to you, but whether you'll choose to take advantage of it. We think it will be worth your while to do so.