What is Character Set in Text, sound and images?

Character Set: A collection of characters (letters, numbers, symbols) that a computer system recognises. Each character is assigned a unique numerical code.

What is ASCII (American Standard Code for Information Interchange) in Text, sound and images?

ASCII (American Standard Code for Information Interchange): A 7-bit character encoding standard representing 128 characters.

What is Unicode in Text, sound and images?

Unicode: A character encoding standard that supports a vast number of characters, including those from different languages worldwide.

What is Sample Rate in Text, sound and images?

Sample Rate: The number of audio samples taken per second, measured in Hertz (Hz).

What is Sample Resolution (Bit Depth) in Text, sound and images?

Sample Resolution (Bit Depth): The number of bits used to represent each audio sample, determining the accuracy of the sound's amplitude.

What is Pixel in Text, sound and images?

Pixel: The smallest unit of a digital image, containing colour information.

What is Resolution in Text, sound and images?

Resolution: The number of pixels in an image, usually expressed as width x height (e.g., 1920 x 1080).

What is Colour Depth in Text, sound and images?

Colour Depth: The number of bits used to represent the colour of a single pixel.

Text, sound and images Revision Notes | IGCSE Computer Science 0478

1. Overview

Data representation is fundamental to computer science because it explains how computers store and manipulate information. Understanding how text, sound, and images are converted into binary form allows us to appreciate the limitations and possibilities of digital technology and to make informed choices about file formats and compression techniques. This topic also lays the groundwork for understanding more advanced concepts such as data structures and algorithms.

Key Definitions

Character Set: A collection of characters (letters, numbers, symbols) that a computer system recognises. Each character is assigned a unique numerical code.
ASCII (American Standard Code for Information Interchange): A 7-bit character encoding standard representing 128 characters.
Unicode: A character encoding standard that supports a vast number of characters, including those from different languages worldwide.
Sample Rate: The number of audio samples taken per second, measured in Hertz (Hz).
Sample Resolution (Bit Depth): The number of bits used to represent each audio sample, determining the accuracy of the sound's amplitude.
Pixel: The smallest unit of a digital image, containing colour information.
Resolution: The number of pixels in an image, usually expressed as width x height (e.g., 1920 x 1080).
Colour Depth: The number of bits used to represent the colour of a single pixel.
Bitmap Image: An image represented as a grid of pixels, each containing colour data. Also known as raster images.
Vector Graphic: An image stored as mathematical descriptions of shapes (lines, curves, polygons).

Core Content

Text Representation

Computers represent text using character sets, which assign a numerical code to each character.
ASCII:
- Uses 7 bits to represent 128 characters (0-127).
- Includes uppercase and lowercase letters (A-Z, a-z), digits (0-9), punctuation marks, and control characters (e.g., line feed, carriage return).
- Limited to representing English characters and basic symbols.
- Extended ASCII uses 8 bits, allowing for 256 characters, but isn't a universal standard.
Unicode:
- Supports over 143,000 characters from almost all writing systems worldwide.
- Uses variable-length encoding schemes like UTF-8 (most common for web pages), UTF-16, and UTF-32.
- UTF-8 uses 1-4 bytes per character.
- Each character is assigned a unique code point. Example ASCII codes:

Character	Decimal	Binary
A	65	01000001
B	66	01000010
Z	90	01011010
a	97	01100001
0	48	00110000
Space	32	00100000

Note: Uppercase and lowercase letters have different codes (A=65, a=97). * Advantage of Unicode: Supports almost all languages, globally compatible. * Disadvantage of Unicode: Larger file size compared to ASCII for simple English text.

Sound Representation

Sound is an analogue signal; computers need to convert it into a digital form (binary) through a process called sampling.
The analogue sound wave's amplitude is measured at regular intervals.
Sample Rate:
- Measured in Hertz (Hz) - samples per second.
- Higher sample rate means more samples are taken per second, resulting in a more accurate representation of the original sound.
- Higher sample rate = better sound quality, but larger file size.
- Example: CD quality audio uses a sample rate of 44,100 Hz (44.1 kHz).

Diagram showing the process of sampling an analogue sound wave: continuous wave is sampled at regular intervals, amplitude measured, converted to binary values

How analogue sound is converted to digital through sampling

* **Sample Resolution (Bit Depth)**: * The number of bits used to represent the amplitude of each sample. * Higher resolution means more possible values for the amplitude, leading to a more accurate and detailed sound. * Higher resolution = better sound quality, but larger file size. * Example: CD quality audio uses a sample resolution of 16 bits. * Example: 8-bit resolution allows 2⁸ = 256 different amplitude levels. * Example: Calculate the size of an uncompressed audio file that is 5 minutes long, recorded at 44.1kHz with a 16-bit sample resolution (stereo). * Duration: 5 minutes = 300 seconds * Sample rate: 44,100 Hz * Resolution: 16 bits = 2 bytes * Stereo = 2 channels * File size = Duration x Sample Rate x Resolution x Channels = 300 x 44,100 x 2 x 2 = 52,920,000 bytes * File size in MB: 52,920,000 / (1024 * 1024) = ~50.47 MB

Image Representation

Digital images are represented as a grid of pixels.
Bitmap Images (Raster Images):
- Each pixel contains colour information.
- Resolution: The number of pixels in the image (width x height).
  - Higher resolution = more pixels = more detail, but a larger file size.
  - Example: A 1920x1080 image has approximately 2 million pixels.
- Colour Depth: The number of bits used to represent the colour of each pixel.
  - 1 bit = 2 colours (e.g., black and white)
  - 8 bits = 2⁸ = 256 colours (commonly used for GIFs)
  - 24 bits = 2²⁴ = 16,777,216 colours (True colour – commonly used for JPEGs and PNGs)

Grid of pixels representing a digital image, with resolution (width x height) and color depth determining file size

Digital images are made of pixels arranged in a grid

* Example: Calculating the size of a bitmap image: A 640x480 image with 24-bit colour depth. * Resolution: 640 x 480 = 307,200 pixels * Colour depth: 24 bits = 3 bytes * File size = Pixels x Bytes per pixel = 307,200 x 3 = 921,600 bytes * File size in KB: 921,600 / 1024 = 900 KB * **Vector Graphics**: * Images are stored as mathematical descriptions of shapes (lines, curves, polygons). * Advantages: Scalable without loss of quality, smaller file sizes (generally) for simple images. * Disadvantages: Not suitable for photorealistic images, complex images can be computationally expensive to render. * Example file formats: .svg *

Comparison of vector graphics (defined by coordinates, scalable) versus bitmap graphics (defined by pixels, fixed resolution) — Vector vs Bitmap: how images are stored differently

Exam Focus

Examiners expect you to understand the relationship between data representation (e.g., sample rate, colour depth, resolution) and file size, and how increasing one affects the other.
Use precise technical terms: "sample rate", "sample resolution", "colour depth", "resolution", "character set", "encoding".
Explain why computers use binary, not just that they "understand" it.
Be prepared to calculate file sizes based on given parameters (e.g., audio duration, sample rate, colour depth).
Understand the differences between bitmap and vector images and their respective advantages and disadvantages.

Common Mistakes to Avoid

❌ Wrong: "Computers use binary because it's what they understand." ✓ Right: "Computers use binary because electronic circuits have two states (on/off), which can be represented as 0 and 1. This enables simple and reliable processing and storage of data."
❌ Wrong: "Higher resolution is always better." ✓ Right: "Higher resolution results in more detail in the image, but also increases file size. You need to consider the trade-off between quality and file size."
❌ Wrong: "ASCII supports all languages." ✓ Right: "ASCII is limited to 128 characters and primarily supports English characters. Unicode is needed to support a wider range of languages."
❌ Wrong: Failing to specify units. ✓ Right: Always include units like Hz, bits, bytes, KB, MB, pixels, etc. in your answers and calculations.
❌ Wrong: Not explaining what sample rate or bit depth actually mean. ✓ Right: Make sure you can clearly explain what each term represents and how it affects the quality of the sound or image.

Exam Tips

When asked to explain why computers use binary, focus on the ease of implementation with electronic circuits (on/off states).
For file size calculations, pay careful attention to units (bits vs. bytes, KB vs. MB) and make sure you show your working.
When comparing ASCII and Unicode, highlight Unicode's ability to represent characters from multiple languages as the main advantage.
If asked about choosing between bitmap and vector graphics, consider the type of image (photorealistic vs. simple shapes) and the need for scalability.

Text, sound and images