Back to All Lectures

Lecture 5: Number Representation and Instruction Encoding

Lectures on Computer Architecture

Click the thumbnail above to watch the video lecture on YouTube

By Dr. Kisaru Liyanage

5.1 Introduction

This lecture delves into how computers represent and manipulate data at the binary level. We explore number systems, two's complement representation for signed integers, instruction encoding formats in ARM assembly, and logical operations for bit manipulation. Understanding these fundamentals is essential for programming efficiently in assembly language and comprehending how processors execute arithmetic and logical operations.

5.2 Number Representation Systems

5.2.1 Unsigned Binary Integers

Binary System Basics

Place Value Calculation

$$ \begin{align*} \text{Binary: } &1011 \\ \text{Value} &= (1 \times 2^3) + (0 \times 2^2) + (1 \times 2^1) + (1 \times 2^0) \\ &= 8 + 0 + 2 + 1 \\ &= 11 \text{ (decimal)} \end{align*} $$

N-Bit Unsigned Range

Binary to Decimal Conversion

$$ \begin{align*} \text{Example: } &10110101 \\ &= 1 \times 128 + 0 \times 64 + 1 \times 32 + 1 \times 16 + 0 \times 8 + 1 \times 4 + 0 \times 2 + 1 \times 1 \\ &= 128 + 32 + 16 + 4 + 1 \\ &= 181 \end{align*} $$

5.2.2 Two's Complement Representation

Purpose of Two's Complement

Sign Bit

Positive Numbers

Negative Numbers

$$ 2^8 - 5 = 256 - 5 = 251 = 11111011 $$

Two's Complement Conversion

Method 1 (Invert and Add):

  1. Write positive value in binary
  2. Invert all bits (0→1, 1→0)
  3. Add 1 to result

Example: -5 in 8 bits

+5:        00000101
Invert:    11111010
Add 1:     11111011  (this is -5)

Method 2 (Subtraction):

$$ -5 = 2^8 - 5 = 256 - 5 = 251 = 11111011 $$

N-Bit Signed Range

Special Cases

5.2.3 Sign Extension

Purpose

Process

Examples

8-bit to 32-bit:
00000101 (+5) → 00000000 00000000 00000000 00000101 (+5)
11111011 (-5) → 11111111 11111111 11111111 11111011 (-5)

ARM Instructions for Sign Extension

Example Usage

LDRH R0, [R1]     ; R0 = 0x0000ABCD (zero-extended)
LDRSH R0, [R1]    ; R0 = 0xFFFFABCD (sign-extended if bit 15 = 1)
LDRB R0, [R1]     ; R0 = 0x000000AB (zero-extended)
LDRSB R0, [R1]    ; R0 = 0xFFFFFFAB (sign-extended if bit 7 = 1)

5.2.4 Hexadecimal Notation

Why Hexadecimal?

Hex Digits

Binary Hex Decimal
000000
000111
001022
001133
010044
010155
011066
011177
100088
100199
1010A10
1011B11
1100C12
1101D13
1110E14
1111F15

Conversion Examples

Binary: 1011 0110 1101 0010
Hex:      B    6    D    2
Result: 0xB6D2

Hex: 0x3F
Binary: 0011 1111
Decimal: 63

ARM Hexadecimal Usage

MOV R0, #0xFF        ; R0 = 255
MOV R1, #0x100       ; R1 = 256
LDR R2, =0xDEADBEEF  ; R2 = 3735928559

5.3 ARM Instruction Encoding

5.3.1 Fixed-Length Instructions

32-Bit Instruction Format

Advantages

Trade-offs

5.3.2 Data Processing Instruction Format

Format Structure

[Cond][00][I][Opcode][S][Rn][Rd][Operand2]
4-bit 2  1   4-bit   1  4   4   12-bit

Field Descriptions

Condition (4 bits, bits 28-31)

I bit (bit 25)

Opcode (4 bits, bits 21-24)

S bit (bit 20)

Rn (4 bits, bits 16-19)

Rd (4 bits, bits 12-15)

Operand2 (12 bits, bits 0-11)

Example: ADD R0, R1, R2

Encoding fields:
  Cond: 1110 (always)
  I: 0 (register operand)
  Opcode: 0100 (ADD)
  S: 0 (don't update flags)
  Rn: 0001 (R1)
  Rd: 0000 (R0)
  Operand2: 0002 (R2, no shift)

Result: 0xE0810002

5.3.3 Data Transfer Instruction Format

Format Structure

[Cond][01][I][P][U][B][W][L][Rn][Rd][Offset]
4-bit 2  1  1  1  1  1  1  4   4   12-bit

Key Fields

L bit (bit 20)

B bit (bit 22)

P bit (bit 24)

U bit (bit 23)

W bit (bit 21)

Rn (base register)

Rd (data register)

Offset (12 bits)

Example: LDR R0, [R1, #4]

Encoding fields:
  Cond: 1110 (always)
  L: 1 (load)
  B: 0 (word)
  P: 1 (offset addressing)
  U: 1 (add offset)
  Rn: 0001 (R1)
  Rd: 0000 (R0)
  Offset: 004 (immediate 4)

Result: 0xE5910004

5.3.4 Immediate Value Encoding

Challenge

ARM Solution: 8-bit + 4-bit Rotation

Calculation

$$ \text{Actual Value} = \text{Immediate} \times \text{ROR}(2 \times \text{Rotation}) $$

Examples

Immediate=0xFF, Rotation=0:
Value = 0xFF ROR 0 = 0x000000FF

Immediate=0xFF, Rotation=8:
Value = 0xFF ROR 16 = 0x00FF0000

Immediate=0xFF, Rotation=12:
Value = 0xFF ROR 24 = 0xFF000000

Allowed Immediates

Assembler Handling

LDR R0, =0x12345678  ; Loads from literal pool

5.4 Logical Operations

5.4.1 Bitwise AND

Operation

Truth Table

A B A AND B
000
010
100
111

ARM Instruction

AND Rd, Rn, Rm       ; Rd = Rn AND Rm
AND Rd, Rn, #imm     ; Rd = Rn AND immediate

Common Uses

Bit Masking (Extract Specific Bits)

; Extract lower 8 bits of R1
MOV R0, R1
AND R0, R0, #0xFF    ; R0 = R1 & 0xFF (keep bits 0-7)

; Extract bits 8-15
MOV R0, R1
AND R0, R0, #0xFF00  ; R0 = R1 & 0xFF00 (keep bits 8-15)

Clearing Specific Bits

; Clear bit 5 of R1
AND R1, R1, #0xFFFFFFDF  ; Bit 5 mask: ~(1 << 5)

Checking if Bit Set

AND R2, R1, #0x80    ; Check if bit 7 is set
CMP R2, #0           ; Compare with zero
BEQ bit_clear        ; Branch if bit was clear

5.4.2 Bitwise OR

Operation

Truth Table

A B A OR B
000
011
101
111

ARM Instruction

ORR Rd, Rn, Rm       ; Rd = Rn OR Rm (ORR in ARM)
ORR Rd, Rn, #imm     ; Rd = Rn OR immediate

Common Uses

Setting Specific Bits

; Set bit 3 of R1
ORR R1, R1, #0x08    ; Bit 3 mask: (1 << 3) = 0x08

; Set bits 4 and 5
ORR R1, R1, #0x30    ; Mask: 0x30 = 0b00110000

Combining Values

; Combine lower byte of R1 with upper bytes of R2
AND R1, R1, #0xFF        ; Keep only lower byte
AND R2, R2, #0xFFFFFF00  ; Keep only upper bytes
ORR R0, R1, R2           ; Combine

5.4.3 Bitwise XOR (Exclusive OR)

Operation

Truth Table

A B A XOR B
000
011
101
110

ARM Instruction

EOR Rd, Rn, Rm       ; Rd = Rn EOR Rm (EOR in ARM)
EOR Rd, Rn, #imm     ; Rd = Rn EOR immediate

Common Uses

Toggling Specific Bits

; Toggle bit 2 of R1
EOR R1, R1, #0x04    ; Bit 2 mask: (1 << 2)
; If bit was 0, becomes 1; if was 1, becomes 0

Fast Zero

EOR R0, R0, R0       ; R0 = 0 (XOR with itself)

Comparison

; Check if R1 and R2 are equal
EOR R3, R1, R2       ; R3 = R1 XOR R2
CMP R3, #0           ; If R3 = 0, R1 == R2
BEQ values_equal

Swapping Without Temporary

; Swap R0 and R1 without using another register
EOR R0, R0, R1
EOR R1, R0, R1
EOR R0, R0, R1
; Now R0 and R1 are swapped

5.4.4 Bitwise NOT

Operation

ARM Instruction

MVN Rd, Rm           ; Rd = NOT Rm (Move Not)
MVN Rd, #imm         ; Rd = NOT immediate

Common Uses

Creating Bit Masks

; Create mask with all bits set except bit 3
MOV R0, #0x08        ; 0x08 = 0b00001000
MVN R1, R0           ; R1 = 0xFFFFFFF7 (all except bit 3)

Negation (with ADD)

; Negate R1 (two's complement)
MVN R1, R1           ; Invert all bits
ADD R1, R1, #1       ; Add 1
; Now R1 = -R1 (original)

5.4.5 Shift Operations

Logical Shift Left (LSL)

LSL Rd, Rn, #shift   ; Rd = Rn << shift
MOV Rd, Rn, LSL #shift

Logical Shift Right (LSR)

LSR Rd, Rn, #shift   ; Rd = Rn >> shift (unsigned)
MOV Rd, Rn, LSR #shift

Arithmetic Shift Right (ASR)

ASR Rd, Rn, #shift   ; Rd = Rn >> shift (signed)

Rotate Right (ROR)

ROR Rd, Rn, #shift   ; Rotate Rn right by shift

Common Shift Applications

Fast Multiplication/Division by Powers of 2

LSL R0, R1, #3       ; R0 = R1 × 8 (2^3)
LSR R0, R1, #2       ; R0 = R1 / 4 (unsigned)
ASR R0, R1, #2       ; R0 = R1 / 4 (signed)

Bit Extraction

; Extract bits 8-11 from R1
LSR R0, R1, #8       ; Shift bits 8-11 to bits 0-3
AND R0, R0, #0xF     ; Mask to keep only 4 bits

Bit Positioning

; Move bit 0 to bit 7
LSL R0, R1, #7       ; Shift left 7 positions
AND R0, R0, #0x80    ; Keep only bit 7

5.5 Practical Bit Manipulation Examples

5.5.1 Extracting Bit Fields

Extract bits 16-23

LSR R0, R1, #16      ; Shift right to position
AND R0, R0, #0xFF    ; Mask to 8 bits

Extract bits 4-9 (6 bits)

LSR R0, R1, #4       ; Shift to position 0
AND R0, R0, #0x3F    ; Mask to 6 bits (0b111111)

5.5.2 Setting and Clearing Bits

Set bits 8-15

ORR R1, R1, #0xFF00  ; Set bits 8-15

Clear bits 16-23

LDR R0, =0xFF00FFFF  ; Mask with bits 16-23 clear
AND R1, R1, R0       ; Clear bits 16-23 of R1

Toggle bits 0-7

EOR R1, R1, #0xFF    ; Toggle lower byte

5.5.3 Checking Flags

Check if any of bits 4-7 are set

AND R2, R1, #0xF0    ; Mask bits 4-7
CMP R2, #0           ; Check if zero
BNE bits_set         ; Branch if any bit was set

Check if specific pattern matches

; Check if bits 8-11 are 0b1010
LSR R0, R1, #8       ; Position bits
AND R0, R0, #0xF     ; Mask 4 bits
CMP R0, #0xA         ; Compare with 0b1010
BEQ pattern_match

5.5.4 Color Packing/Unpacking

Pack RGB values (8 bits each)

; R0 = Red, R1 = Green, R2 = Blue
LSL R1, R1, #8       ; Green << 8
LSL R2, R2, #16      ; Blue << 16
ORR R3, R0, R1       ; Combine Red and Green
ORR R3, R3, R2       ; Combine with Blue
; R3 now contains 0x00BBGGRR

Unpack RGB values

; R0 contains 0x00BBGGRR
AND R1, R0, #0xFF      ; Extract Red
LSR R2, R0, #8
AND R2, R2, #0xFF      ; Extract Green
LSR R3, R0, #16
AND R3, R3, #0xFF      ; Extract Blue

Key Takeaways

  1. Unsigned binary integers represent values from 0 to 2^N - 1 using N bits.
  2. Two's complement represents signed integers, with MSB as sign bit and range -(2^(N-1)) to +(2^(N-1) - 1).
  3. Sign extension preserves value when expanding narrower signed values to wider registers.
  4. Hexadecimal notation provides compact representation with one hex digit per 4 binary bits.
  5. ARM instructions are fixed 32-bit length, simplifying fetch/decode but limiting immediate values.
  6. Data processing format includes condition, opcode, source/destination registers, and operand.
  7. Data transfer format specifies load/store, byte/word, addressing mode, and offset.
  8. Immediate encoding uses 8-bit value + 4-bit rotation, limiting which constants can be encoded directly.
  9. Bitwise AND used for masking (extracting specific bits) and clearing bits.
  10. Bitwise OR used for setting specific bits and combining values.
  11. Bitwise XOR used for toggling bits, fast zero, and comparisons.
  12. Shift operations enable fast multiplication/division by powers of 2 and bit positioning.
  13. Bit manipulation is fundamental for low-level programming, hardware control, and optimization.
  14. Understanding encoding helps write efficient assembly and debug machine code issues.

Summary

Number representation and instruction encoding form the foundation of low-level programming. Two's complement enables efficient signed arithmetic with simple hardware, while sign extension preserves values across different data sizes. ARM's fixed 32-bit instruction format provides regularity but imposes constraints on immediate values, solved through clever encoding schemes. Logical operations—AND, OR, XOR, and NOT—combined with shift operations, provide powerful tools for bit manipulation essential in systems programming, embedded development, and performance optimization. Mastering these concepts enables efficient assembly programming and deeper understanding of how high-level operations translate to machine instructions. These fundamentals prepare us for more complex topics including branching, function calls, and memory management.