Systems Software

This blog is a short review for ECE353 (Systems Software, 2017 winter by Professor Baochun Li) at University of Toronto. This course talks about 7 categories of basic information about computer systems:

  1. Concurrency
  2. Scheduling Policy
  3. Virtualization
  4. File System
  5. I/O
  6. Security
  7. BLITZ

0. What is Operation System?

  • OS is a layer of software between applications and hardware. It implements an API so that applications can access hardware. It controls the execution of applications.
  • OS virtualizes resources: it makes the resources easier to use.

1. Concurrency

  • Process and Thread
    • A process is a running instance of a program.
    • A process has a group of resources and a “thread” of control.
    • Multiple threads can reside in a process, but have their own PC, stacks and registers.
  • Context switching
  • Mutex
    • A mutex is like the only microphone in a room. Only the one holding it can speak while others have to wait and listen.
    • TSL (Test-set-lock) + spin loop implementation of Mutex
      TSL is atomic, which is supported by the hardware.
      1
      2
      3
      4
      5
      6
      7
      def Lock():
      R1 = TSL()
      while (! R1):
      R1 = TSL()

      def Unlock():
      Just release the lock
  • Semaphore
    • A Semaphore is like a waitress outside a restaurant, counting the remaining number of seats within, and asks the customers to line up when there are no seats available.
    • Using Semaphore to make sure event A happens before event B:
      Init semaphore to 0.
      When A is done, sem.Up()
      When B starts, sem.Down()
    • A binary semaphore can be a Mutex.
  • Condition and Monitor
    • A Condition variable supports Wait(&mutex), Signal(&mutex), and Broadcast(&mutex) operations.
    • There are Hoore’s semantics and MESA semantics for Monitors.
    • Hoore’s semantics require that the process yielding control immediately.
    • MESA semantics is less strict. It only requires that the thread calling Signal() or Broadcast() eventually exits the Monitor.
  • Producer / Consumer problem, race condition (circular dependency condition)
  • Dining Philosopher, Sleeping Barber, etc.
    • Two solutions to the Dining Philosopher:
    1. Use five Semaphores, representing the five philosophers.
      See Professor Li’s slides for how this solution comes out.
    2. Use five Semaphores, representing the five forks. Let four of the philosophers pick up left fork before right, and the remaining one pick up the right fork first.
  • Some synchronization problems and their causes:
    • Deadlock: There are four conditions needed to cause a deadlock: mutual exclusion, hold-and-wait, no preemption (interruption), circular dependency.

2. Scheduling Policy

  • CPU / IO burst cycle.
    • There are CPU-bounded tasks and IO-bounded tasks. The former needs more computation time, and the latter needs faster response to increase interactivity. Intuitively we want to give CPU-bound tasks lower priority, and IO-bound tasks higher.
  • Some metrics to measure the effeciveness of a scheduling policy:
    • Turnaround time:
    • Response time:
    • Throughput
  • Round Robin
  • First Come First Serve
  • Shortest Job First
  • Shortest Time-to-finish first (Preemptive SJF)
  • Multiple-Level Feedback Queue (MLFQ) scheduling
    • High priority first; Thread start with highest priority; Adjust down when use up; Priority boost
  • Linux O(1) scheduling
    • Active and expired arrays to keep track of the jobs.
    • Execute the job with the highest priority in the active array.
    • Move a job to expired array once its alloted time is used up.
    • Swap the active array and expired array when active array is empty.
  • Proportional Share Scheduling
    • Lottery Scheduling
    • Stride Scheduling

3. Virtualization

  • Address Space
  • Address Translation, Segmentation
    • Dynamic partitioning (external fragmentation), coalescing
    • Buddy allocation (internal fragmentation)
    • Slab allocation
  • Paging: Page table, inode, multiple page tables
  • Faster Paging: TLB
    • Motivation of TLB: to speed up the mapping from virtual page address to physical frame address.
    • The TLB lookup algorithm
      TLB algorithm figure
  • Page Replacement Policies:
    • FIFO: first-in-first-out
    • LRU: least recently used
    • NRU: not recently used
    • Clock Algorithm (second chance, to approximate NRU)
  • Virtual Machine: Type 1 / 2 hypervisor, virtualizable
    • Type 1 hypervisor runs in kernel mode in the OS.
    • Type 2 hypervisor runs as a program on the host OS.
    • An architecture is virtualizable if all sensitive instructions are privileged instructions. Sensitive instructions can only be run in kernel mode. Privileged instructions cause the system to trap into kernel mode if executed in user mode.

4. File System

  • FAT (Intel x86 structure), 4KB (kilo-Bytes) pages, 32 bits per entry (12+10+10) x 1024 entries
  • PAE (Physical Address Extension): 3-level, 4KB pages, 64 bits per entry (12+40+12) x 512 entries
  • Data Integrity problems:
    Modern disks are unreliable.
    • Latent sector error: this can be detected (or even corrected) by disks using error correction codes
    • Block corruption: we cannot know it. Done by problematic firmware or disk.
  • RAID
    • Throughput:
    • RAID-0: Striping
    • RAID-1: Mirroring to increase reliability
    • RAID-4: Use a separate disk as checksum
    • RAID-5: Rotate the checksum so that the checksum disk will not burn down fast
  • Journaling FS: crash consistency
    • After writing TxA, TxB, etc., write a TxE block in a step of commit. TxE block is written and is checked to ensure that TxA, TxB, etc., blocks are good.
    • Writing TxE Block to commit after the data blocks.
      Why is such step necessary? Why cannot we simultaneously write data blocks and TxE blocks? Because the disk may choose to write those blocks at its own sequence. In the scenario when the disk writes TxE first, if a power outage happens after TxE finishes, we are not able to know TxB, etc blocks are damaged.
    • In Linux ext4, a checksum is calculated and preserved in TxE

5. I/O Devices

  • Memory-mapped
  • DMA (Direct Memory Access)

6. Security

  • What can an adversary do?
    • Eavesdropping
    • Tampering
    • Masquerading
  • Three objectives of a secure channel:
    • Secrecy: Data is not known by others
    • Integrity: Data is not modified by others
    • Authenticity: Data is attributed to the correct author
  • Symmetric encryption
  • Asymmetric encryption
  • MAC: Message Authentication Code
    • A and B already have $K_{AB}$
    • A sends
    • B decodes, then check if equals MAC
  • TLS Handshake Protocol
    • Step 1:
      • Client sends ClientHello: the highest TLS protocol version supported, and a random number
      • The server responds with ServerHello: a chosen protocol version and a random number
      • Server sends Certificate: $K_{s,pub}$
      • Server sends ServerHelloDone
    • Step 2:
      • Client sends ClientKeyExchange with a PreMasterSecret, generated using $K_{s,pub}$
      • Both client and server generate the secret key.
    • Step 3:
      • Client sends ChangeCipherSpec, meaning “Everything from now on will be authenticated and encrypted”.
      • Client sends ClientFinish, encrypted with $K_{c, priv}$
      • Server responds with ChangeCipherSpec
      • Server sends ServerFinish
  • Common Attacks
    • Man-in-the-middle
    • (Distributed) Denial of Service
      Each server has a limited capacity. When too many clients send requests simultaneously, the server is not able to respond. DDoS floods the server so it can not respond properly to all customers.
    • Trojan Horse
    • Buffer Overflow
      • Exploit lack of bounds checking
      • Exploit lack of separation between stack and code
      • Allows user (attack) code to be placed in a setuid root process and hence executed in a more privileged protection domain
      • Hardware solutions:
        • CPU disallows execution of code in a stack section of memory (e.g: Sun’s SPARC)
        • CPU includes the ability to mark a page non-executable
          (e.g: NX feature in AMD and Intel x86 CPUs)
          (Linux and Windows XP SP2 support the feature)
    • SQL injection

7. BLITZ

  • Lab 2
    • Implement Mutex with the sleep-wakeup model
    • Implement Producer-Consumer problem with a Mutex and two Semaphores
  • Lab 3
    • Dining Philosophers with 5 Condition Variables (as forks)
    • Gaming Parlor
    • Sleeping Barber (noticing how Semaphores enable prior-posterior sequence)
  • Lab 4
    • ThreadManager, ProcessManager, FrameManager
    • GetANewXXX(), ReturnXXX()
    • Remember to Init()… Familiarized with KPL syntax
  • Lab 5
    • Familiarized with diskUtil. Enable the Kernel to read in an executable file.
    • Create a new process and change to user-level process: InitFirstProcess() and StartUserProcess()
    • Implement Handle_Sys_Exec()
    • Minimal codes for other system calls.
  • Lab 6
    • ProcessFinish()
    • Handle_Sys_Fork(): In parent return child’s pid, and in child return 0. (why pid=0 for child? R1 modified.)
    • Handle_Sys_Join() and Handle_Sys_Exit(): Why zombies are needed, and when they are eliminated
    • Handle_Sys_Yield(): just Thread.Yield()
  • Lab 7
    • Handle_Sys_Open() and Handle_Sys_Close(). Why are they needed? To build up channels for file operations.
    • Handle_Sys_Read() and Handle_Sys_Write(): when should we set the page dirty or referenced? Dirty only when read; referenced when read and written.
    • Handle_Sys_Seek(). currentThread.myProcess.fileDescriptor[openFile].currentPos
    • Modified Handle_Sys_Fork() to copy the fileDescriptors
    • Modified ProcessFinish to decrement numberOfUsers in OpenFile and FileControlBlock
  • Lab 8
    • Serial Driver
      • GetChar() and PutChar(): producer-consumer problem
      • SerialHandler: Two scenarios of waking it up
    • Pointer are just addresses.
    • Raw and cooked mode for the terminal