Introduction

A computer virus is a type of malicious software that, when executed, replicates itself by modifying other computer programs and inserting its own code into them. The term "virus" was first used in this context by computer scientist Fred Cohen in his 1986 PhD dissertation, drawing an analogy to biological viruses that replicate by injecting their genetic material into host cells.

The defining characteristic that distinguishes viruses from other malware is parasitic self-replication. A virus cannot exist independently -- it must attach itself to a host program, document, or boot sector. When the host is executed, the virus code runs as well, infecting additional hosts. This dependence on a host and on human action to spread differentiates viruses from worms, which spread autonomously without host programs or human intervention.

"A computer virus is a program that can 'infect' other programs by modifying them to include a possibly evolved copy of itself. Every program that gets infected can also act as a virus, and thus the infection grows." -- Fred Cohen, "Computer Viruses: Theory and Experiments" (1984)

History of Computer Viruses

The conceptual foundations of self-replicating programs predate modern computing. In 1949, mathematician John von Neumann described the theoretical possibility of self-reproducing automata. The first programs that could be considered virus-like appeared in the early 1970s, though they were experimental rather than malicious.

YearVirus/EventSignificance
1971CreeperFirst self-replicating program on ARPANET; displayed "I'm the creeper, catch me if you can!"
1982Elk ClonerFirst virus to spread in the wild; infected Apple II boot sectors via floppy disk
1986BrainFirst IBM PC virus; boot sector virus created by Basit and Amjad Farooq Alvi in Pakistan
1987Vienna, Cascade, JerusalemEarly file infectors; Jerusalem triggered payload on Friday the 13th
1988Morris WormFirst major internet worm (technically a worm, not a virus); prompted creation of CERT
1992MichelangeloBoot sector virus that caused worldwide media panic despite limited actual infections
1995ConceptFirst macro virus; infected Word documents; demonstrated a new attack vector
1999MelissaMacro virus/worm hybrid; spread via Outlook; caused $80 million in damage
2000ILOVEYOUVBScript virus; $10 billion in damage; infected 10% of internet-connected computers
2003Sobig.FFastest spreading email virus at the time; generated 1 million copies in 24 hours
2010StuxnetSophisticated nation-state virus targeting Iranian nuclear centrifuges; used four zero-days

How Viruses Work

Infection Mechanism

When a virus infects a program, it modifies the host file to include the virus code. The modification must be done in such a way that the virus code executes before (or instead of) the original program code. Common techniques include:

  • Prepending: The virus inserts itself at the beginning of the host file. When the program runs, the virus executes first, then transfers control to the original code.
  • Appending: The virus adds itself to the end of the host file and modifies the entry point to jump to the virus code. After executing, the virus restores the original entry point and transfers control.
  • Cavity Infection: The virus inserts itself into unused spaces (code caves) within the host file without changing its size, making detection more difficult.
  • Entry Point Obscuring (EPO): Rather than modifying the file's entry point, the virus patches a call instruction somewhere in the middle of the host code to redirect to the virus, making it harder to find.

The Virus Lifecycle

A virus typically passes through four phases:

  1. Dormant Phase: The virus is present on the system but inactive. Not all viruses have this phase; some activate immediately.
  2. Propagation Phase: The virus replicates itself by infecting new host files. It searches for uninfected targets (executables, documents, boot sectors) and modifies them to include a copy of itself. Most viruses include an infection marker to avoid infecting the same file twice.
  3. Triggering Phase: The virus activates its payload based on a trigger condition. Common triggers include specific dates (Jerusalem: Friday the 13th), boot counts (Elk Cloner: every 50th boot), or random conditions.
  4. Payload Phase: The virus executes its malicious payload. Payloads range from harmless messages and graphical effects to destructive actions like file deletion, disk formatting, or data corruption.

Types of Viruses

Boot Sector Viruses

Boot sector viruses infect the Master Boot Record (MBR) or Volume Boot Record (VBR) of storage devices. They load into memory during the boot process, before the operating system, giving them control over the system from the earliest stage. Boot sector viruses spread when an infected storage medium (historically floppy disks, later USB drives) is used to boot a computer.

The Brain virus (1986), the first IBM PC virus, was a boot sector virus. It replaced the boot sector with its own code, moving the original boot sector to another location on the disk. When the system booted from the infected disk, Brain loaded first, installed itself in memory, and intercepted disk access calls to hide its presence and infect other disks.

Boot sector viruses were dominant in the late 1980s and early 1990s but declined as floppy disk usage decreased. However, the concept lives on in modern bootkits that target the UEFI boot process.

File Infectors

File infector viruses attach themselves to executable programs (on Windows, typically .exe and .dll files). When the infected program runs, the virus code executes and searches for other executables to infect. File infectors were the most common virus type through the 1990s and early 2000s.

File infectors vary in their targeting strategy:

  • Direct-action viruses: Search for files to infect each time they execute, then transfer control to the host. They do not stay resident in memory.
  • Memory-resident viruses: Install themselves in memory and intercept system calls. They infect files as they are opened, executed, or copied, remaining active until the system is rebooted.
  • Companion viruses: Create a separate file with the same name but a different extension that takes precedence in execution order (e.g., creating PROGRAM.COM alongside PROGRAM.EXE, since DOS executed .COM files first).

Macro Viruses

Macro viruses exploit the macro programming languages built into applications like Microsoft Word, Excel, and Access. Rather than infecting executable files, they infect documents and templates. When a user opens an infected document, the macro executes and infects the global template (Normal.dot in Word), which then infects every subsequently opened or created document.

The Concept virus (1995) was the first macro virus, demonstrating that data files could carry executable code. The Melissa virus (1999) combined macro virus techniques with email propagation, sending infected documents to the first 50 entries in the victim's Outlook address book. Macro viruses dominated the malware landscape from 1995 to 2002.

Microsoft's decision to disable macros by default in Office applications (and later to block macros in files downloaded from the internet entirely, announced in 2022) has significantly reduced the macro virus threat, though malicious macro documents remain a common delivery mechanism for trojans.

Evasion Techniques

Polymorphic Viruses

Polymorphic viruses encrypt their code with a variable encryption key each time they replicate, producing different-looking copies that evade signature-based detection. The virus body is encrypted and therefore looks different in each infected file, but the decryption routine (the "decryptor") must remain in plaintext to decrypt the virus at runtime.

Early antivirus programs could detect polymorphic viruses by scanning for the decryptor code. In response, advanced polymorphic viruses generate unique decryptors for each copy using techniques like register reassignment, instruction substitution, code reordering, and garbage code insertion. The Mutation Engine (MtE), created by the virus author "Dark Avenger" in 1991, was the first polymorphic engine available as a toolkit, allowing any virus writer to make their creations polymorphic.

Metamorphic Viruses

Metamorphic viruses go further than polymorphism by rewriting their entire code with each generation. Rather than encrypting a static body, the virus contains a metamorphic engine that analyzes its own code, disassembles it into an intermediate representation, applies transformations (instruction substitution, register reassignment, code transposition, subroutine permutation), and reassembles a functionally equivalent but structurally different version.

TechniquePolymorphicMetamorphic
Core mechanismEncrypt virus body with variable keyRewrite entire code each generation
Decryptor neededYes (constant-like stub in plaintext)No (no encryption used)
Code variationEncrypted body looks different; decryptor similarEntire code is structurally different
Detection difficultyMedium-High (decryptor analysis, emulation)Very High (no consistent signatures)
Implementation complexityModerateVery high (requires code analysis engine)
Notable examplesStorm Worm, Virlock, MtE-based virusesZmist (Zombie.Mistfall), Simile, Regswap

The Zmist (Zombie.Mistfall) virus, created by the virus author "Z0mbie" in 2001, is considered the most sophisticated metamorphic virus ever written. It could disassemble its host program, integrate its own code into the host's code flow (code integration), and reassemble the result -- making it nearly impossible to distinguish virus code from host code.

Notable Viruses in History

ILOVEYOU (2000): A VBScript virus distributed as an email attachment named "LOVE-LETTER-FOR-YOU.TXT.vbs." When opened, it overwrote files with copies of itself, emailed itself to all Outlook contacts, and downloaded a password-stealing trojan. It infected an estimated 45 million computers within two days, causing $10 billion in damage and prompting the Pentagon, CIA, and British Parliament to shut down their email systems. The virus was traced to two programmers in the Philippines, but no charges were filed because the Philippines had no computer crime laws at the time.

CIH / Chernobyl (1998): One of the most destructive viruses ever created. It infected Windows PE executables using the cavity infection technique, fitting into unused spaces in PE section headers. On April 26 (the anniversary of the Chernobyl disaster), it attempted to overwrite the system's BIOS flash memory, potentially rendering the computer unbootable. It also overwrote the first megabyte of the hard drive, destroying the partition table. Its creator, Chen Ing-Hau, was eventually convicted under Taiwan's computer crime laws.

Stuxnet (2010): The most sophisticated virus ever discovered, Stuxnet was a joint US-Israeli operation (codenamed "Olympic Games") targeting Iran's Natanz uranium enrichment facility. It spread via USB drives and network shares, used four zero-day exploits, and carried stolen digital certificates. Its payload specifically targeted Siemens S7-315 and S7-417 PLCs controlling centrifuge motors, causing them to spin at destructive speeds while reporting normal readings to operators. Stuxnet destroyed approximately 1,000 of Iran's 6,000 centrifuges.

"Stuxnet was the first true cyber weapon. It crossed the line from the digital world into the physical world, causing actual physical destruction." -- Ralph Langner, the researcher who first analyzed Stuxnet's payload

Antivirus and Defense

The computer virus threat drove the creation of the antivirus industry in the late 1980s. Detection techniques have evolved significantly:

Detection MethodHow It WorksStrengthsLimitations
Signature-BasedCompares file content against a database of known virus byte patternsFast, accurate for known viruses, low false positivesCannot detect unknown or polymorphic viruses
Heuristic AnalysisAnalyzes code structure and behavior patterns for suspicious characteristicsCan detect unknown viruses with similar patternsHigher false positive rate
Emulation/SandboxingExecutes suspicious code in a virtual environment to observe behaviorDefeats encryption and packing; reveals true behaviorSlow; anti-emulation techniques exist
Behavioral MonitoringMonitors running programs for virus-like behavior (mass file modification, etc.)Detects zero-day viruses based on actions, not signaturesCan only detect after execution begins
Machine LearningClassifies files using ML models trained on millions of malware and benign samplesCan generalize to unknown variantsAdversarial samples can evade ML models
Integrity MonitoringDetects unauthorized changes to system files and executablesReliably detects file infectionGenerates alerts for legitimate updates

Viruses vs. Other Malware

The term "virus" is often used colloquially to refer to all malware, but in technical usage it refers specifically to self-replicating code that requires a host. Understanding the distinctions is important:

CharacteristicVirusWormTrojan
Self-replicationYes (requires host)Yes (standalone)No
Requires host programYesNoDisguises as legitimate program
Requires human action to spreadYes (running infected program)No (spreads autonomously)Yes (user must execute)
Primary spread methodInfected files, removable mediaNetwork exploitationSocial engineering
Speed of spreadSlow (depends on sharing)Very fast (automated)Varies (depends on distribution)

Modern Relevance

Traditional file-infecting viruses have declined significantly since the 2000s, replaced by trojans, ransomware, and fileless malware as the dominant threat categories. Several factors contributed to this decline:

  • The shift from local file sharing to internet-based distribution reduced the effectiveness of file infection as a spreading mechanism
  • Improved OS security features (DEP, ASLR, code signing) made file infection more difficult
  • Antivirus software became ubiquitous and effective against known virus techniques
  • Cybercrime shifted toward profit-motivated attacks where trojans and ransomware are more effective

However, virus techniques remain relevant. File infection is still used by some advanced threats, macro-based attacks continue to be a major initial access vector, and the evasion techniques pioneered by virus writers (polymorphism, metamorphism, anti-analysis) are now standard features of all sophisticated malware. For analysis of modern malware, see malware analysis.

References

  • Cohen, F. (1986). "Computer Viruses." PhD Dissertation, University of Southern California.
  • Szor, P. (2005). The Art of Computer Virus Research and Defense. Addison-Wesley Professional.
  • Filiol, E. (2005). Computer Viruses: From Theory to Applications. Springer.
  • Ludwig, M. (1993). The Giant Black Book of Computer Viruses. American Eagle Publications.
  • Zetter, K. (2014). Countdown to Zero Day: Stuxnet and the Launch of the World's First Digital Weapon. Crown.
  • Langner, R. (2013). "To Kill a Centrifuge: A Technical Analysis of What Stuxnet's Creators Tried to Achieve." The Langner Group.
  • von Neumann, J. (1949). "Theory of Self-Reproducing Automata." University of Illinois Press (edited by A. Burks, 1966).
  • AV-TEST Institute. (2024). "Malware Statistics and Trends." https://www.av-test.org/
  • Microsoft. (2022). "Macros from the Internet Will Be Blocked by Default in Office." Microsoft Tech Community.
  • CERT/CC. (1988). "CERT Advisory CA-1988-01: Internet Worm." Carnegie Mellon University.