The Sieve of Eratosthenes

The sieve of Eratosthenes finds all prime numbers up to a given limit.

Method

The algorithm starts out by assuming all numbers are prime, and marking them as such.
At the end of the algorithm, only prime numbers up to an upper limit will still be marked.

The number 1 is a special case, so we start off by unmarking it.

Then we go through the numbers one by one.
For every non-prime number we find, skip to the next number.

If a number is still marked as prime when we get to it, that means it is prime.

Before moving on to the next number, we first unmark every multiple of the found prime.

Those multiples can be divided through the prime number we just found, so by definition isn't prime.

We repeat this process until we reach the upper limit.

Every number that is still marked as prime, is truly prime.

Optimizations

By using some math we can do significantly less work while still getting the same result.

Repeat until the square root

TL;DR: Only check numbers up to the square root of the upper limit.

After that, every number up to that limit will be accurately marked, because math is cool.

While iterating through all numbers, we can stop at the square root of the upper limit.

Any non-prime can be expressed as the product of 2 numbers that are not 1 or itself.

n = a * b

a and b are factors of n.

n = sqrt(n) * sqrt(n), so one factor has to be less than or equal to sqrt(n) while the other is greater than or equal to that square root.

a <= sqrt(n) <= b

Up to any number n, all multiples of a number bigger than sqrt(n) must have a factor smaller than sqrt(n).
As a result that multiple will already be unmarked.

The factors can both be identical to sqrt(n).

49 for example

This means that all the non-primes >= sqrt(limit) will be unmarked in the process of checking every number <= sqrt(limit).

Example

sqrt(21) = 4.58

Any number up to 21 that is a multiple of a number larger than 4.58 will have a factor smaller than 4.58.

Because 18 is a number up to 21.

It is also a multiple of a number that is bigger than 4.58.

That means a factor of 18 must be smaller than 4.58.

That checks out, 3 is a factor!

Because 3 is a factor of 18.
18 was unmarked while going through multiples when 3 was the number the algorithm was unmarking multiples for!

Start unmarking at the square

TL;DR: Start unmarking multiples of a number at that number squared.

All multiples below are already unmarked, because math is cool.

During the step the algorithm unmarks all multiples of a number.
We can start unmarking at that number squared.

Every smaller multiple was already unmarked in a previous iteration.

Why?

A multiple can be written as a multiplier times a number.

  • m = multiple
  • k = multiplier
  • p = prime

m = k * p

The number that is now p, was previously k for every smaller prime number.

Because k * p = p * k, every multiple smaller than p * p has already been unmarked in a previous iteration.

Example

If our current detected prime, p = 5.

5 was previously the multiplier for every smaller prime number.

  • 5 * 2 was unmarked when p was 2, we don't need to calculate 2 * 5
  • 5 * 3 was unmarked when p was 3, we don't need to calculate 3 * 5

Step by step in code

The goal is to write a function that returns a list of prime numbers, up to upper_bound.

I named that variable optimus_prime while writing code for this post, because I thought that was funny

We initialise a list of booleans that is 1 bigger than the given upper_bound and call it sieve.
These booleans tell us if the number at that index is prime or not. (True for prime, False for not)

def primes_up_to(upper_bound):
  # initialise sieve that marks all numbers as prime
  sieve = [True] * (upper_bound + 1)

Smart people decided programmers start counting at 0, so that's why that list is 1 bigger than upper_bound.
It's also the reason why we have to unmark the index 0 along with the index 1 before we start our loop.

def primes_up_to(upper_bound):
  # initialise sieve that marks all numbers as prime
  sieve = [True] * (upper_bound + 1)

  # 0 and 1 are not prime
  sieve[0] = False
  sieve[1] = False

This works out perfectly, because now every index exactly matches the number it represents.

You want to know if the number 69 is prime? The boolean at index 69 will tell you. Nice!

Loop over every number, starting at 2 and ending at the square root of upper_bound.
Inside the loop, index sieve with that number.

import math

def primes_up_to(upper_bound):
  # initialise sieve that marks all numbers as prime
  sieve = [True] * (upper_bound + 1)

  # 0 and 1 are not prime
  sieve[0] = False
  sieve[1] = False

  # iterate up to square root of upper_bound
  # reason: if one factor of num is bigger than sqrt(upper_bound),
  # an other factor _must_ be smaller than sqrt(upper_bound)
  for num in range(2, math.floor(math.sqrt(upper_bound)) + 1):
    # if sieve[num] is true, then num is prime
    if sieve[num]:

If the boolean at that location is True, the number is prime and we unmark every multiple before moving on to the next step of our loop.

Do this by skip counting.
Start at the number squared and add the number until you hit upper_bound.

For every encountered multiple, set sieve at that number's index to False.

import math

def primes_up_to(upper_bound):
  # initialise sieve that marks all numbers as prime
  sieve = [True] * (upper_bound + 1)

  # 0 and 1 are not prime
  sieve[0] = False
  sieve[1] = False

  # iterate up to square root of upper_bound
  # reason: if one factor of num is bigger than sqrt(upper_bound),
  # an other factor _must_ be smaller than sqrt(upper_bound)
  for num in range(2, math.floor(math.sqrt(upper_bound)) + 1):
    # if sieve[num] is true, then num is prime
    if sieve[num]:
      # unmark all multiples
      # start unmarking at num squared
      # every smaller multiple has already been unmarked in previous iterations
      for multiple in range(num ** 2, upper_bound + 1, num):
        sieve[multiple] = False

At the end of the outer loop, sieve will be full of booleans corresponding to the primeness of every possible index to that list.

Use your favourite method to loop over a list while also getting the index, put the indexes with a true into a new list, and presto, primes.

import math

def primes_up_to(upper_bound):
  # initialise sieve that marks all numbers as prime
  sieve = [True] * (upper_bound + 1)

  # 0 and 1 are not prime
  sieve[0] = False
  sieve[1] = False

  # iterate up to square root of upper_bound
  # reason: if one factor of num is bigger than sqrt(upper_bound),
  # an other factor _must_ be smaller than sqrt(upper_bound)
  for num in range(2, math.floor(math.sqrt(upper_bound)) + 1):
    # if sieve[num] is true, then num is prime
    if sieve[num]:
      # unmark all multiples
      # start unmarking at num squared
      # every smaller multiple has already been unmarked in previous iterations
      for multiple in range(num ** 2, upper_bound + 1, num):
        sieve[multiple] = False

  # sieve is done, turn `True` into numbers
  return [idx for idx, mark in enumerate(sieve) if mark]

The returned value is a list of prime numbers, starting at 2, and ending in the last prime up to upper_bound.

  • primes_up_to(16) returns [2, 3, 5, 7, 11, 13].
  • primes_up_to(17) returns [2, 3, 5, 7, 11, 13, 17].
  • primes_up_to(18) returns [2, 3, 5, 7, 11, 13, 17].
  • primes_up_to(19) returns [2, 3, 5, 7, 11, 13, 17, 19].

Final code

pub fn primes_up_to(upper_bound: usize) -> Vec<usize> {
    // initialise sieve that marks all numbers as prime
    let mut sieve = vec![true; upper_bound + 1];

    // 0 and 1 are not prime
    sieve[0] = false;
    sieve[1] = false;

    // iterate up to square root of upper_bound
    // reason: if one factor of num is bigger than sqrt(upper_bound),
    // an other factor _must_ be smaller than sqrt(upper_bound)
    for num in 2..=(upper_bound as f64).sqrt() as usize + 1 {
        // if sieve[num] is true, then num is prime
        if sieve[num] {
            // unmark all multiples
            // start unmarking at num squared
            // every smaller multiple has already been unmarked in previous iterations
            for multiple in (num * num..=upper_bound).step_by(num) {
                sieve[multiple] = false;
            }
        }
    }

    // sieve is done, turn `true` into numbers
    sieve
        .iter()
        .enumerate()
        .filter_map(|(idx, mark)| match mark {
            true => Some(idx),
            false => None,
        })
        .collect()
}
function primesUpTo(upperBound) {
  // initialise sieve that marks all numbers as prime
  const sieve = Array.from({ length: upperBound + 1 }, () => true);

  // 0 and 1 are not prime
  sieve[0] = false;
  sieve[1] = false;

  // iterate up to square root of upperBound
  // reason: if one factor of num is bigger than sqrt(upperBound),
  // an other factor _must_ be smaller than sqrt(upperBound)
  for (let num = 2; num <= Math.sqrt(upperBound) + 1; num++) {
    // if sieve[num] is true, then num is prime
    if (sieve[num]) {
      // unmark all multiples
      // start unmarking at num squared
      // every smaller multiple has already been unmarked in previous iterations
      for (let multiple = num ** 2; multiple <= upperBound; multiple += num) {
        sieve[multiple] = false;
      }
    }
  }

  // sieve is done, turn `true` into numbers
  const primes = [];
  for (const [idx, mark] of sieve.entries()) {
    mark && primes.push(idx);
  }

  return primes;
}
import math

def primes_up_to(upper_bound):
  # initialise sieve that marks all numbers as prime
  sieve = [True] * (upper_bound + 1)

  # 0 and 1 are not prime
  sieve[0] = False
  sieve[1] = False

  # iterate up to square root of upper_bound
  # reason: if one factor of num is bigger than sqrt(upper_bound),
  # an other factor _must_ be smaller than sqrt(upper_bound)
  for num in range(2,math.floor(math.sqrt(upper_bound)) + 1):
    # if sieve[num] is true, then num is prime
    if sieve[num]:
      # unmark all multiples
      # start unmarking at num squared
      # every smaller multiple has already been unmarked in previous iterations
      for multiple in range(num**2, upper_bound + 1, num):
        sieve[multiple] = False

  # sieve is done, turn `True` into numbers
  return [idx for idx, mark in enumerate(sieve) if mark]

19