100 Languages Speedrun: Episode 30: SageMath

There are two kinds of "mathematical" computations - numerical and symbolic.

Numerical ones (like Octave, Julia, NumPy etc.) take a lot of numbers, often whole vectors and matrices of them, and do some numerical computation.

Symbolic ones (like Z3, SageMath) take abstract mathematical expressions, and do transformations on them. In practice most systems are somewhat mixed, but it's still an useful distinction.

SageMath is a fairly weird language is that it's 99% Python, and it runs on Python engine, but it's not quite completely Python, and they introduced a bunch of "pre-parsing" changes to make it more math-friendly. It's sort of like relation between React's JSX or Babel and plain JavaScript.

There's no real point bothering with Hello World, FizzBuzzes, and such, as you can run Python code on SageMath, and they will generally just work unless you just so happen to hit one of the differences.

Syntax differences

You'd most commonly run SageMath from a Jupyter Notebook or similar, but you can run it from command line as well.

Two most obvious syntax differences are / and ^, but even plain numbers are not quite the same:

#!/usr/bin/env sage

# XOR in Python, ** in SageMath
print(3^4)

# Floats in Python, Rationals in SageMath
print(2/3)

# All numbers are wrapped in special classes
print(type(2))
print(type(2/3))

Let's try to run it:

$ sage syntax.sage
81
2/3
<class 'sage.rings.integer.Integer'>
<class 'sage.rings.rational.Rational'>
$ python3 syntax.sage
7
0.6666666666666666
<class 'int'>
<class 'float'>

But we can even see what Sage converted the code into (I adjusted spacing a little):

# This file was *autogenerated* from the file syntax.sage
from sage.all_cmdline import * # import sage library

_sage_const_3 = Integer(3)
_sage_const_4 = Integer(4)
_sage_const_2 = Integer(2)

#!/usr/bin/env sage

# XOR in Python, ** in SageMath
print(_sage_const_3 ** _sage_const_4)

# Floats in Python, Rationals in SageMath
print(_sage_const_2 / _sage_const_3)

# All numbers are wrapped in special classes
print(type(_sage_const_2))
print(type(_sage_const_2 / _sage_const_3))

Limits

Let's start with some high school stuff, like limits:

#!/usr/bin/env sage

print("limit of sin(x) / x as x approaches 0")
print(limit(sin(x) / x, x=0))

print("limit of sin(x) / x as x approaches infinity")
print(limit(sin(x) / x, x=oo))

print("limit of (1+1/x)^x as x approaches infinity")
print(limit((1+1/x)^x, x=oo))

When we run it:

$ ./limits.sage
limit of sin(x) / x as x approaches 0
1
limit of sin(x) / x as x approaches infinity
0
limit of (1+1/x)^x as x approaches infinity
e

In sage a lot of mathematical symbols are pre-defined. x, sin, oo are all symbolic, not what you'd expect in plain Python.

The oo alias for Infinity is just two os, not the Unicode infinite character. It's probably best for ease of typing, but it's somewhat annoying that Python doesn't support ∞ in variable names, as it's technically not a "letter". Raku has ∞ out of the box, and Julia doesn't by default, but you can do ∞ = inf in Julia.

Differentiation

We can do some simple differentiation:

#!/usr/bin/env sage

print("diff of (x+2)*(x-2):")
print(diff((x+2)*(x-2), x))

print("diff of cos(x^3) / x^3:")
print(diff(cos(x^3) / x^3, x))

Which outputs:

./diff.sage
diff of (x+2)*(x-2):
2*x
diff of cos(x^3) / x^3:
-3*sin(x^3)/x - 3*cos(x^3)/x^4

Integrals

And of course integration. We can also get numerical approximation. Sage takes requested precision in binary, not decimal digits.

#!/usr/bin/env sage

f = 1/x

print("Integral of f(x) = 1/x from 5 to 10:")
print(f.integral(x, 5, 10))

print("Numerical approximation to 100 binary digits:")
print(f.integral(x, 5, 10).n(100))

Which outputs:

$ ./integral.sage
Integral of f(x) = 1/x from 5 to 10:
log(10) - log(5)
Numerical approximation to 100 binary digits:
0.69314718055994530941723212146

GF(p) Finite Fields

SageMath can do all the math on real or complex numbers just fine, but so can so many other systems.

SageMath's more unique when in comes to its strengths with discrete mathematics, which matters a lot when it comes to cryptography, or CTFs.

Let's start with the simplest example of GF(p) (also known as Zp and a few other things) fields - basically integers modulo p, so they wrap around when you add, subtract, or multiply them:

#!/usr/bin/env sage

p = 65537
F = GF(p)

print("Some addition in GF(65537):")
for i in range(10):
  a = F.random_element()
  b = F.random_element()
  print(f"{a} + {b} = {a + b}")

print("Some inverses in GF(65537):")
for i in range(10):
  a = F.random_element()
  b = a^-1
  print(f"{a} * {b} = {a * b}")

Which randomly outputs:

./zp.sage
Some addition in GF(65537):
5993 + 52077 = 58070
34573 + 35447 = 4483
64465 + 9539 = 8467
23830 + 64214 = 22507
33632 + 5984 = 39616
32744 + 55138 = 22345
7174 + 12223 = 19397
12815 + 49721 = 62536
21107 + 51922 = 7492
32436 + 21708 = 54144
Some inverses in GF(65537):
4454 * 21880 = 1
8250 * 31990 = 1
2139 * 17403 = 1
34523 * 59411 = 1
44260 * 56663 = 1
3474 * 34655 = 1
53770 * 11490 = 1
12588 * 61575 = 1
34995 * 24902 = 1
2534 * 14354 = 1

GF(p) addition and multiplication is a field that pretty much every programming language can deal with, as that's just doing whatever operation modulo p.

Things like GF(p) division, inverses, and so on - that's generally not coming out of the box.

Elliptic Curves

SageMath can deal with mathematical structures much more complex than GF(p) fields. One such notable structure are elliptic curves, which are one of the basic building blocks of modern cryptography.

#!/usr/bin/env sage

F = GF(101)
EC = EllipticCurve(F, [2, 3])

for i in range(10):
  a = EC.random_element()
  b = EC.random_element()
  print(f"{a} + {b} = {a + b}")

Which outputs:

./ecc.sage
(84 : 56 : 1) + (41 : 15 : 1) = (40 : 94 : 1)
(41 : 15 : 1) + (50 : 41 : 1) = (57 : 51 : 1)
(0 : 1 : 0) + (90 : 93 : 1) = (90 : 93 : 1)
(27 : 67 : 1) + (56 : 30 : 1) = (35 : 86 : 1)
(50 : 41 : 1) + (11 : 89 : 1) = (20 : 93 : 1)
(30 : 55 : 1) + (30 : 55 : 1) = (35 : 15 : 1)
(9 : 12 : 1) + (92 : 93 : 1) = (96 : 26 : 1)
(49 : 40 : 1) + (40 : 94 : 1) = (48 : 55 : 1)
(81 : 89 : 1) + (35 : 15 : 1) = (21 : 69 : 1)
(99 : 71 : 1) + (0 : 1 : 0) = (99 : 71 : 1)

Breaking Diffie-Hellman Key Exchange

I mostly use SageMath for CTFs, and this example is adapted from one of such challenges, with numbers changed.

It's a two step attack, but first part happened in challenge's backstory.

The first step is that we hacked parameter negotiations between Alice and Bob to make each of them think that the other side only supports using '90s "export grade" encryption with very short keys, which was strong enough to withstand casual attackers, but not really against NSA and such. Thanks to SageMath, we can do what NSA could back in the '90s, and break such relatively short keys.

#!/usr/bin/env sage

# "Export Grade" Diffie-Hellman Key Exchange
K = GF(57702167561280060733)
g = K(2)

# We don't know these numbers, CTF server does
secret_a = 21994334292003664313
secret_b = 5307079022839176516

# What we get from the challenge is these:
# Alice sent to Bob: g^a
# Bob sent to Alice: g^a
ga = K(20214349094702392183)
gb = K(36652172046887073403)

# This is fast only because SageMath knows all the fancy algorithms
# It would be completely non-viable to brute force
# and extremely tedious and slow to code yourself:
a = discrete_log(ga, g)
gab = gb ^ a
print(f"We hacked the key: {gab}")

# CTF server can verify our solution:
print(f"Alice shared key: {gb^secret_a}")
print(f"Bob shared key: {ga^secret_b}")

Which outputs:

$ ./mitm.sage
We hacked the key: 44661687795688390997
Alice shared key: 44661687795688390997
Bob shared key: 44661687795688390997

Should you use SageMath?

You already know the syntax, as it's 99% Python, so it has really friendly learning curve.

For numerical mathematics there's so many alternatives, including so many Python-based solutions, I don't think what SageMath is doing is all that special.

For symbolic mathematics, especially anything cryptography-related, it's really good, and I don't think anything else really matches its capabilities.

There are some pure-Python symbolic mathematics libraries like SymPy, but from a quick glance they seem to lack most of the discrete mathematics and crypto features. And of course there's the tradeoff - going 100% Python and dropping the SageMath pre-parsing means considerably more verbose code in some cases.

Overall, SageMath is a useful tool for anyone interested in CTFs, or understanding how modern cryptography and cryptocurrency works. It's really amazing in that limited niche.

If you're not interested in cryptography, and you just have regular mathematical needs, SageMath could work for you, but you can generally stick to the usual Python libraries like NumPy and Pandas, or use a numerical mathematics language like Julia.

Code

17