How to Parse bash Variables

The Problem

I sometimes need to parse a shell script to pick out variables declarations. Consider the following bash script, myscript.sh:

echo hello
server="server1.example.com"
port=8899
echo done

The Solution

The first step is read the contents of the script:

with open("myscript.sh") as stream:
    contents = stream.read().strip()
print(contents)
echo hello
server="server1.example.com"
port=8899
echo done

Next, we would pick out those lines which contains the variable declaration using regular expression:

import re

var_declarations = re.findall(r"^[a-zA-Z0-9_]+=.*$", contents, flags=re.MULTILINE)
print(var_declarations)
['server="server1.example.com"', 'port=8899']

Finally, we use the csv library to parse these declarations and return a dictionary:

import csv
reader = csv.reader(var_declarations, delimiter="=")
vars = dict(reader)
print(vars)
{'server': 'server1.example.com', 'port': '8899'}

Putting it all together:

import re
import csv

def parse_bash_vars(path: str) -> dict:
    """
    Parses a bash script and returns a dictionary representing the
    variables declared in that script.

    :param path: The path to the bash script
    :return: Variables as a dictionary
    """
    with open(path) as stream:
        contents = stream.read().strip()

    var_declarations = re.findall(r"^[a-zA-Z0-9_]+=.*$", contents, flags=re.MULTILINE)
    reader = csv.reader(var_declarations, delimiter="=")
    bash_vars = dict(reader)
    return bash_vars

Test it out:

parse_bash_vars("myscript.sh")
{'server': 'server1.example.com', 'port': '8899'}




Conclusion

When mentioning the csv library, most thinks of it as a way to parse comma-separated-values files, but it has other uses such as parsing the bash variables or parsing /etc/os-release file.

15