Use sqlparse to tokenize SQL statements

The sqlparse module can tokenize SQL statements in Python. In this article I will show some examples on how to use the module.

First of all the module should be installed from PyPI.

pip install sqlparse

If the installation was successful you should be able to import the module.

import sqlparse

Use the parse function to tokenize your first statement. This will return a tuple(list) of all statements.

>>> sqlparse.parse("use mysql")
(<Statement 'use my...' at 0x7F54E50F4C10>,)

To get a list of all tokens checkout the example below. This will parse the SQL statement, get the first element of the result tuple and return a list of all tokens of the first statement.

>>> sqlparse.parse("select b, c from my_table where a=1")[0].tokens
[<DML 'select' at 0x7F74A94A5688>,
 <Whitespace ' ' at 0x7F74A94A5A98>,
 <IdentifierList 'b, c' at 0x7F74A548F118>,
 <Whitespace ' ' at 0x7F74A54811D8>,
 <Keyword 'from' at 0x7F74A5481228>,
 <Whitespace ' ' at 0x7F74A5481278>,
 <Identifier 'my_tab...' at 0x7F74A548F3F0>,
 <Whitespace ' ' at 0x7F74A5481318>,
 <Where 'where ...' at 0x7F74A94ACE80>]

If you want to analyse the SQL statement or check for errors you can use the tokenizer to split the statement into tokens but you have to implement the analyser on your own. Check out the official documentation to find additional examples.


Related articles