If you’re looking for a simple script to extract data from an excel file and put it in an SQL table, you’ve come to the right place. In this article, I will show you how to use python pandas and sqlalchemy to import an excel file to a SQL database (MySQL) in a free, fast and flexible manner. The same method also works for SQLite or other SQL databases, but the connection details will be different.
The Excel File
The Excel file above is what we will be importing. It is a neatly organized four column file filled with randomly generated dummy data about people.
Ensure Required Packages are Installed
For this to work properly, make sure you have the required packages installed and are using Python 3. The packages are: pandas, xlrd, sqlalchemy
. If you are missing these, install them with pip install [package]
If using ubuntu, you may also need to ensure the python3-mysqldb package is installed with sudo apt install python3-mysqldb
Importing the Excel File in Python
The first step is to import the Excel file into python as a pandas dataframe. For this, we use the read_excel
function. The function accepts a variety of options to deal with more complicated Excel files. See the documentation for more information. In my case, the function can import the excel file without any extra parameters.
import pandas as pd
df = pd.read_excel('NameNumbers.xlsx')
df.head()
First Name | Last Name | Age | Phone Number | |
---|---|---|---|---|
0 | Thersa | Fullington | 43 | 183-130-5288 |
1 | Siu | Hopf | 54 | 957-849-2620 |
2 | Ginger | Eland | 32 | 236-726-4747 |
3 | Preston | Putz | 41 | 887-311-9146 |
4 | Novella | Robichaux | 37 | 797-363-5038 |
Inserting the DataFrame as an SQL Table
Now that the data is in Python as a dataframe, we need to write that dataframe to an SQL table. In this case, I am connecting to a MySQL database named contacts. I will write to a table named people, which didn’t exist prior to running this script. The table is created with the to_sql function. Like with the read_excel function, a variety of options are available to customize the output (including appending or replacing existing tables). See the documentation for more information.
from sqlalchemy import create_engine
# format: mysql://user:pass@host/db
engine = create_engine('mysql://ryan:ryan@localhost/contacts')
df.to_sql('people', con=engine)
That’s all it takes to write the table. We can see that everything went according to plan with the HeidiSQL screenshots below.
Specify data types, append table and turn off indexes.
The default behavior of the function will work just fine for many uses, but if you wish to append to an existing table you will want to make some changes.
from sqlalchemy.types import String, SmallInteger
...
df.to_sql('people', con=engine, if_exists='append', index=False, dtype={'First Name': String(length=255),
'Last Name': String(length=255),
'Age': SmallInteger, 'Phone Number': String(length=255)})
The above code tells pandas to append if the table exists, don’t include the indexes it creates (you can add an index in SQL, as shown in the YouTube video) and specifies data types to use in place of the ones it may have selected.
That’s it! If you have any questions, feel free to comment below.
Bonjour, j’ai exactement écrire le même code que vous et j’ai suivie ce que vous avez faire à la lettre mais il m’affiche une erreur et je ne sais pas comment faire pour y rémédier. Je vous pris de me répondre aussi rapidement que possible. merci
Bonjour, qu’est-ce que l’erreur dit exactement?
Hi,
While connecting from excel to mysql database, you have used ‘sudo apt install python3-mysqldb’. Can you please tell me how to connect to microsoft sql server. Can we use ‘ sudo apt install python3-pyodbc’ and proceed?
See here: https://docs.sqlalchemy.org/en/20/core/engines.html#microsoft-sql-server