Python is one of the most preferred language of data scientists (according to kdnuggets). Before we dive deeper with machine learning and visualization packages lets explore the basics of relational database management using Python.
So there are few packages using which we can execute sql queries from python program. I guess the most widely used packages are MySQLdb and PyMySQL. The difference is in the implementation where MySQLdb is a C extension and PyMySQL is pure Python. However is terms of functionality both are almost same. However if you are on Mac then you might like to prefer PyMySQL as the MySQLdb might give some compiler issues.
So if you are not familiar with python syntax but have experience with some other programming language then you might like to try few problems here. Perhaps that is the best way to learn python.
Before we have look at the actual code lets understand the flow,
- First we need to import corresponding python library
- Then define database connection string with appropriate parameters
- Connect to database using connection string and related library function
- Define a database connection variable (cursor)
- Use this connection variable to execute sql queries.
- Get results in some variable and print it
- Close the connection
Now lets write some code based on the steps above,
# Import the package
# Create database connection
conn = pymysql.connect(host='127.0.0.1', user='akshay', passwd='password', db='myblogs')
# Create cursor (connection variable
cur = conn.cursor()
# Execute query
cur.execute("SELECT * FROM bigdatagrind")
# Get results in python variable
results = cur.fetchall()
# Display results
# Close the the cursor
# Close the connection