PMs are incomplete about analysis, and in this tech-driven world, the analysis has a lot to do with the database. For extracting data from most databases, you need to write SQL queries.
Having worked in the startups, I believe knowledge of SQL is a huge advantage for PMs. You don’t have to be dependent on a developer for extracting data, save the time involved in coordination, and reduce the pressure on the developers 🙂 Trust me developers are mostly over-burdened in startups.
Let’s start learning SQL.
There are many databases that understand SQL, such as MySQL, PostgreSQL, MS SQL, Oracle and SQLite. Each of these databases can be read or queried using the SQL programming language.
These are the databases don’t fit into the same family of databases, they are known as NoSQL Databases such as MongoDB, CouchBase and Redis.
Besides minor differences in a few areas of SQL language used in these databases, SQL is almost same across all of these databases.
The Simples Query of SQL?
SELECT * FROM <table name>;
- The words written in uppercase are keywords that help in speaking with databases. While the keywords can be in lowercase, writing in uppercase distinguished them and doing so is an industry best practice.
- SELECT: Signals that you want to extract data from databases that meet the conditions in statement excluding all other data.
- (*): Signals that we want to extract all the rows and columns.
- FROM: Signal from which table you want the data
- <table_name>: Name of the table from which you extract data
Please note that running a query doesn’t change data, but just the presentation.
List of Important SQL Queries for Marketers
|Abfirstname.lastname@example.org||99999999||1||18 July 2017||19 July 2018|
|Acemail@example.com||99999998||2||19 July 2017||20 July 2018|
|Abfirstname.lastname@example.org||99998999||3||20 July 2017||21 July 2018|
|Acemail@example.com||99899999||4||21 July 2017||22 July 2018|
|Abfirstname.lastname@example.org||99999999||5||22 July 2017||23 July 2018|
Here is a sample SQL table, named library, from which we have to get the data.
#1 SQL Query: Get Just 1 Column (Email)
SELECT <column_name> FROM <table_name>;
Example: SELECT email FROM library;
#2 SQL Query: Get 2 Column (Email and name)
Format: SELECT <Column_1>, <Column_2> FROM <Table_name>;
You just have to write column names separated by commas.
Example: SELECT email, name FROM library;
#3 SQL Query: Get Column with Name of Your Choice(Alias)
Format: SELECT <column> AS “<Alias>” FROM <table_name>;
Please not that double quotes is required for alias if you use more than 1 word for the name.
Example: SELECT title, first_published AS “First Published Date” FROM library;
#4 SQL Query: Select Data Which Meets Condition(WHERE)
General Format: SELECT <column> FROM <table_name> WHERE <column> <Operator> <Value>
Here are some examples for different possible formats:
- SELECT title, author FROM books WHERE first_published=1997;
- SELECT title, author FROM books WHERE author=”J.K. Rowling”;
For characters you have to put the name in quotes.
- SELECT title, author FROM books WHERE first_published!=1997;
!= > Inequality operator. This query will give data of book that weren’t published in 1997.
Here is a list of other operators: =(equality), < (less than), and <= (less than to equal to).
- SELECT title, author FROM books WHERE first_published=”2015-12-10″;
The date also needs to be put in quotes.
- SELECT title, author FROM books WHERE <condition 1> AND /OR <condition 2>
AND or OR can be used for writing multiple conditions.
For example: SELECT title FROM books WHERE author=”J.K. Rowling” OR first_published<200
#4 SQL Query: Searching Within Set of Values
- Format: SELECT <column> FROM <table> WHERE <Column> IN (<value 1>, <value 2>, …<value n>)
Example: SELECT first_name, email FROM Library WHERE id IN (“M101”, “M102” “M103”)
- Format for deselction – eclusion of particular set of values: SELECT <column> FROM <table> WHERE <Column> NOT IN (<value 1>, <value 2>, …<value n>)
#5 SQL Query: Searching Within Range of Values
Format: SELECT <column> FROM <table> WHERE <Column> BETWEEN <minimum> AND <maximum>
Example: SELECT first_name, email FROM Library WHERE id BETWEEN 1800 AND 2000;
#6 SQL Query: Search on the basis of a pattern
In this case we have to use LIKE and %:
- LIKE: It’s similar to equality operator, but with this operator case doesn’t matter. Hence, you can use this to find values when you don’t whether it’s upper case or lower case.
- % (wild card): A wildcard is a symbol used to replace or represent one or more characters.
Format: SELECT title FROM books WHERE title LIKE “%Known_characters%”
For example – SELECT * FROM products WHERE name LIKE “%t-shirt%”
|1||t-shit polo||Gautam Vermani|
#6 SQL Query: Find missing values
How to find missing data in the table? Let’s say the return data of book is not mentioned in library table.
In this case, you have to use the keyword ‘NULL’ and ‘IS’ or ‘NOT IS’.
Format 1: SELECT <column> FROM <table> WHERE <Column> IS Null;
Format 2: SELECT <column> FROM <table> WHERE <Column> IS NOT Null;
This query will give the column for rows in column where data is missing (is null).
#7 SQL Query: Retreiving data in specific order
This query format can be used to order selected data in ascending or descending order.
- Format: SELECT <column> FROM <table> ORDER BY <column> [“ASC” or “DESC”];
ASC – Ascending order | DESC – Descending order
Please note that if you don’t ASC or DESC, by default, SQL will show data in ascending order.
Let’s say you want to sort by multiple columns i.e. first on the basis of column A and then on the basis of column B, then follow this:
- Format: SELECT <column> FROM <table> ORDER BY <column A> [“ASC” or “DESC”], <column B> [“ASC” or “DESC”];
#8 SQL Query: Limiting the number of results
- Format: SELECT * FROM <table> LIMIT <Number of rows>;
For example: SELECT * FROM Library ORDER BY sales DESC LIMIT 3;
This will limit the number of rows of results to three.
However, let’s say you have multi-page results, and you want to check the latter results. In this case, you have to offset the published result.
- Format: SELECT * FROM <table> LIMIT <Number of rows> OFFSET <skipped rows>;
For example: SELECT * FROM library LIMIT 50 OFFSET 51;
For this particular query, the result will start from 51st valid result.
That’s all for the basic queries. Now, we will discuss a bit advanced topics that are useful growth marketers.
Function in SQL
A function is SQL is used to transform the result.
Format: <NAME of Function>(<value or column>)
Example: SELECT UPPER(name) FROM library;
|AMITESH||Life of Pie|
This query will publish the name in upper case.
Let’s have a look at some of the common functions used in SQL by marketers.
#1 Join two column
- Format: SELECT CONCAT(<value or column>, <value or column2>) FROM <Table>
Example: SELECT CONCAT (first_name, ” “, last_name) AS “Full Name” FROM library
|Ajitesh Abhishek||Harry Potter|
|Amitesh Abhishek||Life of Pie|
This query will also add space between first name and last name.
Please note that double pipe can also be used to add two columns: first_name || last_name.
#2 Measure Length of a Text
- Format: SELECT <column>, LENGTH(<column>) FROM <Table>
Example: SELECT name, LENGTH(name) AS length FROM library;
You can also use this query to get the longest name:
SELECT name, LENGTH(name) AS length FROM library ORDER BY length DESC LIMIT 1;
This will return the longest text in the name column.
#3 Creating an excerpt from a text
- Format: SUBSTR(<column>, <start>, <length>)
- <start>: Specifies where to start in the string
- if is 0 (zero), then it is treated as 1.
- if is positive, then the function counts from the beginning of the string to find the first character.
- if is negative, then the function counts backward from the end of the string.
- <finish>: length of the desired substring
- <start>: Specifies where to start in the string
Query: SUBSTR(“Ajitesh”, 2, 4) | Output: jite
Example: SELECT user_name, description, SUBSTR(description, 1, 4) || “….” AS short_desc FROM library;
#4 Replace full or part of a string
- Format: SELECT REPLACE(<value or column>, <target>, <replacement>)
Example: SELECT REPLACE(state, “California”, “CA”)
#4 Function for counting results
We use count() function to count the results. Let’s have a look at multiple ways it’s used.
- SELECT COUNT(*) FROM <table_name>
Only non-null rows are counted.
Similarly, SELECT COUNT(last_name) FROM library; -> This will give the count of non-null values of last_name.
- SELECT COUNT( DISTINCT <column_name>) FROM <table_name>
This query will count the unique entries in the columns.
For example, let’s say customer have made multiple purchases. In this case, you can use this query to find the unique number of customers from purchase table. #SuperUseful
- SELECT <column_name> FROM <table_name> GROUP BY <column>
This query will aggregate the rows in <column> that has the same value. So, the output will be unique row names.
Let’s consider this table.
|India After Gandhi||Non-fiction|
Let’s run this query: SELECT book_type, COUNT(*) FROM library GROUP BY book_type;
Please note that
HAVING is used in SQL because
WHERE keyword can’t be used with aggregate functions.
Consider this query run on the same table: SELECT book_type, COUNT(*) FROM library GROUP BY book_type HAVING COUNT(*) > 3;
#5 Function for obtaining total
- Format: SELECT SUM(column_name) FROM <table_name>
Consider a table that has the spending of users on various product item. How can we get total spending per user? You can use the following query:
- SELECT SUM(cost) AS “total_spend” FROM orders GROUP BY user_id ORDER BY cost;
Now, let’s say in the same table, you just want to see the data of total spend per user for users who have purchases above a total of INR 250. Then you can run the following query:
- SELECT SUM(cost) AS “total_spend” FROM orders GROUP BY user_id HAVING total_spend>250 ORDER BY cost;
#5 Function for calculating the average
We use avg() to calculate the average value of a numeric function.
- Format: SELECT AVERAGE(cost) AS average FROM orders;
This average is calculated by dividing total sum by the total number of rows.
How can we calculate the average per user? Again, we have to make use of GROUP BY
- SELECT user_id, AVERAGE(cost) AS average FROM orders GROUP BY user_id;
Here is a list of functions that you might use with brief explanations:
- MAX(<column_name>): To get the maximum value
- MIN(<column_name>): To get the minimum value
- Data and Minutes functions are quite handy to get values till date. These functions differ for different databases. Here is the list for MySQL:
- Current date and time: NOW()
- Current time: CURTIME()
- Current date: CURDATE()
That’s all for now. I will discuss join queries (getting data from more than one table) and sets in next blog post on SQL.