A12: Pandas (Practice Exercises >> 1: Ecommerce Purchases)
This article is a part of “Data Science from Scratch — Can I to I Can”series.
✅ A Suggestion: Open a new jupyter notebook and type the code while reading this article, doing is learning, and yes, “PLEASE Read the comment, they are very useful…..!”
Hi Guys,
After a crash course on pandas for data analysis, it's time to do some practice!!!
Because of privacy issues, I have created a fake dataset here with 30K entries.
The situation is, customers are providing some personal information while purchasing stuff on-line or in-store. For some reasons, your client wants to know the answer to some of his questions from the dataset, let’s try to help him!.
Feel free to consult the solutions if needed. Please note, the tasks given in the exercises, can be solved in different ways. Try your best answer and compare with the solutions.
Exercises:
(Solutions are provided at the end.)
Here is the link to read the data from github: data_link=”https://raw.githubusercontent.com/junaidqazi/DataSets_Practice_ScienceAcademy/master/Cust_Purch_FakeData.csv”
1. Please load the data in a variable "cust"
.
2. Its good idea to see how the data look like, display first 5 rows of your data-set.
3. How many entries your data have? Can you tell the number of columns in your data?
4. What are the max and min ages of your customer? Can you find mean of your customer?
5. What are the three most common customer’s names?
✅ value_counts()
returns object containing counts of unique values. The resulting object will be in descending order so that the first element is the most frequently-occurring element. Excludes NA values by default.
6. Two customers have the same phone number, can you find those customers?
7. How many customers have profession “Structural Engineer”?
8. How many male customers are ‘Structural Engineer’?
9. Find out the female Structural Engineers from province Alberta (AB)?
10. What is the max, min and average spending?
11. Who did not spend anything? Company wants to send a deal to encourage the customer to buy stuff!
12. As a loyalty reward, company wants to send thanks coupon to those who spent 100CAD or more, please find out the customers?
13. How many emails are associated with this credit card number ‘5020000000000230’?
14. We need to send new cards to the customers well before the expire, how many cards are expiring in 2019?
Use sum()
and count()
and see the difference in their use :)
15. How many people use Visa as their Credit Card Provider?
16. Can you find the customer who spent 100 CAD using Visa?
17. What are two most common professions?
18. Can you tell the top 5 most popular email providers? (e.g. gmail.com, yahoo.com, etc…)
19. Is there any customer who is using email with “am.edu”?
Hint: Use lambda
expression in apply()
. split the email address at @
.
20. Which day of the week, there are more customers?
Solutions:
1:
2:
3:
4:
5:
6:
7:
8:
9:
10:
11:
12:
13:
14:
15:
16:
17:
18:
19:
20:
Keep practicing to brush-up and add new skills.
Excellent work!
Your clap and share can help us to reach to someone who is struggling to learn these concepts.
Good luck!
See you in the next lecture on “A13: Pandas (Practice Exercises >> 2: City of Chicago Payroll Data”.
Note: This complete course, including video lectures and jupyter notebooks, is available on the following links:
Dr. Junaid Qazi is a Subject Matter Specialist, Data Science & Machine Learning Consultant and a Team Builder. He is a Professional Development Coach, Mentor, Author, and Invited Speaker. He can be reached for consulting projects and/or professional development training via LinkedIn.