Analysing Car Sales Data: 10 Questions Answered with Pandas (2024)

Analysing Car Sales Data: 10 Questions Answered with Pandas (2)

Buckle up as we embark on a data-driven journey through the fascinating realm of car sales data analysis! In this exhilarating adventure, we’ll navigate our way through a fictional car sales dataset, armed with Python’s Pandas library. Our mission? To unravel answers to 10 intriguing questions that not only scratch the surface but also delve deep into the heart of data-driven decision-making.

Let’s dive into the world of car sales data, where each question isn’t just a curiosity but a key to unlocking valuable insights. Armed with Pandas, we’ll not only provide answers but also shine a light on the Python functions that power our analysis. So, fasten your seatbelts, because this is a ride you won’t want to miss!

Question 1: What’s the average price of cars in the dataset?

  • Why It Matters: Average price sets a benchmark for pricing strategies.
  • How Pandas Helps: We calculate the average price using the mean() function in Pandas.
average_price = df['Price'].mean()

The mean() function computes the average of a column, in this case, the 'Price' column. It adds up all the values and divides by the number of values to give us the average price.

Question 2: Which car make is the most and least expensive on average?

  • Why It Matters: Understanding price variations guides marketing and inventory decisions.
  • How Pandas Helps: We identify the most and least expensive car makes using Pandas’ groupby() and mean() functions.
most_expensive = df.groupby('Car Make')['Price'].mean().idxmax()
least_expensive = df.groupby('Car Make')['Price'].mean().idxmin()

We group the data by ‘Car Make’, calculate the mean price for each group, and then use idxmax() and idxmin() to find the car makes with the highest and lowest average prices.

Question 3: How many different car models are there in the dataset?

  • Why It Matters: Knowing the variety of models helps with stock management.
  • How Pandas Helps: We count the number of unique car models using Pandas’ nunique() function.
unique_models = df['Car Model'].nunique()

The nunique() function counts the number of unique values in the 'Car Model' column, giving us the total number of different car models.

Question 4: What’s the distribution of car colors among the sales?

  • Why It Matters: Insights into color popularity can influence production.
  • How Pandas Helps: We analyze the color distribution using Pandas’ value_counts() function.
color_distribution = df['Car Color'].value_counts()

The value_counts() function tallies the frequency of each unique color in the 'Car Color' column, providing a distribution of colors in the dataset.

Question 5: Are there any correlations between car mileage and price?

  • Why It Matters: This helps in pricing used cars based on mileage.
  • How Pandas Helps: We calculate the correlation between mileage and price using Pandas’ corr() function.
correlation = df['Mileage'].corr(df['Price'])

The corr() function computes the Pearson correlation coefficient between two columns, in this case, 'Mileage' and 'Price'. It tells us if there's a relationship between the two variables.

Question 6: Who buys more cars, men or women?

  • Why It Matters: Targeted marketing based on gender can be more effective.
  • How Pandas Helps: We count the number of buyers by gender using Pandas’ value_counts() function.
gender_counts = df['Buyer Gender'].value_counts()

The value_counts() function tallies the frequency of each unique gender in the 'Buyer Gender' column, helping us determine whether men or women buy more cars.

Question 7: What’s the average age of car buyers?

  • Why It Matters: Age insights guide marketing and product design.
  • How Pandas Helps: We calculate the average age of car buyers with Pandas’ mean() function.
average_age = df['Buyer Age'].mean()

The mean() function computes the average age from the 'Buyer Age' column, providing valuable demographic information.

Question 8: Do younger buyers prefer new car models?

  • Why It Matters: Knowing preferences aids inventory decisions.
  • How Pandas Helps: We group the data by age and calculate the average model year using Pandas’ groupby() and mean() functions.
average_model_year_by_age = df.groupby('Buyer Age')['Model Year'].mean()

By grouping data by age and analyzing the average model year, we can infer whether younger buyers prefer newer car models.

Question 9: Which car make is the favorite among first-time buyers?

  • Why It Matters: Insights into first-time buyer preferences help with stock.
  • How Pandas Helps: We filter the data for first-time buyers and count their favorite car makes using Pandas’ conditional filtering and value_counts().
favorite_make_first_time_buyers = df[df['Buyer Type'] == 'First-Time']['Car Make'].value_counts().idxmax()

We first filter the dataset to include only first-time buyers and then determine their favorite car make based on counts.

Question 10: Is there a relationship between buyer age and preferred car color?

  • Why It Matters: Understanding age-color preferences can influence production.
  • How Pandas Helps: We use groupby to analyze the relationship between age and car color with Pandas’ `groupby()` function.
relationship = df.groupby('Buyer Age')['Car Color'].value_counts().unstack().fillna(0)

By grouping the data by buyer age and analyzing car color preferences within each group, we can discern if age influences color choice.

In the rearview mirror of our data-driven adventure, we’ve glimpsed the colorful world of car sales data, thanks to the mighty Pandas. So, dear reader, as we reach the final pit stop, I can’t resist asking: What’s your favorite car make? As for me, I’ve got a soft spot for the sleek lines and luxurious feel of the Mercedes GLE-Class. With that, let’s keep the engines of curiosity running and stay ready for more exhilarating data escapades in the future!

Happy Coding !!

Analysing Car Sales Data: 10 Questions Answered with Pandas (2024)
Top Articles
Latest Posts
Article information

Author: The Hon. Margery Christiansen

Last Updated:

Views: 5918

Rating: 5 / 5 (50 voted)

Reviews: 81% of readers found this page helpful

Author information

Name: The Hon. Margery Christiansen

Birthday: 2000-07-07

Address: 5050 Breitenberg Knoll, New Robert, MI 45409

Phone: +2556892639372

Job: Investor Mining Engineer

Hobby: Sketching, Cosplaying, Glassblowing, Genealogy, Crocheting, Archery, Skateboarding

Introduction: My name is The Hon. Margery Christiansen, I am a bright, adorable, precious, inexpensive, gorgeous, comfortable, happy person who loves writing and wants to share my knowledge and understanding with you.