Skip to content

Pandas Project – Employee Data AnalysisΒ #28

Description

@asifmohammed1

🎯 Objective

You are given an employee dataset (employees.csv).

Using only Pandas, perform the following operations to analyze and manipulate the data.

This project covers topics from Create DataFrame β†’ Map Values (Topic 24).


employees.csv

ID,Name,Age,Department,City,Salary,Experience,Gender
101,Asif,30,IT,Warangal,65000,5,M
102,Sara,28,HR,Hyderabad,55000,3,F
103,Ali,35,Finance,Delhi,70000,8,M
104,Priya,26,IT,Chennai,60000,2,F
105,Rahul,31,Sales,Mumbai,50000,6,M
106,Aisha,,HR,Hyderabad,52000,4,F
107,John,29,IT,Bangalore,68000,5,M
108,Meena,33,Finance,Delhi,72000,7,F
109,Ravi,27,Sales,Pune,48000,2,M
110,Sneha,30,IT,Warangal,65000,5,F
111,Asif,30,IT,Warangal,65000,5,M

Tasks

1. Read the CSV file.

  • Load employees.csv into a DataFrame.

2. Display the Data

  • Display first 5 rows.
  • Display last 5 rows.
  • Print shape.
  • Print column names.
  • Display information about the dataset.
  • Show summary statistics.

3. Select Data

  • Display only the Name column.
  • Display Name and Salary.
  • Display first employee using loc.
  • Display third employee using iloc.
  • Display first three employees with only Name and Department.

4. Add a New Column

Create a column named:

Bonus = Salary * 10%

5. Update Values

Increase every employee's salary by 5000.


6. Delete Data

  • Remove the Bonus column.
  • Remove employee whose ID is 109.

7. Rename Columns

Rename:

Department β†’ Dept
Experience β†’ Exp

8. Filter Data

Display employees:

  • Age greater than 30
  • Salary greater than 60000
  • Department is IT
  • City is Warangal

9. Multiple Conditions

Display employees who:

  • Work in IT
  • Salary > 60000
  • Experience >= 5

10. Sort Data

  • Sort by Salary (Ascending)
  • Sort by Salary (Descending)
  • Sort by Age

11. Handle Missing Values

  • Find missing values.
  • Count missing values.
  • Replace missing Age with average Age.

12. Remove Missing Values

Remove rows containing missing values.


13. Remove Duplicate Records

Remove duplicate employees.


14. Count Values

Display how many employees belong to each department.

Example:

IT         4
HR         2
Sales      2
Finance    2

15. Unique Values

Display unique cities.


16. Group By

Find the average salary of each department.

Expected Output:

IT
HR
Finance
Sales

17. Aggregate Functions

Find:

  • Total Salary
  • Average Salary
  • Highest Salary
  • Lowest Salary
  • Total Employees

18. Apply Function

Create a new column:

Tax = 5% of Salary

using apply().


19. Map Values

Convert Gender:

M β†’ Male
F β†’ Female

using map().


20. String Operations

Convert all employee names:

  • Uppercase
  • Lowercase

Find length of every employee name.


21. Boolean Column

Create a column:

Senior = Experience >= 5

22. Insert Column

Insert a new column named:

Country = India

after the Name column.


23. Set Index

Set:

ID

as the index.

Then reset the index.


24. Iterate Data

  • Print every row using iterrows().
  • Print every column name.

🎯 Bonus Challenge

Without using loops wherever possible, answer:

  1. Which employee has the highest salary?
  2. Which city has the most employees?
  3. What is the average salary of IT employees?
  4. How many female employees are there?
  5. List employees with more than 5 years of experience.
  6. Display employees whose names start with "A".
  7. Display employees whose salary is between 55,000 and 70,000.
  8. Find the youngest employee.
  9. Find the oldest employee.
  10. Save the final cleaned DataFrame into a new CSV file named employees_cleaned.csv.

Metadata

Metadata

Labels

No labels
No labels

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions