This project covers data cleaning, preprocessing and analysis on a student dataset using NumPy and Pandas.
- Part 1: NumPy operations - mean, median, max, min, normalization
- Part 2: Pandas exploration - data types, missing values, filtering
- Part 3: Data preprocessing - handling missing values, datetime conversion, outlier detection, removing duplicates
- Part 4: Data analysis - average scores, top students, correlation, groupby
The dataset (student_data.csv) contains student records with fields like name, gender, math/science/english scores, attendance and exam date.
pip install numpy pandas
python assignment2_solution.py