I clean messy datasets and turn raw CSV or Excel files into structured, analysis-ready data — then deliver statistical summaries, visualizations, and insights your team can actually use.
What I deliver:
- Data cleaning: duplicate removal, missing values, date formatting, type correction
- Exploratory Data Analysis (EDA): distributions, correlations, outliers
- Summary statistics: mean, median, standard deviation, top categories
- Charts and visualizations (matplotlib, seaborn, or Plotly)
- Final output as clean CSV + PDF report with charts
- Reusable Python script (Pandas) so you can re-run it on new data
I work in two clear phases: clean first, analyze second. You receive a cleaned file you can trust before any analysis begins. No black-box processing — I document every transformation so you know exactly what changed and why.
Tools: Python 3, Pandas, openpyxl, matplotlib, Streamlit (for interactive dashboards on request).