CST338 Learning Log#4
This week we focused heavily on data visualization and exploratory data analysis using Pandas, Matplotlib, and Seaborn. We worked with datasets involving campaign contributions and US Census information, and I learned that choosing the correct visualization is just as important as creating the graph itself. Different types of variables require different approaches. For example, histograms worked well for continuous variables like contribution amounts or hours worked per week, while grouped and stacked bar charts were better for comparing categories such as occupations, employment status, sex, and income level. One thing I improved on this week was using Pandas methods to summarize and prepare data before plotting. We used functions like groupby(), value_counts(), and crosstab() repeatedly. For example, this line was useful for comparing contribution amounts across categories: df.groupby('candidate')['contb_receipt_amt'].median().plot.barh() I also learned how normalizat...