Advanced Data Viz Blog

Week 2

This week’s class about data cleaning was incredibly informing. In previous projects, I dealt with similar issues with a database that was difficult to generate results from. Thankfully, our assignment in working with the Titanic data allowed me to outline successful practices for when less than ideal data is given. 

I’ve used excel before, but reminding myself of the COUNTIF function was really helpful. With this function, I was able to answer quantitative information about the passengers aboard on the Titanic. This was important information for reporting. I realized some of the most important adjustments to the data were quite simple fixes. For example, using “1,2,3” instead of the additional 1st, 2nd, and 3rd. We also noted in class, it would’ve been insightful to include if the passengers were traveling alone or part of a family. For organizational purposes, it would’ve been a lot easier to navigate if the name variable was split into last name and then first name. Simple fixes like avoiding spaces, using plain text, and using lowercases would’ve made a huge difference. 

Consistency is a big factor is determining a successful database. Variable names, variable, and subject IDs should always remain standard. This practice sets the precedent not only for organization but for the clarity of the work. This is something that I definitely need to work on, as it is a fundamental step for working with databases.

For my future projects, I will definitely become more proactive about working with data. In my experience dealing with the databases, the sections were disorganized and not seamless to work with. I will keep in mind the formatting the data for the software I will eventually use. With these steps and and practices, I’m certain that I will be able to tackle database obstacles with confidence. 

Lastly, practicing the partner system was great! My partner and I were able to catch inconsistencies we may not have found on our own. It was definitely great to try out with practice in real time with a colleague. 

Melissa GutierrezComment