Week 3 – BALT 4361 – Where Data Comes From and Where It Goes

The biggest takeaway for me from Chapters 3 and 4 was just how much data there is around us all of the time that we do not even notice. Prior to this week, my thought process regarding data always seemed to lead back to spreadsheets and numbers. Understanding the difference between structured and unstructured data helped me realize that there are so many different types of data such as emails, images, social media posts, and sensor data. Real world data is much messier than I originally thought which helped me understand how organizations can fail when their data is not organized or reliable from the beginning.
Data Quality is also another key topic for me this week. Understanding all of the different qualities that data must have in order to be deemed trustworthy was fascinating. Accuracy, completeness, consistency, timeliness, relevance. If any of these qualities do not apply, the data can not be trusted and any decision made off of that data will be incorrect. I often forget that having more data does not always mean better outcomes. Taking time to clean your data and making sure you are collecting the right data can be more beneficial than just gathering masses of data and analyzing it.
Chapter 4 really helped tie everything together for me. It answered the questions of what happens to data once it is collected. Understanding data pipelines and data warehouses allowed me to see the entire process and system behind data in the real world. I enjoyed learning about how different data roles such as engineers, analysts, and scientists work together. The cloud computing and big data tools section also helped solidify my understanding of why cloud platforms are so prominent in organizations today.
Comments
Post a Comment