Fixing Dirty Data - How to clean Data by classification and Normalization | Susan Walsh

Fixing Dirty Data - How to clean Data by classification and Normalization | Susan Walsh

40 Minuten
Podcast
Podcaster

Beschreibung

vor 2 Monaten

In the first ever English Episode of UNF#CK YOUR DATA host
Christian Krug interviews Susan Walsh, the classification guru,
on how to clean your dirty data.


But firstly, what is dirty data and why does this pose a problem?


Data in your company systems, like CRM or ERP, can have all sorts
of issues. Duplicates, near duplicates, formats and so on.


So the records which should match, don’t. Or your numbers are
off.


Basically, you can’t rely on the data in the system to make
decisions. Like sending a mail or a leaflet. Potentially even an
invoice. Or know who your real number one customer is.


To help you deal with this mess, Susan has created a framework,
which helps you cleaning up your data. You have to normalize and
classify your data. First agree on a common format an fit the
data to it. Afterwards you can give the data a meaning by
classifying it.


So you can further process the data and base your decisions on
it.


Sad news for all the AI enthusiasts out there: This still
requires an awful lot of human knowledge. No speeding up the
process.


On the other hand this step is crucial for your AI success. As
only good quality training data will lead to great AI results.
Regardless, which use case you tackle first.





But cleaning data one is not a lasting solution. It’s a
continuous effort and it hast to start at the very source where
people enter the data into the systems.


So data quality is a process and mantra.





Find in this episode:


- Why data sometimes is so dirty


- How a COAT method can help you clean data


- Why data quality is not an AI topic


- Susans plans on a new framework








Profiles:





Zum LinkedIn-Profil von Susan:
https://www.linkedin.com/in/susanewalsh/





Christian at LinkedIn:
https://www.linkedin.com/in/christian-krug/





Unf*ck Your Data at Linkedin:
https://www.linkedin.com/company/unfck-your-data





Book recommendation:





Susans book recommendation: Buy back your time - Dan Martell





The “UYD” bookshelf at Melena’s store:
https://gunzenhausen.buchhandlung.de/unfuckyourdata








Where to find UN#CK YOUR DATA:





Podcast at Spotify:
https://open.spotify.com/show/6Ow7ySMbgnir27etMYkpxT?si=dc0fd2b3c6454bfa





Podcast at iTunes:
https://podcasts.apple.com/de/podcast/unf-ck-your-data/id1673832019





Podcast at Deezer: https://deezer.page.link/FnT5kRSjf2k54iib6





Contact:





E-Mail: christian@uyd-podcast.com





Timestamps:





00:00 Introduction and Welcome


01:13 Susan's Background and Expertise


03:03 Types of Dirty Data


04:01 The Impact of Dirty Data


06:12 Cleaning Data and the Role of Excel


07:34 The Limitations of AI in Data Cleaning


09:26 Automating Supplier Name Normalization


11:03 Data Classification and Context


13:52 The Importance of Business Understanding


16:26 The Role of Human Expertise in Data...

Kommentare (0)

Lade Inhalte...

Abonnenten

MLindaK
Euskirchen
tihenkel
Mücke
BOFH
Lampertheim
15
15
:
: