What I want AI to do is comb through our (separate) databases that contain field failure information, clean and combine the data, then do some rules-based pre-processing before doing an analysis. The problems for me are:
- All our data is in separate systems, each requiring their own credentials to access.
- We've tried using ML and NLP to analyze this data before and it hasn't worked. Think of the text as a Jiffy Lube tech describing a problem in 256 characters or less. Typos, crazy acronyms and abbreviations make it difficult to come up with a training set let alone successfully train anything. The rare times we were able to train a model successfully, it went out-of-date within a couple months.
I'm big afraid that people don't realize these are some major limitations because I've seen it in my company. Just because your data came from a computer doesn't mean it's right. There is still a need for people to dig in and understand the data, where it comes from and how it gets there.