Thanks for your great insights. Yet I must confess that I disagree a little bit with the fact that data scientists will soon be limited to pre-processing data.
I think that there is a growing faction of interdisciplinary data scientists. Those researchers who have a second working knowledge in addition to computer science. For example, I am a physician by profession, but I lately fell in love with data science because it gave me an opportunity to know more about statistics, and that in turn opened my eyes on different frontiers in cancer research that I didn’t even imagine I could discover while I was only seeing patients.
And because gathering data took me a very long time, I started looking for readily available data to work with. I discovered the wonderful Surveillance, Epidemiology, and End Results (SEER) database, and I was able to publish my first medical paper in 2018.
But I can tell you that …
You are right in that today’s databases are highly unstructured and need a lot of recoding in order to feed their data into statistical software. Yet, pre-processing the data is not the only thing that a data scientist can excel at. Subject knowledge is another value that can be added by multidisciplinary data scientists. So, I guess human data scientists will prevail in that domain for a little longer.
Thanks again for you informative article.