Data science is an interdisciplinary field that uses scientific methods, processes, algorithms and systems to extract knowledge and insights from noisy, structured and unstructured data, and apply knowledge and actionable insights from data across a broad range of application domains. Data science is related to data mining, machine learning and big data.
Data science is a “concept to unify statistics, data analysis, informatics, and their related methods” in order to “understand and analyze actual phenomena” with data. It uses techniques and theories drawn from many fields within the context of mathematics, statistics, computer science, information science, and domain knowledge. However, data science is different from computer science and information science. Turing Award winner Jim Gray imagined data science as a “fourth paradigm” of science (empirical, theoretical, computational, and now data-driven) and asserted that “everything about science is changing because of the impact of information technology” and the data deluge.
A data scientist is someone who creates programming code, and combines it with statistical knowledge to create insights from data.