IBMinterviewrecords
Interview Questions by IBM Data Scientist Intern
Self Introduction Part
Introduce yourself
Your favorite programming languages, why?
C++, Python.
Data Science and Machine Learning Basics
What’s the difference between Supervised Learning and Unsupervised Learning?
Supervised Learning is to train the model using dataset with clear label y, which is also called target label. And the model will predict this label on new dataset.
Unsupervised Learning is to train the model without target label y. It’s actually clustering the data.
What is dimensionality reduction? Why do we need this? Can you explain how PCA works briefly?
Dimensionality reduction: It is a process of reducing and compressing the original features from a high number of dimensions into a low number of dimensions, and try to keep important information.
Why Dimensionality reduction:
1.