scikit-learn fit() transfrom() fit_transform()

https://stackoverflow.com/questions/23838056/what-is-the-difference-between-transform-and-fit-transform-in-sklearn

In scikit-learn estimator api,

fit() : used for generating learning model parameters from training data

transform() : parameters generated from fit() method,applied upon model to generate transformed data set.

fit_transform() :combination of fit() and transform() api on same data set

Checkout Chapter-4 from this book & answer from stackexchange for more clarity

Further more explanation as follows (an example to explain the meaning of fit() and fit_transform() ):

To center the data (make it have zero mean and unit standard error), you subtract the mean and then divide the result by the standard deviation.

scikit-learn fit() transfrom() fit_transform()

You do that on the training set of data. But then you have to apply the same transformation to your testing set (e.g. in cross-validation), or to newly obtained examples before forecast. But you have to use the same two parameters μ and σ (values) that you used for centering the training set.

Hence, every sklearn's transform's fit() just calculates the parameters (e.g. μ and σ in case of StandardScaler) and saves them as an internal objects state. Afterwards, you can call its transform() method to apply the transformation to a particular set of examples.

fit_transform() joins these two steps and is used for the initial fitting of parameters on the training set x, but it also returns a transformed x′. Internally, it just calls first fit() and then transform() on the same data.