If you’ve used the sklearn library in your own code, you may have realized that all attributes are suffixed with a trailing underscore. Here’s an example for the k-means algorithm:
## Dependencies from sklearn.cluster import KMeans import numpy as np ## Data (Work Work (hh) / Salary Salary ($)) X = np.array([[35, 7000], [45, 6900], [70, 7100], [20, 2000], [25, 2200], [15, 1800]]) ## One-liner kmeans = KMeans(n_clusters=2).fit(X) ## Result & puzzle cc = kmeans.cluster_centers_ print(cc) ''' [[ 50. 7000.] [ 20. 2000.]] '''
In the second-last line, we used the
cluster_centers_. Why does sklearn library not use the attribute name
‘The short answer is, the trailing underscore (
kmeans.cluster_centers_) in class attributes is a scikit-learn convention to denote “estimated” or “fitted” attributes.’ (source)
So the underscore simply indicates that the attribute was estimated from the data.
The sklearn documentation is very clear about this:
‘Attributes that have been estimated from the data must always have a name ending with trailing underscore, for example the coefficients of some regression estimator would be stored in a
coef_ attribute after
fit has been called.’
This is very useful for you because you immediately know that these attributes have been set in the learning phase of the algorithm (and not in the initializer etc.). Thus, you can easily spot that a model has not been trained by checking the attributes with trailing underscores:
## Dependencies from sklearn.cluster import KMeans import numpy as np ## Data (Work Work (hh) / Salary Salary ($)) X = np.array([[35, 7000], [45, 6900], [70, 7100], [20, 2000], [25, 2200], [15, 1800]]) ## One-liner kmeans = KMeans(n_clusters=2) cc = kmeans.cluster_centers_ print(cc) ''' Traceback (most recent call last): File "C:\Users\xcent\Desktop\code.py", line 13, in <module> cc = kmeans.cluster_centers_ AttributeError: 'KMeans' object has no attribute 'cluster_centers_' '''
You can see that without calling the
fit() function, there is no
cluster_centers_ attribute, yet. Instead, it’s created dynamically as
fit() is executed.
While working as a researcher in distributed systems, Dr. Christian Mayer found his love for teaching computer science students.
To help students reach higher levels of Python success, he founded the programming education website Finxter.com that has taught exponential skills to millions of coders worldwide. He’s the author of the best-selling programming books Python One-Liners (NoStarch 2020), The Art of Clean Code (NoStarch 2022), and The Book of Dash (NoStarch 2022). Chris also coauthored the Coffee Break Python series of self-published books. He’s a computer science enthusiast, freelancer, and owner of one of the top 10 largest Python blogs worldwide.
His passions are writing, reading, and coding. But his greatest passion is to serve aspiring coders through Finxter and help them to boost their skills. You can join his free email academy here.