

Advanced Tokenization, Stemming, and Lemmatization Bag-of-Words with More Than One Word (n-Grams) Example Application: Sentiment Analysis of Movie Reviews Grid-Searching Preprocessing Steps and Model Parameters Accessing Attributes in a Pipeline inside GridSearchCV Convenient Pipeline Creation with make_pipeline Using Evaluation Metrics in Model Selection The Danger of Overfitting the Parameters and the Validation Set

Stratified k-Fold Cross-Validation and Other Strategies Binning, Discretization, Linear Models, and Trees Convenient ColumnTransformer creation with make_columntransformer

OneHotEncoder and ColumnTransformer: Categorical Variables with scikit-learn Representing Data and Engineering Features Comparing and Evaluating Clustering Algorithms Dimensionality Reduction, Feature Extraction, and Manifold Learning The Effect of Preprocessing on Supervised Learning

Scaling Training and Test Data the Same Way Relation of Model Complexity to Dataset Size Generalization, Overfitting, and Underfitting Building Your First Model: k-Nearest Neighbors Measuring Success: Training and Testing Data A First Application: Classifying Iris Species Suggestions for improving your machine learning and data science skills.Methods for working with text data, including text-specific processing techniques.The concept of pipelines for chaining models and encapsulating your workflow.Advanced methods for model evaluation and parameter tuning.How to represent data processed by machine learning, including which data aspects to focus on.Advantages and shortcomings of widely used machine learning algorithms.Fundamental concepts and applications of machine learning.Familiarity with the NumPy and matplotlib libraries will help you get even more from this book. Authors Andreas Müller and Sarah Guido focus on the practical aspects of using machine learning algorithms, rather than the math behind them. You’ll learn the steps necessary to create a successful machine-learning application with Python and the scikit-learn library. With all the data available today, machine learning applications are limited only by your imagination. If you use Python, even as a beginner, this book will teach you practical ways to build your own machine learning solutions. Machine learning has become an integral part of many commercial applications and research projects, but this field is not exclusive to large companies with extensive research teams.
