The Basics of Machine Learning with scikit by Iwona Popek || Women in Technology Poland

Posted by


Machine learning is a powerful tool that allows computers to learn from data and make decisions or predictions based on that data. It has a wide range of applications, from self-driving cars to personalized advertising. In this tutorial, we will be discussing the basics of machine learning using the scikit-learn library with Iwona Popek, a prominent figure in the field of technology, particularly in Poland as part of the Women in Technology Poland community.

Iwona Popek is a data scientist and machine learning expert who has been working in the field for many years. She is passionate about promoting diversity and inclusion in the tech industry, particularly through her work with Women in Technology Poland.

To get started with machine learning using scikit-learn, you will need to have Python installed on your computer. You can download Python from the official website (https://www.python.org/downloads/), and make sure to install the latest version of Python.

Once you have Python installed, you can install scikit-learn by using the following command:

pip install scikit-learn

Scikit-learn is a popular machine learning library in Python that provides tools for data mining and data analysis. It is built on top of other scientific libraries such as NumPy, SciPy, and Matplotlib, making it a powerful and versatile tool for machine learning tasks.

Now that you have scikit-learn installed, you can start by loading a dataset. Scikit-learn provides a number of built-in datasets that you can use for practice. For this tutorial, we will be using the Iris dataset, which is a classic dataset in machine learning.

To load the Iris dataset, you can use the following code snippet:

from sklearn.datasets import load_iris
iris = load_iris()
X = iris.data
y = iris.target

In this code snippet, we are importing the load_iris function from the sklearn.datasets module, and then loading the Iris dataset into the variables X and y. X contains the features of the dataset, while y contains the target variables.

Next, we will split the dataset into training and testing sets using the train_test_split function from scikit-learn:

from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

In this code snippet, we are importing the train_test_split function from the sklearn.model_selection module, and then splitting the dataset into training and testing sets. We are splitting the dataset such that 80% of the data is used for training and 20% is used for testing, and setting the random_state parameter to 42 to ensure reproducibility.

Now that we have split the dataset, we can choose a machine learning algorithm to train on the data. For this tutorial, we will be using a simple logistic regression model:

from sklearn.linear_model import LogisticRegression
model = LogisticRegression()
model.fit(X_train, y_train)

In this code snippet, we are importing the LogisticRegression class from the sklearn.linear_model module, and then creating an instance of the model. We then use the fit method to train the model on the training data.

Once the model has been trained, we can make predictions on the testing data and evaluate the performance of the model:

y_pred = model.predict(X_test)
from sklearn.metrics import accuracy_score
accuracy = accuracy_score(y_test, y_pred)
print("Accuracy:", accuracy)

In this code snippet, we are using the predict method to make predictions on the testing data, and then using the accuracy_score function from the sklearn.metrics module to calculate the accuracy of the model. Finally, we print out the accuracy of the model on the testing data.

This is just a basic introduction to machine learning using scikit-learn with Iwona Popek. There is a lot more to learn in the field of machine learning, and scikit-learn provides a wide range of tools and algorithms to help you get started. Iwona Popek’s work with Women in Technology Poland is a testament to the importance of diversity and inclusion in the tech industry, and she serves as an inspiration to women who are interested in pursuing a career in technology.

I hope this tutorial has been helpful in getting you started with machine learning using scikit-learn. Remember to practice and explore different algorithms and datasets to develop your skills in machine learning. Thank you for reading, and good luck on your machine learning journey with Iwona Popek and Women in Technology Poland!

0 0 votes
Article Rating

Leave a Reply

4 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
@danielgad8924
4 hours ago

Gdybym byÅ‚ kobietÄ…, nie wzięłabym udziaÅ‚u w takim seksistowskim szkoleniu. Rozumiem, że kobiety sÄ… traktowane jak wymierajÄ…cy gatunek, który potrzebuje szczególnego wsparcia… ale czy na pewno? Może chodzi tu tylko o pieniÄ…dze z dotacji?

@krzysztoflesniak2674
4 hours ago

BÄ™dÄ™ jak typowy irytujÄ…cy czepialski, ale spotkaÅ‚em siÄ™ z komentarzami statystyków, że różnica miÄ™dzy znaczeniem terminów standaryzacja danych a normalizacja danych jest – nomen omen – istotna. Mocno siÄ™ nad wymiennoÅ›ciÄ… tych terminów w jÄ™zyku potocznym znÄ™cali.

Tymczasem inni, których nazwaÅ‚bym lingwistami internetowymi i samodzierżawnymi znawcami dobrego smaku literackiego, chÄ™tnie poznÄ™caliby siÄ™ nad dÅ‚ugoÅ›ciÄ… i zÅ‚ożonoÅ›ciÄ… formuÅ‚owanych w niniejszym komentarzu zdaÅ„ oraz szykiem wystÄ™pujÄ…cych w nim wyrazów. Zdaniem wielu, powinno siÄ™ pisać wedle szyku ,,Mocno siÄ™ znÄ™cali nad wymiennoÅ›ciÄ… tych terminów w jÄ™zyku potocznym", gdyż jest to bardziej przejrzyste dla biednego czytelnika, nawet jeÅ›li inny wariant jest poprawny i ma na celu inaczej rozÅ‚ożyć akcenty. Ale dość tego ,,miodo-bralczenia"…

Co by się nie powiedziało i czego nie zrobiło, jakoś tak zawsze znajdzie się piaskowy do sypania w szprychy bicykla.
Wesół dzień!

@michasekua4642
4 hours ago

Super materiał, jednakże jeśli chodzi o dziś dzień, to ostatnia komórka Pani kodu nie będzie dziś działała, ponieważ jest inna kolejność wymaganych argumentów funkcji scatterplot. Labele "PM1O" i "O3" (które swoją drogą na filmiku wygląda jak 03, aż dziwiłem się, jak Pani kod może działać :D) muszą być zadeklarowane w sposób x="PM10" i y="O3", lub być napisane PO "data", a nie przed, jak ma to miejsce na filmiku 🙂

@johntree4949
4 hours ago

Dzieki 😛 Sprawdzam kmeans czy nadaje sie do tworzenia "wzorcow" cen na rynek krypto. Brakowalo mi tego bo newbie jestem jesli chodzi o LM.

4
0
Would love your thoughts, please comment.x
()
x