Polars: The Next Big Python Data Science Library... written in RUST?
166,497
Published 2022-12-29
Timeline:
00:00 Intro
01:00 What is Polars?
02:43 Getting Started
06:32 Filtering
07:15 New Columns
08:10 Groupby
08:55 Combining Dataframes
10:17 Multithreaded Approach
11:21 Speed Test
12:50 Takeaways
Follow me on twitch for live coding streams: www.twitch.tv/medallionstallion_
My other videos:
Speed Up Your Pandas Code: • Make Your Pandas Code Lightning Fast
Intro to Pandas video: • A Gentle Introduction to Pandas Data ...
Exploratory Data Analysis Video: • Exploratory Data Analysis with Pandas...
Working with Audio data in Python: • Audio Data Processing in Python
Efficient Pandas Dataframes: • Speed Up Your Pandas Dataframes
* Youtube: youtube.com/@robmulla?sub_confirmation=1
* Discord: discord.gg/HZszek7DQc
* Twitch: www.twitch.tv/medallionstallion_
* Twitter: twitter.com/Rob_Mulla
* Kaggle: www.kaggle.com/robikscube
#python #polars #datascience
All Comments (21)
-
Polars is built on top of Apache Arrow which pandas supports. So you can easily convert your polars dataframe to pandas with almost zero overhead. I use polars to do the hard part and jump back to pandas for the visualization stuff
-
Our team tried to integrate polars into our analytics pipeline last year, and the result was kinda on and off. To be honest, the performance of pandas is not that bad, we spent some time on doing several fine tunings, like rewriting key bottlenecks with our native modules or with these vectorized pandas methods, and the result turned out just ok. On the other hand, the integration work of polars did require some major revamping and refactoring, due to API gaps and implementation differences between the two. However, the performance gains didn't seem to justify the effort. What's worse, while pandas does come with pitfalls and caveats here and there, polars is a relatively young project and it comes with bugs on basic text manipulating operations. But don't get me wrong, that was my experience last year. I do think polars has the potential. It has a much more robust and modern architecture than pandas in my opinion. Its API style is cleaner and more consistent. And it comes with a query optimization engine, which many users can appreciate if you are familiar with tools like apache spark or some databases. Given time, I think polars should become another powerful player in the future. So, definitely give it a try if you're building something new!
-
10000 points for printing the version. Every tutorial video should do that.
-
13:20 Regarding learning the syntax… It’s worth mentioning that Polars syntax is very similar to PySpark, so it’s really two birds with one stone.
-
Nice video. Very interesting to see how polar works, hope to see it more frequent in your future streams to learn more about the practical use.
-
Great timing, I was looking to start playing with Polars since Mark Tenenholtz mentioned it some days ago. I went back to Pandas because couldn't find the assign() and astype() equivalents in Polars, I thought they were lacking, but they seem to be with_columns() and cast(). Now I will resume more persistently.
-
This is fantastic. Thank you
-
I saw some tweets about Polars but seeing it in action is something else Also, I can't believe it took me this long to find your channel, subbed!
-
Thanks for the recommendation, I will definitely give it a try 😊
-
Nice stuff. This Polars seems a killer tool. Thank you for share.
-
as usual rob nice video i have learned a lot from you
-
Perfect, thank you!
-
Thanks for a good explanation of how Polars could benefit people who use Pandas and need more speed. In my project we already have a heavy emphasis on multi processing and fast inter process communication, so I am especially interested to see a Pandas vs Polar single core performance comparison for group and join. I hope that someone does the comparison and posts it to Youtube.
-
I'm blown away by how fast this is. Sure there are some things it can't do, but man, even for just reading large data sets it's absolutely blazing.
-
OMG.. thank you!!
-
Thanks for brining this to my attention, I think I might include polars into some productionalization processes. For data exploration, typically I only use parts of dataframes for plotting or investigation. Given that you can convert a polars dataframe to pandas, it seems like a good approach would be to have the the full dataset in polars and then filter into a pandas dataframe and plot.
-
Great channel!! Thanks for sharing. I'll check it out for sure!
-
DataTable is also pretty legendary, you might also find it super awesome. Thanks again for your amazing videos, I have watched and learned from every one of them. I hope I'll interview you about your 100k celebration sometime next year 🙏
-
Hi @rub, I think it's a good approach to diversity our tools this days, especially when it comes to deal with memory (sometimes I find myself running out of time with pandas)
-
Good stuff!!