Wes McKinney
  • Home
  • Book
  • Blog
  • Talks
Categories
All (26)
apache arrow (11)
benchmarks (1)
data science (1)
databases (1)
gear (1)
ibis (1)
open source (2)
pandas (5)
parquet (2)
personal (1)
posit (1)
python (3)
r (1)
rants (3)
retrospective (1)
startups (1)
thoughts (1)
ursa labs (5)
work (11)

Python for Data Analysis, 3rd Edition

Joining Posit’s Polyglot Data Science Mission

work
open source
posit
TL;DR I am joining Posit today as a Principal Architect where I will advocate for the needs of the PyData ecosystem in Posit’s work as well as continue advancing critical…
Nov 6, 2023
Wes McKinney

Python for Data Analysis, 3rd Edition

Voltron Data Update: Transitions

startups
work
TL;DR I am transitioning out of my full-time CTO role at Voltron Data so that I can expand my portfolio of entrepreneurial and open source data projects. While no longer…
Oct 23, 2023
Wes McKinney

Python for Data Analysis, 3rd Edition

The Road to Composable Data Systems: Thoughts on the Last 15 Years and the Future

retrospective
thoughts
A new joint VLDB paper on Composable Data Management Systems with Meta, Databricks, Sundeck, and others at is out! This post is a reflection on how I arrived at thinking…
Sep 1, 2023
Wes McKinney

Python for Data Analysis, 3rd Edition

Joining Forces for an Arrow-Native Future

work
apache arrow
Too often people say “let’s do something together” in passing, and don’t. There’s the occasional inter-project collaboration, but rarely will people take that next step.…
Aug 5, 2021
Wes McKinney

Python for Data Analysis, 3rd Edition

Ursa Labs March 2019 Report

ursa labs
work
The first quarter of 2019 has now wrapped up. In March we spent a good amount of time focused on getting the 0.13.0 Apache Arrow release out of the door. I will mention a…
Apr 4, 2019
Wes McKinney

Python for Data Analysis, 3rd Edition

Ursa Labs February 2019 Report

ursa labs
work
The team had a busy 28 days this February. The Apache Arrow community is discussing a 0.13 release toward the end of March, so we spent February helping the project toward…
Mar 6, 2019
Wes McKinney

Python for Data Analysis, 3rd Edition

Ursa Labs January 2019 Report

ursa labs
work
Ursa Labs had a busy January that went by too quickly. After a high-intensity 3 months of development, we helped release Apache Arrow 0.12 on January 20th. A good chunk of…
Feb 5, 2019
Wes McKinney

Python for Data Analysis, 3rd Edition

Leaving NYC for Nashville

For ten out of the last eleven years, I’ve lived in two places: New York City and San Francisco. The last two years have been in NYC. After founding Ursa Labs, a…
Dec 3, 2018
Wes McKinney

Python for Data Analysis, 3rd Edition

Announcing Ursa Labs’s partnership with NVIDIA

ursa labs
work
I’m excited to announce that NVIDIA AI Labs has signed on as a supporter of Ursa Labs. NVIDIA’s new open source RAPIDS data science platform uses Apache Arrow for an…
Oct 10, 2018
Wes McKinney

Python for Data Analysis, 3rd Edition

Announcing Ursa Labs: an innovation lab for open source data science

work
ursa labs
apache arrow
Funding open source software development is a complicated subject. I’m excited to announce that I’ve founded Ursa Labs (https://ursalabs.org), an independent development lab…
Apr 19, 2018
Wes McKinney

Python for Data Analysis, 3rd Edition

Some comments to Daniel Abadi’s blog about Apache Arrow

apache arrow
databases
Well-known database systems researcher Daniel Abadi published a blog post yesterday asking Apache Arrow vs. Parquet and ORC: Do we really need a third Apache project for…
Nov 1, 2017
Wes McKinney

Python for Data Analysis, 3rd Edition

Feather format update: Whence and Whither?

apache arrow
Earlier this year, development for the Feather file format moved to the Apache Arrow codebase. I will explain how this has already affected Feather and what to expect from…
Oct 16, 2017
Wes McKinney

Python for Data Analysis, 3rd Edition

Apache Arrow and the “10 Things I Hate About pandas”

pandas
apache arrow
This post is the first of many to come on Apache Arrow, pandas, pandas2, and the general trajectory of my work in recent times and into the foreseeable future. This is a bit…
Sep 21, 2017
Wes McKinney

Python for Data Analysis, 3rd Edition

Extreme IO performance with parallel Apache Parquet in Python

parquet
apache arrow
In this post, I show how Parquet can encode very large datasets in a small file footprint, and how we can achieve data throughput significantly exceeding disk IO bandwidth…
Feb 10, 2017
Wes McKinney

Python for Data Analysis, 3rd Edition

Streaming Columnar Data with Apache Arrow

apache arrow
Over the past couple weeks, Nong Li and I added a streaming binary format to Apache Arrow, accompanying the existing random access / IPC file format. We have implementations…
Jan 27, 2017
Wes McKinney

Python for Data Analysis, 3rd Edition

2017 Outlook: pandas, Arrow, Feather, Parquet, Spark, Ibis

apache arrow
pandas
ibis
parquet
work
2017 is shaping up to be an exciting year in Python data development. In this post I’ll give you a flavor of what to expect from my end. In follow up blog posts, I plan to…
Dec 27, 2016
Wes McKinney

Python for Data Analysis, 3rd Edition

From Arrow to pandas at 10 Gigabytes Per Second

apache arrow
pandas
In this post I discuss some recent work in Apache Arrow to accelerate converting to pandas objects from general Arrow columnar memory.
Dec 27, 2016

Python for Data Analysis, 3rd Edition

Kinesis Advantage2: Impressions

work
gear

I discuss my impressions of the newest version of the classic Kinesis Advantage contoured mechnical keyboard

Dec 4, 2016
Wes McKinney

Python for Data Analysis, 3rd Edition

GitHub’s one-dimensional view of open source contributions

rants
open source
TL;DR One of the most harmful parts of the GitHub platform is the code contribution calendar. This “hacker score card” overemphasizes the value of commits over the other…
Nov 6, 2016

Python for Data Analysis, 3rd Edition

Feather: it’s about metadata

apache arrow
Summary: Feather’s good performance is a side effect of its design, but the primary goal of the project is to have a common memory layout (Apache Arrow) and metadata (type…
Apr 26, 2016

Python for Data Analysis, 3rd Edition

Why pandas users should be excited about Apache Arrow

apache arrow
pandas
I’m super excited to be involved in the new open source Apache Arrow community initiative. For Python (and R, too!), it will help enable
Feb 22, 2016
Wes McKinney

Python for Data Analysis, 3rd Edition

The problem with the data science language wars

rants
data science
python
r
I really enjoyed the cheeky blog post by my pal Rob Story.
Nov 2, 2015
Wes McKinney

Python for Data Analysis, 3rd Edition

What’s changed

personal

Some reflections from turning 30.

Mar 23, 2015
Wes McKinney

Python for Data Analysis, 3rd Edition

Thoughts on joining Cloudera

work
After some unanticipated media leaks (here and here), I was very excited to finally share that my team and I are joining Cloudera. You can find out all the concrete details…
Oct 6, 2014
Wes McKinney

Python for Data Analysis, 3rd Edition

Introducing vbench, new code performance analysis and monitoring tool

python
benchmarks
Do you know how fast your code is? Is it faster than it was last week? Or a month ago? How do you know if you accidentally made a function slower by changes elsewhere?…
Dec 18, 2011
Wes McKinney

Python for Data Analysis, 3rd Edition

A Roadmap for Rich Scientific Data Structures in Python

rants
python
pandas
Discussion thread on Hacker News
Jul 21, 2011
Wes McKinney
No matching items

    © Copyright 2025 Wes McKinney. Except where otherwise noted, all rights reserved. The views and opinions on this website are my own and do not represent my current or former employers.