Friday, September 14, 2007

Partitioning in MySQL 5.1, Part 1

I was browsing around the MySQL web site tonight and ran across some free webinars (recordings). Seeing as how I once did a podcast on Oracle partitioning, one webinar that jumped out at me was Partitioning in MySQL 5.1 and onwards.

I didn't even know MySQL did partitioning. Here is a description of the webinar:

In this webinar we will provide further insight into MySQL partitioning, including:

  • Introduction to MySQL Partitioning
  • Linear Key Partitioning
  • Partition Function
  • Partition Options
  • Information Schema for Partitions
  • Partition Management
  • Partition Pruning
  • Partitioning and NULL Values
  • Partitioning Implementation
  • Partitioning for MySQL Cluster
  • Future Roadmap for Partitioning
  • Partitioning Limitations in 5.1

Sounds good. It's a good presentation but kind of slow. If you're like me and would prefer to move at your own speed, you can download the presentation itself.

For those of you even more impatient than me, I will point out a few highlights. The beginning of the presentation was about what partitioning is and why use it. I'll skip over those parts. If you are interested in those details, post me a note and I will do a blog entry on that topic.

partitioning type One of the more important pieces of information is the types of partitioning supported.

Oracle supports all of these types of partitioning but some other very good databases do not. For example, PostgreSQL only supports range partitioning (and list becuase of that) but you would have to manually implement a hash partition and any composites are out of scope for now.

I am fairly impressed with this support. You can also mix and match engines (in future versions). That's pretty exciting.

A quick description of the types for those not familiar:

  • Range Partition - Things like by year, by month, etc. Used very frequently in data warehousing. You might have a partition for each year. You can drop partitions (old years) and add new parititions (for future years). Range partitions tend to be dynamic (you'll be adding and dropping partitions over time).
  • List Partition - A known list of partitioning values. You may partition by state, country, region etc. Each value gets its own partition. You can add partitions but a list tends to be fairly static.
  • Hash Partition - A hashing algorithm creates a numeric value. You'll use this on large data sets that don't have a logical partitioning key. The idea is to give you the ability to partition on numeric data like a primary key.
  • Composite Partition - A combination of the three partitioning types above.

partition function The partition function is the function that determines which partition a particular row will reside in. The current implementation has some serious limitations but those can be worked around. Basically you write a function, say by comparing dates, and that determines which partitions will hold your record. I'm sure that I could live with two of the current limitations: always an integer value (how do you get that to work with dates?) and the fact that you can only write the function on your primary key.

I'll finish up my coverage of this presentation in part 2. All in all, I am quite impressed with the partitioning function in MySQL 5.1.


1 comment:

Robert said...

Just use unixtime to get a partitionable integer value for dates (if greater than 1970)