Database scalability with write-heavy application
There's no such a thing as a Grand Unified Database Layout. If there are custom questionaries, there, really, need to be custom tables. Otherwise you are on a quick path to a single-table-of-200-columns of VARCHAR(128)-with-no-primary-keys monstrosity out of thedailywtf.com, which is inefficient, unsupportable and will hurt you in the future.
Sharding, as recommended by toppledwagon may be a thing to consider, but first, double check, that your database is rationally designed. If it is not normalized, then have a very good, preferably backed by testing, reason, why it is not. If it has hundreds of tables, it's probably wrong. If it has single table, it is definitely wrong. Look at the ways you can divide your problem into independent sets. You will spend more effort up front, but the system will be better for it.
Million rows, with, let's say, 2k of data per row (which seems a lot of characters for a survey), is 2GB of memory. If you can throw a bit more hardware onto your problem, maybe you'll be able to keep your data set in RAM?
Which leads to the next question: What's your load in absolute numbers? Customer requests per second, translated to I/Os per second, divided into reads and writes per second, how many gigabytes of data, with what growth rate? How does your load scale with number of requests? Linearly? Exponentially? You don't have to publish your data, just write it down and think about it. What is it today, how do you think it is going to look in a year or two.
Wikipedia says a 15k rpm SAS drive will give you 175-210 IOps. How many do you need in RAID 10 to satisfy your current and projected load? How big is your data set? How many drives do you need to fit your dataset (probably a lot less than to meet the IOs requirement). Would buying a pair (or a dozen) of SSD be justifiable? Is local storage going to be just OK, or are you going to saturate two 8Gb fiber links to a high-end storage subsystem?
If currently you need 1k IOps, but have three 10k rpm HDDs in RAID 5, then there's no way your hardware will be able to satisfy your requirements. OTOH if your app has a user request per second and brings a 32 core 256 GB of RAM beast, backed by an enterprise-class storage to its knees, then chances are the problem lies not within hardware capabilities.
master-master setup, but this typically hits a snag with auto incremented primary keys
No - you just set up the auto-increment-increment and auto-increment-offset to avoid collisions
The solution is typically to have one server do odd numbers, and the other do evens. I want to avoid that.
Why? Surrogate keys, by their very nature are unrelated to the data they index. Assigning meaning to such values is very dangerous.
A quick look at the Tungsten link you provided does not reveal much about what it does - it does have a number of innacuracies (e.g. "you can do multiple masters replication, which is more than what you can do with MySQL native replication"). In the same paragraph it says that it can't handle conflicts. I'm not filled with confidence about the usefulness of this product.
Assuming that master-master replication (either with or without federation to limit replication) does not meet your requirements (but you need to re-examine your thinking about auto-increment field types) then you could shard the data between native clusters using mysqlproxy or use a nosql database.