We have a concept of a 'key activaton' in Eigenland and I would think that this would be the best thing to use. It's binary on off (though I think and perhaps Jim or Geert could enlighten me on this that it also has a soft/hard activation and that those thresholds are set in the instrument agent at the moment) and I think the logical way for a momentary switch Agent to work would be to take an incoming keygroup and use those key activations as binary switches. I think the detection of thresholds and generation of activations would be properly done in a separate Agent - its one of those little 'mathematical' style agents we talked about at the Devcon. One could imagine a variety of these, and one of them could indeed produce a kind of ladder of keygroup key activations in the style you mentioned. I do think that keeping that separate from the switching would make it more flexible. I don't think that you need that to do this right now though as key activation thresholds are already generated elsewhere.
Several nice possible configurable behaviours spring to mind. The immediately useful one is of a momentary switch for for just one active output - ie, activate the key for that output and the signal goes there, I guess 'highest key wins' or some such. We could have a straight through output that is active when no other output is selected in this mode, which gives you the toggle switching behaviour you need now as the case when key one of the incoming switch control keygroup is pressed.
The second mode could be multiple momentary switching, ie the signal goes to every output whose key is held down. The third is toggle for each one, press on, press off.
I could see a lot of neat setup options being enabled by an agent like this and it could start very simply with just one in and two outs and get more functional as time went on without breaking backwards compatibility, always a concvern with these things.
Of course the most awkward thing about this is dealing with audio, which can't just be switched or you'll get clicks. A fast crossfade is usually the best (followed by zero crossed switching, which is harder than crossfading usually as finding a zero crossing is not as straightforward as it sounds since actual crossings don't correspond to zero data values, we have to keep in mind we're in a sampled world), so there's a bit of signal processing involved.
I wouldn't bother with any of this right now though - I am so keen to see if the idea functions at all that I'm chomping at the bit to see it work - the convenience of being able to switch the mic in and out of it is icing on the cake in my book! If you just have a keygroup input and take one key press (using it's activation signal) as a 'listen' button, we can arrange all the nice stuff later I think.
John