I have long been a fan of Peter Deutsch’s fallacies (btw I’m not alone, Google this AM produced over 22k references) of network/distributed computing, they have served as a set of guiding checkpoints for every distributed system that I have built. What I have found to be missing, however, is a similar set of fallacies/truisms for managing Information while we approach “internet scale” information infrastructure… the information explosion.
Truisms defined by/principles for managing information explosion:
- no one person/system is capable of managing all data
- optimizations will be continually applied, but by different vendors, thereby requiring an enterprise to distribute their information architecture
- information processing is inherently a pipelined process (though fork/join supports parallelism for reduction of latency)
- these pipeline’s can have “in parallel” replicas so long as sufficient locking is engineered, and compensation models supported
- locking for a given pipeline should be owned discretely by a single application context (workflow) – though this workflow may be complex, it is stateless upon completion of end state
- loose coupling / jit integration require coherent, federable, data dictionaries and meta-data/structure maps
Translated to Fallacies… which, agreeing with SGG, I think are way more powerful, and in some cases hilarious.
- there is one enterprise data architect who is responsible for the master models
- there is a system who is the authoritative master for a given entity domain
- there is one vendor involved across the SOA and EIM domain
- the data models are largely fixed, and the business will not ask for further changes/enhancements to the model
- data exchange will be based upon XA/2-phase transactional mechanisms to achieve ACID properties (pessimistic transactionality)
- there will be a singular data dictionary, with complete meta-data for a given entity domain
Additions/Subtractions/debate most wanted!