Semprini

Lies, damn lies and statistics

The forgotton question mark

Ya know, I love me a question mark. The elegant sickle like curve and dangling dot, those little paired words of how? and why? It's much nicer than a full stop (or period for my American colleagues). Why do architects seem to prefer the latter?

There is a Proof of Concept below the rant - I beg your indulgence dear reader. The PoC uses Kafka and python to transfer files based on the argument laid out below. The GitHub project is here: mft-kafka

There and back again - Analytics Data Pipeline

As an architect it's my job to talk and write a lot of bollocks. One must constantly be on the forefront of buzzword lore to maintain ones architectural standing and feeling of general superiority over lesser IT peons.

However, in this obviously correct and proper feudal type system, what sometimes gets lost or confused is the practicality of implementation and shared understanding. The same architectural buzzwords can be interpreted in so many ways so it becomes difficult for people implementing to realise the actual benefit. This means uninspired doers either take no notice or go off on a tangent which looses any architectural benefit intended.

This is why I'm not a fan of ivory tower architects and IT leaders who shoot Google arrows and Gartner bolts from the parapet, and I prefer to wield a bloody big sword in the melee of IT implementation. This blog is about a quick PoC I had a bash at for a pipeline which streams from a simulated on-prem source to a data lake-house.

PoC, PoC, PoC, Production

The hint is in the name people. Proof of Concept. Why is this so difficult to understand? The point of a PoC is to show that a thing can be done, not how a thing can be done. I.e. we don't know something but we have an idea - awesome, I'm here to help. Where I seem to part ways is when the PoC plans to deliver best practice architecture, continuous delivery and push the code & pipelines into production.

Data Convergence

There is so much scope for change in IT. It's quite a pure form of expression of ideas because its underlying logic gates have been virtualized and almost completely abstracted. There are no laws of physics that apply.

But Semprini, you wizened lothario of technology, why then is the pace of change at most companies ever slower?

Excellent question my intrepid reader, lets dive into this and hopefully even plausibly relate it to the topic of the blog - data convergence.

Tikanga Data

A language can say a lot about a culture. All words are made up, therefore the things and ideas that have evolved to have their own words can provide some insight to the nature of a society.

"Tikanga is an inner form of life that manifests itself in one’s conduct. Good intention is embodied in character traits so the philosophy is neither pragmatism nor materialism - it's the character of a person which is given a primary place in virtue ethics." - Piripi Whaanaga. Related to Tikanga is the concept of Manaaki - which is derived from the word ‘mana’ (prestige) and the word to encourage ‘aki.’ Thus an important component of restoring balance is encouraging or building up mana.

This seems like a good place to start for an ethics framework for data.