Working with data is not so much difficult as it is tricky, in part because some data isn’t actually found in basic tables but in fact in joined structures or is made up of derived, calculated or converted values .
A second aspect is the fact that data is often grounded in legacy structures and in some cases these are all but impenetrable to many of the standard and third party tools in the market.
Working with legacy systems for example, is not something that is widely supported by more modern applications. A third aspect of working data is that in some instances, particularly in the realm of engineering, plant maintenance and even project systems; the identifiers that are used to uniquely identify records are not easily associated with the more friendly human readable record identifiers, those that some of applications allow users to use as aliases.
Then there is the constant challenge of the size of the data. Some data literally run to billions of rows and on traditional relational databases this means that doing any kind of sort select, filtered or conditional selection, is not only time consuming but resource intensive.
Many a seasoned database administrator will tell you of at least a couple of instances where poorly written programs or queries have gone off and ruined system performance for the regular users because of their poor design. Third party products have some capabilities that help to avoid some of this pain. NoSQL databases also help, but the latter often means you’ve duplicated or replicated the data to yet another system.
The ability to easily use transactional data for analysis and reporting is undoubtedly one of the greatest ongoing challenges for the business. With IT teams continuously shrinking or being outsourced, the business is increasingly turning to self-service tools to help them close the gaps left by inadequate standard reports and IT delivered reporting solutions.
Waiting months for IT delivered reporting solutions
Companies that I have spoken to, have suggested that although they lean heavily on IT tools and teams for business critical reporting. The time it takes for new reporting structures to be formulated and delivered is far too protracted for business to be able to be agile in grabbing advantages or opportunistically pursuing a line of data inquiry to pivot business decision making.
In some instances, delivery of new reporting cubes and structures can take months or in some cases never get delivered because of constantly changing business and IT priorities. While embedded analytics and business user enabling tools hold increased promise for agile and accelerated reporting this still leaves a great many businesses with limited choices.
Ironically, one of the great fallback scenarios is reporting and analysis out of off-the-shelf products like Microsoft Excel. When the power of Excel is combined with direct access it is possible for small and large business units alike, to generate elegant and relative fresh data sets in digestible reports.
This output can be information or actionable, depending on the structure, design and purpose of the output and when this is further combined with the concept of ‘round trip’ data extraction, transformation and load a great many positive outcomes can be derived for the business in working with the transactional data.
A million rows isn’t enough?
Excel has its limitations of course, earlier versions were constrained to 64000 rows but more contemporary versions support up to a million rows per data sheet.
If you’re working with larger data sets perhaps Excel isn’t really the right staging environment but then there are other choices in the Microsoft stable.
Access for example, supports databases up to 2gb and SQLServer and an array of cloud options like Snowflake are popular data staging environment for DataMart style reporting. Most importantly, don’t assume that real-time data analysis with instantaneously refreshed data is the correct approach and best approach to use.
A great deal of data doesn’t change significantly in a window of a couple of hours and if it does, then perhaps even the analytics tools on offer should be better seen as replaced with business process monitoring tools. Static data analyzed in Excel that can be easily refreshed with on-demand queries will cover probably 80% –90% of the most common use cases and at a much more economical price point.
My message then, is don’t dismiss Excel outright, there is a lot of flexibility even with this routinely available package solution that will impress even the most experience Tableau or Qlikview report developer.