Two weeks ago I was luckily enough to join our general manager for a Business Architecture review in Dubai. This was my first trip to the UAE and I was in for a few surprises. First up as an “eye examination” at 01:00am in the morning. Apparently all the chaps with ZAF passports have really poor eyesight for some reason. Needless to say my eyes weren’t tested. My form got stamped and I was told to re-join the first queue I had stood in. Next up was the heat, a cool 35 degrees when we were picked up at the airport. Temperatures managed to get to 45 degrees at their peak making it less than pleasant to go for a stroll outside. Mind you, the humidity was pretty insane as well. And then finally, the cost of food and booze. A Heineken will set you back around 60 ZAR and for a 300g fillet – anything between R400 – R650. Eish.

On the brighter side were the friendly people, amazing buildings and malls you could never even dream of – full with ski slopes, aquariums and 1,200 stores!

Ok, so why were we out there? A potential client had asked us to review their infrastructure, business processes and technical implementation with the core focus relating to operational efficiencies. I was there more from a technical perspective and had the exciting job of building out a few POCs while we were on-site. This is where I have to give credit to the Microsoft stack, especially to PowerPivot. In just a few hours one is able to mash up multiple data sources and present back analysis that the client thought would take a very long time to complete. Our rapid BI development tool, timeXtender, also came in very handy when building out a finance cube on a subset of the client’s database.

These tools not only accelerate the ability to quickly create POCs, but also significantly reduce the development time for full-scale projects. The company I work for has recently had numerous success stories where ROI has been realised very quickly, sometimes within the first month of going live! The common denominator – timeXtender.

The review came to an end all too quickly but I was happy to be back in SA again. There is nothing quite like your own bed. Well, we’ve compiled an exciting roadmap for the client – let’s hope they want us to partner with them on the journey.


The much anticipated release of CTP 3 occurred on 11 July this year: http://blogs.technet.com/b/dataplatforminsider/archive/2011/07/11/sql-server-code-name-denali-ctp3-is-here.aspx

The past month has seen me building a virtual machine containing all the new features in the next version of SQL Server, specifically the BI ones. Top of my list was Project Crescent along with the new version of PowerPivot. I’ve also been delving into the new version of Master Data Services and started taking a look at Data Quality Services today. Here is my impression on what I’ve seen so far:


The new release of PowerPivot is really great and I am impressed with a lot of the new capabilities including: KPIs, hierarchies, diagram view, multiple relationships on the same field, meta data, advanced column sorting, perspectives, measure formatting and the ability to show details by right-clicking on the PivotTable. I also like the fact that you can change a field property to ImageURL (Advanced Properties) and then have these images appear in your Crescent reports.

To get a full list of the new features, open the PowerPivot window, click on the Help icon and then select the What’s New menu item.

Project Crescent

Wow, I really like Crescent! Crescent reports are pretty much Silverlight PivotTables based on PowerPivot models (uploaded to SharePoint 2010) or the new Tabular version of Analysis Services. The reports are very dynamic and filtering in one area automatically filters data in the other PivotTables and graphs (called Highlighting). The scatter chart also has an option for a Play Axis which is great for comparing data over time. Some of you may remember this report in the PowerPivot Management Dashboard in SharePoint’s Central Administration. Below is a screen grab of the report I created from my rugby PowerPivot data.

The only downside I see to Crescent is the fact that you need SharePoint 2010 Enterprise Edition in order to get access to any of the cool BI features. This is going to prevent a lot of customers leveraging the new technology. And currently, I’m unaware of any applications which will be able to render these reports on a mobile device.

Microsoft Visual Studio 2010

So far I’ve explored the Tabular version of Analysis Services. It’s PowerPivot in Visual Studio (in blue instead of green) with a whole bunch of over advanced features e.g. Partitions and Roles. All up it looks decent. When embarking on a new project, architects will now need to decide as to whether they go the classic Analysis Services route (as we know it) or if they go the Tabular route.

Only thing I couldn’t see was how to add calculated DAX measures in my Tabular project…

Master Data Services

I must say that I was looking forward to an upgraded version of MDS but was disappointed to see that the majority of the application was unchanged. There are some Silverlight screens now embedded in the UI to improve the experience and performance but that was about it. There is also an Excel 2010 MDS add-in that allows you to connect to an MDS instance. This add-in will enable you to manage your master data in a very familiar environment to all of us. I’ll need to explore this feature in more detail to see how robust it is.

Data Quality Services

As I mentioned earlier, I only started looking at this new addition earlier today and my first impressions are good. I setup a Knowledge Base with some Domain Rules and Term-based Relations and it worked well. The UI also looks decent – a lot better than MDS. Below is another screen grab.

What will be interesting to validate is the matching capabilities as one really wants to be able to easily de-duplicate datasets.

Stay tuned for more…

Our BI CoP (Community of Practice) is currently having a refresher of SharePoint 2010 during the month of June and it’s my turn to show some of the BI capabilities on offer. Other than Excel and Visio Services, PowerPivot and PivotViewer, SharePoint has a great tool called Dashboard Designer (part of Performance Point).

Instead of using the Contoso or Adventure works datasets to build out my dashboard, I went for a dataset I came across last year which revolves around Tri-nations rugby players. Quite interesting (if you like rugby that is). I used this same dataset when I showed PivotViewer at Tech-Ed Africa 2010 and received some good feedback.

Dashboard Designer allows you to create a host of different objects including KPIs, Filters, Reports, Dashboards, Indicators and Scorecards. Dashboards are the containers and allow you to link up the other object types in one location. Once deployed, a Web Part page will display your dashboard. The only downside to this approach is that if you want to include document libraries and team discussions (as an example) into your dashboard, Dashboard Designer will overwrite those web parts the next time the dashboard is deployed.

Now, you can easily get around this dilemma by not using the dashboard component in Dashboard Designer but instead create a new Web Part page from scratch. One can then quickly include your published Performance Point content in these web parts as well as your document libraries, custom lists or team discussions. One trick, you need to create connections between your different web parts in order for filters to work or for additional reports to be generated once a user clicks on a particular item. Note: connections would also need to be setup in Dashboard Designer.

Here is what my dashboard looks like when you access it for the first time:

Once the user clicks on a player position or KPI, additional reports are displayed:

I’ve even embedded the new Strategy Companion Analyzer Scorecard component:

And finally a little fun with my team discussions. I’ve included some classic one-liners from our coach, Pieter de Villiers.

With just a little bit of luck Oregan Hoskins will see this dashboard and recruit me as a data analyst for the up-and-coming Rugby World Cup!

Agile BI

I’ve read various posts on agile BI and which methodologies would best suit this type of approach. In some instances it’s a really good idea and others, not. Today I got to experience a new type of agile BI. Let me set the scene.

I’ve been pretty busy lately building POCs so when a new one came by one of our Account Executives decided to help me out by building it himself (Schneeberger likes this). We had a good model to start with and building out the reports would also not be too hectic.

The client feedback session was setup for 09:00 this morning and let’s just say that our model wasn’t working as expected. So at 07:45 I was trying to fix the model while the AE was building out some reports. We left for the client at 08:15 leaving me some time to finish off the reports in the car. Well we hit another road block when the AE’s VM blue-screened on the way there. At 08:40 I started loading the client’s data into the model (using my bosses SUV boot as a desk) on my VM. The data completed loading as we sat down in front of the client. Could this really be happening?

It ends up the client is not too interested in the model but more in the reports that can be created from it. So off I go building out reports in Excel 2010 and the client is loving it. Now that’s what I call Agile BI!

I believe there was a fair amount of luck involved but it also confirmed that the BI model we are using is really solid. Testament to all the hard work that has been put in to accelerate BI implementations.

Much to my surprise the iPad 2 was released in South Africa on Friday, along with very attractive pricing options.  Now, I’ve been waiting to get an iPad for some time and wasn’t going to miss out on the opportunity.

I called all the launch stores and noticed that the iStore on Sandton Drive was the only one to open at 07:30am. To guarantee myself one of the devices I left home real early and was there at 03:55am. After spending quite a bit of time explaining to the security guards why I was there that early, they finally gave in and opened the gate. I was the first person for an hour before the second chap arrived. Let me also add that it was 3 degrees outside which was not that pleasant.

So, 07:30 arrived and there were a lot of people queuing up (behind me of course 🙂 ). The Apple staff were really well organised and handed everyone numbers and also allowed you to pick your specific model based on availability. I went for the top of the range 64GB, Wi-Fi & 3G along with one of those magnetic covers. The wait was extended by another hour and a half as national sales could only kick-off at 09:00am. Come on! Anyways, at 09:05am I had the iPad 2 in my hands and the 5 hour stint was over.

I believe the device is miles ahead of any of the other slates in the market right now. I can’t wait to explore the business intelligence capabilities as one of our products, Strategy Companion Analyzer, has just released their beta mobile version.

Here are some pics of the event and device.

MDM in Cape Town

I’ve just concluded a good week in the Mother City performing a Master Data Management Architecture Review for a potential client. I last visited Cape Town five years ago and it was good to spend some time again in the beautiful city. The grass is, however, not always greener on the other side and Cape Town does come with its fair share of frustrations…

  1. Parking: there isn’t any, and when there is a spot, it’s taken
  2. Wind: man, when the South Easter gets going there’s no hiding. I was surprised at the number of skew trees in the parks…
  3. The drivers: quite snobbish actually. We’re a lot better here in Jozi

Other than that it’s an amazing place and the good things definitely make up for the bad. To be quite honest, the mountain cancels out all the bad things! Quite a pity I was there to work and not on holiday.

The first day involved extracting client profiles from an Oracle database. Easy enough or at least I thought so. I had installed the Oracle 11g client tools before my trip so all I really had to do was setup my tns names file and perform an import into my SQL Server 2008 R2 database. After three hours and still no closer to connecting to the Oracle database, I began feeling the heat. I’m not too sure exactly what the problem was but not even my good mate Google to get me any closer.

The on-site support staff struggled to get the connection working as well and they eventually decided to install an application called Oracle SQL Developer. Not the best looking tool, but there was an option to return rows in a table and then export them to CSV. What a relief. From CSV back to SQL and I was a happy camper again.

That evening I got to spend some time at the very nice Protea Hotel: Fire & Ice. Known for their legendary burgers and milkshakes, a work colleague and I were keen to check them out. We settled for the Kudu Filler burger with Lindt chilli chocolate sauce (as the Cheesy, Cheese, Cheese was sadly not available!). I was a little disappointed as the meat was overcooked but it looked good none the less.

The rest of the week involved performing a data quality check and then a MultiVue POC to determine the severity of the duplications. I must say that MultiVue is an excellent application for these architecture reviews and very quickly was able to demonstrate capabilities and results which the client could not get done before. What more and more clients are realising nowadays is that data quality is critical to the success of so many initiatives and they are keen at exploring new ways of improving the single view of the customer.

So, a good experience all up and I look forward to visiting (or possibly even moving to) Cape Town sometime in the future.

So I’m back from my short break and looking forward to 2011. 2010 was a great year from a personal perspective. After many years of slogging I finally finished my BSc IT degree. I now have a little more free time on my hands but that is soon to disappear as my lovely wife is expecting a little girl in 6 weeks (really looking forward to becoming a dad). I also had my first trip overseas and moved into a new position at work which I’m enjoying immensely! Finally, presenting at Tech-Ed was a great experience and something that I look forward to in 2011 if given another opportunity.

Right, down to business… In December I started a POC using a product called MultiVue. The product is an MDM (Master Data Management) solution developed by a company called VisionWare in Scotland. It uses probabilistic matching algorithms to match and merge clients, products, properties and any other entities which may be duplicated in the organisation. It is very powerful and I was fortunate enough to be part of the first production installation in Africa.

In short, I was trying to accomplish the following for the POC:

  • Import client and product data from three source systems
  • Setup some matching rules
  • Identify duplicate clients and present back to the client

Easy enough, or so I thought. After importing the source data, I created some matching rules and kicked off the process. I returned a couple of hours later only to find out that the process had failed due to insufficient disk space. Strange… I then extended the hard disk of my virtual machine, altered a few of the matching rules and started the process again. To my dismay I once again ran out of disk space! What could be going wrong? At this point I was getting extremely frustrated and decided it would be better off to take a break.

The next day I tried the matching process again but with a smaller subset of the original source data. Closer investigation into the temporary tables that were being created in the background led to the discovery of the problem. 600,000+ matches were being created on a dataset that only had 100,000 records. The root cause: I had committed the cardinal sin of not properly profiling the source data before importing it. One of the fields I was matching on was the client’s email address. The data capturers thought it would be a good idea to specify a 0 for 10,000 odd clients. The matching algorithms were therefore matching thousands of clients with thousands of others that were in fact not the same individuals. I removed the invalid entries, re-ran the process on the full dataset and received the results I was expecting.

Lesson learnt, always have a good look at your source data and understand the peculiarities before starting the actual work. The Data Profiling Task in SSIS is a great way to quickly pick up any issues in your source data and I’ll definitely be using it more often in the future.