Saturday, November 14, 2009

FIMA London 2009

A busy week, last week. FIMA attendance was down a bit however we at least got back into the City. We missed attendees from Asia and Australia who participated last year however I suspect budget restrictions imposed in 08 had an impact on that.

It was a great pleasure for us to have Jeremy Ruston of Osmosoft spend a few minutes during the Avox presentation to give the audience insight into the power of open source software development. Jonathan Lister, an independent opensource software developer working with Avox pulled off the almost impossible by successfully demoing two live applications over the internet during the presentation ( and a custom app developed the night before using the wiki-data API - apps like these never work when you are doing live demos in front of large audiences!).

Thanks to all who joined the party at Prohibition on Tuesday night. The turnout was excellent (>200 people I'd say). Thanks to all the additional sponsors - Asset Control, EDM Council, Interactive Data, GoldenSource, S&P and SWIFT. The bash always makes our business a bit more tolerable.

Based on all the presentations, panel discussions and networking discussions last week, I see the potential for some substantial progress in the standardization and transparency around business entity data in 2010. Watch this blog over the next week or two. We've got some more things brewing.


Thursday, November 5, 2009

NIF - Bark worse than Bite?

It's no secret that I've not been convinced of the "achievability" or desirability of the proposed regulatory infrastructure being proposed by the National Institute of Finance. But you know, I may be mellowing a bit. Here's why.

I bumped into an old business school/Algorithimcs alumni friend of mine in the Toronto airport on Tuesday (Dr. Dan Rosen - the brain with legs). He has been involved in NIF discussions over the past months and was actually the first person to tell me about it.

Well, I took the opportunity to suggest to Dan that NIF was behaving either arrogantly or naively or both. Dan gave me a kind but patronizing look and asserted the following.

"Ken, when the US government shut down Lehman but not AIG, they had no real understanding of whether that was the right move or not. They did not have anything to objectively measure what was the better decision." Crap, that's clear to me... "The objective of NIF is to help generate that understanding", Dan continued.

I immediately and quite cleverly (I thought) countered by shouting "Yes, but these guys are talking about splitting atoms, sending man to the moon and regulatory harmonization. The first two were easy!". Finally, I've managed to achieve some minimal sort of intellectual equity with the Ph.D. who was able to construct, graphically, Mick Jaggar's lips with a gamma graph on RiskWatch while drinking his cappucino and developing new credit risk management algorithms on his dirty napkin.

"Ken, Ken, Ken" (Dan's impatience starting to become evident), "these guys are politicians. They have to talk that way. All NIF wants to do is help increase transparency and provide better decision support.".

Now that makes sense to me. If the NIF folks start talking pragmatically and dump the atom splitting talk, I'm in . Forget about world peace, regulatory harmonization and poltical multi-partisanship. Transparency is the single most valuable objective we can shoot for.

And wouldn't you know it, the dark knight steps up to the plate. Bloomberg, are you serious?

The plot thickens...

Dazed and confused,


Friday, October 30, 2009

New version of wiki-data is live

I’m pleased to say that the new version of is now live. There is also a link from our website.

Features of this new version:

- It has been developed in open source software which will provide significant flexibility to clients when the client specific version of the site is rolled out.

- Every Avox ID (AVID) is a distinct URI (Uniform Resource Identifier) enabling links to and from the entity record. There are no restrictions on the use of AVID, nor are there any license fees associated with it.

- We have put more data fields into this version of wiki-data including formerly known as name, also known as name, trading status, URL and additional operational address details.

- Users can hide/show columns, sort content in columns, click the AVID and land on a company page with a google map, filter content, challenge data values, request additional data for a record and request new records.

- Limited numbers of additional data and new record requests will be processed at no charge. We will soon have a charging mechanism in place for requesting more data and new records. Existing clients will be able to apply these requests against their existing contracts if they wish.

- There is a facility called “New app ideas” at the bottom of the screen where users can suggest and vote for new functionality.

- An email support function is also included at the bottom of the screen.

We have already started working on the client specific version of this application called “my-wiki-data” which will provide clients with more visualization, query and API functionality. We look forward to getting your feedback on both this public site and the client site as it evolves. Don’t hesitate to call if you have any questions. We hope that this new capability is helpful.

Please note that the email request function on the previous version of wiki-data was corrupted and not operational during the past two weeks. My apologies if you had tried to use that capability and received no response. It is working with this new version.

Wednesday, September 2, 2009

Democratization of Data (or not?)

I had an interesting conversation with someone who manages corporate customer data for a major investment bank last week. She said that she had read our blog and developed the impression that Avox was attempting to democratize entity data. In her opinion (and as it turns out, in mine), this alone is likely to degrade the overall quality level of the content.

After this discussion, it dawned on me that we had better be clear about our objectives at Avox, particularly as they relate to

1) Maximize the amount of productive "challenges" received that help us to improve quality of the data in Every challenge to records with an AVID will be verified by our expert team of analysts before being applied (assuming, of course, the challenge is proven correct). The "democratization" element of this activity equates to the opening up of a chunk of our content on for the world to see and for anyone to challenge. BTW, the quality improvement benefits all our data vendor partner offerings too.

2) Provide a platform for others to link to and from. Our next release of wiki-data will incorporate additional identifiers from other firms. The AVID will be part of a URI for each entity enabling the technology community to efficiently leverage a single version of the truth. This will in turn attract more usage, more challenges, lower latency and increased accuracy.

3) Create a forum for anyone to comment on data records, propose changes in public and ultimately to add "non verified" data records which Avox or other firms can verify for clients. This is the big stretch and, if I'm honest, it's the iteration of wiki-data we are least certain about.

To be clear, Avox will never provide to clients data that has not been verified according to the terms of their service level agreements. We do however regularly get asked for large volumes of business entity data to facilitate marketing campaigns for example where data quality is not as important as it is for credit risk management or regulatory compliance. This is where information on millions of entities is sometimes required but the budget for procurement of this content is meager. Perhaps in these cases it is not necessary to have an independent and rigourous analysis performed on every entity at regular intervals and the community self checking mechanism is adequate. Moreover, a global community maintained model may be a great solution for a free and universal yellow pages capability (for example). Over time, more and more of these entities will be verified by expert third parties such as Avox and assigned identifiers as the market demands.

We don't have all the answers however our aim is to give you, the user community, a platform to access and use the content cheaply and efficiently. We are looking for your views and guideance on how to shape to make it a more market friendly service that represents real value for you.

BTW, keep your eyes open for v1 of later in September. You can check out Jonathan Lister's blog for progress - the software is opensource.


Monday, July 20, 2009

Wiki-data development with BT/Osmosoft

It's been a while since my last post but we have been busy working on wiki-data. Some of you will know that we have been in discussions with the open source software group at British Telecom. It's a subsidiary called Osmosoft comprised of some frighteningly clever technologists with impressive credentials and serious talent.

A group of us from Avox including senior management from our Wrexham office, our sales team including Brett Hodge who came over from Australia and myself spent a day with Osmosoft last week. They affectionately refer to these as "HackDays".

Within the space of 10 hours, this team, led by Jeremy Ruston, the Osmosoft founder, put together a brand new (I mean built from scratch) version of wiki-data with some great interactive functionality. We provided the guys with a very large dataset which they used as a base. By the end of the day, we had a platform where every AVID had its own URI (uniform resource identifier) so that anyone can link to or from it, addresses of companies are now linked to Google Maps so you can visualize where firms are based (we quickly found one in Iraq!), anyone can begin a comment string on an entity and yes, we even had an online edit function (we are going to need to figure out how to govern that one before releasing it!).

If you are interested, your best bet is to have a look at Michael Mahemoff's blog where you will find detailed descriptions of the process and outcome as well as a video of me and Paul Downey of Osmosoft running you through the platform before we ran to the pub for pints. Just click on the link (embedded in the heading) at the top of this post.

Or if you prefer:

I'll keep you posted on what will be rolling out and when. We are extremely excited.


Saturday, June 27, 2009

Who's Got the Entity Data Standard?

It's been an interesting month with many articles and editorials in publications such as Reference Data Review, Inside Reference Data, Global Investment Technology and Securities Industry News to name a few all raising the issue of standards in the business entity data space. There is no shortage of contenders. SIIA is pushing IGI (Issuer, Guarantor ID). SWIFT is considering a role by expanding BIC coverage and addressing the non-uniqueness of the BIC. Factset has made past announcements of their intention to make a public standard available. Other major data vendors such as Bloomberg, ThomsonReuters and D&B already have broad coverage. The European Central Bank is working on a data utility. So is Financial Intergroup's Allan Grody. And so is the Dubai International Financial Centre (DIFC). Avox has an initiative S&P called the CABRE (Cusip, Avox Business Reference Entity). We've also recently launched a free business entity directory ( And of course there are the industry working groups including the EDM Council, JWG-IT and FISD all talking about standards.

This is all looking pretty positive, isn't it??? (NOT!!!).

Let's be honest here. Every commercial firm in this space would love to own the global standard for business entity identification. Think about it. The entire planet "theoretically" would have to pay you and only you for the right to get an authoritative picture of a business entity. Just think of the monopoly play here. Millions of customers needing millions of pieces of information every day. One year's profit would pay off the US national debt.

At least that's what many buyers of such information believe. It's not quite that profitable a picture. Here's why.

There can never be a monopoly provider of this data. The function that all of the above mentioned firms play is that of data aggregator. Some firms use automated matching to provide a consolidated picture, some use manual analysis and all the others a combination of both. The sources we go to are the same that anybody can go to directly.

The sources used typically include national and state business registries, regulatory authorities and tax authorities. All these organizations have an obligation to make information available to the public (although some of them have decided to put a rather hefty price tag on some important data). Some data is not publicly available and various vendors have secured rights to it but by an large, nobody had universal access to everything (the Google guys are trying...).

So where is all this spaghetti going to get us? In my opinion, it's going to get us to where you want to go. Competitive forces are pushing all vendors to be more innovative, cost effective and customer focused. A perfect world in the Avox view is one where everyone works off of the same underlying basic content that uniquely identifies a business entity. The ID becomes a secondary issue but remains important of course.

Some vendors are working to this objective. You won't be shocked to hear that Avox is one of them however a number of firms we have spoken to are also "chilling out" with respect to relinquishing value of the basic data.

Please believe me when I say that Avox does not expect to be able to solve the whole problem on our own when it comes to business entity identification. We can't. But we can help facilitate a quicker convergence onto a standard by being open and "influencing" other firms to be open.

If you buy into this view, let your vendor partners know. The sooner we connect, the sooner you can connect and that's when everyone can get on with the business of efficiently generating returns for customers.

Your Data Geek,


Saturday, June 6, 2009

KYC - Global Regulatory Conflict

On June 4, GoldTier, a KYC software company hosted a speaker panel to discuss the impact of a changing global political and economic environment on the Know Your Customer (KYC) function at financial institutions. One of the discussion points exposed a great deal of frustration amongst the participants which included representatives from North American and European financial institutions.

As a KYC professional at a global financial institution, it is incumbent upon you to ensure that the head office of your firm is aware of their global exposure to any and all customers. This requires transmission of what some consider to be "sensitive" data across international borders. Countries including Singapore, Switzerland and Korea are asserted to have in place data privacy regulations that prevent such data from being sent across their borders. By definition, this would make it legally impossible for firms located outside of those countries to do business there.

This is clearly not the business reality. The problem as I see it is that compliance officers are paid to be ultra conservative and risk averse. This results in overly conservative interpretations of international regulations governing, amongst other things, data protection. This puts them and their firms in an awkward position of having to rely on the opinion of their locally based compliance function to make judgements without having the ability to regularly audit the process and data from a foreign head office. So ironically the risk averse compliance function is introducing a significant risk to their firm by not sending what is typically generic company information between geographies.

We have international clients who have realized this and have engaged with regulators in countries like Singapore to get clarity. Indeed Avox has been involved in some of these discussions directly. In most cases, there is actually no restriction on sending company information outside of these countries. How on earth would firms in these countries do business internationally if this was the case?

It's time for the business to work closely with compliance and have frank discussions with any regulatory body they believe is hampering transparency. Any country that truly prohibits communication of important decision support information outside of their borders needs to be prepared to suffer a drastic reduction in international trade. I think you will find that the reality is most regulators will not stand in the way of legitimate commerce.


Tuesday, June 2, 2009

Canada ready to take a reference data lead?

I've just returned from the third of three EDM Council meetings chaired by Mike Atkin, this one in Toronto. The previous recent meetings were held in Boston and London. I was pretty suprised at the contrast.

The Toronto event was well attended by most of the major banks, funds and a number of vendors. John Mulholland of RBC kindly hosted and injected much relevant and interesting comment. A representative from the cash equities business group at CIBC provided a significant amount of challenging, insightful and truly helpful comments. This helped make the Toronto meeting one of the most dynamic and relevant EDM Council meetings yet in my opinion.

It was rather shocking to see the Toronto financial community coming together with so much vigour. I've personally been trying to foster a reference data community here, (where embarrassingly, I'm based) but to little avail. Hats of to Mike and John for getting the ball rolling and to all the firms that participated for grabbing the bull by the horns. Maybe it's time for Canada to start leading the reference data charge...


Saturday, May 16, 2009

If I'm perfectly honest, this exercise has been more painful than anticipated.

We started out by planning to publish a "skinny" version of every record in our CORE (COrroborated, Remediated, Enriched) Avox database. Then we realized that a good chunk of these records had been verified over a year ago for clients wanting just a one time clean-up (some folks never learn...) so those records had a reasonable chance of being out of date. As it takes a good 10 minutes per record to recheck, trolling through hundreds of thousands of stale records would take too long and cost too much for this initiative. Given that we are making this content publicly available so everyone can look at it, we thought it would be risky to publish data we knew had a high probably of being out of date.

Then in the pared down database, we found a large number of records that were not unique "legal" entities. That is to say, they were branches, departments or funds which had been included for specific Avox clients at their request. So while these are valid records from their perspective, we've excluded them from wiki-data for now with a view to starting with pure legal entities.

Then we released the database to a small trial audience and, as expected, we received helpful feedback on some quality issues which we have been in the process of addressing. This is exactly what we had hoped for as every new set of eyes that looks at the content brings with them new perspective and knowledge. The community helps improve its own asset. We also received very helpful feedback from users including some of our own staff about the layout of the search facility, the results and the issue filing process.

Perhaps most interesting and encouraging was the fact that one of our competitors provided constructive feedback. Could it be that we may be able to work with eachother to jointly improve our mutual clients' data? Let's see.

So now we feel the database, although slimmed down quite a lot, is ready for broad public viewing and consumption. There will be out of date information. There may be some duplicate values. And yes, I expect there will be some outright errors. It's all expected. All we ask is that you tell us about any problems you find. You just need to click on the "Correction" button next to the record in question.

I know it's a bit of a pain however please do share your feedback/thoughts with respect to wiki-data on this blog. You can just email me if you prefer. We also have a LinkedIn group (Avox business entity discussion forum) set up.


Tuesday, April 28, 2009

Avox partners with CUSIP/S&P

Like a said a few weeks ago, I anticipate a flurry of activity in the reference data space. OK, I had some inside information upon which to make this assertion but the reality is - it's happening.

We are really excited about this arrangement with CUSIP. Some skeptics out there are suggesting that this is not a good partnership because of some negative press CUSIP is getting, particularly in Europe. I look at it the other way around. By partnering with Avox, CUSIP is embracing a more open approach to entity reference data. The content managed by Avox within this partnership will be consistent with that of all Avox clients which currently include the likes of Barclays, Citigroup, Nomura and Standard Bank of South Africa. It will also be consistent with entity data held by our partners including firms such as SWIFT, Interactive Data and Markit. Applying the CABRE will be quick and efficient for these companies as well as for existing CUSIP customers and partners. In essence, this partnership proves that collaboration is growing and succeeding in the reference data world.

Will the CABRE become the industry standard entity identifier? Well, frankly, that's up to the market. What do you think? If you have an opinion, we'd like to hear it and get some public discussion going on.


Future of Journalism(Alan Rusbridger) = Future of Data?

Alan gives an interesting commentary on this video about the future of journalism and how The Guardian is addressing the new world of open information. The parallels between this and data management keep ringing home for me. What do you think?


Sunday, April 26, 2009

Out with the old, in with the new.

I've been tracking Jeff Jarvis, author of "What Would Google Do?" and of the blog. He posted an article recently about the demise of newspapers in the form of a theoretical testimonial to Senator John Kerry's hearings. What's that got to do with data?

I find interesting parallels between the news business and our industry. Both have historically relied on IP and tight control over it. The Internet destroys control. But on the plus side, it proliferates knowledge. And on the down side, it proliferates much useless noise.

The question is, can large, blue chip businesses that rely heavily on conventional license revenue streams adapt quickly enough to this new regime of transparency to survive and prosper? A number of news organizations have found this a major challenge.

It's an exciting time to test these ideas as firms in the financial services space are forced to seriously question everything they do and how they do it. Might there be a more efficient way to improve data quality. I suppose you know my opinion on that already...


Thursday, April 23, 2009

Wiki-data beta is live!

OK, back in business. Looking for feedback from those of you who have an interest.


Comment on note below: As expected, we've already received some great feedback which means we've got to do some revamping to make this content more useful for you. We'll get back up and running asap.

Best Regards,


310,000 verified, maintained business entity data records. Not a bad starting point. Some partners will soon be coming along with authoritative identifiers, documentation and linkages to securities.

It's happening folks.


Tuesday, April 21, 2009


We are tremendously excited about this new partnership with Markit. The amount of efficiency that can be gained by mutual clients is tremendous without even considering the amount of risk that can be driven out by ensuring consistent data and documentation.

The folks at Markit are very customer focused as are we so please let us know if you have any suggestions for improvement or enhancement of our joint service.


Saturday, April 11, 2009

Is the industry finally getting "collaboration"?!

In the past 3 months, I've seen more concerted effort amongst the vendor community to work together than I have in my 15 years in this industry. Here's a prediction. Before the end of Q2, 2009, we are going to see a series of announcements and real progress toward global vendor collaboration in the business entity space.

We need to move faster in this industry if we want to remain relevant. That applies to all members of the food chain - data vendors, technology providers, consulting firms, regulators, utilities AND users. Business conditions are ripe for change that may initially be percieved as risky but which is already proving to deliver significantly higher value than that which has been achievable in the past.

I'm looking forward to some discussion on this topic.



We've been speaking with a lot of firms in the industry over the past year about the trade off between data quality, coverage, timeliness and cost. A challenge faced by data managers, particularly in the business entity space, is that a central data repository can feed many different groups with distinctly different requirements. Credit risk, for example, will have an extremely low tolerance for latency or errors but may not need massive volumes of entities - just those to which their firm has exposure to. Marketing on the other hand may place a higher priority on a huge database of potential corporate customers to which they can target for campaigns. In this case, quality, while important, does not have the same value as it does for risk.

So the question is how does one balance these requirements while leveraging a central data repository? Six sigma level data quality across an entire CRM database would cost far too much. Incomplete data population for marketing initiatives significantly compromises the effectiveness of a campaign.

What about all of us sharing, for free, very basic data about entities? Say, legal name, country & region(where necessary) of incorporation and perhaps a few other bits as agreed by the community. We can agree a mechanism that ensures contributor identities are not disclosed unless they choose otherwise. All contributors have the capability to check, update and comment on data, very much like a Wikipedia model.

This data then serves as a platform for a financial institution's internal team and/or a third party to perform additional verification/certification. Over time, the shared and free data asset becomes more comprehensive and reliable. Firms like mine will continue to provide verification services however we will need to continue to enhance our service/content offerings in order to increase our value proposition to our market over time. And most importantly, a free and open foundation of basic data will enable anyone on the internet to identify errors. Is there a chance that some contributors may corrupt the data by accident or intentionally? Absolutely. However similar models like Wikipedia are now proving that such abuse is rapidly identified and corrected by the well intentioned community members.

Financial institutions have consistently asked for a free data utility to help address business entity data quality and identification issues. Which of you is ready to proactively participate in such an initiative? It won't happen without your involvement.

I look forward to your comments, criticisms and ideas for improvement.