More than 80% of the world’s data is unstructured and the majority of these new data sets that are popping up in the peripheral vision of these investors are in an unstructured format. What this means for the consumer is they would need to have their own in-house data scientists, engineers, as well as IT staff, who can take all that unstructured content and turn it into something more useful, which is an extremely lengthy and expensive process. The majority of buy-sides do not have access to these types of resources and that’s why big data vendors are essential. Everyday these valuable teams of experts are turning out large volumes of unstructured content and converting it into tradable market data.
For hedge funds, asset managers and banks looking for a big data vendor, it’s important that they ask the right questions. We have narrowed down the top 10 key areas that they should be considered when deciding on an alternative data vendor.
1. Structured Data
Buy-side firm’s should be looking for alternative data vendors that pre-process unstructured data to deliver data in a 100% machine readable, structured format, regardless of the data type.
2. Get a Full History
A lot of these alternative data providers are relatively new, consequently they have only been storing data for a short amount of time, which makes proper back-testing difficult or impossible.
3. Alternative Data Mishaps
The business of alternative data is not a perfect science, sometimes the vendor is not able to store data when it was actually generated. It’s better to be transparent about the gaps or data integrity issues so the consumer can make an informed decision on whether they want to use that part of the data or not.
4. Get Proof of Research
Some of the new vendors have limited or no research demonstrating the value of their data. Consequently, the vendor ends up putting all the burden on the customer to do all the early stage research from their side.
5. Context Matters
When you look at unstructured content such as text, the natural language processing (NLP) engine being used must understand financial terminology. So much so, vendors can build their own dictionary of industry related terms.
6. Version Control is Essential
The vendor must ensure version control of their process as technology improves or their production methods change, otherwise future results are more likely to vary from back-testing performance.
7. Point-in-time Sensitivity
Point-in-time sensitivity is about making sure your analysis only includes information that was relevant and available at any given point-in-time, otherwise forward looking bias is added to your results.
8. Data Maps to Tradable Securities
Most alternative data out there is not about financial securities. The users need to figure out how to relate this information to a tradable security like a stocks or bonds.
9. Fast and Innovative
Alternative data analytics and AI is a fast moving space, there is a lot of competition amongst companies and technology is changing dramatically every year. To stay innovative and competitive, some data vendors secure a dedicated full time Data Science team, who can work with financial organizations and academic institutions to continuously conduct research and development in the analysis of unstructured data.
10. Make Sure the Data is Legal
Both vendors and clients must truly understand where their information comes from and where it’s being sourced to ensure it doesn’t violate any laws.
Article by Raven Pack