Who we are
Opoint is the world’s leading provider of online news data and technology solutions to the media intelligence industry. True insights cannot be obtained without reliable data. If your task is to deliver valuable, trustworthy insights to your customers, the quality, coverage, and speed of your data matter tremendously.
Opoint sets the industry standard in all three parameters – why so many in the media monitoring- and data analytic industry are switching to us as the preferred web-feed provider. We also supply historical datasets and live news feeds to companies specialized in market research, risk management, financial analysis, and AI/machine learning.
What we aim for
We continuously benchmark ourselves against other global news providers to make sure that we maintain our position as market leaders. While no provider can guarantee 100 percent accuracy, we will always keep raising the bar in terms of coverage, quality and speed.
How we do it
New sites are added either by our customers’ request or as a result of our continuous benchmarking against other crawlers.
Our system automatically detects new article links and recognizes the headline, body text, author, and date of an article. However, since all websites are constructed differently, our configuration team will initially finetune the extractor to reduce the risk of incorrect or incomplete crawling.
Phase 1: Detect new articles
We crawl all websites repeatedly to look for links that point to new articles. Our system is programmed to automatically match the frequency of visits to a section with the frequency of new articles being published on that section. In other words, if a section is updated with new article links very often, we will keep visiting that section very often to look for new links.
Our system is specifically designed to avoid crawling the same article more than once – an important measure to avoid false duplicates in the news feed.
Phase 2: Extracting new articles
As soon as a new article link has been detected, we will open the link and attempt to extract the text from the article (our automated quality control will alert us if the extraction fails).
Our extraction application is designed to avoid noise (ads, promotions etc.).
The publisher may edit the content of an article a number of times, whereas our version is a representation of the article in the moment we detected it.
Once a new website has been finetuned by our configurators, it is transferred to the automated crawling and quality control.
The automated quality control consists of various checkpoints and alerts that tell us whether we are detecting new articles as expected (crawling phase 1), and if those articles are extracted correctly (crawling phase 2).
Do you want to know more about the benefits of our products? Please fill out the form and write a few words about what you are looking for.