Publishers of professional content have a significant advantage over public search engines. As I’ve noted in the past, professional publishers do more than merely index content for a search engine. Professional human editors – often with the same degrees and credentials as the practitioners who rely on them – summarize, explain, link and topically classify content. Editors who are lawyers and accountants synthesize concepts from multiple sources of law into actionable content for lawyers and accountants, who further contextualize that content to resolve their client’s issues. Editors who are physicians – many still practice medicine – synthesize findings from clinical trials and review thousands of articles in medical literature. Practicing physicians rely on the physician-editors’ insights to generate better patient outcomes as part of evidence-based medicine.
Professional customers simply cannot keep up with the stream of content. Public search engines, despite sophisticated relevance-ranking algorithms, still present answer sets of thousands and hundreds of thousands of documents. The flood of content is only going to get bigger. It is harder for editors to keep up with the flood. This is where automation and algorithms can step in. Computer-assisted content enrichment is far more sophisticated than it used to be. Computers can literally write the news when the content is predictable. Citations to primary sources of law in content can be automatically transformed to active hypertext links. Algorithms can summarize documents, thus making it easier for editors to determine the most important points in a document. That summary might not be readable, but it might give an editor a head start in creating a meaningful summary that is meaningful to professional customers. Algorithms can topically classify documents and document fragments. That topical classification can be used for browsing, searching, linking and relevance ranking. Again, the topics assigned by algorithms might not reach the level of precision needed by professional customers, but it can be used to route content to editors and speed up the process for them to assign a more precise topical classification.
I still do not see algorithms replacing editors. But there is an interesting possibility with “low value” content. For example, officially unpublished opinions do not have precedential value. The volume of these opinions is high. Human editors cannot invest valuable resources in summarizing those opinions compared to appellate court opinions. This is an example where automated methods might be sufficient for a customer’s research needs. Of course, automated content enrichment methods could be used to identify for editors those officially unpublished decisions that might be of particular interest. Closer attention is needed into the value of each type of content for professional customers and what combination of human and algorithmic enhancements are necessary to make it actionable.
I am convinced that professional publishers must strike a balance between applying algorithms and editors to content. What is clear is that technology is advancing and what tasks are delegated to computers versus editors will always be in flux. Any thoughts?