Monday, March 4, 2024

Can AI Fix "Tags?"

If you are a person who writes hundreds of blog posts every year, you have encountered the tedium of tagging or keywording or otherwise classifying content. The problem is similar to the issue with structured databases, which requires predefined fields and relationships. 


One has to choose a limited number of tags, based on assumptions about what other users will be searching for. But searched-for tags will shift over time, especially when new relationships or connections are investigated. 


That chore of pre-defining what tags will “always” make sense is so imprecise and unpredictable that I long ago gave up using them. Yes, that might make the content harder to “find,” but it saves so much manual work that the tradeoff I deem reasonable. 


In principle, artificial intelligence should solve some of the problem, not only by automatically classifying content, but by creating something like the value of a relational database compared to older, more-static models where adding new tables or relationships without fear of disturbing what already exists. 


Older databases, for example, were hierarchical, where data is organized in a parent-child structure, making complex queries challenging. Flat-file databases also limited querying capabilities. 


In principle, AI should provide similar benefits for tagging functions. There should be less manual labor to classify information in a rigid way, where relationships are not always pre-defined. That should, in principle, allow future searches to use more-dynamic queries that go beyond the fixed structure. 


Equally important, AI should enable easier discovery of new connections beyond the predefined keywords or database schemas. And where assignment of tags and keywords can be subjective and ambiguous, AI should allow us to overcome such ambiguity or subjectivity. 


Tags and labels are often applied based on individual interpretations, leading to inconsistencies and ambiguity, especially across large datasets. AI should help overcome such limitations.


Limited scope and granularity could be less an issue after we apply AI. And, obviously, less manual effort will have to be expended for content classification. And AI-enhanced translation should mean all content can be accessed, irrespective of original language. 


Of course, many will note that AI still lacks the fuzzy, subtle and contextual nuances that make expert human tagging useful. But it should be better than manual tags.


And what would ultimately be better is automated classification that updates over time as "key words" change, and the original tags lose some relevance, while possibly becoming more relevant in new contexts.


No comments:

AI Impact on Data Centers

source: PTC