A Comparative Analysis of NLP Algorithms for Implementing AI Conversational Assistants
Upreti, Aanchal (2023)
Upreti, Aanchal
2023
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi-fe20231030141967
https://urn.fi/URN:NBN:fi-fe20231030141967
Tiivistelmä
The rapid adoption of low-code/no-code software systems has reshaped the landscape of software development, but it also brings challenges in usability and accessibility, particularly for those unfamiliar with the specific components and templates of these platforms. This thesis targets improving the developer experience in Nokia Corporation's low-code/no-code software system for network management through the incorporation of Natural Language Interfaces (NLIs) using Natural Language Processing (NLP) algorithms.
Focused on key NLP tasks like entity extraction and intent classification, we analyzed a variety of algorithms, including MaxEnt Classifier with NLTK, Spacy, Conditional Random Fields with Stanford NER for entity recognition, and SVM Classifier, Logistic Regression, Naïve Bayes, Decision Tree, Random Forest, and RASA DIET for intent classification. Each algorithm's performance was rigorously evaluated using a dataset generated from network-related utterances. The evaluation metrics included not only performance metrics but also system metrics.
Our research uncovers significant trade-offs in algorithmic selection, elucidating the balance between computational cost and predictive accuracy. It reveals that while some models, like RASA DIET, excel in accuracy, they require extensive computational resources, making them less suitable for lightweight systems. In contrast, simpler models like Spacy and StanfordNER provide a balanced performance but require careful consideration for specific entity types.
While the study is limited by dataset size and focuses on simpler algorithms, it offers an empirically grounded framework for practitioners and decision-makers at Nokia and similar corporations. The findings point towards future research directions, including the exploration of ensemble methods, the fine-tuning of existing models, and the real-world implementation and scalability of these algorithms in low-code/no-code platforms.
Focused on key NLP tasks like entity extraction and intent classification, we analyzed a variety of algorithms, including MaxEnt Classifier with NLTK, Spacy, Conditional Random Fields with Stanford NER for entity recognition, and SVM Classifier, Logistic Regression, Naïve Bayes, Decision Tree, Random Forest, and RASA DIET for intent classification. Each algorithm's performance was rigorously evaluated using a dataset generated from network-related utterances. The evaluation metrics included not only performance metrics but also system metrics.
Our research uncovers significant trade-offs in algorithmic selection, elucidating the balance between computational cost and predictive accuracy. It reveals that while some models, like RASA DIET, excel in accuracy, they require extensive computational resources, making them less suitable for lightweight systems. In contrast, simpler models like Spacy and StanfordNER provide a balanced performance but require careful consideration for specific entity types.
While the study is limited by dataset size and focuses on simpler algorithms, it offers an empirically grounded framework for practitioners and decision-makers at Nokia and similar corporations. The findings point towards future research directions, including the exploration of ensemble methods, the fine-tuning of existing models, and the real-world implementation and scalability of these algorithms in low-code/no-code platforms.