Conversation disentanglement as-a-service
Informazioni aggiuntive
Autori
Tipo
Contributo in atti di convegno
Anno
2023
Lingua
Inglese
Sommario
Modern instant messaging applications (e.g., Gitter, Slack, Discord) provide users with real-time communication means. Developers use them for collaborative development, to ask for code reviews, and to have software-related discussions. In short, a (potential) treasure trove for program comprehension. However, as with any high-throughput “chat application”, messages interleave, leading to concurrent conversations. Associating messages to conversations is called conversation disentanglement, a useful and necessary pre-processing step to analyze datasets of instant messages. Although various conversation disentanglement algorithms have been proposed, it is cumbersome to set up proper execution environments and hard to ensure input data format consistency, calling for better practices and tool support. We present CODI, a RESTful API micro-service and web interface for conversation disentanglement. It provides an easy way to disentangle conversation transcripts with pre-trained models or to train new ones on custom datasets, features, and hyper-parameters. CODI achieves state-of-the-art performances on transcripts of IRC, Slack, and Discord conversations. We show how CODI can provide a significant improvement to reusability (and replicability) of research results, while reducing the efforts and potential mistakes due to configuration, setup, and execution.
Parole chiave
CoDi, Conversation disentanglement, Instant messaging, Micro-services
Titolo atti di convegno
IEEE/ACM 31st International Conference on Program Comprehension (ICPC), 15-16 May 2023
Pagine (o numero dell’articolo)
59-63
Diffusione
Licenza
Diritti riservati
Visibilità
Pubblico
Status open access
Green