Babel Machine Changelog
Last Updated: March 01, 2024
Version 2.1
Current Babel Machine version as of March 01, 2024.
Predictions and Modeling
- Implemented pooled models for Emotions Babel and Sentiment Babel.
- Initial release for Named Entity Recognition (NER) Babel: https://nerbabel.poltextlab.com
Pipeline and Backend
- Implemented a dropdown option for choosing the processing unit for prediction: GPU or CPU. We recommend choosing the GPU for most scenarios unless your file contains 1000 or less rows. (Use case examples: Small samples and trying out the service.) Please note that NER Babel is CPU-only by design.
- Fixed an issue that resulted in an error message or NA value in the prediction column beyond the 30000th row.
- Fixed an issue that caused CUDA error messages and resulted in processing failures.
- Technical improvements on VM memory usage.
- Implemented an interim solution for VMs not properly shutting down after a job is complete.
Improvements and Adjustments
- Announcements are now displayed on top of the upload forms with their own formatting.
- Added clarifications regarding supported languages in the upload form text.
- Improved the internal messages we use to track and debug submissions.
- Emotion Babel has been renamed to Emotions Babel.
- Adjusted page subtitles and upload form texts for clarification.
- Adjusted the messages of the successful prediction emails to properly reflect the module used.
- Dataset download emails provide the cost of the prediction computation of the submitted file.
Version 2.0
Babel Machine version from January 24, 2024 to March 01, 2024.
This version introduces the division of the Babel Machine into various modules. CAP Babel Machine is now found under https://capbabel.poltextlab.com, and https://babel.poltextlab.com is a landing page for the module selection.
Predictions and Modeling
- Initial release for Manifesto Babel.
- Initial release for Sentiment Babel.
- Initial release for Emotion Babel.
- The baseline model for CAP Babel Machine has been retrained.
Pipeline and Backend
- Support for the Babel Machine models has been implemented.
Improvements and Adjustments
- Module pages now have a menu on top of the page that can be used to jump to the other Babel modules.
Version 1.2
This version introduced the 10 domain setup (see table and note below).
Predictions and Modeling
Version 1.1 Domains | Version 1.2 Domains |
---|---|
Budget | Budget |
- | Executive Orders |
Judicial Decision | Judiciary |
Legal | Legislative |
Manifesto | Party Manifestos |
Media | Media |
- | Public Opinion |
Social Media | Social Media |
Speech | Execuitive Speech, Parliamentary Speech |
Other | - |
- We have developed language-domain models that cover 9 languages (Hungarian, English, Italian, Dutch, Spanish, French, and Danish) and 10 domains (Media, Social Media, Parliamentary Speech, Legislative, Executive Speech, Executive Order, Party Manifesto, Judiciary, Budget, Public Opinion). Babel DOES WORK for other domains and languages, but we cannot provide validity scores due to a lack of hand-coded test data.
- There is a model selection step that chooses the language/language-domain model accordingly for supported datasets based on which model has the higher F1 score performence.
- We implemented softmax scores (which was a feature request): Users receive with the email the three highest probability category predictions by the Babel Machine model and the corresponding probability (softmax) scores assigned to each label. Take them with a grain of salt.
- Support for "None" category (label 999): "Most of the language models that the CAP Babel Machine uses were fine-tuned on training data containing the label 'None' in addition to the 21 CAP major policy topics, indicating that the given text contains no relevant policy content. We use the label 999 for these cases. Note that some of the models (e.g., Danish legislative, Dutch media) do not recognize this category and thus cannot predict if the row has no policy content." It thus serves as a policy relevancy binary classifier as well.
Pipeline and Backend
- Support for uploading large datasets (up to ~800 MB); no need for splitting files manually (unless they are way larger than this limit).
- Improved prediction speed.
- Added a cache for loading in models.
- CSV validation that checks the dataset for typical errors (such as improper usage of delimiters that causes the file processing to break).
- Security improvements (important for us, as providers):
- Character limits on the upload form
- Upload form input and CSV files are sanitized (does not accept special characters) to prevent exploits such as SQL injection
- Internal reporting adjustments so we can identify submission information and issues faster:
- Detailed metadata description on Slack that corresponds to the metadata on the upload form
- Report number of rows and number of coded rows so we can verify that all rows got properly coded
- Runtime is now reported as timestamps and an estimate runtime cost added so we can see the price of each dataset coding (especially for bigger files)
Improvements and Adjustments
- Added a Contact Us form to the page so users can reach out to us with inquiries and questions.
- There is a menu on the top of the page that includes link back to poltextlab.com.
- The upload form shows the characters remaining for each field (due to the implementation of character limitation).
- Updated the upload instructions to provide more technical details about the dataset validation steps so common issues that cause processing failures can be addressed by the uploader. Then, the uploader can resubmit the file after correcting the error that caused the failure.
- Errors on the upload form have been clarified, in particular the UTF-8 error to provide more context for those less familiar with character encoding.
- User will receive an email if the CSV validation step fails with an additional explanation of what caused the error. As some errors are inherently difficult to catch (such as improper delimiter usage), we recommend to follow the upload instructions for troubleshooting.
- The email with the coded dataset + softmax scores provides suggestions on opening CSV files properly (from our experience Excel and LibreOffice weren't the best at handling them).
Version 1.1
This version added the initial language-domain setup to the pipeline.
Predictions and Modeling
- We have developed language-domain models that cover 9 languages (Hungarian, English, Italian, Dutch, Spanish, French, and Danish) and 7 domains (Legal, Speech, Budget, Manifesto, Media, Social Media, Other). Babel DOES WORK for other domains and languages, but we cannot provide validity scores due to a lack of hand-coded test data.
Pipeline and Backend
- The amount of standby VMs for prediction has been increased from 2 to 4.
Improvements and Adjustments
- Polished the text of the upload form.
- Updated the upload instructions.
- Internal submission notifications have been improved.
- Prediction completion email text has been adjusted.
- Institutional affiliation field has been added on the upload form.
- Dataset language field has been added on the upload form.
Version 1.0
Initial release of the CAP Babel Machine. Predictions were handled by an XLM-RoBERTa model finetuned on the training data of 6 languages (English, Spanish, Hungarian, Polish, Danish, Dutch).
The research was supported by the Ministry of Innovation and Technology NRDI Office within the RRF-2.3.1-21-2022-00004 Artificial Intelligence National Laboratory project and received additional funding from the European Union's Horizon 2020 program under grant agreement no 101008468. We also thank the Babel Machine project and HUN-REN Cloud (Héder et al. 2022; https://science-cloud.hu) for their support. We used the machine learning service of the Slices RI infrastructure (https://www.slices-ri.eu/)
HOW TO CITE: If you use the Babel Machine for your work or research, please cite this paper:
Sebők, M., Máté, Á., Ring, O., Kovács, V., & Lehoczki, R. (2024). Leveraging Open Large Language Models for Multilingual Policy Topic Classification: The Babel Machine Approach. Social Science Computer Review, 0(0). https://doi.org/10.1177/08944393241259434