NusaCrowd: Open source initiative for Indonesian NLP resources S Cahyawijaya, H Lovenia, AF Aji, G Winata, B Wilie, F Koto, R Mahendra, ... Findings of the Association for Computational Linguistics: ACL 2023, 13745-13818, 2023 | 1161 | 2023 |
IndoLEM and IndoBERT: A Benchmark Dataset and Pre-trained Language Model for Indonesian NLP F Koto, A Rahimi, JH Lau, T Baldwin Proceedings of the 28th COLING 2020, 757-770, 2020 | 271 | 2020 |
CMMLU: Measuring Massive Multitask Language Understanding in Chinese H Li, Y Zhang, F Koto, Y Yang, H Zhao, Y Gong, N Duan, T Baldwin Findings of ACL 2024, 2024 | 176 | 2024 |
Inset lexicon: Evaluation of a word list for Indonesian sentiment analysis in microblogs F Koto, GY Rahmaningtyas 2017 International Conference on Asian Language Processing (IALP), 391-394, 2017 | 175 | 2017 |
One Country, 700+ Languages: NLP Challenges for Underrepresented Languages and Dialects in Indonesia AF Aji, GI Winata, F Koto, S Cahyawijaya, A Romadhony, R Mahendra, ... Proceedings of ACL 2022, 2022 | 83 | 2022 |
IndoBERTweet: A Pretrained Language Model for Indonesian Twitter with Effective Domain-Specific Vocabulary Initialization F Koto, JH Lau, T Baldwin Proceedings of EMNLP 2021, 2021 | 82 | 2021 |
A comparative study on twitter sentiment analysis: Which features are good? F Koto, M Adriani Proceedings of the 20th NLDB 2015, 453-457, 2015 | 82 | 2015 |
Jais and Jais-chat: Arabic-Centric Foundation and Instruction-Tuned Open Generative Large Language Models N Sengupta, SK Sahu, B Jia, S Katipomu, H Li, F Koto, OM Afzal, ... Technical Report, 2023 | 78 | 2023 |
SMOTE-Out, SMOTE-Cosine, and Selected-SMOTE: An Enhancement Strategy to Handle Imbalance in Data Level F Koto The 6th ICACSIS, 2014 | 73 | 2014 |
Nusax: Multilingual parallel sentiment dataset for 10 indonesian local languages GI Winata, AF Aji, S Cahyawijaya, R Mahendra, F Koto, A Romadhony, ... Proceedings of the 17th EACL 2023, 2022 | 64 | 2022 |
Llm360: Towards fully transparent open-source llms Z Liu, A Qiao, W Neiswanger, H Wang, B Tan, T Tao, J Li, Y Wang, S Sun, ... Proceedings of the First Conference on Language Modeling (COLM 2024), 2023 | 55 | 2023 |
Bactrian-x: Multilingual replicable instruction-following models with low-rank adaptation H Li, F Koto, M Wu, AF Aji, T Baldwin arXiv preprint arXiv:2305.15011, 2023 | 54 | 2023 |
Discourse Probing of Pretrained Language Models F Koto, JH Lau, T Baldwin Proceedings of NAACL 2021, 2021 | 49 | 2021 |
Liputan6: A Large-scale Indonesian Dataset for Text Summarization F Koto, JH Lau, T Baldwin Proceedings of AACL 2020, 2020 | 47 | 2020 |
Apparatus and method for sharing personal electronic-data of health A Kurniawan, O ABDILLAH, Fajri US Patent App. 15/221,140, 2017 | 42* | 2017 |
Are multilingual llms culturally-diverse reasoners? an investigation into multicultural proverbs and sayings CC Liu, F Koto, T Baldwin, I Gurevych Proceedings of NAACL 2024, 2024 | 41 | 2024 |
Top-down Discourse Parsing via Sequence Labelling F Koto, JH Lau, T Baldwin Proceedings of the 16th EACL 2021, 2021 | 36 | 2021 |
FFCI: A framework for interpretable automatic evaluation of summarization F Koto, T Baldwin, JH Lau Journal of Artificial Intelligence Research (JAIR) 73, 1553–1607, 2022 | 31 | 2022 |
Large Language Models Only Pass Primary School Exams in Indonesia: A Comprehensive Test on IndoMMLU F Koto, N Aisyah, H Li, T Baldwin Proceedings of EMNLP 2023, 2023 | 25 | 2023 |
A Publicly Available Indonesian Corpora for Automatic Abstractive and Extractive Chat Summarization F Koto The 10th International Conference on Language Resources and Evaluation (LREC), 2016 | 23 | 2016 |