- Welcome to the coolest collection of LLM papers around! 🚀
- Here you'll find groundbreaking ideas, fresh perspectives, and meaningful work—all without the heavy math or engineering grind.
- Perfect for anyone looking to dive into innovative research with a light touch. Let’s keep it simple, sharp, and super inspiring!
- Wang, Y., Zhao, J., Ones, D. S., He, L., & Xu, X. (2025). Evaluating the ability of large language models to emulate personality. Scientific Reports, 15(1), 519.
- https://www.nature.com/articles/s41598-024-84109-5
- This research explores GPT-4's ability to role-play individuals with diverse personality traits, showing strong internal consistency and convergent validity in simulations, though performance declines with increased role complexity. Adding demographic information improved emulated traits' predictive validity, highlighting the potential for LLMs in simulating realistic human behaviors.
- Schmidgall, S., Su, Y., Wang, Z., Sun, X., Wu, J., Yu, X., ... & Barsoum, E. (2025). Agent Laboratory: Using LLM Agents as Research Assistants. arXiv preprint arXiv:2501.04227.
- https://arxiv.org/abs/2501.04227
- Agent Laboratory is an autonomous LLM-based framework that streamlines the research process by performing literature review, experimentation, and report writing, producing comprehensive outputs like code repositories and research reports. Evaluations show that it achieves state-of-the-art performance, reduces research costs by 84%, and benefits from human feedback at each stage, significantly improving research quality. This framework aims to shift researchers' focus from routine tasks to creative ideation, accelerating scientific discovery.
- Zhou, L., Schellaert, W., Martínez-Plumed, F., Moros-Daval, Y., Ferri, C., & Hernández-Orallo, J. (2024). Larger and more instructable language models become less reliable. Nature, 1-8.
- PDF & publish: https://www.nature.com/articles/s41586-024-07930-y
- GitHub: https://github.com/wschella/llm-reliability
(2024.9.1, aXive working paper) The 🚀AI Scientist 🚀: Towards Fully Automated Open-Ended Scientific Discovery
- Lu, C., Lu, C., Lange, R. T., Foerster, J., Clune, J., & Ha, D. (2024). The ai scientist: Towards fully automated open-ended scientific discovery. arXiv preprint arXiv:2408.06292.
- arXive: https://arxiv.org/abs/2408.06292
- GitHub: https://github.com/SakanaAI/AI-Scientist
- Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., ... & Zhou, D. (2022). Chain-of-thought prompting elicits reasoning in large language models. Advances in neural information processing systems, 35, 24824-24837.
- Chain-of-Thought (CoT) prompting has significantly enhanced the reasoning capabilities of large language models (LLMs), enabling them to perform complex tasks by generating intermediate reasoning steps. This approach has led to state-of-the-art performance in various reasoning benchmarks, demonstrating the potential of CoT in advancing LLMs' problem-solving abilities
- https://proceedings.neurips.cc/paper_files/paper/2022/hash/9d5609613524ecf4f15af0f7b31abca4-Abstract-Conference.html
(2023.12.18, Nature Computational Science) Introducing Life2Vec - Using sequences of life-events to 🚀predict human lives 🚀
- Savcisens, G., Eliassi-Rad, T., Hansen, L. K., Mortensen, L. H., Lilleholt, L., Rogers, A., ... & Lehmann, S. (2024). Using sequences of life-events to predict human lives. Nature Computational Science, 4(1), 43-56.
- PDF & publish: https://www.nature.com/articles/s43588-023-00573-5
- Website: https://life2vecai.com/
- https://github.com/SocialComplexityLab/life2vec
LLMs and Finance @ Alex Kim: https://www.alexacct.com/
- Kim, A. G., & Nikolaev, V. V. (2024). Context‐Based Interpretation of Financial Information. Journal of Accounting Research.
- This study examines how narrative context in financial statements enhances the informativeness of numerical data. Utilizing deep learning techniques, the authors demonstrate that integrating narrative disclosures with numerical figures significantly improves predictions about a firm's future performance, especially when numerical data alone is less reliable.
- December 2024. "Financial Statement Analysis with Large Language Models" received the Blackrock Best Research Paper Award.
- December 2024. [New Paper] "Learning Fundamentals from Text" is now available on SSRN.
- November 2024. "Vocal Delivery Quality in Earnings Conference Calls" has been conditionally accepted for publication in the Journal of Accounting and Economics.
- October 2024. "Context-Based Interpretation of Financial Information" is forthcoming in the Journal of Accounting Research.
- Kim, A., Muhn, M., & Nikolaev, V. (2024). Financial statement analysis with large language models. arXiv preprint arXiv:2407.17866.
- arXive: https://arxiv.org/abs/2407.17866
- SSRN: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4835311
(2024.8.28, SSRN working paper) Bloated Disclosures: Can ChatGPT Help Investors 🚀Process Information? 🚀
- Kim, A., Muhn, M., & Nikolaev, V. V. (2024). Bloated disclosures: can ChatGPT help investors process information?. Chicago Booth Research Paper, (23-07), 2023-59.
- arXive: https://arxiv.org/abs/2306.10224
- SSRN: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4425527
- 2024 FMA Asia-Pacific Conference Best Paper Award in Corporate Finance
- Kim, A., Muhn, M., & Nikolaev, V. (2023). From transcripts to insights: Uncovering corporate risks using generative ai. arXiv preprint arXiv:2310.17721.
- SSRN: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4593660
- arXive: https://arxiv.org/abs/2310.17721
(2024.6.26, PETRA'24) Stock Price Trend Prediction using Emotion Analysis of Financial Headlines with Distilled LLM Model
- https://dl.acm.org/doi/10.1145/3652037.3652076
- Bhat, R., & Jain, B. (2024, June). Stock Price Trend Prediction using Emotion Analysis of Financial Headlines with Distilled LLM Model. In Proceedings of the 17th International Conference on PErvasive Technologies Related to Assistive Environments (pp. 67-73).
- This study uses Distilled LLMs to analyze the emotional tone of financial news headlines instead of scraping the financial data. The LLMs extract emotion-based attributes, which are then used with machine learning algorithms to predict stock price direction.
- https://arxiv.org/abs/2407.00890
- Carriero, A., Pettenuzzo, D., & Shekhar, S. (2024). Macroeconomic Forecasting with Large Language Models. arXiv preprint arXiv:2407.00890.
- This paper presents a comparative analysis evaluating the accuracy of Large Language Models (LLMs) against traditional macro time series forecasting approaches.
- https://arxiv.org/abs/2411.00640
- Miller, E. (2024). Adding Error Bars to Evals: A Statistical Approach to Language Model Evaluations. arXiv preprint arXiv:2411.00640.
- This article addresses the gap in the evaluation of LLMs by incorporating principles from experimental science and statistical analysis. It guides researchers trained in statistics on how to analyze data from LLM evaluations, measure differences between models, and effectively plan evaluation experiments. The authors recommend specific strategies for conducting and reporting evaluations to reduce statistical noise and enhance the informativeness of the results.
- Large language models encode clinical knowledge: https://www.nature.com/articles/s41586-023-06291-2
- Singhal, K., Azizi, S., Tu, T., Mahdavi, S. S., Wei, J., Chung, H. W., ... & Natarajan, V. (2023). Large language models encode clinical knowledge. Nature, 620(7972), 172-180.
- This paper introduces the MultiMedQA benchmark to evaluate the performance of large language models in medical question answering. By introducing the new dataset HealthSearchQA and a human evaluation framework, the study shows that Flan-PaLM achieved leading accuracy across multiple datasets but highlights gaps in areas like comprehension and reasoning. It also proposes the instruction prompt tuning method to improve model performance in the medical domain.
- https://arxiv.org/html/2402.17944v1
- Fang, X., Xu, W., Anting Tan, F., Zhang, J., Hu, Z., Qi, Y., ... & Faloutsos, C. (2024). Large language models on tabular data--a survey. arXiv e-prints, arXiv-2402.
- Related paper 1: Large Language Models versus Classical Machine Learning: Performance in COVID-19 Mortality Prediction Using High-Dimensional Tabular Data
- doi
- [AI for Grant Writing GitHub] (https://github.com/eseckel/ai-for-grant-writing?tab=readme-ov-file)
- A curated list of resources for using AI to develop more competitive grant applications.
(AAAI 2023) Estimating Geographic Spillover Effects of COVID-19 Policies: From Large-Scale Mobility Networks
- Chang, S., Vrabac, D., Leskovec, J., & Ugander, J. (2023, June). Estimating geographic spillover effects of COVID-19 policies from large-scale mobility networks. In Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 37, No. 12, pp. 14161-14169).
- https://ojs.aaai.org/index.php/AAAI/article/view/26657
- DOI: https://doi.org/10.1609/aaai.v37i12.26657
- This paper investigates the spillover effects of county-level mobility restrictions in California's COVID-19 policy, showing that local restrictions are only 54% as effective as statewide restrictions in reducing mobility. Using a regression discontinuity design, the study leverages mobility data to reveal significant cross-county movement, especially in sectors like retail and dining. The authors propose an optimized "macro-county" policy approach that achieves over 90% of the effectiveness of statewide restrictions by grouping counties to mitigate spillovers.
- Chang, S., Chaszczewicz, A., Wang, E., Josifovska, M., Pierson, E., & Leskovec, J. (2024). LLMs generate structurally realistic social networks but overestimate political homophily. arXiv preprint arXiv:2408.16629.
- https://arxiv.org/abs/2408.16629
- This paper evaluates LLM-generated social networks, finding that "local" generation methods produce more realistic networks that align well with real-world characteristics like density and clustering. However, LLMs overestimate political homophily, placing more emphasis on political alignment than seen in actual social networks.
- Chang, S., Zhong, R., Adams, E., Lee, F. T., Varia, S., Patton, D., ... & McKeown, K. (2018). Detecting gang-involved escalation on social media using context. arXiv preprint arXiv:1809.03632.
- https://aclanthology.org/D18-1005/
- This paper presents a system for detecting expressions of aggression and loss in social media posts by gang-involved youth, using domain-specific resources and contextual representations of users’ recent tweets and interactions. By incorporating context into a CNN model, the system achieves significantly improved accuracy in identifying potential risks of real-world violence.
- GitHub: https://github.com/serinachang5/contextifier
- Submission Deadline: May 2024 (exact date TBD)
- Focus: Machine Learning, Deep Learning, Artificial Intelligence
- Submission Deadline: October 2024 (exact date TBD)
- Focus: Representation Learning, Deep Learning, Machine Learning
- Submission Deadline: Around February 2024 (exact date TBD)
- Focus: Natural Language Processing, Large Language Models, Language Understanding
- Submission Deadline: Around May 2024 (exact date TBD)
- Focus: Natural Language Processing, Large Language Models, Generative Models
- Submission Deadline: Around March 2024 (exact date TBD)
- Focus: Natural Language Processing, Language Modeling, Computational Linguistics
- Submission Deadline: Around September 2024 (exact date TBD)
- Focus: Artificial Intelligence, Machine Learning
- Submission Deadline: Around June 2024 (exact date TBD)
- Focus: Natural Language Processing, Language Models
- Submission Deadline: Around December 2024 (exact date TBD)
- Focus: Natural Language Processing, NLP
- Submission Deadline: Around November 2024 (exact date TBD)
- Focus: Computer Vision, Deep Learning
- Submission Deadline: Around February 2024 (exact date TBD)
- Focus: Machine Learning, Deep Learning
- Submission Deadline: Around October 2024 (exact date TBD)
- Focus: Statistical Learning, Machine Learning, AI
- Submission Deadline: Around April 2024 (exact date TBD)
- Focus: Natural Language Processing, Deep Learning
- Submission Deadline: Around October 2024 (exact date TBD)
- Focus: Speech Recognition, Signal Processing
- Submission Deadline: Around September 2024 (exact date TBD)
- Focus: Robotics, AI, Automation
- Submission Deadline: Around January 2024 (exact date TBD)
- Focus: Artificial Intelligence, Machine Learning, Automated Reasoning
- Submission Deadline: Around March 2024 (exact date TBD)
- Focus: Data Mining, Machine Learning, Artificial Intelligence
- Scientific Reports (NLP)
- Computers and Security
- Asian Journal of Social Science
- Frontiers in Public Health